scanner

package module

v0.9.4 Latest Latest Go to latest Published: May 21, 2026 License: MIT Imports: 10 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/CodatusHQ/scanner

Links

Open Source Insights

README ¶

Codatus

Codatus scans every repository in a GitHub organization or user account against a set of repo standards and produces a Markdown scorecard.

It answers one question: how does each repo in your org measure up against the standards you care about?

This repository is a Go library and a CLI. Posting the scorecard (e.g., as a GitHub Issue) is the caller's responsibility - the scanner returns structured results and Markdown, nothing more.

How it works

Codatus receives a GitHub account to scan (organization or user).
It lists the repositories accessible to the token, then filters out archived and forked repos. Both exclusions are reported in the scorecard header so the reader can see the full breakdown (Total repos, Forks excluded, Archived excluded, Repos scanned).
For each remaining repo, it runs 11 rule checks (see below).
It produces a single Markdown scorecard summarizing pass/fail per repo per rule, plus a structured ScanResult value the caller can post-process (e.g. the bulk-scan binary serializes per-rule aggregates to JSON).

The CLI prints the Markdown to stdout. Callers using the library get a ScanResult (org name, scan timestamp, exclusion counts, per-repo results, skipped repos) and can generate the Markdown via GenerateReport(scanResult).

Rules

Each rule produces a pass or fail result per repository. Rules fall into two categories:

Scored rules drive the org-level score. The score is the arithmetic mean of pass rates across the 5 scored rules. Per-repo classification (Strong / Moderate / Weak) is also based on what fraction of scored rules a repo passes.
Additional checks are informational only. They appear in the report as coverage numbers but do not affect the score. They surface "nice to have" hygiene that's worth seeing but isn't load-bearing for whether a repo's standards are in good shape.

Scored rules (drive the org-level score)

MergeGate sources

The three rules below (Has branch protection, Has required reviewers, Has required checks) all read fields off a shared MergeGate struct that the scanner builds from up to two GitHub APIs per scan, depending on mode:

Source	Endpoint	Visibility	Admin scans	Non-admin scans
1. Rulesets	`GET /repos/{o}/{r}/rules/branches/{branch}`	Public	Always	Always
2. Classic branch protection	`GET /repos/{o}/{r}/branches/{branch}/protection`	Admin-only	Always	Skipped (404s for non-admins)
3. Public branch info	`GET /repos/{o}/{r}/branches/{branch}`	Public	Skipped	Always

Source 2 is the precise classic-protection read (it sees the require-PR checkbox, reviewer count, and check contexts). Source 3 is a coarser public fallback for non-admin scans: it reports protected, protection.enabled, and check contexts, but not the require-PR sub-setting. Admin scans skip source 3 to avoid the false positives it would re-introduce.

Results union across whichever sources ran. What each source contributes per field is in client.go (see MergeGate, GetRulesets, GetBranchProtection, GetBranchInfo).

1. Has branch protection

Check: the default branch enforces pull-request flow - direct pushes blocked, merges go through a PR.

Pass: MergeGate.EnforcesPRFlow is true. Sources set it on these signals:

Rulesets: a pull_request rule applies to the branch (any reviewer count, including 0).
Classic branch protection (admin scans): the response includes a required_pull_request_reviews sub-object.
Public branch info (non-admin scans): both protected: true and protection.enabled: true.

Fail: no source set the flag. Shape-only rulesets (deletion, non_fast_forward, required_signatures, required_linear_history, merge_queue, copilot_code_review) don't count; on admin scans, classic protection without the require-PR checkbox doesn't count; protected: true with protection.enabled: false doesn't count.

Public-scan visibility limit: the public branches endpoint exposes protection.enabled but redacts the require-PR sub-setting. A classic protection that's enabled but only enforces shape rules (signed commits, linear history) will pass this rule on a public scan even though it doesn't actually require a PR. Admin scans read the sub-setting directly and don't have this false-positive surface.

This rule is specifically about whether opening a PR is a prerequisite to changing the branch. Review and check requirements on the PR itself are Has required reviewers and Has required checks.

2. Has required reviewers

Check: the default branch requires at least one approving review before merging.

Pass: MergeGate.RequiredReviewers >= 1. Sources: rulesets' pull_request rule's required_approving_review_count (always consulted), or classic protection's required_pull_request_reviews.required_approving_review_count (admin scans only).

Admin-only. The reviewer count on classic protection is admin-only and the public branches endpoint doesn't expose it. Rather than answering only for rulesets-only repos (which would give misleading partial coverage), the scanner skips this rule entirely on non-admin scans: it doesn't appear in the JSON output, the Markdown table, or the score. Pass WithAdmin(true) (library) or --admin (CLI) to include it.

3. Has required checks

Check: the default branch requires at least one programmatic check (CI status check, workflow run, code scan, deployment, etc.) to pass before merging.

Pass: len(MergeGate.RequiredStatusChecks) > 0. Sources contribute to that slice as follows:

Rulesets (always consulted): any of five rule types with a non-empty enforcement list: required_status_checks, workflows, code_scanning, code_quality, required_deployments.
Classic branch protection (admin scans): required_status_checks.contexts.
Public branch info (non-admin scans): protection.required_status_checks.contexts.

Empty-list rules don't count. A workflows rule with parameters.workflows: [] (or similar empty-list configurations) enforces nothing and contributes no placeholder to the slice - otherwise this rule would pass for repos whose only check rule enforces nothing.

New merge-gate rule types can be added by extending the mergeGateRules registry in GetRulesets.

4. Has CODEOWNERS

Check: a CODEOWNERS file exists in one of the three standard locations: root (/CODEOWNERS), docs/CODEOWNERS, or .github/CODEOWNERS.

Pass: file found in any of the three locations. Fail: file not found in any location.

5. Has CI workflow

Check: the repo has a CI workflow configured for any of the supported providers:

GitHub Actions: .github/workflows/*.yml or *.yaml
CircleCI: .circleci/config.yml
GitLab CI: .gitlab-ci.yml
Travis CI: .travis.yml
Buildkite: any file under .buildkite/
Azure Pipelines: azure-pipelines.yml
Jenkins: Jenkinsfile

Pass: at least one of the above paths is present in the repo. Fail: none of the recognized CI configurations exist. (Repos with server-side-only CI integrations - e.g. CircleCI without a checked-in config - are still missed; the rule is best-effort based on what's in the tree.)

Additional checks (informational only)

6. Has README

Check: a README file exists at the repository root, matched case-insensitively with any extension or none. So README.md, Readme.rst, README.txt, README.markdown, readme all pass.

Pass: file found at root. Fail: no root-level file whose name is readme or starts with readme. (case-insensitive). Subdirectory READMEs (e.g. docs/README.md) don't count.

There is no size threshold - any README counts. The previous "substantial" variant required >2 KB, which discriminated poorly.

7. Has LICENSE

Check: GitHub auto-detected an open-source license for the repository (the license.spdx_id field on the listing payload is non-empty). GitHub uses the Licensee gem to detect license files at any conventional path or filename - LICENSE, LICENSE.md, LICENSE.txt, COPYING, LICENCE, etc.

Pass: GitHub returned a license SPDX id. Fail: GitHub couldn't auto-detect a license (no recognized license file, or a custom-text license Licensee doesn't recognize).

8. Has repo description

Check: the GitHub repository description field is not blank.

Pass: description is set and non-empty. Fail: description is blank or not set.

9. Has activity

Check: the most recent push to any branch is within the last 12 months (via the GitHub API's pushed_at field on the repository).

Pass: the repository was pushed within the last 12 months. Fail: the repository has not been pushed in the last 12 months, or has never been pushed. Archived repositories are filtered out before scanning, so they never reach this rule.

10. Has SECURITY.md

Check: a SECURITY.md file exists in any of the three locations GitHub recognizes for security policies: repo root, .github/SECURITY.md, or docs/SECURITY.md.

Pass: file found in any of those three locations. Fail: file not found.

Score and bucketing

The org-level score is the arithmetic mean of pass rates across the scored rules that were actually evaluated. For an admin scan that's all 5 scored rules; for a public scan it's 4 (HasRequiredReviewers is admin-only and silently skipped). The denominator adapts so the score isn't dragged toward zero by rules the scan couldn't see.

admin scan:   score = mean of 5 per-rule pass rates
public scan:  score = mean of 4 per-rule pass rates (no required-reviewers)

Each repo also gets a percentage based on the fraction of evaluated scored rules it passes (5-rule scans land at 0/20/40/60/80/100; 4-rule scans land at 0/25/50/75/100), and is bucketed:

Strong (≥80%)
Moderate (30-79%)
Weak (≤29%)

Additional checks do not affect either the org score or the per-repo bucket.

Scorecard format

The scorecard is a single Markdown document. Structure:

# Codatus - Repo Standards Scorecard

**Org:** {org_name}<br>
**Scanned:** {timestamp}<br>
**Repos:** {scanned} of {total} scanned ({forks} forks excluded, {archived} archived excluded, {skipped} skipped)

## Scored rules

| Rule | Passing | Failing | Pass rate |
|------|---------|---------|----------|
| Has branch protection | 11 | 42 | 20% |
| Has required reviewers | 4 | 49 | 7% |
| Has required checks | 4 | 49 | 7% |
| Has CODEOWNERS | 3 | 50 | 5% |
| Has CI workflow | 27 | 26 | 50% |

**Score: 18/100** (average pass rate across the scored rules above)

## Additional checks

| Rule | Passing | Failing | Pass rate |
|------|---------|---------|----------|
| Has README | 50 | 3 | 94% |
| Has LICENSE | 38 | 15 | 71% |
| Has repo description | 46 | 7 | 86% |
| Has activity | 43 | 10 | 81% |
| Has SECURITY.md | 3 | 50 | 5% |

## Rule reference

<details>
<summary>How each rule works and how to fix failures</summary>

### Scored rules

#### Has branch protection

Checks that the default branch has a protection rule in place. Detected via any of three GitHub APIs: ...

---

#### Has required reviewers

...

### Additional checks

#### Has README

...

</details>

## Repository details

### Strong (≥80%)

<details>
<summary><a href="https://github.com/{org}/repo-a">repo-a</a> - 100%</summary>

</details>

### Moderate (30-79%)

<details>
<summary><a href="https://github.com/{org}/repo-b">repo-b</a> - 60%</summary>

**Failing scored rules:**
- Has CODEOWNERS
- Has required checks

**Additional check failures:**
- Has SECURITY.md

</details>

### Weak (≤29%)

<details>
<summary><a href="https://github.com/{org}/repo-c">repo-c</a> - 0%</summary>

**Failing scored rules:**
- Has branch protection
- Has required reviewers
- Has required checks
- Has CODEOWNERS
- Has CI workflow

</details>

### Skipped ({n} repos)             <-- only if any repos were skipped

- [empty-repo](https://github.com/{org}/empty-repo) - repository is empty
- [huge-repo](https://github.com/{org}/huge-repo) - file tree too large (truncated by GitHub API)

Header line breaks use explicit <br> so spec-compliant Markdown renderers (CommonMark/marked.js/kramdown/GitHub) emit one line per item instead of folding consecutive single-newlines into one paragraph. The repo-stats parenthetical drops fields that are zero - with no exclusions and no skipped repos, it collapses to **Repos:** {scanned} of {total} scanned.

Tables render in fixed importance order (not sorted by pass rate). Both tables share the same column layout. The Rule reference section is collapsed by default; each rule has a single self-contained description that names what's checked, every detection path the rule walks (legacy and modern GitHub mechanisms), and how to fix it. Rule reference precedes Repository details so the rule definitions are in hand before the per-repo failure lists (which only mention rule names). Subsections (and entire buckets) are omitted when empty. Skipped repos are those that could not be scanned (empty repos, truncated file trees, API errors); they are excluded from the score and bucket counts, and render as the last subsection inside Repository details.

When repos_scanned is 0, the Score line reads **Score: N/A** (no repos available to score).

Canonical sample fixture

samples.Fixture() returns a deterministic ScanResult for the fictional acme-corp org. It's the single source of truth for the sample scorecard shown on the landing page and used as dev-seed data in the app, replacing what used to be hand-typed Markdown in each downstream repo.

Go consumers render it in process:

md := scanner.GenerateReport(samples.Fixture())

Non-Go consumers use the generator binary, which writes Markdown to stdout (or to --out):

go run github.com/CodatusHQ/scanner/cmd/generate-sample > sample-scorecard.md

No rendered .md is committed here - downstream copies are refreshed on demand by re-running the generator.

Scanner configuration

scanner.Scan(ctx, auth, opts...) takes an Auth - a sealed interface implemented by two concrete types. Pick the one that matches your token.

PATAuth - personal access token

For scanning with a user-generated token (classic or fine-grained PAT). Scanner calls /orgs/{Name}/repos and falls back to /users/{Name}/repos on 404, so it works for both org and user accounts.

results, err := scanner.Scan(ctx, scanner.PATAuth{
    Token: "ghp_...",
    Name:  "my-org",        // or a user login like "octocat"
})

Field	Type	Required	Description
`Token`	`string`	Yes	Personal access token
`Name`	`string`	Yes	GitHub organization or user login to scan

InstallationAuth - GitHub App installation

For scanning as a GitHub App. Scanner calls /installation/repositories, which returns exactly the repos the installation was granted access to. Works identically for org and user installs, and respects "Selected repositories" mode (no leak of other public repos).

results, err := scanner.Scan(ctx, scanner.InstallationAuth{
    Token: "ghs_...",       // short-lived installation access token
    Name:  "my-org",        // the account the app is installed on
})

Field	Type	Required	Description
`Token`	`string`	Yes	Installation access token (from `/app/installations/{id}/access_tokens`)
`Name`	`string`	Yes	Account the app is installed on; used for per-repo URL construction

Options

Option Description

WithBaseURL(url string) Override the GitHub API base URL. Defaults to the public GitHub API. Useful for testing against a mock server or targeting GitHub Enterprise.

WithAdmin(b bool) Tell the scanner the auth has admin access on every target repo. Default false. When true, the scanner runs admin-only rules (currently: Has required reviewers). When false, those rules are silently skipped - they don't appear in any per-repo result, the JSON output, or the Markdown report. Pass true for installation-token scans (the Codatus GitHub App is granted admin), or for PAT scans where you're an admin of every target org.

Required token permissions

Classic PAT:

repo — read repo contents and branch protection
read:org — required when Name is an organization

Fine-grained PAT: scoped to the target account, with Repository permissions:

Metadata: Read
Contents: Read
Administration: Read (for branch protection)

Installation token: permissions come from the GitHub App's configured repository permissions (not PAT scopes). At minimum the app needs Contents (read) and Metadata (read); Administration (read) is required for branch protection rules to resolve.

How these values are sourced (env vars, CLI flags, config file) is the responsibility of the caller, not the scanner module.

CLI

The codatus binary reads CODATUS_ORG and CODATUS_TOKEN from the environment, wraps them in PATAuth, runs a scan, and prints the Markdown scorecard to stdout. Log output (scan summary, errors) goes to stderr so stdout stays clean for piping.

Despite the name, CODATUS_ORG accepts either an organization slug or a user login - the library dispatches automatically.

# Organization
CODATUS_ORG=myorg CODATUS_TOKEN=ghp_... codatus > scorecard.md

# User account
CODATUS_ORG=my-username CODATUS_TOKEN=ghp_... codatus > scorecard.md

Bulk scan (many orgs at once)

The bulk-scan binary reads a list of orgs/users from a file (one slug per line, blank lines skipped, no comment handling) and scans them sequentially. For each org it writes a scorecard.md and a stats.json into a per-org subfolder, so partial runs preserve completed work even if a later scan aborts.

# orgs.txt
acme-corp
wayne-enterprises
stark-industries

# Run
bulk-scan --orgs orgs.txt --out ./scans --token ghp_...
# or with the token in env:
CODATUS_TOKEN=ghp_... bulk-scan --orgs orgs.txt --out ./scans
# admin-mode scan (you're an admin of every listed org, so include the
# `Has required reviewers` rule too):
CODATUS_TOKEN=ghp_... bulk-scan --orgs orgs.txt --out ./scans --admin

Output layout:

scans/
├── acme-corp/
│   ├── scorecard.md     # same Markdown the single-org CLI produces
│   └── stats.json       # structured aggregates: per-rule pass rates, totals, exclusion counts
├── wayne-enterprises/
│   ├── scorecard.md
│   └── stats.json
└── ...

Progress prints to stderr per org ([2/3] wayne-enterprises ... ok (42 scanned, 18/42 compliant = 42%)); a final summary lists succeeded / failed / not-attempted counts.

Failure handling:

Per-org errors (404, 403, "user not found", etc.) - logged, run continues to the next org.
Global errors (429 rate limit, 401 auth) - run aborts immediately. Files for orgs that already completed remain on disk; the un-scanned tail is reported in the summary as "not attempted."

Exit code is non-zero if any org failed or was not attempted.

What Codatus is not

Not a velocity/DORA metrics tool. It does not measure cycle time, deployment frequency, or review speed. That's a different product category (Swarmia, LinearB, CodePulse).
Not a security scanner. It checks whether SECURITY.md exists and whether branch protection is on, but it does not scan code for vulnerabilities (use Snyk, Dependabot, or OpenSSF Scorecard).
Not a developer portal. There is no service catalog, no scaffolding, no self-service actions (Backstage, Cortex, OpsLevel cover that). Just standards.

Documentation ¶

Index ¶

Constants
Variables
func GenerateReport(sr ScanResult) string
func IsRateLimitError(err error) bool
func Score(sr ScanResult) (score int, defined bool)
type Auth
type Bucket
- func BucketOf(rr RepoResult, scoredRules []Rule) (b Bucket, scoredPassing, scoredTotal, scorePct int)
- func Buckets() []Bucket
type FileEntry
type GitHubClient
- func NewGitHubClient(token string) GitHubClient
type HasActivity
- func (r HasActivity) Category() RuleCategory
- func (r HasActivity) Check(repo Repo) bool
- func (r HasActivity) Description() string
- func (r HasActivity) Name() string
type HasBranchProtection
- func (r HasBranchProtection) Category() RuleCategory
- func (r HasBranchProtection) Check(repo Repo) bool
- func (r HasBranchProtection) Description() string
- func (r HasBranchProtection) Name() string
type HasCIWorkflow
- func (r HasCIWorkflow) Category() RuleCategory
- func (r HasCIWorkflow) Check(repo Repo) bool
- func (r HasCIWorkflow) Description() string
- func (r HasCIWorkflow) Name() string
type HasCodeowners
- func (r HasCodeowners) Category() RuleCategory
- func (r HasCodeowners) Check(repo Repo) bool
- func (r HasCodeowners) Description() string
- func (r HasCodeowners) Name() string
type HasLicense
- func (r HasLicense) Category() RuleCategory
- func (r HasLicense) Check(repo Repo) bool
- func (r HasLicense) Description() string
- func (r HasLicense) Name() string
type HasReadme
- func (r HasReadme) Category() RuleCategory
- func (r HasReadme) Check(repo Repo) bool
- func (r HasReadme) Description() string
- func (r HasReadme) Name() string
type HasRepoDescription
- func (r HasRepoDescription) Category() RuleCategory
- func (r HasRepoDescription) Check(repo Repo) bool
- func (r HasRepoDescription) Description() string
- func (r HasRepoDescription) Name() string
type HasRequiredChecks
- func (r HasRequiredChecks) Category() RuleCategory
- func (r HasRequiredChecks) Check(repo Repo) bool
- func (r HasRequiredChecks) Description() string
- func (r HasRequiredChecks) Name() string
type HasRequiredReviewers
- func (r HasRequiredReviewers) Category() RuleCategory
- func (r HasRequiredReviewers) Check(repo Repo) bool
- func (r HasRequiredReviewers) Description() string
- func (r HasRequiredReviewers) Name() string
- func (r HasRequiredReviewers) RequiresAdmin() bool
type HasSecurityMd
- func (r HasSecurityMd) Category() RuleCategory
- func (r HasSecurityMd) Check(repo Repo) bool
- func (r HasSecurityMd) Description() string
- func (r HasSecurityMd) Name() string
type InstallationAuth
type MergeGate
type Option
- func WithAdmin(b bool) Option
- func WithBaseURL(url string) Option
type PATAuth
type Repo
type RepoResult
- func (rr RepoResult) Skipped() bool
type Rule
- func AdditionalRules() []Rule
- func AllRules() []Rule
- func ScoredRules() []Rule
type RuleCategory
type RuleResult
type ScanResult
- func Scan(ctx context.Context, auth Auth, opts ...Option) (ScanResult, error)

Constants ¶

View Source

const Version = "v0.9.4"

Version is the semver tag of this scanner library. Bumped manually per release. Surfaced in the report header so a reader can see which scanner version produced a given scorecard.

Variables ¶

View Source

var (
	ErrEmptyRepo     = errors.New("repository is empty")
	ErrTruncatedTree = errors.New("tree truncated by GitHub API")
)

Sentinel errors for per-repo scan failures.

Functions ¶

func GenerateReport ¶

func GenerateReport(sr ScanResult) string

GenerateReport produces a Markdown repo-standards scorecard from a ScanResult. The structure is fixed and meaningful for prospects landing from a cold-email link:

Header: title, org, scan time, single-line repo stats
## Scored rules table (importance order, drives the score)
**Score: N/100** inline callout (or **Score: N/A** when no repos)
## Additional checks table (importance order, same columns as scored)
## Rule reference (collapsed <details>, split by category)
## Repository details: ### Strong / Moderate / Weak / Skipped subsections

Rule reference precedes Repository details so a reader scanning top-down has the rule definitions in hand before they hit the per-repo failure lists, which only mention rule names.

func IsRateLimitError ¶ added in v0.6.0

func IsRateLimitError(err error) bool

IsRateLimitError reports whether an error is a GitHub rate limit error (primary or secondary). Rate limit errors must never be swallowed - they indicate a global problem that affects all subsequent API calls. Exported so callers (e.g., bulk-scan) can decide whether to abort a multi-org run on the first rate-limited org rather than continue and fail every subsequent call.

func Score ¶ added in v0.7.0

func Score(sr ScanResult) (score int, defined bool)

Score computes the org-level score: the arithmetic mean of pass rates across sr.RulesScored. Returns the score (0-100) and a flag indicating whether it's defined. When sr has no scanned repos OR no scored rules were evaluated (e.g., a non-admin scan with admin-only rules filtered out and no scored rules left), defined=false and the caller should render "N/A". Result is rounded to the nearest integer for display.

The denominator is len(sr.RulesScored), not the size of the global scored-rule set - that's how non-admin scans get the math right without rules they couldn't evaluate dragging the score down.

Types ¶

type Auth ¶ added in v0.3.0

type Auth interface {
	// contains filtered or unexported methods
}

Auth identifies how the scanner authenticates to GitHub. It is a sealed interface — only PATAuth and InstallationAuth in this package satisfy it. New auth types are added by defining a struct with an isAuth() method.

type Bucket ¶ added in v0.7.0

type Bucket struct {
	Name   string // "Strong", "Moderate", "Weak"
	MinPct int    // inclusive lower bound (0..100)
	MaxPct int    // inclusive upper bound (0..100)
}

Bucket classifies a repo by what fraction of the scored rules it passes. Each bucket covers an integer percentage range; the full bucket set returned by Buckets() covers [0, 100] without gaps or overlaps. Display labels are derived from MinPct/MaxPct at render time (see report.go).

func BucketOf ¶ added in v0.7.0

func BucketOf(rr RepoResult, scoredRules []Rule) (b Bucket, scoredPassing, scoredTotal, scorePct int)

BucketOf classifies a single repo by the percentage of scored rules it passes. The caller passes the scored-rule set so the denominator is stable across the org's scan: every repo gets the same denominator, regardless of which rules happen to appear in any one repo's results. Pass sr.RulesScored from the parent ScanResult.

Returns the matching Bucket plus the underlying counts so callers don't re-derive them. If scoredRules is empty the result is the last-defined bucket (i.e. Weak) with zero counts; this only happens in test fixtures with no scored rules registered.

func Buckets ¶ added in v0.7.0

func Buckets() []Bucket

Buckets returns the score-range buckets in display order (highest range first). Adding/removing buckets, renaming them, or shifting thresholds is a one-place edit here - report and stats output both derive from this list and need no separate updates.

type FileEntry ¶

type FileEntry struct {
	Path string // full path relative to repo root (e.g., ".github/workflows/ci.yml")
	Size int
	Type string // "blob" (file) or "tree" (directory)
}

FileEntry represents a file or directory in a repo.

type GitHubClient ¶

type GitHubClient interface {
	// ListReposByAccount lists repos for a named org (falls back to user on 404).
	// Used by PAT auth.
	ListReposByAccount(ctx context.Context, name string) ([]Repo, error)
	// ListReposByInstallation lists the repos the current GitHub App installation
	// was granted access to. Used by installation-token auth.
	ListReposByInstallation(ctx context.Context) ([]Repo, error)
	GetTree(ctx context.Context, owner, repo, branch string) ([]FileEntry, error)
	GetBranchProtection(ctx context.Context, owner, repo, branch string) (*MergeGate, error)
	GetRulesets(ctx context.Context, owner, repo, branch string) (*MergeGate, error)
	// GetBranchInfo reads the public GET /repos/{o}/{r}/branches/{br}
	// endpoint, which exposes the protected flag and (for classic
	// per-repo branch protection) the required-status-check contexts to
	// any reader - including non-admins on public repos. This is the
	// fallback when the admin GetBranchProtection 404s and there are no
	// rulesets, so the scanner can still tell whether protection is on
	// and which status checks are required. Required-reviewer counts
	// are NOT exposed here (admin-only field on classic protection).
	GetBranchInfo(ctx context.Context, owner, repo, branch string) (*MergeGate, error)
}

GitHubClient is the interface for all GitHub API interactions. The scanner depends only on this interface, making it testable via mocks.

func NewGitHubClient ¶

func NewGitHubClient(token string) GitHubClient

NewGitHubClient creates a GitHubClient that calls the public GitHub REST API.

type HasActivity ¶ added in v0.5.1

type HasActivity struct {
	Now time.Time
}

HasActivity checks that the repo has had a commit (push) within the last 12 months. Set Now to a fixed time for deterministic testing; the zero value means time.Now() is used at check time.

func (HasActivity) Category ¶ added in v0.7.0

func (r HasActivity) Category() RuleCategory

func (HasActivity) Check ¶ added in v0.5.1

func (r HasActivity) Check(repo Repo) bool

func (HasActivity) Description ¶ added in v0.5.1

func (r HasActivity) Description() string

func (HasActivity) Name ¶ added in v0.5.1

func (r HasActivity) Name() string

type HasBranchProtection ¶

type HasBranchProtection struct{}

HasBranchProtection checks that the default branch enforces PR-flow: direct pushes blocked, merges go through a PR. One of three rules derived from MergeGate; the other two are HasRequiredReviewers and HasRequiredChecks. See MergeGate (client.go) and resolveMergeGate (scanner.go) for how the signal is built.

func (HasBranchProtection) Category ¶ added in v0.7.0

func (r HasBranchProtection) Category() RuleCategory

func (HasBranchProtection) Check ¶

func (r HasBranchProtection) Check(repo Repo) bool

func (HasBranchProtection) Description ¶ added in v0.4.0

func (r HasBranchProtection) Description() string

func (HasBranchProtection) Name ¶

func (r HasBranchProtection) Name() string

type HasCIWorkflow ¶

type HasCIWorkflow struct{}

HasCIWorkflow checks that the repo has a CI workflow configured for any of the well-known CI providers, not just GitHub Actions. Detected via the presence of one of these signals at the repo root or under their canonical directory:

GitHub Actions: .github/workflows/*.yml or *.yaml
CircleCI: .circleci/config.yml
GitLab CI: .gitlab-ci.yml
Travis CI: .travis.yml
Buildkite: any file under .buildkite/
Azure Pipelines: azure-pipelines.yml
Jenkins: Jenkinsfile

Repos using a CI integration that lives entirely server-side (e.g., CircleCI without a checked-in config) are still missed; this is a best-effort signal based on what's visible in the repo.

func (HasCIWorkflow) Category ¶ added in v0.7.0

func (r HasCIWorkflow) Category() RuleCategory

func (HasCIWorkflow) Check ¶

func (r HasCIWorkflow) Check(repo Repo) bool

func (HasCIWorkflow) Description ¶ added in v0.4.0

func (r HasCIWorkflow) Description() string

func (HasCIWorkflow) Name ¶

func (r HasCIWorkflow) Name() string

type HasCodeowners ¶

type HasCodeowners struct{}

HasCodeowners checks that a CODEOWNERS file exists in root, docs/, or .github/.

func (HasCodeowners) Category ¶ added in v0.7.0

func (r HasCodeowners) Category() RuleCategory

func (HasCodeowners) Check ¶

func (r HasCodeowners) Check(repo Repo) bool

func (HasCodeowners) Description ¶ added in v0.4.0

func (r HasCodeowners) Description() string

func (HasCodeowners) Name ¶

func (r HasCodeowners) Name() string

type HasLicense ¶

type HasLicense struct{}

HasLicense uses GitHub's auto-detected license (Licensee) instead of a path-pattern match, so any conventionally-named license file works: LICENSE, LICENSE.md, LICENSE.txt, LICENCE (British), COPYING (GNU), MIT-LICENSE, etc. - anything GitHub recognizes and surfaces as the repo's `license.spdx_id` in the listing payload.

Custom-text licenses GitHub can't auto-detect won't pass even though the file may be present. That's a known false negative; the trade-off is worth it for the much broader correct-positive coverage.

func (HasLicense) Category ¶ added in v0.7.0

func (r HasLicense) Category() RuleCategory

func (HasLicense) Check ¶

func (r HasLicense) Check(repo Repo) bool

func (HasLicense) Description ¶ added in v0.4.0

func (r HasLicense) Description() string

func (HasLicense) Name ¶

func (r HasLicense) Name() string

type HasReadme ¶ added in v0.7.0

type HasReadme struct{}

HasReadme checks that some form of README file exists at the repo root. Matches case-insensitively on the filename and accepts any extension (or no extension), so README.md, readme.rst, README.txt, Readme, README.markdown all pass. Subdirectory READMEs (e.g., docs/README.md) don't count - the rule is about a top-level project README.

(No size threshold - the previous "substantial" variant was dropped because 2 KB is too low to discriminate quality and too high to reward minimal but useful READMEs.)

func (HasReadme) Category ¶ added in v0.7.0

func (r HasReadme) Category() RuleCategory

func (HasReadme) Check ¶ added in v0.7.0

func (r HasReadme) Check(repo Repo) bool

func (HasReadme) Description ¶ added in v0.7.0

func (r HasReadme) Description() string

func (HasReadme) Name ¶ added in v0.7.0

func (r HasReadme) Name() string

type HasRepoDescription ¶

type HasRepoDescription struct{}

HasRepoDescription checks that the repo description field is not blank.

func (HasRepoDescription) Category ¶ added in v0.7.0

func (r HasRepoDescription) Category() RuleCategory

func (HasRepoDescription) Check ¶

func (r HasRepoDescription) Check(repo Repo) bool

func (HasRepoDescription) Description ¶ added in v0.4.0

func (r HasRepoDescription) Description() string

func (HasRepoDescription) Name ¶

func (r HasRepoDescription) Name() string

type HasRequiredChecks ¶ added in v0.8.4

type HasRequiredChecks struct{}

HasRequiredChecks checks that the default branch requires at least one programmatic check (CI status check, workflow, code scan, deployment, etc.) to pass before merging. Reads MergeGate.RequiredStatusChecks, which resolveMergeGate populates from whichever sources ran.

func (HasRequiredChecks) Category ¶ added in v0.8.4

func (r HasRequiredChecks) Category() RuleCategory

func (HasRequiredChecks) Check ¶ added in v0.8.4

func (r HasRequiredChecks) Check(repo Repo) bool

func (HasRequiredChecks) Description ¶ added in v0.8.4

func (r HasRequiredChecks) Description() string

func (HasRequiredChecks) Name ¶ added in v0.8.4

func (r HasRequiredChecks) Name() string

type HasRequiredReviewers ¶

type HasRequiredReviewers struct{}

HasRequiredReviewers checks that at least one approving review is required. Admin-only: non-admin scans skip the rule entirely because the classic-protection reviewer count is admin-only and answering only for rulesets-only repos would produce misleading partial coverage. See effectiveRules in scanner.go.

func (HasRequiredReviewers) Category ¶ added in v0.7.0

func (r HasRequiredReviewers) Category() RuleCategory

func (HasRequiredReviewers) Check ¶

func (r HasRequiredReviewers) Check(repo Repo) bool

func (HasRequiredReviewers) Description ¶ added in v0.4.0

func (r HasRequiredReviewers) Description() string

func (HasRequiredReviewers) Name ¶

func (r HasRequiredReviewers) Name() string

func (HasRequiredReviewers) RequiresAdmin ¶ added in v0.8.0

func (r HasRequiredReviewers) RequiresAdmin() bool

type HasSecurityMd ¶

type HasSecurityMd struct{}

HasSecurityMd checks that SECURITY.md exists in any of the three locations GitHub recognizes for security policies: repo root, .github/, or docs/.

func (HasSecurityMd) Category ¶ added in v0.7.0

func (r HasSecurityMd) Category() RuleCategory

func (HasSecurityMd) Check ¶

func (r HasSecurityMd) Check(repo Repo) bool

func (HasSecurityMd) Description ¶ added in v0.4.0

func (r HasSecurityMd) Description() string

func (HasSecurityMd) Name ¶

func (r HasSecurityMd) Name() string

type InstallationAuth ¶ added in v0.3.0

type InstallationAuth struct {
	Token string
	Name  string // org or user login the app is installed on (used in repo URLs)
}

InstallationAuth uses a GitHub App installation access token. Scanner lists repositories via /installation/repositories, which returns exactly the repos the installation was granted access to (no public-repo leak on "Selected repositories" installs).

type MergeGate ¶ added in v0.9.2

type MergeGate struct {
	EnforcesPRFlow       bool     // direct pushes blocked; merges go through a PR
	RequiredReviewers    int      // approving reviewers required to merge
	RequiredStatusChecks []string // identifiers of required merge-gate checks
}

MergeGate holds the merge requirements the scanner extracts from GitHub's branch-protection APIs. The struct is the union of whichever sources ran for the scan mode; see resolveMergeGate (scanner.go) for the source matrix and per-source signals. A nil *MergeGate means no merge requirements were found.

type Option ¶ added in v0.2.0

type Option func(*scanOptions)

Option configures optional scan behavior.

func WithAdmin ¶ added in v0.8.0

func WithAdmin(b bool) Option

WithAdmin signals that the auth has admin access on every repo it can see. When true, the scanner runs all rules, including those that need admin-only API endpoints (currently: required-reviewers visibility on classic per-repo branch protection). When false (the default), rules marked admin-only are silently skipped - they don't appear in the per-repo results, the JSON output, or the Markdown report. Their absence is invisible to downstream consumers, who simply don't see those keys/columns.

Pass true when scanning with an installation token issued by the Codatus GitHub App (which is granted admin) or a PAT belonging to an admin of every target org. Pass false (or leave default) for third-party / public scans where admin signals can't be read.

func WithBaseURL ¶ added in v0.2.0

func WithBaseURL(url string) Option

WithBaseURL sets a custom GitHub API base URL. Defaults to the public GitHub API when unset. Useful for testing against a mock server or pointing at a GitHub Enterprise instance.

type PATAuth ¶ added in v0.3.0

type PATAuth struct {
	Token string
	Name  string // org or user login to scan
}

PATAuth uses a Personal Access Token targeting a named account. Scanner lists repositories via /orgs/{Name}/repos and falls back to /users/{Name}/repos on 404, so it works for both org and user accounts.

type Repo ¶

type Repo struct {
	Name          string
	Description   string
	DefaultBranch string
	Archived      bool
	Fork          bool
	PushedAt      time.Time   // most recent push to any branch (from list-repos)
	License       string      // SPDX id GitHub auto-detected (Licensee), "" if none
	Files         []FileEntry // all files and directories in the repo
	MergeGate     *MergeGate  // nil if no merge requirements were found
}

Repo represents a GitHub repository with the fields the scanner needs.

type RepoResult ¶

type RepoResult struct {
	RepoName         string
	MostRecentCommit time.Time // PushedAt from the listing; zero if unknown
	Results          []RuleResult
	KnownSkipReason  string
	UnknownSkipError string
}

RepoResult holds all rule results for a single repository. KnownSkipReason and UnknownSkipError are mutually exclusive.

func (RepoResult) Skipped ¶ added in v0.2.0

func (rr RepoResult) Skipped() bool

type Rule ¶

type Rule interface {
	Name() string
	Category() RuleCategory
	Check(repo Repo) bool
	Description() string
}

Rule defines a named check that produces a pass/fail result for a repo. Description supplies the per-rule text used by the Markdown scorecard's Rule reference section: a single self-contained paragraph that names what's checked, every detection path the rule walks, and how to fix it. Category determines whether the rule feeds into the org-level score or appears in the informational-only "Additional checks" section.

func AdditionalRules ¶ added in v0.7.0

func AdditionalRules() []Rule

AdditionalRules returns just the rules with CategoryAdditional, in AllRules order.

func AllRules ¶

func AllRules() []Rule

AllRules returns the ordered list of rules the scanner evaluates. The order is fixed and meaningful: scored rules first (by importance), then additional checks (by importance). Callers that want only one category can use ScoredRules or AdditionalRules.

func ScoredRules ¶ added in v0.7.0

func ScoredRules() []Rule

ScoredRules returns just the rules with CategoryScored, in AllRules order.

type RuleCategory ¶ added in v0.7.0

type RuleCategory string

RuleCategory classifies a rule as either a *scored* rule (contributes to the org-level score) or an *additional* check (informational only).

const (
	CategoryScored     RuleCategory = "scored"
	CategoryAdditional RuleCategory = "additional"
)

type RuleResult ¶

type RuleResult struct {
	RuleName string
	Passed   bool
}

RuleResult holds the outcome of a single rule check for a single repo.

type ScanResult ¶ added in v0.6.0

type ScanResult struct {
	Org              string
	ScannedAt        time.Time
	TotalRepos       int          // total repos returned by GitHub before any filtering
	ArchivedExcluded int          // archived repos filtered out at listing time
	ForksExcluded    int          // forked repos filtered out at listing time
	Skipped          []RepoResult // empty repos, truncated trees, or unexpected errors during the scan
	Results          []RepoResult // repos that finished scanning (success or fail per-rule)

	// RulesScored and RulesAdditional are the rules actually run against
	// each repo, split by category. They reflect WithAdmin filtering: an
	// admin-only rule skipped on a non-admin scan does NOT appear here,
	// so all downstream math (Score, BucketOf, table aggregation) is
	// driven directly by these slices instead of inferring evaluated
	// rules from RepoResult.Results.
	//
	// JSON-tagged "-" because Rule is an interface and consumers that
	// marshal a ScanResult should instead build their own per-rule
	// payload (see cmd/bulk-scan for an example). The fields are stable
	// for in-process use only.
	RulesScored     []Rule `json:"-"`
	RulesAdditional []Rule `json:"-"`
}

ScanResult bundles the scan outcome with the listing-time exclusion counts the scanner accumulates while filtering archived and forked repos. The counts let callers report a full breakdown ("32 total, 4 forks excluded, 2 archived excluded, 26 scanned") without re-querying GitHub.

The library does not expose a precomputed "most recent commit across the org" — each RepoResult carries its own MostRecentCommit and consumers aggregate as needed.

func Scan ¶

func Scan(ctx context.Context, auth Auth, opts ...Option) (ScanResult, error)

Scan lists repositories accessible to auth and evaluates every rule against each non-archived, non-forked repo. Forks and archived repos are excluded at listing time and surface in the returned ScanResult's ForksExcluded / ArchivedExcluded counts.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
cmd
bulk-scan command bulk-scan reads a list of GitHub orgs/users from a file, runs the scanner against each one, and writes per-org output files (scorecard.md + stats.json) into a destination folder.	bulk-scan reads a list of GitHub orgs/users from a file, runs the scanner against each one, and writes per-org output files (scorecard.md + stats.json) into a destination folder.
generate-sample command generate-sample renders samples.Fixture() through scanner.GenerateReport and writes the resulting Markdown to stdout (or to a file via --out).	generate-sample renders samples.Fixture() through scanner.GenerateReport and writes the resulting Markdown to stdout (or to a file via --out).
scanner command
samples Package samples provides the canonical sample scorecard used to drive the landing page hero and the app's dev-seed data.	Package samples provides the canonical sample scorecard used to drive the landing page hero and the app's dev-seed data.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL