scanner

package module
v0.8.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 4, 2026 License: MIT Imports: 9 Imported by: 0

README

Codatus

Codatus scans every repository in a GitHub organization or user account against a set of engineering standards and produces a Markdown scorecard.

It answers one question: how does each repo in your org measure up against the standards you care about?

This repository is a Go library and a CLI. Posting the scorecard (e.g., as a GitHub Issue) is the caller's responsibility - the scanner returns structured results and Markdown, nothing more.


How it works

  1. Codatus receives a GitHub account to scan (organization or user).
  2. It lists the repositories accessible to the token, then filters out archived and forked repos. Both exclusions are reported in the scorecard header so the reader can see the full breakdown (Total repos, Forks excluded, Archived excluded, Repos scanned).
  3. For each remaining repo, it runs 11 rule checks (see below).
  4. It produces a single Markdown scorecard summarizing pass/fail per repo per rule, plus a structured ScanResult value the caller can post-process (e.g. the bulk-scan binary serializes per-rule aggregates to JSON).

The CLI prints the Markdown to stdout. Callers using the library get a ScanResult (org name, scan timestamp, exclusion counts, per-repo results, skipped repos) and can generate the Markdown via GenerateReport(scanResult).


Rules

Each rule produces a pass or fail result per repository. Rules fall into two categories:

  • Scored rules drive the org-level score. The score is the arithmetic mean of pass rates across the 5 scored rules. Per-repo classification (Strong / Moderate / Weak) is also based on what fraction of scored rules a repo passes.
  • Additional checks are informational only. They appear in the report as coverage numbers but do not affect the score. They surface "nice to have" hygiene that's worth seeing but isn't load-bearing for whether a repo's standards are in good shape.
Scored rules (drive the org-level score)
1. Has branch protection

Check: the default branch has branch protection rules enabled (via the GitHub API's branch protection endpoint).

Pass: branch protection is configured on the default branch (via rulesets, classic per-repo protection, or the public branch endpoint's protected flag). Fail: none of the three signals indicate protection.

The scanner consults three GitHub APIs in priority order: rulesets (publicly readable), the admin-only classic-protection endpoint (returns 404 to non-admins), and the public branch endpoint (which exposes protected: true/false to anyone with read access). The third fallback exists so non-admin scans can still detect classic protection without admin privileges - they just can't read all the details of that protection.

2. Has required reviewers

Check: the default branch's protection rules require at least one approving review before merging.

Admin-only. This rule reads required_pull_request_reviews.required_approving_review_count, which is admin-only on classic per-repo protection. Public scans (without --admin) skip this rule entirely - it doesn't appear in the JSON output, the Markdown table, or the score calculation. When you scan with admin access (e.g. via the Codatus GitHub App, or your own org with a PAT belonging to an admin), pass WithAdmin(true) (library) or --admin (CLI) to include it.

Pass: required reviewers is set to 1 or more. Fail: required reviewers is not configured, or set to 0, or branch protection is not enabled.

3. Requires status checks before merging

Check: the default branch's protection rules require at least one status check to pass before merging.

Pass: at least one required status check is configured (via rulesets, classic protection, or the public branch endpoint's protection.required_status_checks.contexts). Fail: none of those signals expose required contexts.

The public branch endpoint exposes the contexts array even to non-admin readers, so this rule is correctly answered for both admin and public scans.

4. Has CODEOWNERS

Check: a CODEOWNERS file exists in one of the three standard locations: root (/CODEOWNERS), docs/CODEOWNERS, or .github/CODEOWNERS.

Pass: file found in any of the three locations. Fail: file not found in any location.

5. Has CI workflow

Check: the repo has a CI workflow configured for any of the supported providers:

  • GitHub Actions: .github/workflows/*.yml or *.yaml
  • CircleCI: .circleci/config.yml
  • GitLab CI: .gitlab-ci.yml
  • Travis CI: .travis.yml
  • Buildkite: any file under .buildkite/
  • Azure Pipelines: azure-pipelines.yml
  • Jenkins: Jenkinsfile

Pass: at least one of the above paths is present in the repo. Fail: none of the recognized CI configurations exist. (Repos with server-side-only CI integrations - e.g. CircleCI without a checked-in config - are still missed; the rule is best-effort based on what's in the tree.)

Additional checks (informational only)
6. Has README

Check: a README file exists at the repository root, matched case-insensitively with any extension or none. So README.md, Readme.rst, README.txt, README.markdown, readme all pass.

Pass: file found at root. Fail: no root-level file whose name is readme or starts with readme. (case-insensitive). Subdirectory READMEs (e.g. docs/README.md) don't count.

There is no size threshold - any README counts. The previous "substantial" variant required >2 KB, which discriminated poorly.

7. Has LICENSE

Check: GitHub auto-detected an open-source license for the repository (the license.spdx_id field on the listing payload is non-empty). GitHub uses the Licensee gem to detect license files at any conventional path or filename - LICENSE, LICENSE.md, LICENSE.txt, COPYING, LICENCE, etc.

Pass: GitHub returned a license SPDX id. Fail: GitHub couldn't auto-detect a license (no recognized license file, or a custom-text license Licensee doesn't recognize).

8. Has repo description

Check: the GitHub repository description field is not blank.

Pass: description is set and non-empty. Fail: description is blank or not set.

9. Has activity

Check: the most recent push to any branch is within the last 12 months (via the GitHub API's pushed_at field on the repository).

Pass: the repository was pushed within the last 12 months. Fail: the repository has not been pushed in the last 12 months, or has never been pushed. Archived repositories are filtered out before scanning, so they never reach this rule.

10. Has SECURITY.md

Check: a SECURITY.md file exists in any of the three locations GitHub recognizes for security policies: repo root, .github/SECURITY.md, or docs/SECURITY.md.

Pass: file found in any of those three locations. Fail: file not found.

Score and bucketing

The org-level score is the arithmetic mean of pass rates across the scored rules that were actually evaluated. For an admin scan that's all 5 scored rules; for a public scan it's 4 (HasRequiredReviewers is admin-only and silently skipped). The denominator adapts so the score isn't dragged toward zero by rules the scan couldn't see.

admin scan:   score = mean of 5 per-rule pass rates
public scan:  score = mean of 4 per-rule pass rates (no required-reviewers)

Each repo also gets a percentage based on the fraction of evaluated scored rules it passes (5-rule scans land at 0/20/40/60/80/100; 4-rule scans land at 0/25/50/75/100), and is bucketed:

  • Strong (≥80%)
  • Moderate (30-79%)
  • Weak (≤29%)

Additional checks do not affect either the org score or the per-repo bucket.


Scorecard format

The scorecard is a single Markdown document. Structure:

# Codatus - Engineering Standards Scorecard

**Org:** {org_name}<br>
**Scanned:** {timestamp}<br>
**Repos:** {scanned} of {total} scanned ({forks} forks excluded, {archived} archived excluded, {skipped} skipped)

## Scored rules

| Rule | Passing | Failing | Pass rate |
|------|---------|---------|----------|
| Has branch protection | 11 | 42 | 20% |
| Has required reviewers | 4 | 49 | 7% |
| Requires status checks before merging | 4 | 49 | 7% |
| Has CODEOWNERS | 3 | 50 | 5% |
| Has CI workflow | 27 | 26 | 50% |

**Score: 18/100** (average pass rate across the scored rules above)

## Additional checks

| Rule | Passing | Failing | Pass rate |
|------|---------|---------|----------|
| Has README | 50 | 3 | 94% |
| Has LICENSE | 38 | 15 | 71% |
| Has repo description | 46 | 7 | 86% |
| Has activity | 43 | 10 | 81% |
| Has SECURITY.md | 3 | 50 | 5% |

## Repository details

### Strong (≥80%)

<details>
<summary><a href="https://github.com/{org}/repo-a">repo-a</a> - 100%</summary>

</details>

### Moderate (30-79%)

<details>
<summary><a href="https://github.com/{org}/repo-b">repo-b</a> - 60%</summary>

**Failing scored rules:**
- Has CODEOWNERS
- Requires status checks before merging

**Additional check failures:**
- Has SECURITY.md

</details>

### Weak (≤29%)

<details>
<summary><a href="https://github.com/{org}/repo-c">repo-c</a> - 0%</summary>

**Failing scored rules:**
- Has branch protection
- Has required reviewers
- Requires status checks before merging
- Has CODEOWNERS
- Has CI workflow

</details>

### Skipped ({n} repos)             <-- only if any repos were skipped

- [empty-repo](https://github.com/{org}/empty-repo) - repository is empty
- [huge-repo](https://github.com/{org}/huge-repo) - file tree too large (truncated by GitHub API)

## Rule reference

<details>
<summary>What each rule checks and how to fix it</summary>

### Scored rules

#### Has branch protection
- **What it checks:** ...
- **How to fix:** ...

---

#### Has required reviewers
...

### Additional checks

#### Has README
...

</details>

Header line breaks use explicit <br> so spec-compliant Markdown renderers (CommonMark/marked.js/kramdown/GitHub) emit one line per item instead of folding consecutive single-newlines into one paragraph. The repo-stats parenthetical drops fields that are zero - with no exclusions and no skipped repos, it collapses to **Repos:** {scanned} of {total} scanned.

Tables render in fixed importance order (not sorted by pass rate). Both tables share the same column layout. The Rule reference section is collapsed by default and lists, for every rule actually present in the scan results, the "what it checks" / "how to fix" text - split into Scored rules and Additional checks subsections. Subsections (and entire buckets) are omitted when empty. Skipped repos are those that could not be scanned (empty repos, truncated file trees, API errors); they are excluded from the score and bucket counts, and render as the last subsection inside Repository details.

When repos_scanned is 0, the Score line reads **Score: N/A** (no repos available to score).

Canonical sample fixture

samples.Fixture() returns a deterministic ScanResult for the fictional acme-corp org. It's the single source of truth for the sample scorecard shown on the landing page and used as dev-seed data in the app, replacing what used to be hand-typed Markdown in each downstream repo.

Go consumers render it in process:

md := scanner.GenerateReport(samples.Fixture())

Non-Go consumers use the generator binary, which writes Markdown to stdout (or to --out):

go run github.com/CodatusHQ/scanner/cmd/generate-sample > sample-scorecard.md

No rendered .md is committed here - downstream copies are refreshed on demand by re-running the generator.


Scanner configuration

scanner.Scan(ctx, auth, opts...) takes an Auth - a sealed interface implemented by two concrete types. Pick the one that matches your token.

PATAuth - personal access token

For scanning with a user-generated token (classic or fine-grained PAT). Scanner calls /orgs/{Name}/repos and falls back to /users/{Name}/repos on 404, so it works for both org and user accounts.

results, err := scanner.Scan(ctx, scanner.PATAuth{
    Token: "ghp_...",
    Name:  "my-org",        // or a user login like "octocat"
})
Field Type Required Description
Token string Yes Personal access token
Name string Yes GitHub organization or user login to scan
InstallationAuth - GitHub App installation

For scanning as a GitHub App. Scanner calls /installation/repositories, which returns exactly the repos the installation was granted access to. Works identically for org and user installs, and respects "Selected repositories" mode (no leak of other public repos).

results, err := scanner.Scan(ctx, scanner.InstallationAuth{
    Token: "ghs_...",       // short-lived installation access token
    Name:  "my-org",        // the account the app is installed on
})
Field Type Required Description
Token string Yes Installation access token (from /app/installations/{id}/access_tokens)
Name string Yes Account the app is installed on; used for per-repo URL construction
Options
Option Description
WithBaseURL(url string) Override the GitHub API base URL. Defaults to the public GitHub API. Useful for testing against a mock server or targeting GitHub Enterprise.
WithAdmin(b bool) Tell the scanner the auth has admin access on every target repo. Default false. When true, the scanner runs admin-only rules (currently: Has required reviewers). When false, those rules are silently skipped - they don't appear in any per-repo result, the JSON output, or the Markdown report. Pass true for installation-token scans (the Codatus GitHub App is granted admin), or for PAT scans where you're an admin of every target org.
Required token permissions

Classic PAT:

  • repo — read repo contents and branch protection
  • read:org — required when Name is an organization

Fine-grained PAT: scoped to the target account, with Repository permissions:

  • Metadata: Read
  • Contents: Read
  • Administration: Read (for branch protection)

Installation token: permissions come from the GitHub App's configured repository permissions (not PAT scopes). At minimum the app needs Contents (read) and Metadata (read); Administration (read) is required for branch protection rules to resolve.

How these values are sourced (env vars, CLI flags, config file) is the responsibility of the caller, not the scanner module.

CLI

The codatus binary reads CODATUS_ORG and CODATUS_TOKEN from the environment, wraps them in PATAuth, runs a scan, and prints the Markdown scorecard to stdout. Log output (scan summary, errors) goes to stderr so stdout stays clean for piping.

Despite the name, CODATUS_ORG accepts either an organization slug or a user login - the library dispatches automatically.

# Organization
CODATUS_ORG=myorg CODATUS_TOKEN=ghp_... codatus > scorecard.md

# User account
CODATUS_ORG=my-username CODATUS_TOKEN=ghp_... codatus > scorecard.md
Bulk scan (many orgs at once)

The bulk-scan binary reads a list of orgs/users from a file (one slug per line, blank lines skipped, no comment handling) and scans them sequentially. For each org it writes a scorecard.md and a stats.json into a per-org subfolder, so partial runs preserve completed work even if a later scan aborts.

# orgs.txt
acme-corp
wayne-enterprises
stark-industries

# Run
bulk-scan --orgs orgs.txt --out ./scans --token ghp_...
# or with the token in env:
CODATUS_TOKEN=ghp_... bulk-scan --orgs orgs.txt --out ./scans
# admin-mode scan (you're an admin of every listed org, so include the
# `Has required reviewers` rule too):
CODATUS_TOKEN=ghp_... bulk-scan --orgs orgs.txt --out ./scans --admin

Output layout:

scans/
├── acme-corp/
│   ├── scorecard.md     # same Markdown the single-org CLI produces
│   └── stats.json       # structured aggregates: per-rule pass rates, totals, exclusion counts
├── wayne-enterprises/
│   ├── scorecard.md
│   └── stats.json
└── ...

Progress prints to stderr per org ([2/3] wayne-enterprises ... ok (42 scanned, 18/42 compliant = 42%)); a final summary lists succeeded / failed / not-attempted counts.

Failure handling:

  • Per-org errors (404, 403, "user not found", etc.) - logged, run continues to the next org.
  • Global errors (429 rate limit, 401 auth) - run aborts immediately. Files for orgs that already completed remain on disk; the un-scanned tail is reported in the summary as "not attempted."

Exit code is non-zero if any org failed or was not attempted.


What Codatus is not

  • Not a velocity/DORA metrics tool. It does not measure cycle time, deployment frequency, or review speed. That's a different product category (Swarmia, LinearB, CodePulse).
  • Not a security scanner. It checks whether SECURITY.md exists and whether branch protection is on, but it does not scan code for vulnerabilities (use Snyk, Dependabot, or OpenSSF Scorecard).
  • Not a developer portal. There is no service catalog, no scaffolding, no self-service actions (Backstage, Cortex, OpsLevel cover that). Just standards.

Documentation

Index

Constants

This section is empty.

Variables

View Source
var (
	ErrEmptyRepo     = errors.New("repository is empty")
	ErrTruncatedTree = errors.New("tree truncated by GitHub API")
)

Sentinel errors for per-repo scan failures.

Functions

func GenerateReport

func GenerateReport(sr ScanResult) string

GenerateReport produces a Markdown engineering-standards scorecard from a ScanResult. The structure is fixed and meaningful for prospects landing from a cold-email link:

  1. Header: title, org, scan time, single-line repo stats
  2. ## Scored rules table (importance order, drives the score)
  3. **Score: N/100** inline callout (or **Score: N/A** when no repos)
  4. ## Additional checks table (importance order, same columns as scored)
  5. ## Repository details: ### Strong / Moderate / Weak / Skipped subsections
  6. ## Rule reference (collapsed <details>, split by category)

func IsRateLimitError added in v0.6.0

func IsRateLimitError(err error) bool

IsRateLimitError reports whether an error is a GitHub rate limit error (primary or secondary). Rate limit errors must never be swallowed - they indicate a global problem that affects all subsequent API calls. Exported so callers (e.g., bulk-scan) can decide whether to abort a multi-org run on the first rate-limited org rather than continue and fail every subsequent call.

func Score added in v0.7.0

func Score(sr ScanResult) (score int, defined bool)

Score computes the org-level score: the arithmetic mean of pass rates across sr.RulesScored. Returns the score (0-100) and a flag indicating whether it's defined. When sr has no scanned repos OR no scored rules were evaluated (e.g., a non-admin scan with admin-only rules filtered out and no scored rules left), defined=false and the caller should render "N/A". Result is rounded to the nearest integer for display.

The denominator is len(sr.RulesScored), not the size of the global scored-rule set - that's how non-admin scans get the math right without rules they couldn't evaluate dragging the score down.

Types

type Auth added in v0.3.0

type Auth interface {
	// contains filtered or unexported methods
}

Auth identifies how the scanner authenticates to GitHub. It is a sealed interface — only PATAuth and InstallationAuth in this package satisfy it. New auth types are added by defining a struct with an isAuth() method.

type BranchProtection

type BranchProtection struct {
	RequiredReviewers    int
	RequiredStatusChecks []string
}

BranchProtection holds the branch protection settings the scanner needs.

type Bucket added in v0.7.0

type Bucket struct {
	Name   string // "Strong", "Moderate", "Weak"
	MinPct int    // inclusive lower bound (0..100)
	MaxPct int    // inclusive upper bound (0..100)
}

Bucket classifies a repo by what fraction of the scored rules it passes. Each bucket covers an integer percentage range; the full bucket set returned by Buckets() covers [0, 100] without gaps or overlaps. Display labels are derived from MinPct/MaxPct at render time (see report.go).

func BucketOf added in v0.7.0

func BucketOf(rr RepoResult, scoredRules []Rule) (b Bucket, scoredPassing, scoredTotal, scorePct int)

BucketOf classifies a single repo by the percentage of scored rules it passes. The caller passes the scored-rule set so the denominator is stable across the org's scan: every repo gets the same denominator, regardless of which rules happen to appear in any one repo's results. Pass sr.RulesScored from the parent ScanResult.

Returns the matching Bucket plus the underlying counts so callers don't re-derive them. If scoredRules is empty the result is the last-defined bucket (i.e. Weak) with zero counts; this only happens in test fixtures with no scored rules registered.

func Buckets added in v0.7.0

func Buckets() []Bucket

Buckets returns the score-range buckets in display order (highest range first). Adding/removing buckets, renaming them, or shifting thresholds is a one-place edit here - report and stats output both derive from this list and need no separate updates.

type FileEntry

type FileEntry struct {
	Path string // full path relative to repo root (e.g., ".github/workflows/ci.yml")
	Size int
	Type string // "blob" (file) or "tree" (directory)
}

FileEntry represents a file or directory in a repo.

type GitHubClient

type GitHubClient interface {
	// ListReposByAccount lists repos for a named org (falls back to user on 404).
	// Used by PAT auth.
	ListReposByAccount(ctx context.Context, name string) ([]Repo, error)
	// ListReposByInstallation lists the repos the current GitHub App installation
	// was granted access to. Used by installation-token auth.
	ListReposByInstallation(ctx context.Context) ([]Repo, error)
	GetTree(ctx context.Context, owner, repo, branch string) ([]FileEntry, error)
	GetBranchProtection(ctx context.Context, owner, repo, branch string) (*BranchProtection, error)
	GetRulesets(ctx context.Context, owner, repo, branch string) (*BranchProtection, error)
	// GetBranchInfo reads the public GET /repos/{o}/{r}/branches/{br}
	// endpoint, which exposes the protected flag and (for classic
	// per-repo branch protection) the required-status-check contexts to
	// any reader - including non-admins on public repos. This is the
	// fallback when the admin GetBranchProtection 404s and there are no
	// rulesets, so the scanner can still tell whether protection is on
	// and which status checks are required. Required-reviewer counts
	// are NOT exposed here (admin-only field on classic protection).
	GetBranchInfo(ctx context.Context, owner, repo, branch string) (*BranchProtection, error)
}

GitHubClient is the interface for all GitHub API interactions. The scanner depends only on this interface, making it testable via mocks.

func NewGitHubClient

func NewGitHubClient(token string) GitHubClient

NewGitHubClient creates a GitHubClient that calls the public GitHub REST API.

type HasActivity added in v0.5.1

type HasActivity struct {
	Now time.Time
}

HasActivity checks that the repo has had a commit (push) within the last 12 months. Set Now to a fixed time for deterministic testing; the zero value means time.Now() is used at check time.

func (HasActivity) Category added in v0.7.0

func (r HasActivity) Category() RuleCategory

func (HasActivity) Check added in v0.5.1

func (r HasActivity) Check(repo Repo) bool

func (HasActivity) Description added in v0.5.1

func (r HasActivity) Description() string

func (HasActivity) HowToFix added in v0.5.1

func (r HasActivity) HowToFix() string

func (HasActivity) Name added in v0.5.1

func (r HasActivity) Name() string

type HasBranchProtection

type HasBranchProtection struct{}

HasBranchProtection checks that the default branch has protection rules enabled.

func (HasBranchProtection) Category added in v0.7.0

func (r HasBranchProtection) Category() RuleCategory

func (HasBranchProtection) Check

func (r HasBranchProtection) Check(repo Repo) bool

func (HasBranchProtection) Description added in v0.4.0

func (r HasBranchProtection) Description() string

func (HasBranchProtection) HowToFix added in v0.4.0

func (r HasBranchProtection) HowToFix() string

func (HasBranchProtection) Name

func (r HasBranchProtection) Name() string

type HasCIWorkflow

type HasCIWorkflow struct{}

HasCIWorkflow checks that the repo has a CI workflow configured for any of the well-known CI providers, not just GitHub Actions. Detected via the presence of one of these signals at the repo root or under their canonical directory:

  • GitHub Actions: .github/workflows/*.yml or *.yaml
  • CircleCI: .circleci/config.yml
  • GitLab CI: .gitlab-ci.yml
  • Travis CI: .travis.yml
  • Buildkite: any file under .buildkite/
  • Azure Pipelines: azure-pipelines.yml
  • Jenkins: Jenkinsfile

Repos using a CI integration that lives entirely server-side (e.g., CircleCI without a checked-in config) are still missed; this is a best-effort signal based on what's visible in the repo.

func (HasCIWorkflow) Category added in v0.7.0

func (r HasCIWorkflow) Category() RuleCategory

func (HasCIWorkflow) Check

func (r HasCIWorkflow) Check(repo Repo) bool

func (HasCIWorkflow) Description added in v0.4.0

func (r HasCIWorkflow) Description() string

func (HasCIWorkflow) HowToFix added in v0.4.0

func (r HasCIWorkflow) HowToFix() string

func (HasCIWorkflow) Name

func (r HasCIWorkflow) Name() string

type HasCodeowners

type HasCodeowners struct{}

HasCodeowners checks that a CODEOWNERS file exists in root, docs/, or .github/.

func (HasCodeowners) Category added in v0.7.0

func (r HasCodeowners) Category() RuleCategory

func (HasCodeowners) Check

func (r HasCodeowners) Check(repo Repo) bool

func (HasCodeowners) Description added in v0.4.0

func (r HasCodeowners) Description() string

func (HasCodeowners) HowToFix added in v0.4.0

func (r HasCodeowners) HowToFix() string

func (HasCodeowners) Name

func (r HasCodeowners) Name() string

type HasLicense

type HasLicense struct{}

HasLicense uses GitHub's auto-detected license (Licensee) instead of a path-pattern match, so any conventionally-named license file works: LICENSE, LICENSE.md, LICENSE.txt, LICENCE (British), COPYING (GNU), MIT-LICENSE, etc. - anything GitHub recognizes and surfaces as the repo's `license.spdx_id` in the listing payload.

Custom-text licenses GitHub can't auto-detect won't pass even though the file may be present. That's a known false negative; the trade-off is worth it for the much broader correct-positive coverage.

func (HasLicense) Category added in v0.7.0

func (r HasLicense) Category() RuleCategory

func (HasLicense) Check

func (r HasLicense) Check(repo Repo) bool

func (HasLicense) Description added in v0.4.0

func (r HasLicense) Description() string

func (HasLicense) HowToFix added in v0.4.0

func (r HasLicense) HowToFix() string

func (HasLicense) Name

func (r HasLicense) Name() string

type HasReadme added in v0.7.0

type HasReadme struct{}

HasReadme checks that some form of README file exists at the repo root. Matches case-insensitively on the filename and accepts any extension (or no extension), so README.md, readme.rst, README.txt, Readme, README.markdown all pass. Subdirectory READMEs (e.g., docs/README.md) don't count - the rule is about a top-level project README.

(No size threshold - the previous "substantial" variant was dropped because 2 KB is too low to discriminate quality and too high to reward minimal but useful READMEs.)

func (HasReadme) Category added in v0.7.0

func (r HasReadme) Category() RuleCategory

func (HasReadme) Check added in v0.7.0

func (r HasReadme) Check(repo Repo) bool

func (HasReadme) Description added in v0.7.0

func (r HasReadme) Description() string

func (HasReadme) HowToFix added in v0.7.0

func (r HasReadme) HowToFix() string

func (HasReadme) Name added in v0.7.0

func (r HasReadme) Name() string

type HasRepoDescription

type HasRepoDescription struct{}

HasRepoDescription checks that the repo description field is not blank.

func (HasRepoDescription) Category added in v0.7.0

func (r HasRepoDescription) Category() RuleCategory

func (HasRepoDescription) Check

func (r HasRepoDescription) Check(repo Repo) bool

func (HasRepoDescription) Description added in v0.4.0

func (r HasRepoDescription) Description() string

func (HasRepoDescription) HowToFix added in v0.4.0

func (r HasRepoDescription) HowToFix() string

func (HasRepoDescription) Name

func (r HasRepoDescription) Name() string

type HasRequiredReviewers

type HasRequiredReviewers struct{}

HasRequiredReviewers checks that at least one approving review is required.

This rule is admin-only: the required-approving-reviewer count on a classic per-repo branch protection is exposed only via the admin API (GET /repos/{o}/{r}/branches/{br}/protection, returns 404 to non-admins). Rulesets surface the count publicly, so repos using rulesets are still counted in non-admin scans, but most classic-protected repos can't be distinguished from "no protection." Rather than fail those silently, the scanner skips this rule entirely on non-admin scans (see WithAdmin in scanner.go).

func (HasRequiredReviewers) Category added in v0.7.0

func (r HasRequiredReviewers) Category() RuleCategory

func (HasRequiredReviewers) Check

func (r HasRequiredReviewers) Check(repo Repo) bool

func (HasRequiredReviewers) Description added in v0.4.0

func (r HasRequiredReviewers) Description() string

func (HasRequiredReviewers) HowToFix added in v0.4.0

func (r HasRequiredReviewers) HowToFix() string

func (HasRequiredReviewers) Name

func (r HasRequiredReviewers) Name() string

func (HasRequiredReviewers) RequiresAdmin added in v0.8.0

func (r HasRequiredReviewers) RequiresAdmin() bool

type HasRequiredStatusChecks

type HasRequiredStatusChecks struct{}

HasRequiredStatusChecks checks that at least one status check is required before merging.

func (HasRequiredStatusChecks) Category added in v0.7.0

func (HasRequiredStatusChecks) Check

func (r HasRequiredStatusChecks) Check(repo Repo) bool

func (HasRequiredStatusChecks) Description added in v0.4.0

func (r HasRequiredStatusChecks) Description() string

func (HasRequiredStatusChecks) HowToFix added in v0.4.0

func (r HasRequiredStatusChecks) HowToFix() string

func (HasRequiredStatusChecks) Name

type HasSecurityMd

type HasSecurityMd struct{}

HasSecurityMd checks that SECURITY.md exists in any of the three locations GitHub recognizes for security policies: repo root, .github/, or docs/.

func (HasSecurityMd) Category added in v0.7.0

func (r HasSecurityMd) Category() RuleCategory

func (HasSecurityMd) Check

func (r HasSecurityMd) Check(repo Repo) bool

func (HasSecurityMd) Description added in v0.4.0

func (r HasSecurityMd) Description() string

func (HasSecurityMd) HowToFix added in v0.4.0

func (r HasSecurityMd) HowToFix() string

func (HasSecurityMd) Name

func (r HasSecurityMd) Name() string

type InstallationAuth added in v0.3.0

type InstallationAuth struct {
	Token string
	Name  string // org or user login the app is installed on (used in repo URLs)
}

InstallationAuth uses a GitHub App installation access token. Scanner lists repositories via /installation/repositories, which returns exactly the repos the installation was granted access to (no public-repo leak on "Selected repositories" installs).

type Option added in v0.2.0

type Option func(*scanOptions)

Option configures optional scan behavior.

func WithAdmin added in v0.8.0

func WithAdmin(b bool) Option

WithAdmin signals that the auth has admin access on every repo it can see. When true, the scanner runs all rules, including those that need admin-only API endpoints (currently: required-reviewers visibility on classic per-repo branch protection). When false (the default), rules marked admin-only are silently skipped - they don't appear in the per-repo results, the JSON output, or the Markdown report. Their absence is invisible to downstream consumers, who simply don't see those keys/columns.

Pass true when scanning with an installation token issued by the Codatus GitHub App (which is granted admin) or a PAT belonging to an admin of every target org. Pass false (or leave default) for third-party / public scans where admin signals can't be read.

func WithBaseURL added in v0.2.0

func WithBaseURL(url string) Option

WithBaseURL sets a custom GitHub API base URL. Defaults to the public GitHub API when unset. Useful for testing against a mock server or pointing at a GitHub Enterprise instance.

type PATAuth added in v0.3.0

type PATAuth struct {
	Token string
	Name  string // org or user login to scan
}

PATAuth uses a Personal Access Token targeting a named account. Scanner lists repositories via /orgs/{Name}/repos and falls back to /users/{Name}/repos on 404, so it works for both org and user accounts.

type Repo

type Repo struct {
	Name             string
	Description      string
	DefaultBranch    string
	Archived         bool
	Fork             bool
	PushedAt         time.Time         // most recent push to any branch (from list-repos)
	License          string            // SPDX id GitHub auto-detected (Licensee), "" if none
	Files            []FileEntry       // all files and directories in the repo
	BranchProtection *BranchProtection // nil if no protection configured
}

Repo represents a GitHub repository with the fields the scanner needs.

type RepoResult

type RepoResult struct {
	RepoName         string
	MostRecentCommit time.Time // PushedAt from the listing; zero if unknown
	Results          []RuleResult
	KnownSkipReason  string
	UnknownSkipError string
}

RepoResult holds all rule results for a single repository. KnownSkipReason and UnknownSkipError are mutually exclusive.

func (RepoResult) Skipped added in v0.2.0

func (rr RepoResult) Skipped() bool

type Rule

type Rule interface {
	Name() string
	Category() RuleCategory
	Check(repo Repo) bool
	Description() string
	HowToFix() string
}

Rule defines a named check that produces a pass/fail result for a repo. Description and HowToFix supply the per-rule text used by the Markdown scorecard's Rule reference section. Category determines whether the rule feeds into the org-level score or appears in the informational-only "Additional checks" section.

func AdditionalRules added in v0.7.0

func AdditionalRules() []Rule

AdditionalRules returns just the rules with CategoryAdditional, in AllRules order.

func AllRules

func AllRules() []Rule

AllRules returns the ordered list of rules the scanner evaluates. The order is fixed and meaningful: scored rules first (by importance), then additional checks (by importance). Callers that want only one category can use ScoredRules or AdditionalRules.

func ScoredRules added in v0.7.0

func ScoredRules() []Rule

ScoredRules returns just the rules with CategoryScored, in AllRules order.

type RuleCategory added in v0.7.0

type RuleCategory string

RuleCategory classifies a rule as either a *scored* rule (contributes to the org-level score) or an *additional* check (informational only).

const (
	CategoryScored     RuleCategory = "scored"
	CategoryAdditional RuleCategory = "additional"
)

type RuleResult

type RuleResult struct {
	RuleName string
	Passed   bool
}

RuleResult holds the outcome of a single rule check for a single repo.

type ScanResult added in v0.6.0

type ScanResult struct {
	Org              string
	ScannedAt        time.Time
	TotalRepos       int          // total repos returned by GitHub before any filtering
	ArchivedExcluded int          // archived repos filtered out at listing time
	ForksExcluded    int          // forked repos filtered out at listing time
	Skipped          []RepoResult // empty repos, truncated trees, or unexpected errors during the scan
	Results          []RepoResult // repos that finished scanning (success or fail per-rule)

	// RulesScored and RulesAdditional are the rules actually run against
	// each repo, split by category. They reflect WithAdmin filtering: an
	// admin-only rule skipped on a non-admin scan does NOT appear here,
	// so all downstream math (Score, BucketOf, table aggregation) is
	// driven directly by these slices instead of inferring evaluated
	// rules from RepoResult.Results.
	//
	// JSON-tagged "-" because Rule is an interface and consumers that
	// marshal a ScanResult should instead build their own per-rule
	// payload (see cmd/bulk-scan for an example). The fields are stable
	// for in-process use only.
	RulesScored     []Rule `json:"-"`
	RulesAdditional []Rule `json:"-"`
}

ScanResult bundles the scan outcome with the listing-time exclusion counts the scanner accumulates while filtering archived and forked repos. The counts let callers report a full breakdown ("32 total, 4 forks excluded, 2 archived excluded, 26 scanned") without re-querying GitHub.

The library does not expose a precomputed "most recent commit across the org" — each RepoResult carries its own MostRecentCommit and consumers aggregate as needed.

func Scan

func Scan(ctx context.Context, auth Auth, opts ...Option) (ScanResult, error)

Scan lists repositories accessible to auth and evaluates every rule against each non-archived, non-forked repo. Forks and archived repos are excluded at listing time and surface in the returned ScanResult's ForksExcluded / ArchivedExcluded counts.

Directories

Path Synopsis
cmd
bulk-scan command
bulk-scan reads a list of GitHub orgs/users from a file, runs the scanner against each one, and writes per-org output files (scorecard.md + stats.json) into a destination folder.
bulk-scan reads a list of GitHub orgs/users from a file, runs the scanner against each one, and writes per-org output files (scorecard.md + stats.json) into a destination folder.
generate-sample command
generate-sample renders samples.Fixture() through scanner.GenerateReport and writes the resulting Markdown to stdout (or to a file via --out).
generate-sample renders samples.Fixture() through scanner.GenerateReport and writes the resulting Markdown to stdout (or to a file via --out).
scanner command
Package samples provides the canonical sample scorecard used to drive the landing page hero and the app's dev-seed data.
Package samples provides the canonical sample scorecard used to drive the landing page hero and the app's dev-seed data.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL