Documentation
¶
Overview ¶
Package git provides abstractions for running git commands and collecting commit metadata, diffs, and fork-point information.
Index ¶
- Constants
- Variables
- func CollectPlanMetadata(ctx context.Context, runner GitRunner, baseBranch string) map[string]string
- func DetectDefaultBranch(ctx context.Context, runner GitRunner, remote string) (string, error)
- func FetchBulkMetadata(ctx context.Context, runner GitRunner, commits []string) (map[string]CommitMetadata, error)
- func FetchMissingCommits(ctx context.Context, runner GitRunner, remote string, commits []string) (unfetchable int, err error)
- func FilterExistingCommits(ctx context.Context, runner GitRunner, commits []string) (existing []string, missing []string, err error)
- func MergeMetadata(existing, auto map[string]string) map[string]string
- func ResolveBaseBranch(ctx context.Context, runner GitRunner, explicit string, remote string) (string, error)
- type CommitDiffs
- type CommitMetadata
- type ExecGitRunner
- type FakeGitRunner
- type ForkPointResult
- type GitRunner
- type MainlineCache
Constants ¶
const DefaultWorkerCount = 10
DefaultWorkerCount is the default number of concurrent goroutines for diff collection.
Variables ¶
var MetadataFormat = strings.Join([]string{
fmtHash, fmtParents,
fmtAuthorN, fmtAuthorE, fmtAuthorD,
fmtCommitN, fmtCommitE, fmtCommitD,
fmtBody,
}, fmtFieldSep) + fmtRecordSep
MetadataFormat is the git log format string for bulk metadata extraction. Fields are separated by ASCII unit separator (%x1f), records by ASCII record separator (%x1e). These control characters are purpose-built for structured data and won't appear in normal git fields.
Functions ¶
func CollectPlanMetadata ¶
func CollectPlanMetadata(ctx context.Context, runner GitRunner, baseBranch string) map[string]string
CollectPlanMetadata collects git metadata for the current HEAD commit. Returns a map of metadata keys to values. Skips keys that cannot be collected (e.g. if not in a git repo). Does not error on git failures; returns partial results with warnings logged via debug.Printf.
func DetectDefaultBranch ¶
DetectDefaultBranch returns the remote default branch reference. Tries <remote>/HEAD, then falls back to <remote>/main, then <remote>/master.
func FetchBulkMetadata ¶
func FetchBulkMetadata(ctx context.Context, runner GitRunner, commits []string) (map[string]CommitMetadata, error)
FetchBulkMetadata fetches metadata for all given commits in a single git call. Uses --no-walk with --stdin to process only the specified commits (not ancestors). Returns a map from commit SHA to CommitMetadata for O(1) lookup.
func FetchMissingCommits ¶
func FetchMissingCommits(ctx context.Context, runner GitRunner, remote string, commits []string) (unfetchable int, err error)
FetchMissingCommits attempts to fetch the given commits from the remote. Uses chunked fetching (1000 commits per batch) with recursive bisection on error to isolate unfetchable SHAs (e.g. force-pushed/rebased commits that no longer exist on the remote).
Returns the number of commits that could not be fetched.
func FilterExistingCommits ¶
func FilterExistingCommits(ctx context.Context, runner GitRunner, commits []string) (existing []string, missing []string, err error)
FilterExistingCommits checks which commits exist in the local repo. Returns the list of existing commits (preserving input order), the list of missing commit SHAs, and any error.
Uses `git cat-file --batch-check` with stdin for efficiency (single process for all commits, rather than one git call per commit).
func MergeMetadata ¶
MergeMetadata merges auto-collected metadata into existing user-provided metadata. User-provided keys take precedence: auto-collected values only fill in keys that are not already present. Empty auto-collected values are skipped. If existing is nil, the auto map is returned as-is.
func ResolveBaseBranch ¶
func ResolveBaseBranch(ctx context.Context, runner GitRunner, explicit string, remote string) (string, error)
ResolveBaseBranch determines the base branch ref to diff against.
Resolution order:
- explicit (from --metadata base_branch=...) -- for repos with non-standard default branches (not main/master), or PRs targeting non-default branches outside Buildkite CI.
- $BUILDKITE_PULL_REQUEST_BASE_BRANCH -- auto-set by the Buildkite agent on PR builds.
- DetectDefaultBranch() -- tries remote/HEAD, remote/main, remote/master.
Most users should NOT need to set base_branch explicitly. Override is only needed when:
- The repo uses a non-standard default branch (e.g. "develop", "trunk") AND remote/HEAD is not configured
- The build targets a non-default branch (e.g. a PR into "release/v2") AND $BUILDKITE_PULL_REQUEST_BASE_BRANCH is not set (non-Buildkite CI or manual trigger)
Each candidate is probed against the repository using git rev-parse --verify. We try the candidate verbatim first, then fall back to "<remote>/<candidate>". This handles every common shape without heuristics: bare branch names ("main") resolve via the fallback to "origin/main"; refs from a different remote ("upstream/main"), fully- qualified refs ("refs/heads/release"), and values already including the configured remote ("origin/main") all resolve on the first probe. Without the verbatim probe, prefixing a qualified ref would rewrite it into an invalid value like "origin/upstream/main" and silently drop the explicit override. Returns the resolved ref (e.g. "origin/main") or an error.
Types ¶
type CommitDiffs ¶
type CommitDiffs struct {
FilesChanged string `json:"files_changed"`
DiffStat string `json:"diff_stat"`
GitDiff string `json:"git_diff,omitempty"`
GitDiffRaw string `json:"git_diff_raw,omitempty"`
}
CommitDiffs holds the diff information for a single commit relative to its fork-point.
func CollectDiffs ¶
func CollectDiffs( ctx context.Context, runner GitRunner, commits []string, mainBranch string, mc *MainlineCache, skipDiffs bool, workerCount int, onProgress func(done, total int), ) ([]CommitDiffs, error)
CollectDiffs collects diff information for each commit concurrently. workerCount controls the number of concurrent goroutines (use DefaultWorkerCount for the standard 10-goroutine pool). Results are returned in the same order as the input commits slice.
For each commit it finds the fork-point and runs:
- git diff --no-ext-diff --name-only <base> <commit> -> FilesChanged
- git diff --no-ext-diff --numstat <base> <commit> -> DiffStat
- git diff --no-ext-diff <base> <commit> -> GitDiff (unless skipDiffs)
- git diff --no-ext-diff --raw <base> <commit> -> GitDiffRaw (unless skipDiffs)
The onProgress callback is called after each commit is processed with the running count and total. It may be nil.
func (CommitDiffs) ToMap ¶
func (d CommitDiffs) ToMap() map[string]string
ToMap returns the diff fields as a flat string map using the same key names as the JSON tags.
type CommitMetadata ¶
type CommitMetadata struct {
CommitSHA string `json:"commit_sha"`
ParentSHAs []string `json:"parent_shas"`
AuthorName string `json:"author_name"`
AuthorEmail string `json:"author_email"`
AuthorDate string `json:"author_date"`
CommitterName string `json:"committer_name"`
CommitterEmail string `json:"committer_email"`
CommitterDate string `json:"committer_date"`
Message string `json:"message"`
}
CommitMetadata holds the metadata for a single git commit.
func (CommitMetadata) ToMap ¶
func (m CommitMetadata) ToMap() map[string]string
ToMap returns the commit metadata as a flat string map using the same key names as the JSON tags. All keys are always present; ParentSHAs are stored as a space-separated string (empty string for root commits).
type ExecGitRunner ¶
type ExecGitRunner struct{}
ExecGitRunner runs git commands via os/exec.
func (*ExecGitRunner) OutputWithStdin ¶
type FakeGitRunner ¶
type FakeGitRunner struct {
// Responses maps a key derived from args to the output string.
Responses map[string]string
// StdinResponses maps a key derived from args to a function that
// takes stdin and returns the response. Used for OutputWithStdin.
StdinResponses map[string]func(stdin string) string
}
FakeGitRunner returns canned responses based on the git arguments. It is exported for use by tests in other packages.
func (*FakeGitRunner) OutputWithStdin ¶
type ForkPointResult ¶
type ForkPointResult struct {
Base string
Strategy string // "fork-point", "parent-fallback", "merge-base"
}
ForkPointResult holds the base commit for diffing and the strategy used to find it.
func FindForkPoint ¶
func FindForkPoint(ctx context.Context, runner GitRunner, mainBranch, commit string, mc *MainlineCache) (ForkPointResult, error)
FindForkPoint determines the base commit to diff against using 3 strategies:
- git merge-base --fork-point (uses reflog, best for recent branches)
- Mainline parent fallback (commit is on the first-parent chain of main)
- Plain git merge-base (fallback for unmerged branches)
type GitRunner ¶
type GitRunner interface {
// Output runs a git command and returns its stdout as a string.
Output(ctx context.Context, args ...string) (string, error)
// OutputWithStdin runs a git command with stdin piped and returns stdout.
OutputWithStdin(ctx context.Context, stdin string, args ...string) (string, error)
}
GitRunner abstracts git command execution for testability.
type MainlineCache ¶
type MainlineCache struct {
// contains filtered or unexported fields
}
MainlineCache precomputes the first-parent topology of the default branch. This is used by the parent-fallback strategy to detect commits that are directly on the main branch (e.g. direct pushes or merge commits).
func BuildMainlineCache ¶
func BuildMainlineCache(ctx context.Context, runner GitRunner, mainBranch string, days int) (*MainlineCache, error)
BuildMainlineCache builds a cache of the first-parent chain from the given branch. Uses `git log --first-parent --format=%H %P` to enumerate all commits on the mainline and their first parents. The days parameter scopes the cache to the same lookback window as the commit list API, avoiding unbounded history for repos with very long mainline histories.
func (*MainlineCache) Size ¶
func (mc *MainlineCache) Size() int
Size returns the number of commits in the mainline cache.