Documentation
¶
Overview ¶
Package repotools provides VCS abstraction for repository operations.
This package supports multiple version control systems with git as the primary focus. SVN support is included for enterprises like LinkedIn that still use SVN as of 2025.
Key components:
- VCS detection and requirement checking
- Git identity detection for per-user session isolation
- Repository fingerprinting (initial commit hash, remote URL hashing)
- Secure hashing with salt to prevent enumeration attacks
Package repotools provides utilities for working with repository identifiers.
The repo ID format is: "repo_" + standard UUIDv7 string
UUIDv7 is used because it's time-sortable, which helps with concurrent init detection where the earlier timestamp wins as canonical. The standard UUID format preserves this time-sortability when sorted lexicographically.
Example generated IDs:
repo_01936d5a-0000-7abc-8def-0123456789ab repo_01936d5a-0001-7abc-8def-0123456789ab repo_01936d5a-0002-7abc-8def-0123456789ab
Index ¶
- func FindMainRepoRoot(vcs VCS) (string, error)
- func FindRepoRoot(vcs VCS) (string, error)
- func GenerateRepoID() string
- func GetCurrentBranch(dir string) string
- func GetInitialCommitHash() (string, error)
- func GetRemoteURLs() ([]string, error)
- func GetRemoteURLsForDir(dir string) ([]string, error)
- func GetRepoName(gitRoot string) string
- func HashRemoteURLs(salt string, urls []string) []string
- func IsInstalled(vcs VCS) bool
- func IsPublicRepo() (bool, error)
- func IsValidRepoID(id string) bool
- func ParseRepoID(id string) (uuid.UUID, error)
- func RequireVCS(vcs VCS) error
- type GitIdentity
- type RepoFingerprint
- type VCS
Examples ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func FindMainRepoRoot ¶
FindMainRepoRoot finds the main repository root, resolving through worktrees. Unlike FindRepoRoot which returns the worktree directory, this always returns the main repository root that all worktrees share.
Why this matters: git worktrees are separate working directories that share the same repository. For features like the ledger where we want ONE instance per repository (not per worktree), we need to resolve to the main repo root. Using --show-toplevel would give each worktree its own ledger, fragmenting data.
func FindRepoRoot ¶
FindRepoRoot finds the root directory of the repository for the given VCS
func GenerateRepoID ¶
func GenerateRepoID() string
GenerateRepoID generates a new prefixed repo ID using UUIDv7. Format: "repo_" + standard UUIDv7 string (e.g., "repo_01936d5a-0000-7abc-8def-0123456789ab")
UUIDv7 is time-sortable, and the standard format preserves this property when sorted lexicographically (e.g., ls -la .sageox/.repo_*).
Example ¶
package main
import (
"fmt"
"github.com/sageox/ox/internal/repotools"
)
func main() {
id := repotools.GenerateRepoID()
// validate the ID
valid := repotools.IsValidRepoID(id)
fmt.Printf("Valid: %v\n", valid)
// parse back to UUID
uuid, err := repotools.ParseRepoID(id)
if err != nil {
fmt.Printf("Error: %v\n", err)
return
}
fmt.Printf("UUID version: %d\n", uuid.Version())
}
Output: Valid: true UUID version: 7
func GetCurrentBranch ¶ added in v0.3.0
GetCurrentBranch returns the current git branch for the given directory. Returns empty string on any error (best-effort).
func GetInitialCommitHash ¶
GetInitialCommitHash returns the hash of the initial (first) commit in the repo This is used as repo_salt for secure hashing of remote URLs
func GetRemoteURLs ¶
GetRemoteURLs returns all configured git remote URLs for the current directory.
func GetRemoteURLsForDir ¶ added in v0.5.0
GetRemoteURLsForDir returns all configured git remote URLs for the given directory. If dir is empty, uses the current working directory.
func GetRepoName ¶
GetRepoName returns a human-readable repo name derived from git remotes. Prefers "owner/repo" extracted from the first remote origin URL (e.g. git@github.com:sageox/ox.git → "sageox/ox"). Falls back to the git root directory name if no remote is available. Uses gitRoot to query remotes (via git -C), not the current working directory.
func HashRemoteURLs ¶
HashRemoteURLs creates salted SHA256 hashes of remote URLs.
SECURITY: Remote URLs are hashed (not sent plaintext) to protect repo identity. Knowing a repo's remote URL can reveal private repo names, internal tooling, or organizational structure. The hash allows the server to detect when two repos share the same origin (for merge detection) without learning the actual URL.
The salt (typically the repo's first commit hash) prevents enumeration attacks: an attacker with server access cannot precompute hashes for known repo URLs like "github.com/company/secret-project" to identify which repos are registered. Each repo's salt is unique, so the same URL produces different hashes for different repos.
func IsInstalled ¶
IsInstalled checks if the specified VCS tool is available in PATH
func IsPublicRepo ¶
IsPublicRepo attempts to detect if the repository is public Currently uses heuristics; could be enhanced with GitHub API in future
func IsValidRepoID ¶
IsValidRepoID validates the format of a repo ID. Returns true if the ID has the correct prefix and can be parsed as a valid UUID.
Example ¶
package main
import (
"fmt"
"github.com/sageox/ox/internal/repotools"
)
func main() {
// valid ID
validID := repotools.GenerateRepoID()
fmt.Printf("Valid ID: %v\n", repotools.IsValidRepoID(validID))
// invalid IDs
fmt.Printf("Invalid prefix: %v\n", repotools.IsValidRepoID("invalid_abc123"))
fmt.Printf("Empty: %v\n", repotools.IsValidRepoID(""))
fmt.Printf("No prefix: %v\n", repotools.IsValidRepoID("abc123"))
}
Output: Valid ID: true Invalid prefix: false Empty: false No prefix: false
func ParseRepoID ¶
ParseRepoID parses a repo ID back to its underlying UUID. Returns an error if the ID is invalid or doesn't have the correct prefix.
Example ¶
package main
import (
"fmt"
"github.com/sageox/ox/internal/repotools"
)
func main() {
// generate a repo ID
id := repotools.GenerateRepoID()
// parse it back to UUID
uuid, err := repotools.ParseRepoID(id)
if err != nil {
fmt.Printf("Error: %v\n", err)
return
}
fmt.Printf("Parsed UUID version: %d\n", uuid.Version())
}
Output: Parsed UUID version: 7
func RequireVCS ¶
RequireVCS checks if a VCS is installed and returns an error if not Use this for fail-fast behavior in commands that require VCS
Types ¶
type GitIdentity ¶
type GitIdentity struct {
Name string // git config user.name
Email string // git config user.email
}
GitIdentity holds git user configuration
func DetectGitIdentity ¶
func DetectGitIdentity() (*GitIdentity, error)
DetectGitIdentity reads git user configuration Returns nil if no identity is configured (both name and email empty)
func (*GitIdentity) Slug ¶
func (g *GitIdentity) Slug() string
Slug returns a filesystem-safe identifier for the git identity Uses email username (before @) if available, otherwise name Example: "ryan@example.com" -> "ryan"
type RepoFingerprint ¶
type RepoFingerprint struct {
// FirstCommit is the hash of the initial commit (same as repo_salt).
// Forks share this value, making it useful for detecting related repos.
FirstCommit string `json:"first_commit"`
// MonthlyCheckpoints maps "YYYY-MM" to the first commit hash of that month.
// Used to detect divergence: if two repos share the same first commit but
// have different monthly checkpoints, they've diverged.
// Only includes months with commits; sparse map.
MonthlyCheckpoints map[string]string `json:"monthly_checkpoints"`
// AncestrySamples contains commit hashes at power-of-2 intervals from the
// first commit: 1st, 2nd, 4th, 8th, 16th, 32nd, 64th, 128th, 256th.
// Provides consistent sampling regardless of commit cadence or age.
//
// FUTURE CONSIDERATION (not implemented): Add yearly exponential fingerprints
// starting from 2027. Each year would have its own set of power-of-2 samples
// from Jan 1 of that year, stored by year key (e.g., "2027": [...hashes]).
// These would NOT cascade into following years - each year stands alone.
//
// This adds recency detection for repos with long histories. The downside:
// during the first month of a new year, merge detection may fail if one
// candidate doesn't have current year hashes yet. Server-side logic must
// account for this when deciding whether to use current year hashes.
AncestrySamples []string `json:"ancestry_samples"`
// RemoteHashes contains salted SHA256 hashes of normalized remote URLs.
// Different clones of the same repo often share remote URLs (origin).
// Hashed with FirstCommit as salt to prevent enumeration attacks.
RemoteHashes []string `json:"remote_hashes,omitempty"`
}
RepoFingerprint holds repository identity fingerprint data for detecting identical or related repositories across different teams or installations.
Purpose: When multiple teams run `ox init` on the same codebase (forks, clones, or parallel work), this fingerprint enables the SageOx API to:
- Detect that repos are the same or related
- Suggest team merges when appropriate
- Identify divergence between forks
The fingerprint uses multiple signals because no single identifier is perfect:
- FirstCommit: Identifies the original repo, but forks share this
- MonthlyCheckpoints: Detect divergence over time (different commits = divergence)
- AncestrySamples: Consistent sampling regardless of commit frequency
func ComputeFingerprint ¶
func ComputeFingerprint() (*RepoFingerprint, error)
ComputeFingerprint generates a repository fingerprint from git history. This performs a single O(N) scan of the commit history to compute all fingerprint components efficiently. Returns nil if the repository has no commits.
func (*RepoFingerprint) WithRemoteHashes ¶
func (f *RepoFingerprint) WithRemoteHashes() error
WithRemoteHashes adds salted remote URL hashes to the fingerprint. Call this after ComputeFingerprint() to include remote URL identity. The hashes are salted with FirstCommit to prevent enumeration.