repotools

package
v0.5.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 15, 2026 License: MIT Imports: 11 Imported by: 0

Documentation

Overview

Package repotools provides VCS abstraction for repository operations.

This package supports multiple version control systems with git as the primary focus. SVN support is included for enterprises like LinkedIn that still use SVN as of 2025.

Key components:

  • VCS detection and requirement checking
  • Git identity detection for per-user session isolation
  • Repository fingerprinting (initial commit hash, remote URL hashing)
  • Secure hashing with salt to prevent enumeration attacks

Package repotools provides utilities for working with repository identifiers.

The repo ID format is: "repo_" + standard UUIDv7 string

UUIDv7 is used because it's time-sortable, which helps with concurrent init detection where the earlier timestamp wins as canonical. The standard UUID format preserves this time-sortability when sorted lexicographically.

Example generated IDs:

repo_01936d5a-0000-7abc-8def-0123456789ab
repo_01936d5a-0001-7abc-8def-0123456789ab
repo_01936d5a-0002-7abc-8def-0123456789ab

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

func FindMainRepoRoot

func FindMainRepoRoot(vcs VCS) (string, error)

FindMainRepoRoot finds the main repository root, resolving through worktrees. Unlike FindRepoRoot which returns the worktree directory, this always returns the main repository root that all worktrees share.

Why this matters: git worktrees are separate working directories that share the same repository. For features like the ledger where we want ONE instance per repository (not per worktree), we need to resolve to the main repo root. Using --show-toplevel would give each worktree its own ledger, fragmenting data.

func FindRepoRoot

func FindRepoRoot(vcs VCS) (string, error)

FindRepoRoot finds the root directory of the repository for the given VCS

func GenerateRepoID

func GenerateRepoID() string

GenerateRepoID generates a new prefixed repo ID using UUIDv7. Format: "repo_" + standard UUIDv7 string (e.g., "repo_01936d5a-0000-7abc-8def-0123456789ab")

UUIDv7 is time-sortable, and the standard format preserves this property when sorted lexicographically (e.g., ls -la .sageox/.repo_*).

Example
package main

import (
	"fmt"

	"github.com/sageox/ox/internal/repotools"
)

func main() {
	id := repotools.GenerateRepoID()

	// validate the ID
	valid := repotools.IsValidRepoID(id)
	fmt.Printf("Valid: %v\n", valid)

	// parse back to UUID
	uuid, err := repotools.ParseRepoID(id)
	if err != nil {
		fmt.Printf("Error: %v\n", err)
		return
	}

	fmt.Printf("UUID version: %d\n", uuid.Version())
}
Output:
Valid: true
UUID version: 7

func GetCurrentBranch added in v0.3.0

func GetCurrentBranch(dir string) string

GetCurrentBranch returns the current git branch for the given directory. Returns empty string on any error (best-effort).

func GetInitialCommitHash

func GetInitialCommitHash() (string, error)

GetInitialCommitHash returns the hash of the initial (first) commit in the repo This is used as repo_salt for secure hashing of remote URLs

func GetRemoteURLs

func GetRemoteURLs() ([]string, error)

GetRemoteURLs returns all configured git remote URLs for the current directory.

func GetRemoteURLsForDir added in v0.5.0

func GetRemoteURLsForDir(dir string) ([]string, error)

GetRemoteURLsForDir returns all configured git remote URLs for the given directory. If dir is empty, uses the current working directory.

func GetRepoName

func GetRepoName(gitRoot string) string

GetRepoName returns a human-readable repo name derived from git remotes. Prefers "owner/repo" extracted from the first remote origin URL (e.g. git@github.com:sageox/ox.git → "sageox/ox"). Falls back to the git root directory name if no remote is available. Uses gitRoot to query remotes (via git -C), not the current working directory.

func HashRemoteURLs

func HashRemoteURLs(salt string, urls []string) []string

HashRemoteURLs creates salted SHA256 hashes of remote URLs.

SECURITY: Remote URLs are hashed (not sent plaintext) to protect repo identity. Knowing a repo's remote URL can reveal private repo names, internal tooling, or organizational structure. The hash allows the server to detect when two repos share the same origin (for merge detection) without learning the actual URL.

The salt (typically the repo's first commit hash) prevents enumeration attacks: an attacker with server access cannot precompute hashes for known repo URLs like "github.com/company/secret-project" to identify which repos are registered. Each repo's salt is unique, so the same URL produces different hashes for different repos.

func IsInstalled

func IsInstalled(vcs VCS) bool

IsInstalled checks if the specified VCS tool is available in PATH

func IsPublicRepo

func IsPublicRepo() (bool, error)

IsPublicRepo attempts to detect if the repository is public Currently uses heuristics; could be enhanced with GitHub API in future

func IsValidRepoID

func IsValidRepoID(id string) bool

IsValidRepoID validates the format of a repo ID. Returns true if the ID has the correct prefix and can be parsed as a valid UUID.

Example
package main

import (
	"fmt"

	"github.com/sageox/ox/internal/repotools"
)

func main() {
	// valid ID
	validID := repotools.GenerateRepoID()
	fmt.Printf("Valid ID: %v\n", repotools.IsValidRepoID(validID))

	// invalid IDs
	fmt.Printf("Invalid prefix: %v\n", repotools.IsValidRepoID("invalid_abc123"))
	fmt.Printf("Empty: %v\n", repotools.IsValidRepoID(""))
	fmt.Printf("No prefix: %v\n", repotools.IsValidRepoID("abc123"))

}
Output:
Valid ID: true
Invalid prefix: false
Empty: false
No prefix: false

func ParseRepoID

func ParseRepoID(id string) (uuid.UUID, error)

ParseRepoID parses a repo ID back to its underlying UUID. Returns an error if the ID is invalid or doesn't have the correct prefix.

Example
package main

import (
	"fmt"

	"github.com/sageox/ox/internal/repotools"
)

func main() {
	// generate a repo ID
	id := repotools.GenerateRepoID()

	// parse it back to UUID
	uuid, err := repotools.ParseRepoID(id)
	if err != nil {
		fmt.Printf("Error: %v\n", err)
		return
	}

	fmt.Printf("Parsed UUID version: %d\n", uuid.Version())
}
Output:
Parsed UUID version: 7

func RequireVCS

func RequireVCS(vcs VCS) error

RequireVCS checks if a VCS is installed and returns an error if not Use this for fail-fast behavior in commands that require VCS

Types

type GitIdentity

type GitIdentity struct {
	Name  string // git config user.name
	Email string // git config user.email
}

GitIdentity holds git user configuration

func DetectGitIdentity

func DetectGitIdentity() (*GitIdentity, error)

DetectGitIdentity reads git user configuration Returns nil if no identity is configured (both name and email empty)

func (*GitIdentity) Slug

func (g *GitIdentity) Slug() string

Slug returns a filesystem-safe identifier for the git identity Uses email username (before @) if available, otherwise name Example: "ryan@example.com" -> "ryan"

type RepoFingerprint

type RepoFingerprint struct {
	// FirstCommit is the hash of the initial commit (same as repo_salt).
	// Forks share this value, making it useful for detecting related repos.
	FirstCommit string `json:"first_commit"`

	// MonthlyCheckpoints maps "YYYY-MM" to the first commit hash of that month.
	// Used to detect divergence: if two repos share the same first commit but
	// have different monthly checkpoints, they've diverged.
	// Only includes months with commits; sparse map.
	MonthlyCheckpoints map[string]string `json:"monthly_checkpoints"`

	// AncestrySamples contains commit hashes at power-of-2 intervals from the
	// first commit: 1st, 2nd, 4th, 8th, 16th, 32nd, 64th, 128th, 256th.
	// Provides consistent sampling regardless of commit cadence or age.
	//
	// FUTURE CONSIDERATION (not implemented): Add yearly exponential fingerprints
	// starting from 2027. Each year would have its own set of power-of-2 samples
	// from Jan 1 of that year, stored by year key (e.g., "2027": [...hashes]).
	// These would NOT cascade into following years - each year stands alone.
	//
	// This adds recency detection for repos with long histories. The downside:
	// during the first month of a new year, merge detection may fail if one
	// candidate doesn't have current year hashes yet. Server-side logic must
	// account for this when deciding whether to use current year hashes.
	AncestrySamples []string `json:"ancestry_samples"`

	// RemoteHashes contains salted SHA256 hashes of normalized remote URLs.
	// Different clones of the same repo often share remote URLs (origin).
	// Hashed with FirstCommit as salt to prevent enumeration attacks.
	RemoteHashes []string `json:"remote_hashes,omitempty"`
}

RepoFingerprint holds repository identity fingerprint data for detecting identical or related repositories across different teams or installations.

Purpose: When multiple teams run `ox init` on the same codebase (forks, clones, or parallel work), this fingerprint enables the SageOx API to:

  • Detect that repos are the same or related
  • Suggest team merges when appropriate
  • Identify divergence between forks

The fingerprint uses multiple signals because no single identifier is perfect:

  • FirstCommit: Identifies the original repo, but forks share this
  • MonthlyCheckpoints: Detect divergence over time (different commits = divergence)
  • AncestrySamples: Consistent sampling regardless of commit frequency

func ComputeFingerprint

func ComputeFingerprint() (*RepoFingerprint, error)

ComputeFingerprint generates a repository fingerprint from git history. This performs a single O(N) scan of the commit history to compute all fingerprint components efficiently. Returns nil if the repository has no commits.

func (*RepoFingerprint) WithRemoteHashes

func (f *RepoFingerprint) WithRemoteHashes() error

WithRemoteHashes adds salted remote URL hashes to the fingerprint. Call this after ComputeFingerprint() to include remote URL identity. The hashes are salted with FirstCommit to prevent enumeration.

type VCS

type VCS string

VCS represents a version control system type

const (
	VCSGit VCS = "git"
	VCSSvn VCS = "svn" // future: LinkedIn still uses SVN as of 2025
)

func DetectVCS

func DetectVCS() (VCS, error)

DetectVCS determines which VCS is being used in the current directory Returns the detected VCS type or an error if none is found

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL