cache

package
v0.14.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 2, 2026 License: MIT Imports: 9 Imported by: 0

Documentation

Overview

Package cache provides semantic analysis caching with content-addressable storage.

Cache Versioning System

The cache uses a three-tier versioning system to detect when cached entries become stale due to changes in extraction logic, analysis prompts, or data structures.

Version Tiers:

  • CacheSchemaVersion: Tracks changes to CachedAnalysis structure itself. Bump when adding/removing/renaming fields in CachedAnalysis.

  • CacheMetadataVersion: Tracks changes to metadata extraction logic. Bump when changing FileMetadata fields, extraction algorithms, or handlers.

  • CacheSemanticVersion: Tracks changes to semantic analysis logic. Bump when changing prompts, SemanticAnalysis fields, or analysis routing.

Cache Key Format:

Old format: {hash[:16]}.json
New format: {hash[:16]}-v{schema}-{metadata}-{semantic}.json

Example: sha256:abc12345-v1-1-1.json

Index

Constants

View Source
const CacheMetadataVersion = 1

CacheMetadataVersion tracks changes to metadata extraction logic. Increment when:

  • Adding fields to FileMetadata
  • Changing metadata extraction algorithms
  • Fixing bugs in metadata handlers
  • Adding new metadata handlers
  • Changing categorization logic
  • Updating readability detection
View Source
const CacheSchemaVersion = 1

CacheSchemaVersion tracks changes to CachedAnalysis structure. Increment when:

  • Adding fields to CachedAnalysis struct
  • Removing fields from CachedAnalysis struct
  • Renaming fields in CachedAnalysis struct
  • Changing field types in CachedAnalysis struct
  • Changing cache storage format (JSON structure)
  • Changing cache key generation algorithm
View Source
const CacheSemanticVersion = 2

CacheSemanticVersion tracks changes to semantic analysis logic. Increment when:

  • Changing prompt templates
  • Adding fields to SemanticAnalysis
  • Changing analysis routing logic (which analyzer for which file type)
  • Updating response parsing logic
  • Changing confidence score calculations
  • Updating entity/reference extraction
  • Fixing bugs in semantic analysis

Version 2: Multi-provider semantic analysis refactor (claude.* + analysis.* → semantic.*)

View Source
const (
	// HashPrefix is the standard prefix for SHA-256 content hashes
	HashPrefix = "sha256:"
)

Variables

This section is empty.

Functions

func CacheVersion added in v0.13.0

func CacheVersion() string

CacheVersion returns the combined version string in format "v{schema}.{metadata}.{semantic}"

func HashFile

func HashFile(filePath string) (string, error)

func IsCurrentVersion added in v0.13.0

func IsCurrentVersion(cached *types.CachedAnalysis) bool

IsCurrentVersion checks if a cache entry is from the current version.

func IsLegacyVersion added in v0.13.0

func IsLegacyVersion(cached *types.CachedAnalysis) bool

IsLegacyVersion checks if a cache entry is a legacy entry (version 0.0.0). Legacy entries are from before versioning was implemented.

func IsStaleVersion added in v0.13.0

func IsStaleVersion(cached *types.CachedAnalysis) bool

IsStaleVersion checks if a cache entry is from an older version that should be re-analyzed. Returns true if the entry should be considered stale and re-analyzed.

Staleness rules:

  • Schema version mismatch = always stale (incompatible structure)
  • Metadata version behind current = stale (missing newer metadata fields)
  • Semantic version behind current = stale (outdated analysis)
  • Future versions (newer than current) = not stale (forward compatible)

func ParseCacheVersion added in v0.13.0

func ParseCacheVersion(cached *types.CachedAnalysis) (schema, metadata, semantic int)

ParseCacheVersion extracts version components from a cached analysis entry. Returns (0, 0, 0) for legacy entries that don't have version fields set.

func ShardPath added in v0.14.0

func ShardPath(basePath, hash, filename string) string

ShardPath generates a two-level sharded directory path for a given hash. This prevents filesystem performance degradation when directories contain many files by distributing entries across 65,536 possible subdirectories.

For hashes with "sha256:" prefix, the prefix is stripped before sharding to use the actual hash value for directory distribution.

Examples:

  • "sha256:41d6..." → "{base}/41/d6/{filename}"
  • "abc123..." → "{base}/ab/c1/{filename}"
  • "ab" (short) → "{base}/{filename}" (no sharding)

func VersionString added in v0.13.0

func VersionString(cached *types.CachedAnalysis) string

VersionString returns the version string for a cached entry in format "v{schema}.{metadata}.{semantic}".

Types

type CacheStats added in v0.13.0

type CacheStats struct {
	TotalEntries  int            `json:"total_entries"`
	LegacyEntries int            `json:"legacy_entries"`
	TotalSize     int64          `json:"total_size_bytes"`
	VersionCounts map[string]int `json:"version_counts"`
}

CacheStats provides statistics about the cache contents

type Manager

type Manager struct {
	// contains filtered or unexported fields
}

func NewManager

func NewManager(cacheDir string) (*Manager, error)

func (*Manager) Clear

func (m *Manager) Clear() error

func (*Manager) ClearOldVersions added in v0.13.0

func (m *Manager) ClearOldVersions() (int, error)

ClearOldVersions removes all cache entries that are not the current version. Recursively walks all subdirectories (provider and shard directories). Returns the number of entries removed.

func (*Manager) Get

func (m *Manager) Get(fileHash, provider string) (*types.CachedAnalysis, error)

func (*Manager) GetCacheDir added in v0.13.0

func (m *Manager) GetCacheDir() string

GetCacheDir returns the cache directory path.

func (*Manager) GetStats added in v0.13.0

func (m *Manager) GetStats() (*CacheStats, error)

GetStats returns statistics about the cache contents including version distribution. Recursively walks all subdirectories (provider and shard directories).

func (*Manager) IsStale

func (m *Manager) IsStale(cached *types.CachedAnalysis, currentHash string) bool

IsStale checks if a cache entry is stale and needs re-analysis. Returns true if either:

  • Content hash doesn't match (file was modified)
  • Cache version is outdated (needs re-analysis with current logic)

func (*Manager) Set

func (m *Manager) Set(cached *types.CachedAnalysis) error

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL