Documentation
¶
Overview ¶
Package types result types for graph queries and traversals.
Package types defines the core domain model for the knowing knowledge graph.
All entities (nodes, edges, files, repos, snapshots) are content-addressed using SHA-256 hashes, enabling deterministic identity, deduplication, and Merkle-based snapshot diffing. The hash functions in this package define the canonical identity computations used throughout the system.
Key types:
- Node: a symbol (function, type, method, etc.) in the knowledge graph
- Edge: a relationship (calls, imports, implements, references) between nodes
- File and Repo: tracked source artifacts
- Snapshot: a point-in-time Merkle root over all edges in a repo
- EdgeEvent: an append-only event for event-sourced diff tracking
The GraphStore interface (interfaces.go) and Extractor interface define the contracts that concrete implementations must satisfy.
Index ¶
- type BlastRadiusResult
- type CalleeResult
- type CallerResult
- type CallerWithProvenance
- type ComputationCache
- type DerivedResult
- type DiffResult
- type Edge
- type EdgeEvent
- type EdgeProvenance
- type ExtractOptions
- type ExtractResult
- type Extractor
- type File
- type GraphStore
- type Hash
- type Node
- type Repo
- type Snapshot
- type TraversalOptions
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type BlastRadiusResult ¶
type BlastRadiusResult struct {
Target Node // the node whose blast radius was computed
ByRepo map[string][]CallerWithProvenance // repo URL -> callers in that repo
TotalCount int // total number of callers across all repos
Truncated bool // true if traversal hit the max depth limit
}
BlastRadiusResult groups all transitive callers of a target node by the repository they belong to. This powers the blast_radius MCP tool, showing how a change to one symbol ripples across the codebase.
type CalleeResult ¶
CalleeResult is a single node in a transitive callees traversal, paired with its depth (hop count) from the query source.
type CallerResult ¶
CallerResult is a single node in a transitive callers traversal, paired with its depth (hop count) from the query target.
type CallerWithProvenance ¶
type CallerWithProvenance struct {
Caller Node
Depth int
Confidence float64 // minimum confidence along the call path (0.0 to 1.0)
Provenance []EdgeProvenance // ordered provenance chain from caller to target
}
CallerWithProvenance pairs a caller node with the edge provenance chain from that caller back to the target. Confidence is the minimum confidence along the call path.
type ComputationCache ¶
type ComputationCache interface {
// Get retrieves a cached result by its content-addressed hash.
Get(ctx context.Context, resultHash Hash) (*DerivedResult, error)
// GetByQuery retrieves a cached result by query type, parameter hash, and snapshot root.
GetByQuery(ctx context.Context, queryType string, params Hash, snapshot Hash) (*DerivedResult, error)
// Put stores a computed result.
Put(ctx context.Context, result DerivedResult) error
// Invalidate evicts cached results that depend on edges changed between
// oldSnapshot and newSnapshot. Returns the number of evicted entries.
Invalidate(ctx context.Context, oldSnapshot, newSnapshot Hash, diff DiffResult) (evicted int, err error)
}
ComputationCache manages content-addressed derived computation results (e.g., cached blast radius queries). Results are keyed by a combination of query type, parameters, and snapshot root, and are automatically invalidated when the snapshot changes.
Interface defined now; implementation is deferred.
type DerivedResult ¶
type DerivedResult struct {
ResultHash Hash // content-addressed hash of this result
QueryType string // type of query (e.g., "blast_radius", "transitive_callers")
QueryParams Hash // hash of the query parameters
SnapshotRoot Hash // snapshot root at the time of computation
Data []byte // serialized result data
ComputedAt int64 // unix timestamp when computed
ComputedBy string // identifier of the computing agent/indexer
}
DerivedResult is a content-addressed cached computation result. Used by ComputationCache to store and retrieve expensive query results (e.g., blast radius, transitive callers) keyed by query parameters and snapshot root.
type DiffResult ¶
type DiffResult struct {
OldSnapshot Hash
NewSnapshot Hash
EdgesAdded []Edge // edges present in NewSnapshot but not OldSnapshot
EdgesRemoved []Edge // edges present in OldSnapshot but not NewSnapshot
NodesAdded []Node // nodes present in NewSnapshot but not OldSnapshot
NodesRemoved []Node // nodes present in OldSnapshot but not NewSnapshot
}
DiffResult contains the structural diff between two snapshots. Used by the snapshot_diff and semantic_diff MCP tools to report what changed between two points in time.
type Edge ¶
type Edge struct {
EdgeHash Hash // content-addressed identity: sha256(sourceHash || targetHash || edgeType || provenance)
SourceHash Hash // hash of the source node (caller, importer, implementor)
TargetHash Hash // hash of the target node (callee, imported package, interface)
EdgeType string // relationship kind: "calls", "imports", "implements", "references"
Confidence float64 // quality score from 0.0 to 1.0; ast_inferred=0.7, lsp_resolved=0.9, ast_resolved=1.0
Provenance string // how the edge was derived: "ast_resolved", "ast_inferred", "lsp_resolved", etc.
// CallSite fields store the source location of the call expression (not the
// target declaration). These positions are used by LSP enrichment: the enricher
// sends GetDefinition at (CallSiteFile, CallSiteLine, CallSiteCol) to confirm
// or correct the target. Zero values mean no call-site info is available.
CallSiteLine int // 1-indexed line of the call expression in the source file
CallSiteCol int // 0-indexed column of the call expression
CallSiteFile string // relative file path (within the repo) containing the call expression
// Runtime observation fields. Zero values for static edges.
ObservationCount int // total observations in current window (0 for static edges)
LastObserved int64 // unix timestamp of last observation (0 for static edges)
}
Edge represents a directed relationship between two nodes in the knowledge graph. Edge types include "calls", "imports", "implements", and "references". Each edge carries a confidence score and provenance tag indicating how it was derived (ast_resolved, ast_inferred, lsp_resolved, etc.).
type EdgeEvent ¶
type EdgeEvent struct {
EventID int64 // auto-increment primary key
EdgeHash Hash // hash of the edge that was added or removed
EventType string // "added" or "removed"
SnapshotHash Hash // the snapshot during which this event occurred
SourceCommit string // git commit that triggered the event
IndexerVer string // version of the indexer that produced this event (e.g., "v1")
Timestamp int64 // unix timestamp of the event
}
EdgeEvent represents an append-only edge mutation event for event sourcing. Each time an edge is added or removed during an index run, an EdgeEvent is recorded. These events power the SnapshotDiff query by tracking which edges changed between snapshots.
type EdgeProvenance ¶
type EdgeProvenance struct {
Source string // derivation method: "ast_resolved", "ast_inferred", "lsp_resolved", etc.
Confidence float64 // confidence score of this provenance step (0.0 to 1.0)
IndexerVersion string // version of the indexer that produced this edge
SourceCommit string // git commit hash at the time of extraction
SourceFileHash Hash // hash of the source file from which the edge was extracted
Timestamp int64 // unix timestamp of extraction
}
EdgeProvenance captures the full derivation history of an edge. Used in BlastRadiusResult to show the provenance chain from a caller back to the target, so consumers can assess trustworthiness.
type ExtractOptions ¶
type ExtractOptions struct {
RepoURL string // the repo URL (or local path) as registered in the store
RepoHash Hash // sha256(RepoURL)
CommitHash string // git commit hash being indexed
FilePath string // file path relative to the repository root
FileHash Hash // content-addressed file hash: sha256(repoHash || path || contentHash)
Content []byte // raw file contents
ModuleRoot string // absolute path to the module/repo root on disk (for go.mod resolution)
// ModuleToRepoURL maps Go module paths to stored repo URLs. This is
// populated by the indexer from the repos table so extractors can
// resolve cross-repo call targets to the correct stored repo URL
// rather than using heuristic inference from the import path.
// Example: "github.com/org/repo" -> "/Users/user/code/repo"
ModuleToRepoURL map[string]string
}
ExtractOptions contains all inputs needed for a single file extraction run. The indexer populates these fields and passes them to the selected Extractor.
type ExtractResult ¶
ExtractResult contains the nodes and edges produced by an extractor.
type Extractor ¶
type Extractor interface {
// Name returns a human-readable identifier for this extractor (e.g., "go", "go-treesitter").
Name() string
// CanHandle returns true if this extractor can process the file at the given path.
// The path is relative to the repository root.
CanHandle(path string) bool
// Extract parses the file described by opts and returns extracted nodes and edges.
// Returns an empty result (not an error) if no symbols are found.
Extract(ctx context.Context, opts ExtractOptions) (*ExtractResult, error)
}
Extractor produces nodes and edges from source files. The indexer maintains a registry of extractors and dispatches each file to the first extractor whose CanHandle returns true. Implementations include GoExtractor (full type resolution), GoTreeSitterExtractor (fast AST-only), and TreeSitterExtractor (Python via tree-sitter).
type File ¶
type File struct {
FileHash Hash // sha256(repoHash || relativePath || contentHash)
RepoHash Hash // hash of the containing Repo
Path string // path relative to the repository root
ContentHash Hash // sha256 of the raw file contents; used for skip-if-unchanged checks
}
File represents a tracked source file within a repository. The FileHash incorporates the repo, path, and content, so a file's identity changes whenever its content changes (enabling content-based change detection).
type GraphStore ¶
type GraphStore interface {
// Write operations (upsert semantics via INSERT OR REPLACE).
PutNode(ctx context.Context, n Node) error
PutEdge(ctx context.Context, e Edge) error
PutFile(ctx context.Context, f File) error
PutRepo(ctx context.Context, r Repo) error
RecordEdgeEvent(ctx context.Context, ev EdgeEvent) error
CreateSnapshot(ctx context.Context, s Snapshot) error
// Point lookups by hash. Return nil when not found (no error).
GetNode(ctx context.Context, hash Hash) (*Node, error)
GetEdge(ctx context.Context, hash Hash) (*Edge, error)
GetSnapshot(ctx context.Context, hash Hash) (*Snapshot, error)
GetRepo(ctx context.Context, hash Hash) (*Repo, error)
// Query operations.
NodesByName(ctx context.Context, qualifiedPrefix string) ([]Node, error)
EdgesFrom(ctx context.Context, sourceHash Hash, edgeType string) ([]Edge, error)
EdgesTo(ctx context.Context, targetHash Hash, edgeType string) ([]Edge, error)
DanglingEdges(ctx context.Context) ([]Edge, error)
AllRepos(ctx context.Context) ([]Repo, error)
NodesByQualifiedName(ctx context.Context, qualifiedName string) ([]Node, error)
// Delete operations for incremental re-indexing and garbage collection.
DeleteEdge(ctx context.Context, hash Hash) error
DeleteNodesByFile(ctx context.Context, fileHash Hash) (int, error)
DeleteEdgesBySourceFile(ctx context.Context, fileHash Hash) ([]Edge, error)
EdgesBySourceFile(ctx context.Context, fileHash Hash) ([]Edge, error)
DeleteSnapshot(ctx context.Context, hash Hash) error
// Graph traversals (implemented as recursive CTEs in SQLite).
TransitiveCallers(ctx context.Context, target Hash, maxDepth int, snapshot Hash) ([]CallerResult, error)
TransitiveCallees(ctx context.Context, source Hash, maxDepth int, snapshot Hash) ([]CalleeResult, error)
BlastRadius(ctx context.Context, target Hash, snapshot Hash) (*BlastRadiusResult, error)
// Snapshot operations.
SnapshotDiff(ctx context.Context, oldRoot, newRoot Hash) (*DiffResult, error)
StaleEdges(ctx context.Context, snapshot Hash) ([]Edge, error)
LatestSnapshot(ctx context.Context, repoHash Hash) (*Snapshot, error)
// File queries.
FilesByRepo(ctx context.Context, repoHash Hash) ([]File, error)
FileByPath(ctx context.Context, repoHash Hash, path string) (*File, error)
NodesByFilePath(ctx context.Context, repoHash Hash, path string) ([]Node, error)
// Close releases the underlying database connection.
Close() error
}
GraphStore defines the operations the graph engine requires from its backing store. SQLite implements this today; an adjacency-list or external graph backend can implement it tomorrow without changing callers.
The interface is organized into four groups:
- Write operations: PutNode, PutEdge, PutFile, PutRepo, RecordEdgeEvent, CreateSnapshot
- Point lookups: GetNode, GetEdge, GetSnapshot, GetRepo
- Query operations: NodesByName, EdgesFrom, EdgesTo, DanglingEdges, etc.
- Graph traversals: TransitiveCallers, TransitiveCallees, BlastRadius, SnapshotDiff
All methods accept a context for cancellation and timeout propagation. Methods that return a pointer return nil (not an error) when the entity is not found.
type Hash ¶
type Hash [32]byte
Hash is a content-addressed identifier (SHA-256 digest, 32 bytes). Used as the primary key for all graph entities: nodes, edges, files, repos, and snapshots. Two entities with identical content always produce the same Hash.
var EmptyHash Hash
EmptyHash is the zero-value hash.
func ComputeEdgeHash ¶
ComputeEdgeHash computes the content-addressed hash for an edge. The hash formula is: SHA-256(sourceHash + NUL + targetHash + NUL + edgeType + NUL + provenance). Because provenance is included, upgrading an edge from "ast_inferred" to "lsp_resolved" produces a new hash (the old edge must be deleted first).
func ComputeNodeHash ¶
ComputeNodeHash computes the content-addressed hash for a node. The contentHash parameter is accepted for API compatibility but is not included in the hash computation. Node identity depends on (repo, package, name, kind) only.
The hash formula is: SHA-256(repoURL + NUL + packagePath + NUL + symbolName + NUL + symbolKind). NUL bytes are used as field separators to prevent ambiguous concatenation (e.g., "a/b" + "c" vs "a" + "b/c").
type Node ¶
type Node struct {
// NodeHash is the content-addressed identity: sha256(repoURL || packagePath || symbolName || symbolKind).
// Note: contentHash was removed from the computation; node identity
// depends only on (repo, package, name, kind).
NodeHash Hash
FileHash Hash // reference to the containing File record
QualifiedName string // fully qualified name: "{repoURL}://{pkgPath}.{TypeName}.{SymbolName}"
Kind string // one of: function, type, method, interface, const, var
Line int // 1-indexed source line number of the declaration
Signature string // human-readable type signature for display (e.g., "func (SQLiteStore) PutNode()")
Doc string // doc comment preceding the declaration (first 200 chars, language-agnostic)
}
Node represents a symbol in the knowledge graph. A node is a function, type, method, interface, const, or var declaration extracted from source code. Nodes are identified by a content-addressed hash computed from (repo, package, name, kind), so two nodes in different files with the same qualified identity will share a hash.
type Repo ¶
type Repo struct {
RepoHash Hash // sha256(repoURL); canonical identity for the repo
RepoURL string // the URL or path that was passed to IndexRepo
LastCommit string // git commit hash from the most recent index run
LastIndexed int64 // unix timestamp of the most recent index run
}
Repo represents a tracked repository. The RepoURL can be either a remote URL (e.g., "github.com/org/repo") or a local filesystem path, depending on how the repo was registered.
type Snapshot ¶
type Snapshot struct {
SnapshotHash Hash // Merkle root: merkle_root(sorted(all_edge_hashes))
ParentHash Hash // hash of the previous snapshot in the chain; zero for the first
RepoHash Hash // hash of the repository this snapshot belongs to
CommitHash string // git commit hash at the time of snapshotting
Timestamp int64 // unix timestamp when the snapshot was created
NodeCount int // total number of nodes in the graph at snapshot time
EdgeCount int // total number of edges in the graph at snapshot time
}
Snapshot represents a point-in-time graph state for a single repository. The SnapshotHash is the Merkle root computed over all sorted edge hashes, providing a tamper-evident fingerprint of the entire graph at a given commit. Snapshots form a singly-linked chain via ParentHash, enabling efficient diffing and garbage collection.
type TraversalOptions ¶
type TraversalOptions struct {
MaxDepth int // maximum hop count from the starting node
MaxResults int // maximum number of results to return
MinConfidence float64 // minimum edge confidence to follow (0.0 to 1.0)
}
TraversalOptions controls bounded graph traversal with early termination. Used to prevent unbounded recursion in transitive caller/callee queries.