Documentation
¶
Overview ¶
Package core defines the L0 contracts for Crowl: types, interfaces, hashing, and format version constants. This package has zero non-stdlib imports beyond crypto/sha256 and encoding/json.
Index ¶
- Constants
- Variables
- func CanonicalizeSignature(s string) string
- func EncodeEdge(e Edge) ([]byte, error)
- func EncodeLineage(l Lineage) ([]byte, error)
- func EncodeSymbol(s Symbol) ([]byte, error)
- func NormalizeBody(b []byte) []byte
- type CommitWriter
- type Direction
- type DriftResult
- type Edge
- type EdgeKind
- type FQN
- type Hash
- func ComputeBodyHash(kind SymbolKind, content []byte) Hash
- func ComputeEdgeID(from, to Hash, kind EdgeKind) Hash
- func ComputeLineageID(priorID, nextID Hash, reason LineageReason) Hash
- func ComputeSymbolID(fqn FQN, kind SymbolKind) Hash
- func ComputeTreeHash(activeIDs []Hash) Hash
- func HashBytes(b []byte) Hash
- func ParseHash(s string) (Hash, error)
- type KnowledgeCommit
- type Lifetime
- type Lineage
- type LineageReason
- type Record
- type RecordKind
- type Span
- type Store
- type Symbol
- type SymbolKind
- type WhyEvent
Constants ¶
const CrowlSemver = "0.5.2"
CrowlSemver is the tool version this binary identifies as in KnowledgeCommit. Bump on each release. Distinct from FormatVersion.
const FormatVersion uint16 = 4
FormatVersion is the on-disk format version this binary writes.
History:
v1 (0.1.x, 0.2.x): Symbol = {version, id, fqn, kind, signature, span, bodyHash}
v2 (0.3.x): adds Symbol.Language (string) for multi-language support.
v3 (0.4.x): Python FQNs change to module-path form
("click.core.Command.invoke" instead of
"src/click/core.py:Command.invoke"). Symbol.ID
changes for affected symbols; v2 indexes need
re-extraction (no in-place ID rewrite).
v4 (0.5.x): adds SymbolEnum (13), SymbolEnumMember (14),
SymbolNamespace (15), and EdgeExtends (12).
Strictly additive: existing v3 records read as
v4 unchanged. TypeScript Symbol FQNs migrate to
module-path form (extension stripped, "."-joined);
v3 TS indexes require re-extraction. New tree-sitter
pipeline (`internal/extract/tspipeline/`) coexists
with the legacy go/types Go extractor; selectable
via `crowl analyze --extractor`.
const MinSupportedVersion uint16 = 1
MinSupportedVersion is the oldest format this binary can read (with migration).
Variables ¶
var ( ErrNotFound = errors.New("core: not found") ErrUnknownCommit = errors.New("core: unknown commit") ErrFormatVersionUnknown = errors.New("core: format version not supported by this binary") ErrCorruptRecord = errors.New("core: corrupt record") )
Sentinel errors used across the core.
var EmptyBodyHash = HashBytes(nil)
EmptyBodyHash is SHA-256("") — used for non-textual symbols (Folder, Package). FORMAT_SPEC §6.5.
var EmptyHash = Hash{}
EmptyHash is the well-known zero hash.
Functions ¶
func CanonicalizeSignature ¶
CanonicalizeSignature collapses runs of ASCII whitespace into single spaces and trims leading/trailing whitespace. Used for Symbol.Signature. See FORMAT_SPEC §4.1.
func EncodeEdge ¶
EncodeEdge returns the canonical JSON encoding.
func EncodeLineage ¶
EncodeLineage returns the canonical JSON encoding.
func EncodeSymbol ¶
EncodeSymbol returns the canonical JSON encoding of s per FORMAT_SPEC §2.2. (Single line, no whitespace around : or ,, no trailing newline.)
func NormalizeBody ¶
NormalizeBody strips leading/trailing ASCII whitespace per FORMAT_SPEC §6.5 and FOUNDATION §4.2. Used as input to BodyHash for textual symbols.
Types ¶
type CommitWriter ¶
type CommitWriter interface {
Put(ctx context.Context, r Record) error
Remove(ctx context.Context, id Hash) error
Finish(ctx context.Context) (KnowledgeCommit, error)
Abort(ctx context.Context) error
}
CommitWriter accumulates record-level changes for one source commit and produces the corresponding KnowledgeCommit when Finish is called.
type DriftResult ¶
DriftResult is the symbol-level diff between two indexed states (typically a historical SourceSha and current HEAD). Each list is sorted by FQN for deterministic output. Used by Query.Drift and the crowl_drift MCP tool.
- Added: symbols active at HEAD that were not active at the historical point.
- Removed: symbols active at the historical point that are not active at HEAD.
- Updated: symbols active at both points where at least one new revision (a body-changing rewrite, since Symbol.ID is keyed by FQN+Kind and an ID-changing change is an Add+Remove) was added in the interval.
type Edge ¶
type Edge struct {
Version uint16 `json:"version"`
ID Hash `json:"id"`
From Hash `json:"from"`
To Hash `json:"to"`
Kind EdgeKind `json:"kind"`
}
Edge is a directed, typed relationship.
func DecodeEdge ¶
DecodeEdge parses a canonical JSON Edge blob. Edge schema has not changed v1→v2.
type EdgeKind ¶
type EdgeKind uint8
EdgeKind enumerates typed relationships between symbols.
const ( EdgeContains EdgeKind = 1 EdgeDefines EdgeKind = 2 EdgeHasMethod EdgeKind = 3 EdgeHasField EdgeKind = 4 EdgeEmbeds EdgeKind = 5 EdgeCalls EdgeKind = 6 EdgeImports EdgeKind = 7 // 8 is reserved (was EdgeReferences, removed in v0.3.4-dev: defined // but never emitted by any extractor; query/MCP layers treated its // absence as "no references" which was a silent-correctness lie). EdgeImplements EdgeKind = 9 EdgeReturns EdgeKind = 10 EdgeAccepts EdgeKind = 11 EdgeExtends EdgeKind = 12 // v4: class A extends B; interface I extends J; trait sub-supertrait )
type FQN ¶
type FQN string
FQN is a fully-qualified name. Format is per-language; the core treats it as opaque.
type Hash ¶
type Hash [32]byte
Hash is a 32-byte SHA-256 digest. Wire form is 64-char lowercase hex.
func ComputeBodyHash ¶
func ComputeBodyHash(kind SymbolKind, content []byte) Hash
ComputeBodyHash applies the per-kind body-hash rule:
- Folder, Package: SHA-256("")
- File: SHA-256(content)
- Other (textual): SHA-256(NormalizeBody(content))
Caller passes the relevant byte slice (file content for File; declaration span bytes for textual symbols).
func ComputeEdgeID ¶
ComputeEdgeID per FORMAT_SPEC §6.2:
input := from-bytes (32) || to-bytes (32) || uint16-LE(kind)
func ComputeLineageID ¶
func ComputeLineageID(priorID, nextID Hash, reason LineageReason) Hash
ComputeLineageID per FORMAT_SPEC §6.3:
input := priorId-bytes (32) || nextId-bytes (32) || uint16-LE(reason)
func ComputeSymbolID ¶
func ComputeSymbolID(fqn FQN, kind SymbolKind) Hash
ComputeSymbolID per FORMAT_SPEC §6.1:
input := uint32-LE(len(fqn)) || fqn || uint16-LE(kind) id := SHA-256(input)
func ComputeTreeHash ¶
ComputeTreeHash per FORMAT_SPEC §6.4:
sort active record IDs lexicographically; concat raw bytes; SHA-256.
activeIDs is consumed (sorted in place). Pass a copy if the caller needs to retain the original order.
func (Hash) MarshalText ¶
MarshalText implements encoding.TextMarshaler so a Hash JSON-encodes as its hex form, matching FORMAT_SPEC §2.3.
func (*Hash) UnmarshalText ¶
UnmarshalText decodes a hex string into a Hash.
type KnowledgeCommit ¶
type KnowledgeCommit struct {
Version uint16 `json:"version"`
SourceSha string `json:"sourceSha"`
Parents []Hash `json:"-"` // encoded as git commit's parents, not in meta.json
Added []Hash `json:"added"`
Removed []Hash `json:"removed"`
Updated []Hash `json:"updated"`
Tree Hash `json:"tree"`
Timestamp int64 `json:"timestamp"`
CrowlVersion string `json:"crowlVersion"`
}
KnowledgeCommit is the set of mutations derived from one source commit. Parents mirrors git's commit DAG: 0 = root, 1 = normal, 2+ = merge. See FOUNDATION §4 and ADR-001 for parent semantics.
type Lifetime ¶
Lifetime is derived state: when a record was added/removed and by whom. NOT stored in blobs; computed from tree presence across KCs. See FORMAT_SPEC §4.4.
type Lineage ¶
type Lineage struct {
Version uint16 `json:"version"`
ID Hash `json:"id"`
PriorID Hash `json:"priorId"`
NextID Hash `json:"nextId"`
Reason LineageReason `json:"reason"`
Manual bool `json:"manual"`
}
Lineage links a prior symbol identity to a successor identity.
func DecodeLineage ¶
DecodeLineage parses a canonical JSON Lineage blob. Lineage schema has not changed v1→v2.
type LineageReason ¶
type LineageReason uint8
LineageReason describes why two symbols are linked. v0.1 has only Renamed.
const (
LineageRenamed LineageReason = 1
)
func (LineageReason) IsValid ¶
func (r LineageReason) IsValid() bool
IsValid reports whether r is in the v1 valid range.
type Record ¶
type Record struct {
Kind RecordKind
Payload []byte
Lifetime Lifetime
}
Record is the atomic unit of the graph: a payload + lifetime.
func LineageToRecord ¶
LineageToRecord wraps l in a Record.
func SymbolToRecord ¶
SymbolToRecord wraps s in a Record with an empty Lifetime. Lifetime is computed/managed by the storage layer, not here.
type RecordKind ¶
type RecordKind uint8
RecordKind tags the payload of a Record.
const ( RecordSymbol RecordKind = 1 RecordEdge RecordKind = 2 RecordLineage RecordKind = 3 )
type Span ¶
type Span struct {
File string `json:"file"`
StartByte uint32 `json:"startByte"`
EndByte uint32 `json:"endByte"`
StartLine uint32 `json:"startLine"`
EndLine uint32 `json:"endLine"`
}
Span is a byte range within a source file.
type Store ¶
type Store interface {
GetRecord(ctx context.Context, id Hash) (Record, error)
GetKnowledgeCommit(ctx context.Context, sourceSha string) (KnowledgeCommit, error)
Walk(ctx context.Context, from Hash, kind EdgeKind, dir Direction) iter.Seq2[Edge, error]
AsOf(commit string) Store
BeginCommit(ctx context.Context, sourceSha string) (CommitWriter, error)
}
Store is the L0 storage interface (FOUNDATION §5).
Reads return state at HEAD by default; AsOf returns a time-travel view. Writes are scoped to a commit being built — see CommitWriter.
Error semantics:
- GetRecord returns ErrNotFound if no revision is active at HEAD.
- GetKnowledgeCommit returns ErrNotFound if no KC exists for sourceSha.
- AsOf is lazy; if commit is unknown, ErrUnknownCommit surfaces on first read.
type Symbol ¶
type Symbol struct {
Version uint16 `json:"version"`
ID Hash `json:"id"`
FQN FQN `json:"fqn"`
Kind SymbolKind `json:"kind"`
Signature string `json:"signature"`
Span Span `json:"span"`
BodyHash Hash `json:"bodyHash"`
Language string `json:"language,omitempty"` // v2+; "go", "typescript", "python"
}
Symbol is a named, addressable code element. Version is the FIRST field per FORMAT_SPEC §4.1 (file-format convention: version header first).
In v2, Language is added as the LAST field (preserving v1 reader compat at the JSON layer — v1 readers ignore unknown fields). Empty Language is the v1 default and indicates "go" by convention. The migration function sets Language="go" on every existing v1 record at read time.
func DecodeSymbol ¶
DecodeSymbol parses a canonical JSON Symbol blob and validates version range + kind. Does NOT migrate — that is L1 work; see internal/migrate/. Production callers should use migrate.DecodeAndMigrateSymbol; this primitive is exported for callers that need to inspect raw on-disk version (e.g., debugging, format introspection).
L0 boundary note: core/ imports only stdlib, so the migration step lives one layer up (see internal/migrate).
type SymbolKind ¶
type SymbolKind uint8
SymbolKind enumerates the kinds of symbols Crowl tracks. Append-only: existing constants never renumber.
const ( SymbolFolder SymbolKind = 1 SymbolFile SymbolKind = 2 SymbolPackage SymbolKind = 3 SymbolStruct SymbolKind = 4 SymbolInterface SymbolKind = 5 SymbolTypeAlias SymbolKind = 6 SymbolTypeDef SymbolKind = 7 SymbolFunction SymbolKind = 8 SymbolMethod SymbolKind = 9 SymbolField SymbolKind = 10 SymbolConst SymbolKind = 11 SymbolVar SymbolKind = 12 SymbolEnum SymbolKind = 13 // v4: enum type (TS, Java, Rust, ...) SymbolEnumMember SymbolKind = 14 // v4: a member of an enum SymbolNamespace SymbolKind = 15 // v4: namespace / module construct (TS, Rust mod, ...) )
func (SymbolKind) IsValid ¶
func (k SymbolKind) IsValid() bool
IsValid reports whether k is in the valid range.
func (SymbolKind) String ¶
func (i SymbolKind) String() string
type WhyEvent ¶
type WhyEvent struct {
SourceSha string
Timestamp int64
CommitMessage string
CommitAuthor string
Effect string // "added" | "updated" | "removed"
}
WhyEvent is one entry in the chronological narrative returned by Query.Why — a single KC that touched a particular symbol. Older KCs (created before the v0.4.1 schema migration ran) lack CommitMessage / CommitAuthor; those fields are empty strings in that case.