core

package
v0.5.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 29, 2026 License: MIT Imports: 10 Imported by: 0

Documentation

Overview

Package core defines the L0 contracts for Crowl: types, interfaces, hashing, and format version constants. This package has zero non-stdlib imports beyond crypto/sha256 and encoding/json.

Index

Constants

View Source
const CrowlSemver = "0.5.2"

CrowlSemver is the tool version this binary identifies as in KnowledgeCommit. Bump on each release. Distinct from FormatVersion.

View Source
const FormatVersion uint16 = 4

FormatVersion is the on-disk format version this binary writes.

History:

v1 (0.1.x, 0.2.x): Symbol = {version, id, fqn, kind, signature, span, bodyHash}
v2 (0.3.x):        adds Symbol.Language (string) for multi-language support.
v3 (0.4.x):        Python FQNs change to module-path form
                   ("click.core.Command.invoke" instead of
                   "src/click/core.py:Command.invoke"). Symbol.ID
                   changes for affected symbols; v2 indexes need
                   re-extraction (no in-place ID rewrite).
v4 (0.5.x):        adds SymbolEnum (13), SymbolEnumMember (14),
                   SymbolNamespace (15), and EdgeExtends (12).
                   Strictly additive: existing v3 records read as
                   v4 unchanged. TypeScript Symbol FQNs migrate to
                   module-path form (extension stripped, "."-joined);
                   v3 TS indexes require re-extraction. New tree-sitter
                   pipeline (`internal/extract/tspipeline/`) coexists
                   with the legacy go/types Go extractor; selectable
                   via `crowl analyze --extractor`.
View Source
const MinSupportedVersion uint16 = 1

MinSupportedVersion is the oldest format this binary can read (with migration).

Variables

View Source
var (
	ErrNotFound             = errors.New("core: not found")
	ErrUnknownCommit        = errors.New("core: unknown commit")
	ErrFormatVersionUnknown = errors.New("core: format version not supported by this binary")
	ErrCorruptRecord        = errors.New("core: corrupt record")
)

Sentinel errors used across the core.

View Source
var EmptyBodyHash = HashBytes(nil)

EmptyBodyHash is SHA-256("") — used for non-textual symbols (Folder, Package). FORMAT_SPEC §6.5.

View Source
var EmptyHash = Hash{}

EmptyHash is the well-known zero hash.

Functions

func CanonicalizeSignature

func CanonicalizeSignature(s string) string

CanonicalizeSignature collapses runs of ASCII whitespace into single spaces and trims leading/trailing whitespace. Used for Symbol.Signature. See FORMAT_SPEC §4.1.

func EncodeEdge

func EncodeEdge(e Edge) ([]byte, error)

EncodeEdge returns the canonical JSON encoding.

func EncodeLineage

func EncodeLineage(l Lineage) ([]byte, error)

EncodeLineage returns the canonical JSON encoding.

func EncodeSymbol

func EncodeSymbol(s Symbol) ([]byte, error)

EncodeSymbol returns the canonical JSON encoding of s per FORMAT_SPEC §2.2. (Single line, no whitespace around : or ,, no trailing newline.)

func NormalizeBody

func NormalizeBody(b []byte) []byte

NormalizeBody strips leading/trailing ASCII whitespace per FORMAT_SPEC §6.5 and FOUNDATION §4.2. Used as input to BodyHash for textual symbols.

Types

type CommitWriter

type CommitWriter interface {
	Put(ctx context.Context, r Record) error
	Remove(ctx context.Context, id Hash) error
	Finish(ctx context.Context) (KnowledgeCommit, error)
	Abort(ctx context.Context) error
}

CommitWriter accumulates record-level changes for one source commit and produces the corresponding KnowledgeCommit when Finish is called.

type Direction

type Direction uint8

Direction selects edge traversal direction.

const (
	DirOut Direction = 1 // From → To
	DirIn  Direction = 2 // To → From
)

type DriftResult

type DriftResult struct {
	Added   []Symbol
	Removed []Symbol
	Updated []Symbol
}

DriftResult is the symbol-level diff between two indexed states (typically a historical SourceSha and current HEAD). Each list is sorted by FQN for deterministic output. Used by Query.Drift and the crowl_drift MCP tool.

  • Added: symbols active at HEAD that were not active at the historical point.
  • Removed: symbols active at the historical point that are not active at HEAD.
  • Updated: symbols active at both points where at least one new revision (a body-changing rewrite, since Symbol.ID is keyed by FQN+Kind and an ID-changing change is an Add+Remove) was added in the interval.

type Edge

type Edge struct {
	Version uint16   `json:"version"`
	ID      Hash     `json:"id"`
	From    Hash     `json:"from"`
	To      Hash     `json:"to"`
	Kind    EdgeKind `json:"kind"`
}

Edge is a directed, typed relationship.

func DecodeEdge

func DecodeEdge(b []byte) (Edge, error)

DecodeEdge parses a canonical JSON Edge blob. Edge schema has not changed v1→v2.

type EdgeKind

type EdgeKind uint8

EdgeKind enumerates typed relationships between symbols.

const (
	EdgeContains  EdgeKind = 1
	EdgeDefines   EdgeKind = 2
	EdgeHasMethod EdgeKind = 3
	EdgeHasField  EdgeKind = 4
	EdgeEmbeds    EdgeKind = 5
	EdgeCalls     EdgeKind = 6
	EdgeImports   EdgeKind = 7
	// 8 is reserved (was EdgeReferences, removed in v0.3.4-dev: defined
	// but never emitted by any extractor; query/MCP layers treated its
	// absence as "no references" which was a silent-correctness lie).
	EdgeImplements EdgeKind = 9
	EdgeReturns    EdgeKind = 10
	EdgeAccepts    EdgeKind = 11
	EdgeExtends    EdgeKind = 12 // v4: class A extends B; interface I extends J; trait sub-supertrait
)

func (EdgeKind) IsValid

func (k EdgeKind) IsValid() bool

IsValid reports whether k is in the valid range. 8 is reserved (gap).

func (EdgeKind) String

func (i EdgeKind) String() string

type FQN

type FQN string

FQN is a fully-qualified name. Format is per-language; the core treats it as opaque.

type Hash

type Hash [32]byte

Hash is a 32-byte SHA-256 digest. Wire form is 64-char lowercase hex.

func ComputeBodyHash

func ComputeBodyHash(kind SymbolKind, content []byte) Hash

ComputeBodyHash applies the per-kind body-hash rule:

  • Folder, Package: SHA-256("")
  • File: SHA-256(content)
  • Other (textual): SHA-256(NormalizeBody(content))

Caller passes the relevant byte slice (file content for File; declaration span bytes for textual symbols).

func ComputeEdgeID

func ComputeEdgeID(from, to Hash, kind EdgeKind) Hash

ComputeEdgeID per FORMAT_SPEC §6.2:

input := from-bytes (32) || to-bytes (32) || uint16-LE(kind)

func ComputeLineageID

func ComputeLineageID(priorID, nextID Hash, reason LineageReason) Hash

ComputeLineageID per FORMAT_SPEC §6.3:

input := priorId-bytes (32) || nextId-bytes (32) || uint16-LE(reason)

func ComputeSymbolID

func ComputeSymbolID(fqn FQN, kind SymbolKind) Hash

ComputeSymbolID per FORMAT_SPEC §6.1:

input := uint32-LE(len(fqn)) || fqn || uint16-LE(kind)
id    := SHA-256(input)

func ComputeTreeHash

func ComputeTreeHash(activeIDs []Hash) Hash

ComputeTreeHash per FORMAT_SPEC §6.4:

sort active record IDs lexicographically; concat raw bytes; SHA-256.

activeIDs is consumed (sorted in place). Pass a copy if the caller needs to retain the original order.

func HashBytes

func HashBytes(b []byte) Hash

HashBytes returns SHA-256 of b.

func ParseHash

func ParseHash(s string) (Hash, error)

ParseHash decodes a hex string into a Hash.

func (Hash) HashHex

func (h Hash) HashHex() string

HashHex returns the lowercase hex form of h (FORMAT_SPEC §3.1).

func (Hash) IsEmpty

func (h Hash) IsEmpty() bool

IsEmpty reports whether h is the zero hash.

func (Hash) MarshalText

func (h Hash) MarshalText() ([]byte, error)

MarshalText implements encoding.TextMarshaler so a Hash JSON-encodes as its hex form, matching FORMAT_SPEC §2.3.

func (*Hash) UnmarshalText

func (h *Hash) UnmarshalText(text []byte) error

UnmarshalText decodes a hex string into a Hash.

type KnowledgeCommit

type KnowledgeCommit struct {
	Version      uint16 `json:"version"`
	SourceSha    string `json:"sourceSha"`
	Parents      []Hash `json:"-"` // encoded as git commit's parents, not in meta.json
	Added        []Hash `json:"added"`
	Removed      []Hash `json:"removed"`
	Updated      []Hash `json:"updated"`
	Tree         Hash   `json:"tree"`
	Timestamp    int64  `json:"timestamp"`
	CrowlVersion string `json:"crowlVersion"`
}

KnowledgeCommit is the set of mutations derived from one source commit. Parents mirrors git's commit DAG: 0 = root, 1 = normal, 2+ = merge. See FOUNDATION §4 and ADR-001 for parent semantics.

type Lifetime

type Lifetime struct {
	AddedCommit   string
	RemovedCommit string
	Author        string
	AuthorTime    int64
}

Lifetime is derived state: when a record was added/removed and by whom. NOT stored in blobs; computed from tree presence across KCs. See FORMAT_SPEC §4.4.

type Lineage

type Lineage struct {
	Version uint16        `json:"version"`
	ID      Hash          `json:"id"`
	PriorID Hash          `json:"priorId"`
	NextID  Hash          `json:"nextId"`
	Reason  LineageReason `json:"reason"`
	Manual  bool          `json:"manual"`
}

Lineage links a prior symbol identity to a successor identity.

func DecodeLineage

func DecodeLineage(b []byte) (Lineage, error)

DecodeLineage parses a canonical JSON Lineage blob. Lineage schema has not changed v1→v2.

type LineageReason

type LineageReason uint8

LineageReason describes why two symbols are linked. v0.1 has only Renamed.

const (
	LineageRenamed LineageReason = 1
)

func (LineageReason) IsValid

func (r LineageReason) IsValid() bool

IsValid reports whether r is in the v1 valid range.

type Record

type Record struct {
	Kind     RecordKind
	Payload  []byte
	Lifetime Lifetime
}

Record is the atomic unit of the graph: a payload + lifetime.

func EdgeToRecord

func EdgeToRecord(e Edge) (Record, error)

EdgeToRecord wraps e in a Record.

func LineageToRecord

func LineageToRecord(l Lineage) (Record, error)

LineageToRecord wraps l in a Record.

func SymbolToRecord

func SymbolToRecord(s Symbol) (Record, error)

SymbolToRecord wraps s in a Record with an empty Lifetime. Lifetime is computed/managed by the storage layer, not here.

type RecordKind

type RecordKind uint8

RecordKind tags the payload of a Record.

const (
	RecordSymbol  RecordKind = 1
	RecordEdge    RecordKind = 2
	RecordLineage RecordKind = 3
)

type Span

type Span struct {
	File      string `json:"file"`
	StartByte uint32 `json:"startByte"`
	EndByte   uint32 `json:"endByte"`
	StartLine uint32 `json:"startLine"`
	EndLine   uint32 `json:"endLine"`
}

Span is a byte range within a source file.

type Store

type Store interface {
	GetRecord(ctx context.Context, id Hash) (Record, error)
	GetKnowledgeCommit(ctx context.Context, sourceSha string) (KnowledgeCommit, error)
	Walk(ctx context.Context, from Hash, kind EdgeKind, dir Direction) iter.Seq2[Edge, error]
	AsOf(commit string) Store
	BeginCommit(ctx context.Context, sourceSha string) (CommitWriter, error)
}

Store is the L0 storage interface (FOUNDATION §5).

Reads return state at HEAD by default; AsOf returns a time-travel view. Writes are scoped to a commit being built — see CommitWriter.

Error semantics:

  • GetRecord returns ErrNotFound if no revision is active at HEAD.
  • GetKnowledgeCommit returns ErrNotFound if no KC exists for sourceSha.
  • AsOf is lazy; if commit is unknown, ErrUnknownCommit surfaces on first read.

type Symbol

type Symbol struct {
	Version   uint16     `json:"version"`
	ID        Hash       `json:"id"`
	FQN       FQN        `json:"fqn"`
	Kind      SymbolKind `json:"kind"`
	Signature string     `json:"signature"`
	Span      Span       `json:"span"`
	BodyHash  Hash       `json:"bodyHash"`
	Language  string     `json:"language,omitempty"` // v2+; "go", "typescript", "python"
}

Symbol is a named, addressable code element. Version is the FIRST field per FORMAT_SPEC §4.1 (file-format convention: version header first).

In v2, Language is added as the LAST field (preserving v1 reader compat at the JSON layer — v1 readers ignore unknown fields). Empty Language is the v1 default and indicates "go" by convention. The migration function sets Language="go" on every existing v1 record at read time.

func DecodeSymbol

func DecodeSymbol(b []byte) (Symbol, error)

DecodeSymbol parses a canonical JSON Symbol blob and validates version range + kind. Does NOT migrate — that is L1 work; see internal/migrate/. Production callers should use migrate.DecodeAndMigrateSymbol; this primitive is exported for callers that need to inspect raw on-disk version (e.g., debugging, format introspection).

L0 boundary note: core/ imports only stdlib, so the migration step lives one layer up (see internal/migrate).

type SymbolKind

type SymbolKind uint8

SymbolKind enumerates the kinds of symbols Crowl tracks. Append-only: existing constants never renumber.

const (
	SymbolFolder     SymbolKind = 1
	SymbolFile       SymbolKind = 2
	SymbolPackage    SymbolKind = 3
	SymbolStruct     SymbolKind = 4
	SymbolInterface  SymbolKind = 5
	SymbolTypeAlias  SymbolKind = 6
	SymbolTypeDef    SymbolKind = 7
	SymbolFunction   SymbolKind = 8
	SymbolMethod     SymbolKind = 9
	SymbolField      SymbolKind = 10
	SymbolConst      SymbolKind = 11
	SymbolVar        SymbolKind = 12
	SymbolEnum       SymbolKind = 13 // v4: enum type (TS, Java, Rust, ...)
	SymbolEnumMember SymbolKind = 14 // v4: a member of an enum
	SymbolNamespace  SymbolKind = 15 // v4: namespace / module construct (TS, Rust mod, ...)
)

func (SymbolKind) IsValid

func (k SymbolKind) IsValid() bool

IsValid reports whether k is in the valid range.

func (SymbolKind) String

func (i SymbolKind) String() string

type WhyEvent

type WhyEvent struct {
	SourceSha     string
	Timestamp     int64
	CommitMessage string
	CommitAuthor  string
	Effect        string // "added" | "updated" | "removed"
}

WhyEvent is one entry in the chronological narrative returned by Query.Why — a single KC that touched a particular symbol. Older KCs (created before the v0.4.1 schema migration ran) lack CommitMessage / CommitAuthor; those fields are empty strings in that case.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL