Documentation
¶
Overview ¶
Package index provides the SQLite-backed derived index for VaultMind.
Index ¶
- Constants
- func ComputeAliasMentions(db *DB, minAliasLen int) (int, error)
- func ComputeTagOverlap(db *DB, threshold float64) (int, error)
- func CountFTS(d *DB, query string, filters ...SearchFilters) (int, error)
- func DecodeColBERTEmbedding(data []byte, _ int) ([][]float32, error)
- func DecodeEmbedding(data []byte) ([]float32, error)
- func DecodeSparseEmbedding(data []byte) (map[int32]float32, error)
- func DeleteNoteByPath(d *DB, path string) error
- func DetectEmbeddingDims(d *DB) (int, error)
- func EncodeColBERTEmbedding(colbert [][]float32) []byte
- func EncodeEmbedding(vec []float32) []byte
- func EncodeSparseEmbedding(sparse map[int32]float32) []byte
- func HasColBERTEmbeddings(d *DB) (bool, error)
- func HasEmbeddings(d *DB) (bool, error)
- func HasSparseEmbeddings(d *DB) (bool, error)
- func LoadEmbedding(d *DB, noteID string) ([]float32, error)
- func RecordNoteAccess(d *DB, noteID string) error
- func RecordNoteAccessAs(d *DB, noteID, caller string) error
- func ResolveLinks(db *DB) (int, error)
- func StoreColBERTEmbedding(d *DB, noteID string, colbert [][]float32) error
- func StoreEmbedding(d *DB, noteID string, vec []float32) error
- func StoreNote(d *DB, rec NoteRecord) error
- func StoreNoteInTx(tx *sql.Tx, rec NoteRecord) error
- func StoreSparseEmbedding(d *DB, noteID string, sparse map[int32]float32) error
- func StripForAliasMatch(body string) string
- type BlockRecord
- type BlockRow
- type DB
- func (d *DB) AllNoteTitles() ([]NoteTitle, error)
- func (d *DB) Begin() (*sql.Tx, error)
- func (d *DB) Close() error
- func (d *DB) Exec(query string, args ...interface{}) (sql.Result, error)
- func (d *DB) NoteHashes() (map[string]NoteHashInfo, error)
- func (d *DB) Query(query string, args ...interface{}) (*sql.Rows, error)
- func (d *DB) QueryFullNote(id string) (*FullNote, error)
- func (d *DB) QueryNoteByID(id string) (*NoteRow, error)
- func (d *DB) QueryNoteByPath(path string) (*NoteRow, error)
- func (d *DB) QueryNotesByAlias(alias string, normalized bool) ([]NoteRow, error)
- func (d *DB) QueryNotesByNormalized(normalized string) ([]NoteRow, error)
- func (d *DB) QueryNotesByTitle(title string, caseInsensitive bool) ([]NoteRow, error)
- func (d *DB) QueryRow(query string, args ...interface{}) *sql.Row
- func (d *DB) UpdateMTime(path string, mtime int64) error
- type EmbedResult
- type EmbeddingDimsCount
- type FTSResult
- type FullNote
- type HeadingRecord
- type HeadingRow
- type IndexAndEmbedResult
- type IndexError
- type IndexResult
- type Indexer
- func (idx *Indexer) EmbedNotes(ctx context.Context, dbPath string, embedder embedding.Embedder) (*EmbedResult, error)
- func (idx *Indexer) Incremental() (*IndexResult, error)
- func (idx *Indexer) IndexFile(relPath string) error
- func (idx *Indexer) Rebuild() (*IndexResult, error)
- func (idx *Indexer) RunEmbed(ctx context.Context, dbPath, model string) (*EmbedResult, error)
- type LinkRecord
- type NoteAccessStats
- func ListAccessedNotes(d *DB) ([]NoteAccessStats, error)
- func ListAccessedNotesByCaller(d *DB, caller string) ([]NoteAccessStats, error)
- func ListAccessedNotesExcludingCaller(d *DB, excludedCaller string) ([]NoteAccessStats, error)
- func ListAccessedNotesExcludingCallers(d *DB, excludedCallers []string) ([]NoteAccessStats, error)
- func LookupNoteAccess(d *DB, noteID string) (NoteAccessStats, error)
- type NoteColBERTEmbedding
- type NoteEmbedding
- type NoteHashInfo
- type NoteRecord
- type NoteRow
- type NoteSparseEmbedding
- type NoteTitle
- type PostIndexWarning
- type SearchFilters
Constants ¶
const ( CallerAgent = "agent" CallerAgentNeighbor = "agent-neighbor" CallerHook = "hook" )
Caller* constants name the provenance of an access event. The string values land in the note_accesses.caller column and let `vaultmind self` filter "what I engaged with" from "what the harness pre-loaded."
CallerAgent is the default for explicit agent reads (note get, the target of an Ask). CallerAgentNeighbor is set when an Ask's context-pack pulls a neighbor in alongside the target — still real engagement, but lower-intent than a direct read. CallerHook is set when a Claude Code hook (SessionStart persona load, UserPromptSubmit pointer fanout, etc.) fires the access; these accesses populate the activation log but `self` filters them out by default so its hot list reflects deliberate engagement rather than ambient harness traffic.
Set via the VAULTMIND_CALLER env var or passed explicitly to RecordNoteAccessAs. RecordNoteAccess (no caller arg) reads the env and falls back to CallerAgent.
Variables ¶
This section is empty.
Functions ¶
func ComputeAliasMentions ¶
ComputeAliasMentions scans every note body for occurrences of aliases and domain-note titles, then writes alias_mention edges into the links table. It returns the number of new edges inserted. Edges shorter than minAliasLen characters are skipped. Calling this function clears any previous alias_mention edges before computing fresh ones.
func ComputeTagOverlap ¶
ComputeTagOverlap scans the tags table for notes sharing common tags and writes tag_overlap edges into the links table weighted by TF-IDF-style tag specificity. Only pairs whose combined score meets threshold are inserted. Calling this function clears any previous tag_overlap edges before computing fresh ones.
func CountFTS ¶
func CountFTS(d *DB, query string, filters ...SearchFilters) (int, error)
CountFTS returns the total number of documents matching the query and filters, independent of any limit/offset. Used for pagination totals.
func DecodeColBERTEmbedding ¶
DecodeColBERTEmbedding deserializes a ColBERT BLOB back to a per-token matrix. The dims are read from the 4-byte header; the dims parameter is ignored (kept for API compat).
func DecodeEmbedding ¶
DecodeEmbedding deserializes raw little-endian bytes back to a float32 slice.
func DecodeSparseEmbedding ¶
DecodeSparseEmbedding deserializes packed (int32, float32) pairs back to a sparse map. Returns an empty map (not nil) when data is empty.
func DeleteNoteByPath ¶
DeleteNoteByPath removes a note and all its dependent rows from every table within a single transaction. It is used by the incremental indexer to clean up notes whose source files no longer exist on disk.
func DetectEmbeddingDims ¶
DetectEmbeddingDims returns the dimensionality of stored embeddings, or 0 if none exist. Uses a single-row query — does not load all embeddings.
In mixed-state vaults (some notes embedded with MiniLM, others with BGE-M3 after a model upgrade) this returns whichever row SQLite happens to scan first, so the result is not authoritative for "what model is this vault using." Use DetectEmbeddingDimsCounts when you need the full picture.
func EncodeColBERTEmbedding ¶
EncodeColBERTEmbedding serializes a per-token embedding matrix with a 4-byte dims header. Format: [uint32 dims][float32 data...] where data is tokens*dims floats.
func EncodeEmbedding ¶
EncodeEmbedding serializes a float32 slice to raw little-endian bytes for BLOB storage.
func EncodeSparseEmbedding ¶
EncodeSparseEmbedding serializes a sparse vector as packed (int32 token_id, float32 weight) pairs.
func HasColBERTEmbeddings ¶
HasColBERTEmbeddings returns true if any note has a stored ColBERT embedding.
func HasEmbeddings ¶
HasEmbeddings returns true if any note in the index has a stored embedding.
func HasSparseEmbeddings ¶
HasSparseEmbeddings returns true if any note has a stored sparse embedding.
func LoadEmbedding ¶
LoadEmbedding reads the embedding for a single note. Returns nil, nil if no embedding stored.
func RecordNoteAccess ¶
RecordNoteAccess records a note access with the default caller (read from VAULTMIND_CALLER env var, falling back to CallerAgent). Backwards compatible with pre-2026-05-01 callers: the call signature is unchanged, so existing call sites don't have to be rewritten.
Use RecordNoteAccessAs when the caller is known structurally (e.g. query.Ask passes CallerAgent for the target and CallerAgentNeighbor for context-pack neighbors). Use RecordNoteAccess when the caller is determined by the runtime context (e.g. a shell hook setting VAULTMIND_CALLER).
func RecordNoteAccessAs ¶
RecordNoteAccessAs records a note access with an explicit caller label. Two side effects:
- Inserts into note_accesses with (note_id, caller, accessed_at) — the per-event log that `self` and future ACT-R retrieval scoring read from.
- Updates the scalar (notes.access_count, notes.last_accessed_at) — kept for backward compatibility and fast lookup on hot paths.
Best-effort: each per-note tracking miss is the caller's responsibility to log at debug; never fail the user query over optional bookkeeping.
func ResolveLinks ¶
ResolveLinks updates unresolved links by matching dst_raw against note IDs, titles, and aliases. Sets dst_note_id and resolved=TRUE for matches.
func StoreColBERTEmbedding ¶
StoreColBERTEmbedding writes a ColBERT embedding BLOB for a note.
func StoreEmbedding ¶
StoreEmbedding writes an embedding BLOB for a note that already exists in the index.
func StoreNote ¶
func StoreNote(d *DB, rec NoteRecord) error
StoreNote deletes all existing rows for the note, then inserts fresh rows into every table within a single transaction (delete-before-reinsert). StoreNote stores a note within its own transaction.
func StoreNoteInTx ¶
func StoreNoteInTx(tx *sql.Tx, rec NoteRecord) error
StoreNoteInTx stores a note within an existing transaction. Used by Rebuild for batch transactions.
func StoreSparseEmbedding ¶
StoreSparseEmbedding writes a sparse embedding BLOB for a note.
func StripForAliasMatch ¶
StripForAliasMatch removes markup that should be excluded from alias detection: code fences, inline code, wikilinks (keeping aliased display text), and HTML comments.
Types ¶
type BlockRecord ¶
BlockRecord represents a block ID anchor for storage.
type BlockRow ¶
type BlockRow struct {
BlockID string `json:"block_id"`
Heading string `json:"heading,omitempty"`
Line int `json:"line"`
}
BlockRow represents a block ID in query results.
type DB ¶
type DB struct {
// contains filtered or unexported fields
}
DB wraps *sql.DB with schema initialization and VaultMind-specific helpers.
func Open ¶
Open opens (or creates) a SQLite database at dbPath, creates the parent directory if needed, applies the full VaultMind schema, and configures pragmas (WAL mode, foreign key enforcement).
func (*DB) AllNoteTitles ¶
AllNoteTitles returns every note's ID and title from the index. Titles that are empty in the notes table (shouldn't happen post-index but guard anyway) are returned as-is — callers filter if they care.
func (*DB) NoteHashes ¶
func (d *DB) NoteHashes() (map[string]NoteHashInfo, error)
NoteHashes returns a map of note path → NoteHashInfo for all notes in the database. Used by the incremental indexer to detect changed and deleted notes.
func (*DB) QueryFullNote ¶
QueryFullNote returns complete note data including body, headings, blocks, aliases, tags. Uses GROUP_CONCAT subqueries to fold aliases and tags into the main note query, reducing the number of DB round-trips from 6 to 4.
func (*DB) QueryNoteByID ¶
QueryNoteByID returns the note with the given ID, or nil if not found.
func (*DB) QueryNoteByPath ¶
QueryNoteByPath returns the note at the given vault-relative path, or nil.
func (*DB) QueryNotesByAlias ¶
QueryNotesByAlias returns notes whose aliases match the given string. If normalized is true, compares against alias_normalized (lowercase, whitespace-collapsed).
func (*DB) QueryNotesByNormalized ¶
QueryNotesByNormalized searches for notes whose title or alias, when hyphens and underscores are replaced with spaces and lowercased, matches the given normalized input.
func (*DB) QueryNotesByTitle ¶
QueryNotesByTitle returns notes matching the given title. If caseInsensitive is true, uses LOWER() comparison.
type EmbedResult ¶
type EmbedResult struct {
Embedded int `json:"embedded"`
Skipped int `json:"skipped"`
Errors int `json:"errors"`
EmptyOutput int `json:"empty_output,omitempty"`
Model string `json:"model,omitempty"`
}
EmbedResult holds the outcome of an embedding pass.
EmptyOutput counts notes whose embedder returned without error but with empty Sparse and/or ColBERT outputs — the heads produced no usable tokens. These notes are NOT counted as Embedded (their sparse_embedding / colbert_embedding columns would be NULL); they remain pending for the next run. See vaultmind#22 for the silent-failure pattern this surfaces.
type EmbeddingDimsCount ¶
EmbeddingDimsCount is one (dimensions, count) pair from a vault.
func DetectEmbeddingDimsCounts ¶
func DetectEmbeddingDimsCounts(d *DB) ([]EmbeddingDimsCount, error)
DetectEmbeddingDimsCounts returns the count of notes per dense-embedding dimensionality. A consistent vault has exactly one entry; a mixed-state vault (mid-upgrade from MiniLM to BGE-M3, or partial-rebuild) returns multiple. Used by `doctor` to surface mixed state explicitly instead of claiming a single model name. See vaultmind#22 dig.
type FTSResult ¶
type FTSResult struct {
ID string `json:"id"`
Type string `json:"type"`
Title string `json:"title"`
Path string `json:"path"`
Snippet string `json:"snippet"`
Score float64 `json:"score"`
IsDomain bool `json:"is_domain_note"`
}
FTSResult represents a single full-text search hit per SRS-09.
type FullNote ¶
type FullNote struct {
ID string `json:"id"`
Type string `json:"type"`
Path string `json:"path"`
Title string `json:"title"`
Frontmatter map[string]interface{} `json:"frontmatter"`
Body string `json:"body,omitempty"`
Headings []HeadingRow `json:"headings,omitempty"`
Blocks []BlockRow `json:"blocks,omitempty"`
IsDomain bool `json:"is_domain_note"`
Aliases []string `json:"-"`
Tags []string `json:"-"`
}
FullNote contains all data for a single note.
type HeadingRecord ¶
HeadingRecord represents a heading for storage.
type HeadingRow ¶
type HeadingRow struct {
Level int `json:"level"`
Title string `json:"title"`
Slug string `json:"slug"`
}
HeadingRow represents a heading in query results.
type IndexAndEmbedResult ¶
type IndexAndEmbedResult struct {
Index *IndexResult `json:"index"`
Embed *EmbedResult `json:"embed,omitempty"`
}
IndexAndEmbedResult combines index and optional embed results for command output.
type IndexError ¶
type IndexError struct {
Path string `json:"path"`
Kind string `json:"kind"` // "read" | "parse" | "store" | "delete"
Error string `json:"error"`
}
IndexError names a specific per-file failure during Rebuild or Incremental. The counter in IndexResult.Errors tells you *how many* files failed; ErrorDetails tells you WHICH files and WHY — without it, a partial-index failure is an unactionable number (manifesto #3).
type IndexResult ¶
type IndexResult struct {
DBPath string `json:"db_path"`
Indexed int `json:"indexed"`
DomainNotes int `json:"domain_notes"`
UnstructuredNotes int `json:"unstructured_notes"`
Errors int `json:"errors"`
Skipped int `json:"skipped"`
DuplicateIDs int `json:"duplicate_ids"`
Added int `json:"added"`
Updated int `json:"updated"`
Deleted int `json:"deleted"`
FullRebuild bool `json:"full_rebuild"`
DurationMs int64 `json:"duration_ms"`
CompletedAt string `json:"completed_at"`
ErrorDetails []IndexError `json:"error_details,omitempty"`
PostIndexWarnings []PostIndexWarning `json:"post_index_warnings,omitempty"`
}
IndexResult holds the outcome of an index rebuild.
type Indexer ¶
type Indexer struct {
// contains filtered or unexported fields
}
Indexer orchestrates vault scanning, parsing, and SQLite storage.
func NewIndexer ¶
NewIndexer creates an Indexer for the given vault.
func (*Indexer) EmbedNotes ¶
func (idx *Indexer) EmbedNotes(ctx context.Context, dbPath string, embedder embedding.Embedder) (*EmbedResult, error)
EmbedNotes computes and stores embeddings for all notes that don't have one yet. It opens its own DB connection (like Rebuild/Incremental) so it can be called after the indexer has closed its connection.
func (*Indexer) Incremental ¶
func (idx *Indexer) Incremental() (*IndexResult, error)
Incremental scans the vault and only indexes files that are new or changed (detected via content hash). Deleted files are removed from the index.
func (*Indexer) Rebuild ¶
func (idx *Indexer) Rebuild() (*IndexResult, error)
Rebuild performs a full rebuild: scan all .md files, parse, and store.
func (*Indexer) RunEmbed ¶
RunEmbed runs an embedding pass against the index DB. The embedder is constructed lazily — only when there's pending work to do.
Why lazy: the BGE-M3 model is ~2.2GB on disk and CGO+ORT session creation pegs a CPU core for ~1s every time it's invoked. Running `vaultmind index --embed --model bge-m3` against a fully-embedded vault (which happens whenever the user re-runs after editing zero notes — hooks, scripts, retries, doctor checks) used to pay that load cost unconditionally. Heat without work. Counting pending notes first lets us skip the model load entirely when there's nothing to do.
type LinkRecord ¶
type LinkRecord struct {
DstNoteID string
DstRaw string
EdgeType string
TargetKind string
Heading string
BlockID string
Resolved bool
Confidence string
Origin string
Weight float64
}
LinkRecord represents a single outbound edge for storage.
type NoteAccessStats ¶
type NoteAccessStats struct {
NoteID string
AccessCount int
LastAccessedAt string // RFC3339Nano UTC, empty when never accessed
Title string
NoteType string
}
NoteAccessStats reports the access counters for a single note. Useful for doctor / debugging / verifying that RecordNoteAccess is firing on the paths it's supposed to. Title and NoteType are populated by ListAccessedNotes so the self-rendering layer can produce human-readable output without a separate join. LookupNoteAccess leaves them empty (single-id callers don't need them).
func ListAccessedNotes ¶
func ListAccessedNotes(d *DB) ([]NoteAccessStats, error)
ListAccessedNotes returns access stats across all notes with at least one recorded access, sorted newest-first by last access timestamp. Backs `vaultmind self` and any caller that wants "everything that's been touched." For the agent-only filtered view (excluding hook accesses), use ListAccessedNotesByCaller.
Pre-2026-05-01 this read from the scalar columns. Post-migration-007 it reads from the events table so callers see consistent data with the caller-filtered variant.
func ListAccessedNotesByCaller ¶
func ListAccessedNotesByCaller(d *DB, caller string) ([]NoteAccessStats, error)
ListAccessedNotesByCaller returns access stats restricted to events fired by the given caller. Used by `vaultmind self` to filter out hook fan-outs from the proprioceptive view: the SessionStart hook and per-turn pointer-recall fire RecordNoteAccess across many notes before the agent does any deliberate work, and showing them in the "hot" list pollutes the engagement signal `self` is supposed to surface. Pass an empty string to include all callers (matches ListAccessedNotes behaviour).
The "exclude" semantic — "show all callers EXCEPT X" — is provided by ListAccessedNotesExcludingCaller, which is the shape `self` actually wants ("agent + agent-neighbor, not hook").
func ListAccessedNotesExcludingCaller ¶
func ListAccessedNotesExcludingCaller(d *DB, excludedCaller string) ([]NoteAccessStats, error)
ListAccessedNotesExcludingCaller returns access stats from all callers EXCEPT the one named. Single-caller exclusion form; see ListAccessedNotesExcludingCallers for the multi-caller form.
func ListAccessedNotesExcludingCallers ¶
func ListAccessedNotesExcludingCallers(d *DB, excludedCallers []string) ([]NoteAccessStats, error)
ListAccessedNotesExcludingCallers returns access stats from all callers EXCEPT the ones named. Used by `vaultmind self` to filter out *both* hook fan-outs (CallerHook) and Ask context-pack neighbors (CallerAgentNeighbor) so the proprioceptive view reflects only deliberate-target accesses (CallerAgent — Ask top-hit + note get).
Round-1 review caught hook pollution; round-2 review caught the next-louder source: a single Ask fires N+1 access events (target + N neighbors) and an off-target nonsense query's neighbor fan-out dominates the hot list. Both pollutions close at the same caller- dimension layer the schema already provides.
Empty list returns the unfiltered view (matches ListAccessedNotes).
func LookupNoteAccess ¶
func LookupNoteAccess(d *DB, noteID string) (NoteAccessStats, error)
LookupNoteAccess returns the access stats for a single note, or (zero-stats, nil) when the note doesn't exist (deliberately mirrors QueryFullNote's "not found" semantics — caller checks whether NoteID came back populated). Reads from the scalar columns; for caller-aware lookups use the event-table queries directly.
type NoteColBERTEmbedding ¶
type NoteColBERTEmbedding struct {
NoteID string
ColBERT [][]float32
Type string
Title string
Path string
BodyText string
IsDomain bool
}
NoteColBERTEmbedding pairs a note ID with its ColBERT matrix and metadata.
func LoadAllColBERTEmbeddings ¶
func LoadAllColBERTEmbeddings(d *DB, dims int) ([]NoteColBERTEmbedding, error)
LoadAllColBERTEmbeddings returns all notes that have stored ColBERT embeddings. dims is the embedding dimensionality for decoding.
type NoteEmbedding ¶
type NoteEmbedding struct {
NoteID string
Embedding []float32
Type string
Title string
Path string
BodyText string
IsDomain bool
}
NoteEmbedding pairs a note ID with its embedding vector and metadata.
func LoadAllEmbeddings ¶
func LoadAllEmbeddings(d *DB) ([]NoteEmbedding, error)
LoadAllEmbeddings returns all notes that have stored embeddings, including metadata. This is a single query that avoids N+1 lookups when scoring and filtering results.
type NoteHashInfo ¶
NoteHashInfo holds the content hash and modification time for a note.
type NoteRecord ¶
type NoteRecord struct {
ID string
Path string
Title string
Type string
Status string
Created string
Updated string
BodyText string
Hash string
MTime int64
IsDomain bool
Aliases []string
Tags []string
ExtraKV map[string]interface{}
Links []LinkRecord
Headings []HeadingRecord
Blocks []BlockRecord
}
NoteRecord is the storage-ready representation of a parsed note. The indexer builds this from parser.ParsedNote + file metadata.
type NoteSparseEmbedding ¶
type NoteSparseEmbedding struct {
NoteID string
Sparse map[int32]float32
Type string
Title string
Path string
BodyText string
IsDomain bool
}
NoteSparseEmbedding pairs a note ID with its sparse vector and metadata.
func LoadAllSparseEmbeddings ¶
func LoadAllSparseEmbeddings(d *DB) ([]NoteSparseEmbedding, error)
LoadAllSparseEmbeddings returns all notes that have stored sparse embeddings.
type NoteTitle ¶
NoteTitle pairs a note's ID with its display title. Used by callers that need to list notes by title without loading full frontmatter/body (e.g. the ask command's fuzzy-title fallback on zero hits).
type PostIndexWarning ¶
PostIndexWarning reports failure of a post-store pass (link resolution, alias detection, tag overlap). These run after the note-store transaction commits; their failure leaves a partially-connected graph the operator can't distinguish from a successful run without this surface.
Conventional Step values: "orphan_sweep", "link_resolution", "alias_mention", "tag_overlap".
type SearchFilters ¶
type SearchFilters struct {
Type string // Filter by note type (empty = no filter)
Tag string // Filter by tag (empty = no filter)
}
SearchFilters holds optional filters for FTS search.