harness

package
v0.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 30, 2026 License: MIT Imports: 16 Imported by: 0

Documentation

Overview

Package harness is the composition layer that turns a Claude Code session directory into a memmy store, runs query batteries against it, and captures the per-node state changes that result.

Three top-level entry points: Ingest (corpus extraction + embedding cache priming), Replay (build a fresh memmy db from the corpus with a controllable Clock), and RunQueries (execute a query battery and snapshot before/after node state).

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func FakeEmbedderModelID

func FakeEmbedderModelID(dim int) string

FakeEmbedderModelID is the conventional ModelID for cache entries produced by internal/embed/fake. Identity-keyed so different fake dims do not collide.

func LastTurnTime

func LastTurnTime(ctx context.Context, corpusStorePath string) (time.Time, error)

LastTurnTime returns the latest turn timestamp in the corpus store at path. Used by callers that opened a primed memmy db without running Replay; they need to seed the FakeClock so subsequent Recall sees a clock at-or-after every persisted LastTouched.

Types

type Hit

type Hit struct {
	Rank        int
	NodeID      string
	Text        string
	Score       float64
	Sim         float64
	NodeWeight  float64
	GraphMult   float64
	Depth       int
	SourceMsgID string
	TurnUUID    string // resolved from metadata when available
}

Hit is one returned chunk from a Recall call, joined with the originating turn UUID we stamped into the Write metadata.

type IngestOptions

type IngestOptions struct {
	// SessionsPath: file or directory passed to corpus.Extract.
	SessionsPath string
	// CorpusStorePath: where to materialize corpus.sqlite.
	CorpusStorePath string
	// EmbedCachePath: where to materialize the content-addressed cache.
	EmbedCachePath string
	// Embedder produces chunk vectors. Wrap with the Gemini embedder in
	// production; pass fake.New(...) in tests.
	Embedder embed.Embedder
	// EmbedderModelID is the identity stored in the cache key. Bump
	// when the embedder model changes.
	EmbedderModelID string
	// EmbedderKind is "fake" or "gemini" — recorded in the manifest.
	EmbedderKind string
	// Limit caps how many files get walked when SessionsPath is a
	// directory (0 = no cap). Useful for smoke runs.
	Limit int
	// Progress receives per-file and per-chunk updates. Optional.
	Progress Progress
	// Logger is the destination for structured ingest events. Optional.
	Logger *slog.Logger
}

IngestOptions configures one Ingest call.

type IngestResult

type IngestResult struct {
	FilesScanned    int
	FilesIngested   int
	FilesSkippedDup int
	TurnsAdded      int
	ChunksEmbedded  int
	ChunksCacheHit  int
	CorpusSnapshot  string
	Manifest        manifest.DatasetManifest
}

IngestResult summarizes one Ingest invocation.

func Ingest

func Ingest(ctx context.Context, datasetName string, opts IngestOptions) (IngestResult, error)

Ingest walks SessionsPath, extracts turns, persists them to the corpus store, chunks them, and primes the embedding cache. Idempotent per source file (skipped on re-run if path+mtime+sha unchanged).

type Progress

type Progress interface {
	StartFiles(total int)
	FileDone(path string, turns int)
	StartChunks(total int)
	ChunkDone(n int)
	Finish()
}

Progress is the optional callback Ingest invokes after each file and after each chunk batch. Wire to a progressbar in the binary; pass nil in tests.

type QueryResult

type QueryResult struct {
	Query      queries.LabeledQuery
	StartedAt  time.Time
	FinishedAt time.Time
	Hits       []Hit
	PreState   []inspect.NodeState // top-K node state captured before Recall
	PostState  []inspect.NodeState // same nodes after Recall
	Error      string
}

QueryResult records everything needed to score one query.

func RunQueries

func RunQueries(ctx context.Context, qs []queries.LabeledQuery, opts RunQueriesOptions) ([]QueryResult, error)

RunQueries executes each labeled query against the live service, captures rankings + score breakdowns, and snapshots the node state of the top-K hits before and after Recall via the inspect reader.

We snapshot the full corpus state ONCE at the start (O(corpus)) into a rolling baseline, then per query we read post-state for only the top-K hits (O(K)) and compute the delta against the rolling baseline. The baseline gets refreshed for hit nodes after each query so that each query's "pre" reflects the state immediately before that query's Recall, not the state at battery start.

type ReplayOptions

type ReplayOptions struct {
	// CorpusStorePath: source of turns (chronological).
	CorpusStorePath string
	// EmbedCachePath: cache primed by Ingest. Replay's embedder consults
	// this cache so re-running with the same corpus does not re-embed.
	EmbedCachePath string
	// Embedder produces vectors via the cache wrapper.
	Embedder embed.Embedder
	// EmbedderModelID is the cache key namespace.
	EmbedderModelID string
	// ServiceConfig is the memmy service tunable bundle. Optional;
	// nil means defaults.
	ServiceConfig *memmy.ServiceConfig
	// FlatScanThreshold below which Recall uses linear scan. 0 = default.
	FlatScanThreshold int
	// TenantTuple identifies the synthetic tenant memmy will write under.
	// Defaults to {agent: memmy-eval, dataset: <DatasetName>}.
	TenantTuple map[string]string
	// DatasetName is used as the default dataset key in TenantTuple.
	DatasetName string
	// Neo4j is the connection used for both the live memmy.Service and
	// the inspect Reader. Required.
	Neo4j inspect.Connection
}

ReplayOptions configures one Replay call.

type ReplayResult

type ReplayResult struct {
	Service       memmy.Service
	Closer        interface{ Close() error }
	Tenant        map[string]string
	FakeClock     *memmy.FakeClock
	Neo4j         inspect.Connection
	TurnsReplayed int
	NodesWritten  int
	StartedAt     time.Time
	FinishedAt    time.Time
}

ReplayResult exposes the live service so a caller can run queries without reopening. Close releases the underlying Neo4j driver.

func OpenService

func OpenService(ctx context.Context, opts ReplayOptions, clockSeed time.Time) (*ReplayResult, error)

OpenService opens a memmy service backed by the configured Neo4j instance plus an embedcache wrapper around opts.Embedder. Used both by Replay (to build a fresh state from corpus) and by query-only paths. clockSeed is what the FakeClock is initialized to; pass the corpus's first-turn timestamp for fresh replays or the corpus's last-turn timestamp for query-only opens (so decay math sees a clock that's at-or-after every Node's LastTouched).

func Replay

func Replay(ctx context.Context, opts ReplayOptions) (*ReplayResult, error)

Replay reads every turn from the corpus store in chronological order, advances a FakeClock to each turn's timestamp, and calls memmy.Service.Write under the configured tenant. The FakeClock drives the Service's decay math so that decay/reinforcement dynamics observed in the resulting db reflect the real time gaps between turns rather than wall-clock latency of the harness.

func (*ReplayResult) Close

func (r *ReplayResult) Close() error

Close releases the underlying memmy storage handle.

type RunQueriesOptions

type RunQueriesOptions struct {
	Service      memmy.Service
	Tenant       map[string]string
	InspectConn  inspect.Connection // Neo4j connection for the inspect Reader
	K            int
	Hops         int
	OversampleN  int
	AdvanceClock time.Duration // FakeClock advance between queries (0 = none)
	FakeClock    *memmy.FakeClock
}

RunQueriesOptions configures one query battery.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL