embed

package

v0.3.2 Latest Latest Go to latest Published: Jun 8, 2026 License: MIT Imports: 27 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/DonaldMurillo/gofastr

Links

Open Source Insights

README ¶

battery/embed

Local semantic-search battery for GoFastr.

idx, _ := embed.Open(embed.Options{
    // Real semantic — recommended once Ollama is running locally:
    Embedder: embed.NewOllamaEmbedder(embed.OllamaConfig{
        Model: "nomic-embed-text",            // ollama pull nomic-embed-text
    }),
    // Or the dependency-free deterministic stub for tests/offline dev:
    // Embedder: embed.NewStubEmbedder(128),
    Keyword:  embed.NewMemoryKeyword(),       // optional, enables hybrid
    Path:     "~/.gofastr/embed/myapp",       // optional, enables persistence
})
defer idx.Close()

idx.Add(ctx, embed.Document{ID: "auth", Source: "auth.go",
    Text: "Auth middleware verifies sessions and JWTs."})

hits, _ := idx.Query(ctx, embed.Query{
    Text:      "how does session middleware work",
    K:         5,
    Hybrid:    true,    // keyword + vector RRF fusion
    MMRLambda: 0.3,     // diversity reranking
})

What's in the box

File	What it does
`embed.go`	Public types: `Document`, `Chunk`, `Hit`, `Query`, `Filter`, `Stats`. The `Embedder`, `Chunker`, `Store`, `KeywordBackend`, `Reranker` interfaces. `Open(Options) Index`.
`index.go`	The default `Index` implementation. Orchestrates chunker → embedder → store → retrieval pipeline. WAL + snapshot lifecycle.
`store_flat.go`	`FlatStore`: in-memory `[]Chunk`, brute-force cosine, doc-scoped removal. Targets ~100k chunks at 384 dims (~150MB).
`chunker.go`	`FixedWindow`: language-agnostic rune-window chunker with overlap. Default.
`chunker_lang.go`	`LangAware`: Go AST-aware + Markdown heading-aware; falls back to `FixedWindow` per-section when chunks exceed `MaxRunes`.
`stub_embedder.go`	`StubEmbedder`: deterministic FNV-hashed bag-of-words. Test and offline-dev only — not a real model.
`embedder_ollama.go`	`OllamaEmbedder`: HTTP client against Ollama-compatible `/api/embed`. Real semantic embeddings, no CGO, no bundled model. Recommended production default.
`hybrid.go`	`MemoryKeyword`: in-process BM25-flavoured keyword backend. `fuseRRF` reciprocal-rank fusion.
`hybrid_search.go`	`WrapSearchBackend`: adapter from `battery/search.Backend` into `KeywordBackend`.
`mmr.go`	Maximal Marginal Relevance reranker. Run after candidate generation, before final top-K.
`persist.go`	Gob snapshot (atomic rename) + append-only WAL with replay. Model-fingerprint guard.
`watcher.go`	`Watcher`: polling filesystem watcher with include-exts, exclude-dirs, replace-on-mtime, delete-on-vanish.
`plugin.go`	`framework.Plugin` adapter that auto-mounts `/embed/*` routes.
`routes.go`	Stdlib `http.Handler` exposing `POST /index`, `POST /query`, `GET /stats`, `DELETE /doc/{id}`.

Retrieval pipeline

Query.Text
  │
  ├── Embedder.Embed(qv)
  ├── Store.Candidates(qv, Filter, 4×K)          → vector hits
  └── KeywordBackend.Search(text, 4×K)           → keyword hits   (if Hybrid)
                │
                └── fuseRRF                       → fused hits
                                    │
                                    └── mmr      → diverse top K (if MMRLambda > 0)
                                                        │
                                                        └── Reranker.Rerank (if Rerank)
                                                                │
                                                                └── strip Vec → caller

The pipeline is opt-in: a default Query{Text: "x"} runs pure vector retrieval. Setting Hybrid enables fusion; MMRLambda enables diversity; Rerank requires Options.Reranker to be set (otherwise the call errors — silent quality regressions are not allowed).

Persistence

When Options.Path is set, every mutation appends to <path>/store.wal and reads back on Open. Every Options.SnapshotEvery mutations (default 1000) or on Index.Snapshot(), the full state is written atomically to <path>/store.snap and the WAL is truncated.

The snapshot header records the embedder's Name() and Dim(). A mismatch on load returns *ModelMismatchError and refuses to load — mixing vectors from different models silently destroys retrieval quality.

Watcher

embed.NewWatcher(idx, embed.WatchOptions{...}) walks roots, applies include-exts (.go, .md, …), excludes well-known dirs (.git, node_modules, …), and polls every 2s by default. Replace-by-doc semantics in Index.Add mean a file edit re-chunks cleanly without leaving stale chunks behind.

HTTP surface

Method	Path	Body	Returns
POST	`/embed/index`	`{"documents":[…]}`	202 + `{"added": N}`
POST	`/embed/query`	`Query`	`{"hits":[…]}`
GET	`/embed/stats`	—	`Stats`
DELETE	`/embed/doc/{id}`	—	204

Mount via the plugin:

app.RegisterPlugin(embed.NewPlugin(idx))
app.InitPlugins()

Or wire the raw handler onto any router:

mux.Handle("/embed/", http.StripPrefix("/embed", embed.Handler(idx)))

CLI

gofastr embed index .                          # one-shot index of cwd
gofastr embed watch ./src ./docs               # poll-watch until SIGINT
gofastr embed query "auth middleware" -k 5 --hybrid
gofastr embed stats
gofastr embed clear                            # wipe local snapshot

When GOFASTR_URL is set, query and stats hit that server's /embed/* routes instead of opening a local index.

Kiln integration

kiln.Loop.ContextHook is a per-turn callback that prepends retrieved context to the system prompt. Wire it with the helper:

loop := &agent.Loop{
    Provider:    provider,
    Tools:       tools,
    ContextHook: agent.NewEmbedContextHook(idx, 6),
}

Each user turn re-queries the index against the latest user message and injects the top 6 chunks as # Project context ahead of the framework's built-in prompt.

Live tests against real Ollama

Default go test ./... covers the package with the stub embedder and an httptest mock of Ollama. Tests that exercise real semantic behaviour against a running Ollama live in live_test.go behind a //go:build live tag and are skipped unless you explicitly run them:

make ollama-up        # docker compose up + auto-pull nomic-embed-text
make embed-live       # go test -tags=live -v ./battery/embed/...
make ollama-down      # stop the container

docker-compose.yml bind-mounts ./.ollama/ (gitignored) so the model only downloads once per workstation. What the live suite verifies:

Paraphrases of the same intent have higher cosine than unrelated pairs (the property the stub cannot satisfy).
A small corpus + a paraphrase query surfaces the right doc at rank #1, both with and without hybrid keyword fusion.
MMR with a real embedder surfaces diverse topics on a near-duplicate corpus.
Snapshot + reopen survives a fresh embedder instance (model fingerprint matches).
The polling watcher feeds the real embedder end-to-end.

Roadmap

Milestone	Status
M1 — package skeleton, types, stub embedder, flat store, chunker	✅
M1.5 — real semantic embedder (Ollama-compatible HTTP)	✅
M2 — gob snapshot + WAL, plugin + HTTP routes	✅
M3 — hybrid retrieval, filters, MMR, rerank hook	✅
M4 — polling watcher, `gofastr embed` CLI	✅
M5 — Kiln context hook, lang-aware chunker, example app, benches, docs	✅

The default production embedder is OllamaEmbedder. Users who want a different model wire any Embedder implementation (OpenAI Embeddings API, a private microservice, etc.) — the interface is three methods and the rest of the package is embedder-agnostic.

Documentation ¶

Overview ¶

Package embed provides a local semantic-search battery for GoFastr.

The package is built around a single per-app Index that stores [Chunk]s with vector embeddings and serves [Query]s via brute-force cosine similarity, with optional hybrid keyword fusion, metadata filtering, MMR diversity, and a pluggable rerank hook.

Components are intentionally separated so users can swap parts:

Embedder turns text into vectors. The default is an ONNX-backed all-MiniLM-L6-v2 (added in M1.5); a deterministic stub (NewStubEmbedder) ships for tests and offline development.
Chunker splits a Document into [Chunk]s. The default FixedWindow is language-agnostic and tokenizer-free.
Store holds vectors and metadata. The default FlatStore keeps everything in memory and persists to disk in M2.

See battery/embed/README.md for the architecture, retrieval pipeline, and milestone plan.

Index ¶

func Handler(idx Index) http.Handler
type Chunk
type Chunker
type Document
type Embedder
type Filter
type FixedWindow
- func NewFixedWindow(size, overlap int) *FixedWindow
- func (f *FixedWindow) Chunk(doc Document) ([]Chunk, error)
type FlatStore
- func NewFlatStore(dim int, model string) *FlatStore
- func (s *FlatStore) Add(_ context.Context, chunks []Chunk) error
- func (s *FlatStore) AllChunks() []Chunk
- func (s *FlatStore) Candidates(_ context.Context, qv []float32, filter Filter, top int) ([]Hit, error)
- func (s *FlatStore) ChunkByID(id string) (Chunk, bool)
- func (s *FlatStore) ChunkIDsForDoc(docID string) []string
- func (s *FlatStore) LoadSnapshot(path string) error
- func (s *FlatStore) RemoveDoc(_ context.Context, docID string) error
- func (s *FlatStore) Snapshot(path string) error
- func (s *FlatStore) Stats() Stats
type Hit
type Index
- func Open(opts Options) (Index, error)
type KeywordBackend
- func WrapSearchBackend(b search.Backend) KeywordBackend
type KeywordHit
type LangAware
- func NewLangAware() *LangAware
- func (l *LangAware) Chunk(doc Document) ([]Chunk, error)
type MemoryKeyword
- func NewMemoryKeyword() *MemoryKeyword
- func (m *MemoryKeyword) Delete(_ context.Context, id string) error
- func (m *MemoryKeyword) Index(_ context.Context, id, text string) error
- func (m *MemoryKeyword) Search(_ context.Context, text string, top int) ([]KeywordHit, error)
type ModelMismatchError
- func (e *ModelMismatchError) Error() string
type OllamaConfig
type OllamaEmbedder
- func NewOllamaEmbedder(cfg OllamaConfig) *OllamaEmbedder
- func (e *OllamaEmbedder) Dim() int
- func (e *OllamaEmbedder) Embed(ctx context.Context, texts []string) ([][]float32, error)
- func (e *OllamaEmbedder) Name() string
- func (e *OllamaEmbedder) Probe(ctx context.Context) error
type Options
type Plugin
- func NewPlugin(idx Index) *Plugin
- func (p *Plugin) Index() Index
- func (p *Plugin) Init(app *framework.App) error
- func (p *Plugin) Name() string
- func (p *Plugin) WithPrefix(prefix string) *Plugin
type Query
type Reranker
type Stats
type Store
type StubEmbedder
- func NewStubEmbedder(dim int) *StubEmbedder
- func (s *StubEmbedder) Dim() int
- func (s *StubEmbedder) Embed(_ context.Context, texts []string) ([][]float32, error)
- func (s *StubEmbedder) Name() string
type WatchOptions
type Watcher
- func NewWatcher(idx Index, opts WatchOptions) *Watcher
- func (w *Watcher) Run(ctx context.Context, roots ...string) error
- func (w *Watcher) ScanOnce(ctx context.Context, roots ...string) error

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func Handler ¶

func Handler(idx Index) http.Handler

Handler returns an http.Handler that exposes the index over HTTP. The handler is framework-agnostic; it can be mounted under any prefix on any router or http.ServeMux. The plugin in Plugin mounts it under "/embed" on a GoFastr framework.App.

Security contract:

Every route requires a non-empty Authorization header. The handler does not validate the credential itself — that is the caller's job when the handler is mounted behind an auth middleware. Rejecting anonymous traffic at the handler is a defense-in-depth measure so an unprotected mount cannot accidentally expose the index.
POST routes require Content-Type: application/json (415 otherwise).
Request bodies are capped at 1 MiB (413 otherwise).
Upstream / driver errors are NEVER echoed back to the client; the handler returns a generic "internal error" string instead.

Routes:

POST /index body: {"documents": [...]} → {"added": N}
POST /query body: Query → {"hits": [...]}
GET /stats → Stats
DELETE /doc/{id} (or query param ?id=) → 204

Types ¶

type Chunk ¶

type Chunk struct {
	ID       string         `json:"id"`
	DocID    string         `json:"doc_id"`
	Source   string         `json:"source,omitempty"`
	Text     string         `json:"text"`
	Offset   [2]int         `json:"offset"`
	Metadata map[string]any `json:"metadata,omitempty"`
	Vec      []float32      `json:"-"`
}

Chunk is a stored, embedded slice of a Document. Vec is owned by the store and is L2-normalized so cosine similarity reduces to a dot product.

type Chunker ¶

type Chunker interface {
	Chunk(doc Document) ([]Chunk, error)
}

Chunker splits a Document into [Chunk]s. Implementations MUST produce chunks with stable IDs derived from the document and chunk offset so re-indexing the same content does not duplicate entries.

type Document ¶

type Document struct {
	ID       string         `json:"id"`
	Source   string         `json:"source,omitempty"`
	Text     string         `json:"text"`
	Metadata map[string]any `json:"metadata,omitempty"`
}

Document is a unit of input to the index. A single Document is split by a Chunker into one or more [Chunk]s that are individually embedded and stored.

type Embedder ¶

type Embedder interface {
	Embed(ctx context.Context, texts []string) ([][]float32, error)
	Dim() int
	Name() string
}

Embedder turns text into fixed-dimension vectors. Implementations MUST produce L2-normalized output so the store can rely on dot product as cosine similarity.

type Filter ¶

type Filter struct {
	Source    string         `json:"source,omitempty"`
	Kind      string         `json:"kind,omitempty"`
	MetaMatch map[string]any `json:"meta_match,omitempty"`
}

Filter restricts the candidate set before scoring. Empty fields are ignored. MetaMatch entries are checked for exact equality against the chunk's Metadata.

type FixedWindow ¶

type FixedWindow struct {
	Size    int // window size, in runes
	Overlap int // overlap between consecutive windows, in runes
}

FixedWindow is a language-agnostic chunker that splits text into fixed-size rune windows with overlap. It does not look at token boundaries — that's the embedder's concern — so chunks are always reproducible from byte offsets alone.

func NewFixedWindow ¶

func NewFixedWindow(size, overlap int) *FixedWindow

NewFixedWindow validates size/overlap and returns a chunker. Overlap is clamped to [0, size-1].

func (*FixedWindow) Chunk ¶

func (f *FixedWindow) Chunk(doc Document) ([]Chunk, error)

Chunk implements Chunker.

type FlatStore ¶

type FlatStore struct {
	// contains filtered or unexported fields
}

FlatStore is the default in-memory Store: a slice of chunks scored by brute-force cosine similarity. It targets up to ~100k chunks; for 384-dim float32 vectors that's roughly 150MB plus chunk text, with query latency under ~30ms on a modern CPU.

func NewFlatStore ¶

func NewFlatStore(dim int, model string) *FlatStore

NewFlatStore returns an empty store sized for vectors of dimension dim produced by the named model. dim and model are recorded so future persistence (M2) can refuse to load a snapshot embedded with a different model.

func (*FlatStore) Add ¶

func (s *FlatStore) Add(_ context.Context, chunks []Chunk) error

Add appends chunks. Vectors are L2-normalized in place so query-time scoring is a plain dot product. When the store was constructed with dim=0 (embedder dim unknown at boot time), the first non-empty vec fixes the dimension for the lifetime of the store.

func (*FlatStore) AllChunks ¶

func (s *FlatStore) AllChunks() []Chunk

AllChunks returns a copy of every stored chunk. Part of chunkLister; used to rebuild the keyword index after a snapshot load (snapshots don't persist keyword state).

func (*FlatStore) Candidates ¶

func (s *FlatStore) Candidates(_ context.Context, qv []float32, filter Filter, top int) ([]Hit, error)

Candidates returns up to top chunks by cosine similarity to qv, applying filter to the candidate set first.

func (*FlatStore) ChunkByID ¶

func (s *FlatStore) ChunkByID(id string) (Chunk, bool)

ChunkByID returns the chunk with the given ID. Part of chunkLister; used to hydrate keyword-only hits (which carry just a chunk ID) into full Hits.

func (*FlatStore) ChunkIDsForDoc ¶

func (s *FlatStore) ChunkIDsForDoc(docID string) []string

ChunkIDsForDoc returns the IDs of every chunk belonging to docID. Part of the chunkLister capability the hybrid/keyword index uses to purge a doc's keyword entries when the doc is removed.

func (*FlatStore) LoadSnapshot ¶

func (s *FlatStore) LoadSnapshot(path string) error

LoadSnapshot replaces the store's contents with the snapshot at path. If the snapshot was produced by a different embedder model or dimension, [ErrModelMismatch] is returned and the store is left untouched.

func (*FlatStore) RemoveDoc ¶

func (s *FlatStore) RemoveDoc(_ context.Context, docID string) error

RemoveDoc deletes every chunk belonging to docID. It performs a stable, in-place compaction so retained chunks keep their relative order — useful when persistence writes the underlying slice in insertion order.

func (*FlatStore) Snapshot ¶

func (s *FlatStore) Snapshot(path string) error

Snapshot writes a complete, atomic snapshot of the store to path. The write goes to path+".tmp" and is renamed into place; partial snapshots are never observable.

func (*FlatStore) Stats ¶

func (s *FlatStore) Stats() Stats

Stats returns a snapshot of store state.

type Hit ¶

type Hit struct {
	Chunk  Chunk   `json:"chunk"`
	Score  float64 `json:"score"`
	Reason string  `json:"reason,omitempty"`
}

Hit is a single retrieval result. Reason describes which stage of the pipeline produced the score ("vec", "kw", "hybrid", "mmr", "rerank") so callers can debug ranking behaviour.

type Index ¶

type Index interface {
	Add(ctx context.Context, docs ...Document) error
	Remove(ctx context.Context, docIDs ...string) error
	Query(ctx context.Context, q Query) ([]Hit, error)
	// Snapshot persists the current state to the path configured via
	// [Options.Path]. It is a no-op if no path was configured.
	Snapshot() error
	Stats() Stats
	Close() error
}

Index is the public handle returned by Open. It composes a Chunker, Embedder and Store into the read/write API GoFastr apps consume.

func Open ¶

func Open(opts Options) (Index, error)

Open constructs an Index from Options. It returns an error if no Embedder is supplied. If Options.Path is set, Open loads any existing snapshot and replays the WAL before returning.

type KeywordBackend ¶

type KeywordBackend interface {
	Index(ctx context.Context, id, text string) error
	Delete(ctx context.Context, id string) error
	// Search returns up to top results as (chunkID, score) pairs.
	Search(ctx context.Context, text string, top int) ([]KeywordHit, error)
}

KeywordBackend is the subset of the battery/search backend the embed package needs for hybrid retrieval. It is defined here so the embed package does not import battery/search in its core path; the real wiring happens in hybrid_search.go via an adapter.

func WrapSearchBackend ¶

func WrapSearchBackend(b search.Backend) KeywordBackend

WrapSearchBackend adapts a battery/search.Backend to the KeywordBackend interface this package expects. It is intentionally thin: every Add/Remove writes through, and Search maps Documents back to chunk IDs.

type KeywordHit ¶

type KeywordHit struct {
	ChunkID string
	Score   float64
}

KeywordHit is one result from a KeywordBackend.

type LangAware ¶

type LangAware struct {
	// MaxRunes caps a single produced chunk in runes. Chunks above this
	// are re-chunked with FixedWindow. Defaults to 1024.
	MaxRunes int
	// Fallback is used for unknown kinds and for over-large structural
	// chunks. Defaults to NewFixedWindow(512, 64).
	Fallback Chunker
}

LangAware is a structure-aware chunker that splits Go source on top-level declarations and Markdown on headings, falling back to a FixedWindow for anything it can't parse or for chunks that exceed MaxRunes.

Routing key is Document.Metadata["kind"] when present; otherwise the Source extension drives the choice (.go → go, .md/.markdown → md).

func NewLangAware ¶

func NewLangAware() *LangAware

NewLangAware returns a LangAware chunker with sensible defaults.

func (*LangAware) Chunk ¶

func (l *LangAware) Chunk(doc Document) ([]Chunk, error)

Chunk implements Chunker.

type MemoryKeyword ¶

type MemoryKeyword struct {
	// contains filtered or unexported fields
}

MemoryKeyword is an allocation-cheap, dependency-free KeywordBackend used as the default for [Options.Keyword] when the caller wants hybrid retrieval without wiring battery/search.

It scores documents using a simplified BM25-flavoured formula: term frequency in the document, normalised by document length and scaled by inverse document frequency. This is not BM25 — there is no IDF saturation parameter — but it is the right shape for fusing with vector scores via RRF, where only the rank order matters.

func NewMemoryKeyword ¶

func NewMemoryKeyword() *MemoryKeyword

NewMemoryKeyword returns an empty in-memory keyword backend.

func (*MemoryKeyword) Delete ¶

func (m *MemoryKeyword) Delete(_ context.Context, id string) error

Delete implements KeywordBackend.

func (*MemoryKeyword) Index ¶

func (m *MemoryKeyword) Index(_ context.Context, id, text string) error

Index implements KeywordBackend. Re-indexing the same id replaces the prior entry so totals stay consistent.

func (*MemoryKeyword) Search ¶

func (m *MemoryKeyword) Search(_ context.Context, text string, top int) ([]KeywordHit, error)

Search implements KeywordBackend. Documents with zero matching terms are not returned.

type ModelMismatchError ¶

type ModelMismatchError struct {
	SnapshotModel, StoreModel string
	SnapshotDim, StoreDim     int
}

ModelMismatchError is returned when a persisted snapshot was produced by a different embedder than the one configured on the store. Mixing vectors from different models is silently catastrophic for retrieval quality, so the load is refused loudly instead.

func (*ModelMismatchError) Error ¶

func (e *ModelMismatchError) Error() string

type OllamaConfig ¶

type OllamaConfig struct {
	// BaseURL is the Ollama server root. Defaults to
	// "http://localhost:11434". Trailing slashes are trimmed.
	BaseURL string
	// Model is the embedding model name (e.g. "nomic-embed-text",
	// "mxbai-embed-large", "all-minilm"). Defaults to
	// "nomic-embed-text". Whatever you pick must be a model the server
	// has pulled.
	Model string
	// Dim is the embedding dimension. If 0, it is detected on the
	// first call by inspecting the response. Setting it explicitly
	// skips a probe at construction time and lets [FlatStore] pre-size.
	Dim int
	// Timeout is the per-request HTTP timeout. Defaults to 30s.
	Timeout time.Duration
	// Client is an optional pre-configured http.Client (for tests or
	// custom transport). When nil, a fresh client with the timeout is
	// constructed.
	Client *http.Client
}

OllamaConfig configures an OllamaEmbedder.

type OllamaEmbedder ¶

type OllamaEmbedder struct {
	// contains filtered or unexported fields
}

OllamaEmbedder is an Embedder that calls a locally running Ollama (or any compatible) server's /api/embed endpoint. Ollama itself, LM Studio, and llama.cpp's server all expose this contract.

This is the recommended swap from StubEmbedder for production use:

idx, _ := embed.Open(embed.Options{
    Embedder: embed.NewOllamaEmbedder(embed.OllamaConfig{
        Model: "nomic-embed-text",
    }),
})

Pre-flight: run `ollama serve` (it auto-starts on most installs) and `ollama pull nomic-embed-text`. The default model is 768-dim and is the standard recommendation for local semantic search.

Failure modes:

Server unreachable → [Embed] returns the underlying network error. Callers should not blindly retry — a missing daemon is a configuration problem, not a transient one.
Model not pulled → Ollama responds 404 with a clear message; we surface it verbatim so users see "pull <model> first".
Dim mismatch with the configured FlatStore is detected later at [Store.Add] time via [errVecDim].

func NewOllamaEmbedder ¶

func NewOllamaEmbedder(cfg OllamaConfig) *OllamaEmbedder

NewOllamaEmbedder constructs an Ollama-backed embedder. It does not make any network calls until [Embed] is invoked — failures during boot are deferred to first use so misconfigurations can be retried without restarting the app.

func (*OllamaEmbedder) Dim ¶

func (e *OllamaEmbedder) Dim() int

Dim returns the embedding dimension. If the dimension was not set in the config and no embedding has been produced yet, Dim returns 0; Open should be passed a config with Dim set when the store is initialised separately.

func (*OllamaEmbedder) Embed ¶

func (e *OllamaEmbedder) Embed(ctx context.Context, texts []string) ([][]float32, error)

Embed calls /api/embed with the entire batch. Ollama performs the batching server-side, so we hand the slice over whole instead of looping client-side.

func (*OllamaEmbedder) Name ¶

func (e *OllamaEmbedder) Name() string

Name returns "ollama:<model>" — the snapshot fingerprint baked into persisted indexes uses this string, so changing models triggers a loud ModelMismatchError on reload.

func (*OllamaEmbedder) Probe ¶

func (e *OllamaEmbedder) Probe(ctx context.Context) error

Probe makes a tiny embedding call to detect Dim and confirm the server is reachable. Useful at app boot so dim-mismatch errors surface before the first user query rather than after.

type Options ¶

type Options struct {
	// Embedder produces vectors. Required.
	Embedder Embedder
	// Chunker splits documents. Defaults to a [FixedWindow] with 512-rune
	// windows and 64-rune overlap if nil.
	Chunker Chunker
	// Store holds vectors. Defaults to an in-memory [FlatStore] if nil.
	//
	// A custom Store must additionally implement Snapshot(path)/LoadSnapshot(path)
	// to be used with Path (persistence), and ChunkIDsForDoc(docID)/ChunkByID(id)/
	// AllChunks() to be used with Keyword (hybrid search). Open returns an error
	// rather than silently degrading if a capability is missing. FlatStore
	// implements all of them.
	Store Store
	// Keyword enables hybrid retrieval. When set, chunk text is
	// mirrored into this backend at Add time and consulted during
	// Query when [Query.Hybrid] is true. When nil, hybrid queries
	// silently fall back to pure vector retrieval. Use
	// [NewMemoryKeyword] for a zero-dep in-process backend or wrap any
	// [battery/search].Backend with [WrapSearchBackend].
	Keyword KeywordBackend
	// Reranker is the optional second-stage scorer. When nil,
	// [Query.Rerank] requests error out so silent quality loss is
	// impossible.
	Reranker Reranker
	// Path is the directory where the snapshot and WAL live. When
	// empty, the index runs purely in memory and [Index.Snapshot] is a
	// no-op. When set, [Open] loads any existing snapshot, replays the
	// WAL on top, and persists subsequent writes through the WAL.
	Path string
	// SnapshotEvery is the number of mutating ops after which the index
	// automatically flushes a full snapshot and truncates the WAL. 0
	// disables auto-snapshot; -1 snapshots after every op (mostly for
	// tests). The default is 1000.
	SnapshotEvery int
}

Options configures Open.

type Plugin ¶

type Plugin struct {
	// contains filtered or unexported fields
}

Plugin is the framework.Plugin adapter for the embed battery. It owns no state of its own — callers construct an Index and hand it to NewPlugin, which then registers /embed/* routes on the app's router during framework.App.Init.

Mount via:

idx, _ := embed.Open(embed.Options{Embedder: e, Path: "~/.gofastr/embed/myapp"})
app.Plugins.Register(embed.NewPlugin(idx))

func NewPlugin ¶

func NewPlugin(idx Index) *Plugin

NewPlugin returns a Plugin that mounts routes under "/embed". Use Plugin.WithPrefix to change the mount point.

func (*Plugin) Index ¶

func (p *Plugin) Index() Index

Index returns the underlying Index so other plugins or the app can perform direct calls without going through HTTP.

func (*Plugin) Init ¶

func (p *Plugin) Init(app *framework.App) error

Init implements framework.Plugin. Mounts the stdlib Handler under the configured prefix on the app's router; routing semantics match Go 1.22 ServeMux.

func (*Plugin) Name ¶

func (p *Plugin) Name() string

Name implements framework.Plugin.

func (*Plugin) WithPrefix ¶

func (p *Plugin) WithPrefix(prefix string) *Plugin

WithPrefix overrides the URL prefix. Leading slash required.

type Query ¶

type Query struct {
	Text      string  `json:"text"`
	K         int     `json:"k,omitempty"`
	Filter    Filter  `json:"filter,omitempty"`
	Hybrid    bool    `json:"hybrid,omitempty"`
	MMRLambda float64 `json:"mmr_lambda,omitempty"`
	Rerank    bool    `json:"rerank,omitempty"`
}

Query is the input to [Index.Query].

Zero-value semantics:

K=0 → 10
Hybrid=false and MMRLambda=0 → pure vector retrieval, no diversity reranking.

type Reranker ¶

type Reranker interface {
	Rerank(ctx context.Context, query string, hits []Hit) ([]Hit, error)
}

Reranker is an optional second-stage scorer applied to candidate hits before they are returned. It receives the original query text (not its vector) so cross-encoder rerankers can score query/chunk pairs jointly. The returned slice may be reordered or truncated; score values should reflect the reranker's confidence.

type Stats ¶

type Stats struct {
	Docs   int    `json:"docs"`
	Chunks int    `json:"chunks"`
	Dim    int    `json:"dim"`
	Model  string `json:"model"`
}

Stats describes the current state of an Index.

type Store ¶

type Store interface {
	Add(ctx context.Context, chunks []Chunk) error
	RemoveDoc(ctx context.Context, docID string) error
	// Candidates returns up to top chunks by cosine similarity to qv,
	// after applying filter. The returned slice MUST NOT be retained;
	// callers should copy if needed.
	Candidates(ctx context.Context, qv []float32, filter Filter, top int) ([]Hit, error)
	Stats() Stats
}

Store is the per-app vector store. It owns the embedding lifecycle for added documents and exposes a candidate-generation API the retrieval pipeline composes on top of.

type StubEmbedder ¶

type StubEmbedder struct {
	// contains filtered or unexported fields
}

StubEmbedder is a deterministic, dependency-free Embedder used by tests and offline development. It produces a hashed bag-of-words representation: each whitespace-separated token is hashed into one of Dim buckets with a sign derived from a second hash, and the result is L2-normalized.

It is NOT a real embedding model — there is no semantic similarity across paraphrases or synonyms — but it is fast, allocation-light, and produces high cosine similarity for documents that share tokens, which is enough to test the retrieval pipeline end-to-end.

func NewStubEmbedder ¶

func NewStubEmbedder(dim int) *StubEmbedder

NewStubEmbedder returns a StubEmbedder of the given dimension. Dimensions <= 0 fall back to 128.

func (*StubEmbedder) Dim ¶

func (s *StubEmbedder) Dim() int

Dim implements Embedder.

func (*StubEmbedder) Embed ¶

func (s *StubEmbedder) Embed(_ context.Context, texts []string) ([][]float32, error)

Embed implements Embedder.

func (*StubEmbedder) Name ¶

func (s *StubEmbedder) Name() string

Name implements Embedder.

type WatchOptions ¶

type WatchOptions struct {
	// IncludeExts is the set of file extensions to index, including the
	// leading dot (".go", ".md"). Empty means all files.
	IncludeExts []string
	// ExcludeDirs is a set of directory names to skip entirely, matched
	// on the path's base name. Defaults to a sensible set if nil:
	// {".git", "node_modules", "dist", ".gofastr", "vendor"}.
	ExcludeDirs []string
	// PollInterval is how often the watcher re-scans roots after the
	// initial walk. Defaults to 2 seconds. Set to <= 0 to disable
	// polling (initial walk only).
	PollInterval time.Duration
	// MaxFileSize skips files larger than this. 0 means 1 MiB. Negative
	// means no limit.
	MaxFileSize int64
	// MetadataFunc derives chunk metadata from the absolute path of a
	// file being indexed. When nil, a default sets {"kind": "code"} for
	// .go files and {"kind": "doc"} for .md/.txt; everything else is
	// untagged.
	MetadataFunc func(absPath string) map[string]any
}

WatchOptions configures a Watcher.

type Watcher ¶

type Watcher struct {
	// contains filtered or unexported fields
}

Watcher walks one or more filesystem roots, feeds matching files into an Index as [Document]s, and (optionally) re-scans on a timer to keep the index in sync with changes.

The watcher uses polling rather than OS-level file events so it has no third-party dependency. Cost on small trees (~thousands of files) is negligible; on very large trees, swap for an fsnotify-backed implementation by replacing this file.

func NewWatcher ¶

func NewWatcher(idx Index, opts WatchOptions) *Watcher

NewWatcher constructs a Watcher. The Index is fed via [Index.Add] and [Index.Remove] as files appear, change, and disappear.

func (*Watcher) Run ¶

func (w *Watcher) Run(ctx context.Context, roots ...string) error

Run walks the provided roots once, indexes every matching file, and (unless PollInterval <= 0) loops re-scanning on every tick until ctx is canceled. Returns ctx.Err() on cancellation, or the first indexing error encountered.

func (*Watcher) ScanOnce ¶

func (w *Watcher) ScanOnce(ctx context.Context, roots ...string) error

ScanOnce walks roots once and applies any diffs against the last known state. Useful for tests and for the `gofastr embed index` one-shot CLI mode.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL