service

package
v0.2.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 12, 2026 License: AGPL-3.0 Imports: 17 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

View Source
var ErrInvalidInput = errors.New("invalid input")

ErrInvalidInput marks errors caused by the caller's request (missing fields, unknown tiers) as opposed to backend failures. API layers map it to 400; anything else is a server-side error.

Functions

func RecallPoolSize added in v0.0.6

func RecallPoolSize(k int) int

RecallPoolSize is the per-leg candidate pool Recall over-fetches for a final result count of k, with the default pool sizing. Exported so external pipelines (bench) that re-create recall stage-by-stage match production instead of hardcoding the constants.

Types

type AnswerInput added in v0.0.4

type AnswerInput struct {
	Namespace string
	Query     string
	// Limit caps how many recalled memories are given to the reader (default 10).
	Limit int
	Tiers []memory.Tier
}

AnswerInput is a retrieve-then-generate request.

type AnswerResult added in v0.0.4

type AnswerResult struct {
	Answer  string
	Sources []store.Scored
}

AnswerResult is the generated answer and the memories it was grounded on.

type Briefing added in v0.0.11

type Briefing struct {
	Namespace  string           `json:"namespace"`
	Facts      []*memory.Memory `json:"facts,omitempty"`      // semantic, highest-retention first
	Procedures []*memory.Memory `json:"procedures,omitempty"` // procedural, highest-retention first
	Recent     []*memory.Memory `json:"recent,omitempty"`     // episodic, newest first
	Pinned     []*memory.Memory `json:"pinned,omitempty"`     // tagged pinned, any tier
}

Briefing is a layered session-start summary of a namespace: the most durable facts and procedures, the most recent episodic activity, and pinned memories.

type ConsolidateMode

type ConsolidateMode string

ConsolidateMode selects how the opt-in LLM consolidation pipeline runs.

const (
	// ConsolidateAsync stores writes immediately and consolidates in the
	// background — writes never block on the LLM. The default.
	ConsolidateAsync ConsolidateMode = "async"
	// ConsolidateSync consolidates before returning, so a write reflects its
	// dedup/supersede outcome immediately (read-your-consolidated-writes).
	ConsolidateSync ConsolidateMode = "sync"
	// ConsolidateOff disables consolidation even when a consolidator is set.
	ConsolidateOff ConsolidateMode = "off"
)

type DedupInput added in v0.0.8

type DedupInput struct {
	// Similarity gates cluster membership. 0 falls back to the package
	// default (0.85). Negative disables the pass and Dedup returns an empty
	// report without erroring.
	Similarity float64
	// MinClusterSize is the smallest cluster acted on. 0 falls back to 2.
	MinClusterSize int
	// Tiers restricts the pass to these tiers; nil/empty means all tiers.
	Tiers []memory.Tier
	// Namespaces restricts the pass to these namespaces; nil/empty means every
	// namespace. API callers scope this to the request's namespace; only an
	// explicit all-namespaces request leaves it empty.
	Namespaces []string
	// NeighboursPerAnchor bounds the per-anchor vector-search fan-out.
	// 0 falls back to 20.
	NeighboursPerAnchor int
	// DryRun reports what would be done without tombstoning anything.
	DryRun bool
}

DedupInput configures a dedup pass invoked through the service. The zero value is valid and means "use the production defaults": 0.85 similarity, cluster size >= 2, all tiers, 20 neighbours per anchor, dry-run = false.

type ListInput

type ListInput struct {
	Namespace         string
	Tiers             []memory.Tier
	IncludeExpired    bool
	IncludeSuperseded bool
	// Limit caps the result count; <= 0 returns all matches.
	Limit int
	// AllNamespaces lists across every namespace instead of in.Namespace, with
	// Limit applied as a single global cap (newest first). Backs the admin UI's
	// "All projects" view.
	AllNamespaces bool
}

ListInput selects a slice of a namespace's memories for browsing. The zero value (besides Namespace) lists all live memories, newest store order.

type Metrics

type Metrics interface {
	// ConsolidateResult records one consolidation outcome: one of
	// "gated", "new", "update", "supersede", "noop", "error", "dropped".
	ConsolidateResult(result string)
	// ConsolidateQueueDepth reports the current async queue depth.
	ConsolidateQueueDepth(depth int)
	// RememberResult records the outcome of a Remember call: result is
	// "ok"|"error" and tier is the memory's tier.
	RememberResult(result, tier string)
	// RecallResult records the outcome of a Recall call. result is
	// "ok"|"error"; tierFilter is one of "all"|"working"|"episodic"|
	// "semantic"|"procedural"|"mixed"; hitsBucket is a pre-bucketed
	// count of returned memories: "0"|"1"|"2-5"|"6-20"|"21+".
	RecallResult(result, tierFilter, hitsBucket string)
	// ForgetResult records the outcome of a Forget call: "ok"|"not_found"|"error".
	ForgetResult(result string)
	// PromoteResult records one Promote batch: result is "ok"|"error";
	// facts is the number of semantic facts written.
	PromoteResult(result string, facts int)
	// FsckResult records one fsck pass: "ok"|"error". Counters for the
	// work done (purged, evicted, duplicate groups) are exposed separately
	// via the store's maintenance metrics.
	FsckResult(result string)
	// OpDuration observes end-to-end latency for a public operation
	// (e.g. "recall", "answer").
	OpDuration(op string, d time.Duration)
	// AnswerResult records one Answer call: "ok" or "error".
	AnswerResult(result string)
	// RerankResult records one recall rerank attempt: backend is the reranker's
	// label ("llm"|"cross_encoder"); result is "ok" or "fallback".
	RerankResult(backend, result string)
	// ReinforceResult records one best-effort recall reinforcement write:
	// "ok" or "error".
	ReinforceResult(result string)
	// DedupTombstoned records the total memories tombstoned by one one-shot
	// Service.Dedup call. Called once per call.
	DedupTombstoned(n int)
}

Metrics receives service-level events for observability. Methods must be safe for concurrent use; a nil Metrics is replaced by a no-op.

type Option

type Option func(*Service)

Option customizes a Service.

func WithAnswerer added in v0.0.4

func WithAnswerer(c llm.Completer) Option

WithAnswerer enables Answer: recall memories, then generate a grounded answer from them with this chat client.

func WithClock

func WithClock(now func() time.Time) Option

WithClock overrides the time source (tests).

func WithConsolidateMinScore

func WithConsolidateMinScore(minScore float64) Option

WithConsolidateMinScore sets the similarity gate: the LLM is only consulted when the nearest candidate scores at least minScore. 0 disables the gate.

func WithConsolidateMode

func WithConsolidateMode(m ConsolidateMode) Option

WithConsolidateMode selects async (default), sync, or off.

func WithConsolidateQueueCap added in v0.0.11

func WithConsolidateQueueCap(n int) Option

WithConsolidateQueueCap bounds the async consolidation queue. n <= 0 uses the default. Raise it for write-bursty deployments to reduce dropped jobs (a dropped job loses the dedup pass, never the memory).

func WithConsolidator

func WithConsolidator(c llm.Consolidator) Option

WithConsolidator enables the opt-in LLM consolidation pipeline.

func WithDistiller

func WithDistiller(d llm.Distiller) Option

WithDistiller enables episodic→semantic promotion via RunPromoter.

func WithIDGenerator

func WithIDGenerator(gen func() string) Option

WithIDGenerator overrides ID generation (tests).

func WithMetrics

func WithMetrics(m Metrics) Option

WithMetrics installs an observability sink for consolidation events.

func WithPromoteMinAccess

func WithPromoteMinAccess(n int) Option

WithPromoteMinAccess sets the minimum access_count for an episodic memory to be eligible for promotion.

func WithQueryPrefix

func WithQueryPrefix(p string) Option

WithQueryPrefix prepends an instruction to recall queries before embedding (e.g. the retrieval instruct expected by Qwen3-Embedding or bge models). Documents keep bare embeddings; the keyword leg keeps the raw query.

func WithRecallPool

func WithRecallPool(factor, floor int) Option

WithRecallPool overrides the per-leg candidate pool sizing (max(k*factor, floor)) for hybrid recall. Non-positive values keep the defaults. Used by the benchmark harness to sweep pool depth.

func WithReranker added in v0.0.4

func WithReranker(r rerank.Reranker, name string, topN int) Option

WithReranker enables reranking of recall candidates: after composite ranking, the top topN candidates are reordered by the reranker (an LLM or cross-encoder model), then truncated to the limit. name labels the backend in metrics. It adds one reranker call per Recall, so it is opt-in; a failed rerank falls back to the composite order. topN <= 0 keeps the default.

func WithScoreFusion

func WithScoreFusion(alpha float64) Option

WithScoreFusion sets the hybrid fusion weight: the vector leg by alpha and the keyword leg by 1-alpha (score fusion). alpha < 0 selects rank fusion (RRF). The package default is score fusion at DefaultFusionAlpha.

func WithShortTermCap

func WithShortTermCap(cap int) Option

WithShortTermCap bounds short-term memories per namespace, enforced by fsck.

func WithSyncReinforce

func WithSyncReinforce() Option

WithSyncReinforce makes recall reinforcement run synchronously (tests).

func WithTemporalTargeting added in v0.0.4

func WithTemporalTargeting(boost float64, ex search.AnchorExtractor) Option

WithTemporalTargeting enables temporal targeting in the re-ranker: when a query names a relative time, candidates dated near the referenced point are boosted by up to `boost` on the composite score. ex resolves the reference (use search.RegexAnchorExtractor{} for the no-LLM default). boost <= 0 or a nil extractor disables it.

func WithWriteDedup

func WithWriteDedup(minScore float64) Option

WithWriteDedup coalesces a fresh write into an existing same-tier memory when their vector similarity is at or above minScore, instead of storing a near-duplicate. It only acts when the LLM consolidation pipeline is not handling the write (no consolidator, a non-durable tier, or consolidation off), giving headless deployments basic corpus hygiene. 0 disables it.

type RecallInput

type RecallInput struct {
	Namespace string
	Query     string
	Tiers     []memory.Tier
	Limit     int
	// IncludeExpired / IncludeSuperseded relax the default live-only filter.
	IncludeExpired    bool
	IncludeSuperseded bool
	// AsOf, when non-zero, runs time-travel recall: it returns the facts whose
	// validity window contained AsOf (including ones since superseded), instead
	// of only currently-live memories.
	AsOf time.Time
	// Subtree expands recall to Namespace and every namespace nested under it
	// ("project" also reads "project/agent-a", "project/agent-b", ...), for the
	// multi-agent "read shared + private" pattern. Default (false) is exact scope,
	// so cross-agent recall never happens unless asked for.
	Subtree bool
}

RecallInput describes a hybrid recall query.

type RememberInput

type RememberInput struct {
	Namespace  string
	Content    string
	Tier       memory.Tier
	Summary    string
	Tags       []string
	Metadata   map[string]any
	Importance float64
	// TTL overrides the tier default. A negative TTL means "never expire".
	TTL *time.Duration
	// ID upserts an existing memory when set; otherwise a new ID is generated.
	ID string
	// Confidence overrides the seed corroboration for a durable fact (e.g. a
	// trusted import). nil uses the default seed. Ignored for short-term tiers.
	Confidence *float64
}

RememberInput describes a memory to store. Only Namespace and Content are required; Tier defaults to working and TTL to the tier default.

type Service

type Service struct {
	// contains filtered or unexported fields
}

Service wires storage and embeddings together. It is safe for concurrent use.

func New

func New(st store.Store, e embed.Embedder, opts ...Option) *Service

New builds a Service from a store and embedder.

func (*Service) Answer added in v0.0.4

func (s *Service) Answer(ctx context.Context, in AnswerInput) (AnswerResult, error)

Answer recalls memories for the query and asks the configured LLM to answer from them, grounding the response and returning the supporting memories. It requires an answerer (see WithAnswerer); recall reuses the full hybrid + rerank path, so a configured reranker applies here too.

func (*Service) Briefing added in v0.0.11

func (s *Service) Briefing(ctx context.Context, namespace string, perSection int) (Briefing, error)

Briefing builds a session-start briefing for a namespace: up to perSection memories in each of facts (semantic), procedures (procedural) and recent (episodic, newest first), plus all pinned memories. It is a cheap, query-less read for hooks to inject context when a session opens.

func (*Service) Dedup added in v0.0.8

Dedup runs a vector-cluster dedup pass: each cluster's representative (the member with the highest RetentionScore) is kept; the rest are tombstoned (SupersededBy → representative) so they're hidden from default search results. The action is reversible. With in.Namespaces empty the pass spans every namespace; callers usually scope it to one.

It's mainly a post-import cleanup tool, since exports tend to be full of restatements. The default similarity (0.85) is a paraphrase-level threshold; raise it for stricter, lower it for looser merging.

func (*Service) DeleteNamespace added in v0.0.8

func (s *Service) DeleteNamespace(ctx context.Context, namespace string) (int64, error)

DeleteNamespace removes every memory in a namespace. Returns the number of memories deleted.

func (*Service) Forget

func (s *Service) Forget(ctx context.Context, namespace, id string) error

Forget deletes a memory by ID.

func (*Service) ForgetByTag added in v0.0.11

func (s *Service) ForgetByTag(ctx context.Context, namespace, tag string) (int64, error)

ForgetByTag deletes every memory in a namespace carrying tag, including superseded and expired ones, and returns the count deleted. With the import provenance tag (import:<source>:<date>), this undoes a bulk import in one call.

func (*Service) Fsck

func (s *Service) Fsck(ctx context.Context) (maintenance.Report, error)

Fsck runs a consistency sweep: purge expired, enforce the short-term cap, and audit live memories for duplicate clusters.

func (*Service) Get

func (s *Service) Get(ctx context.Context, namespace, id string) (*memory.Memory, error)

Get returns a single memory by ID.

func (*Service) List

func (s *Service) List(ctx context.Context, in ListInput) ([]*memory.Memory, error)

List returns memories in a namespace matching the filter, without embeddings. It backs the UI memory browser and the client-derived relationship graph.

func (*Service) Namespaces

func (s *Service) Namespaces(ctx context.Context) ([]string, error)

Namespaces returns the distinct namespaces holding memories, for the UI tenant switcher.

func (*Service) Promote

func (s *Service) Promote(ctx context.Context) (int, error)

Promote distills frequently-accessed, not-yet-promoted episodic memories in each namespace into durable semantic facts (written via Remember so they get the similarity gate and consolidation dedup), then stamps the sources so they aren't reprocessed. Returns the number of facts written. No-op without a distiller.

func (*Service) Recall

func (s *Service) Recall(ctx context.Context, in RecallInput) ([]store.Scored, error)

Recall runs hybrid (vector + keyword) retrieval fused with RRF.

func (*Service) Remember

func (s *Service) Remember(ctx context.Context, in RememberInput) (*memory.Memory, error)

Remember embeds and stores a memory, returning the persisted record.

func (*Service) RunPromoter

func (s *Service) RunPromoter(ctx context.Context, interval time.Duration)

RunPromoter periodically distills frequently-accessed episodic memories into durable semantic facts until ctx is cancelled. It is a no-op without a distiller or a positive interval. Call once, typically in its own goroutine.

func (*Service) StartConsolidator

func (s *Service) StartConsolidator(ctx context.Context)

StartConsolidator runs the background consolidation worker until ctx is cancelled, then drains queued jobs within a bounded timeout. It is a no-op unless the service was built with a consolidator in async mode. Call once, typically in its own goroutine.

func (*Service) Stats

func (s *Service) Stats(ctx context.Context, namespace string) (Stats, error)

Stats computes a per-namespace overview by scanning all of its memories (including expired and superseded, so those can be counted separately).

func (*Service) StatsAll added in v0.0.10

func (s *Service) StatsAll(ctx context.Context) (Stats, error)

StatsAll merges per-namespace overviews into a single store-wide one (namespace reported as ""), backing the admin UI's "All projects" dashboard.

func (*Service) WaitBackground added in v0.0.6

func (s *Service) WaitBackground()

WaitBackground blocks until detached background goroutines (async recall reinforcement) finish. Call during shutdown, after the workers have been stopped and before closing the store.

type Stats

type Stats struct {
	Namespace     string              `json:"namespace"`
	Total         int                 `json:"total"`      // live memories (excludes expired/superseded)
	ByTier        map[memory.Tier]int `json:"by_tier"`    // live count per tier
	Expired       int                 `json:"expired"`    // past-TTL, not yet swept
	Superseded    int                 `json:"superseded"` // contradiction-tombstoned
	TotalAccesses int                 `json:"total_accesses"`
	AvgImportance float64             `json:"avg_importance"`
	LastWriteAt   *time.Time          `json:"last_write_at,omitempty"`
}

Stats summarizes a namespace for the UI dashboard. Counts are computed from a full listing, so callers should treat it as a curated-namespace overview, not a hot-path metric (Prometheus /metrics remains the source for operational counters).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL