maintenance

package
v0.0.8 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 11, 2026 License: AGPL-3.0 Imports: 8 Imported by: 0

Documentation

Overview

Package maintenance keeps the store healthy: a background sweeper purges expired memories and bounds short-term capacity, and fsck additionally audits live memories for duplicate (poisoning) clusters.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func EnforceShortTermCap

func EnforceShortTermCap(ctx context.Context, st store.Store, cap int, now time.Time) (int, error)

EnforceShortTermCap evicts the lowest-retention short-term memories in each namespace that holds more than cap of them. cap <= 0 disables it. Returns the number evicted.

func PurgeExpired

func PurgeExpired(ctx context.Context, st store.Store, now time.Time) (int, error)

PurgeExpired deletes memories whose TTL has passed as of now, in batches, and returns the number removed.

Types

type ClusterAction added in v0.0.8

type ClusterAction struct {
	RepresentativeID string   `json:"representative_id"`
	TombstonedIDs    []string `json:"tombstoned_ids"`
	Size             int      `json:"size"`
}

ClusterAction describes one near-duplicate cluster found by a pass and the representative selection the pass would commit.

type DedupJob added in v0.0.8

type DedupJob struct {
	// contains filtered or unexported fields
}

DedupJob is a periodic vector-cluster dedup pass. With interval <= 0, Run is a no-op (the function returns immediately).

func NewDedupJob added in v0.0.8

func NewDedupJob(st store.Store, emb embed.Embedder, m store.Metrics, log *slog.Logger,
	interval time.Duration, opts DedupOptions) *DedupJob

NewDedupJob builds a DedupJob that calls Dedup(opts) every interval. interval <= 0 disables the job.

func (*DedupJob) Run added in v0.0.8

func (d *DedupJob) Run(ctx context.Context)

Run loops on a ticker until ctx is cancelled. It runs one pass immediately and again on every tick. It is a no-op if the job was built with interval <= 0.

type DedupOptions added in v0.0.8

type DedupOptions struct {
	// Similarity is the minimum cosine-like score (the store's vector
	// distance-to-score mapping) for two memories to join a cluster.
	// 0 falls back to defaultDedupSimilarity. Negative disables dedup and
	// the call returns an empty report.
	Similarity float64
	// MinClusterSize is the smallest cluster acted on. A pair of near-
	// duplicates below this is left alone. 0 falls back to
	// defaultDedupMinClusterSize.
	MinClusterSize int
	// Tiers restricts the pass to these tiers; nil/empty means all tiers.
	Tiers []memory.Tier
	// Namespaces restricts the pass to these namespaces; nil/empty means every
	// namespace. Clusters never span namespaces, so scoping the pass to one
	// (the post-import case) is both cheaper and avoids touching other tenants.
	Namespaces []string
	// NeighboursPerAnchor bounds the per-anchor vector-search fan-out. Larger
	// values tighten clusters at higher vector-search cost. 0 falls back to
	// defaultDedupNeighboursAnchor.
	NeighboursPerAnchor int
	// DryRun reports what would be done without tombstoning anything.
	DryRun bool
	// Now is the instant retention scoring and expiry filtering are evaluated
	// at. Zero means time.Now().UTC().
	Now time.Time
	// Log receives progress messages; nil falls back to slog.Default().
	Log *slog.Logger
}

DedupOptions configures one dedup pass.

type DedupReport added in v0.0.8

type DedupReport struct {
	Namespaces    int             `json:"namespaces"`
	MemoriesSeen  int             `json:"memories_seen"`
	ClustersFound int             `json:"clusters_found"`
	Tombstoned    int             `json:"tombstoned"`
	DryRun        bool            `json:"dry_run"`
	Actions       []ClusterAction `json:"actions,omitempty"`
}

DedupReport summarizes one dedup pass.

func Dedup added in v0.0.8

func Dedup(ctx context.Context, st store.Store, emb embed.Embedder, opts DedupOptions) (DedupReport, error)

Dedup clusters live memories per namespace by embedding similarity and tombstones the lower-scored members of each cluster, pointing them at the cluster's representative. The representative is the member with the highest RetentionScore (importance × access × recency), tie-broken by updated-at and then created-at so re-imports don't shadow the original.

Tombstoning is reversible: SetSuperseded excludes the duplicates from default search results but keeps them in storage. To free space, follow up with a store-level GC (not implemented here). The action is symmetric with consolidation's supersede, so the read path needs no changes.

Dedup is O(n · vector_search(n)) per namespace; with the embedder cache warm (the typical post-import case) the batched embed is near-free. For very large corpora, NeighboursPerAnchor bounds the union-find fan-out; the cluster of a memory can only ever be as wide as that fan-out allows.

st is required; emb is required unless Similarity <= 0 (in which case the pass is a no-op). opts.Similarity <= 0 short-circuits to an empty report.

type Report

type Report struct {
	ExpiredPurged    int        `json:"expired_purged"`
	ShortTermEvicted int        `json:"short_term_evicted"`
	Namespaces       int        `json:"namespaces"`
	DuplicateGroups  [][]string `json:"duplicate_groups,omitempty"`
}

Report summarizes a consistency sweep.

func Fsck

func Fsck(ctx context.Context, st store.Store, cap int, now time.Time) (Report, error)

Fsck purges expired memories, enforces the short-term cap, and audits live memories for duplicate clusters (same normalized content) as a poisoning backstop. Duplicates are reported, not auto-deleted.

type Sweeper

type Sweeper struct {
	// contains filtered or unexported fields
}

Sweeper periodically purges expired memories and enforces the short-term cap.

func NewSweeper

func NewSweeper(st store.Store, log *slog.Logger, interval time.Duration, shortTermCap int) *Sweeper

NewSweeper builds a sweeper that runs every interval, bounding short-term memory to shortTermCap per namespace (0 disables the cap).

func (*Sweeper) Run

func (s *Sweeper) Run(ctx context.Context)

Run sweeps on a ticker until ctx is cancelled. It runs one sweep immediately.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL