maintenance

package
v0.3.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 15, 2026 License: AGPL-3.0 Imports: 10 Imported by: 0

Documentation

Overview

Package maintenance keeps the store healthy: a background sweeper purges expired memories and bounds short-term capacity, and fsck additionally audits live memories for duplicate (poisoning) clusters.

Index

Constants

View Source
const PinnedTag = "pinned"

PinnedTag marks a memory as pinned: exempt from retro-tiering demotion and always surfaced in a session briefing.

Variables

View Source
var DefaultSplitKeys = []string{"import_source_namespace", "user_id", "agent_id", "run_id", "project"}

DefaultSplitKeys are the metadata keys, in priority order, that Split groups a namespace by. import_source_namespace is stamped by the importer when a merge discarded the source namespace, so it recovers a botched `--merge-into` import exactly; the rest cover the scope fields the mem0/agentmemory/mnemory adapters preserve (user_id/agent_id/run_id/project).

Functions

func DemoteStale added in v0.0.11

func DemoteStale(ctx context.Context, st store.Store, olderThan, now time.Time) (int, error)

DemoteStale demotes durable (semantic/procedural) memories that have never been recalled (AccessCount 0), were last updated before olderThan, and are neither highly important (>= 0.5) nor pinned, down to the episodic tier — giving them the episodic TTL. Unused "durable" debris (e.g. a low-quality bulk import) then ages out on its own, while anything recalled even once is reinforced and kept. The mirror image of promotion. Returns the count demoted.

func EnforceShortTermCap

func EnforceShortTermCap(ctx context.Context, st store.Store, cap int, now time.Time) (int, error)

EnforceShortTermCap evicts the lowest-retention short-term memories in each namespace that holds more than cap of them. cap <= 0 disables it. Returns the number evicted.

func ForgetByTag added in v0.0.11

func ForgetByTag(ctx context.Context, st store.Store, namespace, tag string) (int64, error)

ForgetByTag deletes every memory in a namespace carrying tag, including superseded and expired ones, and returns the count deleted. With the import provenance tag (import:<source>:<date>) this undoes a single bulk import.

func PurgeExpired

func PurgeExpired(ctx context.Context, st store.Store, now time.Time) (int, error)

PurgeExpired deletes memories whose TTL has passed as of now, in batches, and returns the number removed.

func PurgeTombstones added in v0.0.11

func PurgeTombstones(ctx context.Context, st store.Store, olderThan time.Time) (int, error)

PurgeTombstones hard-deletes superseded (tombstoned) memories last updated before olderThan, reclaiming the storage and vector-index space they occupy. Tombstones are already excluded from default recall, so this never changes those results; it only frees space and bounds how far back time-travel (as_of) recall can reach. Returns the count deleted.

Types

type BackfillConfidenceReport added in v0.0.12

type BackfillConfidenceReport struct {
	Inspected int `json:"inspected"`
	Seeded    int `json:"seeded"`
	Skipped   int `json:"skipped"`
}

func BackfillConfidence added in v0.0.12

func BackfillConfidence(ctx context.Context, st store.Store, now time.Time) (BackfillConfidenceReport, error)

func BackfillConfidencePreview added in v0.0.12

func BackfillConfidencePreview(ctx context.Context, st store.Store, now time.Time) (BackfillConfidenceReport, error)

type ClusterAction added in v0.0.8

type ClusterAction struct {
	RepresentativeID string   `json:"representative_id"`
	TombstonedIDs    []string `json:"tombstoned_ids"`
	Size             int      `json:"size"`
}

ClusterAction describes one near-duplicate cluster found by a pass and the representative selection the pass would commit.

type DedupJob added in v0.0.8

type DedupJob struct {
	// contains filtered or unexported fields
}

DedupJob is a periodic vector-cluster dedup pass. With interval <= 0, Run is a no-op (the function returns immediately).

func NewDedupJob added in v0.0.8

func NewDedupJob(st store.Store, emb embed.Embedder, m store.Metrics, log *slog.Logger,
	interval time.Duration, opts DedupOptions) *DedupJob

NewDedupJob builds a DedupJob that calls Dedup(opts) every interval. interval <= 0 disables the job.

func (*DedupJob) Run added in v0.0.8

func (d *DedupJob) Run(ctx context.Context)

Run loops on a ticker until ctx is cancelled. It runs one pass immediately and again on every tick. It is a no-op if the job was built with interval <= 0.

type DedupOptions added in v0.0.8

type DedupOptions struct {
	// Similarity is the minimum cosine-like score (the store's vector
	// distance-to-score mapping) for two memories to join a cluster.
	// 0 falls back to defaultDedupSimilarity. Negative disables dedup and
	// the call returns an empty report.
	Similarity float64
	// MinClusterSize is the smallest cluster acted on. A pair of near-
	// duplicates below this is left alone. 0 falls back to
	// defaultDedupMinClusterSize.
	MinClusterSize int
	// Tiers restricts the pass to these tiers; nil/empty means all tiers.
	Tiers []memory.Tier
	// Namespaces restricts the pass to these namespaces; nil/empty means every
	// namespace. Clusters never span namespaces, so scoping the pass to one
	// (the post-import case) is both cheaper and avoids touching other tenants.
	Namespaces []string
	// NeighboursPerAnchor bounds the per-anchor vector-search fan-out. Larger
	// values tighten clusters at higher vector-search cost. 0 falls back to
	// defaultDedupNeighboursAnchor.
	NeighboursPerAnchor int
	// DryRun reports what would be done without tombstoning anything.
	DryRun bool
	// Now is the instant retention scoring and expiry filtering are evaluated
	// at. Zero means time.Now().UTC().
	Now time.Time
	// Log receives progress messages; nil falls back to slog.Default().
	Log *slog.Logger
}

DedupOptions configures one dedup pass.

type DedupReport added in v0.0.8

type DedupReport struct {
	Namespaces    int             `json:"namespaces"`
	MemoriesSeen  int             `json:"memories_seen"`
	ClustersFound int             `json:"clusters_found"`
	Tombstoned    int             `json:"tombstoned"`
	DryRun        bool            `json:"dry_run"`
	Actions       []ClusterAction `json:"actions,omitempty"`
}

DedupReport summarizes one dedup pass.

func Dedup added in v0.0.8

func Dedup(ctx context.Context, st store.Store, emb embed.Embedder, opts DedupOptions) (DedupReport, error)

Dedup clusters live memories per namespace by embedding similarity and tombstones the lower-scored members of each cluster, pointing them at the cluster's representative. The representative is the member with the highest RetentionScore (importance × access × recency), tie-broken by updated-at and then created-at so re-imports don't shadow the original.

Tombstoning is reversible: SetSuperseded excludes the duplicates from default search results but keeps them in storage. To free space, follow up with a store-level GC (not implemented here). The action is symmetric with consolidation's supersede, so the read path needs no changes.

Dedup is O(n · vector_search(n)) per namespace; with the embedder cache warm (the typical post-import case) the batched embed is near-free. For very large corpora, NeighboursPerAnchor bounds the union-find fan-out; the cluster of a memory can only ever be as wide as that fan-out allows.

st is required; emb is required unless Similarity <= 0 (in which case the pass is a no-op). opts.Similarity <= 0 short-circuits to an empty report.

type RenamespaceReport added in v0.0.11

type RenamespaceReport struct {
	Moved   int            `json:"moved"`
	Targets map[string]int `json:"targets,omitempty"` // memories moved into each destination
	Skipped int            `json:"skipped"`           // left in place (no grouping key, or already in place)
	DryRun  bool           `json:"dry_run"`
}

RenamespaceReport summarizes a Move or Split.

func Move added in v0.0.11

func Move(ctx context.Context, st store.Store, fromNS, toNS string, dryRun bool) (RenamespaceReport, error)

Move relocates every memory in fromNS to toNS. A no-op when fromNS == toNS.

func Split added in v0.0.11

func Split(ctx context.Context, st store.Store, fromNS string, byKeys []string, dryRun bool) (RenamespaceReport, error)

Split regroups a namespace by metadata, moving each record to the namespace named by the first of byKeys it carries. Records with no grouping key (or whose key equals fromNS) stay put and are counted as skipped. Pass nil byKeys to use DefaultSplitKeys. This is the recovery path for a store whose imports were collapsed into one pool.

type Report

type Report struct {
	ExpiredPurged    int        `json:"expired_purged"`
	ShortTermEvicted int        `json:"short_term_evicted"`
	Namespaces       int        `json:"namespaces"`
	DuplicateGroups  [][]string `json:"duplicate_groups,omitempty"`
}

Report summarizes a consistency sweep.

func Fsck

func Fsck(ctx context.Context, st store.Store, cap int, now time.Time) (Report, error)

Fsck purges expired memories, enforces the short-term cap, and audits live memories for duplicate clusters (same normalized content) as a poisoning backstop. Duplicates are reported, not auto-deleted.

type ScrubReport added in v0.2.5

type ScrubReport struct {
	LifecycleNoise  int `json:"lifecycle_noise"`
	ExactDuplicates int `json:"exact_duplicates"`
	Namespaces      int `json:"namespaces"`
}

ScrubReport summarizes a content-quality scrub. Counts are the number of memories that were (or, in a preview, would be) deleted in each category.

func Scrub added in v0.2.5

func Scrub(ctx context.Context, st store.Store, apply bool) (ScrubReport, error)

Scrub removes content-level junk that the namespace-oriented doctor fix and the embedding-similarity dedup pass both miss: session-lifecycle markers ("Session ended", "Stop checkpoint") and exact-duplicate memories (identical normalized content within a namespace, keeping the oldest). It previews when apply is false, returning the counts that would be removed without mutating the store. Live memories only — tombstones are left alone (already excluded from recall and reversible). Returns the per-category report.

func (ScrubReport) Total added in v0.2.5

func (r ScrubReport) Total() int

Total returns the number of memories removed across all categories.

type Sweeper

type Sweeper struct {
	// contains filtered or unexported fields
}

Sweeper periodically purges expired memories, enforces the short-term cap, and (optionally) garbage-collects old tombstones.

func NewSweeper

func NewSweeper(st store.Store, log *slog.Logger, cfg SweeperConfig) *Sweeper

NewSweeper builds a sweeper that runs every cfg.Interval.

func (*Sweeper) Run

func (s *Sweeper) Run(ctx context.Context)

Run sweeps on a ticker until ctx is cancelled. It runs one sweep immediately. It is a no-op when Interval <= 0 (time.NewTicker panics on a non-positive duration); config validation rejects that, but guard here too so a misconfigured interval cannot crash the sweeper goroutine.

type SweeperConfig added in v0.0.11

type SweeperConfig struct {
	// Interval is how often the sweep runs.
	Interval time.Duration
	// ShortTermCap bounds working+episodic memories per namespace; the lowest-
	// retention ones over the cap are evicted. 0 disables it.
	ShortTermCap int
	// TombstoneTTL hard-deletes superseded memories last updated before now-TTL,
	// reclaiming space. 0 disables it (tombstones are kept indefinitely but stay
	// excluded from recall).
	TombstoneTTL time.Duration
	// DemoteAfter demotes never-recalled, low-importance durable memories older
	// than this to the episodic tier so unused debris ages out. 0 disables it.
	DemoteAfter time.Duration
}

SweeperConfig configures the periodic maintenance sweep.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL