Documentation
¶
Overview ¶
Package maintenance keeps the store healthy: a background sweeper purges expired memories and bounds short-term capacity, and fsck additionally audits live memories for duplicate (poisoning) clusters.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func EnforceShortTermCap ¶
EnforceShortTermCap evicts the lowest-retention short-term memories in each namespace that holds more than cap of them. cap <= 0 disables it. Returns the number evicted.
Types ¶
type ClusterAction ¶ added in v0.0.8
type ClusterAction struct {
RepresentativeID string `json:"representative_id"`
TombstonedIDs []string `json:"tombstoned_ids"`
Size int `json:"size"`
}
ClusterAction describes one near-duplicate cluster found by a pass and the representative selection the pass would commit.
type DedupJob ¶ added in v0.0.8
type DedupJob struct {
// contains filtered or unexported fields
}
DedupJob is a periodic vector-cluster dedup pass. With interval <= 0, Run is a no-op (the function returns immediately).
type DedupOptions ¶ added in v0.0.8
type DedupOptions struct {
// Similarity is the minimum cosine-like score (the store's vector
// distance-to-score mapping) for two memories to join a cluster.
// 0 falls back to defaultDedupSimilarity. Negative disables dedup and
// the call returns an empty report.
Similarity float64
// MinClusterSize is the smallest cluster acted on. A pair of near-
// duplicates below this is left alone. 0 falls back to
// defaultDedupMinClusterSize.
MinClusterSize int
// Tiers restricts the pass to these tiers; nil/empty means all tiers.
Tiers []memory.Tier
// Namespaces restricts the pass to these namespaces; nil/empty means every
// namespace. Clusters never span namespaces, so scoping the pass to one
// (the post-import case) is both cheaper and avoids touching other tenants.
Namespaces []string
// NeighboursPerAnchor bounds the per-anchor vector-search fan-out. Larger
// values tighten clusters at higher vector-search cost. 0 falls back to
// defaultDedupNeighboursAnchor.
NeighboursPerAnchor int
// DryRun reports what would be done without tombstoning anything.
DryRun bool
// Now is the instant retention scoring and expiry filtering are evaluated
// at. Zero means time.Now().UTC().
Now time.Time
// Log receives progress messages; nil falls back to slog.Default().
Log *slog.Logger
}
DedupOptions configures one dedup pass.
type DedupReport ¶ added in v0.0.8
type DedupReport struct {
Namespaces int `json:"namespaces"`
MemoriesSeen int `json:"memories_seen"`
ClustersFound int `json:"clusters_found"`
Tombstoned int `json:"tombstoned"`
DryRun bool `json:"dry_run"`
Actions []ClusterAction `json:"actions,omitempty"`
}
DedupReport summarizes one dedup pass.
func Dedup ¶ added in v0.0.8
func Dedup(ctx context.Context, st store.Store, emb embed.Embedder, opts DedupOptions) (DedupReport, error)
Dedup clusters live memories per namespace by embedding similarity and tombstones the lower-scored members of each cluster, pointing them at the cluster's representative. The representative is the member with the highest RetentionScore (importance × access × recency), tie-broken by updated-at and then created-at so re-imports don't shadow the original.
Tombstoning is reversible: SetSuperseded excludes the duplicates from default search results but keeps them in storage. To free space, follow up with a store-level GC (not implemented here). The action is symmetric with consolidation's supersede, so the read path needs no changes.
Dedup is O(n · vector_search(n)) per namespace; with the embedder cache warm (the typical post-import case) the batched embed is near-free. For very large corpora, NeighboursPerAnchor bounds the union-find fan-out; the cluster of a memory can only ever be as wide as that fan-out allows.
st is required; emb is required unless Similarity <= 0 (in which case the pass is a no-op). opts.Similarity <= 0 short-circuits to an empty report.
type Report ¶
type Report struct {
ExpiredPurged int `json:"expired_purged"`
ShortTermEvicted int `json:"short_term_evicted"`
Namespaces int `json:"namespaces"`
DuplicateGroups [][]string `json:"duplicate_groups,omitempty"`
}
Report summarizes a consistency sweep.
type Sweeper ¶
type Sweeper struct {
// contains filtered or unexported fields
}
Sweeper periodically purges expired memories and enforces the short-term cap.