storage

package
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 7, 2026 License: MIT Imports: 14 Imported by: 0

Documentation

Overview

Package storage owns persistence for Thoughtline.

Backend: SQLite via modernc.org/sqlite (a pure-Go driver — no CGO, no platform-specific build flags, FTS5 baked in). The default database file lives at $THOUGHTLINE_HOME/thoughtline.db, falling back to a per-OS data directory when the env var is unset.

Schema sketch (firmed up in milestone M1, see docs/ARCHITECTURE.md):

CREATE TABLE memories (
    id              INTEGER PRIMARY KEY AUTOINCREMENT,
    sync_id         TEXT NOT NULL UNIQUE,           -- stable across upserts
    project         TEXT NOT NULL,
    scope           TEXT NOT NULL CHECK (scope IN ('project','personal')),
    type            TEXT NOT NULL,
    topic_key       TEXT,                            -- nullable; upsert key
    title           TEXT NOT NULL,
    content         TEXT NOT NULL,
    normalized_hash TEXT NOT NULL,                   -- for conflict detection
    revision_count  INTEGER NOT NULL DEFAULT 0,
    created_at      INTEGER NOT NULL,                -- unix epoch ms
    updated_at      INTEGER NOT NULL,
    deleted_at      INTEGER,                          -- soft delete

    -- M5 reserved (embeddings layer; never written by v1):
    embedding             BLOB,
    embedding_model       TEXT,
    embedding_created_at  INTEGER
);

CREATE VIRTUAL TABLE memories_fts USING fts5(
    title, content, content=memories, content_rowid=id
);
-- triggers: keep memories_fts in sync on INSERT/UPDATE/DELETE

Search uses FTS5 + BM25 ranking by default; queries containing a "/" are matched against topic_key first (a trick borrowed from Engram). Embeddings are explicitly deferred to milestone M5 — schema is reserved so adding them is a non-breaking migration.

Status: skeleton. Schema, migrations, and queries land in M1.

Package storage owns persistence for Thoughtline.

Backend: SQLite via modernc.org/sqlite (a pure-Go driver — no CGO, FTS5 baked in). The default database file is opened with WAL journal mode and foreign keys on.

Public API (M1):

Open(ctx, path) → *Storage
(*Storage).Close()
(*Storage).Save(ctx, memory.Memory) → (memory.Memory, UpsertAction, error)

The single Save entry point dispatches:

  • empty TopicKey ⇒ insert a brand-new row, ActionCreated.
  • non-empty TopicKey ⇒ upsert keyed on (project, topic_key):
  • first save ⇒ ActionCreated.
  • re-save with changed content ⇒ UPDATE bumps revision_count, keeps id / sync_id / created_at, ActionUpdated.
  • re-save with identical content ⇒ ActionNoop, no row mutation.

Index

Constants

View Source
const (
	DefaultSearchLimit = 10
	MaxSearchLimit     = 50

	// SnippetMaxChars caps the per-result preview returned to callers. The
	// underlying FTS5 snippet() is in token units; we trim to char units after.
	SnippetMaxChars = 300
)

Search defaults & caps. Exported so the server layer can document them.

Variables

View Source
var (
	ErrSessionNotFound        = errors.New("storage: session not found")
	ErrSessionAlreadyEnded    = errors.New("storage: session already ended")
	ErrSessionProjectMismatch = errors.New("storage: memory.Project does not match session.Project")
)

Session-related sentinels. Tests and the server layer match against these via errors.Is.

View Source
var ErrAlreadyPromoted = errors.New("storage: pending event already promoted")

ErrAlreadyPromoted is returned when MarkPromoted is called on a row that is already in status=promoted.

View Source
var ErrMemoryNotFound = errors.New("storage: memory not found")

ErrMemoryNotFound is returned by UpdateByID, SoftDelete, and other id-keyed mutations when no active (non-soft-deleted) row exists for the given id. Callers (and the server layer) should errors.Is against this sentinel to render a clean "not found" response.

View Source
var ErrPendingNotFound = errors.New("storage: pending event not found")

ErrPendingNotFound is returned when a pending_events lookup finds no row.

Functions

func DBQueryRowCount

func DBQueryRowCount(s *Storage, table string) (int, error)

DBQueryRowCount returns the row count of any table. Exported for tests only.

Types

type ListPendingParams

type ListPendingParams struct {
	Project   string
	Status    string    // "" = all statuses
	EventType string    // "" = all types
	Since     time.Time // zero = no lower bound
	Limit     int       // 0 = default (50)
	Offset    int
}

ListPendingParams carries the optional filters for ListPending.

type SearchOptions

type SearchOptions struct {
	Project  string
	Scope    string
	Type     string
	TopicKey string // GLOB pattern (e.g. "design/*"); exact when no wildcard
	Limit    int
	Offset   int
}

SearchOptions filters and paginates a Search call. Empty string filters (Project, Scope, Type, TopicKey) mean "no filter for this column".

type SearchResult

type SearchResult struct {
	ID            int64
	SyncID        string
	Project       string
	Scope         memory.Scope
	Type          memory.Type
	TopicKey      string
	Title         string
	Snippet       string
	Tags          []string
	Score         float64
	RevisionCount int
	UpdatedAt     time.Time
}

SearchResult is a single hit from Search. Snippet is at most SnippetMaxChars. Score is the BM25 rank from SQLite FTS5 — lower (more negative) means a stronger match. Topic-key shortcut hits use a synthetic score of -1000 so they always rank ahead of any FTS hit.

type Stats

type Stats struct {
	GeneratedAt time.Time

	TotalMemories   int
	DeletedMemories int

	ByType    map[memory.Type]int
	ByProject map[string]int
	ByScope   map[memory.Scope]int

	// MostRecentProjects lists up to 4 project names ordered by the most
	// recently updated memory in each project (MAX(updated_at) DESC).
	// Used by the dashboard's "Top Projects" section.
	MostRecentProjects []string

	OpenSessions   int
	ClosedSessions int

	RecentMemories []SearchResult
	RecentSessions []memory.Session
}

Stats is a snapshot of the database's contents — counts and the most recent items — designed to drive the dashboard UI and the tl_stats MCP tool. All fields reflect "active" state (deleted_at IS NULL) except DeletedMemories which is the count of soft-deleted rows.

type StatsOptions

type StatsOptions struct {
	// Project, when non-empty, scopes ALL counts to that project. Sessions
	// and memories from other projects are excluded entirely.
	Project string
	// RecentLimit caps how many recent memories AND sessions are included
	// in the snapshot. <=0 means DefaultSearchLimit (10), >MaxSearchLimit
	// is clamped to MaxSearchLimit (50).
	RecentLimit int
}

StatsOptions configures a Stats() call.

type Storage

type Storage struct {
	// contains filtered or unexported fields
}

Storage owns a SQLite connection pool and is safe for concurrent use.

func Open

func Open(ctx context.Context, path string) (*Storage, error)

Open opens or creates the database at path and runs migrations. Use ":memory:" for an ephemeral in-process database (intended for tests).

func (*Storage) Close

func (s *Storage) Close() error

Close releases the underlying connection pool.

func (*Storage) CountPending

func (s *Storage) CountPending(ctx context.Context, project string) (int, error)

CountPending returns the number of rows in pending_events with status='pending' for the given project.

func (*Storage) DB

func (s *Storage) DB() *sql.DB

DB returns the underlying *sql.DB. Exposed for integration tests that need to seed data or inspect row state directly. Do not call in production code.

func (*Storage) EndSession

func (s *Storage) EndSession(ctx context.Context, id, summary string) (memory.Session, error)

EndSession closes the session by setting ended_at and persisting the summary. A session can only be ended once: subsequent calls return ErrSessionAlreadyEnded. Summary length is bounded by memory.MaxSessionSummaryBytes.

func (*Storage) FTSContains

func (s *Storage) FTSContains(ctx context.Context, q string) (bool, error)

FTSContains returns true if the FTS5 index has at least one row matching q. Used by tests to verify fresh content is searchable after Save.

func (*Storage) FTSCount

func (s *Storage) FTSCount(ctx context.Context) (int, error)

FTSCount returns the row count of the memories_fts virtual table. Used by tests to verify the FTS5 sync triggers fired correctly.

func (*Storage) GetByID

func (s *Storage) GetByID(ctx context.Context, id int64) (memory.Memory, error)

GetByID is exported for tests — production callers use Save.

func (*Storage) GetPendingByID

func (s *Storage) GetPendingByID(ctx context.Context, id int64) (pending.Event, error)

GetPendingByID retrieves a single pending event by its integer ID, including the full payload. Returns ErrPendingNotFound if no row exists.

func (*Storage) GetSession

func (s *Storage) GetSession(ctx context.Context, id string) (memory.Session, error)

GetSession fetches a single session by id. Returns ErrSessionNotFound when no row matches.

func (*Storage) InsertPending

func (s *Storage) InsertPending(ctx context.Context, ev pending.Event) (UpsertAction, error)

InsertPending inserts a new row into pending_events using INSERT OR IGNORE for idempotency. Returns ActionCreated on a new row, ActionNoop on a duplicate (project, event_hash).

func (*Storage) ListPending

func (s *Storage) ListPending(ctx context.Context, p ListPendingParams) ([]pending.Event, error)

ListPending returns pending events filtered by the given params, ordered by captured_at DESC.

func (*Storage) MarkPromoted

func (s *Storage) MarkPromoted(ctx context.Context, pendingID, memoryID int64) error

MarkPromoted transitions a pending event to promoted status and records the memory ID that was created. Returns ErrAlreadyPromoted if the row is already in promoted state.

func (*Storage) Recent

func (s *Storage) Recent(ctx context.Context, project string, limit int) ([]SearchResult, error)

Recent returns the most recently updated active memories for a project, ordered by updated_at DESC. Soft-deleted rows are excluded. The returned SearchResult mirrors what tl_search yields, except Score is always 0 (recency is not a relevance signal) and Snippet is the content prefix truncated to SnippetMaxChars (no FTS5 match centering).

project must be non-empty: passing "" would silently return rows from every project, which is a security/data-leak shape we refuse to support. Callers that genuinely want cross-project recency should call Recent once per project.

limit follows the same semantics as Search: <=0 means DefaultSearchLimit, values above MaxSearchLimit are clamped down.

func (*Storage) RecentAll

func (s *Storage) RecentAll(ctx context.Context, project string, limit int) ([]SearchResult, error)

RecentAll returns the most recently updated active memories. When project is "" it returns hits from every project (scope-aware via FTS metadata). This is the cross-project sibling of Recent, intended for the dashboard Browse tab where the user wants a global feed.

func (*Storage) RecentSessions

func (s *Storage) RecentSessions(ctx context.Context, project string, limit int) ([]memory.Session, error)

RecentSessions returns the most recently STARTED sessions for the given project, ordered by started_at DESC. Mirrors Recent() for memories: project required (refuses empty), limit clamped to [DefaultSearchLimit, MaxSearchLimit].

func (*Storage) Save

Save persists m and returns the stored Memory (with id, sync_id, timestamps populated) along with the action performed. It assumes m has already passed memory.Validate; it does NOT re-validate the memory shape itself, but it DOES enforce the cross-table invariants that domain validation can't see — namely that an attached SessionID exists and belongs to the same project.

func (*Storage) Search

func (s *Storage) Search(ctx context.Context, query string, opts SearchOptions) ([]SearchResult, error)

Search returns memories matching the query, ranked by BM25 (or by recency for topic-key shortcut hits). Filters are applied as exact-match SQL WHERE predicates, except TopicKey which is a GLOB pattern.

Behaviour summary:

  • If query contains "/", a topic_key GLOB lookup runs first. If it returns rows, those rows are the result set (FTS does not run). If it returns zero rows, we fall through to the FTS path. This is the short-circuit behaviour documented in PROGRESS.md (M2 Q2).
  • Otherwise the FTS5 path runs, with each query token wrapped in quotes to neutralize FTS5 operator characters in user input.
  • Soft-deleted rows (deleted_at IS NOT NULL) are never returned.

func (*Storage) SetClock

func (s *Storage) SetClock(now func() time.Time)

SetClock replaces the storage's wall clock. Tests use this to make timestamps deterministic.

func (*Storage) SoftDelete

func (s *Storage) SoftDelete(ctx context.Context, id int64) error

SoftDelete marks the memory with the given id as deleted by setting deleted_at to the current clock time. Subsequent Search, Recent, and UpdateByID calls treat the row as not found.

The unique index on (project, topic_key) is partial — it only covers rows where deleted_at IS NULL — so soft-deleting a row frees its topic_key for a fresh Save in the same project.

Returns ErrMemoryNotFound if no active row matches id (either the id is unknown, or the row is already soft-deleted). The contract is symmetric with UpdateByID: only "active" rows are addressable.

FTS5 sync: the memories_au trigger fires on the underlying UPDATE and will re-index the row's title/content. Search-level filtering on deleted_at IS NULL is what actually hides the row from results — the FTS index itself is left populated, which is fine because there is no public API that reads memories_fts without joining back to memories.

func (*Storage) StartSession

func (s *Storage) StartSession(ctx context.Context, project, agentLabel string) (memory.Session, error)

StartSession creates a fresh open session for the given project. agentLabel is optional (empty string skips it). Returns the persisted Session with its UUIDv7 id and started_at populated.

func (*Storage) Stats

func (s *Storage) Stats(ctx context.Context, opts StatsOptions) (Stats, error)

Stats returns a snapshot of the database for dashboards. Errors propagate from the underlying queries; partial Stats are never returned.

func (*Storage) SweepPending

func (s *Storage) SweepPending(ctx context.Context, retentionDur, hardDeleteDur time.Duration, now time.Time) (SweepResult, error)

SweepPending runs two operations in sequence using the provided 'now' as the reference clock (so tests can inject a fixed time):

  1. Archive: pending rows with captured_at older than retentionDur → status=archived
  2. Delete: archived rows with archived_at older than hardDeleteDur → DELETE

Promoted rows are never touched.

func (*Storage) TopTags

func (s *Storage) TopTags(ctx context.Context, project string, limit int) ([]TagCount, error)

TopTags returns the most-used tags across active memories, optionally scoped to a project. limit <= 0 returns the default cap (20). Tags are stored as JSON arrays in the memories.tags column; SQLite does not have native JSON aggregation in the modernc driver, so we decode in Go.

Soft-deleted memories are excluded.

func (*Storage) UpdateByID

func (s *Storage) UpdateByID(ctx context.Context, id int64, patch UpdatePatch) (memory.Memory, error)

UpdateByID applies a partial mutation to the memory with the given id.

Behaviour:

  • If the row does not exist or is soft-deleted, returns ErrMemoryNotFound.
  • Empty patch (all nil) is a true noop: returns the current row unchanged without bumping revision_count or updated_at.
  • The merged Memory is re-validated against memory.Validate; any rule violation is returned verbatim (callers can errors.Is against the specific memory.Err* sentinels).
  • On real change: bumps revision_count, refreshes updated_at, recomputes normalized_hash, preserves id / sync_id / created_at / type / topic_key / project / scope.
  • FTS5 stays in sync via the existing memories_au trigger (eviction + re-insert with new title/content).

type SweepResult

type SweepResult struct {
	Archived int64
	Deleted  int64
}

SweepResult holds the counts from a SweepPending call.

type TagCount

type TagCount struct {
	Tag   string
	Count int
}

TagCount is one row of the TopTags result.

type UpdatePatch

type UpdatePatch struct {
	Title   *string
	Content *string
	Tags    *[]string
}

UpdatePatch describes a partial mutation applied by UpdateByID. Pointer fields distinguish "leave unchanged" (nil) from "set to this value" (non-nil — including a pointer to the zero value, which would for example clear the tag list when set to &[]string{}).

Only mutable fields are exposed: Title, Content, Tags. Type, TopicKey, Project, and Scope are identity-defining and cannot be changed by UpdateByID — the AI is expected to delete + re-save when those need to move.

func (UpdatePatch) IsEmpty

func (p UpdatePatch) IsEmpty() bool

IsEmpty reports whether the patch carries no changes.

type UpsertAction

type UpsertAction int

UpsertAction reports what Save did with the row.

const (
	ActionCreated UpsertAction = iota
	ActionUpdated
	ActionNoop
)

func (UpsertAction) String

func (a UpsertAction) String() string

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL