migrate

package
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 7, 2026 License: MIT Imports: 13 Imported by: 0

README

cmd/migrate — Engram → Thoughtline migration tool

This binary reads your existing Engram memory database and copies every active observation into Thoughtline. It is a one-shot, idempotent migration tool — safe to run multiple times, safe to interrupt and re-run.


What this does

migrate opens your Engram database read-only, reads every observation that has not been soft-deleted, maps the column schema and type vocabulary to Thoughtline's format, and writes each row through storage.Save() — the same path the MCP server uses. This means FTS5 full-text indexing, normalized content hashing, and all validation rules fire exactly as they would for a normal tl_save call.

Your Engram sync_id values are preserved 1:1 so you can cross-reference migrated memories by their original ID.


Before you start

  1. Back up your Engram database:

    Compress-Archive -Path "$env:USERPROFILE\.engram\*" -DestinationPath "$env:USERPROFILE\.engram-backup-$(Get-Date -Format 'yyyy-MM-dd').zip" -Force
    

    Verify the zip exists and is non-zero before proceeding. The migration never writes to Engram, but having a backup is non-negotiable.

  2. Verify both databases are reachable:

    • Engram: $env:USERPROFILE\.engram\engram.db (default)
    • Thoughtline: $env:LOCALAPPDATA\thoughtline\thoughtline.db (default)
  3. Build the binary (see next section).


How to run

Step 1 — Build
# From the Thoughtline repo root:
go build -o migrate.exe ./cmd/migrate/cmd
Step 2 — Dry run first

Always run with --dry-run before writing anything. This reads and maps every row without touching the destination:

./migrate.exe --dry-run --verbose

Example output:

migrate: DRY RUN — no writes will be made
migrate: source=C:\Users\..\.engram\engram.db dest=C:\...\thoughtline.db log=...

migrate summary:
  migrated:                 0
  skipped (already exists): 0
  skipped (topic collision):0
  failed:                   0
  truncations:              0
  total processed:          291
  duration:                 42ms

Check total processed matches your expected Engram row count (soft-deleted rows are excluded automatically).

Step 3 — Run for real

If the dry run shows failed: 0 (or an acceptable number), run without --dry-run:

./migrate.exe

The tool prints a summary to stdout and writes a detailed per-row log to %LOCALAPPDATA%\thoughtline\migrate-YYYY-MM-DD-HHMMSS.log.

Step 4 — Verify

Open thoughtline ui and check the Stats panel. The new total should equal:

pre-migration Thoughtline count + migrate summary "migrated" count

Understanding the output

Counter Meaning
migrated Rows successfully written to Thoughtline
skipped (already exists) Rows whose sync_id was already in Thoughtline — safe on re-runs
skipped (topic collision) Rows where (project, topic_key) already exists in Thoughtline as a native row — skipped to protect your data
failed Rows that could not be migrated (oversized content, corrupt timestamps). Check the log file.
truncations Rows whose title was > 200 characters — title was truncated, row was still migrated
total processed Active (non-deleted) Engram rows seen by the migrator

Type mapping table

Engram uses a general-purpose type vocabulary. Thoughtline uses a gamedev-specific one. The mapping is deterministic:

Engram type Thoughtline type Provenance tag added
bugfix bugfix none
preference preference none (scope forced to personal)
decision decision none
architecture architecture none
pattern convention origin-type:pattern
config convention origin-type:config
discovery convention origin-type:discovery
manual convention origin-type:manual
(any other) convention origin-type:<original>

The origin-type:* tag lets you find coerced rows later and re-classify them if you want. You can search for them in Thoughtline with tl_search filtering by tag.


What to do on errors

failed: N is not zero

Open the log file at %LOCALAPPDATA%\thoughtline\migrate-*.log and look for action=error lines. Each has a reason= field explaining what went wrong.

Common causes:

  • Oversized content (content too large): The row has more than 64 KiB of content. Thoughtline enforces this limit. To migrate the row manually, trim the content and use tl_save directly.
  • Corrupt timestamp (parse timestamp): The created_at or updated_at field isn't valid ISO 8601. This is rare in Engram databases but possible in very old rows.
  • Unknown scope (unknown scope): The row has a scope value other than project or personal. Inspect the raw row in Engram and fix before re-running.

skipped (topic collision): N is unexpectedly high

This means Thoughtline already has rows with the same (project, topic_key) as some Engram rows. The migrator skips those rows to protect your existing Thoughtline data. If you want to overwrite them, soft-delete the existing Thoughtline rows first via tl_delete, then re-run the migrator.

Rollback

If you need to undo the migration:

  1. In Thoughtline, migrated rows have sync_id values that match Engram's original sync_ids (formatted like obs-xxxxxxxx). You can find and soft-delete them manually or write a short script.
  2. Your Engram database is unaffected — the migrator never writes to it.
  3. Your .zip backup lets you restore Engram to pre-migration state if needed.

Re-running safely

The migrator is idempotent. If you run it twice:

  • Rows that were already migrated appear as skipped (already exists) — they are identified by sync_id.
  • No data is overwritten or duplicated.
  • Rows that errored on the first run will be retried on the second run (they were never written, so they have no sync_id in the destination).

It is safe to interrupt the migrator mid-run and restart it.


Log file location

Every run writes a structured log to:

%LOCALAPPDATA%\thoughtline\migrate-YYYY-MM-DD-HHMMSS.log

Format: one line per event, key=value pairs:

level=INFO msg=row_result sync_id=obs-abc123 action=created
level=WARN msg=title_truncated sync_id=obs-xyz789 original="215 runes"
level=INFO msg=row_result sync_id=obs-def456 action=skipped-duplicate reason=sync_id_already_exists

This is your audit trail. Keep it until you have verified the migration is correct.

Documentation

Overview

Package migrate implements the one-shot migration from Engram's SQLite database to Thoughtline's storage layer. It is intentionally a standalone binary (cmd/migrate/main.go) so it can be built, audited, and discarded without touching the MCP server's public API surface.

Public types in this file are the shared vocabulary used by reader.go, mapper.go, writer.go, and main.go. No logic lives here — types only.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func MapRow

func MapRow(o EngramObservation, logger Logger) (memory.Memory, error)

MapRow applies all column and type mappings, returning a memory.Memory ready for storage.Save(). It enforces the following transformations:

  • project is lowercased (Engram stores mixed case in some old rows)
  • preference scope is forced to personal regardless of source
  • session_id is always cleared (Engram session IDs are not UUIDv7)
  • NormalizedHash is left zero (storage.Save() recomputes it)
  • title > 200 runes is truncated at rune boundary; logger receives one WARN
  • content > MaxContentBytes causes MapRow to return an error
  • invalid topic_key (fails Thoughtline's regex) is silently cleared
  • origin-type:<x> tag is merged into Tags for coerced types
  • bogus scope value returns error

func MapTimestamp

func MapTimestamp(iso string) (int64, error)

MapTimestamp parses an ISO-8601 string and returns the Unix epoch in milliseconds. Always interprets the input in UTC to avoid timezone corruption.

Accepted formats:

  • ISO 8601 with T separator: "2024-03-15T10:30:00Z", "2024-03-15T10:30:00+00:00"
  • SQLite default datetime: "2024-03-15 10:30:00" (space separator, no timezone; assumed UTC)

func MapType

func MapType(engramType string) (memory.Type, []string, error)

MapType converts an Engram type string to a Thoughtline memory.Type plus any provenance tags to add to the migrated row.

Mapping table (spec engram-migration Requirement 4):

bugfix       → bugfix          (no tag)
preference   → preference      (no tag)
decision     → decision        (no tag)
architecture → architecture    (no tag)
pattern      → convention      + origin-type:pattern
config       → convention      + origin-type:config
discovery    → convention      + origin-type:discovery
manual       → convention      + origin-type:manual
<anything>   → convention      + origin-type:<anything>

Types

type Config

type Config struct {
	// Source is the absolute path to engram.db (read-only).
	Source string
	// Dest is the absolute path to thoughtline.db (read-write).
	Dest string
	// DryRun, when true, performs all reads and mapping but skips all writes.
	DryRun bool
	// Verbose, when true, causes each row result to be emitted to the logger
	// as it is processed (not just at the end).
	Verbose bool
}

Config holds the CLI flags resolved by main.go before calling Run.

type EngramObservation

type EngramObservation struct {
	SyncID         string
	Type           string // open set: bugfix|decision|architecture|pattern|config|preference|discovery|manual
	Title          string
	Content        string
	Project        string
	Scope          string
	TopicKey       string
	Tags           []string // Engram has no tags column; always nil. Reserved for future.
	NormalizedHash string   // stored but not forwarded — storage.Save() recomputes it
	RevisionCount  int
	CreatedAt      string  // ISO 8601
	UpdatedAt      string  // ISO 8601
	DeletedAt      *string // nil = active row
}

EngramObservation is the raw row read from the Engram observations table. Columns that Engram stores but Thoughtline has no equivalent for (tool_name, duplicate_count, last_seen_at) are read and discarded by the reader — they never appear here.

func ReadObservations

func ReadObservations(ctx context.Context, db *sql.DB) ([]EngramObservation, error)

ReadObservations reads all active (non-soft-deleted) observations from the Engram database and returns them as a slice of EngramObservation. It accepts a *sql.DB opened by the caller with read-only WAL pragmas — it never opens its own connection.

Columns that Engram stores but Thoughtline has no equivalent for (tool_name, duplicate_count, last_seen_at) are scanned and discarded.

The full result set is returned in one slice. At ~291 rows × ~2 KB avg the memory budget is well within reason.

type Logger

type Logger interface {
	Log(level, msg string, fields map[string]string)
}

Logger is the interface mapper and writer functions use so they stay pure and testable. StructuredLogger and NopLogger both satisfy it. Log fields are key=value pairs; callers pass them as a flat map.

type NopLogger

type NopLogger struct{}

NopLogger implements Logger by discarding every message. Use in tests that don't need to inspect log output, and in production contexts where no log destination is configured.

func (NopLogger) Log

func (NopLogger) Log(_ string, _ string, _ map[string]string)

Log discards the message.

type RowResult

type RowResult struct {
	SyncID string
	// Action is one of: "created", "skipped-deleted", "skipped-duplicate",
	// "skipped-topic-collision", "error".
	Action string
	// Reason is populated when Action is "error" or any "skipped-*" variant.
	Reason string
}

RowResult captures what happened to a single Engram row during migration. The full slice of RowResults is written to the structured log file; stdout shows only aggregate counters.

type StructuredLogger

type StructuredLogger struct {
	// contains filtered or unexported fields
}

StructuredLogger writes one log line per event to an io.Writer in the format:

level=INFO sync_id=obs-abc123 action=created

Keys are sorted for deterministic output. Both main.go (stdout + file) and tests that need to inspect output should use StructuredLogger; tests that don't care about log content should use NopLogger.

func NewStructuredLogger

func NewStructuredLogger(w io.Writer) StructuredLogger

NewStructuredLogger creates a StructuredLogger writing to w.

func (StructuredLogger) Log

func (l StructuredLogger) Log(level, msg string, fields map[string]string)

Log writes a single key=value log line. Fields are sorted by key for determinism. The level and msg are always the first two tokens.

type Summary

type Summary struct {
	Total            int
	Created          int
	SkippedDeleted   int // soft-deleted source rows; not migrated by spec
	SkippedDuplicate int // already in destination by sync_id
	SkippedTopicCol  int // (project, topic_key) collision with existing Thoughtline row
	Errors           int // rows that failed mapping or storage; do not abort the run
	Truncations      int // rows whose title was truncated at 200 runes
	Rows             []RowResult
}

Summary is the aggregate result of a migration run. Run() always returns a Summary even when individual rows errored — per-row errors are non-fatal.

func Run

func Run(ctx context.Context, cfg Config) (Summary, error)

Run executes the full migration from Config.Source (Engram DB) to Config.Dest (Thoughtline DB). It processes all active rows, accumulating results in a Summary. Individual row errors are non-fatal — Run always processes every row and returns the complete Summary regardless of how many rows errored.

If Config.DryRun is true, Run reads and maps all rows but performs no writes to the destination. Counters in the returned Summary reflect what would have happened.

Directories

Path Synopsis
Command migrate performs a one-shot migration from Engram's SQLite database to Thoughtline.
Command migrate performs a one-shot migration from Engram's SQLite database to Thoughtline.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL