distill

package
v0.1.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 4, 2026 License: MIT Imports: 9 Imported by: 0

Documentation

Overview

Package distill reads rendered episode markdown (the durable episodic substrate captured by internal/episode) and surfaces candidate transformation moments for arc distillation — plasticity step 2.

It does the RELIABLE, mechanical half of distillation: parse verbatim turns, and flag high-precision candidate moments via keyword rules. It deliberately does NOT draft arcs or decide whether a candidate is already covered — the 2026-05-31 distillation review found extraction reliable but drafting and dedup-judgment unreliable to automate. Those stay with the mind (propose, never auto-write identity).

Index

Constants

View Source
const MinEpisodeBytes = 6000

MinEpisodeBytes is the signal-filter floor. Episodes below it are sub-minute noise captures (the 2026-04-27 review found 2 of 5 such). NOTE: byte size is a proxy for signal, not signal itself — a short-but-dense episode would be wrongly dropped; revisit if that case appears.

Variables

This section is empty.

Functions

func FormatReport

func FormatReport(r Report, w io.Writer) error

FormatReport writes a human-readable, propose-only candidate report. It leads and closes with the contract — these are MOMENTS, not arcs; the mind drafts and approves — so the output can't be mistaken for finished identity.

func SignalFilter

func SignalFilter(paths []string, minBytes int64) []string

SignalFilter drops episode paths CONFIRMED below minBytes — the minimum-signal threshold that keeps noise captures out of the corpus. A path that can't be stat'd is KEPT, not silently dropped: the size is unknown, so the safe move is to let it through and let ParseEpisodeFile surface any real error downstream (dropping a possibly signal-dense episode on a transient stat error is the failure mode to avoid).

Types

type Candidate

type Candidate struct {
	Rule      RuleID
	EpisodeID string
	TurnIndex int
	Timestamp string
	Verbatim  string // the user turn that triggered the rule (for the mind to quote)
	Trigger   string // the exact phrase that fired the rule (shows WHY; aids judgment)
}

Candidate is a surfaced transformation moment — a propose-only pointer into an episode, never an arc. The mind drafts and approves; this just finds.

func ExtractCandidates

func ExtractCandidates(ep *Episode) []Candidate

ExtractCandidates applies the mechanical rules to an episode's USER turns only. Commit/build/TDD noise lives in ASSISTANT turns, which are never scanned, so it is excluded by construction (no rule can fire on it). A turn matching both rules yields one candidate per rule.

type Episode

type Episode struct {
	ID             string
	UserTurns      []Turn
	AssistantTurns []Turn
}

Episode is the parsed verbatim content of a rendered episode .md.

func ParseEpisodeFile

func ParseEpisodeFile(path string) (*Episode, error)

ParseEpisodeFile parses a rendered episode .md into its verbatim user and assistant turns. The renderer writes user/assistant text un-truncated (only commit subjects are clipped), so the turns are faithful for quoting.

type Report

type Report struct {
	EpisodesScanned int
	EpisodesKept    int
	Candidates      []Candidate `json:"candidates"`
	// ParseErrors records episodes that failed to parse, surfaced rather than
	// silently skipped (distill is infrastructure and can't log, so the visible
	// error rides in the report itself).
	ParseErrors []string `json:"parse_errors,omitempty"`
}

Report is the propose-only result of a distillation scan: the candidate moments found, plus the corpus stats. It is NOT a set of arcs — every entry is a pointer for the mind to judge and (maybe) draft.

func ScanEpisodes

func ScanEpisodes(dir string) (Report, error)

ScanEpisodes globs episode .md files under dir, signal-filters them, parses each, and returns the propose-only candidate Report. A per-episode parse failure is recorded in Report.ParseErrors (and that episode skipped) rather than aborting the whole scan or being swallowed — the corpus tool stays robust while the error stays visible.

type RuleID

type RuleID string

RuleID names a candidate-extraction rule (the high-precision mechanical ones from the 2026-05-31 distillation review). Rule 2 (recurrence→structural) and arc DRAFTING are deliberately NOT here — they need semantic judgment the review found unreliable to mechanize, so they stay with the mind.

const (
	// RuleAuthorityGrant — the partner transfers standing decision authority
	// ("you decide", "I trust you", "you are the one to judge"), not a bare
	// per-task approval ("go for it"). These mark relationship-shaping moments.
	RuleAuthorityGrant RuleID = "authority-grant"
	// RuleManifestoLens — the user turn invokes the manifesto as a decision lens
	// ("manifesto lens on", "remember the manifesto") OR cites a numbered
	// principle ("principle 5"). A frequent precursor to a lens-redirected
	// decision — an arc shape, when the lens actually overrides an instinct.
	RuleManifestoLens RuleID = "manifesto-lens"
)

type Turn

type Turn struct {
	Index     int
	Timestamp string
	Text      string
}

Turn is one verbatim message from an episode, with its 1-based position.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL