tokenopt

package
v0.7.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 4, 2026 License: MIT Imports: 10 Imported by: 0

Documentation

Overview

Package tokenopt reduces the token footprint of session raw.jsonl streams before they are handed to a summarization LLM. Deterministic, streaming, single-pass, no LLM calls.

The one public entry point is Compress. It reads a raw.jsonl stream from r, applies a fixed set of transforms, and writes the compressed stream to w. Memory stays bounded (a small dedup set of content hashes) regardless of session size.

This package is intentionally self-contained — it depends only on the standard library and a small LRU — so it can be used by the CLI, the sessionsummary package, and a future server-side distiller with no coupling. No persistence, no sidecar manifest: the raw.jsonl on disk is the audit trail.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Mode

type Mode int

Mode controls how aggressively Compress reduces the stream.

const (
	// ModeConversationOnly (default) emits user + assistant entries verbatim,
	// drops header and system entries, and selectively replaces tool entries:
	//
	//   - Tool calls with a non-empty `description` field in tool_input
	//     (Bash, Agent, Task, WebFetch, ...) emit
	//     `{type:"tool_mark", description:"..."}`. Adjacent calls with the
	//     same description collapse via the count field.
	//
	//   - Tool calls without a description (Edit, Read, Write, Glob, Grep, ...)
	//     produce NO tool_mark at all. Their actions are recoverable from
	//     surrounding assistant prose ("I'll edit foo.go and run tests"),
	//     so the marker would only be redundant noise.
	//
	// The descriptions are agent-authored intent strings — high
	// signal-per-byte, and far cheaper than the previous 120-char brief
	// extracted heuristically from the most-meaningful input field.
	ModeConversationOnly Mode = 0

	// ModeLossless keeps every entry but applies content-level transforms
	// (ANSI strip, progress collapse, image elision, large-Read truncation,
	// system-reminder + tool_result dedup). Use when a downstream consumer
	// still needs tool details (replay, debugging, contract-testing).
	ModeLossless Mode = 1
)

type Options

type Options struct {
	// Mode selects the compression strategy. Defaults to ModeConversationOnly.
	Mode Mode

	// LargeReadMaxLines is the line threshold above which a Read tool_result
	// body is truncated to head+tail. ModeLossless only. Defaults to 120.
	LargeReadMaxLines int

	// LargeReadKeepLines is how many lines to keep from the start and end when
	// a large Read body is truncated. ModeLossless only. Defaults to 40.
	LargeReadKeepLines int

	// ToolResultLRUSize caps the content-hash LRU for tool_result dedup.
	// ModeLossless only. Defaults to 1024 unique payloads.
	ToolResultLRUSize int

	// ToolResultMinBytes is the minimum content size worth deduping.
	// ModeLossless only. Defaults to 512.
	ToolResultMinBytes int
}

Options configures a Compress run. Zero value is a reasonable default (ModeConversationOnly).

type Stats

type Stats struct {
	EntriesIn         int
	EntriesOut        int // differs from EntriesIn in ModeConversationOnly where tool entries collapse to markers
	BytesIn           int64
	BytesOut          int64
	ToolsMarked       int // ModeConversationOnly: count of tool entries replaced with compact markers
	ToolsBatched      int // ModeConversationOnly: count of adjacent identical tool_marks collapsed into prior with count++
	SystemDropped     int // ModeConversationOnly: count of system entries dropped
	HeaderDropped     int // ModeConversationOnly: count of header entries dropped (header carries metadata the LLM doesn't need; daemon stamps it from stored.Meta directly)
	ANSIStripped      int // ModeLossless: entries with ANSI sequences removed
	ProgressCollapsed int // ModeLossless: entries with \r progress frames collapsed
	ImagesElided      int // ModeLossless: base64 image payloads replaced
	LargeReadsElided  int // ModeLossless: large Read tool_result bodies truncated
	RemindersDeduped  int // ModeLossless: <system-reminder> blocks replaced with a ref
	ToolResultsRefd   int // ModeLossless: tool_result bodies replaced with a tool_ref

	// Token estimates use a ~4 chars/token heuristic from internal/tokens.
	// They exist so callers can log a token-reduction number alongside the
	// byte-reduction one. Tokens are the actual LLM cost driver; bytes are
	// a proxy. Swap the estimator (or wire in a real BPE tokenizer) without
	// changing any caller — the field shape stays stable.
	TokensInEstimate  int64
	TokensOutEstimate int64
}

Stats reports what Compress did. Zero values are meaningful (no matches).

func Compress

func Compress(r io.Reader, w io.Writer) (Stats, error)

Compress reads raw.jsonl entries from r, applies streaming transforms, and writes compressed jsonl to w. Returns Stats describing what was done.

Guarantees (both modes):

  • Single pass over r. Surviving entries emit in original order.
  • Constant memory relative to session size.
  • User turns and header entries are preserved verbatim, always.

Mode-specific behavior:

  • ModeConversationOnly (default): assistant turns verbatim; tool entries collapse to compact tool_mark entries with a brief gist; system entries are dropped. Typical reduction 50–80% on real sessions.
  • ModeLossless: every entry preserved; content-level transforms only (ANSI strip, image elision, dedup, etc.).

func CompressWith

func CompressWith(r io.Reader, w io.Writer, opts Options) (Stats, error)

CompressWith is Compress with tunable options.

func (*Stats) Add

func (s *Stats) Add(other Stats)

Add accumulates other into s. Useful for aggregating across many sessions (e.g., a daemon summarizing a nightly batch).

func (Stats) LogValue

func (s Stats) LogValue() slog.Value

LogValue implements slog.LogValuer. Enables callers (CLI, daemon) to emit single-line key=value compression telemetry with:

slog.Info("token_optimize", "stats", stats)

func (Stats) Reduction

func (s Stats) Reduction() (saved int64, pct float64)

Reduction returns bytes saved and percentage. Safe to call when BytesIn is 0.

func (Stats) TokenReduction added in v0.7.2

func (s Stats) TokenReduction() (saved int64, pct float64)

TokenReduction returns estimated tokens saved and percentage. The cost dashboard wants this — bytes are a proxy, tokens are the actual LLM cost. Code-heavy sessions tokenize at a different ratio than prose-heavy ones, so byte-percentage trends mislead operators on cost.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL