tokenopt

package

v0.7.2 Latest Latest Go to latest Published: May 4, 2026 License: MIT Imports: 10 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/sageox/ox

Links

Open Source Insights

Documentation ¶

Overview ¶

Package tokenopt reduces the token footprint of session raw.jsonl streams before they are handed to a summarization LLM. Deterministic, streaming, single-pass, no LLM calls.

The one public entry point is Compress. It reads a raw.jsonl stream from r, applies a fixed set of transforms, and writes the compressed stream to w. Memory stays bounded (a small dedup set of content hashes) regardless of session size.

This package is intentionally self-contained — it depends only on the standard library and a small LRU — so it can be used by the CLI, the sessionsummary package, and a future server-side distiller with no coupling. No persistence, no sidecar manifest: the raw.jsonl on disk is the audit trail.

Index ¶

type Mode
type Options
type Stats
- func Compress(r io.Reader, w io.Writer) (Stats, error)
- func CompressWith(r io.Reader, w io.Writer, opts Options) (Stats, error)

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Mode ¶

type Mode int

Mode controls how aggressively Compress reduces the stream.

const (
	// ModeConversationOnly (default) emits user + assistant entries verbatim,
	// drops header and system entries, and selectively replaces tool entries:
	//
	//   - Tool calls with a non-empty `description` field in tool_input
	//     (Bash, Agent, Task, WebFetch, ...) emit
	//     `{type:"tool_mark", description:"..."}`. Adjacent calls with the
	//     same description collapse via the count field.
	//
	//   - Tool calls without a description (Edit, Read, Write, Glob, Grep, ...)
	//     produce NO tool_mark at all. Their actions are recoverable from
	//     surrounding assistant prose ("I'll edit foo.go and run tests"),
	//     so the marker would only be redundant noise.
	//
	// The descriptions are agent-authored intent strings — high
	// signal-per-byte, and far cheaper than the previous 120-char brief
	// extracted heuristically from the most-meaningful input field.
	ModeConversationOnly Mode = 0

	// ModeLossless keeps every entry but applies content-level transforms
	// (ANSI strip, progress collapse, image elision, large-Read truncation,
	// system-reminder + tool_result dedup). Use when a downstream consumer
	// still needs tool details (replay, debugging, contract-testing).
	ModeLossless Mode = 1
)

type Options ¶

type Options struct {
	// Mode selects the compression strategy. Defaults to ModeConversationOnly.
	Mode Mode

	// LargeReadMaxLines is the line threshold above which a Read tool_result
	// body is truncated to head+tail. ModeLossless only. Defaults to 120.
	LargeReadMaxLines int

	// LargeReadKeepLines is how many lines to keep from the start and end when
	// a large Read body is truncated. ModeLossless only. Defaults to 40.
	LargeReadKeepLines int

	// ToolResultLRUSize caps the content-hash LRU for tool_result dedup.
	// ModeLossless only. Defaults to 1024 unique payloads.
	ToolResultLRUSize int

	// ToolResultMinBytes is the minimum content size worth deduping.
	// ModeLossless only. Defaults to 512.
	ToolResultMinBytes int
}

Options configures a Compress run. Zero value is a reasonable default (ModeConversationOnly).

type Stats ¶

type Stats struct {
	EntriesIn         int
	EntriesOut        int // differs from EntriesIn in ModeConversationOnly where tool entries collapse to markers
	BytesIn           int64
	BytesOut          int64
	ToolsMarked       int // ModeConversationOnly: count of tool entries replaced with compact markers
	ToolsBatched      int // ModeConversationOnly: count of adjacent identical tool_marks collapsed into prior with count++
	SystemDropped     int // ModeConversationOnly: count of system entries dropped
	HeaderDropped     int // ModeConversationOnly: count of header entries dropped (header carries metadata the LLM doesn't need; daemon stamps it from stored.Meta directly)
	ANSIStripped      int // ModeLossless: entries with ANSI sequences removed
	ProgressCollapsed int // ModeLossless: entries with \r progress frames collapsed
	ImagesElided      int // ModeLossless: base64 image payloads replaced
	LargeReadsElided  int // ModeLossless: large Read tool_result bodies truncated
	RemindersDeduped  int // ModeLossless: <system-reminder> blocks replaced with a ref
	ToolResultsRefd   int // ModeLossless: tool_result bodies replaced with a tool_ref

	// Token estimates use a ~4 chars/token heuristic from internal/tokens.
	// They exist so callers can log a token-reduction number alongside the
	// byte-reduction one. Tokens are the actual LLM cost driver; bytes are
	// a proxy. Swap the estimator (or wire in a real BPE tokenizer) without
	// changing any caller — the field shape stays stable.
	TokensInEstimate  int64
	TokensOutEstimate int64
}

Stats reports what Compress did. Zero values are meaningful (no matches).

func Compress ¶

func Compress(r io.Reader, w io.Writer) (Stats, error)

Compress reads raw.jsonl entries from r, applies streaming transforms, and writes compressed jsonl to w. Returns Stats describing what was done.

Guarantees (both modes):

Single pass over r. Surviving entries emit in original order.
Constant memory relative to session size.
User turns and header entries are preserved verbatim, always.

Mode-specific behavior:

ModeConversationOnly (default): assistant turns verbatim; tool entries collapse to compact tool_mark entries with a brief gist; system entries are dropped. Typical reduction 50–80% on real sessions.
ModeLossless: every entry preserved; content-level transforms only (ANSI strip, image elision, dedup, etc.).

func CompressWith ¶

func CompressWith(r io.Reader, w io.Writer, opts Options) (Stats, error)

CompressWith is Compress with tunable options.

func (*Stats) Add ¶

func (s *Stats) Add(other Stats)

Add accumulates other into s. Useful for aggregating across many sessions (e.g., a daemon summarizing a nightly batch).

func (Stats) LogValue ¶

func (s Stats) LogValue() slog.Value

LogValue implements slog.LogValuer. Enables callers (CLI, daemon) to emit single-line key=value compression telemetry with:

slog.Info("token_optimize", "stats", stats)

func (Stats) Reduction ¶

func (s Stats) Reduction() (saved int64, pct float64)

Reduction returns bytes saved and percentage. Safe to call when BytesIn is 0.

func (Stats) TokenReduction ¶ added in v0.7.2

func (s Stats) TokenReduction() (saved int64, pct float64)

TokenReduction returns estimated tokens saved and percentage. The cost dashboard wants this — bytes are a proxy, tokens are the actual LLM cost. Code-heavy sessions tokenize at a different ratio than prose-heavy ones, so byte-percentage trends mislead operators on cost.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL