tokenstrip

package
v0.7.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 4, 2026 License: MIT Imports: 10 Imported by: 0

Documentation

Overview

Package tokenstrip is a streaming, token-aware compaction stage for session raw.jsonl streams. It sits downstream of tokenopt in the session pipeline and reduces token count — not bytes — via a small set of intentionally conservative transforms.

Why a separate package

tokenopt produces a byte-reduced stream (ANSI strip, image elision, tool-result dedup, etc.). Those transforms save bytes but rarely save tokens in proportion. tokenstrip attacks the tokenizer directly: NFC-normalize, eliminate zero-width characters, canonicalize whitespace, and — strictly inside assistant <thinking> blocks — drop stop words and optionally substitute high-token phrases with shorter synonyms.

Safety model — precise contract by transform

Some transforms here are lossy. The package is therefore OFF by default in upstream callers and gated behind explicit opt-in.

Fields NEVER mutated, regardless of transform or config:

  • header entries (session metadata)
  • user turns, in their entirety (intent signal is sacred)
  • tool_name, tool_input, tool_mark.brief (summarizer scaffolding)

For assistant entries, the applicability depends on whether a transform is lossless or lossy:

Lossless transforms (apply to assistant content globally):
  - NFC Unicode normalization — round-trippable canonical form
  - Zero-width + unusual whitespace strip — information-free glyphs
  - Whitespace canonicalization — multiple spaces/newlines → one

Lossy transforms (apply ONLY to text inside <thinking>…</thinking>):
  - Stop-word removal
  - Synonym substitution (opt-in even when tokenstrip is enabled)

This means assistant prose OUTSIDE <thinking> may see its whitespace canonicalized and zero-width chars removed (lossless-safe), but its words will never be dropped or rewritten (preserves the answer to the user verbatim). Assistant prose INSIDE <thinking> may additionally lose stop words / have synonyms substituted (lossy but scoped to reasoning).

Streaming

Compress is single-pass over r, bounded memory, tolerant of oversized entries (>64KB). Unknown top-level JSON fields on each entry round-trip via map[string]json.RawMessage so downstream consumers keep whatever schema extensions upstream added.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func DefaultSynonymTable

func DefaultSynonymTable() map[string]string

DefaultSynonymTable returns the baseline high-token phrase → shorter form mapping. Kept short and conservative; callers wanting more aggressive shortening should pass their own table.

Types

type DropThinkingMode added in v0.7.2

type DropThinkingMode int

DropThinkingMode controls the aggressive reduction of <thinking> blocks. See Options.DropThinkingMode for the rationale.

const (
	// DropThinkingNone preserves the entire <thinking> block (default).
	// Stop-word removal still applies to the inner text.
	DropThinkingNone DropThinkingMode = 0

	// DropThinkingFirstSentence keeps the first sentence of each block
	// and replaces the rest with an elision marker. Sentence boundary
	// is the first ".", "!", or "?" followed by whitespace or end-of-block.
	DropThinkingFirstSentence DropThinkingMode = 1

	// DropThinkingAll removes the entire block (including the surrounding
	// <thinking> tags). Maximum aggression — use only with telemetry-
	// backed evidence that quality_score isn't impacted.
	DropThinkingAll DropThinkingMode = 2
)

type Options

type Options struct {
	// EnableSynonymSub turns on phrase→synonym substitution inside assistant
	// <thinking> blocks. Off by default even when tokenstrip itself is on,
	// because the table is opinionated and can produce awkward reasoning
	// text; callers should opt in explicitly.
	EnableSynonymSub bool

	// DropThinkingMode controls whether <thinking>...</thinking> blocks in
	// assistant content are reduced more aggressively than the default
	// stop-word strip. Default is DropThinkingNone (preserve current
	// behavior — full block kept, with stop-word removal applied to its
	// contents).
	//
	// On long Sonnet/Opus sessions, thinking blocks can be 30–50% of
	// total assistant prose. The summary schema (title, key_actions,
	// chapter_titles, aha_moments) doesn't require them, but the chain
	// of reasoning sometimes contains the framing for an aha_moment, so
	// we offer a conservative middle ground:
	//
	//   - DropThinkingNone: keep the entire block (current behavior).
	//   - DropThinkingFirstSentence: keep only the first sentence of
	//     each block (typically the framing — "Let me work out X" or
	//     "I need to figure out Y") and elide the body. Preserves the
	//     hint of where the reasoning was going without the deliberation
	//     cost.
	//   - DropThinkingAll: drop the block entirely. Maximum savings,
	//     maximum risk.
	//
	// Recommended rollout: DropThinkingFirstSentence behind an env-var
	// flag for a few weeks, A/B against EventSummarization quality_score
	// distribution, flip to default if there's no measurable quality
	// drop on the cohort that has it enabled.
	DropThinkingMode DropThinkingMode

	// SynonymTable overrides the default substitution table. Keys are
	// matched case-insensitively as whole words. A nil map falls back to
	// DefaultSynonymTable().
	SynonymTable map[string]string

	// StopWordLanguage is an ISO-639-1 language code (e.g. "en", "fr").
	// Empty string defaults to "en".
	StopWordLanguage string
}

Options configures a Compress run. Zero value is a reasonable default (English stop words, synonym substitution OFF, thinking blocks kept).

type Stats

type Stats struct {
	EntriesIn  int
	EntriesOut int
	BytesIn    int64
	BytesOut   int64

	NFCNormalized           int // entries where NFC normalization changed content
	ZeroWidthStripped       int // entries where zero-width / unusual whitespace was removed
	WhitespaceCanonicalized int // entries where whitespace collapse changed content
	StopWordsRemoved        int // <thinking> blocks where stop words were removed
	SynonymsSubstituted     int // <thinking> blocks where synonym substitution fired

	// ThinkingBlocksTrimmed counts <thinking> blocks reduced under
	// DropThinkingFirstSentence (kept first sentence only).
	ThinkingBlocksTrimmed int
	// ThinkingBlocksDropped counts <thinking> blocks removed entirely
	// under DropThinkingAll.
	ThinkingBlocksDropped int

	// Token estimates use a ~4 chars/token heuristic (Anthropic's rule of
	// thumb). They exist so callers can log a rough token-reduction number
	// without pulling in a BPE-heavy tokenizer. When a real tokenizer is
	// wired in later, swap the estimator in transforms.go and these fields
	// will reflect actual counts.
	TokensInEstimate  int64
	TokensOutEstimate int64
}

Stats reports what Compress did. Zero values mean no matches.

func Compress

func Compress(r io.Reader, w io.Writer) (Stats, error)

Compress reads raw.jsonl entries from r, applies token-aware transforms, and writes the transformed stream to w. Equivalent to CompressWith with a zero Options value.

func CompressWith

func CompressWith(r io.Reader, w io.Writer, opts Options) (Stats, error)

CompressWith is Compress with tunable options.

Guarantees:

  • Single pass over r, bounded memory.
  • Entry order preserved; nothing is dropped.
  • User turns and header entries are byte-identical on output.
  • Unknown top-level JSON fields survive round-trip.

func (*Stats) Add

func (s *Stats) Add(other Stats)

Add accumulates other into s. Useful for aggregating across many sessions.

func (Stats) LogValue

func (s Stats) LogValue() slog.Value

LogValue implements slog.LogValuer. Enables single-line key=value telemetry:

slog.Info("tokenstrip", "stats", stats)

func (Stats) Reduction

func (s Stats) Reduction() (saved int64, pct float64)

Reduction returns bytes saved and percentage. Safe when BytesIn is zero.

func (Stats) TokenReduction

func (s Stats) TokenReduction() (saved int64, pct float64)

TokenReduction returns estimated tokens saved and percentage.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL