tokenstrip

package

v0.7.2 Latest Latest Go to latest Published: May 4, 2026 License: MIT Imports: 10 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/sageox/ox

Links

Open Source Insights

Documentation ¶

Overview ¶

Package tokenstrip is a streaming, token-aware compaction stage for session raw.jsonl streams. It sits downstream of tokenopt in the session pipeline and reduces token count — not bytes — via a small set of intentionally conservative transforms.

Why a separate package ¶

tokenopt produces a byte-reduced stream (ANSI strip, image elision, tool-result dedup, etc.). Those transforms save bytes but rarely save tokens in proportion. tokenstrip attacks the tokenizer directly: NFC-normalize, eliminate zero-width characters, canonicalize whitespace, and — strictly inside assistant <thinking> blocks — drop stop words and optionally substitute high-token phrases with shorter synonyms.

Safety model — precise contract by transform ¶

Some transforms here are lossy. The package is therefore OFF by default in upstream callers and gated behind explicit opt-in.

Fields NEVER mutated, regardless of transform or config:

header entries (session metadata)
user turns, in their entirety (intent signal is sacred)
tool_name, tool_input, tool_mark.brief (summarizer scaffolding)

For assistant entries, the applicability depends on whether a transform is lossless or lossy:

Lossless transforms (apply to assistant content globally):
  - NFC Unicode normalization — round-trippable canonical form
  - Zero-width + unusual whitespace strip — information-free glyphs
  - Whitespace canonicalization — multiple spaces/newlines → one

Lossy transforms (apply ONLY to text inside <thinking>…</thinking>):
  - Stop-word removal
  - Synonym substitution (opt-in even when tokenstrip is enabled)

This means assistant prose OUTSIDE <thinking> may see its whitespace canonicalized and zero-width chars removed (lossless-safe), but its words will never be dropped or rewritten (preserves the answer to the user verbatim). Assistant prose INSIDE <thinking> may additionally lose stop words / have synonyms substituted (lossy but scoped to reasoning).

Streaming ¶

Compress is single-pass over r, bounded memory, tolerant of oversized entries (>64KB). Unknown top-level JSON fields on each entry round-trip via map[string]json.RawMessage so downstream consumers keep whatever schema extensions upstream added.

Index ¶

func DefaultSynonymTable() map[string]string
type DropThinkingMode
type Options
type Stats
- func Compress(r io.Reader, w io.Writer) (Stats, error)
- func CompressWith(r io.Reader, w io.Writer, opts Options) (Stats, error)

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func DefaultSynonymTable ¶

func DefaultSynonymTable() map[string]string

DefaultSynonymTable returns the baseline high-token phrase → shorter form mapping. Kept short and conservative; callers wanting more aggressive shortening should pass their own table.

Types ¶

type DropThinkingMode ¶ added in v0.7.2

type DropThinkingMode int

DropThinkingMode controls the aggressive reduction of <thinking> blocks. See Options.DropThinkingMode for the rationale.

const (
	// DropThinkingNone preserves the entire <thinking> block (default).
	// Stop-word removal still applies to the inner text.
	DropThinkingNone DropThinkingMode = 0

	// DropThinkingFirstSentence keeps the first sentence of each block
	// and replaces the rest with an elision marker. Sentence boundary
	// is the first ".", "!", or "?" followed by whitespace or end-of-block.
	DropThinkingFirstSentence DropThinkingMode = 1

	// DropThinkingAll removes the entire block (including the surrounding
	// <thinking> tags). Maximum aggression — use only with telemetry-
	// backed evidence that quality_score isn't impacted.
	DropThinkingAll DropThinkingMode = 2
)

type Options ¶

type Options struct {
	// EnableSynonymSub turns on phrase→synonym substitution inside assistant
	// <thinking> blocks. Off by default even when tokenstrip itself is on,
	// because the table is opinionated and can produce awkward reasoning
	// text; callers should opt in explicitly.
	EnableSynonymSub bool

	// DropThinkingMode controls whether <thinking>...</thinking> blocks in
	// assistant content are reduced more aggressively than the default
	// stop-word strip. Default is DropThinkingNone (preserve current
	// behavior — full block kept, with stop-word removal applied to its
	// contents).
	//
	// On long Sonnet/Opus sessions, thinking blocks can be 30–50% of
	// total assistant prose. The summary schema (title, key_actions,
	// chapter_titles, aha_moments) doesn't require them, but the chain
	// of reasoning sometimes contains the framing for an aha_moment, so
	// we offer a conservative middle ground:
	//
	//   - DropThinkingNone: keep the entire block (current behavior).
	//   - DropThinkingFirstSentence: keep only the first sentence of
	//     each block (typically the framing — "Let me work out X" or
	//     "I need to figure out Y") and elide the body. Preserves the
	//     hint of where the reasoning was going without the deliberation
	//     cost.
	//   - DropThinkingAll: drop the block entirely. Maximum savings,
	//     maximum risk.
	//
	// Recommended rollout: DropThinkingFirstSentence behind an env-var
	// flag for a few weeks, A/B against EventSummarization quality_score
	// distribution, flip to default if there's no measurable quality
	// drop on the cohort that has it enabled.
	DropThinkingMode DropThinkingMode

	// SynonymTable overrides the default substitution table. Keys are
	// matched case-insensitively as whole words. A nil map falls back to
	// DefaultSynonymTable().
	SynonymTable map[string]string

	// StopWordLanguage is an ISO-639-1 language code (e.g. "en", "fr").
	// Empty string defaults to "en".
	StopWordLanguage string
}

Options configures a Compress run. Zero value is a reasonable default (English stop words, synonym substitution OFF, thinking blocks kept).

type Stats ¶

type Stats struct {
	EntriesIn  int
	EntriesOut int
	BytesIn    int64
	BytesOut   int64

	NFCNormalized           int // entries where NFC normalization changed content
	ZeroWidthStripped       int // entries where zero-width / unusual whitespace was removed
	WhitespaceCanonicalized int // entries where whitespace collapse changed content
	StopWordsRemoved        int // <thinking> blocks where stop words were removed
	SynonymsSubstituted     int // <thinking> blocks where synonym substitution fired

	// ThinkingBlocksTrimmed counts <thinking> blocks reduced under
	// DropThinkingFirstSentence (kept first sentence only).
	ThinkingBlocksTrimmed int
	// ThinkingBlocksDropped counts <thinking> blocks removed entirely
	// under DropThinkingAll.
	ThinkingBlocksDropped int

	// Token estimates use a ~4 chars/token heuristic (Anthropic's rule of
	// thumb). They exist so callers can log a rough token-reduction number
	// without pulling in a BPE-heavy tokenizer. When a real tokenizer is
	// wired in later, swap the estimator in transforms.go and these fields
	// will reflect actual counts.
	TokensInEstimate  int64
	TokensOutEstimate int64
}

Stats reports what Compress did. Zero values mean no matches.

func Compress ¶

func Compress(r io.Reader, w io.Writer) (Stats, error)

Compress reads raw.jsonl entries from r, applies token-aware transforms, and writes the transformed stream to w. Equivalent to CompressWith with a zero Options value.

func CompressWith ¶

func CompressWith(r io.Reader, w io.Writer, opts Options) (Stats, error)

CompressWith is Compress with tunable options.

Guarantees:

Single pass over r, bounded memory.
Entry order preserved; nothing is dropped.
User turns and header entries are byte-identical on output.
Unknown top-level JSON fields survive round-trip.

func (*Stats) Add ¶

func (s *Stats) Add(other Stats)

Add accumulates other into s. Useful for aggregating across many sessions.

func (Stats) LogValue ¶

func (s Stats) LogValue() slog.Value

LogValue implements slog.LogValuer. Enables single-line key=value telemetry:

slog.Info("tokenstrip", "stats", stats)

func (Stats) Reduction ¶

func (s Stats) Reduction() (saved int64, pct float64)

Reduction returns bytes saved and percentage. Safe when BytesIn is zero.

func (Stats) TokenReduction ¶

func (s Stats) TokenReduction() (saved int64, pct float64)

TokenReduction returns estimated tokens saved and percentage.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL