Documentation
¶
Overview ¶
Package core provides core interfaces and utilities for tok.
The core package defines fundamental abstractions used throughout the codebase, including the CommandRunner interface for shell command execution and token estimation utilities.
CommandRunner Interface ¶
The CommandRunner interface abstracts shell command execution, enabling dependency injection and testability:
type CommandRunner interface {
Run(ctx context.Context, args []string) (output string, exitCode int, err error)
LookPath(name string) (string, error)
}
Token Estimation ¶
EstimateTokens provides a fast heuristic for counting tokens in text:
tokens := core.EstimateTokens("some text to analyze")
Index ¶
- Variables
- func CalculateSavings(tokensSaved int, model string) float64
- func CalculateTokensSaved(original, filtered string) int
- func EstimateTokens(text string) int
- func EstimateTokensFast(text string) int
- func EstimateTokensPrecise(text string) int
- func HasModelPricing(model string) bool
- func RegisterModelPricing(model string, input, output float64)
- func WarmupBPETokenizer()
- type BPETokenizer
- type BatchProcessor
- type BatchResult
- type CommandRunner
- type LimitedWriter
- type ModelPricing
- type OSCommandRunner
- type Processor
Constants ¶
This section is empty.
Variables ¶
var CommonModelPricing = map[string]ModelPricing{
"gpt-4o": {
Model: "gpt-4o",
InputPerMillion: 2.50,
OutputPerMillion: 10.00,
},
"gpt-4o-mini": {
Model: "gpt-4o-mini",
InputPerMillion: 0.15,
OutputPerMillion: 0.60,
},
"gpt-4.1": {
Model: "gpt-4.1",
InputPerMillion: 2.00,
OutputPerMillion: 8.00,
},
"gpt-4.1-mini": {
Model: "gpt-4.1-mini",
InputPerMillion: 0.40,
OutputPerMillion: 1.60,
},
"gpt-4.1-nano": {
Model: "gpt-4.1-nano",
InputPerMillion: 0.10,
OutputPerMillion: 0.40,
},
"claude-3.5-sonnet": {
Model: "claude-3.5-sonnet",
InputPerMillion: 3.00,
OutputPerMillion: 15.00,
},
"claude-3-haiku": {
Model: "claude-3-haiku",
InputPerMillion: 0.25,
OutputPerMillion: 1.25,
},
"claude-4-sonnet": {
Model: "claude-4-sonnet",
InputPerMillion: 3.00,
OutputPerMillion: 15.00,
},
"claude-4-opus": {
Model: "claude-4-opus",
InputPerMillion: 15.00,
OutputPerMillion: 75.00,
},
}
CommonModelPricing provides pricing for popular models (updated April 2026). Use RegisterModelPricing to add or override entries at runtime.
Functions ¶
func CalculateSavings ¶
CalculateSavings computes dollar savings from token reduction.
func CalculateTokensSaved ¶
CalculateTokensSaved computes token savings between original and filtered.
func EstimateTokens ¶
EstimateTokens is the single source of truth for token estimation. Uses BPE tokenization when available and loaded, falls back to heuristic. Optimized with fast path for short strings.
func EstimateTokensFast ¶
EstimateTokensFast provides a fast estimate without BPE. Use this when exact count isn't critical.
func EstimateTokensPrecise ¶
EstimateTokensPrecise always uses BPE, skipping the short-string heuristic. Use this for user-visible counts (savings dashboards, compression reports) where a 200-char heuristic bypass would produce misleading numbers. Falls back to fastEstimateTokens only if BPE initialization fails.
func HasModelPricing ¶
HasModelPricing reports whether a model has an explicit pricing entry.
func RegisterModelPricing ¶
RegisterModelPricing adds or overwrites pricing for a model at runtime. This allows users and plugins to keep pricing current without code changes.
func WarmupBPETokenizer ¶
func WarmupBPETokenizer()
WarmupBPETokenizer preloads the codec in a background goroutine. Call this during application startup to avoid latency on the first token estimation. Safe to call multiple times.
Types ¶
type BPETokenizer ¶
type BPETokenizer struct {
// contains filtered or unexported fields
}
BPETokenizer wraps tiktoken for accurate BPE token counting. P1.1: Replaces heuristic len/4 with real BPE tokenization. ~20-30% more accurate than heuristic estimation.
func (*BPETokenizer) Count ¶
func (b *BPETokenizer) Count(text string) int
Count returns the accurate BPE token count with caching.
type BatchProcessor ¶
type BatchProcessor struct {
// contains filtered or unexported fields
}
BatchProcessor processes multiple inputs concurrently.
func NewBatchProcessor ¶
func NewBatchProcessor(coordinator Processor, workers int) *BatchProcessor
func (*BatchProcessor) ProcessBatch ¶
func (bp *BatchProcessor) ProcessBatch(inputs []string) []BatchResult
ProcessBatch runs the processor on each input concurrently. Errors from the processor are captured in BatchResult.Error.
type BatchResult ¶
type CommandRunner ¶
type CommandRunner interface {
Run(ctx context.Context, args []string) (output string, exitCode int, err error)
LookPath(name string) (string, error)
}
CommandRunner abstracts shell command execution.
type LimitedWriter ¶
type LimitedWriter struct {
W io.Writer
N int64
Dropped int64
// contains filtered or unexported fields
}
LimitedWriter wraps an io.Writer and drops writes after N bytes. It is safe for concurrent use by multiple goroutines.
type ModelPricing ¶
type ModelPricing struct {
Model string
InputPerMillion float64 // Cost per 1M input tokens
OutputPerMillion float64 // Cost per 1M output tokens
}
ModelPricing holds per-token pricing for a model.
func GetModelPricing ¶
func GetModelPricing(model string) ModelPricing
GetModelPricing returns the pricing for a model, or the default if unknown.
type OSCommandRunner ¶
type OSCommandRunner struct {
Env []string
}
OSCommandRunner executes real shell commands using os/exec.
func NewOSCommandRunner ¶
func NewOSCommandRunner() *OSCommandRunner
NewOSCommandRunner creates a command runner with the current environment.
func (*OSCommandRunner) LookPath ¶
func (r *OSCommandRunner) LookPath(name string) (string, error)
LookPath resolves a command name to its full path.
func (*OSCommandRunner) Run ¶
Run executes a command and captures combined stdout+stderr. The binary name is validated for shell meta-characters to provide a clear error message; arguments are passed directly to exec.CommandContext which does not invoke a shell, making shell injection impossible. Output is capped at maxOutputSize to prevent OOM from runaway commands.