core

package

v0.5.1 Latest Latest Go to latest Published: May 12, 2026 License: MIT Imports: 14 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/GrayCodeAI/tok

Links

Open Source Insights

Documentation ¶

Overview ¶

Package core provides core interfaces and utilities for tok.

The core package defines fundamental abstractions used throughout the codebase, including the CommandRunner interface for shell command execution and token estimation utilities.

CommandRunner Interface ¶

The CommandRunner interface abstracts shell command execution, enabling dependency injection and testability:

type CommandRunner interface {
    Run(ctx context.Context, args []string) (output string, exitCode int, err error)
    LookPath(name string) (string, error)
}

Token Estimation ¶

EstimateTokens provides a fast heuristic for counting tokens in text:

tokens := core.EstimateTokens("some text to analyze")

Index ¶

Variables
func CalculateSavings(tokensSaved int, model string) float64
func CalculateTokensSaved(original, filtered string) int
func EstimateTokens(text string) int
func EstimateTokensFast(text string) int
func EstimateTokensPrecise(text string) int
func HasModelPricing(model string) bool
func RegisterModelPricing(model string, input, output float64)
func WarmupBPETokenizer()
type BPETokenizer
- func (b *BPETokenizer) Count(text string) int
type BatchProcessor
- func NewBatchProcessor(coordinator Processor, workers int) *BatchProcessor
- func (bp *BatchProcessor) ProcessBatch(inputs []string) []BatchResult
type BatchResult
type CommandRunner
type LimitedWriter
- func (lw *LimitedWriter) Write(p []byte) (n int, err error)
type ModelPricing
- func GetModelPricing(model string) ModelPricing
type OSCommandRunner
- func NewOSCommandRunner() *OSCommandRunner
- func (r *OSCommandRunner) LookPath(name string) (string, error)
- func (r *OSCommandRunner) Run(ctx context.Context, args []string) (string, int, error)
type Processor

Constants ¶

This section is empty.

Variables ¶

View Source

var CommonModelPricing = map[string]ModelPricing{

	"gpt-4o": {
		Model:            "gpt-4o",
		InputPerMillion:  2.50,
		OutputPerMillion: 10.00,
	},
	"gpt-4o-mini": {
		Model:            "gpt-4o-mini",
		InputPerMillion:  0.15,
		OutputPerMillion: 0.60,
	},
	"gpt-4.1": {
		Model:            "gpt-4.1",
		InputPerMillion:  2.00,
		OutputPerMillion: 8.00,
	},
	"gpt-4.1-mini": {
		Model:            "gpt-4.1-mini",
		InputPerMillion:  0.40,
		OutputPerMillion: 1.60,
	},
	"gpt-4.1-nano": {
		Model:            "gpt-4.1-nano",
		InputPerMillion:  0.10,
		OutputPerMillion: 0.40,
	},

	"claude-3.5-sonnet": {
		Model:            "claude-3.5-sonnet",
		InputPerMillion:  3.00,
		OutputPerMillion: 15.00,
	},
	"claude-3-haiku": {
		Model:            "claude-3-haiku",
		InputPerMillion:  0.25,
		OutputPerMillion: 1.25,
	},
	"claude-4-sonnet": {
		Model:            "claude-4-sonnet",
		InputPerMillion:  3.00,
		OutputPerMillion: 15.00,
	},
	"claude-4-opus": {
		Model:            "claude-4-opus",
		InputPerMillion:  15.00,
		OutputPerMillion: 75.00,
	},
}

CommonModelPricing provides pricing for popular models (updated April 2026). Use RegisterModelPricing to add or override entries at runtime.

Functions ¶

func CalculateSavings ¶

func CalculateSavings(tokensSaved int, model string) float64

CalculateSavings computes dollar savings from token reduction.

func CalculateTokensSaved ¶

func CalculateTokensSaved(original, filtered string) int

CalculateTokensSaved computes token savings between original and filtered.

func EstimateTokens ¶

func EstimateTokens(text string) int

EstimateTokens is the single source of truth for token estimation. Uses BPE tokenization when available and loaded, falls back to heuristic. Optimized with fast path for short strings.

func EstimateTokensFast ¶

func EstimateTokensFast(text string) int

EstimateTokensFast provides a fast estimate without BPE. Use this when exact count isn't critical.

func EstimateTokensPrecise ¶

func EstimateTokensPrecise(text string) int

EstimateTokensPrecise always uses BPE, skipping the short-string heuristic. Use this for user-visible counts (savings dashboards, compression reports) where a 200-char heuristic bypass would produce misleading numbers. Falls back to fastEstimateTokens only if BPE initialization fails.

func HasModelPricing ¶

func HasModelPricing(model string) bool

HasModelPricing reports whether a model has an explicit pricing entry.

func RegisterModelPricing ¶

func RegisterModelPricing(model string, input, output float64)

RegisterModelPricing adds or overwrites pricing for a model at runtime. This allows users and plugins to keep pricing current without code changes.

func WarmupBPETokenizer ¶

func WarmupBPETokenizer()

WarmupBPETokenizer preloads the codec in a background goroutine. Call this during application startup to avoid latency on the first token estimation. Safe to call multiple times.

Types ¶

type BPETokenizer ¶

type BPETokenizer struct {
	// contains filtered or unexported fields
}

BPETokenizer wraps tiktoken for accurate BPE token counting. P1.1: Replaces heuristic len/4 with real BPE tokenization. ~20-30% more accurate than heuristic estimation.

func (*BPETokenizer) Count ¶

func (b *BPETokenizer) Count(text string) int

Count returns the accurate BPE token count with caching.

type BatchProcessor ¶

type BatchProcessor struct {
	// contains filtered or unexported fields
}

BatchProcessor processes multiple inputs concurrently.

func NewBatchProcessor ¶

func NewBatchProcessor(coordinator Processor, workers int) *BatchProcessor

func (*BatchProcessor) ProcessBatch ¶

func (bp *BatchProcessor) ProcessBatch(inputs []string) []BatchResult

ProcessBatch runs the processor on each input concurrently. Errors from the processor are captured in BatchResult.Error.

type BatchResult ¶

type BatchResult struct {
	Index  int
	Output string
	Stats  interface{}
	Error  error
}

type CommandRunner ¶

type CommandRunner interface {
	Run(ctx context.Context, args []string) (output string, exitCode int, err error)
	LookPath(name string) (string, error)
}

CommandRunner abstracts shell command execution.

type LimitedWriter ¶

type LimitedWriter struct {
	W       io.Writer
	N       int64
	Dropped int64
	// contains filtered or unexported fields
}

LimitedWriter wraps an io.Writer and drops writes after N bytes. It is safe for concurrent use by multiple goroutines.

func (*LimitedWriter) Write ¶

func (lw *LimitedWriter) Write(p []byte) (n int, err error)

type ModelPricing ¶

type ModelPricing struct {
	Model            string
	InputPerMillion  float64 // Cost per 1M input tokens
	OutputPerMillion float64 // Cost per 1M output tokens
}

ModelPricing holds per-token pricing for a model.

func GetModelPricing ¶

func GetModelPricing(model string) ModelPricing

GetModelPricing returns the pricing for a model, or the default if unknown.

type OSCommandRunner ¶

type OSCommandRunner struct {
	Env []string
}

OSCommandRunner executes real shell commands using os/exec.

func NewOSCommandRunner ¶

func NewOSCommandRunner() *OSCommandRunner

NewOSCommandRunner creates a command runner with the current environment.

func (*OSCommandRunner) LookPath ¶

func (r *OSCommandRunner) LookPath(name string) (string, error)

LookPath resolves a command name to its full path.

func (*OSCommandRunner) Run ¶

func (r *OSCommandRunner) Run(ctx context.Context, args []string) (string, int, error)

Run executes a command and captures combined stdout+stderr. The binary name is validated for shell meta-characters to provide a clear error message; arguments are passed directly to exec.CommandContext which does not invoke a shell, making shell injection impossible. Output is capped at maxOutputSize to prevent OOM from runaway commands.

type Processor ¶

type Processor interface {
	Process(string) (string, interface{}, error)
}

Processor interface for batch processing The error return allows callers to detect failures per item.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL