model

package

v0.6.4 Latest Latest Go to latest Published: Jun 23, 2026 License: MIT Imports: 17 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/cnjack/jcode

Links

Open Source Insights

Documentation ¶

Index ¶

Constants
Variables
func FormatAPIError(err error, attempt, maxRetries int) string
func GetModelContextLimit(modelName string) int
func GetTokenUsage() (prompt, completion, total int64)
func IsRetryable(_ context.Context, err error) bool
func NewChatModel(_ context.Context, cfg *ChatModelConfig) (einomodel.ToolCallingChatModel, error)
func ParseProviderModel(s string) (provider, model string, err error)
func ParseRetryAfter(err error) time.Duration
func ResetTokenUsage()
func ResolveContextLimit(reg *ModelRegistry, cfg *config.Config, providerID, modelID string) int
func SmartBackoff(ctx context.Context, attempt int) time.Duration
func UsageNotifierFromContext(ctx context.Context) func()
func ValidateProvider(ctx context.Context, apiKey, baseURL string) error
func WithRetryError(ctx context.Context, err error) context.Context
func WithTokenTracker(ctx context.Context, t *TokenUsage) context.Context
func WithUsageNotifier(ctx context.Context, fn func()) context.Context
type APIErrorCategory
- func ClassifyError(err error) APIErrorCategory
- func (c APIErrorCategory) String() string
type AddParams
type ChatModelConfig
type ContextOverflowInfo
- func ParseContextOverflow(err error) *ContextOverflowInfo
type ModelCost
type ModelFactory
- func NewModelFactory(cfg *config.Config, fallback einomodel.ToolCallingChatModel) *ModelFactory
- func (f *ModelFactory) Fallback() einomodel.ToolCallingChatModel
- func (f *ModelFactory) GetModel(ctx context.Context, providerModel string) (einomodel.ToolCallingChatModel, error)
- func (f *ModelFactory) Registry() *ModelRegistry
type ModelInfo
type ModelLimit
type ModelModalities
type ModelPricing
type ModelRegistry
- func NewModelRegistry() *ModelRegistry
- func NewModelRegistryWithConfig(cfg *config.Config) *ModelRegistry
- func (r *ModelRegistry) GetModelCacheCost(providerID, modelID string) (cacheReadPer1M, cacheWritePer1M float64)
- func (r *ModelRegistry) GetModelContextLimit(providerID, modelID string) int
- func (r *ModelRegistry) GetModelCost(providerID, modelID string) (inputPer1M, outputPer1M float64)
- func (r *ModelRegistry) GetProvider(providerID string) *RegistryProvider
- func (r *ModelRegistry) GetProviderAPI(providerID string) string
- func (r *ModelRegistry) GetProviderEnvVars(providerID string) []string
- func (r *ModelRegistry) HasProvider(providerID string) bool
- func (r *ModelRegistry) ListProviderModels(providerID string, toolCallOnly bool) []*RegistryModel
- func (r *ModelRegistry) ListProviders() []*RegistryProvider
- func (r *ModelRegistry) Load() (map[string]*RegistryProvider, error)
- func (r *ModelRegistry) LookupModel(providerID, modelID string) (*RegistryProvider, *RegistryModel, bool)
- func (r *ModelRegistry) MergeConfigProviders(providers map[string]*config.ProviderConfig)
type RegistryModel
type RegistryProvider
type TokenUsage
- func TokenTrackerFromContext(ctx context.Context) *TokenUsage
- func (t *TokenUsage) Add(p AddParams)
- func (t *TokenUsage) AddByModel(model string, prompt, completion, total int)
- func (t *TokenUsage) BeginTurn()
- func (t *TokenUsage) CacheHitRate() float64
- func (t *TokenUsage) CacheObserved() bool
- func (t *TokenUsage) Get() (prompt, completion, total int64)
- func (t *TokenUsage) GetByModel() map[string]int64
- func (t *TokenUsage) GetFull() TokenUsageDetail
- func (t *TokenUsage) GetLastDetail() *TokenUsageDetail
- func (t *TokenUsage) GetLastTotal() int64
- func (t *TokenUsage) Reset()
- func (t *TokenUsage) ResetContext()
- func (t *TokenUsage) TurnUsage() (prompt, completion, cached int64)
type TokenUsageDetail
- func (d TokenUsageDetail) Minus(prev TokenUsageDetail) TokenUsageDetail

Constants ¶

View Source

const DefaultContextLimitFallback = 200000

DefaultContextLimitFallback is the conservative context window assumed when a model's true limit cannot be determined from the registry, built-in tables, or user config. Kept deliberately small so unknown models compact early rather than overflowing the provider's real window. Override per-model via config.ContextLimits or globally via config.DefaultContextLimit.

Variables ¶

View Source

var TokenTracker = &TokenUsage{}

TokenTracker is a global token usage tracker

Functions ¶

func FormatAPIError ¶

func FormatAPIError(err error, attempt, maxRetries int) string

FormatAPIError produces a user-friendly error message with retry context.

func GetModelContextLimit ¶

func GetModelContextLimit(modelName string) int

GetModelContextLimit returns the known context limit for a given model name. Returns 0 if the model is not in the known list.

func GetTokenUsage ¶

func GetTokenUsage() (prompt, completion, total int64)

GetTokenUsage returns the current token usage statistics

func IsRetryable ¶

func IsRetryable(_ context.Context, err error) bool

IsRetryable returns true if the error should be retried. It is designed to be used as ModelRetryConfig.IsRetryAble in the Eino framework.

Context overflow errors are NOT retryable — they need compaction. Auth errors are NOT retryable — they need user action.

func NewChatModel ¶

func NewChatModel(_ context.Context, cfg *ChatModelConfig) (einomodel.ToolCallingChatModel, error)

func ParseProviderModel ¶

func ParseProviderModel(s string) (provider, model string, err error)

ParseProviderModel splits "provider/model" into its components.

func ParseRetryAfter ¶

func ParseRetryAfter(err error) time.Duration

ParseRetryAfter extracts a delay from an error message or OpenAI APIError. It looks for Retry-After header patterns in the error text. Returns 0 if no delay information is found.

func ResetTokenUsage ¶

func ResetTokenUsage()

ResetTokenUsage resets the token usage tracker

func ResolveContextLimit ¶ added in v0.5.1

func ResolveContextLimit(reg *ModelRegistry, cfg *config.Config, providerID, modelID string) int

ResolveContextLimit determines the effective context window (in tokens) for a provider/model pair. This is the single source of truth for window-size management — all middleware thresholds (compaction, summarization, reduction, reminders) derive from it.

Resolution order (first positive hit wins):

explicit user override: cfg.ContextLimits["provider/model"], then cfg.ContextLimits["model"]
models.dev registry metadata (reg.GetModelContextLimit)
built-in knownModels fallback table (GetModelContextLimit)
cfg.DefaultContextLimit, else DefaultContextLimitFallback

reg and cfg may be nil; the resolver degrades gracefully.

func SmartBackoff ¶

func SmartBackoff(ctx context.Context, attempt int) time.Duration

SmartBackoff returns a delay for the given retry attempt, respecting server-sent Retry-After hints when available. It is designed to be used as ModelRetryConfig.BackoffFunc in the Eino framework.

Strategy (matching Claude-Code & OpenCode patterns):

If the error contains a Retry-After hint, use it (capped at 5 min).
Otherwise fall back to exponential backoff: 500ms × 2^(attempt-1), capped at 32s, plus 0-25% random jitter.

func UsageNotifierFromContext ¶ added in v0.6.3

func UsageNotifierFromContext(ctx context.Context) func()

UsageNotifierFromContext retrieves the per-call usage notifier, if any.

func ValidateProvider ¶ added in v0.3.10

func ValidateProvider(ctx context.Context, apiKey, baseURL string) error

ValidateProvider tests connectivity to a provider by making a lightweight GET /models request. Returns nil on success, or a descriptive error.

func WithRetryError ¶

func WithRetryError(ctx context.Context, err error) context.Context

WithRetryError stores an error in context for BackoffFunc to inspect.

func WithTokenTracker ¶

func WithTokenTracker(ctx context.Context, t *TokenUsage) context.Context

WithTokenTracker attaches a per-agent TokenUsage to the context. chatModel.Generate/Stream will increment this tracker in addition to the global TokenTracker.

func WithUsageNotifier ¶ added in v0.6.3

func WithUsageNotifier(ctx context.Context, fn func()) context.Context

WithUsageNotifier attaches a callback that chatModel.Generate/Stream invokes after each API call's usage has been recorded. UIs use it to refresh the token/context display in real time during a run, not just at turn end. The model layer stays provider/UI-agnostic — it only fires the opaque callback.

Types ¶

type APIErrorCategory ¶

type APIErrorCategory int

APIErrorCategory classifies LLM API errors into actionable categories.

const (
	// ErrCategoryTransient — network blips, timeouts, 5xx; safe to retry.
	ErrCategoryTransient APIErrorCategory = iota
	// ErrCategoryRateLimit — 429 / "overloaded"; retry with back-off.
	ErrCategoryRateLimit
	// ErrCategoryContextOverflow — input too long; needs compaction, NOT retry.
	ErrCategoryContextOverflow
	// ErrCategoryAuth — 401/403; permanent until key is fixed.
	ErrCategoryAuth
	// ErrCategoryFatal — 400 bad request, unknown; do not retry.
	ErrCategoryFatal
)

func ClassifyError ¶

func ClassifyError(err error) APIErrorCategory

ClassifyError determines the category of an API error.

func (APIErrorCategory) String ¶

func (c APIErrorCategory) String() string

type AddParams ¶ added in v0.6.3

type AddParams struct {
	Prompt     int
	Completion int
	Total      int
	Cached     int
	Reasoning  int
	CacheWrite int
	// CacheDetailsPresent is true when the provider returned a
	// prompt_tokens_details object at all (even with cached_tokens:0), letting
	// CacheObserved tell "supports caching, 0 hits" apart from "never reports
	// caching". See https://platform.openai.com/docs/guides/prompt-caching.
	CacheDetailsPresent bool
}

AddParams carries one API call's token usage. Using a struct keeps the growing set of token categories from turning Add into a long positional list.

type ChatModelConfig ¶

type ChatModelConfig struct {
	Model   string
	APIKey  string
	BaseURL string
}

type ContextOverflowInfo ¶

type ContextOverflowInfo struct {
	ActualTokens int
	LimitTokens  int
	TokenGap     int // ActualTokens - LimitTokens
}

ContextOverflowInfo holds parsed token counts from an overflow error.

func ParseContextOverflow ¶

func ParseContextOverflow(err error) *ContextOverflowInfo

ParseContextOverflow extracts token counts from a context overflow error. Returns nil if the error is not a context overflow or counts cannot be parsed.

type ModelCost ¶

type ModelCost struct {
	Input      float64 `json:"input"`
	Output     float64 `json:"output"`
	CacheRead  float64 `json:"cache_read,omitempty"`
	CacheWrite float64 `json:"cache_write,omitempty"`
}

ModelCost describes per-token costs in USD per 1M tokens.

type ModelFactory ¶

type ModelFactory struct {
	// contains filtered or unexported fields
}

ModelFactory creates and caches ChatModel instances by "provider/model" identifier.

func NewModelFactory ¶

func NewModelFactory(cfg *config.Config, fallback einomodel.ToolCallingChatModel) *ModelFactory

NewModelFactory creates a model factory with the given config, fallback model, and registry.

func (*ModelFactory) Fallback ¶

func (f *ModelFactory) Fallback() einomodel.ToolCallingChatModel

Fallback returns the default fallback model.

func (*ModelFactory) GetModel ¶

func (f *ModelFactory) GetModel(ctx context.Context, providerModel string) (einomodel.ToolCallingChatModel, error)

GetModel returns a ChatModel for the given "provider/model" identifier. Empty string returns the fallback model.

func (*ModelFactory) Registry ¶

func (f *ModelFactory) Registry() *ModelRegistry

Registry returns the underlying ModelRegistry for metadata lookups.

type ModelInfo ¶

type ModelInfo struct {
	ID           string
	ContextLimit int // Maximum context window size, 0 if unknown
	Pricing      ModelPricing
}

ModelInfo contains information about a model

type ModelLimit ¶

type ModelLimit struct {
	Context int `json:"context"`
	Input   int `json:"input,omitempty"`
	Output  int `json:"output,omitempty"`
}

ModelLimit describes context window and output limits.

type ModelModalities ¶

type ModelModalities struct {
	Input  []string `json:"input,omitempty"`
	Output []string `json:"output,omitempty"`
}

ModelModalities describes input/output modalities.

type ModelPricing ¶

type ModelPricing struct {
	InputPer1M     float64 // cost per 1M input tokens
	OutputPer1M    float64 // cost per 1M output tokens
	CacheReadPer1M float64 // cost per 1M cache-read (cached input) tokens; 0 ⇒ no discount data, fall back to InputPer1M
}

ModelPricing contains cost information for a model.

type ModelRegistry ¶

type ModelRegistry struct {
	// contains filtered or unexported fields
}

ModelRegistry provides model metadata from models.dev and custom config. The base data is statically generated at build time via go:generate. Custom models from config are merged in at runtime.

func NewModelRegistry ¶

func NewModelRegistry() *ModelRegistry

NewModelRegistry creates a new ModelRegistry with a deep copy of generated data. Each RegistryProvider and its Models map are copied so that merging custom models at runtime never mutates the shared generatedProviders.

func NewModelRegistryWithConfig ¶ added in v0.4.8

func NewModelRegistryWithConfig(cfg *config.Config) *ModelRegistry

NewModelRegistryWithConfig creates a ModelRegistry and merges custom models from config.

func (*ModelRegistry) GetModelCacheCost ¶ added in v0.6.4

func (r *ModelRegistry) GetModelCacheCost(providerID, modelID string) (cacheReadPer1M, cacheWritePer1M float64)

GetModelCacheCost returns the cache-read and cache-write prices (USD per 1M tokens) for a model, or 0 when the registry has no cache pricing for it.

func (*ModelRegistry) GetModelContextLimit ¶

func (r *ModelRegistry) GetModelContextLimit(providerID, modelID string) int

GetModelContextLimit returns the context limit for a model looked up via registry.

func (*ModelRegistry) GetModelCost ¶

func (r *ModelRegistry) GetModelCost(providerID, modelID string) (inputPer1M, outputPer1M float64)

GetModelCost returns pricing info for a model.

func (*ModelRegistry) GetProvider ¶

func (r *ModelRegistry) GetProvider(providerID string) *RegistryProvider

GetProvider returns provider info by ID, or nil if not found.

func (*ModelRegistry) GetProviderAPI ¶

func (r *ModelRegistry) GetProviderAPI(providerID string) string

GetProviderAPI returns the API base URL for a provider from the registry.

func (*ModelRegistry) GetProviderEnvVars ¶

func (r *ModelRegistry) GetProviderEnvVars(providerID string) []string

GetProviderEnvVars returns the environment variable names for a provider.

func (*ModelRegistry) HasProvider ¶

func (r *ModelRegistry) HasProvider(providerID string) bool

HasProvider returns whether the given provider ID exists in the registry.

func (*ModelRegistry) ListProviderModels ¶

func (r *ModelRegistry) ListProviderModels(providerID string, toolCallOnly bool) []*RegistryModel

ListProviderModels returns models for a provider from the registry. If toolCallOnly is true, only models with tool_call support are returned. Models are sorted by ID.

func (*ModelRegistry) ListProviders ¶

func (r *ModelRegistry) ListProviders() []*RegistryProvider

ListProviders returns all providers in the curated display order.

func (*ModelRegistry) Load ¶

func (r *ModelRegistry) Load() (map[string]*RegistryProvider, error)

Load returns the provider/model data.

func (*ModelRegistry) LookupModel ¶

func (r *ModelRegistry) LookupModel(providerID, modelID string) (*RegistryProvider, *RegistryModel, bool)

LookupModel finds a model by "provider/model" identifier. Returns the provider info, model info, and whether it was found.

func (*ModelRegistry) MergeConfigProviders ¶ added in v0.4.8

func (r *ModelRegistry) MergeConfigProviders(providers map[string]*config.ProviderConfig)

MergeConfigProviders merges custom models from config providers into the registry. For providers not in the registry, a new entry is created. For existing providers, custom models are added (existing models are not overridden).

type RegistryModel ¶

type RegistryModel struct {
	ID               string           `json:"id"`
	Name             string           `json:"name"`
	Family           string           `json:"family,omitempty"`
	Attachment       bool             `json:"attachment,omitempty"`
	Reasoning        bool             `json:"reasoning,omitempty"`
	ToolCall         bool             `json:"tool_call,omitempty"`
	StructuredOutput bool             `json:"structured_output,omitempty"`
	Temperature      bool             `json:"temperature,omitempty"`
	Knowledge        string           `json:"knowledge,omitempty"`
	ReleaseDate      string           `json:"release_date,omitempty"`
	LastUpdated      string           `json:"last_updated,omitempty"`
	Modalities       *ModelModalities `json:"modalities,omitempty"`
	OpenWeights      bool             `json:"open_weights,omitempty"`
	Cost             *ModelCost       `json:"cost,omitempty"`
	Limit            *ModelLimit      `json:"limit,omitempty"`
	Status           string           `json:"status,omitempty"`
	Recommended      bool             `json:"recommended,omitempty"`
	DefaultEnabled   bool             `json:"default_enabled,omitempty"`
}

RegistryModel represents a model from models.dev API.

type RegistryProvider ¶

type RegistryProvider struct {
	ID     string                    `json:"id"`
	Name   string                    `json:"name"`
	Env    []string                  `json:"env"`
	API    string                    `json:"api"`
	Doc    string                    `json:"doc,omitempty"`
	Models map[string]*RegistryModel `json:"models"`
}

RegistryProvider represents a provider from models.dev API.

type TokenUsage ¶

type TokenUsage struct {
	PromptTokens     int64
	CompletionTokens int64
	TotalTokens      int64
	CachedTokens     int64
	ReasoningTokens  int64
	CacheWriteTokens int64
	CallCount        int64 // number of API calls recorded (averages denominator)
	LastTotalTokens  int64
	// contains filtered or unexported fields
}

TokenUsage tracks token consumption across all API calls.

CachedTokens is the cache-READ portion of the prompt (tokens served from the provider's KV cache). CacheWriteTokens is the cache-CREATION portion; it is 0 today because the shared go-openai transport does not surface cache_creation_input_tokens, and is kept as a forward-compatible field. ReasoningTokens is the reasoning/thinking subset of the completion.

func TokenTrackerFromContext ¶

func TokenTrackerFromContext(ctx context.Context) *TokenUsage

TokenTrackerFromContext retrieves the per-agent TokenUsage from the context, if any.

func (*TokenUsage) Add ¶

func (t *TokenUsage) Add(p AddParams)

Add records one API call's token usage.

func (*TokenUsage) AddByModel ¶

func (t *TokenUsage) AddByModel(model string, prompt, completion, total int)

AddByModel adds token usage attributed to a specific model name.

func (*TokenUsage) BeginTurn ¶ added in v0.6.4

func (t *TokenUsage) BeginTurn()

BeginTurn snapshots the cumulative counters as the baseline for the current agent turn so TurnUsage reports only this turn's delta. Called at the start of every runner turn.

func (*TokenUsage) CacheHitRate ¶ added in v0.6.3

func (t *TokenUsage) CacheHitRate() float64

CacheHitRate returns the cumulative KV cache hit rate, defined as cached / prompt — the fraction of prompt tokens served from the provider's cache. Returns 0 when no prompt tokens have been recorded. The result is clamped to [0,1] to stay robust against provider quirks.

func (*TokenUsage) CacheObserved ¶ added in v0.6.3

func (t *TokenUsage) CacheObserved() bool

CacheObserved reports whether the provider has reported cache details (a prompt_tokens_details object) — used to distinguish "cache hit rate is 0%" from "this provider never reports caching". It is true on the first turn that carries cache details even when cached_tokens is 0, and stays true for the session (cleared only by Reset). The CachedTokens>0 fallback keeps it correct for older snapshots recorded before the presence flag existed.

func (*TokenUsage) Get ¶

func (t *TokenUsage) Get() (prompt, completion, total int64)

Get returns the current token usage

func (*TokenUsage) GetByModel ¶

func (t *TokenUsage) GetByModel() map[string]int64

GetByModel returns a snapshot of per-model token totals.

func (*TokenUsage) GetFull ¶ added in v0.6.3

func (t *TokenUsage) GetFull() TokenUsageDetail

GetFull returns a cumulative snapshot of all tracked token usage.

func (*TokenUsage) GetLastDetail ¶ added in v0.4.4

func (t *TokenUsage) GetLastDetail() *TokenUsageDetail

GetLastDetail returns the last API call's token usage detail.

func (*TokenUsage) GetLastTotal ¶ added in v0.3.2

func (t *TokenUsage) GetLastTotal() int64

GetLastTotal returns the last API call's total tokens (current context usage)

func (*TokenUsage) Reset ¶

func (t *TokenUsage) Reset()

Reset resets the token tracker

func (*TokenUsage) ResetContext ¶ added in v0.6.4

func (t *TokenUsage) ResetContext()

ResetContext clears only the "current context occupancy" snapshot (the last API call's per-call values), leaving the cumulative consumption ledger, the cache-support flag, the per-model breakdown, and the per-turn baseline intact. Call this after a compaction/summarization shrinks the live context: the context indicator should reflect the smaller window, but the session's accumulated spend must NOT be lost — it feeds budgets, the usage log, and cross-session stats. (Full Reset is for a genuine session boundary.)

func (*TokenUsage) TurnUsage ¶ added in v0.6.4

func (t *TokenUsage) TurnUsage() (prompt, completion, cached int64)

TurnUsage returns this turn's consumption (cumulative minus the BeginTurn baseline). Each value is clamped at 0 so a mid-turn Reset (which zeroes the cumulative and the baseline together) can never yield a negative delta.

type TokenUsageDetail ¶ added in v0.4.4

type TokenUsageDetail struct {
	PromptTokens     int `json:"prompt_tokens"`
	CompletionTokens int `json:"completion_tokens"`
	TotalTokens      int `json:"total_tokens"`
	CachedTokens     int `json:"cached_tokens"`
	ReasoningTokens  int `json:"reasoning_tokens,omitempty"`
	CacheWriteTokens int `json:"cache_write_tokens,omitempty"`
	CallCount        int `json:"call_count,omitempty"`
}

TokenUsageDetail holds a token usage snapshot for tracing/observability and for JSON transport to the UI. Reasoning/cache-write/call-count carry omitempty so per-call telemetry stays compact while cumulative snapshots (GetFull) carry the full breakdown.

func (TokenUsageDetail) Minus ¶ added in v0.6.3

func (d TokenUsageDetail) Minus(prev TokenUsageDetail) TokenUsageDetail

Minus returns the per-field difference d-prev, used to derive the token delta of a single agent run from cumulative snapshots.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL