modelprovider

package
v0.1.7 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 10, 2026 License: Apache-2.0 Imports: 21 Imported by: 0

Documentation

Index

Constants

View Source
const (
	DefaultGeminiModel      = "gemini-3-flash-preview"
	DefaultAnthropicModel   = "claude-sonnet-4-6"
	DefaultOpenAIModel      = "gpt-4o"
	DefaultOllamaModel      = "llama3"
	DefaultHuggingFaceModel = "bert-base-uncased"
)
View Source
const DefaultOllamaModelForSetup = "llama3.2:3b"

DefaultOllamaModelForSetup is the default model offered during setup. It runs well on Mac (Apple Silicon) and is pulled behind the scenes if not already present.

View Source
const DefaultOllamaURL = "http://localhost:11434"

DefaultOllamaURL is the default base URL for a local Ollama server.

Variables

This section is empty.

Functions

func EchoCheckModel

func EchoCheckModel(ctx context.Context, m model.Model) error

EchoCheckModel performs a minimal GenerateContent call (echo check) to verify that the given model's credentials are valid. It returns nil if the model responds successfully, and an error if the API returns an auth error (e.g. 401) or a system-level failure. Use this to validate tokens before using a provider.

func ListModels

func ListModels(ctx context.Context, url string) ([]string, error)

ListModels returns the names of models available on the Ollama server at url. If url is empty, DefaultOllamaURL is used. Returns an error if the request fails or the response cannot be parsed.

func NewOpenAICompletionModel added in v0.1.7

func NewOpenAICompletionModel(name string, opts ...option.RequestOption) model.Model

NewOpenAICompletionModel creates a new model that uses the OpenAI completions endpoint.

func OllamaReachable

func OllamaReachable(ctx context.Context, url string) bool

OllamaReachable reports whether an Ollama server is reachable at the given URL. If url is empty, DefaultOllamaURL is used. The check uses a short timeout so setup does not block if Ollama is not running.

func PullModel

func PullModel(ctx context.Context, url, model string) error

PullModel pulls the named model from the Ollama library to the local server at url. If url is empty, DefaultOllamaURL is used. The pull runs with stream disabled so it blocks until the model is fully pulled. Use a context with a long timeout (e.g. 10+ minutes).

Types

type ModelConfig

type ModelConfig struct {
	Providers ProviderConfigs `json:"providers" yaml:"providers,omitempty" toml:"providers,omitempty"`
}

func DefaultModelConfig

func DefaultModelConfig(ctx context.Context, sp security.SecretProvider) ModelConfig

DefaultModelConfig builds the default model configuration by resolving API keys through the given SecretProvider. Each provider is added only if its API key is present. Without a SecretProvider, callers can pass security.NewEnvProvider() to preserve the legacy os.Getenv behavior.

func (ModelConfig) NewEnvBasedModelProvider

func (c ModelConfig) NewEnvBasedModelProvider() ModelProvider

func (*ModelConfig) ValidateAndFilter

func (c *ModelConfig) ValidateAndFilter(ctx context.Context, sp security.SecretProvider, opts ...ValidateAndFilterOption) error

ValidateAndFilter keeps only providers that pass Validate and (unless skipped) EchoCheckModel, and mutates c.Providers. Providers that fail credential validation or the echo check (token invalid/401) are excluded with a warning. Returns an error if after filtering no providers remain.

type ModelMap

type ModelMap map[string]model.Model

ModelMap is a map of model names to models

func (ModelMap) GetAny

func (m ModelMap) GetAny() model.Model

func (ModelMap) Providers

func (m ModelMap) Providers() []string

type ModelProvider

type ModelProvider interface {
	GetModel(ctx context.Context, taskType TaskType) (ModelMap, error)
}

func NewSingleModelProvider added in v0.1.7

func NewSingleModelProvider(key string, model model.Model) ModelProvider

type ProviderConfig

type ProviderConfig struct {
	Name        string   `json:"name" yaml:"name,omitempty" toml:"name,omitempty"`
	Provider    string   `json:"provider" yaml:"provider,omitempty" toml:"provider,omitempty"`
	ModelName   string   `json:"model_name" yaml:"model_name,omitempty" toml:"model_name,omitempty"`
	Variant     string   `json:"variant" yaml:"variant,omitempty" toml:"variant,omitempty"`
	Token       string   `json:"token" yaml:"token,omitempty" toml:"token,omitempty"`
	Host        string   `json:"host" yaml:"host,omitempty" toml:"host,omitempty"`
	GoodForTask TaskType `json:"good_for_task" yaml:"good_for_task,omitempty" toml:"good_for_task,omitempty"`
	// EnableTokenTailoring when true (default) trims conversation history to the model's context window (arXiv:2601.14192).
	// Set to false to disable (e.g. debugging or when the provider handles context itself).
	EnableTokenTailoring *bool `json:"enable_token_tailoring,omitempty" yaml:"enable_token_tailoring,omitempty" toml:"enable_token_tailoring,omitempty"`

	MaxTokens *int `json:"max_tokens,omitempty" yaml:"max_tokens,omitempty" toml:"max_tokens,omitempty"`
}

func (ProviderConfig) String

func (p ProviderConfig) String() string

func (ProviderConfig) Validate

Validate returns an error if this provider is not usable (e.g. missing API key). It uses the given SecretProvider to resolve env-based keys when Token is empty. Call this before using the provider so the server never starts with invalid credentials.

type ProviderConfigs

type ProviderConfigs []ProviderConfig

func (ProviderConfigs) Providers

func (providers ProviderConfigs) Providers() []string

type TaskType

type TaskType string

TaskType represents different categories of tasks that LLMs are benchmarked against. These task types help in selecting the most appropriate model based on the specific requirements of the work being performed.

const (
	// TaskToolCalling represents tasks requiring reliable generation of executable code and API calls.
	// Benchmark: BFCL v4 (Berkeley Function Calling Leaderboard)
	// Top performers: Llama 3.1 405B Instruct (88.50%), Claude Opus 4.5 FC (77.47%)
	// Use this for: Function calling, API integration, structured code generation
	TaskToolCalling TaskType = "tool_calling"

	// TaskPlanning represents tasks involving agentic planning and coding for real-world software engineering.
	// Benchmark: SWE-Bench (Software Engineering Benchmark)
	// Top performers: Claude Sonnet 4.5 Parallel (82.00%), Claude Opus 4.5 (80.90%)
	// Use this for: Complex refactoring, multi-file changes, architectural decisions
	TaskPlanning TaskType = "planning"

	// TaskCoding represents pure code generation, algorithmic problem solving, and script writing.
	// Benchmarks: HumanEval (pioneered by Codex), MBPP, LiveCodeBench
	// Top performers: Claude Sonnet 4.5, GPT-5.2
	// Use this for: Single-function generation, copilot-style autocomplete, algorithmic coding
	TaskCoding TaskType = "coding"

	// TaskTerminalCalling represents tasks requiring precision in command-line interfaces and terminal operations.
	// Benchmark: Terminal Execution Bench 2.0
	// Top performers: Claude Sonnet 4.5 (61.30%), Claude Opus 4.5 (59.30%)
	// Use this for: Shell scripting, CLI tool usage, system administration tasks
	TaskTerminalCalling TaskType = "terminal_calling"

	// TaskScientificReasoning represents tasks requiring PhD-level scientific reasoning and logic.
	// Benchmark: GPQA Diamond (Graduate-Level Google-Proof Q&A)
	// Top performers: Gemini 3 Pro Deep Think (93.80%), GPT-5.2 (92.40%)
	// Use this for: Complex analysis, research tasks, domain-specific expertise
	TaskScientificReasoning TaskType = "scientific_reasoning"

	// TaskNovelReasoning represents tasks testing abstract visual pattern solving for never-before-seen problems.
	// Benchmark: ARC-AGI 2 (Abstraction and Reasoning Corpus)
	// Top performers: GPT-5.2 Pro High (54.20%), Poetiq Gemini 3 Pro Refine (54.00%)
	// Use this for: Novel problem-solving, pattern recognition, creative solutions
	TaskNovelReasoning TaskType = "novel_reasoning"

	// TaskGeneralTask represents broad knowledge tasks and general reasoning capabilities.
	// Benchmark: Humanity's Last Exam
	// Top performers: Gemini 3 Pro Deep Think (41.00%), Gemini 3 Pro Standard (37.50%)
	// Use this for: General knowledge queries, broad reasoning, interdisciplinary tasks
	TaskGeneralTask TaskType = "general_task"

	// TaskMathematical represents high-level competition mathematics and quantitative reasoning.
	// Benchmark: AIME 2025 (American Invitational Mathematics Examination)
	// Top performers: GPT-5.2 (100.00%), Gemini 3 Pro (100.00%), Grok 4.1 Heavy (100.00%)
	// Use this for: Mathematical proofs, quantitative analysis, algorithmic optimization
	TaskMathematical TaskType = "mathematical"

	// TaskLongHorizonAutonomy represents extended autonomous operation capabilities.
	// Benchmark: METR (measured in minutes before 50% failure rate)
	// Top performers: GPT-5 Medium (137.3 min), Claude Sonnet 4.5 (113.3 min)
	// Use this for: Long-running autonomous agents, multi-step workflows, sustained reasoning
	TaskLongHorizonAutonomy TaskType = "long_horizon_autonomy"

	// TaskEfficiency represents operational speed and cost efficiency considerations.
	// Benchmarks: Throughput (tokens/sec) and Cost (USD per 1M input tokens)
	// Throughput leaders: Llama 4 Scout (2600 t/s), Grok 4.1 (455 t/s)
	// Cost leaders: Grok 4.1 ($0.20), Gemini 3 Flash ($0.50)
	// Use this for: High-volume processing, cost-sensitive operations, real-time applications
	TaskEfficiency TaskType = "efficiency"

	// TaskSummarizer represents tasks requiring large-context summarization of
	// verbose tool outputs. Typically mapped to a model with a very large context
	// window (e.g., 1M tokens) so it can ingest and compress raw HTML, API
	// responses, or other bulk data before handing it back to a smaller agent.
	// Use this for: Auto-summarizing oversized tool results, condensing documents
	TaskSummarizer TaskType = "summarizer"

	// TaskComputerOperations represents native computer-use capabilities.
	// Benchmark: OSWorld-Verified, WebArena Verified, APEX-Agents
	// Top performers: GPT-5.4 Pro
	// Use this for: Operating applications via keyboard and mouse commands
	TaskComputerOperations TaskType = "computer_operations"
)

type ValidateAndFilterOption

type ValidateAndFilterOption func(*validateAndFilterOptions)

ValidateAndFilterOption configures ValidateAndFilter behavior.

func SkipEchoCheck

func SkipEchoCheck() ValidateAndFilterOption

SkipEchoCheck disables the per-provider echo check (useful in tests to avoid real API calls).

Directories

Path Synopsis
Code generated by counterfeiter.
Code generated by counterfeiter.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL