Documentation
¶
Overview ¶
Package llms provides interfaces and utilities for LLM providers.
Index ¶
- Constants
- func GenerateFromSinglePrompt(ctx context.Context, llm Model, prompt string, options ...CallOption) (string, error)
- func TextParts(role schema.ChatMessageType, parts ...string) schema.MessageContent
- type CallOption
- func WithContextLength(length int) CallOption
- func WithJSONMode(enabled bool) CallOption
- func WithJSONSchema(schema any) CallOption
- func WithKeepAlive(keepAlive string) CallOption
- func WithMaxTokens(maxTokens int) CallOption
- func WithMinP(minP float64) CallOption
- func WithModel(model string) CallOption
- func WithSeed(seed int) CallOption
- func WithStopWords(stopWords []string) CallOption
- func WithStreamingFunc(streamingFunc func(ctx context.Context, chunk []byte) error) CallOption
- func WithTemperature(temperature float64) CallOption
- func WithThink(think any) CallOption
- func WithTools(tools []ToolDefinition) CallOption
- func WithTopK(topK int) CallOption
- func WithTopP(topP float64) CallOption
- type CallOptions
- type FunctionCall
- type FunctionDefinition
- type LLMReranker
- type LLMRerankerOption
- type Model
- type Tokenizer
- type ToolCall
- type ToolDefinition
Constants ¶
const RerankPromptDefault = `` /* 580-byte string literal not displayed */
RerankPromptDefault is the default prompt template for LLM-based reranking.
Variables ¶
This section is empty.
Functions ¶
func GenerateFromSinglePrompt ¶
func GenerateFromSinglePrompt(ctx context.Context, llm Model, prompt string, options ...CallOption) (string, error)
GenerateFromSinglePrompt generates a response from a single prompt. It wraps the prompt in a human message and returns the generated text.
func TextParts ¶
func TextParts(role schema.ChatMessageType, parts ...string) schema.MessageContent
TextParts creates a MessageContent with multiple text parts.
Types ¶
type CallOption ¶
type CallOption func(*CallOptions)
CallOption configures LLM generation options.
func WithContextLength ¶ added in v0.15.0
func WithContextLength(length int) CallOption
WithContextLength sets the context window size in tokens.
func WithJSONMode ¶ added in v0.15.0
func WithJSONMode(enabled bool) CallOption
WithJSONMode enables JSON output format.
func WithJSONSchema ¶ added in v0.15.0
func WithJSONSchema(schema any) CallOption
WithJSONSchema specifies a JSON schema for structured output.
func WithKeepAlive ¶ added in v0.15.0
func WithKeepAlive(keepAlive string) CallOption
WithKeepAlive controls how long the model stays loaded in memory. Examples: "5m", "10m", "0" to unload immediately.
func WithMaxTokens ¶ added in v0.15.0
func WithMaxTokens(maxTokens int) CallOption
WithMaxTokens specifies the maximum number of tokens to generate.
func WithMinP ¶ added in v0.15.0
func WithMinP(minP float64) CallOption
WithMinP specifies the minimum probability threshold for token selection.
func WithModel ¶ added in v0.15.0
func WithModel(model string) CallOption
WithModel specifies the model to use for this call.
func WithSeed ¶ added in v0.15.0
func WithSeed(seed int) CallOption
WithSeed specifies the seed for deterministic generation.
func WithStopWords ¶ added in v0.15.0
func WithStopWords(stopWords []string) CallOption
WithStopWords specifies the stop words to use. Generation stops when any stop word is encountered.
func WithStreamingFunc ¶
func WithStreamingFunc(streamingFunc func(ctx context.Context, chunk []byte) error) CallOption
WithStreamingFunc specifies the streaming function to use. The function is called for each chunk of the streamed response.
func WithTemperature ¶ added in v0.15.0
func WithTemperature(temperature float64) CallOption
WithTemperature specifies the temperature to use. Higher values produce more random outputs.
func WithThink ¶ added in v0.15.0
func WithThink(think any) CallOption
WithThink enables thinking/reasoning output for supported models. Pass true/false for standard models, or "high"/"medium"/"low" for GPT-OSS.
func WithTools ¶ added in v0.15.0
func WithTools(tools []ToolDefinition) CallOption
WithTools specifies function tools the model may call.
func WithTopK ¶ added in v0.15.0
func WithTopK(topK int) CallOption
WithTopK specifies the top-k value to use for sampling.
func WithTopP ¶ added in v0.15.0
func WithTopP(topP float64) CallOption
WithTopP specifies the top-p value to use for nucleus sampling.
type CallOptions ¶
type CallOptions struct {
// Model specifies the model to use (overrides default).
Model string `json:"model"`
// Temperature controls randomness in generation (0.0 to 2.0).
Temperature float64 `json:"temperature"`
// MaxTokens limits the maximum tokens in the response.
MaxTokens int `json:"max_tokens"`
// StopWords specifies sequences where generation should stop.
StopWords []string `json:"stop_words"`
// TopP controls diversity via nucleus sampling (0.0 to 1.0).
TopP float64 `json:"top_p"`
// TopK limits sampling to top K tokens.
TopK int `json:"top_k"`
// MinP sets minimum probability threshold for token selection.
MinP float64 `json:"min_p"`
// Seed sets a deterministic seed for reproducible outputs.
Seed int `json:"seed"`
// Metadata contains additional provider-specific options.
Metadata map[string]any `json:"metadata,omitempty"`
// StreamingFunc is called for each chunk when streaming is enabled.
StreamingFunc func(ctx context.Context, chunk []byte) error `json:"-"`
// JSONMode enables JSON output format.
JSONMode bool `json:"json_mode"`
// JSONSchema specifies a JSON schema for structured output.
JSONSchema any `json:"json_schema,omitempty"`
// Tools specifies function tools the model may call.
Tools []ToolDefinition `json:"tools,omitempty"`
// Think enables thinking/reasoning output for supported models.
// Can be true/false or "high"/"medium"/"low" for some models.
Think any `json:"think,omitempty"`
// KeepAlive controls how long the model stays loaded in memory.
KeepAlive string `json:"keep_alive,omitempty"`
// ContextLength sets the context window size in tokens.
ContextLength int `json:"context_length,omitempty"`
}
CallOptions contains configurable options for LLM generation calls.
type FunctionCall ¶ added in v0.15.0
type FunctionCall struct {
// Name is the name of the function to call.
Name string `json:"name"`
// Arguments is the JSON object of arguments to pass.
Arguments map[string]any `json:"arguments"`
}
FunctionCall contains the details of a function call.
type FunctionDefinition ¶ added in v0.15.0
type FunctionDefinition struct {
// Name is the function name.
Name string `json:"name"`
// Description explains what the function does.
Description string `json:"description,omitempty"`
// Parameters is a JSON Schema for the function parameters.
Parameters any `json:"parameters"`
}
FunctionDefinition describes a function that can be called by the model.
type LLMReranker ¶ added in v0.15.0
type LLMReranker struct {
// contains filtered or unexported fields
}
LLMReranker uses an LLM to rerank documents by relevance to a query. It evaluates each document in parallel with configurable concurrency.
func NewLLMReranker ¶ added in v0.15.0
func NewLLMReranker(model Model, opts ...LLMRerankerOption) *LLMReranker
NewLLMReranker creates a new LLM-based reranker. By default, it uses 5 concurrent operations and the default prompt.
func (*LLMReranker) Rerank ¶ added in v0.15.0
func (r *LLMReranker) Rerank(ctx context.Context, query string, docs []schema.Document) ([]schema.ScoredDocument, error)
Rerank reranks documents by relevance to the query using the LLM. Documents are evaluated in parallel and sorted by score descending.
type LLMRerankerOption ¶ added in v0.15.0
type LLMRerankerOption func(*LLMReranker)
LLMRerankerOption configures an LLMReranker.
func WithConcurrency ¶ added in v0.15.0
func WithConcurrency(c int) LLMRerankerOption
WithConcurrency sets the number of concurrent reranking operations. Values <= 0 are ignored, keeping the default of 5.
func WithPrompt ¶ added in v0.15.0
func WithPrompt(p string) LLMRerankerOption
WithPrompt sets a custom prompt template for reranking. The template receives .Query, .Content, and all document metadata fields.
type Model ¶
type Model interface {
// GenerateContent generates a response from the LLM given a conversation history.
// Use this for multi-turn conversations or when you need access to full response metadata.
GenerateContent(ctx context.Context, messages []schema.MessageContent, options ...CallOption) (*schema.ContentResponse, error)
// Call is a convenience method for single-turn prompts.
// It returns the generated text directly.
Call(ctx context.Context, prompt string, options ...CallOption) (string, error)
}
Model is the interface for LLM providers. Implementations support both single-turn and multi-turn conversations with optional streaming support.
type Tokenizer ¶
type Tokenizer interface {
// CountTokens returns the number of tokens in the text.
CountTokens(ctx context.Context, text string) (int, error)
}
Tokenizer is the interface for token counting. Implementations provide accurate token counts for a given model.
type ToolCall ¶ added in v0.15.0
type ToolCall struct {
// Function contains the function call details.
Function FunctionCall `json:"function"`
}
ToolCall represents a tool call request from the model.
type ToolDefinition ¶ added in v0.15.0
type ToolDefinition struct {
// Type is always "function".
Type string `json:"type"`
// Function contains the function definition.
Function FunctionDefinition `json:"function"`
}
ToolDefinition defines a function tool the model may call.
Directories
¶
| Path | Synopsis |
|---|---|
|
Package fake provides mock LLM implementations for testing.
|
Package fake provides mock LLM implementations for testing. |
|
Package gemini provides LLM and embedding support for Google's Gemini models.
|
Package gemini provides LLM and embedding support for Google's Gemini models. |
|
Package ollama provides a client for interacting with Ollama's local LLM server.
|
Package ollama provides a client for interacting with Ollama's local LLM server. |