Documentation
¶
Overview ¶
Package llms provides interfaces and utilities for LLM providers.
Index ¶
- Constants
- func GenerateFromSinglePrompt(ctx context.Context, llm Model, prompt string, options ...CallOption) (string, error)
- func TextParts(role schema.ChatMessageType, parts ...string) schema.MessageContent
- type CallOption
- func WithMaxTokens(maxTokens int) CallOption
- func WithModel(model string) CallOption
- func WithSeed(seed int) CallOption
- func WithStopWords(stopWords []string) CallOption
- func WithStreamingFunc(streamingFunc func(ctx context.Context, chunk []byte) error) CallOption
- func WithTemperature(temperature float64) CallOption
- func WithTopK(topK int) CallOption
- func WithTopP(topP float64) CallOption
- type CallOptions
- type LLMReranker
- type LLMRerankerOption
- type Model
- type Tokenizer
Constants ¶
const RerankPromptDefault = `` /* 580-byte string literal not displayed */
RerankPromptDefault is the default prompt template for LLM-based reranking.
Variables ¶
This section is empty.
Functions ¶
func GenerateFromSinglePrompt ¶
func GenerateFromSinglePrompt(ctx context.Context, llm Model, prompt string, options ...CallOption) (string, error)
GenerateFromSinglePrompt generates a response from a single prompt. It wraps the prompt in a human message and returns the generated text.
func TextParts ¶
func TextParts(role schema.ChatMessageType, parts ...string) schema.MessageContent
TextParts creates a MessageContent with multiple text parts.
Types ¶
type CallOption ¶
type CallOption func(*CallOptions)
CallOption configures LLM generation options.
func WithMaxTokens ¶ added in v0.15.0
func WithMaxTokens(maxTokens int) CallOption
WithMaxTokens specifies the maximum number of tokens to generate.
func WithModel ¶ added in v0.15.0
func WithModel(model string) CallOption
WithModel specifies the model to use for this call.
func WithSeed ¶ added in v0.15.0
func WithSeed(seed int) CallOption
WithSeed specifies the seed for deterministic generation.
func WithStopWords ¶ added in v0.15.0
func WithStopWords(stopWords []string) CallOption
WithStopWords specifies the stop words to use. Generation stops when any stop word is encountered.
func WithStreamingFunc ¶
func WithStreamingFunc(streamingFunc func(ctx context.Context, chunk []byte) error) CallOption
WithStreamingFunc specifies the streaming function to use. The function is called for each chunk of the streamed response.
func WithTemperature ¶ added in v0.15.0
func WithTemperature(temperature float64) CallOption
WithTemperature specifies the temperature to use. Higher values produce more random outputs.
func WithTopK ¶ added in v0.15.0
func WithTopK(topK int) CallOption
WithTopK specifies the top-k value to use for sampling.
func WithTopP ¶ added in v0.15.0
func WithTopP(topP float64) CallOption
WithTopP specifies the top-p value to use for nucleus sampling.
type CallOptions ¶
type CallOptions struct {
// Model specifies the model to use (overrides default).
Model string `json:"model"`
// Temperature controls randomness in generation (0.0 to 2.0).
Temperature float64 `json:"temperature"`
// MaxTokens limits the maximum tokens in the response.
MaxTokens int `json:"max_tokens"`
// StopWords specifies sequences where generation should stop.
StopWords []string `json:"stop_words"`
// TopP controls diversity via nucleus sampling (0.0 to 1.0).
TopP float64 `json:"top_p"`
// TopK limits sampling to top K tokens.
TopK int `json:"top_k"`
// Seed sets a deterministic seed for reproducible outputs.
Seed int `json:"seed"`
// Metadata contains additional provider-specific options.
Metadata map[string]any `json:"metadata,omitempty"`
// StreamingFunc is called for each chunk when streaming is enabled.
StreamingFunc func(ctx context.Context, chunk []byte) error `json:"-"`
}
CallOptions contains configurable options for LLM generation calls.
type LLMReranker ¶ added in v0.15.0
type LLMReranker struct {
// contains filtered or unexported fields
}
LLMReranker uses an LLM to rerank documents by relevance to a query. It evaluates each document in parallel with configurable concurrency.
func NewLLMReranker ¶ added in v0.15.0
func NewLLMReranker(model Model, opts ...LLMRerankerOption) *LLMReranker
NewLLMReranker creates a new LLM-based reranker. By default, it uses 5 concurrent operations and the default prompt.
func (*LLMReranker) Rerank ¶ added in v0.15.0
func (r *LLMReranker) Rerank(ctx context.Context, query string, docs []schema.Document) ([]schema.ScoredDocument, error)
Rerank reranks documents by relevance to the query using the LLM. Documents are evaluated in parallel and sorted by score descending.
type LLMRerankerOption ¶ added in v0.15.0
type LLMRerankerOption func(*LLMReranker)
LLMRerankerOption configures an LLMReranker.
func WithConcurrency ¶ added in v0.15.0
func WithConcurrency(c int) LLMRerankerOption
WithConcurrency sets the number of concurrent reranking operations. Values <= 0 are ignored, keeping the default of 5.
func WithPrompt ¶ added in v0.15.0
func WithPrompt(p string) LLMRerankerOption
WithPrompt sets a custom prompt template for reranking. The template receives .Query, .Content, and all document metadata fields.
type Model ¶
type Model interface {
// GenerateContent generates a response from the LLM given a conversation history.
// Use this for multi-turn conversations or when you need access to full response metadata.
GenerateContent(ctx context.Context, messages []schema.MessageContent, options ...CallOption) (*schema.ContentResponse, error)
// Call is a convenience method for single-turn prompts.
// It returns the generated text directly.
Call(ctx context.Context, prompt string, options ...CallOption) (string, error)
}
Model is the interface for LLM providers. Implementations support both single-turn and multi-turn conversations with optional streaming support.