llms

package

v0.36.3 Latest Latest Go to latest Published: Mar 21, 2026 License: MIT Imports: 9 Imported by: 1

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/sevigo/goframe

Documentation ¶

Overview ¶

Package llms provides interfaces and utilities for LLM providers.

Index ¶

Constants
func GenerateFromSinglePrompt(ctx context.Context, llm Model, prompt string, options ...CallOption) (string, error)
func TextParts(role schema.ChatMessageType, parts ...string) schema.MessageContent
type CallOption
type CallOptions
type FunctionCall
type FunctionDefinition
type LLMReranker
- func NewLLMReranker(model Model, opts ...LLMRerankerOption) *LLMReranker
- func (r *LLMReranker) Rerank(ctx context.Context, query string, docs []schema.Document) ([]schema.ScoredDocument, error)
type LLMRerankerOption
- func WithConcurrency(c int) LLMRerankerOption
- func WithPrompt(p string) LLMRerankerOption
type Model
type Tokenizer
type ToolCall
type ToolDefinition

Constants ¶

View Source

const RerankPromptDefault = `` /* 580-byte string literal not displayed */

RerankPromptDefault is the default prompt template for LLM-based reranking.

Variables ¶

This section is empty.

Functions ¶

func GenerateFromSinglePrompt ¶

func GenerateFromSinglePrompt(ctx context.Context, llm Model, prompt string, options ...CallOption) (string, error)

GenerateFromSinglePrompt generates a response from a single prompt. It wraps the prompt in a human message and returns the generated text.

func TextParts ¶

func TextParts(role schema.ChatMessageType, parts ...string) schema.MessageContent

TextParts creates a MessageContent with multiple text parts.

Types ¶

type CallOption ¶

type CallOption func(*CallOptions)

CallOption configures LLM generation options.

func WithContextLength ¶ added in v0.15.0

func WithContextLength(length int) CallOption

WithContextLength sets the context window size in tokens.

func WithJSONMode ¶ added in v0.15.0

func WithJSONMode(enabled bool) CallOption

WithJSONMode enables JSON output format.

func WithJSONSchema ¶ added in v0.15.0

func WithJSONSchema(schema any) CallOption

WithJSONSchema specifies a JSON schema for structured output.

func WithKeepAlive ¶ added in v0.15.0

func WithKeepAlive(keepAlive string) CallOption

WithKeepAlive controls how long the model stays loaded in memory. Examples: "5m", "10m", "0" to unload immediately.

func WithMaxTokens ¶ added in v0.15.0

func WithMaxTokens(maxTokens int) CallOption

WithMaxTokens specifies the maximum number of tokens to generate.

func WithMinP ¶ added in v0.15.0

func WithMinP(minP float64) CallOption

WithMinP specifies the minimum probability threshold for token selection.

func WithModel ¶ added in v0.15.0

func WithModel(model string) CallOption

WithModel specifies the model to use for this call.

func WithSeed ¶ added in v0.15.0

func WithSeed(seed int) CallOption

WithSeed specifies the seed for deterministic generation.

func WithStopWords ¶ added in v0.15.0

func WithStopWords(stopWords []string) CallOption

WithStopWords specifies the stop words to use. Generation stops when any stop word is encountered.

func WithStreamingFunc ¶

func WithStreamingFunc(streamingFunc func(ctx context.Context, chunk []byte) error) CallOption

WithStreamingFunc specifies the streaming function to use. The function is called for each chunk of the streamed response.

func WithTemperature ¶ added in v0.15.0

func WithTemperature(temperature float64) CallOption

WithTemperature specifies the temperature to use. Higher values produce more random outputs.

func WithThink ¶ added in v0.15.0

func WithThink(think any) CallOption

WithThink enables thinking/reasoning output for supported models. Pass true/false for standard models, or "high"/"medium"/"low" for GPT-OSS.

func WithTools ¶ added in v0.15.0

func WithTools(tools []ToolDefinition) CallOption

WithTools specifies function tools the model may call.

func WithTopK ¶ added in v0.15.0

func WithTopK(topK int) CallOption

WithTopK specifies the top-k value to use for sampling.

func WithTopP ¶ added in v0.15.0

func WithTopP(topP float64) CallOption

WithTopP specifies the top-p value to use for nucleus sampling.

type CallOptions ¶

type CallOptions struct {
	// Model specifies the model to use (overrides default).
	Model string `json:"model"`
	// Temperature controls randomness in generation (0.0 to 2.0).
	Temperature float64 `json:"temperature"`
	// MaxTokens limits the maximum tokens in the response.
	MaxTokens int `json:"max_tokens"`
	// StopWords specifies sequences where generation should stop.
	StopWords []string `json:"stop_words"`
	// TopP controls diversity via nucleus sampling (0.0 to 1.0).
	TopP float64 `json:"top_p"`
	// TopK limits sampling to top K tokens.
	TopK int `json:"top_k"`
	// MinP sets minimum probability threshold for token selection.
	MinP float64 `json:"min_p"`
	// Seed sets a deterministic seed for reproducible outputs.
	Seed int `json:"seed"`
	// Metadata contains additional provider-specific options.
	Metadata map[string]any `json:"metadata,omitempty"`
	// StreamingFunc is called for each chunk when streaming is enabled.
	StreamingFunc func(ctx context.Context, chunk []byte) error `json:"-"`
	// JSONMode enables JSON output format.
	JSONMode bool `json:"json_mode"`
	// JSONSchema specifies a JSON schema for structured output.
	JSONSchema any `json:"json_schema,omitempty"`
	// Tools specifies function tools the model may call.
	Tools []ToolDefinition `json:"tools,omitempty"`
	// Think enables thinking/reasoning output for supported models.
	// Can be true/false or "high"/"medium"/"low" for some models.
	Think any `json:"think,omitempty"`
	// KeepAlive controls how long the model stays loaded in memory.
	KeepAlive string `json:"keep_alive,omitempty"`
	// ContextLength sets the context window size in tokens.
	ContextLength int `json:"context_length,omitempty"`
}

CallOptions contains configurable options for LLM generation calls.

type FunctionCall ¶ added in v0.15.0

type FunctionCall struct {
	// Name is the name of the function to call.
	Name string `json:"name"`
	// Arguments is the JSON object of arguments to pass.
	Arguments map[string]any `json:"arguments"`
}

FunctionCall contains the details of a function call.

type FunctionDefinition ¶ added in v0.15.0

type FunctionDefinition struct {
	// Name is the function name.
	Name string `json:"name"`
	// Description explains what the function does.
	Description string `json:"description,omitempty"`
	// Parameters is a JSON Schema for the function parameters.
	Parameters any `json:"parameters"`
}

FunctionDefinition describes a function that can be called by the model.

type LLMReranker ¶ added in v0.15.0

type LLMReranker struct {
	// contains filtered or unexported fields
}

LLMReranker uses an LLM to rerank documents by relevance to a query. It evaluates each document in parallel with configurable concurrency.

func NewLLMReranker ¶ added in v0.15.0

func NewLLMReranker(model Model, opts ...LLMRerankerOption) *LLMReranker

NewLLMReranker creates a new LLM-based reranker. By default, it uses 5 concurrent operations and the default prompt.

func (*LLMReranker) Rerank ¶ added in v0.15.0

func (r *LLMReranker) Rerank(ctx context.Context, query string, docs []schema.Document) ([]schema.ScoredDocument, error)

Rerank reranks documents by relevance to the query using the LLM. Documents are evaluated in parallel and sorted by score descending.

type LLMRerankerOption ¶ added in v0.15.0

type LLMRerankerOption func(*LLMReranker)

LLMRerankerOption configures an LLMReranker.

func WithConcurrency ¶ added in v0.15.0

func WithConcurrency(c int) LLMRerankerOption

WithConcurrency sets the number of concurrent reranking operations. Values <= 0 are ignored, keeping the default of 5.

func WithPrompt ¶ added in v0.15.0

func WithPrompt(p string) LLMRerankerOption

WithPrompt sets a custom prompt template for reranking. The template receives .Query, .Content, and all document metadata fields.

type Model ¶

type Model interface {
	// GenerateContent generates a response from the LLM given a conversation history.
	// Use this for multi-turn conversations or when you need access to full response metadata.
	GenerateContent(ctx context.Context, messages []schema.MessageContent, options ...CallOption) (*schema.ContentResponse, error)
	// Call is a convenience method for single-turn prompts.
	// It returns the generated text directly.
	Call(ctx context.Context, prompt string, options ...CallOption) (string, error)
}

Model is the interface for LLM providers. Implementations support both single-turn and multi-turn conversations with optional streaming support.

type Tokenizer ¶

type Tokenizer interface {
	// CountTokens returns the number of tokens in the text.
	CountTokens(ctx context.Context, text string) (int, error)
}

Tokenizer is the interface for token counting. Implementations provide accurate token counts for a given model.

type ToolCall ¶ added in v0.15.0

type ToolCall struct {
	// Function contains the function call details.
	Function FunctionCall `json:"function"`
}

ToolCall represents a tool call request from the model.

type ToolDefinition ¶ added in v0.15.0

type ToolDefinition struct {
	// Type is always "function".
	Type string `json:"type"`
	// Function contains the function definition.
	Function FunctionDefinition `json:"function"`
}

ToolDefinition defines a function tool the model may call.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
fake Package fake provides mock LLM implementations for testing.	Package fake provides mock LLM implementations for testing.
gemini Package gemini provides LLM and embedding support for Google's Gemini models.	Package gemini provides LLM and embedding support for Google's Gemini models.
ollama Package ollama provides a client for interacting with Ollama's local LLM server.	Package ollama provides a client for interacting with Ollama's local LLM server.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL