llms

package

v0.35.3 Latest Latest Go to latest Published: Mar 11, 2026 License: MIT Imports: 9 Imported by: 1

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/sevigo/goframe

Links

Open Source Insights

Documentation ¶

Overview ¶

Package llms provides interfaces and utilities for LLM providers.

Index ¶

Constants
func GenerateFromSinglePrompt(ctx context.Context, llm Model, prompt string, options ...CallOption) (string, error)
func TextParts(role schema.ChatMessageType, parts ...string) schema.MessageContent
type CallOption
type CallOptions
type FunctionCall
type FunctionDefinition
type LLMReranker
- func NewLLMReranker(model Model, opts ...LLMRerankerOption) *LLMReranker
- func (r *LLMReranker) Rerank(ctx context.Context, query string, docs []schema.Document) ([]schema.ScoredDocument, error)
type LLMRerankerOption
- func WithConcurrency(c int) LLMRerankerOption
- func WithPrompt(p string) LLMRerankerOption
type Model
type Tokenizer
type ToolCall
type ToolDefinition

Constants ¶

View Source

const RerankPromptDefault = `` /* 580-byte string literal not displayed */

RerankPromptDefault is the default prompt template for LLM-based reranking.

Variables ¶

This section is empty.

Functions ¶

func GenerateFromSinglePrompt ¶

func GenerateFromSinglePrompt(ctx context.Context, llm Model, prompt string, options ...CallOption) (string, error)

GenerateFromSinglePrompt generates a response from a single prompt. It wraps the prompt in a human message and returns the generated text.

func TextParts ¶

func TextParts(role schema.ChatMessageType, parts ...string) schema.MessageContent

TextParts creates a MessageContent with multiple text parts.

Types ¶

type CallOption ¶

type CallOption func(*CallOptions)

CallOption configures LLM generation options.

func WithContextLength ¶ added in v0.15.0

func WithContextLength(length int) CallOption

WithContextLength sets the context window size in tokens.

func WithJSONMode ¶ added in v0.15.0

func WithJSONMode(enabled bool) CallOption

WithJSONMode enables JSON output format.

func WithJSONSchema ¶ added in v0.15.0

func WithJSONSchema(schema any) CallOption

WithJSONSchema specifies a JSON schema for structured output.

func WithKeepAlive ¶ added in v0.15.0

func WithKeepAlive(keepAlive string) CallOption

WithKeepAlive controls how long the model stays loaded in memory. Examples: "5m", "10m", "0" to unload immediately.

func WithMaxTokens ¶ added in v0.15.0

func WithMaxTokens(maxTokens int) CallOption

WithMaxTokens specifies the maximum number of tokens to generate.

func WithMinP ¶ added in v0.15.0

func WithMinP(minP float64) CallOption

WithMinP specifies the minimum probability threshold for token selection.

func WithModel ¶ added in v0.15.0

func WithModel(model string) CallOption

WithModel specifies the model to use for this call.

func WithSeed ¶ added in v0.15.0

func WithSeed(seed int) CallOption

WithSeed specifies the seed for deterministic generation.

func WithStopWords ¶ added in v0.15.0

func WithStopWords(stopWords []string) CallOption

WithStopWords specifies the stop words to use. Generation stops when any stop word is encountered.

func WithStreamingFunc ¶

func WithStreamingFunc(streamingFunc func(ctx context.Context, chunk []byte) error) CallOption

WithStreamingFunc specifies the streaming function to use. The function is called for each chunk of the streamed response.

func WithTemperature ¶ added in v0.15.0

func WithTemperature(temperature float64) CallOption

WithTemperature specifies the temperature to use. Higher values produce more random outputs.

func WithThink ¶ added in v0.15.0

func WithThink(think any) CallOption

WithThink enables thinking/reasoning output for supported models. Pass true/false for standard models, or "high"/"medium"/"low" for GPT-OSS.

func WithTools ¶ added in v0.15.0

func WithTools(tools []ToolDefinition) CallOption

WithTools specifies function tools the model may call.

func WithTopK ¶ added in v0.15.0

func WithTopK(topK int) CallOption

WithTopK specifies the top-k value to use for sampling.

func WithTopP ¶ added in v0.15.0

func WithTopP(topP float64) CallOption

WithTopP specifies the top-p value to use for nucleus sampling.

type CallOptions ¶

type CallOptions struct {
	// Model specifies the model to use (overrides default).
	Model string `json:"model"`
	// Temperature controls randomness in generation (0.0 to 2.0).
	Temperature float64 `json:"temperature"`
	// MaxTokens limits the maximum tokens in the response.
	MaxTokens int `json:"max_tokens"`
	// StopWords specifies sequences where generation should stop.
	StopWords []string `json:"stop_words"`
	// TopP controls diversity via nucleus sampling (0.0 to 1.0).
	TopP float64 `json:"top_p"`
	// TopK limits sampling to top K tokens.
	TopK int `json:"top_k"`
	// MinP sets minimum probability threshold for token selection.
	MinP float64 `json:"min_p"`
	// Seed sets a deterministic seed for reproducible outputs.
	Seed int `json:"seed"`
	// Metadata contains additional provider-specific options.
	Metadata map[string]any `json:"metadata,omitempty"`
	// StreamingFunc is called for each chunk when streaming is enabled.
	StreamingFunc func(ctx context.Context, chunk []byte) error `json:"-"`
	// JSONMode enables JSON output format.
	JSONMode bool `json:"json_mode"`
	// JSONSchema specifies a JSON schema for structured output.
	JSONSchema any `json:"json_schema,omitempty"`
	// Tools specifies function tools the model may call.
	Tools []ToolDefinition `json:"tools,omitempty"`
	// Think enables thinking/reasoning output for supported models.
	// Can be true/false or "high"/"medium"/"low" for some models.
	Think any `json:"think,omitempty"`
	// KeepAlive controls how long the model stays loaded in memory.
	KeepAlive string `json:"keep_alive,omitempty"`
	// ContextLength sets the context window size in tokens.
	ContextLength int `json:"context_length,omitempty"`
}

CallOptions contains configurable options for LLM generation calls.

type FunctionCall ¶ added in v0.15.0

type FunctionCall struct {
	// Name is the name of the function to call.
	Name string `json:"name"`
	// Arguments is the JSON object of arguments to pass.
	Arguments map[string]any `json:"arguments"`
}

FunctionCall contains the details of a function call.

type FunctionDefinition ¶ added in v0.15.0

type FunctionDefinition struct {
	// Name is the function name.
	Name string `json:"name"`
	// Description explains what the function does.
	Description string `json:"description,omitempty"`
	// Parameters is a JSON Schema for the function parameters.
	Parameters any `json:"parameters"`
}

FunctionDefinition describes a function that can be called by the model.

type LLMReranker ¶ added in v0.15.0

type LLMReranker struct {
	// contains filtered or unexported fields
}

LLMReranker uses an LLM to rerank documents by relevance to a query. It evaluates each document in parallel with configurable concurrency.

func NewLLMReranker ¶ added in v0.15.0

func NewLLMReranker(model Model, opts ...LLMRerankerOption) *LLMReranker

NewLLMReranker creates a new LLM-based reranker. By default, it uses 5 concurrent operations and the default prompt.

func (*LLMReranker) Rerank ¶ added in v0.15.0

func (r *LLMReranker) Rerank(ctx context.Context, query string, docs []schema.Document) ([]schema.ScoredDocument, error)

Rerank reranks documents by relevance to the query using the LLM. Documents are evaluated in parallel and sorted by score descending.

type LLMRerankerOption ¶ added in v0.15.0

type LLMRerankerOption func(*LLMReranker)

LLMRerankerOption configures an LLMReranker.

func WithConcurrency ¶ added in v0.15.0

func WithConcurrency(c int) LLMRerankerOption

WithConcurrency sets the number of concurrent reranking operations. Values <= 0 are ignored, keeping the default of 5.

func WithPrompt ¶ added in v0.15.0

func WithPrompt(p string) LLMRerankerOption

WithPrompt sets a custom prompt template for reranking. The template receives .Query, .Content, and all document metadata fields.

type Model ¶

type Model interface {
	// GenerateContent generates a response from the LLM given a conversation history.
	// Use this for multi-turn conversations or when you need access to full response metadata.
	GenerateContent(ctx context.Context, messages []schema.MessageContent, options ...CallOption) (*schema.ContentResponse, error)
	// Call is a convenience method for single-turn prompts.
	// It returns the generated text directly.
	Call(ctx context.Context, prompt string, options ...CallOption) (string, error)
}

Model is the interface for LLM providers. Implementations support both single-turn and multi-turn conversations with optional streaming support.

type Tokenizer ¶

type Tokenizer interface {
	// CountTokens returns the number of tokens in the text.
	CountTokens(ctx context.Context, text string) (int, error)
}

Tokenizer is the interface for token counting. Implementations provide accurate token counts for a given model.

type ToolCall ¶ added in v0.15.0

type ToolCall struct {
	// Function contains the function call details.
	Function FunctionCall `json:"function"`
}

ToolCall represents a tool call request from the model.

type ToolDefinition ¶ added in v0.15.0

type ToolDefinition struct {
	// Type is always "function".
	Type string `json:"type"`
	// Function contains the function definition.
	Function FunctionDefinition `json:"function"`
}

ToolDefinition defines a function tool the model may call.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
fake
gemini
ollama Package ollama provides a client for interacting with Ollama's local LLM server.	Package ollama provides a client for interacting with Ollama's local LLM server.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL