Documentation
¶
Overview ¶
Package llmproxy provides a unified client interface for interacting with multiple LLM providers.
Index ¶
- func LastUserInput(input Input) string
- type ChatRequest
- type ChatResponse
- type Citation
- type Client
- func (c *Client) AvailableModels() []ModelInfo
- func (c *Client) AvailableProviders() []ProviderInfo
- func (c *Client) Chat(ctx context.Context, req *ChatRequest) (*ChatResponse, error)
- func (c *Client) DeleteFile(ctx context.Context, delReq *FileDeleteRequest) (bool, error)
- func (c *Client) Embed(ctx context.Context, req *EmbeddingRequest) (*EmbeddingResponse, error)
- func (c *Client) EmbeddingModel(model ModelID) (ModelInfo, int, error)
- func (c *Client) Health(ctx context.Context) error
- func (c *Client) UploadFile(ctx context.Context, fileReq *FileUploadRequest) (*FileUploadResponse, error)
- type ContentType
- type Embedder
- type EmbeddingRequest
- type EmbeddingResponse
- type ExternalTool
- type ExternalToolBase
- type FileDeleteRequest
- type FileID
- type FileUploadRequest
- type FileUploadResponse
- type FunctionTool
- type Input
- type InputItem
- type InputItems
- type InputString
- type ModelCapability
- type ModelID
- type ModelInfo
- type NoArgs
- type Option
- type Options
- type Provider
- type ProviderID
- type ProviderInfo
- type ResponseFormat
- type Tool
- type ToolCallRequest
- type ToolResult
- type WebSearch
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func LastUserInput ¶
LastUserInput returns the last user-provided text from an Input. For InputString it returns the string itself; for InputItems it returns the last InputString in the list (skipping tool calls/results). Returns "" if none found.
Types ¶
type ChatRequest ¶
type ChatRequest struct {
// Model is the model name.
Model ModelID `json:"model"`
// Input is the runtime input to the model. Can be a simple prompt in form of an InputString or a list of InputItem.
Input Input `json:"input"`
// SystemPrompt overrides the model's default system message/prompt.
// Needs to be re-supplied for each request, even if PreviousResponseID is used.
SystemPrompt string `json:"systemPrompt,omitempty"`
// PreviousResponseID is the ID of a previous response that will be passed as context
// to the next response, allowing for multi-turn conversations.
// This does not store full conversation history.
PreviousResponseID string `json:"previousResponseID,omitempty"`
// ResponseFormat specifies a structured JSON response format.
// If provided, the model is forced to respond in the specified format if supported by the model.
ResponseFormat *ResponseFormat `json:"responseFormat,omitempty"`
// Tools is an optional list of tools to be made available to the model/agent
Tools []Tool `json:"tools,omitempty"`
// FileIDs is an optional list of uploaded file IDs accompanying this
// request, for models that can use files as context.
FileIDs []FileID `json:"fileIDs,omitempty"`
// ImageIDs is an optional list of image file IDs accompanying this
// request, for multimodal models.
ImageIDs []FileID `json:"imageIDs,omitempty"`
// ImageURLs is an optional list of image URLs accompanying this
// request, for multimodal models that can fetch images from URLs.
ImageURLs []string `json:"imageURLs,omitempty"`
// Options lists model-specific options. For example, temperature can be
// set through this field, if the model supports it.
Options *Options `json:"options,omitempty"`
}
ChatRequest represents a request to generate a response from an LLM.
func NewChatRequest ¶
func NewChatRequest(model ModelID, input Input, opts ...Option) *ChatRequest
NewChatRequest creates a new GenerateRequest with the given arguments.
type ChatResponse ¶
type ChatResponse struct {
// Text is the generated response text.
Text string `json:"text,omitempty"`
// Citations are source references attached to spans of the assistant text.
// Populated when the model grounds its output; empty otherwise.
Citations []Citation `json:"citations,omitempty"`
// ToolCallRequests are requests made by the model to call function tools as next action.
ToolCallRequests []ToolCallRequest `json:"toolCallRequests,omitempty"`
// ResponseID is the unique ID of this response.
ResponseID string
// InputTokens is the number of input/prompt tokens consumed.
InputTokens int
// OutputTokens is the number of output/completion tokens consumed.
OutputTokens int
// TokensUsed indicates the total number of tokens used in this response. This may be more than the sum of
// InputTokens and OutputTokens for some providers.
TokensUsed int
}
ChatResponse represents a response from the LLM
type Citation ¶
type Citation struct {
// URL is the cited source.
URL string `json:"url"`
// Title is the source's title (optional).
Title string `json:"title,omitempty"`
// Snippet is the cited text excerpt from the source (optional).
Snippet string `json:"snippet,omitempty"`
// StartIdx is the start offset into the assistant message text (optional).
StartIdx int `json:"startIdx,omitempty"`
// EndIdx is the end offset into the assistant message text (optional).
EndIdx int `json:"endIdx,omitempty"`
}
Citation is a source reference attached to a span of assistant-generated text. Populated when the model grounds its output (e.g. native web search). Populated best-effort: providers that don't emit structured citations leave this empty.
type Client ¶
type Client struct {
// contains filtered or unexported fields
}
Client is the main entry point for interacting with multiple LLM providers.
func NewClient ¶
NewClient creates a new client with the given set of providers. Providers are constructed by a registry/wiring layer and injected here; llmproxy itself has no knowledge of concrete provider implementations.
func (*Client) AvailableModels ¶
AvailableModels returns a list of all supported models by all configured providers.
func (*Client) AvailableProviders ¶
func (c *Client) AvailableProviders() []ProviderInfo
AvailableProviders returns info about all configured providers.
func (*Client) Chat ¶
func (c *Client) Chat(ctx context.Context, req *ChatRequest) (*ChatResponse, error)
Chat sends a text prompt to the LLM and returns the generated response.
func (*Client) DeleteFile ¶
DeleteFile removes a previously uploaded file from the LLM service using its FileID.
func (*Client) Embed ¶
func (c *Client) Embed(ctx context.Context, req *EmbeddingRequest) (*EmbeddingResponse, error)
Embed generates embeddings for the given inputs using the specified provider. The provider is resolved by model ID; if it does not support embedding, an error is returned.
func (*Client) EmbeddingModel ¶
EmbeddingModelInfo returns the modelInfo and embedding dimension for the given model or error if not an embedding model
func (*Client) Health ¶
Health verifies the health of all configured providers. It returns an error if any provider is unhealthy.
func (*Client) UploadFile ¶
func (c *Client) UploadFile(ctx context.Context, fileReq *FileUploadRequest) (*FileUploadResponse, error)
UploadFile uploads a file to the LLM service.
type ContentType ¶
type ContentType string
ContentType represents the MIME type of a file.
const ( ContentTypePlainText ContentType = "text/plain" ContentTypeCSV ContentType = "text/csv" ContentTypePNG ContentType = "image/png" ContentTypeJPEG ContentType = "image/jpeg" ContentTypePDF ContentType = "application/pdf" ContentTypeJSON ContentType = "application/json" ContentTypeZIP ContentType = "application/zip" )
revive:disable
type Embedder ¶
type Embedder interface {
ProviderID() ProviderID
Embed(ctx context.Context, req *EmbeddingRequest) (*EmbeddingResponse, error)
EmbeddingDimension(model ModelID) (int, error)
}
Embedder is an optional capability interface for providers that support text embedding.
type EmbeddingRequest ¶
type EmbeddingRequest struct {
// Model is the embedding model to use.
Model ModelID
// Inputs is the list of text strings to embed.
Inputs []string
}
EmbeddingRequest represents a request to generate embeddings from text inputs.
type EmbeddingResponse ¶
type EmbeddingResponse struct {
// Embeddings contains the embedding vectors, one per input.
Embeddings [][]float32
// Model is the model that was used to generate the embeddings.
Model string
// TokensUsed is the total number of tokens consumed.
TokensUsed int
// InputTokens is the number of input/prompt tokens consumed.
InputTokens int
// OutputTokens is the number of output/completion tokens consumed.
OutputTokens int
}
EmbeddingResponse represents the result of an embedding request.
type ExternalTool ¶
ExternalTool is a tool whose execution is handled outside of the LLM provider. Calls and results are exchanged between the caller and the model as part of the request flow.
type ExternalToolBase ¶
type ExternalToolBase struct {
Name string `json:"name"`
Description string `json:"description,omitempty"`
Parameters map[string]any `json:"parameters"`
}
ExternalToolBase is the canonical implementation of ExternalTool. Users can provide their own external tool implementations via embedding this struct, while Tool remains sealed.
func (ExternalToolBase) ToolDescription ¶
func (t ExternalToolBase) ToolDescription() string
func (ExternalToolBase) ToolName ¶
func (t ExternalToolBase) ToolName() string
func (ExternalToolBase) ToolParams ¶
func (t ExternalToolBase) ToolParams() map[string]any
type FileDeleteRequest ¶
type FileDeleteRequest struct {
// FileID is the unique identifier of the file to be deleted.
FileID FileID `json:"fileID"`
// ProviderID specifies the target LLM provider from which to delete the file.
ProviderID ProviderID `json:"providerID"`
}
FileDeleteRequest represents a file deletion request
type FileUploadRequest ¶
type FileUploadRequest struct {
// File is the file content as an io.ReadCloser.
File io.ReadCloser `json:"file"`
// FileName is the name of the file, including its extension.
FileName string `json:"fileName"`
// FileType is the MIME type of the file (e.g., "application/pdf", "image/png", "text/plain").
FileType ContentType `json:"fileType"`
// Purpose is an optional description of the file's intended use.
Purpose string `json:"purpose,omitempty"`
// ProviderID specifies the target LLM provider for the file upload.
ProviderID ProviderID `json:"providerID"`
}
FileUploadRequest represents a file upload
type FileUploadResponse ¶
FileUploadResponse represents an uploaded file result
type FunctionTool ¶
type FunctionTool struct {
ExternalToolBase
// ToolCall calls the tool function with given context and parameters.
ToolCall func(ctx context.Context, arguments json.RawMessage) (any, error) `json:"-"`
}
FunctionTool is an external tool that has a callable function handle that can be invoked with arguements parsed from JSON.
func NewFunctionTool ¶
func NewFunctionTool[T any, R any](name, description string, handler func(ctx context.Context, args T) (R, error)) (FunctionTool, error)
NewFunctionTool constructs a FunctionTool from typed parameters and a handler function.
type Input ¶
type Input interface {
// contains filtered or unexported methods
}
Input represents runtime input to a model. It can be either a string or an InputItem list.
type InputItem ¶
type InputItem interface {
// String returns a human-readable representation of the item, including its type.
// This is used to serialize the item as plain text if the model only supports a single message type.
String() string
// contains filtered or unexported methods
}
InputItem represents a single item in InputItems.
type InputItems ¶
type InputItems []InputItem
InputItems represents a list of InputItem.
func AsInputItems ¶
func AsInputItems(input Input) InputItems
AsInputItems converts an Input to InputItems.
type InputString ¶
type InputString string
InputString represents a simple string input. Can be used directly or as an item in InputItems.
func (InputString) String ¶
func (s InputString) String() string
type ModelCapability ¶
type ModelCapability string
ModelCapability represents what a model can do
const ( CapabilityChat ModelCapability = "chat" CapabilityEmbedding ModelCapability = "embedding" CapabilityFunctionCall ModelCapability = "function_call" CapabilityVision ModelCapability = "vision" CapabilityFineTuning ModelCapability = "fine_tuning" CapabilityReasoning ModelCapability = "reasoning" CapabilityClassification ModelCapability = "classification" CapabilityCode ModelCapability = "code" )
revive:disable
type ModelInfo ¶
type ModelInfo struct {
ID ModelID
Provider ProviderID
Label string
Capabilities []ModelCapability
TokenModifier float64 // TokenModifier scales token counts for billing (1.0 = pass-through).
MaxTokens *int // MaxTokens is the model's output token cap (chat models).
EmbeddingDimension *int // EmbeddingDimension is the output vector size (on embedding models).
}
ModelInfo represents information about a model
type NoArgs ¶
type NoArgs struct{}
NoArgs is an empty struct type used as type for function tools that take no arguments.
type Option ¶
type Option func(*Options)
Option is a functional option for configuring generation options
func WithFrequencyPenalty ¶
WithFrequencyPenalty sets the frequency penalty option
func WithMaxTokens ¶
WithMaxTokens sets the maximum number of tokens to generate
func WithPresencePenalty ¶
WithPresencePenalty sets the presence penalty option
func WithTemperature ¶
WithTemperature sets the temperature option
type Options ¶
type Options struct {
// MaxTokens specifies the maximum number of tokens to generate in the response.
// If nil, the model defaults to its internal maximum generation length.
MaxTokens *int `json:"maxTokens,omitempty"`
// Temperature controls randomness in generation.
// Higher values (e.g., 1.0) make output more random, lower values (e.g., 0.0) make it deterministic.
Temperature *float32 `json:"temperature,omitempty"`
// TopK specifies the maximum number of tokens to consider during sampling.
TopK *int `json:"topK,omitempty"`
// TopP (nucleus sampling) controls the cumulative probability cutoff for token selection.
// Only tokens whose cumulative probability <= TopP are considered at each step.
TopP *float32 `json:"topP,omitempty"`
// FrequencyPenalty reduces the likelihood of repeating tokens that have already appeared.
// Higher values make the model less likely to repeat the same text.
FrequencyPenalty *float32 `json:"frequencyPenalty,omitempty"`
// PresencePenalty penalizes tokens that have already appeared in the text.
// This encourages introducing new concepts instead of repeating old ones.
PresencePenalty *float32 `json:"presencePenalty,omitempty"`
// Seed sets the random seed for deterministic generation.
// Useful for reproducibility. Nil uses a random seed.
Seed *int `json:"seed,omitempty"`
}
Options represents model-specific options for generation.
type Provider ¶
type Provider interface {
// ProviderID returns the unique identifier of the LLM provider.
ProviderID() ProviderID
// AvailableModels returns a list of supported models by the LLM service.
AvailableModels() []ModelInfo
// Health verifies that the LLM service is available and responding.
// Returns an error if the service is unreachable or unhealthy.
Health(ctx context.Context) error
// Chat sends a text prompt (optionally with system instructions, ID of previous response,
// files, images, and model options) to the LLM and returns the generated response.
// The request may include structured instructions, multimodal input, and output formatting.
Chat(ctx context.Context, req *ChatRequest) (*ChatResponse, error)
// UploadFile uploads a file (text, PDF, image, etc.) to the LLM service.
// Returns a FileUploadResponse containing the FileID that can be referenced in future requests.
// Implementations may store the file temporarily or persist it depending on service capabilities.
UploadFile(ctx context.Context, fileReq *FileUploadRequest) (*FileUploadResponse, error)
// DeleteFile removes a previously uploaded file from the LLM service using its FileID.
// After deletion, the file can no longer be referenced in new requests.
DeleteFile(ctx context.Context, fileID FileID) (bool, error)
}
Provider defines the interface for interacting with a large language model (LLM) service. Implementations can wrap different backends (OpenAI, Ollama, custom models, etc.).
type ProviderInfo ¶
type ProviderInfo struct {
ID ProviderID
Models []ModelInfo
}
ProviderInfo describes a provider.
type ResponseFormat ¶
type ResponseFormat struct {
Name string `json:"name"`
Schema map[string]any `json:"schema"`
Description string `json:"description,omitempty"`
}
ResponseFormat defines a structured JSON response format for LLM outputs.
func NewResponseFormat ¶
func NewResponseFormat[T any](name string, description string) (*ResponseFormat, error)
NewResponseFormat creates a new ResponseFormat based on the provided type T. T can not have fields marked as 'omitempty' to ensure required fields in the schema.
type Tool ¶
type Tool interface {
// ToolName returns the name of the tool.
ToolName() string
// contains filtered or unexported methods
}
Tool represents a capability that can be used by a model. Tools may be executed internally by model providers or externally by the caller. This interface is sealed and must not be implemented outside the llmproxy package. Callers may only supply tools via the ExternalTool abstraction.
type ToolCallRequest ¶
type ToolCallRequest struct {
// The unique ID of the function tool call request generated by the model.
CallID string `json:"callId"`
// Name of the external tool called.
Name string `json:"name"`
// Arguments that should be passed to the external tool.
Arguments json.RawMessage `json:"arguments"`
}
ToolCallRequest represents a request made by the model to call an external tool as next step.
func (ToolCallRequest) String ¶
func (r ToolCallRequest) String() string
type ToolResult ¶
type ToolResult struct {
// The unique ID of the function tool call request generated by the model.
CallID string `json:"callId"`
// Name of the external tool called.
Name string `json:"name"`
// Output of the external tool.
Output any `json:"output"`
}
ToolResult represents the result returned by executing an external tool.
func (ToolResult) String ¶
func (r ToolResult) String() string
type WebSearch ¶
type WebSearch struct{}
WebSearch is a marker tool requesting native web search on the provider side. Carries no fields: per-provider parameters (allowed domains, max uses, etc.) live on each provider's Config. Whether the marker actually triggers a native search is decided in each provider's mapping based on its config.
Source Files
¶
Directories
¶
| Path | Synopsis |
|---|---|
|
Package agent defines concepts for agentic applications
|
Package agent defines concepts for agentic applications |
|
Package registry wires concrete LLM providers into an llmproxy.Client.
|
Package registry wires concrete LLM providers into an llmproxy.Client. |
|
Package provider contains common constants for LLM providers
|
Package provider contains common constants for LLM providers |
|
anthropic
Package anthropic implements the Anthropic LLM provider using the official anthropic-sdk-go.
|
Package anthropic implements the Anthropic LLM provider using the official anthropic-sdk-go. |
|
gemini
Package gemini implements the Gemini LLM provider using the Google genai SDK.
|
Package gemini implements the Gemini LLM provider using the Google genai SDK. |
|
mistral
Package mistral implements the Mistral LLM provider.
|
Package mistral implements the Mistral LLM provider. |
|
openai
Package openai implements the OpenAI LLM provider.
|
Package openai implements the OpenAI LLM provider. |
|
selfhosted
Package selfhosted implements the SelfHosted LLM provider for routing chat requests to operator-run inference servers (llama-server, vLLM, Ollama, etc.) via the OpenAI-compatible Chat Completions API.
|
Package selfhosted implements the SelfHosted LLM provider for routing chat requests to operator-run inference servers (llama-server, vLLM, Ollama, etc.) via the OpenAI-compatible Chat Completions API. |
|
Package schemautil provides utilities for generating and manipulating JSON schemas with OpenAI's constraints.
|
Package schemautil provides utilities for generating and manipulating JSON schemas with OpenAI's constraints. |
|
Package test provides common test methods and types for LLM providers.
|
Package test provides common test methods and types for LLM providers. |