ai

package module
v1.0.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 11, 2025 License: MIT Imports: 18 Imported by: 0

README

Go AI

A Go library providing a unified, provider-agnostic interface for interacting with multiple AI models, including Google Gemini, OpenAI, and Anthropic. This library simplifies content generation and tool integration, allowing you to switch between AI providers with minimal code changes.

It also features built-in support for the Model-Context Protocol (MCP), enabling seamless integration with external tool servers.

Features

  • Unified Client Interface: A single ai.Client interface for Google Gemini, OpenAI, and Anthropic.
  • Provider-Agnostic API: Universal Request, Response, and Message structs for consistent interaction.
  • Simplified Configuration: Easily configure clients using environment variables or functional options.
  • Multimodal Support: Support for text, images, audio, video, and PDF documents with automatic format handling.
  • First-Class Tool Support: Abstracted support for function calling (tools) that works across providers.
  • MCP Integration: Discover, connect to, and execute tools on MCP-compliant servers.
  • Universal API Proxy Server: HTTP proxy server that accepts any provider's API format and routes to any provider.
  • Error Handling: Clear error messages when using unsupported features with specific providers.

Installation

To add the library to your project, run:

go get github.com/liuzl/ai

Configuration

The easiest way to configure the client is by setting environment variables. The library's NewClientFromEnv() function will automatically detect and use them.

  • AI_PROVIDER: The provider to use. Can be openai (default), gemini, or anthropic.
OpenAI
  • OPENAI_API_KEY: Your OpenAI API key.
  • OPENAI_MODEL: (Optional) The model name, e.g., gpt-5-mini.
  • OPENAI_BASE_URL: (Optional) For using a custom or proxy endpoint.
Google Gemini
  • GEMINI_API_KEY: Your Gemini API key.
  • GEMINI_MODEL: (Optional) The model name, e.g., gemini-2.5-flash.
  • GEMINI_BASE_URL: (Optional) For using a custom endpoint.
Anthropic
  • ANTHROPIC_API_KEY: Your Anthropic API key.
  • ANTHROPIC_MODEL: (Optional) The model name, e.g., claude-haiku-4-5.
  • ANTHROPIC_BASE_URL: (Optional) For using a custom endpoint.

Usage

Basic Example: Simple Text Generation

This example shows how to create a client from environment variables and generate a simple text response.

// main.go
package main

import (
	"context"
	"fmt"
	"log"

	// Use godotenv to load .env file for local development
	_ "github.com/joho/godotenv/autoload"
	"github.com/liuzl/ai"
)

func main() {
	// Create a new client using the recommended NewClientFromEnv function.
	// This automatically reads the AI_PROVIDER and corresponding API keys.
	client, err := ai.NewClientFromEnv()
	if err != nil {
		log.Fatalf("Failed to create AI client: %v", err)
	}

	// Create a request for the model.
	req := &ai.Request{
		Messages: []ai.Message{
			{Role: ai.RoleUser, Content: "Tell me a one-sentence joke about programming."},
		},
	}

	// Call the Generate function.
	resp, err := client.Generate(context.Background(), req)
	if err != nil {
		log.Fatalf("Generate failed: %v", err)
	}

	// Print the result.
	fmt.Println(resp.Text)
}
Running the Examples

The examples directory contains runnable code. To run the simple chat example, execute the following command from the root of the project:

go run ./examples/simple_chat
Advanced Example: Using Tools with an MCP Server

This library can orchestrate interactions between an AI model and an external tool server that implements the Model-Context Protocol (MCP).

The following example demonstrates the full loop:

  1. Connect to an MCP server to discover available tools.
  2. Ask the AI model a question, providing the list of tools it can use.
  3. Receive a ToolCall from the model.
  4. Execute the ToolCall on the MCP server.
  5. Send the tool's result back to the model for a final, synthesized answer.
package main

import (
	"context"
	"log"

	_ "github.com/joho/godotenv/autoload" // for loading .env file
	"github.com/liuzl/ai"
)

const (
	mcpServerName = "remote-shell"
	mcpServerURL  = "http://localhost:8080/mcp" // URL of a running MCP server
)

func main() {
	ctx := context.Background()

	// 1. Setup ToolServerManager and register the remote server.
	manager := ai.NewToolServerManager()
	if err := manager.AddRemoteServer(mcpServerName, mcpServerURL); err != nil {
		log.Fatalf("Failed to add remote tool server: %v", err)
	}

	// 2. Get the client for the server and defer its closing.
	toolClient, _ := manager.GetClient(mcpServerName)
	defer toolClient.Close()

	// 3. Fetch available tools. The client will connect automatically.
	aiTools, err := toolClient.FetchTools(ctx)
	if err != nil {
		log.Fatalf("Failed to fetch tools: %v", err)
	}
	log.Printf("Found %d tools on server '%s'.\n", len(aiTools), mcpServerName)

	// 4. Create an AI client
	aiClient, err := ai.NewClientFromEnv()
	if err != nil {
		log.Fatalf("Failed to create AI client: %v", err)
	}

	// 5. Ask the model a question, making it aware of the tools
	messages := []ai.Message{
		{Role: ai.RoleUser, Content: "List all files in the current directory using the shell."},
	}
	req := &ai.Request{Messages: messages, Tools: aiTools}

	resp, err := aiClient.Generate(ctx, req)
	if err != nil {
		log.Fatalf("Initial model call failed: %v", err)
	}

	// 6. Check for a tool call and execute it
	if len(resp.ToolCalls) == 0 {
		log.Fatalf("Expected a tool call, but got text: %s", resp.Text)
	}
	toolCall := resp.ToolCalls[0]
	log.Printf("Model wants to call function '%s'.\n", toolCall.Function)
	messages = append(messages, ai.Message{Role: ai.RoleAssistant, ToolCalls: resp.ToolCalls})

	toolResult, err := toolClient.ExecuteTool(ctx, toolCall)
	if err != nil {
		log.Fatalf("Tool call failed: %v", err)
	}
	log.Printf("Tool executed successfully.\n")

	// 7. Send the result back to the model for a final answer
	messages = append(messages, ai.Message{Role: ai.RoleTool, ToolCallID: toolCall.ID, Content: toolResult})
	finalReq := &ai.Request{Messages: messages}

	finalResp, err := aiClient.Generate(ctx, finalReq)
	if err != nil {
		log.Fatalf("Final model call failed: %v", err)
	}

	// 8. Print the final, synthesized response
	log.Println("--- Final Model Response ---")
	log.Println(finalResp.Text)
	log.Println("--------------------------")
}

Multimodal Support

The library provides comprehensive support for multiple content types beyond text, including images, audio, video, and PDF documents. Different providers support different modalities:

Supported Content Types by Provider
Content Type OpenAI Gemini Anthropic Notes
Text Universal support
Images PNG, JPEG, WEBP, GIF
Audio MP3, WAV, AIFF, AAC, OGG, FLAC
Video MP4, MPEG, MOV, AVI, FLV, WEBM, etc.
PDF Documents Native PDF parsing
Image Analysis Example
req := &ai.Request{
	Messages: []ai.Message{
		ai.NewMultimodalMessage(ai.RoleUser, []ai.ContentPart{
			ai.NewTextPart("What's in this image?"),
			ai.NewImagePartFromURL("https://example.com/image.jpg"),
		}),
	},
}
Audio Analysis Example (Gemini only)
req := &ai.Request{
	Messages: []ai.Message{
		ai.NewMultimodalMessage(ai.RoleUser, []ai.ContentPart{
			ai.NewTextPart("Transcribe and analyze this audio"),
			ai.NewAudioPartFromURL("https://example.com/audio.mp3"),
		}),
	},
}
Video Analysis Example (Gemini only)
req := &ai.Request{
	Messages: []ai.Message{
		ai.NewMultimodalMessage(ai.RoleUser, []ai.ContentPart{
			ai.NewTextPart("Describe what happens in this video"),
			ai.NewVideoPartFromURL("https://example.com/video.mp4", "mp4"),
		}),
	},
}
PDF Document Analysis Example (Gemini & Anthropic)
req := &ai.Request{
	Messages: []ai.Message{
		ai.NewMultimodalMessage(ai.RoleUser, []ai.ContentPart{
			ai.NewTextPart("Summarize this research paper"),
			ai.NewPDFPartFromURL("https://arxiv.org/pdf/1706.03762.pdf"),
		}),
	},
}
Using Base64-Encoded Media

All media types also support base64-encoded data for local files:

// Read local file
audioData, _ := os.ReadFile("audio.mp3")
base64Audio := base64.StdEncoding.EncodeToString(audioData)

req := &ai.Request{
	Messages: []ai.Message{
		ai.NewMultimodalMessage(ai.RoleUser, []ai.ContentPart{
			ai.NewTextPart("Analyze this audio"),
			ai.NewAudioPartFromBase64(base64Audio, "mp3"),
		}),
	},
}
Automatic Format Handling
  • Gemini: Automatically downloads media from URLs and converts to base64
  • Anthropic: Supports both URL and base64 for images and PDFs
  • OpenAI: Supports URL and base64 for images
Error Handling

The library provides clear error messages when attempting to use unsupported content types:

// Trying to use audio with OpenAI will return:
// "OpenAI provider does not support audio input (content type: audio).
//  Supported providers: Gemini"
Complete Examples

See the examples directory for complete working examples:

Each example includes detailed documentation on supported formats, use cases, and limitations.

Universal API Proxy Server

The library includes a universal HTTP proxy server that accepts any provider's API format and routes requests to any supported provider. This allows you to:

  • Use OpenAI format to call Gemini or Anthropic
  • Use Gemini format to call OpenAI or Anthropic
  • Use Anthropic format to call OpenAI or Gemini
  • All 9 format/provider combinations are supported!
Why Use the Universal Proxy?
  • Format Flexibility: Use the API format you're familiar with, regardless of backend provider
  • Tool Compatibility: Use existing tools/SDKs designed for one provider with another
  • No code changes: Switch providers without rewriting client code
  • Cost Optimization: Route expensive API calls to cheaper providers using the same format
  • Vendor Lock-in Mitigation: Build with one API, easily switch providers
Quick Start

Install and run the proxy server:

# Install
go install github.com/liuzl/ai/cmd/api-proxy@latest

# Example 1: Use OpenAI format to call Gemini
export GEMINI_API_KEY="your-gemini-api-key"
api-proxy -format openai -provider gemini -model gemini-2.5-flash

# Example 2: Use Anthropic format to call OpenAI
export OPENAI_API_KEY="your-openai-api-key"
api-proxy -format anthropic -provider openai -model gpt-4o

# Example 3: Use Gemini format to call Anthropic
export ANTHROPIC_API_KEY="your-anthropic-api-key"
api-proxy -format gemini -provider anthropic -model claude-3-5-haiku-20241022
Example: OpenAI SDK → Gemini Backend
from openai import OpenAI

# Start proxy: api-proxy -format openai -provider gemini

client = OpenAI(
    api_key="dummy",
    base_url="http://localhost:8080/v1"
)

# Use OpenAI SDK, but actually calls Gemini!
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello!"}]
)

print(response.choices[0].message.content)
Example: Anthropic SDK → OpenAI Backend
import anthropic

# Start proxy: api-proxy -format anthropic -provider openai -model gpt-4o

client = anthropic.Anthropic(
    api_key="dummy",
    base_url="http://localhost:8080"
)

# Use Anthropic SDK, but actually calls OpenAI!
message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}]
)

print(message.content[0].text)
Supported Format Combinations
API Format ↓ / Provider → OpenAI Gemini Anthropic
OpenAI
Gemini
Anthropic
Configuration Options
-listen string       Server listen address (default ":8080")
-format string       API format to accept: openai, gemini, anthropic (default "openai")
-provider string     Target provider to call: openai, gemini, anthropic (required)
-api-key string      Provider API key (or use env vars)
-model string        Target model (optional)
-base-url string     Custom API endpoint (optional)
-timeout duration    Request timeout (default 5m)
-verbose             Enable verbose logging
Supported Features

✅ All chat completion features

✅ System prompts

✅ Multi-turn conversations

✅ Tool/function calling

✅ Vision/multimodal inputs (images, audio, video, PDFs)

✅ All provider combinations

See the API proxy README for complete documentation, examples, and use cases.

License

MIT License - see LICENSE file for details.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ConvertResponse

func ConvertResponse(targetFormat string, resp *Response) ([]byte, error)

ConvertResponse converts a universal Response into a provider-specific response payload.

Types

type AnthropicFormatConverter

type AnthropicFormatConverter struct{}

AnthropicFormatConverter provides conversion between Anthropic API format and Universal format. It implements the FormatConverter interface.

func NewAnthropicFormatConverter

func NewAnthropicFormatConverter() *AnthropicFormatConverter

NewAnthropicFormatConverter creates a new Anthropic format converter.

func (*AnthropicFormatConverter) ConvertRequestFromFormat

func (c *AnthropicFormatConverter) ConvertRequestFromFormat(providerReq any) (*Request, error)

ConvertRequestFromFormat converts an Anthropic request to Universal format.

func (*AnthropicFormatConverter) ConvertRequestToUniversal

func (c *AnthropicFormatConverter) ConvertRequestToUniversal(anthropicReq *AnthropicMessagesRequest) (*Request, error)

ConvertRequestToUniversal converts an Anthropic request to Universal format.

func (*AnthropicFormatConverter) ConvertResponseToAnthropic

func (c *AnthropicFormatConverter) ConvertResponseToAnthropic(universalResp *Response, model string) (*AnthropicMessagesResponse, error)

ConvertResponseToAnthropic converts a Universal Response to Anthropic format.

func (*AnthropicFormatConverter) ConvertResponseToFormat

func (c *AnthropicFormatConverter) ConvertResponseToFormat(universalResp *Response, originalModel string) (any, error)

ConvertResponseToFormat converts a Universal Response to Anthropic format.

func (*AnthropicFormatConverter) GetEndpoint

func (c *AnthropicFormatConverter) GetEndpoint() string

GetEndpoint returns the Anthropic API endpoint path.

func (*AnthropicFormatConverter) GetProviderName

func (c *AnthropicFormatConverter) GetProviderName() string

GetProviderName returns the provider name.

type AnthropicMessagesRequest

type AnthropicMessagesRequest struct {
	Model     string             `json:"model"`
	System    string             `json:"system,omitempty"`
	Messages  []anthropicMessage `json:"messages"`
	MaxTokens int                `json:"max_tokens"`
	Tools     []anthropicTool    `json:"tools,omitempty"`
}

AnthropicMessagesRequest represents an Anthropic messages request.

type AnthropicMessagesResponse

type AnthropicMessagesResponse struct {
	ID         string                  `json:"id"`
	Type       string                  `json:"type"`
	Role       string                  `json:"role"`
	Model      string                  `json:"model"`
	Content    []anthropicContentBlock `json:"content"`
	StopReason string                  `json:"stop_reason"`
	Usage      *anthropicUsage         `json:"usage,omitempty"`
}

AnthropicMessagesResponse represents an Anthropic messages response.

type AudioSource

type AudioSource struct {
	Type   MediaSourceType // "url" or "base64"
	URL    string          // HTTP(S) URL to the audio file
	Data   string          // Base64-encoded audio data
	Format string          // Audio format: "mp3", "wav", "aiff", "aac", "ogg", "flac"
}

AudioSource represents an audio input for audio-enabled models (primarily Gemini). Supported formats: MP3, WAV, AIFF, AAC, OGG, FLAC

type AuthenticationError

type AuthenticationError struct {
	// contains filtered or unexported fields
}

AuthenticationError represents authentication failures (401, 403).

func NewAuthenticationError

func NewAuthenticationError(provider string, statusCode int, message string, err error) *AuthenticationError

NewAuthenticationError creates a new authentication error.

func (*AuthenticationError) Error

func (e *AuthenticationError) Error() string

func (*AuthenticationError) Provider

func (e *AuthenticationError) Provider() string

func (*AuthenticationError) StatusCode

func (e *AuthenticationError) StatusCode() int

func (*AuthenticationError) Unwrap

func (e *AuthenticationError) Unwrap() error

type Client

type Client interface {
	Generate(ctx context.Context, req *Request) (*Response, error)
}

Client is the unified interface for different AI providers.

func NewClient

func NewClient(opts ...Option) (Client, error)

NewClient is the single, unified factory function to create an AI client.

func NewClientFromEnv

func NewClientFromEnv() (Client, error)

NewClientFromEnv creates a new AI client by reading configuration from environment variables. It provides a convenient way to initialize the client without manual configuration.

It uses the following environment variables:

  • AI_PROVIDER: "openai" or "gemini" (defaults to "openai").
  • OPENAI_API_KEY, OPENAI_MODEL, OPENAI_BASE_URL
  • GEMINI_API_KEY, GEMINI_MODEL, GEMINI_BASE_URL

type Config

type Config struct {
	// contains filtered or unexported fields
}

Config holds all possible Configuration options for any client.

type ContentPart

type ContentPart struct {
	Type           ContentType
	Text           string          // For text parts
	ImageSource    *ImageSource    // For image parts
	AudioSource    *AudioSource    // For audio parts
	VideoSource    *VideoSource    // For video parts
	DocumentSource *DocumentSource // For document parts (PDF, etc.)
}

ContentPart represents a part of multimodal content (text, image, audio, video, document, etc.).

func NewAudioPartFromBase64

func NewAudioPartFromBase64(data, format string) ContentPart

NewAudioPartFromBase64 creates an audio content part from base64-encoded data. The format parameter specifies the audio format (e.g., "mp3", "wav", "ogg").

func NewAudioPartFromURL

func NewAudioPartFromURL(url, format string) ContentPart

NewAudioPartFromURL creates an audio content part from a URL. Supported formats: mp3, wav, aiff, aac, ogg, flac Primarily supported by Gemini models.

func NewDocumentPartFromBase64

func NewDocumentPartFromBase64(data, mimeType string) ContentPart

NewDocumentPartFromBase64 creates a document content part from base64-encoded data. The mimeType parameter should be "application/pdf" for PDF documents.

func NewDocumentPartFromURL

func NewDocumentPartFromURL(url, mimeType string) ContentPart

NewDocumentPartFromURL creates a document content part from a URL. Primarily used for PDF documents. Supported by Gemini and Anthropic models.

func NewImagePartFromBase64

func NewImagePartFromBase64(data, format string) ContentPart

NewImagePartFromBase64 creates an image content part from base64-encoded data. The data parameter should be the base64-encoded image data. The format parameter specifies the image format (e.g., "png", "jpeg", "gif", "webp"). If format is empty, it will be auto-detected from the data URI prefix if present.

func NewImagePartFromURL

func NewImagePartFromURL(url string) ContentPart

NewImagePartFromURL creates an image content part from a URL.

func NewPDFPartFromBase64

func NewPDFPartFromBase64(data string) ContentPart

NewPDFPartFromBase64 is a convenience function for creating a PDF document part from base64 data.

func NewPDFPartFromURL

func NewPDFPartFromURL(url string) ContentPart

NewPDFPartFromURL is a convenience function for creating a PDF document part from a URL.

func NewTextPart

func NewTextPart(text string) ContentPart

NewTextPart creates a text content part.

func NewVideoPartFromBase64

func NewVideoPartFromBase64(data, format string) ContentPart

NewVideoPartFromBase64 creates a video content part from base64-encoded data. The format parameter specifies the video format (e.g., "mp4", "webm").

func NewVideoPartFromURL

func NewVideoPartFromURL(url, format string) ContentPart

NewVideoPartFromURL creates a video content part from a URL. Supported formats: mp4, mpeg, mov, avi, flv, mpg, webm, wmv, 3gpp Primarily supported by Gemini models.

type ContentType

type ContentType string

ContentType defines the type of content in a multimodal message.

const (
	ContentTypeText     ContentType = "text"
	ContentTypeImage    ContentType = "image"
	ContentTypeAudio    ContentType = "audio"
	ContentTypeVideo    ContentType = "video"
	ContentTypeDocument ContentType = "document"
)

type DocumentSource

type DocumentSource struct {
	Type     MediaSourceType // "url" or "base64"
	URL      string          // HTTP(S) URL to the document
	Data     string          // Base64-encoded document data
	MimeType string          // MIME type: "application/pdf", etc.
}

DocumentSource represents a document input (primarily PDF). Supported by Gemini and Anthropic.

type ErrorWithStatus

type ErrorWithStatus interface {
	error
	StatusCode() int
	Provider() string
	Unwrap() error
}

ErrorWithStatus is the base interface for all AI client errors. It provides access to the HTTP status code and allows for error type assertions.

type FormatConverter

type FormatConverter interface {
	// ConvertRequestFromFormat converts a provider-specific request format to Universal Request.
	// The input should be the unmarshaled JSON request body from the provider's API.
	ConvertRequestFromFormat(providerReq any) (*Request, error)

	// ConvertResponseToFormat converts a Universal Response to provider-specific response format.
	// Returns the response struct that can be marshaled to JSON for the provider's API.
	ConvertResponseToFormat(universalResp *Response, originalModel string) (any, error)

	// GetEndpoint returns the API endpoint path for this format (e.g., "/v1/chat/completions", "/v1/messages").
	GetEndpoint() string

	// GetProviderName returns the provider name for this format (e.g., "openai", "gemini", "anthropic").
	GetProviderName() string
}

FormatConverter defines the interface for converting between provider-specific API formats and the Universal format. This enables creating proxy servers that accept one provider's API format and route to any other provider.

type FormatConverterFactory

type FormatConverterFactory struct{}

FormatConverterFactory creates format converters for different providers.

func NewFormatConverterFactory

func NewFormatConverterFactory() *FormatConverterFactory

NewFormatConverterFactory creates a new format converter factory.

func (*FormatConverterFactory) GetConverter

func (f *FormatConverterFactory) GetConverter(provider Provider) (FormatConverter, error)

GetConverter returns the appropriate format converter for the given provider.

type FunctionDefinition

type FunctionDefinition struct {
	Name        string          `json:"name"`
	Description string          `json:"description,omitempty"`
	Parameters  json.RawMessage `json:"parameters"`
}

FunctionDefinition is a universal, provider-agnostic function definition.

type GeminiFormatConverter

type GeminiFormatConverter struct{}

GeminiFormatConverter provides conversion between Google Gemini API format and Universal format. It implements the FormatConverter interface.

func NewGeminiFormatConverter

func NewGeminiFormatConverter() *GeminiFormatConverter

NewGeminiFormatConverter creates a new Gemini format converter.

func (*GeminiFormatConverter) ConvertRequestFromFormat

func (c *GeminiFormatConverter) ConvertRequestFromFormat(providerReq any) (*Request, error)

ConvertRequestFromFormat converts a Gemini request to Universal format.

func (*GeminiFormatConverter) ConvertRequestToUniversal

func (c *GeminiFormatConverter) ConvertRequestToUniversal(geminiReq *GeminiGenerateContentRequest) (*Request, error)

ConvertRequestToUniversal converts a Gemini request to Universal format.

func (*GeminiFormatConverter) ConvertResponseToFormat

func (c *GeminiFormatConverter) ConvertResponseToFormat(universalResp *Response, originalModel string) (any, error)

ConvertResponseToFormat converts a Universal Response to Gemini format.

func (*GeminiFormatConverter) ConvertResponseToGemini

func (c *GeminiFormatConverter) ConvertResponseToGemini(universalResp *Response) (*GeminiGenerateContentResponse, error)

ConvertResponseToGemini converts a Universal Response to Gemini format.

func (*GeminiFormatConverter) GetEndpoint

func (c *GeminiFormatConverter) GetEndpoint() string

GetEndpoint returns the Gemini API endpoint path (note: model is typically part of the path).

func (*GeminiFormatConverter) GetProviderName

func (c *GeminiFormatConverter) GetProviderName() string

GetProviderName returns the provider name.

type GeminiGenerateContentRequest

type GeminiGenerateContentRequest struct {
	Contents          []geminiContent `json:"contents"`
	Tools             []geminiTool    `json:"tools,omitempty"`
	SystemInstruction *geminiContent  `json:"systemInstruction,omitempty"`
}

GeminiGenerateContentRequest represents a Gemini generateContent request.

type GeminiGenerateContentResponse

type GeminiGenerateContentResponse struct {
	Candidates []geminiCandidate `json:"candidates"`
}

GeminiGenerateContentResponse represents a Gemini generateContent response.

type ImageSource

type ImageSource struct {
	Type   ImageSourceType // "url" or "base64"
	URL    string          // HTTP(S) URL to the image
	Data   string          // Base64-encoded image data (with or without data URI prefix)
	Format string          // Image format: "png", "jpeg", "gif", "webp" (optional, can be auto-detected)
}

ImageSource represents an image input for vision-enabled models.

type ImageSourceType

type ImageSourceType string

ImageSourceType defines how an image is provided.

const (
	ImageSourceTypeURL    ImageSourceType = "url"
	ImageSourceTypeBase64 ImageSourceType = "base64"
)

type InvalidRequestError

type InvalidRequestError struct {
	Details string
	// contains filtered or unexported fields
}

InvalidRequestError represents invalid request errors (400).

func NewInvalidRequestError

func NewInvalidRequestError(provider string, message string, details string, err error) *InvalidRequestError

NewInvalidRequestError creates a new invalid request error.

func (*InvalidRequestError) Error

func (e *InvalidRequestError) Error() string

func (*InvalidRequestError) Provider

func (e *InvalidRequestError) Provider() string

func (*InvalidRequestError) StatusCode

func (e *InvalidRequestError) StatusCode() int

func (*InvalidRequestError) Unwrap

func (e *InvalidRequestError) Unwrap() error

type MediaSourceType

type MediaSourceType string

MediaSourceType defines how media (audio/video/document) is provided.

const (
	MediaSourceTypeURL    MediaSourceType = "url"
	MediaSourceTypeBase64 MediaSourceType = "base64"
)

type Message

type Message struct {
	Role         Role
	Content      string        // Simple text content (for backward compatibility)
	ContentParts []ContentPart // Multimodal content (text + images, etc.)
	ToolCalls    []ToolCall
	ToolCallID   string
}

Message represents a universal message structure. Supports both simple text messages (Content) and multimodal messages (ContentParts).

func NewMultimodalMessage

func NewMultimodalMessage(role Role, parts []ContentPart) Message

NewMultimodalMessage creates a message with multimodal content parts.

func NewTextMessage

func NewTextMessage(role Role, text string) Message

NewTextMessage creates a simple text message. This is a convenience function for backward compatibility.

type NetworkError

type NetworkError struct {
	// contains filtered or unexported fields
}

NetworkError represents network-level failures (connection refused, DNS, etc.).

func NewNetworkError

func NewNetworkError(provider string, message string, err error) *NetworkError

NewNetworkError creates a new network error.

func (*NetworkError) Error

func (e *NetworkError) Error() string

func (*NetworkError) Provider

func (e *NetworkError) Provider() string

func (*NetworkError) StatusCode

func (e *NetworkError) StatusCode() int

func (*NetworkError) Unwrap

func (e *NetworkError) Unwrap() error

type OpenAIChatCompletionRequest

type OpenAIChatCompletionRequest struct {
	Model    string          `json:"model"`
	Messages []openaiMessage `json:"messages"`
	Tools    []openaiTool    `json:"tools,omitempty"`
}

OpenAIChatCompletionRequest represents an OpenAI chat completion request. This type is exported to enable format conversion in the proxy server.

type OpenAIFormatConverter

type OpenAIFormatConverter struct{}

OpenAIFormatConverter provides conversion between OpenAI API format and Universal format. This enables creating an OpenAI-compatible proxy server that can route to any provider. It implements the FormatConverter interface.

func NewOpenAIFormatConverter

func NewOpenAIFormatConverter() *OpenAIFormatConverter

NewOpenAIFormatConverter creates a new OpenAI format converter.

func (*OpenAIFormatConverter) ConvertRequestFromFormat

func (c *OpenAIFormatConverter) ConvertRequestFromFormat(providerReq any) (*Request, error)

ConvertRequestFromFormat converts an OpenAI request to Universal format. Implements FormatConverter interface.

func (*OpenAIFormatConverter) ConvertRequestToUniversal

func (c *OpenAIFormatConverter) ConvertRequestToUniversal(openaiReq *OpenAIChatCompletionRequest) (*Request, error)

ConvertRequestToUniversal converts an OpenAI chat completion request to Universal Request format.

func (*OpenAIFormatConverter) ConvertResponseToFormat

func (c *OpenAIFormatConverter) ConvertResponseToFormat(universalResp *Response, originalModel string) (any, error)

ConvertResponseToFormat converts a Universal Response to OpenAI format. Implements FormatConverter interface.

func (*OpenAIFormatConverter) ConvertResponseToOpenAI

func (c *OpenAIFormatConverter) ConvertResponseToOpenAI(universalResp *Response, model string, promptTokens, completionTokens int) (*openaiChatCompletionResponse, error)

ConvertResponseToOpenAI converts a Universal Response to OpenAI chat completion response format.

func (*OpenAIFormatConverter) GetEndpoint

func (c *OpenAIFormatConverter) GetEndpoint() string

GetEndpoint returns the OpenAI API endpoint path.

func (*OpenAIFormatConverter) GetProviderName

func (c *OpenAIFormatConverter) GetProviderName() string

GetProviderName returns the provider name.

type Option

type Option func(*Config)

Option is the function signature for Configuration options.

func WithAPIKey

func WithAPIKey(apiKey string) Option

WithAPIKey sets the API key for authentication.

func WithBaseURL

func WithBaseURL(baseURL string) Option

WithBaseURL sets a custom base URL for the API endpoint.

func WithModel

func WithModel(model string) Option

WithModel sets the model name to use for the client.

func WithProvider

func WithProvider(provider Provider) Option

WithProvider sets the AI provider.

func WithTimeout

func WithTimeout(timeout time.Duration) Option

WithTimeout sets the HTTP client timeout.

type Provider

type Provider string

Provider defines the supported AI providers.

const (
	ProviderOpenAI    Provider = "openai"
	ProviderGemini    Provider = "gemini"
	ProviderAnthropic Provider = "anthropic"
)

type RateLimitError

type RateLimitError struct {
	RetryAfter time.Duration
	// contains filtered or unexported fields
}

RateLimitError represents rate limiting errors (429).

func NewRateLimitError

func NewRateLimitError(provider string, message string, retryAfter time.Duration, err error) *RateLimitError

NewRateLimitError creates a new rate limit error.

func (*RateLimitError) Error

func (e *RateLimitError) Error() string

func (*RateLimitError) Provider

func (e *RateLimitError) Provider() string

func (*RateLimitError) StatusCode

func (e *RateLimitError) StatusCode() int

func (*RateLimitError) Unwrap

func (e *RateLimitError) Unwrap() error

type Request

type Request struct {
	Model        string
	SystemPrompt string
	Messages     []Message
	Tools        []Tool
}

Request is a universal request structure for content generation.

func ConvertRequest

func ConvertRequest(sourceFormat string, payload []byte) (*Request, error)

ConvertRequest converts a provider-specific request payload into the universal Request format.

func (*Request) Validate

func (r *Request) Validate() error

Validate checks if the request is valid and returns an error if not. This method validates all request fields before sending to the API.

type Response

type Response struct {
	Text      string
	ToolCalls []ToolCall
}

Response is a universal response structure.

type Role

type Role string

Role defines the originator of a message.

const (
	RoleSystem    Role = "system"
	RoleUser      Role = "user"
	RoleAssistant Role = "assistant"
	RoleTool      Role = "tool"
)

type ServerError

type ServerError struct {
	// contains filtered or unexported fields
}

ServerError represents server-side errors (5xx).

func NewServerError

func NewServerError(provider string, statusCode int, message string, err error) *ServerError

NewServerError creates a new server error.

func (*ServerError) Error

func (e *ServerError) Error() string

func (*ServerError) Provider

func (e *ServerError) Provider() string

func (*ServerError) StatusCode

func (e *ServerError) StatusCode() int

func (*ServerError) Unwrap

func (e *ServerError) Unwrap() error

type TimeoutError

type TimeoutError struct {
	Duration time.Duration
	// contains filtered or unexported fields
}

TimeoutError represents timeout errors (context deadline exceeded).

func NewTimeoutError

func NewTimeoutError(provider string, duration time.Duration, err error) *TimeoutError

NewTimeoutError creates a new timeout error.

func (*TimeoutError) Error

func (e *TimeoutError) Error() string

func (*TimeoutError) Provider

func (e *TimeoutError) Provider() string

func (*TimeoutError) StatusCode

func (e *TimeoutError) StatusCode() int

func (*TimeoutError) Unwrap

func (e *TimeoutError) Unwrap() error

type Tool

type Tool struct {
	Type     string             `json:"type"`
	Function FunctionDefinition `json:"function"`
}

Tool defines a tool the model can use.

type ToolCall

type ToolCall struct {
	ID        string
	Type      string
	Function  string
	Arguments string
}

ToolCall represents a request from the model to call a specific tool.

type ToolServerClient

type ToolServerClient struct {
	// contains filtered or unexported fields
}

ToolServerClient handles the connection lifecycle for a single tool server. It should be created and managed by the ToolServerManager.

func (*ToolServerClient) Close

func (c *ToolServerClient) Close() error

Close terminates the session with the tool server.

func (*ToolServerClient) Connect

func (c *ToolServerClient) Connect(ctx context.Context) error

Connect establishes a session with the tool server. It is optional to call this manually; methods like FetchTools and ExecuteTool will call it automatically if a session is not already active.

func (*ToolServerClient) ExecuteTool

func (c *ToolServerClient) ExecuteTool(ctx context.Context, toolCall ToolCall) (string, error)

ExecuteTool executes a tool call, automatically connecting if necessary.

func (*ToolServerClient) FetchTools

func (c *ToolServerClient) FetchTools(ctx context.Context) ([]Tool, error)

FetchTools lists available tools, automatically connecting if necessary.

type ToolServerConfig

type ToolServerConfig struct {
	MCPServers map[string]ToolServerDetails `json:"mcpServers"`
}

ToolServerConfig defines the top-level structure of the mcp.json file.

type ToolServerDetails

type ToolServerDetails struct {
	Command string            `json:"command"`
	Args    []string          `json:"args"`
	Env     map[string]string `json:"env"`
}

ToolServerDetails defines the configuration for a command-based tool server as found in the mcp.json file.

type ToolServerManager

type ToolServerManager struct {
	// contains filtered or unexported fields
}

ToolServerManager discovers and manages tool server clients from various sources. It acts as a central registry for all known tool servers. All methods are safe for concurrent use.

func NewToolServerManager

func NewToolServerManager() *ToolServerManager

NewToolServerManager creates a new, empty manager.

func (*ToolServerManager) AddRemoteServer

func (m *ToolServerManager) AddRemoteServer(name, url string) error

AddRemoteServer programmatically registers a remote, HTTP-based tool server.

func (*ToolServerManager) GetClient

func (m *ToolServerManager) GetClient(name string) (*ToolServerClient, bool)

GetClient retrieves a ready-to-use client for the server with the given name.

func (*ToolServerManager) ListServerNames

func (m *ToolServerManager) ListServerNames() []string

ListServerNames returns a slice of the names of all registered servers.

func (*ToolServerManager) LoadFromFile

func (m *ToolServerManager) LoadFromFile(configFile string) error

LoadFromFile parses a standard mcp.json file and registers all defined servers with the manager.

type UnknownError

type UnknownError struct {
	// contains filtered or unexported fields
}

UnknownError represents unexpected errors that don't fit other categories.

func NewUnknownError

func NewUnknownError(provider string, statusCode int, message string, err error) *UnknownError

NewUnknownError creates a new unknown error.

func (*UnknownError) Error

func (e *UnknownError) Error() string

func (*UnknownError) Provider

func (e *UnknownError) Provider() string

func (*UnknownError) StatusCode

func (e *UnknownError) StatusCode() int

func (*UnknownError) Unwrap

func (e *UnknownError) Unwrap() error

type VideoSource

type VideoSource struct {
	Type   MediaSourceType // "url" or "base64"
	URL    string          // HTTP(S) URL to the video file
	Data   string          // Base64-encoded video data
	Format string          // Video format: "mp4", "mpeg", "mov", "avi", "flv", "webm", etc.
}

VideoSource represents a video input for video-enabled models (primarily Gemini). Supported formats: MP4, MPEG, MOV, AVI, FLV, MPG, WEBM, WMV, 3GPP

Directories

Path Synopsis
cmd
api-proxy command
examples
audio_chat command
pdf_chat command
simple_chat command
video_chat command
vision_chat command

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL