ai

package module

v1.0.0 Latest Latest Go to latest Published: Nov 11, 2025 License: MIT Imports: 18 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/liuzl/ai

Links

Open Source Insights

README ¶

Go AI

A Go library providing a unified, provider-agnostic interface for interacting with multiple AI models, including Google Gemini, OpenAI, and Anthropic. This library simplifies content generation and tool integration, allowing you to switch between AI providers with minimal code changes.

It also features built-in support for the Model-Context Protocol (MCP), enabling seamless integration with external tool servers.

Features

Unified Client Interface: A single ai.Client interface for Google Gemini, OpenAI, and Anthropic.
Provider-Agnostic API: Universal Request, Response, and Message structs for consistent interaction.
Simplified Configuration: Easily configure clients using environment variables or functional options.
Multimodal Support: Support for text, images, audio, video, and PDF documents with automatic format handling.
First-Class Tool Support: Abstracted support for function calling (tools) that works across providers.
MCP Integration: Discover, connect to, and execute tools on MCP-compliant servers.
Universal API Proxy Server: HTTP proxy server that accepts any provider's API format and routes to any provider.
Error Handling: Clear error messages when using unsupported features with specific providers.

Installation

To add the library to your project, run:

go get github.com/liuzl/ai

Configuration

The easiest way to configure the client is by setting environment variables. The library's NewClientFromEnv() function will automatically detect and use them.

AI_PROVIDER: The provider to use. Can be openai (default), gemini, or anthropic.

OpenAI

OPENAI_API_KEY: Your OpenAI API key.
OPENAI_MODEL: (Optional) The model name, e.g., gpt-5-mini.
OPENAI_BASE_URL: (Optional) For using a custom or proxy endpoint.

Google Gemini

GEMINI_API_KEY: Your Gemini API key.
GEMINI_MODEL: (Optional) The model name, e.g., gemini-2.5-flash.
GEMINI_BASE_URL: (Optional) For using a custom endpoint.

Anthropic

ANTHROPIC_API_KEY: Your Anthropic API key.
ANTHROPIC_MODEL: (Optional) The model name, e.g., claude-haiku-4-5.
ANTHROPIC_BASE_URL: (Optional) For using a custom endpoint.

Usage

Basic Example: Simple Text Generation

This example shows how to create a client from environment variables and generate a simple text response.

// main.go
package main

import (
	"context"
	"fmt"
	"log"

	// Use godotenv to load .env file for local development
	_ "github.com/joho/godotenv/autoload"
	"github.com/liuzl/ai"
)

func main() {
	// Create a new client using the recommended NewClientFromEnv function.
	// This automatically reads the AI_PROVIDER and corresponding API keys.
	client, err := ai.NewClientFromEnv()
	if err != nil {
		log.Fatalf("Failed to create AI client: %v", err)
	}

	// Create a request for the model.
	req := &ai.Request{
		Messages: []ai.Message{
			{Role: ai.RoleUser, Content: "Tell me a one-sentence joke about programming."},
		},
	}

	// Call the Generate function.
	resp, err := client.Generate(context.Background(), req)
	if err != nil {
		log.Fatalf("Generate failed: %v", err)
	}

	// Print the result.
	fmt.Println(resp.Text)
}

Running the Examples

The examples directory contains runnable code. To run the simple chat example, execute the following command from the root of the project:

go run ./examples/simple_chat

Advanced Example: Using Tools with an MCP Server

This library can orchestrate interactions between an AI model and an external tool server that implements the Model-Context Protocol (MCP).

The following example demonstrates the full loop:

Connect to an MCP server to discover available tools.
Ask the AI model a question, providing the list of tools it can use.
Receive a ToolCall from the model.
Execute the ToolCall on the MCP server.
Send the tool's result back to the model for a final, synthesized answer.

package main

import (
	"context"
	"log"

	_ "github.com/joho/godotenv/autoload" // for loading .env file
	"github.com/liuzl/ai"
)

const (
	mcpServerName = "remote-shell"
	mcpServerURL  = "http://localhost:8080/mcp" // URL of a running MCP server
)

func main() {
	ctx := context.Background()

	// 1. Setup ToolServerManager and register the remote server.
	manager := ai.NewToolServerManager()
	if err := manager.AddRemoteServer(mcpServerName, mcpServerURL); err != nil {
		log.Fatalf("Failed to add remote tool server: %v", err)
	}

	// 2. Get the client for the server and defer its closing.
	toolClient, _ := manager.GetClient(mcpServerName)
	defer toolClient.Close()

	// 3. Fetch available tools. The client will connect automatically.
	aiTools, err := toolClient.FetchTools(ctx)
	if err != nil {
		log.Fatalf("Failed to fetch tools: %v", err)
	}
	log.Printf("Found %d tools on server '%s'.\n", len(aiTools), mcpServerName)

	// 4. Create an AI client
	aiClient, err := ai.NewClientFromEnv()
	if err != nil {
		log.Fatalf("Failed to create AI client: %v", err)
	}

	// 5. Ask the model a question, making it aware of the tools
	messages := []ai.Message{
		{Role: ai.RoleUser, Content: "List all files in the current directory using the shell."},
	}
	req := &ai.Request{Messages: messages, Tools: aiTools}

	resp, err := aiClient.Generate(ctx, req)
	if err != nil {
		log.Fatalf("Initial model call failed: %v", err)
	}

	// 6. Check for a tool call and execute it
	if len(resp.ToolCalls) == 0 {
		log.Fatalf("Expected a tool call, but got text: %s", resp.Text)
	}
	toolCall := resp.ToolCalls[0]
	log.Printf("Model wants to call function '%s'.\n", toolCall.Function)
	messages = append(messages, ai.Message{Role: ai.RoleAssistant, ToolCalls: resp.ToolCalls})

	toolResult, err := toolClient.ExecuteTool(ctx, toolCall)
	if err != nil {
		log.Fatalf("Tool call failed: %v", err)
	}
	log.Printf("Tool executed successfully.\n")

	// 7. Send the result back to the model for a final answer
	messages = append(messages, ai.Message{Role: ai.RoleTool, ToolCallID: toolCall.ID, Content: toolResult})
	finalReq := &ai.Request{Messages: messages}

	finalResp, err := aiClient.Generate(ctx, finalReq)
	if err != nil {
		log.Fatalf("Final model call failed: %v", err)
	}

	// 8. Print the final, synthesized response
	log.Println("--- Final Model Response ---")
	log.Println(finalResp.Text)
	log.Println("--------------------------")
}

Multimodal Support

The library provides comprehensive support for multiple content types beyond text, including images, audio, video, and PDF documents. Different providers support different modalities:

Supported Content Types by Provider

Content Type	OpenAI	Gemini	Anthropic	Notes
Text	✅	✅	✅	Universal support
Images	✅	✅	✅	PNG, JPEG, WEBP, GIF
Audio	❌	✅	❌	MP3, WAV, AIFF, AAC, OGG, FLAC
Video	❌	✅	❌	MP4, MPEG, MOV, AVI, FLV, WEBM, etc.
PDF Documents	❌	✅	✅	Native PDF parsing

Image Analysis Example

req := &ai.Request{
	Messages: []ai.Message{
		ai.NewMultimodalMessage(ai.RoleUser, []ai.ContentPart{
			ai.NewTextPart("What's in this image?"),
			ai.NewImagePartFromURL("https://example.com/image.jpg"),
		}),
	},
}

Audio Analysis Example (Gemini only)

req := &ai.Request{
	Messages: []ai.Message{
		ai.NewMultimodalMessage(ai.RoleUser, []ai.ContentPart{
			ai.NewTextPart("Transcribe and analyze this audio"),
			ai.NewAudioPartFromURL("https://example.com/audio.mp3"),
		}),
	},
}

Video Analysis Example (Gemini only)

req := &ai.Request{
	Messages: []ai.Message{
		ai.NewMultimodalMessage(ai.RoleUser, []ai.ContentPart{
			ai.NewTextPart("Describe what happens in this video"),
			ai.NewVideoPartFromURL("https://example.com/video.mp4", "mp4"),
		}),
	},
}

PDF Document Analysis Example (Gemini & Anthropic)

req := &ai.Request{
	Messages: []ai.Message{
		ai.NewMultimodalMessage(ai.RoleUser, []ai.ContentPart{
			ai.NewTextPart("Summarize this research paper"),
			ai.NewPDFPartFromURL("https://arxiv.org/pdf/1706.03762.pdf"),
		}),
	},
}

Using Base64-Encoded Media

All media types also support base64-encoded data for local files:

// Read local file
audioData, _ := os.ReadFile("audio.mp3")
base64Audio := base64.StdEncoding.EncodeToString(audioData)

req := &ai.Request{
	Messages: []ai.Message{
		ai.NewMultimodalMessage(ai.RoleUser, []ai.ContentPart{
			ai.NewTextPart("Analyze this audio"),
			ai.NewAudioPartFromBase64(base64Audio, "mp3"),
		}),
	},
}

Automatic Format Handling

Gemini: Automatically downloads media from URLs and converts to base64
Anthropic: Supports both URL and base64 for images and PDFs
OpenAI: Supports URL and base64 for images

Error Handling

The library provides clear error messages when attempting to use unsupported content types:

// Trying to use audio with OpenAI will return:
// "OpenAI provider does not support audio input (content type: audio).
//  Supported providers: Gemini"

Complete Examples

See the examples directory for complete working examples:

examples/simple_chat - Basic text generation
examples/vision_chat - Image analysis with all providers
examples/audio_chat - Audio analysis with Gemini
examples/video_chat - Video analysis with Gemini
examples/pdf_chat - PDF document Q&A with Gemini/Anthropic
examples/tool_server_interaction - MCP tool integration

Each example includes detailed documentation on supported formats, use cases, and limitations.

Universal API Proxy Server

The library includes a universal HTTP proxy server that accepts any provider's API format and routes requests to any supported provider. This allows you to:

Use OpenAI format to call Gemini or Anthropic
Use Gemini format to call OpenAI or Anthropic
Use Anthropic format to call OpenAI or Gemini
All 9 format/provider combinations are supported!

Why Use the Universal Proxy?

Format Flexibility: Use the API format you're familiar with, regardless of backend provider
Tool Compatibility: Use existing tools/SDKs designed for one provider with another
No code changes: Switch providers without rewriting client code
Cost Optimization: Route expensive API calls to cheaper providers using the same format
Vendor Lock-in Mitigation: Build with one API, easily switch providers

Quick Start

Install and run the proxy server:

# Install
go install github.com/liuzl/ai/cmd/api-proxy@latest

# Example 1: Use OpenAI format to call Gemini
export GEMINI_API_KEY="your-gemini-api-key"
api-proxy -format openai -provider gemini -model gemini-2.5-flash

# Example 2: Use Anthropic format to call OpenAI
export OPENAI_API_KEY="your-openai-api-key"
api-proxy -format anthropic -provider openai -model gpt-4o

# Example 3: Use Gemini format to call Anthropic
export ANTHROPIC_API_KEY="your-anthropic-api-key"
api-proxy -format gemini -provider anthropic -model claude-3-5-haiku-20241022

Example: OpenAI SDK → Gemini Backend

from openai import OpenAI

# Start proxy: api-proxy -format openai -provider gemini

client = OpenAI(
    api_key="dummy",
    base_url="http://localhost:8080/v1"
)

# Use OpenAI SDK, but actually calls Gemini!
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello!"}]
)

print(response.choices[0].message.content)

Example: Anthropic SDK → OpenAI Backend

import anthropic

# Start proxy: api-proxy -format anthropic -provider openai -model gpt-4o

client = anthropic.Anthropic(
    api_key="dummy",
    base_url="http://localhost:8080"
)

# Use Anthropic SDK, but actually calls OpenAI!
message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}]
)

print(message.content[0].text)

Supported Format Combinations

API Format ↓ / Provider →	OpenAI	Gemini	Anthropic
OpenAI	✅	✅	✅
Gemini	✅	✅	✅
Anthropic	✅	✅	✅

Configuration Options

-listen string       Server listen address (default ":8080")
-format string       API format to accept: openai, gemini, anthropic (default "openai")
-provider string     Target provider to call: openai, gemini, anthropic (required)
-api-key string      Provider API key (or use env vars)
-model string        Target model (optional)
-base-url string     Custom API endpoint (optional)
-timeout duration    Request timeout (default 5m)
-verbose             Enable verbose logging

Supported Features

✅ All chat completion features

✅ System prompts

✅ Multi-turn conversations

✅ Tool/function calling

✅ Vision/multimodal inputs (images, audio, video, PDFs)

✅ All provider combinations

See the API proxy README for complete documentation, examples, and use cases.

License

MIT License - see LICENSE file for details.

Documentation ¶

Index ¶

func ConvertResponse(targetFormat string, resp *Response) ([]byte, error)
type AnthropicFormatConverter
- func NewAnthropicFormatConverter() *AnthropicFormatConverter
- func (c *AnthropicFormatConverter) ConvertRequestFromFormat(providerReq any) (*Request, error)
- func (c *AnthropicFormatConverter) ConvertRequestToUniversal(anthropicReq *AnthropicMessagesRequest) (*Request, error)
- func (c *AnthropicFormatConverter) ConvertResponseToAnthropic(universalResp *Response, model string) (*AnthropicMessagesResponse, error)
- func (c *AnthropicFormatConverter) ConvertResponseToFormat(universalResp *Response, originalModel string) (any, error)
- func (c *AnthropicFormatConverter) GetEndpoint() string
- func (c *AnthropicFormatConverter) GetProviderName() string
type AnthropicMessagesRequest
type AnthropicMessagesResponse
type AudioSource
type AuthenticationError
- func NewAuthenticationError(provider string, statusCode int, message string, err error) *AuthenticationError
- func (e *AuthenticationError) Error() string
- func (e *AuthenticationError) Provider() string
- func (e *AuthenticationError) StatusCode() int
- func (e *AuthenticationError) Unwrap() error
type Client
- func NewClient(opts ...Option) (Client, error)
- func NewClientFromEnv() (Client, error)
type Config
type ContentPart
- func NewAudioPartFromBase64(data, format string) ContentPart
- func NewAudioPartFromURL(url, format string) ContentPart
- func NewDocumentPartFromBase64(data, mimeType string) ContentPart
- func NewDocumentPartFromURL(url, mimeType string) ContentPart
- func NewImagePartFromBase64(data, format string) ContentPart
- func NewImagePartFromURL(url string) ContentPart
- func NewPDFPartFromBase64(data string) ContentPart
- func NewPDFPartFromURL(url string) ContentPart
- func NewTextPart(text string) ContentPart
- func NewVideoPartFromBase64(data, format string) ContentPart
- func NewVideoPartFromURL(url, format string) ContentPart
type ContentType
type DocumentSource
type ErrorWithStatus
type FormatConverter
type FormatConverterFactory
- func NewFormatConverterFactory() *FormatConverterFactory
- func (f *FormatConverterFactory) GetConverter(provider Provider) (FormatConverter, error)
type FunctionDefinition
type GeminiFormatConverter
- func NewGeminiFormatConverter() *GeminiFormatConverter
- func (c *GeminiFormatConverter) ConvertRequestFromFormat(providerReq any) (*Request, error)
- func (c *GeminiFormatConverter) ConvertRequestToUniversal(geminiReq *GeminiGenerateContentRequest) (*Request, error)
- func (c *GeminiFormatConverter) ConvertResponseToFormat(universalResp *Response, originalModel string) (any, error)
- func (c *GeminiFormatConverter) ConvertResponseToGemini(universalResp *Response) (*GeminiGenerateContentResponse, error)
- func (c *GeminiFormatConverter) GetEndpoint() string
- func (c *GeminiFormatConverter) GetProviderName() string
type GeminiGenerateContentRequest
type GeminiGenerateContentResponse
type ImageSource
type ImageSourceType
type InvalidRequestError
- func NewInvalidRequestError(provider string, message string, details string, err error) *InvalidRequestError
- func (e *InvalidRequestError) Error() string
- func (e *InvalidRequestError) Provider() string
- func (e *InvalidRequestError) StatusCode() int
- func (e *InvalidRequestError) Unwrap() error
type MediaSourceType
type Message
- func NewMultimodalMessage(role Role, parts []ContentPart) Message
- func NewTextMessage(role Role, text string) Message
type NetworkError
- func NewNetworkError(provider string, message string, err error) *NetworkError
- func (e *NetworkError) Error() string
- func (e *NetworkError) Provider() string
- func (e *NetworkError) StatusCode() int
- func (e *NetworkError) Unwrap() error
type OpenAIChatCompletionRequest
type OpenAIFormatConverter
- func NewOpenAIFormatConverter() *OpenAIFormatConverter
- func (c *OpenAIFormatConverter) ConvertRequestFromFormat(providerReq any) (*Request, error)
- func (c *OpenAIFormatConverter) ConvertRequestToUniversal(openaiReq *OpenAIChatCompletionRequest) (*Request, error)
- func (c *OpenAIFormatConverter) ConvertResponseToFormat(universalResp *Response, originalModel string) (any, error)
- func (c *OpenAIFormatConverter) ConvertResponseToOpenAI(universalResp *Response, model string, promptTokens, completionTokens int) (*openaiChatCompletionResponse, error)
- func (c *OpenAIFormatConverter) GetEndpoint() string
- func (c *OpenAIFormatConverter) GetProviderName() string
type Option
- func WithAPIKey(apiKey string) Option
- func WithBaseURL(baseURL string) Option
- func WithModel(model string) Option
- func WithProvider(provider Provider) Option
- func WithTimeout(timeout time.Duration) Option
type Provider
type RateLimitError
- func NewRateLimitError(provider string, message string, retryAfter time.Duration, err error) *RateLimitError
- func (e *RateLimitError) Error() string
- func (e *RateLimitError) Provider() string
- func (e *RateLimitError) StatusCode() int
- func (e *RateLimitError) Unwrap() error
type Request
- func ConvertRequest(sourceFormat string, payload []byte) (*Request, error)
- func (r *Request) Validate() error
type Response
type Role
type ServerError
- func NewServerError(provider string, statusCode int, message string, err error) *ServerError
- func (e *ServerError) Error() string
- func (e *ServerError) Provider() string
- func (e *ServerError) StatusCode() int
- func (e *ServerError) Unwrap() error
type TimeoutError
- func NewTimeoutError(provider string, duration time.Duration, err error) *TimeoutError
- func (e *TimeoutError) Error() string
- func (e *TimeoutError) Provider() string
- func (e *TimeoutError) StatusCode() int
- func (e *TimeoutError) Unwrap() error
type Tool
type ToolCall
type ToolServerClient
- func (c *ToolServerClient) Close() error
- func (c *ToolServerClient) Connect(ctx context.Context) error
- func (c *ToolServerClient) ExecuteTool(ctx context.Context, toolCall ToolCall) (string, error)
- func (c *ToolServerClient) FetchTools(ctx context.Context) ([]Tool, error)
type ToolServerConfig
type ToolServerDetails
type ToolServerManager
- func NewToolServerManager() *ToolServerManager
- func (m *ToolServerManager) AddRemoteServer(name, url string) error
- func (m *ToolServerManager) GetClient(name string) (*ToolServerClient, bool)
- func (m *ToolServerManager) ListServerNames() []string
- func (m *ToolServerManager) LoadFromFile(configFile string) error
type UnknownError
- func NewUnknownError(provider string, statusCode int, message string, err error) *UnknownError
- func (e *UnknownError) Error() string
- func (e *UnknownError) Provider() string
- func (e *UnknownError) StatusCode() int
- func (e *UnknownError) Unwrap() error
type VideoSource

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func ConvertResponse ¶

func ConvertResponse(targetFormat string, resp *Response) ([]byte, error)

ConvertResponse converts a universal Response into a provider-specific response payload.

Types ¶

type AnthropicFormatConverter ¶

type AnthropicFormatConverter struct{}

AnthropicFormatConverter provides conversion between Anthropic API format and Universal format. It implements the FormatConverter interface.

func NewAnthropicFormatConverter ¶

func NewAnthropicFormatConverter() *AnthropicFormatConverter

NewAnthropicFormatConverter creates a new Anthropic format converter.

func (*AnthropicFormatConverter) ConvertRequestFromFormat ¶

func (c *AnthropicFormatConverter) ConvertRequestFromFormat(providerReq any) (*Request, error)

ConvertRequestFromFormat converts an Anthropic request to Universal format.

func (*AnthropicFormatConverter) ConvertRequestToUniversal ¶

func (c *AnthropicFormatConverter) ConvertRequestToUniversal(anthropicReq *AnthropicMessagesRequest) (*Request, error)

ConvertRequestToUniversal converts an Anthropic request to Universal format.

func (*AnthropicFormatConverter) ConvertResponseToAnthropic ¶

func (c *AnthropicFormatConverter) ConvertResponseToAnthropic(universalResp *Response, model string) (*AnthropicMessagesResponse, error)

ConvertResponseToAnthropic converts a Universal Response to Anthropic format.

func (*AnthropicFormatConverter) ConvertResponseToFormat ¶

func (c *AnthropicFormatConverter) ConvertResponseToFormat(universalResp *Response, originalModel string) (any, error)

ConvertResponseToFormat converts a Universal Response to Anthropic format.

func (*AnthropicFormatConverter) GetEndpoint ¶

func (c *AnthropicFormatConverter) GetEndpoint() string

GetEndpoint returns the Anthropic API endpoint path.

func (*AnthropicFormatConverter) GetProviderName ¶

func (c *AnthropicFormatConverter) GetProviderName() string

GetProviderName returns the provider name.

type AnthropicMessagesRequest ¶

type AnthropicMessagesRequest struct {
	Model     string             `json:"model"`
	System    string             `json:"system,omitempty"`
	Messages  []anthropicMessage `json:"messages"`
	MaxTokens int                `json:"max_tokens"`
	Tools     []anthropicTool    `json:"tools,omitempty"`
}

AnthropicMessagesRequest represents an Anthropic messages request.

type AnthropicMessagesResponse ¶

type AnthropicMessagesResponse struct {
	ID         string                  `json:"id"`
	Type       string                  `json:"type"`
	Role       string                  `json:"role"`
	Model      string                  `json:"model"`
	Content    []anthropicContentBlock `json:"content"`
	StopReason string                  `json:"stop_reason"`
	Usage      *anthropicUsage         `json:"usage,omitempty"`
}

AnthropicMessagesResponse represents an Anthropic messages response.

type AudioSource ¶

type AudioSource struct {
	Type   MediaSourceType // "url" or "base64"
	URL    string          // HTTP(S) URL to the audio file
	Data   string          // Base64-encoded audio data
	Format string          // Audio format: "mp3", "wav", "aiff", "aac", "ogg", "flac"
}

AudioSource represents an audio input for audio-enabled models (primarily Gemini). Supported formats: MP3, WAV, AIFF, AAC, OGG, FLAC

type AuthenticationError ¶

type AuthenticationError struct {
	// contains filtered or unexported fields
}

AuthenticationError represents authentication failures (401, 403).

func NewAuthenticationError ¶

func NewAuthenticationError(provider string, statusCode int, message string, err error) *AuthenticationError

NewAuthenticationError creates a new authentication error.

func (*AuthenticationError) Error ¶

func (e *AuthenticationError) Error() string

func (*AuthenticationError) Provider ¶

func (e *AuthenticationError) Provider() string

func (*AuthenticationError) StatusCode ¶

func (e *AuthenticationError) StatusCode() int

func (*AuthenticationError) Unwrap ¶

func (e *AuthenticationError) Unwrap() error

type Client ¶

type Client interface {
	Generate(ctx context.Context, req *Request) (*Response, error)
}

Client is the unified interface for different AI providers.

func NewClient ¶

func NewClient(opts ...Option) (Client, error)

NewClient is the single, unified factory function to create an AI client.

func NewClientFromEnv ¶

func NewClientFromEnv() (Client, error)

NewClientFromEnv creates a new AI client by reading configuration from environment variables. It provides a convenient way to initialize the client without manual configuration.

It uses the following environment variables:

AI_PROVIDER: "openai" or "gemini" (defaults to "openai").
OPENAI_API_KEY, OPENAI_MODEL, OPENAI_BASE_URL
GEMINI_API_KEY, GEMINI_MODEL, GEMINI_BASE_URL

type Config ¶

type Config struct {
	// contains filtered or unexported fields
}

Config holds all possible Configuration options for any client.

type ContentPart ¶

type ContentPart struct {
	Type           ContentType
	Text           string          // For text parts
	ImageSource    *ImageSource    // For image parts
	AudioSource    *AudioSource    // For audio parts
	VideoSource    *VideoSource    // For video parts
	DocumentSource *DocumentSource // For document parts (PDF, etc.)
}

ContentPart represents a part of multimodal content (text, image, audio, video, document, etc.).

func NewAudioPartFromBase64 ¶

func NewAudioPartFromBase64(data, format string) ContentPart

NewAudioPartFromBase64 creates an audio content part from base64-encoded data. The format parameter specifies the audio format (e.g., "mp3", "wav", "ogg").

func NewAudioPartFromURL ¶

func NewAudioPartFromURL(url, format string) ContentPart

NewAudioPartFromURL creates an audio content part from a URL. Supported formats: mp3, wav, aiff, aac, ogg, flac Primarily supported by Gemini models.

func NewDocumentPartFromBase64 ¶

func NewDocumentPartFromBase64(data, mimeType string) ContentPart

NewDocumentPartFromBase64 creates a document content part from base64-encoded data. The mimeType parameter should be "application/pdf" for PDF documents.

func NewDocumentPartFromURL ¶

func NewDocumentPartFromURL(url, mimeType string) ContentPart

NewDocumentPartFromURL creates a document content part from a URL. Primarily used for PDF documents. Supported by Gemini and Anthropic models.

func NewImagePartFromBase64 ¶

func NewImagePartFromBase64(data, format string) ContentPart

NewImagePartFromBase64 creates an image content part from base64-encoded data. The data parameter should be the base64-encoded image data. The format parameter specifies the image format (e.g., "png", "jpeg", "gif", "webp"). If format is empty, it will be auto-detected from the data URI prefix if present.

func NewImagePartFromURL ¶

func NewImagePartFromURL(url string) ContentPart

NewImagePartFromURL creates an image content part from a URL.

func NewPDFPartFromBase64 ¶

func NewPDFPartFromBase64(data string) ContentPart

NewPDFPartFromBase64 is a convenience function for creating a PDF document part from base64 data.

func NewPDFPartFromURL ¶

func NewPDFPartFromURL(url string) ContentPart

NewPDFPartFromURL is a convenience function for creating a PDF document part from a URL.

func NewTextPart ¶

func NewTextPart(text string) ContentPart

NewTextPart creates a text content part.

func NewVideoPartFromBase64 ¶

func NewVideoPartFromBase64(data, format string) ContentPart

NewVideoPartFromBase64 creates a video content part from base64-encoded data. The format parameter specifies the video format (e.g., "mp4", "webm").

func NewVideoPartFromURL ¶

func NewVideoPartFromURL(url, format string) ContentPart

NewVideoPartFromURL creates a video content part from a URL. Supported formats: mp4, mpeg, mov, avi, flv, mpg, webm, wmv, 3gpp Primarily supported by Gemini models.

type ContentType ¶

type ContentType string

ContentType defines the type of content in a multimodal message.

const (
	ContentTypeText     ContentType = "text"
	ContentTypeImage    ContentType = "image"
	ContentTypeAudio    ContentType = "audio"
	ContentTypeVideo    ContentType = "video"
	ContentTypeDocument ContentType = "document"
)

type DocumentSource ¶

type DocumentSource struct {
	Type     MediaSourceType // "url" or "base64"
	URL      string          // HTTP(S) URL to the document
	Data     string          // Base64-encoded document data
	MimeType string          // MIME type: "application/pdf", etc.
}

DocumentSource represents a document input (primarily PDF). Supported by Gemini and Anthropic.

type ErrorWithStatus ¶

type ErrorWithStatus interface {
	error
	StatusCode() int
	Provider() string
	Unwrap() error
}

ErrorWithStatus is the base interface for all AI client errors. It provides access to the HTTP status code and allows for error type assertions.

type FormatConverter ¶

type FormatConverter interface {
	// ConvertRequestFromFormat converts a provider-specific request format to Universal Request.
	// The input should be the unmarshaled JSON request body from the provider's API.
	ConvertRequestFromFormat(providerReq any) (*Request, error)

	// ConvertResponseToFormat converts a Universal Response to provider-specific response format.
	// Returns the response struct that can be marshaled to JSON for the provider's API.
	ConvertResponseToFormat(universalResp *Response, originalModel string) (any, error)

	// GetEndpoint returns the API endpoint path for this format (e.g., "/v1/chat/completions", "/v1/messages").
	GetEndpoint() string

	// GetProviderName returns the provider name for this format (e.g., "openai", "gemini", "anthropic").
	GetProviderName() string
}

FormatConverter defines the interface for converting between provider-specific API formats and the Universal format. This enables creating proxy servers that accept one provider's API format and route to any other provider.

type FormatConverterFactory ¶

type FormatConverterFactory struct{}

FormatConverterFactory creates format converters for different providers.

func NewFormatConverterFactory ¶

func NewFormatConverterFactory() *FormatConverterFactory

NewFormatConverterFactory creates a new format converter factory.

func (*FormatConverterFactory) GetConverter ¶

func (f *FormatConverterFactory) GetConverter(provider Provider) (FormatConverter, error)

GetConverter returns the appropriate format converter for the given provider.

type FunctionDefinition ¶

type FunctionDefinition struct {
	Name        string          `json:"name"`
	Description string          `json:"description,omitempty"`
	Parameters  json.RawMessage `json:"parameters"`
}

FunctionDefinition is a universal, provider-agnostic function definition.

type GeminiFormatConverter ¶

type GeminiFormatConverter struct{}

GeminiFormatConverter provides conversion between Google Gemini API format and Universal format. It implements the FormatConverter interface.

func NewGeminiFormatConverter ¶

func NewGeminiFormatConverter() *GeminiFormatConverter

NewGeminiFormatConverter creates a new Gemini format converter.

func (*GeminiFormatConverter) ConvertRequestFromFormat ¶

func (c *GeminiFormatConverter) ConvertRequestFromFormat(providerReq any) (*Request, error)

ConvertRequestFromFormat converts a Gemini request to Universal format.

func (*GeminiFormatConverter) ConvertRequestToUniversal ¶

func (c *GeminiFormatConverter) ConvertRequestToUniversal(geminiReq *GeminiGenerateContentRequest) (*Request, error)

ConvertRequestToUniversal converts a Gemini request to Universal format.

func (*GeminiFormatConverter) ConvertResponseToFormat ¶

func (c *GeminiFormatConverter) ConvertResponseToFormat(universalResp *Response, originalModel string) (any, error)

ConvertResponseToFormat converts a Universal Response to Gemini format.

func (*GeminiFormatConverter) ConvertResponseToGemini ¶

func (c *GeminiFormatConverter) ConvertResponseToGemini(universalResp *Response) (*GeminiGenerateContentResponse, error)

ConvertResponseToGemini converts a Universal Response to Gemini format.

func (*GeminiFormatConverter) GetEndpoint ¶

func (c *GeminiFormatConverter) GetEndpoint() string

GetEndpoint returns the Gemini API endpoint path (note: model is typically part of the path).

func (*GeminiFormatConverter) GetProviderName ¶

func (c *GeminiFormatConverter) GetProviderName() string

GetProviderName returns the provider name.

type GeminiGenerateContentRequest ¶

type GeminiGenerateContentRequest struct {
	Contents          []geminiContent `json:"contents"`
	Tools             []geminiTool    `json:"tools,omitempty"`
	SystemInstruction *geminiContent  `json:"systemInstruction,omitempty"`
}

GeminiGenerateContentRequest represents a Gemini generateContent request.

type GeminiGenerateContentResponse ¶

type GeminiGenerateContentResponse struct {
	Candidates []geminiCandidate `json:"candidates"`
}

GeminiGenerateContentResponse represents a Gemini generateContent response.

type ImageSource ¶

type ImageSource struct {
	Type   ImageSourceType // "url" or "base64"
	URL    string          // HTTP(S) URL to the image
	Data   string          // Base64-encoded image data (with or without data URI prefix)
	Format string          // Image format: "png", "jpeg", "gif", "webp" (optional, can be auto-detected)
}

ImageSource represents an image input for vision-enabled models.

type ImageSourceType ¶

type ImageSourceType string

ImageSourceType defines how an image is provided.

const (
	ImageSourceTypeURL    ImageSourceType = "url"
	ImageSourceTypeBase64 ImageSourceType = "base64"
)

type InvalidRequestError ¶

type InvalidRequestError struct {
	Details string
	// contains filtered or unexported fields
}

InvalidRequestError represents invalid request errors (400).

func NewInvalidRequestError ¶

func NewInvalidRequestError(provider string, message string, details string, err error) *InvalidRequestError

NewInvalidRequestError creates a new invalid request error.

func (*InvalidRequestError) Error ¶

func (e *InvalidRequestError) Error() string

func (*InvalidRequestError) Provider ¶

func (e *InvalidRequestError) Provider() string

func (*InvalidRequestError) StatusCode ¶

func (e *InvalidRequestError) StatusCode() int

func (*InvalidRequestError) Unwrap ¶

func (e *InvalidRequestError) Unwrap() error

type MediaSourceType ¶

type MediaSourceType string

MediaSourceType defines how media (audio/video/document) is provided.

const (
	MediaSourceTypeURL    MediaSourceType = "url"
	MediaSourceTypeBase64 MediaSourceType = "base64"
)

type Message ¶

type Message struct {
	Role         Role
	Content      string        // Simple text content (for backward compatibility)
	ContentParts []ContentPart // Multimodal content (text + images, etc.)
	ToolCalls    []ToolCall
	ToolCallID   string
}

Message represents a universal message structure. Supports both simple text messages (Content) and multimodal messages (ContentParts).

func NewMultimodalMessage ¶

func NewMultimodalMessage(role Role, parts []ContentPart) Message

NewMultimodalMessage creates a message with multimodal content parts.

func NewTextMessage ¶

func NewTextMessage(role Role, text string) Message

NewTextMessage creates a simple text message. This is a convenience function for backward compatibility.

type NetworkError ¶

type NetworkError struct {
	// contains filtered or unexported fields
}

NetworkError represents network-level failures (connection refused, DNS, etc.).

func NewNetworkError ¶

func NewNetworkError(provider string, message string, err error) *NetworkError

NewNetworkError creates a new network error.

func (*NetworkError) Error ¶

func (e *NetworkError) Error() string

func (*NetworkError) Provider ¶

func (e *NetworkError) Provider() string

func (*NetworkError) StatusCode ¶

func (e *NetworkError) StatusCode() int

func (*NetworkError) Unwrap ¶

func (e *NetworkError) Unwrap() error

type OpenAIChatCompletionRequest ¶

type OpenAIChatCompletionRequest struct {
	Model    string          `json:"model"`
	Messages []openaiMessage `json:"messages"`
	Tools    []openaiTool    `json:"tools,omitempty"`
}

OpenAIChatCompletionRequest represents an OpenAI chat completion request. This type is exported to enable format conversion in the proxy server.

type OpenAIFormatConverter ¶

type OpenAIFormatConverter struct{}

OpenAIFormatConverter provides conversion between OpenAI API format and Universal format. This enables creating an OpenAI-compatible proxy server that can route to any provider. It implements the FormatConverter interface.

func NewOpenAIFormatConverter ¶

func NewOpenAIFormatConverter() *OpenAIFormatConverter

NewOpenAIFormatConverter creates a new OpenAI format converter.

func (*OpenAIFormatConverter) ConvertRequestFromFormat ¶

func (c *OpenAIFormatConverter) ConvertRequestFromFormat(providerReq any) (*Request, error)

ConvertRequestFromFormat converts an OpenAI request to Universal format. Implements FormatConverter interface.

func (*OpenAIFormatConverter) ConvertRequestToUniversal ¶

func (c *OpenAIFormatConverter) ConvertRequestToUniversal(openaiReq *OpenAIChatCompletionRequest) (*Request, error)

ConvertRequestToUniversal converts an OpenAI chat completion request to Universal Request format.

func (*OpenAIFormatConverter) ConvertResponseToFormat ¶

func (c *OpenAIFormatConverter) ConvertResponseToFormat(universalResp *Response, originalModel string) (any, error)

ConvertResponseToFormat converts a Universal Response to OpenAI format. Implements FormatConverter interface.

func (*OpenAIFormatConverter) ConvertResponseToOpenAI ¶

func (c *OpenAIFormatConverter) ConvertResponseToOpenAI(universalResp *Response, model string, promptTokens, completionTokens int) (*openaiChatCompletionResponse, error)

ConvertResponseToOpenAI converts a Universal Response to OpenAI chat completion response format.

func (*OpenAIFormatConverter) GetEndpoint ¶

func (c *OpenAIFormatConverter) GetEndpoint() string

GetEndpoint returns the OpenAI API endpoint path.

func (*OpenAIFormatConverter) GetProviderName ¶

func (c *OpenAIFormatConverter) GetProviderName() string

GetProviderName returns the provider name.

type Option ¶

type Option func(*Config)

Option is the function signature for Configuration options.

func WithAPIKey ¶

func WithAPIKey(apiKey string) Option

WithAPIKey sets the API key for authentication.

func WithBaseURL ¶

func WithBaseURL(baseURL string) Option

WithBaseURL sets a custom base URL for the API endpoint.

func WithModel ¶

func WithModel(model string) Option

WithModel sets the model name to use for the client.

func WithProvider ¶

func WithProvider(provider Provider) Option

WithProvider sets the AI provider.

func WithTimeout ¶

func WithTimeout(timeout time.Duration) Option

WithTimeout sets the HTTP client timeout.

type Provider ¶

type Provider string

Provider defines the supported AI providers.

const (
	ProviderOpenAI    Provider = "openai"
	ProviderGemini    Provider = "gemini"
	ProviderAnthropic Provider = "anthropic"
)

type RateLimitError ¶

type RateLimitError struct {
	RetryAfter time.Duration
	// contains filtered or unexported fields
}

RateLimitError represents rate limiting errors (429).

func NewRateLimitError ¶

func NewRateLimitError(provider string, message string, retryAfter time.Duration, err error) *RateLimitError

NewRateLimitError creates a new rate limit error.

func (*RateLimitError) Error ¶

func (e *RateLimitError) Error() string

func (*RateLimitError) Provider ¶

func (e *RateLimitError) Provider() string

func (*RateLimitError) StatusCode ¶

func (e *RateLimitError) StatusCode() int

func (*RateLimitError) Unwrap ¶

func (e *RateLimitError) Unwrap() error

type Request ¶

type Request struct {
	Model        string
	SystemPrompt string
	Messages     []Message
	Tools        []Tool
}

Request is a universal request structure for content generation.

func ConvertRequest ¶

func ConvertRequest(sourceFormat string, payload []byte) (*Request, error)

ConvertRequest converts a provider-specific request payload into the universal Request format.

func (*Request) Validate ¶

func (r *Request) Validate() error

Validate checks if the request is valid and returns an error if not. This method validates all request fields before sending to the API.

type Response ¶

type Response struct {
	Text      string
	ToolCalls []ToolCall
}

Response is a universal response structure.

type Role ¶

type Role string

Role defines the originator of a message.

const (
	RoleSystem    Role = "system"
	RoleUser      Role = "user"
	RoleAssistant Role = "assistant"
	RoleTool      Role = "tool"
)

type ServerError ¶

type ServerError struct {
	// contains filtered or unexported fields
}

ServerError represents server-side errors (5xx).

func NewServerError ¶

func NewServerError(provider string, statusCode int, message string, err error) *ServerError

NewServerError creates a new server error.

func (*ServerError) Error ¶

func (e *ServerError) Error() string

func (*ServerError) Provider ¶

func (e *ServerError) Provider() string

func (*ServerError) StatusCode ¶

func (e *ServerError) StatusCode() int

func (*ServerError) Unwrap ¶

func (e *ServerError) Unwrap() error

type TimeoutError ¶

type TimeoutError struct {
	Duration time.Duration
	// contains filtered or unexported fields
}

TimeoutError represents timeout errors (context deadline exceeded).

func NewTimeoutError ¶

func NewTimeoutError(provider string, duration time.Duration, err error) *TimeoutError

NewTimeoutError creates a new timeout error.

func (*TimeoutError) Error ¶

func (e *TimeoutError) Error() string

func (*TimeoutError) Provider ¶

func (e *TimeoutError) Provider() string

func (*TimeoutError) StatusCode ¶

func (e *TimeoutError) StatusCode() int

func (*TimeoutError) Unwrap ¶

func (e *TimeoutError) Unwrap() error

type Tool ¶

type Tool struct {
	Type     string             `json:"type"`
	Function FunctionDefinition `json:"function"`
}

Tool defines a tool the model can use.

type ToolCall ¶

type ToolCall struct {
	ID        string
	Type      string
	Function  string
	Arguments string
}

ToolCall represents a request from the model to call a specific tool.

type ToolServerClient ¶

type ToolServerClient struct {
	// contains filtered or unexported fields
}

ToolServerClient handles the connection lifecycle for a single tool server. It should be created and managed by the ToolServerManager.

func (*ToolServerClient) Close ¶

func (c *ToolServerClient) Close() error

Close terminates the session with the tool server.

func (*ToolServerClient) Connect ¶

func (c *ToolServerClient) Connect(ctx context.Context) error

Connect establishes a session with the tool server. It is optional to call this manually; methods like FetchTools and ExecuteTool will call it automatically if a session is not already active.

func (*ToolServerClient) ExecuteTool ¶

func (c *ToolServerClient) ExecuteTool(ctx context.Context, toolCall ToolCall) (string, error)

ExecuteTool executes a tool call, automatically connecting if necessary.

func (*ToolServerClient) FetchTools ¶

func (c *ToolServerClient) FetchTools(ctx context.Context) ([]Tool, error)

FetchTools lists available tools, automatically connecting if necessary.

type ToolServerConfig ¶

type ToolServerConfig struct {
	MCPServers map[string]ToolServerDetails `json:"mcpServers"`
}

ToolServerConfig defines the top-level structure of the mcp.json file.

type ToolServerDetails ¶

type ToolServerDetails struct {
	Command string            `json:"command"`
	Args    []string          `json:"args"`
	Env     map[string]string `json:"env"`
}

ToolServerDetails defines the configuration for a command-based tool server as found in the mcp.json file.

type ToolServerManager ¶

type ToolServerManager struct {
	// contains filtered or unexported fields
}

ToolServerManager discovers and manages tool server clients from various sources. It acts as a central registry for all known tool servers. All methods are safe for concurrent use.

func NewToolServerManager ¶

func NewToolServerManager() *ToolServerManager

NewToolServerManager creates a new, empty manager.

func (*ToolServerManager) AddRemoteServer ¶

func (m *ToolServerManager) AddRemoteServer(name, url string) error

AddRemoteServer programmatically registers a remote, HTTP-based tool server.

func (*ToolServerManager) GetClient ¶

func (m *ToolServerManager) GetClient(name string) (*ToolServerClient, bool)

GetClient retrieves a ready-to-use client for the server with the given name.

func (*ToolServerManager) ListServerNames ¶

func (m *ToolServerManager) ListServerNames() []string

ListServerNames returns a slice of the names of all registered servers.

func (*ToolServerManager) LoadFromFile ¶

func (m *ToolServerManager) LoadFromFile(configFile string) error

LoadFromFile parses a standard mcp.json file and registers all defined servers with the manager.

type UnknownError ¶

type UnknownError struct {
	// contains filtered or unexported fields
}

UnknownError represents unexpected errors that don't fit other categories.

func NewUnknownError ¶

func NewUnknownError(provider string, statusCode int, message string, err error) *UnknownError

NewUnknownError creates a new unknown error.

func (*UnknownError) Error ¶

func (e *UnknownError) Error() string

func (*UnknownError) Provider ¶

func (e *UnknownError) Provider() string

func (*UnknownError) StatusCode ¶

func (e *UnknownError) StatusCode() int

func (*UnknownError) Unwrap ¶

func (e *UnknownError) Unwrap() error

type VideoSource ¶

type VideoSource struct {
	Type   MediaSourceType // "url" or "base64"
	URL    string          // HTTP(S) URL to the video file
	Data   string          // Base64-encoded video data
	Format string          // Video format: "mp4", "mpeg", "mov", "avi", "flv", "webm", etc.
}

VideoSource represents a video input for video-enabled models (primarily Gemini). Supported formats: MP4, MPEG, MOV, AVI, FLV, MPG, WEBM, WMV, 3GPP

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
cmd
api-proxy command
examples
audio_chat command
pdf_chat command
simple_chat command
tool_server_interaction command
universal_proxy_demo command
video_chat command
vision_chat command

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL