Documentation
¶
Overview ¶
Package gemini implements the Google Gemini backend for WaveTerm's AI chat system.
This package provides a complete implementation of the UseChatBackend interface for Google's Gemini API, including:
- Streaming chat responses via Server-Sent Events (SSE)
- Function calling (tool use) support
- Multi-modal input support (text, images, PDFs)
- Proper message conversion and state management
API Type ¶
The Gemini backend uses the API type constant:
uctypes.APIType_GoogleGemini = "google-gemini"
Supported Features ¶
- Text messages - Image uploads (JPEG, PNG, etc.) - inline base64 encoding - PDF document uploads - inline base64 encoding - Text file attachments - Directory listings - Function/tool calling with structured arguments - Streaming responses with real-time token delivery
Usage ¶
The backend is automatically registered and can be obtained via:
backend, err := aiusechat.GetBackendByAPIType(uctypes.APIType_GoogleGemini)
To use the Gemini API, you need:
- A Google AI API key
- Configure the chat with APIType_GoogleGemini
- Set the Model (e.g., "gemini-2.0-flash-exp")
- Provide the API key in the Config.APIToken field
Configuration Example ¶
chatOpts := uctypes.WaveChatOpts{
ChatId: "my-chat-id",
ClientId: "my-client-id",
Config: uctypes.AIOptsType{
APIType: uctypes.APIType_GoogleGemini,
Model: "gemini-2.0-flash-exp",
APIToken: "your-google-api-key",
MaxTokens: 8192,
Capabilities: []string{
uctypes.AICapabilityTools,
uctypes.AICapabilityImages,
uctypes.AICapabilityPdfs,
},
},
Tools: []uctypes.ToolDefinition{...},
SystemPrompt: []string{"You are a helpful assistant."},
}
Message Format ¶
The Gemini backend uses the GeminiChatMessage type internally, which stores:
- MessageId: Unique identifier for idempotency
- Role: "user" or "model" (model is Gemini's term for assistant)
- Parts: Array of message parts (text, inline data, function calls/responses)
- Usage: Token usage metadata
Function Calling ¶
Function calling is supported via Gemini's native function calling feature:
- Tools are converted to Gemini's FunctionDeclaration format
- Function calls are streamed with real-time argument updates
- Function responses are sent back as user messages with FunctionResponse parts
API Endpoint ¶
By default, the backend uses:
https://generativelanguage.googleapis.com/v1beta/models/{model}:streamGenerateContent
You can override this by setting Config.BaseURL.
Error Handling ¶
The backend properly handles:
- Content blocking/safety filters
- Token limit errors
- Network errors
- Malformed responses
- Context cancellation
All errors are properly propagated through the SSE stream.
Limitations ¶
- File uploads must be provided as base64-encoded inline data - Images and PDFs use inline data, not file upload URIs - Multi-turn conversations require proper role alternation (user/model) - Some advanced Gemini features like caching are not yet implemented
Index ¶
- Constants
- func ConvertAIChatToUIChat(aiChat uctypes.AIChat) (*uctypes.UIChat, error)
- func GetFunctionCallInputByToolCallId(aiChat uctypes.AIChat, toolCallId string) *uctypes.AIFunctionCallInput
- func UpdateToolUseData(chatId string, toolCallId string, toolUseData uctypes.UIMessageDataToolUse) error
- type GeminiCandidate
- type GeminiChatMessage
- func ConvertAIMessageToGeminiChatMessage(aiMsg uctypes.AIMessage) (*GeminiChatMessage, error)
- func ConvertToolResultsToGeminiChatMessage(toolResults []uctypes.AIToolResult) (*GeminiChatMessage, error)
- func RunGeminiChatStep(ctx context.Context, sseHandler *sse.SSEHandlerCh, ...) (*uctypes.WaveStopReason, *GeminiChatMessage, *uctypes.RateLimitInfo, error)
- type GeminiContent
- type GeminiError
- type GeminiErrorResponse
- type GeminiFileData
- type GeminiFunctionCall
- type GeminiFunctionCallingConfig
- type GeminiFunctionDeclaration
- type GeminiFunctionResponse
- type GeminiGenerationConfig
- type GeminiGoogleSearch
- type GeminiGroundingMetadata
- type GeminiInlineData
- type GeminiMessagePart
- type GeminiPromptFeedback
- type GeminiRequest
- type GeminiSafetyRating
- type GeminiStreamResponse
- type GeminiThinkingConfig
- type GeminiTool
- type GeminiToolConfig
- type GeminiUsageMetadata
Constants ¶
const (
GeminiDefaultMaxTokens = 8192
)
Variables ¶
This section is empty.
Functions ¶
func ConvertAIChatToUIChat ¶
ConvertAIChatToUIChat converts an AIChat to a UIChat for Gemini
func GetFunctionCallInputByToolCallId ¶
func GetFunctionCallInputByToolCallId(aiChat uctypes.AIChat, toolCallId string) *uctypes.AIFunctionCallInput
GetFunctionCallInputByToolCallId returns the function call input associated with the given tool call ID
func UpdateToolUseData ¶
func UpdateToolUseData(chatId string, toolCallId string, toolUseData uctypes.UIMessageDataToolUse) error
UpdateToolUseData updates the tool use data for a specific tool call in the chat
Types ¶
type GeminiCandidate ¶
type GeminiCandidate struct {
Content *GeminiContent `json:"content,omitempty"`
FinishReason string `json:"finishReason,omitempty"`
Index int `json:"index,omitempty"`
SafetyRatings []GeminiSafetyRating `json:"safetyRatings,omitempty"`
GroundingMetadata *GeminiGroundingMetadata `json:"groundingMetadata,omitempty"`
}
GeminiCandidate represents a candidate response
type GeminiChatMessage ¶
type GeminiChatMessage struct {
MessageId string `json:"messageid"`
Role string `json:"role"` // "user", "model"
Parts []GeminiMessagePart `json:"parts"`
Usage *GeminiUsageMetadata `json:"usage,omitempty"`
}
GeminiChatMessage represents a stored chat message for Gemini backend
func ConvertAIMessageToGeminiChatMessage ¶
func ConvertAIMessageToGeminiChatMessage(aiMsg uctypes.AIMessage) (*GeminiChatMessage, error)
ConvertAIMessageToGeminiChatMessage converts an AIMessage to GeminiChatMessage These messages are ALWAYS role "user"
func ConvertToolResultsToGeminiChatMessage ¶
func ConvertToolResultsToGeminiChatMessage(toolResults []uctypes.AIToolResult) (*GeminiChatMessage, error)
ConvertToolResultsToGeminiChatMessage converts AIToolResult slice to GeminiChatMessage
func RunGeminiChatStep ¶
func RunGeminiChatStep( ctx context.Context, sseHandler *sse.SSEHandlerCh, chatOpts uctypes.WaveChatOpts, cont *uctypes.WaveContinueResponse, ) (*uctypes.WaveStopReason, *GeminiChatMessage, *uctypes.RateLimitInfo, error)
RunGeminiChatStep executes a chat step using the Gemini API
func (*GeminiChatMessage) GetMessageId ¶
func (m *GeminiChatMessage) GetMessageId() string
func (*GeminiChatMessage) GetRole ¶
func (m *GeminiChatMessage) GetRole() string
func (*GeminiChatMessage) GetUsage ¶
func (m *GeminiChatMessage) GetUsage() *uctypes.AIUsage
type GeminiContent ¶
type GeminiContent struct {
Role string `json:"role,omitempty"`
Parts []GeminiMessagePart `json:"parts"`
}
GeminiContent represents a content message for the API
func (*GeminiContent) Clean ¶
func (c *GeminiContent) Clean() *GeminiContent
Clean removes internal fields from all parts
type GeminiError ¶
type GeminiError struct {
Code int `json:"code"`
Message string `json:"message"`
Status string `json:"status,omitempty"`
}
GeminiError represents an error
type GeminiErrorResponse ¶
type GeminiErrorResponse struct {
Error *GeminiError `json:"error,omitempty"`
}
GeminiErrorResponse represents an error response
type GeminiFileData ¶
type GeminiFileData struct {
MimeType string `json:"mimeType"`
FileUri string `json:"fileUri"` // gs:// URI from file upload
DisplayName string `json:"displayName,omitempty"` // for multimodal function responses
}
GeminiFileData represents uploaded file reference
type GeminiFunctionCall ¶
type GeminiFunctionCall struct {
Name string `json:"name"`
Args map[string]any `json:"args,omitempty"`
}
GeminiFunctionCall represents a function call from the model
type GeminiFunctionCallingConfig ¶
type GeminiFunctionCallingConfig struct {
Mode string `json:"mode,omitempty"` // "AUTO", "ANY", "NONE"
}
GeminiFunctionCallingConfig represents function calling configuration
type GeminiFunctionDeclaration ¶
type GeminiFunctionDeclaration struct {
Name string `json:"name"`
Description string `json:"description"`
Parameters map[string]any `json:"parameters,omitempty"`
}
GeminiFunctionDeclaration represents a function schema
func ConvertToolDefinitionToGemini ¶
func ConvertToolDefinitionToGemini(tool uctypes.ToolDefinition) GeminiFunctionDeclaration
ConvertToolDefinitionToGemini converts a Wave ToolDefinition to Gemini format
type GeminiFunctionResponse ¶
type GeminiFunctionResponse struct {
Name string `json:"name"`
Response map[string]any `json:"response"`
Parts []GeminiMessagePart `json:"parts,omitempty"` // nested parts for multimodal content (Gemini 3 Pro and later)
}
GeminiFunctionResponse represents a function execution result
type GeminiGenerationConfig ¶
type GeminiGenerationConfig struct {
Temperature float32 `json:"temperature,omitempty"`
TopP float32 `json:"topP,omitempty"`
TopK int32 `json:"topK,omitempty"`
CandidateCount int32 `json:"candidateCount,omitempty"`
MaxOutputTokens int32 `json:"maxOutputTokens,omitempty"`
StopSequences []string `json:"stopSequences,omitempty"`
ThinkingConfig *GeminiThinkingConfig `json:"thinkingConfig,omitempty"` // for Gemini 3+ models
}
GeminiGenerationConfig represents generation parameters
type GeminiGoogleSearch ¶
type GeminiGoogleSearch struct{}
GeminiGoogleSearch represents Google Search configuration (empty for default)
type GeminiGroundingMetadata ¶
type GeminiGroundingMetadata struct {
WebSearchQueries []string `json:"webSearchQueries,omitempty"`
}
GeminiGroundingMetadata represents grounding metadata with web search results
type GeminiInlineData ¶
type GeminiInlineData struct {
MimeType string `json:"mimeType"`
Data string `json:"data"` // base64 encoded
DisplayName string `json:"displayName,omitempty"` // for multimodal function responses
}
GeminiInlineData represents inline binary data
type GeminiMessagePart ¶
type GeminiMessagePart struct {
// Text part
Text string `json:"text,omitempty"`
// Inline data (images, PDFs, etc.)
InlineData *GeminiInlineData `json:"inlineData,omitempty"`
// File data (for uploaded files)
FileData *GeminiFileData `json:"fileData,omitempty"`
// Function call (assistant calling a tool)
FunctionCall *GeminiFunctionCall `json:"functionCall,omitempty"`
// Function response (result of tool execution)
FunctionResponse *GeminiFunctionResponse `json:"functionResponse,omitempty"`
// Thought signature (for thinking models - applies to text and function calls)
ThoughtSignature string `json:"thoughtSignature,omitempty"`
// Internal fields (not sent to API)
PreviewUrl string `json:"previewurl,omitempty"` // internal field
FileName string `json:"filename,omitempty"` // internal field
ToolUseData *uctypes.UIMessageDataToolUse `json:"toolusedata,omitempty"` // internal field
}
GeminiMessagePart represents different types of content in a message
func (*GeminiMessagePart) Clean ¶
func (p *GeminiMessagePart) Clean() *GeminiMessagePart
Clean removes internal fields before sending to API
type GeminiPromptFeedback ¶
type GeminiPromptFeedback struct {
BlockReason string `json:"blockReason,omitempty"`
SafetyRatings []GeminiSafetyRating `json:"safetyRatings,omitempty"`
}
GeminiPromptFeedback represents feedback about the prompt
type GeminiRequest ¶
type GeminiRequest struct {
Contents []GeminiContent `json:"contents"`
SystemInstruction *GeminiContent `json:"systemInstruction,omitempty"`
GenerationConfig *GeminiGenerationConfig `json:"generationConfig,omitempty"`
Tools []GeminiTool `json:"tools,omitempty"`
ToolConfig *GeminiToolConfig `json:"toolConfig,omitempty"`
}
GeminiRequest represents a request to the Gemini API
type GeminiSafetyRating ¶
type GeminiSafetyRating struct {
Category string `json:"category"`
Probability string `json:"probability"`
}
GeminiSafetyRating represents a safety rating
type GeminiStreamResponse ¶
type GeminiStreamResponse struct {
Candidates []GeminiCandidate `json:"candidates,omitempty"`
PromptFeedback *GeminiPromptFeedback `json:"promptFeedback,omitempty"`
UsageMetadata *GeminiUsageMetadata `json:"usageMetadata,omitempty"`
GroundingMetadata *GeminiGroundingMetadata `json:"groundingMetadata,omitempty"`
}
GeminiStreamResponse represents a streaming response chunk
type GeminiThinkingConfig ¶
type GeminiThinkingConfig struct {
ThinkingLevel string `json:"thinkingLevel,omitempty"` // "low" or "high"
}
GeminiThinkingConfig represents thinking configuration for Gemini 3+ models
type GeminiTool ¶
type GeminiTool struct {
FunctionDeclarations []GeminiFunctionDeclaration `json:"functionDeclarations,omitempty"`
GoogleSearch *GeminiGoogleSearch `json:"googleSearch,omitempty"`
}
GeminiTool represents a function tool definition
type GeminiToolConfig ¶
type GeminiToolConfig struct {
FunctionCallingConfig *GeminiFunctionCallingConfig `json:"functionCallingConfig,omitempty"`
}
GeminiToolConfig represents tool choice configuration
type GeminiUsageMetadata ¶
type GeminiUsageMetadata struct {
Model string `json:"model,omitempty"` // internal field
PromptTokenCount int `json:"promptTokenCount"`
CachedContentTokenCount int `json:"cachedContentTokenCount,omitempty"`
CandidatesTokenCount int `json:"candidatesTokenCount"`
TotalTokenCount int `json:"totalTokenCount"`
}
GeminiUsageMetadata represents token usage