gemini

package

v0.13.0-beta.0 Latest Latest Go to latest Published: Dec 6, 2025 License: Apache-2.0 Imports: 19 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/wavetermdev/waveterm

Links

Open Source Insights

Documentation ¶

Overview ¶

Package gemini implements the Google Gemini backend for WaveTerm's AI chat system.

This package provides a complete implementation of the UseChatBackend interface for Google's Gemini API, including:

Streaming chat responses via Server-Sent Events (SSE)
Function calling (tool use) support
Multi-modal input support (text, images, PDFs)
Proper message conversion and state management

API Type ¶

The Gemini backend uses the API type constant:

uctypes.APIType_GoogleGemini = "google-gemini"

Supported Features ¶

- Text messages - Image uploads (JPEG, PNG, etc.) - inline base64 encoding - PDF document uploads - inline base64 encoding - Text file attachments - Directory listings - Function/tool calling with structured arguments - Streaming responses with real-time token delivery

Usage ¶

The backend is automatically registered and can be obtained via:

backend, err := aiusechat.GetBackendByAPIType(uctypes.APIType_GoogleGemini)

To use the Gemini API, you need:

A Google AI API key
Configure the chat with APIType_GoogleGemini
Set the Model (e.g., "gemini-2.0-flash-exp")
Provide the API key in the Config.APIToken field

Configuration Example ¶

chatOpts := uctypes.WaveChatOpts{
    ChatId:   "my-chat-id",
    ClientId: "my-client-id",
    Config: uctypes.AIOptsType{
        APIType:      uctypes.APIType_GoogleGemini,
        Model:        "gemini-2.0-flash-exp",
        APIToken:     "your-google-api-key",
        MaxTokens:    8192,
        Capabilities: []string{
            uctypes.AICapabilityTools,
            uctypes.AICapabilityImages,
            uctypes.AICapabilityPdfs,
        },
    },
    Tools:        []uctypes.ToolDefinition{...},
    SystemPrompt: []string{"You are a helpful assistant."},
}

Message Format ¶

The Gemini backend uses the GeminiChatMessage type internally, which stores:

MessageId: Unique identifier for idempotency
Role: "user" or "model" (model is Gemini's term for assistant)
Parts: Array of message parts (text, inline data, function calls/responses)
Usage: Token usage metadata

Function Calling ¶

Function calling is supported via Gemini's native function calling feature:

Tools are converted to Gemini's FunctionDeclaration format
Function calls are streamed with real-time argument updates
Function responses are sent back as user messages with FunctionResponse parts

API Endpoint ¶

By default, the backend uses:

https://generativelanguage.googleapis.com/v1beta/models/{model}:streamGenerateContent

You can override this by setting Config.BaseURL.

Error Handling ¶

The backend properly handles:

Content blocking/safety filters
Token limit errors
Network errors
Malformed responses
Context cancellation

All errors are properly propagated through the SSE stream.

Limitations ¶

- File uploads must be provided as base64-encoded inline data - Images and PDFs use inline data, not file upload URIs - Multi-turn conversations require proper role alternation (user/model) - Some advanced Gemini features like caching are not yet implemented

Index ¶

Constants
func ConvertAIChatToUIChat(aiChat uctypes.AIChat) (*uctypes.UIChat, error)
func GetFunctionCallInputByToolCallId(aiChat uctypes.AIChat, toolCallId string) *uctypes.AIFunctionCallInput
func UpdateToolUseData(chatId string, toolCallId string, toolUseData uctypes.UIMessageDataToolUse) error
type GeminiCandidate
type GeminiChatMessage
- func ConvertAIMessageToGeminiChatMessage(aiMsg uctypes.AIMessage) (*GeminiChatMessage, error)
- func ConvertToolResultsToGeminiChatMessage(toolResults []uctypes.AIToolResult) (*GeminiChatMessage, error)
- func RunGeminiChatStep(ctx context.Context, sseHandler *sse.SSEHandlerCh, ...) (*uctypes.WaveStopReason, *GeminiChatMessage, *uctypes.RateLimitInfo, error)
- func (m *GeminiChatMessage) GetMessageId() string
- func (m *GeminiChatMessage) GetRole() string
- func (m *GeminiChatMessage) GetUsage() *uctypes.AIUsage
type GeminiContent
- func (c *GeminiContent) Clean() *GeminiContent
type GeminiError
type GeminiErrorResponse
type GeminiFileData
type GeminiFunctionCall
type GeminiFunctionCallingConfig
type GeminiFunctionDeclaration
- func ConvertToolDefinitionToGemini(tool uctypes.ToolDefinition) GeminiFunctionDeclaration
type GeminiFunctionResponse
type GeminiGenerationConfig
type GeminiGoogleSearch
type GeminiGroundingMetadata
type GeminiInlineData
type GeminiMessagePart
- func (p *GeminiMessagePart) Clean() *GeminiMessagePart
type GeminiPromptFeedback
type GeminiRequest
type GeminiSafetyRating
type GeminiStreamResponse
type GeminiThinkingConfig
type GeminiTool
type GeminiToolConfig
type GeminiUsageMetadata

Constants ¶

View Source

const (
	GeminiDefaultMaxTokens = 8192
)

Variables ¶

This section is empty.

Functions ¶

func ConvertAIChatToUIChat ¶

func ConvertAIChatToUIChat(aiChat uctypes.AIChat) (*uctypes.UIChat, error)

ConvertAIChatToUIChat converts an AIChat to a UIChat for Gemini

func GetFunctionCallInputByToolCallId ¶

func GetFunctionCallInputByToolCallId(aiChat uctypes.AIChat, toolCallId string) *uctypes.AIFunctionCallInput

GetFunctionCallInputByToolCallId returns the function call input associated with the given tool call ID

func UpdateToolUseData ¶

func UpdateToolUseData(chatId string, toolCallId string, toolUseData uctypes.UIMessageDataToolUse) error

UpdateToolUseData updates the tool use data for a specific tool call in the chat

Types ¶

type GeminiCandidate ¶

type GeminiCandidate struct {
	Content           *GeminiContent           `json:"content,omitempty"`
	FinishReason      string                   `json:"finishReason,omitempty"`
	Index             int                      `json:"index,omitempty"`
	SafetyRatings     []GeminiSafetyRating     `json:"safetyRatings,omitempty"`
	GroundingMetadata *GeminiGroundingMetadata `json:"groundingMetadata,omitempty"`
}

GeminiCandidate represents a candidate response

type GeminiChatMessage ¶

type GeminiChatMessage struct {
	MessageId string               `json:"messageid"`
	Role      string               `json:"role"` // "user", "model"
	Parts     []GeminiMessagePart  `json:"parts"`
	Usage     *GeminiUsageMetadata `json:"usage,omitempty"`
}

GeminiChatMessage represents a stored chat message for Gemini backend

func ConvertAIMessageToGeminiChatMessage ¶

func ConvertAIMessageToGeminiChatMessage(aiMsg uctypes.AIMessage) (*GeminiChatMessage, error)

ConvertAIMessageToGeminiChatMessage converts an AIMessage to GeminiChatMessage These messages are ALWAYS role "user"

func ConvertToolResultsToGeminiChatMessage ¶

func ConvertToolResultsToGeminiChatMessage(toolResults []uctypes.AIToolResult) (*GeminiChatMessage, error)

ConvertToolResultsToGeminiChatMessage converts AIToolResult slice to GeminiChatMessage

func RunGeminiChatStep ¶

func RunGeminiChatStep(
	ctx context.Context,
	sseHandler *sse.SSEHandlerCh,
	chatOpts uctypes.WaveChatOpts,
	cont *uctypes.WaveContinueResponse,
) (*uctypes.WaveStopReason, *GeminiChatMessage, *uctypes.RateLimitInfo, error)

RunGeminiChatStep executes a chat step using the Gemini API

func (*GeminiChatMessage) GetMessageId ¶

func (m *GeminiChatMessage) GetMessageId() string

func (*GeminiChatMessage) GetRole ¶

func (m *GeminiChatMessage) GetRole() string

func (*GeminiChatMessage) GetUsage ¶

func (m *GeminiChatMessage) GetUsage() *uctypes.AIUsage

type GeminiContent ¶

type GeminiContent struct {
	Role  string              `json:"role,omitempty"`
	Parts []GeminiMessagePart `json:"parts"`
}

GeminiContent represents a content message for the API

func (*GeminiContent) Clean ¶

func (c *GeminiContent) Clean() *GeminiContent

Clean removes internal fields from all parts

type GeminiError ¶

type GeminiError struct {
	Code    int    `json:"code"`
	Message string `json:"message"`
	Status  string `json:"status,omitempty"`
}

GeminiError represents an error

type GeminiErrorResponse ¶

type GeminiErrorResponse struct {
	Error *GeminiError `json:"error,omitempty"`
}

GeminiErrorResponse represents an error response

type GeminiFileData ¶

type GeminiFileData struct {
	MimeType    string `json:"mimeType"`
	FileUri     string `json:"fileUri"`               // gs:// URI from file upload
	DisplayName string `json:"displayName,omitempty"` // for multimodal function responses
}

GeminiFileData represents uploaded file reference

type GeminiFunctionCall ¶

type GeminiFunctionCall struct {
	Name string         `json:"name"`
	Args map[string]any `json:"args,omitempty"`
}

GeminiFunctionCall represents a function call from the model

type GeminiFunctionCallingConfig ¶

type GeminiFunctionCallingConfig struct {
	Mode string `json:"mode,omitempty"` // "AUTO", "ANY", "NONE"
}

GeminiFunctionCallingConfig represents function calling configuration

type GeminiFunctionDeclaration ¶

type GeminiFunctionDeclaration struct {
	Name        string         `json:"name"`
	Description string         `json:"description"`
	Parameters  map[string]any `json:"parameters,omitempty"`
}

GeminiFunctionDeclaration represents a function schema

func ConvertToolDefinitionToGemini ¶

func ConvertToolDefinitionToGemini(tool uctypes.ToolDefinition) GeminiFunctionDeclaration

ConvertToolDefinitionToGemini converts a Wave ToolDefinition to Gemini format

type GeminiFunctionResponse ¶

type GeminiFunctionResponse struct {
	Name     string              `json:"name"`
	Response map[string]any      `json:"response"`
	Parts    []GeminiMessagePart `json:"parts,omitempty"` // nested parts for multimodal content (Gemini 3 Pro and later)
}

GeminiFunctionResponse represents a function execution result

type GeminiGenerationConfig ¶

type GeminiGenerationConfig struct {
	Temperature     float32               `json:"temperature,omitempty"`
	TopP            float32               `json:"topP,omitempty"`
	TopK            int32                 `json:"topK,omitempty"`
	CandidateCount  int32                 `json:"candidateCount,omitempty"`
	MaxOutputTokens int32                 `json:"maxOutputTokens,omitempty"`
	StopSequences   []string              `json:"stopSequences,omitempty"`
	ThinkingConfig  *GeminiThinkingConfig `json:"thinkingConfig,omitempty"` // for Gemini 3+ models
}

GeminiGenerationConfig represents generation parameters

type GeminiGoogleSearch ¶

type GeminiGoogleSearch struct{}

GeminiGoogleSearch represents Google Search configuration (empty for default)

type GeminiGroundingMetadata ¶

type GeminiGroundingMetadata struct {
	WebSearchQueries []string `json:"webSearchQueries,omitempty"`
}

GeminiGroundingMetadata represents grounding metadata with web search results

type GeminiInlineData ¶

type GeminiInlineData struct {
	MimeType    string `json:"mimeType"`
	Data        string `json:"data"`                  // base64 encoded
	DisplayName string `json:"displayName,omitempty"` // for multimodal function responses
}

GeminiInlineData represents inline binary data

type GeminiMessagePart ¶

type GeminiMessagePart struct {
	// Text part
	Text string `json:"text,omitempty"`

	// Inline data (images, PDFs, etc.)
	InlineData *GeminiInlineData `json:"inlineData,omitempty"`

	// File data (for uploaded files)
	FileData *GeminiFileData `json:"fileData,omitempty"`

	// Function call (assistant calling a tool)
	FunctionCall *GeminiFunctionCall `json:"functionCall,omitempty"`

	// Function response (result of tool execution)
	FunctionResponse *GeminiFunctionResponse `json:"functionResponse,omitempty"`

	// Thought signature (for thinking models - applies to text and function calls)
	ThoughtSignature string `json:"thoughtSignature,omitempty"`

	// Internal fields (not sent to API)
	PreviewUrl  string                        `json:"previewurl,omitempty"`  // internal field
	FileName    string                        `json:"filename,omitempty"`    // internal field
	ToolUseData *uctypes.UIMessageDataToolUse `json:"toolusedata,omitempty"` // internal field
}

GeminiMessagePart represents different types of content in a message

func (*GeminiMessagePart) Clean ¶

func (p *GeminiMessagePart) Clean() *GeminiMessagePart

Clean removes internal fields before sending to API

type GeminiPromptFeedback ¶

type GeminiPromptFeedback struct {
	BlockReason   string               `json:"blockReason,omitempty"`
	SafetyRatings []GeminiSafetyRating `json:"safetyRatings,omitempty"`
}

GeminiPromptFeedback represents feedback about the prompt

type GeminiRequest ¶

type GeminiRequest struct {
	Contents          []GeminiContent         `json:"contents"`
	SystemInstruction *GeminiContent          `json:"systemInstruction,omitempty"`
	GenerationConfig  *GeminiGenerationConfig `json:"generationConfig,omitempty"`
	Tools             []GeminiTool            `json:"tools,omitempty"`
	ToolConfig        *GeminiToolConfig       `json:"toolConfig,omitempty"`
}

GeminiRequest represents a request to the Gemini API

type GeminiSafetyRating ¶

type GeminiSafetyRating struct {
	Category    string `json:"category"`
	Probability string `json:"probability"`
}

GeminiSafetyRating represents a safety rating

type GeminiStreamResponse ¶

type GeminiStreamResponse struct {
	Candidates        []GeminiCandidate        `json:"candidates,omitempty"`
	PromptFeedback    *GeminiPromptFeedback    `json:"promptFeedback,omitempty"`
	UsageMetadata     *GeminiUsageMetadata     `json:"usageMetadata,omitempty"`
	GroundingMetadata *GeminiGroundingMetadata `json:"groundingMetadata,omitempty"`
}

GeminiStreamResponse represents a streaming response chunk

type GeminiThinkingConfig ¶

type GeminiThinkingConfig struct {
	ThinkingLevel string `json:"thinkingLevel,omitempty"` // "low" or "high"
}

GeminiThinkingConfig represents thinking configuration for Gemini 3+ models

type GeminiTool ¶

type GeminiTool struct {
	FunctionDeclarations []GeminiFunctionDeclaration `json:"functionDeclarations,omitempty"`
	GoogleSearch         *GeminiGoogleSearch         `json:"googleSearch,omitempty"`
}

GeminiTool represents a function tool definition

type GeminiToolConfig ¶

type GeminiToolConfig struct {
	FunctionCallingConfig *GeminiFunctionCallingConfig `json:"functionCallingConfig,omitempty"`
}

GeminiToolConfig represents tool choice configuration

type GeminiUsageMetadata ¶

type GeminiUsageMetadata struct {
	Model                   string `json:"model,omitempty"` // internal field
	PromptTokenCount        int    `json:"promptTokenCount"`
	CachedContentTokenCount int    `json:"cachedContentTokenCount,omitempty"`
	CandidatesTokenCount    int    `json:"candidatesTokenCount"`
	TotalTokenCount         int    `json:"totalTokenCount"`
}

GeminiUsageMetadata represents token usage

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL