Documentation
¶
Index ¶
- Constants
- func CountMediaParts(msg Message) int
- func CountPartsByType(msg Message, contentType string) int
- func ExtractTextContent(msg Message) string
- func HasOnlyTextContent(msg Message) bool
- func MigrateMessagesToLegacy(messages []Message) error
- func MigrateMessagesToMultimodal(messages []Message)
- func MigrateToLegacy(msg *Message) error
- func MigrateToMultimodal(msg *Message)
- type ChunkReader
- type ChunkWriter
- type ContentPart
- func NewAudioPart(filePath string) (ContentPart, error)
- func NewAudioPartFromData(base64Data, mimeType string) ContentPart
- func NewImagePart(filePath string, detail *string) (ContentPart, error)
- func NewImagePartFromData(base64Data, mimeType string, detail *string) ContentPart
- func NewImagePartFromURL(url string, detail *string) ContentPart
- func NewTextPart(text string) ContentPart
- func NewVideoPart(filePath string) (ContentPart, error)
- func NewVideoPartFromData(base64Data, mimeType string) ContentPart
- func SplitMultimodalMessage(msg Message) (text string, mediaParts []ContentPart)
- type CostInfo
- type MediaChunk
- type MediaContent
- type MediaItemSummary
- type MediaSummary
- type Message
- func (m *Message) AddAudioPart(filePath string) error
- func (m *Message) AddImagePart(filePath string, detail *string) error
- func (m *Message) AddImagePartFromURL(url string, detail *string)
- func (m *Message) AddPart(part ContentPart)
- func (m *Message) AddTextPart(text string)
- func (m *Message) AddVideoPart(filePath string) error
- func (m *Message) GetContent() string
- func (m *Message) HasMediaContent() bool
- func (m *Message) IsMultimodal() bool
- func (m Message) MarshalJSON() ([]byte, error)
- func (m *Message) SetMultimodalContent(parts []ContentPart)
- func (m *Message) SetTextContent(text string)
- func (m *Message) UnmarshalJSON(data []byte) error
- type MessageToolCall
- type MessageToolResult
- type StreamingMediaConfig
- type ToolDef
- type ToolStats
- type ValidationError
- type ValidationResult
Constants ¶
const ( ContentTypeText = "text" ContentTypeImage = "image" ContentTypeAudio = "audio" ContentTypeVideo = "video" )
ContentType constants for different content part types
const ( MIMETypeImageJPEG = "image/jpeg" MIMETypeImagePNG = "image/png" MIMETypeImageGIF = "image/gif" MIMETypeImageWebP = "image/webp" MIMETypeAudioMP3 = "audio/mpeg" MIMETypeAudioWAV = "audio/wav" MIMETypeAudioOgg = "audio/ogg" MIMETypeAudioWebM = "audio/webm" MIMETypeVideoMP4 = "video/mp4" MIMETypeVideoWebM = "video/webm" MIMETypeVideoOgg = "video/ogg" )
Common MIME types
Variables ¶
This section is empty.
Functions ¶
func CountMediaParts ¶ added in v1.1.0
CountMediaParts returns the number of media parts (image, audio, video) in a message
func CountPartsByType ¶ added in v1.1.0
CountPartsByType returns the number of parts of a specific type in a message
func ExtractTextContent ¶ added in v1.1.0
ExtractTextContent extracts all text content from a message, regardless of format. This is useful for backward compatibility when you need just the text.
func HasOnlyTextContent ¶ added in v1.1.0
HasOnlyTextContent returns true if the message contains only text (no media)
func MigrateMessagesToLegacy ¶ added in v1.1.0
MigrateMessagesToLegacy converts a slice of multimodal messages to legacy format in-place. Returns an error if any message contains media content.
func MigrateMessagesToMultimodal ¶ added in v1.1.0
func MigrateMessagesToMultimodal(messages []Message)
MigrateMessagesToMultimodal converts a slice of legacy messages to multimodal format in-place
func MigrateToLegacy ¶ added in v1.1.0
MigrateToLegacy converts a multimodal message back to legacy text-only format. This is useful for backward compatibility with systems that don't support multimodal. Returns an error if the message contains non-text content.
func MigrateToMultimodal ¶ added in v1.1.0
func MigrateToMultimodal(msg *Message)
MigrateToMultimodal converts a legacy text-only message to use the Parts structure. This is useful when transitioning existing code to the new multimodal API.
Types ¶
type ChunkReader ¶ added in v1.1.0
type ChunkReader struct {
// contains filtered or unexported fields
}
ChunkReader reads from an io.Reader and produces MediaChunks. Useful for converting continuous streams (e.g., microphone input) into chunks.
Example usage:
reader := NewChunkReader(micInput, config)
for {
chunk, err := reader.NextChunk(ctx)
if err == io.EOF {
break
}
if err != nil {
return err
}
session.SendChunk(ctx, chunk)
}
func NewChunkReader ¶ added in v1.1.0
func NewChunkReader(r io.Reader, config StreamingMediaConfig) *ChunkReader
NewChunkReader creates a new ChunkReader that reads from the given reader and produces MediaChunks according to the config.
func (*ChunkReader) NextChunk ¶ added in v1.1.0
func (cr *ChunkReader) NextChunk(ctx context.Context) (*MediaChunk, error)
NextChunk reads the next chunk from the reader. Returns io.EOF when the stream is complete. The returned chunk's IsLast field will be true on the final chunk.
type ChunkWriter ¶ added in v1.1.0
type ChunkWriter struct {
// contains filtered or unexported fields
}
ChunkWriter writes MediaChunks to an io.Writer. Useful for converting chunks back into continuous streams (e.g., speaker output).
Example usage:
writer := NewChunkWriter(speakerOutput)
for chunk := range session.Response() {
if chunk.MediaDelta != nil {
err := writer.WriteChunk(chunk.MediaDelta)
if err != nil {
return err
}
}
}
func NewChunkWriter ¶ added in v1.1.0
func NewChunkWriter(w io.Writer) *ChunkWriter
NewChunkWriter creates a new ChunkWriter that writes to the given writer.
func (*ChunkWriter) Flush ¶ added in v1.1.0
func (cw *ChunkWriter) Flush() error
Flush flushes any buffered data to the underlying writer (if it supports flushing).
func (*ChunkWriter) WriteChunk ¶ added in v1.1.0
func (cw *ChunkWriter) WriteChunk(chunk *MediaChunk) (int, error)
WriteChunk writes a MediaChunk to the underlying writer. Returns the number of bytes written and any error encountered.
type ContentPart ¶ added in v1.1.0
type ContentPart struct {
Type string `json:"type"` // "text", "image", "audio", "video"
// For text content
Text *string `json:"text,omitempty"`
// For media content (image, audio, video)
Media *MediaContent `json:"media,omitempty"`
}
ContentPart represents a single piece of content in a multimodal message. A message can contain multiple parts: text, images, audio, video, etc.
func NewAudioPart ¶ added in v1.1.0
func NewAudioPart(filePath string) (ContentPart, error)
NewAudioPart creates a ContentPart with audio content from a file path
func NewAudioPartFromData ¶ added in v1.1.0
func NewAudioPartFromData(base64Data, mimeType string) ContentPart
NewAudioPartFromData creates a ContentPart with base64-encoded audio data
func NewImagePart ¶ added in v1.1.0
func NewImagePart(filePath string, detail *string) (ContentPart, error)
NewImagePart creates a ContentPart with image content from a file path
func NewImagePartFromData ¶ added in v1.1.0
func NewImagePartFromData(base64Data, mimeType string, detail *string) ContentPart
NewImagePartFromData creates a ContentPart with base64-encoded image data
func NewImagePartFromURL ¶ added in v1.1.0
func NewImagePartFromURL(url string, detail *string) ContentPart
NewImagePartFromURL creates a ContentPart with image content from a URL
func NewTextPart ¶ added in v1.1.0
func NewTextPart(text string) ContentPart
NewTextPart creates a ContentPart with text content
func NewVideoPart ¶ added in v1.1.0
func NewVideoPart(filePath string) (ContentPart, error)
NewVideoPart creates a ContentPart with video content from a file path
func NewVideoPartFromData ¶ added in v1.1.0
func NewVideoPartFromData(base64Data, mimeType string) ContentPart
NewVideoPartFromData creates a ContentPart with base64-encoded video data
func SplitMultimodalMessage ¶ added in v1.1.0
func SplitMultimodalMessage(msg Message) (text string, mediaParts []ContentPart)
SplitMultimodalMessage splits a multimodal message into separate text and media parts. Returns the text content and a slice of media content parts.
func (*ContentPart) Validate ¶ added in v1.1.0
func (cp *ContentPart) Validate() error
Validate checks if the ContentPart is valid
type CostInfo ¶
type CostInfo struct {
InputTokens int `json:"input_tokens"` // Number of input tokens consumed
OutputTokens int `json:"output_tokens"` // Number of output tokens generated
CachedTokens int `json:"cached_tokens,omitempty"` // Number of cached tokens used (reduces cost)
InputCostUSD float64 `json:"input_cost_usd"` // Cost of input tokens in USD
OutputCostUSD float64 `json:"output_cost_usd"` // Cost of output tokens in USD
CachedCostUSD float64 `json:"cached_cost_usd,omitempty"` // Cost savings from cached tokens
TotalCost float64 `json:"total_cost_usd"` // Total cost in USD
}
CostInfo tracks token usage and associated costs for LLM operations. All cost values are in USD. Used for both individual messages and aggregated tracking.
type MediaChunk ¶ added in v1.1.0
type MediaChunk struct {
// Data contains the raw media bytes for this chunk
Data []byte `json:"data"`
// SequenceNum is the sequence number for ordering chunks (starts at 0)
SequenceNum int64 `json:"sequence_num"`
// Timestamp indicates when this chunk was created
Timestamp time.Time `json:"timestamp"`
// IsLast indicates if this is the final chunk in the stream
IsLast bool `json:"is_last"`
// Metadata contains chunk-specific metadata (MIME type, encoding, etc.)
Metadata map[string]string `json:"metadata,omitempty"`
}
MediaChunk represents a chunk of streaming media data. Used for bidirectional streaming where media is sent or received in chunks.
Example usage:
chunk := &MediaChunk{
Data: audioData,
SequenceNum: 1,
Timestamp: time.Now(),
IsLast: false,
Metadata: map[string]string{"mime_type": "audio/pcm"},
}
type MediaContent ¶ added in v1.1.0
type MediaContent struct {
// Data source - exactly one should be set
Data *string `json:"data,omitempty"` // Base64-encoded media data
FilePath *string `json:"file_path,omitempty"` // Local file path
URL *string `json:"url,omitempty"` // External URL (http/https)
// Storage backend reference (used when media is externalized)
StorageReference *string `json:"storage_reference,omitempty"` // Backend-specific storage reference
// Media metadata
MIMEType string `json:"mime_type"` // e.g., "image/jpeg", "audio/mp3", "video/mp4"
Format *string `json:"format,omitempty"` // Optional format hint (e.g., "png", "mp3", "mp4")
SizeKB *int64 `json:"size_kb,omitempty"` // Optional size in kilobytes
Detail *string `json:"detail,omitempty"` // Optional detail level for images: "low", "high", "auto"
Caption *string `json:"caption,omitempty"` // Optional caption/description
Duration *int `json:"duration,omitempty"` // Optional duration in seconds (for audio/video)
BitRate *int `json:"bit_rate,omitempty"` // Optional bit rate in kbps (for audio/video)
Channels *int `json:"channels,omitempty"` // Optional number of channels (for audio)
Width *int `json:"width,omitempty"` // Optional width in pixels (for image/video)
Height *int `json:"height,omitempty"` // Optional height in pixels (for image/video)
FPS *int `json:"fps,omitempty"` // Optional frames per second (for video)
PolicyName *string `json:"policy_name,omitempty"` // Retention policy name
}
MediaContent represents media data (image, audio, video) in a message. Supports both inline base64 data and external file/URL references.
func (*MediaContent) GetBase64Data
deprecated
added in
v1.1.0
func (mc *MediaContent) GetBase64Data() (string, error)
GetBase64Data returns the base64-encoded data for this media content. If the data is already base64-encoded, it returns it directly. If the data is from a file, it reads and encodes the file. If the data is from a URL or StorageReference, it returns an error (caller should use MediaLoader).
Deprecated: For new code, use providers.MediaLoader.GetBase64Data which supports all sources including storage references and URLs with proper context handling.
func (*MediaContent) ReadData ¶ added in v1.1.0
func (mc *MediaContent) ReadData() (io.ReadCloser, error)
ReadData returns an io.Reader for the media content. For base64 data, it decodes and returns a reader. For file paths, it opens and returns the file. For URLs, it returns an error (caller should fetch separately).
func (*MediaContent) Validate ¶ added in v1.1.0
func (mc *MediaContent) Validate() error
Validate checks if the MediaContent is valid
type MediaItemSummary ¶ added in v1.1.0
type MediaItemSummary struct {
Type string `json:"type"` // Content type: "image", "audio", "video"
Source string `json:"source"` // Source description (file path, URL, or "inline data")
MIMEType string `json:"mime_type"` // MIME type
SizeBytes int `json:"size_bytes"` // Size in bytes (0 if unknown)
Detail string `json:"detail,omitempty"` // Detail level for images
Loaded bool `json:"loaded"` // Whether media was successfully loaded
Error string `json:"error,omitempty"` // Error message if loading failed
}
MediaItemSummary provides details about a single media item in a message.
type MediaSummary ¶ added in v1.1.0
type MediaSummary struct {
TotalParts int `json:"total_parts"` // Total number of content parts
TextParts int `json:"text_parts"` // Number of text parts
ImageParts int `json:"image_parts"` // Number of image parts
AudioParts int `json:"audio_parts"` // Number of audio parts
VideoParts int `json:"video_parts"` // Number of video parts
MediaItems []MediaItemSummary `json:"media_items,omitempty"` // Details of each media item
}
MediaSummary provides a high-level overview of media content in a message. This is included in JSON output to make multimodal messages more observable.
type Message ¶
type Message struct {
Role string `json:"role"` // "system", "user", "assistant", "tool"
Content string `json:"content"` // Message content (legacy text-only, maintained for backward compatibility)
// Multimodal content parts (text, images, audio, video)
// If Parts is non-empty, it takes precedence over Content.
// For backward compatibility, if Parts is empty, Content will be used.
Parts []ContentPart `json:"parts,omitempty"`
// Tool invocations (for assistant messages that call tools)
ToolCalls []MessageToolCall `json:"tool_calls,omitempty"`
// Tool result (for tool role messages)
// When Role="tool", this contains the tool execution result
ToolResult *MessageToolResult `json:"tool_result,omitempty"`
// Source indicates where this message originated (runtime-only, not persisted in JSON)
// Values: "statestore" (loaded from StateStore), "pipeline" (created during execution), "" (user input)
Source string `json:"-"`
// Metadata for observability and tracking
Timestamp time.Time `json:"timestamp,omitempty"` // When the message was created
LatencyMs int64 `json:"latency_ms,omitempty"` // Time taken to generate (for assistant messages)
CostInfo *CostInfo `json:"cost_info,omitempty"` // Token usage and cost tracking
Meta map[string]interface{} `json:"meta,omitempty"` // Custom metadata
// Validation results (for assistant messages)
Validations []ValidationResult `json:"validations,omitempty"`
}
Message represents a single message in a conversation. This is the canonical message type used throughout the system.
func CloneMessage ¶ added in v1.1.0
CloneMessage creates a deep copy of a message
func CombineTextAndMedia ¶ added in v1.1.0
func CombineTextAndMedia(role, text string, mediaParts []ContentPart) Message
CombineTextAndMedia creates a multimodal message from separate text and media parts. This is the inverse of SplitMultimodalMessage.
func ConvertTextToMultimodal ¶ added in v1.1.0
ConvertTextToMultimodal is a convenience function that creates a multimodal message from a role and text content. This helps with code migration.
func (*Message) AddAudioPart ¶ added in v1.1.0
AddAudioPart adds an audio content part from a file path
func (*Message) AddImagePart ¶ added in v1.1.0
AddImagePart adds an image content part from a file path
func (*Message) AddImagePartFromURL ¶ added in v1.1.0
AddImagePartFromURL adds an image content part from a URL
func (*Message) AddPart ¶ added in v1.1.0
func (m *Message) AddPart(part ContentPart)
AddPart adds a content part to the message. If this is the first part added, it clears the legacy Content field.
func (*Message) AddTextPart ¶ added in v1.1.0
AddTextPart adds a text content part to the message
func (*Message) AddVideoPart ¶ added in v1.1.0
AddVideoPart adds a video content part from a file path
func (*Message) GetContent ¶ added in v1.1.0
GetContent returns the content of the message. If Parts is non-empty, it returns only the text parts concatenated. Otherwise, it returns the legacy Content field.
func (*Message) HasMediaContent ¶ added in v1.1.0
HasMediaContent returns true if the message contains any media (image, audio, video)
func (*Message) IsMultimodal ¶ added in v1.1.0
IsMultimodal returns true if the message contains multimodal content (Parts)
func (Message) MarshalJSON ¶ added in v1.1.0
MarshalJSON implements custom JSON marshaling for Message. This enhances the output by: 1. Populating the Content field with a human-readable summary when Parts exist 2. Adding a MediaSummary field for observability of multimodal content 3. Omitting Content field when ToolResult is present to avoid duplication
func (*Message) SetMultimodalContent ¶ added in v1.1.0
func (m *Message) SetMultimodalContent(parts []ContentPart)
SetMultimodalContent sets the message content to multimodal parts. This clears the legacy Content field.
func (*Message) SetTextContent ¶ added in v1.1.0
SetTextContent sets the message content to simple text. This clears any existing Parts and sets the legacy Content field.
func (*Message) UnmarshalJSON ¶ added in v1.1.5
UnmarshalJSON implements custom JSON unmarshaling for Message. After unmarshaling, if ToolResult is present, copy its Content to Message.Content for provider compatibility (providers expect Content field to be populated).
type MessageToolCall ¶
type MessageToolCall struct {
ID string `json:"id"` // Unique identifier for this tool call
Name string `json:"name"` // Name of the tool to invoke
Args json.RawMessage `json:"args"` // JSON-encoded tool arguments
}
MessageToolCall represents a request to call a tool within a Message. The Args field contains the JSON-encoded arguments for the tool.
type MessageToolResult ¶
type MessageToolResult struct {
ID string `json:"id"` // References the MessageToolCall.ID that triggered this result
Name string `json:"name"` // Tool name that was executed
Content string `json:"content"` // Result content or error message
Error string `json:"error,omitempty"` // Error message if tool execution failed
LatencyMs int64 `json:"latency_ms"` // Tool execution latency in milliseconds
}
MessageToolResult represents the result of a tool execution in a Message. When embedded in Message, the Message.Role should be "tool".
type StreamingMediaConfig ¶ added in v1.1.0
type StreamingMediaConfig struct {
// Type specifies the media type being streamed
// Values: ContentTypeAudio, ContentTypeVideo
Type string `json:"type"`
// ChunkSize is the target size in bytes for each chunk
// Typical values: 4096-8192 for audio, 32768-65536 for video
ChunkSize int `json:"chunk_size"`
// SampleRate is the audio sample rate in Hz
// Common values: 8000 (phone quality), 16000 (wideband), 44100 (CD quality), 48000 (pro audio)
SampleRate int `json:"sample_rate,omitempty"`
// Encoding specifies the audio encoding format
// Values: "pcm" (raw), "opus", "mp3", "aac"
Encoding string `json:"encoding,omitempty"`
// Channels is the number of audio channels
// Values: 1 (mono), 2 (stereo)
Channels int `json:"channels,omitempty"`
// BitDepth is the audio bit depth in bits
// Common values: 16, 24, 32
BitDepth int `json:"bit_depth,omitempty"`
// Width is the video width in pixels
Width int `json:"width,omitempty"`
// Height is the video height in pixels
Height int `json:"height,omitempty"`
// FrameRate is the video frame rate (FPS)
// Common values: 24, 30, 60
FrameRate int `json:"frame_rate,omitempty"`
// BufferSize is the maximum number of chunks to buffer
// Larger values increase latency but provide more stability
// Typical values: 5-20
BufferSize int `json:"buffer_size,omitempty"`
// FlushInterval is how often to flush buffered data (if applicable)
FlushInterval time.Duration `json:"flush_interval,omitempty"`
// Metadata contains additional provider-specific configuration
Metadata map[string]interface{} `json:"metadata,omitempty"`
}
StreamingMediaConfig configures streaming media input parameters. Used to configure audio/video streaming sessions with providers.
Example usage for audio streaming:
config := &StreamingMediaConfig{
Type: ContentTypeAudio,
ChunkSize: 8192, // 8KB chunks
SampleRate: 16000, // 16kHz audio
Encoding: "pcm", // Raw PCM audio
Channels: 1, // Mono
BufferSize: 10, // Buffer 10 chunks
}
func (*StreamingMediaConfig) Validate ¶ added in v1.1.0
func (c *StreamingMediaConfig) Validate() error
Validate checks if the StreamingMediaConfig is valid
type ToolDef ¶
type ToolDef struct {
Name string `json:"name"` // Unique tool name
Description string `json:"description"` // Human-readable description of what the tool does
InputSchema json.RawMessage `json:"input_schema"` // JSON Schema for input validation
OutputSchema json.RawMessage `json:"output_schema,omitempty"` // Optional JSON Schema for output validation
}
ToolDef represents a tool definition that can be provided to an LLM. The InputSchema and OutputSchema use JSON Schema format for validation.
type ToolStats ¶
type ToolStats struct {
TotalCalls int `json:"total_calls"` // Total number of tool calls
ByTool map[string]int `json:"by_tool"` // Count of calls per tool name
}
ToolStats tracks tool usage statistics across a conversation or run. Useful for monitoring which tools are being used and how frequently.
type ValidationError ¶
type ValidationError struct {
Type string `json:"type"` // Error type: "args_invalid" | "result_invalid" | "policy_violation"
Tool string `json:"tool"` // Name of the tool that failed validation
Detail string `json:"detail"` // Human-readable error details
}
ValidationError represents a validation failure in tool usage or message content. Used to provide structured error information when validation fails.
type ValidationResult ¶
type ValidationResult struct {
ValidatorType string `json:"validator_type"` // Type of validator
Passed bool `json:"passed"` // Whether the validation passed
Details map[string]interface{} `json:"details,omitempty"` // Validator-specific details
Timestamp time.Time `json:"timestamp,omitempty"` // When validation was performed
}
ValidationResult represents the outcome of a validator check on a message. These are attached to assistant messages to show which validations passed or failed.