Documentation
¶
Overview ¶
Package model defines the shared data types used across the SDK: multi-modal messages, tool calling protocol, and token usage tracking.
Index ¶
- func MarshalToolArgs(args any) (string, error)
- type DataRef
- type FileRef
- type MediaRef
- type Message
- func CloneMessages(msgs []Message) []Message
- func LastByRole(msgs []Message, role Role) (Message, bool)
- func NewImageMessage(role Role, text, imageURL string) Message
- func NewTextMessage(role Role, text string) Message
- func NewToolCallMessage(calls []ToolCall) Message
- func NewToolResultMessage(results []ToolResult) Message
- type Part
- type PartType
- type Role
- type StreamChunk
- type TokenUsage
- type ToolCall
- type ToolDefinition
- type ToolResult
- type Usage
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func MarshalToolArgs ¶
MarshalToolArgs marshals arguments to a JSON string suitable for ToolCall.Arguments.
Types ¶
type DataRef ¶
type DataRef struct {
MimeType string `json:"mime_type,omitempty"`
Value map[string]any `json:"value"`
}
DataRef carries structured JSON-compatible data in a message part.
type FileRef ¶
type FileRef struct {
URI string `json:"uri"`
MimeType string `json:"mime_type,omitempty"`
Name string `json:"name,omitempty"`
}
FileRef references a generic file (URI + MIME), e.g. for A2A-style payloads.
type MediaRef ¶
type MediaRef struct {
URL string `json:"url,omitempty"`
Base64 string `json:"base64,omitempty"`
MediaType string `json:"media_type,omitempty"`
}
MediaRef references an image or audio asset.
type Message ¶
Message is a multi-modal, provider-agnostic chat message.
func CloneMessages ¶ added in v0.2.3
CloneMessages returns a deep copy of msgs. Nil stays nil so callers can preserve the usual JSON / len semantics.
func LastByRole ¶ added in v0.3.0
LastByRole returns the last message in msgs whose Role matches role. The boolean is false when no such message exists. The returned Message is the slice element itself (not a deep copy); callers that intend to mutate it should call Message.Clone first.
Typical use is for graph nodes that need to read a single role-scoped turn from a board channel — e.g. "the latest user message on MainChannel" — without re-implementing the reverse scan everywhere:
if m, ok := model.LastByRole(b.Channel(engine.MainChannel), model.RoleUser); ok {
query = m.Content()
}
func NewImageMessage ¶
NewImageMessage creates a user message with text and an image URL.
func NewTextMessage ¶
NewTextMessage creates a simple text message.
func NewToolCallMessage ¶
NewToolCallMessage creates an assistant message containing tool calls.
func NewToolResultMessage ¶
func NewToolResultMessage(results []ToolResult) Message
NewToolResultMessage creates a tool-role message with multiple results.
func (Message) Clone ¶ added in v0.2.3
Clone returns a deep copy of m. It duplicates the Parts slice and all pointer-backed payloads so callers can safely retain or mutate the result.
func (Message) HasToolCalls ¶
HasToolCalls reports whether the message contains any tool calls.
func (Message) ToolResults ¶
func (m Message) ToolResults() []ToolResult
ToolResults extracts all tool-result parts.
type Part ¶
type Part struct {
Type PartType `json:"type"`
Text string `json:"text,omitempty"`
Image *MediaRef `json:"image,omitempty"`
Audio *MediaRef `json:"audio,omitempty"`
File *FileRef `json:"file,omitempty"`
Data *DataRef `json:"data,omitempty"`
ToolCall *ToolCall `json:"tool_call,omitempty"`
ToolResult *ToolResult `json:"tool_result,omitempty"`
}
Part is a single content unit within a Message.
func CloneParts ¶ added in v0.2.3
CloneParts returns a deep copy of parts.
type StreamChunk ¶
type StreamChunk struct {
Role Role `json:"role,omitempty"`
Content string `json:"content,omitempty"`
ToolCalls []ToolCall `json:"tool_calls,omitempty"`
FinishReason string `json:"finish_reason,omitempty"`
}
StreamChunk is an incremental piece of a streaming response.
type TokenUsage ¶
type TokenUsage struct {
InputTokens int64 `json:"input_tokens"`
OutputTokens int64 `json:"output_tokens"`
TotalTokens int64 `json:"total_tokens"`
// CachedInputTokens is the subset of InputTokens that hit the
// provider's prompt cache and are therefore billed at a reduced
// rate (typically 10x cheaper). It is always <= InputTokens.
//
// Different providers expose this under different names but the
// semantics are aligned: OpenAI / DeepSeek / Azure / Qwen report
// `usage.prompt_tokens_details.cached_tokens`, Anthropic /
// Minimax report `usage.cache_read_input_tokens`, ByteDance
// Doubao reports `usage.prompt_tokens_details.cached_tokens`.
// The Generate / stream paths normalise these into this single
// field so callers can compute a cache hit-rate uniformly via
// CachedInputTokens / InputTokens without provider-specific
// branching.
//
// Zero means either the provider does not expose cache stats
// or no cache hit occurred on this call. omitempty keeps the
// wire format stable for providers that never set it.
CachedInputTokens int64 `json:"cached_input_tokens,omitempty"`
// Model is the resolved LLM model name this usage came from
// (e.g. "gpt-4o", "claude-3-7-sonnet-20250219"). Empty when the
// reporter does not know which model produced the call (test
// engines, aggregate reports). Hosts that bucket usage by model
// for budgeting / quota enforcement consume this field.
Model string `json:"model,omitempty"`
// LatencyMs is the wall-clock latency of the producing call in
// milliseconds. Zero when not measured. Used by sandbox hosts to
// surface per-call timing on the same dimension as token counts
// (avoids a parallel timing channel).
LatencyMs int64 `json:"latency_ms,omitempty"`
// CostMicros is the cost of the producing call in micro-units of
// the host's configured currency (micro-USD = USD * 1_000_000).
// Integer math is used so cumulative budgets do not drift.
// Zero when no pricing catalog is configured. Hosts enforcing $
// budgets accumulate this field.
CostMicros int64 `json:"cost_micros,omitempty"`
}
TokenUsage tracks cumulative token consumption (includes TotalTokens).
The Model / LatencyMs / CostMicros fields enrich the basic token counts so a sandbox host (typically the planned sdk/pod controller) can enforce dollar-denominated budgets and per-model rate limits without a separate sidecar accumulator. All three are optional: a reporter that only knows token counts leaves them zero / empty.
func (TokenUsage) Add ¶
func (u TokenUsage) Add(other TokenUsage) TokenUsage
Add returns a new TokenUsage that is the sum of u and other.
Numeric fields (token counts, latency, cost) are summed. Model is preserved from u when both are non-empty and disagree (the accumulator's identity wins); when one side is empty the other fills it in. Adding a per-call delta into a running total therefore keeps the running total's model label intact even if the delta reports a different model — callers that need per-model breakdowns SHOULD bucket by Model before summing.
type ToolCall ¶
type ToolCall struct {
ID string `json:"id"`
Name string `json:"name"`
Arguments string `json:"arguments"`
}
ToolCall represents a function call requested by the LLM.
type ToolDefinition ¶
type ToolDefinition struct {
Name string `json:"name"`
Description string `json:"description"`
InputSchema map[string]any `json:"input_schema"`
}
ToolDefinition describes a tool for LLM function-calling.
type ToolResult ¶
type ToolResult struct {
ToolCallID string `json:"tool_call_id"`
Content string `json:"content"`
IsError bool `json:"is_error,omitempty"`
}
ToolResult carries the result of a tool execution back to the LLM.
type Usage ¶
type Usage struct {
InputTokens int64 `json:"input_tokens"`
CachedInputTokens int64 `json:"cached_input_tokens,omitempty"`
OutputTokens int64 `json:"output_tokens"`
}
Usage represents raw token usage from a single LLM call (Provider layer).
CachedInputTokens carries the same semantics as TokenUsage.CachedInputTokens: it is the subset of InputTokens that hit the provider's prompt cache and is billed at a reduced rate (typically ~10x cheaper). Always <= InputTokens. Zero means either the provider does not expose cache stats or no cache hit occurred on this call.
omitempty keeps the wire format stable for providers (e.g. ollama) that never set it — pre-existing JSON consumers of Usage continue to observe the historical {input_tokens, output_tokens} shape.