model

package
v0.3.17 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 17, 2026 License: MIT Imports: 3 Imported by: 0

Documentation

Overview

Package model defines the shared data types used across the SDK: multi-modal messages, tool calling protocol, and token usage tracking.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func MarshalToolArgs

func MarshalToolArgs(args any) (string, error)

MarshalToolArgs marshals arguments to a JSON string suitable for ToolCall.Arguments.

Types

type DataRef

type DataRef struct {
	MimeType string         `json:"mime_type,omitempty"`
	Value    map[string]any `json:"value"`
}

DataRef carries structured JSON-compatible data in a message part.

type FileRef

type FileRef struct {
	URI      string `json:"uri"`
	MimeType string `json:"mime_type,omitempty"`
	Name     string `json:"name,omitempty"`
}

FileRef references a generic file (URI + MIME), e.g. for A2A-style payloads.

type MediaRef

type MediaRef struct {
	URL       string `json:"url,omitempty"`
	Base64    string `json:"base64,omitempty"`
	MediaType string `json:"media_type,omitempty"`
}

MediaRef references an image or audio asset.

type Message

type Message struct {
	Role  Role   `json:"role"`
	Parts []Part `json:"parts"`
}

Message is a multi-modal, provider-agnostic chat message.

func CloneMessages added in v0.2.3

func CloneMessages(msgs []Message) []Message

CloneMessages returns a deep copy of msgs. Nil stays nil so callers can preserve the usual JSON / len semantics.

func LastByRole added in v0.3.0

func LastByRole(msgs []Message, role Role) (Message, bool)

LastByRole returns the last message in msgs whose Role matches role. The boolean is false when no such message exists. The returned Message is the slice element itself (not a deep copy); callers that intend to mutate it should call Message.Clone first.

Typical use is for graph nodes that need to read a single role-scoped turn from a board channel — e.g. "the latest user message on MainChannel" — without re-implementing the reverse scan everywhere:

if m, ok := model.LastByRole(b.Channel(engine.MainChannel), model.RoleUser); ok {
    query = m.Content()
}

func NewImageMessage

func NewImageMessage(role Role, text, imageURL string) Message

NewImageMessage creates a user message with text and an image URL.

func NewTextMessage

func NewTextMessage(role Role, text string) Message

NewTextMessage creates a simple text message.

func NewToolCallMessage

func NewToolCallMessage(calls []ToolCall) Message

NewToolCallMessage creates an assistant message containing tool calls.

func NewToolResultMessage

func NewToolResultMessage(results []ToolResult) Message

NewToolResultMessage creates a tool-role message with multiple results.

func (Message) Clone added in v0.2.3

func (m Message) Clone() Message

Clone returns a deep copy of m. It duplicates the Parts slice and all pointer-backed payloads so callers can safely retain or mutate the result.

func (Message) Content

func (m Message) Content() string

Content returns the concatenated text of all text parts.

func (Message) HasToolCalls

func (m Message) HasToolCalls() bool

HasToolCalls reports whether the message contains any tool calls.

func (Message) ToolCalls

func (m Message) ToolCalls() []ToolCall

ToolCalls extracts all tool-call parts.

func (Message) ToolResults

func (m Message) ToolResults() []ToolResult

ToolResults extracts all tool-result parts.

type Part

type Part struct {
	Type       PartType    `json:"type"`
	Text       string      `json:"text,omitempty"`
	Image      *MediaRef   `json:"image,omitempty"`
	Audio      *MediaRef   `json:"audio,omitempty"`
	File       *FileRef    `json:"file,omitempty"`
	Data       *DataRef    `json:"data,omitempty"`
	ToolCall   *ToolCall   `json:"tool_call,omitempty"`
	ToolResult *ToolResult `json:"tool_result,omitempty"`
}

Part is a single content unit within a Message.

func CloneParts added in v0.2.3

func CloneParts(parts []Part) []Part

CloneParts returns a deep copy of parts.

func (Part) Clone added in v0.2.3

func (p Part) Clone() Part

Clone returns a deep copy of p.

type PartType

type PartType string

PartType identifies the content type within a message Part.

const (
	PartText       PartType = "text"
	PartImage      PartType = "image"
	PartAudio      PartType = "audio"
	PartFile       PartType = "file"
	PartData       PartType = "data"
	PartToolCall   PartType = "tool_call"
	PartToolResult PartType = "tool_result"
)

type Role

type Role string

Role identifies who sent a message.

const (
	RoleSystem    Role = "system"
	RoleUser      Role = "user"
	RoleAssistant Role = "assistant"
	RoleTool      Role = "tool"
)

type StreamChunk

type StreamChunk struct {
	Role         Role       `json:"role,omitempty"`
	Content      string     `json:"content,omitempty"`
	ToolCalls    []ToolCall `json:"tool_calls,omitempty"`
	FinishReason string     `json:"finish_reason,omitempty"`
}

StreamChunk is an incremental piece of a streaming response.

type TokenUsage

type TokenUsage struct {
	InputTokens  int64 `json:"input_tokens"`
	OutputTokens int64 `json:"output_tokens"`
	TotalTokens  int64 `json:"total_tokens"`

	// CachedInputTokens is the subset of InputTokens that hit the
	// provider's prompt cache and are therefore billed at a reduced
	// rate (typically 10x cheaper). It is always <= InputTokens.
	//
	// Different providers expose this under different names but the
	// semantics are aligned: OpenAI / DeepSeek / Azure / Qwen report
	// `usage.prompt_tokens_details.cached_tokens`, Anthropic /
	// Minimax report `usage.cache_read_input_tokens`, ByteDance
	// Doubao reports `usage.prompt_tokens_details.cached_tokens`.
	// The Generate / stream paths normalise these into this single
	// field so callers can compute a cache hit-rate uniformly via
	// CachedInputTokens / InputTokens without provider-specific
	// branching.
	//
	// Zero means either the provider does not expose cache stats
	// or no cache hit occurred on this call. omitempty keeps the
	// wire format stable for providers that never set it.
	CachedInputTokens int64 `json:"cached_input_tokens,omitempty"`

	// Model is the resolved LLM model name this usage came from
	// (e.g. "gpt-4o", "claude-3-7-sonnet-20250219"). Empty when the
	// reporter does not know which model produced the call (test
	// engines, aggregate reports). Hosts that bucket usage by model
	// for budgeting / quota enforcement consume this field.
	Model string `json:"model,omitempty"`

	// LatencyMs is the wall-clock latency of the producing call in
	// milliseconds. Zero when not measured. Used by sandbox hosts to
	// surface per-call timing on the same dimension as token counts
	// (avoids a parallel timing channel).
	LatencyMs int64 `json:"latency_ms,omitempty"`

	// CostMicros is the cost of the producing call in micro-units of
	// the host's configured currency (micro-USD = USD * 1_000_000).
	// Integer math is used so cumulative budgets do not drift.
	// Zero when no pricing catalog is configured. Hosts enforcing $
	// budgets accumulate this field.
	CostMicros int64 `json:"cost_micros,omitempty"`
}

TokenUsage tracks cumulative token consumption (includes TotalTokens).

The Model / LatencyMs / CostMicros fields enrich the basic token counts so a sandbox host (typically the planned sdk/pod controller) can enforce dollar-denominated budgets and per-model rate limits without a separate sidecar accumulator. All three are optional: a reporter that only knows token counts leaves them zero / empty.

func (TokenUsage) Add

func (u TokenUsage) Add(other TokenUsage) TokenUsage

Add returns a new TokenUsage that is the sum of u and other.

Numeric fields (token counts, latency, cost) are summed. Model is preserved from u when both are non-empty and disagree (the accumulator's identity wins); when one side is empty the other fills it in. Adding a per-call delta into a running total therefore keeps the running total's model label intact even if the delta reports a different model — callers that need per-model breakdowns SHOULD bucket by Model before summing.

type ToolCall

type ToolCall struct {
	ID        string `json:"id"`
	Name      string `json:"name"`
	Arguments string `json:"arguments"`
}

ToolCall represents a function call requested by the LLM.

type ToolDefinition

type ToolDefinition struct {
	Name        string         `json:"name"`
	Description string         `json:"description"`
	InputSchema map[string]any `json:"input_schema"`
}

ToolDefinition describes a tool for LLM function-calling.

type ToolResult

type ToolResult struct {
	ToolCallID string `json:"tool_call_id"`
	Content    string `json:"content"`
	IsError    bool   `json:"is_error,omitempty"`
}

ToolResult carries the result of a tool execution back to the LLM.

type Usage

type Usage struct {
	InputTokens       int64 `json:"input_tokens"`
	CachedInputTokens int64 `json:"cached_input_tokens,omitempty"`
	OutputTokens      int64 `json:"output_tokens"`
}

Usage represents raw token usage from a single LLM call (Provider layer).

CachedInputTokens carries the same semantics as TokenUsage.CachedInputTokens: it is the subset of InputTokens that hit the provider's prompt cache and is billed at a reduced rate (typically ~10x cheaper). Always <= InputTokens. Zero means either the provider does not expose cache stats or no cache hit occurred on this call.

omitempty keeps the wire format stable for providers (e.g. ollama) that never set it — pre-existing JSON consumers of Usage continue to observe the historical {input_tokens, output_tokens} shape.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL