Documentation
¶
Overview ¶
Package chatserver implements an OpenAI-compatible HTTP server that exposes docker-agent agents through the /v1/chat/completions and /v1/models endpoints.
The goal is to let any tool that already speaks OpenAI's chat protocol (e.g. Open WebUI, custom shell scripts using the openai SDK) drive a docker-agent agent without needing to know about docker-agent's own protocol.
On types: we deliberately don't reuse the request/response structs from github.com/openai/openai-go/v3. The SDK is built around its internal `apijson` encoder; with stdlib `encoding/json` those types serialize every field and produce noisy responses. `apijson` lives under `internal/`, so we can't borrow it. `openai.Model` is the one type that round-trips cleanly with stdlib json, so we reuse it for /v1/models.
Index ¶
- func Run(ctx context.Context, agentFilename string, opts Options, ln net.Listener) error
- type ChatCompletionChoice
- type ChatCompletionMessage
- type ChatCompletionRequest
- type ChatCompletionResponse
- type ChatCompletionStreamChoice
- type ChatCompletionStreamDelta
- type ChatCompletionStreamResponse
- type ChatCompletionUsage
- type ContentImageURL
- type ContentPart
- type ErrorDetail
- type ErrorResponse
- type ModelsResponse
- type Options
- type StopSequences
- type ToolCallFunction
- type ToolCallReference
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
Types ¶
type ChatCompletionChoice ¶
type ChatCompletionChoice struct {
Index int `json:"index"`
Message ChatCompletionMessage `json:"message"`
FinishReason string `json:"finish_reason"`
}
type ChatCompletionMessage ¶
type ChatCompletionMessage struct {
Role string `json:"role"`
// Content is the text content of the message. Populated whether the
// wire format used a string or a parts array (the parts' text values
// are concatenated).
Content string `json:"-"`
// Parts holds the original typed parts when the wire format used an
// array. Empty when the wire format was a plain string.
Parts []ContentPart `json:"-"`
Name string `json:"name,omitempty"`
ToolCallID string `json:"tool_call_id,omitempty"`
ToolCalls []ToolCallReference `json:"tool_calls,omitempty"`
}
ChatCompletionMessage is a single message in the conversation.
On the wire OpenAI accepts message content in two shapes: either a plain string (`"content": "hello"`) or an array of typed parts (`"content": [{"type":"text",...}, {"type":"image_url",...}]`). Both shapes are accepted on the request side; the response always uses the string form for text-only content and the parts form when images or other non-text content are present. The custom JSON (un)marshallers below preserve that union without forcing every Go caller to deal with it.
func (ChatCompletionMessage) MarshalJSON ¶
func (m ChatCompletionMessage) MarshalJSON() ([]byte, error)
MarshalJSON emits the parts array when present, otherwise a plain string. Tool/role/name/tool_call_id round-trip verbatim.
func (*ChatCompletionMessage) UnmarshalJSON ¶
func (m *ChatCompletionMessage) UnmarshalJSON(data []byte) error
UnmarshalJSON accepts either a string `content` field or an array of typed parts (OpenAI's multimodal shape).
type ChatCompletionRequest ¶
type ChatCompletionRequest struct {
Model string `json:"model"`
Messages []ChatCompletionMessage `json:"messages"`
Stream bool `json:"stream,omitempty"`
// Temperature is parsed and range-checked but not yet plumbed through
// to the runtime/model layer (no per-request override exists today).
// Set on the agent's YAML configuration to control sampling.
Temperature *float64 `json:"temperature,omitempty"`
// TopP is parsed and range-checked but not yet plumbed through.
TopP *float64 `json:"top_p,omitempty"`
// MaxTokens is the maximum number of tokens the model may generate in
// the response. Parsed and validated; runtime plumbing is tracked for
// a follow-up.
MaxTokens *int64 `json:"max_tokens,omitempty"`
// Stop is one or more substrings that, if produced, end generation.
// Accepted as either a single string or an array of strings, matching
// the OpenAI schema. Validated; not yet enforced.
Stop StopSequences `json:"stop,omitempty"`
}
ChatCompletionRequest is the body of a /v1/chat/completions call. We declare every field commonly sent by OpenAI clients so they are accepted without surprise. Whether each field is *acted on* is documented inline.
type ChatCompletionResponse ¶
type ChatCompletionResponse struct {
ID string `json:"id"`
Object string `json:"object"`
Created int64 `json:"created"`
Model string `json:"model"`
Choices []ChatCompletionChoice `json:"choices"`
Usage *ChatCompletionUsage `json:"usage,omitempty"`
}
type ChatCompletionStreamChoice ¶
type ChatCompletionStreamChoice struct {
Index int `json:"index"`
Delta ChatCompletionStreamDelta `json:"delta"`
FinishReason string `json:"finish_reason,omitempty"`
}
type ChatCompletionStreamDelta ¶
type ChatCompletionStreamDelta struct {
Role string `json:"role,omitempty"`
Content string `json:"content,omitempty"`
ToolCalls []ToolCallReference `json:"tool_calls,omitempty"`
}
type ChatCompletionStreamResponse ¶
type ChatCompletionStreamResponse struct {
ID string `json:"id"`
Object string `json:"object"`
Created int64 `json:"created"`
Model string `json:"model"`
Choices []ChatCompletionStreamChoice `json:"choices"`
}
ChatCompletionStreamResponse is one SSE chunk emitted when the client requests stream: true.
type ChatCompletionUsage ¶
type ChatCompletionUsage struct {
PromptTokens int64 `json:"prompt_tokens"`
CompletionTokens int64 `json:"completion_tokens"`
TotalTokens int64 `json:"total_tokens"`
}
ChatCompletionUsage reports approximate token counts. Best-effort: when the underlying provider doesn't report usage we omit the field entirely.
type ContentImageURL ¶
ContentImageURL carries an image part. URL may be a regular http(s) URL or a data URL (`data:image/png;base64,...`).
type ContentPart ¶
type ContentPart struct {
Type string `json:"type"`
Text string `json:"text,omitempty"`
ImageURL *ContentImageURL `json:"image_url,omitempty"`
}
ContentPart mirrors one entry in OpenAI's typed-parts array. Today the server understands `text` and `image_url` parts; unknown types are preserved in the request payload but ignored when building the session, so future part types degrade gracefully.
type ErrorDetail ¶
type ErrorResponse ¶
type ErrorResponse struct {
Error ErrorDetail `json:"error"`
}
ErrorResponse is the OpenAI-style error envelope returned on 4xx/5xx.
type ModelsResponse ¶
ModelsResponse is the body returned by /v1/models. Each agent in the team is exposed as one entry.
type Options ¶
type Options struct {
// AgentName pins the single agent to expose. Empty exposes every
// agent in the team and uses the team's default as the fallback.
AgentName string
// RunConfig is the runtime configuration used to load the team.
RunConfig *config.RuntimeConfig
// CORSOrigin is the allowed value for the Access-Control-Allow-Origin
// header. When empty, the CORS middleware is not registered at all
// (the server never emits any Access-Control-* response header).
//
// Multiple values can be provided separated by commas. Each entry is
// either a literal origin (matched exactly), the wildcard "*", or a
// pattern starting with "~" interpreted as a Go regular expression
// against the request's Origin header. Examples:
//
// "https://app.example.com"
// "https://app.example.com,https://staging.example.com"
// "~^https://[a-z0-9-]+\\.example\\.com$"
CORSOrigin string
// APIKey, if non-empty, is the static bearer token clients must
// present in the `Authorization` header (`Authorization: Bearer X`).
// Empty disables authentication; once set, every request to /v1/* is
// rejected with 401 unless it carries the matching token.
// /v1/models is also protected so an unauthenticated client can't
// fingerprint the server.
APIKey string
// MaxRequestBytes caps the size of an incoming request body. Zero
// means use the package default (1 MiB).
MaxRequestBytes int64
// RequestTimeout caps how long a single chat completion is allowed to
// run. Zero means use the package default (5 minutes). The cap covers
// model calls, tool calls, and SSE streaming combined.
RequestTimeout time.Duration
// ConversationsMaxSessions, when > 0, enables the X-Conversation-Id
// header: clients can pass a stable id to reuse the same session
// across requests instead of re-sending the full message history
// every turn. This is the size of the in-memory LRU cache.
ConversationsMaxSessions int
// ConversationTTL is how long a cached conversation may be idle
// before it's evicted. Zero means use the package default
// (30 minutes).
ConversationTTL time.Duration
// MaxIdleRuntimes bounds the number of idle runtimes pooled per
// agent. Building a runtime resolves tools and sets up channels;
// keeping a small pool of warm runtimes avoids paying that cost on
// every request. Zero disables pooling (a fresh runtime is built
// for every request, the original behaviour).
MaxIdleRuntimes int
}
Options configures the chat completions server. Future improvements (auth, conversations, etc.) extend this struct rather than the Run signature so callers stay stable.
type StopSequences ¶
type StopSequences []string
StopSequences is a JSON-flexible field that accepts either a single string or an array of strings. OpenAI's API uses both shapes interchangeably; clients in the wild send both.
func (*StopSequences) UnmarshalJSON ¶
func (s *StopSequences) UnmarshalJSON(data []byte) error
type ToolCallFunction ¶
type ToolCallFunction struct {
Name string `json:"name,omitempty"`
Arguments string `json:"arguments,omitempty"`
}
ToolCallFunction mirrors OpenAI's nested tool function descriptor.
type ToolCallReference ¶
type ToolCallReference struct {
// Index is the position of the tool call in the assistant message.
// In streaming mode multiple chunks targeting the same Index are
// concatenated by the client.
Index int `json:"index,omitempty"`
// ID matches what is later echoed back as ToolCallID on `tool` role
// messages — useful when correlating tool calls with their results.
ID string `json:"id,omitempty"`
// Type is always "function" today; OpenAI reserves the field for
// future expansion.
Type string `json:"type,omitempty"`
// Function carries the tool's name and JSON-encoded arguments.
Function ToolCallFunction `json:"function"`
}
ToolCallReference mirrors OpenAI's `tool_calls` entry. The server fills it in on the *response* side so clients can introspect what tools the agent invoked. Tools are still executed server-side; this is purely informational.