chatserver

package
v1.53.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 28, 2026 License: Apache-2.0 Imports: 28 Imported by: 0

Documentation

Overview

Package chatserver implements an OpenAI-compatible HTTP server that exposes docker-agent agents through the /v1/chat/completions and /v1/models endpoints.

The goal is to let any tool that already speaks OpenAI's chat protocol (e.g. Open WebUI, custom shell scripts using the openai SDK) drive a docker-agent agent without needing to know about docker-agent's own protocol.

On types: we deliberately don't reuse the request/response structs from github.com/openai/openai-go/v3. The SDK is built around its internal `apijson` encoder; with stdlib `encoding/json` those types serialize every field and produce noisy responses. `apijson` lives under `internal/`, so we can't borrow it. `openai.Model` is the one type that round-trips cleanly with stdlib json, so we reuse it for /v1/models.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Run

func Run(ctx context.Context, agentFilename string, opts Options, ln net.Listener) error

Run starts an OpenAI-compatible HTTP server on the given listener and blocks until ctx is cancelled or the server fails. The team is loaded once from agentFilename and shared across requests; every chat completion request gets a fresh session.

Types

type ChatCompletionChoice

type ChatCompletionChoice struct {
	Index        int                   `json:"index"`
	Message      ChatCompletionMessage `json:"message"`
	FinishReason string                `json:"finish_reason"`
}

type ChatCompletionMessage

type ChatCompletionMessage struct {
	Role string `json:"role"`
	// Content is the text content of the message. Populated whether the
	// wire format used a string or a parts array (the parts' text values
	// are concatenated).
	Content string `json:"-"`
	// Parts holds the original typed parts when the wire format used an
	// array. Empty when the wire format was a plain string.
	Parts []ContentPart `json:"-"`

	Name       string              `json:"name,omitempty"`
	ToolCallID string              `json:"tool_call_id,omitempty"`
	ToolCalls  []ToolCallReference `json:"tool_calls,omitempty"`
}

ChatCompletionMessage is a single message in the conversation.

On the wire OpenAI accepts message content in two shapes: either a plain string (`"content": "hello"`) or an array of typed parts (`"content": [{"type":"text",...}, {"type":"image_url",...}]`). Both shapes are accepted on the request side; the response always uses the string form for text-only content and the parts form when images or other non-text content are present. The custom JSON (un)marshallers below preserve that union without forcing every Go caller to deal with it.

func (ChatCompletionMessage) MarshalJSON

func (m ChatCompletionMessage) MarshalJSON() ([]byte, error)

MarshalJSON emits the parts array when present, otherwise a plain string. Tool/role/name/tool_call_id round-trip verbatim.

func (*ChatCompletionMessage) UnmarshalJSON

func (m *ChatCompletionMessage) UnmarshalJSON(data []byte) error

UnmarshalJSON accepts either a string `content` field or an array of typed parts (OpenAI's multimodal shape).

type ChatCompletionRequest

type ChatCompletionRequest struct {
	Model    string                  `json:"model"`
	Messages []ChatCompletionMessage `json:"messages"`
	Stream   bool                    `json:"stream,omitempty"`

	// Temperature is parsed and range-checked but not yet plumbed through
	// to the runtime/model layer (no per-request override exists today).
	// Set on the agent's YAML configuration to control sampling.
	Temperature *float64 `json:"temperature,omitempty"`
	// TopP is parsed and range-checked but not yet plumbed through.
	TopP *float64 `json:"top_p,omitempty"`
	// MaxTokens is the maximum number of tokens the model may generate in
	// the response. Parsed and validated; runtime plumbing is tracked for
	// a follow-up.
	MaxTokens *int64 `json:"max_tokens,omitempty"`
	// Stop is one or more substrings that, if produced, end generation.
	// Accepted as either a single string or an array of strings, matching
	// the OpenAI schema. Validated; not yet enforced.
	Stop StopSequences `json:"stop,omitempty"`
}

ChatCompletionRequest is the body of a /v1/chat/completions call. We declare every field commonly sent by OpenAI clients so they are accepted without surprise. Whether each field is *acted on* is documented inline.

type ChatCompletionResponse

type ChatCompletionResponse struct {
	ID      string                 `json:"id"`
	Object  string                 `json:"object"`
	Created int64                  `json:"created"`
	Model   string                 `json:"model"`
	Choices []ChatCompletionChoice `json:"choices"`
	Usage   *ChatCompletionUsage   `json:"usage,omitempty"`
}

type ChatCompletionStreamChoice

type ChatCompletionStreamChoice struct {
	Index        int                       `json:"index"`
	Delta        ChatCompletionStreamDelta `json:"delta"`
	FinishReason string                    `json:"finish_reason,omitempty"`
}

type ChatCompletionStreamDelta

type ChatCompletionStreamDelta struct {
	Role      string              `json:"role,omitempty"`
	Content   string              `json:"content,omitempty"`
	ToolCalls []ToolCallReference `json:"tool_calls,omitempty"`
}

type ChatCompletionStreamResponse

type ChatCompletionStreamResponse struct {
	ID      string                       `json:"id"`
	Object  string                       `json:"object"`
	Created int64                        `json:"created"`
	Model   string                       `json:"model"`
	Choices []ChatCompletionStreamChoice `json:"choices"`
}

ChatCompletionStreamResponse is one SSE chunk emitted when the client requests stream: true.

type ChatCompletionUsage

type ChatCompletionUsage struct {
	PromptTokens     int64 `json:"prompt_tokens"`
	CompletionTokens int64 `json:"completion_tokens"`
	TotalTokens      int64 `json:"total_tokens"`
}

ChatCompletionUsage reports approximate token counts. Best-effort: when the underlying provider doesn't report usage we omit the field entirely.

type ContentImageURL

type ContentImageURL struct {
	URL    string `json:"url"`
	Detail string `json:"detail,omitempty"`
}

ContentImageURL carries an image part. URL may be a regular http(s) URL or a data URL (`data:image/png;base64,...`).

type ContentPart

type ContentPart struct {
	Type     string           `json:"type"`
	Text     string           `json:"text,omitempty"`
	ImageURL *ContentImageURL `json:"image_url,omitempty"`
}

ContentPart mirrors one entry in OpenAI's typed-parts array. Today the server understands `text` and `image_url` parts; unknown types are preserved in the request payload but ignored when building the session, so future part types degrade gracefully.

type ErrorDetail

type ErrorDetail struct {
	Message string `json:"message"`
	Type    string `json:"type"`
	Code    string `json:"code,omitempty"`
}

type ErrorResponse

type ErrorResponse struct {
	Error ErrorDetail `json:"error"`
}

ErrorResponse is the OpenAI-style error envelope returned on 4xx/5xx.

type ModelsResponse

type ModelsResponse struct {
	Object string         `json:"object"`
	Data   []openai.Model `json:"data"`
}

ModelsResponse is the body returned by /v1/models. Each agent in the team is exposed as one entry.

type Options

type Options struct {
	// AgentName pins the single agent to expose. Empty exposes every
	// agent in the team and uses the team's default as the fallback.
	AgentName string
	// RunConfig is the runtime configuration used to load the team.
	RunConfig *config.RuntimeConfig
	// CORSOrigin is the allowed value for the Access-Control-Allow-Origin
	// header. When empty, the CORS middleware is not registered at all
	// (the server never emits any Access-Control-* response header).
	//
	// Multiple values can be provided separated by commas. Each entry is
	// either a literal origin (matched exactly), the wildcard "*", or a
	// pattern starting with "~" interpreted as a Go regular expression
	// against the request's Origin header. Examples:
	//
	//	"https://app.example.com"
	//	"https://app.example.com,https://staging.example.com"
	//	"~^https://[a-z0-9-]+\\.example\\.com$"
	CORSOrigin string
	// APIKey, if non-empty, is the static bearer token clients must
	// present in the `Authorization` header (`Authorization: Bearer X`).
	// Empty disables authentication; once set, every request to /v1/* is
	// rejected with 401 unless it carries the matching token.
	// /v1/models is also protected so an unauthenticated client can't
	// fingerprint the server.
	APIKey string
	// MaxRequestBytes caps the size of an incoming request body. Zero
	// means use the package default (1 MiB).
	MaxRequestBytes int64
	// RequestTimeout caps how long a single chat completion is allowed to
	// run. Zero means use the package default (5 minutes). The cap covers
	// model calls, tool calls, and SSE streaming combined.
	RequestTimeout time.Duration
	// ConversationsMaxSessions, when > 0, enables the X-Conversation-Id
	// header: clients can pass a stable id to reuse the same session
	// across requests instead of re-sending the full message history
	// every turn. This is the size of the in-memory LRU cache.
	ConversationsMaxSessions int
	// ConversationTTL is how long a cached conversation may be idle
	// before it's evicted. Zero means use the package default
	// (30 minutes).
	ConversationTTL time.Duration
	// MaxIdleRuntimes bounds the number of idle runtimes pooled per
	// agent. Building a runtime resolves tools and sets up channels;
	// keeping a small pool of warm runtimes avoids paying that cost on
	// every request. Zero disables pooling (a fresh runtime is built
	// for every request, the original behaviour).
	MaxIdleRuntimes int
}

Options configures the chat completions server. Future improvements (auth, conversations, etc.) extend this struct rather than the Run signature so callers stay stable.

type StopSequences

type StopSequences []string

StopSequences is a JSON-flexible field that accepts either a single string or an array of strings. OpenAI's API uses both shapes interchangeably; clients in the wild send both.

func (*StopSequences) UnmarshalJSON

func (s *StopSequences) UnmarshalJSON(data []byte) error

type ToolCallFunction

type ToolCallFunction struct {
	Name      string `json:"name,omitempty"`
	Arguments string `json:"arguments,omitempty"`
}

ToolCallFunction mirrors OpenAI's nested tool function descriptor.

type ToolCallReference

type ToolCallReference struct {
	// Index is the position of the tool call in the assistant message.
	// In streaming mode multiple chunks targeting the same Index are
	// concatenated by the client.
	Index int `json:"index,omitempty"`
	// ID matches what is later echoed back as ToolCallID on `tool` role
	// messages — useful when correlating tool calls with their results.
	ID string `json:"id,omitempty"`
	// Type is always "function" today; OpenAI reserves the field for
	// future expansion.
	Type string `json:"type,omitempty"`
	// Function carries the tool's name and JSON-encoded arguments.
	Function ToolCallFunction `json:"function"`
}

ToolCallReference mirrors OpenAI's `tool_calls` entry. The server fills it in on the *response* side so clients can introspect what tools the agent invoked. Tools are still executed server-side; this is purely informational.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL