local

package

v1.5.0 Latest Latest Go to latest Published: Feb 6, 2026 License: MIT Imports: 13 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/randalmurphal/llmkit

Links

Open Source Insights

Documentation ¶

Overview ¶

Package local provides a client for local LLM models via a Python sidecar process.

The local provider communicates with a Python sidecar via JSON-RPC 2.0 over stdio. This enables llmkit to work with local model backends like Ollama, llama.cpp, vLLM, and HuggingFace transformers without requiring direct Go bindings.

Architecture ¶

The local client manages a long-running Python sidecar process:

Go Client <--JSON-RPC/stdio--> Python Sidecar <--Backend API--> Local Model

Supported Backends ¶

ollama: Ollama API server
llama.cpp: llama.cpp server
vllm: vLLM server
transformers: HuggingFace transformers

Usage ¶

client, err := provider.New("local", provider.Config{
    Model: "llama3.2:latest",
    Options: map[string]any{
        "backend":      "ollama",
        "sidecar_path": "/path/to/sidecar.py",
    },
})

Package local provides a client for local LLM models via a Python sidecar process.

The local provider communicates with a Python sidecar via JSON-RPC 2.0 over stdio. This enables llmkit to work with local model backends like Ollama, llama.cpp, vLLM, and HuggingFace transformers without requiring direct Go bindings.

Architecture ¶

The local client manages a long-running Python sidecar process:

Go Client <--JSON-RPC/stdio--> Python Sidecar <--Backend API--> Local Model

The sidecar process is started lazily on the first request and is kept running for subsequent requests. The sidecar handles all communication with the local model backend and can optionally connect to MCP servers.

Supported Backends ¶

ollama: Ollama API server (default host: localhost:11434)
llama.cpp: llama.cpp server (default host: localhost:8000)
vllm: vLLM server (default host: localhost:8000)
transformers: HuggingFace transformers (runs in-process in sidecar)

JSON-RPC Protocol ¶

Communication uses JSON-RPC 2.0 over stdio with newline-delimited messages:

Request (client -> sidecar):

{"jsonrpc": "2.0", "method": "complete", "params": {...}, "id": 1}

Response (sidecar -> client):

{"jsonrpc": "2.0", "result": {"content": "...", "usage": {...}}, "id": 1}

Streaming uses notifications (no ID, no response expected):

{"jsonrpc": "2.0", "method": "stream.chunk", "params": {"content": "...", "done": false}}
{"jsonrpc": "2.0", "method": "stream.done", "params": {"usage": {...}}}

Usage ¶

Using the provider registry:

import _ "github.com/randalmurphal/llmkit/local"

client, err := provider.New("local", provider.Config{
    Model: "llama3.2:latest",
    Options: map[string]any{
        "backend":      "ollama",
        "sidecar_path": "/path/to/sidecar.py",
    },
})
if err != nil {
    log.Fatal(err)
}
defer client.Close()

Direct instantiation:

client := local.NewClient(
    local.WithBackend(local.BackendOllama),
    local.WithSidecarPath("/path/to/sidecar.py"),
    local.WithModel("llama3.2:latest"),
)
defer client.Close()

resp, err := client.Complete(ctx, provider.Request{
    Messages: []provider.Message{
        {Role: provider.RoleUser, Content: "Hello!"},
    },
})

Capabilities ¶

The local provider has the following capabilities:

Streaming: true (via JSON-RPC notifications)
Tools: false (local models don't have native tool support)
MCP: true (the sidecar can connect to MCP servers)
Sessions: false (no persistent sessions)
Images: false (multimodal not currently supported)
NativeTools: none

Sidecar Implementation ¶

The Python sidecar script must implement the following RPC methods:

init: Initialize the backend connection
complete: Perform a completion request
shutdown: Clean shutdown

For streaming, the sidecar should send stream.chunk and stream.done notifications. See the protocol.go file for detailed message formats.

Index ¶

Constants
func IsNotification(data []byte) bool
type Backend
type Client
- func NewClient(opts ...Option) *Client
- func NewClientWithConfig(cfg Config) *Client
- func (c *Client) Capabilities() provider.Capabilities
- func (c *Client) Close() error
- func (c *Client) Complete(ctx context.Context, req provider.Request) (*provider.Response, error)
- func (c *Client) Provider() string
- func (c *Client) Stream(ctx context.Context, req provider.Request) (<-chan provider.StreamChunk, error)
type CompleteParams
type CompleteResult
type Config
- func DefaultConfig() Config
- func (c *Config) Validate() error
- func (c Config) WithDefaults() Config
type InitParams
type InitResult
type MCPServerConfig
type MessageParam
type Notification
- func ParseNotification(data []byte) (*Notification, error)
type Option
- func WithBackend(backend Backend) Option
- func WithEnv(env map[string]string) Option
- func WithHost(host string) Option
- func WithMCPServers(servers map[string]MCPServerConfig) Option
- func WithModel(model string) Option
- func WithPythonPath(path string) Option
- func WithRequestTimeout(d time.Duration) Option
- func WithSidecarPath(path string) Option
- func WithStartupTimeout(d time.Duration) Option
- func WithWorkDir(dir string) Option
type Protocol
- func NewProtocol(r io.Reader, w io.Writer) *Protocol
- func (p *Protocol) Call(method string, params, result any) error
- func (p *Protocol) Notify(method string, params any) error
- func (p *Protocol) ReadMessage() ([]byte, error)
type RPCError
- func (e *RPCError) Error() string
type Request
type Response
type ShutdownResult
type Sidecar
- func NewSidecar(cfg Config) *Sidecar
- func (s *Sidecar) Done() <-chan struct{}
- func (s *Sidecar) ExitError() error
- func (s *Sidecar) IsRunning() bool
- func (s *Sidecar) Protocol() *Protocol
- func (s *Sidecar) Restart(ctx context.Context) error
- func (s *Sidecar) Start(ctx context.Context) error
- func (s *Sidecar) Stop() error
type StreamChunkParams
- func ParseStreamChunk(data json.RawMessage) (*StreamChunkParams, error)
type StreamDoneParams
- func ParseStreamDone(data json.RawMessage) (*StreamDoneParams, error)
type UsageResult

Constants ¶

View Source

const (
	CodeParseError     = -32700
	CodeInvalidRequest = -32600
	CodeMethodNotFound = -32601
	CodeInvalidParams  = -32602
	CodeInternalError  = -32603
)

Standard JSON-RPC 2.0 error codes.

View Source

const (
	CodeBackendError    = -32000 // Backend API error
	CodeModelNotFound   = -32001 // Model not found/loaded
	CodeStreamError     = -32002 // Streaming error
	CodeConnectionError = -32003 // Backend connection failed
)

Application-specific error codes (range -32000 to -32099).

Variables ¶

This section is empty.

Functions ¶

func IsNotification ¶

func IsNotification(data []byte) bool

IsNotification checks if a message is a notification (no ID).

Types ¶

type Backend ¶

type Backend string

Backend identifies the local model backend.

const (
	BackendOllama       Backend = "ollama"
	BackendLlamaCpp     Backend = "llama.cpp"
	BackendVLLM         Backend = "vllm"
	BackendTransformers Backend = "transformers"
)

Supported backends.

type Client ¶

type Client struct {
	// contains filtered or unexported fields
}

Client implements provider.Client for local LLM models via a Python sidecar.

func NewClient ¶

func NewClient(opts ...Option) *Client

NewClient creates a new local model client. The sidecar process is not started until the first request.

func NewClientWithConfig ¶

func NewClientWithConfig(cfg Config) *Client

NewClientWithConfig creates a new local model client from a Config.

func (*Client) Capabilities ¶

func (c *Client) Capabilities() provider.Capabilities

Capabilities implements provider.Client.

func (*Client) Close ¶

func (c *Client) Close() error

Close implements provider.Client. Stops the sidecar process if running.

func (*Client) Complete ¶

func (c *Client) Complete(ctx context.Context, req provider.Request) (*provider.Response, error)

Complete implements provider.Client. Starts the sidecar if not already running.

func (*Client) Provider ¶

func (c *Client) Provider() string

Provider implements provider.Client.

func (*Client) Stream ¶

func (c *Client) Stream(ctx context.Context, req provider.Request) (<-chan provider.StreamChunk, error)

Stream implements provider.Client. Starts the sidecar if not already running.

type CompleteParams ¶

type CompleteParams struct {
	Messages     []MessageParam `json:"messages"`
	Model        string         `json:"model,omitempty"`
	SystemPrompt string         `json:"system_prompt,omitempty"`
	MaxTokens    int            `json:"max_tokens,omitempty"`
	Temperature  float64        `json:"temperature,omitempty"`
	Stream       bool           `json:"stream,omitempty"`
	Options      map[string]any `json:"options,omitempty"`
}

CompleteParams are the parameters for the "complete" RPC method.

type CompleteResult ¶

type CompleteResult struct {
	Content      string      `json:"content"`
	Model        string      `json:"model,omitempty"`
	FinishReason string      `json:"finish_reason,omitempty"`
	Usage        UsageResult `json:"usage"`
}

CompleteResult is the result of a "complete" RPC call.

type Config ¶

type Config struct {
	// Backend specifies which local model backend to use.
	// Required. One of: "ollama", "llama.cpp", "vllm", "transformers"
	Backend Backend `json:"backend" yaml:"backend"`

	// SidecarPath is the path to the Python sidecar script.
	// Required.
	SidecarPath string `json:"sidecar_path" yaml:"sidecar_path"`

	// Model is the local model name to use.
	// Format depends on backend (e.g., "llama3.2:latest" for Ollama).
	Model string `json:"model" yaml:"model"`

	// Host is the API server address for applicable backends.
	// Default: "localhost:11434" for Ollama, "localhost:8000" for vLLM.
	Host string `json:"host" yaml:"host"`

	// PythonPath is the path to the Python interpreter.
	// Default: "python3"
	PythonPath string `json:"python_path" yaml:"python_path"`

	// StartupTimeout is how long to wait for sidecar to become ready.
	// Default: 30 seconds.
	StartupTimeout time.Duration `json:"startup_timeout" yaml:"startup_timeout"`

	// RequestTimeout is the default timeout for completion requests.
	// Default: 5 minutes.
	RequestTimeout time.Duration `json:"request_timeout" yaml:"request_timeout"`

	// WorkDir is the working directory for the sidecar process.
	WorkDir string `json:"work_dir" yaml:"work_dir"`

	// Env provides additional environment variables for the sidecar.
	Env map[string]string `json:"env" yaml:"env"`

	// MCPServers configures MCP servers to pass through to the sidecar.
	// The sidecar is responsible for connecting to MCP servers.
	MCPServers map[string]MCPServerConfig `json:"mcp_servers" yaml:"mcp_servers"`
}

Config holds local provider configuration.

func DefaultConfig ¶

func DefaultConfig() Config

DefaultConfig returns a Config with sensible defaults.

func (*Config) Validate ¶

func (c *Config) Validate() error

Validate checks if the configuration is valid.

func (Config) WithDefaults ¶

func (c Config) WithDefaults() Config

WithDefaults returns a copy of the config with defaults applied for unset fields.

type InitParams ¶

type InitParams struct {
	Backend    string                     `json:"backend"`
	Model      string                     `json:"model"`
	Host       string                     `json:"host,omitempty"`
	MCPServers map[string]MCPServerConfig `json:"mcp_servers,omitempty"`
	Options    map[string]any             `json:"options,omitempty"`
}

InitParams are the parameters for the "init" RPC method.

type InitResult ¶

type InitResult struct {
	Ready   bool   `json:"ready"`
	Version string `json:"version,omitempty"`
	Message string `json:"message,omitempty"`
}

InitResult is the result of an "init" RPC call.

type MCPServerConfig ¶

type MCPServerConfig struct {
	Type    string            `json:"type"`              // "stdio", "http", "sse"
	Command string            `json:"command,omitempty"` // For stdio transport
	Args    []string          `json:"args,omitempty"`
	Env     map[string]string `json:"env,omitempty"`
	URL     string            `json:"url,omitempty"` // For http/sse transport
	Headers []string          `json:"headers,omitempty"`
}

MCPServerConfig defines an MCP server to enable.

type MessageParam ¶

type MessageParam struct {
	Role    string `json:"role"`
	Content string `json:"content"`
	Name    string `json:"name,omitempty"` // For tool results
}

MessageParam is a message in the conversation.

type Notification ¶

type Notification struct {
	JSONRPC string `json:"jsonrpc"`
	Method  string `json:"method"`
	Params  any    `json:"params,omitempty"`
}

Notification is a JSON-RPC 2.0 notification (no ID, no response expected).

func ParseNotification ¶

func ParseNotification(data []byte) (*Notification, error)

ParseNotification attempts to parse a message as a notification. Returns nil, nil if the message is not a notification.

type Option ¶

type Option func(*Client)

Option configures a local Client.

func WithBackend ¶

func WithBackend(backend Backend) Option

WithBackend sets the backend type.

func WithEnv ¶

func WithEnv(env map[string]string) Option

WithEnv adds environment variables for the sidecar process.

func WithHost ¶

func WithHost(host string) Option

WithHost sets the backend API server address.

func WithMCPServers ¶

func WithMCPServers(servers map[string]MCPServerConfig) Option

WithMCPServers configures MCP servers for the sidecar.

func WithModel ¶

func WithModel(model string) Option

WithModel sets the model name.

func WithPythonPath ¶

func WithPythonPath(path string) Option

WithPythonPath sets the Python interpreter path.

func WithRequestTimeout ¶

func WithRequestTimeout(d time.Duration) Option

WithRequestTimeout sets the default request timeout.

func WithSidecarPath ¶

func WithSidecarPath(path string) Option

WithSidecarPath sets the path to the sidecar script.

func WithStartupTimeout ¶

func WithStartupTimeout(d time.Duration) Option

WithStartupTimeout sets the sidecar startup timeout.

func WithWorkDir ¶

func WithWorkDir(dir string) Option

WithWorkDir sets the working directory for the sidecar.

type Protocol ¶

type Protocol struct {
	// contains filtered or unexported fields
}

Protocol handles JSON-RPC encoding/decoding over stdio.

func NewProtocol ¶

func NewProtocol(r io.Reader, w io.Writer) *Protocol

NewProtocol creates a new JSON-RPC protocol handler.

func (*Protocol) Call ¶

func (p *Protocol) Call(method string, params, result any) error

Call sends a request and waits for a response. The result is unmarshaled into the provided value. This method is safe for concurrent use, but callers waiting for responses will be serialized.

func (*Protocol) Notify ¶

func (p *Protocol) Notify(method string, params any) error

Notify sends a notification (no response expected).

func (*Protocol) ReadMessage ¶

func (p *Protocol) ReadMessage() ([]byte, error)

ReadMessage reads a single message (response or notification). Returns the raw JSON for further processing. This method is safe for concurrent use.

type RPCError ¶

type RPCError struct {
	Code    int             `json:"code"`
	Message string          `json:"message"`
	Data    json.RawMessage `json:"data,omitempty"`
}

RPCError is a JSON-RPC 2.0 error object.

func (*RPCError) Error ¶

func (e *RPCError) Error() string

Error implements the error interface.

type Request ¶

type Request struct {
	JSONRPC string `json:"jsonrpc"`
	Method  string `json:"method"`
	Params  any    `json:"params,omitempty"`
	ID      int64  `json:"id"`
}

Request is a JSON-RPC 2.0 request.

type Response ¶

type Response struct {
	JSONRPC string          `json:"jsonrpc"`
	Result  json.RawMessage `json:"result,omitempty"`
	Error   *RPCError       `json:"error,omitempty"`
	ID      int64           `json:"id"`
}

Response is a JSON-RPC 2.0 response.

type ShutdownResult ¶

type ShutdownResult struct {
	Success bool   `json:"success"`
	Message string `json:"message,omitempty"`
}

ShutdownResult is the result of a "shutdown" RPC call.

type Sidecar ¶

type Sidecar struct {
	// contains filtered or unexported fields
}

Sidecar manages the Python sidecar process lifecycle.

func NewSidecar ¶

func NewSidecar(cfg Config) *Sidecar

NewSidecar creates a new sidecar manager.

func (*Sidecar) Done ¶

func (s *Sidecar) Done() <-chan struct{}

Done returns a channel that's closed when the sidecar process exits.

func (*Sidecar) ExitError ¶

func (s *Sidecar) ExitError() error

ExitError returns the error from the sidecar process exit, if any.

func (*Sidecar) IsRunning ¶

func (s *Sidecar) IsRunning() bool

IsRunning returns true if the sidecar is running.

func (*Sidecar) Protocol ¶

func (s *Sidecar) Protocol() *Protocol

Protocol returns the JSON-RPC protocol handler. Returns nil if the sidecar is not running.

func (*Sidecar) Restart ¶

func (s *Sidecar) Restart(ctx context.Context) error

Restart stops and starts the sidecar.

func (*Sidecar) Start ¶

func (s *Sidecar) Start(ctx context.Context) error

Start launches the sidecar process and waits for it to become ready. Returns an error if the process fails to start or doesn't become ready within the configured timeout.

func (*Sidecar) Stop ¶

func (s *Sidecar) Stop() error

Stop gracefully shuts down the sidecar process.

type StreamChunkParams ¶

type StreamChunkParams struct {
	Content string `json:"content"`
	Done    bool   `json:"done"`
}

StreamChunkParams are the parameters for stream.chunk notifications.

func ParseStreamChunk ¶

func ParseStreamChunk(data json.RawMessage) (*StreamChunkParams, error)

ParseStreamChunk parses a stream.chunk notification payload.

type StreamDoneParams ¶

type StreamDoneParams struct {
	Usage        UsageResult `json:"usage"`
	FinishReason string      `json:"finish_reason,omitempty"`
	Model        string      `json:"model,omitempty"`
}

StreamDoneParams are the parameters for stream.done notifications.

func ParseStreamDone ¶

func ParseStreamDone(data json.RawMessage) (*StreamDoneParams, error)

ParseStreamDone parses a stream.done notification payload.

type UsageResult ¶

type UsageResult struct {
	InputTokens  int `json:"input_tokens"`
	OutputTokens int `json:"output_tokens"`
	TotalTokens  int `json:"total_tokens"`
}

UsageResult tracks token usage.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL