eyrie

package module
v0.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 16, 2026 License: MIT Imports: 13 Imported by: 0

README

Eyrie

Go License Tests GoDoc


What is eyrie?

eyrie is the provider runtime that hawk sits on top of. It handles everything between hawk and the LLM APIs — authentication, model resolution, streaming, retries, rate limiting, and caching — so hawk can focus on being a great coding agent.

When hawk calls a model, eyrie figures out which provider to use, how to talk to it, and how to stream the response back. When hawk switches from Anthropic to Ollama, eyrie handles the translation. When an API returns a 529, eyrie retries with backoff. When a response hits max_tokens, eyrie continues automatically.

hawk never talks to an LLM API directly. eyrie does.

What eyrie handles for hawk

Concern What eyrie does
Provider routing Detects active provider from env vars, config file, or explicit key
Model resolution Maps abstract tiers (opus/sonnet/haiku) to concrete model IDs per provider
Streaming Parses SSE for Anthropic and OpenAI formats — text, tool calls, thinking blocks
Reliability Retries on 429/500/529 with exponential backoff and Retry-After support
Long outputs Auto-continues when stop_reason == max_tokens
Cost control Anthropic prompt caching breakpoints on system prompt and conversation prefix
Rate limiting Token bucket per provider — prevents hitting API limits
Config Reads/writes ~/.hawk/provider.json, applies to env vars
Model catalog Embedded pricing + context windows for all providers, live-fetched from OpenRouter
Testing Mock provider — hawk's tests never need real API keys

Supported providers

Provider Set this Notes
Anthropic ANTHROPIC_API_KEY Default for hawk · supports thinking, caching
OpenAI OPENAI_API_KEY Full tool use + reasoning effort
OpenRouter OPENROUTER_API_KEY 200+ models via one key
Grok (xAI) XAI_API_KEY
Gemini GEMINI_API_KEY
CanopyWave CANOPYWAVE_API_KEY
Ollama OLLAMA_BASE_URL Local models, no key needed
OpenCodeGo OPENCODEGO_API_KEY

eyrie detects which provider to use automatically — in the order above.

How hawk uses eyrie

// hawk creates a client once at startup
c := client.NewEyrieClient(&client.EyrieConfig{
    Provider: client.DetectProvider(), // reads from env / config file
})

// hawk streams a response
sr, err := c.StreamChat(ctx, conversation, client.ChatOptions{
    Model: catalog.GetProviderDefaultModel(provider, &cat),
})
defer sr.Close()

for evt := range sr.Events {
    switch evt.Type {
    case "content":   // stream text to terminal
    case "tool_call": // execute tool, append result
    case "thinking":  // show thinking indicator
    case "done":      // response complete
    }
}

// When a response hits max_tokens, eyrie continues automatically
resp, err := client.ChatWithContinuation(ctx, provider, messages,
    client.ChatOptions{Model: model},
    client.DefaultContinuationConfig(),
)

Provider config file

hawk stores provider config at ~/.hawk/provider.json. eyrie owns this file.

cfg := config.LoadProviderConfig("")        // load
config.ApplyProviderConfigToEnv(cfg, false, nil) // apply to env
config.SaveProviderConfig(cfg, "")          // save

Model catalog

eyrie ships with an embedded catalog of every supported model — pricing, context windows, max output. hawk uses this for cost tracking and model selection.

cat := catalog.DefaultModelCatalog()

// Get the best model for a tier
model := catalog.GetPreferredProviderModel("anthropic", catalog.TierSonnet, &cat)
// → "claude-sonnet-4-6"

// Check if a model is deprecated
warn := catalog.GetModelDeprecationWarning("claude-3-7-sonnet", "anthropic")
// → "⚠ Claude 3.7 Sonnet will be retired on February 19, 2026..."

Testing hawk without API keys

mock := client.NewMockProvider(client.MockModeFixed)
mock.Response = "Here is the code you asked for..."

// Inject into hawk's test suite — no real API calls
resp, _ := mock.Chat(ctx, messages, opts)

Install

go get github.com/GrayCodeAI/eyrie

Requires Go 1.26+. Zero external dependencies.

License

MIT © 2026 GrayCode AI

Documentation

Overview

Package eyrie is the core LLM client library for hawk.

It provides API provider configurations, model resolution, API limits, base types (messages, IDs, connectors), and error types.

Sub-packages:

  • types: Message types, content blocks, usage, IDs, connectors
  • errors: Error message constants and utilities
  • constants: API limits (image, PDF, media)
  • catalog: Model catalog, tiers, names, deprecation, provider data
  • config: Provider configuration, profiles, env, OpenAI-compatible runtime
  • client: EyrieClient, factory, provider detection
  • utils: Error utilities (SSL detection, API error sanitization)

Package eyrie observability provides OpenTelemetry-compatible structured tracing and metrics collection for all LLM provider calls.

Design follows OpenTelemetry Go SDK patterns (TraceID/SpanID, span lifecycle, metric instruments) while remaining zero-dependency (Go stdlib only).

Usage is opt-in: a nil *Telemetry adds zero overhead.

Index

Constants

View Source
const (
	SpanLLMChat     = "llm.chat"
	SpanLLMStream   = "llm.stream"
	SpanLLMRetry    = "llm.retry"
	SpanLLMCacheHit = "llm.cache_hit"
)
View Source
const (
	AttrLLMProvider     = "llm.provider"
	AttrLLMModel        = "llm.model"
	AttrLLMInputTokens  = "llm.input_tokens"
	AttrLLMOutputTokens = "llm.output_tokens"
	AttrLLMCostUSD      = "llm.cost_usd"
	AttrLLMLatencyMs    = "llm.latency_ms"
	AttrLLMStatus       = "llm.status"
)
View Source
const (
	// DefaultWarmInterval is the default interval between cache warming pings.
	// Anthropic's prompt cache TTL is 5 minutes; we warm at 4 minutes to stay ahead.
	DefaultWarmInterval = 4 * time.Minute
)

Default cache warmer settings.

Variables

View Source
var Version = strings.TrimSpace(versionFile)

Version of the eyrie library. Sourced from the VERSION file at the repo root — do not edit this variable directly. Bump VERSION instead, or let release-please/goreleaser do it.

Functions

This section is empty.

Types

type CacheStats added in v0.2.0

type CacheStats struct {
	WarmingRequests     int64
	CacheHits           int64
	CacheMisses         int64
	EstimatedSavingsUSD float64
	LastWarmedAt        time.Time
}

CacheStats tracks cache warming statistics.

type CacheWarmer added in v0.2.0

type CacheWarmer struct {
	Provider     string
	Model        string
	SystemPrompt string
	Interval     time.Duration
	Enabled      bool

	Stats CacheStats

	// ChatFn is the function used to send warming requests.
	// It should send a chat completion request to the provider.
	ChatFn func(ctx context.Context, messages []Message, opts ChatOptions) error
	// contains filtered or unexported fields
}

CacheWarmer keeps Anthropic's prompt cache warm by periodically sending minimal requests with the system prompt, ensuring subsequent real requests get cache hits at a 90% discount.

func NewCacheWarmer added in v0.2.0

func NewCacheWarmer(chatFn func(ctx context.Context, messages []Message, opts ChatOptions) error, systemPrompt, provider, model string) *CacheWarmer

NewCacheWarmer creates a new CacheWarmer configured for the given provider. The chatFn is called to send warming pings; it should make a real API call.

func (*CacheWarmer) CacheBreakpoints added in v0.2.0

func (cw *CacheWarmer) CacheBreakpoints(systemPrompt string, conversationPrefix []Message) []int

CacheBreakpoints suggests where to place cache breakpoints in a message list. Anthropic allows up to 4 breakpoints. The strategy is:

  • After system prompt (index 0 in returned slice signals "system prompt")
  • After the first user message
  • After large context blocks (messages with content > 200 chars)

Returns indices into the messages array where breakpoints should be placed. The special index -1 indicates the system prompt should be cached.

func (*CacheWarmer) EstimateSavings added in v0.2.0

func (cw *CacheWarmer) EstimateSavings(inputTokens int, requestCount int) float64

EstimateSavings calculates the cost savings from prompt caching in USD.

Without cache: inputTokens * requestCount * price_per_token With cache: (inputTokens * price_per_token * 1.25 first time) +

(inputTokens * (requestCount-1) * price_per_token * 0.1)

Returns the difference (savings amount).

func (*CacheWarmer) ShouldWarm added in v0.2.0

func (cw *CacheWarmer) ShouldWarm() bool

ShouldWarm returns true if the cache has likely expired (more than 4 minutes since the last warming request) and should be refreshed.

func (*CacheWarmer) Start added in v0.2.0

func (cw *CacheWarmer) Start(ctx context.Context) error

Start begins the background cache warming loop. It sends a warming ping immediately, then repeats every Interval until Stop is called or the context is cancelled. For non-Anthropic providers this is a no-op.

func (*CacheWarmer) Stop added in v0.2.0

func (cw *CacheWarmer) Stop()

Stop stops the cache warmer background loop.

func (*CacheWarmer) Warm added in v0.2.0

func (cw *CacheWarmer) Warm(ctx context.Context) error

Warm sends a single warming request immediately. This keeps the system prompt cached on Anthropic's side. Returns an error if the request fails. For non-Anthropic providers this is a no-op.

type ChatOptions added in v0.2.0

type ChatOptions struct {
	Model         string
	MaxTokens     int
	System        string
	EnableCaching bool
}

ChatOptions is a minimal options struct for cache warmer use.

type HealthCheckConfig added in v0.2.0

type HealthCheckConfig struct {
	// Interval between periodic health checks. Default: 30s.
	Interval time.Duration
	// Timeout for each individual ping. Default: 5s.
	Timeout time.Duration
	// DegradedThreshold: latency above this marks provider as Degraded. Default: 2s.
	DegradedThreshold time.Duration
	// UnhealthyAfter: number of consecutive failures before marking Unhealthy. Default: 3.
	UnhealthyAfter int
	// DegradedAfter: number of consecutive failures before marking Degraded. Default: 1.
	DegradedAfter int
}

HealthCheckConfig configures the HealthChecker behavior.

func DefaultHealthCheckConfig added in v0.2.0

func DefaultHealthCheckConfig() HealthCheckConfig

DefaultHealthCheckConfig returns sensible default configuration.

type HealthChecker added in v0.2.0

type HealthChecker struct {
	// contains filtered or unexported fields
}

HealthChecker periodically pings LLM providers to determine their health. It is safe for concurrent use. A nil *HealthChecker is safe (all methods are no-ops).

func NewHealthChecker added in v0.2.0

func NewHealthChecker(cfg HealthCheckConfig) *HealthChecker

NewHealthChecker creates a new HealthChecker with the given configuration. Pass a zero-value config to use defaults.

func (*HealthChecker) AllProviderHealth added in v0.2.0

func (hc *HealthChecker) AllProviderHealth() map[string]HealthStatus

AllProviderHealth returns the current health status of all registered providers.

func (*HealthChecker) Check added in v0.2.0

func (hc *HealthChecker) Check(provider string) HealthStatus

Check performs an immediate health check on a single provider and returns the resulting HealthStatus. Returns an Unhealthy status if the provider is not registered.

func (*HealthChecker) Register added in v0.2.0

func (hc *HealthChecker) Register(provider ProviderPinger)

Register adds a provider to be health-checked. Can be called at any time.

func (*HealthChecker) Start added in v0.2.0

func (hc *HealthChecker) Start()

Start begins periodic background health checks. Call Stop() to halt. No-op if already started or if hc is nil.

func (*HealthChecker) Stop added in v0.2.0

func (hc *HealthChecker) Stop()

Stop halts periodic health checking. Blocks until the background loop exits. No-op if not started or if hc is nil.

func (*HealthChecker) Unregister added in v0.2.0

func (hc *HealthChecker) Unregister(name string)

Unregister removes a provider from health checking.

type HealthState added in v0.2.0

type HealthState int

HealthState represents the health condition of a provider.

const (
	// Healthy indicates the provider is responding normally.
	Healthy HealthState = iota
	// Degraded indicates the provider is responding but with elevated latency or intermittent errors.
	Degraded
	// Unhealthy indicates the provider is not responding or consistently failing.
	Unhealthy
)

func (HealthState) String added in v0.2.0

func (h HealthState) String() string

String returns a human-readable representation of HealthState.

type HealthStatus added in v0.2.0

type HealthStatus struct {
	State       HealthState   `json:"state"`
	Latency     time.Duration `json:"latency"`
	LastChecked time.Time     `json:"last_checked"`
	Error       string        `json:"error,omitempty"`
	Message     string        `json:"message,omitempty"`
}

HealthStatus holds the current health status for a provider, including measured latency and the time of the last health check.

func (HealthStatus) IsHealthy added in v0.2.0

func (hs HealthStatus) IsHealthy() bool

IsHealthy returns true if the provider state is Healthy.

type Message added in v0.2.0

type Message struct {
	Role    string
	Content string
}

Message is a minimal message struct for cache warmer use.

type MetricsCollector added in v0.2.0

type MetricsCollector struct {
	// contains filtered or unexported fields
}

MetricsCollector aggregates metrics for LLM operations. It uses atomic operations and mutexes for thread safety with minimal contention.

func NewMetricsCollector added in v0.2.0

func NewMetricsCollector() *MetricsCollector

NewMetricsCollector creates a new MetricsCollector.

func (*MetricsCollector) CacheHitRate added in v0.2.0

func (mc *MetricsCollector) CacheHitRate() float64

CacheHitRate returns the ratio of cache hits to total cache-eligible requests.

func (*MetricsCollector) CostAccumulator added in v0.2.0

func (mc *MetricsCollector) CostAccumulator(key string) float64

CostAccumulator returns the total cost in USD for a provider/model key.

func (*MetricsCollector) ErrorRate added in v0.2.0

func (mc *MetricsCollector) ErrorRate(provider string) float64

ErrorRate returns the error rate for a provider (errors / total requests involving that provider).

func (*MetricsCollector) ExportJSON added in v0.2.0

func (mc *MetricsCollector) ExportJSON() string

ExportJSON dumps all metrics as a JSON string.

func (*MetricsCollector) ExportPrometheus added in v0.2.0

func (mc *MetricsCollector) ExportPrometheus() string

ExportPrometheus dumps metrics in Prometheus exposition format.

func (*MetricsCollector) LatencyHistogram added in v0.2.0

func (mc *MetricsCollector) LatencyHistogram(key string) (p50, p95, p99 float64)

LatencyHistogram returns P50, P95, P99 latencies in milliseconds for a key.

func (*MetricsCollector) RecordCustom added in v0.2.0

func (mc *MetricsCollector) RecordCustom(name string, value float64, attrs map[string]string)

RecordCustom records a custom named metric.

func (*MetricsCollector) RecordRequest added in v0.2.0

func (mc *MetricsCollector) RecordRequest(provider, model string, inputTok, outputTok int, latency time.Duration, costUSD float64, isError bool)

RecordRequest records a completed LLM request with all its metrics.

func (*MetricsCollector) RequestCount added in v0.2.0

func (mc *MetricsCollector) RequestCount(key string) int64

RequestCount returns total requests for a provider/model key.

func (*MetricsCollector) TokensUsed added in v0.2.0

func (mc *MetricsCollector) TokensUsed(key string) (int64, int64)

TokensUsed returns (inputTokens, outputTokens) for a provider/model key.

func (*MetricsCollector) TotalCost added in v0.2.0

func (mc *MetricsCollector) TotalCost() float64

TotalCost returns the sum of all accumulated costs in USD.

func (*MetricsCollector) TotalRequests added in v0.2.0

func (mc *MetricsCollector) TotalRequests() int64

TotalRequests returns the sum of all request counts.

type ProviderPinger added in v0.2.0

type ProviderPinger interface {
	Ping(ctx context.Context) error
	Name() string
}

ProviderPinger is the interface that providers must implement for health checking. This is satisfied by the client.Provider interface's Ping method.

type Span added in v0.2.0

type Span struct {
	TraceID    string            `json:"trace_id"`
	SpanID     string            `json:"span_id"`
	Name       string            `json:"name"`
	StartTime  time.Time         `json:"start_time"`
	EndTime    time.Time         `json:"end_time,omitempty"`
	Provider   string            `json:"provider,omitempty"`
	Model      string            `json:"model,omitempty"`
	Status     SpanStatus        `json:"status"`
	Attributes map[string]string `json:"attributes,omitempty"`
	Events     []SpanEvent       `json:"events,omitempty"`
}

Span represents a single timed operation in a trace, modeled after OpenTelemetry's Span concept. It carries a TraceID, SpanID, timing information, and arbitrary string attributes.

func (*Span) AddEvent added in v0.2.0

func (s *Span) AddEvent(name string, attrs map[string]string)

AddEvent adds a timestamped event to the span.

func (*Span) Duration added in v0.2.0

func (s *Span) Duration() time.Duration

Duration returns the span duration. Returns 0 if not yet ended.

func (*Span) SetAttribute added in v0.2.0

func (s *Span) SetAttribute(key, value string)

SetAttribute sets a single attribute on the span.

type SpanEvent added in v0.2.0

type SpanEvent struct {
	Name       string            `json:"name"`
	Timestamp  time.Time         `json:"timestamp"`
	Attributes map[string]string `json:"attributes,omitempty"`
}

SpanEvent records a timestamped event within a span.

type SpanStatus added in v0.2.0

type SpanStatus string

SpanStatus represents the outcome of a span.

const (
	StatusOK    SpanStatus = "ok"
	StatusError SpanStatus = "error"
	StatusUnset SpanStatus = "unset"
)

type Telemetry added in v0.2.0

type Telemetry struct {

	// OnSpanEnd is an optional callback invoked when a span ends.
	// Useful for exporting spans to external systems.
	OnSpanEnd func(*Span)
	// contains filtered or unexported fields
}

Telemetry wraps observability for all LLM calls. A nil *Telemetry is safe to use and adds zero overhead (all methods are no-ops on nil receiver).

func NewTelemetry added in v0.2.0

func NewTelemetry() *Telemetry

NewTelemetry creates a new Telemetry instance with an initialized MetricsCollector.

func (*Telemetry) EndSpan added in v0.2.0

func (t *Telemetry) EndSpan(span *Span, err error)

EndSpan completes a span, recording its status and duration. If err is non-nil, the span status is set to error and the error message is recorded. No-op if either t or span is nil.

func (*Telemetry) Metrics added in v0.2.0

func (t *Telemetry) Metrics() *MetricsCollector

Metrics returns the underlying MetricsCollector, or nil if Telemetry is nil.

func (*Telemetry) RecordMetric added in v0.2.0

func (t *Telemetry) RecordMetric(name string, value float64, attrs map[string]string)

RecordMetric records a named metric value with attributes. This is a general-purpose method for recording custom metrics. No-op if t is nil.

func (*Telemetry) Spans added in v0.2.0

func (t *Telemetry) Spans() []*Span

Spans returns a copy of all completed spans.

func (*Telemetry) StartSpan added in v0.2.0

func (t *Telemetry) StartSpan(name string, attrs map[string]string) *Span

StartSpan creates and starts a new Span with the given name and attributes. Returns nil if the Telemetry receiver is nil (opt-in pattern).

Directories

Path Synopsis
Package client provides LLM provider clients for Anthropic, OpenAI, and OpenAI-compatible APIs with streaming, retry, and provider detection.
Package client provides LLM provider clients for Anthropic, OpenAI, and OpenAI-compatible APIs with streaming, retry, and provider detection.
sdk
go

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL