eyrie

package module

v0.2.0 Latest Latest Go to latest Published: May 16, 2026 License: MIT Imports: 13 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/GrayCodeAI/eyrie

Links

Open Source Insights

README ¶

What is eyrie?

eyrie is the provider runtime that hawk sits on top of. It handles everything between hawk and the LLM APIs — authentication, model resolution, streaming, retries, rate limiting, and caching — so hawk can focus on being a great coding agent.

When hawk calls a model, eyrie figures out which provider to use, how to talk to it, and how to stream the response back. When hawk switches from Anthropic to Ollama, eyrie handles the translation. When an API returns a 529, eyrie retries with backoff. When a response hits max_tokens, eyrie continues automatically.

hawk never talks to an LLM API directly. eyrie does.

What eyrie handles for hawk

Concern	What eyrie does
Provider routing	Detects active provider from env vars, config file, or explicit key
Model resolution	Maps abstract tiers (opus/sonnet/haiku) to concrete model IDs per provider
Streaming	Parses SSE for Anthropic and OpenAI formats — text, tool calls, thinking blocks
Reliability	Retries on 429/500/529 with exponential backoff and `Retry-After` support
Long outputs	Auto-continues when `stop_reason == max_tokens`
Cost control	Anthropic prompt caching breakpoints on system prompt and conversation prefix
Rate limiting	Token bucket per provider — prevents hitting API limits
Config	Reads/writes `~/.hawk/provider.json`, applies to env vars
Model catalog	Embedded pricing + context windows for all providers, live-fetched from OpenRouter
Testing	Mock provider — hawk's tests never need real API keys

Supported providers

Provider	Set this	Notes
Anthropic	`ANTHROPIC_API_KEY`	Default for hawk · supports thinking, caching
OpenAI	`OPENAI_API_KEY`	Full tool use + reasoning effort
OpenRouter	`OPENROUTER_API_KEY`	200+ models via one key
Grok (xAI)	`XAI_API_KEY`
Gemini	`GEMINI_API_KEY`
CanopyWave	`CANOPYWAVE_API_KEY`
Ollama	`OLLAMA_BASE_URL`	Local models, no key needed
OpenCodeGo	`OPENCODEGO_API_KEY`

eyrie detects which provider to use automatically — in the order above.

How hawk uses eyrie

// hawk creates a client once at startup
c := client.NewEyrieClient(&client.EyrieConfig{
    Provider: client.DetectProvider(), // reads from env / config file
})

// hawk streams a response
sr, err := c.StreamChat(ctx, conversation, client.ChatOptions{
    Model: catalog.GetProviderDefaultModel(provider, &cat),
})
defer sr.Close()

for evt := range sr.Events {
    switch evt.Type {
    case "content":   // stream text to terminal
    case "tool_call": // execute tool, append result
    case "thinking":  // show thinking indicator
    case "done":      // response complete
    }
}

// When a response hits max_tokens, eyrie continues automatically
resp, err := client.ChatWithContinuation(ctx, provider, messages,
    client.ChatOptions{Model: model},
    client.DefaultContinuationConfig(),
)

Provider config file

hawk stores provider config at ~/.hawk/provider.json. eyrie owns this file.

cfg := config.LoadProviderConfig("")        // load
config.ApplyProviderConfigToEnv(cfg, false, nil) // apply to env
config.SaveProviderConfig(cfg, "")          // save

Model catalog

eyrie ships with an embedded catalog of every supported model — pricing, context windows, max output. hawk uses this for cost tracking and model selection.

cat := catalog.DefaultModelCatalog()

// Get the best model for a tier
model := catalog.GetPreferredProviderModel("anthropic", catalog.TierSonnet, &cat)
// → "claude-sonnet-4-6"

// Check if a model is deprecated
warn := catalog.GetModelDeprecationWarning("claude-3-7-sonnet", "anthropic")
// → "⚠ Claude 3.7 Sonnet will be retired on February 19, 2026..."

Testing hawk without API keys

mock := client.NewMockProvider(client.MockModeFixed)
mock.Response = "Here is the code you asked for..."

// Inject into hawk's test suite — no real API calls
resp, _ := mock.Chat(ctx, messages, opts)

Install

go get github.com/GrayCodeAI/eyrie

Requires Go 1.26+. Zero external dependencies.

License

Documentation ¶

Overview ¶

Package eyrie is the core LLM client library for hawk.

It provides API provider configurations, model resolution, API limits, base types (messages, IDs, connectors), and error types.

Sub-packages:

types: Message types, content blocks, usage, IDs, connectors
errors: Error message constants and utilities
constants: API limits (image, PDF, media)
catalog: Model catalog, tiers, names, deprecation, provider data
config: Provider configuration, profiles, env, OpenAI-compatible runtime
client: EyrieClient, factory, provider detection
utils: Error utilities (SSL detection, API error sanitization)

Package eyrie observability provides OpenTelemetry-compatible structured tracing and metrics collection for all LLM provider calls.

Design follows OpenTelemetry Go SDK patterns (TraceID/SpanID, span lifecycle, metric instruments) while remaining zero-dependency (Go stdlib only).

Usage is opt-in: a nil *Telemetry adds zero overhead.

Index ¶

Constants
Variables
type CacheStats
type CacheWarmer
- func NewCacheWarmer(chatFn func(ctx context.Context, messages []Message, opts ChatOptions) error, ...) *CacheWarmer
- func (cw *CacheWarmer) CacheBreakpoints(systemPrompt string, conversationPrefix []Message) []int
- func (cw *CacheWarmer) EstimateSavings(inputTokens int, requestCount int) float64
- func (cw *CacheWarmer) ShouldWarm() bool
- func (cw *CacheWarmer) Start(ctx context.Context) error
- func (cw *CacheWarmer) Stop()
- func (cw *CacheWarmer) Warm(ctx context.Context) error
type ChatOptions
type HealthCheckConfig
- func DefaultHealthCheckConfig() HealthCheckConfig
type HealthChecker
- func NewHealthChecker(cfg HealthCheckConfig) *HealthChecker
- func (hc *HealthChecker) AllProviderHealth() map[string]HealthStatus
- func (hc *HealthChecker) Check(provider string) HealthStatus
- func (hc *HealthChecker) Register(provider ProviderPinger)
- func (hc *HealthChecker) Start()
- func (hc *HealthChecker) Stop()
- func (hc *HealthChecker) Unregister(name string)
type HealthState
- func (h HealthState) String() string
type HealthStatus
- func (hs HealthStatus) IsHealthy() bool
type Message
type MetricsCollector
- func NewMetricsCollector() *MetricsCollector
- func (mc *MetricsCollector) CacheHitRate() float64
- func (mc *MetricsCollector) CostAccumulator(key string) float64
- func (mc *MetricsCollector) ErrorRate(provider string) float64
- func (mc *MetricsCollector) ExportJSON() string
- func (mc *MetricsCollector) ExportPrometheus() string
- func (mc *MetricsCollector) LatencyHistogram(key string) (p50, p95, p99 float64)
- func (mc *MetricsCollector) RecordCustom(name string, value float64, attrs map[string]string)
- func (mc *MetricsCollector) RecordRequest(provider, model string, inputTok, outputTok int, latency time.Duration, ...)
- func (mc *MetricsCollector) RequestCount(key string) int64
- func (mc *MetricsCollector) TokensUsed(key string) (int64, int64)
- func (mc *MetricsCollector) TotalCost() float64
- func (mc *MetricsCollector) TotalRequests() int64
type ProviderPinger
type Span
- func (s *Span) AddEvent(name string, attrs map[string]string)
- func (s *Span) Duration() time.Duration
- func (s *Span) SetAttribute(key, value string)
type SpanEvent
type SpanStatus
type Telemetry
- func NewTelemetry() *Telemetry
- func (t *Telemetry) EndSpan(span *Span, err error)
- func (t *Telemetry) Metrics() *MetricsCollector
- func (t *Telemetry) RecordMetric(name string, value float64, attrs map[string]string)
- func (t *Telemetry) Spans() []*Span
- func (t *Telemetry) StartSpan(name string, attrs map[string]string) *Span

Constants ¶

View Source

const (
	SpanLLMChat     = "llm.chat"
	SpanLLMStream   = "llm.stream"
	SpanLLMRetry    = "llm.retry"
	SpanLLMCacheHit = "llm.cache_hit"
)

View Source

const (
	AttrLLMProvider     = "llm.provider"
	AttrLLMModel        = "llm.model"
	AttrLLMInputTokens  = "llm.input_tokens"
	AttrLLMOutputTokens = "llm.output_tokens"
	AttrLLMCostUSD      = "llm.cost_usd"
	AttrLLMLatencyMs    = "llm.latency_ms"
	AttrLLMStatus       = "llm.status"
)

View Source

const (
	// DefaultWarmInterval is the default interval between cache warming pings.
	// Anthropic's prompt cache TTL is 5 minutes; we warm at 4 minutes to stay ahead.
	DefaultWarmInterval = 4 * time.Minute
)

Default cache warmer settings.

Variables ¶

View Source

var Version = strings.TrimSpace(versionFile)

Version of the eyrie library. Sourced from the VERSION file at the repo root — do not edit this variable directly. Bump VERSION instead, or let release-please/goreleaser do it.

Functions ¶

This section is empty.

Types ¶

type CacheStats ¶ added in v0.2.0

type CacheStats struct {
	WarmingRequests     int64
	CacheHits           int64
	CacheMisses         int64
	EstimatedSavingsUSD float64
	LastWarmedAt        time.Time
}

CacheStats tracks cache warming statistics.

type CacheWarmer ¶ added in v0.2.0

type CacheWarmer struct {
	Provider     string
	Model        string
	SystemPrompt string
	Interval     time.Duration
	Enabled      bool

	Stats CacheStats

	// ChatFn is the function used to send warming requests.
	// It should send a chat completion request to the provider.
	ChatFn func(ctx context.Context, messages []Message, opts ChatOptions) error
	// contains filtered or unexported fields
}

CacheWarmer keeps Anthropic's prompt cache warm by periodically sending minimal requests with the system prompt, ensuring subsequent real requests get cache hits at a 90% discount.

func NewCacheWarmer ¶ added in v0.2.0

func NewCacheWarmer(chatFn func(ctx context.Context, messages []Message, opts ChatOptions) error, systemPrompt, provider, model string) *CacheWarmer

NewCacheWarmer creates a new CacheWarmer configured for the given provider. The chatFn is called to send warming pings; it should make a real API call.

func (*CacheWarmer) CacheBreakpoints ¶ added in v0.2.0

func (cw *CacheWarmer) CacheBreakpoints(systemPrompt string, conversationPrefix []Message) []int

CacheBreakpoints suggests where to place cache breakpoints in a message list. Anthropic allows up to 4 breakpoints. The strategy is:

After system prompt (index 0 in returned slice signals "system prompt")
After the first user message
After large context blocks (messages with content > 200 chars)

Returns indices into the messages array where breakpoints should be placed. The special index -1 indicates the system prompt should be cached.

func (*CacheWarmer) EstimateSavings ¶ added in v0.2.0

func (cw *CacheWarmer) EstimateSavings(inputTokens int, requestCount int) float64

EstimateSavings calculates the cost savings from prompt caching in USD.

Without cache: inputTokens * requestCount * price_per_token With cache: (inputTokens * price_per_token * 1.25 first time) +

(inputTokens * (requestCount-1) * price_per_token * 0.1)

Returns the difference (savings amount).

func (*CacheWarmer) ShouldWarm ¶ added in v0.2.0

func (cw *CacheWarmer) ShouldWarm() bool

ShouldWarm returns true if the cache has likely expired (more than 4 minutes since the last warming request) and should be refreshed.

func (*CacheWarmer) Start ¶ added in v0.2.0

func (cw *CacheWarmer) Start(ctx context.Context) error

Start begins the background cache warming loop. It sends a warming ping immediately, then repeats every Interval until Stop is called or the context is cancelled. For non-Anthropic providers this is a no-op.

func (*CacheWarmer) Stop ¶ added in v0.2.0

func (cw *CacheWarmer) Stop()

Stop stops the cache warmer background loop.

func (*CacheWarmer) Warm ¶ added in v0.2.0

func (cw *CacheWarmer) Warm(ctx context.Context) error

Warm sends a single warming request immediately. This keeps the system prompt cached on Anthropic's side. Returns an error if the request fails. For non-Anthropic providers this is a no-op.

type ChatOptions ¶ added in v0.2.0

type ChatOptions struct {
	Model         string
	MaxTokens     int
	System        string
	EnableCaching bool
}

ChatOptions is a minimal options struct for cache warmer use.

type HealthCheckConfig ¶ added in v0.2.0

type HealthCheckConfig struct {
	// Interval between periodic health checks. Default: 30s.
	Interval time.Duration
	// Timeout for each individual ping. Default: 5s.
	Timeout time.Duration
	// DegradedThreshold: latency above this marks provider as Degraded. Default: 2s.
	DegradedThreshold time.Duration
	// UnhealthyAfter: number of consecutive failures before marking Unhealthy. Default: 3.
	UnhealthyAfter int
	// DegradedAfter: number of consecutive failures before marking Degraded. Default: 1.
	DegradedAfter int
}

HealthCheckConfig configures the HealthChecker behavior.

func DefaultHealthCheckConfig ¶ added in v0.2.0

func DefaultHealthCheckConfig() HealthCheckConfig

DefaultHealthCheckConfig returns sensible default configuration.

type HealthChecker ¶ added in v0.2.0

type HealthChecker struct {
	// contains filtered or unexported fields
}

HealthChecker periodically pings LLM providers to determine their health. It is safe for concurrent use. A nil *HealthChecker is safe (all methods are no-ops).

func NewHealthChecker ¶ added in v0.2.0

func NewHealthChecker(cfg HealthCheckConfig) *HealthChecker

NewHealthChecker creates a new HealthChecker with the given configuration. Pass a zero-value config to use defaults.

func (*HealthChecker) AllProviderHealth ¶ added in v0.2.0

func (hc *HealthChecker) AllProviderHealth() map[string]HealthStatus

AllProviderHealth returns the current health status of all registered providers.

func (*HealthChecker) Check ¶ added in v0.2.0

func (hc *HealthChecker) Check(provider string) HealthStatus

Check performs an immediate health check on a single provider and returns the resulting HealthStatus. Returns an Unhealthy status if the provider is not registered.

func (*HealthChecker) Register ¶ added in v0.2.0

func (hc *HealthChecker) Register(provider ProviderPinger)

Register adds a provider to be health-checked. Can be called at any time.

func (*HealthChecker) Start ¶ added in v0.2.0

func (hc *HealthChecker) Start()

Start begins periodic background health checks. Call Stop() to halt. No-op if already started or if hc is nil.

func (*HealthChecker) Stop ¶ added in v0.2.0

func (hc *HealthChecker) Stop()

Stop halts periodic health checking. Blocks until the background loop exits. No-op if not started or if hc is nil.

func (*HealthChecker) Unregister ¶ added in v0.2.0

func (hc *HealthChecker) Unregister(name string)

Unregister removes a provider from health checking.

type HealthState ¶ added in v0.2.0

type HealthState int

HealthState represents the health condition of a provider.

const (
	// Healthy indicates the provider is responding normally.
	Healthy HealthState = iota
	// Degraded indicates the provider is responding but with elevated latency or intermittent errors.
	Degraded
	// Unhealthy indicates the provider is not responding or consistently failing.
	Unhealthy
)

func (HealthState) String ¶ added in v0.2.0

func (h HealthState) String() string

String returns a human-readable representation of HealthState.

type HealthStatus ¶ added in v0.2.0

type HealthStatus struct {
	State       HealthState   `json:"state"`
	Latency     time.Duration `json:"latency"`
	LastChecked time.Time     `json:"last_checked"`
	Error       string        `json:"error,omitempty"`
	Message     string        `json:"message,omitempty"`
}

HealthStatus holds the current health status for a provider, including measured latency and the time of the last health check.

func (HealthStatus) IsHealthy ¶ added in v0.2.0

func (hs HealthStatus) IsHealthy() bool

IsHealthy returns true if the provider state is Healthy.

type Message ¶ added in v0.2.0

type Message struct {
	Role    string
	Content string
}

Message is a minimal message struct for cache warmer use.

type MetricsCollector ¶ added in v0.2.0

type MetricsCollector struct {
	// contains filtered or unexported fields
}

MetricsCollector aggregates metrics for LLM operations. It uses atomic operations and mutexes for thread safety with minimal contention.

func NewMetricsCollector ¶ added in v0.2.0

func NewMetricsCollector() *MetricsCollector

NewMetricsCollector creates a new MetricsCollector.

func (*MetricsCollector) CacheHitRate ¶ added in v0.2.0

func (mc *MetricsCollector) CacheHitRate() float64

CacheHitRate returns the ratio of cache hits to total cache-eligible requests.

func (*MetricsCollector) CostAccumulator ¶ added in v0.2.0

func (mc *MetricsCollector) CostAccumulator(key string) float64

CostAccumulator returns the total cost in USD for a provider/model key.

func (*MetricsCollector) ErrorRate ¶ added in v0.2.0

func (mc *MetricsCollector) ErrorRate(provider string) float64

ErrorRate returns the error rate for a provider (errors / total requests involving that provider).

func (*MetricsCollector) ExportJSON ¶ added in v0.2.0

func (mc *MetricsCollector) ExportJSON() string

ExportJSON dumps all metrics as a JSON string.

func (*MetricsCollector) ExportPrometheus ¶ added in v0.2.0

func (mc *MetricsCollector) ExportPrometheus() string

ExportPrometheus dumps metrics in Prometheus exposition format.

func (*MetricsCollector) LatencyHistogram ¶ added in v0.2.0

func (mc *MetricsCollector) LatencyHistogram(key string) (p50, p95, p99 float64)

LatencyHistogram returns P50, P95, P99 latencies in milliseconds for a key.

func (*MetricsCollector) RecordCustom ¶ added in v0.2.0

func (mc *MetricsCollector) RecordCustom(name string, value float64, attrs map[string]string)

RecordCustom records a custom named metric.

func (*MetricsCollector) RecordRequest ¶ added in v0.2.0

func (mc *MetricsCollector) RecordRequest(provider, model string, inputTok, outputTok int, latency time.Duration, costUSD float64, isError bool)

RecordRequest records a completed LLM request with all its metrics.

func (*MetricsCollector) RequestCount ¶ added in v0.2.0

func (mc *MetricsCollector) RequestCount(key string) int64

RequestCount returns total requests for a provider/model key.

func (*MetricsCollector) TokensUsed ¶ added in v0.2.0

func (mc *MetricsCollector) TokensUsed(key string) (int64, int64)

TokensUsed returns (inputTokens, outputTokens) for a provider/model key.

func (*MetricsCollector) TotalCost ¶ added in v0.2.0

func (mc *MetricsCollector) TotalCost() float64

TotalCost returns the sum of all accumulated costs in USD.

func (*MetricsCollector) TotalRequests ¶ added in v0.2.0

func (mc *MetricsCollector) TotalRequests() int64

TotalRequests returns the sum of all request counts.

type ProviderPinger ¶ added in v0.2.0

type ProviderPinger interface {
	Ping(ctx context.Context) error
	Name() string
}

ProviderPinger is the interface that providers must implement for health checking. This is satisfied by the client.Provider interface's Ping method.

type Span ¶ added in v0.2.0

type Span struct {
	TraceID    string            `json:"trace_id"`
	SpanID     string            `json:"span_id"`
	Name       string            `json:"name"`
	StartTime  time.Time         `json:"start_time"`
	EndTime    time.Time         `json:"end_time,omitempty"`
	Provider   string            `json:"provider,omitempty"`
	Model      string            `json:"model,omitempty"`
	Status     SpanStatus        `json:"status"`
	Attributes map[string]string `json:"attributes,omitempty"`
	Events     []SpanEvent       `json:"events,omitempty"`
}

Span represents a single timed operation in a trace, modeled after OpenTelemetry's Span concept. It carries a TraceID, SpanID, timing information, and arbitrary string attributes.

func (*Span) AddEvent ¶ added in v0.2.0

func (s *Span) AddEvent(name string, attrs map[string]string)

AddEvent adds a timestamped event to the span.

func (*Span) Duration ¶ added in v0.2.0

func (s *Span) Duration() time.Duration

Duration returns the span duration. Returns 0 if not yet ended.

func (*Span) SetAttribute ¶ added in v0.2.0

func (s *Span) SetAttribute(key, value string)

SetAttribute sets a single attribute on the span.

type SpanEvent ¶ added in v0.2.0

type SpanEvent struct {
	Name       string            `json:"name"`
	Timestamp  time.Time         `json:"timestamp"`
	Attributes map[string]string `json:"attributes,omitempty"`
}

SpanEvent records a timestamped event within a span.

type SpanStatus ¶ added in v0.2.0

type SpanStatus string

SpanStatus represents the outcome of a span.

const (
	StatusOK    SpanStatus = "ok"
	StatusError SpanStatus = "error"
	StatusUnset SpanStatus = "unset"
)

type Telemetry ¶ added in v0.2.0

type Telemetry struct {

	// OnSpanEnd is an optional callback invoked when a span ends.
	// Useful for exporting spans to external systems.
	OnSpanEnd func(*Span)
	// contains filtered or unexported fields
}

Telemetry wraps observability for all LLM calls. A nil *Telemetry is safe to use and adds zero overhead (all methods are no-ops on nil receiver).

func NewTelemetry ¶ added in v0.2.0

func NewTelemetry() *Telemetry

NewTelemetry creates a new Telemetry instance with an initialized MetricsCollector.

func (*Telemetry) EndSpan ¶ added in v0.2.0

func (t *Telemetry) EndSpan(span *Span, err error)

EndSpan completes a span, recording its status and duration. If err is non-nil, the span status is set to error and the error message is recorded. No-op if either t or span is nil.

func (*Telemetry) Metrics ¶ added in v0.2.0

func (t *Telemetry) Metrics() *MetricsCollector

Metrics returns the underlying MetricsCollector, or nil if Telemetry is nil.

func (*Telemetry) RecordMetric ¶ added in v0.2.0

func (t *Telemetry) RecordMetric(name string, value float64, attrs map[string]string)

RecordMetric records a named metric value with attributes. This is a general-purpose method for recording custom metrics. No-op if t is nil.

func (*Telemetry) Spans ¶ added in v0.2.0

func (t *Telemetry) Spans() []*Span

Spans returns a copy of all completed spans.

func (*Telemetry) StartSpan ¶ added in v0.2.0

func (t *Telemetry) StartSpan(name string, attrs map[string]string) *Span

StartSpan creates and starts a new Span with the given name and attributes. Returns nil if the Telemetry receiver is nil (opt-in pattern).

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
api
catalog
client Package client provides LLM provider clients for Anthropic, OpenAI, and OpenAI-compatible APIs with streaming, retry, and provider detection.	Package client provides LLM provider clients for Anthropic, OpenAI, and OpenAI-compatible APIs with streaming, retry, and provider detection.
config
constants
conversation
errors
router
sdk
go
storage
types
utils

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL