config

package
v0.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 16, 2026 License: Apache-2.0 Imports: 6 Imported by: 0

README

Config

Package config handles application configuration: JSON file loading and environment variable overrides (12-factor style).

Purpose

  • Config: Single struct for server (host, port, TLS, CORS), transport (WebSocket/WebRTC), pipeline (provider, model, STT/LLM/TTS overrides), turn/VAD, runner, recording, transcripts, MCP, metrics, logging.
  • LoadConfig: Reads JSON from path and applies ApplyEnvOverrides so env vars override file values.
  • Nested configs: RecordingConfig, TranscriptConfig, MCPConfig for recording, transcripts, and MCP tool integration.

Config struct tree

graph TD
    Config["Config"] --> Server["Host, Port, TLS, CORS\nLogLevel, JSONLogs"]
    Config --> Pipeline["Provider, Model\nSttProvider, LlmProvider, TtsProvider\nSTTModel, TTSModel, TTSVoice\nPlugins, PluginOptions, APIKeys"]
    Config --> Transport["Transport\nWebRTCICEServers"]
    Config --> Turn["TurnDetection, TurnStopSecs\nVAD*, UserTurnStopTimeoutSecs\nUserIdleTimeoutSecs"]
    Config --> Runner["RunnerTransport, RunnerPort\nProxyHost, Dialin\nSessionStore, RedisURL"]
    Config --> Recording["Recording\nRecordingConfig"]
    Config --> Transcripts["Transcripts\nTranscriptConfig"]
    Config --> MCP["MCP\nMCPConfig"]
    Config --> Metrics["MetricsEnabled"]
    RecordingConfig["RecordingConfig"] --> RecFields["Enable, Bucket\nBasePath, Format\nWorkerCount"]
    TranscriptConfig["TranscriptConfig"] --> TransFields["Enable, Driver\nDSN, TableName"]
    MCPConfig["MCPConfig"] --> MCPFields["Command, Args\nToolsFilter"]

Exported symbols

Symbol Type Description
Config struct Root config; fields for server, pipeline, transport, turn/VAD, runner, recording, transcripts, MCP, metrics
RecordingConfig struct Enable, Bucket, BasePath, Format, WorkerCount
TranscriptConfig struct Enable, Driver, DSN, TableName
MCPConfig struct Command, Args, ToolsFilter
LoadConfig(path) func Read JSON file and apply env overrides; returns *Config
ApplyEnvOverrides(cfg) func Apply VOXRAY_* and common env vars to cfg
GetEnv(key, def) func os.Getenv with default
(c *Config) GetAPIKey(service, envVar) method API key from APIKeys map or env
(c *Config) STTProvider(), LLMProvider(), TTSProvider() method Per-task provider (stt_provider/llm_provider/tts_provider or provider)
(c *Config) TurnEnabled() method true when turn_detection == "silence"
(c *Config) VADBackendOrDefault() method VAD type, default "energy"
(c *Config) VADParams() method Struct with Confidence, StartSecs, StopSecs, MinVolume
(c *Config) MetricsEnabledOrDefault() method true if metrics_enabled unset or true

Validation

  • No built-in validation; callers validate as needed. Unknown JSON keys are ignored.

Concurrency

  • Config is intended to be loaded once and read-only thereafter; no internal locking.

Files

File Description
config.go Config, RecordingConfig, TranscriptConfig, MCPConfig, LoadConfig, ApplyEnvOverrides, GetEnv, Config methods

See also

Documentation

Overview

Package config handles the application configuration, including environment variables and JSON files.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ApplyEnvOverrides

func ApplyEnvOverrides(cfg *Config)

ApplyEnvOverrides applies environment variable overrides to cfg (12-factor config). VOXRAY_PORT or PORT, VOXRAY_HOST or HOST, VOXRAY_LOG_LEVEL, VOXRAY_JSON_LOGS, VOXRAY_TLS_ENABLE, VOXRAY_TLS_CERT_FILE, VOXRAY_TLS_KEY_FILE, VOXRAY_CORS_ORIGINS (comma-separated), VOXRAY_MAX_BODY_BYTES. Unset env vars leave cfg unchanged.

func GetEnv

func GetEnv(key, def string) string

GetEnv returns the value of an environment variable, or def if unset. Used for API keys (e.g. OPENAI_API_KEY).

func Validate

func Validate(cfg *Config) []string

Validate checks the config and returns a slice of validation error messages. Returns nil or empty slice when valid.

Types

type Config

type Config struct {
	Host          string                     `json:"host"`
	Port          int                        `json:"port"`
	Model         string                     `json:"model"`
	Provider      string                     `json:"provider,omitempty"` // default for all tasks; "openai" or "groq"
	SttProvider   string                     `json:"stt_provider,omitempty"`
	LlmProvider   string                     `json:"llm_provider,omitempty"`
	TtsProvider   string                     `json:"tts_provider,omitempty"`
	STTModel      string                     `json:"stt_model,omitempty"`
	STTLanguage   string                     `json:"stt_language,omitempty"` // e.g. "hi-IN", "en-IN"; empty = auto-detect (Sarvam)
	TTSModel      string                     `json:"tts_model,omitempty"`
	TTSVoice      string                     `json:"tts_voice,omitempty"`
	Plugins       []string                   `json:"plugins"`
	PluginOptions map[string]json.RawMessage `json:"plugin_options,omitempty"` // per-plugin JSON options; key = plugin name
	APIKeys       map[string]string          `json:"api_keys,omitempty"`

	// Transport selects which network transports are enabled for the server.
	// Supported values:
	//   - "" or "websocket": only WebSocket (/ws).
	//   - "smallwebrtc": only SmallWebRTC signaling (/webrtc/offer).
	//   - "both": enable both WebSocket and SmallWebRTC on the same HTTP server.
	Transport string `json:"transport,omitempty"`
	// WebRTCICEServers lists ICE server URLs (e.g. STUN/TURN) for the SmallWebRTC transport.
	// When empty, a sensible default STUN server is used.
	WebRTCICEServers []string `json:"webrtc_ice_servers,omitempty"`

	// RTCMaxDurationSecs enforces a maximum lifetime for RTC connections (WebRTC and/or
	// WebSocket voice sessions) after the first inbound audio is observed.
	// 0 or negative disables the enforcement.
	RTCMaxDurationSecs float64 `json:"rtc_max_duration_secs,omitempty"`

	// Turn detection: when to consider user finished speaking
	TurnDetection       string  `json:"turn_detection,omitempty"`         // "none" | "silence"; default "none"
	TurnStopSecs        float64 `json:"turn_stop_secs,omitempty"`         // silence after speech to end turn (default 3)
	TurnPreSpeechMs     float64 `json:"turn_pre_speech_ms,omitempty"`     // pre-speech padding ms (default 500)
	TurnMaxDurationSecs float64 `json:"turn_max_duration_secs,omitempty"` // max segment duration secs (default 8)
	VADStartSecs        float64 `json:"vad_start_secs,omitempty"`         // VAD start trigger time for turn (default 0)
	VadThreshold        float64 `json:"vad_threshold,omitempty"`          // EnergyDetector RMS threshold (default 0.02)
	TurnAsync           bool    `json:"turn_async,omitempty"`             // use async AnalyzeEndOfTurn instead of sync AppendAudio

	// User turn / idle lifecycle.
	// When zero, UserTurnStopTimeoutSecs falls back to TurnStopSecs; when both
	// are zero, a conservative default (5s) is used.
	UserTurnStopTimeoutSecs float64 `json:"user_turn_stop_timeout_secs,omitempty"` // timeout with no activity before forcing user turn stop
	// When >0, triggers a UserIdleFrame after the bot has finished speaking
	// and the user has been idle for this duration.
	UserIdleTimeoutSecs float64 `json:"user_idle_timeout_secs,omitempty"`

	// VAD analyzer configuration. When unset, defaults
	// match the Python VADParams defaults.
	VADType         string  `json:"vad_type,omitempty"`           // "energy" (default), "silero", "aic" (future)
	VADConfidence   float64 `json:"vad_confidence,omitempty"`     // default 0.7
	VADStartSecsVAD float64 `json:"vad_start_secs_vad,omitempty"` // default 0.2
	VADStopSecs     float64 `json:"vad_stop_secs,omitempty"`      // default 0.2
	VADMinVolume    float64 `json:"vad_min_volume,omitempty"`     // default 0.6
	// VADBatchSize batches consecutive VAD chunks before inference when > 1 (e.g. Silero); default 1 = no batching.
	VADBatchSize int `json:"vad_batch_size,omitempty"`

	// Interruption: allow user to interrupt bot; strategy (e.g. "keyword") and min_words for future use.
	AllowInterruptions   bool   `json:"allow_interruptions,omitempty"`
	InterruptionStrategy string `json:"interruption_strategy,omitempty"`
	MinWords             int    `json:"min_words,omitempty"`

	// Runner: development runner (transport type, port, proxy for telephony).
	// RunnerTransport: "webrtc" | "daily" | "twilio" | "telnyx" | "plivo" | "exotel" | "livekit" | "" (use Transport + /ws as before).
	RunnerTransport string `json:"runner_transport,omitempty"`
	// RunnerPort overrides Port when runner is used (default 8080; Python runner uses 7860).
	RunnerPort int `json:"runner_port,omitempty"`
	// ProxyHost is the public hostname for telephony webhook XML (e.g. mybot.ngrok.io). No protocol.
	ProxyHost string `json:"proxy_host,omitempty"`
	// Dialin enables Daily PSTN dial-in webhook (POST /daily-dialin-webhook). Only with runner_transport=daily.
	Dialin bool `json:"dialin,omitempty"`
	// DailyDialinWebhookSecret when set requires X-Webhook-Secret header to match for POST /daily-dialin-webhook. Overridden by VOXRAY_DAILY_DIALIN_WEBHOOK_SECRET.
	DailyDialinWebhookSecret string `json:"daily_dialin_webhook_secret,omitempty"`

	// Session store for runner sessions (POST /start, /sessions/{id}/...).
	// "memory" (default): in-memory per process; use for single instance or vertical scaling.
	// "redis": shared store via Redis; use for horizontal scaling behind a load balancer.
	SessionStore string `json:"session_store,omitempty"`
	// RedisURL is the Redis connection URL (e.g. redis://localhost:6379/0). Required when session_store is "redis".
	RedisURL string `json:"redis_url,omitempty"`
	// SessionTTLSecs is the TTL for sessions in seconds (default 3600). Applies to Redis store; optional for memory store.
	SessionTTLSecs int `json:"session_ttl_secs,omitempty"`

	// PipelineInputQueueCap is the buffer size between transport read and pipeline push (default 256).
	// When > 0, overrides the default. Back-pressure: when full, the reader blocks so the transport doesn't consume unbounded memory.
	PipelineInputQueueCap int `json:"pipeline_input_queue_cap,omitempty"`

	// WSWriteCoalesceMs when > 0 enables WebSocket write coalescing: drain up to WSWriteCoalesceMaxFrames frames within this many ms before writing (reduces syscalls; adds latency). Default 0 = disabled.
	WSWriteCoalesceMs        int `json:"ws_write_coalesce_ms,omitempty"`
	WSWriteCoalesceMaxFrames int `json:"ws_write_coalesce_max_frames,omitempty"`

	// TLS: enable TLS and cert/key paths. Can be overridden by VOXRAY_TLS_* env vars.
	TLSEnable   bool   `json:"tls_enable,omitempty"`
	TLSCertFile string `json:"tls_cert_file,omitempty"`
	TLSKeyFile  string `json:"tls_key_file,omitempty"`

	// LogLevel is "debug", "info", or "error". Overridden by VOXRAY_LOG_LEVEL.
	LogLevel string `json:"log_level,omitempty"`
	// JSONLogs enables one-JSON-object-per-line logging. Overridden by VOXRAY_JSON_LOGS.
	JSONLogs bool `json:"json_logs,omitempty"`

	// CORSAllowedOrigins is a list of origins allowed for CORS (e.g. https://app.example.com). Empty means no CORS headers. Overridden by VOXRAY_CORS_ORIGINS (comma-separated).
	CORSAllowedOrigins []string `json:"cors_allowed_origins,omitempty"`

	// MaxRequestBodyBytes limits JSON request body size (e.g. /webrtc/offer, /start). Zero = no limit. Overridden by VOXRAY_MAX_BODY_BYTES.
	MaxRequestBodyBytes int64 `json:"max_request_body_bytes,omitempty"`

	// ServerAPIKey when non-empty requires Authorization: Bearer <key> or X-API-Key: <key> for /start, /sessions/*, /webrtc/offer, /ws. Overridden by VOXRAY_SERVER_API_KEY.
	ServerAPIKey string `json:"server_api_key,omitempty"`

	// MCP configures the MCP (Model Context Protocol) client for tool integration. When set, tools from the MCP server are registered with the LLM.
	MCP *MCPConfig `json:"mcp,omitempty"`

	// MetricsEnabled toggles Prometheus metrics collection and exposure at /metrics.
	// When omitted, metrics default to enabled to match README docs.
	// When set explicitly to false, handlers avoid recording metrics and /metrics still exists but exports an empty registry.
	MetricsEnabled *bool `json:"metrics_enabled,omitempty"`
	// Metrics holds optional tuning for metrics (e.g. audio sampling to reduce contention).
	Metrics MetricsConfig `json:"metrics,omitempty"`

	// Recording controls conversation-wide audio recording and async upload.
	Recording RecordingConfig `json:"recording,omitempty"`

	// Transcripts controls per-message transcript logging to an external database.
	Transcripts TranscriptConfig `json:"transcripts,omitempty"`
	// contains filtered or unexported fields
}

func LoadConfig

func LoadConfig(path string) (*Config, error)

LoadConfig reads a JSON configuration file from the specified path and returns a Config struct. It returns an error if the file cannot be read or if the JSON format is invalid. Call ApplyEnvOverrides(cfg) after LoadConfig to apply 12-factor env overrides.

func (*Config) GetAPIKey

func (c *Config) GetAPIKey(service string, envVar string) string

GetAPIKey returns the API key for the given service, checking the config first, then falling back to environment variables. Resolved values are cached so env lookups are not repeated.

func (*Config) LLMProvider

func (c *Config) LLMProvider() string

LLMProvider returns the provider to use for LLM (llm_provider if set, else provider).

func (*Config) LogFormatOrDefault

func (c *Config) LogFormatOrDefault() bool

LogFormatOrDefault returns whether JSON log format is enabled (from JSONLogs).

func (*Config) LogLevelOrDefault

func (c *Config) LogLevelOrDefault() string

LogLevelOrDefault returns the configured log level, or "info" if unset.

func (*Config) MetricsEnabledOrDefault

func (c *Config) MetricsEnabledOrDefault() bool

MetricsEnabledOrDefault reports whether Prometheus metrics should be recorded and exported. When cfg or MetricsEnabled is nil, it defaults to true (metrics on by default).

func (*Config) PublicPipelineInfo

func (c *Config) PublicPipelineInfo() PublicPipelineInfo

PublicPipelineInfo returns non-sensitive pipeline labels for UI display.

func (*Config) STTProvider

func (c *Config) STTProvider() string

STTProvider returns the provider to use for STT (stt_provider if set, else provider).

func (*Config) TTSProvider

func (c *Config) TTSProvider() string

TTSProvider returns the provider to use for TTS (tts_provider if set, else provider).

func (*Config) TurnEnabled

func (c *Config) TurnEnabled() bool

TurnEnabled returns true when turn detection is set to "silence".

func (*Config) VADBackendOrDefault

func (c *Config) VADBackendOrDefault() string

VADBackendOrDefault returns the configured VAD backend, defaulting to "energy".

func (*Config) VADParams

func (c *Config) VADParams() (p struct {
	Confidence float64
	StartSecs  float64
	StopSecs   float64
	MinVolume  float64
})

VADParams returns a simple struct with the configured VAD parameters. Zero-values are allowed; the consumer (audio/vad) applies its own defaults.

type MCPConfig

type MCPConfig struct {
	Command     string   `json:"command"`                // executable (e.g. "npx", "go")
	Args        []string `json:"args,omitempty"`         // arguments (e.g. ["-y", "mcp-server"] or ["run", "server.go"])
	ToolsFilter []string `json:"tools_filter,omitempty"` // if non-empty, only these tool names are registered
}

MCPConfig configures an MCP server connection (stdio: command + args). Used to register MCP tools with the LLM.

type MetricsConfig

type MetricsConfig struct {
	// Future: AudioSampleRate is reserved for per-chunk metric sampling (0..1). Not yet wired; do not document until a consumer exists.
	AudioSampleRate float64 `json:"audio_sample_rate,omitempty"`
}

MetricsConfig holds optional metrics tuning.

type PublicPipelineInfo

type PublicPipelineInfo struct {
	SttProvider string `json:"stt_provider"`
	LlmProvider string `json:"llm_provider"`
	TtsProvider string `json:"tts_provider"`
	SttModel    string `json:"stt_model,omitempty"`
	SttLanguage string `json:"stt_language,omitempty"`
	LlmModel    string `json:"llm_model,omitempty"`
	TtsModel    string `json:"tts_model,omitempty"`
	TtsVoice    string `json:"tts_voice,omitempty"`
	VadType     string `json:"vad_type"`
	Transport   string `json:"transport"`
}

PublicPipelineInfo is safe to expose to browser clients (no API keys or secrets).

type RecordingConfig

type RecordingConfig struct {
	// Enable turns recording on for all sessions.
	Enable bool `json:"enable,omitempty"`
	// Bucket is the destination S3 bucket for recordings.
	Bucket string `json:"bucket,omitempty"`
	// BasePath is the key prefix within the bucket (e.g. "recordings/").
	BasePath string `json:"base_path,omitempty"`
	// Format is the file format/extension to use (e.g. "wav").
	Format string `json:"format,omitempty"`
	// WorkerCount is the number of async uploader workers (thread pool size).
	WorkerCount int `json:"worker_count,omitempty"`
	// QueueCap is the job queue capacity (default 32). SCALING: tune for S3 bandwidth and concurrent sessions.
	QueueCap int `json:"queue_cap,omitempty"`
	// MaxRetries is the number of retries for S3 upload on failure (default 3); exponential backoff between attempts.
	MaxRetries int `json:"max_retries,omitempty"`
}

RecordingConfig controls per-call/session audio recording.

type TranscriptConfig

type TranscriptConfig struct {
	// Enable turns transcript logging on for all sessions.
	Enable bool `json:"enable,omitempty"`
	// Driver is the SQL driver name, e.g. "postgres" or "mysql".
	Driver string `json:"driver,omitempty"`
	// DSN is the driver-specific data source name / connection string.
	DSN string `json:"dsn,omitempty"`
	// TableName is the table to write transcript rows into (default "call_transcripts").
	TableName string `json:"table_name,omitempty"`
}

TranscriptConfig controls per-message transcript storage in a SQL database.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL