Documentation
¶
Overview ¶
Package services defines interfaces and implementations for LLM, STT, and TTS. Use the factory functions (NewLLMFromConfig, NewSTTFromConfig, NewTTSFromConfig) to construct services by provider name; see Supported*Providers for capability matrix. For RealtimeService use realtime.NewFromConfig(cfg, provider) to avoid an import cycle.
Package services defines interfaces and implementations for LLM, STT, and TTS. These align conceptually with common LLM/STT/TTS service abstractions and websocket/realtime session handling. See pkg/services/factory.go for provider wiring.
Index ¶
- Constants
- Variables
- func NewServicesFromConfig(cfg *config.Config) (LLMService, STTService, TTSService)
- type LLMService
- type LLMServiceWithTools
- type RealtimeConfig
- type RealtimeEvent
- type RealtimeService
- type RealtimeSession
- type STTService
- type STTStreamingService
- type TTSService
- type TTSStreamingService
- type ToolHandler
Constants ¶
const ( ProviderOpenAI = "openai" ProviderGroq = "groq" ProviderSarvam = "sarvam" ProviderGrok = "grok" ProviderCerebras = "cerebras" ProviderElevenLabs = "elevenlabs" ProviderAWS = "aws" ProviderMistral = "mistral" ProviderDeepSeek = "deepseek" ProviderAnthropic = "anthropic" ProviderGoogle = "google" ProviderGoogleVertex = "google_vertex" ProviderOllama = "ollama" ProviderQwen = "qwen" ProviderWhisper = "whisper" // Pipecat-integrated providers ProviderAsyncAI = "asyncai" ProviderCamb = "camb" ProviderFish = "fish" ProviderGradium = "gradium" ProviderHume = "hume" ProviderInworld = "inworld" ProviderMinimax = "minimax" ProviderMoondream = "moondream" ProviderNeuphonic = "neuphonic" ProviderOpenPipe = "openpipe" ProviderSoniox = "soniox" ProviderXTTS = "xtts" )
Variables ¶
var SupportedLLMProviders = []string{ ProviderOpenAI, ProviderGroq, ProviderGrok, ProviderCerebras, ProviderAWS, ProviderMistral, ProviderDeepSeek, ProviderAnthropic, ProviderGoogle, ProviderGoogleVertex, ProviderOllama, ProviderQwen, ProviderAsyncAI, ProviderFish, ProviderInworld, ProviderMinimax, ProviderMoondream, ProviderOpenPipe, }
SupportedLLMProviders lists provider keys that can be passed to NewLLMFromConfig.
var SupportedRealtimeProviders = []string{ProviderOpenAI, ProviderHume, ProviderInworld}
SupportedRealtimeProviders lists provider keys for realtime (use realtime.NewFromConfig to construct).
var SupportedSTTProviders = []string{ ProviderOpenAI, ProviderGroq, ProviderSarvam, ProviderElevenLabs, ProviderAWS, ProviderGoogle, ProviderWhisper, ProviderCamb, ProviderGradium, ProviderSoniox, }
SupportedSTTProviders lists provider keys that can be passed to NewSTTFromConfig.
var SupportedTTSProviders = []string{ ProviderOpenAI, ProviderGroq, ProviderSarvam, ProviderElevenLabs, ProviderAWS, ProviderGoogle, ProviderHume, ProviderInworld, ProviderMinimax, ProviderNeuphonic, ProviderXTTS, }
SupportedTTSProviders lists provider keys that can be passed to NewTTSFromConfig.
Functions ¶
func NewServicesFromConfig ¶
func NewServicesFromConfig(cfg *config.Config) (LLMService, STTService, TTSService)
NewServicesFromConfig returns LLM, STT, and TTS services based on cfg. Resolves provider per task (stt_provider/llm_provider/tts_provider or provider); uses task-specific model/voice when set.
Types ¶
type LLMService ¶
type LLMService = llmapi.LLMService
LLMService provides chat completion; may stream text frames. Re-exported from llmapi.
func NewLLMFromConfig ¶
func NewLLMFromConfig(cfg *config.Config, provider, model string) LLMService
NewLLMFromConfig returns an LLMService for the given provider and model. Provider must be one of SupportedLLMProviders; model is the chat model (e.g. cfg.Model).
type LLMServiceWithTools ¶
type LLMServiceWithTools = llmapi.LLMServiceWithTools
LLMServiceWithTools is an LLM service that supports registering tools. Re-exported from llmapi.
type RealtimeConfig ¶
type RealtimeConfig struct {
Provider string // e.g. "openai"
Model string // e.g. "gpt-4o-realtime" or regular chat model
Voice string // TTS voice, if applicable
Tools []map[string]any // optional function calling tools
}
RealtimeConfig configures a realtime session for a given provider/model.
type RealtimeEvent ¶
type RealtimeEvent struct {
Text *frames.LLMTextFrame
Audio *frames.TTSAudioRawFrame
Frame frames.Frame
}
RealtimeEvent represents a high-level event emitted by a realtime session. It can carry LLM text, TTS audio, or generic frames for extensibility.
type RealtimeService ¶
type RealtimeService interface {
NewSession(ctx context.Context, cfg RealtimeConfig) (RealtimeSession, error)
}
RealtimeService creates realtime sessions.
type RealtimeSession ¶
type RealtimeSession interface {
// SendText sends text input into the session (e.g. user message).
SendText(ctx context.Context, text string) error
// SendAudio sends raw audio input into the session (e.g. microphone audio).
SendAudio(ctx context.Context, audio []byte, sampleRate, numChannels int) error
// Events returns a channel of high-level events from the session.
Events() <-chan RealtimeEvent
// Close terminates the session and closes the events channel.
Close(ctx context.Context) error
}
RealtimeSession is a bidirectional, long-lived conversation with an AI service.
type STTService ¶
type STTService interface {
Transcribe(ctx context.Context, audio []byte, sampleRate, numChannels int) ([]*frames.TranscriptionFrame, error)
}
STTService transcribes audio to text (transcription frames).
func NewSTTFromConfig ¶
func NewSTTFromConfig(cfg *config.Config, provider string) STTService
NewSTTFromConfig returns an STTService for the given provider. Provider must be one of SupportedSTTProviders; cfg.STTModel is used when supported (e.g. Groq).
type STTStreamingService ¶
type STTStreamingService interface {
STTService
// TranscribeStream sends transcription frames (interim and final) to outCh as audio is received on audioCh.
TranscribeStream(ctx context.Context, audioCh <-chan []byte, sampleRate, numChannels int, outCh chan<- frames.Frame)
}
STTStreamingService optionally supports streaming transcription (interim + final frames).
type TTSService ¶
type TTSService interface {
Speak(ctx context.Context, text string, sampleRate int) ([]*frames.TTSAudioRawFrame, error)
}
TTSService converts text to speech (audio frames).
func NewTTSFromConfig ¶
func NewTTSFromConfig(cfg *config.Config, provider, model, voice string) TTSService
NewTTSFromConfig returns a TTSService for the given provider, model, and voice. Provider must be one of SupportedTTSProviders; model and voice are typically cfg.TTSModel and cfg.TTSVoice.
type TTSStreamingService ¶
type TTSStreamingService interface {
TTSService
// SpeakStream streams TTS audio frames to outCh as they are produced.
SpeakStream(ctx context.Context, text string, sampleRate int, outCh chan<- frames.Frame)
}
TTSStreamingService optionally supports streaming TTS (incremental audio to outCh).
type ToolHandler ¶
type ToolHandler = llmapi.ToolHandler
ToolHandler is called when the LLM requests a tool call. Re-exported from llmapi.
Directories
¶
| Path | Synopsis |
|---|---|
|
Package camb provides Camb AI speech-to-text.
|
Package camb provides Camb AI speech-to-text. |
|
Package cerebras provides Cerebras inference API-backed LLM via OpenAI-compatible API.
|
Package cerebras provides Cerebras inference API-backed LLM via OpenAI-compatible API. |
|
Package deepseek provides DeepSeek-backed LLM via OpenAI-compatible API.
|
Package deepseek provides DeepSeek-backed LLM via OpenAI-compatible API. |
|
Package google provides Google Gemini LLM, Vertex AI LLM, and Google Cloud STT/TTS services.
|
Package google provides Google Gemini LLM, Vertex AI LLM, and Google Cloud STT/TTS services. |
|
Package gradium provides Gradium speech-to-text (WebSocket or REST).
|
Package gradium provides Gradium speech-to-text (WebSocket or REST). |
|
Package grok provides xAI Grok-backed LLM via OpenAI-compatible API.
|
Package grok provides xAI Grok-backed LLM via OpenAI-compatible API. |
|
Package groq provides Groq-backed LLM, STT, and TTS via OpenAI-compatible API.
|
Package groq provides Groq-backed LLM, STT, and TTS via OpenAI-compatible API. |
|
Package hume provides Hume (Hume AI) text-to-speech.
|
Package hume provides Hume (Hume AI) text-to-speech. |
|
Package inworld provides Inworld text-to-speech (and LLM).
|
Package inworld provides Inworld text-to-speech (and LLM). |
|
Package llmapi defines LLM and tool-calling interfaces so that implementers (e.g.
|
Package llmapi defines LLM and tool-calling interfaces so that implementers (e.g. |
|
Package minimax provides Minimax text-to-speech.
|
Package minimax provides Minimax text-to-speech. |
|
Package mistral provides Mistral AI-backed LLM via OpenAI-compatible API.
|
Package mistral provides Mistral AI-backed LLM via OpenAI-compatible API. |
|
Package mock provides mock STT, LLM, and TTS services for testing and stress testing without calling real APIs.
|
Package mock provides mock STT, LLM, and TTS services for testing and stress testing without calling real APIs. |
|
Package neuphonic provides Neuphonic text-to-speech (HTTP SSE streaming).
|
Package neuphonic provides Neuphonic text-to-speech (HTTP SSE streaming). |
|
Package ollama provides Ollama-backed LLM via OpenAI-compatible API (localhost or custom base URL).
|
Package ollama provides Ollama-backed LLM via OpenAI-compatible API (localhost or custom base URL). |
|
Package openai provides OpenAI-based LLM (and optionally STT/TTS) for Voxray.
|
Package openai provides OpenAI-based LLM (and optionally STT/TTS) for Voxray. |
|
Package openpipe provides OpenPipe-backed LLM via OpenAI-compatible API.
|
Package openpipe provides OpenPipe-backed LLM via OpenAI-compatible API. |
|
Package qwen provides Alibaba DashScope Qwen LLM via OpenAI-compatible API.
|
Package qwen provides Alibaba DashScope Qwen LLM via OpenAI-compatible API. |
|
Package sarvam provides Sarvam AI TTS and STT service implementations.
|
Package sarvam provides Sarvam AI TTS and STT service implementations. |
|
Package soniox provides Soniox speech-to-text (WebSocket API used for batch Transcribe).
|
Package soniox provides Soniox speech-to-text (WebSocket API used for batch Transcribe). |
|
Package stt provides STT service implementations (OpenAI Whisper, Groq Whisper).
|
Package stt provides STT service implementations (OpenAI Whisper, Groq Whisper). |
|
Package tts provides TTS service implementations (OpenAI TTS, Groq TTS).
|
Package tts provides TTS service implementations (OpenAI TTS, Groq TTS). |
|
Package whisper provides Whisper API-backed STT (OpenAI or self-hosted compatible) with configurable base URL.
|
Package whisper provides Whisper API-backed STT (OpenAI or self-hosted compatible) with configurable base URL. |
|
Package xtts provides Coqui XTTS text-to-speech via local streaming server.
|
Package xtts provides Coqui XTTS text-to-speech via local streaming server. |