omnivoice

package module
v0.11.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 21, 2026 License: MIT Imports: 14 Imported by: 4

README

OmniVoice

Go CI Go Lint Go SAST Go Report Card Docs Docs Visualization License

Batteries-included voice pipeline framework for Go. This package provides a unified interface for speech-to-text (STT) and text-to-speech (TTS) with all providers included.

For a minimal dependency footprint, use omnivoice-core instead.

Voice Architecture

OmniVoice supports two approaches for real-time voice:

Approach Latency Use Case
Traditional Pipeline (STT→LLM→TTS) 500-1500ms Custom voices, domain-specific STT
Native Voice-to-Voice 100-200ms Low latency, natural conversation

This package provides the Traditional Pipeline components. For native voice-to-voice, see:

See the Voice Architecture Guide for detailed comparison.

Features

  • 🎯 Unified Interface: Single API for all STT and TTS providers
  • 🗂️ Provider Registry: Get providers by name - no need to import individual provider packages
  • 🔌 Multiple Providers: OpenAI, Deepgram, ElevenLabs, Twilio, Telnyx
  • Streaming Support: Real-time transcription and synthesis
  • 🚀 Easy Integration: Import and use with minimal configuration

Installation

go get github.com/plexusone/omnivoice

CLI

OmniVoice includes a command-line tool for transcription.

Install CLI
go install github.com/plexusone/omnivoice/cmd/omnivoice@latest
Usage
# Set your API key
export DEEPGRAM_API_KEY="your-api-key"

# Basic transcription (stdout)
omnivoice transcribe podcast.mp3

# Save to file
omnivoice transcribe -p deepgram -o transcript.txt podcast.mp3

# JSON output with full metadata (OmniVoice Transcript format)
omnivoice transcribe -p deepgram --diarize --timestamps -f json -o transcript.json podcast.mp3

# Generate SRT subtitles
omnivoice transcribe -p deepgram -f srt -o subtitles.srt podcast.mp3

# Generate WebVTT subtitles
omnivoice transcribe -p deepgram -f vtt -o subtitles.vtt podcast.mp3

# List available providers
omnivoice providers list
Output Formats
Format Description
text Plain transcript text (default)
json OmniVoice Transcript format with full metadata
srt SubRip subtitles
vtt WebVTT subtitles
Environment Variables
Variable Provider
DEEPGRAM_API_KEY Deepgram
OPENAI_API_KEY OpenAI
ELEVENLABS_API_KEY ElevenLabs

Quick Start (Library)

import (
    "github.com/plexusone/omnivoice"
    _ "github.com/plexusone/omnivoice/providers/all" // Register all providers
)

Usage

package main

import (
    "context"
    "log"
    "os"

    "github.com/plexusone/omnivoice"
    _ "github.com/plexusone/omnivoice/providers/all"
)

func main() {
    ctx := context.Background()

    // Get providers by name using the registry
    sttProvider, err := omnivoice.GetSTTProvider("deepgram",
        omnivoice.WithAPIKey(os.Getenv("DEEPGRAM_API_KEY")))
    if err != nil {
        log.Fatal(err)
    }

    ttsProvider, err := omnivoice.GetTTSProvider("elevenlabs",
        omnivoice.WithAPIKey(os.Getenv("ELEVENLABS_API_KEY")))
    if err != nil {
        log.Fatal(err)
    }

    // Transcribe audio
    result, err := sttProvider.TranscribeFile(ctx, "audio.mp3", omnivoice.TranscriptionConfig{
        Language:             "en",
        EnableWordTimestamps: true,
    })
    if err != nil {
        log.Fatal(err)
    }
    log.Printf("Transcription: %s", result.Text)

    // Synthesize speech
    audio, err := ttsProvider.Synthesize(ctx, "Hello, world!", omnivoice.SynthesisConfig{
        VoiceID: "pNInz6obpgDQGcFmaJgB", // Adam
    })
    if err != nil {
        log.Fatal(err)
    }
    // audio.Audio contains the audio bytes
}

Provider Registry

Get providers by name at runtime - no need to import individual provider packages:

// STT/TTS providers: "openai", "elevenlabs", "deepgram", "twilio"
ttsProvider, _ := omnivoice.GetTTSProvider("elevenlabs", omnivoice.WithAPIKey(key))
sttProvider, _ := omnivoice.GetSTTProvider("deepgram", omnivoice.WithAPIKey(key))

// Realtime providers: "openai", "gemini"
rtProvider, _ := omnivoice.GetRealtimeProvider("openai", omnivoice.WithAPIKey(key))

// Gateway providers: "twilio", "telnyx"
gateway, _ := omnivoice.GetGatewayProvider("twilio", omnivoice.WithAccountSID(sid), omnivoice.WithAuthToken(token))

// List registered providers
fmt.Println(omnivoice.ListTTSProviders())      // [openai elevenlabs deepgram twilio]
fmt.Println(omnivoice.ListSTTProviders())      // [openai elevenlabs deepgram twilio]
fmt.Println(omnivoice.ListRealtimeProviders()) // [openai gemini]
fmt.Println(omnivoice.ListGatewayProviders())  // [twilio telnyx]

// Realtime factory API (for gateway integration)
factory, _ := omnivoice.GetRealtimeFactory("openai")
fmt.Println(omnivoice.ListRealtimeFactories()) // [openai gemini]

Language Codes

OmniVoice accepts language codes in BCP-47 format, which includes ISO 639-1 two-letter codes and regional variants.

Common codes:

Code Language
en English
en-US English (US)
en-GB English (UK)
es Spanish
es-MX Spanish (Mexico)
fr French
de German
it Italian
pt Portuguese
pt-BR Portuguese (Brazil)
ja Japanese
ko Korean
zh Chinese
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
ar Arabic
hi Hindi
ru Russian

Notes:

  • Use simple codes (en) for broad compatibility across providers
  • Use regional variants (en-US) when accent/dialect matters for TTS
  • Provider support varies; see provider documentation for full language lists
  • STT providers generally support automatic language detection when no code is specified

Included Providers

STT/TTS Providers
Provider STT TTS Registry Name
OpenAI Whisper TTS-1/TTS-1-HD "openai"
ElevenLabs Scribe Multilingual v2 "elevenlabs"
Deepgram Nova-2 Aura "deepgram"
Twilio Media Streams Media Streams "twilio"
Native Voice-to-Voice (Realtime)
Provider Latency Registry Name
OpenAI Realtime ~100ms "openai"
Gemini Live ~200ms "gemini"
Voice Gateway
Provider Registry Name
Twilio "twilio"
Telnyx "telnyx"
Core
STT/TTS Providers
Voice Gateway Providers

License

MIT License - see LICENSE for details.

Documentation

Overview

Package omnivoice provides a unified interface for speech-to-text and text-to-speech. This is the batteries-included package that imports all providers. For a minimal dependency footprint, use github.com/plexusone/omnivoice-core instead.

Package omnivoice provides a unified interface for speech-to-text and text-to-speech. This is the batteries-included package that imports all providers. For a minimal dependency footprint, use github.com/plexusone/omnivoice-core instead.

Package omnivoice provides a unified interface for speech-to-text and text-to-speech.

This is the batteries-included package that imports all providers. For a minimal dependency footprint, use github.com/plexusone/omnivoice-core instead.

Quick Start

Import the package with all providers:

import (
    "github.com/plexusone/omnivoice"
    _ "github.com/plexusone/omnivoice/providers/all"
)

Or import specific providers:

import (
    "github.com/plexusone/omnivoice"
    openai "github.com/plexusone/omni-openai/omnivoice"
)

Creating Providers

// OpenAI provider
sttProvider := openai.NewSTTProvider(apiKey)
ttsProvider := openai.NewTTSProvider(apiKey)

// Create multi-provider client
sttClient := omnivoice.NewSTTClient(sttProvider)
ttsClient := omnivoice.NewTTSClient(ttsProvider)

Package omnivoice provides a unified interface for speech-to-text and text-to-speech. This is the batteries-included package that imports all providers. For a minimal dependency footprint, use github.com/plexusone/omnivoice-core instead.

Package omnivoice provides a unified interface for speech-to-text and text-to-speech. This is the batteries-included package that imports all providers. For a minimal dependency footprint, use github.com/plexusone/omnivoice-core instead.

Package omnivoice provides a unified interface for speech-to-text and text-to-speech. This is the batteries-included package that imports all providers. For a minimal dependency footprint, use github.com/plexusone/omnivoice-core instead.

Package omnivoice provides a unified interface for speech-to-text and text-to-speech. This is the batteries-included package that imports all providers. For a minimal dependency footprint, use github.com/plexusone/omnivoice-core instead.

Package omnivoice provides a unified interface for speech-to-text and text-to-speech. This is the batteries-included package that imports all providers. For a minimal dependency footprint, use github.com/plexusone/omnivoice-core instead.

Index

Constants

View Source
const (
	StatusRinging  = callsystem.StatusRinging
	StatusAnswered = callsystem.StatusAnswered
	StatusEnded    = callsystem.StatusEnded
	StatusFailed   = callsystem.StatusFailed
	StatusBusy     = callsystem.StatusBusy
	StatusNoAnswer = callsystem.StatusNoAnswer
	CallInbound    = callsystem.Inbound
	CallOutbound   = callsystem.Outbound
)

Re-export CallStatus constants

View Source
const (
	GatewayProviderTwilio  = gateway.ProviderTwilio
	GatewayProviderTelnyx  = gateway.ProviderTelnyx
	GatewayProviderVonage  = gateway.ProviderVonage
	GatewayProviderPlivo   = gateway.ProviderPlivo
	GatewayProviderLiveKit = gateway.ProviderLiveKit
)

Re-export Gateway provider name constants

View Source
const (
	PipelineModeText     = gateway.PipelineModeText
	PipelineModeRealtime = gateway.PipelineModeRealtime
)

Re-export Pipeline mode constants

View Source
const (
	EventSessionStarted   = gateway.EventSessionStarted
	EventSessionEnded     = gateway.EventSessionEnded
	EventUserSpeechStart  = gateway.EventUserSpeechStart
	EventUserSpeechEnd    = gateway.EventUserSpeechEnd
	EventUserTranscript   = gateway.EventUserTranscript
	EventAgentThinking    = gateway.EventAgentThinking
	EventAgentSpeechStart = gateway.EventAgentSpeechStart
	EventAgentSpeechEnd   = gateway.EventAgentSpeechEnd
	EventAgentTranscript  = gateway.EventAgentTranscript
	EventToolCall         = gateway.EventToolCall
	EventInterruption     = gateway.EventInterruption
	EventError            = gateway.EventError
	EventAudioReceived    = gateway.EventAudioReceived
	EventAudioSent        = gateway.EventAudioSent
)

Re-export Gateway event type constants

View Source
const (
	// PriorityThin is the priority for thin (stdlib-only) provider implementations.
	PriorityThin = core.PriorityThin

	// PriorityThick is the priority for thick (official SDK) provider implementations.
	PriorityThick = core.PriorityThick
)

Re-export priority constants from omnivoice-core.

View Source
const (
	SubtitleFormatSRT = subtitle.FormatSRT
	SubtitleFormatVTT = subtitle.FormatVTT
)

Subtitle format constants.

View Source
const TranscriptFormatVersion = stt.TranscriptFormatVersion

TranscriptFormatVersion is the current version of the OmniVoice transcript format.

View Source
const TranscriptSchemaURL = stt.TranscriptSchemaURL

TranscriptSchemaURL is the JSON Schema URL for the transcript format.

View Source
const Version = "0.10.0"

Version is the current version of the omnivoice package.

Variables

View Source
var (
	WithFrom             = callsystem.WithFrom
	WithTimeout          = callsystem.WithTimeout
	WithMachineDetection = callsystem.WithMachineDetection
	WithRecording        = callsystem.WithRecording
	WithWhisper          = callsystem.WithWhisper
	WithAgent            = callsystem.WithAgent
	WithStatusCallback   = callsystem.WithStatusCallback
)

Re-export CallOption functions

View Source
var (
	AudioFormatTwilio       = gateway.AudioFormatTwilio
	AudioFormatTelnyx       = gateway.AudioFormatTelnyx
	AudioFormatOpenAI       = gateway.AudioFormatOpenAI
	AudioFormatGeminiInput  = gateway.AudioFormatGeminiInput
	AudioFormatGeminiOutput = gateway.AudioFormatGeminiOutput
)
View Source
var (
	// Core options
	WithAPIKey    = registry.WithAPIKey
	WithBaseURL   = registry.WithBaseURL
	WithExtension = registry.WithExtension

	// CallSystem options
	WithAccountSID  = registry.WithAccountSID
	WithAuthToken   = registry.WithAuthToken
	WithPhoneNumber = registry.WithPhoneNumber
	WithWebhookURL  = registry.WithWebhookURL
	WithRegion      = registry.WithRegion

	// Gateway options
	WithListener     = registry.WithListener
	WithPublicURL    = registry.WithPublicURL
	WithListenAddr   = registry.WithListenAddr
	WithConnectionID = registry.WithConnectionID

	// Realtime options
	WithVoice        = registry.WithVoice
	WithModel        = registry.WithModel
	WithInstructions = registry.WithInstructions

	// Pipeline options
	WithSTTProvider     = registry.WithSTTProvider
	WithSTTAPIKey       = registry.WithSTTAPIKey
	WithSTTModel        = registry.WithSTTModel
	WithSTTLanguage     = registry.WithSTTLanguage
	WithTTSProvider     = registry.WithTTSProvider
	WithTTSAPIKey       = registry.WithTTSAPIKey
	WithTTSVoiceID      = registry.WithTTSVoiceID
	WithTTSModel        = registry.WithTTSModel
	WithLLMProvider     = registry.WithLLMProvider
	WithLLMAPIKey       = registry.WithLLMAPIKey
	WithLLMModel        = registry.WithLLMModel
	WithLLMSystemPrompt = registry.WithLLMSystemPrompt

	// Session options
	WithGreeting           = registry.WithGreeting
	WithMaxSessionDuration = registry.WithMaxSessionDuration
	WithInterruptionMode   = registry.WithInterruptionMode
	WithLogger             = registry.WithLogger
	WithPipelineMode       = registry.WithPipelineMode
)

Re-export registry option functions from omnivoice-core.

View Source
var (
	ErrNoAvailableProvider   = stt.ErrNoAvailableProvider
	ErrStreamingNotSupported = stt.ErrStreamingNotSupported
	ErrInvalidAudio          = stt.ErrInvalidAudio
	ErrInvalidConfig         = stt.ErrInvalidConfig
	ErrAudioTooLong          = stt.ErrAudioTooLong
	ErrAudioTooShort         = stt.ErrAudioTooShort
	ErrRateLimited           = stt.ErrRateLimited
	ErrQuotaExceeded         = stt.ErrQuotaExceeded
	ErrUnsupportedLanguage   = stt.ErrUnsupportedLanguage
	ErrUnsupportedFormat     = stt.ErrUnsupportedFormat
	ErrStreamClosed          = stt.ErrStreamClosed
)

Re-export STT errors

View Source
var (
	// DefaultSubtitleOptions returns sensible defaults for subtitle generation.
	DefaultSubtitleOptions = subtitle.DefaultOptions

	// GenerateSRT generates SRT subtitles from a transcription result.
	GenerateSRT = subtitle.GenerateSRT

	// GenerateVTT generates WebVTT subtitles from a transcription result.
	GenerateVTT = subtitle.GenerateVTT

	// SaveSRT generates and saves SRT to a file.
	SaveSRT = subtitle.SaveSRT

	// SaveVTT generates and saves WebVTT to a file.
	SaveVTT = subtitle.SaveVTT
)

Re-export subtitle functions.

View Source
var (
	ErrTTSNoAvailableProvider = tts.ErrNoAvailableProvider
	ErrVoiceNotFound          = tts.ErrVoiceNotFound
	ErrTTSInvalidConfig       = tts.ErrInvalidConfig
	ErrTTSRateLimited         = tts.ErrRateLimited
	ErrTTSQuotaExceeded       = tts.ErrQuotaExceeded
	ErrTTSStreamClosed        = tts.ErrStreamClosed
)

Re-export TTS errors

View Source
var NewCallSystemClient = callsystem.NewClient

NewCallSystemClient creates a new CallSystem client with the given providers. The first provider becomes the primary by default.

View Source
var NewRealtimeClient = realtime.NewClient

Re-export Realtime functions

View Source
var NewSTTClient = stt.NewClient

Re-export STT functions

View Source
var NewTTSClient = tts.NewClient

Re-export TTS functions

Functions

func GetCallSystemProvider added in v0.7.0

func GetCallSystemProvider(name string, opts ...ProviderOption) (callsystem.CallSystem, error)

GetCallSystemProvider creates a CallSystem provider instance from the registry.

func GetCallSystemProviderPriority added in v0.10.0

func GetCallSystemProviderPriority(name string) int

GetCallSystemProviderPriority returns the priority of the registered CallSystem provider.

func GetGatewayProvider added in v0.10.0

func GetGatewayProvider(name string, opts ...ProviderOption) (registry.Gateway, error)

GetGatewayProvider creates a Gateway provider instance from the registry.

func GetGatewayProviderPriority added in v0.10.0

func GetGatewayProviderPriority(name string) int

GetGatewayProviderPriority returns the priority of the registered Gateway provider.

func GetRealtimeFactory added in v0.11.0

func GetRealtimeFactory(name string) (gateway.RealtimeProviderFactory, error)

GetRealtimeFactory returns a realtime provider factory by name. Supported providers:

  • "openai": OpenAI Realtime API (~100ms latency)
  • "gemini": Google Gemini Live API (~200ms latency)

Example:

factory, err := omnivoice.GetRealtimeFactory("openai")
if err != nil {
    log.Fatal(err)
}
// Use factory with gateway configuration

func GetRealtimeProvider added in v0.9.0

func GetRealtimeProvider(name string, opts ...ProviderOption) (registry.RealtimeProvider, error)

GetRealtimeProvider creates a Realtime provider instance from the registry.

Realtime providers enable native voice-to-voice conversations with low latency (~100-300ms). Available providers include:

  • "openai": OpenAI Realtime API (~100ms latency)
  • "gemini": Google Gemini Live API (~200ms latency)

Example:

provider, err := omnivoice.GetRealtimeProvider("openai",
    omnivoice.WithAPIKey(os.Getenv("OPENAI_API_KEY")))

func GetRealtimeProviderPriority added in v0.10.0

func GetRealtimeProviderPriority(name string) int

GetRealtimeProviderPriority returns the priority of the registered Realtime provider.

func GetSTTProvider

func GetSTTProvider(name string, opts ...ProviderOption) (stt.Provider, error)

GetSTTProvider creates an STT provider instance from the registry.

func GetSTTProviderPriority added in v0.10.0

func GetSTTProviderPriority(name string) int

GetSTTProviderPriority returns the priority of the registered STT provider.

func GetTTSProvider

func GetTTSProvider(name string, opts ...ProviderOption) (tts.Provider, error)

GetTTSProvider creates a TTS provider instance from the registry.

func GetTTSProviderPriority added in v0.10.0

func GetTTSProviderPriority(name string) int

GetTTSProviderPriority returns the priority of the registered TTS provider.

func HasCallSystemProvider added in v0.7.0

func HasCallSystemProvider(name string) bool

HasCallSystemProvider returns true if a CallSystem provider with the given name is registered.

func HasGatewayProvider added in v0.10.0

func HasGatewayProvider(name string) bool

HasGatewayProvider returns true if a Gateway provider with the given name is registered.

func HasRealtimeFactory added in v0.11.0

func HasRealtimeFactory(name string) bool

HasRealtimeFactory returns true if a realtime factory with the given name exists.

func HasRealtimeProvider added in v0.9.0

func HasRealtimeProvider(name string) bool

HasRealtimeProvider returns true if a Realtime provider with the given name is registered.

func HasSTTProvider

func HasSTTProvider(name string) bool

HasSTTProvider returns true if an STT provider with the given name is registered.

func HasTTSProvider

func HasTTSProvider(name string) bool

HasTTSProvider returns true if a TTS provider with the given name is registered.

func ListCallSystemProviders added in v0.7.0

func ListCallSystemProviders() []string

ListCallSystemProviders returns a list of all registered CallSystem provider names.

func ListGatewayProviders added in v0.10.0

func ListGatewayProviders() []string

ListGatewayProviders returns a list of all registered Gateway provider names.

func ListRealtimeFactories added in v0.11.0

func ListRealtimeFactories() []string

ListRealtimeFactories returns the names of all available realtime factories.

func ListRealtimeProviders added in v0.9.0

func ListRealtimeProviders() []string

ListRealtimeProviders returns a list of all registered Realtime provider names.

func ListSTTProviders

func ListSTTProviders() []string

ListSTTProviders returns a list of all registered STT provider names.

func ListTTSProviders

func ListTTSProviders() []string

ListTTSProviders returns a list of all registered TTS provider names.

func MustGetRealtimeFactory added in v0.11.0

func MustGetRealtimeFactory(name string) gateway.RealtimeProviderFactory

MustGetRealtimeFactory returns a realtime provider factory by name. It panics if the factory is not found.

func RegisterCallSystemProvider added in v0.7.0

func RegisterCallSystemProvider(name string, factory registry.CallSystemProviderFactory, priority int)

RegisterCallSystemProvider registers a CallSystem provider factory with the given name and priority. Higher priority values override lower priority registrations.

func RegisterGatewayProvider added in v0.10.0

func RegisterGatewayProvider(name string, factory registry.GatewayProviderFactory, priority int)

RegisterGatewayProvider registers a Gateway provider factory with the given name and priority. Higher priority values override lower priority registrations.

func RegisterRealtimeProvider added in v0.9.0

func RegisterRealtimeProvider(name string, factory registry.RealtimeProviderFactory, priority int)

RegisterRealtimeProvider registers a Realtime provider factory with the given name and priority. Higher priority values override lower priority registrations.

func RegisterSTTProvider

func RegisterSTTProvider(name string, factory registry.STTProviderFactory, priority int)

RegisterSTTProvider registers an STT provider factory with the given name and priority. Higher priority values override lower priority registrations.

func RegisterTTSProvider

func RegisterTTSProvider(name string, factory registry.TTSProviderFactory, priority int)

RegisterTTSProvider registers a TTS provider factory with the given name and priority. Higher priority values override lower priority registrations.

func WithGatewayRealtimeConfig added in v0.11.0

func WithGatewayRealtimeConfig(config *gateway.RealtimeConfig) registry.ProviderOption

WithGatewayRealtimeConfig returns a provider option that configures realtime settings. This is used for native voice-to-voice mode with gateways.

func WithGatewayToolHandlers added in v0.11.0

func WithGatewayToolHandlers(provider string, handlers map[string]ToolHandler) registry.ProviderOption

WithGatewayToolHandlers returns a provider option that configures tool handlers for a gateway. The provider parameter specifies which gateway provider to configure ("twilio" or "telnyx").

func WithGatewayTools added in v0.11.0

func WithGatewayTools(provider string, tools []ToolDefinition) registry.ProviderOption

WithGatewayTools returns a provider option that configures tools for a gateway. The provider parameter specifies which gateway provider to configure ("twilio" or "telnyx").

func WithRealtimeFactory added in v0.11.0

func WithRealtimeFactory(factory gateway.RealtimeProviderFactory) registry.ProviderOption

WithRealtimeFactory returns a provider option that configures a realtime provider factory. This is used for native voice-to-voice mode with gateways.

Types

type AudioFormat added in v0.10.0

type AudioFormat = gateway.AudioFormat

Re-export audio format types and constants

type Call added in v0.6.0

type Call = callsystem.Call

Call represents a phone or video call.

type CallDirection added in v0.6.0

type CallDirection = callsystem.CallDirection

CallDirection indicates inbound or outbound call.

type CallHandler added in v0.6.0

type CallHandler = callsystem.CallHandler

CallHandler is called when a new call arrives.

type CallOption added in v0.6.0

type CallOption = callsystem.CallOption

CallOption configures an outbound call.

type CallOptions added in v0.6.0

type CallOptions = callsystem.CallOptions

CallOptions holds parsed options for MakeCall.

type CallStatus added in v0.6.0

type CallStatus = callsystem.CallStatus

CallStatus represents the call state.

type CallSystem added in v0.6.0

type CallSystem = callsystem.CallSystem

CallSystem defines the interface for telephony/meeting integrations.

type CallSystemClient added in v0.7.0

type CallSystemClient = callsystem.Client

CallSystemClient manages multiple CallSystem providers with fallback support.

type CallSystemConfig added in v0.6.0

type CallSystemConfig = callsystem.CallSystemConfig

CallSystemConfig configures a call system integration.

type FunctionDeclaration added in v0.9.0

type FunctionDeclaration = realtime.FunctionDeclaration

FunctionDeclaration describes a function the model can call.

type Gateway added in v0.10.0

type Gateway = gateway.Gateway

Gateway defines the interface for voice gateway providers.

type GatewayCallHandler added in v0.10.0

type GatewayCallHandler = gateway.CallHandler

GatewayCallHandler is called when a new call is received.

type GatewayCallInfo added in v0.10.0

type GatewayCallInfo = gateway.CallInfo

GatewayCallInfo contains information about a call.

type GatewayConfig added in v0.10.0

type GatewayConfig = gateway.Config

GatewayConfig provides common configuration for voice gateways.

type GatewayEvent added in v0.10.0

type GatewayEvent = gateway.Event

GatewayEvent represents a session event.

type GatewayEventType added in v0.10.0

type GatewayEventType = gateway.EventType

GatewayEventType identifies the type of session event.

type GatewayMetrics added in v0.10.0

type GatewayMetrics = gateway.Metrics

GatewayMetrics contains session performance metrics.

type GatewayMinimal added in v0.10.0

type GatewayMinimal = registry.Gateway

GatewayMinimal is the minimal interface used by the registry. Use type assertion to access the full Gateway interface.

type GatewaySession added in v0.10.0

type GatewaySession = gateway.Session

GatewaySession represents an active voice conversation session.

type GatewayToolCall added in v0.10.0

type GatewayToolCall = gateway.ToolCall

GatewayToolCall represents a tool invocation during conversation.

type GatewayTurn added in v0.10.0

type GatewayTurn = gateway.Turn

GatewayTurn represents a single conversation turn.

type LLMProvider added in v0.10.0

type LLMProvider = gateway.LLMProvider

LLMProvider defines the interface for LLM integration with voice gateways.

type ProcessConfig added in v0.9.0

type ProcessConfig = realtime.ProcessConfig

ProcessConfig configures a real-time audio processing session.

type ProviderConfig

type ProviderConfig = registry.ProviderConfig

ProviderConfig holds common configuration options for creating providers.

type ProviderOption

type ProviderOption = registry.ProviderOption

ProviderOption configures a ProviderConfig.

type RealtimeAudioChunk added in v0.9.0

type RealtimeAudioChunk = realtime.AudioChunk

RealtimeAudioChunk represents a chunk of audio data from the model.

type RealtimeClient added in v0.9.0

type RealtimeClient = realtime.Client

RealtimeClient is the multi-provider Realtime client.

type RealtimeConfig added in v0.10.0

type RealtimeConfig = gateway.RealtimeConfig

RealtimeConfig configures a realtime provider for voice-to-voice.

type RealtimeProvider added in v0.9.0

type RealtimeProvider = realtime.Provider

RealtimeProvider defines the interface for real-time voice-to-voice providers. This is the full interface from realtime package with ProcessAudioStream.

type RealtimeProviderFactory added in v0.9.0

type RealtimeProviderFactory = gateway.RealtimeProviderFactory

RealtimeProviderFactory creates realtime providers from configuration.

type RealtimeProviderMinimal added in v0.10.0

type RealtimeProviderMinimal = registry.RealtimeProvider

RealtimeProviderMinimal is the minimal interface used by the registry. Use type assertion to access the full RealtimeProvider interface.

type RealtimeTranscript added in v0.9.0

type RealtimeTranscript = realtime.Transcript

RealtimeTranscript represents a transcript update during a realtime conversation.

type STTClient

type STTClient = stt.Client

STTClient is the multi-provider STT client.

type STTProvider

type STTProvider = stt.Provider

STTProvider defines the interface for STT providers.

type STTStreamingProvider

type STTStreamingProvider = stt.StreamingProvider

STTStreamingProvider extends Provider with streaming support.

type Segment

type Segment = stt.Segment

Segment represents a transcription segment.

type StreamEvent

type StreamEvent = stt.StreamEvent

StreamEvent represents a streaming transcription event.

type SubtitleFormat

type SubtitleFormat = subtitle.Format

SubtitleFormat represents the output format for subtitles.

type SubtitleOptions

type SubtitleOptions = subtitle.Options

SubtitleOptions configures subtitle generation.

type SynthesisConfig

type SynthesisConfig = tts.SynthesisConfig

SynthesisConfig configures a TTS synthesis request.

type SynthesisResult

type SynthesisResult = tts.SynthesisResult

SynthesisResult contains the result of a TTS synthesis.

type TTSClient

type TTSClient = tts.Client

TTSClient is the multi-provider TTS client.

type TTSProvider

type TTSProvider = tts.Provider

TTSProvider defines the interface for TTS providers.

type TTSStreamChunk

type TTSStreamChunk = tts.StreamChunk

StreamChunk represents a chunk of streaming audio.

type TTSStreamingProvider

type TTSStreamingProvider = tts.StreamingProvider

TTSStreamingProvider extends Provider with input streaming support.

type ToolDefinition added in v0.11.0

type ToolDefinition struct {
	Name        string         `json:"name"`
	Description string         `json:"description"`
	Parameters  map[string]any `json:"parameters,omitempty"`
}

ToolDefinition defines a tool that can be called during voice conversations. This is a generic type that works with any gateway provider.

type ToolHandler added in v0.11.0

type ToolHandler func(ctx context.Context, args map[string]any) (string, error)

ToolHandler is a function that handles tool calls during voice conversations.

type Transcript added in v0.8.0

type Transcript = stt.Transcript

Type aliases for backwards compatibility.

func LoadTranscript added in v0.8.0

func LoadTranscript(filePath string) (*Transcript, error)

LoadTranscript reads a transcript from a JSON file.

func NewTranscript added in v0.8.0

func NewTranscript(result *TranscriptionResult, provider, model, audioFile string, config *TranscriptionConfig) *Transcript

NewTranscript creates a Transcript from a TranscriptionResult. This is a convenience wrapper around stt.NewTranscript.

type TranscriptMetadata added in v0.8.0

type TranscriptMetadata = stt.TranscriptMetadata

Type aliases for backwards compatibility.

type TranscriptOptions added in v0.8.0

type TranscriptOptions = stt.TranscriptOptions

Type aliases for backwards compatibility.

type TranscriptSegment added in v0.8.0

type TranscriptSegment = stt.TranscriptSegment

Type aliases for backwards compatibility.

type TranscriptWord added in v0.8.0

type TranscriptWord = stt.TranscriptWord

Type aliases for backwards compatibility.

type TranscriptionConfig

type TranscriptionConfig = stt.TranscriptionConfig

TranscriptionConfig configures a STT transcription request.

type TranscriptionResult

type TranscriptionResult = stt.TranscriptionResult

TranscriptionResult contains the result of a STT transcription.

type Voice

type Voice = tts.Voice

Voice represents a voice configuration for TTS.

type Word

type Word = stt.Word

Word represents a word with timing information.

Directories

Path Synopsis
cmd
omnivoice command
Package main provides the entry point for the omnivoice CLI.
Package main provides the entry point for the omnivoice CLI.
internal
cli
Package cli provides the command-line interface for omnivoice.
Package cli provides the command-line interface for omnivoice.
providers
all
Package all imports and registers all omnivoice providers.
Package all imports and registers all omnivoice providers.
Package schema re-exports JSON Schema definitions from omnivoice-core.
Package schema re-exports JSON Schema definitions from omnivoice-core.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL