Documentation
¶
Overview ¶
Package voice provides text-to-speech generation via a local Kokoro-compatible server.
It sends TTS requests to a locally running speech server, receives WAV audio, and converts it to OGG/Opus format via ffmpeg for delivery as Telegram voice messages. It also maintains the list of available voice IDs and their metadata.
Plane: shared
Index ¶
- Constants
- Variables
- func ConvertWAVToOGG(wavData []byte) ([]byte, error)
- func Install() error
- func IsValidVoice(id string) bool
- func PrintVoiceList()
- func SpeakToBytes(text, voiceID string, speed float64) ([]byte, error)
- func Status() error
- func Transcribe(audioData []byte, filename string) (string, error)
- func Uninstall() error
- type SpeechRequest
- type VoiceInfo
Constants ¶
const DefaultVoice = "af_heart"
Variables ¶
var Voices = []VoiceInfo{
{"af_heart", "Female", "American", "Default, warm"},
{"af_alloy", "Female", "American", ""},
{"af_bella", "Female", "American", "Youthful, soft"},
{"af_jessica", "Female", "American", ""},
{"af_kore", "Female", "American", ""},
{"af_nicole", "Female", "American", ""},
{"af_nova", "Female", "American", "Professional"},
{"af_river", "Female", "American", ""},
{"af_sarah", "Female", "American", "Calm, composed"},
{"af_sky", "Female", "American", "Bright, energetic"},
{"am_adam", "Male", "American", "Deep"},
{"am_echo", "Male", "American", ""},
{"am_eric", "Male", "American", ""},
{"am_fenrir", "Male", "American", ""},
{"am_liam", "Male", "American", ""},
{"am_michael", "Male", "American", ""},
{"am_onyx", "Male", "American", ""},
{"am_puck", "Male", "American", ""},
{"am_santa", "Male", "American", ""},
{"bf_alice", "Female", "British", ""},
{"bf_emma", "Female", "British", "Elegant"},
{"bf_lily", "Female", "British", ""},
{"bm_daniel", "Male", "British", ""},
{"bm_fable", "Male", "British", ""},
{"bm_george", "Male", "British", ""},
{"bm_lewis", "Male", "British", ""},
}
Functions ¶
func ConvertWAVToOGG ¶
ConvertWAVToOGG converts WAV audio bytes to OGG/Opus format via ffmpeg.
func IsValidVoice ¶
IsValidVoice checks if a voice ID is recognized.
func SpeakToBytes ¶
SpeakToBytes generates TTS audio and returns WAV bytes.
func Transcribe ¶
Transcribe sends audio data to the mlx-audio STT endpoint. Reads vocabulary and language fresh from config on each call (hot-reload). Defaults to "en" when voice_language is unset; set to "auto" for Whisper auto-detect.