voice

package
v1.10.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 25, 2026 License: MIT Imports: 15 Imported by: 0

Documentation

Overview

Package voice provides text-to-speech generation via a local Kokoro-compatible server.

It sends TTS requests to a locally running speech server, receives WAV audio, and converts it to OGG/Opus format via ffmpeg for delivery as Telegram voice messages. It also maintains the list of available voice IDs and their metadata.

Plane: shared

Index

Constants

View Source
const DefaultVoice = "af_heart"

Variables

View Source
var Voices = []VoiceInfo{

	{"af_heart", "Female", "American", "Default, warm"},
	{"af_alloy", "Female", "American", ""},
	{"af_bella", "Female", "American", "Youthful, soft"},
	{"af_jessica", "Female", "American", ""},
	{"af_kore", "Female", "American", ""},
	{"af_nicole", "Female", "American", ""},
	{"af_nova", "Female", "American", "Professional"},
	{"af_river", "Female", "American", ""},
	{"af_sarah", "Female", "American", "Calm, composed"},
	{"af_sky", "Female", "American", "Bright, energetic"},

	{"am_adam", "Male", "American", "Deep"},
	{"am_echo", "Male", "American", ""},
	{"am_eric", "Male", "American", ""},
	{"am_fenrir", "Male", "American", ""},
	{"am_liam", "Male", "American", ""},
	{"am_michael", "Male", "American", ""},
	{"am_onyx", "Male", "American", ""},
	{"am_puck", "Male", "American", ""},
	{"am_santa", "Male", "American", ""},

	{"bf_alice", "Female", "British", ""},
	{"bf_emma", "Female", "British", "Elegant"},
	{"bf_lily", "Female", "British", ""},

	{"bm_daniel", "Male", "British", ""},
	{"bm_fable", "Male", "British", ""},
	{"bm_george", "Male", "British", ""},
	{"bm_lewis", "Male", "British", ""},
}

Functions

func ConvertWAVToOGG

func ConvertWAVToOGG(wavData []byte) ([]byte, error)

ConvertWAVToOGG converts WAV audio bytes to OGG/Opus format via ffmpeg.

func Install

func Install() error

Install sets up the voice server script and launchd service.

func IsValidVoice

func IsValidVoice(id string) bool

IsValidVoice checks if a voice ID is recognized.

func PrintVoiceList

func PrintVoiceList()

PrintVoiceList prints the available voices in a table.

func SpeakToBytes

func SpeakToBytes(text, voiceID string, speed float64) ([]byte, error)

SpeakToBytes generates TTS audio and returns WAV bytes.

func Status

func Status() error

Status checks if the voice server is running and healthy.

func Transcribe

func Transcribe(audioData []byte, filename string) (string, error)

Transcribe sends audio data to the mlx-audio STT endpoint. Reads vocabulary and language fresh from config on each call (hot-reload). Defaults to "en" when voice_language is unset; set to "auto" for Whisper auto-detect.

func Uninstall

func Uninstall() error

Uninstall removes the voice server launchd service and script.

Types

type SpeechRequest

type SpeechRequest struct {
	Model          string  `json:"model"`
	Input          string  `json:"input"`
	Voice          string  `json:"voice"`
	Speed          float64 `json:"speed"`
	LangCode       string  `json:"lang_code"`
	ResponseFormat string  `json:"response_format"`
}

type VoiceInfo

type VoiceInfo struct {
	ID     string
	Gender string
	Accent string
	Note   string
}

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL