tts

package
v0.1.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 24, 2025 License: MIT Imports: 1 Imported by: 0

Documentation

Overview

Package tts provides the text-to-speech provider abstraction and implementations. It defines the Provider interface that all TTS services must implement, along with common types and request/response structures.

Key features:

  • Provider interface for TTS abstraction
  • Support for multiple providers (say, elevenlabs)
  • Timing control and speed adjustment
  • Voice listing and selection

Providers:

  • say: macOS built-in TTS (AIFF, M4A output)
  • elevenlabs: ElevenLabs API (MP3 output)

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type GenerateRequest

type GenerateRequest struct {
	// Text is the content to convert to speech
	Text string

	// Voice is the voice identifier (provider-specific)
	Voice string

	// OutputPath is where the audio file should be created
	OutputPath string

	// Rate is the speaking rate in words per minute (optional, used by 'say' provider)
	Rate *int

	// ModelID is the TTS model identifier (optional, used by ElevenLabs)
	ModelID *string

	// Format is the desired audio format (e.g., "aiff", "mp3")
	Format string

	// TargetDuration is the desired duration in seconds (optional, for timing control)
	TargetDuration *float64
}

GenerateRequest contains all parameters needed to generate audio.

type Provider

type Provider interface {
	// Generate creates audio from text and returns the output file path.
	// The provider is responsible for creating the audio file at the specified path.
	Generate(ctx context.Context, req GenerateRequest) (string, error)

	// ListVoices returns available voices for this provider.
	ListVoices(ctx context.Context) ([]Voice, error)

	// Name returns the provider name (e.g., "say", "elevenlabs").
	Name() string
}

Provider defines the interface for text-to-speech providers. Implementations include macOS 'say' command and ElevenLabs API.

type Voice

type Voice struct {
	// ID is the unique voice identifier
	ID string

	// Name is the human-readable voice name
	Name string

	// Description provides additional information about the voice
	Description string

	// Language is the voice language code (e.g., "en-US")
	Language string

	// Gender is the voice gender (if applicable)
	Gender string
}

Voice represents a TTS voice.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL