openai-tts

command

v0.35.3 Latest Latest Go to latest Published: Mar 11, 2026 License: MIT Imports: 7 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/sevigo/goframe

Links

Open Source Insights

README ¶

OpenAI TTS Example

This example demonstrates how to use OpenAI's Text-to-Speech API with the goframe voice package.

Features Demonstrated

Basic Text-to-Speech - Simple audio generation
Voice Selection - All 6 available OpenAI voices
HD Model - Higher quality audio synthesis
Streaming - Reduced latency for longer content
Audio Formats - MP3, Opus, AAC, FLAC

Prerequisites

OpenAI API Key - Get it from OpenAI Platform
API Credits - TTS API costs $15 per 1M characters (tts-1) or $30 per 1M characters (tts-1-hd)

Running the Example

# Set your OpenAI API key
export OPENAI_API_KEY="sk-..."

# Run the example
go run main.go

Output

The example generates multiple audio files:

openai_tts_output.mp3 - Basic example
openai_voice_alloy.mp3 - Alloy voice (neutral, versatile)
openai_voice_echo.mp3 - Echo voice (warm, engaging)
openai_voice_fable.mp3 - Fable voice (expressive, dramatic)
openai_voice_onyx.mp3 - Onyx voice (deep, authoritative)
openai_voice_nova.mp3 - Nova voice (energetic, friendly)
openai_voice_shimmer.mp3 - Shimmer voice (soft, gentle)
openai_hd_output.mp3 - HD quality example
openai_streamed.mp3 - Streaming example
openai_format_*.mp3 - Different audio formats

Voice Characteristics

Voice	Description	Best For
alloy	Neutral, versatile	General purpose, documentaries
echo	Warm, engaging	Conversations, podcasts
fable	Expressive, dramatic	Storytelling, audiobooks
onyx	Deep, authoritative	Serious topics, news
nova	Energetic, friendly	Marketing, tutorials
shimmer	Soft, gentle	Meditation, children's content

Models

tts-1 (Standard)

Faster generation
Lower cost ($15 per 1M characters)
Suitable for real-time applications
Good quality for most use cases

tts-1-hd (High Definition)

Higher quality audio
Higher cost ($30 per 1M characters)
Better for pre-recorded content
Clearer pronunciation and prosody

Audio Formats

Format	Use Case	Compression
mp3	Most compatible	Good
opus	Web streaming	Best
aac	Mobile apps	Excellent
flac	Lossless archival	None
wav	Audio editing	None
pcm	Raw audio	None

API Usage Examples

Basic Usage

synthesizer, _ := openai.NewSynthesizer(
    openai.WithAPIKey(os.Getenv("OPENAI_API_KEY")),
    openai.WithModel("tts-1"),
    openai.WithVoice("alloy"),
)

audio, _ := synthesizer.Synthesize(ctx, "Hello world!")
os.WriteFile("output.mp3", audio.Data, 0600)

With Options

audio, _ := synthesizer.Synthesize(ctx, 
    "Speaking faster now",
    voice.WithVoice("nova"),
    voice.WithModel("tts-1-hd"),
    voice.WithSpeed(1.2),
)

Streaming

stream, _ := synthesizer.Stream(ctx, "Long text...")
defer stream.Close()

file, _ := os.Create("output.mp3")
file.ReadFrom(stream)

Cost Considerations

tts-1: $0.015 per 1K characters
tts-1-hd: $0.030 per 1K characters
Example: 10,000 characters (~2 min audio) = $0.15 (tts-1) or $0.30 (tts-1-hd)

Limitations

Maximum text length: 4,096 characters per request
Rate limits apply (check OpenAI dashboard)
No word-level timestamps (use Kokoro-FastAPI for that)
No subtitle generation built-in

../kokoro-tts/ - Local TTS with Kokoro-FastAPI
../kokoro-dialogue/ - Multi-speaker dialogue synthesis
../kokoro-captioned-dialogue/ - Dialogue with word-level timestamps

Documentation ¶

Overview ¶

Package main demonstrates OpenAI Text-to-Speech API usage.

Source Files ¶

View all Source files

main.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL