pipeline

package
v0.3.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 24, 2026 License: MIT Imports: 8 Imported by: 0

Documentation

Overview

Package pipeline provides components for connecting voice processing stages.

The pipeline package connects STT, LLM, and TTS providers to transport connections, handling audio streaming, buffering, and error handling.

Index

Constants

This section is empty.

Variables

View Source
var (
	// ErrPipelineActive is returned when trying to start a pipeline that is already active.
	ErrPipelineActive = errors.New("pipeline is already active")
)

Functions

This section is empty.

Types

type STTPipeline

type STTPipeline struct {
	// contains filtered or unexported fields
}

STTPipeline connects a transport connection's audio output to an STT provider. It handles streaming audio from the caller to the STT service and forwards transcripts.

func NewSTTPipeline

func NewSTTPipeline(provider stt.StreamingProvider, config STTPipelineConfig) *STTPipeline

NewSTTPipeline creates a new STT pipeline.

func (*STTPipeline) IsActive

func (p *STTPipeline) IsActive() bool

IsActive returns whether the pipeline is currently transcribing.

func (*STTPipeline) StartFromConnection

func (p *STTPipeline) StartFromConnection(ctx context.Context, conn transport.Connection) error

StartFromConnection starts transcribing audio from the connection. This is a non-blocking call that runs in goroutines. Call Stop() to end transcription.

func (*STTPipeline) Stop

func (p *STTPipeline) Stop()

Stop stops the STT pipeline.

type STTPipelineConfig

type STTPipelineConfig struct {
	// Model is the STT model to use (provider-specific).
	Model string

	// Language is the BCP-47 language code (e.g., "en-US").
	Language string

	// Encoding is the audio encoding (e.g., "mulaw", "pcm").
	// Use "mulaw" for Twilio Media Streams.
	Encoding string

	// SampleRate is the audio sample rate.
	// Use 8000 for telephony (mu-law).
	SampleRate int

	// Channels is the number of audio channels (1 = mono).
	Channels int

	// OnTranscript is called when a transcript is received.
	// The bool indicates if it's a final (non-interim) result.
	OnTranscript func(transcript string, isFinal bool)

	// OnError is called when an error occurs during streaming.
	OnError func(error)

	// OnSpeechStart is called when speech is detected.
	OnSpeechStart func()

	// OnSpeechEnd is called when speech ends (utterance complete).
	OnSpeechEnd func()
}

STTPipelineConfig configures the STT pipeline.

func DefaultSTTConfig

func DefaultSTTConfig() STTPipelineConfig

DefaultSTTConfig returns sensible defaults for telephony.

type StreamingTTSPipeline

type StreamingTTSPipeline struct {
	*TTSPipeline
	// contains filtered or unexported fields
}

StreamingTTSPipeline extends TTSPipeline with input streaming support. It connects an io.Reader (e.g., LLM streaming output) to TTS to transport.

func NewStreamingTTSPipeline

func NewStreamingTTSPipeline(provider tts.StreamingProvider, config TTSPipelineConfig) *StreamingTTSPipeline

NewStreamingTTSPipeline creates a pipeline that accepts streaming text input.

func (*StreamingTTSPipeline) StreamToConnection

func (p *StreamingTTSPipeline) StreamToConnection(ctx context.Context, textReader io.Reader, conn transport.Connection) error

StreamToConnection reads text from a reader and streams synthesized audio to the connection. This enables low-latency streaming from LLM output to voice output.

type TTSPipeline

type TTSPipeline struct {
	// contains filtered or unexported fields
}

TTSPipeline connects a TTS provider's streaming output to a transport connection. It handles buffering, error handling, and graceful shutdown.

func NewTTSPipeline

func NewTTSPipeline(provider tts.Provider, config TTSPipelineConfig) *TTSPipeline

NewTTSPipeline creates a new TTS pipeline.

func (*TTSPipeline) IsActive

func (p *TTSPipeline) IsActive() bool

IsActive returns whether the pipeline is currently synthesizing.

func (*TTSPipeline) Stop

func (p *TTSPipeline) Stop()

Stop stops any active synthesis.

func (*TTSPipeline) SynthesizeToConnection

func (p *TTSPipeline) SynthesizeToConnection(ctx context.Context, text string, conn transport.Connection) error

SynthesizeToConnection synthesizes text and streams audio to the connection. This is a non-blocking call that runs in a goroutine.

type TTSPipelineConfig

type TTSPipelineConfig struct {
	// VoiceID is the voice to use for synthesis.
	VoiceID string

	// OutputFormat is the audio format (e.g., "ulaw", "pcm").
	// Use "ulaw" for Twilio Media Streams.
	OutputFormat string

	// SampleRate is the audio sample rate.
	// Use 8000 for telephony (mu-law).
	SampleRate int

	// Model is the TTS model to use (provider-specific).
	Model string

	// OnError is called when an error occurs during streaming.
	OnError func(error)

	// OnComplete is called when synthesis completes.
	OnComplete func()
}

TTSPipelineConfig configures the TTS pipeline.

func DefaultTTSConfig

func DefaultTTSConfig() TTSPipelineConfig

DefaultTTSConfig returns sensible defaults for telephony.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL