Documentation
¶
Overview ¶
Package bargein provides barge-in detection for voice conversations.
Barge-in allows users to interrupt the AI agent while it is speaking. The detector monitors STT (Speech-to-Text) events for user speech and automatically stops TTS (Text-to-Speech) playback when appropriate.
Interruption Modes ¶
The detector supports multiple interruption modes:
- ModeImmediate: Stop agent speech immediately when user starts speaking
- ModeAfterSentence: Wait for agent to complete current sentence
- ModeDisabled: Never interrupt (user must wait for agent to finish)
Usage ¶
cfg := bargein.Config{
Mode: bargein.ModeImmediate,
MinSpeechDurationMs: 200, // Avoid false triggers from noise
SilenceThresholdMs: 500,
}
detector := bargein.NewDetector(cfg)
detector.AttachTTS(ttsPipeline)
detector.AttachSTTEvents(sttEvents)
detector.OnInterrupt(func(event gateway.Event) {
// Handle interruption (e.g., log, cancel pending responses)
})
detector.Start(ctx)
Integration ¶
The detector integrates with existing omnivoice-core components:
- pipeline.TTSPipeline: Stopped when barge-in is detected
- stt.StreamEvent: Monitors for EventSpeechStart and EventSpeechEnd
- gateway.Event: Emits EventInterruption on barge-in
Index ¶
- Variables
- type Config
- type Detector
- func (d *Detector) AttachSTTEvents(events <-chan stt.StreamEvent)
- func (d *Detector) AttachTTS(tts TTSController)
- func (d *Detector) IsRunning() bool
- func (d *Detector) OnInterrupt(handler InterruptHandler)
- func (d *Detector) SetAgentSpeaking(speaking bool)
- func (d *Detector) Start(ctx context.Context) error
- func (d *Detector) Stop()
- type InterruptHandler
- type InterruptionMode
- type TTSController
Constants ¶
This section is empty.
Variables ¶
var ErrNoSTTEvents = errors.New("no STT events attached")
ErrNoSTTEvents is returned when Start is called without attaching STT events.
Functions ¶
This section is empty.
Types ¶
type Config ¶
type Config struct {
// Mode determines when to trigger barge-in.
// Default: ModeImmediate
Mode InterruptionMode
// MinSpeechDurationMs is the minimum speech duration in milliseconds
// before triggering barge-in. This prevents false triggers from brief
// noises or non-speech sounds.
// Default: 200ms
MinSpeechDurationMs int
// SilenceThresholdMs is the duration of silence in milliseconds that
// indicates the user has stopped speaking. Used in ModeAfterSentence.
// Default: 500ms
SilenceThresholdMs int
// CooldownMs is the minimum time in milliseconds between consecutive
// barge-in events. Prevents rapid repeated triggering.
// Default: 300ms
CooldownMs int
// MinAgentSpeechMs is the minimum agent speech duration in milliseconds
// before barge-in is enabled. Prevents interrupting very short responses.
// Default: 500ms
MinAgentSpeechMs int
}
Config configures the barge-in detector.
func DefaultConfig ¶
func DefaultConfig() Config
DefaultConfig returns a Config with sensible defaults.
func (Config) MinAgentSpeech ¶
MinAgentSpeech returns MinAgentSpeechMs as a time.Duration.
func (Config) MinSpeechDuration ¶
MinSpeechDuration returns MinSpeechDurationMs as a time.Duration.
func (Config) SilenceThreshold ¶
SilenceThreshold returns SilenceThresholdMs as a time.Duration.
type Detector ¶
type Detector struct {
// contains filtered or unexported fields
}
Detector monitors STT events and triggers barge-in when appropriate.
func NewDetector ¶
NewDetector creates a new barge-in detector with the given configuration.
func (*Detector) AttachSTTEvents ¶
func (d *Detector) AttachSTTEvents(events <-chan stt.StreamEvent)
AttachSTTEvents sets the STT event channel to monitor.
func (*Detector) AttachTTS ¶
func (d *Detector) AttachTTS(tts TTSController)
AttachTTS sets the TTS controller to stop when barge-in is triggered.
func (*Detector) OnInterrupt ¶
func (d *Detector) OnInterrupt(handler InterruptHandler)
OnInterrupt sets the handler called when barge-in is triggered.
func (*Detector) SetAgentSpeaking ¶
SetAgentSpeaking notifies the detector that the agent has started or stopped speaking.
type InterruptHandler ¶
InterruptHandler is called when barge-in is triggered.
type InterruptionMode ¶
type InterruptionMode string
InterruptionMode defines when barge-in should trigger.
const ( // ModeImmediate stops agent speech immediately when user speech is detected. // This provides the most responsive experience but may trigger on brief noises. ModeImmediate InterruptionMode = "immediate" // ModeAfterSentence waits for the agent to complete the current sentence // before stopping. This provides a smoother experience but higher latency. ModeAfterSentence InterruptionMode = "after_sentence" // ModeDisabled disables barge-in entirely. The user must wait for the // agent to finish speaking before their speech is processed. ModeDisabled InterruptionMode = "disabled" )
type TTSController ¶
type TTSController interface {
// IsActive returns whether TTS is currently playing.
IsActive() bool
// Stop stops any active TTS playback.
Stop()
}
TTSController is the interface for controlling TTS playback. This is satisfied by pipeline.TTSPipeline.