bargein

package
v0.14.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 15, 2026 License: MIT Imports: 6 Imported by: 0

Documentation

Overview

Package bargein provides barge-in detection for voice conversations.

Barge-in allows users to interrupt the AI agent while it is speaking. The detector monitors STT (Speech-to-Text) events for user speech and automatically stops TTS (Text-to-Speech) playback when appropriate.

Interruption Modes

The detector supports multiple interruption modes:

  • ModeImmediate: Stop agent speech immediately when user starts speaking
  • ModeAfterSentence: Wait for agent to complete current sentence
  • ModeDisabled: Never interrupt (user must wait for agent to finish)

Usage

cfg := bargein.Config{
    Mode:                bargein.ModeImmediate,
    MinSpeechDurationMs: 200,  // Avoid false triggers from noise
    SilenceThresholdMs:  500,
}

detector := bargein.NewDetector(cfg)
detector.AttachTTS(ttsPipeline)
detector.AttachSTTEvents(sttEvents)

detector.OnInterrupt(func(event gateway.Event) {
    // Handle interruption (e.g., log, cancel pending responses)
})

detector.Start(ctx)

Integration

The detector integrates with existing omnivoice-core components:

  • pipeline.TTSPipeline: Stopped when barge-in is detected
  • stt.StreamEvent: Monitors for EventSpeechStart and EventSpeechEnd
  • gateway.Event: Emits EventInterruption on barge-in

Index

Constants

This section is empty.

Variables

View Source
var ErrNoSTTEvents = errors.New("no STT events attached")

ErrNoSTTEvents is returned when Start is called without attaching STT events.

Functions

This section is empty.

Types

type Config

type Config struct {
	// Mode determines when to trigger barge-in.
	// Default: ModeImmediate
	Mode InterruptionMode

	// MinSpeechDurationMs is the minimum speech duration in milliseconds
	// before triggering barge-in. This prevents false triggers from brief
	// noises or non-speech sounds.
	// Default: 200ms
	MinSpeechDurationMs int

	// SilenceThresholdMs is the duration of silence in milliseconds that
	// indicates the user has stopped speaking. Used in ModeAfterSentence.
	// Default: 500ms
	SilenceThresholdMs int

	// CooldownMs is the minimum time in milliseconds between consecutive
	// barge-in events. Prevents rapid repeated triggering.
	// Default: 300ms
	CooldownMs int

	// MinAgentSpeechMs is the minimum agent speech duration in milliseconds
	// before barge-in is enabled. Prevents interrupting very short responses.
	// Default: 500ms
	MinAgentSpeechMs int
}

Config configures the barge-in detector.

func DefaultConfig

func DefaultConfig() Config

DefaultConfig returns a Config with sensible defaults.

func (Config) Cooldown

func (c Config) Cooldown() time.Duration

Cooldown returns CooldownMs as a time.Duration.

func (Config) MinAgentSpeech

func (c Config) MinAgentSpeech() time.Duration

MinAgentSpeech returns MinAgentSpeechMs as a time.Duration.

func (Config) MinSpeechDuration

func (c Config) MinSpeechDuration() time.Duration

MinSpeechDuration returns MinSpeechDurationMs as a time.Duration.

func (Config) SilenceThreshold

func (c Config) SilenceThreshold() time.Duration

SilenceThreshold returns SilenceThresholdMs as a time.Duration.

func (*Config) Validate

func (c *Config) Validate() Config

Validate checks if the config is valid and applies defaults. Negative values are replaced with defaults. Zero values are allowed and represent "no minimum" for the respective setting.

type Detector

type Detector struct {
	// contains filtered or unexported fields
}

Detector monitors STT events and triggers barge-in when appropriate.

func NewDetector

func NewDetector(cfg Config) *Detector

NewDetector creates a new barge-in detector with the given configuration.

func (*Detector) AttachSTTEvents

func (d *Detector) AttachSTTEvents(events <-chan stt.StreamEvent)

AttachSTTEvents sets the STT event channel to monitor.

func (*Detector) AttachTTS

func (d *Detector) AttachTTS(tts TTSController)

AttachTTS sets the TTS controller to stop when barge-in is triggered.

func (*Detector) IsRunning

func (d *Detector) IsRunning() bool

IsRunning returns whether the detector is actively monitoring.

func (*Detector) OnInterrupt

func (d *Detector) OnInterrupt(handler InterruptHandler)

OnInterrupt sets the handler called when barge-in is triggered.

func (*Detector) SetAgentSpeaking

func (d *Detector) SetAgentSpeaking(speaking bool)

SetAgentSpeaking notifies the detector that the agent has started or stopped speaking.

func (*Detector) Start

func (d *Detector) Start(ctx context.Context) error

Start begins monitoring STT events for barge-in. Call Stop() to end monitoring.

func (*Detector) Stop

func (d *Detector) Stop()

Stop ends monitoring and releases resources.

type InterruptHandler

type InterruptHandler func(event gateway.Event)

InterruptHandler is called when barge-in is triggered.

type InterruptionMode

type InterruptionMode string

InterruptionMode defines when barge-in should trigger.

const (
	// ModeImmediate stops agent speech immediately when user speech is detected.
	// This provides the most responsive experience but may trigger on brief noises.
	ModeImmediate InterruptionMode = "immediate"

	// ModeAfterSentence waits for the agent to complete the current sentence
	// before stopping. This provides a smoother experience but higher latency.
	ModeAfterSentence InterruptionMode = "after_sentence"

	// ModeDisabled disables barge-in entirely. The user must wait for the
	// agent to finish speaking before their speech is processed.
	ModeDisabled InterruptionMode = "disabled"
)

type TTSController

type TTSController interface {
	// IsActive returns whether TTS is currently playing.
	IsActive() bool

	// Stop stops any active TTS playback.
	Stop()
}

TTSController is the interface for controlling TTS playback. This is satisfied by pipeline.TTSPipeline.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL