segmenter

package

v1.8.2 Latest Latest Go to latest Published: Jan 7, 2026 License: Apache-2.0 Imports: 7 Imported by: 2

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/mutablelogic/go-media

Links

Open Source Insights

Documentation ¶

Overview ¶

Package segmenter provides audio segmentation functionality.

The segmenter reads audio from any supported format (using FFmpeg) and outputs fixed-duration chunks of raw PCM samples. It supports:

Fixed segment sizes (e.g., 30 seconds)
Silence-based segmentation (break on silence boundaries)
Sample rate conversion
Mono output (single channel)

Use cases include:

Speech-to-text preprocessing (splitting audio into segments)
Audio analysis (processing chunks at a time)
Streaming audio processing

Index ¶

Constants
type Opt
type SegmentFuncFloat32
type SegmentFuncInt16
type Segmenter
- func New(path string, sampleRate int, opts ...Opt) (*Segmenter, error)
- func NewFromReader(r io.Reader, sampleRate int, opts ...Opt) (*Segmenter, error)

Constants ¶

View Source

const (
	DefaultSilenceThreshold = 0.01                   // Default silence threshold (RMS)
	DefaultSilenceDuration  = time.Millisecond * 500 // Default silence duration
	MinSegmentDuration      = time.Millisecond * 100 // Minimum segment/silence duration
)

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Opt ¶ added in v1.7.6

type Opt func(*opts) error

Opt is a function which applies options to a Segmenter

func WithDefaultSilence ¶ added in v1.8.0

func WithDefaultSilence() Opt

WithDefaultSilence enables silence detection with default threshold and duration. Uses threshold of 0.01 (1% RMS) and silence duration of 500ms.

func WithFFmpegOpt ¶ added in v1.8.0

func WithFFmpegOpt(opt ffmpeg.Opt) Opt

WithFFmpegOpt wraps an ffmpeg.Opt as a segmenter.Opt for use with segmenter.NewFromReader

func WithSegmentSize ¶ added in v1.7.7

func WithSegmentSize(v time.Duration) Opt

WithSegmentSize sets the target segment duration. Segments will be output when they reach approximately this duration. Minimum is 100ms.

func WithSilenceSize ¶ added in v1.7.7

func WithSilenceSize(v time.Duration) Opt

WithSilenceSize sets the silence duration that triggers a segment boundary. Only takes effect when silence detection is enabled. Minimum is 100ms.

func WithSilenceThreshold ¶ added in v1.8.0

func WithSilenceThreshold(threshold float64) Opt

WithSilenceThreshold enables silence detection with a custom threshold. The threshold is the RMS energy level (0.0-1.0) below which audio is considered silence. A typical value is 0.01 (1%).

type SegmentFuncFloat32 ¶

type SegmentFuncFloat32 func(timestamp time.Duration, samples []float32) error

SegmentFuncFloat32 is a callback function which is called for each segment of audio samples. The first argument is the timestamp of the segment start. Return nil to continue, io.EOF to stop early, or any other error to abort.

type SegmentFuncInt16 ¶

type SegmentFuncInt16 func(timestamp time.Duration, samples []int16) error

SegmentFuncInt16 is a callback function which is called for each segment of audio samples. The first argument is the timestamp of the segment start. Return nil to continue, io.EOF to stop early, or any other error to abort.

type Segmenter ¶

type Segmenter struct {
	// contains filtered or unexported fields
}

Segmenter reads audio samples from a reader and segments them into fixed-size chunks. It can be used to process audio samples in chunks for speech recognition, audio analysis, etc.

func New ¶ added in v1.8.0

func New(path string, sampleRate int, opts ...Opt) (*Segmenter, error)

New creates a new segmenter from a file path or URL. The sampleRate is the output sample rate in Hz (e.g., 16000 for speech). The output is always mono (single channel).

Options:

WithSegmentSize(duration) - output segments of approximately this duration
WithDefaultSilence() - also break segments on silence boundaries
WithSilenceThreshold(threshold) - custom silence threshold (0.0-1.0)
WithSilenceSize(duration) - minimum silence duration to trigger a break

func NewFromReader ¶ added in v1.8.0

func NewFromReader(r io.Reader, sampleRate int, opts ...Opt) (*Segmenter, error)

NewFromReader creates a new segmenter from an io.Reader. See New for parameter documentation.

func (*Segmenter) Close ¶

func (s *Segmenter) Close() error

Close releases all resources associated with the segmenter.

func (*Segmenter) DecodeFloat32 ¶

func (s *Segmenter) DecodeFloat32(ctx context.Context, fn SegmentFuncFloat32) error

DecodeFloat32 decodes the audio stream into float32 samples and calls the callback function for each segment. Samples are in the range [-1.0, 1.0].

The "best" audio stream is automatically selected. Output is mono at the configured sample rate.

func (*Segmenter) DecodeInt16 ¶

func (s *Segmenter) DecodeInt16(ctx context.Context, fn SegmentFuncInt16) error

DecodeInt16 decodes the audio stream into int16 samples and calls the callback function for each segment. Samples are in the range [-32768, 32767].

The "best" audio stream is automatically selected. Output is mono at the configured sample rate.

func (*Segmenter) Duration ¶

func (s *Segmenter) Duration() time.Duration

Duration returns the total duration of the media stream. Returns zero if duration is unknown.

func (*Segmenter) SampleRate ¶ added in v1.8.0

func (s *Segmenter) SampleRate() int

SampleRate returns the configured output sample rate.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL