whisper

package
v0.0.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 26, 2024 License: Apache-2.0 Imports: 9 Imported by: 2

Documentation

Index

Constants

View Source
const (
	SampleRate = C.WHISPER_SAMPLE_RATE                 // Expected sample rate, samples per second
	SampleBits = uint16(unsafe.Sizeof(C.float(0))) * 8 // Sample size in bits
	NumFFT     = C.WHISPER_N_FFT
	HopLength  = C.WHISPER_HOP_LENGTH
	ChunkSize  = C.WHISPER_CHUNK_SIZE
)

Variables

View Source
var (
	ErrTokenizerFailed  = errors.New("whisper_tokenize failed")
	ErrAutoDetectFailed = errors.New("whisper_lang_auto_detect failed")
	ErrConversionFailed = errors.New("whisper_convert failed")
	ErrInvalidLanguage  = errors.New("invalid language")
)

Functions

func NewClient

func NewClient(abspath string) *client

Create a new client with the specified root URL to the models

func Whisper_lang_max_id

func Whisper_lang_max_id() int

Largest language id (i.e. number of available languages - 1)

func Whisper_lang_str

func Whisper_lang_str(id int) string

Return the short string of the specified language id (e.g. 2 -> "de"), returns empty string if not found

func Whisper_log_set

func Whisper_log_set(fn func(level LogLevel, text string))

Set logging output

Types

type Client

type Client interface {
	Get(ctx context.Context, w io.Writer, path string) error
}

type Context

type Context C.struct_whisper_context

func Whisper_init

func Whisper_init(path string) *Context

Allocates all memory needed for the model and loads the model from the given file. Returns NULL on failure.

func (*Context) Whisper_free

func (ctx *Context) Whisper_free()

Frees all memory allocated by the model.

func (*Context) Whisper_full

func (ctx *Context) Whisper_full(
	params *Params,
	samples []float32,
	encoderBeginCallback func() bool,
	newSegmentCallback func(int),
	progressCallback func(int),
) error

Run the entire model: PCM -> log mel spectrogram -> encoder -> decoder -> text Uses the specified decoding strategy to obtain the text.

func (*Context) Whisper_full_get_segment_t0

func (ctx *Context) Whisper_full_get_segment_t0(segment int) int64

Get the start and end time of the specified segment.

func (*Context) Whisper_full_get_segment_t1

func (ctx *Context) Whisper_full_get_segment_t1(segment int) int64

Get the start and end time of the specified segment.

func (*Context) Whisper_full_get_segment_text

func (ctx *Context) Whisper_full_get_segment_text(segment int) string

Get the text of the specified segment.

func (*Context) Whisper_full_get_token_data

func (ctx *Context) Whisper_full_get_token_data(segment int, token int) TokenData

Get token data for the specified token in the specified segment. This contains probabilities, timestamps, etc.

func (*Context) Whisper_full_get_token_id

func (ctx *Context) Whisper_full_get_token_id(segment int, token int) Token

Get the token of the specified token index in the specified segment.

func (*Context) Whisper_full_get_token_p

func (ctx *Context) Whisper_full_get_token_p(segment int, token int) float32

Get the probability of the specified token in the specified segment.

func (*Context) Whisper_full_get_token_text

func (ctx *Context) Whisper_full_get_token_text(segment int, token int) string

Get the token text of the specified token index in the specified segment.

func (*Context) Whisper_full_lang_id

func (ctx *Context) Whisper_full_lang_id() int

Return the id of the autodetected language, returns -1 if not found Added to whisper.cpp in https://github.com/ggerganov/whisper.cpp/commit/a1c1583cc7cd8b75222857afc936f0638c5683d6

Examples:

"de" -> 2
"german" -> 2

func (*Context) Whisper_full_n_segments

func (ctx *Context) Whisper_full_n_segments() int

Number of generated text segments. A segment can be a few words, a sentence, or even a paragraph.

func (*Context) Whisper_full_n_tokens

func (ctx *Context) Whisper_full_n_tokens(segment int) int

Get number of tokens in the specified segment.

func (*Context) Whisper_lang_id

func (ctx *Context) Whisper_lang_id(lang string) int

Return the id of the specified language, returns -1 if not found Examples:

"de" -> 2
"german" -> 2

type LogLevel

type LogLevel C.enum_ggml_log_level
const (
	LogLevelDebug LogLevel = C.GGML_LOG_LEVEL_DEBUG
	LogLevelInfo  LogLevel = C.GGML_LOG_LEVEL_INFO
	LogLevelWarn  LogLevel = C.GGML_LOG_LEVEL_WARN
	LogLevelError LogLevel = C.GGML_LOG_LEVEL_ERROR
)

func (LogLevel) String

func (v LogLevel) String() string

type Params

func NewParams

func NewParams(strategy SamplingStrategy) *Params

Returns new default parameters. Call Close() to free memory associated with the parameters.

func (*Params) Close

func (p *Params) Close()

func (*Params) Language

func (p *Params) Language() int

Get language id

func (*Params) MarshalJSON

func (p *Params) MarshalJSON() ([]byte, error)

func (*Params) SetAudioCtx

func (p *Params) SetAudioCtx(n int)

Set audio encoder context

func (*Params) SetDuration

func (p *Params) SetDuration(duration_ms int)

Set audio duration to process in ms

func (*Params) SetInitialPrompt

func (p *Params) SetInitialPrompt(prompt string)

Set initial prompt

func (*Params) SetLanguage

func (p *Params) SetLanguage(lang int) error

Set language id

func (*Params) SetMaxSegmentLength

func (p *Params) SetMaxSegmentLength(n int)

Set max segment length in characters

func (*Params) SetMaxTokensPerSegment

func (p *Params) SetMaxTokensPerSegment(n int)

Set max tokens per segment (0 = no limit)

func (*Params) SetNoContext

func (p *Params) SetNoContext(v bool)

func (*Params) SetOffset

func (p *Params) SetOffset(offset_ms int)

Set start offset in ms

func (*Params) SetPrintProgress

func (p *Params) SetPrintProgress(v bool)

func (*Params) SetPrintRealtime

func (p *Params) SetPrintRealtime(v bool)

func (*Params) SetPrintSpecial

func (p *Params) SetPrintSpecial(v bool)

func (*Params) SetPrintTimestamps

func (p *Params) SetPrintTimestamps(v bool)

func (*Params) SetSingleSegment

func (p *Params) SetSingleSegment(v bool)

func (*Params) SetSplitOnWord

func (p *Params) SetSplitOnWord(v bool)

func (*Params) SetThreads

func (p *Params) SetThreads(threads int)

Set number of threads to use

func (*Params) SetTokenSumThreshold

func (p *Params) SetTokenSumThreshold(t float32)

Set timestamp token sum probability threshold (~0.01)

func (*Params) SetTokenThreshold

func (p *Params) SetTokenThreshold(t float32)

Set timestamp token probability threshold (~0.01)

func (*Params) SetTokenTimestamps

func (p *Params) SetTokenTimestamps(b bool)

func (*Params) SetTranslate

func (p *Params) SetTranslate(v bool)

func (*Params) String

func (p *Params) String() string

func (*Params) Threads

func (p *Params) Threads() int

Threads available

type SamplingStrategy

type SamplingStrategy C.enum_whisper_sampling_strategy
const (
	SAMPLING_GREEDY      SamplingStrategy = C.WHISPER_SAMPLING_GREEDY      // similar to OpenAI's GreedyDecoder
	SAMPLING_BEAM_SEARCH SamplingStrategy = C.WHISPER_SAMPLING_BEAM_SEARCH // similar to OpenAI's BeamSearchDecoder
)

func (SamplingStrategy) MarshalJSON

func (v SamplingStrategy) MarshalJSON() ([]byte, error)

func (SamplingStrategy) String

func (v SamplingStrategy) String() string

type Token

type Token C.whisper_token

type TokenData

type TokenData C.struct_whisper_token_data

func (*TokenData) MarshalJSON

func (t *TokenData) MarshalJSON() ([]byte, error)

func (TokenData) String

func (t TokenData) String() string

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL