Documentation
¶
Index ¶
Constants ¶
View Source
const ( // DefaultOpenAITranscriptionModel is the built-in OpenAI transcription model. DefaultOpenAITranscriptionModel = "whisper-1" // DefaultGeminiTranscriptionModel is the built-in Gemini transcription model. DefaultGeminiTranscriptionModel = "gemini-2.5-flash" )
Variables ¶
View Source
var ( // ErrProviderNotFound indicates that a requested provider ID does not exist. ErrProviderNotFound = errors.New("AI provider not found") // ErrCapabilityUnsupported indicates that the provider does not support the requested capability. ErrCapabilityUnsupported = errors.New("AI provider capability unsupported") // ErrSTTNotSupported indicates that the provider does not have a dedicated // speech-to-text endpoint. Use the audiollm package for multimodal audio // understanding when this is returned. ErrSTTNotSupported = errors.New("provider does not support speech-to-text capability") // ErrAudioLLMNotSupported indicates that the provider does not have a // multimodal-audio LLM available in this codebase. ErrAudioLLMNotSupported = errors.New("provider does not support multimodal audio capability") )
Functions ¶
func DefaultTranscriptionModel ¶
func DefaultTranscriptionModel(providerType ProviderType) (string, error)
DefaultTranscriptionModel returns the built-in transcription model for a provider.
Types ¶
type ProviderConfig ¶
type ProviderConfig struct {
ID string
Title string
Type ProviderType
Endpoint string
APIKey string
}
ProviderConfig configures a callable AI provider connection.
func FindProvider ¶
func FindProvider(providers []ProviderConfig, providerID string) (*ProviderConfig, error)
FindProvider returns the provider with the given ID.
type ProviderType ¶
type ProviderType string
ProviderType identifies an AI provider implementation.
const ( // ProviderOpenAI is OpenAI's hosted API. ProviderOpenAI ProviderType = "OPENAI" // ProviderGemini is Google's Gemini API. ProviderGemini ProviderType = "GEMINI" )
Directories
¶
| Path | Synopsis |
|---|---|
|
Package audio provides audio container/codec helpers for AI providers.
|
Package audio provides audio container/codec helpers for AI providers. |
|
Package audiollm defines the multimodal-audio capability for AI providers.
|
Package audiollm defines the multimodal-audio capability for AI providers. |
|
gemini
Package gemini implements audiollm.Model against the Gemini generateContent endpoint.
|
Package gemini implements audiollm.Model against the Gemini generateContent endpoint. |
|
Package stt defines the speech-to-text capability for AI providers.
|
Package stt defines the speech-to-text capability for AI providers. |
|
openai
Package openai implements stt.Transcriber against the OpenAI /audio/transcriptions endpoint (and any compatible third-party endpoint such as Groq Whisper, faster-whisper self-hosted, or Azure Whisper).
|
Package openai implements stt.Transcriber against the OpenAI /audio/transcriptions endpoint (and any compatible third-party endpoint such as Groq Whisper, faster-whisper self-hosted, or Azure Whisper). |
Click to show internal directories.
Click to hide internal directories.