embeddings

package

v0.4.0 Latest Latest Go to latest Published: May 5, 2026 License: MIT Imports: 13 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/GrayCodeAI/yaad

Links

Open Source Insights

Documentation ¶

Overview ¶

Package embeddings provides a pluggable embedding provider interface. Supported: OpenAI, Voyage AI, and a local stub (returns hash-based pseudo-vectors).

Index ¶

func Cosine(a, b []float32) float32
func ExtractRetryDelay(errMsg string) time.Duration
func GetInputType(model string, mode EmbedMode) string
func GetMaxBatchSize(model string) int
type EmbedMode
type EmbeddingMemo
- func NewEmbeddingMemo(maxEntries int) *EmbeddingMemo
type MemoizedProvider
- func NewMemoizedProvider(inner Provider, maxEntries int) *MemoizedProvider
type ModelDefaults
- func GetModelDefaults(model string) (ModelDefaults, bool)
type Pacer
- func NewPacer(minInterval time.Duration) *Pacer
- func (p *Pacer) SetInterval(d time.Duration)
- func (p *Pacer) Wait()
type Provider

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func Cosine ¶

func Cosine(a, b []float32) float32

Cosine returns the cosine similarity between two vectors.

func ExtractRetryDelay ¶

func ExtractRetryDelay(errMsg string) time.Duration

ExtractRetryDelay parses rate-limit error messages for the recommended wait time. It recognises patterns such as "Please try again in 1.5s", "retry in 500ms", and "try again in 2 seconds". Returns 0 if no delay is found.

func GetInputType ¶

func GetInputType(model string, mode EmbedMode) string

GetInputType returns the appropriate input_type string for the given model and embed mode. If the model has no asymmetric input types, an empty string is returned.

func GetMaxBatchSize ¶

func GetMaxBatchSize(model string) int

GetMaxBatchSize returns the maximum batch size for a model, or 64 as the default when the model is unknown.

Types ¶

type EmbedMode ¶

type EmbedMode int

EmbedMode specifies whether a text is a document or a query for asymmetric embedding models.

const (
	// ModeDocument indicates the text is a document to be stored.
	ModeDocument EmbedMode = iota
	// ModeQuery indicates the text is a search query.
	ModeQuery
)

type EmbeddingMemo ¶

type EmbeddingMemo struct {
	// contains filtered or unexported fields
}

EmbeddingMemo caches embeddings by content hash to skip re-embedding unchanged content.

func NewEmbeddingMemo ¶

func NewEmbeddingMemo(maxEntries int) *EmbeddingMemo

NewEmbeddingMemo creates a memo cache with the given max entry count.

func (*EmbeddingMemo) Get ¶

func (m *EmbeddingMemo) Get(content string) ([]float32, bool)

Get returns a cached embedding for the content, if present.

func (*EmbeddingMemo) Len ¶

func (m *EmbeddingMemo) Len() int

Len returns the number of cached entries.

func (*EmbeddingMemo) Put ¶

func (m *EmbeddingMemo) Put(content string, embedding []float32)

Put stores an embedding for the given content, evicting the oldest entry if at capacity.

type MemoizedProvider ¶

type MemoizedProvider struct {
	// contains filtered or unexported fields
}

MemoizedProvider wraps a Provider with content-addressed memoization.

func NewMemoizedProvider ¶

func NewMemoizedProvider(inner Provider, maxEntries int) *MemoizedProvider

NewMemoizedProvider wraps an existing Provider with a memo cache.

func (*MemoizedProvider) Dims ¶

func (p *MemoizedProvider) Dims() int

func (*MemoizedProvider) Embed ¶

func (p *MemoizedProvider) Embed(ctx context.Context, text string) ([]float32, error)

func (*MemoizedProvider) EmbedBatch ¶

func (p *MemoizedProvider) EmbedBatch(ctx context.Context, texts []string) ([][]float32, error)

func (*MemoizedProvider) EmbedWithMode ¶

func (p *MemoizedProvider) EmbedWithMode(ctx context.Context, text string, mode EmbedMode) ([]float32, error)

func (*MemoizedProvider) Memo ¶

func (p *MemoizedProvider) Memo() *EmbeddingMemo

Memo returns the underlying cache for inspection/testing.

func (*MemoizedProvider) Name ¶

func (p *MemoizedProvider) Name() string

type ModelDefaults ¶

type ModelDefaults struct {
	IndexInputType string // e.g., "search_document", "RETRIEVAL_DOCUMENT"
	QueryInputType string // e.g., "search_query", "RETRIEVAL_QUERY"
	Dimensions     int    // output dimensions
	MaxBatchSize   int    // max texts per batch request
}

ModelDefaults maps model names to their optimal embedding parameters.

func GetModelDefaults ¶

func GetModelDefaults(model string) (ModelDefaults, bool)

GetModelDefaults returns the defaults for a known model. The second return value is false when the model is not in the table.

type Pacer ¶

type Pacer struct {
	// contains filtered or unexported fields
}

Pacer enforces minimum intervals between API requests to stay under rate limits.

func NewPacer ¶

func NewPacer(minInterval time.Duration) *Pacer

NewPacer creates a Pacer with the given minimum interval between requests.

func (*Pacer) SetInterval ¶

func (p *Pacer) SetInterval(d time.Duration)

SetInterval adjusts the pacer's minimum interval dynamically.

func (*Pacer) Wait ¶

func (p *Pacer) Wait()

Wait blocks until the next request is allowed according to the pacer's minimum interval.

type Provider ¶

type Provider interface {
	Embed(ctx context.Context, text string) ([]float32, error)
	EmbedBatch(ctx context.Context, texts []string) ([][]float32, error)
	EmbedWithMode(ctx context.Context, text string, mode EmbedMode) ([]float32, error)
	Dims() int
	Name() string
}

Provider generates vector embeddings for text.

func NewLocal ¶

func NewLocal() Provider

NewLocal returns a local stub provider that generates deterministic pseudo-vectors from SHA-256 hashes. Useful for testing and offline use. NOT semantically meaningful — use OpenAI/Voyage for real semantic search.

func NewOpenAI ¶

func NewOpenAI(apiKey, model string) Provider

NewOpenAI creates an OpenAI embedding provider. model: "text-embedding-3-small" (1536 dims) or "text-embedding-3-large" (3072 dims)

func NewVoyage ¶

func NewVoyage(apiKey, model string) Provider

NewVoyage creates a Voyage AI embedding provider. model: "voyage-code-3" (optimized for code)

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL