embeddings

package
v0.4.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 5, 2026 License: MIT Imports: 17 Imported by: 4

Documentation

Index

Constants

View Source
const AllowInsecureEnvVar = "CHROMAGO_ALLOW_INSECURE_EF"

AllowInsecureEnvVar is the environment variable that allows insecure HTTP connections for embedding functions loaded from config. This is useful for backward compatibility with existing collections that have HTTP base URLs stored in config.

View Source
const (
	// MaxImageFileSize is the maximum allowed size for image files (50 MB).
	MaxImageFileSize = 50 * 1024 * 1024
)

Variables

This section is empty.

Functions

func AllowInsecureFromEnv added in v0.3.0

func AllowInsecureFromEnv() bool

AllowInsecureFromEnv checks if insecure mode is allowed via environment variable. When true, embedding functions loaded from config will allow HTTP connections even if the stored config doesn't have insecure: true.

func ConfigFloat64 added in v0.3.0

func ConfigFloat64(cfg EmbeddingFunctionConfig, key string) (float64, bool)

ConfigFloat64 extracts a float64 from EmbeddingFunctionConfig. Handles both float64 and int types.

func ConfigInt added in v0.3.0

func ConfigInt(cfg EmbeddingFunctionConfig, key string) (int, bool)

ConfigInt extracts an integer from EmbeddingFunctionConfig. Handles both int (direct assignment) and float64 (JSON unmarshaling).

func ConfigStringSlice added in v0.3.0

func ConfigStringSlice(cfg EmbeddingFunctionConfig, key string) ([]string, bool)

ConfigStringSlice extracts a []string from EmbeddingFunctionConfig. Handles both []string (direct assignment) and []interface{} (JSON unmarshaling).

func HasDense added in v0.3.0

func HasDense(name string) bool

HasDense checks if a dense embedding function is registered.

func HasMultimodal added in v0.3.3

func HasMultimodal(name string) bool

HasMultimodal checks if a multimodal embedding function is registered.

func HasSparse added in v0.3.0

func HasSparse(name string) bool

HasSparse checks if a sparse embedding function is registered.

func ListDense added in v0.3.0

func ListDense() []string

ListDense returns all registered dense embedding function names.

func ListMultimodal added in v0.3.3

func ListMultimodal() []string

ListMultimodal returns all registered multimodal embedding function names.

func ListSparse added in v0.3.0

func ListSparse() []string

ListSparse returns all registered sparse embedding function names.

func LogInsecureEnvVarWarning added in v0.3.0

func LogInsecureEnvVarWarning(providerName string)

LogInsecureEnvVarWarning logs a warning when the insecure env var is used. This helps users discover they should migrate to config-based insecure setting.

func NewValidator added in v0.3.0

func NewValidator() *validator.Validate

NewValidator creates a validator configured to properly validate Secret fields. The validator extracts the underlying string value so `validate:"required"` works correctly.

func RegisterDense added in v0.3.0

func RegisterDense(name string, factory EmbeddingFunctionFactory) error

RegisterDense registers a dense embedding function factory by name. Returns an error if a factory with the same name is already registered.

func RegisterMultimodal added in v0.3.3

func RegisterMultimodal(name string, factory MultimodalEmbeddingFunctionFactory) error

RegisterMultimodal registers a multimodal embedding function factory by name. Returns an error if a factory with the same name is already registered.

func RegisterSparse added in v0.3.0

func RegisterSparse(name string, factory SparseEmbeddingFunctionFactory) error

RegisterSparse registers a sparse embedding function factory by name. Returns an error if a factory with the same name is already registered.

Types

type Closeable added in v0.3.0

type Closeable interface {
	Close() error
}

Closeable is an optional interface for embedding functions that hold resources. Callers should check if an embedding function implements this interface and call Close() when done to release resources (e.g., ONNX runtime, native libraries).

type ConsistentHashEmbeddingFunction

type ConsistentHashEmbeddingFunction struct {
	// contains filtered or unexported fields
}

ConsistentHashEmbeddingFunction generates deterministic embeddings using SHA-256 hashing. Useful for testing and scenarios where consistent embeddings are needed for the same input.

func (*ConsistentHashEmbeddingFunction) DefaultSpace added in v0.3.0

func (*ConsistentHashEmbeddingFunction) EmbedDocuments

func (e *ConsistentHashEmbeddingFunction) EmbedDocuments(ctx context.Context, documents []string) ([]Embedding, error)

func (*ConsistentHashEmbeddingFunction) EmbedQuery

func (e *ConsistentHashEmbeddingFunction) EmbedQuery(_ context.Context, document string) (Embedding, error)

func (*ConsistentHashEmbeddingFunction) GetConfig added in v0.3.0

func (*ConsistentHashEmbeddingFunction) Name added in v0.3.0

func (*ConsistentHashEmbeddingFunction) SupportedSpaces added in v0.3.0

func (e *ConsistentHashEmbeddingFunction) SupportedSpaces() []DistanceMetric

type Distance

type Distance float32

type DistanceMetric

type DistanceMetric string
const (
	L2     DistanceMetric = "l2"
	COSINE DistanceMetric = "cosine"
	IP     DistanceMetric = "ip"
)

type DistanceMetricOperator

type DistanceMetricOperator interface {
	Compare(a, b []float32) float64
}

type Distances

type Distances []Distance

type Embedding

type Embedding interface {
	Len() int
	ContentAsFloat32() []float32
	ContentAsInt32() []int32
	FromFloat32(content ...float32) error
	Compare(other Embedding, metric DistanceMetricOperator) float32
	IsDefined() bool
}

Embedding represents a vector embedding that can be either float32 or int32 based. Implementations include Float32Embedding and Int32Embedding.

func NewEmbeddingFromFloat32

func NewEmbeddingFromFloat32(embedding []float32) Embedding

NewEmbeddingFromFloat32 creates a new Float32Embedding from a slice of float32 values.

func NewEmbeddingFromFloat64

func NewEmbeddingFromFloat64(embedding []float64) Embedding

NewEmbeddingFromFloat64 creates a Float32Embedding by converting float64 values to float32.

func NewEmbeddingFromInt32

func NewEmbeddingFromInt32(embedding []int32) Embedding

NewEmbeddingFromInt32 creates a Float32Embedding by converting int32 values to float32.

func NewEmbeddingsFromFloat32

func NewEmbeddingsFromFloat32(lst [][]float32) ([]Embedding, error)

NewEmbeddingsFromFloat32 creates a slice of embeddings from a 2D slice of float32 values.

func NewEmbeddingsFromInt32

func NewEmbeddingsFromInt32(lst [][]int32) ([]Embedding, error)

NewEmbeddingsFromInt32 creates a slice of Int32Embedding from a 2D slice of int32 values.

func NewEmbeddingsFromInterface

func NewEmbeddingsFromInterface(lst []interface{}) ([]Embedding, error)

NewEmbeddingsFromInterface parses embeddings from a slice of interface{} values, typically used when unmarshaling JSON responses from embedding APIs.

func NewEmptyEmbedding

func NewEmptyEmbedding() Embedding

NewEmptyEmbedding creates an undefined embedding with no data.

func NewEmptyEmbeddings

func NewEmptyEmbeddings() []Embedding

NewEmptyEmbeddings creates an empty slice of embeddings.

func NewInt32Embedding

func NewInt32Embedding(embedding []int32) Embedding

NewInt32Embedding creates a new Int32Embedding from a slice of int32 values.

type EmbeddingFunction

type EmbeddingFunction interface {
	// EmbedDocuments returns a vector for each text.
	EmbedDocuments(ctx context.Context, texts []string) ([]Embedding, error)
	// EmbedQuery embeds a single text.
	EmbedQuery(ctx context.Context, text string) (Embedding, error)
	// Name returns the static identifier for this embedding function (e.g., "openai", "cohere").
	Name() string
	// GetConfig returns the current configuration as a serializable map.
	GetConfig() EmbeddingFunctionConfig
	// DefaultSpace returns the recommended distance metric for this embedding function.
	DefaultSpace() DistanceMetric
	// SupportedSpaces returns all distance metrics supported by this embedding function.
	SupportedSpaces() []DistanceMetric
}

func BuildDense added in v0.3.0

func BuildDense(name string, config EmbeddingFunctionConfig) (EmbeddingFunction, error)

BuildDense creates a dense EmbeddingFunction from name and config.

func BuildDenseCloseable added in v0.3.0

func BuildDenseCloseable(name string, config EmbeddingFunctionConfig) (EmbeddingFunction, func() error, error)

BuildDenseCloseable creates a dense EmbeddingFunction and returns a closer function. The closer handles cleanup for embedding functions that implement Closeable. If the embedding function does not implement Closeable, the closer is a no-op.

func NewConsistentHashEmbeddingFunction

func NewConsistentHashEmbeddingFunction() EmbeddingFunction

NewConsistentHashEmbeddingFunction creates a ConsistentHashEmbeddingFunction with default dimension (384).

func NewConsistentHashEmbeddingFunctionFromConfig added in v0.3.0

func NewConsistentHashEmbeddingFunctionFromConfig(cfg EmbeddingFunctionConfig) (EmbeddingFunction, error)

NewConsistentHashEmbeddingFunctionFromConfig creates a ConsistentHashEmbeddingFunction from config

type EmbeddingFunctionConfig added in v0.3.0

type EmbeddingFunctionConfig map[string]interface{}

EmbeddingFunctionConfig represents serializable configuration for an embedding function. Used for cross-language compatibility and config persistence.

type EmbeddingFunctionFactory added in v0.3.0

type EmbeddingFunctionFactory func(config EmbeddingFunctionConfig) (EmbeddingFunction, error)

EmbeddingFunctionFactory creates an EmbeddingFunction from config.

type EmbeddingModel

type EmbeddingModel string

EmbeddingModel represents the name/identifier of an embedding model.

type Embeddings

type Embeddings []Embedding

Embeddings is a slice of Embedding values.

type Float32Embedding

type Float32Embedding struct {
	ArrayOfFloat32 *[]float32
}

Float32Embedding implements Embedding using float32 values. This is the most common embedding type for dense vectors.

func (*Float32Embedding) Compare

func (e *Float32Embedding) Compare(other Embedding, metric DistanceMetricOperator) float32

func (*Float32Embedding) ContentAsFloat32

func (e *Float32Embedding) ContentAsFloat32() []float32

func (*Float32Embedding) ContentAsInt32

func (e *Float32Embedding) ContentAsInt32() []int32

func (*Float32Embedding) FromFloat32

func (e *Float32Embedding) FromFloat32(content ...float32) error

func (*Float32Embedding) IsDefined

func (e *Float32Embedding) IsDefined() bool

func (*Float32Embedding) Len

func (e *Float32Embedding) Len() int

func (*Float32Embedding) MarshalJSON

func (e *Float32Embedding) MarshalJSON() ([]byte, error)

func (*Float32Embedding) UnmarshalJSON

func (e *Float32Embedding) UnmarshalJSON(b []byte) error

type ImageInput added in v0.3.3

type ImageInput struct {
	Base64   string
	URL      string
	FilePath string
}

ImageInput represents an image that can be embedded. Exactly one of Base64, URL, or FilePath must be set.

func NewImageInputFromBase64 added in v0.3.3

func NewImageInputFromBase64(base64Data string) ImageInput

NewImageInputFromBase64 creates an ImageInput from a base64-encoded string.

func NewImageInputFromFile added in v0.3.3

func NewImageInputFromFile(filePath string) ImageInput

NewImageInputFromFile creates an ImageInput from a local file path.

func NewImageInputFromURL added in v0.3.3

func NewImageInputFromURL(url string) ImageInput

NewImageInputFromURL creates an ImageInput from a URL.

func (ImageInput) ToBase64 added in v0.3.3

func (i ImageInput) ToBase64(_ context.Context) (string, error)

ToBase64 converts the image input to a base64-encoded string. For Base64 inputs, returns the value directly. For URL inputs, returns an error (URLs should be passed directly to the API). For FilePath inputs, reads the file and encodes it.

func (ImageInput) Type added in v0.3.3

func (i ImageInput) Type() ImageInputType

Type returns the type of the image input based on which field is set.

func (ImageInput) Validate added in v0.3.3

func (i ImageInput) Validate() error

Validate checks that exactly one input source is specified.

type ImageInputType added in v0.3.3

type ImageInputType string

ImageInputType represents the type of image input.

const (
	ImageInputTypeBase64   ImageInputType = "base64"
	ImageInputTypeURL      ImageInputType = "url"
	ImageInputTypeFilePath ImageInputType = "file"
)

type Int32Embedding

type Int32Embedding struct {
	ArrayOfInt32 *[]int32
}

Int32Embedding implements Embedding using int32 values. Used for quantized embeddings to reduce memory usage.

func (*Int32Embedding) Compare

func (e *Int32Embedding) Compare(other Embedding, metric DistanceMetricOperator) float32

func (*Int32Embedding) ContentAsFloat32

func (e *Int32Embedding) ContentAsFloat32() []float32

func (*Int32Embedding) ContentAsInt32

func (e *Int32Embedding) ContentAsInt32() []int32

func (*Int32Embedding) FromFloat32

func (e *Int32Embedding) FromFloat32(_ ...float32) error

func (*Int32Embedding) FromInt32

func (e *Int32Embedding) FromInt32(content ...int32) error

func (*Int32Embedding) IsDefined

func (e *Int32Embedding) IsDefined() bool

func (*Int32Embedding) Len

func (e *Int32Embedding) Len() int

func (*Int32Embedding) MarshalJSON

func (e *Int32Embedding) MarshalJSON() ([]byte, error)

func (*Int32Embedding) UnmarshalJSON

func (e *Int32Embedding) UnmarshalJSON(b []byte) error

type KnnVector added in v0.3.0

type KnnVector interface {
	Len() int
	ValuesAsFloat32() []float32
}

KnnVector represents a vector that can be used for k-nearest neighbor search.

type MultimodalEmbeddingFunction added in v0.3.3

type MultimodalEmbeddingFunction interface {
	EmbeddingFunction
	// EmbedImages returns embeddings for a batch of images.
	EmbedImages(ctx context.Context, images []ImageInput) ([]Embedding, error)
	// EmbedImage returns an embedding for a single image.
	EmbedImage(ctx context.Context, image ImageInput) (Embedding, error)
}

MultimodalEmbeddingFunction extends EmbeddingFunction with image embedding capabilities.

func BuildMultimodal added in v0.3.3

func BuildMultimodal(name string, config EmbeddingFunctionConfig) (MultimodalEmbeddingFunction, error)

BuildMultimodal creates a MultimodalEmbeddingFunction from name and config.

func BuildMultimodalCloseable added in v0.3.3

func BuildMultimodalCloseable(name string, config EmbeddingFunctionConfig) (MultimodalEmbeddingFunction, func() error, error)

BuildMultimodalCloseable creates a MultimodalEmbeddingFunction and returns a closer function. The closer handles cleanup for embedding functions that implement Closeable. If the embedding function does not implement Closeable, the closer is a no-op.

type MultimodalEmbeddingFunctionFactory added in v0.3.3

type MultimodalEmbeddingFunctionFactory func(config EmbeddingFunctionConfig) (MultimodalEmbeddingFunction, error)

MultimodalEmbeddingFunctionFactory creates a MultimodalEmbeddingFunction from config.

type Secret added in v0.3.0

type Secret struct {
	// contains filtered or unexported fields
}

Secret is a string type that prevents accidental logging of sensitive values. It implements fmt.Stringer, fmt.GoStringer, fmt.Formatter, json.Marshaler, and slog.LogValuer to mask the value in all common output scenarios.

func NewSecret added in v0.3.0

func NewSecret(value string) Secret

NewSecret creates a new Secret from a string value.

func (Secret) Format added in v0.3.0

func (s Secret) Format(f fmt.State, _ rune)

Format implements fmt.Formatter for complete control over all format verbs.

func (Secret) GoString added in v0.3.0

func (s Secret) GoString() string

GoString implements fmt.GoStringer for %#v formatting.

func (Secret) IsEmpty added in v0.3.0

func (s Secret) IsEmpty() bool

IsEmpty returns true if the secret value is empty.

func (Secret) LogValue added in v0.3.0

func (s Secret) LogValue() slog.Value

LogValue implements slog.LogValuer for structured logging protection.

func (Secret) MarshalJSON added in v0.3.0

func (s Secret) MarshalJSON() ([]byte, error)

MarshalJSON implements json.Marshaler, returning a redacted placeholder.

func (Secret) String added in v0.3.0

func (s Secret) String() string

String implements fmt.Stringer, returning a redacted placeholder.

func (*Secret) UnmarshalJSON added in v0.3.0

func (s *Secret) UnmarshalJSON(_ []byte) error

UnmarshalJSON implements json.Unmarshaler. It returns an error to enforce the use of environment variables for secrets. Secrets should be passed via api_key_env_var configuration, not directly in JSON.

func (Secret) Value added in v0.3.0

func (s Secret) Value() string

Value returns the actual secret value. Use this only when the value is needed (e.g., for HTTP headers). Never log the result of this method.

type SparseEmbeddingFunction added in v0.3.0

type SparseEmbeddingFunction interface {
	// EmbedDocumentsSparse returns a sparse vector for each text.
	EmbedDocumentsSparse(ctx context.Context, texts []string) ([]*SparseVector, error)
	// EmbedQuerySparse embeds a single text as a sparse vector.
	EmbedQuerySparse(ctx context.Context, text string) (*SparseVector, error)
	// Name returns the static identifier for this sparse embedding function (e.g., "bm25", "splade").
	Name() string
	// GetConfig returns the current configuration as a serializable map.
	GetConfig() EmbeddingFunctionConfig
}

func BuildSparse added in v0.3.0

func BuildSparse(name string, config EmbeddingFunctionConfig) (SparseEmbeddingFunction, error)

BuildSparse creates a SparseEmbeddingFunction from name and config.

func BuildSparseCloseable added in v0.3.0

func BuildSparseCloseable(name string, config EmbeddingFunctionConfig) (SparseEmbeddingFunction, func() error, error)

BuildSparseCloseable creates a SparseEmbeddingFunction and returns a closer function. The closer handles cleanup for embedding functions that implement Closeable. If the embedding function does not implement Closeable, the closer is a no-op.

type SparseEmbeddingFunctionFactory added in v0.3.0

type SparseEmbeddingFunctionFactory func(config EmbeddingFunctionConfig) (SparseEmbeddingFunction, error)

SparseEmbeddingFunctionFactory creates a SparseEmbeddingFunction from config.

type SparseVector added in v0.3.0

type SparseVector struct {
	Indices []int     `json:"indices"`
	Values  []float32 `json:"values"`
	Labels  []string  `json:"labels,omitempty"`
}

SparseVector represents a sparse embedding vector

func NewSparseVector added in v0.3.0

func NewSparseVector(indices []int, values []float32) (*SparseVector, error)

NewSparseVector creates a new sparse vector. Returns an error if:

  • indices and values have different lengths
  • any index is negative
  • any index is duplicated
  • any value is NaN or infinite

func (*SparseVector) Len added in v0.3.0

func (s *SparseVector) Len() int

Len returns the number of non-zero elements

func (*SparseVector) MarshalJSON added in v0.3.0

func (s *SparseVector) MarshalJSON() ([]byte, error)

MarshalJSON implements JSON marshaling for sparse vectors

func (*SparseVector) Validate added in v0.3.0

func (s *SparseVector) Validate() error

Validate checks that the sparse vector is valid. A valid sparse vector has:

  • matching lengths for indices and values
  • all indices are non-negative
  • no duplicate indices
  • no NaN or infinite values

func (*SparseVector) ValuesAsFloat32 added in v0.3.0

func (s *SparseVector) ValuesAsFloat32() []float32

ValuesAsFloat32 returns the non-zero values

Directories

Path Synopsis
Package defaultef exposes the legacy default ONNX embedding function package path.
Package defaultef exposes the legacy default ONNX embedding function package path.
Package ort exposes the canonical ONNX runtime embedding function package path.
Package ort exposes the canonical ONNX runtime embedding function package path.
Package roboflow provides a Roboflow CLIP embedding function for text and images.
Package roboflow provides a Roboflow CLIP embedding function for text and images.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL