vector

package
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 16, 2025 License: AGPL-3.0 Imports: 21 Imported by: 0

Documentation

Overview

Package vector provides a unified interface for vector database operations.

This package abstracts vector storage and similarity search, allowing different backends (chromem-go, Qdrant, Chroma, etc.) to be used interchangeably. It serves as the foundation for both:

  • Long-term memory (session/conversation search)
  • RAG (document retrieval)

Architecture (ported from legacy pkg/databases)

The package follows a provider pattern:

┌─────────────────────────────────────────────────────────────────────┐
│  Provider Interface                                                  │
│  • Upsert, Search, Delete operations                                │
│  • Collection management                                            │
│  • Metadata filtering                                               │
├─────────────────────────────────────────────────────────────────────┤
│  Implementations                                                     │
│  ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐   │
│  │ ChromemProv │ │ QdrantProv  │ │ ChromaProv  │ │ PineconePr  │   │
│  │ (embedded)  │ │ (external)  │ │ (external)  │ │ (cloud)     │   │
│  └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘   │
└─────────────────────────────────────────────────────────────────────┘

Usage

The typical flow is:

  1. Create a provider from configuration
  2. Use with SearchEngine (v2/search) for high-level operations
  3. SearchEngine handles embedding generation and query processing

Example:

provider, _ := vector.NewChromemProvider(vector.ChromemConfig{
    PersistPath: ".hector/vectors",
})
defer provider.Close()

// Upsert with pre-computed vector
provider.Upsert(ctx, "documents", "doc1", embedding, metadata)

// Search with query vector
results, _ := provider.Search(ctx, "documents", queryVector, 10)

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type ChromaConfig

type ChromaConfig struct {
	// Host is the Chroma server hostname.
	Host string `yaml:"host"`

	// Port is the Chroma HTTP port (default: 8000).
	Port int `yaml:"port,omitempty"`

	// APIKey for authenticated access (optional).
	APIKey string `yaml:"api_key,omitempty"`

	// UseTLS enables HTTPS connections.
	UseTLS bool `yaml:"use_tls,omitempty"`
}

ChromaConfig configures the Chroma vector provider.

Direct port from legacy pkg/databases/chroma.go

type ChromaProvider

type ChromaProvider struct {
	// contains filtered or unexported fields
}

ChromaProvider implements Provider using Chroma vector database.

Direct port from legacy pkg/databases/chroma.go

func NewChromaProvider

func NewChromaProvider(cfg ChromaConfig) (*ChromaProvider, error)

NewChromaProvider creates a new Chroma provider.

func (*ChromaProvider) Close

func (p *ChromaProvider) Close() error

Close closes the HTTP client.

func (*ChromaProvider) CreateCollection

func (p *ChromaProvider) CreateCollection(ctx context.Context, collection string, vectorDimension int) error

CreateCollection creates a new collection in Chroma.

func (*ChromaProvider) Delete

func (p *ChromaProvider) Delete(ctx context.Context, collection string, id string) error

Delete removes a document by ID.

func (*ChromaProvider) DeleteByFilter

func (p *ChromaProvider) DeleteByFilter(ctx context.Context, collection string, filter map[string]any) error

DeleteByFilter removes all documents matching the filter.

func (*ChromaProvider) DeleteCollection

func (p *ChromaProvider) DeleteCollection(ctx context.Context, collection string) error

DeleteCollection removes a collection from Chroma.

func (*ChromaProvider) Name

func (p *ChromaProvider) Name() string

Name returns the provider name.

func (*ChromaProvider) Search

func (p *ChromaProvider) Search(ctx context.Context, collection string, vector []float32, topK int) ([]Result, error)

Search finds the most similar vectors.

func (*ChromaProvider) SearchWithFilter

func (p *ChromaProvider) SearchWithFilter(ctx context.Context, collection string, vector []float32, topK int, filter map[string]any) ([]Result, error)

SearchWithFilter combines vector similarity with metadata filtering.

func (*ChromaProvider) Upsert

func (p *ChromaProvider) Upsert(ctx context.Context, collection string, id string, vector []float32, metadata map[string]any) error

Upsert adds or updates a document with its vector.

type ChromemConfig

type ChromemConfig struct {
	// PersistPath for file persistence (optional).
	// If empty, vectors are stored in memory only.
	// Directory will be created if it doesn't exist.
	PersistPath string `yaml:"persist_path,omitempty"`

	// Compress enables gzip compression for persistence.
	// Reduces file size but increases CPU usage.
	Compress bool `yaml:"compress,omitempty"`
}

ChromemConfig configures the chromem provider.

type ChromemProvider

type ChromemProvider struct {
	// contains filtered or unexported fields
}

ChromemProvider implements Provider using chromem-go for embedded vector storage.

This is the recommended provider for zero-config deployments as it requires no external services. It stores vectors in memory with optional file persistence.

Features:

  • Pure Go, no external dependencies
  • Optional file persistence (gzip compressed)
  • Cosine similarity search
  • Metadata filtering

Limitations:

  • Single-process only (no distributed search)
  • Memory-bound (all vectors in RAM)
  • No hybrid search support

For production at scale, consider Qdrant or other external providers.

func NewChromemProvider

func NewChromemProvider(cfg ChromemConfig) (*ChromemProvider, error)

NewChromemProvider creates a new chromem-based vector provider.

func (*ChromemProvider) Close

func (p *ChromemProvider) Close() error

Close persists the database and releases resources.

func (*ChromemProvider) CreateCollection

func (p *ChromemProvider) CreateCollection(ctx context.Context, collection string, vectorDimension int) error

CreateCollection creates a new collection. chromem-go creates collections implicitly, so this is a no-op.

func (*ChromemProvider) Delete

func (p *ChromemProvider) Delete(ctx context.Context, collection string, id string) error

Delete removes a document from a collection by ID.

func (*ChromemProvider) DeleteByFilter

func (p *ChromemProvider) DeleteByFilter(ctx context.Context, collection string, filter map[string]any) error

DeleteByFilter removes all documents matching the filter.

func (*ChromemProvider) DeleteCollection

func (p *ChromemProvider) DeleteCollection(ctx context.Context, collection string) error

DeleteCollection removes a collection and all its documents.

func (*ChromemProvider) Name

func (p *ChromemProvider) Name() string

Name returns the provider name.

func (*ChromemProvider) Search

func (p *ChromemProvider) Search(ctx context.Context, collection string, vector []float32, topK int) ([]Result, error)

Search finds the most similar vectors in a collection.

func (*ChromemProvider) SearchWithFilter

func (p *ChromemProvider) SearchWithFilter(ctx context.Context, collection string, vector []float32, topK int, filter map[string]any) ([]Result, error)

SearchWithFilter combines vector similarity with metadata filtering.

func (*ChromemProvider) Upsert

func (p *ChromemProvider) Upsert(ctx context.Context, collection string, id string, vector []float32, metadata map[string]any) error

Upsert adds or updates a document with its vector embedding.

type Config

type Config struct {
	// Type identifies the provider implementation.
	// Values: "chromem", "qdrant", "chroma", "pinecone", "milvus", "weaviate"
	Type string `yaml:"type"`

	// Collection is the default collection name.
	// Individual operations can override this.
	Collection string `yaml:"collection,omitempty"`
}

Config is the base configuration for all vector providers.

type MilvusConfig

type MilvusConfig struct {
	// Host is the Milvus server hostname.
	Host string `yaml:"host"`

	// Port is the Milvus HTTP port (default: 19530).
	Port int `yaml:"port,omitempty"`

	// APIKey for authenticated access (optional).
	APIKey string `yaml:"api_key,omitempty"`

	// UseTLS enables HTTPS connections.
	UseTLS bool `yaml:"use_tls,omitempty"`
}

MilvusConfig configures the Milvus vector provider.

Direct port from legacy pkg/databases/milvus.go

type MilvusProvider

type MilvusProvider struct {
	// contains filtered or unexported fields
}

MilvusProvider implements Provider using Milvus vector database.

Direct port from legacy pkg/databases/milvus.go

func NewMilvusProvider

func NewMilvusProvider(cfg MilvusConfig) (*MilvusProvider, error)

NewMilvusProvider creates a new Milvus provider.

func (*MilvusProvider) Close

func (p *MilvusProvider) Close() error

Close closes the HTTP client.

func (*MilvusProvider) CreateCollection

func (p *MilvusProvider) CreateCollection(ctx context.Context, collection string, vectorDimension int) error

CreateCollection creates a new collection in Milvus.

func (*MilvusProvider) Delete

func (p *MilvusProvider) Delete(ctx context.Context, collection string, id string) error

Delete removes a document by ID.

func (*MilvusProvider) DeleteByFilter

func (p *MilvusProvider) DeleteByFilter(ctx context.Context, collection string, filter map[string]any) error

DeleteByFilter removes all documents matching the filter.

func (*MilvusProvider) DeleteCollection

func (p *MilvusProvider) DeleteCollection(ctx context.Context, collection string) error

DeleteCollection removes a collection from Milvus.

func (*MilvusProvider) Name

func (p *MilvusProvider) Name() string

Name returns the provider name.

func (*MilvusProvider) Search

func (p *MilvusProvider) Search(ctx context.Context, collection string, vector []float32, topK int) ([]Result, error)

Search finds the most similar vectors.

func (*MilvusProvider) SearchWithFilter

func (p *MilvusProvider) SearchWithFilter(ctx context.Context, collection string, vector []float32, topK int, filter map[string]any) ([]Result, error)

SearchWithFilter combines vector similarity with metadata filtering.

func (*MilvusProvider) Upsert

func (p *MilvusProvider) Upsert(ctx context.Context, collection string, id string, vector []float32, metadata map[string]any) error

Upsert adds or updates a document with its vector.

type NilProvider

type NilProvider struct{}

NilProvider is a no-op implementation for when vector search is disabled.

func (NilProvider) Close

func (NilProvider) Close() error

func (NilProvider) CreateCollection

func (NilProvider) CreateCollection(ctx context.Context, collection string, vectorDimension int) error

func (NilProvider) Delete

func (NilProvider) Delete(ctx context.Context, collection string, id string) error

func (NilProvider) DeleteByFilter

func (NilProvider) DeleteByFilter(ctx context.Context, collection string, filter map[string]any) error

func (NilProvider) DeleteCollection

func (NilProvider) DeleteCollection(ctx context.Context, collection string) error

func (NilProvider) Name

func (NilProvider) Name() string

func (NilProvider) Search

func (NilProvider) Search(ctx context.Context, collection string, vector []float32, topK int) ([]Result, error)

func (NilProvider) SearchWithFilter

func (NilProvider) SearchWithFilter(ctx context.Context, collection string, vector []float32, topK int, filter map[string]any) ([]Result, error)

func (NilProvider) Upsert

func (NilProvider) Upsert(ctx context.Context, collection string, id string, vector []float32, metadata map[string]any) error

type PineconeConfig

type PineconeConfig struct {
	// APIKey is required for Pinecone authentication.
	APIKey string `yaml:"api_key"`

	// Host is the Pinecone API host (optional, defaults to https://api.pinecone.io).
	Host string `yaml:"host,omitempty"`

	// IndexName is the default index to use.
	IndexName string `yaml:"index_name"`

	// Environment is the Pinecone environment (e.g., "us-west1-gcp").
	Environment string `yaml:"environment,omitempty"`
}

PineconeConfig configures the Pinecone vector provider.

Direct port from legacy pkg/databases/pinecone.go

type PineconeProvider

type PineconeProvider struct {
	// contains filtered or unexported fields
}

PineconeProvider implements Provider using Pinecone vector database.

Direct port from legacy pkg/databases/pinecone.go

func NewPineconeProvider

func NewPineconeProvider(cfg PineconeConfig) (*PineconeProvider, error)

NewPineconeProvider creates a new Pinecone provider.

func (*PineconeProvider) Close

func (p *PineconeProvider) Close() error

Close closes the Pinecone client.

func (*PineconeProvider) CreateCollection

func (p *PineconeProvider) CreateCollection(ctx context.Context, collection string, vectorDimension int) error

CreateCollection checks if the index exists (Pinecone indexes must be created separately).

func (*PineconeProvider) Delete

func (p *PineconeProvider) Delete(ctx context.Context, collection string, id string) error

Delete removes a document by ID.

func (*PineconeProvider) DeleteByFilter

func (p *PineconeProvider) DeleteByFilter(ctx context.Context, collection string, filter map[string]any) error

DeleteByFilter removes all documents matching the filter.

func (*PineconeProvider) DeleteCollection

func (p *PineconeProvider) DeleteCollection(ctx context.Context, collection string) error

DeleteCollection returns an error (Pinecone index deletion requires API).

func (*PineconeProvider) Name

func (p *PineconeProvider) Name() string

Name returns the provider name.

func (*PineconeProvider) Search

func (p *PineconeProvider) Search(ctx context.Context, collection string, vector []float32, topK int) ([]Result, error)

Search finds the most similar vectors.

func (*PineconeProvider) SearchWithFilter

func (p *PineconeProvider) SearchWithFilter(ctx context.Context, collection string, vector []float32, topK int, filter map[string]any) ([]Result, error)

SearchWithFilter combines vector similarity with metadata filtering.

func (*PineconeProvider) Upsert

func (p *PineconeProvider) Upsert(ctx context.Context, collection string, id string, vector []float32, metadata map[string]any) error

Upsert adds or updates a document with its vector.

type Provider

type Provider interface {
	// Upsert adds or updates a document with its vector embedding.
	//
	// If a document with the same ID exists, it will be updated.
	// The vector dimension must match the collection's configured dimension.
	//
	// Parameters:
	//   - collection: logical grouping (e.g., "memory", "documents")
	//   - id: unique document identifier
	//   - vector: embedding vector from the embedder
	//   - metadata: additional searchable/filterable attributes
	Upsert(ctx context.Context, collection string, id string, vector []float32, metadata map[string]any) error

	// Search finds the most similar vectors in a collection.
	//
	// Returns results ordered by similarity score (highest first).
	// The query vector should be generated by the same embedder
	// used for indexing.
	//
	// Parameters:
	//   - collection: which collection to search
	//   - vector: query embedding vector
	//   - topK: maximum number of results to return
	Search(ctx context.Context, collection string, vector []float32, topK int) ([]Result, error)

	// SearchWithFilter combines vector similarity with metadata filtering.
	//
	// The filter map supports equality matching on metadata fields.
	// Filter semantics:
	//   - {"field": "value"} - exact match
	//   - {"field1": "v1", "field2": "v2"} - AND of all conditions
	//
	// Example:
	//   filter := map[string]any{"user_id": "user123", "type": "document"}
	SearchWithFilter(ctx context.Context, collection string, vector []float32, topK int, filter map[string]any) ([]Result, error)

	// Delete removes a document from a collection by ID.
	Delete(ctx context.Context, collection string, id string) error

	// DeleteByFilter removes all documents matching the filter.
	//
	// Use with caution - this can delete many documents at once.
	DeleteByFilter(ctx context.Context, collection string, filter map[string]any) error

	// CreateCollection creates a new collection with specified vector dimension.
	//
	// Some backends create collections implicitly on first Upsert.
	// This method allows explicit creation with configuration.
	CreateCollection(ctx context.Context, collection string, vectorDimension int) error

	// DeleteCollection removes a collection and all its documents.
	DeleteCollection(ctx context.Context, collection string) error

	// Name returns the provider implementation name (e.g., "chromem", "qdrant").
	Name() string

	// Close releases resources held by the provider.
	// Should be called when the provider is no longer needed.
	io.Closer
}

Provider defines the interface for vector database operations.

All implementations must be thread-safe for concurrent access. Vector dimensions are determined by the embedder used and should be consistent within a collection.

Derived from legacy pkg/databases/registry.go:DatabaseProvider

func NewProvider

func NewProvider(cfg *ProviderConfig) (Provider, error)

NewProvider creates a vector provider from configuration.

type ProviderConfig

type ProviderConfig struct {
	// Type identifies which provider to create.
	Type ProviderType `yaml:"type"`

	// Chromem configuration (used when Type == "chromem").
	Chromem *ChromemConfig `yaml:"chromem,omitempty"`

	// Qdrant configuration (used when Type == "qdrant").
	Qdrant *QdrantConfig `yaml:"qdrant,omitempty"`

	// Pinecone configuration (used when Type == "pinecone").
	Pinecone *PineconeConfig `yaml:"pinecone,omitempty"`

	// Weaviate configuration (used when Type == "weaviate").
	Weaviate *WeaviateConfig `yaml:"weaviate,omitempty"`

	// Milvus configuration (used when Type == "milvus").
	Milvus *MilvusConfig `yaml:"milvus,omitempty"`

	// Chroma configuration (used when Type == "chroma").
	Chroma *ChromaConfig `yaml:"chroma,omitempty"`
}

ProviderConfig is the configuration for creating vector providers.

func (*ProviderConfig) SetDefaults

func (c *ProviderConfig) SetDefaults()

SetDefaults applies default values.

func (*ProviderConfig) Validate

func (c *ProviderConfig) Validate() error

Validate checks the configuration.

type ProviderType

type ProviderType string

ProviderType identifies a vector provider implementation.

const (
	// ProviderChromem uses chromem-go for embedded vector storage.
	// Zero-config, no external dependencies. Best for development and small deployments.
	ProviderChromem ProviderType = "chromem"

	// ProviderQdrant uses Qdrant vector database.
	// High-performance, supports distributed deployments.
	ProviderQdrant ProviderType = "qdrant"

	// ProviderChroma uses Chroma vector database.
	// Python-native but has Go client support.
	ProviderChroma ProviderType = "chroma"

	// ProviderPinecone uses Pinecone managed vector database.
	// Fully managed cloud service.
	ProviderPinecone ProviderType = "pinecone"

	// ProviderMilvus uses Milvus vector database.
	// Open-source, supports large-scale deployments.
	ProviderMilvus ProviderType = "milvus"

	// ProviderWeaviate uses Weaviate vector database.
	// Supports GraphQL queries and hybrid search.
	ProviderWeaviate ProviderType = "weaviate"
)

type QdrantConfig

type QdrantConfig struct {
	// Host is the Qdrant server hostname.
	Host string `yaml:"host"`

	// Port is the Qdrant gRPC port (default: 6334).
	Port int `yaml:"port"`

	// APIKey for authenticated access (optional).
	APIKey string `yaml:"api_key,omitempty"`

	// UseTLS enables TLS connections.
	UseTLS bool `yaml:"use_tls,omitempty"`
}

QdrantConfig configures the Qdrant vector provider.

Direct port from legacy pkg/databases/qdrant.go

type QdrantProvider

type QdrantProvider struct {
	// contains filtered or unexported fields
}

QdrantProvider implements Provider using Qdrant vector database.

Direct port from legacy pkg/databases/qdrant.go

func NewQdrantProvider

func NewQdrantProvider(cfg QdrantConfig) (*QdrantProvider, error)

NewQdrantProvider creates a new Qdrant provider.

func (*QdrantProvider) Close

func (p *QdrantProvider) Close() error

Close closes the Qdrant client.

func (*QdrantProvider) CreateCollection

func (p *QdrantProvider) CreateCollection(ctx context.Context, collection string, vectorDimension int) error

CreateCollection creates a new collection.

func (*QdrantProvider) Delete

func (p *QdrantProvider) Delete(ctx context.Context, collection string, id string) error

Delete removes a document by ID.

func (*QdrantProvider) DeleteByFilter

func (p *QdrantProvider) DeleteByFilter(ctx context.Context, collection string, filter map[string]any) error

DeleteByFilter removes all documents matching the filter.

func (*QdrantProvider) DeleteCollection

func (p *QdrantProvider) DeleteCollection(ctx context.Context, collection string) error

DeleteCollection removes a collection.

func (*QdrantProvider) Name

func (p *QdrantProvider) Name() string

Name returns the provider name.

func (*QdrantProvider) Search

func (p *QdrantProvider) Search(ctx context.Context, collection string, vector []float32, topK int) ([]Result, error)

Search finds the most similar vectors.

func (*QdrantProvider) SearchWithFilter

func (p *QdrantProvider) SearchWithFilter(ctx context.Context, collection string, vector []float32, topK int, filter map[string]any) ([]Result, error)

SearchWithFilter combines vector similarity with metadata filtering.

func (*QdrantProvider) Upsert

func (p *QdrantProvider) Upsert(ctx context.Context, collection string, id string, vector []float32, metadata map[string]any) error

Upsert adds or updates a document with its vector.

type Registry

type Registry struct {
	// contains filtered or unexported fields
}

Registry manages named vector providers.

This allows multiple providers to be configured and accessed by name, similar to how databases or embedders are managed.

func NewRegistry

func NewRegistry() *Registry

NewRegistry creates a new provider registry.

func (*Registry) Close

func (r *Registry) Close() error

Close closes all registered providers.

func (*Registry) Get

func (r *Registry) Get(name string) (Provider, bool)

Get retrieves a provider by name.

func (*Registry) List

func (r *Registry) List() []string

List returns all registered provider names.

func (*Registry) MustGet

func (r *Registry) MustGet(name string) Provider

MustGet retrieves a provider by name or panics.

func (*Registry) Register

func (r *Registry) Register(name string, provider Provider) error

Register adds a provider to the registry.

type Result

type Result struct {
	// ID is the unique document identifier.
	ID string

	// Score represents similarity/relevance (higher is better).
	Score float32

	// Content is the original text content (if stored in metadata).
	Content string

	// Vector is the document's embedding (optional, not always returned).
	Vector []float32

	// Metadata contains the document's metadata fields.
	Metadata map[string]any
}

Result represents a single search result.

Results are returned ordered by Score (highest first). The Score semantics depend on the implementation:

  • Cosine similarity: 0.0 to 1.0 (1.0 = identical)
  • Euclidean distance: inverted (higher = more similar)

type WeaviateConfig

type WeaviateConfig struct {
	// Host is the Weaviate server hostname.
	Host string `yaml:"host"`

	// Port is the Weaviate HTTP port (default: 8080).
	Port int `yaml:"port,omitempty"`

	// APIKey for authenticated access (optional).
	APIKey string `yaml:"api_key,omitempty"`

	// UseTLS enables HTTPS connections.
	UseTLS bool `yaml:"use_tls,omitempty"`
}

WeaviateConfig configures the Weaviate vector provider.

Direct port from legacy pkg/databases/weaviate.go

type WeaviateProvider

type WeaviateProvider struct {
	// contains filtered or unexported fields
}

WeaviateProvider implements Provider using Weaviate vector database.

Direct port from legacy pkg/databases/weaviate.go

func NewWeaviateProvider

func NewWeaviateProvider(cfg WeaviateConfig) (*WeaviateProvider, error)

NewWeaviateProvider creates a new Weaviate provider.

func (*WeaviateProvider) Close

func (p *WeaviateProvider) Close() error

Close closes the HTTP client.

func (*WeaviateProvider) CreateCollection

func (p *WeaviateProvider) CreateCollection(ctx context.Context, collection string, vectorDimension int) error

CreateCollection creates a new class in Weaviate.

func (*WeaviateProvider) Delete

func (p *WeaviateProvider) Delete(ctx context.Context, collection string, id string) error

Delete removes a document by ID.

func (*WeaviateProvider) DeleteByFilter

func (p *WeaviateProvider) DeleteByFilter(ctx context.Context, collection string, filter map[string]any) error

DeleteByFilter removes all documents matching the filter.

func (*WeaviateProvider) DeleteCollection

func (p *WeaviateProvider) DeleteCollection(ctx context.Context, collection string) error

DeleteCollection removes a class from Weaviate.

func (*WeaviateProvider) Name

func (p *WeaviateProvider) Name() string

Name returns the provider name.

func (*WeaviateProvider) Search

func (p *WeaviateProvider) Search(ctx context.Context, collection string, vector []float32, topK int) ([]Result, error)

Search finds the most similar vectors.

func (*WeaviateProvider) SearchWithFilter

func (p *WeaviateProvider) SearchWithFilter(ctx context.Context, collection string, vector []float32, topK int, filter map[string]any) ([]Result, error)

SearchWithFilter combines vector similarity with metadata filtering.

func (*WeaviateProvider) Upsert

func (p *WeaviateProvider) Upsert(ctx context.Context, collection string, id string, vector []float32, metadata map[string]any) error

Upsert adds or updates a document with its vector.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL