vectordb

package
v1.3.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 2, 2026 License: MIT Imports: 4 Imported by: 0

Documentation

Overview

Package vectordb provides a database-agnostic abstraction for vector similarity search.

Overview

This package defines a common interface Service that can be implemented by different vector database adapters (Qdrant, pgVector, etc.), allowing applications to switch between databases without changing application code.

Architecture

┌─────────────────────────────────────────────────────────────┐
│                    Application Layer                        │
│  (uses vectordb.Service - no DB-specific imports)           │
└──────────────────────────┬──────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────────┐
│                  vectordb.Service                           │
│          (common interface + DB-agnostic types)             │
└──────────────────────────┬──────────────────────────────────┘
                           │
              ┌────────────┴────────────┐
              ▼                         ▼
      ┌───────────────┐         ┌───────────────-┐
      │ qdrant.Adapter│         │pgvector.Adapter│
      │  (implements) │         │  (planned)     │
      └───────────────┘         └───────────────-┘

Benefits

  • Single Source of Truth: Filter types, search interfaces, and result types defined once.
  • Easy to Add New DBs: Just add a new adapter; consuming projects don't change.
  • Consistent API: All projects using std get the same interface.
  • Testability: Mock the interface once, works for all DBs.

Usage

In your application, depend only on the vectordb interface:

import "github.com/Aleph-Alpha/std/v1/vectordb"

type SearchService struct {
    db vectordb.Service
}

func NewSearchService(db vectordb.Service) *SearchService {
    return &SearchService{db: db}
}

func (s *SearchService) Search(ctx context.Context, query string, vector []float32) ([]vectordb.SearchResult, error) {
    results, err := s.db.Search(ctx, vectordb.SearchRequest{
        CollectionName: "documents",
        Vector:         vector,
        TopK:           10,
        Filters: []*vectordb.FilterSet{
            {
                Must: &vectordb.ConditionSet{
                    Conditions: []vectordb.FilterCondition{
                        vectordb.NewMatch("status", "published"),
                    },
                },
            },
        },
    })
    if err != nil {
        return nil, err
    }
    return results[0], nil
}

Wire Up with Qdrant

In your main setup:

import (
    "github.com/Aleph-Alpha/std/v1/vectordb"
    "github.com/Aleph-Alpha/std/v1/qdrant"
)

func main() {
    // Create Qdrant client (with health checks, config, etc.)
    qc, _ := qdrant.NewQdrantClient(qdrant.QdrantParams{
        Config: &qdrant.Config{Endpoint: "localhost", Port: 6334},
    })

    // Create adapter for DB-agnostic usage
    db := qdrant.NewAdapter(qc.Client())

    // Use in application
    svc := NewSearchService(db)
    // ...
}

Package Layout

vectordb/
├── interface.go      # Service interface
├── types.go          # SearchRequest, SearchResult, EmbeddingInput, Collection
├── filters.go        # FilterSet, FilterCondition, and condition types
├── utils.go          # Convenience constructors (New*) and JSON helpers
└── doc.go            # This file

qdrant/                      # Qdrant package (includes adapter)
├── client.go                # QdrantClient wrapper
├── operations.go            # Adapter - implements Service
├── converter.go             # vectordb types → qdrant types
└── ...

Future adapters would live in their own packages:

pgvector/             # (planned) PostgreSQL pgvector adapter

Filter Types

The package provides DB-agnostic filter conditions:

| Type                    | Description                  | SQL Equivalent                    |
|-------------------------|------------------------------|-----------------------------------|
| MatchCondition          | Exact value match            | WHERE field = value               |
| MatchAnyCondition       | Value in set                 | WHERE field IN (...)              |
| MatchExceptCondition    | Value not in set             | WHERE field NOT IN (...)          |
| NumericRangeCondition   | Numeric range                | WHERE field >= min AND field <= max|
| TimeRangeCondition      | Datetime range               | WHERE created_at BETWEEN ...      |
| IsNullCondition         | Field is null                | WHERE field IS NULL               |
| IsEmptyCondition        | Field is empty/null/missing  | WHERE field IS NULL OR field = '' |
| NestedFilterCondition   | Nested filter group          | (A OR B) AND (C OR D)             |

Use convenience constructors for cleaner code:

// Internal field (top-level in payload)
vectordb.NewMatch("status", "published")

// User-defined field (stored under "custom." prefix)
vectordb.NewUserMatch("category", "research")

// Range conditions with NumericRange/TimeRange structs
vectordb.NewNumericRange("price", vectordb.NumericRange{Gte: &min, Lt: &max})
vectordb.NewTimeRange("created_at", vectordb.TimeRange{AtOrAfter: &start, Before: &end})

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Must

func Must(conditions ...FilterCondition) func(*FilterSet)

Must creates a Must clause (AND logic) with the given conditions. All conditions must match for a document to be included.

func MustNot

func MustNot(conditions ...FilterCondition) func(*FilterSet)

MustNot creates a MustNot clause (NOT logic) with the given conditions. Documents matching any of these conditions are excluded.

func Should

func Should(conditions ...FilterCondition) func(*FilterSet)

Should creates a Should clause (OR logic) with the given conditions. At least one condition must match for a document to be included.

Types

type Collection

type Collection struct {
	// Name is the unique identifier of the collection
	Name string `json:"name"`

	// Status indicates the operational state (e.g., "Green", "Yellow")
	Status string `json:"status"`

	// VectorSize is the dimension of vectors in this collection
	VectorSize int `json:"vectorSize"`

	// Distance is the similarity metric (e.g., "Cosine", "Dot", "Euclid")
	Distance string `json:"distance"`

	// VectorCount is the number of indexed vectors
	VectorCount uint64 `json:"vectorCount"`

	// PointCount is the number of stored points/documents
	PointCount uint64 `json:"pointCount"`
}

Collection contains metadata about a vector collection.

type ConditionSet

type ConditionSet struct {
	Conditions []FilterCondition `json:"conditions,omitempty"`
}

ConditionSet holds a group of conditions for a single clause.

func (*ConditionSet) MarshalJSON

func (cs *ConditionSet) MarshalJSON() ([]byte, error)

MarshalJSON implements custom JSON marshaling for ConditionSet. This is needed because FilterCondition is an interface.

func (*ConditionSet) UnmarshalJSON

func (cs *ConditionSet) UnmarshalJSON(data []byte) error

UnmarshalJSON implements custom JSON unmarshaling for ConditionSet. It detects the condition type based on JSON keys and deserializes into the appropriate concrete type (MatchCondition, NumericRangeCondition, etc.)

type EmbeddingInput

type EmbeddingInput struct {
	// ID is the unique identifier for this embedding
	ID string `json:"id"`

	// Vector is the dense embedding representation
	Vector []float32 `json:"vector"`

	// Payload is optional metadata to store with the vector
	Payload map[string]any `json:"payload,omitempty"`
}

EmbeddingInput is the input for inserting vectors into a collection.

type FieldType

type FieldType int

FieldType indicates whether a field is internal (system-managed) or user-defined (stored under a prefix like "custom.").

const (
	// InternalField - system-managed fields stored at top-level
	InternalField FieldType = iota
	// UserField - user-defined fields stored under a prefix (e.g., "custom.")
	UserField
)

type FilterCondition

type FilterCondition interface {
	// isFilterCondition is a marker method to ensure type safety
	IsFilterCondition()
}

FilterCondition is the interface all filter conditions must implement. Each database adapter converts these to its native filter format.

type FilterSet

type FilterSet struct {
	// Must: All conditions must match (AND)
	Must *ConditionSet `json:"must,omitempty"`
	// Should: At least one condition must match (OR)
	Should *ConditionSet `json:"should,omitempty"`
	// MustNot: None of the conditions should match (NOT)
	MustNot *ConditionSet `json:"mustNot,omitempty"`
}

FilterSet supports Must (AND), Should (OR), and MustNot (NOT) clauses. Use with SearchRequest.Filters to filter search results.

Example:

filters := &FilterSet{
    Must: &ConditionSet{
        Conditions: []FilterCondition{
            &MatchCondition{Field: "city", Value: "London"},
        },
    },
}

func NewFilterSet

func NewFilterSet(clauses ...func(*FilterSet)) *FilterSet

NewFilterSet creates a FilterSet with the given clauses. Use with Must(), Should(), and MustNot() helpers.

Example:

vectordb.NewFilterSet(
    vectordb.Must(vectordb.NewMatch("status", "published")),
    vectordb.Should(vectordb.NewMatch("tag", "ml"), vectordb.NewMatch("tag", "ai")),
)

type IsEmptyCondition

type IsEmptyCondition struct {
	Field     string    `json:"field"`
	FieldType FieldType `json:"-"`
}

IsEmptyCondition checks if a field is empty (doesn't exist, null, or []). SQL equivalent: WHERE field IS NULL OR field = ” OR field = []

func NewIsEmpty

func NewIsEmpty(field string) *IsEmptyCondition

NewIsEmpty creates an IS EMPTY condition for internal fields.

func NewUserIsEmpty

func NewUserIsEmpty(field string) *IsEmptyCondition

NewUserIsEmpty creates an IS EMPTY condition for user-defined fields.

func (*IsEmptyCondition) IsFilterCondition

func (c *IsEmptyCondition) IsFilterCondition()

type IsNullCondition

type IsNullCondition struct {
	Field     string    `json:"field"`
	FieldType FieldType `json:"-"`
}

IsNullCondition checks if a field has a NULL value. SQL equivalent: WHERE field IS NULL

func NewIsNull

func NewIsNull(field string) *IsNullCondition

NewIsNull creates an IS NULL condition for internal fields.

func NewUserIsNull

func NewUserIsNull(field string) *IsNullCondition

NewUserIsNull creates an IS NULL condition for user-defined fields.

func (*IsNullCondition) IsFilterCondition

func (c *IsNullCondition) IsFilterCondition()

type MatchAnyCondition

type MatchAnyCondition struct {
	Field     string    `json:"field"`
	Values    []any     `json:"anyOf"`
	FieldType FieldType `json:"-"`
}

MatchAnyCondition matches if value is one of the given values (IN operator). SQL equivalent: WHERE field IN (value1, value2, ...)

func NewMatchAny

func NewMatchAny(field string, values ...any) *MatchAnyCondition

NewMatchAny creates an IN condition for internal fields.

func NewUserMatchAny

func NewUserMatchAny(field string, values ...any) *MatchAnyCondition

NewUserMatchAny creates an IN condition for user-defined fields.

func (*MatchAnyCondition) IsFilterCondition

func (c *MatchAnyCondition) IsFilterCondition()

type MatchCondition

type MatchCondition struct {
	Field     string    `json:"field"`
	Value     any       `json:"equalTo"`
	FieldType FieldType `json:"-"`
}

MatchCondition represents an exact match filter (WHERE field = value). Supports string, bool, and int64 values.

func NewMatch

func NewMatch(field string, value any) *MatchCondition

NewMatch creates a match condition for internal fields.

func NewUserMatch

func NewUserMatch(field string, value any) *MatchCondition

NewUserMatch creates a match condition for user-defined fields.

func (*MatchCondition) IsFilterCondition

func (c *MatchCondition) IsFilterCondition()

type MatchExceptCondition

type MatchExceptCondition struct {
	Field     string    `json:"field"`
	Values    []any     `json:"noneOf"`
	FieldType FieldType `json:"-"`
}

MatchExceptCondition matches if value is NOT one of the given values (NOT IN). SQL equivalent: WHERE field NOT IN (value1, value2, ...)

func NewMatchExcept

func NewMatchExcept(field string, values ...any) *MatchExceptCondition

NewMatchExcept creates a NOT IN condition for internal fields.

func NewUserMatchExcept

func NewUserMatchExcept(field string, values ...any) *MatchExceptCondition

NewUserMatchExcept creates a NOT IN condition for user-defined fields.

func (*MatchExceptCondition) IsFilterCondition

func (c *MatchExceptCondition) IsFilterCondition()

type NestedFilterCondition added in v1.1.0

type NestedFilterCondition struct {
	Filter *FilterSet `json:"filter"`
}

NestedFilterCondition allows a FilterSet to be used as a condition. This enables complex recursive filters like (A AND B) OR (C AND D).

Example - (A OR B) AND (C OR D):

filters := &FilterSet{
    Must: &ConditionSet{
        Conditions: []FilterCondition{
            &NestedFilterCondition{
                Filter: &FilterSet{
                    Should: &ConditionSet{Conditions: []FilterCondition{A, B}},
                },
            },
            &NestedFilterCondition{
                Filter: &FilterSet{
                    Should: &ConditionSet{Conditions: []FilterCondition{C, D}},
                },
            },
        },
    },
}

Example - (A AND B) OR (C AND D):

filters := &FilterSet{
    Should: &ConditionSet{
        Conditions: []FilterCondition{
            &NestedFilterCondition{
                Filter: &FilterSet{
                    Must: &ConditionSet{Conditions: []FilterCondition{A, B}},
                },
            },
            &NestedFilterCondition{
                Filter: &FilterSet{
                    Must: &ConditionSet{Conditions: []FilterCondition{C, D}},
                },
            },
        },
    },
}

func (*NestedFilterCondition) IsFilterCondition added in v1.1.0

func (c *NestedFilterCondition) IsFilterCondition()

type NumericRange

type NumericRange struct {
	Gt  *float64 `json:"greaterThan,omitempty"`          // GreaterThan (exclusive)
	Gte *float64 `json:"greaterThanOrEqualTo,omitempty"` // GreaterThanOrEqualTo (inclusive)
	Lt  *float64 `json:"lessThan,omitempty"`             // LessThan (exclusive)
	Lte *float64 `json:"lessThanOrEqualTo,omitempty"`    // LessThanOrEqualTo (inclusive)
}

NumericRange defines bounds for numeric filtering. Used with NewNumericRange for cleaner constructor calls.

type NumericRangeCondition

type NumericRangeCondition struct {
	Field     string       `json:"field"`
	Range     NumericRange `json:"-"`
	FieldType FieldType    `json:"-"`
}

NumericRangeCondition filters by numeric range. SQL equivalent: WHERE field >= min AND field <= max

func NewNumericRange

func NewNumericRange(field string, r NumericRange) *NumericRangeCondition

NewNumericRange creates a numeric range condition for internal fields.

func NewUserNumericRange

func NewUserNumericRange(field string, r NumericRange) *NumericRangeCondition

NewUserNumericRange creates a numeric range condition for user-defined fields.

func (*NumericRangeCondition) IsFilterCondition

func (c *NumericRangeCondition) IsFilterCondition()

func (*NumericRangeCondition) MarshalJSON

func (c *NumericRangeCondition) MarshalJSON() ([]byte, error)

func (*NumericRangeCondition) UnmarshalJSON

func (c *NumericRangeCondition) UnmarshalJSON(data []byte) error

type SearchRequest

type SearchRequest struct {
	// CollectionName is the target collection to search in
	CollectionName string `json:"collectionName"`

	// Vector is the query embedding to find similar vectors for
	Vector []float32 `json:"vector"`

	// TopK is the maximum number of results to return
	TopK int `json:"maxResults"`

	// Filters is optional metadata filtering (AND/OR/NOT logic)
	Filters []*FilterSet `json:"filters,omitempty"`
}

SearchRequest represents a single similarity search query. Use with Service.Search() for single or batch queries.

type SearchResult

type SearchResult struct {
	// ID is the unique identifier of the matched point
	ID string `json:"id"`

	// Score is the similarity score (higher = more similar for cosine)
	Score float32 `json:"score"`

	// Payload contains the metadata stored with the vector
	Payload map[string]any `json:"payload"`

	// Vector is the stored embedding (only populated if requested)
	Vector []float32 `json:"vector,omitempty"`

	// CollectionName identifies which collection this result came from
	CollectionName string `json:"collectionName,omitempty"`
}

SearchResult represents a single search result with its similarity score. This is database-agnostic—payload is converted to map[string]any.

type Service

type Service interface {
	// Search performs similarity search across one or more requests.
	// Each request can target a different collection with different filters.
	// Returns:
	//   - results: slice of result slices—one []SearchResult per request
	//   - err: combined error (per-request errors and systemic errors joined)
	//
	// Example:
	//   results, err := db.Search(ctx,
	//       SearchRequest{CollectionName: "docs", Vector: vec1, TopK: 10},
	//       SearchRequest{CollectionName: "docs", Vector: vec2, TopK: 5, Filters: filters},
	//   )
	//   if err != nil {
	//       return err
	//   }
	//   for _, res := range results {
	//       // use res...
	//   }
	Search(ctx context.Context, requests ...SearchRequest) ([][]SearchResult, error)

	// Insert adds embeddings to a collection.
	// Uses batch processing internally for efficiency.
	Insert(ctx context.Context, collectionName string, inputs []EmbeddingInput) error

	// Delete removes points by their IDs from a collection.
	Delete(ctx context.Context, collection string, ids []string) error

	// EnsureCollection creates a collection if it doesn't exist.
	// Safe to call multiple times—no-op if collection already exists.
	EnsureCollection(ctx context.Context, name string, vectorSize uint64) error

	// GetCollection retrieves metadata about a collection.
	GetCollection(ctx context.Context, name string) (*Collection, error)

	// ListCollections returns names of all collections.
	ListCollections(ctx context.Context) ([]string, error)
}

Service is the common interface for all vector databases. It provides a database-agnostic abstraction for vector similarity search, allowing applications to switch between different vector databases (Qdrant, pgVector, etc.) without changing application code.

Example usage:

func NewSearchService(db vectordb.Service) *SearchService {
    return &SearchService{db: db}
}

// Works with any implementation:
// - vectordb.NewQdrantAdapter(qdrantClient)
// - vectordb.NewPgVectorAdapter(pgVectorClient)

type TimeRange

type TimeRange struct {
	Gt  *time.Time `json:"after,omitempty"`      // After (exclusive)
	Gte *time.Time `json:"atOrAfter,omitempty"`  // AtOrAfter (inclusive)
	Lt  *time.Time `json:"before,omitempty"`     // Before (exclusive)
	Lte *time.Time `json:"atOrBefore,omitempty"` // AtOrBefore (inclusive)
}

TimeRange defines bounds for time filtering. Used with NewTimeRange for cleaner constructor calls.

type TimeRangeCondition

type TimeRangeCondition struct {
	Field     string    `json:"field"`
	Range     TimeRange `json:"-"`
	FieldType FieldType `json:"-"`
}

TimeRangeCondition filters by datetime range. SQL equivalent: WHERE created_at >= '2024-01-01' AND created_at < '2025-01-01'

func NewTimeRange

func NewTimeRange(field string, t TimeRange) *TimeRangeCondition

NewTimeRange creates a time range condition for internal fields.

func NewUserTimeRange

func NewUserTimeRange(field string, t TimeRange) *TimeRangeCondition

NewUserTimeRange creates a time range condition for user-defined fields.

func (*TimeRangeCondition) IsFilterCondition

func (c *TimeRangeCondition) IsFilterCondition()

func (TimeRangeCondition) MarshalJSON

func (c TimeRangeCondition) MarshalJSON() ([]byte, error)

func (*TimeRangeCondition) UnmarshalJSON

func (c *TimeRangeCondition) UnmarshalJSON(data []byte) error

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL