Documentation
¶
Overview ¶
Package vectordb provides a database-agnostic abstraction for vector similarity search.
Overview ¶
This package defines a common interface Service that can be implemented by different vector database adapters (Qdrant, pgVector, etc.), allowing applications to switch between databases without changing application code.
Architecture ¶
┌─────────────────────────────────────────────────────────────┐
│ Application Layer │
│ (uses vectordb.Service - no DB-specific imports) │
└──────────────────────────┬──────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ vectordb.Service │
│ (common interface + DB-agnostic types) │
└──────────────────────────┬──────────────────────────────────┘
│
┌────────────┴────────────┐
▼ ▼
┌───────────────┐ ┌───────────────-┐
│ qdrant.Adapter│ │pgvector.Adapter│
│ (implements) │ │ (planned) │
└───────────────┘ └───────────────-┘
Benefits ¶
- Single Source of Truth: Filter types, search interfaces, and result types defined once.
- Easy to Add New DBs: Just add a new adapter; consuming projects don't change.
- Consistent API: All projects using std get the same interface.
- Testability: Mock the interface once, works for all DBs.
Usage ¶
In your application, depend only on the vectordb interface:
import "github.com/Aleph-Alpha/std/v1/vectordb"
type SearchService struct {
db vectordb.Service
}
func NewSearchService(db vectordb.Service) *SearchService {
return &SearchService{db: db}
}
func (s *SearchService) Search(ctx context.Context, query string, vector []float32) ([]vectordb.SearchResult, error) {
results, err := s.db.Search(ctx, vectordb.SearchRequest{
CollectionName: "documents",
Vector: vector,
TopK: 10,
Filters: []*vectordb.FilterSet{
{
Must: &vectordb.ConditionSet{
Conditions: []vectordb.FilterCondition{
vectordb.NewMatch("status", "published"),
},
},
},
},
})
if err != nil {
return nil, err
}
return results[0], nil
}
Wire Up with Qdrant ¶
In your main setup:
import (
"github.com/Aleph-Alpha/std/v1/vectordb"
"github.com/Aleph-Alpha/std/v1/qdrant"
)
func main() {
// Create Qdrant client (with health checks, config, etc.)
qc, _ := qdrant.NewQdrantClient(qdrant.QdrantParams{
Config: &qdrant.Config{Endpoint: "localhost", Port: 6334},
})
// Create adapter for DB-agnostic usage
db := qdrant.NewAdapter(qc.Client())
// Use in application
svc := NewSearchService(db)
// ...
}
Package Layout ¶
vectordb/ ├── interface.go # Service interface ├── types.go # SearchRequest, SearchResult, EmbeddingInput, Collection ├── filters.go # FilterSet, FilterCondition, and condition types ├── utils.go # Convenience constructors (New*) and JSON helpers └── doc.go # This file qdrant/ # Qdrant package (includes adapter) ├── client.go # QdrantClient wrapper ├── operations.go # Adapter - implements Service ├── converter.go # vectordb types → qdrant types └── ...
Future adapters would live in their own packages:
pgvector/ # (planned) PostgreSQL pgvector adapter
Filter Types ¶
The package provides DB-agnostic filter conditions:
| Type | Description | SQL Equivalent | |-------------------------|------------------------------|-----------------------------------| | MatchCondition | Exact value match | WHERE field = value | | MatchAnyCondition | Value in set | WHERE field IN (...) | | MatchExceptCondition | Value not in set | WHERE field NOT IN (...) | | NumericRangeCondition | Numeric range | WHERE field >= min AND field <= max| | TimeRangeCondition | Datetime range | WHERE created_at BETWEEN ... | | IsNullCondition | Field is null | WHERE field IS NULL | | IsEmptyCondition | Field is empty/null/missing | WHERE field IS NULL OR field = '' | | NestedFilterCondition | Nested filter group | (A OR B) AND (C OR D) |
Use convenience constructors for cleaner code:
// Internal field (top-level in payload)
vectordb.NewMatch("status", "published")
// User-defined field (stored under "custom." prefix)
vectordb.NewUserMatch("category", "research")
// Range conditions with NumericRange/TimeRange structs
vectordb.NewNumericRange("price", vectordb.NumericRange{Gte: &min, Lt: &max})
vectordb.NewTimeRange("created_at", vectordb.TimeRange{AtOrAfter: &start, Before: &end})
Index ¶
- func Must(conditions ...FilterCondition) func(*FilterSet)
- func MustNot(conditions ...FilterCondition) func(*FilterSet)
- func Should(conditions ...FilterCondition) func(*FilterSet)
- type Collection
- type ConditionSet
- type EmbeddingInput
- type FieldType
- type FilterCondition
- type FilterSet
- type IsEmptyCondition
- type IsNullCondition
- type MatchAnyCondition
- type MatchCondition
- type MatchExceptCondition
- type NestedFilterCondition
- type NumericRange
- type NumericRangeCondition
- type SearchRequest
- type SearchResult
- type Service
- type TimeRange
- type TimeRangeCondition
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func Must ¶
func Must(conditions ...FilterCondition) func(*FilterSet)
Must creates a Must clause (AND logic) with the given conditions. All conditions must match for a document to be included.
func MustNot ¶
func MustNot(conditions ...FilterCondition) func(*FilterSet)
MustNot creates a MustNot clause (NOT logic) with the given conditions. Documents matching any of these conditions are excluded.
func Should ¶
func Should(conditions ...FilterCondition) func(*FilterSet)
Should creates a Should clause (OR logic) with the given conditions. At least one condition must match for a document to be included.
Types ¶
type Collection ¶
type Collection struct {
// Name is the unique identifier of the collection
Name string `json:"name"`
// Status indicates the operational state (e.g., "Green", "Yellow")
Status string `json:"status"`
// VectorSize is the dimension of vectors in this collection
VectorSize int `json:"vectorSize"`
// Distance is the similarity metric (e.g., "Cosine", "Dot", "Euclid")
Distance string `json:"distance"`
// VectorCount is the number of indexed vectors
VectorCount uint64 `json:"vectorCount"`
// PointCount is the number of stored points/documents
PointCount uint64 `json:"pointCount"`
}
Collection contains metadata about a vector collection.
type ConditionSet ¶
type ConditionSet struct {
Conditions []FilterCondition `json:"conditions,omitempty"`
}
ConditionSet holds a group of conditions for a single clause.
func (*ConditionSet) MarshalJSON ¶
func (cs *ConditionSet) MarshalJSON() ([]byte, error)
MarshalJSON implements custom JSON marshaling for ConditionSet. This is needed because FilterCondition is an interface.
func (*ConditionSet) UnmarshalJSON ¶
func (cs *ConditionSet) UnmarshalJSON(data []byte) error
UnmarshalJSON implements custom JSON unmarshaling for ConditionSet. It detects the condition type based on JSON keys and deserializes into the appropriate concrete type (MatchCondition, NumericRangeCondition, etc.)
type EmbeddingInput ¶
type EmbeddingInput struct {
// ID is the unique identifier for this embedding
ID string `json:"id"`
// Vector is the dense embedding representation
Vector []float32 `json:"vector"`
// Payload is optional metadata to store with the vector
Payload map[string]any `json:"payload,omitempty"`
}
EmbeddingInput is the input for inserting vectors into a collection.
type FieldType ¶
type FieldType int
FieldType indicates whether a field is internal (system-managed) or user-defined (stored under a prefix like "custom.").
type FilterCondition ¶
type FilterCondition interface {
// isFilterCondition is a marker method to ensure type safety
IsFilterCondition()
}
FilterCondition is the interface all filter conditions must implement. Each database adapter converts these to its native filter format.
type FilterSet ¶
type FilterSet struct {
// Must: All conditions must match (AND)
Must *ConditionSet `json:"must,omitempty"`
// Should: At least one condition must match (OR)
Should *ConditionSet `json:"should,omitempty"`
// MustNot: None of the conditions should match (NOT)
MustNot *ConditionSet `json:"mustNot,omitempty"`
}
FilterSet supports Must (AND), Should (OR), and MustNot (NOT) clauses. Use with SearchRequest.Filters to filter search results.
Example:
filters := &FilterSet{
Must: &ConditionSet{
Conditions: []FilterCondition{
&MatchCondition{Field: "city", Value: "London"},
},
},
}
func NewFilterSet ¶
NewFilterSet creates a FilterSet with the given clauses. Use with Must(), Should(), and MustNot() helpers.
Example:
vectordb.NewFilterSet(
vectordb.Must(vectordb.NewMatch("status", "published")),
vectordb.Should(vectordb.NewMatch("tag", "ml"), vectordb.NewMatch("tag", "ai")),
)
type IsEmptyCondition ¶
IsEmptyCondition checks if a field is empty (doesn't exist, null, or []). SQL equivalent: WHERE field IS NULL OR field = ” OR field = []
func NewIsEmpty ¶
func NewIsEmpty(field string) *IsEmptyCondition
NewIsEmpty creates an IS EMPTY condition for internal fields.
func NewUserIsEmpty ¶
func NewUserIsEmpty(field string) *IsEmptyCondition
NewUserIsEmpty creates an IS EMPTY condition for user-defined fields.
func (*IsEmptyCondition) IsFilterCondition ¶
func (c *IsEmptyCondition) IsFilterCondition()
type IsNullCondition ¶
IsNullCondition checks if a field has a NULL value. SQL equivalent: WHERE field IS NULL
func NewIsNull ¶
func NewIsNull(field string) *IsNullCondition
NewIsNull creates an IS NULL condition for internal fields.
func NewUserIsNull ¶
func NewUserIsNull(field string) *IsNullCondition
NewUserIsNull creates an IS NULL condition for user-defined fields.
func (*IsNullCondition) IsFilterCondition ¶
func (c *IsNullCondition) IsFilterCondition()
type MatchAnyCondition ¶
type MatchAnyCondition struct {
Field string `json:"field"`
Values []any `json:"anyOf"`
FieldType FieldType `json:"-"`
}
MatchAnyCondition matches if value is one of the given values (IN operator). SQL equivalent: WHERE field IN (value1, value2, ...)
func NewMatchAny ¶
func NewMatchAny(field string, values ...any) *MatchAnyCondition
NewMatchAny creates an IN condition for internal fields.
func NewUserMatchAny ¶
func NewUserMatchAny(field string, values ...any) *MatchAnyCondition
NewUserMatchAny creates an IN condition for user-defined fields.
func (*MatchAnyCondition) IsFilterCondition ¶
func (c *MatchAnyCondition) IsFilterCondition()
type MatchCondition ¶
type MatchCondition struct {
Field string `json:"field"`
Value any `json:"equalTo"`
FieldType FieldType `json:"-"`
}
MatchCondition represents an exact match filter (WHERE field = value). Supports string, bool, and int64 values.
func NewMatch ¶
func NewMatch(field string, value any) *MatchCondition
NewMatch creates a match condition for internal fields.
func NewUserMatch ¶
func NewUserMatch(field string, value any) *MatchCondition
NewUserMatch creates a match condition for user-defined fields.
func (*MatchCondition) IsFilterCondition ¶
func (c *MatchCondition) IsFilterCondition()
type MatchExceptCondition ¶
type MatchExceptCondition struct {
Field string `json:"field"`
Values []any `json:"noneOf"`
FieldType FieldType `json:"-"`
}
MatchExceptCondition matches if value is NOT one of the given values (NOT IN). SQL equivalent: WHERE field NOT IN (value1, value2, ...)
func NewMatchExcept ¶
func NewMatchExcept(field string, values ...any) *MatchExceptCondition
NewMatchExcept creates a NOT IN condition for internal fields.
func NewUserMatchExcept ¶
func NewUserMatchExcept(field string, values ...any) *MatchExceptCondition
NewUserMatchExcept creates a NOT IN condition for user-defined fields.
func (*MatchExceptCondition) IsFilterCondition ¶
func (c *MatchExceptCondition) IsFilterCondition()
type NestedFilterCondition ¶ added in v1.1.0
type NestedFilterCondition struct {
Filter *FilterSet `json:"filter"`
}
NestedFilterCondition allows a FilterSet to be used as a condition. This enables complex recursive filters like (A AND B) OR (C AND D).
Example - (A OR B) AND (C OR D):
filters := &FilterSet{
Must: &ConditionSet{
Conditions: []FilterCondition{
&NestedFilterCondition{
Filter: &FilterSet{
Should: &ConditionSet{Conditions: []FilterCondition{A, B}},
},
},
&NestedFilterCondition{
Filter: &FilterSet{
Should: &ConditionSet{Conditions: []FilterCondition{C, D}},
},
},
},
},
}
Example - (A AND B) OR (C AND D):
filters := &FilterSet{
Should: &ConditionSet{
Conditions: []FilterCondition{
&NestedFilterCondition{
Filter: &FilterSet{
Must: &ConditionSet{Conditions: []FilterCondition{A, B}},
},
},
&NestedFilterCondition{
Filter: &FilterSet{
Must: &ConditionSet{Conditions: []FilterCondition{C, D}},
},
},
},
},
}
func (*NestedFilterCondition) IsFilterCondition ¶ added in v1.1.0
func (c *NestedFilterCondition) IsFilterCondition()
type NumericRange ¶
type NumericRange struct {
Gt *float64 `json:"greaterThan,omitempty"` // GreaterThan (exclusive)
Gte *float64 `json:"greaterThanOrEqualTo,omitempty"` // GreaterThanOrEqualTo (inclusive)
Lt *float64 `json:"lessThan,omitempty"` // LessThan (exclusive)
Lte *float64 `json:"lessThanOrEqualTo,omitempty"` // LessThanOrEqualTo (inclusive)
}
NumericRange defines bounds for numeric filtering. Used with NewNumericRange for cleaner constructor calls.
type NumericRangeCondition ¶
type NumericRangeCondition struct {
Field string `json:"field"`
Range NumericRange `json:"-"`
FieldType FieldType `json:"-"`
}
NumericRangeCondition filters by numeric range. SQL equivalent: WHERE field >= min AND field <= max
func NewNumericRange ¶
func NewNumericRange(field string, r NumericRange) *NumericRangeCondition
NewNumericRange creates a numeric range condition for internal fields.
func NewUserNumericRange ¶
func NewUserNumericRange(field string, r NumericRange) *NumericRangeCondition
NewUserNumericRange creates a numeric range condition for user-defined fields.
func (*NumericRangeCondition) IsFilterCondition ¶
func (c *NumericRangeCondition) IsFilterCondition()
func (*NumericRangeCondition) MarshalJSON ¶
func (c *NumericRangeCondition) MarshalJSON() ([]byte, error)
func (*NumericRangeCondition) UnmarshalJSON ¶
func (c *NumericRangeCondition) UnmarshalJSON(data []byte) error
type SearchRequest ¶
type SearchRequest struct {
// CollectionName is the target collection to search in
CollectionName string `json:"collectionName"`
// Vector is the query embedding to find similar vectors for
Vector []float32 `json:"vector"`
// TopK is the maximum number of results to return
TopK int `json:"maxResults"`
// Filters is optional metadata filtering (AND/OR/NOT logic)
Filters []*FilterSet `json:"filters,omitempty"`
}
SearchRequest represents a single similarity search query. Use with Service.Search() for single or batch queries.
type SearchResult ¶
type SearchResult struct {
// ID is the unique identifier of the matched point
ID string `json:"id"`
// Score is the similarity score (higher = more similar for cosine)
Score float32 `json:"score"`
// Payload contains the metadata stored with the vector
Payload map[string]any `json:"payload"`
// Vector is the stored embedding (only populated if requested)
Vector []float32 `json:"vector,omitempty"`
// CollectionName identifies which collection this result came from
CollectionName string `json:"collectionName,omitempty"`
}
SearchResult represents a single search result with its similarity score. This is database-agnostic—payload is converted to map[string]any.
type Service ¶
type Service interface {
// Search performs similarity search across one or more requests.
// Each request can target a different collection with different filters.
// Returns:
// - results: slice of result slices—one []SearchResult per request
// - err: combined error (per-request errors and systemic errors joined)
//
// Example:
// results, err := db.Search(ctx,
// SearchRequest{CollectionName: "docs", Vector: vec1, TopK: 10},
// SearchRequest{CollectionName: "docs", Vector: vec2, TopK: 5, Filters: filters},
// )
// if err != nil {
// return err
// }
// for _, res := range results {
// // use res...
// }
Search(ctx context.Context, requests ...SearchRequest) ([][]SearchResult, error)
// Insert adds embeddings to a collection.
// Uses batch processing internally for efficiency.
Insert(ctx context.Context, collectionName string, inputs []EmbeddingInput) error
// Delete removes points by their IDs from a collection.
Delete(ctx context.Context, collection string, ids []string) error
// EnsureCollection creates a collection if it doesn't exist.
// Safe to call multiple times—no-op if collection already exists.
EnsureCollection(ctx context.Context, name string, vectorSize uint64) error
// GetCollection retrieves metadata about a collection.
GetCollection(ctx context.Context, name string) (*Collection, error)
// ListCollections returns names of all collections.
ListCollections(ctx context.Context) ([]string, error)
}
Service is the common interface for all vector databases. It provides a database-agnostic abstraction for vector similarity search, allowing applications to switch between different vector databases (Qdrant, pgVector, etc.) without changing application code.
Example usage:
func NewSearchService(db vectordb.Service) *SearchService {
return &SearchService{db: db}
}
// Works with any implementation:
// - vectordb.NewQdrantAdapter(qdrantClient)
// - vectordb.NewPgVectorAdapter(pgVectorClient)
type TimeRange ¶
type TimeRange struct {
Gt *time.Time `json:"after,omitempty"` // After (exclusive)
Gte *time.Time `json:"atOrAfter,omitempty"` // AtOrAfter (inclusive)
Lt *time.Time `json:"before,omitempty"` // Before (exclusive)
Lte *time.Time `json:"atOrBefore,omitempty"` // AtOrBefore (inclusive)
}
TimeRange defines bounds for time filtering. Used with NewTimeRange for cleaner constructor calls.
type TimeRangeCondition ¶
type TimeRangeCondition struct {
Field string `json:"field"`
Range TimeRange `json:"-"`
FieldType FieldType `json:"-"`
}
TimeRangeCondition filters by datetime range. SQL equivalent: WHERE created_at >= '2024-01-01' AND created_at < '2025-01-01'
func NewTimeRange ¶
func NewTimeRange(field string, t TimeRange) *TimeRangeCondition
NewTimeRange creates a time range condition for internal fields.
func NewUserTimeRange ¶
func NewUserTimeRange(field string, t TimeRange) *TimeRangeCondition
NewUserTimeRange creates a time range condition for user-defined fields.
func (*TimeRangeCondition) IsFilterCondition ¶
func (c *TimeRangeCondition) IsFilterCondition()
func (TimeRangeCondition) MarshalJSON ¶
func (c TimeRangeCondition) MarshalJSON() ([]byte, error)
func (*TimeRangeCondition) UnmarshalJSON ¶
func (c *TimeRangeCondition) UnmarshalJSON(data []byte) error