vectorstore

package
v0.4.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 12, 2026 License: MIT Imports: 5 Imported by: 0

README

Vector Store Package

The vectorstore package provides a unified interface for working with vector databases in Aixgo. It enables semantic search, similarity matching, and retrieval-augmented generation (RAG) workflows through a provider-agnostic API.

Table of Contents

Overview

Vector databases store high-dimensional embeddings alongside metadata, enabling fast similarity search. This package abstracts common operations across different vector database providers, allowing you to switch backends without changing your application code.

Key Concepts
  • Collection: Isolated namespace with use-case specific configuration
  • Document: A piece of content with its embedding vector and metadata
  • Embedding: A numerical representation (vector) of content, typically generated by a machine learning model
  • Query: Type-safe search with composable filters
  • Multi-modal Content: Support for text, images, audio, and video

Features

  • Collection-Based Isolation: Separate namespaces for different use cases
  • Provider Agnostic: Switch between memory, Firestore, and future providers using the same API
  • Multi-Modal Support: Text, images, audio, and video content
  • Flexible Filtering: Compose type-safe filters with And/Or/Not
  • Batch Operations: Efficiently upsert and retrieve multiple documents
  • Streaming Support: Handle large result sets with iterators
  • Multiple Distance Metrics: Cosine similarity, Euclidean distance, dot product
  • Production Ready: Built-in validation, error handling, and concurrency safety
  • Extensible: Easy to add custom vector store providers

Installation

go get github.com/aixgo-dev/aixgo/pkg/vectorstore
Provider-Specific Dependencies
# For Firestore support
go get cloud.google.com/go/firestore
go get github.com/aixgo-dev/aixgo/pkg/vectorstore/firestore

# For in-memory support (included in base package)
go get github.com/aixgo-dev/aixgo/pkg/vectorstore/memory

Quick Start

Basic Usage with Memory Provider
package main

import (
    "context"
    "fmt"
    "log"

    "github.com/aixgo-dev/aixgo/pkg/vectorstore"
    "github.com/aixgo-dev/aixgo/pkg/vectorstore/memory"
)

func main() {
    // Create an in-memory vector store
    store, err := memory.New()
    if err != nil {
        log.Fatal(err)
    }
    defer store.Close()

    // Get a collection
    coll := store.Collection("documents")

    ctx := context.Background()

    // Create a document with embedding
    doc := &vectorstore.Document{
        ID:      "doc-1",
        Content: vectorstore.NewTextContent("Aixgo is a production-grade AI agent framework for Go"),
        Embedding: vectorstore.NewEmbedding(
            []float32{0.1, 0.2, 0.3, /* ... 384 dimensions total */},
            "text-embedding-3-small",
        ),
        Tags: []string{"documentation", "framework"},
    }

    // Store the document
    result, err := coll.Upsert(ctx, doc)
    if err != nil {
        log.Fatal(err)
    }
    fmt.Printf("Inserted: %d documents\n", result.Inserted)

    // Query for similar documents
    query := &vectorstore.Query{
        Embedding: vectorstore.NewEmbedding(
            []float32{0.11, 0.21, 0.31, /* ... */},
            "text-embedding-3-small",
        ),
        Limit:    5,
        MinScore: 0.7,
    }

    results, err := coll.Query(ctx, query)
    if err != nil {
        log.Fatal(err)
    }

    for _, match := range results.Matches {
        fmt.Printf("Score: %.3f - %s\n", match.Score, match.Document.Content.String())
    }
}
Production Setup with Firestore
import (
    "github.com/aixgo-dev/aixgo/pkg/vectorstore/firestore"
)

// Create Firestore vector store
store, err := firestore.New(
    firestore.WithProject("my-gcp-project"),
    firestore.WithDatabase("(default)"),
    firestore.WithCredentials("/path/to/service-account.json"),
)
if err != nil {
    log.Fatal(err)
}
defer store.Close()

// Get collection with TTL for semantic caching
cache := store.Collection("cache",
    vectorstore.WithTTL(5*time.Minute),
    vectorstore.WithDeduplication(true),
)

Supported Providers

Provider Status Best For Persistence Scalability
memory Available Development, testing No Low (10K docs)
firestore Available Production, serverless Yes High
qdrant Planned High-performance search Yes Very High
pgvector Planned Existing PostgreSQL apps Yes High
Memory Provider

Pros:

  • Zero setup required
  • Fast for small datasets
  • Perfect for testing

Cons:

  • Data lost on restart
  • Limited capacity (default 10K documents)
  • Brute-force search (O(n) complexity)

Use Cases:

  • Unit tests
  • Local development
  • Prototyping
Firestore Provider

Pros:

  • Serverless, fully managed
  • Automatic scaling
  • Real-time sync capabilities
  • Built-in security rules

Cons:

  • Requires GCP setup
  • Costs based on operations
  • Index creation takes time

Use Cases:

  • Production deployments
  • Serverless architectures
  • Projects already using Firebase/GCP

API Reference

VectorStore Interface
type VectorStore interface {
    // Collection returns a collection with the given name and options
    Collection(name string, opts ...CollectionOption) Collection

    // ListCollections returns all collection names
    ListCollections(ctx context.Context) ([]string, error)

    // DeleteCollection removes a collection and all its documents
    DeleteCollection(ctx context.Context, name string) error

    // Stats returns store-level statistics
    Stats(ctx context.Context) (*StoreStats, error)

    // Close closes the connection
    Close() error
}
Collection Interface
type Collection interface {
    // Upsert inserts or updates documents
    Upsert(ctx context.Context, docs ...*Document) (*UpsertResult, error)

    // UpsertBatch efficiently upserts multiple documents
    UpsertBatch(ctx context.Context, docs []*Document, opts ...BatchOption) (*UpsertResult, error)

    // Query performs similarity search
    Query(ctx context.Context, query *Query) (*QueryResult, error)

    // QueryStream returns an iterator for large result sets
    QueryStream(ctx context.Context, query *Query) (ResultIterator, error)

    // Get retrieves documents by IDs
    Get(ctx context.Context, ids ...string) ([]*Document, error)

    // Delete removes documents by IDs
    Delete(ctx context.Context, ids ...string) (*DeleteResult, error)

    // DeleteByFilter removes documents matching a filter
    DeleteByFilter(ctx context.Context, filter Filter) (*DeleteResult, error)

    // Count returns the number of documents matching a filter
    Count(ctx context.Context, filter Filter) (int64, error)
}
Document Structure
type Document struct {
    ID        string       // Unique identifier
    Content   *Content     // Multi-modal (text/image/audio/video)
    Embedding *Embedding   // Vector + model metadata
    Scope     *Scope       // Multi-tenant isolation
    Temporal  *Temporal    // Time-based metadata
    Tags      []string     // Indexed labels
    Metadata  map[string]any // Free-form data
}
Query Structure
type Query struct {
    Embedding         *Embedding     // Query vector
    Filters           Filter         // Composable filters
    Limit             int            // Number of results (default: 10)
    Offset            int            // Pagination offset
    MinScore          float32        // Minimum similarity (0.0-1.0)
    Metric            DistanceMetric // Similarity metric
    IncludeEmbeddings bool           // Include vectors in results
    IncludeContent    bool           // Include content in results
    SortBy            []SortBy       // Hybrid ranking
    Explain           bool           // Debug info
}
Composable Filters
// Composite filters
And(filters...)
Or(filters...)
Not(filter)

// Field-based filters
Eq(field, value)
Ne(field, value)
Gt/Gte/Lt/Lte(field, value)
In/NotIn(field, values...)
Contains/StartsWith/EndsWith(field, substring)

// Tag filters
TagFilter(tag)
TagsFilter(tags...)  // All tags
AnyTagFilter(tags...) // Any tag

// Scope filters
ScopeFilter(scope)
TenantFilter(tenant)
UserFilter(user)
SessionFilter(session)

// Time filters
CreatedAfter/Before(time)
UpdatedAfter/Before(time)
NotExpired()

// Score filters
ScoreAbove/Below/AtLeast(threshold)
Distance Metrics
const (
    DistanceMetricCosine     = "cosine"      // Default, range: -1 to 1
    DistanceMetricEuclidean  = "euclidean"   // Range: 0 to infinity
    DistanceMetricDotProduct = "dot_product" // Range: -inf to +inf
)

When to use:

  • Cosine: Most text embeddings (normalized vectors)
  • Euclidean: When vector magnitude matters
  • Dot Product: Faster than cosine for normalized vectors

Best Practices

1. Collection Design

Use collections for isolation:

// Semantic cache
cache := store.Collection("cache",
    vectorstore.WithTTL(5*time.Minute),
    vectorstore.WithDeduplication(true),
)

// Agent memory
memory := store.Collection("agent-memory",
    vectorstore.WithScope("user", "session"),
)

// Document store
docs := store.Collection("documents",
    vectorstore.WithIndexing(vectorstore.IndexTypeHNSW),
    vectorstore.WithDimensions(768),
)
2. Document Design

Keep content focused:

// Good: Focused, semantic chunk
doc := &vectorstore.Document{
    ID:      "user-guide-installation",
    Content: vectorstore.NewTextContent("To install Aixgo, run: go get github.com/aixgo-dev/aixgo"),
    Embedding: vectorstore.NewEmbedding(embedding, "text-embedding-3-small"),
    Tags:    []string{"installation", "quickstart"},
}

// Avoid: Too large, mixed topics
doc := &vectorstore.Document{
    Content: vectorstore.NewTextContent("... entire 50-page user manual ..."),
}
3. Batch Operations

Prefer batch upserts:

// Good: Batch operation
result, err := coll.UpsertBatch(ctx, documents,
    vectorstore.WithBatchSize(100),
    vectorstore.WithParallelism(4),
)

// Avoid: Individual operations
for _, doc := range documents {
    coll.Upsert(ctx, doc) // Inefficient
}
4. Error Handling

Validate before operations:

// Documents are validated automatically
result, err := coll.Upsert(ctx, doc)
if err != nil {
    return fmt.Errorf("upsert failed: %w", err)
}
5. Context and Timeouts

Always use timeouts:

ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()

results, err := coll.Query(ctx, query)
if err != nil {
    if errors.Is(err, context.DeadlineExceeded) {
        return fmt.Errorf("query timeout: %w", err)
    }
    return err
}
6. Resource Cleanup

Always close connections:

store, err := memory.New()
if err != nil {
    return err
}
defer store.Close() // Important!

Troubleshooting

Firestore Permission Denied
Error: rpc error: code = PermissionDenied

Solution: Grant proper IAM roles:

gcloud projects add-iam-policy-binding PROJECT_ID \
  --member="serviceAccount:SERVICE_ACCOUNT@PROJECT.iam.gserviceaccount.com" \
  --role="roles/datastore.user"
Firestore Index Not Ready
Error: index not found or not ready

Solution: Wait for index creation (5-10 minutes):

# Check index status
gcloud firestore indexes composite list --format=table

# Create index
gcloud firestore indexes composite create \
  --collection-group=embeddings \
  --query-scope=COLLECTION \
  --field-config=field-path=embedding,vector-config='{"dimension":"384","flat":{}}'
Query Returns No Results

Check these common issues:

  1. MinScore too high: Try lowering or removing it
  2. Wrong distance metric: Use cosine for most text embeddings
  3. Filters too restrictive: Test without filters first
  4. Empty database: Verify documents were stored
// Debug: Query without filters
results, err := coll.Query(ctx, &vectorstore.Query{
    Embedding: queryVec,
    Limit:     100,
    MinScore:  0.0, // Remove score threshold
})

Examples

Example 1: Semantic Caching
cache := store.Collection("cache",
    vectorstore.WithTTL(5*time.Minute),
    vectorstore.WithDeduplication(true),
)

// Store
doc := &vectorstore.Document{
    ID:        queryHash,
    Content:   vectorstore.NewTextContent(query),
    Embedding: vectorstore.NewEmbedding(queryEmbedding, "model"),
    Temporal:  vectorstore.NewTemporalWithTTL(5*time.Minute),
    Metadata:  map[string]any{"result": cachedResult},
}
cache.Upsert(ctx, doc)

// Lookup
query := &vectorstore.Query{
    Embedding: vectorstore.NewEmbedding(queryEmbedding, "model"),
    MinScore:  0.95, // High threshold
    Limit:     1,
}
results, _ := cache.Query(ctx, query)
if results.HasMatches() {
    // Cache hit
    result := results.TopMatch().Document.Metadata["result"]
}
Example 2: Hybrid Search with Filters
// Search for recent documentation in English
results, err := coll.Query(ctx, &vectorstore.Query{
    Embedding: queryEmbedding,
    Filters: vectorstore.And(
        vectorstore.Eq("doc_type", "documentation"),
        vectorstore.Eq("language", "en"),
        vectorstore.Eq("status", "published"),
        vectorstore.Not(vectorstore.Eq("archived", true)),
    ),
    Limit:    10,
    MinScore: 0.75,
})
Example 3: Streaming Large Results
iter, err := coll.QueryStream(ctx, &vectorstore.Query{
    Embedding: queryEmbedding,
    Limit:     10000,
})
defer iter.Close()

for iter.Next() {
    match := iter.Match()
    fmt.Printf("Score: %.4f, Doc: %s\n", match.Score, match.Document.ID)
}

if err := iter.Err(); err != nil {
    log.Fatal(err)
}
media := store.Collection("media",
    vectorstore.WithDimensions(512), // CLIP dimensions
)

// Store image
doc := &vectorstore.Document{
    ID:        imageID,
    Content:   vectorstore.NewImageURL(imageURL),
    Embedding: vectorstore.NewEmbedding(clipEmbedding, "clip-vit-base-patch32"),
    Tags:      []string{"photo", "product"},
}
media.Upsert(ctx, doc)

// Query with image embedding
results, _ := media.Query(ctx, &vectorstore.Query{
    Embedding: queryImageEmbedding,
    Filters:   vectorstore.TagFilter("product"),
})

Performance Considerations

Memory Provider
  • Complexity: O(n) for search (brute-force)
  • Throughput: ~10,000 searches/sec (depends on dimensions)
  • Capacity: Up to 100K documents on typical hardware
Firestore Provider
  • Complexity: O(log n) with vector index
  • Throughput: ~1,000 searches/sec (network bound)
  • Capacity: Unlimited (serverless)
Optimization Tips
  1. Use batch operations for bulk inserts
  2. Set appropriate Limit (smaller = faster)
  3. Add filters to reduce search space
  4. Consider embedding dimensions (smaller = faster, less accurate)
  5. Use streaming for large result sets
  6. Disable content/embeddings in results if not needed

Next Steps

Resources

Contributing

To add a new vector store provider:

  1. Implement the VectorStore and Collection interfaces
  2. Add provider-specific constructors (e.g., yourprovider.New())
  3. Add tests and documentation
  4. Submit a pull request

See CONTRIBUTING.md for details.

Documentation

Overview

Example (AgentMemory)

Example_agentMemory demonstrates using collections for agent memory.

package main

import (
	"context"
	"fmt"
	"time"

	"github.com/aixgo-dev/aixgo/pkg/vectorstore"
)

func main() {
	var store vectorstore.VectorStore
	defer func() { _ = store.Close() }()

	// Create memory collection with scope requirements
	memory := store.Collection("agent-memory",
		vectorstore.WithScope("user", "session"),
		vectorstore.WithMaxDocuments(1000),
	)

	ctx := context.Background()

	// Store a memory
	memoryDoc := &vectorstore.Document{
		ID:      "memory-1",
		Content: vectorstore.NewTextContent("User prefers dark mode"),
		Embedding: vectorstore.NewEmbedding(
			[]float32{0.5, 0.6, 0.7},
			"text-embedding-3-small",
		),
		Scope: vectorstore.NewScope("tenant1", "user123", "session456"),
		Tags:  []string{"preference", "ui"},
		Temporal: &vectorstore.Temporal{
			CreatedAt: time.Now(),
			UpdatedAt: time.Now(),
		},
	}

	_, err := memory.Upsert(ctx, memoryDoc)
	if err != nil {
		fmt.Printf("Error: %v\n", err)
		return
	}

	// Retrieve memories for a specific user
	query := &vectorstore.Query{
		Embedding: vectorstore.NewEmbedding(
			[]float32{0.5, 0.6, 0.7},
			"text-embedding-3-small",
		),
		Filters: vectorstore.And(
			vectorstore.UserFilter("user123"),
			vectorstore.TagFilter("preference"),
		),
		Limit: 5,
	}

	result, _ := memory.Query(ctx, query)
	for _, match := range result.Matches {
		fmt.Printf("Memory: %s (score: %.2f)\n",
			match.Document.Content.String(),
			match.Score,
		)
	}
}
Example (Basic)

Example demonstrates basic usage of the VectorStore interface.

package main

import (
	"context"
	"fmt"

	"github.com/aixgo-dev/aixgo/pkg/vectorstore"
)

func main() {
	// NOTE: This is a documentation example showing the API.
	// It won't run without a real store implementation.

	// Create a vector store (implementation-specific)
	var store vectorstore.VectorStore // = memory.New() or firestore.New() etc.
	defer func() { _ = store.Close() }()

	// Create a collection for documents
	docs := store.Collection("documents")

	// Create a document
	doc := &vectorstore.Document{
		ID:      "doc1",
		Content: vectorstore.NewTextContent("The quick brown fox jumps over the lazy dog"),
		Embedding: vectorstore.NewEmbedding(
			[]float32{0.1, 0.2, 0.3, 0.4, 0.5},
			"text-embedding-3-small",
		),
		Tags: []string{"example", "demo"},
	}

	// Insert the document
	ctx := context.Background()
	result, err := docs.Upsert(ctx, doc)
	if err != nil {
		fmt.Printf("Error: %v\n", err)
		return
	}
	fmt.Printf("Inserted: %d documents\n", result.Inserted)

	// Query for similar documents
	query := &vectorstore.Query{
		Embedding: vectorstore.NewEmbedding(
			[]float32{0.1, 0.2, 0.3, 0.4, 0.5},
			"text-embedding-3-small",
		),
		Limit: 10,
	}

	queryResult, _ := docs.Query(ctx, query)
	fmt.Printf("Found: %d matches\n", queryResult.Count())
}
Example (BatchOperations)

Example_batchOperations demonstrates batch upsert with progress tracking.

package main

import (
	"context"
	"fmt"

	"github.com/aixgo-dev/aixgo/pkg/vectorstore"
)

func main() {
	var store vectorstore.VectorStore
	defer func() { _ = store.Close() }()

	docs := store.Collection("documents")
	ctx := context.Background()

	// Create many documents
	documents := make([]*vectorstore.Document, 1000)
	for i := range documents {
		documents[i] = &vectorstore.Document{
			ID:      fmt.Sprintf("doc-%d", i),
			Content: vectorstore.NewTextContent(fmt.Sprintf("Document %d", i)),
			Embedding: vectorstore.NewEmbedding(
				[]float32{float32(i) / 1000.0, 0.5, 0.5},
				"model",
			),
		}
	}

	// Batch insert with progress tracking
	result, err := docs.UpsertBatch(ctx, documents,
		vectorstore.WithBatchSize(100),
		vectorstore.WithParallelism(4),
		vectorstore.WithProgressCallback(func(processed, total int) {
			pct := float64(processed) / float64(total) * 100
			fmt.Printf("Progress: %d/%d (%.1f%%)\n", processed, total, pct)
		}),
	)
	if err != nil {
		fmt.Printf("Error: %v\n", err)
		return
	}

	fmt.Printf("Inserted: %d, Failed: %d\n", result.Inserted, result.Failed)
}
Example (ComplexFilters)

Example_complexFilters demonstrates complex filter queries.

package main

import (
	"context"
	"fmt"
	"time"

	"github.com/aixgo-dev/aixgo/pkg/vectorstore"
)

func main() {
	var store vectorstore.VectorStore
	defer func() { _ = store.Close() }()

	docs := store.Collection("products")
	ctx := context.Background()

	// Complex filter: recent, high-rated electronics in stock
	query := &vectorstore.Query{
		Embedding: vectorstore.NewEmbedding(
			[]float32{0.1, 0.2, 0.3},
			"model",
		),
		Filters: vectorstore.And(
			vectorstore.TagFilter("electronics"),
			vectorstore.Gte("rating", 4.5),
			vectorstore.Eq("in_stock", true),
			vectorstore.CreatedAfter(time.Now().Add(-30*24*time.Hour)),
			vectorstore.Or(
				vectorstore.Contains("category", "phone"),
				vectorstore.Contains("category", "laptop"),
			),
		),
		SortBy: []vectorstore.SortBy{
			vectorstore.SortByScore(),
			vectorstore.SortByField("rating", true),
		},
		Limit: 20,
	}

	result, _ := docs.Query(ctx, query)
	fmt.Printf("Found %d matching products\n", result.Count())
}
Example (ConversationHistory)

Example_conversationHistory demonstrates storing conversation history.

package main

import (
	"context"
	"fmt"
	"time"

	"github.com/aixgo-dev/aixgo/pkg/vectorstore"
)

func main() {
	var store vectorstore.VectorStore
	defer func() { _ = store.Close() }()

	// Create conversation collection
	conversations := store.Collection("conversations",
		vectorstore.WithScope("user", "thread"),
		vectorstore.WithMaxDocuments(100000),
	)

	ctx := context.Background()

	// Store a conversation turn
	turn := &vectorstore.Document{
		ID:      "turn-1",
		Content: vectorstore.NewTextContent("User: What's the weather?\nAssistant: It's sunny today."),
		Embedding: vectorstore.NewEmbedding(
			[]float32{0.2, 0.3, 0.4},
			"text-embedding-3-small",
		),
		Scope: &vectorstore.Scope{
			User:   "user123",
			Thread: "thread-abc",
		},
		Temporal: &vectorstore.Temporal{
			CreatedAt: time.Now(),
			EventTime: &[]time.Time{time.Now()}[0],
		},
		Tags: []string{"weather", "conversation"},
		Metadata: map[string]any{
			"turn_number": 1,
			"user_id":     "user123",
		},
	}

	_, err := conversations.Upsert(ctx, turn)
	if err != nil {
		fmt.Printf("Error: %v\n", err)
		return
	}

	// Retrieve recent conversation history
	query := &vectorstore.Query{
		Filters: vectorstore.And(
			vectorstore.UserFilter("user123"),
			vectorstore.Eq("thread", "thread-abc"),
			vectorstore.CreatedAfter(time.Now().Add(-24*time.Hour)),
		),
		SortBy: []vectorstore.SortBy{
			vectorstore.SortByCreatedAt(false), // Ascending (chronological)
		},
		Limit: 20,
	}

	result, _ := conversations.Query(ctx, query)
	fmt.Printf("Found %d conversation turns\n", result.Count())
}
Example (Deduplication)

Example_deduplication demonstrates content deduplication.

package main

import (
	"context"
	"fmt"

	"github.com/aixgo-dev/aixgo/pkg/vectorstore"
)

func main() {
	var store vectorstore.VectorStore
	defer func() { _ = store.Close() }()

	// Create collection with aggressive deduplication
	docs := store.Collection("documents",
		vectorstore.WithDeduplicationThreshold(0.95),
	)

	ctx := context.Background()

	// Insert multiple similar documents
	documents := []*vectorstore.Document{
		{
			ID:        "doc1",
			Content:   vectorstore.NewTextContent("The quick brown fox"),
			Embedding: vectorstore.NewEmbedding([]float32{0.1, 0.2, 0.3}, "model"),
		},
		{
			ID:        "doc2",
			Content:   vectorstore.NewTextContent("The quick brown fox"), // Duplicate
			Embedding: vectorstore.NewEmbedding([]float32{0.1, 0.2, 0.3}, "model"),
		},
		{
			ID:        "doc3",
			Content:   vectorstore.NewTextContent("A different document"),
			Embedding: vectorstore.NewEmbedding([]float32{0.9, 0.8, 0.7}, "model"),
		},
	}

	result, err := docs.UpsertBatch(ctx, documents)
	if err != nil {
		fmt.Printf("Error: %v\n", err)
		return
	}
	fmt.Printf("Inserted: %d, Deduplicated: %d\n",
		result.Inserted,
		result.Deduplicated,
	)
}
Example (Multimodal)

Example_multimodal demonstrates multi-modal content.

package main

import (
	"context"
	"fmt"

	"github.com/aixgo-dev/aixgo/pkg/vectorstore"
)

func main() {
	var store vectorstore.VectorStore
	defer func() { _ = store.Close() }()

	media := store.Collection("media",
		vectorstore.WithDimensions(512),
	)

	ctx := context.Background()

	// Store an image
	imageDoc := &vectorstore.Document{
		ID: "img1",
		Content: vectorstore.NewImageURL(
			"https://example.com/photo.jpg",
		),
		Embedding: vectorstore.NewEmbedding(
			make([]float32, 512), // CLIP embedding
			"clip-vit-base-patch32",
		),
		Tags: []string{"photo", "landscape"},
	}

	_, err := media.Upsert(ctx, imageDoc)
	if err != nil {
		fmt.Printf("Error: %v\n", err)
		return
	}

	// Query with image embedding
	query := &vectorstore.Query{
		Embedding: vectorstore.NewEmbedding(
			make([]float32, 512),
			"clip-vit-base-patch32",
		),
		Filters: vectorstore.TagFilter("photo"),
		Limit:   10,
	}

	result, _ := media.Query(ctx, query)
	fmt.Printf("Found %d similar images\n", result.Count())
}
Example (Pagination)

Example_pagination demonstrates paginated queries.

package main

import (
	"context"
	"fmt"

	"github.com/aixgo-dev/aixgo/pkg/vectorstore"
)

func main() {
	var store vectorstore.VectorStore
	defer func() { _ = store.Close() }()

	docs := store.Collection("documents")
	ctx := context.Background()

	pageSize := 20
	page := 0

	for {
		query := &vectorstore.Query{
			Embedding: vectorstore.NewEmbedding(
				[]float32{0.1, 0.2, 0.3},
				"model",
			),
			Limit:  pageSize,
			Offset: page * pageSize,
		}

		result, _ := docs.Query(ctx, query)

		fmt.Printf("Page %d: %d results\n", page+1, result.Count())

		// Process results...
		for _, match := range result.Matches {
			_ = match // Process match
		}

		// Check if there are more pages
		if !result.HasMore() {
			break
		}

		page++
	}
}
Example (SemanticCache)

Example_semanticCache demonstrates using collections for semantic caching.

package main

import (
	"context"
	"fmt"
	"time"

	"github.com/aixgo-dev/aixgo/pkg/vectorstore"
)

func main() {
	var store vectorstore.VectorStore
	defer func() { _ = store.Close() }()

	// Create a cache collection with TTL and deduplication
	cache := store.Collection("cache",
		vectorstore.WithTTL(5*time.Minute),
		vectorstore.WithDeduplication(true),
		vectorstore.WithMaxDocuments(10000),
	)

	ctx := context.Background()

	// Cache a query result
	cacheDoc := &vectorstore.Document{
		ID:      "query-hash-123",
		Content: vectorstore.NewTextContent("What is the capital of France?"),
		Embedding: vectorstore.NewEmbedding(
			[]float32{0.1, 0.2, 0.3},
			"text-embedding-3-small",
		),
		Temporal: vectorstore.NewTemporalWithTTL(5 * time.Minute),
		Tags:     []string{"qa", "geography"},
		Metadata: map[string]any{
			"answer": "Paris",
			"cached": time.Now(),
		},
	}

	_, err := cache.Upsert(ctx, cacheDoc)
	if err != nil {
		fmt.Printf("Error: %v\n", err)
		return
	}

	// Lookup cached result by similarity
	query := &vectorstore.Query{
		Embedding: vectorstore.NewEmbedding(
			[]float32{0.11, 0.19, 0.31},
			"text-embedding-3-small",
		),
		Limit:    1,
		MinScore: 0.95, // High threshold for cache hits
	}

	result, _ := cache.Query(ctx, query)
	if result.HasMatches() {
		answer := result.TopMatch().Document.Metadata["answer"]
		fmt.Printf("Cache hit! Answer: %s\n", answer)
	}
}
Example (Streaming)

Example_streaming demonstrates streaming query results.

package main

import (
	"context"
	"fmt"

	"github.com/aixgo-dev/aixgo/pkg/vectorstore"
)

func main() {
	var store vectorstore.VectorStore
	defer func() { _ = store.Close() }()

	docs := store.Collection("documents")
	ctx := context.Background()

	query := &vectorstore.Query{
		Embedding: vectorstore.NewEmbedding(
			[]float32{0.1, 0.2, 0.3},
			"model",
		),
		Limit: 1000, // Large result set
	}

	// Stream results
	iter, _ := docs.QueryStream(ctx, query)
	defer func() { _ = iter.Close() }()

	count := 0
	for iter.Next() {
		match := iter.Match()
		if match.Score >= 0.8 {
			count++
			fmt.Printf("High score match: %s\n", match.Document.ID)
		}
	}

	if err := iter.Err(); err != nil {
		fmt.Printf("Error: %v\n", err)
	}

	fmt.Printf("Found %d high-score matches\n", count)
}
Example (TimeBasedQueries)

Example_timeBasedQueries demonstrates temporal queries.

package main

import (
	"context"
	"fmt"
	"time"

	"github.com/aixgo-dev/aixgo/pkg/vectorstore"
)

func main() {
	var store vectorstore.VectorStore
	defer func() { _ = store.Close() }()

	events := store.Collection("events")
	ctx := context.Background()

	// Query for events in the last week that haven't expired
	query := &vectorstore.Query{
		Filters: vectorstore.And(
			vectorstore.CreatedAfter(time.Now().Add(-7*24*time.Hour)),
			vectorstore.NotExpired(),
			vectorstore.TagsFilter("important", "scheduled"),
		),
		SortBy: []vectorstore.SortBy{
			vectorstore.SortByCreatedAt(true), // Most recent first
		},
		Limit: 50,
	}

	result, _ := events.Query(ctx, query)
	for _, match := range result.Matches {
		fmt.Printf("Event: %s at %s\n",
			match.Document.Content.String(),
			match.Document.Temporal.CreatedAt,
		)
	}
}

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

func ForEach

func ForEach(iter ResultIterator, fn func(*Match) error) error

ForEach applies a function to each result in an iterator. The iterator is closed after iteration.

Example:

iter, err := coll.QueryStream(ctx, query)
if err != nil {
    return err
}
err = vectorstore.ForEach(iter, func(match *Match) error {
    fmt.Printf("Score: %.4f, ID: %s\n", match.Score, match.Document.ID)
    return nil
})

func GetTagFilter

func GetTagFilter(f Filter) (tag string, ok bool)

GetTagFilter extracts the tag from a tag filter. Returns empty string if not a tag filter.

func GetTimeFilter

func GetTimeFilter(f Filter) (field TimeField, op FilterOperator, value time.Time, ok bool)

GetTimeFilter extracts field, operator, and value from a time filter. Returns zero values if not a time filter.

func IsAndFilter

func IsAndFilter(f Filter) bool

IsAndFilter checks if the filter is an AND composite.

func IsNotFilter

func IsNotFilter(f Filter) bool

IsNotFilter checks if the filter is a NOT filter.

func IsOrFilter

func IsOrFilter(f Filter) bool

IsOrFilter checks if the filter is an OR composite.

func Validate

func Validate(doc *Document) error

Validate validates a document before storage. This performs comprehensive validation including:

  • ID format and length
  • Content presence and validity
  • Embedding dimensions and values
  • Metadata key safety
  • Temporal constraints

func ValidateContent

func ValidateContent(c *Content) error

ValidateContent validates document content.

func ValidateEmbedding

func ValidateEmbedding(e *Embedding) error

ValidateEmbedding validates an embedding vector.

func ValidateID

func ValidateID(id string) error

ValidateID validates a document ID.

func ValidateMetadataKey

func ValidateMetadataKey(key string) error

ValidateMetadataKey validates a metadata key.

func ValidateScope

func ValidateScope(s *Scope) error

ValidateScope validates scope information.

func ValidateTag

func ValidateTag(tag string) error

ValidateTag validates a tag.

func ValidateTemporal

func ValidateTemporal(t *Temporal) error

ValidateTemporal validates temporal information.

Types

type BatchConfig

type BatchConfig struct {
	// BatchSize is the number of documents to process per batch.
	// Default: 100
	BatchSize int

	// Parallelism is the number of concurrent batches.
	// Default: 1 (sequential)
	Parallelism int

	// ContinueOnError controls whether to continue on individual document errors.
	// If false, the entire batch fails on first error.
	// If true, failed documents are collected in UpsertResult.Errors.
	// Default: true
	ContinueOnError bool

	// ProgressCallback is called after each batch completes.
	// Receives (processed, total) counts.
	ProgressCallback func(processed, total int)

	// ValidateBeforeBatch validates all documents before starting batch.
	// If true, invalid documents cause immediate failure.
	// If false, validation happens per-batch.
	// Default: false
	ValidateBeforeBatch bool

	// RetryOnError enables retry for failed batches.
	// Default: false
	RetryOnError bool

	// MaxRetries is the maximum number of retries per batch.
	// Only meaningful if RetryOnError is true.
	// Default: 3
	MaxRetries int

	// RetryDelay is the delay between retries.
	// Default: 1 second
	RetryDelay time.Duration
}

BatchConfig contains configuration for batch operations.

func ApplyBatchOptions

func ApplyBatchOptions(opts []BatchOption) *BatchConfig

ApplyBatchOptions applies a list of batch options to a config. This is used internally by collection implementations.

func (*BatchConfig) Validate

func (c *BatchConfig) Validate() error

Validate validates a BatchConfig.

type BatchOption

type BatchOption func(*BatchConfig)

BatchOption configures batch operations.

func WithBatchSize

func WithBatchSize(size int) BatchOption

WithBatchSize sets the batch size.

Example:

result, err := coll.UpsertBatch(ctx, docs,
    WithBatchSize(100),
)

func WithContinueOnError

func WithContinueOnError(enabled bool) BatchOption

WithContinueOnError sets whether to continue on errors.

Example:

result, err := coll.UpsertBatch(ctx, docs,
    WithContinueOnError(false), // Fail fast
)

func WithMaxRetries

func WithMaxRetries(max int) BatchOption

WithMaxRetries sets the maximum number of retries. Also enables retry.

Example:

result, err := coll.UpsertBatch(ctx, docs,
    WithMaxRetries(5),
)

func WithParallelism

func WithParallelism(n int) BatchOption

WithParallelism sets the number of concurrent batches.

Example:

result, err := coll.UpsertBatch(ctx, docs,
    WithBatchSize(100),
    WithParallelism(4),
)

func WithProgressCallback

func WithProgressCallback(callback func(processed, total int)) BatchOption

WithProgressCallback sets a progress callback.

Example:

result, err := coll.UpsertBatch(ctx, docs,
    WithProgressCallback(func(processed, total int) {
        log.Printf("Progress: %d/%d (%.1f%%)",
            processed, total,
            float64(processed)/float64(total)*100)
    }),
)

func WithRetry

func WithRetry(enabled bool) BatchOption

WithRetry enables retry on batch failures.

Example:

result, err := coll.UpsertBatch(ctx, docs,
    WithRetry(true),
    WithMaxRetries(5),
    WithRetryDelay(2*time.Second),
)

func WithRetryDelay

func WithRetryDelay(delay time.Duration) BatchOption

WithRetryDelay sets the delay between retries. Also enables retry.

Example:

result, err := coll.UpsertBatch(ctx, docs,
    WithRetryDelay(2*time.Second),
)

func WithValidation

func WithValidation(enabled bool) BatchOption

WithValidation enables pre-batch validation.

Example:

result, err := coll.UpsertBatch(ctx, docs,
    WithValidation(true),
)

type Collection

type Collection interface {
	// Name returns the collection name.
	Name() string

	// Upsert inserts or updates documents in the collection.
	// If a document with the same ID exists, it is updated. Otherwise, a new document is created.
	//
	// Documents are validated before insertion. Invalid documents cause the entire
	// operation to fail (all-or-nothing semantics).
	//
	// For large batches, consider using UpsertBatch for better performance.
	//
	// Example:
	//
	//	doc := &Document{
	//	    ID: "doc1",
	//	    Content: NewTextContent("Hello world"),
	//	    Embedding: NewEmbedding([]float32{0.1, 0.2, 0.3}, "text-embedding-3-small"),
	//	    Tags: []string{"greeting", "english"},
	//	}
	//	result, err := coll.Upsert(ctx, doc)
	Upsert(ctx context.Context, documents ...*Document) (*UpsertResult, error)

	// UpsertBatch performs batch upsert with progress tracking and error handling.
	// This is optimized for inserting large numbers of documents.
	//
	// Options can control batch size, parallelism, and progress callbacks.
	//
	// Example:
	//
	//	result, err := coll.UpsertBatch(ctx, documents,
	//	    WithBatchSize(100),
	//	    WithProgressCallback(func(processed, total int) {
	//	        log.Printf("Progress: %d/%d", processed, total)
	//	    }),
	//	)
	UpsertBatch(ctx context.Context, documents []*Document, opts ...BatchOption) (*UpsertResult, error)

	// Query performs similarity search and returns matching documents.
	// The query can include vector similarity, metadata filters, temporal constraints,
	// scope filters, and more.
	//
	// Example:
	//
	//	results, err := coll.Query(ctx, &Query{
	//	    Embedding: queryEmbedding,
	//	    Limit: 10,
	//	    Filters: And(
	//	        TagFilter("category", "product"),
	//	        ScoreFilter(GreaterThan(0.8)),
	//	    ),
	//	})
	Query(ctx context.Context, query *Query) (*QueryResult, error)

	// QueryStream performs similarity search and streams results via an iterator.
	// This is useful for processing large result sets without loading everything into memory.
	//
	// The iterator must be closed when done to release resources.
	//
	// Example:
	//
	//	iter, err := coll.QueryStream(ctx, query)
	//	if err != nil {
	//	    log.Fatal(err)
	//	}
	//	defer iter.Close()
	//
	//	for iter.Next() {
	//	    match := iter.Match()
	//	    fmt.Printf("Score: %.4f, Content: %s\n", match.Score, match.Document.Content)
	//	}
	//	if err := iter.Err(); err != nil {
	//	    log.Fatal(err)
	//	}
	QueryStream(ctx context.Context, query *Query) (ResultIterator, error)

	// Get retrieves documents by their IDs.
	// Documents that don't exist are omitted from the result (no error is returned).
	//
	// The order of returned documents may not match the order of requested IDs.
	//
	// Example:
	//
	//	docs, err := coll.Get(ctx, "doc1", "doc2", "doc3")
	Get(ctx context.Context, ids ...string) ([]*Document, error)

	// Delete removes documents by their IDs.
	// IDs that don't exist are silently ignored (no error is returned).
	//
	// Example:
	//
	//	result, err := coll.Delete(ctx, "doc1", "doc2")
	//	fmt.Printf("Deleted %d documents\n", result.Deleted)
	Delete(ctx context.Context, ids ...string) (*DeleteResult, error)

	// DeleteByFilter removes all documents matching the filter.
	// This is useful for bulk deletion based on criteria.
	//
	// WARNING: This can delete many documents. Use with caution.
	//
	// Example:
	//
	//	// Delete all expired documents
	//	result, err := coll.DeleteByFilter(ctx,
	//	    TimeFilter(Before(time.Now())),
	//	)
	DeleteByFilter(ctx context.Context, filter Filter) (*DeleteResult, error)

	// Count returns the number of documents in the collection.
	// If a filter is provided, only documents matching the filter are counted.
	//
	// Example:
	//
	//	total, err := coll.Count(ctx, nil) // All documents
	//	active, err := coll.Count(ctx, TagFilter("status", "active"))
	Count(ctx context.Context, filter Filter) (int64, error)

	// Stats returns statistics about the collection.
	// This includes document count, storage size, index info, etc.
	Stats(ctx context.Context) (*CollectionStats, error)

	// Clear removes all documents from the collection.
	// This is primarily useful for testing.
	//
	// WARNING: This permanently deletes all data in the collection.
	Clear(ctx context.Context) error
}

Collection represents an isolated namespace for documents and embeddings. Collections enable use-case specific configurations like TTL, deduplication, scoping, and indexing strategies.

All Collection methods are safe for concurrent use.

type CollectionConfig

type CollectionConfig struct {
	// TTL specifies the time-to-live for documents in this collection.
	// Documents are automatically deleted after TTL expires.
	// Zero means no TTL (documents never expire).
	TTL time.Duration

	// EnableDeduplication enables content-based deduplication.
	// When enabled, documents with identical embeddings (or content hashes)
	// are deduplicated automatically.
	EnableDeduplication bool

	// DeduplicationThreshold is the similarity threshold for deduplication (0.0-1.0).
	// Documents with similarity >= threshold are considered duplicates.
	// Default: 0.99 (nearly identical)
	DeduplicationThreshold float32

	// IndexType specifies the vector index type.
	// Examples: "flat", "hnsw", "ivf"
	IndexType IndexType

	// EmbeddingDimensions is the expected dimensionality of embeddings.
	// If set, documents with different dimensions will be rejected.
	// Zero means no dimension validation.
	EmbeddingDimensions int

	// AutoGenerateEmbeddings enables automatic embedding generation.
	// When enabled, documents without embeddings will have them generated
	// using the specified embedding function.
	AutoGenerateEmbeddings bool

	// EmbeddingFunction is the function to generate embeddings.
	// Only used if AutoGenerateEmbeddings is true.
	EmbeddingFunction EmbeddingFunction

	// ScopeRequired specifies required scope fields.
	// Documents without these scope fields will be rejected.
	ScopeRequired []string

	// MaxDocuments limits the number of documents in the collection.
	// Oldest documents are removed when limit is exceeded (FIFO).
	// Zero means no limit.
	MaxDocuments int64

	// EnableVersioning enables document versioning.
	// Previous versions are retained and can be queried.
	EnableVersioning bool

	// MaxVersions limits the number of versions per document.
	// Only meaningful if EnableVersioning is true.
	// Zero means unlimited versions.
	MaxVersions int

	// EnableAuditLog enables audit logging for all operations.
	EnableAuditLog bool

	// Metadata contains additional provider-specific configuration.
	Metadata map[string]any
}

CollectionConfig contains the configuration for a collection. This is built from CollectionOption functions.

func ApplyOptions

func ApplyOptions(opts []CollectionOption) *CollectionConfig

ApplyOptions applies a list of options to a config. This is used internally by collection implementations.

func (*CollectionConfig) Validate

func (c *CollectionConfig) Validate() error

Validate validates a CollectionConfig.

type CollectionOption

type CollectionOption func(*CollectionConfig)

CollectionOption configures a Collection. Options are applied when creating or accessing a collection.

func WithAuditLog

func WithAuditLog(enabled bool) CollectionOption

WithAuditLog enables audit logging for the collection.

Example:

docs := store.Collection("docs",
    WithAuditLog(true),
)

func WithAutoEmbeddings

func WithAutoEmbeddings(fn EmbeddingFunction) CollectionOption

WithAutoEmbeddings enables automatic embedding generation.

Example:

docs := store.Collection("docs",
    WithAutoEmbeddings(myEmbeddingFunc),
)

func WithDeduplication

func WithDeduplication(enabled bool) CollectionOption

WithDeduplication enables content-based deduplication. Documents with similarity >= 0.99 are considered duplicates.

Example:

docs := store.Collection("docs", WithDeduplication(true))

func WithDeduplicationThreshold

func WithDeduplicationThreshold(threshold float32) CollectionOption

WithDeduplicationThreshold sets the similarity threshold for deduplication. Also enables deduplication.

Example:

docs := store.Collection("docs", WithDeduplicationThreshold(0.95))

func WithDimensions

func WithDimensions(dimensions int) CollectionOption

WithDimensions sets the expected embedding dimensions. Documents with different dimensions will be rejected.

Example:

docs := store.Collection("docs", WithDimensions(768))

func WithIndexing

func WithIndexing(indexType IndexType) CollectionOption

WithIndexing sets the vector index type.

Example:

docs := store.Collection("docs", WithIndexing(IndexTypeHNSW))

func WithMaxDocuments

func WithMaxDocuments(max int64) CollectionOption

WithMaxDocuments limits the collection size. Oldest documents are removed when limit is exceeded.

Example:

cache := store.Collection("cache",
    WithMaxDocuments(10000),
)

func WithMaxVersions

func WithMaxVersions(max int) CollectionOption

WithMaxVersions limits the number of versions per document. Also enables versioning.

Example:

docs := store.Collection("docs",
    WithMaxVersions(5),
)

func WithMetadata

func WithMetadata(metadata map[string]any) CollectionOption

WithMetadata adds custom metadata to the collection config.

Example:

docs := store.Collection("docs",
    WithMetadata(map[string]any{
        "shard_count": 4,
        "replication": 3,
    }),
)

func WithScope

func WithScope(fields ...string) CollectionOption

WithScope specifies required scope fields. Documents without these fields will be rejected.

Example:

memory := store.Collection("memory",
    WithScope("user", "session"),
)

func WithTTL

func WithTTL(ttl time.Duration) CollectionOption

WithTTL sets the time-to-live for documents in the collection.

Example:

cache := store.Collection("cache", WithTTL(5*time.Minute))

func WithVersioning

func WithVersioning(enabled bool) CollectionOption

WithVersioning enables document versioning.

Example:

docs := store.Collection("docs",
    WithVersioning(true),
)

type CollectionStats

type CollectionStats struct {
	// Name is the collection name
	Name string

	// Documents is the number of documents in the collection
	Documents int64

	// StorageBytes is the storage used by this collection in bytes
	StorageBytes int64

	// EmbeddingDimensions is the dimensionality of embeddings in this collection
	EmbeddingDimensions int

	// IndexType is the index type used (e.g., "flat", "hnsw", "ivf")
	IndexType string

	// CreatedAt is when the collection was created
	CreatedAt *TimestampValue

	// UpdatedAt is when the collection was last modified
	UpdatedAt *TimestampValue

	// Extra contains provider-specific statistics
	Extra map[string]any
}

CollectionStats contains statistics about a specific collection.

type Content

type Content struct {
	// Type indicates the content type
	Type ContentType

	// Text is the text content (for ContentTypeText)
	Text string

	// Data is the binary data (for images, audio, video)
	// Stored as base64-encoded string for JSON serialization
	Data []byte

	// MimeType is the MIME type of the content (e.g., "image/jpeg", "audio/mp3")
	MimeType string

	// URL is an optional external URL for the content
	// Useful for referencing large media without storing it inline
	URL string

	// Chunks contains text chunks for long documents
	// Useful for document splitting and retrieval
	Chunks []string
}

Content represents multi-modal document content. A document can contain text, images, audio, or video.

func NewAudioContent

func NewAudioContent(data []byte, mimeType string) *Content

NewAudioContent creates a Content with audio data.

func NewImageContent

func NewImageContent(data []byte, mimeType string) *Content

NewImageContent creates a Content with image data.

func NewImageURL

func NewImageURL(url string) *Content

NewImageURL creates a Content with an image URL reference.

func NewTextContent

func NewTextContent(text string) *Content

NewTextContent creates a Content with text.

func NewVideoContent

func NewVideoContent(data []byte, mimeType string) *Content

NewVideoContent creates a Content with video data.

func (*Content) DataBase64

func (c *Content) DataBase64() string

DataBase64 returns the binary data as a base64-encoded string. This is useful for JSON serialization.

func (*Content) String

func (c *Content) String() string

String returns a string representation of the content. For text, returns the text. For binary, returns a summary.

type ContentType

type ContentType string

ContentType represents the type of content in a document.

const (
	// ContentTypeText represents text content
	ContentTypeText ContentType = "text"

	// ContentTypeImage represents image content (JPEG, PNG, WebP, etc.)
	ContentTypeImage ContentType = "image"

	// ContentTypeAudio represents audio content (MP3, WAV, etc.)
	ContentTypeAudio ContentType = "audio"

	// ContentTypeVideo represents video content (MP4, WebM, etc.)
	ContentTypeVideo ContentType = "video"

	// ContentTypeMultimodal represents mixed content types
	ContentTypeMultimodal ContentType = "multimodal"
)

type DeleteResult

type DeleteResult struct {
	// Deleted is the number of documents actually deleted
	Deleted int64

	// NotFound is the number of IDs that were not found
	NotFound int64

	// NotFoundIDs contains the IDs that were not found
	NotFoundIDs []string

	// Timing contains operation timing information
	Timing *OperationTiming
}

DeleteResult contains the results of a delete operation.

func (*DeleteResult) Success

func (r *DeleteResult) Success() bool

Success returns true if at least one document was deleted.

type DistanceMetric

type DistanceMetric string

DistanceMetric represents the method for calculating vector similarity.

const (
	// DistanceMetricCosine calculates cosine similarity (default)
	// Range: -1 (opposite) to 1 (identical)
	// Best for: Most text embeddings (normalized vectors)
	DistanceMetricCosine DistanceMetric = "cosine"

	// DistanceMetricEuclidean calculates Euclidean (L2) distance
	// Range: 0 (identical) to infinity (different)
	// Best for: When magnitude matters
	DistanceMetricEuclidean DistanceMetric = "euclidean"

	// DistanceMetricDotProduct calculates dot product similarity
	// Range: -infinity to +infinity
	// Best for: Normalized vectors, faster than cosine
	DistanceMetricDotProduct DistanceMetric = "dot_product"

	// DistanceMetricManhattan calculates Manhattan (L1) distance
	// Range: 0 (identical) to infinity (different)
	// Best for: High-dimensional sparse vectors
	DistanceMetricManhattan DistanceMetric = "manhattan"

	// DistanceMetricHamming calculates Hamming distance (for binary vectors)
	// Range: 0 (identical) to vector_length (completely different)
	// Best for: Binary embeddings
	DistanceMetricHamming DistanceMetric = "hamming"
)

type Document

type Document struct {
	// ID is the unique identifier for the document.
	// Must be unique within a collection.
	// IDs should be URL-safe strings (alphanumeric, hyphens, underscores).
	ID string

	// Content is the multi-modal content of the document.
	// Supports text, images, audio, video.
	Content *Content

	// Embedding is the vector representation of the content.
	// Can be nil if embeddings are generated server-side.
	Embedding *Embedding

	// Scope defines hierarchical context for the document.
	// Useful for multi-tenancy, user isolation, session tracking.
	// Example: {Tenant: "acme", User: "user123", Session: "sess456"}
	Scope *Scope

	// Temporal contains time-related information for the document.
	// Useful for TTL, time-based queries, event ordering.
	Temporal *Temporal

	// Tags are indexed labels for efficient filtering.
	// Unlike metadata, tags are optimized for equality queries.
	// Examples: ["product", "electronics", "featured"]
	Tags []string

	// Metadata contains additional free-form information.
	// Use this for data that doesn't fit into typed fields.
	// Keys should be alphanumeric with underscores (no special chars).
	Metadata map[string]any

	// Score is the similarity score (populated during queries).
	// Not stored, only returned in query results.
	Score float32 `json:"-"`

	// Distance is the raw distance metric (populated during queries).
	// Not stored, only returned in query results.
	Distance float32 `json:"-"`
}

Document represents a document with embeddings, content, and metadata. Documents are the primary unit of storage in a vector database.

Unlike the previous version which only had a string Content field, this enhanced Document supports:

  • Multi-modal content (text, image, audio, video)
  • Typed scope fields for multi-tenancy
  • Temporal information for time-based queries
  • Tags for efficient filtering
  • Structured metadata separate from free-form data

type Embedding

type Embedding struct {
	// Vector is the embedding vector
	Vector []float32

	// Model is the name of the model that generated this embedding
	// Examples: "text-embedding-3-small", "clip-vit-base-patch32"
	Model string

	// Dimensions is the dimensionality of the vector
	// Automatically set from len(Vector)
	Dimensions int

	// Normalized indicates whether the vector is normalized (unit length)
	// Many distance metrics work better with normalized vectors
	Normalized bool
}

Embedding represents a vector embedding with metadata.

func NewEmbedding

func NewEmbedding(vector []float32, model string) *Embedding

NewEmbedding creates a new Embedding from a vector and model name.

func NewNormalizedEmbedding

func NewNormalizedEmbedding(vector []float32, model string) *Embedding

NewNormalizedEmbedding creates a normalized embedding (unit length). The vector is normalized in-place.

func (*Embedding) Normalize

func (e *Embedding) Normalize()

Normalize normalizes the embedding vector to unit length (in-place).

type EmbeddingFunction

type EmbeddingFunction func(content *Content) (*Embedding, error)

EmbeddingFunction generates embeddings for content.

type ExplainStep

type ExplainStep struct {
	// Name is the step name
	Name string

	// Duration is how long this step took
	Duration time.Duration

	// Details contains step-specific information
	Details map[string]any
}

ExplainStep represents a single step in query execution.

type Filter

type Filter interface {
	// contains filtered or unexported methods
}

Filter represents a condition for filtering documents. Filters can be combined using And(), Or(), Not().

func And

func And(filters ...Filter) Filter

And combines multiple filters with AND logic (all must match).

func AnyTagFilter

func AnyTagFilter(tags ...string) Filter

AnyTagFilter creates a filter that matches documents with any of the specified tags.

func Contains

func Contains(field string, substring string) Filter

Contains creates a substring filter.

func CreatedAfter

func CreatedAfter(t time.Time) Filter

CreatedAfter filters documents created after a time.

func CreatedBefore

func CreatedBefore(t time.Time) Filter

CreatedBefore filters documents created before a time.

func EndsWith

func EndsWith(field string, suffix string) Filter

EndsWith creates a suffix filter.

func Eq

func Eq(field string, value any) Filter

Eq creates an equality filter.

func Exists

func Exists(field string) Filter

Exists creates an existence filter.

func Expired

func Expired() Filter

Expired filters documents that have expired.

func ExpiresAfter

func ExpiresAfter(t time.Time) Filter

ExpiresAfter filters documents that expire after a time.

func ExpiresBefore

func ExpiresBefore(t time.Time) Filter

ExpiresBefore filters documents that expire before a time.

func FieldFilter

func FieldFilter(field string, operator FilterOperator, value any) Filter

FieldFilter creates a filter on a metadata field.

func GetFilters

func GetFilters(f Filter) []Filter

GetFilters returns all filters in the composite filter. This is used internally by providers to decompose complex filters.

func GetNotFilter

func GetNotFilter(f Filter) (inner Filter, ok bool)

GetNotFilter extracts the inner filter from a NOT filter.

func Gt

func Gt(field string, value any) Filter

Gt creates a greater-than filter.

func Gte

func Gte(field string, value any) Filter

Gte creates a greater-than-or-equal filter.

func In

func In(field string, values ...any) Filter

In creates an in-set filter.

func Lt

func Lt(field string, value any) Filter

Lt creates a less-than filter.

func Lte

func Lte(field string, value any) Filter

Lte creates a less-than-or-equal filter.

func Ne

func Ne(field string, value any) Filter

Ne creates an inequality filter.

func Not

func Not(filter Filter) Filter

Not negates a filter.

func NotExists

func NotExists(field string) Filter

NotExists creates a non-existence filter.

func NotExpired

func NotExpired() Filter

NotExpired filters documents that have not expired yet.

func NotIn

func NotIn(field string, values ...any) Filter

NotIn creates a not-in-set filter.

func Or

func Or(filters ...Filter) Filter

Or combines multiple filters with OR logic (at least one must match).

func ScopeFilter

func ScopeFilter(scope *Scope) Filter

ScopeFilter creates a filter based on scope. Documents must match all non-empty scope fields.

func ScoreAbove

func ScoreAbove(threshold float32) Filter

ScoreAbove filters results with score > threshold.

func ScoreAtLeast

func ScoreAtLeast(threshold float32) Filter

ScoreAtLeast filters results with score >= threshold.

func ScoreBelow

func ScoreBelow(threshold float32) Filter

ScoreBelow filters results with score < threshold.

func ScoreFilter

func ScoreFilter(operator FilterOperator, value float32) Filter

ScoreFilter creates a filter based on similarity score. Only applies during vector similarity queries.

func SessionFilter

func SessionFilter(session string) Filter

SessionFilter creates a filter for a specific session.

func StartsWith

func StartsWith(field string, prefix string) Filter

StartsWith creates a prefix filter.

func TagFilter

func TagFilter(tag string) Filter

TagFilter creates a filter that matches documents with a specific tag.

func TagsFilter

func TagsFilter(tags ...string) Filter

TagsFilter creates a filter that matches documents with all specified tags.

func TenantFilter

func TenantFilter(tenant string) Filter

TenantFilter creates a filter for a specific tenant.

func TimeFilter

func TimeFilter(field TimeField, operator FilterOperator, value time.Time) Filter

TimeFilter creates a time-based filter.

func UpdatedAfter

func UpdatedAfter(t time.Time) Filter

UpdatedAfter filters documents updated after a time.

func UpdatedBefore

func UpdatedBefore(t time.Time) Filter

UpdatedBefore filters documents updated before a time.

func UserFilter

func UserFilter(user string) Filter

UserFilter creates a filter for a specific user.

type FilterOperator

type FilterOperator string

FilterOperator represents a comparison operator.

const (
	// OpEqual checks for equality (==)
	OpEqual FilterOperator = "eq"

	// OpNotEqual checks for inequality (!=)
	OpNotEqual FilterOperator = "ne"

	// OpGreaterThan checks if field > value
	OpGreaterThan FilterOperator = "gt"

	// OpGreaterThanOrEqual checks if field >= value
	OpGreaterThanOrEqual FilterOperator = "gte"

	// OpLessThan checks if field < value
	OpLessThan FilterOperator = "lt"

	// OpLessThanOrEqual checks if field <= value
	OpLessThanOrEqual FilterOperator = "lte"

	// OpIn checks if field is in a set of values
	OpIn FilterOperator = "in"

	// OpNotIn checks if field is not in a set of values
	OpNotIn FilterOperator = "nin"

	// OpContains checks if a string field contains a substring
	OpContains FilterOperator = "contains"

	// OpStartsWith checks if a string field starts with a prefix
	OpStartsWith FilterOperator = "startswith"

	// OpEndsWith checks if a string field ends with a suffix
	OpEndsWith FilterOperator = "endswith"

	// OpExists checks if a field exists (value is ignored)
	OpExists FilterOperator = "exists"

	// OpNotExists checks if a field does not exist (value is ignored)
	OpNotExists FilterOperator = "notexists"
)

func GetFieldFilter

func GetFieldFilter(f Filter) (field string, op FilterOperator, value any, ok bool)

GetFieldFilter extracts field, operator, and value from a field filter. Returns empty values if not a field filter.

func GetScoreFilter

func GetScoreFilter(f Filter) (op FilterOperator, value float32, ok bool)

GetScoreFilter extracts operator and value from a score filter. Returns zero values if not a score filter.

type IndexType

type IndexType string

IndexType represents the type of vector index.

const (
	// IndexTypeFlat performs brute-force (exact) search.
	// Best for: Small collections (<10K documents), maximum accuracy.
	IndexTypeFlat IndexType = "flat"

	// IndexTypeHNSW uses Hierarchical Navigable Small World graph.
	// Best for: Large collections, good balance of speed and accuracy.
	IndexTypeHNSW IndexType = "hnsw"

	// IndexTypeIVF uses Inverted File with Product Quantization.
	// Best for: Very large collections, faster but less accurate.
	IndexTypeIVF IndexType = "ivf"

	// IndexTypeAuto lets the provider choose based on collection size.
	IndexTypeAuto IndexType = "auto"
)

type Match

type Match struct {
	// Document is the matched document
	Document *Document

	// Score is the similarity score (higher is more similar)
	// For cosine similarity: -1 (opposite) to 1 (identical)
	// For euclidean: normalized to 0-1 range
	Score float32

	// Distance is the raw distance metric (optional)
	// The interpretation depends on the distance metric used
	Distance float32

	// Rank is the result rank (1-based)
	// Useful for hybrid ranking scenarios
	Rank int
}

Match represents a single search result with similarity score.

func CollectAll

func CollectAll(iter ResultIterator) ([]*Match, error)

CollectAll collects all results from an iterator into a slice. The iterator is closed after collection.

This is a convenience function for cases where you want to materialize all results. Use with caution on large result sets.

Example:

iter, err := coll.QueryStream(ctx, query)
if err != nil {
    return err
}
matches, err := vectorstore.CollectAll(iter)
if err != nil {
    return err
}

func CollectN

func CollectN(iter ResultIterator, n int) ([]*Match, error)

CollectN collects up to N results from an iterator. The iterator is NOT closed (caller should close it).

Example:

iter, err := coll.QueryStream(ctx, query)
if err != nil {
    return err
}
defer iter.Close()

// Get first 10 results
matches, err := vectorstore.CollectN(iter, 10)

type OperationTiming

type OperationTiming struct {
	// Total is the total operation time
	Total time.Duration

	// Validation is the time spent validating documents
	Validation time.Duration

	// Storage is the time spent writing to storage
	Storage time.Duration

	// IndexUpdate is the time spent updating indexes
	IndexUpdate time.Duration
}

OperationTiming contains timing information for CRUD operations.

type Query

type Query struct {
	// Embedding is the query vector for similarity search.
	// If nil, performs a pure metadata/filter query without vector similarity.
	Embedding *Embedding

	// Filters specifies conditions that documents must match.
	// Can be combined using And(), Or(), Not() for complex queries.
	Filters Filter

	// Limit is the maximum number of results to return.
	// Default: 10. Maximum: 10000.
	Limit int

	// Offset is the number of results to skip (for pagination).
	// Default: 0.
	Offset int

	// MinScore is the minimum similarity score (0.0-1.0).
	// Documents with lower scores are excluded.
	// Default: 0 (no minimum).
	MinScore float32

	// Metric specifies how to calculate vector similarity.
	// Default: Cosine similarity.
	Metric DistanceMetric

	// IncludeEmbeddings controls whether to return embeddings in results.
	// Default: false (embeddings are large and often not needed).
	IncludeEmbeddings bool

	// IncludeContent controls whether to return document content in results.
	// Default: true. Set to false to only get metadata/scores.
	IncludeContent bool

	// SortBy specifies additional sorting criteria (after similarity).
	// Useful for hybrid ranking (e.g., similarity + recency).
	SortBy []SortBy

	// Explain requests query execution details (for debugging/optimization).
	// Default: false.
	Explain bool
}

Query defines parameters for similarity search. It supports vector similarity, metadata filters, temporal constraints, scope filters, pagination, and more.

func NewFilterQuery

func NewFilterQuery(filters Filter) *Query

NewFilterQuery creates a filter-only query (no vector similarity).

func NewQuery

func NewQuery(embedding *Embedding) *Query

NewQuery creates a query with an embedding vector.

func (*Query) Validate

func (q *Query) Validate() error

Validate validates a query.

type QueryExplain

type QueryExplain struct {
	// Strategy describes the query execution strategy
	// Examples: "brute_force", "hnsw_index", "ivf_index"
	Strategy string

	// IndexUsed indicates which index was used (if any)
	IndexUsed string

	// ScannedDocuments is the number of documents scanned
	ScannedDocuments int64

	// FilteredDocuments is the number of documents after filtering
	FilteredDocuments int64

	// VectorComparisons is the number of vector similarity comparisons
	VectorComparisons int64

	// CacheHit indicates if results were served from cache
	CacheHit bool

	// Steps contains detailed execution steps
	Steps []ExplainStep
}

QueryExplain contains detailed query execution information. This is useful for debugging and optimization.

func (*QueryExplain) ExplainString

func (e *QueryExplain) ExplainString() string

ExplainString returns a human-readable explanation of query execution.

type QueryResult

type QueryResult struct {
	// Matches are the matching documents with their scores
	Matches []*Match

	// Total is the total number of matches (before limit/offset)
	// Useful for pagination
	Total int64

	// Offset is the offset that was applied
	Offset int

	// Limit is the limit that was applied
	Limit int

	// Timing contains query execution timing information
	Timing *QueryTiming

	// Explain contains query execution details (if requested)
	Explain *QueryExplain
}

QueryResult contains the results of a similarity search query.

func (*QueryResult) AvgScore

func (r *QueryResult) AvgScore() float32

AvgScore returns the average similarity score across all matches.

func (*QueryResult) Count

func (r *QueryResult) Count() int

Count returns the number of matches in this result.

func (*QueryResult) Documents

func (r *QueryResult) Documents() []*Document

Documents returns just the documents from matches (without scores).

func (*QueryResult) Empty

func (r *QueryResult) Empty() bool

Empty returns true if there are no matches.

func (*QueryResult) FilterByScore

func (r *QueryResult) FilterByScore(minScore float32) []*Match

FilterByScore returns matches with score >= minScore.

func (*QueryResult) FilterByTag

func (r *QueryResult) FilterByTag(tag string) []*Match

FilterByTag returns matches with a specific tag.

func (*QueryResult) First

func (r *QueryResult) First(n int) []*Match

First returns the first N matches.

func (*QueryResult) GroupByTag

func (r *QueryResult) GroupByTag() map[string][]*Match

GroupByTag groups matches by tag. Each match may appear in multiple groups if it has multiple tags.

func (*QueryResult) HasMatches

func (r *QueryResult) HasMatches() bool

HasMatches returns true if the query returned any matches.

func (*QueryResult) HasMore

func (r *QueryResult) HasMore() bool

HasMore returns true if there are more results beyond the current page.

func (*QueryResult) IDs

func (r *QueryResult) IDs() []string

IDs returns just the document IDs from matches.

func (*QueryResult) Last

func (r *QueryResult) Last(n int) []*Match

Last returns the last N matches.

func (*QueryResult) MatchByID

func (r *QueryResult) MatchByID(id string) *Match

MatchByID finds a match by document ID.

func (*QueryResult) MaxScore

func (r *QueryResult) MaxScore() float32

MaxScore returns the highest similarity score.

func (*QueryResult) MinScore

func (r *QueryResult) MinScore() float32

MinScore returns the lowest similarity score.

func (*QueryResult) NextOffset

func (r *QueryResult) NextOffset() int

NextOffset returns the offset for the next page.

func (*QueryResult) PrevOffset

func (r *QueryResult) PrevOffset() int

PrevOffset returns the offset for the previous page.

func (*QueryResult) Scores

func (r *QueryResult) Scores() []float32

Scores returns just the scores from matches.

func (*QueryResult) Slice

func (r *QueryResult) Slice(start, end int) []*Match

Slice returns a slice of matches [start:end].

func (*QueryResult) TopMatch

func (r *QueryResult) TopMatch() *Match

TopMatch returns the highest scoring match, or nil if no matches.

type QueryTiming

type QueryTiming struct {
	// Total is the total query execution time
	Total time.Duration

	// VectorSearch is the time spent on vector similarity search
	VectorSearch time.Duration

	// FilterApplication is the time spent applying filters
	FilterApplication time.Duration

	// Retrieval is the time spent retrieving full documents
	Retrieval time.Duration

	// Scoring is the time spent calculating similarity scores
	Scoring time.Duration
}

QueryTiming contains timing information for a query.

type ResultIterator

type ResultIterator interface {
	// Next advances to the next result.
	// Returns true if there is a result, false if iteration is complete or an error occurred.
	//
	// Always check Err() after Next returns false to distinguish between
	// normal completion and errors.
	Next() bool

	// Match returns the current search match.
	// Only valid after Next returns true.
	Match() *Match

	// Err returns any error that occurred during iteration.
	// Should be checked after Next returns false.
	Err() error

	// Close releases resources associated with the iterator.
	// Always call Close when done, typically via defer.
	Close() error
}

ResultIterator provides streaming access to query results. It follows the iterator pattern common in Go database libraries.

func FilterIterator

func FilterIterator(iter ResultIterator, predicate func(*Match) bool) ResultIterator

FilterIterator applies a predicate to filter results from an iterator. Returns a new iterator with only matching results.

Example:

iter, err := coll.QueryStream(ctx, query)
if err != nil {
    return err
}
defer iter.Close()

// Filter for high-scoring results
filtered := vectorstore.FilterIterator(iter, func(m *Match) bool {
    return m.Score >= 0.8
})
defer filtered.Close()

func MapIterator

func MapIterator(iter ResultIterator, fn func(*Match) *Match) ResultIterator

MapIterator applies a transformation function to each result. Returns a new iterator with transformed results.

Example:

iter, err := coll.QueryStream(ctx, query)
if err != nil {
    return err
}
defer iter.Close()

// Boost scores
boosted := vectorstore.MapIterator(iter, func(m *Match) *Match {
    m.Score *= 1.2
    if m.Score > 1.0 {
        m.Score = 1.0
    }
    return m
})
defer boosted.Close()

func NewChannelIterator

func NewChannelIterator(matches <-chan *Match, errs <-chan error) ResultIterator

NewChannelIterator creates a ResultIterator from channels. The match channel should be closed when done. The error channel receives at most one error.

func NewEmptyIterator

func NewEmptyIterator() ResultIterator

NewEmptyIterator creates a ResultIterator with no results.

func NewErrorIterator

func NewErrorIterator(err error) ResultIterator

NewErrorIterator creates a ResultIterator that returns an error.

func NewSliceIterator

func NewSliceIterator(matches []*Match) ResultIterator

NewSliceIterator creates a ResultIterator from a slice of matches. This is a helper for providers that materialize all results in memory.

type Scope

type Scope struct {
	// Tenant is the top-level scope (organization, workspace, team)
	Tenant string

	// User is the user identifier
	User string

	// Session is the session identifier
	Session string

	// Agent is the agent identifier (for multi-agent systems)
	Agent string

	// Thread is the conversation thread identifier
	Thread string

	// Custom contains additional custom scope dimensions
	// Use this for domain-specific scoping needs
	Custom map[string]string
}

Scope defines hierarchical context for a document. This enables multi-tenancy, user isolation, and session tracking.

Scope fields are indexed separately for efficient filtering. All fields are optional and can be combined as needed.

func GetScopeFilter

func GetScopeFilter(f Filter) (scope *Scope, ok bool)

GetScopeFilter extracts the scope from a scope filter. Returns nil if not a scope filter.

func NewScope

func NewScope(tenant, user, session string) *Scope

NewScope creates a scope with common fields.

func (*Scope) Match

func (s *Scope) Match(other *Scope) bool

Match checks if this scope matches another scope. A nil field matches any value (wildcard).

type SortBy

type SortBy struct {
	// Field is the field name to sort by
	// Can be a metadata field, score, or temporal field
	Field string

	// Descending indicates descending order (default is ascending)
	Descending bool
}

SortBy specifies a field to sort by.

func SortByCreatedAt

func SortByCreatedAt(descending bool) SortBy

SortByCreatedAt creates a sort by creation time.

func SortByField

func SortByField(field string, descending bool) SortBy

SortByField creates a sort by metadata field.

func SortByScore

func SortByScore() SortBy

SortByScore creates a sort by similarity score (descending by default).

func SortByUpdatedAt

func SortByUpdatedAt(descending bool) SortBy

SortByUpdatedAt creates a sort by update time.

type StoreStats

type StoreStats struct {
	// Collections is the total number of collections
	Collections int64

	// Documents is the total number of documents across all collections
	Documents int64

	// StorageBytes is the total storage used in bytes
	StorageBytes int64

	// Provider is the vector store provider name (e.g., "memory", "firestore", "qdrant")
	Provider string

	// Version is the provider version
	Version string

	// Extra contains provider-specific statistics
	Extra map[string]any
}

StoreStats contains statistics about the entire vector store.

type Temporal

type Temporal struct {
	// CreatedAt is when the document was created
	CreatedAt time.Time

	// UpdatedAt is when the document was last updated
	UpdatedAt time.Time

	// ExpiresAt is when the document should expire (optional)
	// Used for automatic cleanup in caching scenarios
	ExpiresAt *time.Time

	// EventTime is the time of the event this document represents (optional)
	// Used for time-series data and event ordering
	EventTime *time.Time

	// ValidFrom is when this document becomes valid (optional)
	// Used for future-dated content
	ValidFrom *time.Time

	// ValidUntil is when this document stops being valid (optional)
	// Different from ExpiresAt - this is semantic validity, not storage
	ValidUntil *time.Time
}

Temporal contains time-related information for a document. This enables TTL, time-based queries, and event ordering.

func NewTemporal

func NewTemporal() *Temporal

NewTemporal creates a Temporal with creation time set to now.

func NewTemporalWithTTL

func NewTemporalWithTTL(ttl time.Duration) *Temporal

NewTemporalWithTTL creates a Temporal with TTL.

func (*Temporal) IsExpired

func (t *Temporal) IsExpired() bool

IsExpired checks if the document has expired.

func (*Temporal) IsValid

func (t *Temporal) IsValid() bool

IsValid checks if the document is currently valid (within ValidFrom/ValidUntil range).

func (*Temporal) SetExpiry

func (t *Temporal) SetExpiry(ttl time.Duration)

SetExpiry sets the expiration time relative to now.

func (*Temporal) Touch

func (t *Temporal) Touch()

Touch updates the UpdatedAt timestamp to now.

type TimeField

type TimeField string

TimeField represents a temporal field to filter on.

const (
	// TimeFieldCreatedAt filters on document creation time
	TimeFieldCreatedAt TimeField = "created_at"

	// TimeFieldUpdatedAt filters on document update time
	TimeFieldUpdatedAt TimeField = "updated_at"

	// TimeFieldExpiresAt filters on document expiration time
	TimeFieldExpiresAt TimeField = "expires_at"

	// TimeFieldEventTime filters on event time
	TimeFieldEventTime TimeField = "event_time"

	// TimeFieldValidFrom filters on validity start time
	TimeFieldValidFrom TimeField = "valid_from"

	// TimeFieldValidUntil filters on validity end time
	TimeFieldValidUntil TimeField = "valid_until"
)

type TimestampValue

type TimestampValue struct {
	time.Time
}

TimestampValue represents a timestamp that can be stored and queried. This is a helper type for stats and results.

func NewTimestamp

func NewTimestamp(t time.Time) *TimestampValue

NewTimestamp creates a TimestampValue from a time.Time.

type UpsertResult

type UpsertResult struct {
	// Inserted is the number of new documents inserted
	Inserted int64

	// Updated is the number of existing documents updated
	Updated int64

	// Failed is the number of documents that failed validation/insertion
	Failed int64

	// FailedIDs contains the IDs of documents that failed (if any)
	FailedIDs []string

	// Errors contains errors for failed documents (parallel to FailedIDs)
	Errors []error

	// Timing contains operation timing information
	Timing *OperationTiming

	// Deduplicated is the number of documents that were deduplicated
	// Only set if deduplication is enabled
	Deduplicated int64

	// DeduplicatedIDs contains the IDs of deduplicated documents
	DeduplicatedIDs []string
}

UpsertResult contains the results of an upsert operation.

func (*UpsertResult) PartialSuccess

func (r *UpsertResult) PartialSuccess() bool

PartialSuccess returns true if some documents succeeded but some failed.

func (*UpsertResult) Success

func (r *UpsertResult) Success() bool

Success returns true if all documents were successfully upserted.

func (*UpsertResult) TotalProcessed

func (r *UpsertResult) TotalProcessed() int64

TotalProcessed returns the total number of documents processed.

type VectorStore

type VectorStore interface {
	// Collection returns a Collection with the specified name and options.
	// Collections provide isolated namespaces for documents and embeddings.
	//
	// The name must be a valid collection identifier (alphanumeric, hyphens, underscores).
	// Options configure behavior like TTL, deduplication, indexing, etc.
	//
	// Collections are created lazily on first use. Calling Collection multiple times
	// with the same name returns the same logical collection.
	//
	// Example:
	//
	//	cache := store.Collection("cache", WithTTL(5*time.Minute))
	//	docs := store.Collection("documents", WithIndexing(IndexTypeHNSW))
	Collection(name string, opts ...CollectionOption) Collection

	// ListCollections returns the names of all collections in this store.
	// This is useful for administration and debugging.
	ListCollections(ctx context.Context) ([]string, error)

	// DeleteCollection permanently deletes a collection and all its documents.
	// This operation cannot be undone.
	//
	// Returns an error if the collection doesn't exist or cannot be deleted.
	DeleteCollection(ctx context.Context, name string) error

	// Stats returns statistics about the vector store.
	// This includes total collections, documents, storage size, etc.
	Stats(ctx context.Context) (*StoreStats, error)

	// Close closes the connection to the vector database and releases resources.
	// After Close is called, the VectorStore should not be used.
	Close() error
}

VectorStore is the top-level interface for vector database operations. It provides methods for creating and managing collections, which are isolated namespaces for documents and embeddings.

VectorStore acts as a factory for Collections, enabling multi-tenancy, use-case isolation, and organized storage of embeddings.

Example:

store, err := memory.New()
if err != nil {
    log.Fatal(err)
}
defer store.Close()

// Create collection for semantic caching
cache := store.Collection("cache",
    WithTTL(5*time.Minute),
    WithDeduplication(true),
)

// Create collection for agent memory
memory := store.Collection("agent-memory",
    WithScope("user", "session"),
)

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL