sqlitevec

package module
v0.5.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 11, 2026 License: Apache-2.0 Imports: 11 Imported by: 0

README

goagent/memory/vector/sqlitevec

SQLite + sqlite-vec store for goagent.

Implements goagent.VectorStore over SQLite with the sqlite-vec extension. Suitable for embedded and single-process deployments with no external database dependency.

go get github.com/Germanblandin1/goagent/memory/vector/sqlitevec

Requires CGO and a C compiler.

Documentation

Documentation

Overview

Package sqlitevec implements goagent.VectorStore over SQLite with the sqlite-vec extension (https://github.com/asg017/sqlite-vec).

Build requirements

This package requires CGO and a C compiler. The sqlite-vec extension needs sqlite3.h, which mattn/go-sqlite3 ships as sqlite3-binding.h. A shim header at csrc/sqlite3.h bridges the two. Run once from the repository root:

go env -w CGO_CFLAGS="-I$(pwd)/memory/vector/sqlitevec/csrc -I$(go env GOMODCACHE)/github.com/mattn/go-sqlite3@v1.14.40"

The setting persists in your Go environment file (~/.config/go/env). Update the mattn/go-sqlite3 version suffix if that dependency is upgraded.

Schema

This package uses two tables (created by Migrate or supplied by the caller):

goagent_embeddings      — regular table: id, content, metadata, created_at
goagent_embeddings_vec  — vec0 virtual table: rowid (FK to main table), embedding

The vec0 table provides an indexed KNN search via sqlite-vec's MATCH operator.

Usage

Call Register (or use Open) before opening any SQLite connection:

sqlitevec.Register()
db, err := sql.Open("sqlite3", "path/to/db.sqlite")

Or use the convenience wrapper:

db, err := sqlitevec.Open("path/to/db.sqlite")

Metadata filtering

Search supports goagent.WithFilter when MetadataColumn is set. Filtering is applied in Go after the database returns results (post-query). topK is applied by sqlite-vec first, so a selective filter may yield fewer than topK results. All key-value pairs in the filter map must match (AND semantics); values are compared with reflect.DeepEqual.

This is appropriate for sqlitevec's typical scale (< 100k entries) where the overhead of post-filtering a small result set is negligible.

Score threshold

goagent.WithScoreThreshold is also applied post-query in Go, after the score conversion from distance. Both options can be combined.

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

func Migrate

func Migrate(ctx context.Context, db *sql.DB, cfg MigrateConfig) error

Migrate creates the data table and the vec0 virtual table if they do not exist. It is idempotent — safe to call on every application start.

Two tables are created:

  • TableName: id TEXT PK, content TEXT, metadata TEXT (JSON), created_at INTEGER
  • TableName+"_vec": vec0 virtual table with an embedding float[Dims] column

To use the created tables with New:

cfg := sqlitevec.MigrateConfig{TableName: "goagent_embeddings", Dims: 1536}
if err := sqlitevec.Migrate(ctx, db, cfg); err != nil { ... }

store, err := sqlitevec.New(db, sqlitevec.TableConfig{
    Table:          cfg.TableName,
    IDColumn:       "id",
    VectorColumn:   "embedding",
    TextColumn:     "content",
    MetadataColumn: "metadata",
})
Example
package main

import (
	"context"
	"fmt"
	"log"

	"github.com/Germanblandin1/goagent/memory/vector/sqlitevec"
)

func main() {
	ctx := context.Background()
	db, err := sqlitevec.Open("/path/to/mydb.sqlite")
	if err != nil {
		log.Fatal(err)
	}
	defer db.Close()

	cfg := sqlitevec.MigrateConfig{
		TableName: "goagent_embeddings",
		Dims:      1536, // match your embedding model (e.g. text-embedding-3-small)
	}
	if err := sqlitevec.Migrate(ctx, db, cfg); err != nil {
		log.Fatal(err)
	}

	store, err := sqlitevec.New(db, sqlitevec.TableConfig{
		Table:          cfg.TableName,
		IDColumn:       "id",
		VectorColumn:   "embedding",
		TextColumn:     "content",
		MetadataColumn: "metadata",
	})
	if err != nil {
		log.Fatal(err)
	}
	_ = store
	fmt.Println("store ready")
}

func Open

func Open(dsn string) (*sql.DB, error)

Open registers the sqlite-vec extension and opens a SQLite database at dsn. It is the recommended way to obtain a *sql.DB for use with this package.

Callers managing their own connection must call Register before sql.Open.

func Register

func Register()

Register enables the sqlite-vec extension for all SQLite connections opened after this call. It is idempotent and safe to call multiple times. Must be called before sql.Open when not using Open.

Types

type DistanceMetric

type DistanceMetric string

DistanceMetric selects the vector similarity function used in Search.

const (
	// L2 uses Euclidean distance via the sqlite-vec KNN index (MATCH … AND k = ?).
	// Score is 1 / (1 + distance), in (0, 1].
	// This is the default. Queries are index-accelerated.
	L2 DistanceMetric = "l2"

	// Cosine uses cosine distance via the vec_distance_cosine SQL function.
	// Score is 1 - distance; for unit-normalised vectors the range is [0, 1].
	// Queries perform a full scan — suitable for datasets up to tens of thousands
	// of rows. For larger datasets, normalise vectors before inserting and use L2
	// (L2 on unit vectors is equivalent to cosine similarity).
	Cosine DistanceMetric = "cosine"
)

type MigrateConfig

type MigrateConfig struct {
	// TableName is the base name for the two tables to create:
	//   TableName       — regular data table
	//   TableName+"_vec" — vec0 virtual table for vector search
	// Default: "goagent_embeddings".
	TableName string

	// Dims is the vector dimension. Required. Must match the embedding model
	// (e.g. 768 for nomic-embed-text, 1024 for voyage-3, 1536 for
	// text-embedding-3-small).
	Dims int

	// Metric is the distance metric that will be used with New.
	// Stored only for documentation — has no effect on the schema.
	// Default: L2.
	Metric DistanceMetric
}

MigrateConfig configures the tables that Migrate will create. Use this when the caller has no existing schema and wants a quick start.

type Store

type Store struct {
	// contains filtered or unexported fields
}

Store implements goagent.VectorStore over SQLite with the sqlite-vec extension. It satisfies the goagent.VectorStore interface directly.

func New

func New(db *sql.DB, cfg TableConfig, opts ...StoreOption) (*Store, error)

New creates a Store backed by db using the given TableConfig and options. Returns an error if any required field is missing or contains invalid characters.

Example
package main

import (
	"fmt"
	"log"

	"github.com/Germanblandin1/goagent/memory/vector/sqlitevec"
)

func main() {
	db, err := sqlitevec.Open("/path/to/mydb.sqlite")
	if err != nil {
		log.Fatal(err)
	}
	defer db.Close()

	store, err := sqlitevec.New(db, sqlitevec.TableConfig{
		Table:          "embeddings",
		IDColumn:       "id",
		VectorColumn:   "embedding",
		TextColumn:     "content",
		MetadataColumn: "metadata", // optional; omit if your table has no metadata column
	})
	if err != nil {
		log.Fatal(err)
	}
	_ = store
	fmt.Println("store created")
}

func (*Store) BulkDelete added in v0.5.4

func (s *Store) BulkDelete(ctx context.Context, ids []string) error

BulkDelete removes all entries with the given ids within a single SQLite transaction. IDs that do not exist are silently ignored.

func (*Store) BulkUpsert added in v0.5.4

func (s *Store) BulkUpsert(ctx context.Context, entries []goagent.UpsertEntry) error

BulkUpsert stores or updates all entries within a single SQLite transaction, reducing round-trips compared to N individual Upsert calls. Each entry follows the same multi-step upsert logic as Store.Upsert.

func (*Store) Delete

func (s *Store) Delete(ctx context.Context, id string) error

Delete removes the entry with the given id from both the data table and the vec0 virtual table. It is a no-op if id does not exist.

func (*Store) Search

func (s *Store) Search(ctx context.Context, vec []float32, topK int, opts ...goagent.SearchOption) ([]goagent.ScoredMessage, error)

Search returns the topK messages most similar to vec, ordered by similarity descending. Each returned Message has RoleDocument so it is never forwarded to a provider.

L2 search uses sqlite-vec's indexed KNN (MATCH … AND k = ?). Cosine search uses vec_distance_cosine and performs a full scan.

WithScoreThreshold and WithFilter are both applied post-query in Go. topK is applied by the database first, so fewer than topK results may be returned when either option is active. WithFilter requires MetadataColumn to be set; silently ignored otherwise. All key-value pairs in the filter must match (AND semantics); values are compared with reflect.DeepEqual.

Example (WithFilter)

ExampleStore_Search_withFilter demonstrates metadata filtering using goagent.WithFilter.

For sqlitevec, filtering is applied in Go after sqlite-vec returns the topK nearest neighbours. This is appropriate for sqlitevec's typical scale (embedded, local, < 100k entries) where the post-filter overhead is negligible. Requires MetadataColumn to be set in TableConfig.

Scenario: a local agent that indexes documents from multiple projects. A query for "project-alpha" must not surface documents from other projects, even if they are semantically similar.

package main

import (
	"context"
	"fmt"
	"log"

	"github.com/Germanblandin1/goagent"
	"github.com/Germanblandin1/goagent/memory/vector/sqlitevec"
)

func main() {
	ctx := context.Background()
	db, err := sqlitevec.Open(":memory:") // in-memory SQLite for the example
	if err != nil {
		log.Fatal(err)
	}
	defer db.Close()

	if err := sqlitevec.Migrate(ctx, db, sqlitevec.MigrateConfig{
		TableName: "embeddings",
		Dims:      3,
	}); err != nil {
		log.Fatal(err)
	}

	store, err := sqlitevec.New(db, sqlitevec.TableConfig{
		Table:          "embeddings",
		IDColumn:       "id",
		VectorColumn:   "embedding",
		TextColumn:     "content",
		MetadataColumn: "metadata", // required for WithFilter
	})
	if err != nil {
		log.Fatal(err)
	}

	// Upsert documents tagged with a project name.
	// In production this step runs at index time, not at query time.
	docs := []struct {
		id      string
		vec     []float32
		text    string
		project string
	}{
		{"alpha-001", []float32{1, 0, 0}, "Alpha: deployment checklist", "alpha"},
		{"alpha-002", []float32{0.9, 0.1, 0}, "Alpha: rollback procedure", "alpha"},
		{"beta-001", []float32{0.95, 0.05, 0}, "Beta: deployment guide", "beta"},
	}
	for _, d := range docs {
		msg := goagent.Message{
			Role:     goagent.RoleDocument,
			Content:  []goagent.ContentBlock{goagent.TextBlock(d.text)},
			Metadata: map[string]any{"project": d.project},
		}
		if err := store.Upsert(ctx, d.id, d.vec, msg); err != nil {
			log.Fatal(err)
		}
	}

	queryVec := []float32{1, 0, 0}

	// Without filter: "Beta: deployment guide" would rank highly because its
	// vector is almost identical to the alpha docs.
	// With filter: only alpha documents qualify.
	results, err := store.Search(ctx, queryVec, 5,
		goagent.WithFilter(map[string]any{"project": "alpha"}),
	)
	if err != nil {
		log.Fatal(err)
	}

	for _, r := range results {
		fmt.Printf("project=%s text=%s\n",
			r.Message.Metadata["project"],
			r.Message.TextContent(),
		)
	}
}
Example (WithScoreThresholdAndFilter)

ExampleStore_Search_withScoreThresholdAndFilter demonstrates combining a score threshold with a metadata filter.

Both are applied in Go post-query: the threshold discards results below the minimum similarity, and the filter discards results whose metadata does not match. The order of application is: ScoreThreshold first, then Filter. Both options can yield fewer than topK results.

package main

import (
	"context"
	"fmt"
	"log"

	"github.com/Germanblandin1/goagent"
	"github.com/Germanblandin1/goagent/memory/vector/sqlitevec"
)

func main() {
	ctx := context.Background()
	db, err := sqlitevec.Open(":memory:")
	if err != nil {
		log.Fatal(err)
	}
	defer db.Close()

	if err := sqlitevec.Migrate(ctx, db, sqlitevec.MigrateConfig{
		TableName: "embeddings",
		Dims:      3,
	}); err != nil {
		log.Fatal(err)
	}

	store, err := sqlitevec.New(db, sqlitevec.TableConfig{
		Table:          "embeddings",
		IDColumn:       "id",
		VectorColumn:   "embedding",
		TextColumn:     "content",
		MetadataColumn: "metadata",
	})
	if err != nil {
		log.Fatal(err)
	}

	queryVec := []float32{1, 0, 0}

	results, err := store.Search(ctx, queryVec, 10,
		goagent.WithFilter(map[string]any{"project": "alpha"}),
		goagent.WithScoreThreshold(0.90), // only results with ≥90% similarity
	)
	if err != nil {
		log.Fatal(err)
	}

	fmt.Printf("results above threshold: %d\n", len(results))
}

func (*Store) Upsert

func (s *Store) Upsert(ctx context.Context, id string, vec []float32, msg goagent.Message) error

Upsert stores or updates the message and its embedding vector under id. The operation is idempotent: calling Upsert twice with the same id replaces the first entry. Only text content and optional metadata from msg are persisted — Role and ToolCalls are discarded.

type StoreOption

type StoreOption func(*storeOptions)

StoreOption configures the behaviour of a Store.

func WithDistanceMetric

func WithDistanceMetric(m DistanceMetric) StoreOption

WithDistanceMetric sets the similarity metric used in Search. Default: L2.

type TableConfig

type TableConfig struct {
	// Table is the regular SQLite table name (e.g. "goagent_embeddings").
	Table string

	// IDColumn is the PRIMARY KEY column of type TEXT in the data table.
	IDColumn string

	// VectorColumn is the vector column defined in the vec0 virtual table
	// (the table named Table+"_vec").
	VectorColumn string

	// TextColumn is the TEXT column in the data table containing the chunk text.
	TextColumn string

	// MetadataColumn is optional. If non-empty, it must be a TEXT column holding
	// a JSON object. Its content is deserialized into Message.Metadata.
	MetadataColumn string
}

TableConfig describes the schema used by this package. The vec0 virtual table is inferred by appending "_vec" to Table. All fields are required except MetadataColumn.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL