Documentation
¶
Overview ¶
Package sqlitevec implements goagent.VectorStore over SQLite with the sqlite-vec extension (https://github.com/asg017/sqlite-vec).
Build requirements ¶
This package requires CGO and a C compiler. The sqlite-vec extension needs sqlite3.h, which mattn/go-sqlite3 ships as sqlite3-binding.h. A shim header at csrc/sqlite3.h bridges the two. Run once from the repository root:
go env -w CGO_CFLAGS="-I$(pwd)/memory/vector/sqlitevec/csrc -I$(go env GOMODCACHE)/github.com/mattn/go-sqlite3@v1.14.40"
The setting persists in your Go environment file (~/.config/go/env). Update the mattn/go-sqlite3 version suffix if that dependency is upgraded.
Schema ¶
This package uses two tables (created by Migrate or supplied by the caller):
goagent_embeddings — regular table: id, content, metadata, created_at goagent_embeddings_vec — vec0 virtual table: rowid (FK to main table), embedding
The vec0 table provides an indexed KNN search via sqlite-vec's MATCH operator.
Usage ¶
Call Register (or use Open) before opening any SQLite connection:
sqlitevec.Register()
db, err := sql.Open("sqlite3", "path/to/db.sqlite")
Or use the convenience wrapper:
db, err := sqlitevec.Open("path/to/db.sqlite")
Metadata filtering ¶
Search supports goagent.WithFilter when MetadataColumn is set. Filtering is applied in Go after the database returns results (post-query). topK is applied by sqlite-vec first, so a selective filter may yield fewer than topK results. All key-value pairs in the filter map must match (AND semantics); values are compared with reflect.DeepEqual.
This is appropriate for sqlitevec's typical scale (< 100k entries) where the overhead of post-filtering a small result set is negligible.
Score threshold ¶
goagent.WithScoreThreshold is also applied post-query in Go, after the score conversion from distance. Both options can be combined.
Index ¶
- func Migrate(ctx context.Context, db *sql.DB, cfg MigrateConfig) error
- func Open(dsn string) (*sql.DB, error)
- func Register()
- type DistanceMetric
- type MigrateConfig
- type Store
- func (s *Store) BulkDelete(ctx context.Context, ids []string) error
- func (s *Store) BulkUpsert(ctx context.Context, entries []goagent.UpsertEntry) error
- func (s *Store) Delete(ctx context.Context, id string) error
- func (s *Store) Search(ctx context.Context, vec []float32, topK int, opts ...goagent.SearchOption) ([]goagent.ScoredMessage, error)
- func (s *Store) Upsert(ctx context.Context, id string, vec []float32, msg goagent.Message) error
- type StoreOption
- type TableConfig
Examples ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func Migrate ¶
Migrate creates the data table and the vec0 virtual table if they do not exist. It is idempotent — safe to call on every application start.
Two tables are created:
- TableName: id TEXT PK, content TEXT, metadata TEXT (JSON), created_at INTEGER
- TableName+"_vec": vec0 virtual table with an embedding float[Dims] column
To use the created tables with New:
cfg := sqlitevec.MigrateConfig{TableName: "goagent_embeddings", Dims: 1536}
if err := sqlitevec.Migrate(ctx, db, cfg); err != nil { ... }
store, err := sqlitevec.New(db, sqlitevec.TableConfig{
Table: cfg.TableName,
IDColumn: "id",
VectorColumn: "embedding",
TextColumn: "content",
MetadataColumn: "metadata",
})
Example ¶
package main
import (
"context"
"fmt"
"log"
"github.com/Germanblandin1/goagent/memory/vector/sqlitevec"
)
func main() {
ctx := context.Background()
db, err := sqlitevec.Open("/path/to/mydb.sqlite")
if err != nil {
log.Fatal(err)
}
defer db.Close()
cfg := sqlitevec.MigrateConfig{
TableName: "goagent_embeddings",
Dims: 1536, // match your embedding model (e.g. text-embedding-3-small)
}
if err := sqlitevec.Migrate(ctx, db, cfg); err != nil {
log.Fatal(err)
}
store, err := sqlitevec.New(db, sqlitevec.TableConfig{
Table: cfg.TableName,
IDColumn: "id",
VectorColumn: "embedding",
TextColumn: "content",
MetadataColumn: "metadata",
})
if err != nil {
log.Fatal(err)
}
_ = store
fmt.Println("store ready")
}
Output:
Types ¶
type DistanceMetric ¶
type DistanceMetric string
DistanceMetric selects the vector similarity function used in Search.
const ( // L2 uses Euclidean distance via the sqlite-vec KNN index (MATCH … AND k = ?). // Score is 1 / (1 + distance), in (0, 1]. // This is the default. Queries are index-accelerated. L2 DistanceMetric = "l2" // Cosine uses cosine distance via the vec_distance_cosine SQL function. // Score is 1 - distance; for unit-normalised vectors the range is [0, 1]. // Queries perform a full scan — suitable for datasets up to tens of thousands // of rows. For larger datasets, normalise vectors before inserting and use L2 // (L2 on unit vectors is equivalent to cosine similarity). Cosine DistanceMetric = "cosine" )
type MigrateConfig ¶
type MigrateConfig struct {
// TableName is the base name for the two tables to create:
// TableName — regular data table
// TableName+"_vec" — vec0 virtual table for vector search
// Default: "goagent_embeddings".
TableName string
// Dims is the vector dimension. Required. Must match the embedding model
// (e.g. 768 for nomic-embed-text, 1024 for voyage-3, 1536 for
// text-embedding-3-small).
Dims int
// Metric is the distance metric that will be used with New.
// Stored only for documentation — has no effect on the schema.
// Default: L2.
Metric DistanceMetric
}
MigrateConfig configures the tables that Migrate will create. Use this when the caller has no existing schema and wants a quick start.
type Store ¶
type Store struct {
// contains filtered or unexported fields
}
Store implements goagent.VectorStore over SQLite with the sqlite-vec extension. It satisfies the goagent.VectorStore interface directly.
func New ¶
func New(db *sql.DB, cfg TableConfig, opts ...StoreOption) (*Store, error)
New creates a Store backed by db using the given TableConfig and options. Returns an error if any required field is missing or contains invalid characters.
Example ¶
package main
import (
"fmt"
"log"
"github.com/Germanblandin1/goagent/memory/vector/sqlitevec"
)
func main() {
db, err := sqlitevec.Open("/path/to/mydb.sqlite")
if err != nil {
log.Fatal(err)
}
defer db.Close()
store, err := sqlitevec.New(db, sqlitevec.TableConfig{
Table: "embeddings",
IDColumn: "id",
VectorColumn: "embedding",
TextColumn: "content",
MetadataColumn: "metadata", // optional; omit if your table has no metadata column
})
if err != nil {
log.Fatal(err)
}
_ = store
fmt.Println("store created")
}
Output:
func (*Store) BulkDelete ¶ added in v0.5.4
BulkDelete removes all entries with the given ids within a single SQLite transaction. IDs that do not exist are silently ignored.
func (*Store) BulkUpsert ¶ added in v0.5.4
BulkUpsert stores or updates all entries within a single SQLite transaction, reducing round-trips compared to N individual Upsert calls. Each entry follows the same multi-step upsert logic as Store.Upsert.
func (*Store) Delete ¶
Delete removes the entry with the given id from both the data table and the vec0 virtual table. It is a no-op if id does not exist.
func (*Store) Search ¶
func (s *Store) Search(ctx context.Context, vec []float32, topK int, opts ...goagent.SearchOption) ([]goagent.ScoredMessage, error)
Search returns the topK messages most similar to vec, ordered by similarity descending. Each returned Message has RoleDocument so it is never forwarded to a provider.
L2 search uses sqlite-vec's indexed KNN (MATCH … AND k = ?). Cosine search uses vec_distance_cosine and performs a full scan.
WithScoreThreshold and WithFilter are both applied post-query in Go. topK is applied by the database first, so fewer than topK results may be returned when either option is active. WithFilter requires MetadataColumn to be set; silently ignored otherwise. All key-value pairs in the filter must match (AND semantics); values are compared with reflect.DeepEqual.
Example (WithFilter) ¶
ExampleStore_Search_withFilter demonstrates metadata filtering using goagent.WithFilter.
For sqlitevec, filtering is applied in Go after sqlite-vec returns the topK nearest neighbours. This is appropriate for sqlitevec's typical scale (embedded, local, < 100k entries) where the post-filter overhead is negligible. Requires MetadataColumn to be set in TableConfig.
Scenario: a local agent that indexes documents from multiple projects. A query for "project-alpha" must not surface documents from other projects, even if they are semantically similar.
package main
import (
"context"
"fmt"
"log"
"github.com/Germanblandin1/goagent"
"github.com/Germanblandin1/goagent/memory/vector/sqlitevec"
)
func main() {
ctx := context.Background()
db, err := sqlitevec.Open(":memory:") // in-memory SQLite for the example
if err != nil {
log.Fatal(err)
}
defer db.Close()
if err := sqlitevec.Migrate(ctx, db, sqlitevec.MigrateConfig{
TableName: "embeddings",
Dims: 3,
}); err != nil {
log.Fatal(err)
}
store, err := sqlitevec.New(db, sqlitevec.TableConfig{
Table: "embeddings",
IDColumn: "id",
VectorColumn: "embedding",
TextColumn: "content",
MetadataColumn: "metadata", // required for WithFilter
})
if err != nil {
log.Fatal(err)
}
// Upsert documents tagged with a project name.
// In production this step runs at index time, not at query time.
docs := []struct {
id string
vec []float32
text string
project string
}{
{"alpha-001", []float32{1, 0, 0}, "Alpha: deployment checklist", "alpha"},
{"alpha-002", []float32{0.9, 0.1, 0}, "Alpha: rollback procedure", "alpha"},
{"beta-001", []float32{0.95, 0.05, 0}, "Beta: deployment guide", "beta"},
}
for _, d := range docs {
msg := goagent.Message{
Role: goagent.RoleDocument,
Content: []goagent.ContentBlock{goagent.TextBlock(d.text)},
Metadata: map[string]any{"project": d.project},
}
if err := store.Upsert(ctx, d.id, d.vec, msg); err != nil {
log.Fatal(err)
}
}
queryVec := []float32{1, 0, 0}
// Without filter: "Beta: deployment guide" would rank highly because its
// vector is almost identical to the alpha docs.
// With filter: only alpha documents qualify.
results, err := store.Search(ctx, queryVec, 5,
goagent.WithFilter(map[string]any{"project": "alpha"}),
)
if err != nil {
log.Fatal(err)
}
for _, r := range results {
fmt.Printf("project=%s text=%s\n",
r.Message.Metadata["project"],
r.Message.TextContent(),
)
}
}
Output:
Example (WithScoreThresholdAndFilter) ¶
ExampleStore_Search_withScoreThresholdAndFilter demonstrates combining a score threshold with a metadata filter.
Both are applied in Go post-query: the threshold discards results below the minimum similarity, and the filter discards results whose metadata does not match. The order of application is: ScoreThreshold first, then Filter. Both options can yield fewer than topK results.
package main
import (
"context"
"fmt"
"log"
"github.com/Germanblandin1/goagent"
"github.com/Germanblandin1/goagent/memory/vector/sqlitevec"
)
func main() {
ctx := context.Background()
db, err := sqlitevec.Open(":memory:")
if err != nil {
log.Fatal(err)
}
defer db.Close()
if err := sqlitevec.Migrate(ctx, db, sqlitevec.MigrateConfig{
TableName: "embeddings",
Dims: 3,
}); err != nil {
log.Fatal(err)
}
store, err := sqlitevec.New(db, sqlitevec.TableConfig{
Table: "embeddings",
IDColumn: "id",
VectorColumn: "embedding",
TextColumn: "content",
MetadataColumn: "metadata",
})
if err != nil {
log.Fatal(err)
}
queryVec := []float32{1, 0, 0}
results, err := store.Search(ctx, queryVec, 10,
goagent.WithFilter(map[string]any{"project": "alpha"}),
goagent.WithScoreThreshold(0.90), // only results with ≥90% similarity
)
if err != nil {
log.Fatal(err)
}
fmt.Printf("results above threshold: %d\n", len(results))
}
Output:
type StoreOption ¶
type StoreOption func(*storeOptions)
StoreOption configures the behaviour of a Store.
func WithDistanceMetric ¶
func WithDistanceMetric(m DistanceMetric) StoreOption
WithDistanceMetric sets the similarity metric used in Search. Default: L2.
type TableConfig ¶
type TableConfig struct {
// Table is the regular SQLite table name (e.g. "goagent_embeddings").
Table string
// IDColumn is the PRIMARY KEY column of type TEXT in the data table.
IDColumn string
// VectorColumn is the vector column defined in the vec0 virtual table
// (the table named Table+"_vec").
VectorColumn string
// TextColumn is the TEXT column in the data table containing the chunk text.
TextColumn string
// MetadataColumn is optional. If non-empty, it must be a TEXT column holding
// a JSON object. Its content is deserialized into Message.Metadata.
MetadataColumn string
}
TableConfig describes the schema used by this package. The vec0 virtual table is inferred by appending "_vec" to Table. All fields are required except MetadataColumn.