Documentation
¶
Overview ¶
Package pgvector implements goagent.VectorStore over PostgreSQL with the pgvector extension.
The caller describes their existing table via TableConfig — this package does not impose any schema. For a quick start without an existing table, use Migrate.
Metadata filtering ¶
When MetadataColumn is set to a JSONB column, Search supports goagent.WithFilter to restrict results server-side using PostgreSQL's JSONB containment operator (@>). All key-value pairs in the filter map must be present in the stored metadata (AND semantics).
For best performance on large tables, create a GIN index on the metadata column:
CREATE INDEX ON embeddings USING gin(metadata jsonb_path_ops);
Without the index, PostgreSQL falls back to a sequential scan. For tables under ~100k rows, the sequential scan is typically fast enough.
Score threshold ¶
goagent.WithScoreThreshold is applied in Go after the database returns results. topK is applied by the database first, so a selective threshold may yield fewer than topK results.
Index ¶
- Variables
- func Migrate(ctx context.Context, db *sql.DB, cfg MigrateConfig) error
- type DistanceFunc
- type MigrateConfig
- type Querier
- type Store
- func (s *Store) BulkDelete(ctx context.Context, ids []string) error
- func (s *Store) BulkUpsert(ctx context.Context, entries []goagent.UpsertEntry) error
- func (s *Store) Delete(ctx context.Context, id string) error
- func (s *Store) Search(ctx context.Context, vec []float32, topK int, opts ...goagent.SearchOption) ([]goagent.ScoredMessage, error)
- func (s *Store) Upsert(ctx context.Context, id string, vec []float32, msg goagent.Message) error
- type StoreOption
- type TableConfig
Examples ¶
Constants ¶
This section is empty.
Variables ¶
var ( // Cosine uses cosine distance (<=>). // Score is 1 - distance, in [0, 1] for normalised vectors. // Recommended for most text embedding models. // HNSW operator class: vector_cosine_ops. Cosine = DistanceFunc{"<=>", "vector_cosine_ops"} // L2 uses Euclidean distance (<->). // Score is 1 / (1 + distance), in (0, 1]. // Use when the model produces non-normalised vectors. // HNSW operator class: vector_l2_ops. L2 = DistanceFunc{"<->", "vector_l2_ops"} // InnerProduct uses negative inner product (<#>). // Score is the inner product (negated back to positive). // Equivalent to cosine similarity for normalised vectors; // marginally faster on some hardware. // HNSW operator class: vector_ip_ops. InnerProduct = DistanceFunc{"<#>", "vector_ip_ops"} )
Functions ¶
func Migrate ¶
Migrate creates the vector extension, table, and HNSW index if they do not exist. It is idempotent — safe to call multiple times without error. The created table has columns: id TEXT PK, embedding vector(Dims), content TEXT, metadata JSONB, created_at TIMESTAMPTZ.
To use the created table with New:
cfg := pgvector.MigrateConfig{TableName: "goagent_embeddings", Dims: 1536}
if err := pgvector.Migrate(ctx, db, cfg); err != nil { ... }
store, err := pgvector.New(db, pgvector.TableConfig{
Table: cfg.TableName,
IDColumn: "id",
VectorColumn: "embedding",
TextColumn: "content",
MetadataColumn: "metadata",
})
Example ¶
package main
import (
"context"
"database/sql"
"fmt"
"log"
"github.com/Germanblandin1/goagent/memory/vector/pgvector"
_ "github.com/jackc/pgx/v5/stdlib"
)
func main() {
ctx := context.Background()
db, err := sql.Open("pgx", "postgres://user:pass@localhost/mydb")
if err != nil {
log.Fatal(err)
}
defer db.Close()
cfg := pgvector.MigrateConfig{
TableName: "goagent_embeddings",
Dims: 1536, // match your embedding model (e.g. text-embedding-3-small)
}
if err := pgvector.Migrate(ctx, db, cfg); err != nil {
log.Fatal(err)
}
store, err := pgvector.New(db, pgvector.TableConfig{
Table: cfg.TableName,
IDColumn: "id",
VectorColumn: "embedding",
TextColumn: "content",
MetadataColumn: "metadata",
})
if err != nil {
log.Fatal(err)
}
_ = store
fmt.Println("store ready")
}
Output:
Types ¶
type DistanceFunc ¶
type DistanceFunc struct {
// contains filtered or unexported fields
}
DistanceFunc defines the vector distance operator used for similarity search and the corresponding HNSW operator class for index creation.
The DistanceFunc passed to New and the operator class used in Migrate must match — a mismatch causes pgvector to skip the index and fall back to a sequential scan.
The three built-in values (Cosine, L2, InnerProduct) cover all operators supported by pgvector. For custom or future operators use NewDistanceFunc.
func NewDistanceFunc ¶
func NewDistanceFunc(operator, opsClass string) DistanceFunc
NewDistanceFunc constructs a DistanceFunc for a custom or future pgvector operator. operator is the SQL infix operator (e.g. "<=>") and opsClass is the HNSW operator class (e.g. "vector_cosine_ops").
Use the built-in values (Cosine, L2, InnerProduct) for standard pgvector operators.
type MigrateConfig ¶
type MigrateConfig struct {
// TableName is the name of the table to create. Default: "goagent_embeddings".
TableName string
// Dims is the vector dimension. Required. Must match the embedding model
// used (768, 1024, 1536, etc.).
Dims int
// DistanceFunc sets the HNSW operator class for the index.
// Must match the WithDistanceFunc option passed to New.
// Default: Cosine.
DistanceFunc DistanceFunc
// HNSWm is the m parameter of the HNSW index (connections per node).
// Common values: 8 (reduced memory), 16 (default), 32 (higher recall).
// Default: 16.
HNSWm int
// HNSWefConstruction is the candidate set size during index construction.
// Higher value = more accurate but slower to build.
// Default: 64.
HNSWefConstruction int
}
MigrateConfig configures the table that Migrate will create. Use this when the caller has no existing table and wants a quick start. For production, review the HNSW index parameters based on expected volume.
type Querier ¶
type Querier interface {
ExecContext(ctx context.Context, query string, args ...any) (sql.Result, error)
QueryContext(ctx context.Context, query string, args ...any) (*sql.Rows, error)
}
Querier is the minimal interface the Store needs from a connection. Both *sql.DB and *sql.Tx satisfy it, as does any pgx adapter that wraps them.
type Store ¶
type Store struct {
// contains filtered or unexported fields
}
Store implements goagent.VectorStore over PostgreSQL with the pgvector extension.
func New ¶
func New(db Querier, cfg TableConfig, opts ...StoreOption) (*Store, error)
New creates a Store backed by db using the given TableConfig and options. Returns an error if any required field is missing or contains invalid characters.
Example ¶
package main
import (
"database/sql"
"fmt"
"log"
"github.com/Germanblandin1/goagent/memory/vector/pgvector"
_ "github.com/jackc/pgx/v5/stdlib"
)
func main() {
db, err := sql.Open("pgx", "postgres://user:pass@localhost/mydb")
if err != nil {
log.Fatal(err)
}
defer db.Close()
store, err := pgvector.New(db, pgvector.TableConfig{
Table: "embeddings",
IDColumn: "id",
VectorColumn: "embedding",
TextColumn: "content",
MetadataColumn: "metadata", // optional; omit if your table has no metadata column
})
if err != nil {
log.Fatal(err)
}
_ = store
fmt.Println("store created")
}
Output:
func (*Store) BulkDelete ¶ added in v0.5.4
BulkDelete removes all entries with the given ids in a single query using a parameterized IN clause. IDs that do not exist are silently ignored.
func (*Store) BulkUpsert ¶ added in v0.5.4
BulkUpsert stores or updates all entries in a single database transaction when the underlying connection supports BeginTx (e.g. *sql.DB). Otherwise entries are upserted sequentially with individual calls. The operation is idempotent; duplicate IDs within entries follow last-write-wins semantics.
func (*Store) Delete ¶
Delete removes the entry with the given id from the store. It is a no-op if id does not exist.
func (*Store) Search ¶
func (s *Store) Search(ctx context.Context, vec []float32, topK int, opts ...goagent.SearchOption) ([]goagent.ScoredMessage, error)
Search returns the topK messages most similar to vec, ordered by similarity descending. Each returned Message has RoleDocument so it is never forwarded to a provider.
WithFilter applies a JSONB containment filter (metadata @> filter::jsonb) server-side. Requires MetadataColumn to be set in TableConfig; silently ignored otherwise. topK is applied after the filter by the database, so fewer than topK results may be returned when the filter is selective. WithScoreThreshold is applied post-query in Go.
Example (WithFilter) ¶
ExampleStore_Search_withFilter demonstrates metadata filtering using goagent.WithFilter.
The filter is applied server-side with PostgreSQL's JSONB containment operator (metadata @> filter::jsonb), so only rows whose metadata contains all specified key-value pairs are candidates for similarity ranking.
Scenario: a shared embedding table used by multiple teams. Each document is tagged with "team" and "language". A query for the engineering team must not surface documents owned by other teams, even if they are semantically close.
For large tables, create a GIN index to make the JSONB filter index-assisted:
CREATE INDEX ON goagent_embeddings USING gin(metadata jsonb_path_ops);
package main
import (
"context"
"database/sql"
"fmt"
"log"
"github.com/Germanblandin1/goagent"
"github.com/Germanblandin1/goagent/memory/vector/pgvector"
_ "github.com/jackc/pgx/v5/stdlib"
)
func main() {
ctx := context.Background()
db, err := sql.Open("pgx", "postgres://user:pass@localhost/mydb")
if err != nil {
log.Fatal(err)
}
defer db.Close()
store, err := pgvector.New(db, pgvector.TableConfig{
Table: "goagent_embeddings",
IDColumn: "id",
VectorColumn: "embedding",
TextColumn: "content",
MetadataColumn: "metadata", // required for WithFilter
})
if err != nil {
log.Fatal(err)
}
// Upsert documents with team and language metadata.
// In production this step runs at index time, not at query time.
docs := []struct {
id string
vec []float32
text string
meta map[string]any
}{
{
id: "eng-go-001",
vec: []float32{0.1, 0.9, 0.0},
text: "Error handling patterns in Go use explicit return values.",
meta: map[string]any{"team": "engineering", "language": "go"},
},
{
id: "eng-py-001",
vec: []float32{0.1, 0.85, 0.05},
text: "Python exceptions propagate up the call stack automatically.",
meta: map[string]any{"team": "engineering", "language": "python"},
},
{
id: "mkt-001",
vec: []float32{0.15, 0.8, 0.1},
text: "Our error-free delivery guarantee is central to our brand.",
meta: map[string]any{"team": "marketing", "language": "en"},
},
}
for _, d := range docs {
msg := goagent.Message{
Role: goagent.RoleDocument,
Content: []goagent.ContentBlock{goagent.TextBlock(d.text)},
Metadata: d.meta,
}
if err := store.Upsert(ctx, d.id, d.vec, msg); err != nil {
log.Fatal(err)
}
}
// Query embedding for "how do I handle errors".
queryVec := []float32{0.1, 0.88, 0.02}
// Without filter: all three documents are candidates.
// With filter: only the Go engineering document qualifies, even though the
// marketing document has a similar embedding.
results, err := store.Search(ctx, queryVec, 5,
goagent.WithFilter(map[string]any{
"team": "engineering",
"language": "go",
}),
)
if err != nil {
log.Fatal(err)
}
for _, r := range results {
fmt.Printf("score=%.2f team=%s text=%s\n",
r.Score,
r.Message.Metadata["team"],
r.Message.TextContent(),
)
}
}
Output:
Example (WithScoreThresholdAndFilter) ¶
ExampleStore_Search_withScoreThresholdAndFilter demonstrates combining a score threshold with a metadata filter.
The threshold is applied after the database returns topK results. The filter is applied server-side before ranking. Together they let you express: "give me the top 10 engineering docs, but only if they are at least 80% similar to my query".
package main
import (
"context"
"database/sql"
"fmt"
"log"
"github.com/Germanblandin1/goagent"
"github.com/Germanblandin1/goagent/memory/vector/pgvector"
_ "github.com/jackc/pgx/v5/stdlib"
)
func main() {
ctx := context.Background()
db, err := sql.Open("pgx", "postgres://user:pass@localhost/mydb")
if err != nil {
log.Fatal(err)
}
defer db.Close()
store, err := pgvector.New(db, pgvector.TableConfig{
Table: "goagent_embeddings",
IDColumn: "id",
VectorColumn: "embedding",
TextColumn: "content",
MetadataColumn: "metadata",
})
if err != nil {
log.Fatal(err)
}
queryVec := []float32{0.1, 0.88, 0.02}
results, err := store.Search(ctx, queryVec, 10,
goagent.WithFilter(map[string]any{"team": "engineering"}),
goagent.WithScoreThreshold(0.80), // discard results below 80% similarity
)
if err != nil {
log.Fatal(err)
}
fmt.Printf("results above threshold: %d\n", len(results))
}
Output:
type StoreOption ¶
type StoreOption func(*storeOptions)
StoreOption configures the behaviour of a Store.
func WithDistanceFunc ¶
func WithDistanceFunc(d DistanceFunc) StoreOption
WithDistanceFunc sets the distance operator used in Search. Must match the operator class of the table's vector index. Default: Cosine.
type TableConfig ¶
type TableConfig struct {
// Table is the table or view name. May include schema: "public.embeddings".
Table string
// IDColumn is the PRIMARY KEY column of type TEXT.
IDColumn string
// VectorColumn is the column of type vector(n) where n is the embedding
// model dimension (e.g. 1536 for text-embedding-3-small, 768 for
// nomic-embed-text, 1024 for voyage-3).
VectorColumn string
// TextColumn is the TEXT column containing the chunk text.
// Its value is returned as a goagent.ContentBlock of type text in ScoredMessage.
TextColumn string
// MetadataColumn is optional. If non-empty, it must be a JSONB column.
// Its content is deserialized into Message.Metadata of each ScoredMessage.
MetadataColumn string
}
TableConfig describes the schema of the caller's vector table. No field has a default — the caller must be explicit. All fields are required except MetadataColumn.