embeddings

package
v0.33.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 11, 2025 License: MIT Imports: 20 Imported by: 0

README

Embeddings Package

This package provides a Go client for the vector-db MCP server running in a Docker container. It's a translation of the Clojure namespace from test/embeddings/clj/vector_db_process.clj.

Overview

The embeddings package provides:

  1. Container Management - Automatically starts and manages the vector-db Docker container
  2. MCP Client - Connects to the vector database via the official Go MCP SDK
  3. Vector Operations - High-level functions for working with vector collections and embeddings

Features

  • Start/stop vector DB container automatically
  • MCP protocol communication via stdio
  • Collection management (create, delete, list)
  • Vector operations (add, delete, search)
  • Cosine distance similarity search
  • Metadata support for vectors
  • Full type safety with Go

Usage

Basic Example
package main

import (
    "context"
    "log"

    "github.com/docker/mcp-gateway/pkg/gateway/embeddings"
)

func main() {
    ctx := context.Background()

    // Create client (this starts the container)
    // The dimension parameter specifies the vector dimension (1536 for OpenAI embeddings)
    client, err := embeddings.NewVectorDBClient(ctx, "./data", 1536, nil)
    if err != nil {
        log.Fatal(err)
    }
    defer client.Close()

    // Create a collection
    _, err = client.CreateCollection(ctx, "my-vectors")
    if err != nil {
        log.Fatal(err)
    }

    // Add a vector (1536 dimensions for OpenAI embeddings)
    vector := make([]float64, 1536)
    for i := range vector {
        vector[i] = 0.1 // Your actual embedding values here
    }

    metadata := map[string]interface{}{
        "text": "This is my document",
        "source": "example.txt",
    }

    _, err = client.AddVector(ctx, "my-vectors", vector, metadata)
    if err != nil {
        log.Fatal(err)
    }

    // Search for similar vectors
    results, err := client.SearchVectors(ctx, vector, &embeddings.SearchOptions{
        CollectionName: "my-vectors",
        Limit: 10,
    })
    if err != nil {
        log.Fatal(err)
    }

    for _, result := range results {
        log.Printf("Match: ID=%d, Distance=%f, Metadata=%v\n",
            result.ID, result.Distance, result.Metadata)
    }
}
Collection Operations
// List all collections
collections, err := client.ListCollections(ctx)

// Delete a collection
_, err = client.DeleteCollection(ctx, "my-vectors")
Vector Operations
// Add vector with metadata
metadata := map[string]interface{}{
    "title": "My Document",
    "category": "research",
}
result, err := client.AddVector(ctx, "collection-name", vector, metadata)

// Search with options
results, err := client.SearchVectors(ctx, queryVector, &embeddings.SearchOptions{
    CollectionName: "my-vectors",  // Search in specific collection
    Limit: 20,                       // Return top 20 results
})

// Search across multiple collections (exclude some)
results, err := client.SearchVectors(ctx, queryVector, &embeddings.SearchOptions{
    ExcludeCollections: []string{"test-data"},
    Limit: 10,
})

// Delete a vector by ID
_, err = client.DeleteVector(ctx, vectorID)
Advanced: Direct Tool Access
// List available MCP tools
tools, err := client.ListTools(ctx)

// Call any tool directly
result, err := client.CallTool(ctx, "tool-name", map[string]interface{}{
    "param1": "value1",
    "param2": 123,
})

Key Differences from Clojure Version

  1. Simplified API: Uses CommandTransport instead of manual pipe management
  2. Automatic Initialization: MCP initialization happens during Connect()
  3. Strong Typing: Uses Go structs instead of dynamic maps
  4. Error Handling: Explicit error returns instead of Clojure's exception model
  5. Concurrency: Uses sync.Mutex instead of Clojure's core.async channels

Vector Database Details

  • Image: jimclark106/vector-db:latest
  • Vector Dimension: Configurable via the dimension parameter (default: 1536 for OpenAI embeddings)
    • Pass 0 or negative value to use default (1536)
    • Common dimensions: 1536 (OpenAI), 768 (sentence transformers), 384 (MiniLM)
  • Database: SQLite with vec extension
  • Transport: stdio (JSON-RPC over stdin/stdout)

Requirements

  • Docker daemon running
  • Go 1.24+
  • The official MCP Go SDK (github.com/modelcontextprotocol/go-sdk/mcp)

Architecture

┌─────────────────┐
│  Your Go App    │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ VectorDBClient  │  (this package)
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│  MCP Client     │  (go-sdk/mcp)
└────────┬────────┘
         │ stdio
         ▼
┌─────────────────┐
│ Docker Container│  (jimclark106/vector-db)
└─────────────────┘

See Also

Documentation

Overview

Example

Example demonstrates how to use the vector DB client

package main

import (
	"context"
	"fmt"
	"log"

	"github.com/docker/mcp-gateway/pkg/gateway/embeddings"
)

func main() {
	ctx := context.Background()

	// Create a client which starts the vector DB container
	client, err := embeddings.NewVectorDBClient(ctx, "./data", 1536, func(msg string) {
		fmt.Println(msg)
	})
	if err != nil {
		log.Fatalf("Failed to create client: %v", err)
	}
	defer client.Close()

	// Check if container is alive
	if !client.IsAlive() {
		log.Fatal("Container is not running")
	}

	// List available tools (connection is already initialized)
	toolsResult, err := client.ListTools(ctx)
	if err != nil {
		log.Fatalf("Failed to list tools: %v", err)
	}
	fmt.Printf("Available tools: %d\n", len(toolsResult.Tools))

	// Create a collection
	_, err = client.CreateCollection(ctx, "my-collection")
	if err != nil {
		log.Fatalf("Failed to create collection: %v", err)
	}

	// List collections
	collections, err := client.ListCollections(ctx)
	if err != nil {
		log.Fatalf("Failed to list collections: %v", err)
	}
	fmt.Printf("Collections: %v\n", collections)

	// Add a vector (1536 dimensions)
	sampleVector := make([]float64, 1536)
	for i := range sampleVector {
		sampleVector[i] = 0.1
	}
	metadata := map[string]any{
		"name": "test-doc",
	}
	_, err = client.AddVector(ctx, "my-collection", sampleVector, metadata)
	if err != nil {
		log.Fatalf("Failed to add vector: %v", err)
	}

	// Search for similar vectors
	results, err := client.SearchVectors(ctx, sampleVector, &embeddings.SearchOptions{
		CollectionName: "my-collection",
		Limit:          5,
	})
	if err != nil {
		log.Fatalf("Failed to search vectors: %v", err)
	}
	fmt.Printf("Search results: %d\n", len(results))
	for _, result := range results {
		fmt.Printf("  ID: %d, Distance: %f, Collection: %s\n",
			result.ID, result.Distance, result.Collection)
	}

	// Delete a vector by ID
	if len(results) > 0 {
		_, err = client.DeleteVector(ctx, results[0].ID)
		if err != nil {
			log.Fatalf("Failed to delete vector: %v", err)
		}
	}

	// Delete a collection
	_, err = client.DeleteCollection(ctx, "my-collection")
	if err != nil {
		log.Fatalf("Failed to delete collection: %v", err)
	}
}
Example (LongRunning)

Example_longRunning demonstrates waiting for container completion

package main

import (
	"context"
	"log"

	"github.com/docker/mcp-gateway/pkg/gateway/embeddings"
)

func main() {
	ctx := context.Background()

	client, err := embeddings.NewVectorDBClient(ctx, "./data", 1536, nil)
	if err != nil {
		log.Fatalf("Failed to create client: %v", err)
	}

	// In a separate goroutine, wait for container to exit
	go func() {
		if err := client.Wait(); err != nil {
			log.Printf("Container exited with error: %v", err)
		} else {
			log.Println("Container exited successfully")
		}
	}()

	// Do work with the client (already initialized)...
	// For example: client.ListCollections(ctx), client.SearchVectors(ctx, ...), etc.

	// When done, close the client (which stops the container)
	if err := client.Close(); err != nil {
		log.Printf("Failed to close client: %v", err)
	}
}
Example (WithTimeout)

Example_withTimeout demonstrates usage with context timeouts

package main

import (
	"context"
	"fmt"
	"log"
	"time"

	"github.com/docker/mcp-gateway/pkg/gateway/embeddings"
)

func main() {
	// Create a context with timeout
	ctx, cancel := context.WithTimeout(context.Background(), 5*time.Minute)
	defer cancel()

	// Create client with the timeout context
	client, err := embeddings.NewVectorDBClient(ctx, "./data", 1536, nil)
	if err != nil {
		log.Fatalf("Failed to create client: %v", err)
	}
	defer client.Close()

	// Perform operations (connection is already initialized)
	collections, err := client.ListCollections(ctx)
	if err != nil {
		log.Fatalf("Failed to list collections: %v", err)
	}
	fmt.Printf("Collections: %v\n", collections)
}

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

func Pull

func Pull(ctx context.Context) error

Pull downloads the embeddings OCI artifact, extracts it to a temp directory, and copies the vector.db directory to ~/.docker/mcp if it doesn't already exist.

Example usage:

go run ./examples/embeddings/pull.go

func Push

func Push(ctx context.Context, vectorDBPath string, ociRef string) error

Push creates an OCI artifact containing the vector database directory and pushes it to the specified OCI reference. The directory will always be named "vectors.db" in the OCI artifact regardless of the source directory name.

Example usage:

go run ./examples/embeddings/push.go ~/.docker/mcp/vectors.db jimclark106/embeddings:v1.0

Types

type Collection

type Collection struct {
	Name string `json:"name"`
}

Collection represents a vector collection

type SearchArgs

type SearchArgs struct {
	Vector             []float64 `json:"vector"`
	CollectionName     string    `json:"collection_name,omitempty"`
	ExcludeCollections []string  `json:"exclude_collections,omitempty"`
	Limit              int       `json:"limit,omitempty"`
}

SearchArgs combines search options with the vector for the search tool call

type SearchOptions

type SearchOptions struct {
	CollectionName     string   // Search only within this collection
	ExcludeCollections []string // Collections to exclude from search
	Limit              int      // Maximum number of results (default 10)
}

SearchOptions contains options for vector search

type SearchResult

type SearchResult struct {
	ID           int64          `json:"id"`
	Collection   string         `json:"collection"`
	Distance     float64        `json:"distance"`
	Metadata     map[string]any `json:"metadata"`
	VectorLength int            `json:"vector_length"`
}

SearchResult represents a single search result

type VectorDBClient

type VectorDBClient struct {
	// contains filtered or unexported fields
}

VectorDBClient wraps the MCP client connection to the vector DB server

func NewVectorDBClient

func NewVectorDBClient(ctx context.Context, dataDir string, dimension int, logFunc func(string)) (*VectorDBClient, error)

NewVectorDBClient creates a new MCP client and starts the vector DB container. The dataDir parameter specifies where the vector database will store its data. The dimension parameter specifies the vector dimension (default 1536 for OpenAI embeddings). The logFunc parameter is optional and can be used to log MCP messages.

func (*VectorDBClient) AddVector

func (c *VectorDBClient) AddVector(ctx context.Context, collectionName string, vector []float64, metadata map[string]any) (*mcp.CallToolResult, error)

AddVector adds a vector to a collection (creates collection if it doesn't exist). The vector must be a slice of 1536 float64 numbers. Metadata is optional.

func (*VectorDBClient) CallTool

func (c *VectorDBClient) CallTool(ctx context.Context, toolName string, arguments any) (*mcp.CallToolResult, error)

CallTool calls a tool on the MCP server with the given name and arguments. The arguments parameter accepts any type - the MCP SDK handles JSON marshaling.

func (*VectorDBClient) Close

func (c *VectorDBClient) Close() error

Close closes the MCP client session and stops the Docker container

func (*VectorDBClient) CreateCollection

func (c *VectorDBClient) CreateCollection(ctx context.Context, collectionName string) (*mcp.CallToolResult, error)

CreateCollection creates a new vector collection

func (*VectorDBClient) DeleteCollection

func (c *VectorDBClient) DeleteCollection(ctx context.Context, collectionName string) (*mcp.CallToolResult, error)

DeleteCollection deletes a collection and all its vectors

func (*VectorDBClient) DeleteVector

func (c *VectorDBClient) DeleteVector(ctx context.Context, vectorID int64) (*mcp.CallToolResult, error)

DeleteVector deletes a vector by its ID

func (*VectorDBClient) IsAlive

func (c *VectorDBClient) IsAlive() bool

IsAlive checks if the container process is still running

func (*VectorDBClient) ListCollections

func (c *VectorDBClient) ListCollections(ctx context.Context) ([]string, error)

ListCollections lists all vector collections in the database. Returns a slice of collection names.

func (*VectorDBClient) ListTools

func (c *VectorDBClient) ListTools(ctx context.Context) (*mcp.ListToolsResult, error)

ListTools lists available tools from the MCP server

func (*VectorDBClient) SearchVectors

func (c *VectorDBClient) SearchVectors(ctx context.Context, vector []float64, options *SearchOptions) ([]SearchResult, error)

SearchVectors searches for similar vectors using cosine distance. The vector must be a slice of 1536 float64 numbers. Returns a slice of search results.

func (*VectorDBClient) Session

func (c *VectorDBClient) Session() *mcp.ClientSession

Session returns the MCP client session

func (*VectorDBClient) Wait

func (c *VectorDBClient) Wait() error

Wait waits for the container to exit and returns any error

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL