vectorfs

package

v0.0.0-...-668ef5a Latest Latest Go to latest Published: Mar 11, 2026 License: Apache-2.0 Imports: 24 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/c4pt0r/agfs

Links

Open Source Insights

README ¶

VectorFS Plugin

Document Vector Search Plugin for AGFS with S3 storage and TiDB Cloud vector indexing.

Overview

VectorFS provides semantic search capabilities for documents by combining:

S3 for scalable document storage
TiDB Cloud vector index for fast similarity search using HNSW algorithm
OpenAI embeddings (default) for generating vector representations

Features

Automatic Indexing: Documents are automatically indexed when written (async with worker pool)
Deduplication: Same content (same SHA256 digest) won't be indexed twice
Semantic Search: Use standard grep command for vector similarity search
Document Retrieval: Read original documents with cat command
Subdirectory Support: Organize documents in nested folders
Batch Copy: Copy entire folders with cp -r command
Scalable Storage: S3-backed document storage
Fast Vector Search: TiDB Cloud's HNSW index with >90% recall rate
Document Chunking: Smart chunking by paragraphs and sentences
Multiple Namespaces: Isolate documents by project/namespace
Similarity Scores: Search results include distance and relevance scores

Directory Structure

/vectorfs/
  README                    - Documentation
  <namespace>/              - Project/namespace directory
    docs/                   - Document directory (auto-indexed)
      file1.txt             - Root-level document
      subfolder/            - Subdirectory (virtual)
        file2.txt           - Nested document
        deep/file3.txt      - Deeply nested document
    .indexing               - Indexing status (virtual file, read-only)

Note:

Subdirectories under docs/ are virtual - they don't need to be created explicitly. Just write files with paths like docs/guides/tutorial.txt and the directory structure is maintained in metadata.
The .indexing file is a virtual read-only status file. Currently returns "idle" as a placeholder. Future versions will show real-time worker pool status.

Configuration

YAML Configuration

plugins:
  vectorfs:
    enabled: true
    path: /vectorfs
    config:
      # S3 Storage Configuration
      s3_bucket: my-document-bucket
      s3_key_prefix: vectorfs # Optional, default: vectorfs
      s3_region: us-east-1 # Optional, default: us-east-1
      s3_access_key: AKIAXXXXXXXX # Optional, uses IAM role if not provided
      s3_secret_key: secret # Optional
      s3_endpoint: "" # Optional, for custom S3-compatible services

      # TiDB Cloud Configuration
      tidb_dsn: "user:password@tcp(gateway01.us-west-2.prod.aws.tidbcloud.com:4000)/dbname?tls=true"

      # Embedding Configuration
      embedding_provider: openai # Default: openai
      openai_api_key: sk-xxxxxxxxxxxxxxxx
      embedding_model: text-embedding-3-small # Default: text-embedding-3-small
      embedding_dim: 1536 # Default: 1536

      # Chunking Configuration (Optional)
      chunk_size: 512 # Default: 512 tokens
      chunk_overlap: 50 # Default: 50 tokens

      # Worker Pool Configuration (Optional)
      index_workers: 4 # Default: 4 concurrent workers

TiDB Cloud Setup

Create a TiDB Cloud cluster (Serverless or Dedicated)
Enable TiFlash (required for vector search)
Get the connection string (DSN) from cluster details
Tables will be created automatically when you create a namespace

S3 Setup

Create an S3 bucket (or use S3-compatible service like MinIO)
Configure access credentials (IAM role recommended for production)
Documents will be stored as: s3://bucket/vectorfs/<namespace>/<digest>

Usage

1. Create a Namespace (Project)

agfs:/> mkdir /vectorfs/my_project

This creates TiDB tables:

tbl_meta_my_project - File metadata
tbl_chunks_my_project - Document chunks with vector embeddings

2. Write Documents

Documents are automatically indexed when written to the docs/ directory:

# Write a single file
agfs:/> echo "How to deploy applications..." > /vectorfs/my_project/docs/deployment.txt

# Write to subdirectory (virtual subdirectories)
agfs:/> echo "Kubernetes guide" > /vectorfs/my_project/docs/guides/kubernetes.txt
agfs:/> echo "Docker tutorial" > /vectorfs/my_project/docs/tutorials/docker.txt

What happens:

Write operation returns immediately (~8ms)
Indexing happens asynchronously in background worker pool:
- SHA256 digest calculated
- Document uploaded to S3
- Text split into chunks (~512 tokens)
- Embeddings generated via OpenAI API
- Chunks and embeddings stored in TiDB

Copy entire folders:

# Copy multiple files and folders
agfs:/> cp -r /s3fs/mybucket/docs /vectorfs/my_project/docs/imported

3. Search Documents

Use the standard grep command for semantic search:

agfs:/> grep "deployment strategies" /vectorfs/my_project/docs

# Or use agfs-shell's fsgrep command
$ fsgrep -r "deployment strategies" /vectorfs/my_project/docs

Returns:

{
  "matches": [
    {
      "file": "/vectorfs/my_project/docs/deployment.txt",
      "line": 1,
      "content": "How to deploy applications using blue-green strategy...",
      "metadata": {
        "distance": 0.234,
        "score": 0.766
      }
    },
    {
      "file": "/vectorfs/my_project/docs/kubernetes.txt",
      "line": 3,
      "content": "Kubernetes deployment strategies include rolling updates...",
      "metadata": {
        "distance": 0.412,
        "score": 0.588
      }
    }
  ],
  "count": 2
}

Similarity scores:

distance: Cosine distance (0.0 = identical, 1.0 = completely different)
score: Relevance score (1.0 - distance, higher is better)

The search uses cosine distance in TiDB's vector index to find semantically similar chunks.

4. Read Documents

Read original document content from S3:

# Read a file
agfs:/> cat /vectorfs/my_project/docs/deployment.txt

# Read from subdirectory
agfs:/> cat /vectorfs/my_project/docs/guides/kubernetes.txt

Documents are retrieved from S3 using the file's digest and returned with their original content.

5. List Documents

agfs:/> ls /vectorfs/my_project/docs
deployment.txt
kubernetes.txt
architecture.md
guides/
tutorials/

# List subdirectory
agfs:/> ls /vectorfs/my_project/docs/guides
kubernetes.txt
getting-started.md

6. Check Indexing Status

Each namespace has a virtual .indexing file that shows background indexing status:

agfs:/> cat /vectorfs/my_project/.indexing
idle

Current Status: This file currently returns idle as a placeholder. Since indexing happens asynchronously in a worker pool, documents may still be processing in the background even when showing "idle".

Future Enhancement: Will show real-time worker pool statistics:

Queue depth (pending documents)
Active workers processing
Indexing rate and completion status

Note: With async indexing, there may be a short delay (typically 1-15 seconds depending on file size) between writing a file and it being searchable. Large files (>20KB) with many chunks take longer to index.

Architecture

Data Flow

User writes file
      ↓
  Calculate SHA256 digest
      ↓
  Submit to index queue → Return immediately (~8ms)
      ↓
Worker pool (4 workers by default) processes async:
      ↓
  Upload to S3 (s3://bucket/vectorfs/<namespace>/<digest>)
      ↓
  Chunk document (paragraphs → sentences)
      ↓
  Generate embeddings (OpenAI API, batch)
      ↓
  Store in TiDB:
    - tbl_meta_<namespace> (file metadata)
    - tbl_chunks_<namespace> (chunks + vector embeddings)

Vector Search Flow

User runs grep
      ↓
  Generate query embedding (OpenAI API)
      ↓
  TiDB vector search:
    SELECT ... ORDER BY VEC_COSINE_DISTANCE(embedding, <query>) LIMIT 10
      ↓
  Return matching chunks as GrepMatch format

Database Schema

File Metadata Table

CREATE TABLE tbl_meta_<namespace> (
    file_digest VARCHAR(64) PRIMARY KEY,
    file_name VARCHAR(1024) NOT NULL,
    s3_key VARCHAR(1024) NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    INDEX idx_file_name (file_name)
);

Chunks Table with Vector Index

CREATE TABLE tbl_chunks_<namespace> (
    chunk_id BIGINT AUTO_INCREMENT PRIMARY KEY,
    file_digest VARCHAR(64) NOT NULL,
    chunk_index INT NOT NULL,
    chunk_text TEXT NOT NULL,
    embedding VECTOR(1536) NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    INDEX idx_file_digest (file_digest),
    VECTOR INDEX idx_embedding ((VEC_COSINE_DISTANCE(embedding)))
);

Performance Considerations

Write Performance

Write Response: ~8ms (immediate return, async indexing)
Worker Pool: 4 concurrent workers (configurable)
Queue Capacity: 100 pending tasks

Indexing Performance (Background)

Embedding API: ~100-200ms per batch (OpenAI)
TiDB Insert: ~10-50ms per chunk
S3 Upload: ~50-200ms per document
Large Files: 26KB file (~169 chunks) completes in ~15 seconds

Benefits of async indexing:

No timeout issues with cp -r for large folders
User operations never blocked
Controlled concurrency prevents API rate limits

Search Performance

Query Embedding: ~100ms (OpenAI)
Vector Search: ~10-50ms (TiDB HNSW index)
Total: ~150ms for typical search

TiDB Cloud vector search maintains >90% recall rate with HNSW indexing.

Cost Estimation

OpenAI Embeddings

Model: text-embedding-3-small
Cost: ~$0.02 per 1M tokens
Example: 100 documents × 1000 words ≈ 130K tokens ≈ $0.003

TiDB Cloud

Serverless: Pay per use (RU consumption)
Dedicated: Fixed monthly cost based on cluster size

S3 Storage

Standard storage: ~$0.023 per GB/month
Example: 1000 documents × 10KB ≈ 10MB ≈ $0.0002/month

Limitations

No Updates: Updating documents creates a new version (different digest). Old versions remain in S3 and TiDB.
Deletion: Not yet implemented. Use direct TiDB/S3 operations to clean up.
Single Embedding Provider: Only OpenAI is supported currently.
TiFlash Required: TiDB Cloud cluster must have TiFlash enabled for vector search.
Indexing Visibility: The .indexing status file is currently a placeholder (always shows "idle"). No API yet to check:
- Whether a specific file has been indexed
- Real-time queue depth or worker status
- Indexing progress or completion percentage

Troubleshooting

"failed to connect to TiDB"

Verify DSN connection string
Ensure TLS is enabled for TiDB Cloud: ?tls=true
Check network connectivity and firewall rules

"failed to initialize S3 client"

Verify AWS credentials or IAM role
Check bucket name and region
For custom endpoints, ensure s3_endpoint is correct

"failed to generate embeddings"

Verify OpenAI API key is valid
Check API rate limits and quotas
Ensure network access to api.openai.com

"vector search returns no results"

Verify documents have been indexed (ls /vectorfs/<namespace>/docs)
Check TiFlash is enabled on TiDB Cloud cluster
Try broader search queries

"file not appearing in search results immediately"

Indexing happens asynchronously in background worker pool
Small files (< 5KB): typically indexed within 1-3 seconds
Large files (> 20KB): may take 10-15+ seconds to complete indexing
Check server logs for indexing completion: grep "Successfully indexed" /var/log/agfs.log
The .indexing status file currently doesn't show real-time status (placeholder)
Workaround: Wait a few seconds after writing, then search again

Example: Complete Workflow

# 1. Create namespace
mkdir /vectorfs/tech_docs

# 2. Add documents
echo "Kubernetes is a container orchestration platform..." > /vectorfs/tech_docs/docs/k8s.txt
echo "Docker provides containerization for applications..." > /vectorfs/tech_docs/docs/docker.txt
echo "Terraform enables infrastructure as code..." > /vectorfs/tech_docs/docs/terraform.txt

# 3. Search
grep "container management" /vectorfs/tech_docs/docs

# Returns semantically similar results:
# - k8s.txt (mentions container orchestration)
# - docker.txt (mentions containerization)

Future Enhancements

Real-time indexing status in .indexing file (queue depth, active workers, completion %)
Per-file indexing status API (check if specific file has been indexed)
Document update/delete operations
Multiple embedding providers (Cohere, Hugging Face, etc.)
Hybrid search (vector + keyword)
Metadata filtering in search
Configurable top-K results
Re-indexing support
Priority queue for indexing tasks

License

Apache 2.0

Documentation ¶

Index ¶

Constants
type Chunk
- func ChunkDocument(text string, cfg ChunkerConfig) []Chunk
type ChunkData
type ChunkerConfig
type EmbeddingClient
- func NewEmbeddingClient(cfg EmbeddingConfig) (*EmbeddingClient, error)
- func (e *EmbeddingClient) GenerateBatchEmbeddings(texts []string) ([][]float32, error)
- func (e *EmbeddingClient) GenerateEmbedding(text string) ([]float32, error)
- func (e *EmbeddingClient) GetDimension() int
type EmbeddingConfig
type FileMetadata
type Indexer
- func NewIndexer(s3Client *S3Client, tidbClient *TiDBClient, embeddingClient *EmbeddingClient, ...) *Indexer
- func (idx *Indexer) DeleteDocument(namespace, digest string) error
- func (idx *Indexer) IndexChunks(namespace, digest, fileName, content string) error
- func (idx *Indexer) IndexDocument(namespace, digest, fileName, content string) error
- func (idx *Indexer) PrepareDocument(namespace, digest, fileName, content string) (bool, error)
type S3Client
- func NewS3Client(cfg S3Config) (*S3Client, error)
- func (c *S3Client) DeleteDocument(ctx context.Context, namespace, digest string) error
- func (c *S3Client) DocumentExists(ctx context.Context, namespace, digest string) (bool, error)
- func (c *S3Client) DownloadDocument(ctx context.Context, namespace, digest string) ([]byte, error)
- func (c *S3Client) UploadDocument(ctx context.Context, namespace, digest string, data []byte) error
type S3Config
type TiDBClient
- func NewTiDBClient(cfg TiDBConfig) (*TiDBClient, error)
- func (c *TiDBClient) Close() error
- func (c *TiDBClient) CreateNamespace(namespace string, embeddingDim int) error
- func (c *TiDBClient) DeleteFileByName(namespace, fileName string) error
- func (c *TiDBClient) DeleteFileChunks(namespace, fileDigest string) error
- func (c *TiDBClient) DeleteFileMetadata(namespace, fileDigest string) error
- func (c *TiDBClient) DeleteNamespace(namespace string) error
- func (c *TiDBClient) FileExists(namespace, digest string) (bool, error)
- func (c *TiDBClient) GetFileMetadataByName(namespace, fileName string) (*FileMetadata, error)
- func (c *TiDBClient) HasFilesWithPrefix(namespace, prefix string) (bool, error)
- func (c *TiDBClient) InsertChunk(namespace, fileDigest string, chunkIndex int, chunkText string, ...) error
- func (c *TiDBClient) InsertChunksBatch(namespace, fileDigest string, chunks []ChunkData) error
- func (c *TiDBClient) InsertFileMetadata(namespace string, meta FileMetadata) error
- func (c *TiDBClient) ListFiles(namespace string) ([]FileMetadata, error)
- func (c *TiDBClient) ListFilesWithPrefix(namespace, prefix string) ([]FileMetadata, error)
- func (c *TiDBClient) ListNamespaces() ([]string, error)
- func (c *TiDBClient) NamespaceExists(namespace string) (bool, error)
- func (c *TiDBClient) VectorSearch(namespace string, queryEmbedding []float32, limit int) ([]VectorMatch, error)
type TiDBConfig
type VectorFSPlugin
- func NewVectorFSPlugin() *VectorFSPlugin
- func (v *VectorFSPlugin) GetConfigParams() []plugin.ConfigParameter
- func (v *VectorFSPlugin) GetFileSystem() filesystem.FileSystem
- func (v *VectorFSPlugin) GetReadme() string
- func (v *VectorFSPlugin) Initialize(cfg map[string]interface{}) error
- func (v *VectorFSPlugin) Name() string
- func (v *VectorFSPlugin) Shutdown() error
- func (v *VectorFSPlugin) Validate(cfg map[string]interface{}) error
type VectorMatch

Constants ¶

View Source

const (
	PluginName = "vectorfs"
)

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Chunk ¶

type Chunk struct {
	Text  string
	Index int
}

Chunk represents a text chunk

func ChunkDocument ¶

func ChunkDocument(text string, cfg ChunkerConfig) []Chunk

ChunkDocument splits a document into chunks

type ChunkData ¶

type ChunkData struct {
	ChunkIndex int
	ChunkText  string
	Embedding  []float32
}

ChunkData represents a chunk to be inserted

type ChunkerConfig ¶

type ChunkerConfig struct {
	ChunkSize    int // Approximate chunk size in tokens
	ChunkOverlap int // Overlap between chunks in tokens
}

ChunkerConfig holds chunking configuration

type EmbeddingClient ¶

type EmbeddingClient struct {
	// contains filtered or unexported fields
}

EmbeddingClient handles embedding generation

func NewEmbeddingClient ¶

func NewEmbeddingClient(cfg EmbeddingConfig) (*EmbeddingClient, error)

NewEmbeddingClient creates a new embedding client

func (*EmbeddingClient) GenerateBatchEmbeddings ¶

func (e *EmbeddingClient) GenerateBatchEmbeddings(texts []string) ([][]float32, error)

GenerateBatchEmbeddings generates embeddings for multiple texts

func (*EmbeddingClient) GenerateEmbedding ¶

func (e *EmbeddingClient) GenerateEmbedding(text string) ([]float32, error)

GenerateEmbedding generates an embedding for the given text

func (*EmbeddingClient) GetDimension ¶

func (e *EmbeddingClient) GetDimension() int

GetDimension returns the embedding dimension

type EmbeddingConfig ¶

type EmbeddingConfig struct {
	Provider  string // Provider name (openai)
	APIKey    string // API key
	Model     string // Model name
	Dimension int    // Embedding dimension
}

EmbeddingConfig holds embedding configuration

type FileMetadata ¶

type FileMetadata struct {
	FileDigest string
	FileName   string
	S3Key      string
	FileSize   int64
	CreatedAt  time.Time
	UpdatedAt  time.Time
}

FileMetadata represents file metadata stored in TiDB

type Indexer ¶

type Indexer struct {
	// contains filtered or unexported fields
}

Indexer handles document indexing

func NewIndexer ¶

func NewIndexer(
	s3Client *S3Client,
	tidbClient *TiDBClient,
	embeddingClient *EmbeddingClient,
	chunkerConfig ChunkerConfig,
) *Indexer

NewIndexer creates a new indexer

func (*Indexer) DeleteDocument ¶

func (idx *Indexer) DeleteDocument(namespace, digest string) error

DeleteDocument removes a document from the index

func (*Indexer) IndexChunks ¶

func (idx *Indexer) IndexChunks(namespace, digest, fileName, content string) error

IndexChunks performs chunking, embedding generation, and stores chunks in TiDB (async phase). This is called after PrepareDocument to enable vector search on the document.

func (*Indexer) IndexDocument ¶

func (idx *Indexer) IndexDocument(namespace, digest, fileName, content string) error

IndexDocument indexes a document (upload to S3, chunk, generate embeddings, store in TiDB) Deprecated: Use PrepareDocument + IndexChunks for better performance. This method is kept for backward compatibility.

func (*Indexer) PrepareDocument ¶

func (idx *Indexer) PrepareDocument(namespace, digest, fileName, content string) (bool, error)

PrepareDocument uploads document to S3 and registers metadata in TiDB (synchronous phase). After this completes, the file is visible via ls/cat. Returns (alreadyExists, error) - if alreadyExists is true, no further indexing is needed.

type S3Client ¶

type S3Client struct {
	// contains filtered or unexported fields
}

S3Client handles S3 operations for document storage

func NewS3Client ¶

func NewS3Client(cfg S3Config) (*S3Client, error)

NewS3Client creates a new S3 client

func (*S3Client) DeleteDocument ¶

func (c *S3Client) DeleteDocument(ctx context.Context, namespace, digest string) error

DeleteDocument deletes a document from S3

func (*S3Client) DocumentExists ¶

func (c *S3Client) DocumentExists(ctx context.Context, namespace, digest string) (bool, error)

DocumentExists checks if a document exists in S3

func (*S3Client) DownloadDocument ¶

func (c *S3Client) DownloadDocument(ctx context.Context, namespace, digest string) ([]byte, error)

DownloadDocument downloads a document from S3

func (*S3Client) UploadDocument ¶

func (c *S3Client) UploadDocument(ctx context.Context, namespace, digest string, data []byte) error

UploadDocument uploads a document to S3

type S3Config ¶

type S3Config struct {
	AccessKey string
	SecretKey string
	Bucket    string
	KeyPrefix string
	Region    string
	Endpoint  string
}

S3Config holds S3 configuration

type TiDBClient ¶

type TiDBClient struct {
	// contains filtered or unexported fields
}

TiDBClient handles TiDB operations for vector search

func NewTiDBClient ¶

func NewTiDBClient(cfg TiDBConfig) (*TiDBClient, error)

NewTiDBClient creates a new TiDB client

func (*TiDBClient) Close ¶

func (c *TiDBClient) Close() error

Close closes the TiDB connection

func (*TiDBClient) CreateNamespace ¶

func (c *TiDBClient) CreateNamespace(namespace string, embeddingDim int) error

CreateNamespace creates tables for a new namespace (fails if already exists)

func (*TiDBClient) DeleteFileByName ¶

func (c *TiDBClient) DeleteFileByName(namespace, fileName string) error

DeleteFileByName deletes all versions of a file by name (used before writing new content)

func (*TiDBClient) DeleteFileChunks ¶

func (c *TiDBClient) DeleteFileChunks(namespace, fileDigest string) error

DeleteFileChunks deletes all chunks for a file

func (*TiDBClient) DeleteFileMetadata ¶

func (c *TiDBClient) DeleteFileMetadata(namespace, fileDigest string) error

DeleteFileMetadata deletes file metadata

func (*TiDBClient) DeleteNamespace ¶

func (c *TiDBClient) DeleteNamespace(namespace string) error

DeleteNamespace drops all tables for a namespace

func (*TiDBClient) FileExists ¶

func (c *TiDBClient) FileExists(namespace, digest string) (bool, error)

FileExists checks if a file (by digest) is already indexed

func (*TiDBClient) GetFileMetadataByName ¶

func (c *TiDBClient) GetFileMetadataByName(namespace, fileName string) (*FileMetadata, error)

GetFileMetadataByName retrieves file metadata by file name (returns the latest version)

func (*TiDBClient) HasFilesWithPrefix ¶

func (c *TiDBClient) HasFilesWithPrefix(namespace, prefix string) (bool, error)

HasFilesWithPrefix checks if any files exist with the given prefix (for directory detection) This is much faster than loading all files just to check if a directory exists

func (*TiDBClient) InsertChunk ¶

func (c *TiDBClient) InsertChunk(namespace, fileDigest string, chunkIndex int, chunkText string, embedding []float32) error

InsertChunk inserts a document chunk with embedding

func (*TiDBClient) InsertChunksBatch ¶

func (c *TiDBClient) InsertChunksBatch(namespace, fileDigest string, chunks []ChunkData) error

InsertChunksBatch inserts multiple chunks in a single batch operation This significantly reduces database round-trips compared to individual inserts

func (*TiDBClient) InsertFileMetadata ¶

func (c *TiDBClient) InsertFileMetadata(namespace string, meta FileMetadata) error

InsertFileMetadata inserts file metadata

func (*TiDBClient) ListFiles ¶

func (c *TiDBClient) ListFiles(namespace string) ([]FileMetadata, error)

ListFiles lists all files in a namespace

func (*TiDBClient) ListFilesWithPrefix ¶

func (c *TiDBClient) ListFilesWithPrefix(namespace, prefix string) ([]FileMetadata, error)

ListFilesWithPrefix lists files in a namespace with a given prefix (database-level filtering) This is more efficient than ListFiles when only a subset of files is needed

func (*TiDBClient) ListNamespaces ¶

func (c *TiDBClient) ListNamespaces() ([]string, error)

ListNamespaces lists all namespaces (by finding all tbl_meta_* tables)

func (*TiDBClient) NamespaceExists ¶

func (c *TiDBClient) NamespaceExists(namespace string) (bool, error)

NamespaceExists checks if a namespace exists

func (*TiDBClient) VectorSearch ¶

func (c *TiDBClient) VectorSearch(namespace string, queryEmbedding []float32, limit int) ([]VectorMatch, error)

VectorSearch performs vector similarity search

type TiDBConfig ¶

type TiDBConfig struct {
	DSN string // Connection string
}

TiDBConfig holds TiDB configuration

type VectorFSPlugin ¶

type VectorFSPlugin struct {
	// contains filtered or unexported fields
}

func NewVectorFSPlugin ¶

func NewVectorFSPlugin() *VectorFSPlugin

NewVectorFSPlugin creates a new VectorFS plugin

func (*VectorFSPlugin) GetConfigParams ¶

func (v *VectorFSPlugin) GetConfigParams() []plugin.ConfigParameter

func (*VectorFSPlugin) GetFileSystem ¶

func (v *VectorFSPlugin) GetFileSystem() filesystem.FileSystem

func (*VectorFSPlugin) GetReadme ¶

func (v *VectorFSPlugin) GetReadme() string

func (*VectorFSPlugin) Initialize ¶

func (v *VectorFSPlugin) Initialize(cfg map[string]interface{}) error

func (*VectorFSPlugin) Name ¶

func (v *VectorFSPlugin) Name() string

func (*VectorFSPlugin) Shutdown ¶

func (v *VectorFSPlugin) Shutdown() error

func (*VectorFSPlugin) Validate ¶

func (v *VectorFSPlugin) Validate(cfg map[string]interface{}) error

type VectorMatch ¶

type VectorMatch struct {
	FileDigest string
	FileName   string
	ChunkText  string
	ChunkIndex int
	Distance   float64
}

VectorMatch represents a vector search result

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL