vector

package

v0.1.8-rc.22 Latest Latest Go to latest Published: Mar 29, 2026 License: Apache-2.0 Imports: 33 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/stackgenhq/genie

Links

Open Source Insights

Documentation ¶

Overview ¶

Package vector provides the vector store used for Genie's semantic memory: embedding and searching over documents (runbooks, synced data from Drive, Gmail, Slack, etc.) so the agent can retrieve relevant context via memory_search.

It solves the problem of giving the agent access to large, heterogeneous corpora: items are embedded, stored in an IStore (in-memory or Qdrant), and queried by semantic similarity. The sync pipeline upserts NormalizedItems from data sources; tools expose search and (when configured) delete. Without this package, the agent would have no persistent, searchable memory across runs.

Index ¶

Constants
Variables
func ChunkTextForEmbedding(text string) []string
func NewMemoryDeleteTool(store IStore) tool.Tool
func NewMemoryListTool(store IStore, cfg *Config) tool.Tool
func NewMemoryMergeTool(store IStore, cfg *Config) tool.Tool
func NewMemorySearchTool(store IStore, cfg *Config) tool.Tool
func NewMemoryStoreTool(store IStore, cfg *Config, scorer MemoryImportanceScorer) tool.Tool
type AddRequest
type BatchItem
- func ChunkContentToBatchItems(itemID, content string, baseMeta map[string]string) ([]BatchItem, error)
type Config
- func DefaultConfig(ctx context.Context, sp security.SecretProvider) Config
- func (cfg Config) NewStore(ctx context.Context) (*Store, error)
type ContentChunker
- func NewContentChunker(chunkSize, overlap int) *ContentChunker
- func (cc *ContentChunker) ChunkForType(text string, ct ContentType) []string
- func (cc *ContentChunker) ChunkToBatchItemsForType(itemID, content string, baseMeta map[string]string, ct ContentType) ([]BatchItem, error)
- func (cc *ContentChunker) StrategyFor(ct ContentType) chunking.Strategy
type ContentType
- func ContentTypeFromMIME(mime string) ContentType
type DeleteRequest
type IStore
- func NewScopedStore(inner IStore) IStore
type MemoryDeleteRequest
type MemoryDeleteResponse
- func (r MemoryDeleteResponse) MarshalJSON() ([]byte, error)
type MemoryImportanceScorer
type MemoryListRequest
type MemoryListResponse
- func (r MemoryListResponse) MarshalJSON() ([]byte, error)
type MemoryMergeRequest
type MemoryMergeResponse
- func (r MemoryMergeResponse) MarshalJSON() ([]byte, error)
type MemorySearchRequest
type MemorySearchResponse
- func (r MemorySearchResponse) MarshalJSON() ([]byte, error)
type MemorySearchResultItem
type MemoryStoreRequest
type MemoryStoreResponse
- func (r MemoryStoreResponse) MarshalJSON() ([]byte, error)
type SearchRequest
type SearchResult
- func (s SearchResult) String() string
type SearchResults
type Store
- func (s *Store) Add(ctx context.Context, req AddRequest) error
- func (s *Store) Close(ctx context.Context) error
- func (s *Store) Delete(ctx context.Context, req DeleteRequest) error
- func (s *Store) Search(ctx context.Context, req SearchRequest) ([]SearchResult, error)
- func (s *Store) Upsert(ctx context.Context, req UpsertRequest) error
type ToolProvider
- func NewToolProvider(store IStore, cfg *Config, scorer MemoryImportanceScorer) *ToolProvider
- func (p *ToolProvider) GetTools(_ context.Context) []tool.Tool
type UpsertRequest

Constants ¶

View Source

const (
	MemoryStoreToolName  = "memory_store"
	MemorySearchToolName = "memory_search"
	MemoryDeleteToolName = "memory_delete"
	MemoryListToolName   = "memory_list"
	MemoryMergeToolName  = "memory_merge"
)

Tool name constants for the vector memory tools. Use these instead of magic strings when referencing memory tools elsewhere (e.g. retrieval tool classification, empty-memory guard, loop detection).

View Source

const MaxCharsPerEmbeddingChunk = 16_000

MaxCharsPerEmbeddingChunk is a safe character limit so that chunked text stays under embedding model token limits (e.g. OpenAI text-embedding-3-small 8191 tokens). Email/HTML can be ~2 chars per token; 8000 tokens ≈ 16000 chars.

View Source

const MetaAgentName = "__agent_name"

MetaAgentName is the metadata key used to scope documents to a specific agent, preventing one agent from reading or overwriting another agent's data in a shared vector store collection.

View Source

const MetaLogicalID = "_logical_id"

MetaLogicalID is the reserved metadata key that stores the caller-facing document ID. ScopedStore namespaces internal document IDs by visibility scope to prevent cross-scope collisions, and uses this key to recover the original logical ID on search output. Callers must never set this key manually — ScopedStore strips any caller-supplied value.

View Source

const MetaVisibility = "visibility"

MetaVisibility is the reserved metadata key used to scope documents by visibility (private, group, or global). It is automatically injected by ScopedStore on every write and used as a filter on every read. Callers must never set this key manually — ScopedStore strips any caller-supplied value and replaces it with the derived scope.

View Source

const SkillNameKey = "skill_name"

View Source

const VisibilityGlobal = "global"

VisibilityGlobal is the visibility scope value for documents that should be accessible to all users and channels within a deployment agent. Used by raw write paths (data source sync, activity reports, learned skills, tool index, graph entities/relations) to stamp content that must remain searchable from any context.

Variables ¶

View Source

var DefaultContentChunker = NewContentChunker(MaxCharsPerEmbeddingChunk, embeddingChunkOverlap)

DefaultContentChunker is the shared content-aware chunker using the standard embedding chunk size and overlap. Datasource connectors should use this instead of creating their own.

Functions ¶

func ChunkTextForEmbedding ¶

func ChunkTextForEmbedding(text string) []string

ChunkTextForEmbedding splits text into chunks using trpc-agent-go's RecursiveChunking strategy (paragraph→line→word→character boundaries, with 10% overlap). Returns a single chunk if text is short enough; otherwise multiple chunks suitable for embedding one at a time.

func NewMemoryDeleteTool ¶

func NewMemoryDeleteTool(store IStore) tool.Tool

NewMemoryDeleteTool creates a tool that deletes entries from vector memory by ID. Use this to clean up stale, incorrect, or outdated memories.

func NewMemoryListTool ¶

func NewMemoryListTool(store IStore, cfg *Config) tool.Tool

NewMemoryListTool creates a tool that lists entries from vector memory. Supports metadata filtering to browse specific categories.

func NewMemoryMergeTool ¶

func NewMemoryMergeTool(store IStore, cfg *Config) tool.Tool

NewMemoryMergeTool creates a tool that merges multiple memory entries into one. The agent provides the consolidated text; the tool upserts it under the first ID and deletes the remaining IDs.

func NewMemorySearchTool ¶

func NewMemorySearchTool(store IStore, cfg *Config) tool.Tool

NewMemorySearchTool creates a tool that searches the vector memory. When cfg.AllowedMetadataKeys is set, only those keys may be used in filter.

func NewMemoryStoreTool ¶

func NewMemoryStoreTool(store IStore, cfg *Config, scorer MemoryImportanceScorer) tool.Tool

NewMemoryStoreTool creates a tool that stores text into the vector memory. When cfg.AllowedMetadataKeys is set, only those keys are accepted in metadata. If req.ID is set, the store is upserted (existing document with that ID is replaced). If scorer is non-nil, each write is scored for importance and the score is stored as "_importance" metadata for retrieval-time quality filtering.

Types ¶

type AddRequest ¶

type AddRequest struct {
	Items []BatchItem
}

AddRequest holds the items to insert into the vector store.

type BatchItem ¶

type BatchItem struct {
	ID       string
	Text     string
	Metadata map[string]string
}

BatchItem represents a single document to be stored via Add.

func ChunkContentToBatchItems ¶

func ChunkContentToBatchItems(itemID, content string, baseMeta map[string]string) ([]BatchItem, error)

ChunkContentToBatchItems splits content into chunks and returns BatchItems ready for Upsert with stable IDs and chunk metadata. Uses the default RecursiveChunking strategy.

type Config ¶

type Config struct {
	// PersistenceDir is the directory where the vector store snapshot is
	// saved as a JSON file. If empty, the store is ephemeral (in-memory only).
	// Note: PersistenceDir is ignored when using an external store (Qdrant),
	// as those handle persistence internally.
	PersistenceDir    string `yaml:"persistence_dir,omitempty" toml:"persistence_dir,omitempty"`
	EmbeddingProvider string `yaml:"embedding_provider,omitempty" toml:"embedding_provider,omitempty"` // "openai", "ollama", "huggingface", "gemini"
	APIKey            string `yaml:"api_key,omitempty" toml:"api_key,omitempty"`
	OllamaURL         string `yaml:"ollama_url,omitempty" toml:"ollama_url,omitempty"`
	OllamaModel       string `yaml:"ollama_model,omitempty" toml:"ollama_model,omitempty"`
	HuggingFaceURL    string `yaml:"huggingface_url,omitempty" toml:"huggingface_url,omitempty"`
	GeminiAPIKey      string `yaml:"gemini_api_key,omitempty" toml:"gemini_api_key,omitempty"`
	GeminiModel       string `yaml:"gemini_model,omitempty" toml:"gemini_model,omitempty"`
	// VectorStoreProvider specifies the vector store backend to use.
	// Options: "inmemory" (default), "qdrant"
	VectorStoreProvider string `yaml:"vector_store_provider,omitempty" toml:"vector_store_provider,omitempty"`
	// Qdrant configuration (only used when VectorStoreProvider is "qdrant")
	Qdrant qdrantstore.Config `yaml:"qdrant,omitempty" toml:"qdrant,omitempty"`
	// AllowedMetadataKeys optionally restricts which metadata keys may be used in
	// memory_store and memory_search. If non-empty, only these keys are accepted
	// for metadata (store) and filter (search), enabling product/category buckets.
	AllowedMetadataKeys []string `yaml:"allowed_metadata_keys,omitempty" toml:"allowed_metadata_keys,omitempty"`
}

Config holds the configuration for the vector store. It supports OpenAI, Ollama (via OpenAI-compatible endpoint), HuggingFace Text-Embeddings-Inference, Gemini, and a deterministic dummy embedder for development and testing.

func DefaultConfig ¶

func DefaultConfig(ctx context.Context, sp security.SecretProvider) Config

DefaultConfig builds the default vector store configuration by resolving API keys and endpoints through the given SecretProvider. Without a SecretProvider, callers can pass security.NewEnvProvider() to preserve the legacy os.Getenv behavior.

func (Config) NewStore ¶

func (cfg Config) NewStore(ctx context.Context) (*Store, error)

NewStore creates a new vector store backed by trpc-agent-go/knowledge. If cfg.PersistenceDir is set and using in-memory store, existing data is loaded from disk. If using Qdrant, persistence is handled by the external store itself.

type ContentChunker ¶

type ContentChunker struct {
	// contains filtered or unexported fields
}

ContentChunker selects a chunking strategy based on content type. Reusable across data sources (GDrive, Slack, Gmail, etc.). Each datasource calls ChunkForType with the appropriate content type to get intelligent splitting.

func NewContentChunker ¶

func NewContentChunker(chunkSize, overlap int) *ContentChunker

NewContentChunker creates a ContentChunker with the given size and overlap. It initialises both RecursiveChunking (for generic text) and MarkdownChunking (for structured documents) from trpc-agent-go.

func (*ContentChunker) ChunkForType ¶

func (cc *ContentChunker) ChunkForType(text string, ct ContentType) []string

ChunkForType splits text using the strategy for the given content type and returns the text chunks.

func (*ContentChunker) ChunkToBatchItemsForType ¶

func (cc *ContentChunker) ChunkToBatchItemsForType(itemID, content string, baseMeta map[string]string, ct ContentType) ([]BatchItem, error)

ChunkToBatchItemsForType splits content and returns BatchItems using the strategy appropriate for the content type.

func (*ContentChunker) StrategyFor ¶

func (cc *ContentChunker) StrategyFor(ct ContentType) chunking.Strategy

StrategyFor returns the appropriate chunking strategy for the content type.

type ContentType ¶

type ContentType int

ContentType classifies document content for strategy selection.

const (
	// ContentTypePlain is generic text content.
	ContentTypePlain ContentType = iota
	// ContentTypeMarkdown is markdown-structured content (Google Docs, .md files).
	ContentTypeMarkdown
)

func ContentTypeFromMIME ¶

func ContentTypeFromMIME(mime string) ContentType

ContentTypeFromMIME maps a MIME type to a ContentType. Google Workspace document types are treated as markdown because exported text often contains heading-like structure. This function is reusable by any datasource.

type DeleteRequest ¶

type DeleteRequest struct {
	IDs []string
}

DeleteRequest holds the IDs to remove from the vector store.

type IStore ¶

type IStore interface {
	Search(ctx context.Context, req SearchRequest) ([]SearchResult, error)
	Add(ctx context.Context, req AddRequest) error
	// Upsert replaces existing documents with the same ID, or inserts if not present.
	// Use a stable ID (e.g. source:external_id) to overwrite memory when appropriate.
	Upsert(ctx context.Context, req UpsertRequest) error
	Delete(ctx context.Context, req DeleteRequest) error
	Close(ctx context.Context) error
}

func NewScopedStore ¶

func NewScopedStore(inner IStore) IStore

NewScopedStore returns an IStore that enforces per-user/per-channel visibility on every operation. Wrap the raw store before passing it to NewToolProvider so that all LLM-facing memory tools are automatically scoped. The orchestrator and other internal callers should continue using the unwrapped store (they manage their own visibility filters).

type MemoryDeleteRequest ¶

type MemoryDeleteRequest struct {
	IDs []string `json:"ids" jsonschema:"description=List of memory IDs to delete,required"`
}

MemoryDeleteRequest is the input for the memory_delete tool.

type MemoryDeleteResponse ¶

type MemoryDeleteResponse struct {
	Deleted int    `json:"deleted"`
	Message string `json:"message"`
}

MemoryDeleteResponse is the output for the memory_delete tool.

func (MemoryDeleteResponse) MarshalJSON ¶

func (r MemoryDeleteResponse) MarshalJSON() ([]byte, error)

MarshalJSON implements custom JSON marshaling for tool responses.

type MemoryImportanceScorer ¶

type MemoryImportanceScorer interface {
	// ScoreText returns an importance score (1-10) for the given text.
	ScoreText(ctx context.Context, text string) int
}

MemoryImportanceScorer scores text content for importance on a 1-10 scale. This is a local interface to avoid importing reactree/memory; callers can satisfy it with rtmemory.ImportanceScorer or a no-op implementation.

type MemoryListRequest ¶

type MemoryListRequest struct {
	Filter map[string]string `json:"filter,omitempty" jsonschema:"description=Optional metadata filter to narrow results (e.g. type=accomplishment)"`
	Limit  int               `json:"limit,omitempty" jsonschema:"description=Maximum entries to return (default 20)"`
}

MemoryListRequest is the input for the memory_list tool.

type MemoryListResponse ¶

type MemoryListResponse struct {
	Entries []MemorySearchResultItem `json:"entries"`
	Count   int                      `json:"count"`
}

MemoryListResponse is the output for the memory_list tool.

func (MemoryListResponse) MarshalJSON ¶

func (r MemoryListResponse) MarshalJSON() ([]byte, error)

MarshalJSON implements custom JSON marshaling for tool responses.

type MemoryMergeRequest ¶

type MemoryMergeRequest struct {
	IDs        []string          `json:"ids" jsonschema:"description=List of memory IDs to merge (minimum 2),required"`
	MergedText string            `json:"merged_text" jsonschema:"description=The consolidated text for the merged memory entry,required"`
	Metadata   map[string]string `` /* 136-byte string literal not displayed */
}

MemoryMergeRequest is the input for the memory_merge tool.

type MemoryMergeResponse ¶

type MemoryMergeResponse struct {
	MergedID     string `json:"merged_id"`
	DeletedCount int    `json:"deleted_count"`
	Message      string `json:"message"`
}

MemoryMergeResponse is the output for the memory_merge tool.

func (MemoryMergeResponse) MarshalJSON ¶

func (r MemoryMergeResponse) MarshalJSON() ([]byte, error)

MarshalJSON implements custom JSON marshaling for tool responses.

type MemorySearchRequest ¶

type MemorySearchRequest struct {
	Query  string            `json:"query" jsonschema:"description=The search query to find relevant memories,required"`
	Limit  int               `json:"limit,omitempty" jsonschema:"description=Maximum number of results to return (default 5)"`
	Filter map[string]string `` /* 153-byte string literal not displayed */
}

MemorySearchRequest is the input for the memory_search tool.

type MemorySearchResponse ¶

type MemorySearchResponse struct {
	Results []MemorySearchResultItem `json:"results"`
	Count   int                      `json:"count"`
}

MemorySearchResponse is the output for the memory_search tool.

func (MemorySearchResponse) MarshalJSON ¶

func (r MemorySearchResponse) MarshalJSON() ([]byte, error)

MarshalJSON implements custom JSON marshaling for tool responses.

type MemorySearchResultItem ¶

type MemorySearchResultItem struct {
	ID         string            `json:"id"`
	Content    string            `json:"content"`
	Metadata   map[string]string `json:"metadata,omitempty"`
	Similarity float64           `json:"similarity"`
}

MemorySearchResultItem represents a single search result from the memory_search tool.

type MemoryStoreRequest ¶

type MemoryStoreRequest struct {
	Text     string            `json:"text" jsonschema:"description=The text content to store in memory,required"`
	Metadata map[string]string `` /* 144-byte string literal not displayed */
	ID       string            `` /* 133-byte string literal not displayed */
}

MemoryStoreRequest is the input for the memory_store tool.

type MemoryStoreResponse ¶

type MemoryStoreResponse struct {
	ID      string `json:"id"`
	Message string `json:"message"`
}

MemoryStoreResponse is the output for the memory_store tool.

func (MemoryStoreResponse) MarshalJSON ¶

func (r MemoryStoreResponse) MarshalJSON() ([]byte, error)

MarshalJSON implements custom JSON marshaling for tool responses.

type SearchRequest ¶

type SearchRequest struct {
	Query  string
	Limit  int
	Filter map[string]string
}

SearchRequest holds the parameters for a vector store search. When Filter is non-nil, only documents whose metadata contains ALL specified entries are returned. This replaces the former SearchWithFilter method — an unfiltered search simply leaves Filter nil.

type SearchResult ¶

type SearchResult struct {
	ID       string            `json:"id"`
	Content  string            `json:"content"`
	Metadata map[string]string `json:"metadata,omitempty"`
	Score    float64           `json:"score"`
}

SearchResult represents a single result returned by Store.Search. It contains the matched document content, its metadata and the cosine similarity score (0.0–1.0, higher is more similar).

func (SearchResult) String ¶

func (s SearchResult) String() string

type SearchResults ¶

type SearchResults []SearchResult

type Store ¶

type Store struct {
	// contains filtered or unexported fields
}

Store wraps a trpc-agent-go vector store and embedder to provide simple add/search operations for agent memory. When PersistenceDir is set and using in-memory store, the store snapshots its state to disk after every Add and restores it on startup. When using Qdrant, persistence is handled by the external store itself.

func (*Store) Add ¶

func (s *Store) Add(ctx context.Context, req AddRequest) error

Add stores one or more documents in the vector store. When multiple items are provided, their embeddings are generated concurrently via errgroup, reducing wall-clock latency from N×round-trip to max(round-trip). A single disk snapshot is taken at the end.

func (*Store) Close ¶

func (s *Store) Close(ctx context.Context) error

Close flushes any pending state to disk (if persistence is configured). For Qdrant stores, it closes the client connection. It is safe to call multiple times.

func (*Store) Delete ¶

func (s *Store) Delete(ctx context.Context, req DeleteRequest) error

Delete removes one or more documents by their IDs from the vector store. A single snapshot is taken at the end. Errors from individual deletes are collected but do not stop processing of remaining items.

func (*Store) Search ¶

func (s *Store) Search(ctx context.Context, req SearchRequest) ([]SearchResult, error)

Search finds semantically similar documents, optionally filtered by metadata key-value pairs. Only documents whose metadata contains ALL specified filter entries are returned. Pass nil Filter for unfiltered search. This enables source-based memory isolation (e.g. per-sender, per-channel).

func (*Store) Upsert ¶

func (s *Store) Upsert(ctx context.Context, req UpsertRequest) error

Upsert replaces documents with the same ID (delete then add). Use a stable ID (e.g. source:external_id) so that re-ingestion overwrites rather than duplicates.

type ToolProvider ¶

type ToolProvider struct {
	// contains filtered or unexported fields
}

ToolProvider wraps an IStore and optional Config, and satisfies the tools.ToolProviders interface so vector memory tools can be passed directly to tools.NewRegistry. When cfg is non-nil and AllowedMetadataKeys is set, memory_store and memory_search only accept those keys for metadata and filter (product/category buckets).

func NewToolProvider ¶

func NewToolProvider(store IStore, cfg *Config, scorer MemoryImportanceScorer) *ToolProvider

NewToolProvider creates a ToolProvider for the vector memory tools (memory_store and memory_search). cfg may be nil; when set with AllowedMetadataKeys, only those metadata keys are allowed. scorer may be nil; when set, memory_store writes are scored for importance and the score is stored as "_importance" metadata.

func (*ToolProvider) GetTools ¶

func (p *ToolProvider) GetTools(_ context.Context) []tool.Tool

GetTools returns the memory store and memory search tools.

type UpsertRequest ¶

type UpsertRequest struct {
	Items []BatchItem
}

UpsertRequest holds the items to upsert (replace-or-insert) in the vector store.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
qdrantstore Package qdrantstore provides the Qdrant vector store backend for Genie's semantic memory.	Package qdrantstore provides the Qdrant vector store backend for Genie's semantic memory.
vectorfakes Code generated by counterfeiter.	Code generated by counterfeiter.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL