toolsindex

package
v1.72.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 31, 2026 License: Apache-2.0 Imports: 5 Imported by: 0

Documentation

Overview

Package toolsindex is the tools-discovery consumer of the indexjobs framework (#440). It embeds every globally-visible MCP tool's descriptor (name + description + parameter-schema summary) under source_kind "tools" and ranks them by cosine similarity for platform_find_tools.

Unlike the api-catalog consumer, the tool corpus is not a DB table: tools are registered in-process from compiled-in toolkits plus admin visibility config. So the Source (in pkg/platform) enumerates the live registry; this package owns only the vector storage (a Sink over tool_embeddings) and the query-time ranking. The expected-count breadcrumb the reconciler diffs against lives in the framework-owned index_sources table (migration 000053).

Index

Constants

View Source
const SourceID = "platform"

SourceID is the single logical tool-corpus identifier. There is one tool registry per deployment, identical across replicas (same binary plus the same DB-backed visibility config), so a constant source_id is sufficient; vectors keyed on it are shared by every replica.

View Source
const SourceKind = "tools"

SourceKind is the indexjobs source_kind this package serves.

Variables

This section is empty.

Functions

This section is empty.

Types

type ScoredTool

type ScoredTool struct {
	ToolName string
	Score    float64
}

ScoredTool is one tool name with its cosine similarity to a query, returned by the store's similarity ranking. Score is in [-1, 1] (1 = identical direction); for the platform's normalized embeddings it is effectively [0, 1].

type Sink

type Sink struct {
	// contains filtered or unexported fields
}

Sink implements indexjobs.Sink for the tools kind over the tool_embeddings table (vectors) and index_sources (expected count). The key's SourceID is used verbatim as the tool_embeddings source_id; there is no composite encoding because, unlike api-catalog, the tool corpus is a single flat set per source.

func NewSink

func NewSink(store *Store) *Sink

NewSink returns a Sink backed by the given store.

func (*Sink) FindGaps

func (s *Sink) FindGaps(ctx context.Context) ([]string, error)

FindGaps returns the source ids whose expected count and persisted vector count disagree.

func (*Sink) Kind

func (*Sink) Kind() string

Kind reports the tools source kind.

func (*Sink) ListExisting

func (s *Sink) ListExisting(ctx context.Context, key indexjobs.Key) (map[string]indexjobs.Vector, error)

ListExisting returns the persisted vectors keyed by tool name for the worker's dedup pass.

func (*Sink) StampExpected

func (*Sink) StampExpected(context.Context, indexjobs.Key, int) error

StampExpected is a no-op for the tools kind. The framework calls it after a successful embed to record an expected item count for count-based gap detection, but tools detects gaps by always re-syncing (see Store.FindGaps), so there is no count to record.

func (*Sink) Upsert

func (s *Sink) Upsert(ctx context.Context, key indexjobs.Key, rows []indexjobs.Vector) error

Upsert atomically replaces the full vector set for the source, so a tool dropped from the registry has its stale vector removed.

func (*Sink) UpsertBatch

func (s *Sink) UpsertBatch(ctx context.Context, key indexjobs.Key, rows []indexjobs.Vector) error

UpsertBatch writes one chunk in place without disturbing rows outside it (the worker's incremental progress persistence).

type Store

type Store struct {
	// contains filtered or unexported fields
}

Store persists tool embedding vectors (tool_embeddings) and the expected-count breadcrumb (index_sources) and answers the query-time cosine ranking. Backed by PostgreSQL + pgvector.

func NewStore

func NewStore(db *sql.DB) *Store

NewStore returns a Store over the given database.

func (*Store) FindGaps

func (*Store) FindGaps(_ context.Context) ([]string, error)

FindGaps always returns the single tools source, so the reconciler re-syncs the tool index on every sweep.

Unlike a DB-backed corpus, the tool set lives in the running process (compiled-in toolkits plus admin visibility/description config), and it drifts in ways a count comparison cannot see: a description edit or a visibility flip changes the live descriptors without changing the stored vector count, so an expected-vs-indexed count diff would report "no gap" while the index is stale. Returning the source unconditionally makes the worker re-enumerate the live registry each sweep; its text-hash dedup (pkg/indexjobs/embed.go) skips the embedding provider for unchanged tools, so a no-change pass costs one in-memory tools/list plus a row rewrite, and any add / remove / edit / deny-flip converges within one reconcile interval. The content-blind count check is left to DB-backed consumers (#441+), whose corpus is a table the gap query can compare against directly.

func (*Store) ListVectors

func (s *Store) ListVectors(ctx context.Context, sourceID string) (map[string]indexjobs.Vector, error)

ListVectors returns every persisted vector for the source, keyed by tool name, for the worker's text-hash dedup pass.

func (*Store) RankBySimilarity

func (s *Store) RankBySimilarity(ctx context.Context, sourceID string, queryVec []float32) ([]ScoredTool, error)

RankBySimilarity returns every indexed tool for the source ordered by cosine similarity to queryVec (most similar first). pgvector's `<=>` is the cosine-distance operator, so 1 - distance is the similarity. No LIMIT is applied: the corpus is small (low hundreds) and the caller filters by persona before capping, which must happen on the full ranked set to avoid a denied tool consuming a top-K slot.

func (*Store) Replace

func (s *Store) Replace(ctx context.Context, sourceID string, rows []indexjobs.Vector) error

Replace atomically swaps the full vector set for the source: it deletes every existing row for source_id and inserts the supplied set in one transaction, so a tool removed from the registry has its stale vector dropped.

func (*Store) UpsertBatch

func (s *Store) UpsertBatch(ctx context.Context, sourceID string, rows []indexjobs.Vector) error

UpsertBatch inserts or updates the supplied rows in place without deleting rows outside the batch (incremental progress for the worker's per-chunk persistence).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL