strategy

package
v1.16.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 30, 2025 License: Apache-2.0 Imports: 32 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func BuildShouldIgnore added in v1.9.30

func BuildShouldIgnore(buildCtx BuildContext, strategyParams map[string]any) func(path string) bool

BuildShouldIgnore creates a filter function based on BuildContext and optional strategy-level override. Strategy params can override the RAG-level respect_vcs setting. Returns nil if no filtering should be applied.

func CreateEmbedder added in v1.9.26

func CreateEmbedder(embedModel provider.Provider, batchSize, maxConcurrency int) *embed.Embedder

CreateEmbedder creates an embedder with the specified configuration.

func EmitEvent

func EmitEvent(events chan<- types.Event, event types.Event, strategyName string)

EmitEvent sends an event to the events channel using non-blocking send This prevents strategies from hanging if the event channel is full or not ready Automatically sets the StrategyName field in the event

func GetParam

func GetParam[T any](params map[string]any, key string, defaultValue T) T

GetParam gets a parameter from the config Params map. It includes numeric coercion so YAML numbers (which may be decoded as int, int64, uint64 or float64) can be safely read as either int or float64 without callers needing to worry about the concrete type.

func GetParamPtr

func GetParamPtr[T any](params map[string]any, key string) *T

GetParamPtr gets a parameter pointer from the config Params map

func MergeDocPaths

func MergeDocPaths(sharedDocs, strategyDocs []string, parentDir string) []string

MergeDocPaths merges shared docs with strategy-specific docs and makes them absolute

func ResolveDatabasePath

func ResolveDatabasePath(dbCfg latest.RAGDatabaseConfig, parentDir, defaultName string) (string, error)

ResolveDatabasePath resolves database configuration to a path

func ResolveModelConfig added in v1.9.26

func ResolveModelConfig(ref string, models map[string]latest.ModelConfig) (latest.ModelConfig, error)

ResolveModelConfig resolves a model reference into a ModelConfig. Supports "provider/model" inline references or named references into the models map.

Types

type BM25Strategy

type BM25Strategy struct {
	// contains filtered or unexported fields
}

BM25Strategy implements BM25 keyword-based retrieval BM25 is a ranking function that uses term frequency and inverse document frequency

func (*BM25Strategy) CheckAndReindexChangedFiles

func (s *BM25Strategy) CheckAndReindexChangedFiles(ctx context.Context, docPaths []string, chunking ChunkingConfig) error

CheckAndReindexChangedFiles checks for file changes and re-indexes if needed

func (*BM25Strategy) Close

func (s *BM25Strategy) Close() error

Close releases resources

func (*BM25Strategy) Initialize

func (s *BM25Strategy) Initialize(ctx context.Context, docPaths []string, chunking ChunkingConfig) error

Initialize indexes all documents for BM25 retrieval

func (*BM25Strategy) Query

func (s *BM25Strategy) Query(ctx context.Context, query string, numResults int, threshold float64) ([]database.SearchResult, error)

Query searches for relevant documents using BM25 scoring

func (*BM25Strategy) StartFileWatcher

func (s *BM25Strategy) StartFileWatcher(ctx context.Context, docPaths []string, chunking ChunkingConfig) error

StartFileWatcher starts monitoring files for changes

type BuildContext

type BuildContext struct {
	RAGName       string
	ParentDir     string
	SharedDocs    []string
	Models        map[string]latest.ModelConfig
	Env           environment.Provider
	ModelsGateway string
	RespectVCS    bool // Whether to respect VCS ignore files (e.g., .gitignore) when collecting files
}

BuildContext contains everything needed to build a strategy

type ChunkingConfig added in v1.9.26

type ChunkingConfig struct {
	Size                  int
	Overlap               int
	RespectWordBoundaries bool
	CodeAware             bool
}

ChunkingConfig holds chunking parameters.

func ParseChunkingConfig added in v1.9.26

func ParseChunkingConfig(cfg latest.RAGStrategyConfig) ChunkingConfig

ParseChunkingConfig extracts chunking configuration from RAGStrategyConfig.

type Config

type Config struct {
	Name      string
	Strategy  Strategy
	Docs      []string // Merged document paths (shared + strategy-specific)
	Limit     int      // Max results for this strategy
	Threshold float64  // Score threshold
	Chunking  ChunkingConfig
}

Config contains a strategy and its runtime configuration.

func BuildStrategy

func BuildStrategy(ctx context.Context, cfg latest.RAGStrategyConfig, buildCtx BuildContext, events chan<- types.Event) (*Config, error)

BuildStrategy builds a strategy from config Explicitly dispatches to the appropriate constructor based on type

func NewBM25FromConfig

func NewBM25FromConfig(_ context.Context, cfg latest.RAGStrategyConfig, buildCtx BuildContext, events chan<- types.Event) (*Config, error)

NewBM25FromConfig creates a BM25 strategy from configuration

func NewChunkedEmbeddingsFromConfig

func NewChunkedEmbeddingsFromConfig(ctx context.Context, cfg latest.RAGStrategyConfig, buildCtx BuildContext, events chan<- types.Event) (*Config, error)

NewChunkedEmbeddingsFromConfig creates a chunked-embeddings strategy from configuration.

This strategy embeds document chunks directly and uses vector similarity search for retrieval. It's the simplest embedding-based RAG strategy.

func NewSemanticEmbeddingsFromConfig added in v1.9.29

func NewSemanticEmbeddingsFromConfig(ctx context.Context, cfg latest.RAGStrategyConfig, buildCtx BuildContext, events chan<- types.Event) (*Config, error)

NewSemanticEmbeddingsFromConfig creates a semantic-embeddings strategy from configuration.

This strategy uses an LLM to generate semantic summaries of each chunk before embedding. The summaries capture the meaning/purpose of the code, making retrieval more semantic than direct chunk embedding.

Configuration (in RAGStrategyConfig.Params):

  • embedding_model (string, required): embedding model name (same as chunked-embeddings)
  • chat_model (string, required): chat model used to generate semantic representations for each chunk (e.g., "anthropic/claude-sonnet-4-5")
  • vector_dimensions (int, required): embedding vector dimensions
  • semantic_prompt (string, optional): prompt template for the semantic LLM
  • ast_context (bool, optional): when true, include TreeSitter-derived AST metadata in the semantic prompt (requires chunking.code_aware for best results)
  • similarity_metric (string, optional): "cosine_similarity" (default) or "euclidean"
  • threshold (float, optional): minimum similarity score (default: 0.5)
  • embedding_batch_size (int, optional): batch size for embedding calls (default: 50)
  • max_embedding_concurrency (int, optional): parallel embedding/LLM calls (default: 3)
  • max_indexing_concurrency (int, optional): parallel file indexing (default: 3)

Template Placeholders

Templates use JavaScript template literal syntax (${variable}). The following placeholders are available:

  • ${path} - full source file path
  • ${basename} - base name of the source file
  • ${chunk_index} - numeric index of the chunk
  • ${content} - raw chunk content
  • ${ast_context} - formatted AST metadata (empty when unavailable)

If semantic_prompt is omitted, a sensible default is used.

type DefaultEmbeddingInputBuilder added in v1.9.26

type DefaultEmbeddingInputBuilder struct{}

DefaultEmbeddingInputBuilder returns the raw chunk content unchanged. This is the default used by chunked-embeddings strategy.

func (DefaultEmbeddingInputBuilder) BuildEmbeddingInput added in v1.9.26

func (d DefaultEmbeddingInputBuilder) BuildEmbeddingInput(_ context.Context, _ string, ch chunk.Chunk) (string, error)

BuildEmbeddingInput returns the raw chunk content.

type EmbeddingConfig added in v1.9.26

type EmbeddingConfig struct {
	Provider    provider.Provider
	ModelID     string // Full model ID for pricing (e.g., "openai/text-embedding-3-small")
	ModelsStore *modelsdev.Store
}

EmbeddingConfig holds configuration for creating an embedding provider.

func CreateEmbeddingProvider added in v1.9.26

func CreateEmbeddingProvider(ctx context.Context, modelName string, buildCtx BuildContext) (*EmbeddingConfig, error)

CreateEmbeddingProvider creates an embedding model provider from configuration. Supports "auto" for auto-detection, inline "provider/model" format, or named model references.

type EmbeddingInputBuilder added in v1.9.26

type EmbeddingInputBuilder interface {
	BuildEmbeddingInput(ctx context.Context, sourcePath string, ch chunk.Chunk) (string, error)
}

EmbeddingInputBuilder builds the string that will be sent to the embedding model for a given chunk. This allows strategies to customize the embedded text without changing the stored document content.

type Strategy

type Strategy interface {
	// Initialize indexes all documents from the given paths.
	Initialize(ctx context.Context, docPaths []string, chunking ChunkingConfig) error

	// Query searches for relevant documents using the strategy's retrieval method.
	// numResults is the maximum number of candidates to retrieve (before fusion).
	Query(ctx context.Context, query string, numResults int, threshold float64) ([]database.SearchResult, error)

	// CheckAndReindexChangedFiles checks for file changes and re-indexes if needed.
	CheckAndReindexChangedFiles(ctx context.Context, docPaths []string, chunking ChunkingConfig) error

	// StartFileWatcher starts monitoring files for changes.
	StartFileWatcher(ctx context.Context, docPaths []string, chunking ChunkingConfig) error

	// Close releases resources held by the strategy.
	Close() error
}

Strategy defines the interface for different retrieval strategies. This is the canonical definition used by both the strategies and rag packages.

type VectorSearchResultData added in v1.9.26

type VectorSearchResultData struct {
	database.Document
	Embedding      []float64
	EmbeddingInput string // Only populated for semantic-embeddings
	Similarity     float64
}

VectorSearchResultData is the internal search result type used by VectorStore. It contains base document data plus similarity score.

type VectorStore added in v1.9.26

type VectorStore struct {
	// contains filtered or unexported fields
}

VectorStore provides shared embedding-based indexing and retrieval infrastructure. This is NOT a standalone strategy - it's infrastructure that strategies use.

func NewVectorStore added in v1.9.26

func NewVectorStore(cfg VectorStoreConfig) *VectorStore

NewVectorStore creates a new vector store with the given configuration.

func (*VectorStore) CheckAndReindexChangedFiles added in v1.9.26

func (s *VectorStore) CheckAndReindexChangedFiles(ctx context.Context, docPaths []string, chunking ChunkingConfig) error

CheckAndReindexChangedFiles checks for file changes and re-indexes if needed

func (*VectorStore) Close added in v1.9.26

func (s *VectorStore) Close() error

Close releases resources

func (*VectorStore) GetIndexingUsage added in v1.9.26

func (s *VectorStore) GetIndexingUsage() (tokens int64, cost float64)

GetIndexingUsage returns usage statistics from indexing

func (*VectorStore) Initialize added in v1.9.26

func (s *VectorStore) Initialize(ctx context.Context, docPaths []string, chunking ChunkingConfig) error

Initialize indexes all documents

func (*VectorStore) Query added in v1.9.26

func (s *VectorStore) Query(ctx context.Context, query string, numResults int, threshold float64) ([]database.SearchResult, error)

Query searches for relevant documents using vector similarity

func (*VectorStore) RecordUsage added in v1.9.26

func (s *VectorStore) RecordUsage(tokens int64, cost float64)

RecordUsage records usage and emits a usage event with cumulative totals. This is exported so strategies can track additional usage (e.g., semantic LLM calls).

func (*VectorStore) SetEmbeddingInputBuilder added in v1.9.26

func (s *VectorStore) SetEmbeddingInputBuilder(builder EmbeddingInputBuilder)

SetEmbeddingInputBuilder allows callers to override how text is prepared before being sent to the embedding model. Passing nil resets to the default behavior (raw chunk content).

func (*VectorStore) StartFileWatcher added in v1.9.26

func (s *VectorStore) StartFileWatcher(ctx context.Context, docPaths []string, chunking ChunkingConfig) error

StartFileWatcher starts monitoring files for changes

type VectorStoreConfig added in v1.9.26

type VectorStoreConfig struct {
	Name                 string
	Database             vectorStoreDB
	Embedder             *embed.Embedder
	Events               chan<- types.Event
	SimilarityMetric     string
	ModelID              string
	ModelsStore          modelStore
	EmbeddingConcurrency int
	FileIndexConcurrency int
	Chunking             ChunkingConfig
	ShouldIgnore         func(path string) bool // Optional filter for gitignore support
}

VectorStoreConfig holds configuration for creating a VectorStore.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL