Documentation
¶
Overview ¶
Package fusion provides result fusion strategies for combining multiple retrieval results. It implements algorithms to merge and rank results from different retrieval methods or sources into a single, optimized result set.
The package includes Reciprocal Rank Fusion (RRF), a robust algorithm for combining ranked lists that doesn't require score normalization.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type RRFFusionEngine ¶
type RRFFusionEngine struct {
// contains filtered or unexported fields
}
RRFFusionEngine implements Reciprocal Rank Fusion for combining multiple result sets. RRF is a simple yet effective method for merging ranked lists that computes scores based on the rank positions rather than the raw similarity scores.
The RRF formula is: RRF_Score(d) = Σ 1/(k + rank_i(d)) where:
- d is a document/chunk
- rank_i(d) is the rank of d in the i-th result list
- k is a smoothing constant (default: 60)
Advantages of RRF:
- Score normalization not required
- Handles different retrieval methods gracefully
- Robust to outliers and score scale differences
- Simple and computationally efficient
Example:
engine := fusion.NewRRFFusionEngine()
results, err := engine.ReciprocalRankFusion(ctx, [][]*core.Chunk{
vectorResults, // Results from vector search
keywordResults, // Results from keyword search
graphResults, // Results from graph traversal
}, 10)
func NewRRFFusionEngine ¶
func NewRRFFusionEngine() *RRFFusionEngine
NewRRFFusionEngine creates a new RRF fusion engine with the standard k=60. The value k=60 is recommended by the original RRF paper and works well across different domains and retrieval systems.
Returns:
- *RRFFusionEngine: Configured fusion engine
Example:
engine := fusion.NewRRFFusionEngine() fused, err := engine.ReciprocalRankFusion(ctx, resultSets, 10)
func (*RRFFusionEngine) Fuse ¶
func (e *RRFFusionEngine) Fuse(ctx context.Context, resultSets [][]*core.Chunk, topK int) ([]*core.Chunk, error)
Fuse performs a simple merge of multiple result sets without ranking. It deduplicates chunks by ID and returns up to topK results. This is a basic fusion method; use ReciprocalRankFusion for better results.
Parameters:
- ctx: Context for cancellation (currently unused)
- resultSets: Multiple result sets to merge
- topK: Maximum number of results to return
Returns:
- []*core.Chunk: Merged and deduplicated results (up to topK)
- error: Always nil (included for interface compatibility)
func (*RRFFusionEngine) ReciprocalRankFusion ¶
func (e *RRFFusionEngine) ReciprocalRankFusion(ctx context.Context, resultSets [][]*core.Chunk, topK int) ([]*core.Chunk, error)
ReciprocalRankFusion merges results from different retrieval methods using RRF algorithm. It computes a fused score for each chunk based on its rank position in each result set, then returns the chunks sorted by their fused scores.
Parameters:
- ctx: Context for cancellation (currently unused)
- resultSets: Multiple ranked result sets to fuse
- topK: Maximum number of results to return
Returns:
- []*core.Chunk: Fused and ranked results (up to topK)
- error: Always nil (included for interface compatibility)
The algorithm:
- For each chunk in each result set, compute RRF score: 1/(k + rank)
- Sum RRF scores across all result sets for each unique chunk
- Sort chunks by their total RRF score (descending)
- Return topK chunks
Example:
vectorResults := []*core.Chunk{chunk1, chunk2, chunk3}
keywordResults := []*core.Chunk{chunk2, chunk4, chunk1}
fused, _ := engine.ReciprocalRankFusion(ctx, [][]*core.Chunk{
vectorResults, keywordResults,
}, 5)