rerank

package
v0.4.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 16, 2026 License: AGPL-3.0 Imports: 13 Imported by: 0

Documentation

Overview

Package rerank holds the optional read-side rerank stage of recall: after hybrid retrieval and composite ranking, a reranker reads the query and the candidates together — something embeddings can't — and reorders them by how well each answers the query. Two implementations share the Reranker contract: a cross-encoder ranking model behind a /rerank endpoint (CrossEncoder) and a chat LLM prompted over a numbered list (NewLLM).

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Candidate added in v0.0.6

type Candidate struct {
	ID      string
	Content string
}

Candidate is one memory offered to a reranker.

type Completer added in v0.0.6

type Completer interface {
	Complete(ctx context.Context, system, user string) (string, error)
}

Completer is the single-turn chat completion an LLM reranker needs. The chat clients in internal/llm satisfy it; declaring it here keeps this package free of a dependency on the chat backends.

type Config

type Config struct {
	// BaseURL is the API root (e.g. http://host:8002/v1); "/rerank" is appended.
	BaseURL string
	Model   string
	APIKey  string
	// MaxDocChars truncates each document before sending so an oversized
	// candidate can't blow the server's physical batch and fail the whole
	// request. 0 disables truncation.
	MaxDocChars int
	HTTPClient  *http.Client
}

Config configures the cross-encoder reranker client.

type CrossEncoder

type CrossEncoder struct {
	// contains filtered or unexported fields
}

CrossEncoder reranks candidates with a dedicated ranking model (bge-reranker, Qwen3-Reranker, mxbai-rerank, …) served over the Cohere-style /rerank API that Infinity, vLLM, TEI, and llama-server --rerank expose — a cheaper alternative to the LLM reranker.

func New

func New(cfg Config) (*CrossEncoder, error)

New builds a cross-encoder reranker client. BaseURL is required.

func (*CrossEncoder) Rerank

func (c *CrossEncoder) Rerank(ctx context.Context, query string, candidates []Candidate) ([]string, error)

Rerank scores every candidate against the query with the ranking model and returns the candidate IDs most-relevant-first. Candidates the server omits are appended in their original order, satisfying the reorder-only contract of Reranker.

type Limited added in v0.4.0

type Limited struct {
	// contains filtered or unexported fields
}

Limited caps in-flight Rerank calls. max <= 0 returns inner unchanged. onInFlight, if non-nil, receives the absolute in-flight count on every acquire and release.

func (*Limited) Max added in v0.4.0

func (l *Limited) Max() int

func (*Limited) Rerank added in v0.4.0

func (l *Limited) Rerank(ctx context.Context, query string, candidates []Candidate) ([]string, error)

type Reranker added in v0.0.6

type Reranker interface {
	// Rerank returns candidate IDs ordered most-relevant-first. Candidates the
	// backend omits are appended in their original order, so reranking can only
	// reorder, never drop, the input.
	Rerank(ctx context.Context, query string, candidates []Candidate) ([]string, error)
}

Reranker reorders retrieved candidates by how well each answers a query.

func NewLLM added in v0.0.6

func NewLLM(c Completer) Reranker

NewLLM builds a reranker over a chat backend. The per-candidate content cap keeps the prompt small — a deep candidate pool of long memories can otherwise blow a RAM-limited local server's context/activation budget.

func NewLimited added in v0.4.0

func NewLimited(inner Reranker, max int, onInFlight func(n int64)) Reranker

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL