Documentation
¶
Overview ¶
Package rerank holds the optional read-side rerank stage of recall: after hybrid retrieval and composite ranking, a reranker reads the query and the candidates together — something embeddings can't — and reorders them by how well each answers the query. Two implementations share the Reranker contract: a cross-encoder ranking model behind a /rerank endpoint (CrossEncoder) and a chat LLM prompted over a numbered list (NewLLM).
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Completer ¶ added in v0.0.6
Completer is the single-turn chat completion an LLM reranker needs. The chat clients in internal/llm satisfy it; declaring it here keeps this package free of a dependency on the chat backends.
type Config ¶
type Config struct {
// BaseURL is the API root (e.g. http://host:8002/v1); "/rerank" is appended.
BaseURL string
Model string
APIKey string
// MaxDocChars truncates each document before sending. 0 disables.
MaxDocChars int
// MaxBatchChars caps the total characters across query and documents in
// a single /rerank request. Set just below the model's effective context
// in characters (≈ n_ctx × chars-per-token × (1 − template reserve);
// ~4000 for a 1024-token model). 0 disables proactive batching.
MaxBatchChars int
HTTPClient *http.Client
}
Config configures the cross-encoder reranker client.
type CrossEncoder ¶
type CrossEncoder struct {
// contains filtered or unexported fields
}
CrossEncoder reranks candidates with a dedicated ranking model served over the Cohere-style /rerank API (Infinity, vLLM, TEI, llama-server --rerank).
func New ¶
func New(cfg Config) (*CrossEncoder, error)
func (*CrossEncoder) Rerank ¶
func (c *CrossEncoder) Rerank(ctx context.Context, query string, candidates []Candidate) ([]string, error)
Rerank scores every candidate against the query and returns candidate IDs most-relevant-first. Candidates the server omits are dropped. When the payload exceeds MaxBatchChars, candidates are split across multiple requests and merged by score.
func (*CrossEncoder) Scores ¶ added in v0.4.13
func (c *CrossEncoder) Scores(ctx context.Context, query string, candidates []Candidate) (map[string]float64, error)
Scores returns each candidate's raw relevance score (id → score) from the reranker, applying the same per-doc truncation and batch splitting as Rerank but keeping the scores instead of collapsing them to an order. Candidates the server omits are absent from the map. Exposed for offline analysis (e.g. the rerank-gate separability bench); the production recall path uses Rerank.
type Limited ¶ added in v0.4.0
type Limited struct {
// contains filtered or unexported fields
}
Limited caps in-flight Rerank calls. max <= 0 returns inner unchanged. onInFlight, if non-nil, receives the absolute in-flight count on every acquire and release.
type Reranker ¶ added in v0.0.6
type Reranker interface {
// Rerank returns candidate IDs ordered most-relevant-first. Backends may
// return fewer IDs than the input slice; candidates omitted from the output
// are treated as irrelevant and DROPPED from the result set. An empty slice
// means no candidate is relevant.
Rerank(ctx context.Context, query string, candidates []Candidate) ([]string, error)
}
Reranker reorders retrieved candidates by how well each answers a query, and MAY drop irrelevant candidates entirely — reranking is no longer reorder-only.