Documentation
¶
Overview ¶
Package rerank holds the optional read-side rerank stage of recall: after hybrid retrieval and composite ranking, a reranker reads the query and the candidates together — something embeddings can't — and reorders them by how well each answers the query. Two implementations share the Reranker contract: a cross-encoder ranking model behind a /rerank endpoint (CrossEncoder) and a chat LLM prompted over a numbered list (NewLLM).
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Completer ¶ added in v0.0.6
Completer is the single-turn chat completion an LLM reranker needs. The chat clients in internal/llm satisfy it; declaring it here keeps this package free of a dependency on the chat backends.
type Config ¶
type Config struct {
// BaseURL is the API root (e.g. http://host:8002/v1); "/rerank" is appended.
BaseURL string
Model string
APIKey string
// MaxDocChars truncates each document before sending. 0 disables.
MaxDocChars int
// MaxBatchChars caps the total characters across query and documents in
// a single /rerank request. Set just below the model's effective context
// in characters (≈ n_ctx × chars-per-token × (1 − template reserve);
// ~4000 for a 1024-token model). 0 disables proactive batching.
MaxBatchChars int
HTTPClient *http.Client
}
Config configures the cross-encoder reranker client.
type CrossEncoder ¶
type CrossEncoder struct {
// contains filtered or unexported fields
}
CrossEncoder reranks candidates with a dedicated ranking model served over the Cohere-style /rerank API (Infinity, vLLM, TEI, llama-server --rerank).
func New ¶
func New(cfg Config) (*CrossEncoder, error)
func (*CrossEncoder) Rerank ¶
func (c *CrossEncoder) Rerank(ctx context.Context, query string, candidates []Candidate) ([]string, error)
Rerank scores every candidate against the query and returns candidate IDs most-relevant-first. Candidates the server omits are dropped. When the payload exceeds MaxBatchChars, candidates are split across multiple requests and merged by score.
type Limited ¶ added in v0.4.0
type Limited struct {
// contains filtered or unexported fields
}
Limited caps in-flight Rerank calls. max <= 0 returns inner unchanged. onInFlight, if non-nil, receives the absolute in-flight count on every acquire and release.
type Reranker ¶ added in v0.0.6
type Reranker interface {
// Rerank returns candidate IDs ordered most-relevant-first. Backends may
// return fewer IDs than the input slice; candidates omitted from the output
// are treated as irrelevant and DROPPED from the result set. An empty slice
// means no candidate is relevant.
Rerank(ctx context.Context, query string, candidates []Candidate) ([]string, error)
}
Reranker reorders retrieved candidates by how well each answers a query, and MAY drop irrelevant candidates entirely — reranking is no longer reorder-only.