rerank

package

v0.4.13 Latest Latest Go to latest Published: Jun 20, 2026 License: AGPL-3.0 Imports: 15 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/eleboucher/memini

Links

Open Source Insights

Documentation ¶

Overview ¶

Package rerank holds the optional read-side rerank stage of recall: after hybrid retrieval and composite ranking, a reranker reads the query and the candidates together — something embeddings can't — and reorders them by how well each answers the query. Two implementations share the Reranker contract: a cross-encoder ranking model behind a /rerank endpoint (CrossEncoder) and a chat LLM prompted over a numbered list (NewLLM).

Index ¶

type Candidate
type Completer
type Config
type CrossEncoder
- func New(cfg Config) (*CrossEncoder, error)
- func (c *CrossEncoder) Rerank(ctx context.Context, query string, candidates []Candidate) ([]string, error)
- func (c *CrossEncoder) Scores(ctx context.Context, query string, candidates []Candidate) (map[string]float64, error)
type Limited
- func (l *Limited) Max() int
- func (l *Limited) Rerank(ctx context.Context, query string, candidates []Candidate) ([]string, error)
type Reranker
- func NewLLM(c Completer) Reranker
- func NewLimited(inner Reranker, max int, onInFlight func(n int64)) Reranker

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Candidate ¶ added in v0.0.6

type Candidate struct {
	ID      string
	Content string
}

Candidate is one memory offered to a reranker.

type Completer ¶ added in v0.0.6

type Completer interface {
	Complete(ctx context.Context, system, user string) (string, error)
}

Completer is the single-turn chat completion an LLM reranker needs. The chat clients in internal/llm satisfy it; declaring it here keeps this package free of a dependency on the chat backends.

type Config ¶

type Config struct {
	// BaseURL is the API root (e.g. http://host:8002/v1); "/rerank" is appended.
	BaseURL string
	Model   string
	APIKey  string
	// MaxDocChars truncates each document before sending. 0 disables.
	MaxDocChars int
	// MaxBatchChars caps the total characters across query and documents in
	// a single /rerank request. Set just below the model's effective context
	// in characters (≈ n_ctx × chars-per-token × (1 − template reserve);
	// ~4000 for a 1024-token model). 0 disables proactive batching.
	MaxBatchChars int
	HTTPClient    *http.Client
}

Config configures the cross-encoder reranker client.

type CrossEncoder ¶

type CrossEncoder struct {
	// contains filtered or unexported fields
}

CrossEncoder reranks candidates with a dedicated ranking model served over the Cohere-style /rerank API (Infinity, vLLM, TEI, llama-server --rerank).

func New ¶

func New(cfg Config) (*CrossEncoder, error)

func (*CrossEncoder) Rerank ¶

func (c *CrossEncoder) Rerank(ctx context.Context, query string, candidates []Candidate) ([]string, error)

Rerank scores every candidate against the query and returns candidate IDs most-relevant-first. Candidates the server omits are dropped. When the payload exceeds MaxBatchChars, candidates are split across multiple requests and merged by score.

func (*CrossEncoder) Scores ¶ added in v0.4.13

func (c *CrossEncoder) Scores(ctx context.Context, query string, candidates []Candidate) (map[string]float64, error)

Scores returns each candidate's raw relevance score (id → score) from the reranker, applying the same per-doc truncation and batch splitting as Rerank but keeping the scores instead of collapsing them to an order. Candidates the server omits are absent from the map. Exposed for offline analysis (e.g. the rerank-gate separability bench); the production recall path uses Rerank.

type Limited ¶ added in v0.4.0

type Limited struct {
	// contains filtered or unexported fields
}

Limited caps in-flight Rerank calls. max <= 0 returns inner unchanged. onInFlight, if non-nil, receives the absolute in-flight count on every acquire and release.

func (*Limited) Max ¶ added in v0.4.0

func (l *Limited) Max() int

func (*Limited) Rerank ¶ added in v0.4.0

func (l *Limited) Rerank(ctx context.Context, query string, candidates []Candidate) ([]string, error)

type Reranker ¶ added in v0.0.6

type Reranker interface {
	// Rerank returns candidate IDs ordered most-relevant-first. Backends may
	// return fewer IDs than the input slice; candidates omitted from the output
	// are treated as irrelevant and DROPPED from the result set. An empty slice
	// means no candidate is relevant.
	Rerank(ctx context.Context, query string, candidates []Candidate) ([]string, error)
}

Reranker reorders retrieved candidates by how well each answers a query, and MAY drop irrelevant candidates entirely — reranking is no longer reorder-only.

func NewLLM ¶ added in v0.0.6

func NewLLM(c Completer) Reranker

NewLLM builds a reranker over a chat backend. The per-candidate content cap keeps the prompt small — a deep candidate pool of long memories can otherwise blow a RAM-limited local server's context/activation budget.

func NewLimited ¶ added in v0.4.0

func NewLimited(inner Reranker, max int, onInFlight func(n int64)) Reranker

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL