rerank

package

v0.4.0 Latest Latest Go to latest Published: Jun 16, 2026 License: AGPL-3.0 Imports: 13 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/eleboucher/memini

Links

Open Source Insights

Documentation ¶

Overview ¶

Package rerank holds the optional read-side rerank stage of recall: after hybrid retrieval and composite ranking, a reranker reads the query and the candidates together — something embeddings can't — and reorders them by how well each answers the query. Two implementations share the Reranker contract: a cross-encoder ranking model behind a /rerank endpoint (CrossEncoder) and a chat LLM prompted over a numbered list (NewLLM).

Index ¶

type Candidate
type Completer
type Config
type CrossEncoder
- func New(cfg Config) (*CrossEncoder, error)
- func (c *CrossEncoder) Rerank(ctx context.Context, query string, candidates []Candidate) ([]string, error)
type Limited
- func (l *Limited) Max() int
- func (l *Limited) Rerank(ctx context.Context, query string, candidates []Candidate) ([]string, error)
type Reranker
- func NewLLM(c Completer) Reranker
- func NewLimited(inner Reranker, max int, onInFlight func(n int64)) Reranker

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Candidate ¶ added in v0.0.6

type Candidate struct {
	ID      string
	Content string
}

Candidate is one memory offered to a reranker.

type Completer ¶ added in v0.0.6

type Completer interface {
	Complete(ctx context.Context, system, user string) (string, error)
}

Completer is the single-turn chat completion an LLM reranker needs. The chat clients in internal/llm satisfy it; declaring it here keeps this package free of a dependency on the chat backends.

type Config ¶

type Config struct {
	// BaseURL is the API root (e.g. http://host:8002/v1); "/rerank" is appended.
	BaseURL string
	Model   string
	APIKey  string
	// MaxDocChars truncates each document before sending so an oversized
	// candidate can't blow the server's physical batch and fail the whole
	// request. 0 disables truncation.
	MaxDocChars int
	HTTPClient  *http.Client
}

Config configures the cross-encoder reranker client.

type CrossEncoder ¶

type CrossEncoder struct {
	// contains filtered or unexported fields
}

CrossEncoder reranks candidates with a dedicated ranking model (bge-reranker, Qwen3-Reranker, mxbai-rerank, …) served over the Cohere-style /rerank API that Infinity, vLLM, TEI, and llama-server --rerank expose — a cheaper alternative to the LLM reranker.

func New ¶

func New(cfg Config) (*CrossEncoder, error)

New builds a cross-encoder reranker client. BaseURL is required.

func (*CrossEncoder) Rerank ¶

func (c *CrossEncoder) Rerank(ctx context.Context, query string, candidates []Candidate) ([]string, error)

Rerank scores every candidate against the query with the ranking model and returns the candidate IDs most-relevant-first. Candidates the server omits are appended in their original order, satisfying the reorder-only contract of Reranker.

type Limited ¶ added in v0.4.0

type Limited struct {
	// contains filtered or unexported fields
}

Limited caps in-flight Rerank calls. max <= 0 returns inner unchanged. onInFlight, if non-nil, receives the absolute in-flight count on every acquire and release.

func (*Limited) Max ¶ added in v0.4.0

func (l *Limited) Max() int

func (*Limited) Rerank ¶ added in v0.4.0

func (l *Limited) Rerank(ctx context.Context, query string, candidates []Candidate) ([]string, error)

type Reranker ¶ added in v0.0.6

type Reranker interface {
	// Rerank returns candidate IDs ordered most-relevant-first. Candidates the
	// backend omits are appended in their original order, so reranking can only
	// reorder, never drop, the input.
	Rerank(ctx context.Context, query string, candidates []Candidate) ([]string, error)
}

Reranker reorders retrieved candidates by how well each answers a query.

func NewLLM ¶ added in v0.0.6

func NewLLM(c Completer) Reranker

NewLLM builds a reranker over a chat backend. The per-candidate content cap keeps the prompt small — a deep candidate pool of long memories can otherwise blow a RAM-limited local server's context/activation budget.

func NewLimited ¶ added in v0.4.0

func NewLimited(inner Reranker, max int, onInFlight func(n int64)) Reranker

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL