rerank

package

v0.4.5 Latest Latest Go to latest Published: Jun 17, 2026 License: AGPL-3.0 Imports: 14 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/eleboucher/memini

Links

Open Source Insights

Documentation ¶

Overview ¶

Package rerank holds the optional read-side rerank stage of recall: after hybrid retrieval and composite ranking, a reranker reads the query and the candidates together — something embeddings can't — and reorders them by how well each answers the query. Two implementations share the Reranker contract: a cross-encoder ranking model behind a /rerank endpoint (CrossEncoder) and a chat LLM prompted over a numbered list (NewLLM).

Index ¶

type Candidate
type Completer
type Config
type CrossEncoder
- func New(cfg Config) (*CrossEncoder, error)
- func (c *CrossEncoder) Rerank(ctx context.Context, query string, candidates []Candidate) ([]string, error)
type Limited
- func (l *Limited) Max() int
- func (l *Limited) Rerank(ctx context.Context, query string, candidates []Candidate) ([]string, error)
type Reranker
- func NewLLM(c Completer) Reranker
- func NewLimited(inner Reranker, max int, onInFlight func(n int64)) Reranker

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Candidate ¶ added in v0.0.6

type Candidate struct {
	ID      string
	Content string
}

Candidate is one memory offered to a reranker.

type Completer ¶ added in v0.0.6

type Completer interface {
	Complete(ctx context.Context, system, user string) (string, error)
}

Completer is the single-turn chat completion an LLM reranker needs. The chat clients in internal/llm satisfy it; declaring it here keeps this package free of a dependency on the chat backends.

type Config ¶

type Config struct {
	// BaseURL is the API root (e.g. http://host:8002/v1); "/rerank" is appended.
	BaseURL string
	Model   string
	APIKey  string
	// MaxDocChars truncates each document before sending. 0 disables.
	MaxDocChars int
	// MaxBatchChars caps the total characters across query and documents in
	// a single /rerank request. Set just below the model's effective context
	// in characters (≈ n_ctx × chars-per-token × (1 − template reserve);
	// ~4000 for a 1024-token model). 0 disables proactive batching.
	MaxBatchChars int
	HTTPClient    *http.Client
}

Config configures the cross-encoder reranker client.

type CrossEncoder ¶

type CrossEncoder struct {
	// contains filtered or unexported fields
}

CrossEncoder reranks candidates with a dedicated ranking model served over the Cohere-style /rerank API (Infinity, vLLM, TEI, llama-server --rerank).

func New ¶

func New(cfg Config) (*CrossEncoder, error)

func (*CrossEncoder) Rerank ¶

func (c *CrossEncoder) Rerank(ctx context.Context, query string, candidates []Candidate) ([]string, error)

Rerank scores every candidate against the query and returns candidate IDs most-relevant-first. Candidates the server omits are dropped. When the payload exceeds MaxBatchChars, candidates are split across multiple requests and merged by score.

type Limited ¶ added in v0.4.0

type Limited struct {
	// contains filtered or unexported fields
}

Limited caps in-flight Rerank calls. max <= 0 returns inner unchanged. onInFlight, if non-nil, receives the absolute in-flight count on every acquire and release.

func (*Limited) Max ¶ added in v0.4.0

func (l *Limited) Max() int

func (*Limited) Rerank ¶ added in v0.4.0

func (l *Limited) Rerank(ctx context.Context, query string, candidates []Candidate) ([]string, error)

type Reranker ¶ added in v0.0.6

type Reranker interface {
	// Rerank returns candidate IDs ordered most-relevant-first. Backends may
	// return fewer IDs than the input slice; candidates omitted from the output
	// are treated as irrelevant and DROPPED from the result set. An empty slice
	// means no candidate is relevant.
	Rerank(ctx context.Context, query string, candidates []Candidate) ([]string, error)
}

Reranker reorders retrieved candidates by how well each answers a query, and MAY drop irrelevant candidates entirely — reranking is no longer reorder-only.

func NewLLM ¶ added in v0.0.6

func NewLLM(c Completer) Reranker

NewLLM builds a reranker over a chat backend. The per-candidate content cap keeps the prompt small — a deep candidate pool of long memories can otherwise blow a RAM-limited local server's context/activation budget.

func NewLimited ¶ added in v0.4.0

func NewLimited(inner Reranker, max int, onInFlight func(n int64)) Reranker

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL