rerank

package

v0.1.0 Latest Latest Go to latest Published: May 12, 2026 License: Apache-2.0 Imports: 11 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/gallowaysoftware/murmur

Links

Open Source Insights

README ¶

Example: search rerank service (Pattern A from doc/search-integration.md)

A runnable reference implementation of the query-time rescore pattern from doc/search-integration.md. HTTP server that takes a query, recalls candidates from an upstream search engine (OpenSearch in production), fetches counter features from Murmur, and returns top-K reranked candidates.

What it does

Client → /search?q=...
       │
       ├─ 1: Recall (e.g. OpenSearch BM25)         → top N=200 candidates
       │
       ├─ 2: Murmur GetMany / GetWindowMany        → counter features per candidate
       │     (batched; one BatchGetItem-shaped call covers all N)
       │
       └─ 3: Score(BM25, likes_all, likes_24h)     → top K=20 results
                  ↓
              JSON response

Counters never go in the OpenSearch index. The text-side index serves only slow-moving fields; popularity is fetched fresh per query and applied as a second-stage rerank in this service. The pattern's full justification — including pagination, ML rerank framing, and when to combine with Pattern B — is in doc/search-integration.md.

Files

File	Purpose
`rerank.go`	The pure rerank logic. `Service.Search` does recall → counter-fetch → score → top-K. The `Recaller` interface is the seam for swapping OpenSearch in.
`cmd/server/main.go`	The HTTP server. Wires a static demo recaller against the Murmur QueryService client. Swap the recaller to implement against your search backend.
`rerank_test.go`	11 unit tests against a real Murmur QueryService (httptest-backed) so the integration path is exercised end-to-end without external infra.

Run the tests

go test ./examples/search-rerank/...

Notable test: TestSearch_WindowedFeatureBoostsRecent proves the windowed-counter feature pulls a recently-active post above an equally-popular-but-cold post. TestSearch_DegradesGracefullyWhenMurmurFails proves the service returns BM25-only results when Murmur is unavailable rather than 500-ing the user.

Run the server

# In one terminal, start a Murmur query service (the page-view-counters
# example or any other Counter pipeline).
go run ./examples/page-view-counters/cmd/query

# In another terminal:
go run ./examples/search-rerank/cmd/server \
    --likes-alltime-url http://localhost:50051 \
    --addr :8080

# Hit it:
curl 'http://localhost:8080/search?q=anything'

The static recaller returns a fixed set of 10 candidates for any query. To wire OpenSearch in production, implement the rerank.Recaller interface — that's the entire seam.

Score function

The default score is a hand-tuned multiplicative blend of BM25 and log-scaled likes:

func DefaultScore(c Candidate, likesAll, likes24h int64) float64 {
    recencyBoost := 1 + math.Log10(float64(likes24h)+1)
    allTimeBoost := 1 + 0.2*math.Log10(float64(likesAll)+1)
    return c.BM25 * recencyBoost * allTimeBoost
}

Replace it for ML rerank. Murmur is invariant under the choice — same GetMany / GetWindowMany call shape. See doc/search-integration.md "Two-pass ML ranking" for the framing.

Composition with Pattern B

This example pairs cleanly with examples/search-projector/. Use both:

Projector (Pattern B) keeps a coarse popularity_bucket field indexed in OpenSearch for filtering and rough ordering.
Rerank service (Pattern A) does the fine ranking within the bucket via fresh counter fetches.

The recall stage's query becomes:

filter: popularity_bucket >= 3
sort:   popularity_bucket DESC, _score DESC, doc_id ASC
limit:  N=200

Pagination uses search_after on the composite sort key for stable cursor navigation; per-page rerank is N=20 — cheap. See doc/search-integration.md "Pagination" section for the full treatment of cursor-based pagination with external rescore.

What's NOT in this example

OpenSearch integration. The Recaller interface is the seam; production wires it against opensearch-go/v3 with the AWS-signed transport for managed OpenSearch Service.
A learned ranking model. ScoreFn is a simple BM25 × log(likes) function. For production ML rerank, replace it with a call into your inference service (XGBoost endpoint, cross-encoder transformer, LLM). The data flow is unchanged — Murmur provides the counter features regardless.
Search-session caches for pagination. The example doesn't implement the cached-rescored-list pattern from doc/search-integration.md "Pagination". For shallow pagination (page 1–3) it's not needed; for infinite-feed UIs add a Valkey-backed session cache in front of Search.
Personalization. Only global features (likes-per-post). Per-(user, item) features need a feature store or an embedding-based recommender — Murmur isn't the right tool for those, see doc/search-integration.md "Where Murmur isn't the answer".

Documentation ¶

Overview ¶

Package rerank is the runnable Pattern A reference implementation from doc/search-integration.md. It exposes an HTTP search endpoint that does two-stage retrieval:

Recall — call OpenSearch (or any candidate-source), get top N by text
Rerank — fetch counter features from Murmur via GetMany / GetWindowMany, score each candidate, return top K

Counters never go in the OpenSearch index. The text-side index serves only slow-moving fields; popularity is fetched fresh per query and applied as a second-stage rerank in this service.

Pairs with examples/search-projector (Pattern B). Production deployments use both: B for filterable bucket fields ("≥1k likes"), A for fine ranking within a bucket. See doc/search-integration.md.

Counter feature lookup ¶

The rerank stage reads counters from Murmur via the QueryService.GetMany (all-time) and GetWindowMany (windowed-recent) RPCs. Both are batched — one BatchGetItem-shaped fetch covers up to 100 entities, with the singleflight coalescing layer collapsing concurrent identical requests. For N=200 candidates the fetch is two BatchGetItems in parallel, ~10ms p99.

Score function ¶

The default score is a hand-tuned combination of BM25 and log-scaled likes:

score = bm25 * (1 + log10(likes_24h + 1))

Real production systems use learned models (XGBoost / cross-encoder transformers); Murmur is invariant under that choice — it just provides the counter-feature inputs. Override ScoreFn to plug a different model in.

Index ¶

func DefaultScore(c Candidate, likesAll, likes24h int64) float64
type Candidate
type Config
type Recaller
type ScoreFn
type ScoredCandidate
type Service
- func New(cfg Config) (*Service, error)
- func (s *Service) Search(ctx context.Context, query string) ([]ScoredCandidate, error)
- func (s *Service) Stats() *Stats
type Stats

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func DefaultScore ¶

func DefaultScore(c Candidate, likesAll, likes24h int64) float64

DefaultScore is the default ScoreFn — a multiplicative combination of BM25 and log10-scaled likes. Typical product-search shape; tunable per-app.

Types ¶

type Candidate ¶

type Candidate struct {
	ID   string  // document ID
	BM25 float64 // text-relevance score from OpenSearch
}

Candidate is the input to the rerank stage — what the recall stage returned, before counter features are applied.

type Config ¶

type Config struct {
	// Recall is the candidate-generation stage. Required.
	Recall Recaller

	// LikesAllTime is the Murmur QueryService client for the all-time
	// likes counter. Optional — set nil to skip this feature.
	LikesAllTime murmurv1connect.QueryServiceClient

	// LikesWindowed is the Murmur QueryService client for the windowed
	// likes counter. Same client type, but typically points at a
	// DIFFERENT pipeline (a windowed counter over the last 24h, not the
	// all-time one). Set nil to skip the windowed feature.
	LikesWindowed murmurv1connect.QueryServiceClient

	// WindowDuration is the duration to query against LikesWindowed.
	// Default: 24 hours.
	WindowDuration time.Duration

	// TopN is the recall set size. Default: 200.
	TopN int

	// TopK is the final returned set size. Default: 20.
	TopK int

	// ScoreFn is the final-stage scoring function. Default: DefaultScore.
	ScoreFn ScoreFn
}

Config configures the rerank service. ScoreFn defaults to DefaultScore; set it explicitly for ML rerankers.

type Recaller ¶

type Recaller interface {
	Recall(ctx context.Context, query string, topN int) ([]Candidate, error)
}

Recaller is the abstraction over the candidate-generation stage. Production: the OpenSearch BM25 + filter query path. Tests: a fake that returns a fixed list. The interface is intentionally minimal — the rerank service doesn't need to know anything about the recall implementation beyond "give me up to N candidates for this query."

type ScoreFn ¶

type ScoreFn func(c Candidate, likesAll, likes24h int64) float64

ScoreFn computes the final ranking score from a candidate's BM25 and counter features. Default — see DefaultScore — uses log10-scaled likes; override to plug in a learned model.

type ScoredCandidate ¶

type ScoredCandidate struct {
	ID    string
	Score float64

	// Features carried through for inspection.
	BM25     float64
	LikesAll int64
	Likes24h int64
}

ScoredCandidate is the output of the rerank stage — the candidate plus its final blended score and the raw feature values that produced it (handy for explainability and A/B logging).

type Service ¶

type Service struct {
	// contains filtered or unexported fields
}

Service is the rerank service — wraps a Recaller plus Murmur clients behind a single Search method.

func New ¶

func New(cfg Config) (*Service, error)

New constructs a Service.

func (*Service) Search ¶

func (s *Service) Search(ctx context.Context, query string) ([]ScoredCandidate, error)

Search runs the two-stage retrieval and returns the top-K reranked candidates. Failures in the counter-fetch stage degrade gracefully — missing features are treated as zero, the recall result still scores (BM25 alone), and the response carries on. The user sees a slightly less relevant ranking, not an error.

func (*Service) Stats ¶

func (s *Service) Stats() *Stats

Stats returns a pointer to live counters. Safe for concurrent reads.

type Stats ¶

type Stats struct {
	Searches          atomic.Int64
	CandidatesFetched atomic.Int64
	LikesAllHits      atomic.Int64
	LikesWindowHits   atomic.Int64
	LikesMisses       atomic.Int64
}

Stats reports per-search latency and counter coverage. Useful for dashboards and for verifying the rerank stage is doing what you think.

Source Files ¶

View all Source files

rerank.go

Directories ¶

Path	Synopsis
cmd
server command HTTP server fronting the rerank service.	HTTP server fronting the rerank service.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL