Documentation
¶
Overview ¶
Package main runs a retrieval-quality benchmark for embedding models on real production data. It samples real user messages that look like retrieval queries (help/remember/what-about-X), then for each candidate embedding configuration:
- embeds all topic summaries in the haystack
- embeds the test queries
- computes Recall@k where the "gold" topic is the one containing the original message (via topic_id)
Also reports wall time and token counts — useful for sizing the full production re-embedding job.
Usage:
go run ./cmd/embed-benchmark --queries=50 --db=data/prod/laplaced.db
Click to show internal directories.
Click to hide internal directories.