queries

package

v0.2.0 Latest Latest Go to latest Published: Apr 30, 2026 License: MIT Imports: 16 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/Cidan/memmy

Links

Open Source Insights

Documentation ¶

Overview ¶

Package queries owns the labeled-query side of the eval framework.

A LabeledQuery couples a query string with the gold node IDs (or turn UUIDs) that should rank highly when the query runs against the replayed memmy store. Labels come either from a Generator (cheap rule-based for tests, LLM for production) or from a Judge that scores returned candidates after the fact.

The package ships pluggable Generator / Judge interfaces and Fake implementations that work without network access. Real Gemini-backed implementations are wired in cmd/memmy-eval to keep this package import-graph clean for unit tests.

Index ¶

func QueryID(text string, cat Category) string
type Candidate
type Category
- func AllCategories() []Category
type FakeGenerator
- func NewFakeGenerator() *FakeGenerator
- func (FakeGenerator) Generate(_ context.Context, turns []corpus.StoredTurn, req GenerateRequest) ([]LabeledQuery, error)
- func (FakeGenerator) Version() string
type FakeJudge
- func NewFakeJudge() *FakeJudge
- func (FakeJudge) Judge(_ context.Context, q LabeledQuery, cands []Candidate) ([]Verdict, error)
- func (FakeJudge) Version() string
type GenerateRequest
type Generator
type Judge
type LabeledQuery
type Store
- func OpenStore(path string) (*Store, error)
type Verdict

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func QueryID ¶

func QueryID(text string, cat Category) string

QueryID returns the stable hash used as primary key.

Types ¶

type Candidate ¶

type Candidate struct {
	NodeID   string
	TurnUUID string
	Text     string
}

Candidate is a single returned chunk handed to the Judge.

type Category ¶

type Category string

Category labels the kind of query for per-bucket metric reporting. Generators produce queries grouped by category and downstream metric aggregation slices results by category.

const (
	// CategoryParaphrase: lightly reworded version of a known turn —
	// the gold turn should rank top-1.
	CategoryParaphrase Category = "paraphrase"
	// CategoryNegation: query that should NOT match the named turn.
	CategoryNegation Category = "negation"
	// CategoryTopicJump: query targets a topic only one specific turn covered.
	CategoryTopicJump Category = "topic-jump"
	// CategoryDistractor: query similar in surface form but topically distinct.
	CategoryDistractor Category = "distractor"
	// CategoryStaleRelevant: query targets an old turn — tests decay vs relevance.
	CategoryStaleRelevant Category = "stale-relevant"
	// CategoryTemporal: query references a time window ("yesterday").
	CategoryTemporal Category = "temporal"
)

func AllCategories ¶

func AllCategories() []Category

AllCategories returns every supported category in declaration order.

type FakeGenerator ¶

type FakeGenerator struct{}

FakeGenerator emits deterministic per-turn queries for tests. It does NOT use any external service. Output:

paraphrase: the first sentence of the turn, prefixed with "about:"
distractor: a phrase guaranteed not to appear in the corpus

func NewFakeGenerator ¶

func NewFakeGenerator() *FakeGenerator

NewFakeGenerator returns the deterministic test Generator.

func (FakeGenerator) Generate ¶

func (FakeGenerator) Generate(_ context.Context, turns []corpus.StoredTurn, req GenerateRequest) ([]LabeledQuery, error)

Generate satisfies Generator.

func (FakeGenerator) Version ¶

func (FakeGenerator) Version() string

Version satisfies Generator.

type FakeJudge ¶

type FakeJudge struct{}

FakeJudge declares any candidate whose text shares a non-trivial token with the query "relevant" (score 1.0); everything else 0.0.

func NewFakeJudge ¶

func NewFakeJudge() *FakeJudge

NewFakeJudge returns the deterministic test Judge.

func (FakeJudge) Judge ¶

func (FakeJudge) Judge(_ context.Context, q LabeledQuery, cands []Candidate) ([]Verdict, error)

Judge satisfies Judge.

func (FakeJudge) Version ¶

func (FakeJudge) Version() string

Version satisfies Judge.

type GenerateRequest ¶

type GenerateRequest struct {
	Categories []Category
	TargetN    int // per category
}

GenerateRequest configures one Generator.Generate call.

type Generator ¶

type Generator interface {
	// Version is the generator-version field used as part of the
	// dedup key. Bump when the prompting strategy changes.
	Version() string
	// Generate returns up to req.TargetN queries per category in
	// req.Categories, drawn from `turns`.
	Generate(ctx context.Context, turns []corpus.StoredTurn, req GenerateRequest) ([]LabeledQuery, error)
}

Generator turns a corpus into a labeled query set. Implementations must be deterministic given the same (corpus, request) pair — the dedup key in the queries store assumes identical re-runs produce identical outputs.

type Judge ¶

type Judge interface {
	Version() string
	Judge(ctx context.Context, q LabeledQuery, candidates []Candidate) ([]Verdict, error)
}

Judge scores a (query, candidate) pair as relevant or not. Used to expand gold labels after a run by asking an LLM "did the candidate actually answer this query?" Returned scores are in [0, 1].

type LabeledQuery ¶

type LabeledQuery struct {
	ID            string
	Category      Category
	Text          string
	GoldTurnUUIDs []string
	Notes         string
	GeneratedAt   time.Time
}

LabeledQuery is the unit a Generator produces and a query battery consumes. GoldTurnUUIDs are the corpus turn UUIDs whose chunks should be considered "correct hits"; metrics map them to memmy node IDs at scoring time via the source turn UUID stored in node text.

type Store ¶

type Store struct {
	// contains filtered or unexported fields
}

Store is the per-dataset queries.sqlite handle.

func OpenStore ¶

func OpenStore(path string) (*Store, error)

OpenStore opens or creates the queries database at path.

func (*Store) All ¶

func (s *Store) All(ctx context.Context) ([]LabeledQuery, error)

All returns every labeled query in storage order (by category then ID).

func (*Store) ByCategory ¶

func (s *Store) ByCategory(ctx context.Context, c Category) ([]LabeledQuery, error)

ByCategory returns labeled queries filtered to one category.

func (*Store) Close ¶

func (s *Store) Close() error

Close releases the handle.

func (*Store) Count ¶

func (s *Store) Count(ctx context.Context) (int, error)

Count returns the total number of queries.

func (*Store) CountForGeneration ¶

func (s *Store) CountForGeneration(ctx context.Context, generatorVersion, corpusSnapshotHash string, category Category) (int, error)

CountForGeneration returns the number of queries already stored that match the (generator_version, corpus_snapshot_hash, category) tuple. Used by the queries subcommand to decide whether to re-run the generator at all.

func (*Store) Embedding ¶

func (s *Store) Embedding(ctx context.Context, queryID string, dim int) ([]float32, bool, error)

Embedding returns the stored embedding, or (nil, false) when none.

func (*Store) Put ¶

func (s *Store) Put(ctx context.Context, q LabeledQuery, generatorVersion, corpusSnapshotHash string) error

Put inserts a labeled query. Idempotent: same ID is a no-op (preserves the original generated_at and embedding so re-running the generator after data appears does not blow away the cached vector).

func (*Store) PutEmbedding ¶

func (s *Store) PutEmbedding(ctx context.Context, queryID string, vec []float32) error

PutEmbedding stores the query embedding. Idempotent.

type Verdict ¶

type Verdict struct {
	NodeID string
	Score  float64 // 0..1, higher = more relevant
	Reason string
}

Verdict is the Judge's assessment of one candidate.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL