Documentation
¶
Overview ¶
Package benchmark provides a self-validating benchmark harness for Synapses.
Each Scenario derives ground truth from the current graph state — no hardcoded node IDs. This makes benchmarks portable across any indexed codebase.
Metrics are structural and deterministic: precision, recall, F1, latency. No LLM judge needed — we measure against the graph's own topology.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func BuiltinScenarioNames ¶
func BuiltinScenarioNames() []string
BuiltinScenarioNames returns the names of all built-in scenarios.
Types ¶
type QueryResult ¶
type QueryResult struct {
Label string `json:"label"`
Precision float64 `json:"precision"`
Recall float64 `json:"recall"`
F1 float64 `json:"f1"`
LatencyMs float64 `json:"latency_ms"`
Expected int `json:"expected"` // ground truth size
Returned int `json:"returned"` // result size
Relevant int `json:"relevant"` // |expected ∩ returned|
}
QueryResult holds the outcome of a single benchmark query.
type Result ¶
type Result struct {
Timestamp string `json:"timestamp"`
RepoID string `json:"repo_id"`
NodeCount int `json:"node_count"`
EdgeCount int `json:"edge_count"`
Scenarios []ScenarioResult `json:"scenarios"`
Summary Summary `json:"summary"`
DurationMs int64 `json:"total_duration_ms"`
Note string `json:"note,omitempty"` // informational note (e.g. scenario name normalization)
}
Result holds the outcome of running a complete benchmark suite.
type Scenario ¶
type Scenario struct {
Name string
Description string
// Run executes the scenario against the live graph and store.
// Returns query results. Error means the scenario couldn't run (not a quality failure).
Run func(g *graph.Graph, st *store.Store) ([]QueryResult, error)
// PassThreshold is the minimum average F1 to consider this scenario "passed".
PassThreshold float64
}
Scenario defines a benchmark that derives ground truth from the graph.
func BuiltinScenarios ¶
func BuiltinScenarios() []Scenario
BuiltinScenarios returns the standard set of scenarios shipped with Synapses. These are listed in BuiltinScenarioNames() for MCP tool discovery.
func FindScenario ¶
FindScenario returns the named scenario, or an error if not found.
type ScenarioResult ¶
type ScenarioResult struct {
Name string `json:"name"`
Description string `json:"description"`
Queries []QueryResult `json:"queries"`
Passed bool `json:"passed"`
AvgF1 float64 `json:"avg_f1"`
AvgLatencyMs float64 `json:"avg_latency_ms"`
Error string `json:"error,omitempty"`
}
ScenarioResult holds the outcome of a single scenario.
type Summary ¶
type Summary struct {
ScenariosRun int `json:"scenarios_run"`
ScenariosPassed int `json:"scenarios_passed"`
ScenariosErrored int `json:"scenarios_errored"` // scenarios that could not run (graph too small, etc.)
AvgPrecision float64 `json:"avg_precision"`
AvgRecall float64 `json:"avg_recall"`
AvgF1 float64 `json:"avg_f1"`
AvgLatencyMs float64 `json:"avg_latency_ms"`
P95LatencyMs float64 `json:"p95_latency_ms"`
}
Summary aggregates across all scenarios.