benchmark

package

v0.8.0 Latest Latest Go to latest Published: Mar 29, 2026 License: MIT Imports: 6 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/SynapsesOS/synapses

Links

Open Source Insights

Documentation ¶

Overview ¶

Package benchmark provides a self-validating benchmark harness for Synapses.

Each Scenario derives ground truth from the current graph state — no hardcoded node IDs. This makes benchmarks portable across any indexed codebase.

Metrics are structural and deterministic: precision, recall, F1, latency. No LLM judge needed — we measure against the graph's own topology.

Index ¶

func BuiltinScenarioNames() []string
type QueryResult
type Result
- func RunAll(g *graph.Graph, st *store.Store) *Result
- func RunScenarios(g *graph.Graph, st *store.Store, scenarios []Scenario) *Result
type Scenario
- func BuiltinScenarios() []Scenario
- func FindScenario(name string) (Scenario, error)
type ScenarioResult
type Summary

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func BuiltinScenarioNames ¶

func BuiltinScenarioNames() []string

BuiltinScenarioNames returns the names of all built-in scenarios.

Types ¶

type QueryResult ¶

type QueryResult struct {
	Label     string  `json:"label"`
	Precision float64 `json:"precision"`
	Recall    float64 `json:"recall"`
	F1        float64 `json:"f1"`
	LatencyMs float64 `json:"latency_ms"`
	Expected  int     `json:"expected"` // ground truth size
	Returned  int     `json:"returned"` // result size
	Relevant  int     `json:"relevant"` // |expected ∩ returned|
}

QueryResult holds the outcome of a single benchmark query.

type Result ¶

type Result struct {
	Timestamp  string           `json:"timestamp"`
	RepoID     string           `json:"repo_id"`
	NodeCount  int              `json:"node_count"`
	EdgeCount  int              `json:"edge_count"`
	Scenarios  []ScenarioResult `json:"scenarios"`
	Summary    Summary          `json:"summary"`
	DurationMs int64            `json:"total_duration_ms"`
	Note       string           `json:"note,omitempty"` // informational note (e.g. scenario name normalization)
}

Result holds the outcome of running a complete benchmark suite.

func RunAll ¶

func RunAll(g *graph.Graph, st *store.Store) *Result

RunAll executes all built-in scenarios and returns the aggregate result.

func RunScenarios ¶

func RunScenarios(g *graph.Graph, st *store.Store, scenarios []Scenario) *Result

RunScenarios executes the given scenarios and returns the aggregate result.

type Scenario ¶

type Scenario struct {
	Name        string
	Description string
	// Run executes the scenario against the live graph and store.
	// Returns query results. Error means the scenario couldn't run (not a quality failure).
	Run func(g *graph.Graph, st *store.Store) ([]QueryResult, error)
	// PassThreshold is the minimum average F1 to consider this scenario "passed".
	PassThreshold float64
}

Scenario defines a benchmark that derives ground truth from the graph.

func BuiltinScenarios ¶

func BuiltinScenarios() []Scenario

BuiltinScenarios returns the standard set of scenarios shipped with Synapses. These are listed in BuiltinScenarioNames() for MCP tool discovery.

func FindScenario ¶

func FindScenario(name string) (Scenario, error)

FindScenario returns the named scenario, or an error if not found.

type ScenarioResult ¶

type ScenarioResult struct {
	Name         string        `json:"name"`
	Description  string        `json:"description"`
	Queries      []QueryResult `json:"queries"`
	Passed       bool          `json:"passed"`
	AvgF1        float64       `json:"avg_f1"`
	AvgLatencyMs float64       `json:"avg_latency_ms"`
	Error        string        `json:"error,omitempty"`
}

ScenarioResult holds the outcome of a single scenario.

type Summary ¶

type Summary struct {
	ScenariosRun     int     `json:"scenarios_run"`
	ScenariosPassed  int     `json:"scenarios_passed"`
	ScenariosErrored int     `json:"scenarios_errored"` // scenarios that could not run (graph too small, etc.)
	AvgPrecision     float64 `json:"avg_precision"`
	AvgRecall        float64 `json:"avg_recall"`
	AvgF1            float64 `json:"avg_f1"`
	AvgLatencyMs     float64 `json:"avg_latency_ms"`
	P95LatencyMs     float64 `json:"p95_latency_ms"`
}

Summary aggregates across all scenarios.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL