snapshot

package

v0.9.0 Latest Latest Go to latest Published: May 17, 2026 License: MIT Imports: 17 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/runixer/laplaced

Links

Open Source Insights

Documentation ¶

Overview ¶

Package snapshot captures Tempo traces and a database snapshot to disk for later offline replay and evaluation. The package is agent-agnostic: callers pick the TraceQL filter, the package mechanically fetches traces and writes them out. Per-agent extraction (which event holds the prompt, which attribute holds the cost) is left to consumers.

Index ¶

Constants
Variables
func CopyDB(outDir, srcDB string) (string, error)
func ExtractServiceVersion(traceJSON []byte) string
func FormatReport(r *AnalysisReport) string
func RenderLabelsTodo(m *Manifest, spans []*RerankerSpanData) string
func WriteManifest(path string, m *Manifest) error
func WriteManifestFile(outDir string, m *Manifest) error
func WriteTrace(outDir, traceID string, body []byte) (string, error)
type AnalysisReport
- func AnalyzeSpans(spans []*RerankerSpanData) *AnalysisReport
type ArtifactCandidateLine
- func ParseArtifactCandidates(body string) []ArtifactCandidateLine
type CandidateLine
- func ParseCandidates(body string) []CandidateLine
type FallbackBreakdown
type Label
type LabelsFile
- func ParseLabelsMarkdown(r io.Reader) (LabelsFile, error)
type Manifest
- func ReadManifest(path string) (*Manifest, error)
type Numeric
type PerTraceSnapshot
type PersonCandidateLine
- func ParsePersonCandidates(body string) []PersonCandidateLine
type RerankerSpanData
- func ExtractRerankerSpan(traceID string, traceJSON []byte) (*RerankerSpanData, error)
type SelectionResult
type TempoClient
- func NewTempoClient(baseURL string) *TempoClient
- func (c *TempoClient) FetchTrace(ctx context.Context, traceID string) ([]byte, error)
- func (c *TempoClient) Search(ctx context.Context, query string, start, end int64, limit int) ([]TraceMeta, error)
type TraceLabels
type TraceManifestEntry
type TraceMeta
- func (m TraceMeta) StartTimeUnix() int64
type TypeStats

Constants ¶

View Source

const ManifestVersion = "1"

ManifestVersion is the on-disk schema version for snapshot manifests. Bump when changing the YAML structure in a non-additive way.

Variables ¶

View Source

var ErrNoRerankerSpan = errors.New("no reranker.Execute span in trace")

ErrNoRerankerSpan is returned by ExtractRerankerSpan when the trace JSON contains no span named "reranker.Execute".

Functions ¶

func CopyDB ¶

func CopyDB(outDir, srcDB string) (string, error)

CopyDB copies srcDB into <outDir>/<defaultDBName>. SQLite files are stable snapshots when the source isn't actively being written to — we don't take extra locks. Caller is responsible for ensuring the source isn't being modified concurrently (typically true for data/prod/laplaced.db which is already a copy).

func ExtractServiceVersion ¶

func ExtractServiceVersion(traceJSON []byte) string

ExtractServiceVersion pulls service.version from the resource attributes of a Tempo trace JSON if present. Returns "" on any decode issue — this is best-effort metadata for the manifest.

func FormatReport ¶

func FormatReport(r *AnalysisReport) string

FormatReport renders an AnalysisReport as human-readable text. JSON form is produced separately (caller handles persistence).

func RenderLabelsTodo ¶

func RenderLabelsTodo(m *Manifest, spans []*RerankerSpanData) string

RenderLabelsTodo turns a manifest plus parsed reranker spans into the human-fillable markdown described in docs/plans/reranker-replay-eval.md.

All three candidate types — topics, people, artifacts — are rendered as fillable tables when their respective `reranker.*_candidates_input` events are present on the span. Spans captured before the events were instrumented will fall back to a "no candidates parsed" stub for the missing sections.

func WriteManifest ¶

func WriteManifest(path string, m *Manifest) error

WriteManifest serialises to <path> with restrictive permissions.

func WriteManifestFile ¶

func WriteManifestFile(outDir string, m *Manifest) error

WriteManifestFile is a convenience wrapper over WriteManifest pinning the canonical filename inside the snapshot directory.

func WriteTrace ¶

func WriteTrace(outDir, traceID string, body []byte) (string, error)

WriteTrace persists a single trace JSON to <outDir>/traces/<traceID>.json. Returns the path written, relative to outDir, for inclusion in the manifest.

Types ¶

type AnalysisReport ¶

type AnalysisReport struct {
	Traces      int                `json:"traces"`
	Fallbacks   FallbackBreakdown  `json:"fallbacks"`
	Topics      TypeStats          `json:"topics"`
	People      TypeStats          `json:"people"`
	Artifacts   TypeStats          `json:"artifacts"`
	CostUSD     Numeric            `json:"cost_usd"`
	LLMCalls    Numeric            `json:"llm_calls"`
	ToolCalls   Numeric            `json:"tool_calls"`
	DurationMs  Numeric            `json:"duration_ms"`
	HallucTopic Numeric            `json:"halluc_rate_topics"` // (raw-kept)/raw, per-trace
	Per         []PerTraceSnapshot `json:"per_trace,omitempty"`
}

AnalysisReport summarises the no-LLM-required statistics over a batch of reranker spans. Built deterministically from RerankerSpanData fields so it can be diffed across snapshots / prompt versions.

func AnalyzeSpans ¶

func AnalyzeSpans(spans []*RerankerSpanData) *AnalysisReport

AnalyzeSpans walks the spans once, collects per-trace floats, then computes summary stats. Stable across reruns.

type ArtifactCandidateLine ¶

type ArtifactCandidateLine struct {
	ID         int
	Similarity float64
	FileType   string
	FileName   string
	Keywords   string // empty if absent
	Entities   string // empty if absent
	Summary    string
}

ArtifactCandidateLine is one parsed row from a `reranker.artifacts_candidates_input` event. Mirrors formatArtifactCandidates:

[Artifact:N] (sim) type: "filename" [| keywords] [| Entities: …] | summary

func ParseArtifactCandidates ¶

func ParseArtifactCandidates(body string) []ArtifactCandidateLine

ParseArtifactCandidates splits an artifacts_candidates_input body into rows. Splits on ` | ` and identifies parts by position (head first, summary last) and prefix (`Entities: ` marks the entities cell).

type CandidateLine ¶

type CandidateLine struct {
	ID         int
	Similarity float64
	Date       string // free-form string from the producer (yyyy-mm-dd)
	MsgCount   int
	Size       string // "12K", "371", etc — kept verbatim
	Summary    string
}

CandidateLine is one parsed row from a `reranker.candidates_input` event. Mirrors the producer in internal/agent/reranker/candidates.go's formatCandidatesForReranker.

func ParseCandidates ¶

func ParseCandidates(body string) []CandidateLine

ParseCandidates splits a candidates_input body into one CandidateLine per recognised row. Unrecognised lines are silently skipped — a malformed candidate must never abort label generation for the whole snapshot.

type FallbackBreakdown ¶

type FallbackBreakdown struct {
	Total    int            `json:"total"`
	Success  int            `json:"success"`
	ByReason map[string]int `json:"by_reason"`
}

FallbackBreakdown counts traces by fallback_reason. Empty string is the success path, kept as "" key for fidelity with the trace data.

type Label ¶

type Label struct {
	Score  int    `json:"score"`
	Reason string `json:"reason,omitempty"`
}

Label is one filled-in cell. Reason is optional ("" allowed).

type LabelsFile ¶

type LabelsFile map[string]TraceLabels

LabelsFile is the JSON shape produced by judge-import.

{
  "<trace_id>": {
    "topics":    {"<topic_id>":    {"score": 9, "reason": "..."}, ...},
    "people":    {"<person_id>":   {"score": 7, "reason": "..."}, ...},
    "artifacts": {"<artifact_id>": {"score": 5, "reason": "..."}, ...}
  }
}

func ParseLabelsMarkdown ¶

func ParseLabelsMarkdown(r io.Reader) (LabelsFile, error)

ParseLabelsMarkdown reads a filled labels-todo.md and returns a LabelsFile. Only rows with non-empty Score are emitted; "Leave Score blank for trivial 0/10 rows" is the convention from docs/plans/reranker-replay-eval.md.

Bad rows are skipped silently — labelling is human work and the parser must be resilient to typos. The caller can compare the row count vs. emitted label count to gauge completeness.

type Manifest ¶

type Manifest struct {
	Version     string               `yaml:"version"`
	Agent       string               `yaml:"agent"`     // free-form label, e.g. "reranker"
	SpanName    string               `yaml:"span_name"` // OTel span name driving the search
	TempoURL    string               `yaml:"tempo_url"`
	Filter      string               `yaml:"filter"` // TraceQL query used
	WindowStart time.Time            `yaml:"window_start"`
	WindowEnd   time.Time            `yaml:"window_end"`
	CapturedAt  time.Time            `yaml:"captured_at"`
	TraceCount  int                  `yaml:"trace_count"`
	DBPath      string               `yaml:"db_path"` // relative to snapshot dir
	Traces      []TraceManifestEntry `yaml:"traces"`
}

Manifest describes a single snapshot directory: which traces it contains, where they came from, and which DB snapshot ships alongside.

func ReadManifest ¶

func ReadManifest(path string) (*Manifest, error)

ReadManifest loads a manifest from disk.

type Numeric ¶

type Numeric struct {
	N      int     `json:"n"`
	Mean   float64 `json:"mean"`
	Median float64 `json:"median"`
	P95    float64 `json:"p95"`
	Max    float64 `json:"max"`
	Min    float64 `json:"min"`
	Sum    float64 `json:"sum"`
}

Numeric is a basic distribution summary.

type PerTraceSnapshot ¶

type PerTraceSnapshot struct {
	TraceID    string  `json:"trace_id"`
	UserID     int64   `json:"user_id"`
	Fallback   string  `json:"fallback,omitempty"`
	TopicsIn   int     `json:"topics_in"`
	TopicsKept int     `json:"topics_kept"`
	PeopleIn   int     `json:"people_in"`
	PeopleKept int     `json:"people_kept"`
	ArtIn      int     `json:"artifacts_in"`
	ArtKept    int     `json:"artifacts_kept"`
	CostUSD    float64 `json:"cost_usd"`
	DurationMs int64   `json:"duration_ms"`
	ToolCalls  int     `json:"tool_calls"`
}

PerTraceSnapshot is a one-line digest used for `--per-trace` listing.

type PersonCandidateLine ¶

type PersonCandidateLine struct {
	ID     int
	Name   string
	Circle string
	Bio    string
}

PersonCandidateLine is one parsed row from a `reranker.people_candidates_input` event. Mirrors storage.FormatPeople: `[Person:N] Name [Circle][: Bio]`. The Name column is kept verbatim and may include `(@username)` and `(aka aliases)` segments.

func ParsePersonCandidates ¶

func ParsePersonCandidates(body string) []PersonCandidateLine

ParsePersonCandidates splits a people_candidates_input body into rows. Lines that don't match the producer format are silently skipped (same contract as ParseCandidates).

type RerankerSpanData ¶

type RerankerSpanData struct {
	TraceID    string
	UserID     int64
	DurationMs int64

	CandidatesIn struct {
		Topics, People, Artifacts int
	}
	ModelKept struct {
		Topics, People, Artifacts int
	}
	ModelRawCount struct {
		Topics, People, Artifacts int
	}

	FallbackReason string
	CostUSD        float64
	LLMCalls       int
	ToolCalls      int

	RawQuery                string
	EnrichedQuery           string
	CandidatesInput         string // raw multi-line body, see ParseCandidates
	PeopleCandidatesInput   string // raw body, see ParsePersonCandidates
	ArtifactCandidatesInput string // raw body, see ParseArtifactCandidates
	UserProfile             string // raw <user_profile> XML at trace time
	RecentTopics            string // raw <recent_topics> XML at trace time

	// ToolCallRequests is one slice per `reranker.tool_call` event, each
	// holding the requested topic IDs for that iteration.
	ToolCallRequests [][]int

	// Selected* are the IDs that ended up in the final selection (post-fallback
	// if any). The *Reasons maps hold the model-supplied reason per ID; entries
	// are empty when the trace ended via a fallback path.
	SelectedTopics    []int
	SelectedPeople    []int
	SelectedArtifacts []int
	SelectedReasons   map[int]string // topic-only, kept for back-compat
	SelectedReasonsP  map[int]string
	SelectedReasonsA  map[int]string
}

RerankerSpanData holds the parts of a reranker.Execute span needed to generate a labels-todo.md entry. Mirrors the OTel schema established by commit 2aa4245 on feat/otel-migration.

Counts: ModelRawCount is what the model returned in its final JSON, before filterValid. ModelKept is post-filterValid. Both are zero when the trace ended in a fallback path.

func ExtractRerankerSpan ¶

func ExtractRerankerSpan(traceID string, traceJSON []byte) (*RerankerSpanData, error)

ExtractRerankerSpan parses the Tempo trace JSON and pulls the first `reranker.Execute` span into a RerankerSpanData. Returns ErrNoRerankerSpan if the trace contains no such span.

type SelectionResult ¶

type SelectionResult struct {
	Topics    []int
	People    []int
	Artifacts []int

	TopicReasons    map[int]string
	PersonReasons   map[int]string
	ArtifactReasons map[int]string
}

SelectionResult holds the parsed selection_reasons body broken out by type. The producer marshals one flat array `[{id:"<Type>:<N>", reason:...}, ...]` where Type is Topic / Person / Artifact.

type TempoClient ¶

type TempoClient struct {
	BaseURL string
	HTTP    *http.Client
}

TempoClient is a thin HTTP client over Tempo's search and trace APIs. No auth — this is dev tooling against a LAN-only Tempo (Traefik whitelist).

func NewTempoClient ¶

func NewTempoClient(baseURL string) *TempoClient

NewTempoClient constructs a client with sensible defaults.

func (*TempoClient) FetchTrace ¶

func (c *TempoClient) FetchTrace(ctx context.Context, traceID string) ([]byte, error)

FetchTrace returns the full trace JSON as raw bytes. We persist it as-is so consumers can use any JSON tooling without re-marshalling drift.

func (*TempoClient) Search ¶

func (c *TempoClient) Search(ctx context.Context, query string, start, end int64, limit int) ([]TraceMeta, error)

Search runs a TraceQL query over [start, end] and returns up to limit traces. start and end are unix seconds. Tempo enforces a 168h max window.

type TraceLabels ¶

type TraceLabels struct {
	Topics    map[string]Label `json:"topics"`
	People    map[string]Label `json:"people"`
	Artifacts map[string]Label `json:"artifacts"`
}

TraceLabels groups labels by candidate type. Entries map ID (as string, to keep JSON keys stable) to a Label.

type TraceManifestEntry ¶

type TraceManifestEntry struct {
	TraceID        string `yaml:"trace_id"`
	File           string `yaml:"file"`
	DurationMs     int64  `yaml:"duration_ms"`
	StartTimeUnix  int64  `yaml:"start_time_unix"`
	ServiceVersion string `yaml:"service_version,omitempty"`
}

TraceManifestEntry is one row in the manifest's trace listing. service_version is captured opportunistically from resource attrs so a human reading the manifest can spot mixed-deploy snapshots at a glance.

type TraceMeta ¶

type TraceMeta struct {
	TraceID           string `json:"traceID"`
	RootServiceName   string `json:"rootServiceName"`
	RootTraceName     string `json:"rootTraceName"`
	StartTimeUnixNano string `json:"startTimeUnixNano"`
	DurationMs        int64  `json:"durationMs"`
}

TraceMeta is one entry from /api/search.

func (TraceMeta) StartTimeUnix ¶

func (m TraceMeta) StartTimeUnix() int64

StartTimeUnix converts the nano-precision string to unix seconds. Tempo returns startTimeUnixNano as a string because it overflows int64 in some JSON parsers; we only need second precision for manifest.

type TypeStats ¶

type TypeStats struct {
	CandidatesIn  Numeric `json:"candidates_in"`
	ModelRawCount Numeric `json:"model_raw_count"`
	ModelKept     Numeric `json:"model_kept"`
	SelectionRate Numeric `json:"selection_rate"` // model_kept / candidates_in (per-trace)
}

TypeStats aggregates the candidate-funnel for one of (topics, people, artifacts).

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
replay Package replay re-runs captured reranker (and, in future, other agent) invocations through the production agent code, using the snapshotted database as the back-end.	Package replay re-runs captured reranker (and, in future, other agent) invocations through the production agent code, using the snapshotted database as the back-end.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL