snapshot

package
v0.9.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 17, 2026 License: MIT Imports: 17 Imported by: 0

Documentation

Overview

Package snapshot captures Tempo traces and a database snapshot to disk for later offline replay and evaluation. The package is agent-agnostic: callers pick the TraceQL filter, the package mechanically fetches traces and writes them out. Per-agent extraction (which event holds the prompt, which attribute holds the cost) is left to consumers.

Index

Constants

View Source
const ManifestVersion = "1"

ManifestVersion is the on-disk schema version for snapshot manifests. Bump when changing the YAML structure in a non-additive way.

Variables

View Source
var ErrNoRerankerSpan = errors.New("no reranker.Execute span in trace")

ErrNoRerankerSpan is returned by ExtractRerankerSpan when the trace JSON contains no span named "reranker.Execute".

Functions

func CopyDB

func CopyDB(outDir, srcDB string) (string, error)

CopyDB copies srcDB into <outDir>/<defaultDBName>. SQLite files are stable snapshots when the source isn't actively being written to — we don't take extra locks. Caller is responsible for ensuring the source isn't being modified concurrently (typically true for data/prod/laplaced.db which is already a copy).

func ExtractServiceVersion

func ExtractServiceVersion(traceJSON []byte) string

ExtractServiceVersion pulls service.version from the resource attributes of a Tempo trace JSON if present. Returns "" on any decode issue — this is best-effort metadata for the manifest.

func FormatReport

func FormatReport(r *AnalysisReport) string

FormatReport renders an AnalysisReport as human-readable text. JSON form is produced separately (caller handles persistence).

func RenderLabelsTodo

func RenderLabelsTodo(m *Manifest, spans []*RerankerSpanData) string

RenderLabelsTodo turns a manifest plus parsed reranker spans into the human-fillable markdown described in docs/plans/reranker-replay-eval.md.

All three candidate types — topics, people, artifacts — are rendered as fillable tables when their respective `reranker.*_candidates_input` events are present on the span. Spans captured before the events were instrumented will fall back to a "no candidates parsed" stub for the missing sections.

func WriteManifest

func WriteManifest(path string, m *Manifest) error

WriteManifest serialises to <path> with restrictive permissions.

func WriteManifestFile

func WriteManifestFile(outDir string, m *Manifest) error

WriteManifestFile is a convenience wrapper over WriteManifest pinning the canonical filename inside the snapshot directory.

func WriteTrace

func WriteTrace(outDir, traceID string, body []byte) (string, error)

WriteTrace persists a single trace JSON to <outDir>/traces/<traceID>.json. Returns the path written, relative to outDir, for inclusion in the manifest.

Types

type AnalysisReport

type AnalysisReport struct {
	Traces      int                `json:"traces"`
	Fallbacks   FallbackBreakdown  `json:"fallbacks"`
	Topics      TypeStats          `json:"topics"`
	People      TypeStats          `json:"people"`
	Artifacts   TypeStats          `json:"artifacts"`
	CostUSD     Numeric            `json:"cost_usd"`
	LLMCalls    Numeric            `json:"llm_calls"`
	ToolCalls   Numeric            `json:"tool_calls"`
	DurationMs  Numeric            `json:"duration_ms"`
	HallucTopic Numeric            `json:"halluc_rate_topics"` // (raw-kept)/raw, per-trace
	Per         []PerTraceSnapshot `json:"per_trace,omitempty"`
}

AnalysisReport summarises the no-LLM-required statistics over a batch of reranker spans. Built deterministically from RerankerSpanData fields so it can be diffed across snapshots / prompt versions.

func AnalyzeSpans

func AnalyzeSpans(spans []*RerankerSpanData) *AnalysisReport

AnalyzeSpans walks the spans once, collects per-trace floats, then computes summary stats. Stable across reruns.

type ArtifactCandidateLine

type ArtifactCandidateLine struct {
	ID         int
	Similarity float64
	FileType   string
	FileName   string
	Keywords   string // empty if absent
	Entities   string // empty if absent
	Summary    string
}

ArtifactCandidateLine is one parsed row from a `reranker.artifacts_candidates_input` event. Mirrors formatArtifactCandidates:

[Artifact:N] (sim) type: "filename" [| keywords] [| Entities: …] | summary

func ParseArtifactCandidates

func ParseArtifactCandidates(body string) []ArtifactCandidateLine

ParseArtifactCandidates splits an artifacts_candidates_input body into rows. Splits on ` | ` and identifies parts by position (head first, summary last) and prefix (`Entities: ` marks the entities cell).

type CandidateLine

type CandidateLine struct {
	ID         int
	Similarity float64
	Date       string // free-form string from the producer (yyyy-mm-dd)
	MsgCount   int
	Size       string // "12K", "371", etc — kept verbatim
	Summary    string
}

CandidateLine is one parsed row from a `reranker.candidates_input` event. Mirrors the producer in internal/agent/reranker/candidates.go's formatCandidatesForReranker.

func ParseCandidates

func ParseCandidates(body string) []CandidateLine

ParseCandidates splits a candidates_input body into one CandidateLine per recognised row. Unrecognised lines are silently skipped — a malformed candidate must never abort label generation for the whole snapshot.

type FallbackBreakdown

type FallbackBreakdown struct {
	Total    int            `json:"total"`
	Success  int            `json:"success"`
	ByReason map[string]int `json:"by_reason"`
}

FallbackBreakdown counts traces by fallback_reason. Empty string is the success path, kept as "" key for fidelity with the trace data.

type Label

type Label struct {
	Score  int    `json:"score"`
	Reason string `json:"reason,omitempty"`
}

Label is one filled-in cell. Reason is optional ("" allowed).

type LabelsFile

type LabelsFile map[string]TraceLabels

LabelsFile is the JSON shape produced by judge-import.

{
  "<trace_id>": {
    "topics":    {"<topic_id>":    {"score": 9, "reason": "..."}, ...},
    "people":    {"<person_id>":   {"score": 7, "reason": "..."}, ...},
    "artifacts": {"<artifact_id>": {"score": 5, "reason": "..."}, ...}
  }
}

func ParseLabelsMarkdown

func ParseLabelsMarkdown(r io.Reader) (LabelsFile, error)

ParseLabelsMarkdown reads a filled labels-todo.md and returns a LabelsFile. Only rows with non-empty Score are emitted; "Leave Score blank for trivial 0/10 rows" is the convention from docs/plans/reranker-replay-eval.md.

Bad rows are skipped silently — labelling is human work and the parser must be resilient to typos. The caller can compare the row count vs. emitted label count to gauge completeness.

type Manifest

type Manifest struct {
	Version     string               `yaml:"version"`
	Agent       string               `yaml:"agent"`     // free-form label, e.g. "reranker"
	SpanName    string               `yaml:"span_name"` // OTel span name driving the search
	TempoURL    string               `yaml:"tempo_url"`
	Filter      string               `yaml:"filter"` // TraceQL query used
	WindowStart time.Time            `yaml:"window_start"`
	WindowEnd   time.Time            `yaml:"window_end"`
	CapturedAt  time.Time            `yaml:"captured_at"`
	TraceCount  int                  `yaml:"trace_count"`
	DBPath      string               `yaml:"db_path"` // relative to snapshot dir
	Traces      []TraceManifestEntry `yaml:"traces"`
}

Manifest describes a single snapshot directory: which traces it contains, where they came from, and which DB snapshot ships alongside.

func ReadManifest

func ReadManifest(path string) (*Manifest, error)

ReadManifest loads a manifest from disk.

type Numeric

type Numeric struct {
	N      int     `json:"n"`
	Mean   float64 `json:"mean"`
	Median float64 `json:"median"`
	P95    float64 `json:"p95"`
	Max    float64 `json:"max"`
	Min    float64 `json:"min"`
	Sum    float64 `json:"sum"`
}

Numeric is a basic distribution summary.

type PerTraceSnapshot

type PerTraceSnapshot struct {
	TraceID    string  `json:"trace_id"`
	UserID     int64   `json:"user_id"`
	Fallback   string  `json:"fallback,omitempty"`
	TopicsIn   int     `json:"topics_in"`
	TopicsKept int     `json:"topics_kept"`
	PeopleIn   int     `json:"people_in"`
	PeopleKept int     `json:"people_kept"`
	ArtIn      int     `json:"artifacts_in"`
	ArtKept    int     `json:"artifacts_kept"`
	CostUSD    float64 `json:"cost_usd"`
	DurationMs int64   `json:"duration_ms"`
	ToolCalls  int     `json:"tool_calls"`
}

PerTraceSnapshot is a one-line digest used for `--per-trace` listing.

type PersonCandidateLine

type PersonCandidateLine struct {
	ID     int
	Name   string
	Circle string
	Bio    string
}

PersonCandidateLine is one parsed row from a `reranker.people_candidates_input` event. Mirrors storage.FormatPeople: `[Person:N] Name [Circle][: Bio]`. The Name column is kept verbatim and may include `(@username)` and `(aka aliases)` segments.

func ParsePersonCandidates

func ParsePersonCandidates(body string) []PersonCandidateLine

ParsePersonCandidates splits a people_candidates_input body into rows. Lines that don't match the producer format are silently skipped (same contract as ParseCandidates).

type RerankerSpanData

type RerankerSpanData struct {
	TraceID    string
	UserID     int64
	DurationMs int64

	CandidatesIn struct {
		Topics, People, Artifacts int
	}
	ModelKept struct {
		Topics, People, Artifacts int
	}
	ModelRawCount struct {
		Topics, People, Artifacts int
	}

	FallbackReason string
	CostUSD        float64
	LLMCalls       int
	ToolCalls      int

	RawQuery                string
	EnrichedQuery           string
	CandidatesInput         string // raw multi-line body, see ParseCandidates
	PeopleCandidatesInput   string // raw body, see ParsePersonCandidates
	ArtifactCandidatesInput string // raw body, see ParseArtifactCandidates
	UserProfile             string // raw <user_profile> XML at trace time
	RecentTopics            string // raw <recent_topics> XML at trace time

	// ToolCallRequests is one slice per `reranker.tool_call` event, each
	// holding the requested topic IDs for that iteration.
	ToolCallRequests [][]int

	// Selected* are the IDs that ended up in the final selection (post-fallback
	// if any). The *Reasons maps hold the model-supplied reason per ID; entries
	// are empty when the trace ended via a fallback path.
	SelectedTopics    []int
	SelectedPeople    []int
	SelectedArtifacts []int
	SelectedReasons   map[int]string // topic-only, kept for back-compat
	SelectedReasonsP  map[int]string
	SelectedReasonsA  map[int]string
}

RerankerSpanData holds the parts of a reranker.Execute span needed to generate a labels-todo.md entry. Mirrors the OTel schema established by commit 2aa4245 on feat/otel-migration.

Counts: ModelRawCount is what the model returned in its final JSON, before filterValid. ModelKept is post-filterValid. Both are zero when the trace ended in a fallback path.

func ExtractRerankerSpan

func ExtractRerankerSpan(traceID string, traceJSON []byte) (*RerankerSpanData, error)

ExtractRerankerSpan parses the Tempo trace JSON and pulls the first `reranker.Execute` span into a RerankerSpanData. Returns ErrNoRerankerSpan if the trace contains no such span.

type SelectionResult

type SelectionResult struct {
	Topics    []int
	People    []int
	Artifacts []int

	TopicReasons    map[int]string
	PersonReasons   map[int]string
	ArtifactReasons map[int]string
}

SelectionResult holds the parsed selection_reasons body broken out by type. The producer marshals one flat array `[{id:"<Type>:<N>", reason:...}, ...]` where Type is Topic / Person / Artifact.

type TempoClient

type TempoClient struct {
	BaseURL string
	HTTP    *http.Client
}

TempoClient is a thin HTTP client over Tempo's search and trace APIs. No auth — this is dev tooling against a LAN-only Tempo (Traefik whitelist).

func NewTempoClient

func NewTempoClient(baseURL string) *TempoClient

NewTempoClient constructs a client with sensible defaults.

func (*TempoClient) FetchTrace

func (c *TempoClient) FetchTrace(ctx context.Context, traceID string) ([]byte, error)

FetchTrace returns the full trace JSON as raw bytes. We persist it as-is so consumers can use any JSON tooling without re-marshalling drift.

func (*TempoClient) Search

func (c *TempoClient) Search(ctx context.Context, query string, start, end int64, limit int) ([]TraceMeta, error)

Search runs a TraceQL query over [start, end] and returns up to limit traces. start and end are unix seconds. Tempo enforces a 168h max window.

type TraceLabels

type TraceLabels struct {
	Topics    map[string]Label `json:"topics"`
	People    map[string]Label `json:"people"`
	Artifacts map[string]Label `json:"artifacts"`
}

TraceLabels groups labels by candidate type. Entries map ID (as string, to keep JSON keys stable) to a Label.

type TraceManifestEntry

type TraceManifestEntry struct {
	TraceID        string `yaml:"trace_id"`
	File           string `yaml:"file"`
	DurationMs     int64  `yaml:"duration_ms"`
	StartTimeUnix  int64  `yaml:"start_time_unix"`
	ServiceVersion string `yaml:"service_version,omitempty"`
}

TraceManifestEntry is one row in the manifest's trace listing. service_version is captured opportunistically from resource attrs so a human reading the manifest can spot mixed-deploy snapshots at a glance.

type TraceMeta

type TraceMeta struct {
	TraceID           string `json:"traceID"`
	RootServiceName   string `json:"rootServiceName"`
	RootTraceName     string `json:"rootTraceName"`
	StartTimeUnixNano string `json:"startTimeUnixNano"`
	DurationMs        int64  `json:"durationMs"`
}

TraceMeta is one entry from /api/search.

func (TraceMeta) StartTimeUnix

func (m TraceMeta) StartTimeUnix() int64

StartTimeUnix converts the nano-precision string to unix seconds. Tempo returns startTimeUnixNano as a string because it overflows int64 in some JSON parsers; we only need second precision for manifest.

type TypeStats

type TypeStats struct {
	CandidatesIn  Numeric `json:"candidates_in"`
	ModelRawCount Numeric `json:"model_raw_count"`
	ModelKept     Numeric `json:"model_kept"`
	SelectionRate Numeric `json:"selection_rate"` // model_kept / candidates_in (per-trace)
}

TypeStats aggregates the candidate-funnel for one of (topics, people, artifacts).

Directories

Path Synopsis
Package replay re-runs captured reranker (and, in future, other agent) invocations through the production agent code, using the snapshotted database as the back-end.
Package replay re-runs captured reranker (and, in future, other agent) invocations through the production agent code, using the snapshotted database as the back-end.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL