Documentation
¶
Overview ¶
Package obs houses cross-cutting observability wiring (tracing today, metrics/logs later). It owns the lifecycle of the global OpenTelemetry TracerProvider so the rest of the code can fetch a tracer via otel.Tracer(...) without knowing about exporters or SDK configuration.
Index ¶
- Constants
- func ContentEnabled() bool
- func LoggerWithSpan(ctx context.Context, logger *slog.Logger) *slog.Logger
- func ObserveErr(span trace.Span, err error) error
- func RecordContent(span trace.Span, name, body string, extra ...attribute.KeyValue)
- func RedactBase64Payloads(body string) string
- func SetContentEnabled(v bool)
- func TextPreview(text string, maxLen int) (preview, hashHex string)
- type ShutdownFunc
- type TracingConfig
Constants ¶
const ( ExporterOTLP = "otlp" ExporterStdout = "stdout" )
Exporter names for TracingConfig.Exporter.
const DefaultPreviewLen = 160
DefaultPreviewLen is the truncation threshold used by TextPreview when the caller doesn't override it. ~160 chars fits inside one Tempo attribute slot without bloating index bytes, while still leaving enough text for direct TraceQL substring search (e.g. `{span.bot.user_query_preview =~ ".*Isotonic.*"}`).
Variables ¶
This section is empty.
Functions ¶
func ContentEnabled ¶
func ContentEnabled() bool
ContentEnabled reports the current state of the toggle. Exposed for tests and for callers that want to cheaply skip assembling a payload when recording is disabled.
func LoggerWithSpan ¶
LoggerWithSpan returns logger augmented with "trace_id" and "span_id" fields taken from the OTEL span attached to ctx. If ctx has no valid span (e.g. a background goroutine that hasn't been instrumented), logger is returned unchanged.
Intended to be called once per logical scope — typically right after the root span for that scope is started — and the returned logger then threaded through the entire scope. All subsequent `.Info()`/`.Warn()`/`.Error()` calls on the scoped logger inherit the trace/span ids automatically.
This bridges the gap that this codebase uses non-context slog APIs (`logger.Info(...)` rather than `logger.InfoContext(ctx, ...)`), so a context-extracting Handler would never see the span. Binding the ids onto the logger at scope entry is the simplest fix without rewriting every call site.
The format of trace_id ("0123...32-hex") and span_id ("0123...16-hex") matches Grafana's Tempo derived-fields configuration, so a Loki log line with these fields renders a one-click "open in Tempo" button.
func ObserveErr ¶
ObserveErr records err on span and sets Error status when err is non-nil, then returns err unchanged so callers can one-line it:
return obs.ObserveErr(span, doThing())
When err is nil it is a no-op.
func RecordContent ¶
RecordContent adds a span event carrying a content payload when the content toggle is enabled; it is a no-op otherwise. The body is stored under the reserved attribute key "body" — extra attributes are appended alongside but must not shadow this key.
func RedactBase64Payloads ¶
RedactBase64Payloads replaces every "data:<mime>;base64,<payload>" fragment in body with "redacted:sha256:<hex>:<mime>:<size>", where <hex> is the sha256 of the decoded bytes (matching artifacts.content_hash) and <size> is the decoded byte count. The shape is deliberately distinct from a real data URL so the regex below can't loop on already-redacted output.
Designed for use inside RecordContent payloads where the body is typically a marshalled JSON request/response carrying multimodal file inputs. With 1-5 MB images per OR call and 3-4 OR calls per turn, leaving raw base64 in trace events drives per-trace size beyond Tempo's max_bytes_per_trace limit; redacting brings a media-heavy turn from ~30 MB down to ~1 MB.
Replay reconstructs the original FilePart by parsing this placeholder, looking up the artifact by content_hash, and reading bytes from the snapshotted artifact storage. Validity checks (sha256 + size) catch any drift between trace and snapshot.
Hot-path note: a no-match body returns in O(len(body)) with zero allocs. On match we run the regex exactly once via FindAllStringSubmatchIndex and stream the result into a byte buffer — avoiding the double-regex cost of ReplaceAllStringFunc + per-match FindStringSubmatch.
func SetContentEnabled ¶
func SetContentEnabled(v bool)
SetContentEnabled flips the content-recording toggle. Called from InitTracing based on TracingConfig.TraceContent.
func TextPreview ¶
TextPreview returns a short, single-line, base64-redacted preview of text suitable for use as a span attribute, plus the sha256-hex of the original (untruncated, pre-redaction) bytes. Both are intended to be set on a span alongside each other so an investigator can:
- search by readable substring via the preview attribute, and
- join different spans on the same input via the hash, even after the preview gets truncated or collides between similar queries.
The function is deterministic and allocation-light: it short-circuits on strings already under maxLen with no redaction matches.
If maxLen <= 0, DefaultPreviewLen is used. The hash is always over the raw input — truncation must not change the hash for the same input.
Types ¶
type ShutdownFunc ¶
ShutdownFunc flushes pending spans and releases exporter resources. Safe to call with a context that has a timeout — the SDK respects deadlines.
func InitTracing ¶
func InitTracing(ctx context.Context, cfg TracingConfig, serviceVersion string) (ShutdownFunc, error)
InitTracing configures the global TracerProvider.
When cfg.Enabled is false, InitTracing is a no-op: the OTel default (noop provider) stays in effect, otel.Tracer(...) returns cheap no-op spans, and the returned ShutdownFunc does nothing.
When enabled, it builds an exporter based on cfg.Exporter:
- "otlp" (default): OTLP/gRPC pointed at cfg.OTLPEndpoint (loopback, insecure — for a local collector like Alloy on the same host).
- "stdout": pretty-prints spans to stderr, for local dev where network-level delivery is not what's being tested.
Sampling is AlwaysSample: low-volume workload, short retention downstream.
serviceVersion is recorded as the service.version resource attribute; callers typically pass the build-time Version variable.
type TracingConfig ¶
type TracingConfig struct {
Enabled bool
Exporter string // "otlp" (default) or "stdout"
OTLPEndpoint string // required when Exporter == "otlp"
ServiceName string
// TraceContent, when true, flips the content toggle via SetContentEnabled
// so RecordContent starts attaching body-bearing span events. The trace
// carries a resource attribute laplaced.trace_content=true so the mode
// is self-identifying at query time.
TraceContent bool
}
TracingConfig is the minimal contract tracing.Init needs from caller config. Defined here (not imported from internal/config) to keep internal/obs free of upstream dependencies — useful for tests and for future reuse.