obs

package
v0.9.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 18, 2026 License: MIT Imports: 19 Imported by: 0

Documentation

Overview

Package obs houses cross-cutting observability wiring (tracing today, metrics/logs later). It owns the lifecycle of the global OpenTelemetry TracerProvider so the rest of the code can fetch a tracer via otel.Tracer(...) without knowing about exporters or SDK configuration.

Index

Constants

View Source
const (
	ExporterOTLP   = "otlp"
	ExporterStdout = "stdout"
)

Exporter names for TracingConfig.Exporter.

View Source
const DefaultPreviewLen = 160

DefaultPreviewLen is the truncation threshold used by TextPreview when the caller doesn't override it. ~160 chars fits inside one Tempo attribute slot without bloating index bytes, while still leaving enough text for direct TraceQL substring search (e.g. `{span.bot.user_query_preview =~ ".*Isotonic.*"}`).

Variables

This section is empty.

Functions

func ContentEnabled

func ContentEnabled() bool

ContentEnabled reports the current state of the toggle. Exposed for tests and for callers that want to cheaply skip assembling a payload when recording is disabled.

func LoggerWithSpan

func LoggerWithSpan(ctx context.Context, logger *slog.Logger) *slog.Logger

LoggerWithSpan returns logger augmented with "trace_id" and "span_id" fields taken from the OTEL span attached to ctx. If ctx has no valid span (e.g. a background goroutine that hasn't been instrumented), logger is returned unchanged.

Intended to be called once per logical scope — typically right after the root span for that scope is started — and the returned logger then threaded through the entire scope. All subsequent `.Info()`/`.Warn()`/`.Error()` calls on the scoped logger inherit the trace/span ids automatically.

This bridges the gap that this codebase uses non-context slog APIs (`logger.Info(...)` rather than `logger.InfoContext(ctx, ...)`), so a context-extracting Handler would never see the span. Binding the ids onto the logger at scope entry is the simplest fix without rewriting every call site.

The format of trace_id ("0123...32-hex") and span_id ("0123...16-hex") matches Grafana's Tempo derived-fields configuration, so a Loki log line with these fields renders a one-click "open in Tempo" button.

func ObserveErr

func ObserveErr(span trace.Span, err error) error

ObserveErr records err on span and sets Error status when err is non-nil, then returns err unchanged so callers can one-line it:

return obs.ObserveErr(span, doThing())

When err is nil it is a no-op.

func RecordContent

func RecordContent(span trace.Span, name, body string, extra ...attribute.KeyValue)

RecordContent adds a span event carrying a content payload when the content toggle is enabled; it is a no-op otherwise. The body is stored under the reserved attribute key "body" — extra attributes are appended alongside but must not shadow this key.

func RedactBase64Payloads

func RedactBase64Payloads(body string) string

RedactBase64Payloads replaces every "data:<mime>;base64,<payload>" fragment in body with "redacted:sha256:<hex>:<mime>:<size>", where <hex> is the sha256 of the decoded bytes (matching artifacts.content_hash) and <size> is the decoded byte count. The shape is deliberately distinct from a real data URL so the regex below can't loop on already-redacted output.

Designed for use inside RecordContent payloads where the body is typically a marshalled JSON request/response carrying multimodal file inputs. With 1-5 MB images per OR call and 3-4 OR calls per turn, leaving raw base64 in trace events drives per-trace size beyond Tempo's max_bytes_per_trace limit; redacting brings a media-heavy turn from ~30 MB down to ~1 MB.

Replay reconstructs the original FilePart by parsing this placeholder, looking up the artifact by content_hash, and reading bytes from the snapshotted artifact storage. Validity checks (sha256 + size) catch any drift between trace and snapshot.

Hot-path note: a no-match body returns in O(len(body)) with zero allocs. On match we run the regex exactly once via FindAllStringSubmatchIndex and stream the result into a byte buffer — avoiding the double-regex cost of ReplaceAllStringFunc + per-match FindStringSubmatch.

func SetContentEnabled

func SetContentEnabled(v bool)

SetContentEnabled flips the content-recording toggle. Called from InitTracing based on TracingConfig.TraceContent.

func TextPreview

func TextPreview(text string, maxLen int) (preview, hashHex string)

TextPreview returns a short, single-line, base64-redacted preview of text suitable for use as a span attribute, plus the sha256-hex of the original (untruncated, pre-redaction) bytes. Both are intended to be set on a span alongside each other so an investigator can:

  • search by readable substring via the preview attribute, and
  • join different spans on the same input via the hash, even after the preview gets truncated or collides between similar queries.

The function is deterministic and allocation-light: it short-circuits on strings already under maxLen with no redaction matches.

If maxLen <= 0, DefaultPreviewLen is used. The hash is always over the raw input — truncation must not change the hash for the same input.

Types

type ShutdownFunc

type ShutdownFunc func(context.Context) error

ShutdownFunc flushes pending spans and releases exporter resources. Safe to call with a context that has a timeout — the SDK respects deadlines.

func InitTracing

func InitTracing(ctx context.Context, cfg TracingConfig, serviceVersion string) (ShutdownFunc, error)

InitTracing configures the global TracerProvider.

When cfg.Enabled is false, InitTracing is a no-op: the OTel default (noop provider) stays in effect, otel.Tracer(...) returns cheap no-op spans, and the returned ShutdownFunc does nothing.

When enabled, it builds an exporter based on cfg.Exporter:

  • "otlp" (default): OTLP/gRPC pointed at cfg.OTLPEndpoint (loopback, insecure — for a local collector like Alloy on the same host).
  • "stdout": pretty-prints spans to stderr, for local dev where network-level delivery is not what's being tested.

Sampling is AlwaysSample: low-volume workload, short retention downstream.

serviceVersion is recorded as the service.version resource attribute; callers typically pass the build-time Version variable.

type TracingConfig

type TracingConfig struct {
	Enabled      bool
	Exporter     string // "otlp" (default) or "stdout"
	OTLPEndpoint string // required when Exporter == "otlp"
	ServiceName  string
	// TraceContent, when true, flips the content toggle via SetContentEnabled
	// so RecordContent starts attaching body-bearing span events. The trace
	// carries a resource attribute laplaced.trace_content=true so the mode
	// is self-identifying at query time.
	TraceContent bool
}

TracingConfig is the minimal contract tracing.Init needs from caller config. Defined here (not imported from internal/config) to keep internal/obs free of upstream dependencies — useful for tests and for future reuse.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL