Documentation
¶
Overview ¶
Package phtrace provides OpenTelemetry tracing and metrics helpers for PayCloud services.
It is an optional subpackage: importing paycloudhelper/phtrace does NOT automatically start the SDK. Consumer services call Init(ctx, cfg) once during startup, then register/use the returned Shutdown to flush on exit.
Design goals:
- Zero-cost when disabled (env flag OTEL_ENABLED=false or Init not called).
- Works with Grafana Tempo for traces, Prometheus for metrics, Loki for logs — via the standard OTLP gRPC exporter hitting an OTel Collector.
- W3C traceparent propagation across HTTP (otelecho), gRPC (otelgrpc), and RabbitMQ (see rmqprop.go in this package).
- Graceful shutdown: Shutdown() flushes pending spans and metrics within a caller-provided timeout so the process does not hang on exit.
Backward compatibility: This is a NEW subpackage (no pre-existing symbols to break). All functions are safe to call before Init() — they become no-ops that return a disabled tracer/meter. Consumers that never call Init() pay no runtime cost beyond a package-level atomic load.
Index ¶
- Constants
- Variables
- func ExtractAMQP(ctx context.Context, headers amqp.Table) context.Context
- func InjectAMQP(ctx context.Context, headers amqp.Table) amqp.Table
- func IsEnabled() bool
- func LogDCtx(ctx context.Context, format string, args ...interface{})
- func LogECtx(ctx context.Context, format string, args ...interface{})
- func LogICtx(ctx context.Context, format string, args ...interface{})
- func LogWCtx(ctx context.Context, format string, args ...interface{})
- func Meter(name string) metric.Meter
- func Propagator() propagation.TextMapPropagator
- func Resource() *resource.Resource
- func Tracer(name string) trace.Tracer
- type AMQPCarrier
- type Config
- type LogContextCtx
- func (l *LogContextCtx) Ctx() context.Context
- func (l *LogContextCtx) LogD(format string, args ...interface{})
- func (l *LogContextCtx) LogE(format string, args ...interface{})
- func (l *LogContextCtx) LogI(format string, args ...interface{})
- func (l *LogContextCtx) LogW(format string, args ...interface{})
- func (l *LogContextCtx) With(fields ...string) *LogContextCtx
- type Option
- func WithEnabled(v bool) Option
- func WithEndpoint(v string) Option
- func WithEnvironment(v string) Option
- func WithInsecure(v bool) Option
- func WithResourceAttribute(k, v string) Option
- func WithSamplingRatio(v float64) Option
- func WithServiceName(v string) Option
- func WithServiceVersion(v string) Option
- type PhaseHistogram
- type Shutdown
Constants ¶
const ( FieldTraceID = "trace_id" FieldSpanID = "span_id" FieldTicketID = "ticket_id" FieldReffNo = "reff_no" FieldMerchantID = "merchant_id" FieldOrderID = "order_id" FieldTrxID = "trx_id" FieldTrxNo = "trx_no" FieldService = "service" FieldRoute = "route" FieldVendor = "vendor" )
Standard log field keys used across all PayCloud services so Loki/Grafana queries are consistent. These are the canonical names — always use these constants instead of raw strings when adding fields via WithFields.
Variables ¶
var DefaultPhaseBuckets = []float64{
5, 10, 25, 50, 100, 250, 500, 1000, 2500, 5000, 10000,
}
DefaultPhaseBuckets are the histogram bucket boundaries (milliseconds) used by QR-MPM phase timing. They follow a power-of-two-ish progression tuned to the latency analysis in docs/2026-04-22-qrmpm-performance-analysis.md so p50/p99 sit on distinct buckets.
Units: milliseconds.
Functions ¶
func ExtractAMQP ¶
ExtractAMQP returns a new context carrying the span context extracted from the given AMQP headers. When headers is nil or does not contain a traceparent, it returns ctx unchanged (any new span will become a root span).
func InjectAMQP ¶
InjectAMQP writes the current span context from ctx into the provided AMQP headers using the globally registered propagator. Safe when headers is nil — InjectAMQP will not allocate a new Table in that case (caller loses the traceparent). Prefer passing a non-nil amqp.Table so propagation actually works. Returns the (possibly newly allocated) headers for ergonomic call sites: `headers = phtrace.InjectAMQP(ctx, headers)`.
func IsEnabled ¶
func IsEnabled() bool
IsEnabled reports whether phtrace has been successfully initialized and is actively exporting telemetry. Cheap (atomic load) and safe for hot paths.
func LogDCtx ¶
LogDCtx logs at Debug level with `[trace_id=... span_id=...] ` prefix drawn from ctx. Safe when ctx has no active span — the prefix is simply omitted.
func LogECtx ¶
LogECtx logs at Error level with trace context. Additionally records an exception event on the active span (if any) so errors show up in Tempo.
func Meter ¶
Meter returns the named meter, or a no-op meter when phtrace is disabled. Safe to call even before Init.
func Propagator ¶
func Propagator() propagation.TextMapPropagator
Propagator returns the globally registered text map propagator. Falls back to a composite TraceContext+Baggage propagator when Init has not been called so RMQ/HTTP carriers still work (span context will simply be empty).
Types ¶
type AMQPCarrier ¶
AMQPCarrier is a propagation.TextMapCarrier backed by AMQP message headers. Use it to inject and extract the W3C traceparent header on the AMQP hot path so traces flow seamlessly between publisher and consumer services.
Usage on the publish side:
headers := amqp.Table{}
phtrace.InjectAMQP(ctx, headers)
ch.Publish(exchange, routingKey, false, false, amqp.Publishing{
Headers: headers,
Body: body,
})
Usage on the consume side:
ctx = phtrace.ExtractAMQP(ctx, delivery.Headers)
ctx, span := phtrace.Tracer("rmq-consumer").Start(ctx, "process_message")
defer span.End()
func NewAMQPCarrier ¶
func NewAMQPCarrier(h amqp.Table) *AMQPCarrier
NewAMQPCarrier wraps an amqp.Table so it can be used as a TextMapCarrier. It does not copy the table; mutations to the carrier mutate the table.
func (*AMQPCarrier) Get ¶
func (c *AMQPCarrier) Get(key string) string
Get returns the value for the given header key, or "" when absent. W3C traceparent is commonly stored as a string, but AMQP allows typed values, so we normalize non-string values to their string form.
func (*AMQPCarrier) Keys ¶
func (c *AMQPCarrier) Keys() []string
Keys returns all header keys. Order is not guaranteed — callers that need stable ordering must sort the returned slice themselves.
func (*AMQPCarrier) Set ¶
func (c *AMQPCarrier) Set(key, value string)
Set writes a header value. An empty key is ignored per AMQP semantics.
type Config ¶
type Config struct {
// Enabled toggles OTel export. When false, Init returns a no-op Shutdown
// and all package helpers degrade to no-ops. Env: OTEL_ENABLED (default true
// when any OTEL_EXPORTER_OTLP_ENDPOINT is set, false otherwise).
Enabled bool
// ServiceName is the logical service identifier. Required.
// Env: OTEL_SERVICE_NAME.
ServiceName string
// ServiceVersion is the build/semver version. Env: OTEL_SERVICE_VERSION.
ServiceVersion string
// Environment is the deployment env (prod, stg, dev). Env: OTEL_DEPLOYMENT_ENV.
Environment string
// Endpoint is the OTLP collector host:port (gRPC). Required when enabled.
// Env: OTEL_EXPORTER_OTLP_ENDPOINT (e.g. "otel-collector:4317").
Endpoint string
// Insecure disables TLS for the OTLP gRPC connection. Typically true in
// dev / in-cluster where the collector is an internal service.
// Env: OTEL_EXPORTER_OTLP_INSECURE (default true).
Insecure bool
// SamplingRatio is the head-based sampling ratio for traces (0..1). 1.0
// captures everything, 0.05 captures 5%. Parent-based sampling is applied
// so downstream spans inherit parent decisions.
// Env: OTEL_TRACES_SAMPLER_ARG (default 1.0).
SamplingRatio float64
// DialTimeout is the timeout applied while connecting to the OTLP endpoint
// during Init. Kept short so a flaky collector cannot block service startup.
// Env: OTEL_DIAL_TIMEOUT (default 5s).
DialTimeout time.Duration
// BatchTimeout is the maximum delay between span export batches.
// Env: OTEL_BATCH_TIMEOUT (default 5s).
BatchTimeout time.Duration
// BatchMaxExportSize is the maximum spans per export batch.
// Env: OTEL_BATCH_MAX_EXPORT_SIZE (default 512).
BatchMaxExportSize int
// MetricExportInterval is how often the periodic metric reader pushes.
// Env: OTEL_METRIC_EXPORT_INTERVAL (default 15s).
MetricExportInterval time.Duration
// ResourceAttributes are extra key=value pairs attached to every span and
// metric. Env: OTEL_RESOURCE_ATTRIBUTES (comma-separated key=value).
ResourceAttributes map[string]string
}
Config controls OTLP exporter behavior for traces and metrics. Only ServiceName and Endpoint are strictly required; the rest have safe defaults.
Typical usage in a service main():
shutdown, err := phtrace.Init(ctx, phtrace.FromEnv(
phtrace.WithServiceName("paycloud-be-createordersnap-manager"),
phtrace.WithServiceVersion(build.Version),
))
type LogContextCtx ¶
type LogContextCtx struct {
// contains filtered or unexported fields
}
LogContextCtx is a trace-aware sibling of phlogger.LogContext. Prefix is rebuilt per log call so trace/span IDs reflect the currently-active span on ctx (useful when a parent context spawns child spans).
func WithFields ¶
func WithFields(ctx context.Context, fields ...string) *LogContextCtx
WithFields returns a LogContextCtx carrying ctx plus an immutable set of extra fields prefixed to every log line. Use this when a function logs multiple times for the same logical operation and you want consistent fields.
Example:
lc := phtrace.WithFields(ctx, phtrace.FieldTicketID, ticketID, phtrace.FieldReffNo, reff)
lc.LogI("qrmpm start")
lc.LogI("qrmpm done status=%s", status)
func (*LogContextCtx) Ctx ¶
func (l *LogContextCtx) Ctx() context.Context
Ctx returns the underlying context (helpful when spawning child spans).
func (*LogContextCtx) LogD ¶
func (l *LogContextCtx) LogD(format string, args ...interface{})
LogD logs at Debug level with trace + field prefix.
func (*LogContextCtx) LogE ¶
func (l *LogContextCtx) LogE(format string, args ...interface{})
LogE logs at Error level with trace + field prefix (records span exception).
func (*LogContextCtx) LogI ¶
func (l *LogContextCtx) LogI(format string, args ...interface{})
LogI logs at Info level with trace + field prefix.
func (*LogContextCtx) LogW ¶
func (l *LogContextCtx) LogW(format string, args ...interface{})
LogW logs at Warning level with trace + field prefix.
func (*LogContextCtx) With ¶
func (l *LogContextCtx) With(fields ...string) *LogContextCtx
With returns a new LogContextCtx extending this one with extra fields.
type Option ¶
type Option func(*Config)
Option configures a Config via functional options, following the paycloudhelper public API style rule for options with >= 3 fields.
func WithEndpoint ¶
WithEndpoint sets Config.Endpoint (required when enabled).
func WithEnvironment ¶
WithEnvironment sets Config.Environment.
func WithResourceAttribute ¶
WithResourceAttribute adds or replaces a single resource attribute.
func WithSamplingRatio ¶
WithSamplingRatio sets Config.SamplingRatio (clamped to [0,1] at Init).
func WithServiceName ¶
WithServiceName sets Config.ServiceName (required).
func WithServiceVersion ¶
WithServiceVersion sets Config.ServiceVersion.
type PhaseHistogram ¶
type PhaseHistogram struct {
// contains filtered or unexported fields
}
PhaseHistogram wraps a Float64Histogram pre-configured with milliseconds unit and QR-MPM phase bucket boundaries. Thread-safe; create once at startup per service and reuse across goroutines.
func MustPhaseHistogram ¶
func MustPhaseHistogram(meterName, histName string, buckets []float64) *PhaseHistogram
MustPhaseHistogram is like NewPhaseHistogram but panics on error. Use at package init where an instrument failure indicates a programming mistake.
func NewPhaseHistogram ¶
func NewPhaseHistogram(meterName, histName string, buckets []float64) (*PhaseHistogram, error)
NewPhaseHistogram creates (or returns a cached) phase-duration histogram. The meterName selects which instrumentation scope the instrument belongs to (typically the service name). Safe to call multiple times; subsequent calls with the same (meterName, histName) return the cached instrument.
When phtrace is disabled this returns a no-op histogram that records nothing.
func (*PhaseHistogram) Observe ¶
func (p *PhaseHistogram) Observe(ctx context.Context, phase string, extra ...attribute.KeyValue) func()
Observe is a convenience helper returning a done func that records elapsed time when deferred. Example:
defer phaseHist.Observe(ctx, "rmq_publish")()
func (*PhaseHistogram) Record ¶
func (p *PhaseHistogram) Record(ctx context.Context, phase string, duration time.Duration, extra ...attribute.KeyValue)
Record logs duration (ms) tagged with the given phase and any extra attributes. Zero and negative durations are accepted but may indicate a clock bug upstream and are left to the caller's judgment.
type Shutdown ¶
Shutdown is returned by Init and MUST be called during graceful shutdown to flush pending spans and metrics. It is safe to call multiple times; only the first call performs shutdown. It is safe to pass a nil context — it will be replaced by context.Background internally.
func Init ¶
Init initializes OTLP gRPC exporters for traces and metrics and registers global providers. Safe to call once per process; subsequent calls return the same Shutdown and error captured on the first call.
When cfg.Enabled is false (e.g. OTEL_ENABLED=false), Init returns a no-op Shutdown and no error, leaving the global providers as the default no-op.