Documentation
¶
Overview ¶
Package metrics provides a lightweight in-process metrics registry with Prometheus text-format output. It requires zero external dependencies.
Instrumented points on the hot path:
- agentguard_checks_total — counter, by decision label
- agentguard_request_duration_ms — histogram, end-to-end /v1/check
- agentguard_policy_eval_duration_ms — histogram, Engine.Check only
- agentguard_audit_write_duration_ms — histogram, Logger.Log only
- agentguard_pending_approvals — gauge, current queue depth
Index ¶
- Constants
- Variables
- func AddAuditReplayEntries(n uint64)
- func ApprovalEvictedFor(reason string) uint64
- func DecSSESubscribers()
- func IncApprovalEvicted(reason string)
- func IncApprovalReplayMismatch()
- func IncAuditCorruptLine()
- func IncAuditRotation()
- func IncDecision(decision string)
- func IncLLMProxyBufferOverflow(provider string)
- func IncLLMProxyNonStreamingOverflow(provider string)
- func IncLLMProxyStreamsRejected()
- func IncNotifyDropped(notifier, reason string)
- func IncRateLimitBucketEvicted(scope string)
- func IncRateLimited()
- func IncRequestRejected(reason string)
- func IncSSEEventDropped(reason string)
- func IncSSESubscribers()
- func LLMProxyBufferOverflowFor(provider string) uint64
- func LLMProxyNonStreamingOverflowFor(provider string) uint64
- func LLMProxyStreamsActive() int64
- func LLMProxyStreamsRejectedTotal() uint64
- func MigrationStatusFor(from, to, status string) int64
- func NotifyDroppedFor(notifier, reason string) uint64
- func NotifyDroppedSnapshot() map[notifyDroppedKey]uint64
- func ObserveNotifyDispatch(notifier string, seconds float64)
- func RateLimitBucketEvictedFor(scope string) uint64
- func RequestRejectedSnapshot() map[string]uint64
- func SSEEventDroppedFor(reason string) uint64
- func SetAuditReplayDuration(d time.Duration)
- func SetLLMProxyStreamsActive(n int64)
- func SetMigrationStatus(from, to, status string, value int64)
- func SetNotifyQueueDepth(n int)
- func SetPendingApprovals(n int)
- func SetRateLimitBuckets(n int)
- func WritePrometheus(w io.Writer)
- type Histogram
Constants ¶
const ( ApprovalEvictedLRUResolved = "lru_resolved" ApprovalEvictedQueueFull = "queue_full" )
Well-known reason labels for IncApprovalEvicted. When the approval queue is at capacity, either an old resolved entry is dropped to make room (lru_resolved) or the request is refused with 503 because nothing was resolved (queue_full). Both paths increment this counter so operators can distinguish "we need a bigger queue" from "we need more approvers".
const ( MigrationStatusRan = "ran" MigrationStatusSkipped = "skipped" MigrationStatusFailed = "failed" )
const (
NotifyDroppedQueueFull = "queue_full"
)
Well-known reason labels for IncNotifyDropped. Kept bounded so the Prometheus series cardinality stays predictable.
const (
RejectedBodyTooLarge = "body_too_large"
)
Well-known reason labels for IncRequestRejected. Other reasons are allowed but callers must keep the cardinality bounded.
const ( // SSEDroppedSlowConsumer labels a broadcast that was discarded because // the per-subscriber channel was full (the subscriber isn't draining // fast enough). This is the fail-fast drop in broadcastLocked's // default case. SSEDroppedSlowConsumer = "slow_consumer" )
Variables ¶
var ( ChecksTotal uint64 // all /v1/check requests AllowedTotal uint64 DeniedTotal uint64 ApprovalTotal uint64 // REQUIRE_APPROVAL decisions RateLimitedTotal uint64 // rate-limit denies )
var ( AuditReplayEntriesTotal uint64 AuditRotationsTotal uint64 )
Audit replay + rotation counters. Replay happens once at startup (seeding in-memory decision counters from the audit log); rotations happen inline on FileLogger.Log when the size threshold is crossed.
var ( RequestDuration = newHistogram(durationBuckets) PolicyEvalDuration = newHistogram(durationBuckets) AuditWriteDuration = newHistogram(durationBuckets) )
Package-level histograms.
var ApprovalReplayMismatchTotal uint64
ApprovalReplayMismatchTotal counts /v1/check requests that carried an approval_id whose corresponding PendingAction.Request did not match the retry's operationally-meaningful fields (agent_id / scope / command / path / domain / url / action). Mismatches are NOT short-circuited to the cached decision — the request falls through to normal Engine.Check evaluation. This metric is the security signal: legitimate retries match shape and never increment it; a non-zero rate means either a buggy gateway is reusing ids across distinct actions or an attacker who learned an approved id is replaying it against unrelated commands.
See V05 audit B1 (R-Sec H1, R-Stub C3) for the underlying gating- bypass finding the validator closes.
var AuditCorruptLinesTotal uint64
AuditCorruptLinesTotal counts audit log lines that failed JSON parse during Query() and were skipped. Rare in practice — the usual cause is a crash between the write syscall and the newline flush, or disk corruption. Kept visible via /metrics so operators can spot silent audit-file degradation instead of discovering it when a query returns fewer entries than expected.
Functions ¶
func AddAuditReplayEntries ¶ added in v0.5.0
func AddAuditReplayEntries(n uint64)
AddAuditReplayEntries records entries processed during replay. Cumulative across multiple replays in pathological re-entrance, but in the normal single-replay-per-process case just equals that one replay's count.
func ApprovalEvictedFor ¶ added in v0.5.0
ApprovalEvictedFor returns the count for a specific reason (for tests).
func DecSSESubscribers ¶ added in v0.5.0
func DecSSESubscribers()
DecSSESubscribers is the counterpart to IncSSESubscribers.
func IncApprovalEvicted ¶ added in v0.5.0
func IncApprovalEvicted(reason string)
IncApprovalEvicted increments agentguard_approvals_evicted_total{reason=...}. Cardinality is bounded to the ApprovalEvicted* constants above.
func IncApprovalReplayMismatch ¶ added in v0.5.0
func IncApprovalReplayMismatch()
IncApprovalReplayMismatch increments agentguard_approval_replay_mismatch_total. Called from pkg/proxy.handleCheck when the approval-id round-trip lookup hits an entry but the retry request's shape differs from the original.
func IncAuditCorruptLine ¶ added in v0.5.0
func IncAuditCorruptLine()
IncAuditCorruptLine bumps agentguard_audit_corrupt_lines_total.
func IncAuditRotation ¶ added in v0.5.0
func IncAuditRotation()
IncAuditRotation increments agentguard_audit_rotations_total. Called from the FileLogger rotation success path after the new live file is open.
func IncDecision ¶
func IncDecision(decision string)
IncDecision increments the appropriate decision counter.
func IncLLMProxyBufferOverflow ¶ added in v0.5.0
func IncLLMProxyBufferOverflow(provider string)
IncLLMProxyBufferOverflow increments agentguard_llmproxy_buffer_overflow_total{provider=...}. Provider MUST be "openai" or "anthropic" — the LLM proxy enforces that upstream so cardinality stays bounded.
func IncLLMProxyNonStreamingOverflow ¶ added in v0.5.0
func IncLLMProxyNonStreamingOverflow(provider string)
IncLLMProxyNonStreamingOverflow increments agentguard_llmproxy_non_streaming_overflow_total{provider=...}. Provider MUST be "openai" or "anthropic" — the LLM proxy enforces that upstream so cardinality stays bounded.
func IncLLMProxyStreamsRejected ¶ added in v0.5.0
func IncLLMProxyStreamsRejected()
IncLLMProxyStreamsRejected bumps agentguard_llmproxy_streams_rejected_total. Called once per streaming request that was refused with 503 because the global cap was already at MaxConcurrentStreams.
func IncNotifyDropped ¶ added in v0.5.0
func IncNotifyDropped(notifier, reason string)
IncNotifyDropped increments the labeled counter for a notification drop. notifier should be a bounded-cardinality notifier type ("webhook"/"slack"/"console"/"log"); reason should be a stable NotifyDropped* constant. Callers MUST NOT pass agent- or user-supplied strings here — that would explode Prometheus cardinality.
func IncRateLimitBucketEvicted ¶ added in v0.5.0
func IncRateLimitBucketEvicted(scope string)
IncRateLimitBucketEvicted increments agentguard_ratelimit_bucket_evictions_total{scope=...}. Cardinality is bounded by the set of policy scopes (typically < 20 across a deployment).
func IncRateLimited ¶
func IncRateLimited()
IncRateLimited increments the rate-limit-specific counter.
It used to also bump ChecksTotal/DeniedTotal, which double-counted rate-limited requests because logAndRespond unconditionally calls IncDecision("DENY") for the synthetic rate-limit DENY result. As of v0.5 the unified logAndRespond path owns ChecksTotal/DeniedTotal for every decision (including the synthetic rate-limit DENY); IncRateLimited only touches the rate-limit-specific series.
Closes R3 #21 (audit finding "rate-limited requests double-count ChecksTotal and DeniedTotal").
func IncRequestRejected ¶ added in v0.5.0
func IncRequestRejected(reason string)
IncRequestRejected increments agentguard_request_rejected_total{reason=...}.
func IncSSEEventDropped ¶ added in v0.5.0
func IncSSEEventDropped(reason string)
IncSSEEventDropped bumps the labeled counter for an SSE broadcast drop.
func IncSSESubscribers ¶ added in v0.5.0
func IncSSESubscribers()
IncSSESubscribers is called on Subscribe. Matching dec runs on Unsubscribe so the gauge stays accurate even if a client drops without the server side noticing (Unsubscribe is always called from the SSE handler's defer).
func LLMProxyBufferOverflowFor ¶ added in v0.5.0
LLMProxyBufferOverflowFor returns the current count (for tests).
func LLMProxyNonStreamingOverflowFor ¶ added in v0.5.0
LLMProxyNonStreamingOverflowFor returns the current count (for tests).
func LLMProxyStreamsActive ¶ added in v0.5.0
func LLMProxyStreamsActive() int64
LLMProxyStreamsActive returns the current active-streams gauge value (for tests).
func LLMProxyStreamsRejectedTotal ¶ added in v0.5.0
func LLMProxyStreamsRejectedTotal() uint64
LLMProxyStreamsRejectedTotal returns the rejected-streams counter (for tests).
func MigrationStatusFor ¶ added in v0.5.0
MigrationStatusFor returns the gauge value for a (from, to, status) triple (for tests).
func NotifyDroppedFor ¶ added in v0.5.0
NotifyDroppedFor returns the count for a specific (notifier, reason) pair (for tests).
func NotifyDroppedSnapshot ¶ added in v0.5.0
func NotifyDroppedSnapshot() map[notifyDroppedKey]uint64
NotifyDroppedSnapshot returns a copy of the current counts (for tests).
func ObserveNotifyDispatch ¶ added in v0.5.0
ObserveNotifyDispatch records a dispatch latency in seconds for the named notifier type. A missing histogram is created lazily; cardinality is bounded to the notifierType() domain in pkg/notify.
func RateLimitBucketEvictedFor ¶ added in v0.5.0
RateLimitBucketEvictedFor returns the eviction count for a scope (for tests).
func RequestRejectedSnapshot ¶ added in v0.5.0
RequestRejectedSnapshot returns a copy of the current counts (for tests).
func SSEEventDroppedFor ¶ added in v0.5.0
SSEEventDroppedFor returns the count for a specific reason (for tests).
func SetAuditReplayDuration ¶ added in v0.5.0
SetAuditReplayDuration records the duration of the startup audit replay. Expressed in seconds in the Prometheus output; nanoseconds are stored atomically under the hood so the setter is a single instruction.
func SetLLMProxyStreamsActive ¶ added in v0.5.0
func SetLLMProxyStreamsActive(n int64)
SetLLMProxyStreamsActive updates the active-streams gauge. Called from the llmproxy server on every stream entry/exit (which atomically also updates the underlying server-side counter — this metric mirrors that counter). 0 is a valid value (no streams in flight).
func SetMigrationStatus ¶ added in v0.5.0
SetMigrationStatus updates the migration-status gauge for a given (from, to, status) triple. Callers typically record one ran/skipped/ failed value per migration per startup.
func SetNotifyQueueDepth ¶ added in v0.5.0
func SetNotifyQueueDepth(n int)
SetNotifyQueueDepth updates the notify dispatch queue depth gauge.
func SetPendingApprovals ¶
func SetPendingApprovals(n int)
SetPendingApprovals sets the current queue depth gauge.
func SetRateLimitBuckets ¶ added in v0.5.0
func SetRateLimitBuckets(n int)
SetRateLimitBuckets updates the rate-limit bucket gauge. Called from the /metrics handler with Limiter.BucketCount() so operators can see bucket growth without exporting the limiter internals.
func WritePrometheus ¶
WritePrometheus writes all metrics to w in the Prometheus text exposition format (https://prometheus.io/docs/instrumenting/exposition_formats/).
Types ¶
type Histogram ¶
type Histogram struct {
// contains filtered or unexported fields
}
Histogram tracks a distribution using cumulative bucket counts. Each bucket counts observations with value ≤ the bucket bound, which is the Prometheus histogram convention.