authevents

package
v1.62.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 17, 2026 License: Apache-2.0 Imports: 11 Imported by: 0

Documentation

Overview

Package authevents provides durable audit history for the OAuth lifecycle of every connection — connect, refresh, rotation, revocation, and admin deletion — keyed on (connection_kind, connection_name).

Two failure modes motivated this package:

  • Before authevents, a revoked-refresh deletion left no trail: the row vanished from connection_oauth_tokens and an operator saw "Token: not yet acquired" with no way to distinguish "never connected" from "the IdP rejected our refresh ten minutes ago because the SSO session idled out." Now every delete is paired with a refresh_failed_revoked + token_deleted_revoked event so operators can correlate the symptom to a specific point in time.

  • Rotation-required IdPs (Microsoft Entra with rotation enforced, rotation-enabled Keycloak) silently lose access permanently if a rotated token fails to persist. authevents emits refresh_rotation_persistence_failed at ERROR level the moment the DB write fails, before any subsequent tool call exposes the dead connection.

Event types are a closed set (see eventType constants in event.go); callers cannot invent new types at runtime. Detail payloads are per-type JSON and must NEVER include access or refresh tokens, IdP response bodies, or human-readable error_description strings (those vary per IdP and sometimes carry user identifiers — only the RFC 6749 machine-readable `error` field is recorded).

The package is kind-agnostic: the same table records events for the MCP gateway (kind=mcp), the HTTP API gateway (kind=api), and any future kind that participates in the unified OAuth flow.

Index

Constants

View Source
const (
	// SystemBackgroundRefresh is the actor recorded when the
	// connoauth refresher loop (running on a server replica with no
	// associated human) refreshes a token.
	SystemBackgroundRefresh = "system:background-refresh"
	// SystemToolCall is the actor recorded when an outbound toolkit
	// request triggers refresh-on-access-token-expiry. Distinct from
	// SystemBackgroundRefresh so operators can see whether the
	// keepalive is doing its job (frequent background events =
	// healthy) or whether refreshes are only happening reactively
	// (the bug the keepalive was added to fix).
	SystemToolCall = "system:tool-call"
)

Actor identifies who/what initiated the event. Operator-driven events carry the operator's email (or apikey:name); background refresher events carry the synthetic SystemBackgroundRefresh constant. Distinguishing the two answers the operator question: did a human click this, or did the platform keep things alive on its own.

Variables

View Source
var ErrInvalidEvent = errors.New("authevents: invalid event")

ErrInvalidEvent indicates the caller passed an event that fails validation (missing required field or unknown type). Wrapped so errors.Is works through the call chain.

Functions

This section is empty.

Types

type ConnectCompletedDetail

type ConnectCompletedDetail struct {
	Scope            string    `json:"scope,omitempty"`
	ExpiresAt        time.Time `json:"expires_at,omitzero"`
	RefreshExpiresAt time.Time `json:"refresh_expires_at,omitzero"`
	HasRefreshToken  bool      `json:"has_refresh_token"`
}

ConnectCompletedDetail is the detail payload for ConnectCompleted. Captured server-side after the token exchange succeeds.

type Event

type Event struct {
	// ID is the server-assigned row UUID. Empty on inserts; populated
	// by the store after the INSERT … RETURNING id.
	ID string `json:"id"`
	// OccurredAt is wall-clock time of the event. Empty on inserts;
	// the store stamps NOW().
	OccurredAt time.Time `json:"occurred_at"`
	// Kind is the connection kind (mcp, api, future).
	Kind string `json:"connection_kind"`
	// Name is the connection name within the kind.
	Name string `json:"connection_name"`
	// Type is one of the declared TypeXxx constants. Required.
	Type Type `json:"event_type"`
	// Actor is who/what initiated the event. See SystemXxx constants
	// for synthetic actors; operator-driven events use the email or
	// apikey:name. Required (empty string is rejected at insert).
	Actor string `json:"actor"`
	// IDPHost is the host portion of the IdP token endpoint. Empty
	// when not relevant to the event (e.g., TypeTokenDeletedAdmin).
	IDPHost string `json:"idp_host,omitempty"`
	// Detail is per-Type JSON. May be nil for events with no extra
	// payload.
	//
	// swaggertype tag tells swag (used by make swagger) to treat the
	// field as a generic object; without it swag can't resolve
	// json.RawMessage and the generation step fails.
	Detail json.RawMessage `json:"detail,omitempty" swaggertype:"object"`
}

Event is one row in connection_auth_events. Detail is per-Type JSON; see the doc on each TypeXxx constant for the shape callers should produce.

JSON tags are snake_case to match the TypeScript ConnectionAuthEvent interface the portal History panel consumes. Without explicit tags Go marshals field names verbatim (ID/OccurredAt/...), the portal's `event.event_type` lookup returns undefined, and every row in the History panel renders empty.

func (*Event) IsValid

func (e *Event) IsValid() bool

IsValid checks the required fields. Used by the store before insert.

type Filter

type Filter struct {
	// Kind, Name select events for a specific connection. Both must
	// be set (or both empty for cross-connection queries — only used
	// by tests and the prune job).
	Kind string
	Name string
	// Limit caps the result count. Required to be > 0; List returns
	// an error when the caller forgets it (the History panel passes
	// 30, the prune job doesn't list).
	Limit int
	// Since drops events older than Since. Zero disables.
	Since time.Time
}

Filter constrains a List query. All fields are AND-combined. Zero values disable that condition.

type MemoryStore

type MemoryStore struct {
	// contains filtered or unexported fields
}

MemoryStore is an in-process Store for tests and for dev deployments without a database. Events DO NOT survive process restarts. Concurrency-safe.

List(...) returns newest-first by OccurredAt, with ties broken by a monotonic insertion counter (seq) so back-to-back inserts that produce identical OccurredAt values still yield a deterministic order. Identical-timestamp ties happen on hardware where time.Now()'s wall-clock granularity is coarser than the time between two consecutive Insert calls — common on Apple Silicon when tests run without -race.

PostgresStore (store_postgres.go) does NOT have a comparable tie-breaker — the table's primary key is a random UUID and List's query is "ORDER BY occurred_at DESC" with no secondary sort. The only production consumer of either backend that depends on this ordering is admin.lastRevocationFor, which scans the most recent few events for any revocation lead type and tolerates either order (both the lead and the trail of a revocation pair carry the same reason). Inside test code we still want determinism, hence the counter here.

func NewMemoryStore

func NewMemoryStore() *MemoryStore

NewMemoryStore returns an in-process Store. The Postgres store is the production choice; this exists so unit tests don't need a database AND so TestNoopOnlyInterfaces sees a real (non-noop) implementation alongside the Postgres one.

func (*MemoryStore) Insert

func (s *MemoryStore) Insert(_ context.Context, ev Event) error

Insert appends ev to the in-memory slice after validating.

func (*MemoryStore) List

func (s *MemoryStore) List(_ context.Context, f Filter) ([]Event, error)

List collects events matching f and returns them ordered occurred_at DESC, then caps to f.Limit. Sorting (rather than relying on insertion order) keeps the contract identical to PostgresStore.List even when callers pre-populate OccurredAt out of monotonic order.

func (*MemoryStore) Prune

func (s *MemoryStore) Prune(_ context.Context, cutoff time.Time) (int64, error)

Prune drops events older than cutoff. Returns the count removed.

type PostgresStore

type PostgresStore struct {
	// contains filtered or unexported fields
}

PostgresStore writes connection_auth_events rows via the supplied *sql.DB. The schema is migration 000040.

func NewPostgresStore

func NewPostgresStore(db *sql.DB) *PostgresStore

NewPostgresStore wires a Store to the platform's database.

func (*PostgresStore) Close

func (s *PostgresStore) Close() error

Close stops the prune goroutine and waits for it. Safe to call when StartPruneRoutine was never called.

func (*PostgresStore) Insert

func (s *PostgresStore) Insert(ctx context.Context, ev Event) error

Insert appends ev. RETURNING id populates ev.ID server-side so the caller can reference the row in subsequent logs.

func (*PostgresStore) List

func (s *PostgresStore) List(ctx context.Context, f Filter) ([]Event, error)

List runs a per-filter query against the (kind, name, occurred_at) index and decodes each row into an Event. ORDER BY occurred_at DESC matches the index direction so the query is index-only.

func (*PostgresStore) Prune

func (s *PostgresStore) Prune(ctx context.Context, cutoff time.Time) (int64, error)

Prune deletes rows whose occurred_at < cutoff. Returns the rowcount so the caller can log the daily prune size.

func (*PostgresStore) StartPruneRoutine

func (s *PostgresStore) StartPruneRoutine(interval, retention time.Duration)

StartPruneRoutine launches a background goroutine that calls Prune once every `interval` with cutoff = now() - retention. The goroutine is stopped by Close. interval is typically 24h in production; retention is typically 90d. The first prune fires after one interval so a freshly-started replica doesn't immediately churn the DB.

type RefreshDetail

type RefreshDetail struct {
	BeforeExpiresAt        time.Time `json:"before_expires_at,omitzero"`
	BeforeRefreshExpiresAt time.Time `json:"before_refresh_expires_at,omitzero"`
	AfterExpiresAt         time.Time `json:"after_expires_at,omitzero"`
	AfterRefreshExpiresAt  time.Time `json:"after_refresh_expires_at,omitzero"`
	RotatedRefresh         bool      `json:"rotated_refresh,omitempty"`
	DurationMS             int64     `json:"duration_ms,omitempty"`
	// IDPErrorCode is the RFC 6749 `error` field from the IdP's error
	// body (e.g., "invalid_grant"). Empty on success. NEVER carries
	// error_description — that's per-IdP text that can leak user IDs.
	IDPErrorCode string `json:"idp_error_code,omitempty"`
	// ErrorClass distinguishes transient from revoked from
	// rotation-persistence-failure when reading the detail blob in
	// isolation (without the row's Type).
	ErrorClass string `json:"error_class,omitempty"`
}

RefreshDetail is the detail shape for refresh_succeeded / refresh_failed_transient / refresh_failed_revoked. Same shape across the three so dashboards can diff before/after the same way.

type Store

type Store interface {
	// Insert appends an event. The store stamps OccurredAt and ID;
	// callers should not pre-populate them. Returns ErrInvalidEvent
	// for events with missing required fields or unknown types.
	Insert(ctx context.Context, ev Event) error
	// List returns the most recent events matching f, ordered
	// occurred_at DESC (newest first). Capped by f.Limit.
	List(ctx context.Context, f Filter) ([]Event, error)
	// Prune deletes events older than cutoff. Returns the count of
	// deleted rows so the caller can log it. The prune job runs this
	// daily with cutoff = now - retention.
	Prune(ctx context.Context, cutoff time.Time) (int64, error)
}

Store persists and queries connection_auth_events rows. The Postgres implementation lives in store_postgres.go; an in-memory implementation lives in store_memory.go for tests and for dev deployments that lack a database. Both satisfy this interface.

Distinct from connoauth.Store deliberately: this package's writes are append-only audit history, while connoauth.Store is the authoritative token-row state. Mixing them would conflate two very different update cadences (every refresh tick writes here; only the IdP can mutate connoauth rows).

type Type

type Type string

Type names the event category. Closed set — callers must use the declared constants; arbitrary strings are rejected by the store. Ordering of constants is intentional: declaration order matches the typical timeline of a connection's lifecycle, which makes diffs against this file readable when new events are added.

const (
	// TypeConnectStarted records the operator hitting Connect on the
	// portal. Emitted by the admin oauth-start handler immediately
	// before issuing the PKCE state. Paired with a later
	// TypeConnectCompleted on success, or no follow-up event when the
	// operator abandons the browser flow.
	TypeConnectStarted Type = "connect_started"
	// TypeConnectCompleted records a successful authorization_code
	// exchange (the IdP returned a token pair and the row was
	// persisted). Emitted by the admin callback handler.
	TypeConnectCompleted Type = "connect_completed"
	// TypeRefreshSucceeded records a successful refresh-token grant.
	// Emitted by the background refresher and by the toolkit-side
	// refresh paths (gateway/oauth.go, apigateway/auth.go) on success.
	TypeRefreshSucceeded Type = "refresh_succeeded"
	// TypeRefreshFailedTransient records a network/5xx/ctx-cancel
	// failure during refresh. The row is NOT deleted; the next attempt
	// can succeed. Emitted at WARN level by both refresh paths.
	TypeRefreshFailedTransient Type = "refresh_failed_transient"
	// TypeRefreshFailedRevoked records that the IdP was called and
	// returned a definitive rejection (RFC 6749 §5.2 invalid_grant @
	// HTTP 400). Paired with a subsequent TypeTokenDeletedRevoked
	// event. Reserved strictly for IdP-returned rejections — the
	// locally-decided cases use TypeRefreshSkippedExpired and
	// TypeRefreshSkippedNoToken below so the History panel does not
	// falsely attribute the verdict to the upstream IdP.
	TypeRefreshFailedRevoked Type = "refresh_failed_revoked"
	// TypeRefreshSkippedNoToken records that the refresh attempt was
	// aborted before any network call because no refresh_token was
	// persisted. Paired with a subsequent TypeTokenDeletedRevoked
	// event. Distinct from TypeRefreshFailedRevoked so the History
	// panel can show that the IdP was NOT contacted on this tick.
	TypeRefreshSkippedNoToken Type = "refresh_skipped_no_token"
	// TypeRefreshSkippedExpired records that the refresh attempt was
	// aborted before any network call because the IdP-disclosed
	// refresh deadline (refresh_expires_in from the most recent
	// successful refresh response) had already passed. Paired with a
	// subsequent TypeTokenDeletedRevoked event. Distinct from
	// TypeRefreshFailedRevoked so the History panel does not falsely
	// claim the IdP returned an error code on this tick.
	TypeRefreshSkippedExpired Type = "refresh_skipped_expired"
	// TypeRefreshRotationPersistenceFailed records the most serious
	// failure class: the IdP issued a rotated token pair (the old
	// refresh token is therefore invalid the instant the new one is
	// minted) but persisting the new pair to the store failed. The
	// connection is now permanently broken until reconnect. Emitted
	// at ERROR level.
	TypeRefreshRotationPersistenceFailed Type = "refresh_rotation_persistence_failed"
	// TypeTokenDeletedRevoked records the automatic deletion of a
	// token row following a TypeRefreshFailedRevoked. Distinct from
	// TypeTokenDeletedAdmin so operator-initiated and IdP-initiated
	// deletions are visually distinguishable in the History panel.
	TypeTokenDeletedRevoked Type = "token_deleted_revoked"
	// TypeTokenDeletedAdmin records an operator deleting the
	// connection (or otherwise clearing its token row) through the
	// admin API. Emitted by the connection delete handler.
	TypeTokenDeletedAdmin Type = "token_deleted_admin"
)

func (Type) IsValid

func (t Type) IsValid() bool

IsValid reports whether t is one of the declared event types. The store rejects events with unknown types so a misconfigured caller cannot smuggle arbitrary strings into the history.

type Writer

type Writer struct {
	// contains filtered or unexported fields
}

Writer is the small, opinionated surface most callers want — emit a typed event without assembling an Event struct or worrying about a missing store. Wraps Store with structured logging and nil-safety so the producing code paths can call e.g. w.Refreshed(ctx, key, ...) inline without polluting their logic with NullStore checks.

A nil Writer is valid and emits nothing — callers that don't have a configured store (memory-only dev, tests) can pass nil and the rest of the code path is unaffected.

func NewWriter

func NewWriter(store Store, logger *slog.Logger) *Writer

NewWriter wraps store. Pass logger=nil to use the global default.

func (*Writer) ConnectCompleted

func (w *Writer) ConnectCompleted(ctx context.Context, kind, name, actor, tokenURL string, d ConnectCompletedDetail)

ConnectCompleted records a successful authorization_code exchange.

func (*Writer) ConnectStarted

func (w *Writer) ConnectStarted(ctx context.Context, kind, name, actor, tokenURL, returnURL string)

ConnectStarted records the operator initiating the authorization flow. tokenURL is recorded as idp_host so the History panel surfaces which IdP the operator hit.

func (*Writer) Emit

func (w *Writer) Emit(ctx context.Context, ev Event)

Emit is the low-level entry point that the typed helpers below delegate to. Callers should prefer the typed helpers; Emit exists for events whose detail shape doesn't fit one of the helpers (currently TypeConnectStarted with arbitrary detail).

Emit returns no error: durability of the audit log is best-effort. A persist failure is logged at WARN but does NOT propagate — the caller's normal flow (refresh succeeded / row deleted / etc.) must not fail because the audit row couldn't be written.

func (*Writer) RefreshFailedRevoked

func (w *Writer) RefreshFailedRevoked(ctx context.Context, kind, name, actor, tokenURL string, d RefreshDetail)

RefreshFailedRevoked records a definitive refresh rejection. Paired with a subsequent TokenDeletedRevoked.

func (*Writer) RefreshFailedTransient

func (w *Writer) RefreshFailedTransient(ctx context.Context, kind, name, actor, tokenURL string, d RefreshDetail)

RefreshFailedTransient records a transient refresh failure (network / 5xx / ctx cancel). The row is NOT deleted; the caller's retry path can run.

func (*Writer) RefreshSkippedExpired

func (w *Writer) RefreshSkippedExpired(ctx context.Context, kind, name, actor, tokenURL string)

RefreshSkippedExpired records a refresh attempt aborted because the IdP-disclosed refresh deadline had already passed.

func (*Writer) RefreshSkippedNoToken

func (w *Writer) RefreshSkippedNoToken(ctx context.Context, kind, name, actor, tokenURL string)

RefreshSkippedNoToken records a refresh attempt with no refresh_token persisted.

func (*Writer) RefreshSucceeded

func (w *Writer) RefreshSucceeded(ctx context.Context, kind, name, actor, tokenURL string, d RefreshDetail)

RefreshSucceeded records a successful refresh-token grant.

func (*Writer) RotationPersistenceFailed

func (w *Writer) RotationPersistenceFailed(ctx context.Context, kind, name, actor, tokenURL, persistError string)

RotationPersistenceFailed records the most serious failure class: the IdP issued a rotated token pair (old refresh is now invalid) but persisting the new pair failed. The caller MUST also emit an ERROR-level slog line so operators see the page.

func (*Writer) TokenDeletedAdmin

func (w *Writer) TokenDeletedAdmin(ctx context.Context, kind, name, actor string)

TokenDeletedAdmin records an operator deleting the connection or otherwise clearing its token row.

func (*Writer) TokenDeletedRevoked

func (w *Writer) TokenDeletedRevoked(ctx context.Context, kind, name, actor, tokenURL, reason string)

TokenDeletedRevoked records the auto-deletion of a token row after a revoked-refresh signal from the IdP.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL