etag

package
v1.6.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 7, 2026 License: MIT Imports: 22 Imported by: 0

Documentation

Overview

Package etag implements GitHub's reverse-engineered ETag algorithm and a conditional-request HTTP transport that uses it.

The algorithm was originally reverse-engineered by bored-engineer:

https://github.com/bored-engineer/github-conditional-http-transport
https://www.bored-engineer.com/posts/github-etag-algorithm/

GitHub's server-side ETag hash includes the Authorization header, so a plain store-and-forward ETag cache falls over under rotating auth (GitHub App installation tokens refresh hourly; fine-grained PATs rotate on a schedule). The precompute trick reproduces that hash client-side at request time using the current Authorization header, so the cached body stays valid across rotations and 304 hit rates stay high.

This package ships:

  • ComputeExpectedETag, NormaliseETag, ParseVary: low-level helpers for callers composing their own transport.
  • Cache: a three-method interface (Get/Add/Remove) any backend can implement. The default NewLRUCache is memory-bounded and in-process.
  • NewTransport: an http.RoundTripper that does the hit/miss/304/write- invalidation dance around any Cache implementation.

Drift-detector tuning. The detector is calibrated for steady GitHub-App traffic. Internal thresholds (how often the transport probes after the cooldown elapses, and how many consecutive successful probes are needed to recover) are private and may change. With the current values, a transport handling fewer than roughly 100 cacheable requests after the cooldown window elapses will not complete recovery to precompute mode. This is a deliberate trade-off against probe-induced load on high-traffic deployments.

Security invariant: no log line emitted from this package may include req.Header or resp.Header as a structured field. The Authorization header value is a live credential. Only specific scalar fields (lengths, status codes, path templates, event kinds) are ever logged.

drift.go implements transparent ETag drift detection and fallback. Precompute mode is the default; if the client-side hash diverges from the server-issued ETag on driftThreshold validations inside driftWindow, the Transport silently switches to passive mode (sends the server's stored ETag verbatim). After driftCooldown, sampled probe-back requests retry precompute; consecutive successes restore the precompute path. Drift transitions surface as KindDriftDetected / KindDriftRecovered events on WithEventCallback; the read-only Stats() snapshot also exposes live drift state plus per-Outcome counters.

Index

Constants

This section is empty.

Variables

View Source
var (
	// ErrKeyScopeRequired is returned by NewTransport when a caller-supplied
	// Cache is used without a WithKeyScope option. Required to prevent
	// cross-tenant body leaks when multiple identities share one cache.
	ErrKeyScopeRequired = errors.New("etag: WithKeyScope is required when WithCache supplies a Cache; see package docs")

	// ErrDoubleWrap is returned when NewTransport is called with a base that
	// is already an *etag.Transport. Nesting the transport is never intended.
	ErrDoubleWrap = errors.New("etag: base is already an *etag.Transport; do not double-wrap")

	// ErrBaseTransportType is returned when NewTransport is called with a
	// base that is not nil and not an *http.Transport. The gzip invariant
	// requires DisableCompression=true on an *http.Transport, which we
	// cannot set on an arbitrary wrapper.
	ErrBaseTransportType = errors.New("etag: base transport must be *http.Transport so DisableCompression can be set")

	// ErrNilCache is returned when WithCache is called with a nil Cache. If
	// you want the default LRU, omit WithCache entirely.
	ErrNilCache = errors.New("etag: WithCache was called with nil; omit the option to use the default LRU")

	// ErrConflictingScope is returned when WithKeyScope and WithAutoKeyScope
	// are both set on the same Transport. Pick one.
	ErrConflictingScope = errors.New("etag: WithKeyScope and WithAutoKeyScope are mutually exclusive")

	// ErrEmptyScope is returned (wrapped) from RoundTrip when a
	// WithAutoKeyScope fn returns an empty string with a nil error. The
	// scope is contractually non-empty; the caller's fn must either
	// return a non-empty scope or a non-nil error.
	ErrEmptyScope = errors.New("etag: WithAutoKeyScope fn returned an empty scope with a nil error")
)

Sentinel errors exported from NewTransport so callers can use errors.Is.

Functions

func ComputeExpectedETag

func ComputeExpectedETag(reqHeaders http.Header, respVary []string, body []byte) string

ComputeExpectedETag is the single source of truth for the client-side hash computation.

func Hash

func Hash(reqHeaders http.Header, vary []string) hash.Hash

Hash returns a SHA256 hasher pre-loaded with the VaryHeaders values in declaration order, each suffixed with ':'. Callers Write the raw (pre-compression) response body and then Sum to obtain the ETag bytes.

The vary argument is a FILTER: when non-nil, a header is only mixed into the hash when vary contains its name. When vary is nil, all three canonical headers are used. This matches GitHub's server-side behaviour: the server hashes over a fixed header set regardless of what it advertises in Vary.

func NewTransport

func NewTransport(base http.RoundTripper, opts ...Option) (http.RoundTripper, error)

NewTransport returns a Transport wrapping base. When base is nil, a cloned http.DefaultTransport with DisableCompression=true is used. Returns an error when:

  • base is a non-nil http.RoundTripper that is not an *http.Transport (the library cannot set DisableCompression on an arbitrary wrapper).
  • base is already an *Transport (double-wrap).
  • a caller-supplied Cache was provided via WithCache without a non-empty WithKeyScope (cross-tenant safety requirement).

func NormaliseETag

func NormaliseETag(e string) string

NormaliseETag strips the W/ weak-marker prefix and surrounding quotes so two ETag strings can be byte-compared. Callers building If-None-Match values construct the strong-quoted form from the raw hex hash; this function is for comparisons only.

func ParseVary

func ParseVary(h http.Header) []string

ParseVary MUST only be called on 200-response headers. RFC 7232 allows servers to omit Vary on 304 responses (GitHub does), so calling ParseVary on a 304 would silently fall back to the canonical list and lose any endpoint-specific Vary the original 200 carried. The transport's 304 branch does NOT call this function.

Server order is preserved; we do not sort. The server hashes in its iteration order, and any client-side reordering diverges.

func VaryHeaders

func VaryHeaders() []string

VaryHeaders returns an immutable copy of the canonical Vary header list. Mutating the returned slice does not affect internal state.

Types

type Cache

type Cache interface {
	Get(ctx context.Context, key string) (Entry, bool, error)
	Add(ctx context.Context, key string, e Entry) error
	Remove(ctx context.Context, key string) error
}

Cache is the minimal interface any backend must implement. Implementors must be safe for concurrent use. Methods take a context so network-backed backends can honour deadlines and cancellation; the default in-memory LRU ignores it. Add overwrites; on error the transport treats the response as uncached. Get returning (zero, false, err) is treated as a miss. Remove is idempotent.

func NewLRUCache

func NewLRUCache(size int) Cache

NewLRUCache returns the default in-process Cache: a bounded LRU with no TTL-based eviction. size <= 0 uses 4096. The returned Cache is safe for concurrent use and spawns no background goroutines: hashicorp/golang-lru/v2 starts a reaper only when ttl > 0; we pass 0 to turn that off.

The eviction callback fires on count-cap eviction and explicit Remove, not on same-key overwrite (upstream MoveToFront). It runs synchronously in the calling goroutine while c.mu is held and must not re-lock.

type DriftEvent

type DriftEvent struct {
	DetectedAt time.Time
	Recovered  bool
}

DriftEvent fires on each drift state transition. Recovered=false on detection; Recovered=true on probe-back recovery. Drift state is per-Transport: if you build multiple *Transport instances in one process, the callback fires per Transport. Read Stats() for the current truth at any time.

DetectedAt is the time of the transition. For Recovered=false events this is when drift was first observed; for Recovered=true events this is when recovery was confirmed.

type Entry

type Entry struct {
	ETag     string
	Body     []byte
	Headers  http.Header
	StoredAt time.Time
}

Entry is what the transport caches: the server's ETag, the full body as last read, and the response headers. StoredAt is populated by NewLRUCache; custom backends may populate it too, or leave it zero if freshness tracking isn't useful.

Use named-field struct literals so future field additions remain non-breaking.

type Event added in v1.6.0

type Event struct {
	Kind            Kind
	URL             *url.URL      // nil on KindDriftDetected, KindDriftRecovered
	PathTemplate    string        // normalised path; see algo.go
	Status          int           // 0 when no resp
	BodyLen         int           // KindStore only
	Age             time.Duration // KindHit only
	Err             error         // KindGetError, KindStoreError, KindRemoveError
	GitHubRequestID string        // X-GitHub-Request-Id; empty when no resp
	DriftEvent      DriftEvent    // KindDriftDetected, KindDriftRecovered
}

Event is the per-call payload delivered to a WithEventCallback handler. Fields are populated when meaningful for the Kind; zero values are valid otherwise. URL is the caller's pointer (captured before the transport's internal request clone); the callback must not mutate it.

KindHit fires at cache lookup, before the wire 304 round-trip; the served-from-cache outcome is signalled to the caller via cond.HeaderCacheStatus on the synth response. A single tripping mismatch fires KindMismatch then KindDriftDetected. With the pages package, the callback fires once per page in a paginated traversal.

Panics in the callback are not recovered; they propagate through RoundTrip.

type Kind added in v1.6.0

type Kind string

Kind discriminates Event values. Bare names match the slog "kind" attribute on etag_event records 1:1; drift kinds drop the "etag_" prefix that the etag_drift_* slog kind attributes carry.

const (
	KindGetError        Kind = "get_error"
	KindHit             Kind = "hit"
	KindMiss            Kind = "miss"
	KindBypassOversize  Kind = "bypass_oversize"
	KindBypassNoncache  Kind = "bypass_noncacheable"
	KindNoEtagHeader    Kind = "no_etag_header"
	KindValidatedOK     Kind = "validated_ok"
	KindMismatch        Kind = "mismatch"
	KindStoreError      Kind = "store_error"
	KindStore           Kind = "store"
	KindRemoveError     Kind = "remove_error"
	KindInvalidatedGone Kind = "invalidated_gone"
	KindDriftDetected   Kind = "drift_detected"
	KindDriftRecovered  Kind = "drift_recovered"
)

type Option

type Option interface {
	// contains filtered or unexported methods
}

Option configures a Transport. The interface form (rather than a bare `func(*config)`) lets us evolve the API without a breaking change: new option shapes can introduce richer concrete types that still satisfy Option.

func WithAutoKeyScope added in v1.3.0

func WithAutoKeyScope(fn func(*http.Request) (string, error)) Option

WithAutoKeyScope derives the cache-key scope per request via fn(req). Use it when one *http.Client serves multiple tenants. Mutually exclusive with WithKeyScope: combining both yields ErrConflictingScope. Either option satisfies the caller-supplied-Cache scope requirement.

fn must be safe for concurrent use and must return either a non-empty scope string with a nil error, or an empty string with a non-nil error. Both are wrapped and returned from RoundTrip; an empty scope with a nil error surfaces as ErrEmptyScope.

func WithCache

func WithCache(c Cache) Option

WithCache supplies the storage backend. When omitted, the Transport uses NewLRUCache(4096). If a Cache is supplied, WithKeyScope is REQUIRED to prevent cross-tenant body leaks (different callers writing to the same URL under different auth).

Passing nil marks the option as caller-set but with a nil value; NewTransport rejects this with ErrNilCache. If you want the default LRU, omit the option entirely.

func WithEventCallback added in v1.6.0

func WithEventCallback(cb func(ctx context.Context, evt Event)) Option

WithEventCallback registers a callback fired for every cache decision, validation outcome, store/invalidation, and drift transition. Drift transitions arrive as KindDriftDetected / KindDriftRecovered events with the full DriftEvent payload on evt.DriftEvent.

The callback runs synchronously inside RoundTrip and may be invoked concurrently from many goroutines. It must be fast, non-blocking, and panic-free; panics propagate up through RoundTrip.

func WithKeyScope

func WithKeyScope(scope string) Option

WithKeyScope namespaces cache entries. The scope string is hashed with SHA256 into the cache key, so two callers sharing a Cache with different scopes never collide. Scopes are treated as opaque: do NOT embed secrets in the scope value.

An empty scope is treated identically to omitting the option; if WithCache is also present, NewTransport fails with ErrKeyScopeRequired.

func WithLogger

func WithLogger(l *slog.Logger) Option

WithLogger supplies the slog.Logger the transport emits events to. Pass nil (or omit the option) to silence the package; a nil logger is replaced with slog.New(slog.DiscardHandler) at construction.

func WithMaxBodyBytes

func WithMaxBodyBytes(n int) Option

WithMaxBodyBytes caps the per-entry body size the transport will buffer and cache. Responses exceeding this cap pass through to the caller uncached. Values below the 8 KiB internal floor are accepted but the initial allocation stays at the floor; the caller-supplied value is the cap. Default: 4 MiB.

func WithMaxCacheBytes

func WithMaxCacheBytes(n int64) Option

WithMaxCacheBytes caps the total byte budget held by the default NewLRUCache. Exceeding the budget evicts oldest entries. Has no effect when a caller-supplied Cache is used; custom backends enforce their own budgets. Default: 256 MiB.

type Stats

type Stats struct {
	Degraded        bool
	DegradedAt      time.Time // zero when not degraded
	TotalMismatches int64     // monotonic over Transport lifetime

	// TotalHits counts cache lookups that matched (transport.go:218 site).
	TotalHits int64
	// TotalMisses counts cache lookups that missed (transport.go:231 site).
	TotalMisses int64
	// TotalStores counts wire-200 entries written to cache. Includes
	// re-validated stores: a 200 whose ETag matched precompute also
	// reaches cache.Add, so this is "stores that succeeded against the
	// cache backend", not "stores of new entries".
	TotalStores int64
	// TotalBypasses aggregates uncached pass-throughs: bypass_oversize,
	// bypass_noncacheable, and the two no_etag_header sites.
	TotalBypasses int64
}

Stats is the read-only snapshot returned by (*Transport).Stats. Suitable for /healthz probes, Prometheus gauges, or polling dashboards. The four Total* per-Outcome counters added in v1.6.0 are populated lock-free from atomic counters; suitable for hit-rate metrics without enabling DEBUG-level slog ingestion. {Degraded, DegradedAt} remain mutex-guarded for snapshot consistency under concurrent transitions.

type Transport

type Transport struct {
	// contains filtered or unexported fields
}

Transport is an http.RoundTripper that adds If-None-Match on cacheable GET/HEAD requests and replays the cached body as a synthesised 200 when the server answers with 304 Not Modified.

Transport runs precompute-mode by default: the If-None-Match value is computed from the cached body and the CURRENT request headers, so cached entries stay useful across Authorization rotation (GitHub App installation tokens, rotating fine-grained PATs). If algorithm drift is detected (10 precompute/server-ETag mismatches inside a 60-second window), the transport transparently switches to passive mode and sends the server's stored ETag verbatim. After a 1-hour cooldown, sampled probe-back requests retry precompute; consecutive successes restore precompute mode automatically. Passive mode never replays caller credentials in If-None-Match: only the server's previously issued opaque ETag is sent. Use WithEventCallback for per-call lifecycle events (cache decisions, validation, store, drift transitions). Stats() exposes the four atomic per-Outcome counters and the drift-state pair.

func (*Transport) RoundTrip

func (t *Transport) RoundTrip(req *http.Request) (*http.Response, error)

RoundTrip implements http.RoundTripper.

func (*Transport) Stats

func (t *Transport) Stats() Stats

Stats returns a snapshot of the Transport's drift detector state. Safe to call from any goroutine.

Holds driftMu for the {Degraded, DegradedAt} pair so the snapshot is internally consistent under concurrent transitions. The hot path (buildIfNoneMatch) intentionally stays lock-free; it can briefly see a transient state and at worst sends one extra probe.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL