etag

package
v1.0.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 25, 2026 License: MIT Imports: 20 Imported by: 0

Documentation

Overview

Package etag implements GitHub's reverse-engineered ETag algorithm and a conditional-request HTTP transport that uses it.

The algorithm was originally reverse-engineered by bored-engineer:

https://github.com/bored-engineer/github-conditional-http-transport
https://www.bored-engineer.com/posts/github-etag-algorithm/

GitHub's server-side ETag hash includes the Authorization header, so a plain store-and-forward ETag cache falls over under rotating auth (GitHub App installation tokens refresh hourly; fine-grained PATs rotate on a schedule). The precompute trick reproduces that hash client-side at request time using the current Authorization header, so the cached body stays valid across rotations and 304 hit rates stay high.

This package ships:

  • ComputeExpectedETag, NormaliseETag, ParseVary: low-level helpers for callers composing their own transport.
  • Cache: a three-method interface (Get/Add/Remove) any backend can implement. The default NewLRUCache is memory-bounded and in-process.
  • NewTransport: an http.RoundTripper that does the hit/miss/304/write- invalidation dance around any Cache implementation.

Security invariant: no log line emitted from this package may include req.Header or resp.Header as a structured field. The Authorization header value is a live credential. Only specific scalar fields (lengths, status codes, path templates, event kinds) are ever logged.

Index

Constants

This section is empty.

Variables

View Source
var (
	// ErrKeyScopeRequired is returned by NewTransport when a caller-supplied
	// Cache is used without a WithKeyScope option. Required to prevent
	// cross-tenant body leaks when multiple identities share one cache.
	ErrKeyScopeRequired = errors.New("etag: WithKeyScope is required when WithCache supplies a Cache; see package docs")

	// ErrDoubleWrap is returned when NewTransport is called with a base that
	// is already an *etag.Transport. Nesting the transport is never intended.
	ErrDoubleWrap = errors.New("etag: base is already an *etag.Transport; do not double-wrap")

	// ErrBaseTransportType is returned when NewTransport is called with a
	// base that is not nil and not an *http.Transport. The gzip invariant
	// requires DisableCompression=true on an *http.Transport, which we
	// cannot set on an arbitrary wrapper.
	ErrBaseTransportType = errors.New("etag: base transport must be *http.Transport so DisableCompression can be set")

	// ErrNilCache is returned when WithCache is called with a nil Cache. If
	// you want the default LRU, omit WithCache entirely.
	ErrNilCache = errors.New("etag: WithCache was called with nil; omit the option to use the default LRU")
)

Sentinel errors exported from NewTransport so callers can use errors.Is.

Functions

func ComputeExpectedETag

func ComputeExpectedETag(reqHeaders http.Header, respVary []string, body []byte) string

ComputeExpectedETag is the single source of truth for the client-side hash computation.

func Hash

func Hash(reqHeaders http.Header, vary []string) hash.Hash

Hash returns a SHA256 hasher pre-loaded with the VaryHeaders values in declaration order, each suffixed with ':'. Callers Write the raw (pre-compression) response body and then Sum to obtain the ETag bytes.

The vary argument is a FILTER: when non-nil, a header is only mixed into the hash when vary contains its name. When vary is nil, all three canonical headers are used. This matches GitHub's server-side behaviour: the server hashes over a fixed header set regardless of what it advertises in Vary.

func NewTransport

func NewTransport(base http.RoundTripper, opts ...Option) (http.RoundTripper, error)

NewTransport returns a Transport wrapping base. When base is nil, a cloned http.DefaultTransport with DisableCompression=true is used. Returns an error when:

  • base is a non-nil http.RoundTripper that is not an *http.Transport (the library cannot set DisableCompression on an arbitrary wrapper).
  • base is already an *Transport (double-wrap).
  • a caller-supplied Cache was provided via WithCache without a non-empty WithKeyScope (cross-tenant safety requirement).

func NormaliseETag

func NormaliseETag(e string) string

NormaliseETag strips the W/ weak-marker prefix and surrounding quotes so two ETag strings can be byte-compared. Callers building If-None-Match values construct the strong-quoted form from the raw hex hash; this function is for comparisons only.

func ParseVary

func ParseVary(h http.Header) []string

ParseVary MUST only be called on 200-response headers. RFC 7232 allows servers to omit Vary on 304 responses (GitHub does), so calling ParseVary on a 304 would silently fall back to the canonical list and lose any endpoint-specific Vary the original 200 carried. The transport's 304 branch does NOT call this function.

Server order is preserved; we do not sort. The server hashes in its iteration order, and any client-side reordering diverges.

func VaryHeaders

func VaryHeaders() []string

VaryHeaders returns an immutable copy of the canonical Vary header list. Mutating the returned slice does not affect internal state.

Types

type Cache

type Cache interface {
	Get(ctx context.Context, key string) (Entry, bool, error)
	Add(ctx context.Context, key string, e Entry) error
	Remove(ctx context.Context, key string) error
}

Cache is the minimal interface any backend must implement. Implementors must be safe for concurrent use. Methods take a context so network-backed backends can honour deadlines and cancellation; the default in-memory LRU ignores it. Add overwrites; on error the transport treats the response as uncached. Get returning (zero, false, err) is treated as a miss. Remove is idempotent.

func NewLRUCache

func NewLRUCache(size int) Cache

NewLRUCache returns the default in-process Cache: a bounded LRU with no TTL-based eviction. size <= 0 uses 4096. The returned Cache is safe for concurrent use and spawns no background goroutines: hashicorp/golang-lru/v2 starts a reaper only when ttl > 0; we pass 0 to turn that off.

type DriftEvent

type DriftEvent struct {
	DetectedAt time.Time
	Recovered  bool
}

DriftEvent fires on each drift state transition. Recovered=false on detection; Recovered=true on probe-back recovery. Drift state is per-Transport: if you build multiple *Transport instances in one process, the callback fires per Transport. Read Stats() for the current truth at any time.

DetectedAt is the time of the transition. For Recovered=false events this is when drift was first observed; for Recovered=true events this is when recovery was confirmed.

type Entry

type Entry struct {
	ETag     string
	Body     []byte
	Headers  http.Header
	StoredAt time.Time
}

Entry is what the transport caches: the server's ETag, the full body as last read, and the response headers. StoredAt is populated by NewLRUCache; custom backends may populate it too, or leave it zero if freshness tracking isn't useful.

Use named-field struct literals so future field additions remain non-breaking.

type Option

type Option interface {
	// contains filtered or unexported methods
}

Option configures a Transport. The interface form (rather than a bare `func(*config)`) lets us evolve the API without a breaking change: new option shapes can introduce richer concrete types that still satisfy Option.

func WithCache

func WithCache(c Cache) Option

WithCache supplies the storage backend. When omitted, the Transport uses NewLRUCache(4096). If a Cache is supplied, WithKeyScope is REQUIRED to prevent cross-tenant body leaks (different callers writing to the same URL under different auth).

Passing nil marks the option as caller-set but with a nil value; NewTransport rejects this with ErrNilCache. If you want the default LRU, omit the option entirely.

func WithDriftDetected

func WithDriftDetected(cb func(DriftEvent)) Option

WithDriftDetected registers a callback fired on each drift state transition: Recovered=false on detection, Recovered=true on probe-back recovery. Without this option the transport still detects drift and degrades transparently; only the user-visible signal is omitted.

The callback runs synchronously inside RoundTrip; keep it fast and non-blocking. Panics are contained by a recover guard so a misbehaving callback cannot crash the transport.

func WithKeyScope

func WithKeyScope(scope string) Option

WithKeyScope namespaces cache entries. The scope string is hashed with SHA256 into the cache key, so two callers sharing a Cache with different scopes never collide. Scopes are treated as opaque: do NOT embed secrets in the scope value.

func WithLogger

func WithLogger(l *slog.Logger) Option

WithLogger supplies the slog.Logger the transport emits events to. Default: slog.Default().

func WithMaxBodyBytes

func WithMaxBodyBytes(n int) Option

WithMaxBodyBytes caps the per-entry body size the transport will buffer and cache. Responses exceeding this cap pass through to the caller uncached. Values below the 8 KiB internal floor are accepted but the initial allocation stays at the floor; the caller-supplied value is the cap. Default: 4 MiB.

func WithMaxCacheBytes

func WithMaxCacheBytes(n int64) Option

WithMaxCacheBytes caps the total byte budget held by the default NewLRUCache. Exceeding the budget evicts oldest entries. Has no effect when a caller-supplied Cache is used; custom backends enforce their own budgets. Default: 256 MiB.

type Stats

type Stats struct {
	Degraded        bool
	DegradedAt      time.Time // zero when not degraded
	TotalMismatches int64     // monotonic over Transport lifetime
}

Stats is the read-only snapshot returned by (*Transport).Stats. Suitable for /healthz probes, Prometheus gauges, or polling dashboards.

type Transport

type Transport struct {
	// contains filtered or unexported fields
}

Transport is an http.RoundTripper that adds If-None-Match on cacheable GET/HEAD requests and replays the cached body as a synthesised 200 when the server answers with 304 Not Modified.

Transport runs precompute-mode by default: the If-None-Match value is computed from the cached body and the CURRENT request headers, so cached entries stay useful across Authorization rotation (GitHub App installation tokens, rotating fine-grained PATs). If algorithm drift is detected (10 precompute/server-ETag mismatches inside a 60-second window), the transport transparently switches to passive mode and sends the server's stored ETag verbatim. After a 1-hour cooldown, sampled probe-back requests retry precompute; consecutive successes restore precompute mode automatically. Passive mode never replays caller credentials in If-None-Match: only the server's previously issued opaque ETag is sent. Use WithDriftDetected for state-transition callbacks; use Stats() to read live state.

func (*Transport) RoundTrip

func (t *Transport) RoundTrip(req *http.Request) (*http.Response, error)

RoundTrip implements http.RoundTripper.

func (*Transport) Stats

func (t *Transport) Stats() Stats

Stats returns a snapshot of the Transport's drift detector state. Safe to call from any goroutine.

Holds driftMu for the {Degraded, DegradedAt} pair so the snapshot is internally consistent under concurrent transitions. The hot path (buildIfNoneMatch) intentionally stays lock-free; it can briefly see a transient state and at worst sends one extra probe.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL