advanced-cache

module

v1.6.0 Latest Latest Go to latest Published: Sep 12, 2025 License: Apache-2.0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/Borislavv/advanced-cache

Links

Open Source Insights

README ¶

Advanced Cache (advCache)

High‑load in‑memory HTTP cache & reverse proxy for Go.
Designed for low latency and sustained high throughput — hundreds of thousands of RPS on commodity hosts.

Built around sharded storage, LRU with TinyLFU admission (Doorkeeper), background refresh, a resilient upstream cluster, and a lightweight worker orchestrator. The hot path is engineered to avoid allocations and global locks.

Highlights

Sharded storage (power‑of‑two shards) with per‑shard LRU and a global shard balancer for proportional eviction.
Admission = W‑TinyLFU + Doorkeeper (Count‑Min Sketch + gated Bloom‑like filter).
Background refresh with TTL, β‑staggering, scan‑rate and upstream rate limiting.
Reverse proxy mode with an upstream cluster: per‑backend rate limiting, health probing, slow‑start, throttling/quarantine, and a fan‑in slot‑selection pattern for balanced dispatch.
Worker orchestration for eviction/refresh/GC: on/off/start/reload/scale via a lightweight governor.
Careful memory discipline: pooled buffers, zero‑copy headers, predictable storage budget.
Dump/restore per shard with CRC32 and version rotation (optional GZIP).
fasthttp HTTP layer, focused REST surface, Prometheus/VictoriaMetrics metrics.
Kubernetes‑friendly: liveness probe, graceful shutdown, configurable GOMAXPROCS (auto when 0).

See also: METRICS.md and ROADMAP.md.

Repository map

Quick orientation to major components.

cmd/
  main.go                  # entrypoint, flags, wiring, logger, probes

internal/cache/api/        # HTTP controllers: main cache route, on/off, clear, metrics
pkg/
  config/                  # YAML config loader & derived fields
  http/server/             # fasthttp server, middlewares (Server header, JSON, etc)
  orchestrator/            # worker governor & transport
  pools/                   # buffer and slice pools
  prometheus/metrics/      # Prometheus/VictoriaMetrics exposition
  storage/                 # sharded map, per‑shard LRU, LFU, dumper, refresher, evictor
    lru/ lfU/ map/
  upstream/                # backend & cluster: rate‑limit, health, proxy logic
  k8s/                     # probes
  utils/, common/, types/  # helpers

How requests are canonicalized (cache key)

To ensure keys are consistent and idempotent, requests are normalized before lookup/insert:

Whitelist filtering

Only items listed in config participate in the key:

Query: rules.*.cache_key.query — exact parameter names (supports names like project[id]).
Headers: rules.*.cache_key.headers — exact header names to include (e.g. Accept-Encoding).

All other query params and headers are ignored for the key.

Deterministic ordering

Selected query params and headers are sorted lexicographically (by name, then value) before key construction, so semantically identical requests map to the same key.
Source: pkg/model/query.go, pkg/model/header.go, pkg/common/sort/key_value.go.

Compression variants

If you whitelist Accept-Encoding, its normalized value becomes part of the key to isolate gzip/brotli/plain variants.

Response headers policy

Whitelist filtering (no sorting): Only response headers from rules.*.cache_value.headers are stored and forwarded as‑is.
(No reordering is performed.)

Server‑added diagnostic headers

Server: <service-name> — always set by middleware; if an upstream server name was present, it is preserved as X-Origin-Server and replaced with local Server.
Source: pkg/http/server/middleware/server_name.go.

Note: X-Refreshed-At is planned to indicate background refresh timing. (See ROADMAP.md.)

Configuration

Two example profiles are included:

advcache.cfg.yaml — deployment profile
advcache.cfg.local.yaml — local/stress profile

Selected top‑level keys (under cache:):

env — log/metrics label (dev, prod, etc).
runtime.gomaxprocs — 0 = auto (via automaxprocs); set explicit N to cap CPUs.
api.{name,port} — service name and HTTP port.
upstream.policy — "await" (back‑pressure) or "deny" (fail‑fast).
upstream.cluster.backends[] — per‑backend: rate, timeout, max_timeout, use_max_timeout_header, healthcheck.
data.dump — snapshots: {enabled,dir,name,crc32_control_sum,max_versions,gzip}.
storage.size — memory budget (bytes).
admission — TinyLFU: table_len_per_shard (power‑of‑two), estimated_length, door_bits_per_counter (8–16 typical), sample_multiplier (traffic‑proportional aging).
eviction — pressure policy: soft_limit (background eviction + enforce admission), hard_limit (minimal hot‑path eviction + runtime memory limit); replicas, scan_rate.
refresh — {enabled,ttl,beta,rate,replicas,scan_rate,coefficient}.
forceGC — periodic FreeOSMemory.
metrics.enabled — Prometheus/VictoriaMetrics.
k8s.probe.timeout — probe timeout.
rules — per‑path overrides + cache key/value canonicalization.

Example (deployment excerpt)

cache:
  env: "dev"
  enabled: true

  runtime:
    gomaxprocs: 0

  api:
    name: "starTeam.advCache"
    port: "8020"

  upstream:
    policy: "await"
    cluster:
      backends:
        - id: "prod-node-1"
          enabled: true
          host: "localhost:8081"
          scheme: "http"
          rate: 100000
          timeout: "10s"
          max_timeout: "1m"
          use_max_timeout_header: ""
          healthcheck: "/healthcheck"
        - id: "low-resources-prod-node-2"
          enabled: true
          host: "localhost:8082"
          scheme: "http"
          rate: 3000
          timeout: "10s"
          max_timeout: "1m"
          use_max_timeout_header: ""
          healthcheck: "/healthcheck"
        - id: "legacy-prod-node-3"
          enabled: true
          host: "localhost:8083"
          scheme: "http"
          rate: 500
          timeout: "1m"
          max_timeout: "10m"
          use_max_timeout_header: ""
          healthcheck: "/legacy/health/is-ok"

  data:
    dump:
      enabled: false
      dir: "public/dump"
      name: "cache.dump"
      crc32_control_sum: true
      max_versions: 3
      gzip: false

  storage:
    size: 53687091200  # 50 GiB

  admission:
    table_len_per_shard: 32768
    estimated_length: 10000000
    door_bits_per_counter: 12
    sample_multiplier: 12

  eviction:
    enabled: true
    replicas: 4
    scan_rate: 8
    soft_limit: 0.8
    hard_limit: 0.9

  refresh:
    enabled: true
    ttl: "3h"
    beta: 0.5
    rate: 1250
    replicas: 4
    scan_rate: 32
    coefficient: 0.5

  forceGC:
    enabled: true
    interval: "10s"

  metrics:
    enabled: true

  k8s:
    probe:
      timeout: "5s"

  rules:
    /api/v2/cloud/data:
      cache_key:
        query: [project[id], domain, language, choice, timezone]
        headers: [Accept-Encoding]
      cache_value:
        headers: [Content-Type, Content-Length, Content-Encoding, Connection, Strict-Transport-Security, Vary, Cache-Control]

    /api/v1/stats:
      enabled: true
      ttl: "36h"
      beta: 0.4
      coefficient: 0.7
      cache_key:
        query: [language, timezone]
        headers: [Accept-Encoding]
      cache_value:
        headers: [Content-Type, Content-Length, Content-Encoding, Connection, Strict-Transport-Security, Vary, Cache-Control]

Example (local stress excerpt)

cache:
  env: "dev"
  enabled: true

  runtime:
    gomaxprocs: 12

  api:
    name: "starTeam.adv:8020"
    port: "8081"

  upstream:
    policy: "deny"
    cluster:
      backends:
        - id: "adv"
          enabled: true
          host: "localhost:8020"
          scheme: "http"
          rate: 250000
          timeout: "5s"
          max_timeout: "3m"
          use_max_timeout_header: "X-Google-Bot"
          healthcheck: "/k8s/probe"

  storage:
    size: 10737418240  # 10 GiB

  admission:
    table_len_per_shard: 32768
    estimated_length: 10000000
    door_bits_per_counter: 12
    sample_multiplier: 10

  eviction:
    enabled: true
    replicas: 4
    scan_rate: 8
    soft_limit: 0.9
    hard_limit: 0.99

  forceGC:
    enabled: true
    interval: "10s"

  metrics:
    enabled: true

Eviction & pressure policy

Background eviction at SOFT‑LIMIT
When heap_usage >= storage.size × soft_limit, the evictor runs in the background and does not touch the hot path. It removes items using a larger LRU sample (preferentially keeping newer entries). Increase replicas and scan_rate to shave memory continuously.
Admission at SOFT‑LIMIT
TinyLFU admission is enforced on the hot path during pressure to avoid polluting the cache with low‑value inserts while the evictor catches up.
Minimal hot‑path eviction at HARD‑LIMIT
When heap_usage >= storage.size × hard_limit, a single‑item eviction per request is applied to reduce contention with the background worker, and the runtime memory limit is set in parallel. This preserves throughput and avoids latency cliffs.

TinyLFU + Doorkeeper (admission)

Count‑Min Sketch (depth=4) with compact counters, sharded to minimize contention.
Sample‑based aging: ages after estimated_length × sample_multiplier observations (traffic‑proportional).
Doorkeeper (Bloom‑like bitset) gates first‑seen keys; reset with aging to avoid FPR growth.

Recommended starting points:
table_len_per_shard: 8192–32768 · door_bits_per_counter: 12 · sample_multiplier: 8–12

Sizing evidence (current tests)

With randomized object sizes between 1 KiB and 16 KiB (mocks), the cache fills to ~10 GiB of logical data with ~500 MiB of overhead. Resident usage stabilizes around ~10.5 GiB for a 10 GiB dataset under these conditions.

Build & run

Requirements: Go 1.24+

# Build
go build -o advCache ./cmd/main.go

# Run (uses default config path if present)
./advCache

# Run with an explicit config path
./advCache -cfg ./advcache.cfg.yaml

# Docker (example multi‑stage)
docker build -t advcache .
docker run --rm -p 8020:8020 -v "$PWD/public/dump:/app/public/dump" advcache -cfg /app/advcache.cfg.yaml

The built‑in defaults try advcache.cfg.yaml and then advcache.cfg.local.yaml if -cfg is not provided.

HTTP endpoints

GET /{any} — main cached endpoint (cache key = path + selected query + selected request headers).
GET /cache/on — enable caching.
GET /cache/off — disable caching.
GET /cache/clear — two‑step clear (first call returns a token with 5‑min TTL; second call with ?token= clears).
GET /metrics — Prometheus/VictoriaMetrics exposition.

Observability

Hits, misses, proxied/fallback counts, errors, panics.
Cache length and memory gauges.
Upstream health: healthy/sick/dead.
Eviction/admission activity.
Refresh scan/attempt metrics.

Enable periodic stats to stdout with logs.stats: true in config.

Tuning guide (ops)

Upstream policy: deny for fail‑fast load tests; await in production for back‑pressure.
Eviction thresholds: start with soft_limit: 0.8–0.9, hard_limit: 0.9–0.99, forceGC.enabled: true. If hot‑path eviction triggers often, increase evictor replicas or scan_rate.
Admission: watch Doorkeeper density and reset interval; if density > ~0.5, increase door_bits_per_counter or reduce sample_multiplier.
CPU: leave gomaxprocs: 0 in production; pin CPUs via container limits/quotas if needed.
Headers: whitelist only what must participate in the key; Accept-Encoding is a good default when you store compressed variants.

Testing

Unit tests around storage hot path, TinyLFU, and shard balancer.
Dump/load tests with CRC and rotation.
Upstream fault injection: timeouts, spikes, error bursts.
Benchmarks with -benchmem, race tests for concurrency‑sensitive code.

License

MIT — see LICENSE.

Maintainer

Borislav Glazunov — glazunov2142@gmail.com · Telegram @glbrslv

Directories ¶

Path	Synopsis
cmd
internal
cache
pkg
api
common/bytes
common/ctime
common/dlog Package dlog is shortly for deduplicated logger.	Package dlog is shortly for deduplicated logger.
common/loc
common/rate
common/rnd Package rnd provides lock-free, allocation-free random helpers.	Package rnd provides lock-free, allocation-free random helpers.
common/sort
config
http
http/header
http/query
k8s/probe/liveness
model
orchestrator
pools
prometheus/metrics
rule
shutdown
storage
storage/lfu
storage/lru
storage/map Package sharded implements a high‑throughput, zero‑allocation sharded map intended for in‑memory cache workloads.	Package sharded implements a high‑throughput, zero‑allocation sharded map intended for in‑memory cache workloads.
tests
upstream
workers Package workers exposes backend interfaces used by worker groups.	Package workers exposes backend interfaces used by worker groups.
workers/evictor Package evictor provides a scalable worker Group implementing service.Service and service.Scalable.	Package evictor provides a scalable worker Group implementing service.Service and service.Scalable.
workers/gc Package gc provides a scalable worker Group implementing service.Service and service.Scalable, identical по архитектуре evictor.Group, но вместо эвикции делает принудительный GC.	Package gc provides a scalable worker Group implementing service.Service and service.Scalable, identical по архитектуре evictor.Group, но вместо эвикции делает принудительный GC.
workers/refresher Package refresher provides a scalable worker Group implementing service.Service and service.Scalable.	Package refresher provides a scalable worker Group implementing service.Service and service.Scalable.
script
wrk command
wrk/check command cmd/mixed-endpoints-checker/main.go	cmd/mixed-endpoints-checker/main.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL