cache

package
v2.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 27, 2025 License: BSD-3-Clause, MIT Imports: 16 Imported by: 0

Documentation

Overview

Package cache implements the CLOCK-Pro caching algorithm.

CLOCK-Pro is a patent-free alternative to the Adaptive Replacement Cache, https://en.wikipedia.org/wiki/Adaptive_replacement_cache. It is an approximation of LIRS ( https://en.wikipedia.org/wiki/LIRS_caching_algorithm ), much like the CLOCK page replacement algorithm is an approximation of LRU.

This implementation is based on the python code from https://bitbucket.org/SamiLehtinen/pyclockpro .

Slides describing the algorithm: http://fr.slideshare.net/huliang64/clockpro

The original paper: http://static.usenix.org/event/usenix05/tech/general/full_papers/jiang/jiang_html/html.html

It is MIT licensed, like the original.

Index

Constants

View Source
const ValueMetadataSize = 32

ValueMetadataSize denotes the number of bytes of metadata allocated for a cache entry. Note that builds with cgo disabled allocate no metadata, and 32-bit builds allocate less for a cache.Value. However, we keep the value constant to reduce friction for writing tests.

Variables

This section is empty.

Functions

func Free

func Free(v *Value)

Free frees the specified value. The buffer associated with the value will possibly be reused, making it invalid to use the buffer after calling Free. Do not call Free on a value that has been added to the cache.

Types

type Cache

type Cache struct {
	// contains filtered or unexported fields
}

Cache implements Pebble's sharded block cache. The Clock-PRO algorithm is used for page replacement (http://static.usenix.org/event/usenix05/tech/general/full_papers/jiang/jiang_html/html.html). In order to provide better concurrency, 4 x NumCPUs shards are created, with each shard being given 1/n of the target cache size. The Clock-PRO algorithm is run independently on each shard.

Blocks are keyed by an (handleID, fileNum, offset) triple. The handleID is a namespace for file numbers and allows a single Cache to be shared between multiple Pebble instances (via separate Handles). The fileNum and offset refer to an sstable file number and the offset of the block within the file. Because sstables are immutable and file numbers are never reused, (fileNum,offset) are unique for the lifetime of a Pebble instance.

In addition to maintaining a map from (fileNum,offset) to data, each shard maintains a map of the cached blocks for a particular fileNum. This allows efficient eviction of all blocks for a file (is used when an sstable is deleted from disk).

Memory Management

A normal implementation of the block cache would result in GC having to read through all the structures and keep track of the liveness of many objects. This was found to cause significant overhead in CRDB when compared to the earlier use of RocksDB.

In order to reduce pressure on the Go GC, manual memory management is performed for the data stored in the cache. Manual memory management is performed by calling into C.{malloc,free} to allocate memory; this memory is outside the purview of the GC. Cache.Values are reference counted and the memory backing a manual value is freed when the reference count drops to 0.

Manual memory management brings the possibility of memory leaks. It is imperative that every Handle returned by Cache.{Get,Set} is eventually released. The "invariants" build tag enables a leak detection facility that places a GC finalizer on cache.Value. When the cache.Value finalizer is run, if the underlying buffer is still present a leak has occurred. The "tracing" build tag enables tracing of cache.Value reference count manipulation and eases finding where a leak has occurred. These two facilities are usually used in combination by specifying `-tags invariants,tracing`. Note that "tracing" produces a significant slowdown, while "invariants" does not.

func New

func New(size int64) *Cache

New creates a new cache of the specified size. Memory for the cache is allocated on demand, not during initialization. The cache is created with a reference count of 1. Each DB it is associated with adds a reference, so the creator of the cache should usually release their reference after the DB is created.

c := cache.New(...)
defer c.Unref()
d, err := pebble.Open(pebble.Options{Cache: c})

func NewWithShards added in v2.1.0

func NewWithShards(size int64, shards int) *Cache

NewWithShards creates a new cache with the specified size and number of shards.

func (*Cache) MaxSize

func (c *Cache) MaxSize() int64

MaxSize returns the max size of the cache.

func (*Cache) Metrics

func (c *Cache) Metrics() Metrics

Metrics returns the metrics for the cache.

func (*Cache) NewHandle added in v2.1.0

func (c *Cache) NewHandle() *Handle

func (*Cache) Ref

func (c *Cache) Ref()

Ref adds a reference to the cache. The cache only remains valid as long a reference is maintained to it.

func (*Cache) Reserve

func (c *Cache) Reserve(n int) func()

Reserve N bytes in the cache. This effectively shrinks the size of the cache by N bytes, without actually consuming any memory. The returned closure should be invoked to release the reservation.

func (*Cache) Size

func (c *Cache) Size() int64

Size returns the current space used by the cache.

func (*Cache) Unref

func (c *Cache) Unref()

Unref releases a reference on the cache.

type Handle

type Handle struct {
	// contains filtered or unexported fields
}

Handle is the interface through which a store uses the cache. Each store uses a separate "handle". A handle corresponds to a separate "namespace" inside the cache; a handle cannot see another handle's blocks.

func (*Handle) Cache added in v2.1.0

func (c *Handle) Cache() *Cache

Cache returns the Cache instance associated with the handle.

func (*Handle) Close added in v2.1.0

func (c *Handle) Close()

func (*Handle) Delete added in v2.1.0

func (c *Handle) Delete(fileNum base.DiskFileNum, offset uint64)

Delete deletes the cached value for the specified file and offset.

func (*Handle) EvictFile added in v2.1.0

func (c *Handle) EvictFile(fileNum base.DiskFileNum)

EvictFile evicts all cache values for the specified file.

func (*Handle) Get added in v2.1.0

func (c *Handle) Get(fileNum base.DiskFileNum, offset uint64) *Value

Get retrieves the cache value for the specified file and offset, returning nil if no value is present.

func (*Handle) GetWithReadHandle added in v2.1.0

func (c *Handle) GetWithReadHandle(
	ctx context.Context, fileNum base.DiskFileNum, offset uint64,
) (cv *Value, rh ReadHandle, errorDuration time.Duration, cacheHit bool, err error)

GetWithReadHandle retrieves the cache value for the specified handleID, fileNum and offset. If found, a valid Handle is returned (with cacheHit set to true), else a valid ReadHandle is returned.

See the ReadHandle declaration for the contract the caller must satisfy when getting a valid ReadHandle.

This method can block before returning since multiple concurrent gets for the same cache value will take turns getting a ReadHandle, which represents permission to do the read. This blocking respects context cancellation, in which case an error is returned (and not a valid ReadHandle).

When blocking, the errorDuration return value can be non-zero and is populated with the total duration that other readers that observed an error (see ReadHandle.SetReadError) spent in doing the read. This duration can be greater than the time spent blocked in this method, since some of these errors could have occurred prior to this call. But it serves as a rough indicator of whether turn taking could have caused higher latency due to context cancellation of other readers.

While waiting, someone else may successfully read the value, which results in a valid Handle being returned. This is a case where cacheHit=false.

func (*Handle) Set added in v2.1.0

func (c *Handle) Set(fileNum base.DiskFileNum, offset uint64, value *Value)

Set sets the cache value for the specified file and offset, overwriting an existing value if present. The value must have been allocated by Cache.Alloc.

The cache takes a reference on the Value and holds it until it gets evicted.

type Metrics

type Metrics struct {
	// The number of bytes inuse by the cache.
	Size int64
	// The count of objects (blocks or tables) in the cache.
	Count int64
	// The number of cache hits.
	Hits int64
	// The number of cache misses.
	Misses int64
}

Metrics holds metrics for the cache.

type ReadHandle added in v2.1.0

type ReadHandle struct {
	// contains filtered or unexported fields
}

ReadHandle represents a contract with a caller that had a miss when doing a cache lookup, and wants to do a read and insert the read block into the cache. The contract applies when ReadHandle.Valid returns true, in which case the caller has been assigned the turn to do the read (and others are potentially waiting for it).

Contract:

The caller must immediately start doing a read, or can first wait on a shared resource that would also block a different reader if it was assigned the turn instead (specifically, this refers to Options.LoadBlockSema). After the read, it must either call SetReadValue or SetReadError depending on whether the read succeeded or failed.

func (ReadHandle) SetReadError added in v2.1.0

func (rh ReadHandle) SetReadError(err error)

SetReadError specifies that the caller has encountered a read error.

func (ReadHandle) SetReadValue added in v2.1.0

func (rh ReadHandle) SetReadValue(v *Value)

SetReadValue provides the Value that the caller has read and sets it in the block cache.

The cache takes a reference on the Value and holds it until it is evicted and no longer needed by other readers.

func (ReadHandle) Valid added in v2.1.0

func (rh ReadHandle) Valid() bool

Valid returns true for a valid ReadHandle.

type Value

type Value struct {
	// contains filtered or unexported fields
}

Value holds a reference counted immutable value.

func Alloc

func Alloc(n int) *Value

Alloc allocates a byte slice of the specified size, possibly reusing previously allocated but unused memory. The memory backing the value is manually managed. The caller MUST either add the value to the cache (via Cache.Set), or release the value (via Cache.Free). Failure to do so will result in a memory leak.

func (*Value) RawBuffer added in v2.1.0

func (v *Value) RawBuffer() []byte

RawBuffer returns the buffer associated with the value. The contents of the buffer should not be changed once the value has been added to the cache. Instead, a new Value should be created and added to the cache to replace the existing value.

func (*Value) Release added in v2.1.0

func (v *Value) Release()

Release a ref count on the buffer. It is a no-op to call Release on a nil Value.

func (*Value) Truncate

func (v *Value) Truncate(n int)

Truncate the buffer to the specified length. The buffer length should not be changed once the value has been added to the cache as there may be concurrent readers of the Value. Instead, a new Value should be created and added to the cache to replace the existing value.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL