chunker

package
v0.4.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 2, 2022 License: Apache-2.0, MIT Imports: 18 Imported by: 0

Documentation

Overview

Package chunker provides functionality for chunking entries chain generated from provider.MultihashIterator, represented as EntriesChunker interface. The package provides a default implementation of this interface, CachedEntriesChunker.

CachedEntriesChunker stores a cache of generated entries chains with configurable capacity and maximum chunk size. This cache guarantees that a cached chain of entries is either fully cached or not at all. This includes chains that may have overlapping section. In this case, the overlapping section is not evicted from the cache until the larger chain it is overlapping with is evicted. The CachedEntriesChunker also supports restoring previously cached entries upon instantiation.

See: CachedEntriesChunker, NewCachedEntriesChunker

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type CachedEntriesChunker

type CachedEntriesChunker struct {
	// contains filtered or unexported fields
}

CachedEntriesChunker is an EntriesChunker that caches the generated chunks using an LRU cache. The chunks within a chain are guaranteed to either be fully cached or not at all. If the chains overlap, the smaller overlapping portion is not evicted unless all the chains that reference to it are evicted.

The number of chains cached will be at most equal to the given capacity. The capacity is immutable. Chains are evicted as needed if the capacity is reached.

This cache restores previously cached values from the datastore upon instantiation. If the capacity is smaller than the number of chains persisted, the surplus chains will be evicted in no particular order.

See: NewCachedEntriesChunker.

func NewCachedEntriesChunker

func NewCachedEntriesChunker(ctx context.Context, ds datastore.Batching, chunkSize, capacity int) (*CachedEntriesChunker, error)

NewCachedEntriesChunker instantiates a new CachedEntriesChunker backed by a given datastore.

The chunks are generated with the given maximum chunkSize and are stored in an LRU cache. Once stored, the individual chunks that make up the entries chain are retrievable in their raw binary

form via CachedEntriesChunker.GetRawCachedChunk.

The growth of LRU cache is limited by the given capacity. The capacity specifies the number of complete chains that are cached, not the chunks within each chain. The actual storage consumed by the cache is a factor of: 1) maximum chunk size, 2) multihash length and 3) capacity. For example, a fully populated cache with chunk size of 16384, for multihashes of length 128-bit and capacity of 1024 will consume 256MiB of space, i.e. (16384 * 1024 * 128b).

This struct guarantees that for any given chain of entries, either the entire chain is cached, or it is not cached at all. When chains overlap, the overlapping portion of the chain is not evicted until the larger chain is evicted.

Upon instantiation, the chunker will restore its state from the datastore, and prunes the datastore as needed. For example, if the given capacity is smaller than the number of chains present in the datastore it will evict chains to respect the given capacity.

Note that a caching metadata with negligible size is persistent in addition to the chunks. The caching metadata is checked during restore to determine the root of cached chains, and the number of overlapping chunks.

The context is only used cancel a call to this function while it is accessing the data store.

See CachedEntriesChunker.Chunk, CachedEntriesChunker.GetRawCachedChunk

func (*CachedEntriesChunker) Cap

func (ls *CachedEntriesChunker) Cap() int

Cap returns the maximum number of chained entries chunks this cache stores.

Note, the maximum number refers to the number of chains as a unit and not the total sum of individual chunks across chains.

func (*CachedEntriesChunker) Chunk

func (ls *CachedEntriesChunker) Chunk(ctx context.Context, mhi provider.MultihashIterator) (ipld.Link, error)

Chunk chunks the multihashes supplied by the given mhi into a chain of schema.EntryChunk instances and stores them.

func (*CachedEntriesChunker) Clear

func (ls *CachedEntriesChunker) Clear(ctx context.Context) error

Clear purges all stored items from the CachedEntriesChunker.

func (*CachedEntriesChunker) Close

func (ls *CachedEntriesChunker) Close() error

Close syncs the backing datastore but does not close it. This is because cached entries chunker wraps an existing datastore and does not construct it, and the wrapped datastore may be in use elsewhere.

func (*CachedEntriesChunker) GetRawCachedChunk

func (ls *CachedEntriesChunker) GetRawCachedChunk(ctx context.Context, l ipld.Link) ([]byte, error)

GetRawCachedChunk gets the raw cached entry chunk for the given link, or nil if no such caching exists.

func (*CachedEntriesChunker) Len

func (ls *CachedEntriesChunker) Len() int

Len returns the number of chained entries chunks thar are currently stored in cache.

Note, the number refers to the number of chains as a unit and not the total sum of individual chunks across chains.

type EntriesChunker

type EntriesChunker interface {
	// Chunk chunks multihashes supplied by a given provider.MultihashIterator into a chain of
	// schema.EntryChunk and returns the link of the chain root.
	Chunk(context.Context, provider.MultihashIterator) (ipld.Link, error)
}

EntriesChunker chunks multihashes supplied by a given provider.MultihashIterator into a chain of schema.EntryChunk.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL