snap

package

v0.0.20 Latest Latest Go to latest Published: Dec 20, 2025 License: Apache-2.0 Imports: 14 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/signadot/tony-format

Links

Open Source Insights

README ¶

Event-Based Snapshot Design

Overview

This package provides event-based snapshot storage for logd. Snapshots store events directly from the stream package, with a size-bound index for efficient path lookups.

Design Goals

Store events directly: Snapshots contain a stream of stream.Event values, not IR nodes
Size-bound index: The index fits in memory even for very large snapshots
Efficient path access: Look up paths without reading the entire snapshot
Streaming reads: Support reading paths without full document reconstruction

Architecture

Snapshot Format

[8 bytes: event stream size (uint64, big-endian)]
[4 bytes: index size (uint32, big-endian)]
[event stream bytes]
[index bytes]

Event stream size: 8-byte uint64 indicating size of event stream in bytes
Index size: 4-byte uint32 indicating size of index in bytes
Event stream: Sequence of stream.Event values encoded in Tony format
Index: Size-bound map of paths to byte offsets

The sizes are written first (as placeholders, then updated), followed by the event stream and index.

Index Structure

The index is a list of kpaths in order of the stream events, each associated with an offset in the event data:

IndexEntry: Contains a kpath (string) and offset (int64)
Entries: Ordered list of IndexEntry values, sorted by offset
Ancestor lookup: If exact path not indexed, find nearest ancestor

Building Snapshots

Write placeholder sizes at beginning (12 bytes: 8 for event stream size, 4 for index size)
Process IR node or event stream
Convert to events (if starting from IR node)
Write events to snapshot file
Build size-bound index while processing events
After all events written:
- Write index bytes
- Seek back to beginning and update event stream size and index size

Reading Snapshots

Read sizes from beginning of file:
- Read event stream size (8 bytes, uint64)
- Read index size (4 bytes, uint32)
Read event stream (starting at offset 12, for event stream size bytes)
Read index (starting after event stream, for index size bytes)
Parse index structure
For path lookup:
- Find path or nearest ancestor in index
- Seek to offset in event stream
- Decode events from that offset
- Reconstruct path value from events

Implementation Status

Basic index structure (Index, IndexEntry)
Event-based snapshot writer
Index builder (records paths and offsets as events are processed)
Snapshot reader with path lookup, basic

Migration from IR-Node-Based Snapshots

The previous IR-node-based implementation (with !snap-loc, !snap-range, !snap-chunks) has been archived in archive/. The new design:

Simpler: No chunking logic, just events + index
More efficient: Direct event storage, no IR node conversion overhead
Better scaling: Size-bound index works for arbitrarily large snapshots

Documentation ¶

Overview ¶

Package snap provides event-based snapshot storage.

Snapshots store stream.Event sequences with a size-bound index mapping paths to byte offsets. This enables efficient path lookups without loading entire documents into memory.

Format: [header: 12 bytes][events][index]

Index ¶

Constants
func EncodeRandomDocument(doc *ir.Node, enc *stream.Encoder) error
func GetChunkSize() int
func RandomDocument(config RandomDocConfig) (*ir.Node, []string, error)
type Builder
- func NewBuilder(w W, index *Index, patches []*ir.Node) (*Builder, error)
- func (b *Builder) Close() error
- func (b *Builder) WriteEvent(ev *stream.Event) error
type Index
- func OpenIndex(r io.Reader, size int) (*Index, error)
- func (idx *Index) EstimatedSize() int64
- func (s *Index) FromTony(data []byte, opts ...gomap.UnmapOption) error
- func (s *Index) FromTonyIR(node *ir.Node, opts ...gomap.UnmapOption) error
- func (idx *Index) Lookup(kp string) (index int, err error)
- func (s *Index) ToTony(opts ...gomap.MapOption) ([]byte, error)
- func (s *Index) ToTonyIR(opts ...gomap.MapOption) (*ir.Node, error)
type IndexEntry
- func (s *IndexEntry) FromTony(data []byte, opts ...gomap.UnmapOption) error
- func (s *IndexEntry) FromTonyIR(node *ir.Node, opts ...gomap.UnmapOption) error
- func (s *IndexEntry) ToTony(opts ...gomap.MapOption) ([]byte, error)
- func (s *IndexEntry) ToTonyIR(opts ...gomap.MapOption) (*ir.Node, error)
type Path
- func (p *Path) MarshalText() ([]byte, error)
- func (p *Path) String() string
- func (p *Path) UnmarshalText(d []byte) error
type PathFinder
- func NewPathFinder(r io.ReadSeekCloser, index *Index, off int64, idxPath, desPath *kpath.KPath, ...) (*PathFinder, error)
- func (pf *PathFinder) FindEvents() ([]stream.Event, error)
type R
type RandomDocConfig
- func DefaultRandomDocConfig() RandomDocConfig
type Snapshot
- func Open(rc R) (*Snapshot, error)
- func (s *Snapshot) Close() error
- func (s *Snapshot) ReadPath(p string) (*ir.Node, error)
type W

Constants ¶

View Source

const (
	DefaultChunkSize = 4096
	HeaderSize       = 12
)

Variables ¶

This section is empty.

Functions ¶

func EncodeRandomDocument ¶

func EncodeRandomDocument(doc *ir.Node, enc *stream.Encoder) error

EncodeRandomDocument encodes a random document to events using the stream encoder

func GetChunkSize ¶

func GetChunkSize() int

GetChunkSize returns the chunk size for indexing (bytes). Defaults to 4096. Override with SNAP_MAX_CHUNK_SIZE env var.

func RandomDocument ¶

func RandomDocument(config RandomDocConfig) (*ir.Node, []string, error)

RandomDocument generates a random document with mixed structure Returns the document as an ir.Node and all paths that exist in it

Types ¶

type Builder ¶

type Builder struct {
	// contains filtered or unexported fields
}

Builder writes snapshot files by consuming stream events. Automatically creates index entries at chunk boundaries.

func NewBuilder ¶

func NewBuilder(w W, index *Index, patches []*ir.Node) (*Builder, error)

NewBuilder creates a snapshot builder writing to w. Populates the provided index as events are written.

func (*Builder) Close ¶

func (b *Builder) Close() error

func (*Builder) WriteEvent ¶

func (b *Builder) WriteEvent(ev *stream.Event) error

WriteEvent writes an event to the snapshot. Creates index entries when chunk size threshold is reached.

type Index ¶

type Index struct {
	Entries []IndexEntry // Ordered by Offset
}

Index maps kinded paths to event stream offsets. Entries are in document order (sorted for objects, sequential for arrays).

func OpenIndex ¶

func OpenIndex(r io.Reader, size int) (*Index, error)

OpenIndex reads an index from a reader of size size

func (*Index) EstimatedSize ¶

func (idx *Index) EstimatedSize() int64

EstimatedSize returns an estimate of the index size in bytes.

func (*Index) FromTony ¶

func (s *Index) FromTony(data []byte, opts ...gomap.UnmapOption) error

FromTony parses Tony format bytes and populates Index.

func (*Index) FromTonyIR ¶

func (s *Index) FromTonyIR(node *ir.Node, opts ...gomap.UnmapOption) error

FromTonyIR populates Index from a Tony IR node.

func (*Index) Lookup ¶

func (idx *Index) Lookup(kp string) (index int, err error)

Lookup finds the index entry at or before path kp in document order. Returns the largest index i where Entries[i].Path <= kp. Returns 0 if kp comes before all indexed paths.

func (*Index) ToTony ¶

func (s *Index) ToTony(opts ...gomap.MapOption) ([]byte, error)

ToTony converts Index to Tony format bytes.

func (*Index) ToTonyIR ¶

func (s *Index) ToTonyIR(opts ...gomap.MapOption) (*ir.Node, error)

ToTonyIR converts Index to a Tony IR node.

type IndexEntry ¶

type IndexEntry struct {
	Path   *Path // Kinded path (e.g., "a.b[0]", "users.123.name")
	Offset int64 // Byte offset in event stream
	Size   int64 `tony:"omit"`
}

IndexEntry maps a kinded path to its byte offset in the event stream.

tony:schemagen=index-entry

func (*IndexEntry) FromTony ¶

func (s *IndexEntry) FromTony(data []byte, opts ...gomap.UnmapOption) error

FromTony parses Tony format bytes and populates IndexEntry.

func (*IndexEntry) FromTonyIR ¶

func (s *IndexEntry) FromTonyIR(node *ir.Node, opts ...gomap.UnmapOption) error

FromTonyIR populates IndexEntry from a Tony IR node.

func (*IndexEntry) ToTony ¶

func (s *IndexEntry) ToTony(opts ...gomap.MapOption) ([]byte, error)

ToTony converts IndexEntry to Tony format bytes.

func (*IndexEntry) ToTonyIR ¶

func (s *IndexEntry) ToTonyIR(opts ...gomap.MapOption) (*ir.Node, error)

ToTonyIR converts IndexEntry to a Tony IR node.

type Path ¶

type Path struct {
	kpath.KPath
}

func (*Path) MarshalText ¶

func (p *Path) MarshalText() ([]byte, error)

func (*Path) String ¶

func (p *Path) String() string

func (*Path) UnmarshalText ¶

func (p *Path) UnmarshalText(d []byte) error

type PathFinder ¶

type PathFinder struct {
	R io.ReadSeekCloser
	// contains filtered or unexported fields
}

PathFinder seeks to an indexed offset and extracts events for a target path.

Uses stream.KPathState to initialize state for the indexed path. For leaf array elements, KPathState positions one element before, so processing the first event at the offset advances to the correct position.

func NewPathFinder ¶

func NewPathFinder(r io.ReadSeekCloser, index *Index, off int64, idxPath, desPath *kpath.KPath, eventSize int64) (*PathFinder, error)

NewPathFinder creates a PathFinder starting at offset off (indexed at idxPath) to find desPath.

Initializes state using stream.KPathState(idxPath), which positions correctly for reading events starting at off. For field and sparse array entries, advances state past the key by processing a dummy null event. index is the snapshot index, used to determine chunk boundaries for buffering. eventSize is the total size of the event stream, used to prevent reading past into the index section.

func (*PathFinder) FindEvents ¶

func (pf *PathFinder) FindEvents() ([]stream.Event, error)

FindEvents extracts events for the desired path from the snapshot. Buffers chunks for efficient I/O, reading additional chunks as needed.

type R ¶

type R interface {
	io.ReadSeekCloser
}

type RandomDocConfig ¶

type RandomDocConfig struct {
	// MinSize and MaxSize control the approximate size range in bytes
	MinSize int
	MaxSize int

	// MaxDepth controls maximum nesting depth
	MaxDepth int

	// ObjectFieldProbability is probability (0.0-1.0) that a container will be an object vs array
	ObjectFieldProbability float64

	// ContainerProbability is probability (0.0-1.0) that a value will be a container vs primitive
	ContainerProbability float64

	// StringLengthRange controls string value lengths
	StringLengthMin int
	StringLengthMax int

	// Seed for random number generator (0 means use current time)
	Seed int64
}

RandomDocConfig configures random document generation

func DefaultRandomDocConfig ¶

func DefaultRandomDocConfig() RandomDocConfig

DefaultRandomDocConfig returns a reasonable default configuration

type Snapshot ¶

type Snapshot struct {
	R         io.ReadSeekCloser
	Index     *Index
	EventSize uint64 // Size of event stream in bytes

}

Snapshot is an opened snapshot file providing random access to paths.

func Open ¶

func Open(rc R) (*Snapshot, error)

Open reads a snapshot from rc. The index is loaded into memory; events are read on demand.

func (*Snapshot) Close ¶

func (s *Snapshot) Close() error

func (*Snapshot) ReadPath ¶

func (s *Snapshot) ReadPath(p string) (*ir.Node, error)

ReadPath reads the IR node at path p. Returns nil if path not found.

type W ¶

type W interface {
	io.WriteCloser
	io.Seeker
}

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL