snap

package

v0.0.13 Latest Latest Go to latest Published: Dec 17, 2025 License: Apache-2.0 Imports: 15 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/signadot/tony-format

Links

Open Source Insights

README ¶

Event-Based Snapshot Design

Overview

This package provides event-based snapshot storage for logd. Snapshots store events directly from the stream package, with a size-bound index for efficient path lookups.

Design Goals

Store events directly: Snapshots contain a stream of stream.Event values, not IR nodes
Size-bound index: The index fits in memory even for very large snapshots
Efficient path access: Look up paths without reading the entire snapshot
Streaming reads: Support reading paths without full document reconstruction

Architecture

Snapshot Format

[8 bytes: event stream size (uint64, big-endian)]
[4 bytes: index size (uint32, big-endian)]
[event stream bytes]
[index bytes]

Event stream size: 8-byte uint64 indicating size of event stream in bytes
Index size: 4-byte uint32 indicating size of index in bytes
Event stream: Sequence of stream.Event values encoded in Tony format
Index: Size-bound map of paths to byte offsets

The sizes are written first (as placeholders, then updated), followed by the event stream and index.

Index Structure

The index is a list of kpaths in order of the stream events, each associated with an offset in the event data:

IndexEntry: Contains a kpath (string) and offset (int64)
Entries: Ordered list of IndexEntry values, sorted by offset
Ancestor lookup: If exact path not indexed, find nearest ancestor

Building Snapshots

Write placeholder sizes at beginning (12 bytes: 8 for event stream size, 4 for index size)
Process IR node or event stream
Convert to events (if starting from IR node)
Write events to snapshot file
Build size-bound index while processing events
After all events written:
- Write index bytes
- Seek back to beginning and update event stream size and index size

Reading Snapshots

Read sizes from beginning of file:
- Read event stream size (8 bytes, uint64)
- Read index size (4 bytes, uint32)
Read event stream (starting at offset 12, for event stream size bytes)
Read index (starting after event stream, for index size bytes)
Parse index structure
For path lookup:
- Find path or nearest ancestor in index
- Seek to offset in event stream
- Decode events from that offset
- Reconstruct path value from events

Implementation Status

Basic index structure (Index, IndexEntry)
Event-based snapshot writer
Index builder (records paths and offsets as events are processed)
Snapshot reader with path lookup, basic

Migration from IR-Node-Based Snapshots

The previous IR-node-based implementation (with !snap-loc, !snap-range, !snap-chunks) has been archived in archive/. The new design:

Simpler: No chunking logic, just events + index
More efficient: Direct event storage, no IR node conversion overhead
Better scaling: Size-bound index works for arbitrarily large snapshots

Documentation ¶

Overview ¶

Package snap provides event-based snapshot storage for logd.

This package is being redesigned to store events directly from the stream package, with a size-bound index into those events. This allows efficient storage and retrieval of large snapshots without loading entire documents into memory.

Design Principles:

Store events directly (from stream.Event) rather than IR nodes
Maintain a size-bound index for efficient path lookups
Support streaming reads without full document reconstruction

The previous IR-node-based implementation has been archived in internal/snap/archive/.

Index ¶

Constants
func EncodeRandomDocument(doc *ir.Node, enc *stream.Encoder) error
func GetChunkSize() int
func RandomDocument(config RandomDocConfig) (*ir.Node, []string, error)
type Builder
- func NewBuilder(w W, index *Index, patches []*ir.Node) (*Builder, error)
- func (b *Builder) Close() error
- func (b *Builder) WriteEvent(ev *stream.Event) error
type Index
- func OpenIndex(r io.Reader, size int) (*Index, error)
- func (idx *Index) EstimatedSize() int64
- func (s *Index) FromTony(data []byte, opts ...gomap.UnmapOption) error
- func (s *Index) FromTonyIR(node *ir.Node, opts ...gomap.UnmapOption) error
- func (idx *Index) Lookup(kp string) (index int, err error)
- func (s *Index) ToTony(opts ...gomap.MapOption) ([]byte, error)
- func (s *Index) ToTonyIR(opts ...gomap.MapOption) (*ir.Node, error)
type IndexEntry
- func (s *IndexEntry) FromTony(data []byte, opts ...gomap.UnmapOption) error
- func (s *IndexEntry) FromTonyIR(node *ir.Node, opts ...gomap.UnmapOption) error
- func (s *IndexEntry) ToTony(opts ...gomap.MapOption) ([]byte, error)
- func (s *IndexEntry) ToTonyIR(opts ...gomap.MapOption) (*ir.Node, error)
type Path
- func (p *Path) MarshalText() ([]byte, error)
- func (p *Path) String() string
- func (p *Path) UnmarshalText(d []byte) error
type PathFinder
- func NewPathFinder(r io.ReadSeekCloser, off int64, idxPath, desPath *kpath.KPath) (*PathFinder, error)
- func (pf *PathFinder) FindEvents() ([]stream.Event, error)
type R
type RandomDocConfig
- func DefaultRandomDocConfig() RandomDocConfig
type Snapshot
- func Open(rc R) (*Snapshot, error)
- func (s *Snapshot) Close() error
- func (s *Snapshot) ReadPath(p string) (*ir.Node, error)
type W

Constants ¶

View Source

const (
	DefaultChunkSize = 4096
	HeaderSize       = 12
)

Variables ¶

This section is empty.

Functions ¶

func EncodeRandomDocument ¶

func EncodeRandomDocument(doc *ir.Node, enc *stream.Encoder) error

EncodeRandomDocument encodes a random document to events using the stream encoder

func GetChunkSize ¶

func GetChunkSize() int

GetChunkSize returns the maximum chunk size for indexing. Defaults to DefaultMaxChunkSize (4096) if SNAP_MAX_CHUNK_SIZE environment variable is not set. This allows tests to use smaller chunk sizes to exercise chunk boundary conditions.

func RandomDocument ¶

func RandomDocument(config RandomDocConfig) (*ir.Node, []string, error)

RandomDocument generates a random document with mixed structure Returns the document as an ir.Node and all paths that exist in it

Types ¶

type Builder ¶

type Builder struct {
	// contains filtered or unexported fields
}

func NewBuilder ¶

func NewBuilder(w W, index *Index, patches []*ir.Node) (*Builder, error)

func (*Builder) Close ¶

func (b *Builder) Close() error

func (*Builder) WriteEvent ¶

func (b *Builder) WriteEvent(ev *stream.Event) error

WriteEvent processes a single event, writing it to the snapshot and updating state/index.

type Index ¶

type Index struct {
	// Entries is a list of indexed paths in order of appearance in the event stream.
	// Entries are ordered by their Offset values.
	Entries []IndexEntry
}

Index is an index into event-based snapshots. It contains a list of kpaths in order of the stream events, each associated with an offset in the event data.

func OpenIndex ¶

func OpenIndex(r io.Reader, size int) (*Index, error)

OpenIndex reads an index from a reader of size size

func (*Index) EstimatedSize ¶

func (idx *Index) EstimatedSize() int64

EstimatedSize returns an estimate of the index size in bytes.

func (*Index) FromTony ¶

func (s *Index) FromTony(data []byte, opts ...gomap.UnmapOption) error

FromTony parses Tony format bytes and populates Index.

func (*Index) FromTonyIR ¶

func (s *Index) FromTonyIR(node *ir.Node, opts ...gomap.UnmapOption) error

FromTonyIR populates Index from a Tony IR node.

func (*Index) Lookup ¶

func (idx *Index) Lookup(kp string) (index int, err error)

Lookup finds the entry for the given path, or the entry just before it in sorted (document) order. Since document order for objects is sorted order, entries are sorted by path. Returns the entry just before where the path would be inserted, which should be an ancestor or a sibling that comes before the requested path.

If the target path comes before all indexed entries, returns the first entry (index 0). The caller should check if the returned entry is actually before or at the target path.

func (*Index) ToTony ¶

func (s *Index) ToTony(opts ...gomap.MapOption) ([]byte, error)

ToTony converts Index to Tony format bytes.

func (*Index) ToTonyIR ¶

func (s *Index) ToTonyIR(opts ...gomap.MapOption) (*ir.Node, error)

ToTonyIR converts Index to a Tony IR node.

type IndexEntry ¶

type IndexEntry struct {
	Path   *Path // Kinded path (e.g., "a.b[0]", "users.123.name")
	Offset int64 // Byte offset in the event stream where this path appears
	Size   int64 `tony:"omit"`
}

IndexEntry represents a single entry in the snapshot index. Each entry maps a kinded path to its byte offset in the event stream.

tony:schemagen=index-entry

func (*IndexEntry) FromTony ¶

func (s *IndexEntry) FromTony(data []byte, opts ...gomap.UnmapOption) error

FromTony parses Tony format bytes and populates IndexEntry.

func (*IndexEntry) FromTonyIR ¶

func (s *IndexEntry) FromTonyIR(node *ir.Node, opts ...gomap.UnmapOption) error

FromTonyIR populates IndexEntry from a Tony IR node.

func (*IndexEntry) ToTony ¶

func (s *IndexEntry) ToTony(opts ...gomap.MapOption) ([]byte, error)

ToTony converts IndexEntry to Tony format bytes.

func (*IndexEntry) ToTonyIR ¶

func (s *IndexEntry) ToTonyIR(opts ...gomap.MapOption) (*ir.Node, error)

ToTonyIR converts IndexEntry to a Tony IR node.

type Path ¶

type Path struct {
	kpath.KPath
}

func (*Path) MarshalText ¶

func (p *Path) MarshalText() ([]byte, error)

func (*Path) String ¶

func (p *Path) String() string

func (*Path) UnmarshalText ¶

func (p *Path) UnmarshalText(d []byte) error

type PathFinder ¶

type PathFinder struct {
	R io.ReadSeekCloser
	// contains filtered or unexported fields
}

func NewPathFinder ¶

func NewPathFinder(r io.ReadSeekCloser, off int64, idxPath, desPath *kpath.KPath) (*PathFinder, error)

func (*PathFinder) FindEvents ¶

func (pf *PathFinder) FindEvents() ([]stream.Event, error)

FindEvents reads events from the snapshot and returns only those events that correspond to the desired path.

type R ¶

type R interface {
	io.ReadSeekCloser
}

type RandomDocConfig ¶

type RandomDocConfig struct {
	// MinSize and MaxSize control the approximate size range in bytes
	MinSize int
	MaxSize int

	// MaxDepth controls maximum nesting depth
	MaxDepth int

	// ObjectFieldProbability is probability (0.0-1.0) that a container will be an object vs array
	ObjectFieldProbability float64

	// ContainerProbability is probability (0.0-1.0) that a value will be a container vs primitive
	ContainerProbability float64

	// StringLengthRange controls string value lengths
	StringLengthMin int
	StringLengthMax int

	// Seed for random number generator (0 means use current time)
	Seed int64
}

RandomDocConfig configures random document generation

func DefaultRandomDocConfig ¶

func DefaultRandomDocConfig() RandomDocConfig

DefaultRandomDocConfig returns a reasonable default configuration

type Snapshot ¶

type Snapshot struct {
	R         io.ReadSeekCloser
	Index     *Index
	EventSize uint64 // Size of event stream in bytes

}

func Open ¶

func Open(rc R) (*Snapshot, error)

func (*Snapshot) Close ¶

func (s *Snapshot) Close() error

func (*Snapshot) ReadPath ¶

func (s *Snapshot) ReadPath(p string) (*ir.Node, error)

ReadPath reads a specific path from the snapshot. Returns nil if the path is not found.

type W ¶

type W interface {
	io.WriteCloser
	io.Seeker
}

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
archive

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL