snapshot

package
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 2, 2026 License: MIT Imports: 22 Imported by: 0

Documentation

Overview

Package snapshot serialises the durable on-disk representation of a gograph snapshot (CSR + LPG + schema) and reads it back into a fresh process.

A snapshot is a directory containing a manifest.json plus one binary file per kept-on-disk component. Publication is atomic on any POSIX filesystem: the writer assembles the new directory under a sibling .tmp path, fsyncs every file, then renames the .tmp directory to its final name. Concurrent readers continue using the previous directory until they re-open.

Example

Example writes a full (v3) snapshot of a labelled graph to a directory and loads it back into a fresh process view, inspecting the manifest and the parsed CSR readback.

package main

import (
	"fmt"
	"os"
	"path/filepath"

	"github.com/FlavioCFOliveira/GoGraph/graph/adjlist"
	"github.com/FlavioCFOliveira/GoGraph/graph/csr"
	"github.com/FlavioCFOliveira/GoGraph/graph/lpg"
	"github.com/FlavioCFOliveira/GoGraph/store/snapshot"
)

func main() {
	dir, err := os.MkdirTemp("", "snapshot-example")
	if err != nil {
		panic(err)
	}
	defer func() { _ = os.RemoveAll(dir) }()

	// A small labelled, weighted graph plus its frozen CSR snapshot.
	g := lpg.New[string, int64](adjlist.Config{Directed: true})
	if err := g.AddEdge("alice", "bob", 7); err != nil {
		panic(err)
	}
	if err := g.SetNodeLabel("alice", "Person"); err != nil {
		panic(err)
	}
	c := csr.BuildFromAdjList(g.AdjList())

	// WriteSnapshotFull lays out csr.bin + labels.bin + properties.bin +
	// a manifest, and (because the graph is string-keyed) a mapper.bin,
	// stamping the manifest at v3. Publication is atomic.
	snapDir := filepath.Join(dir, "snapshot")
	if err := snapshot.WriteSnapshotFull(snapDir, c, g); err != nil {
		panic(err)
	}

	// Load it back: LoadSnapshotFull verifies every component's CRC and
	// returns the parsed readbacks.
	loaded, err := snapshot.LoadSnapshotFull(snapDir)
	if err != nil {
		panic(err)
	}
	// The readback exposes the parsed edge array and the interned label
	// strings. (CSR.Vertices is the dense row-pointer array sized by the
	// largest interned NodeID, not the live-node count, so it is not
	// asserted here.)
	fmt.Printf("manifest version=%d\n", loaded.Manifest.Version)
	fmt.Printf("csr edges=%d\n", len(loaded.CSR.Edges))
	fmt.Printf("label strings=%d\n", len(loaded.Labels.Strings))

}
Output:
manifest version=3
csr edges=1
label strings=1

Index

Examples

Constants

View Source
const CSRFile = "csr.bin"

CSRFile is the conventional file name carrying the CSR triplet (vertices + edges + optional weights) inside a snapshot directory.

View Source
const EdgeHandlesFile = "edgehandles.bin"

EdgeHandlesFile is the conventional file name carrying the durable per-handle edge metadata (per-CREATE relationship type and properties keyed by the stable edge handle) inside a snapshot directory. It is a sibling of CSRFile and is referenced by an additional FileEntry in the manifest only when the writer emitted at least one record.

View Source
const IndexesDir = "indexes"

IndexesDir is the conventional sub-directory inside a v2 snapshot that holds one index.Serializer-encoded file per registered secondary index. The file name is <indexName>.bin; the manifest records the size and CRC32C of every entry under [Manifest.Indexes].

View Source
const LabelsFile = "labels.bin"

LabelsFile is the conventional file name carrying the durable LPG label state inside a v2 snapshot directory. It is a sibling of CSRFile and is referenced by an additional entry in the [Manifest.Files] slice.

View Source
const ManifestVersion = 3

ManifestVersion is the highest on-disk schema version this build understands. The current build writes version 3 manifests via WriteSnapshotFull when N=string (CSR + labels + properties + mapper, fully self-sufficient on load), version 2 manifests via the same writer for non-string N (CSR + labels + properties, requires WAL replay to reconstruct the natural-key mapper), and version 1 manifests via the legacy WriteSnapshotCSR code path (CSR-only snapshots). The loader transparently accepts all three.

View Source
const MapperFile = "mapper.bin"

MapperFile is the conventional file name carrying the durable (NodeID -> natural key) interning table inside a v3 snapshot directory. It is a sibling of CSRFile, LabelsFile and PropertiesFile and is referenced by an additional entry in [Manifest.Files] when the writer emitted it.

View Source
const PropertiesFile = "properties.bin"

PropertiesFile is the conventional file name carrying the durable LPG typed-property state inside a v2 snapshot directory. It is a sibling of CSRFile and LabelsFile and is referenced by an additional entry in [Manifest.Files] when the writer emitted any property at all.

View Source
const TombstonesFile = "tombstones.bin"

TombstonesFile is the conventional file name carrying the durable LPG node-tombstone set inside a snapshot directory. It is a sibling of CSRFile and is referenced by an additional entry in the [Manifest.Files] slice.

The component is OPTIONAL: the writer emits it only when the graph has at least one tombstoned node, so a snapshot of a graph that never deleted a node is byte-identical to one produced before this component existed. A snapshot without the component loads as an empty tombstone set — the backward-compatibility contract.

Forward compatibility is one-directional. A reader that predates this component ignores the unknown file name and so would resurrect deleted nodes; reopening a store written by a current binary with an older binary is therefore a downgrade hazard. Upgrades (older snapshot, newer binary) are always safe.

Variables

View Source
var ErrCSRCorrupted = errors.New("snapshot: csr.bin corrupted")

ErrCSRCorrupted is returned by ReadCSR when the csr.bin payload is structurally malformed: an implausible vertex/edge count, an out-of-range weight-element size, or a weights-array byte length that overflows. It mirrors the per-component corruption sentinels used by the sibling readers (ErrLabelsCorrupted, ErrPropertiesCorrupted, ErrMapperCorrupted) so callers can classify a corrupt CSR the same way. [readVerifiedCSR] / Open wrap it under ErrCorrupted.

View Source
var ErrCorrupted = errors.New("snapshot: directory corrupted")

ErrCorrupted is returned by Open when a component file CRC32C disagrees with the manifest, or when a referenced file is missing or shorter than expected.

View Source
var ErrEdgeHandlesCorrupted = errors.New("snapshot: edgehandles.bin corrupted")

ErrEdgeHandlesCorrupted is returned by ReadEdgeHandles when the edgehandles.bin file is structurally malformed (bad magic, unsupported version, implausible count, label/key index past its table, unknown property kind, or a truncated record).

View Source
var ErrLabelsCorrupted = errors.New("snapshot: labels.bin corrupted")

ErrLabelsCorrupted is returned by ReadLabels when the labels.bin file is structurally malformed (bad magic, truncated record, or a label-string index that points beyond the embedded string table).

View Source
var ErrManifestCorrupted = errors.New("snapshot: manifest corrupted")

ErrManifestCorrupted is returned when the manifest does not parse as JSON or its file list disagrees with what is on disk.

View Source
var ErrManifestUnsupported = errors.New("snapshot: manifest version unsupported")

ErrManifestUnsupported is returned by LoadManifest when the manifest version is newer than this build understands.

View Source
var ErrMapperApply = errors.New("snapshot: cannot apply mapper")

ErrMapperApply is returned by ApplyMapperToGraph when the supplied readback violates an invariant the writer is responsible for upholding (intra-shard gap, hash/shard mismatch, duplicate key, or a non-empty target mapper). It wraps the underlying [graph.ErrMapper…] sentinels so callers can branch on the typed cause via errors.Is.

View Source
var ErrMapperCorrupted = errors.New("snapshot: mapper.bin corrupted")

ErrMapperCorrupted is returned by ReadMapperString when the mapper.bin file is structurally malformed (bad magic, unsupported format version, truncated record, or an implausible length prefix).

View Source
var ErrPropertiesCorrupted = errors.New("snapshot: properties.bin corrupted")

ErrPropertiesCorrupted is returned by ReadProperties when the properties.bin file is structurally malformed (bad magic, truncated record, key index past the embedded key table, unknown kind, or a value length implausibly large).

View Source
var ErrTombstonesCorrupted = errors.New("snapshot: tombstones.bin corrupted")

ErrTombstonesCorrupted is returned by ReadTombstones when the tombstones.bin file is structurally malformed (bad magic, unsupported format version, implausible count, or a truncated record).

Functions

func ApplyCSRToGraph

func ApplyCSRToGraph[N comparable, W any](g *lpg.Graph[N, W], rb *CSRReadback) error

ApplyCSRToGraph replays the adjacency in rb into g. The pre- condition is that g's underlying mapper has already been populated with every NodeID referenced by rb — typically by an immediately- preceding ApplyMapperToGraph (v3 snapshots) or by a WAL replay (v2 snapshots that pair with a WAL prefix). Records whose endpoints the mapper cannot resolve are skipped and counted via `store.snapshot.ApplyCSR.unresolved`; the function does not return an error for them so a partial mapper degrades cleanly rather than aborting recovery mid-way.

Weight decoding is supported for the common int/float weight types (int8/uint8/bool, int16/uint16, int32/uint32/float32, int/uint/ int64/uint64/float64/uintptr). Other W types apply zero weights; the metric `store.snapshot.ApplyCSR.weightFallback` reports the fallback count for observability.

ApplyCSRToGraph is idempotent against a freshly-loaded mapper but not against a graph that already contains edges: re-applying a CSR to a graph with existing edges may duplicate them in multigraph mode or no-op in simple-graph mode. Callers should run this exactly once per recovery, immediately after the mapper restore and before any WAL replay.

rb is passed by pointer to avoid copying the three slices in the readback (vertices, edges, weight bytes) on every call. The function does not mutate rb.

func ApplyEdgeHandlesToGraph

func ApplyEdgeHandlesToGraph[N comparable, W any](g *lpg.Graph[N, W], rb EdgeHandlesReadback)

ApplyEdgeHandlesToGraph replays rb into a live g, re-attaching every per-handle edge label and property keyed by its stable handle and the endpoint NodeID pair. It MUST run AFTER the mapper and CSR (with its handle column) are applied so the handle the record references is already live on the adjacency slot — though the per-handle metadata stores are keyed by (NodeID pair, handle) directly and do not require the adjacency edge to be present, so a record whose edge the CSR did not materialise is still re-attached harmlessly. The handle high-water counter is re-seeded for every record so a post-recovery edge creation never re-mints a live handle (invariant I5).

func ApplyLabelsToGraph

func ApplyLabelsToGraph[N comparable, W any](g *lpg.Graph[N, W], rb LabelsReadback) error

ApplyLabelsToGraph replays rb into a live g. The pre-condition is that g's underlying mapper has already been populated with every NodeID referenced by rb — typically by replaying the WAL prefix covered by the snapshot, or by re-issuing the original AddNode / AddEdge calls. Records whose NodeID cannot be resolved by the mapper are skipped and counted via the `store.snapshot.ApplyLabels.unresolved` metric counter; the function does not return an error for them so a partial mapper degrades cleanly rather than aborting recovery mid-way.

Edge label records whose endpoints are resolvable but whose edge is absent from the adjacency list (e.g., the CSR was not yet applied) are likewise skipped and counted under `store.snapshot.ApplyLabels.edgeMissing`; this matches lpg.Graph.SetEdgeLabel's own no-op-on-missing-edge contract.

func ApplyMapperToGraph

func ApplyMapperToGraph[N comparable, W any](g *lpg.Graph[N, W], rb MapperReadback) error

ApplyMapperToGraph rebuilds g's underlying graph.Mapper from the snapshot readback. It is only meaningful for string-keyed graphs: any other N type returns nil without touching g, because no v3 mapper.bin is ever produced for non-string graphs. The caller is expected to invoke this function before ApplyCSRToGraph, ApplyLabelsToGraph, or ApplyPropertiesToGraph so subsequent resolution calls see the restored interning table.

Pre-condition: g must hold a fresh (empty) mapper. Calling on a graph that already has interned values returns ErrMapperApply wrapping graph.ErrMapperNotEmpty so the caller can distinguish a programmer error from a corruption error.

Concurrency: ApplyMapperToGraph is not safe to call concurrently with mutations or reads on g. It is intended for the one-shot snapshot-load phase of recovery.

func ApplyMapperToGraphWithCodec

func ApplyMapperToGraphWithCodec[N comparable, W any](g *lpg.Graph[N, W], rb MapperReadback, codec keyDecoder[N]) error

ApplyMapperToGraphWithCodec rebuilds g's underlying graph.Mapper from a version-2 (codec) snapshot readback for ANY comparable key type N. Each MapperRawPair carries the codec-encoded key bytes the snapshot writer produced via WriteMapper; this function decodes them back into N via the supplied codec (the same one the store uses on the WAL) and seeds the interning table through graph.Mapper.LoadFrom.

It is the codec-aware dual of ApplyMapperToGraph: recovery calls this when the loaded readback carries RawPairs (non-string keys) and the string-specialised path when it carries Pairs. An empty readback is a no-op.

Pre-condition and concurrency contract match ApplyMapperToGraph: g must hold a fresh (empty) mapper, and the call must not race with any other access to g. A decode failure surfaces as ErrMapperApply wrapping the codec error; a structural violation surfaces as ErrMapperApply wrapping the relevant [graph.ErrMapper…] sentinel.

func ApplyPropertiesToGraph

func ApplyPropertiesToGraph[N comparable, W any](g *lpg.Graph[N, W], rb PropertiesReadback) error

ApplyPropertiesToGraph replays rb into a live g. The pre-condition is that g's underlying mapper has already been populated with every NodeID referenced by rb — typically by replaying the WAL prefix covered by the snapshot, or by re-issuing the original AddNode / AddEdge calls. Records whose NodeID cannot be resolved by the mapper are skipped and counted via the `store.snapshot.ApplyProperties.unresolved` metric counter; the function does not return an error for them so a partial mapper degrades cleanly rather than aborting recovery mid-way.

Edge property records whose endpoints are resolvable but whose edge is absent from the adjacency list (e.g., the CSR was not yet applied) are likewise skipped and counted under `store.snapshot.ApplyProperties.edgeMissing`; this matches lpg.Graph.SetEdgeProperty's own no-op-on-missing-edge contract.

func ApplyTombstonesToGraph

func ApplyTombstonesToGraph[N comparable, W any](g *lpg.Graph[N, W], rb TombstonesReadback)

ApplyTombstonesToGraph replays rb into a live g, re-tombstoning every NodeID the snapshot recorded as removed. It must run AFTER the snapshot nodes are loaded (mapper + CSR) so the ids it restores reference the same stable slots; it re-tombstones by id directly via lpg.Graph.RestoreTombstones and so does not require the natural keys to be resolvable.

A later WAL re-create (OpAddNode) for any of these ids still revives it, preserving the chronology of a delete→recreate cycle that straddles the snapshot boundary.

func WriteCSR

func WriteCSR[W any](w io.Writer, c *csr.CSR[W]) (size int64, crc uint32, err error)

WriteCSR serialises c to w, returning the number of bytes written and the CRC32C of the serialised payload. The on-disk layout is:

uint64 nVertices      (little-endian)
uint64 nEdges
uint8  hasWeights     (1 = weights array present)
uint8  weightSizeBytes (0 when hasWeights = 0)
[vertices]            (nVertices * 8 bytes)
[edges]               (nEdges * 8 bytes)
[weights]             (nEdges * weightSizeBytes bytes, when present)
uint8  hasHandles      (OPTIONAL trailing block; 1 = handle array present)
[handles]             (nEdges * 8 bytes, when hasHandles = 1)

The trailing handles block (Stage 2 of the stable-edge-handle work) is emitted ONLY when the source CSR carries a per-slot handle column (csr.CSR.HandlesSlice != nil). A graph that never used AddEdgeH produces no trailing block, so its csr.bin is byte-identical to one written before this column existed — the v1 golden and the cross-process byte-equality fixtures are unaffected. [readCSRLimited] detects the block by attempting to read one more byte after the weights array: present → handles follow, EOF → none (the backward-compatible read branch).

func WriteEdgeHandles

func WriteEdgeHandles[N comparable, W any](w io.Writer, g *lpg.Graph[N, W]) (size int64, crc uint32, emitted bool, err error)

WriteEdgeHandles serialises every per-handle edge label and property attached to g into w in the edgehandles.bin format. It returns the number of bytes written and the CRC32C of the serialised payload — both stored in the manifest's FileEntry so LoadSnapshotFull can verify integrity at load time. It returns (0, 0, nil, false) when the graph carries no per-handle metadata, signalling the caller to omit the component entirely.

Records are emitted in the deterministic order lpg.Graph.WalkEdgeHandles yields (the same source-node order csr.bin / labels.bin use), and within a record the label names and property keys are sorted, so the component is byte-stable across writes of the same logical state — the cross-process byte-equality contract the snapshot relies on.

func WriteLabels

func WriteLabels[N comparable, W any](w io.Writer, g *lpg.Graph[N, W]) (size int64, crc uint32, err error)

WriteLabels serialises every node and edge label attached to g into w in the labels.bin format documented at the top of this file. It returns the number of bytes written and the CRC32C of the serialised payload — both stored in the manifest's FileEntry for the labels.bin component so Open / LoadSnapshotFull can verify integrity at load time.

The CRC32C covers the entire on-disk file, including the magic header. This lets the manifest's CRC field validate every byte of labels.bin end-to-end without a separate inner-payload checksum.

The on-disk string table is populated by walking g's lpg.LabelRegistry in interning order; the labelStringIdx written for each (node | edge) record indexes into that table. Because LabelID is itself assigned in interning order, this preserves the registry's identity across save and load: the reader interns each name back in the same order and observes the same LabelID values without an extra remap step.

The walk holds the registry's RLock for the duration of the string table emission; node/edge enumeration uses the same lock-free / RLock-only primitives the public LPG accessors expose.

func WriteManifest

func WriteManifest(w io.Writer, m Manifest) error

WriteManifest writes m to w in canonical (pretty-printed) JSON.

func WriteMapper

func WriteMapper[N comparable](w io.Writer, m *graph.Mapper[N], codec keyEncoder[N]) (size int64, crc uint32, err error)

WriteMapper serialises every (NodeID -> key) pair held by m into w, generalising WriteMapperString to any comparable key type N via the supplied codec. It returns the number of bytes written and the CRC32C of the serialised payload, both recorded in the manifest's FileEntry for the mapper.bin component so LoadSnapshotFull can verify integrity at load time.

Back-compatibility: when N is the canonical string type the function delegates to WriteMapperString, which emits the frozen version-1 layout (raw UTF-8 key bytes, no codec framing). The on-disk image is therefore byte-identical to every string mapper.bin produced before the codec generalisation, regardless of which string codec is supplied. For any other N the function emits a version-2 layout whose per-record key bytes are the output of codec.Encode.

On-disk layout (version 2, all little-endian):

uint32  magic           ('GMAP', 0x50414D47)
uint16  formatVersion   (2)
uint64  pairCount
for each pair:
    uint64  nodeID
    uint32  keyLen       (length of the codec-encoded key bytes)
    [keyLen]byte key     (codec.Encode output for the natural key)

Pairs are emitted in graph.Mapper.Walk order (shard-major, intra-index-major) so the read side reconstructs the mapper deterministically. The CRC32C covers the entire on-disk file, including the magic header.

func WriteMapperString

func WriteMapperString(w io.Writer, m *graph.Mapper[string]) (size int64, crc uint32, err error)

WriteMapperString serialises every (NodeID -> string key) pair held by m into w in the mapper.bin format documented below. It returns the number of bytes written and the CRC32C of the serialised payload — both stored in the manifest's FileEntry for the mapper.bin component so LoadSnapshotFull can verify integrity at load time.

On-disk layout (all little-endian):

uint32  magic           ('GMAP', 0x50414D47)
uint16  formatVersion   (1)
uint64  pairCount
for each pair:
    uint64  nodeID
    uint32  keyLen
    [keyLen]byte key

Pairs are emitted in graph.Mapper.Walk order (shard-major, intra-index-major) so the read side can reconstruct the mapper deterministically.

The CRC32C covers the entire on-disk file, including the magic header. The reader recomputes the CRC end-to-end at load time.

func WriteProperties

func WriteProperties[N comparable, W any](w io.Writer, g *lpg.Graph[N, W]) (size int64, crc uint32, err error)

WriteProperties serialises every node and edge property attached to g into w in the properties.bin format documented at the top of this file. It returns the number of bytes written and the CRC32C of the serialised payload — both stored in the manifest's FileEntry for the properties.bin component so LoadSnapshotFull can verify integrity at load time.

The CRC32C covers the entire on-disk file, including the magic header. This lets the manifest's CRC field validate every byte of properties.bin end-to-end without a separate inner-payload checksum.

The on-disk key string table is populated by walking g's lpg.PropertyKeyRegistry in interning order; the keyIdx written for each (node | edge) record indexes into that table.

Concurrency contract: the walk relies on the same lock-free / RLock-only primitives the public LPG accessors expose. Properties added by a concurrent mutator race with the snapshot writer in the same way labels do — the writer either observes the new property (and the matching node/edge entry) or it does not, but never an inconsistent fragment.

func WriteSnapshotCSR

func WriteSnapshotCSR[W any](dir string, c *csr.CSR[W]) error

WriteSnapshotCSR is the legacy high-level helper that lays a snapshot directory containing a v1 manifest plus the CSR. It is retained for backward compatibility: callers that also need LPG label durability must use WriteSnapshotFull which writes a v2 manifest with both csr.bin and labels.bin. Atomic publication is achieved by assembling the snapshot under dir + ".tmp" and renaming it to dir on success.

Example

ExampleWriteSnapshotCSR shows the lighter, CSR-only (v1) path: it writes just the adjacency and reads it straight back with Open.

package main

import (
	"fmt"
	"os"
	"path/filepath"

	"github.com/FlavioCFOliveira/GoGraph/graph/adjlist"
	"github.com/FlavioCFOliveira/GoGraph/graph/csr"
	"github.com/FlavioCFOliveira/GoGraph/store/snapshot"
)

func main() {
	dir, err := os.MkdirTemp("", "snapshot-csr-example")
	if err != nil {
		panic(err)
	}
	defer func() { _ = os.RemoveAll(dir) }()

	a := adjlist.New[string, int64](adjlist.Config{Directed: true})
	if err := a.AddEdge("a", "b", 1); err != nil {
		panic(err)
	}
	if err := a.AddEdge("a", "c", 2); err != nil {
		panic(err)
	}
	c := csr.BuildFromAdjList(a)

	snapDir := filepath.Join(dir, "snapshot")
	if err := snapshot.WriteSnapshotCSR(snapDir, c); err != nil {
		panic(err)
	}

	loaded, err := snapshot.Open(snapDir)
	if err != nil {
		panic(err)
	}
	fmt.Printf("manifest version=%d\n", loaded.Manifest.Version)
	fmt.Printf("csr edges=%d\n", len(loaded.CSR.Edges))

}
Output:
manifest version=1
csr edges=2

func WriteSnapshotCSRCtx

func WriteSnapshotCSRCtx[W any](ctx context.Context, dir string, c *csr.CSR[W]) error

WriteSnapshotCSRCtx is the context-aware variant of WriteSnapshotCSR. ctx.Err() is checked at three stage boundaries: before the CSR write, before the manifest write, and before the atomic rename. On cancellation the temporary staging directory is cleaned up and the wrapped ctx.Err is returned.

func WriteSnapshotFull

func WriteSnapshotFull[N comparable, W any](dir string, c *csr.CSR[W], g *lpg.Graph[N, W]) error

WriteSnapshotFull is the v2/v3 high-level helper: it lays out a snapshot directory containing csr.bin (legacy v1 component), labels.bin (v2 component), properties.bin (v2 component) and a manifest indexing them. When the underlying graph.Mapper is string-keyed (N=string) the writer additionally emits mapper.bin — the durable (NodeID -> natural key) interning table — and the manifest is stamped at ManifestVersion (v3). For any other N the writer falls back to the v2 layout (no mapper.bin) and the manifest records [manifestVersionV2]; recovery from a v2 snapshot continues to rely on WAL replay to re-intern keys.

Atomic publication is achieved by assembling the snapshot under dir + ".tmp" and renaming it to dir on success — the same protocol used by WriteSnapshotCSR.

When g carries a non-nil index.Manager (set via lpg.Graph.SetIndexManager) with at least one registered index that implements index.Serializer, an indexes/ sub-directory is also produced — one file per registered serializable index, each referenced from the manifest's Indexes field. Subscribers that do not implement index.Serializer are skipped (rebuild-on-restart).

Callers that do not need durable LPG labels or properties can keep using WriteSnapshotCSR; it writes a v1-shaped directory that future readers (including this one) accept transparently.

func WriteSnapshotFullCtx

func WriteSnapshotFullCtx[N comparable, W any](
	ctx context.Context,
	dir string,
	c *csr.CSR[W],
	g *lpg.Graph[N, W],
) error

WriteSnapshotFullCtx is the context-aware variant of WriteSnapshotFull. ctx.Err() is checked at five stage boundaries: before the CSR write, before the labels write, before the properties write, before the manifest write, and before the atomic rename. On cancellation the temporary staging directory is cleaned up and the wrapped ctx.Err is returned.

func WriteSnapshotFullWithMapperCodec

func WriteSnapshotFullWithMapperCodec[N comparable, W any](
	dir string,
	c *csr.CSR[W],
	g *lpg.Graph[N, W],
	codec keyEncoder[N],
) error

WriteSnapshotFullWithMapperCodec is the codec-aware variant of WriteSnapshotFull: it threads codec (the same [txn.Codec] the store uses to serialise node identifiers onto the WAL) into the mapper.bin writer so the durable NodeID->key interning table is emitted for ANY comparable key type N, not just string. A snapshot written this way is self-sufficient on load for every key type, which lets the checkpointer truncate the WAL instead of retaining it unboundedly (audit gap F3).

For string-keyed graphs the mapper bytes remain byte-identical to the version-1 layout (see WriteMapper), so this entry point is a safe drop-in for the existing WriteSnapshotFull on string stores too.

codec must not be nil; pass the store's [txn.Store.Codec].

func WriteSnapshotFullWithMapperCodecCtx

func WriteSnapshotFullWithMapperCodecCtx[N comparable, W any](
	ctx context.Context,
	dir string,
	c *csr.CSR[W],
	g *lpg.Graph[N, W],
	codec keyEncoder[N],
) error

WriteSnapshotFullWithMapperCodecCtx is the context-aware variant of WriteSnapshotFullWithMapperCodec. ctx cancellation is honoured at the same stage boundaries as WriteSnapshotFullCtx; the only difference is that the mapper.bin component is emitted for every key type via codec rather than for string alone.

func WriteTombstones

func WriteTombstones[N comparable, W any](w io.Writer, g *lpg.Graph[N, W]) (size int64, crc uint32, err error)

WriteTombstones serialises g's current tombstone set (the NodeIDs removed via lpg.Graph.RemoveNode) into w in the tombstones.bin format. It returns the number of bytes written and the CRC32C of the serialised payload — both stored in the manifest's FileEntry for the component so LoadSnapshotFull can verify integrity at load time.

The CRC32C covers the entire on-disk file, including the magic header, matching the labels.bin / properties.bin discipline. The id list is emitted in ascending order (lpg.Graph.TombstonedIDs sorts it) so the component is deterministic across writes of the same logical state.

Types

type CSRReadback

type CSRReadback struct {
	Vertices    []uint64
	Edges       []graph.NodeID
	HasWeights  bool
	WeightSize  uint8
	WeightBytes []byte
	// Handles is the optional per-slot stable-edge-handle column, aligned
	// slot-for-slot with Edges (handles[i] is the stable handle of the edge
	// at edges[i]). It is nil when the snapshot predates the column or its
	// source graph carried no handles; when non-nil it has the same length
	// as Edges. See the trailing-block note on [WriteCSR].
	Handles []uint64
}

CSRReadback parses a CSR previously serialised by WriteCSR. It returns the parsed vertices, edges, and (optional) raw weight bytes plus the on-disk CRC32C of the payload.

func ReadCSR

func ReadCSR(r io.Reader) (CSRReadback, error)

ReadCSR parses a CSR previously written by WriteCSR from r. The caller is responsible for verifying the surrounding manifest CRC; this function only enforces the structural contract.

Untrusted input: a bare ReadCSR over an io.Reader of unknown length cannot know the true remaining-bytes bound, so the declared vertex, edge, and weight sizes are checked only against an absolute backstop cap (see [maxCSRCount]) plus the overflow-safe weights computation. That stops an unbounded pre-EOF allocation, but the precise bound requires the file size. Callers loading an untrusted snapshot should prefer the Open / LoadSnapshotFull path, which supplies the manifest-recorded size (FileEntry.Size) so the count is rejected the moment it exceeds what that many bytes could possibly hold; bounding a bare reader otherwise remains the caller's responsibility.

type EdgeHandleRecord

type EdgeHandleRecord struct {
	Src        uint64
	Dst        uint64
	Handle     uint64
	Labels     []string
	Properties map[string]lpg.PropertyValue
}

EdgeHandleRecord is one persisted per-handle edge metadata record: the endpoint NodeIDs, the stable handle, the per-CREATE label names, and the per-CREATE properties. NodeIDs are stored verbatim (the snapshot mapper is restored before this component is applied), matching csr.bin's NodeID references.

type EdgeHandlesReadback

type EdgeHandlesReadback struct {
	Records []EdgeHandleRecord
}

EdgeHandlesReadback is the structural parse of an edgehandles.bin file. The caller materialises it back into a live lpg.Graph via ApplyEdgeHandlesToGraph once the underlying mapper is populated.

func ReadEdgeHandles

func ReadEdgeHandles(r io.Reader) (EdgeHandlesReadback, error)

ReadEdgeHandles parses an edgehandles.bin payload produced by WriteEdgeHandles. It performs strict structural validation: a missing or wrong magic, an unsupported version, an implausible count, a label/key index past its table, an unknown property kind, or a truncated record all surface as ErrEdgeHandlesCorrupted.

type EdgeLabelEntry

type EdgeLabelEntry struct {
	Src       uint64
	Dst       uint64
	StringIdx uint32
}

EdgeLabelEntry pairs an (src, dst) NodeID couple with the string-table index of one label name attached to that edge. An edge carrying N labels yields N entries; parallel edges between the same endpoints fold into the same edgeKey on disk just as they do in lpg.Graph's in-memory edgeBag.

type EdgePropertyEntry

type EdgePropertyEntry struct {
	Src        uint64
	Dst        uint64
	KeyIdx     uint32
	Kind       lpg.PropertyKind
	ValueBytes []byte
}

EdgePropertyEntry pairs an (src, dst) NodeID couple with the key string-table index, the kind tag, and the encoded value bytes for one property attached to that edge. An edge carrying P properties yields P entries; as with labels, parallel edges between the same endpoints fold into the same edgeKey on disk just as they do in lpg.Graph's in-memory shards.

type FileEntry

type FileEntry struct {
	Name   string `json:"name"`
	Size   int64  `json:"size"`
	CRC32C uint32 `json:"crc32c"`
}

FileEntry records one component file inside a snapshot directory.

type IndexFileEntry

type IndexFileEntry struct {
	Name   string `json:"name"`
	Size   int64  `json:"size"`
	CRC32C uint32 `json:"crc32c"`
}

IndexFileEntry pairs an index file's logical name (the name it was registered under with index.Manager.CreateIndex) with its on-disk size and CRC32C. It is the secondary-index analogue of FileEntry and travels in [Manifest.Indexes].

func WriteIndexes

func WriteIndexes(dir string, m *index.Manager) ([]IndexFileEntry, error)

WriteIndexes serialises every registered index in m to one file per index under dir/IndexesDir. Returns one IndexFileEntry per successfully serialised index, which the caller threads into the manifest. Subscribers that do not implement index.Serializer are silently skipped (rebuild-on-restart contract).

On any I/O error the partial directory under dir/IndexesDir is removed (best effort) so the caller does not need to clean up.

type IndexReadback

type IndexReadback struct {
	Name  string
	Bytes []byte
}

IndexReadback is the raw byte payload of one secondary index file returned by LoadSnapshotFull. The bytes are passed verbatim to index.Serializer.Deserialize by [store/recovery.Open*]; the snapshot loader does not interpret them further.

func LoadIndexes

func LoadIndexes(dir string, entries []IndexFileEntry) ([]IndexReadback, error)

LoadIndexes reads every entry in entries from dir/IndexesDir and returns the raw bytes for each. Files whose on-disk CRC32C does not match the manifest record surface a metric warning via `store.snapshot.indexes.corrupted` and are reported with [IndexReadback.Bytes] == nil; the caller treats nil bytes as "rebuild from LPG" rather than as a fatal error.

A missing indexes/ directory is not an error: it simply means the snapshot does not carry persisted indexes (forward compat with snapshots produced before this format extension).

type LabelsReadback

type LabelsReadback struct {
	Strings    []string
	NodeLabels []NodeLabelEntry
	EdgeLabels []EdgeLabelEntry
}

LabelsReadback is the structural parse of a labels.bin file. The caller materialises it back into a live lpg.Graph via ApplyLabelsToGraph once the underlying mapper is populated.

func ReadLabels

func ReadLabels(r io.Reader) (LabelsReadback, error)

ReadLabels parses a labels.bin payload produced by WriteLabels. It performs strict structural validation: a missing or wrong magic, a future format-version byte, a truncated record, or an out-of-range string-table index all surface as ErrLabelsCorrupted.

The caller is responsible for verifying the surrounding manifest CRC matches the file bytes (the Open / LoadSnapshotFull helpers do this); this function only enforces the structural contract.

type LoadedCSR

type LoadedCSR struct {
	Manifest Manifest
	CSR      CSRReadback
}

LoadedCSR is the result of [LoadCSR] / Open: the parsed CSR arrays plus the manifest entry that produced them.

func Open

func Open(dir string) (LoadedCSR, error)

Open verifies and loads the snapshot rooted at dir. It reads the manifest, then reads csr.bin and verifies its CRC32C matches the manifest entry. Future versions may load additional components (labels.bin, properties.bin, schema.bin) by extending Manifest.Files.

type LoadedSnapshot

type LoadedSnapshot struct {
	Manifest   Manifest
	CSR        CSRReadback
	Labels     LabelsReadback
	Properties PropertiesReadback
	Mapper     MapperReadback
	Indexes    []IndexReadback
	// Tombstones is the node-removal set restored from tombstones.bin. It
	// is empty for snapshots that carry no tombstones.bin entry (older
	// snapshots, or any snapshot of a graph that never removed a node) —
	// the backward-compatibility contract.
	Tombstones TombstonesReadback
	// EdgeHandles is the per-handle edge metadata restored from
	// edgehandles.bin (each parallel edge's per-CREATE relationship type and
	// properties keyed by its stable handle). It is empty for snapshots that
	// carry no edgehandles.bin entry (older snapshots, or any snapshot of a
	// graph that never used the handle-keyed metadata stores) — the
	// backward-compatibility contract.
	EdgeHandles EdgeHandlesReadback
}

LoadedSnapshot is the result of LoadSnapshotFull: the parsed CSR arrays, the parsed labels readback (empty for v1 snapshots), the parsed properties readback (empty when properties.bin is absent), the parsed mapper readback (empty when mapper.bin is absent, e.g. a v1 CSR-only snapshot or a v2 snapshot written without a codec for a non-string key type), the optional per-index byte payloads (one entry per indexes/<name>.bin file referenced by the manifest), and the manifest that produced them.

When mapper.bin is present, exactly one of [MapperReadback.Pairs] (version-1 string layout) and [MapperReadback.RawPairs] (version-2 codec layout) is populated; see MapperReadback.

Each IndexReadback.Bytes may be nil even when the manifest references the index — that signals the file was missing or its CRC32C did not validate. Callers must treat nil bytes as "rebuild from LPG" rather than as a fatal error; the corruption was already metered by LoadIndexes under `store.snapshot.indexes.corrupted`.

func LoadSnapshotFull

func LoadSnapshotFull(dir string) (LoadedSnapshot, error)

LoadSnapshotFull verifies and loads the snapshot rooted at dir, returning the CSR, the labels readback, and the properties readback. v1 snapshots are accepted transparently: their manifest has no labels.bin or properties.bin entry, and the returned [LoadedSnapshot.Labels] / [LoadedSnapshot.Properties] are zero values (empty tables, no records). v2 snapshots may carry any combination of labels.bin and properties.bin; each component is CRC-validated only when its manifest entry is present.

CSR CRC verification mirrors Open; labels and properties CRC verification use the same TeeReader pattern so a corrupted component surfaces as ErrCorrupted.

type Manifest

type Manifest struct {
	Version   int              `json:"version"`
	CreatedAt time.Time        `json:"created_at"`
	Order     uint64           `json:"order"`
	Size      uint64           `json:"size"`
	Files     []FileEntry      `json:"files"`
	Indexes   []IndexFileEntry `json:"indexes,omitempty"`
}

Manifest is the JSON-encoded index of a snapshot directory.

Indexes is the secondary-index sub-manifest: it carries one IndexFileEntry per file written under indexes/<name>.bin. The field is omitted from the JSON form when empty so v2 manifests produced before this extension are byte-identical to the ones produced by current builds when no indexes are registered.

func LoadManifest

func LoadManifest(r io.Reader) (Manifest, error)

LoadManifest parses m from r. Returns ErrManifestUnsupported when the version is newer than this build.

func ReadManifestFile

func ReadManifestFile(path string) (Manifest, error)

ReadManifestFile is a convenience wrapper around an O_NOFOLLOW open plus LoadManifest. The file is opened via [openSnapshotComponent] so a manifest.json that is a symlink in an untrusted snapshot directory is rejected rather than dereferenced.

type MapperPair

type MapperPair struct {
	ID  graph.NodeID
	Key string
}

MapperPair is one (NodeID, natural key) record as parsed from the on-disk mapper.bin payload. The slice exposed by [MapperReadback.Pairs] is enumerated in shard-major / intra-index- major order — the same order graph.Mapper.Walk produces, which is the order the writer serialised.

type MapperRawPair

type MapperRawPair struct {
	ID  graph.NodeID
	Key []byte
}

MapperRawPair is one (NodeID, raw key bytes) record as parsed from a codec-encoded (version 2) mapper.bin payload. The bytes are the opaque output of [txn.Codec.Encode] for the natural key; recovery decodes them back into the concrete key type N via the matching codec. The slice exposed by [MapperReadback.RawPairs] is enumerated in the same Walk order the writer serialised.

type MapperReadback

type MapperReadback struct {
	Pairs    []MapperPair
	RawPairs []MapperRawPair
}

MapperReadback is the structural parse of a mapper.bin file. The caller materialises it back into a live graph.Mapper via graph.Mapper.LoadFrom once a fresh mapper has been constructed.

Exactly one of Pairs / RawPairs is populated, selected by the on-disk format version:

  • Pairs holds string keys, parsed from a version-1 (string) file. ApplyMapperToGraph consumes this directly for string-keyed graphs.
  • RawPairs holds codec-encoded key bytes, parsed from a version-2 file produced for a non-string key type. ApplyMapperToGraphWithCodec decodes these via the supplied codec.

Both are empty when mapper.bin was absent (v1/v2 snapshots written by the no-codec writer for non-string keys, or any v1 CSR-only snapshot).

func ReadMapperBytes

func ReadMapperBytes(r io.Reader) (MapperReadback, error)

ReadMapperBytes parses a mapper.bin payload produced by WriteMapper for a non-string key type (version 2): the per-record key bytes are the opaque codec output and are returned verbatim in [MapperReadback.RawPairs] for the caller to decode with the matching codec. A version-1 (string) payload is accepted too — its UTF-8 key bytes are returned as RawPairs so a single reader path can serve both layouts when a codec is in hand.

Structural validation matches ReadMapperString: bad magic, an unsupported version, a truncated record, or an implausible length prefix all surface as ErrMapperCorrupted. The caller verifies the surrounding manifest CRC ([readVerifiedMapperBytes] does this); this function enforces only the structural contract.

func ReadMapperString

func ReadMapperString(r io.Reader) (MapperReadback, error)

ReadMapperString parses a mapper.bin payload produced by WriteMapperString. It performs strict structural validation: a missing or wrong magic, an unsupported format version, a truncated record, or an implausible key length all surface as ErrMapperCorrupted.

The caller is responsible for verifying the surrounding manifest CRC matches the file bytes (LoadSnapshotFull does this); this function only enforces the structural contract.

type NodeLabelEntry

type NodeLabelEntry struct {
	NodeID    uint64
	StringIdx uint32
}

NodeLabelEntry pairs a NodeID with the string-table index of one label name attached to that node. A node carrying N labels yields N entries.

type NodePropertyEntry

type NodePropertyEntry struct {
	NodeID     uint64
	KeyIdx     uint32
	Kind       lpg.PropertyKind
	ValueBytes []byte
}

NodePropertyEntry pairs a NodeID with the key string-table index, the kind tag, and the encoded value bytes for one property attached to that node. A node carrying P properties yields P entries.

type PropertiesReadback

type PropertiesReadback struct {
	Keys           []string
	NodeProperties []NodePropertyEntry
	EdgeProperties []EdgePropertyEntry
}

PropertiesReadback is the structural parse of a properties.bin file. The caller materialises it back into a live lpg.Graph via ApplyPropertiesToGraph once the underlying mapper is populated.

func ReadProperties

func ReadProperties(r io.Reader) (PropertiesReadback, error)

ReadProperties parses a properties.bin payload produced by WriteProperties. It performs strict structural validation: a missing or wrong magic, a future format-version byte, a truncated record, an unknown kind tag, or a key-table index that points beyond the embedded string table all surface as ErrPropertiesCorrupted.

The caller is responsible for verifying the surrounding manifest CRC matches the file bytes (the LoadSnapshotFull helper does this); this function only enforces the structural contract.

type TombstonesReadback

type TombstonesReadback struct {
	IDs []graph.NodeID
}

TombstonesReadback is the structural parse of a tombstones.bin file: the sorted set of removed NodeIDs. The caller materialises it back into a live lpg.Graph via ApplyTombstonesToGraph.

func ReadTombstones

func ReadTombstones(r io.Reader) (TombstonesReadback, error)

ReadTombstones parses a tombstones.bin payload produced by WriteTombstones. It performs strict structural validation: a missing or wrong magic, a future format-version word, an implausible count, or a truncated record all surface as ErrTombstonesCorrupted.

The caller is responsible for verifying the surrounding manifest CRC matches the file bytes (the LoadSnapshotFull helper does this); this function only enforces the structural contract.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL