Documentation
¶
Overview ¶
Package backup implements the per-adapter logical-backup format defined in docs/design/2026_04_29_proposed_snapshot_logical_decoder.md (Phase 0) and reused by docs/design/2026_04_29_proposed_logical_backup.md (Phase 1).
This file owns the filename encoding rules for non-S3 segments. S3 object keys preserve their `/` separators (and so are not transformed by EncodeSegment); every other adapter scope encodes user-supplied bytes through this path.
Encoding rules (see "Filename encoding" in the Phase 0 doc):
- Bytes in the unreserved set [A-Za-z0-9._-] pass through.
- Every other byte is rendered as %HH (uppercase hex), like application/x-www-form-urlencoded but applied to every non-allowlisted byte.
- If the encoded result exceeds maxSegmentBytes (240), the segment is replaced with <sha256-hex-prefix-32>__<truncated-original> and the full original bytes must be recorded in KEYMAP.jsonl by the caller.
- Binary DynamoDB partition / sort keys take a separate "b64.<base64url>" path so a binary key never collides with a string key whose hex encoding happens to look like base64. EncodeBinarySegment emits that form.
Index ¶
- Constants
- Variables
- func DecodeLiveEntries(r io.Reader) ([]RoundTripEntry, SnapshotHeader, error)
- func DecodeSegment(seg string) ([]byte, error)
- func EncodeBinarySegment(raw []byte) string
- func EncodeDDBItemKey(tableName string, generation uint64, hashKey, rangeKey string) []byte
- func EncodeDDBTableMetaKey(tableName string) []byte
- func EncodeInfoSidecarPath(fsmPath string) string
- func EncodeMsgDataKey(queueName string, gen uint64, messageID string) []byte
- func EncodeQueueMetaKey(queueName string) []byte
- func EncodeSegment(raw []byte) string
- func HasInlineTTL(value []byte) bool
- func IsBinarySegment(seg string) bool
- func IsBlobAtomicWriteOutOfSpace(err error) bool
- func IsBlobAtomicWriteRetriable(err error) bool
- func IsShaFallback(seg string) bool
- func LoadKeymap(r io.Reader) (map[string]KeymapRecord, error)
- func OpenSidecarFile(path string) (*os.File, error)
- func ReadSnapshot(r io.Reader, fn func(SnapshotHeader, SnapshotEntry) error) error
- func ResolveCommitTS(manifestTS uint64, override *uint64) (uint64, error)
- func SnapshotIndexFromPath(path string) (uint64, error)
- func VerifyChecksums(root string) error
- func WriteChecksums(root string) error
- func WriteEncodeInfo(w io.Writer, info EncodeInfo) error
- func WriteManifest(w io.Writer, m Manifest) error
- type Adapter
- type AdapterSet
- type Adapters
- type DDBEncoder
- func (d *DDBEncoder) Finalize() error
- func (d *DDBEncoder) HandleGSIRow(_, _ []byte) error
- func (d *DDBEncoder) HandleItem(key, value []byte) error
- func (d *DDBEncoder) HandleTableGen(_, _ []byte) error
- func (d *DDBEncoder) HandleTableMeta(key, value []byte) error
- func (d *DDBEncoder) WithBundleJSONL(on bool) *DDBEncoder
- func (d *DDBEncoder) WithWarnSink(fn func(event string, fields ...any)) *DDBEncoder
- type DecodeCounters
- type DecodeOptions
- type DecodeResult
- type DynamoDBEncoder
- type EncodeInfo
- type EncodeInfoSelfTest
- type EncodeOptions
- type EncodeResult
- type Exclusions
- type KeymapReader
- type KeymapRecord
- type KeymapWriter
- type Live
- type Manifest
- type RedisDB
- func (r *RedisDB) Finalize() error
- func (r *RedisDB) HandleHLL(userKey, value []byte) error
- func (r *RedisDB) HandleHashField(key, value []byte) error
- func (r *RedisDB) HandleHashMeta(key, value []byte) error
- func (r *RedisDB) HandleListClaim(_, _ []byte) error
- func (r *RedisDB) HandleListItem(key, value []byte) error
- func (r *RedisDB) HandleListMeta(key, value []byte) error
- func (r *RedisDB) HandleListMetaDelta(_, _ []byte) error
- func (r *RedisDB) HandleSetMember(key, _ []byte) error
- func (r *RedisDB) HandleSetMeta(key, value []byte) error
- func (r *RedisDB) HandleSetMetaDelta(_, _ []byte) error
- func (r *RedisDB) HandleStreamEntry(key, value []byte) error
- func (r *RedisDB) HandleStreamMeta(key, value []byte) error
- func (r *RedisDB) HandleString(userKey, value []byte) error
- func (r *RedisDB) HandleTTL(userKey, value []byte) error
- func (r *RedisDB) HandleZSetLegacyBlob(key, value []byte) error
- func (r *RedisDB) HandleZSetMember(key, value []byte) error
- func (r *RedisDB) HandleZSetMeta(key, value []byte) error
- func (r *RedisDB) HandleZSetMetaDelta(_, _ []byte) error
- func (r *RedisDB) HandleZSetScore(_, _ []byte) error
- func (r *RedisDB) WithPendingTTLByteCap(capacity int) *RedisDB
- func (r *RedisDB) WithWarnSink(fn func(event string, fields ...any)) *RedisDB
- type RedisEncoder
- type RoundTripEntry
- type S3Encoder
- func (s *S3Encoder) Finalize() error
- func (s *S3Encoder) HandleBlob(key, value []byte) error
- func (s *S3Encoder) HandleBucketMeta(key, value []byte) error
- func (s *S3Encoder) HandleIgnored(_, _ []byte) error
- func (s *S3Encoder) HandleIncompleteUpload(prefix string, key, value []byte) error
- func (s *S3Encoder) HandleObjectManifest(key, value []byte) error
- func (s *S3Encoder) WithIncludeIncompleteUploads(on bool) *S3Encoder
- func (s *S3Encoder) WithIncludeOrphans(on bool) *S3Encoder
- func (s *S3Encoder) WithRenameCollisions(on bool) *S3Encoder
- func (s *S3Encoder) WithWarnSink(fn func(event string, fields ...any)) *S3Encoder
- type S3RecordEncoder
- type SQSEncoder
- func (s *SQSEncoder) Finalize() error
- func (s *SQSEncoder) HandleMessageData(key, value []byte) error
- func (s *SQSEncoder) HandleQueueGen(key, value []byte) error
- func (s *SQSEncoder) HandleQueueMeta(key, value []byte) error
- func (s *SQSEncoder) HandleSideRecord(prefix string, key, value []byte) error
- func (s *SQSEncoder) WithIncludeSideRecords(on bool) *SQSEncoder
- func (s *SQSEncoder) WithPreserveVisibility(on bool) *SQSEncoder
- func (s *SQSEncoder) WithWarnSink(fn func(event string, fields ...any)) *SQSEncoder
- type SQSRecordEncoder
- type SnapshotEntry
- type SnapshotHeader
- type Source
Constants ¶
const ( DDBTableMetaPrefix = "!ddb|meta|table|" DDBTableGenPrefix = "!ddb|meta|gen|" DDBItemPrefix = "!ddb|item|" DDBGSIPrefix = "!ddb|gsi|" )
Snapshot prefixes the DynamoDB encoder dispatches on. Mirror the live constants in kv/shard_key.go (DynamoTableMetaPrefix etc.) so a renamed prefix is caught by the dispatch tests below.
const ( KindSHAFallback = "sha-fallback" KindS3LeafData = "s3-leaf-data" KindMetaCollision = "meta-suffix-rename" )
KEYMAP.jsonl shape (one record per line):
{"encoded":"<encoded-segment>","original":"<base64url-no-padding>","kind":"sha-fallback"}
Records are written in encounter order (the order the encoder produced them) and never modified after write. The file is append-only; if the same encoded segment is written twice the reader keeps the last entry, but the encoder is expected not to emit duplicates within a single dump.
Records exist only for entries whose original bytes are NOT recoverable from the encoded filename alone:
- KindSHAFallback — segment is `<sha-prefix-32>__<truncated-original>` (filename length exceeded EncodeSegment's 240-byte ceiling).
- KindS3LeafData — S3 object renamed to `<obj>.elastickv-leaf-data` because both `<obj>` and `<obj>/...` existed in the same bucket.
- KindMetaCollision — user S3 object key happened to end in `.elastickv-meta.json`; renamed under --rename-collisions.
A consumer that does not care about reversing these to original bytes can ignore KEYMAP.jsonl entirely.
const ( // PhasePhase0SnapshotDecode marks dumps produced by Phase 0a (offline // snapshot decoder). PhasePhase0SnapshotDecode = "phase0-snapshot-decode" // PhasePhase1LivePinned marks dumps produced by Phase 1 (live PIT // extraction with cluster-wide read_ts pinning). PhasePhase1LivePinned = "phase1-live-pinned" )
const ( // ChecksumAlgorithmSHA256 is the only checksum algorithm Phase 0a writes. // Phase 1 may add others later (e.g. blake3) under the same field. ChecksumAlgorithmSHA256 = "sha256" // ChecksumFormatSha256sum identifies the line-oriented sha256sum(1) // format used by the CHECKSUMS file. Operators verify with // `sha256sum -c CHECKSUMS` from the dump root. ChecksumFormatSha256sum = "sha256sum" // EncodedFilenameCharsetRFC3986 is the EncodeSegment charset used for // every non-S3-object filename in the dump. EncodedFilenameCharsetRFC3986 = "rfc3986-unreserved-plus-percent" // S3MetaSuffixDefault is the reserved suffix for the S3 sidecar // metadata file (`<obj>.elastickv-meta.json`). S3MetaSuffixDefault = ".elastickv-meta.json" // S3CollisionStrategyLeafDataSuffix renames the shorter of two // colliding S3 keys to `<obj>.elastickv-leaf-data` and records the // rename in KEYMAP.jsonl. S3CollisionStrategyLeafDataSuffix = "leaf-data-suffix" // DynamoDBLayoutPerItem emits one item per file // (`items/<pk>/<sk>.json`); the user's stated default. DynamoDBLayoutPerItem = "per-item" // DynamoDBLayoutJSONL bundles items into `items/data-<part>.jsonl` // (opt-in via --dynamodb-bundle-mode jsonl). DynamoDBLayoutJSONL = "jsonl" // KeySegmentMaxBytesDefault matches EncodeSegment's maxSegmentBytes. KeySegmentMaxBytesDefault uint32 = 240 )
const ( RedisHashMetaPrefix = "!hs|meta|" RedisHashFieldPrefix = "!hs|fld|" RedisHashMetaDeltaPrefix = "!hs|meta|d|" )
Snapshot key prefixes the hash encoder dispatches on. Mirror the live store/hash_helpers.go constants — a renamed prefix on the live side surfaces here at compile time via the dispatch tests.
const ( ListMetaPrefix = "!lst|meta|" ListItemPrefix = "!lst|itm|" ListMetaDeltaPrefix = "!lst|meta|d|" ListClaimPrefix = "!lst|claim|" )
Redis list encoder. Translates raw !lst|... snapshot records into the per-list `lists/<key>.json` shape defined by Phase 0 (docs/design/2026_04_29_proposed_snapshot_logical_decoder.md).
Three on-disk key families share the `!lst|` namespace; only two carry restorable state:
- !lst|meta|<userKey> -> 24-byte (Head, Tail, Len) blob
- !lst|itm|<userKey><seq(8)> -> raw item bytes (Redis lists are binary-safe)
- !lst|claim|... -> POP tombstone for OCC uniqueness. The live read path (rangeList → fetchListRange in redis.go) does NOT consult claims — POPs also Del the underlying item key in the same OCC commit, so a snapshot taken after a POP commit has no item record for the popped seq. The encoder therefore skips claim keys entirely.
- !lst|meta|d|... -> meta delta. The hash encoder skips its analogous deltas and treats !hs|fld| as the source of truth; the list encoder mirrors that policy — !lst|itm| keys are the source of truth and the delta arithmetic is not replayed at backup time.
const ( RedisSetMetaPrefix = "!st|meta|" RedisSetMemberPrefix = "!st|mem|" RedisSetMetaDeltaPrefix = "!st|meta|d|" )
Redis set encoder. Translates raw !st|... snapshot records into the per-set `sets/<key>.json` shape defined by Phase 0 (docs/design/2026_04_29_proposed_snapshot_logical_decoder.md).
Wire format mirrors store/set_helpers.go:
- !st|meta|<userKeyLen(4)><userKey> → 8-byte BE Len
- !st|mem|<userKeyLen(4)><userKey><member> → empty value; the member bytes live in the key (binary-safe, per Redis's SADD contract).
- !st|meta|d|<userKeyLen(4)><userKey>... → 8-byte LenDelta; skipped silently. Same policy as hash deltas: !st|mem| keys are the source of truth at backup time, and delta arithmetic does not need to be replayed.
const ( RedisStreamMetaPrefix = "!stream|meta|" RedisStreamEntryPrefix = "!stream|entry|" )
Redis stream encoder. Translates raw !stream|... snapshot records into the per-stream `streams/<key>.jsonl` shape defined by Phase 0 (docs/design/2026_04_29_proposed_snapshot_logical_decoder.md, lines 336-344).
Wire format mirrors store/stream_helpers.go and adapter/redis_storage_codec.go:
- !stream|meta|<userKeyLen(4)><userKey> → 24-byte BE Length(8) || LastMs(8) || LastSeq(8)
- !stream|entry|<userKeyLen(4)><userKey><ms(8)><seq(8)> → magic-prefixed pb.RedisStreamEntry protobuf with fields {id string, fields []string} where Fields is the interleaved (name1, value1, name2, value2, ...) XADD field list.
The protobuf entry value carries a magic prefix `0x00 'R' 'X' 'E' 0x01` (mirror of adapter/redis_storage_codec.go:17 storedRedisStreamEntryProtoPrefix); re-declared here so this package stays adapter-independent.
Output is JSONL (one record per line) plus a trailing `_meta` terminator line that captures length, last_ms, last_seq, and TTL. Per the design line 336-339:
{"id":"1714400000000-0","fields":{"event":"login","user":"alice"}}
{"_meta":true,"length":2,"last_ms":1714400000001,"last_seq":0,
"expire_at_ms":null}
JSONL was chosen for streams over per-entry files because real streams routinely hold tens of thousands of entries and per-entry inode pressure would dominate `tar`/`find` runtime.
const ( RedisStringPrefix = "!redis|str|" RedisHLLPrefix = "!redis|hll|" RedisTTLPrefix = "!redis|ttl|" )
Snapshot key prefixes the encoder dispatches on. Kept in sync with adapter/redis_compat_types.go so a renamed prefix in the live code is caught here at compile time via the corresponding tests.
const ( RedisZSetMetaPrefix = "!zs|meta|" RedisZSetMemberPrefix = "!zs|mem|" RedisZSetScorePrefix = "!zs|scr|" RedisZSetMetaDeltaPrefix = "!zs|meta|d|" // RedisZSetLegacyBlobPrefix is the consolidated single-key // layout the live store still writes for non-empty persisted // zsets (`adapter/redis_compat_types.go:82` redisZSetPrefix, // produced by `adapter/redis_compat_commands.go:3495-3508` and // read by `adapter/redis_compat_helpers.go:610-631` as the // fallback when no wide-column members exist). A backup that // skipped this prefix would silently drop legacy-only zsets; // HandleZSetLegacyBlob decodes the blob and registers the same // per-member state HandleZSetMeta + HandleZSetMember would. // Codex P1 finding on PR #790 (round 2). RedisZSetLegacyBlobPrefix = "!redis|zset|" )
Redis zset encoder. Translates raw !zs|... snapshot records into the per-zset `zsets/<key>.json` shape defined by Phase 0 (docs/design/2026_04_29_proposed_snapshot_logical_decoder.md).
Wire format mirrors store/zset_helpers.go:
- !zs|meta|<userKeyLen(4)><userKey> → 8-byte BE Len
- !zs|mem|<userKeyLen(4)><userKey><member> → 8-byte IEEE 754 score
- !zs|scr|<userKeyLen(4)><userKey><sortableScore(8)><member> → empty (score index)
- !zs|meta|d|<userKeyLen(4)><userKey><...> → 8-byte LenDelta
Routing rules:
- `!zs|mem|` is the source of truth: it carries both the member name (in the trailing key bytes) and its IEEE 754 score (in the value).
- `!zs|scr|` is a secondary index used at scan time on the live side; for backup purposes it is redundant and is silently skipped, the same way `!st|scr|` (no such prefix exists) and `!hs|meta|d|` deltas are skipped by the hash/set encoders.
- `!zs|meta|d|` deltas are silently skipped; `!zs|mem|` already reflects the post-delta state at backup time.
const ( S3BucketMetaPrefix = s3keys.BucketMetaPrefix S3BucketGenPrefix = s3keys.BucketGenerationPrefix S3ObjectManifestPrefix = s3keys.ObjectManifestPrefix S3UploadMetaPrefix = s3keys.UploadMetaPrefix S3UploadPartPrefix = s3keys.UploadPartPrefix S3BlobPrefix = s3keys.BlobPrefix S3GCUploadPrefix = s3keys.GCUploadPrefix S3RoutePrefix = s3keys.RoutePrefix )
Snapshot prefixes the S3 encoder dispatches on. Mirror internal/s3keys/keys.go so a renamed prefix surfaces at compile time via the dispatch tests.
const ( // PebbleSnapshotMagicLen is the byte length of the "EKVPBBL1" // header. Exposed so callers can sniff the first 8 bytes of a // file to decide whether to dispatch into ReadSnapshot or fall // through to another reader. PebbleSnapshotMagicLen = 8 MaxSnapshotEncodedKeySize = maxSnapshotUserKeySize + snapshotTSSize MaxSnapshotEncodedValueSize = maxSnapshotUserValueSize + snapshotValueHeaderSize + envelopeMaxB )
Snapshot format constants — mirror store/lsm_store.go.
const ( SQSQueueMetaPrefix = "!sqs|queue|meta|" SQSQueueGenPrefix = "!sqs|queue|gen|" SQSQueueSeqPrefix = "!sqs|queue|seq|" SQSQueueTombstonePrefix = "!sqs|queue|tombstone|" SQSMsgDataPrefix = "!sqs|msg|data|" SQSMsgVisPrefix = "!sqs|msg|vis|" SQSMsgByAgePrefix = "!sqs|msg|byage|" SQSMsgDedupPrefix = "!sqs|msg|dedup|" SQSMsgGroupPrefix = "!sqs|msg|group|" )
Snapshot key prefixes the SQS encoder dispatches on. Kept in sync with adapter/sqs_keys.go and adapter/sqs_messages.go (see SqsQueueMetaPrefix / SqsMsgDataPrefix); a renamed prefix in the live code is caught here at dispatch time by the corresponding tests that synthesise records with these literal byte strings.
const CHECKSUMSFilename = "CHECKSUMS"
CHECKSUMSFilename is the on-disk name of the dump-tree-wide sha256sum(1)-compatible checksum file. Stored at the dump root so `sha256sum -c CHECKSUMS` from the same root verifies the dump without elastickv tooling — the documented vendor-independent recovery property of Phase 0a.
const CurrentFormatVersion uint32 = 1
CurrentFormatVersion is the format major-version this code emits and accepts. Restore-side code MUST refuse `format_version > current`. A minor-version bump (e.g., adding optional fields) does not change this constant.
const EncodeInfoFormatVersion uint32 = 1
EncodeInfoFormatVersion is the on-disk schema version for ENCODE_INFO.json. Bumped on incompatible schema changes; ReadEncodeInfo rejects unknown versions with ErrUnsupportedEncodeInfoFormatVersion so a future encoder release cannot silently drop fields a current operator relies on.
const PebbleSnapshotMagic = "EKVPBBL1"
PebbleSnapshotMagic is the 8-byte file header that introduces a native Pebble snapshot. Exposed for callers that need to sniff a file before deciding which reader to dispatch to. Declared as an untyped string CONSTANT (not a `var [8]byte`) so an importer cannot mutate the bytes — a writable package variable would let any caller corrupt the header globally and break parsing for every consumer (coderabbit Major on PR #792 round 2). Callers comparing against the magic should treat the encoded-key type of their own data: most call sites convert to a byte slice via `[]byte(PebbleSnapshotMagic)` at the comparison point.
const S3LeafDataSuffix = ".elastickv-leaf-data"
S3LeafDataSuffix renames the shorter of two S3 keys when the longer would force its parent to be a directory. Recorded in KEYMAP.jsonl.
const S3MetaSuffixReserved = ".elastickv-meta.json"
S3MetaSuffixReserved is the sidecar suffix per the design doc. A user S3 object key whose suffix matches this is rejected at dump time unless WithRenameCollisions is on.
Variables ¶
var ( ErrDDBInvalidSchema = errors.New("backup: invalid !ddb|meta|table value") ErrDDBInvalidItem = errors.New("backup: invalid !ddb|item value") ErrDDBMalformedKey = errors.New("backup: malformed DynamoDB key") )
ErrDDBInvalidSchema, ErrDDBInvalidItem, ErrDDBMalformedKey are the typed error classes for this encoder. Surface via errors.Is.
var ( // ErrSQSEncodeInvalidQueue is returned when a _queue.json cannot be // parsed or carries an empty queue name. ErrSQSEncodeInvalidQueue = errors.New("backup: sqs encode invalid _queue.json") // ErrSQSEncodeInvalidMessage is returned when a messages.jsonl line // cannot be parsed. ErrSQSEncodeInvalidMessage = errors.New("backup: sqs encode invalid messages.jsonl") // ErrSQSEncodeNotRegular is returned when a dump file is not a regular // file (symlink / FIFO / device / directory). ErrSQSEncodeNotRegular = errors.New("backup: sqs dump file is not a regular file") // ErrSQSEncodeUnsupportedPartitioned is returned for HT-FIFO queues // (partition_count > 1), whose partitioned key family this slice does // not yet reproduce. ErrSQSEncodeUnsupportedPartitioned = errors.New("backup: sqs partitioned (HT-FIFO) queue not yet supported by encoder") )
var ( // ErrS3InvalidBucketMeta is returned when a !s3|bucket|meta value // fails JSON decoding. ErrS3InvalidBucketMeta = errors.New("backup: invalid !s3|bucket|meta value") // ErrS3InvalidManifest is returned when a !s3|obj|head value fails // JSON decoding. ErrS3InvalidManifest = errors.New("backup: invalid !s3|obj|head value") // ErrS3MalformedKey is returned when an S3 key cannot be parsed // for its structural components. ErrS3MalformedKey = errors.New("backup: malformed S3 key") // ErrS3MetaSuffixCollision is returned when a user object key // collides with the reserved S3MetaSuffixReserved suffix. ErrS3MetaSuffixCollision = errors.New("backup: user S3 object key collides with reserved sidecar suffix") // ErrS3IncompleteBlobChunks is returned when a manifest declares // N chunks for some part but the snapshot did not contain all N. // Without this guard a partial / racy snapshot would silently // emit a truncated body. Codex P1 #729. ErrS3IncompleteBlobChunks = errors.New("backup: incomplete blob chunks for manifest-declared part") )
var ErrChecksumMismatch = errors.New("backup: checksum mismatch")
ErrChecksumMismatch is returned by VerifyChecksums when a recomputed digest does not match the stored value.
var ErrChecksumsEmpty = errors.New("backup: CHECKSUMS contains no checksum rows")
ErrChecksumsEmpty is returned by VerifyChecksums when the CHECKSUMS file parsed to zero checksum rows (empty file, all blank lines, or comment-only — none of which are valid Phase 0a dump shapes). Typed so callers can distinguish "CHECKSUMS file did not list anything" from "a listed file mismatched".
var ErrChecksumsMalformedLine = errors.New("backup: malformed CHECKSUMS line")
ErrChecksumsMalformedLine is returned by VerifyChecksums when a CHECKSUMS line does not match the expected `<hex> <path>` shape. Typed so callers can distinguish "CHECKSUMS itself corrupt" from "file content tampering".
var ErrChecksumsPathTraversal = errors.New("backup: CHECKSUMS path escapes dump root")
ErrChecksumsPathTraversal is returned by VerifyChecksums when a CHECKSUMS line's path field would resolve outside the dump root. Typed so a future verifier built on top of this package can branch on errors.Is and distinguish path-shape attacks from honest content-mismatch.
var ErrChecksumsSymlinkEscape = errors.New("backup: CHECKSUMS path traverses symlink")
ErrChecksumsSymlinkEscape is returned by VerifyChecksums when a CHECKSUMS line's resolved path traverses a symlink. Typed so a caller branching on errors.Is can distinguish symlink-tampering from textual `..`-traversal (ErrChecksumsPathTraversal).
var ErrDDBEncodeInvalidItem = errors.New("backup: dynamodb encode invalid item json")
ErrDDBEncodeInvalidItem is returned when an items/*.json file cannot be parsed into a well-formed item (bad JSON shape, unknown attribute type, missing primary-key attribute, a non-S/N/B primary key, or a malformed number literal).
var ErrDDBEncodeInvalidSchema = errors.New("backup: dynamodb encode invalid _schema.json")
ErrDDBEncodeInvalidSchema is returned when a _schema.json file cannot be parsed into the expected shape.
var ErrDDBEncodeNotRegular = errors.New("backup: dynamodb dump file is not a regular file")
ErrDDBEncodeNotRegular is returned when a _schema.json (or, later, item file) is not a regular file — a symlink, FIFO, device, or directory. A dedicated DDB sentinel (not the redis one) so callers classify it correctly via errors.Is.
var ErrDecodeOptionsInvalid = errors.New("backup: DecodeOptions invalid")
ErrDecodeOptionsInvalid is returned when DecodeSnapshot is called with an under-populated DecodeOptions (missing OutRoot, etc.).
var ErrEncodeAdapterData = errors.New("backup: adapter encoder rejected input tree")
ErrEncodeAdapterData marks every error returned by an adapter encoder (Redis / DynamoDB / S3 / SQS) so callers can distinguish "the input tree contained content the encoder cannot translate" from "operator passed a bad flag". The encoder is offline-only — every adapter error originates from rejecting the content under opts.InputRoot (a malformed DynamoDB _schema.json, an S3 collision artifact the encoder cannot reverse, a SQS side-record with an unknown kind, …). These are data-correctness failures, not user errors; the CLI maps this sentinel to exit 2 so runbooks can branch on exit status to quarantine bad dump data (codex P2 v9 #904).
Wrapped via errors.Mark inside runAdapterEncoders so the original adapter sentinel chain (ErrDDBEncodeInvalidSchema, …) is preserved for callers that errors.Is on the more specific type.
var ErrEncodeDuplicateKey = errors.New("backup: duplicate encoded key in snapshot build")
ErrEncodeDuplicateKey is returned when two reconstructed records MVCC-encode to the same key. The live store's keyspace is a set; a duplicate means an adapter reverse-encoder produced colliding internal keys, which would make the loaded snapshot order-dependent. The encoder fails closed rather than emit a `.fsm` whose Pebble image depends on insertion order.
var ErrEncodeKeyTooLarge = errors.New("backup: encoded key length exceeds limit")
ErrEncodeKeyTooLarge / ErrEncodeValueTooLarge mirror the decode-side MaxSnapshotEncodedKeySize / MaxSnapshotEncodedValueSize caps. A reconstructed entry exceeding them would produce a `.fsm` the live restore path rejects, so the encoder fails closed before writing a single byte rather than emit an unloadable file.
var ErrEncodeUnsupportedDynamoDBLayout = errors.New("backup: DynamoDB JSONL layout not supported by encoder")
ErrEncodeUnsupportedDynamoDBLayout is returned when an input dump declares `dynamodb_layout: "jsonl"` in MANIFEST.json. The DynamoDB reverse encoder only walks per-item files (items/*.json, items/*/*.json) and would silently skip every items/data-*.jsonl file, producing an .fsm with only table metadata and no items — a silent-data-loss restore artifact (codex P2 v7 #904). Fail closed until the encoder learns the JSONL layout (M7 / future milestone).
var ErrEncodeUnsupportedS3IncompleteUploads = errors.New("backup: S3 include_incomplete_uploads not supported by encoder")
ErrEncodeUnsupportedS3IncompleteUploads is returned when a caller (the CLI threading manifest.exclusions.include_incomplete_uploads, or a library caller setting EncodeOptions.S3IncludeIncompleteUploads) asks for S3 in-flight multipart uploads to round-trip, but the reverse encoder cannot rebuild that subtree. The S3 reverse encoder silently skips `_incomplete_uploads/` payload directories (internal/backup/encode_s3_objects.go), so a dump that included those records would publish an .fsm missing them. Fail closed until the encoder learns the subtree (codex P2 v21 #904).
var ErrEncodeUnsupportedS3Orphans = errors.New("backup: S3 include_orphans not supported by encoder")
ErrEncodeUnsupportedS3Orphans is returned when the manifest (or a library caller) requests round-tripping S3 pre-generation orphan blob chunks but the reverse encoder cannot rebuild that subtree. Same pattern as ErrEncodeUnsupportedS3IncompleteUploads — the S3 reverse encoder silently skips `_orphans/` payload directories; fail closed until the encoder learns them (codex P2 v21 #904).
var ErrEncodeUnsupportedSQSPreserveVisibility = errors.New("backup: preserve_sqs_visibility not supported by encoder")
ErrEncodeUnsupportedSQSPreserveVisibility is returned when the manifest requests preserving in-flight SQS message visibility state (`preserve_sqs_visibility=true`), but the reverse encoder unconditionally resets VisibleAtMillis / receive count / first receive / receipt token to zero on every restored message (internal/backup/encode_sqs.go). A dump that intentionally preserved visibility would silently restore as "every message visible/reset" without this guard (codex P2 v21 #904).
var ErrEncodeValueTooLarge = errors.New("backup: encoded value length exceeds limit")
var ErrInvalidEncodedSegment = errors.New("backup: invalid encoded filename segment")
ErrInvalidEncodedSegment is returned by DecodeSegment when its input is neither a valid percent-encoded segment, a binary-prefixed segment, nor a SHA-fallback segment.
var ErrInvalidKeymapRecord = errors.New("backup: invalid KEYMAP.jsonl record")
ErrInvalidKeymapRecord is returned by Reader.Next when a line does not parse as a KeymapRecord (malformed JSON, missing field, malformed base64, etc.).
var ErrInvalidManifest = errors.New("backup: manifest invalid")
ErrInvalidManifest is returned by ReadManifest when the JSON parses but fails structural validation (missing required field, unknown phase, etc.).
var ErrLastCommitTSRegression = errors.New("backup: --last-commit-ts override is older than manifest last_commit_ts")
ErrLastCommitTSRegression is returned by ResolveCommitTS when a `--last-commit-ts` override is older than MANIFEST.last_commit_ts. Seeding the restored node's HLC ceiling below a timestamp already durable in the dump would let a post-restart leader re-issue a timestamp at-or-below a restored row's commit ts — the exact HLC-ceiling regression the design's §"MVCC re-encoding" forbids. Raising the ceiling (T >= manifest) is always safe; lowering it is not, so the override is accepted in one direction only.
var ErrPendingTTLBufferFull = cockroachdberr.New("backup: pendingTTL byte budget exhausted; raise WithPendingTTLByteCap or accept orphan-counter mode via WithPendingTTLByteCap(0)")
ErrPendingTTLBufferFull is returned by HandleTTL when an unknown-kind TTL arrives but the pendingTTL buffer's BYTE budget (pendingTTLBytesCap) is already exhausted. The encoder fails closed here rather than silently counting the TTL as an orphan because in real Pebble snapshot order (`!redis|ttl|` lex-sorts before `!st|`/`!stream|`/`!zs|`), the dropped entry would likely belong to a valid wide-column key that arrives later — losing its expire_at_ms would produce a restored database with non-expiring data that the source snapshot's clients expected to expire.
The budget is in BYTES (not entries) because Redis user keys can be up to 1 MiB each; an entry-count cap of N still permits N MiB of accumulated key bytes, which defeats the OOM protection on adversarial snapshots with large keys. Codex P1 on PR #790 round 5 (entry-count -> byte-budget on round 6).
The default cap is 1 GiB (see defaultPendingTTLBytesCap), sized to accommodate typical real-world workloads — at the 58-byte per-entry cost (50-byte user key + 8-byte expireAtMs payload) the cap holds up to ~18.5M expiring wide-column keys before fail-closed (1,073,741,824 / 58 ≈ 18,513,652). Larger deployments must raise the ceiling via WithPendingTTLByteCap. The OOM ceiling is still hard (adversarial snapshots with TB of large keys fail closed). Codex P1 on PR #790 round 7; claude r9/r10 sizing-claim correction in round 11.
Recovery: raise WithPendingTTLByteCap above the snapshot's expected cumulative byte cost of unmatched-at-intake TTLs, or set the byte cap to 0 to explicitly opt into the lossy counter-only mode.
var ErrRedisEncodeHardLink = errors.New("backup: redis dump file is hard-linked")
ErrRedisEncodeHardLink is returned (on platforms where the link count is observable — see refuseHardLink) when a dump file has more than one hard link. A hard link can name an inode outside the dump subtree while passing the IsRegular and os.Root symlink guards, so ingesting it would breach the untrusted-input boundary (codex P2 on PR #828).
var ErrRedisEncodeInvalidJSON = errors.New("backup: redis encode invalid collection JSON")
ErrRedisEncodeInvalidJSON is returned when a collection JSON file in the dump cannot be parsed into its expected shape.
var ErrRedisEncodeMissingKeymap = errors.New("backup: redis encode missing KEYMAP entry for sha-fallback key")
ErrRedisEncodeMissingKeymap is returned when a strings/ or hll/ file name (or a TTL sidecar key) took the SHA-fallback encoding but the db's KEYMAP.jsonl has no matching record to recover the original user-key bytes. The encoder fails closed rather than emit a record under a truncated/hashed key the live cluster would never serve.
var ErrRedisEncodeMultiDBUnsupported = errors.New("backup: redis encoder requires single db_0 (multi-DB or non-zero DB not yet supported)")
ErrRedisEncodeMultiDBUnsupported is returned when the input tree contains a Redis db_<N>/ for any N != 0, or contains multiple db_<N> directories. The current Redis MVCC key prefixes (!redis|str|, !redis|hll|, !redis|ttl|, …) carry NO database component, so feeding two distinct DBs into the same snapshot builder would either collide on same-named keys or silently merge both DBs under db_0 on restore (DecodeOptions.RedisDBIndex defaults to 0). Failing closed preserves correctness until Phase 1 makes the native keys DB-aware (codex P2 v14 #904).
v14 originally fanned out across db_<N> to address codex P1 v13's silent-data-loss concern; codex's v14 follow-up clarified that fan-out under the current key format produces mis-scoped output. The corrected fix replaces fan-out with fail-closed.
var ErrRedisEncodeNotDir = errors.New("backup: redis db path is not a directory")
ErrRedisEncodeNotDir is returned when the redis/db_<n> path exists but is a regular file rather than a directory — a malformed dump. A dedicated sentinel (not ErrRedisEncodeMissingKeymap) so callers can distinguish "bad dump layout" from "sha-fallback key without a keymap entry" via errors.Is.
var ErrRedisEncodeNotRegular = errors.New("backup: redis dump sidecar is not a regular file")
ErrRedisEncodeNotRegular is returned when a dump sidecar (KEYMAP.jsonl, strings_ttl.jsonl, ...) exists but is not a regular file — a symlink, FIFO, device, or directory. Reading such a path with plain os.Open would follow the symlink or block indefinitely on a reader-less FIFO; the encoder fails closed instead, matching the non-regular refusal walkBlobDir applies to *.bin entries (codex P2 on PR #828).
var ErrRedisInvalidHashKey = cockroachdberr.New("backup: malformed !hs| key")
ErrRedisInvalidHashKey is returned when an !hs| key cannot be parsed for its userKeyLen+userKey segment (truncated, malformed, etc).
var ErrRedisInvalidHashMeta = cockroachdberr.New("backup: invalid !hs|meta| value")
ErrRedisInvalidHashMeta is returned when the !hs|meta| value is not the expected 8-byte big-endian field count.
var ErrRedisInvalidListKey = cockroachdberr.New("backup: malformed !lst| key")
ErrRedisInvalidListKey is returned when an !lst| key cannot be parsed for its userKey + (optional) seq segments.
var ErrRedisInvalidListMeta = cockroachdberr.New("backup: invalid !lst|meta| value")
ErrRedisInvalidListMeta is returned when an !lst|meta| value is not the expected 24-byte (Head, Tail, Len) layout.
var ErrRedisInvalidSetKey = cockroachdberr.New("backup: malformed !st| key")
ErrRedisInvalidSetKey is returned when an !st| key cannot be parsed for its userKeyLen+userKey (or member) segments.
var ErrRedisInvalidSetMeta = cockroachdberr.New("backup: invalid !st|meta| value")
ErrRedisInvalidSetMeta is returned when an !st|meta| value is not the expected 8-byte big-endian member count.
var ErrRedisInvalidStreamEntry = cockroachdberr.New("backup: invalid !stream|entry| value")
ErrRedisInvalidStreamEntry is returned when an !stream|entry| value's magic prefix is missing or its protobuf body fails to unmarshal.
var ErrRedisInvalidStreamKey = cockroachdberr.New("backup: malformed !stream| key")
ErrRedisInvalidStreamKey is returned when a !stream| key cannot be parsed for its userKeyLen+userKey (or trailing ID) segments.
var ErrRedisInvalidStreamMeta = cockroachdberr.New("backup: invalid !stream|meta| value")
ErrRedisInvalidStreamMeta is returned when an !stream|meta| value is not the expected 24 bytes or carries a negative length.
var ErrRedisInvalidStringValue = cockroachdberr.New("backup: invalid !redis|str| value")
ErrRedisInvalidStringValue is returned when a !redis|str| value uses the new magic-prefix format but its declared TTL section is truncated. Legacy (no-magic) values are accepted as opaque raw bytes.
var ErrRedisInvalidTTLValue = cockroachdberr.New("backup: invalid !redis|ttl| value")
ErrRedisInvalidTTLValue is returned when a !redis|ttl| value is not the expected 8-byte big-endian uint64 millisecond expiry.
var ErrRedisInvalidZSetKey = cockroachdberr.New("backup: malformed !zs| key")
ErrRedisInvalidZSetKey is returned when an !zs| key cannot be parsed for its userKeyLen+userKey (or member) segments.
var ErrRedisInvalidZSetLegacyBlob = cockroachdberr.New("backup: invalid !redis|zset| value")
ErrRedisInvalidZSetLegacyBlob is returned when a `!redis|zset|` value's magic prefix is missing, its protobuf body fails to unmarshal, or its decoded scores include NaN. Fail-closed for the same reason as ErrRedisInvalidZSetMember: silently accepting a corrupt blob would lose the entire zset's contents at restore.
var ErrRedisInvalidZSetMember = cockroachdberr.New("backup: invalid !zs|mem| value")
ErrRedisInvalidZSetMember is returned when an !zs|mem| value is not the expected 8-byte IEEE 754 score, or contains a NaN score (Redis's ZADD command rejects NaN, so a NaN at backup time indicates store corruption and a silent fall-through would re-corrupt the restored cluster).
var ErrRedisInvalidZSetMeta = cockroachdberr.New("backup: invalid !zs|meta| value")
ErrRedisInvalidZSetMeta is returned when an !zs|meta| value is not the expected 8-byte big-endian member count.
var ErrS3EncodeInvalidBucket = errors.New("backup: s3 encode invalid _bucket.json")
ErrS3EncodeInvalidBucket is returned when a _bucket.json cannot be parsed into the expected shape (or carries an empty bucket name).
var ErrS3EncodeInvalidManifest = errors.New("backup: s3 encode invalid object sidecar")
ErrS3EncodeInvalidManifest is returned when an object's .elastickv-meta.json sidecar cannot be parsed.
var ErrS3EncodeNotRegular = errors.New("backup: s3 dump file is not a regular file")
ErrS3EncodeNotRegular is returned when a dump file (here _bucket.json) is not a regular file — a symlink, FIFO, device, or directory.
var ErrS3EncodeReservedPrefixCollision = errors.New("backup: s3 encode user object key collides with reserved dump directory")
ErrS3EncodeReservedPrefixCollision is returned when a user object key collides with a reserved dump-control directory (_incomplete_uploads/ or _orphans/) — the decoder writes such keys at their natural path without renaming, so the encoder cannot disambiguate them from the dump's own payload. Codex P1 #842 follow-up: failing closed prevents silently dropping the entire user-object subtree.
var ErrS3EncodeUnsupportedCollision = errors.New("backup: s3 encode object-name collision not yet supported")
ErrS3EncodeUnsupportedCollision is returned when the dump carries an object-name collision artifact (KEYMAP.jsonl or a .elastickv-leaf-data file) that this slice does not yet reverse — failing closed avoids emitting a record under the wrong object key.
var ErrSQSInvalidMessage = errors.New("backup: invalid !sqs|msg|data value")
ErrSQSInvalidMessage is returned for !sqs|msg|data values that miss the magic prefix or fail JSON decoding.
var ErrSQSInvalidQueueMeta = errors.New("backup: invalid !sqs|queue|meta value")
ErrSQSInvalidQueueMeta is returned for !sqs|queue|meta values that miss the magic prefix or fail JSON decoding.
var ErrSQSMalformedKey = errors.New("backup: malformed SQS key")
ErrSQSMalformedKey is returned when an SQS key cannot be parsed for the queue-name segment (e.g., the heuristic boundary detection found no transition byte).
var ErrSelfTestLowerLastCommitTS = errors.New("backup: --last-commit-ts T < manifest.last_commit_ts (HLC ceiling regression)")
ErrSelfTestLowerLastCommitTS is returned when the operator-supplied T is below the manifest's last_commit_ts. The HLC ceiling invariant (CLAUDE.md "Timestamp Oracle") forbids lowering the ceiling on restore: a lower T would let a post-restart leader issue a read ts ≤ a restored row's commit ts.
Enforced at two layers:
- CLI (`resolveLastCommitTS`) rejects --last-commit-ts T < manifest before EncodeSnapshot is called (exit code 2).
- Library (`validateEncodeOptions`) rejects when the caller threads `opts.ManifestLastCommitTS > 0` and `opts.LastCommitTS` is below it — defense-in-depth for in-process callers (Phase 1 live extractor, integration tests) that bypass the CLI.
Callers can errors.Is on this sentinel to map to the right exit code (claude v3 doc bug #904 + claude v7 doc bug #904 + codex P2 v2 #904).
var ErrShaFallbackNeedsKeymap = errors.New("backup: filename uses SHA fallback; consult KEYMAP.jsonl")
ErrShaFallbackNeedsKeymap is returned by DecodeSegment when its input is a SHA-fallback segment. The segment cannot be reversed to its original bytes from the filename alone — the caller must consult KEYMAP.jsonl.
var ErrSnapshotBadMagic = cockroachdberr.New("backup: snapshot magic header does not match \"EKVPBBL1\"")
ErrSnapshotBadMagic is returned when the first 8 bytes of the reader do not match `EKVPBBL1`. The decoder caller should treat this as an immediate hard failure rather than try to skip past the bad header — a wrong magic almost always indicates the file is not actually a Pebble snapshot (an MVCC streaming snapshot, a tar archive, a partial truncate, etc.).
var ErrSnapshotBuilderReused = errors.New("backup: snapshotBuilder.WriteTo called more than once")
ErrSnapshotBuilderReused is returned by WriteTo when it is called more than once on the same builder. A builder is single-use (one per encode run); a second WriteTo would re-emit the already-written entries, producing a valid-but-unintended stream. Enforced so the per-adapter feed loops in later milestones cannot silently double- emit (claude review on PR #825).
var ErrSnapshotEncryptedEntry = cockroachdberr.New("backup: snapshot contains encrypted entries — Phase 0a does not link the decryption keyring")
ErrSnapshotEncryptedEntry is returned when a value-header declares the entry is encrypted (encState=0b01). Phase 0a does NOT carry the decryption keyring; an encrypted snapshot must be decoded with a Phase 0a+keyring binary or after Stage 8 of the encryption rollout reverses the encryption.
var ErrSnapshotEncryptedReserved = cockroachdberr.New("backup: value header carries reserved encryption_state; decoder cannot interpret this entry")
ErrSnapshotEncryptedReserved is returned when a value-header carries reserved encryption_state bits (0b10 or 0b11). Mirrors store.ErrEncryptedValueReservedState — the decoder fails closed rather than treat the body as cleartext, matching the design's §7.1 fail-closed contract.
var ErrSnapshotIndexUnparseable = errors.New("backup: snapshot path does not match <index>.fsm convention")
ErrSnapshotIndexUnparseable is returned by SnapshotIndexFromPath when the input filename does not match the live writer's `<uint64>.fsm` convention. The Phase 0a CLI surfaces this as a soft warning (the dump still completes; MANIFEST.snapshot_index is left zero) rather than a hard failure — operator-supplied .fsm files copied off-cluster sometimes carry generic names.
var ErrSnapshotKeyTooLarge = cockroachdberr.New("backup: snapshot key length exceeds limit")
ErrSnapshotKeyTooLarge / ErrSnapshotValueTooLarge are returned when the on-disk length prefix declares an entry larger than the MaxSnapshotEncodedKeySize / MaxSnapshotEncodedValueSize budgets. Mirrors `store.ErrSnapshotKeyTooLarge` and `store.ErrValueTooLarge` from the live restore path so a corrupt or adversarial snapshot fails closed at the length-prefix layer instead of triggering an OOM-sized allocation (codex P1 + gemini security-high on PR #792).
var ErrSnapshotShortKey = cockroachdberr.New("backup: encoded key shorter than timestamp suffix")
ErrSnapshotShortKey is returned when an entry's encoded key is shorter than the 8-byte timestamp suffix that `store.fillEncodedKey` always appends. Indicates a corrupt snapshot — the live store would never emit such a key.
var ErrSnapshotShortValue = cockroachdberr.New("backup: encoded value shorter than value-header")
ErrSnapshotShortValue is returned when an entry's encoded value is shorter than the 9-byte value header. Indicates a corrupt snapshot — the live store always writes the header even for tombstones.
var ErrSnapshotTruncated = cockroachdberr.New("backup: snapshot truncated mid-entry")
ErrSnapshotTruncated is returned when the snapshot ends mid-entry (after a key length but before the key, or after a value length but before the value). A clean EOF at the start of the key-length field is a normal terminator and is NOT an error.
var ErrSnapshotValueTooLarge = cockroachdberr.New("backup: snapshot value length exceeds limit")
var ErrUnsupportedEncodeInfoFormatVersion = errors.New("backup: unsupported ENCODE_INFO format_version")
ErrUnsupportedEncodeInfoFormatVersion is returned by ReadEncodeInfo when the sidecar's format_version is not EncodeInfoFormatVersion. Mirrors the decoder's ErrUnsupportedFormatVersion contract so callers can branch on errors.Is.
var ErrUnsupportedFormatVersion = errors.New("backup: manifest format_version unsupported")
ErrUnsupportedFormatVersion is returned by ReadManifest when the on-disk format_version is greater than CurrentFormatVersion or zero.
Functions ¶
func DecodeLiveEntries ¶
func DecodeLiveEntries(r io.Reader) ([]RoundTripEntry, SnapshotHeader, error)
DecodeLiveEntries decodes an EKVPBBL1 stream through ReadSnapshot and returns its live (non-tombstone) entries plus the header. It is the round-trip self-test primitive from the design's §"Round-trip self-test": the encoder decodes its own just-written bytes and compares the recovered records against what it fed the builder before committing the final `.fsm` to disk, so a node never receives an unloadable snapshot.
Tombstones are skipped (the encoder never writes them, so seeing one would indicate a corrupted build; surfacing only live records keeps the comparison symmetric with what Add accepts). Byte slices are cloned because ReadSnapshot reuses its scratch buffers across the callback.
func DecodeSegment ¶
DecodeSegment is the inverse of EncodeSegment for percent-encoded and binary-prefixed inputs. SHA-fallback inputs return ErrShaFallbackNeedsKeymap so the caller knows to consult KEYMAP.jsonl rather than treat the partial suffix as the original key.
As a defensive measure DecodeSegment refuses inputs longer than maxSegmentBytes. EncodeSegment never produces such inputs, so any caller passing one is either reading a corrupted dump or has a bug; either way the percentDecode allocation should not run.
func EncodeBinarySegment ¶
EncodeBinarySegment encodes a DynamoDB B-attribute (binary) segment as "b64.<base64url-no-padding>" so that binary keys never collide with string keys whose hex-encoding happens to look like base64.
Short-circuits the SHA-fallback for inputs whose base64 expansion (~4/3 of the raw length, plus the 4-byte "b64." prefix) would always overflow maxSegmentBytes. As with EncodeSegment, this avoids an unnecessary large allocation when the result would have been discarded anyway.
func EncodeDDBItemKey ¶
EncodeDDBItemKey constructs a !ddb|item key for tests. Mirrors the live legacyDynamoItemKey constructor in adapter/dynamodb.go (string hash + range, simplest shape).
func EncodeDDBTableMetaKey ¶
EncodeDDBTableMetaKey constructs a !ddb|meta|table key for tests.
func EncodeInfoSidecarPath ¶
EncodeInfoSidecarPath returns the path-derived sidecar location for a given .fsm output path. Multiple .fsm files can share a directory (e.g., per-node dumps under /backups/); a static "ENCODE_INFO.json" name would silently overwrite siblings (gemini medium #896).
Convention: append ".encode_info.json" to the full output path. The same scheme gpg and sha256sum follow when their input is path-addressable.
func EncodeMsgDataKey ¶
EncodeMsgDataKey constructs a !sqs|msg|data key for tests. Mirrors the live sqsMsgDataKey constructor in adapter/sqs_messages.go.
func EncodeQueueMetaKey ¶
EncodeQueueMetaKey constructs a !sqs|queue|meta key for tests.
func EncodeSegment ¶
EncodeSegment encodes a single user-supplied path segment for use as a filename component. It is the inverse of DecodeSegment for non-fallback inputs.
The encoding is deterministic given the same input.
Three structural short-circuits ensure DecodeSegment cannot misclassify a legitimate key:
- If `raw` is longer than maxSegmentBytes, even a fully-unreserved encoding (1:1) cannot fit, so we go straight to shaFallback. This also caps the percent-encode allocation at ~maxSegmentBytes, preventing OOM on adversarial input.
- If the percent-encoded form happens to match the SHA-fallback shape (32 hex chars followed by "__"), we promote it to a real SHA-fallback so DecodeSegment's structural detection cannot fabricate a wrong original.
- If the percent-encoded form starts with the binary "b64." prefix, we promote to SHA-fallback for the same reason: a plain string key like "b64.foo" would otherwise be decoded as base64 and produce different bytes on round-trip.
Both promoted-fallback paths leave the original in KEYMAP.jsonl (a correctness dependency, per the package doc), so exact-byte recovery is preserved.
func HasInlineTTL ¶
HasInlineTTL reports whether a !redis|str| value carries the new-format inline TTL header. Useful for tests asserting the producer's choice.
func IsBinarySegment ¶
IsBinarySegment reports whether seg is a base64-url encoded binary segment emitted by EncodeBinarySegment.
func IsBlobAtomicWriteOutOfSpace ¶
IsBlobAtomicWriteOutOfSpace reports whether err from writeFileAtomic (or any os.File write the master pipeline issues) was driven by a full disk. The platform-specific error codes (POSIX ENOSPC vs. Windows ERROR_DISK_FULL / ERROR_HANDLE_DISK_FULL) live in disk_full_{unix,windows}.go so retry/alarm logic in callers classifies disk-full uniformly across operating systems (Codex P2 round 9).
func IsBlobAtomicWriteRetriable ¶
IsBlobAtomicWriteRetriable reports whether err from writeFileAtomic is a retriable I/O failure. Today the only retriable signal is io.ErrShortWrite. ENOSPC (disk full) is intentionally NOT retriable here — the master pipeline must surface it to the operator rather than spin: a backup against a full disk has no business retrying. IsBlobAtomicWriteOutOfSpace is the explicit out-of-space probe so the pipeline can choose the right alarm wording.
func IsShaFallback ¶
IsShaFallback reports whether seg uses the SHA-prefix-and-truncated-original form. Such segments cannot be reversed without KEYMAP.jsonl.
func LoadKeymap ¶
func LoadKeymap(r io.Reader) (map[string]KeymapRecord, error)
LoadKeymap reads every record from r into an in-memory map keyed by encoded segment. The last record wins on duplicates. Suitable for scopes where the keymap fits comfortably in memory; for large scopes callers should use KeymapReader directly.
func OpenSidecarFile ¶
OpenSidecarFile is the exported wrapper around the per-platform openSidecarFile. It opens path for write while refusing symlink, hard-link, FIFO, socket, and other non-regular-file clobber attacks via the platform-appropriate primitives (O_NOFOLLOW + O_NONBLOCK + Nlink check on unix; Lstat-then-OpenFile on Windows; a stricter Lstat-then-OpenFile fallback on other platforms).
Use this whenever a writer creates or replaces a "sidecar" style file at a deterministic path inside an operator-supplied directory — the path is predictable to an attacker who can pre- create the entry, so the open MUST refuse to follow a symlink or truncate a hard-linked / non-regular file (codex P2 v25 #904 extended this from in-package adapter writers to the cmd/elastickv-snapshot-encode CLI's ENCODE_INFO.json writer).
func ReadSnapshot ¶
func ReadSnapshot(r io.Reader, fn func(SnapshotHeader, SnapshotEntry) error) error
ReadSnapshot reads the EKVPBBL1 header from r, then yields every entry through fn. fn receives a transient SnapshotEntry whose byte slices are NOT safe to retain across calls (the reader reuses scratch buffers to keep per-entry allocations bounded for multi-GB snapshots). If fn returns an error, iteration stops and the error is returned verbatim.
Iteration terminates cleanly on EOF at the start of an entry's key-length field. EOF inside an entry returns ErrSnapshotTruncated.
Tombstone entries (flags bit 0 set) are surfaced via SnapshotEntry.Tombstone — callers decide whether to suppress them (Phase 0a's intended behavior for backup output) or include them (a multi-version diagnostic dump might want both).
Callers that need the header even when the snapshot has zero entries (and so the callback never fires) should use ReadSnapshotWithHeader, which surfaces it as a separate return value. The callback's per-entry SnapshotHeader argument carries the same value but only fires when at least one entry exists.
func ResolveCommitTS ¶
ResolveCommitTS returns the commit timestamp the encoder stamps on every reconstructed key (design §"MVCC re-encoding": uniform stamping). manifestTS is MANIFEST.last_commit_ts; override is the optional `--last-commit-ts` value (nil = no override).
Fail-closed contract: an override is accepted only when it is >= manifestTS (raising the restored HLC ceiling is safe; lowering it risks a post-restart timestamp colliding with a restored row). The returned value is used verbatim for BOTH the EKVPBBL1 header and every key's invTS, so the two never disagree.
func SnapshotIndexFromPath ¶
SnapshotIndexFromPath extracts the integer applied-index encoded in a `.fsm` filename. The live writer (internal/raftengine/etcd/ engine.go) names every snapshot `<index>.fsm` where <index> is the FSM's applied_index at the moment the snapshot was taken. Phase 0a's MANIFEST.json carries that value as `snapshot_index` so a restore-time operator knows how stale the dump is relative to the cluster's last activity.
The numeric portion is everything between the final path separator (or string start) and the `.fsm` suffix. Lengths and zero-padding are not enforced — the live writer happens to pad to 16 digits, but a hand-rolled snapshot named `42.fsm` decodes correctly too.
Returns ErrSnapshotIndexUnparseable when path lacks the .fsm suffix or when the basename is not a parseable uint64.
func VerifyChecksums ¶
VerifyChecksums reads root/CHECKSUMS, recomputes sha256 of every listed file, and returns the first mismatch as an error. Files referenced by CHECKSUMS but missing on disk surface as the same error class so a partial-restore is obvious.
Memory: streaming. Uses bufio.Scanner with a fixed 8 KiB per-line budget so a multi-million-file dump tree's CHECKSUMS file does not need to fit in memory, and a corrupt CHECKSUMS without newlines cannot trick us into buffering arbitrarily much (the gemini security-high finding on PR #810 — the previous os.ReadFile + strings.Split path materialised the whole file).
Path safety: every CHECKSUMS line's relative-path field is validated by validateChecksumRelPath before being joined with root, so an adversarial CHECKSUMS shipped alongside a backup (e.g. with a `../../etc/shadow` entry) cannot escape root and fingerprint files the operator did not intend to expose. The coderabbit critical finding on PR #810.
Used by the Phase 0b encoder's self-test path: after encoding a directory tree back into a `.fsm`, it re-runs the decoder and asserts WriteChecksums + VerifyChecksums round-trip cleanly.
func WriteChecksums ¶
WriteChecksums walks root recursively, computes sha256 over every regular file encountered, and writes a sha256sum(1)-compatible CHECKSUMS file at root/CHECKSUMS.
Each line is `<hex> <relative-path>\n` (two spaces, "binary" mode in sha256sum vocabulary). The relative path is rooted at `root` itself and uses forward-slash separators so the file is portable across operating systems (sha256sum on Linux and the macOS/BSD shasum tool both accept this shape).
CHECKSUMS itself is excluded from the listing — the file is written last and lists every other file in the tree.
Determinism: the line ordering is lexicographic on the relative path so two invocations on the same dump tree produce byte-identical CHECKSUMS files. The design (§"CHECKSUMS") relies on this for the "encode → decode round-trip is byte-identical" property.
func WriteEncodeInfo ¶
func WriteEncodeInfo(w io.Writer, info EncodeInfo) error
WriteEncodeInfo serializes info to w. Caller is responsible for the fsync+close discipline (the cmd wrapper uses os.File.Sync then Close to surface late writeback errors — gemini r1 medium on #810).
Types ¶
type Adapter ¶
type Adapter struct {
Tables []string `json:"tables,omitempty"`
Buckets []string `json:"buckets,omitempty"`
Databases []uint32 `json:"databases,omitempty"`
Queues []string `json:"queues,omitempty"`
}
Adapter holds the scope identifiers for one adapter. Field names are per-adapter to match the protocol's natural vocabulary.
type AdapterSet ¶
AdapterSet selects which adapter encoders DecodeSnapshot instantiates. Disabled adapters are silently dropped at dispatch time (counted as Internal); they do NOT create empty output directories.
func AllAdapters ¶
func AllAdapters() AdapterSet
AllAdapters enables every adapter. Convenience for the CLI's default (`--adapter` flag absent) and for end-to-end tests.
type Adapters ¶
type Adapters struct {
DynamoDB *Adapter `json:"dynamodb,omitempty"`
S3 *Adapter `json:"s3,omitempty"`
Redis *Adapter `json:"redis,omitempty"`
SQS *Adapter `json:"sqs,omitempty"`
}
Adapters lists which scopes were dumped per adapter. The pointer values express two distinguishable on-disk states:
- nil -> the adapter was excluded from this dump (e.g. `--adapter dynamodb,s3` filtered it out). The corresponding JSON key is absent.
- non-nil pointer to Adapter{} -> the adapter was in scope but no scopes for it were emitted (no tables, no buckets, etc.). The JSON key is present with an empty object.
- non-nil pointer to a populated Adapter -> the listed scopes were emitted.
Storing pointers (rather than zero-value Adapter structs) is what keeps "excluded by filter" distinguishable from "included but empty" through json.Marshal — non-pointer fields would collapse both states into the same on-disk shape.
type DDBEncoder ¶
type DDBEncoder struct {
// contains filtered or unexported fields
}
DDBEncoder encodes the DynamoDB prefix family into the per-table layout described in docs/design/2026_04_29_proposed_snapshot_logical_decoder.md (Phase 0): one `_schema.json` per table and one `items/<pk>/<sk>.json` per item (default per-item layout).
Lifecycle: Handle* per record, Finalize once. Items arrive before the schema in lex order ('i' < 'm' under !ddb|), so the encoder buffers per-encoded-table-segment and emits at Finalize once the schema is known.
Wide-column GSI rows (!ddb|gsi|*) are NOT dumped: they are derivable from the base item set + schema, and replaying GSI rows on restore would conflict with the destination's own index maintenance.
func NewDDBEncoder ¶
func NewDDBEncoder(outRoot string) *DDBEncoder
NewDDBEncoder constructs an encoder rooted at <outRoot>/dynamodb/.
func (*DDBEncoder) Finalize ¶
func (d *DDBEncoder) Finalize() error
Finalize emits each table's _schema.json and per-item JSON files. Tables with items but no schema (orphans) emit a warning and are skipped — preserving the spec's lenient handling for incomplete inputs. Real flush errors fail fast so corruption surfaces immediately rather than being attributed to a later table (Gemini MEDIUM #182).
func (*DDBEncoder) HandleGSIRow ¶
func (d *DDBEncoder) HandleGSIRow(_, _ []byte) error
HandleGSIRow drops GSI rows by default (they are derivable from the base item set + schema). Exposed as a no-op so the master pipeline can dispatch all !ddb|* prefixes uniformly without special-casing.
func (*DDBEncoder) HandleItem ¶
func (d *DDBEncoder) HandleItem(key, value []byte) error
HandleItem processes a !ddb|item|<encTable>|<gen>|<rest> record. The encoded table segment AND the item generation are parsed out of the key; the proto is buffered keyed by generation so Finalize can emit only the rows belonging to the schema's active generation.
Stale-generation rows (left behind by an in-flight delete/recreate before async cleanup finishes) would otherwise silently leak under the new schema and either resurrect deleted data or fail Finalize when primary-key names changed across generations — Codex P1 #237.
func (*DDBEncoder) HandleTableGen ¶
func (d *DDBEncoder) HandleTableGen(_, _ []byte) error
HandleTableGen drops the per-table generation counter (operational state, not user-visible).
func (*DDBEncoder) HandleTableMeta ¶
func (d *DDBEncoder) HandleTableMeta(key, value []byte) error
HandleTableMeta processes a !ddb|meta|table|<encodedTable> record. Strips the magic prefix, proto-unmarshals into DynamoTableSchema, and parks it on the per-table state.
func (*DDBEncoder) WithBundleJSONL ¶
func (d *DDBEncoder) WithBundleJSONL(on bool) *DDBEncoder
WithBundleJSONL switches per-table layout to `items/data-<part>.jsonl` (one item per line). Default is per-item files. The choice is recorded in MANIFEST.json (`dynamodb_layout`) by the master pipeline; the encoder itself only needs the flag to pick the on-disk shape.
Bundle mode is a follow-up: this PR ships per-item only. Calling WithBundleJSONL(true) returns an error from Finalize until the bundle path lands.
func (*DDBEncoder) WithWarnSink ¶
func (d *DDBEncoder) WithWarnSink(fn func(event string, fields ...any)) *DDBEncoder
WithWarnSink wires structured-warning emission (orphan items, schema-less tables, etc.).
type DecodeCounters ¶
type DecodeCounters struct {
Total uint64
Tombstone uint64
DynamoDB uint64
S3 uint64
Redis uint64
SQS uint64
Internal uint64
Unknown uint64
}
DecodeCounters reports per-class entry counts after a successful DecodeSnapshot call.
The breakdown is:
Total - every snapshot entry that reached the dispatcher
Tombstone - entries whose value-header flagged a delete; skipped
before adapter dispatch
DynamoDB - entries routed to a DDBEncoder method
S3 - entries routed to an S3Encoder method
Redis - entries routed to a RedisDB method
SQS - entries routed to a SQSEncoder method
Internal - entries matched by a known internal-only prefix
(e.g. !txn|, !s3route|) AND entries that matched an
adapter prefix whose adapter was excluded by
DecodeOptions.Adapters; both are intentional drops
Unknown - entries whose key prefix matched no route — surfaced
so a malformed or version-skewed snapshot does not
silently round-trip into an empty dump
Total = Tombstone + DynamoDB + S3 + Redis + SQS + Internal + Unknown.
type DecodeOptions ¶
type DecodeOptions struct {
// OutRoot is the destination directory tree root. Must be set.
OutRoot string
// ScratchRoot is where the S3 encoder buffers blob chunks during
// assembly. Defaults to <OutRoot>/.scratch when empty (NEVER to
// OutRoot itself — sharing would let S3Encoder.Finalize's
// os.RemoveAll wipe the final dump). Callers backing the CLI's
// --scratch-root flag override it (e.g. to a tmpfs).
//
// DecodeSnapshot returns ErrDecodeOptionsInvalid if ScratchRoot
// cleaned-equals OutRoot: a CLI misconfig that passes
// --scratch-root=$OUT_ROOT would otherwise be the silent
// data-loss path the default-target check is designed to prevent.
ScratchRoot string
// Adapters selects which adapter encoders to construct. Entries
// for disabled adapters are dropped silently and counted as
// Internal.
Adapters AdapterSet
// RedisDBIndex selects the <n> in redis/db_<n>/. Phase 0a always
// emits db_0 but the parameter is plumbed so Phase 1 can dump
// multiple DBs.
RedisDBIndex int
// IncludeIncompleteUploads enables `--include-incomplete-uploads`
// for the S3 encoder (in-flight multipart uploads).
IncludeIncompleteUploads bool
// IncludeOrphans enables `--include-orphans` for the S3 encoder
// (pre-generation orphan blob chunks).
IncludeOrphans bool
// RenameS3Collisions enables `--rename-collisions` for the S3
// encoder (rename user keys ending in `.elastickv-meta.json`).
RenameS3Collisions bool
// PreserveSQSVisibility enables `--preserve-sqs-visibility` for
// the SQS encoder (carry live visibility-state fields into the
// dump rather than zeroing them).
PreserveSQSVisibility bool
// IncludeSQSSideRecords enables `--include-sqs-side-records` for
// the SQS encoder (`_internals/dedup.jsonl` etc.).
IncludeSQSSideRecords bool
// DynamoDBBundleJSONL switches the DynamoDB encoder to the JSONL
// bundle layout (`items/data-<part>.jsonl`).
DynamoDBBundleJSONL bool
// WarnSink, when non-nil, receives structured warnings from the
// per-adapter encoders ("redis_orphan_ttl", "ddb_orphan_items",
// ...). The dispatcher itself does not emit warnings.
WarnSink func(event string, fields ...any)
}
DecodeOptions configures DecodeSnapshot. Zero values are documented as "off" for booleans; OutRoot is required.
type DecodeResult ¶
type DecodeResult struct {
Header SnapshotHeader
Counters DecodeCounters
}
DecodeResult is returned by DecodeSnapshot after a successful run. Header carries the snapshot's `last_commit_ts` for the caller's MANIFEST.json.
func DecodeSnapshot ¶
func DecodeSnapshot(r io.Reader, opts DecodeOptions) (DecodeResult, error)
DecodeSnapshot reads a `.fsm`-format stream from r, dispatches every non-tombstone entry to the appropriate adapter encoder rooted under opts.OutRoot, and runs Finalize on each enabled adapter at end of stream.
Caller closes r. DecodeSnapshot does NOT write MANIFEST.json or CHECKSUMS — both are emitted by the cmd/ wrapper from the returned DecodeResult plus its own wall-time / source-path metadata.
type DynamoDBEncoder ¶
type DynamoDBEncoder struct {
// contains filtered or unexported fields
}
DynamoDBEncoder reconstructs the internal DynamoDB keyspace from the decoded dynamodb/ directory tree.
func NewDynamoDBEncoder ¶
func NewDynamoDBEncoder(inRoot string) *DynamoDBEncoder
NewDynamoDBEncoder constructs an encoder rooted at <inRoot>/dynamodb/.
func (*DynamoDBEncoder) Encode ¶
func (e *DynamoDBEncoder) Encode(b *snapshotBuilder) error
Encode walks dynamodb/<table>/ and stages each table's schema + generation-counter records on b. A missing dynamodb/ directory is not an error.
type EncodeInfo ¶
type EncodeInfo struct {
FormatVersion uint32 `json:"format_version"`
EncoderVersion string `json:"encoder_version"`
EncoderKeyFormatVersion uint32 `json:"encoder_key_format_version"`
WallTimeISO string `json:"wall_time_iso"`
InputRoot string `json:"input_root"`
OutputFSMPath string `json:"output_fsm_path"`
OutputFSMSHA256 string `json:"output_fsm_sha256"`
LastCommitTS uint64 `json:"last_commit_ts"`
LastCommitTSOverridden bool `json:"last_commit_ts_overridden"`
ManifestLastCommitTS uint64 `json:"manifest_last_commit_ts"`
ManifestClusterID string `json:"manifest_cluster_id,omitempty"`
AdaptersEnabled []string `json:"adapters_enabled"`
SelfTest EncodeInfoSelfTest `json:"self_test"`
}
EncodeInfo is the on-disk shape of <output>.encode_info.json. Schema pinned by docs/design/2026_06_01_proposed_snapshot_encode_cli.md §"ENCODE_INFO.json". Restore operators rely on this for "encoded for the right cluster, by the right encoder version, against this exact file" confirmation; tag changes are a breaking schema bump.
func NewEncodeInfo ¶
func NewEncodeInfo(now time.Time) EncodeInfo
NewEncodeInfo stamps the current format version + wall time so callers only fill in the encode-specific fields. Mirrors NewPhase0SnapshotManifest. EncoderKeyFormatVersion is the on-disk key format the encoder produces; today it tracks CurrentFormatVersion (no separate key-format version has been declared), which is conservative: future encoder bumps that change MVCC layout MUST bump both manifest and encoder-key formats so restore operators can correlate.
func ReadEncodeInfo ¶
func ReadEncodeInfo(r io.Reader) (EncodeInfo, error)
ReadEncodeInfo parses an ENCODE_INFO.json payload from r. Rejects unknown format_version values with ErrUnsupportedEncodeInfoFormatVersion so a future schema bump surfaces as a typed error rather than a silent field drop. Unknown JSON fields are tolerated to allow forward-compat additions within the same format_version.
type EncodeInfoSelfTest ¶
EncodeInfoSelfTest captures the self-test outcome (parent §"Round-trip self-test"). Ran=false when --self-test was off; Matched is only meaningful when Ran=true.
type EncodeOptions ¶
type EncodeOptions struct {
// InputRoot is the directory tree root produced by the decoder.
// Must contain MANIFEST.json; per-adapter encoders read their
// subtrees (redis/, dynamodb/, s3/, sqs/) directly off this root.
InputRoot string
// Adapters selects which adapter encoders to invoke; disabled
// adapters are skipped without error. Mirrors DecodeOptions.Adapters.
Adapters AdapterSet
// LastCommitTS is the EFFECTIVE T used for both the EKVPBBL1
// header and every key's invTS = ^T. Callers pass manifest.last_commit_ts
// by default and the --last-commit-ts override otherwise.
LastCommitTS uint64
// DynamoDBBundleJSONL is true when the input dump's MANIFEST.json
// has `dynamodb_layout: "jsonl"`. The reverse encoder does not
// support that layout — it would silently skip every
// items/data-*.jsonl file and publish an .fsm with only table
// metadata. Fail-closed via ErrEncodeUnsupportedDynamoDBLayout
// when true (codex P2 v7 #904). When the encoder gains JSONL
// support, this field will switch from a guard to a control.
DynamoDBBundleJSONL bool
// S3IncludeIncompleteUploads is true when the input dump's
// MANIFEST.json has `exclusions.include_incomplete_uploads=true`
// (the producer dumped in-flight multipart uploads under
// _incomplete_uploads/). The reverse encoder cannot rebuild that
// subtree today; fail-closed via ErrEncodeUnsupportedS3IncompleteUploads
// when true AND Adapters.S3 is enabled (codex P2 v21 #904).
S3IncludeIncompleteUploads bool
// S3IncludeOrphans is true when the input dump's MANIFEST.json
// has `exclusions.include_orphans=true` (the producer dumped
// pre-generation orphan blob chunks under _orphans/). The reverse
// encoder cannot rebuild that subtree today; fail-closed via
// ErrEncodeUnsupportedS3Orphans when true AND Adapters.S3 is
// enabled (codex P2 v21 #904).
S3IncludeOrphans bool
// PreserveSQSVisibility is true when the input dump's MANIFEST.json
// has `exclusions.preserve_sqs_visibility=true` (the producer
// preserved in-flight message visibility state — VisibleAtMillis,
// receive count, first receive, receipt token). The reverse
// encoder unconditionally zeros those fields on every restored
// message; fail-closed via ErrEncodeUnsupportedSQSPreserveVisibility
// when true AND Adapters.SQS is enabled (codex P2 v21 #904).
PreserveSQSVisibility bool
// ManifestLastCommitTS is the floor LastCommitTS must not fall
// below. When > 0, EncodeSnapshot fails-closed with
// ErrSelfTestLowerLastCommitTS if LastCommitTS < ManifestLastCommitTS.
// This is defense-in-depth for the CLI's pre-check (which already
// rejects --last-commit-ts T < manifest), and it's the load-bearing
// guard for future in-process library callers (Phase 1 live extractor,
// integration tests) that bypass the CLI: a library caller that
// forgets to compare against the manifest can no longer silently
// publish a low-TS .fsm (codex P2 v2 #904). Callers that genuinely
// have no manifest reference (synthetic test fixtures) leave this
// at 0 to opt out of the check.
ManifestLastCommitTS uint64
// SelfTest enables the round-trip self-test. When true,
// EncodeSnapshot writes the FSM to an on-disk temp file under
// SelfTestDecodeOptions.OutRoot (encode-self-test-fsm-*), streams
// it through DecodeSnapshot, and copies to the caller's io.Writer
// ONLY if the decode survives — i.e. the bytes the encoder
// produced are loadable. When false, the FSM streams straight to
// the writer with no extra buffering. Memory cost in self-test
// mode is O(1) on top of the sort working set (the temp file
// holds the snapshot; only a small streaming buffer is in RAM).
SelfTest bool
// SelfTestDecodeOptions are threaded into the scratch DecodeSnapshot
// call. The CLI reads MANIFEST.json's Exclusions + DynamoDBLayout
// and populates this so the self-test's scratch tree matches what
// the original decoder would have produced.
SelfTestDecodeOptions DecodeOptions
// AllowMissingManifest opts out of the MANIFEST.json presence
// check in validateEncodeOptions. When false (default),
// EncodeSnapshot requires <InputRoot>/MANIFEST.json to exist —
// the contract on InputRoot has always claimed this, but until
// codex P2 v17 #904 the library only checked the path was a
// directory, so a real library caller pointing at the wrong
// directory would silently emit a header-only .fsm (each enabled
// adapter no-ops when its top-level subdir is missing).
//
// Set to true for synthetic test fixtures that don't have a
// MANIFEST.json on disk. Production callers (CLI, Phase 1
// in-process extractor) MUST leave this at false so a bad
// InputRoot surfaces an explicit error rather than a
// silent-empty .fsm.
AllowMissingManifest bool
// contains filtered or unexported fields
}
EncodeOptions configures EncodeSnapshot. Mirrors the decoder's DecodeOptions in shape: required InputRoot, AdapterSet, then per-adapter option flags read back from the input MANIFEST.json by the CLI.
func (*EncodeOptions) SetSelfTestCorruptHookForTest ¶
func (o *EncodeOptions) SetSelfTestCorruptHookForTest(hook func(*os.File))
SetSelfTestCorruptHookForTest installs a same-process hook that fires against the on-disk self-test buffer between WriteTo and the re-decode call. The hook can WriteAt into the file to inject corruption so the subsequent self-test mismatches deterministically.
Production code MUST NOT call this; it is exclusively a test seam for callers OUTSIDE package backup (specifically the cmd/elastickv-snapshot-encode CLI tests, which need to drive a real end-to-end self-test mismatch to verify the stale-.fsm cleanup path — codex P2 v10 #904). In-package tests should set EncodeOptions.corruptBufferForTest directly.
type EncodeResult ¶
type EncodeResult struct {
// Header is what ReadSnapshotWithHeader returned when the encoder
// decoded its own output for the self-test. Header.LastCommitTS
// equals the effective T (uniform-stamping rule per parent doc
// §"MVCC re-encoding").
Header SnapshotHeader
// BytesWritten is the number of bytes written to the caller's
// io.Writer (the SHA256-anchored payload).
BytesWritten int64
// SHA256 of the produced .fsm bytes (raw 32-byte digest; the CLI
// hex-encodes it via encoding/hex when writing ENCODE_INFO.json).
SHA256 [32]byte
// SelfTestRan is true iff opts.SelfTest was true AND the encoder
// ran (i.e. no earlier per-adapter error short-circuited).
SelfTestRan bool
// SelfTestMatched is meaningful only when SelfTestRan; reports
// whether the re-decode produced no diff against InputRoot.
SelfTestMatched bool
// SelfTestMismatchTxt is non-nil when SelfTestRan && !SelfTestMatched.
// The CLI writes it as <output>.mismatch.txt at exit 2.
SelfTestMismatchTxt []byte
// AdaptersEnabled is the canonical fan-out order of adapters that
// were actually invoked; ENCODE_INFO.json embeds this verbatim.
AdaptersEnabled []string
}
EncodeResult is the public return value from EncodeSnapshot. Mirrors the decoder's DecodeResult shape.
func EncodeSnapshot ¶
func EncodeSnapshot(opts EncodeOptions, out io.Writer) (EncodeResult, error)
EncodeSnapshot reads the directory tree at opts.InputRoot, invokes the enabled per-adapter encoders in canonical fan-out order, optionally runs the round-trip self-test, and writes the .fsm bytes to out. The .fsm bytes are NOT returned; they go to out.
When opts.SelfTest=false the FSM streams straight to out with a sha256 tee and no extra buffering. When opts.SelfTest=true the FSM is written to an on-disk temp file (encode-self-test-fsm-*) under opts.SelfTestDecodeOptions.OutRoot, the file is streamed through DecodeSnapshot, and bytes are copied to out ONLY if the decode survives. Memory cost in self-test mode is O(1) on top of the sort working set (gemini high #904 — the earlier *bytes.Buffer version would OOM on multi-GB snapshots).
Self-test failure returns (result, nil) with result.SelfTestMatched == false and result.SelfTestMismatchTxt populated. Callers MUST check result.SelfTestMatched before treating a nil error as success. The CLI relies on this contract to write mismatch.txt + exit 2; library callers should follow the same pattern.
EncodeSnapshot does NOT read MANIFEST.json itself, but it WILL enforce a floor on opts.LastCommitTS when the caller threads the manifest value through opts.ManifestLastCommitTS — a low LastCommitTS returns ErrSelfTestLowerLastCommitTS BEFORE any bytes are written. The CLI's resolveLastCommitTS sets both fields to the reconciled values, and library callers SHOULD do the same. The check is opt-in (ManifestLastCommitTS=0 disables it) so synthetic test fixtures without a manifest reference can still call this directly (codex P2 v2 #904).
type Exclusions ¶
type Exclusions struct {
IncludeIncompleteUploads bool `json:"include_incomplete_uploads"`
IncludeOrphans bool `json:"include_orphans"`
PreserveSQSVisibility bool `json:"preserve_sqs_visibility"`
IncludeSQSSideRecords bool `json:"include_sqs_side_records"`
// RenameS3Collisions records whether the producer ran with
// --rename-collisions (DecodeOptions.RenameS3Collisions), so the
// M6 encoder's self-test can thread the same option back through
// DecodeSnapshot. Older manifests that omit this field decode as
// false (no-rename), matching the decoder default. Intentionally
// NOT added to exclusionsRequiredFields below so legacy manifests
// continue to validate (#896 v5 — claude review on M6 design).
RenameS3Collisions bool `json:"rename_s3_collisions,omitempty"`
}
Exclusions records the producer-side flags that affected which records were emitted. Restore tools log these so an operator can correlate a surprising dump shape with the producer invocation.
type KeymapReader ¶
type KeymapReader struct {
// contains filtered or unexported fields
}
KeymapReader iterates JSONL records line-by-line. Memory footprint is bounded by keymapBufSizeReader regardless of file size.
func NewKeymapReader ¶
func NewKeymapReader(r io.Reader) *KeymapReader
NewKeymapReader wraps r so the caller can iterate records via Next.
func (*KeymapReader) Next ¶
func (r *KeymapReader) Next() (KeymapRecord, bool, error)
Next decodes the next record. It returns (rec, true, nil) on success, (zero, false, nil) at end of stream, and (zero, false, err) on parse failure or I/O error. Once an error is returned the reader is sticky: subsequent calls return the same error.
The base64-encoded `original` field is validated at parse time rather than lazily: a malformed dump must surface on the first read of the affected line, not propagate silently until a much later rec.Original() call. Same error class either way.
type KeymapRecord ¶
type KeymapRecord struct {
// Encoded is the filename segment as it appears in the dump tree.
Encoded string `json:"encoded"`
// OriginalB64 is base64url-no-padding of the original key bytes.
OriginalB64 string `json:"original"`
// Kind classifies why this record exists; see Kind* constants.
Kind string `json:"kind"`
}
KeymapRecord is a single mapping from encoded filename component back to the original key bytes. Original bytes are arbitrary (binary safe), so they are encoded as base64url-no-padding for transport in JSON.
func (KeymapRecord) Original ¶
func (r KeymapRecord) Original() ([]byte, error)
Original returns the decoded original key bytes from r.OriginalB64.
type KeymapWriter ¶
type KeymapWriter struct {
// contains filtered or unexported fields
}
KeymapWriter appends records to a KEYMAP.jsonl stream. Concurrent calls to Write are serialised through the underlying bufio.Writer; the caller is expected to use a single writer per scope.
func NewKeymapWriter ¶
func NewKeymapWriter(w io.Writer) *KeymapWriter
NewKeymapWriter returns a writer that appends JSONL records to w. Close must be called to flush.
func (*KeymapWriter) Close ¶
func (w *KeymapWriter) Close() error
Close flushes any buffered records to the underlying writer.
func (*KeymapWriter) Count ¶
func (w *KeymapWriter) Count() int
Count returns the number of records written so far. Useful for the "omit empty KEYMAP file" decision after the dump completes.
func (*KeymapWriter) Write ¶
func (w *KeymapWriter) Write(rec KeymapRecord) error
Write appends one KeymapRecord. The record is JSON-serialised with a trailing newline (json.Encoder behavior), giving the JSONL contract.
func (*KeymapWriter) WriteOriginal ¶
func (w *KeymapWriter) WriteOriginal(encoded string, original []byte, kind string) error
WriteOriginal is a convenience wrapper that base64-encodes raw original bytes for the caller.
type Live ¶
type Live struct {
// ReadTS is the pinned read_ts at which BackupScanner traversed the
// keyspace.
ReadTS uint64 `json:"read_ts"`
// PinTokenSHA256 is the hex SHA-256 of the pin_token issued by
// BeginBackup. Stored as a hash rather than the raw token so the
// manifest carries no auth-sensitive material.
PinTokenSHA256 string `json:"pin_token_sha256,omitempty"`
}
Live records the cluster-wide pinning information that produced a Phase 1 dump. Phase 0 dumps leave this nil.
type Manifest ¶
type Manifest struct {
FormatVersion uint32 `json:"format_version"`
Phase string `json:"phase"`
ElastickvVersion string `json:"elastickv_version,omitempty"`
ClusterID string `json:"cluster_id,omitempty"`
SnapshotIndex uint64 `json:"snapshot_index,omitempty"`
LastCommitTS uint64 `json:"last_commit_ts,omitempty"`
WallTimeISO string `json:"wall_time_iso"`
Source *Source `json:"source,omitempty"`
Live *Live `json:"live,omitempty"`
// Adapters and Exclusions are pointer types so ReadManifest can
// distinguish "section omitted entirely" (a corrupted or
// truncated dump that should fail validation) from "section
// present but populated with default values" (legitimate
// scope-everything-excluded). Codex P2 #146 (round 3).
Adapters *Adapters `json:"adapters"`
Exclusions *Exclusions `json:"exclusions"`
ChecksumAlgorithm string `json:"checksum_algorithm"`
ChecksumFormat string `json:"checksum_format"`
EncodedFilenameCharset string `json:"encoded_filename_charset"`
KeySegmentMaxBytes uint32 `json:"key_segment_max_bytes"`
S3MetaSuffix string `json:"s3_meta_suffix"`
S3CollisionStrategy string `json:"s3_collision_strategy"`
DynamoDBLayout string `json:"dynamodb_layout"`
}
Manifest is the on-disk MANIFEST.json structure. Field tags match the spec in docs/design/2026_04_29_proposed_snapshot_logical_decoder.md.
func NewPhase0SnapshotManifest ¶
NewPhase0SnapshotManifest seeds a manifest with the Phase 0a defaults. Callers fill in scope (Adapters), Source/wall time and exclusions before passing it to WriteManifest. Adapters and Exclusions are seeded to non-nil zero values so the resulting manifest passes the "section-present" validation; callers populating individual scopes reach in via the now-non-nil pointer.
type RedisDB ¶
type RedisDB struct {
// contains filtered or unexported fields
}
RedisDB encodes one logical Redis database (`redis/db_<n>/`). All operations are scoped to its outRoot; the caller wires per-database instances when the producer supports multiple databases (today only db_0 is meaningful, but the encoder is wired to take any non-negative index so a future multi-db dump does not silently collide on db_0).
Lifecycle:
r := NewRedisDB(outRoot, dbIndex) for each snapshot record matching a redis prefix: r.Handle*(...) r.Finalize()
Handle* methods are NOT goroutine-safe; the decoder pipeline is inherently sequential per scope, so a mutex would only add cost.
func NewRedisDB ¶
NewRedisDB constructs a RedisDB rooted at <outRoot>/redis/db_<n>/. dbIndex selects <n>; today the producer always passes 0, but accepting the index as a parameter prevents a future multi-db dump from silently colliding on db_0.
func (*RedisDB) Finalize ¶
Finalize flushes all open sidecar writers and emits warnings for any pending TTL records whose user key was never claimed by the wide-column encoders. Call exactly once after every snapshot record has been dispatched.
func (*RedisDB) HandleHLL ¶
HandleHLL processes one !redis|hll|<userKey> record. The value is the raw HLL sketch bytes, written byte-for-byte to hll/<encoded>.bin. TTL for HLL keys lives in !redis|ttl|<userKey> and is consumed by HandleTTL.
func (*RedisDB) HandleHashField ¶
HandleHashField processes one !hs|fld|<userKey><fieldName> record. The value is the raw field-value bytes (binary-safe).
Note: Redis hash field names are binary-safe and may legitimately be empty — `HSET k "" v` is a valid command and the live store emits a key shaped exactly `!hs|fld|<len><userKey>` with no trailing field bytes. We deliberately do NOT reject zero-length field names here so backup decoding succeeds on real data created via HSET with empty names. Codex P1 round 13 (PR #725).
func (*RedisDB) HandleHashMeta ¶
HandleHashMeta processes one !hs|meta|<userKey> record. The value is the 8-byte BE field count. We park the state for finalize-time flush and register the user key so a later !redis|ttl|<userKey> record routes back to this hash state.
Delta keys (!hs|meta|d|...) share the !hs|meta| string prefix, so a snapshot dispatcher that routes by "starts with RedisHashMetaPrefix" will land delta records here too. Phase 0a's output (an array of observed fields) doesn't need to apply the delta arithmetic — the !hs|fld|... records are the source of truth — so we silently skip delta keys instead of returning ErrRedisInvalidHashKey. Codex P1 round 14 (PR #725 #13).
func (*RedisDB) HandleListClaim ¶
HandleListClaim accepts and discards one !lst|claim|... record. See the file-level comment: the live read path does not consult claims; POP'd item keys are deleted in the same OCC commit. Restored lists therefore reflect the post-POP state without any claim replay.
func (*RedisDB) HandleListItem ¶
HandleListItem processes one !lst|itm|<userKey><sortable_seq(8)> record. The value is the raw item bytes (binary-safe). The seq is the trailing 8-byte sortable-int64 — sortable encoding flips the sign bit so a forward byte-ordered scan yields ascending int64, which matches the live store's left-to-right read order.
func (*RedisDB) HandleListMeta ¶
HandleListMeta processes one !lst|meta|<userKey> record. The value is the 24-byte (Head, Tail, Len) layout. We park the declared length so flushLists can warn on a mismatch with the observed item count and register the user key so a later !redis|ttl|<userKey> record routes back to this list state.
!lst|meta|d|<userKey>... delta keys share the !lst|meta| string prefix, so a snapshot dispatcher that routes by "starts with ListMetaPrefix" lands delta records here too. The hash encoder solved the analogous problem (Codex P1 round 14 PR #725) by silently skipping the delta family; we mirror that policy because !lst|itm| records are the source of truth for the restored list contents and the delta arithmetic does not need to be replayed at backup time.
func (*RedisDB) HandleListMetaDelta ¶
HandleListMetaDelta accepts and discards one !lst|meta|d|... record. See HandleListMeta's docstring for the rationale; !lst|itm| is the source of truth at backup time.
func (*RedisDB) HandleSetMember ¶
HandleSetMember processes one !st|mem|<len><userKey><member> record. The value is empty by design (Redis sets store the member bytes in the key, not the value), so HandleSetMember discards the value argument; the member bytes are extracted from the key's trailing segment.
func (*RedisDB) HandleSetMeta ¶
HandleSetMeta processes one !st|meta|<len><userKey> record. The value is the 8-byte BE member count. We park the declared length so flushSets can warn on a mismatch with the observed member count and register the user key so a later !redis|ttl|<userKey> record routes back to this set state.
!st|meta|d|... delta keys share the !st|meta| string prefix, so a snapshot dispatcher that routes by "starts with RedisSetMetaPrefix" lands delta records here too. The hash encoder solved the analogous problem (Codex P1 round 14 PR #725) by silently skipping the delta family; we mirror that policy because !st|mem| records are the source of truth for the restored set contents.
func (*RedisDB) HandleSetMetaDelta ¶
HandleSetMetaDelta accepts and discards one !st|meta|d|... record. See HandleSetMeta's docstring for the rationale; !st|mem| is the source of truth at backup time.
func (*RedisDB) HandleStreamEntry ¶
HandleStreamEntry processes one !stream|entry|<userKey><ms><seq> record. The ID is recovered from the trailing 16 bytes of the key; the value is the magic-prefixed `pb.RedisStreamEntry` protobuf carrying the entry's interleaved (name, value) field list.
func (*RedisDB) HandleStreamMeta ¶
HandleStreamMeta processes one !stream|meta|<userKey> record. Value layout: Length(8) || LastMs(8) || LastSeq(8). The encoder uses the meta's last_ms / last_seq verbatim in the JSONL _meta terminator so a restorer can replay them into the same XADD '*' monotonicity window. Length mismatches against the observed entry count surface as `redis_stream_length_mismatch` at flush time.
func (*RedisDB) HandleString ¶
HandleString processes one !redis|str|<userKey> record. The value is the raw stored bytes; HandleString peels the magic-prefix TTL header (if present) and writes the user-visible value to strings/<encoded>.bin and the TTL — if any — to strings_ttl.jsonl.
func (*RedisDB) HandleTTL ¶
HandleTTL processes one !redis|ttl|<userKey> record. Routing depends on what the encoder has previously recorded for the user key. There are two ordering regimes the snapshot stream presents:
- Prefix sorts BEFORE !redis|ttl| in encoded-key order (!hs|, !lst|, !redis|str|, !redis|hll|). The typed record arrives FIRST, kindByKey is already set when HandleTTL fires, and we route directly to the per-type sidecar / inline field.
- Prefix sorts AFTER !redis|ttl| (!st|, !stream|, !zs|, because `r` < `s`/`s`/`z`). The TTL arrives FIRST and kindByKey is still redisKindUnknown. We park the expiry in pendingTTL and let each wide-column state-init function (setState / zsetState / streamState) drain it when the user key finally surfaces as a typed record. Codex P1 finding on PR #790.
Routing:
- redisKindHLL -> hll_ttl.jsonl (case 1)
- redisKindString -> strings_ttl.jsonl (case 1; legacy strings whose TTL lives in !redis|ttl| rather than the inline header)
- redisKindHash/List/Set/ZSet/Stream -> inlined into the per-key JSON (case 1 for hash/list, case 2 for set/zset/stream where the state-init already drained from pendingTTL before HandleTTL would even be called the second time)
- redisKindUnknown -> bufferPendingTTL. Finalize counts truly unmatched entries (key never registered as a typed record).
func (*RedisDB) HandleZSetLegacyBlob ¶
HandleZSetLegacyBlob processes one `!redis|zset|<userKey>` record. This is the consolidated single-key layout the live store still writes for non-empty persisted zsets (see RedisZSetLegacyBlobPrefix docstring). The encoded value is a magic-prefixed `pb.RedisZSetValue` carrying every (member, score) pair.
Decoded entries land in the same per-key state HandleZSetMember would have produced, so the per-zset JSON output is identical regardless of which layout the live store used. NaN scores fail closed at intake, matching HandleZSetMember's contract.
Scan-order context (per Pebble lexicographic order):
`!redis|zset|<k>` (0x72='r') sorts BEFORE `!zs|...` (0x7A='z'), so a mid-migration store with both formats hits this handler first; later wide-column records merge into the same state. Duplicate HandleZSetMember calls follow the latest-wins policy.
`!redis|zset|<k>` (0x72='r' < 0x74='t' for the byte after the shared `!redis|` prefix would be the wrong direction; the actual comparison is byte 'z'=0x7A > 't'=0x74) — so `!redis|ttl|<k>` sorts BEFORE `!redis|zset|<k>`. The real production order for a TTL'd legacy zset is therefore:
!redis|ttl|k → HandleTTL (redisKindUnknown) → pendingTTL["k"] = ms !redis|zset|k → HandleZSetLegacyBlob → zsetState() → claimPendingTTL() drains pendingTTL into st.expireAtMs
The HandleTTL redisKindZSet fast-path branch only fires for the reverse order (custom dispatcher or replay), where a wide-column row registered the key before the TTL arrived.
func (*RedisDB) HandleZSetMember ¶
HandleZSetMember processes one !zs|mem|<len><userKey><member> record. The value is the 8-byte IEEE 754 big-endian score. The member bytes live in the key (binary-safe).
NaN scores are rejected. Redis's ZADD command itself rejects NaN at the wire level, so a NaN at backup time indicates either a corrupted store or a write path that bypassed ZADD validation — either way, emitting it as `score: null` or letting json.Marshal fail mid-flush would corrupt the dump silently. Fail closed here so the problem surfaces at the source record.
func (*RedisDB) HandleZSetMeta ¶
HandleZSetMeta processes one !zs|meta|<len><userKey> record. The value is the 8-byte BE member count. We park the declared length so flushZSets can warn on a mismatch with the observed member count and register the user key so a later !redis|ttl|<userKey> record routes back to this zset state.
!zs|meta|d|... delta keys share the !zs|meta| string prefix, so a snapshot dispatcher that routes by "starts with RedisZSetMetaPrefix" lands delta records here too. We mirror the hash/set encoder policy (PRs #725, #758) and silently skip the delta family because !zs|mem| records are the source of truth for restored zset contents.
func (*RedisDB) HandleZSetMetaDelta ¶
HandleZSetMetaDelta accepts and discards one !zs|meta|d|... record. See HandleZSetMeta's docstring for the rationale; !zs|mem| is the source of truth at backup time.
func (*RedisDB) HandleZSetScore ¶
HandleZSetScore accepts and discards one !zs|scr|... record. The score index is a live-side secondary index for ZRANGEBYSCORE; the authoritative member→score mapping lives in !zs|mem|. Snapshot dispatchers should still route this prefix here (rather than to the orphan-record warning sink) so a future audit that greps for "every !zs| prefix has a handler" finds one.
func (*RedisDB) WithPendingTTLByteCap ¶
WithPendingTTLByteCap overrides the default byte budget for the pendingTTL buffer. A value of 0 disables buffering — every unknown-kind TTL becomes an immediate orphan (matches the pre-pendingTTL behavior). Negative inputs are coerced to 0. Returns the receiver so it can be chained with other With* setters.
The budget is in BYTES (sum of `len(userKey) + pendingTTLEntryOverheadBytes` over every buffered entry), NOT entry count, because Redis user keys can be up to 1 MiB each and an entry-count cap of N would still permit ~N MiB of accumulated key bytes. Codex P1 finding on PR #790 round 6.
type RedisEncoder ¶
type RedisEncoder struct {
// contains filtered or unexported fields
}
RedisEncoder reconstructs the internal Redis keyspace for one logical database (redis/db_<n>/) from its decoded directory tree.
func NewRedisEncoder ¶
func NewRedisEncoder(inRoot string, dbIndex int) *RedisEncoder
NewRedisEncoder constructs an encoder rooted at <inRoot>/redis/db_<n>/. Negative dbIndex is coerced to 0 (mirrors NewRedisDB).
func (*RedisEncoder) Encode ¶
func (e *RedisEncoder) Encode(b *snapshotBuilder) error
Encode walks the db subtree and stages every reconstructed record on b. A missing db directory is not an error — there is simply nothing to encode for that database.
type RoundTripEntry ¶
RoundTripEntry is one live record recovered by DecodeLiveEntries. The byte slices are owned by the caller (cloned out of the reader's scratch buffer).
type S3Encoder ¶
type S3Encoder struct {
// contains filtered or unexported fields
}
S3Encoder emits per-bucket _bucket.json + assembled object bodies + .elastickv-meta.json sidecars + KEYMAP.jsonl, per the Phase 0 design (docs/design/2026_04_29_proposed_snapshot_logical_decoder.md).
Lifecycle: Handle* per record, Finalize once. Records arrive in snapshot lex order:
!s3|blob|* (b) -- written to a per-(bucket,object)
scratch chunk pool
!s3|bucket|gen|* (bg) -- ignored (operational counter)
!s3|bucket|meta|* (bm) -- buffered until Finalize
!s3|gc|upload|* (g) -- ignored (in-flight cleanup state)
!s3|obj|head|* (o) -- buffered until Finalize
!s3|upload|meta|* (um) -- excluded by default; opt in via
WithIncludeIncompleteUploads
!s3|upload|part|* (up) -- same
!s3route|* (r) -- ignored (control plane)
Object body assembly happens at Finalize: for each object manifest, the encoder enumerates parts in PartNo order and chunks in ChunkNo order, concatenates the matching blob chunks (which were pre-spilled to scratch files as they arrived), and writes the assembled body to <outRoot>/s3/<bucket>/<object> with the metadata sidecar at <object>.elastickv-meta.json.
Memory: O(num_objects + num_buckets) buffered metadata. Per-blob payloads are streamed to disk as they arrive — never held in memory.
func NewS3Encoder ¶
NewS3Encoder constructs an encoder rooted at <outRoot>/s3/. Blob chunks are spilled to <scratchRoot>/s3/ as they arrive and assembled into final object bodies at Finalize. The caller owns scratchRoot; it must exist and be writable. A common choice is os.TempDir() under the dump runner — the encoder removes its scratch subtree on Close().
func (*S3Encoder) Finalize ¶
Finalize assembles every object body, writes its sidecar, flushes per-bucket _bucket.json, and removes the scratch tree.
func (*S3Encoder) HandleBlob ¶
HandleBlob spills a !s3|blob| record to a per-chunk scratch file and registers it under the (bucket, object, gen, uploadID, partNo, chunkNo, partVersion) routing key. EncodeSegment percent-encodes `/` so a multi-segment object key like `../../tmp/pwn` collapses into one filename, but a literal `..` (or `.`) survives unchanged because both `.` chars are RFC3986-unreserved. Without explicit validation, a crafted bucket+object pair like `bucket="..", object=".."` would resolve to filepath.Join(scratchRoot, "..", "..") = the parent of scratchRoot, letting writeFileAtomic land outside the decoder's controlled directory before safeJoinUnderRoot ever runs at output time. Codex P1 round 11.
func (*S3Encoder) HandleBucketMeta ¶
HandleBucketMeta decodes and parks a !s3|bucket|meta record.
func (*S3Encoder) HandleIgnored ¶
HandleIgnored is a no-op for prefixes the encoder explicitly drops (!s3|bucket|gen|, !s3|gc|upload|, !s3route|). Exposed so the master pipeline can dispatch all !s3|* prefixes uniformly without special-casing.
func (*S3Encoder) HandleIncompleteUpload ¶
HandleIncompleteUpload routes !s3|upload|meta|/!s3|upload|part| records to <bucket>/_incomplete_uploads/records.jsonl when the include flag is on; otherwise drops them.
The output writer is opened once per bucket on the first record and cached on s3BucketState. Re-opening per record (the prior implementation) used create/truncate semantics, so each call wiped the file and only the last record survived — Codex P2 #318 / Gemini HIGH+MEDIUM #318.
func (*S3Encoder) HandleObjectManifest ¶
HandleObjectManifest decodes and parks an !s3|obj|head record. The manifest's UploadID and Parts list drive the Finalize-time blob assembly.
func (*S3Encoder) WithIncludeIncompleteUploads ¶
WithIncludeIncompleteUploads routes !s3|upload|meta|/!s3|upload|part| records to s3/<bucket>/_incomplete_uploads/. Default is to skip them.
func (*S3Encoder) WithIncludeOrphans ¶
WithIncludeOrphans surfaces blob chunks that have no matching manifest under s3/<bucket>/_orphans/. Default skips them.
func (*S3Encoder) WithRenameCollisions ¶
WithRenameCollisions opts in to renaming user objects that collide with the reserved S3MetaSuffixReserved suffix. Default rejects.
type S3RecordEncoder ¶
type S3RecordEncoder struct {
// contains filtered or unexported fields
}
S3RecordEncoder reconstructs the internal S3 keyspace from the decoded s3/ directory tree. (Named distinctly from the decoder's S3Encoder in s3.go, which despite its name turns records INTO the dump tree.)
func NewS3RecordEncoder ¶
func NewS3RecordEncoder(inRoot string) *S3RecordEncoder
NewS3RecordEncoder constructs an encoder rooted at <inRoot>/s3/.
func (*S3RecordEncoder) Encode ¶
func (e *S3RecordEncoder) Encode(b *snapshotBuilder) error
Encode walks s3/<bucket>/ and stages each bucket's meta + generation counter records on b. A missing s3/ directory is not an error.
type SQSEncoder ¶
type SQSEncoder struct {
// contains filtered or unexported fields
}
SQSEncoder encodes the SQS prefix family into the per-queue layout described in docs/design/2026_04_29_proposed_snapshot_logical_decoder.md (Phase 0): one `_queue.json` per queue and one ordered `messages.jsonl`.
Lifecycle: per-snapshot pass calls Handle* for each record, then exactly one Finalize. Side-records (vis/byage/dedup/group/tombstone) are excluded by default; opt in with WithIncludeSideRecords. Visibility state on emitted messages is zeroed by default; opt in to preserve with WithPreserveVisibility.
The encoder buffers messages per queue in memory and sorts them at Finalize-time by (SendTimestampMillis, SequenceNumber, MessageID). This is acceptable for typical operational queues; queues with hundreds of millions of messages will need a future stream-and-merge variant.
func NewSQSEncoder ¶
func NewSQSEncoder(outRoot string) *SQSEncoder
NewSQSEncoder constructs an encoder rooted at <outRoot>/sqs/.
func (*SQSEncoder) Finalize ¶
func (s *SQSEncoder) Finalize() error
Finalize flushes every queue's _queue.json and messages.jsonl. Queues with buffered messages but no meta record (orphans) emit a warning and have their messages dropped — restoring orphan messages without a queue config would silently create a queue with default settings, which is rarely what the operator wants. However, if --include-sqs-side-records is on and this orphan queue has buffered side records (vis/byage/dedup/group/tombstone), those are still flushed under the encoded-prefix directory: the most common reason for a missing meta is a DeleteQueue that left tombstones, and dropping exactly those records is the opposite of what the operator asked for. Codex P2 round 8.
func (*SQSEncoder) HandleMessageData ¶
func (s *SQSEncoder) HandleMessageData(key, value []byte) error
HandleMessageData processes one !sqs|msg|data|<encQueue><gen><encMsgID> record. The encoded queue segment is parsed out of the key and used as the per-queue routing key; the message is buffered until Finalize so it can be sorted and emitted in send-order.
func (*SQSEncoder) HandleQueueGen ¶
func (s *SQSEncoder) HandleQueueGen(key, value []byte) error
HandleQueueGen processes one !sqs|queue|gen|<encoded> record. The value is a base-10 decimal string holding the queue's current generation (mirrors adapter/sqs_catalog.go's CreateQueue Put: the live cluster writes strconv.FormatUint(gen, 10)). Capturing activeGen lets flushQueue drop messages tagged with older generations — those are residual rows left by PurgeQueue / DeleteQueue that the reaper has not yet cleaned, and emitting them to messages.jsonl would resurrect purged messages on restore. Codex P1 round 10.
func (*SQSEncoder) HandleQueueMeta ¶
func (s *SQSEncoder) HandleQueueMeta(key, value []byte) error
HandleQueueMeta processes one !sqs|queue|meta|<encoded> record. Strips the magic prefix, decodes the JSON, projects to the dump-format sqsQueueMetaPublic, and parks it on the per-queue state.
func (*SQSEncoder) HandleSideRecord ¶
func (s *SQSEncoder) HandleSideRecord(prefix string, key, value []byte) error
HandleSideRecord buffers (vis|byage|dedup|group|tombstone) records when includeSideRecords is on; otherwise drops them silently (this is the documented Phase 0 default).
func (*SQSEncoder) WithIncludeSideRecords ¶
func (s *SQSEncoder) WithIncludeSideRecords(on bool) *SQSEncoder
WithIncludeSideRecords routes vis/byage/dedup/group/tombstone records into _internals/. Default is to exclude them — they are derivable from the queue config + message records and replaying them on restore can resurrect aborted state.
func (*SQSEncoder) WithPreserveVisibility ¶
func (s *SQSEncoder) WithPreserveVisibility(on bool) *SQSEncoder
WithPreserveVisibility passes the visibility-state fields (visible_at_millis, current_receipt_token, receive_count, first_receive_millis) through to the dump. Default is to zero them so the restored queue starts with every message visible.
func (*SQSEncoder) WithWarnSink ¶
func (s *SQSEncoder) WithWarnSink(fn func(event string, fields ...any)) *SQSEncoder
WithWarnSink wires a structured warning hook (same shape as RedisDB.WithWarnSink). Used for orphan messages and unresolvable side records.
type SQSRecordEncoder ¶
type SQSRecordEncoder struct {
// contains filtered or unexported fields
}
SQSRecordEncoder reconstructs the internal SQS keyspace from the decoded sqs/ directory tree. (Named distinctly from the decoder's SQSEncoder in sqs.go.)
func NewSQSRecordEncoder ¶
func NewSQSRecordEncoder(inRoot string) *SQSRecordEncoder
NewSQSRecordEncoder constructs an encoder rooted at <inRoot>/sqs/.
func (*SQSRecordEncoder) Encode ¶
func (e *SQSRecordEncoder) Encode(b *snapshotBuilder) error
Encode walks sqs/<queue>/ and stages each queue's meta + counters + message records. A missing sqs/ directory is not an error.
type SnapshotEntry ¶
type SnapshotEntry struct {
UserKey []byte
UserValue []byte
CommitTS uint64
ExpireAt uint64
Tombstone bool
}
SnapshotEntry is one decoded entry emitted by ReadSnapshot's callback. Fields are the user-visible key / value bytes plus the MVCC metadata the decoder peeled off (commit timestamp, expiry, tombstone marker). Slices are owned by the snapshot reader's scratch buffer and may be overwritten when the callback returns — callers that need to retain bytes across iterations must `bytes.Clone` them.
type SnapshotHeader ¶
type SnapshotHeader struct {
LastCommitTS uint64
}
SnapshotHeader is the decoded preamble returned to the caller before iteration begins so the caller can record the snapshot's commit-time horizon in its MANIFEST.json (per design §380-422).
func ReadSnapshotWithHeader ¶
func ReadSnapshotWithHeader(r io.Reader, fn func(SnapshotHeader, SnapshotEntry) error) (SnapshotHeader, error)
ReadSnapshotWithHeader is the variant of ReadSnapshot that returns the decoded EKVPBBL1 header even when the snapshot contained zero entries (in which case fn is never called). Phase 0a's DecodeSnapshot needs this so MANIFEST.json's `last_commit_ts` field is populated for an empty-snapshot dump — the per-entry callback's header argument cannot fire on a header-only file.
type Source ¶
type Source struct {
// FSMPath is the absolute or relative path of the .fsm file the
// decoder consumed.
FSMPath string `json:"fsm_path"`
// FSMCRC32C is the CRC32C value the decoder verified against the
// .fsm file's footer (lowercase hex).
FSMCRC32C string `json:"fsm_crc32c,omitempty"`
}
Source records where a Phase 0 dump came from. Phase 1 dumps leave Source nil and populate Live instead.
Source Files
¶
- checksums.go
- decode.go
- disk_full_unix.go
- dynamodb.go
- encode.go
- encode_dynamodb.go
- encode_dynamodb_gsi.go
- encode_dynamodb_items.go
- encode_dynamodb_numeric.go
- encode_info.go
- encode_redis.go
- encode_redis_coll.go
- encode_s3.go
- encode_s3_objects.go
- encode_snapshot.go
- encode_sqs.go
- encode_sqs_side.go
- filename.go
- keymap.go
- manifest.go
- open_nofollow_unix.go
- open_sidecar_export.go
- redis_hash.go
- redis_list.go
- redis_set.go
- redis_stream.go
- redis_string.go
- redis_zset.go
- s3.go
- snapshot_reader.go
- source.go
- sqs.go