Documentation
¶
Overview ¶
Package pebble implements the cross-engine compaction primitive for the v3 Pebble-backed storage engine.
The compactor takes a `base` engine and one or more `applied` engines (each representing a sync_run worth of records) and atomically merges `applied`'s data into `base` using pebble.DB.IngestAndExcise:
- For each applied engine, iterate its records in the source engine's key order and accumulate them into an SST file on disk.
- Call base.DB.IngestAndExcise(paths, exciseSpan) with the new SST and the key range covering the applied sync_id. Pebble atomically: (a) excises every key in [exciseSpan.Start, exciseSpan.End) from base — old sync_id rows go away in one shot. (b) ingests the SST as a new L6 file (or flushable, depending on Pebble's choice) under that range.
The net effect is a byte-level merge: zero proto encode/decode per record, zero LSM compaction churn from a record-by-record Put loop.
Bound: each Compact call replaces exactly one sync_id range in the destination. Multi-sync rollup is a higher-level loop that Compact in sequence.
Index ¶
Constants ¶
This section is empty.
Variables ¶
var ErrEmptySync = errors.New("synccompactor/pebble: source has no records under the given sync_id")
ErrEmptySync is returned when Compact is called with a sync_id that has no records in the source engine. The caller can treat this as a no-op or as an error depending on context.
Functions ¶
func MergeInto ¶ added in v0.12.5
func MergeInto(ctx context.Context, dest *enginepkg.Engine, sources []SourceSync, destSyncID string) error
MergeInto folds every source's primary records into dest under destSyncID, producing a single merged sync. It is the native-Pebble equivalent of the SQLite ATTACH union: across all inputs, the newest record per logical key (by discovered_at, ties keep the incumbent) survives, and dest's derived indexes are rebuilt from the survivors.
destSyncID must already exist in dest (created by StartNewSync) and be empty — MergeInto only adds records, it never excises, so no pre-existing data under destSyncID is assumed.
Only the four primary record buckets are copied: resource_types, resources, entitlements, grants. Assets are intentionally NOT copied — this matches the SQLite compaction path, which folds only those four tables and drops assets from the compacted sync; copying assets here would give Pebble compacted syncs asset access the SQLite path does not have. Index keyspaces are not copied verbatim either; the per-record write path rebuilds them from the surviving primaries.
Sources are applied in the given order. The dedup keeps the record with the strictly-greater discovered_at; on an equal discovered_at the already-written (earlier source) record is kept. Callers pass sources in the same order the SQLite fold applies them so the tie winner is identical across engines.
Types ¶
type Compactor ¶
type Compactor struct {
// contains filtered or unexported fields
}
Compactor merges sync_run data between two Pebble engines using IngestAndExcise. Reusable across many Compact calls; safe for sequential use only (not concurrent).
func NewCompactor ¶
NewCompactor builds a compactor that writes its merge into base. The tmpDir is where intermediate SST files are written; it must be on the same filesystem as the engine's data directory so Pebble can hard-link (a different FS would force a copy and break atomicity). If tmpDir is empty, os.TempDir is used (acceptable for tests but not production).
func (*Compactor) Compact ¶
Compact merges all records belonging to syncID in `source` into the base engine. Any pre-existing data in base under that syncID is excised before the new data is ingested — base ends up with exactly source's view of that sync.
Atomicity: per-bucket atomic, NOT whole-Compact atomic. Each IngestAndExcise call (one per record-type bucket: grants, by_entitlement index, by_principal index) is atomic from base's perspective — a concurrent reader sees the old-or-new state of that bucket, never a mixture. However, the multi-bucket loop is NOT transactional as a whole: a crash or hard cancellation mid-loop leaves base with new data in some buckets and old data in others (and the same for the DeleteRange-only path used for empty buckets). Recovery is "re-run Compact for the same syncID" — every step is idempotent (excise + ingest the same SSTs again converges to the source's view).
Caller is responsible for quiescing writes to `source` for the duration of the call (an active syncer would invalidate the SST as it's being built).
type SourceSync ¶ added in v0.12.5
SourceSync names one input to a merge: its open Pebble engine and the sync id whose records should be folded into the destination.