Documentation
¶
Overview ¶
Package snapshot defines the SnapshotSource contract used by Murmur's Bootstrap execution mode. SnapshotSources scan a source-of-truth datastore (Mongo, DynamoDB, JDBC, S3 dump) and emit synthetic events into the same monoid-combine logic the streaming runtime uses, populating initial state when the live event stream lacks full history (typical for CDC sources).
Bootstrap mirrors Debezium's "incremental snapshot" pattern: chunked, watermarked, resumable, can run in parallel with live capture. Concrete implementations live in subpackages (snapshot/mongo, snapshot/dynamodb).
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type HandoffToken ¶
type HandoffToken []byte
HandoffToken is an opaque marker captured at the start of a snapshot that tells the live source where to resume from after bootstrap completes — without gap or duplicate.
- For Mongo: a Change Stream resume token captured before the collection scan.
- For DynamoDB: a Streams shard iterator position captured before the table scan.
- For JDBC + binlog CDC: a binlog position.
The bootstrap runtime persists this token alongside its progress and hands it back to the live source on transition.
type Source ¶
type Source[T any] interface { // CaptureHandoff captures the live-source resume position before scanning begins. // Called once at bootstrap start. CaptureHandoff(ctx context.Context) (HandoffToken, error) // Scan emits records via out until the snapshot is exhausted. Implementations should // honor ctx cancellation and yield records in stable, resumable chunks. Each record // must carry an EventID derived from the document's primary key so re-runs are // idempotent under at-least-once dedup. Scan(ctx context.Context, out chan<- source.Record[T]) error // Resume continues a previously-interrupted scan from the persisted progress marker. // Implementations that don't support resumption should restart from the beginning; // at-least-once dedup handles the duplicate emissions. Resume(ctx context.Context, marker []byte, out chan<- source.Record[T]) error // Name returns a stable identifier for logging and metrics. Name() string // Close releases any underlying resources. Close() error }
Source scans a datastore and emits records as synthetic events. Implementations should support resumable, chunked progress so a long-running bootstrap can survive worker restarts.
Distinct from pkg/source.Source (the live-streaming abstraction): a snapshot.Source is one-shot and finite; a pkg/source.Source is open-ended and streaming. Bootstrap mode uses snapshot.Source; live mode uses pkg/source.Source.
Directories
¶
| Path | Synopsis |
|---|---|
|
Package dynamodb provides a snapshot.Source backed by DynamoDB ParallelScan, suitable for bootstrapping a Murmur pipeline from a DDB-resident OLTP table.
|
Package dynamodb provides a snapshot.Source backed by DynamoDB ParallelScan, suitable for bootstrapping a Murmur pipeline from a DDB-resident OLTP table. |
|
Package jsonl provides a snapshot.Source that reads JSON Lines from any io.Reader — local files, S3 objects, gzip streams, etc.
|
Package jsonl provides a snapshot.Source that reads JSON Lines from any io.Reader — local files, S3 objects, gzip streams, etc. |
|
Package mongo provides a Mongo-backed implementation of snapshot.Source for Murmur's Bootstrap execution mode.
|
Package mongo provides a Mongo-backed implementation of snapshot.Source for Murmur's Bootstrap execution mode. |
|
Package parquet provides a snapshot.Source that reads Apache Parquet from any io.ReaderAt — local files, S3 objects fetched into memory, in-memory bytes from tests, etc.
|
Package parquet provides a snapshot.Source that reads Apache Parquet from any io.ReaderAt — local files, S3 objects fetched into memory, in-memory bytes from tests, etc. |
|
Package s3 provides a snapshot.Source that scans every JSON-Lines object under an S3 prefix and emits records for bootstrap.
|
Package s3 provides a snapshot.Source that scans every JSON-Lines object under an S3 prefix and emits records for bootstrap. |