Documentation
¶
Overview ¶
Package c1zsanitize transforms a real .c1z snapshot into an identity-stripped copy whose graph topology, cardinalities, and annotation structure are preserved. The output is suitable for shipping to internal development environments where the original customer data must not appear.
The whole transform is driven by a single per-c1z HMAC-SHA256 secret. Same input → same output within one c1z so cross-references stay coherent; different across c1zs whose secrets differ so an attacker holding multiple sanitized outputs cannot correlate them.
v0.1 reads and writes the v1/v2 sqlite-zstd .c1z format via connectorstore.Reader / Writer. v3 c1z3 output will land in v0.2 once the storage-engine-v4 PR stack merges.
Index ¶
Constants ¶
const MinSecretBytes = 32
MinSecretBytes is the minimum length of a per-c1z secret. Anything shorter is rejected by Sanitize; in practice operators should use 32 random bytes from a CSPRNG.
Variables ¶
This section is empty.
Functions ¶
func LoadOrGenerateSecret ¶
LoadOrGenerateSecret returns the per-c1z HMAC secret. When flagPath is set it loads and length-checks that file. Otherwise it mints a fresh CSPRNG secret and writes it next to outPath, refusing to clobber an existing one so a prior run's reversible mapping is never silently replaced. generated reports whether a new secret was minted so the caller can tell the operator to archive it.
func Sanitize ¶
func Sanitize(ctx context.Context, src connectorstore.Reader, dst connectorstore.Writer, opts Options) error
Sanitize copies records from src to dst, transforming identifiers, names, free text, emails, and timestamps under the per-c1z secret. One destination sync is opened per source sync; parent_sync_id linkage is preserved via a srcSyncID → dstSyncID map maintained for the duration of the call.
func SanitizeID ¶
SanitizeID returns a deterministic, irreversible transform of input under the per-c1z secret. Same input → same output within a c1z; different across c1zs whose secrets differ.
Empty input returns empty output so callers can transform optional fields without checking presence first.
func SecretPath ¶
SecretPath returns where the per-c1z HMAC secret is read from or written to: the explicit flag path when set, otherwise a file next to the sanitized output.
Types ¶
type Options ¶
type Options struct {
// Secret is the per-c1z HMAC key. Must be at least MinSecretBytes.
// The operator chooses whether to archive or discard it; the
// sanitizer never persists it on its own.
Secret []byte
// TimestampAnchor is the wall-clock value the newest timestamp in
// the source c1z lands on. All other timestamps shift by the same
// delta so relative deltas are preserved. Defaults to time.Now()
// when zero.
TimestampAnchor time.Time
// AllowUnknownAnnotations controls behavior when an annotation's
// Any type URL is not in the handler registry. The zero value is
// the safe default: unknown annotations are dropped and a log line
// names the type URL, so a newly-added annotation type carrying
// customer data can never pass through unsanitized. Set true to
// pass unknown annotations through unchanged — convenient for
// development against new annotation types, dangerous on real
// customer data.
AllowUnknownAnnotations bool
}
Options configures a sanitization run.