Documentation
¶
Overview ¶
Package codec implements the v3 storage engine's record codec layer.
The codec layer encodes and decodes c1.storage.v3 record types onto Pebble's key-value primitives. It uses a hybrid pattern:
Generated, typed codecs for the SDK's six built-in record types, registered via init() during package import. The hot write path dispatches to these.
Cached descriptor-reflection codecs (*ReflectCodec) for any other proto message — debug paths, manifest descriptor walking, potential future extension protos. Lazily constructed on first Lookup miss and cached process-wide.
Index ¶
- Variables
- func AppendTupleBool(dst []byte, b bool) []byte
- func AppendTupleBytes(dst []byte, b []byte) []byte
- func AppendTupleInt32(dst []byte, n int32) []byte
- func AppendTupleInt64(dst []byte, n int64) []byte
- func AppendTupleSeparator(dst []byte) []byte
- func AppendTupleString(dst []byte, s string) []byte
- func AppendTupleStrings(dst []byte, s ...string) []byte
- func AppendTupleUint32(dst []byte, n uint32) []byte
- func AppendTupleUint64(dst []byte, n uint64) []byte
- func DecodeSyncID(b []byte) string
- func DecodeTupleStringTo(dst []byte, src []byte, off int) ([]byte, int, error)
- func EncodeSyncID(s string) ([]byte, error)
- func Register(name protoreflect.FullName, c Codec)
- func RegisteredNames() []protoreflect.FullName
- type Codec
- type ReflectCodec
- func (c *ReflectCodec) DecodeValue(b []byte, dst proto.Message) error
- func (c *ReflectCodec) DeleteIndexes(batch *pebble.Batch, msg proto.Message) error
- func (c *ReflectCodec) EncodeKey(msg proto.Message) ([]byte, error)
- func (c *ReflectCodec) EncodeValue(msg proto.Message) ([]byte, error)
- func (c *ReflectCodec) WriteIndexes(batch *pebble.Batch, msg proto.Message) error
Constants ¶
This section is empty.
Variables ¶
var ErrCodecTypeMismatch = errors.New("codec: input message type does not match registered codec")
ErrCodecTypeMismatch is returned by a generated codec when the supplied proto.Message is not the type the codec was registered for. Engine write paths surface this as a DataLoss-class error.
var ErrInvalidSyncID = errors.New("codec: sync_id is not a valid KSUID")
ErrInvalidSyncID is returned when EncodeSyncID receives a string that is not a parseable KSUID.
var ErrInvalidTuple = errors.New("codec: invalid tuple encoding")
ErrInvalidTuple is returned by tuple decoders when the bytes are malformed (e.g. an escape sequence with no follower, or a truncated fixed-width integer).
var ErrReflectMissingTable = fmt.Errorf("codec: descriptor missing (storage.v3.table) option")
ErrReflectMissingTable is returned by EncodeKey / WriteIndexes when the descriptor lacks a (storage.v3.table) option. Reflection codecs can still encode/decode values without one; only key paths require the table metadata.
Functions ¶
func AppendTupleBool ¶
AppendTupleBool writes 0x26 for false, 0x27 for true. These bytes sort false-before-true and never collide with the separator (0x00) or escape (0x01).
func AppendTupleBytes ¶
AppendTupleBytes writes a tuple-encoded raw-bytes component. Same escape rules as strings — needed because some connectors emit external IDs as opaque bytes that may contain embedded NUL.
func AppendTupleInt32 ¶
AppendTupleInt32 writes a sign-flipped big-endian 4-byte int32. Sign-flipping puts negative numbers before non-negative in bytewise comparison, matching natural int order.
func AppendTupleInt64 ¶
AppendTupleInt64 writes a sign-flipped big-endian 8-byte int64.
func AppendTupleSeparator ¶
AppendTupleSeparator writes a single separator byte between elements. Callers emit this themselves so the encoder is composable — e.g. a record's primary-key emission appends version + type + sync_id + separator + external_id with no separator at the end.
func AppendTupleString ¶
AppendTupleString writes a tuple-encoded string component (no trailing separator). The caller is responsible for emitting the separator between successive components.
func AppendTupleStrings ¶
AppendTupleStrings tuple-encodes each string in s and interleaves the tuple separator between successive elements. Equivalent to calling AppendTupleString in a loop with AppendTupleSeparator between calls — but in one place, so key-encoding sites can't silently drift on "did I emit one too many / one too few separators?".
No leading or trailing separator is emitted. Callers that need a leading separator (e.g. to delimit the raw sync_id bytes that precede the tuple tail in every Pebble v3 key) or a trailing separator (e.g. to make a by-value range-scan prefix unambiguous — see keys.go's convention doc) must add it themselves.
For a single string, AppendTupleStrings(dst, s) is exactly equivalent to AppendTupleString(dst, s).
func AppendTupleUint32 ¶
AppendTupleUint32 writes a big-endian 4-byte uint32 (no sign flip).
func AppendTupleUint64 ¶
AppendTupleUint64 writes a big-endian 8-byte uint64 (no sign flip).
func DecodeSyncID ¶
DecodeSyncID converts a 20-byte canonical KSUID back to its base62 string form for human-readable display. Returns an empty string if b is not exactly 20 bytes.
func DecodeTupleStringTo ¶
DecodeTupleStringTo decodes a single tuple-encoded string from src starting at offset off. Returns the decoded string, the offset immediately after the consumed bytes (pointing at the separator or end-of-input), and any error. If the input ends inside an escape sequence, returns ErrInvalidTuple.
func EncodeSyncID ¶
EncodeSyncID converts a KSUID string into its 20-byte canonical binary form. baton's sync_id values are KSUIDs (27-char base62); storing them in keys as base62 strings would burn ~7 bytes per occurrence × N indexes × 100M+ rows = real space. The binary form is uniformly 20 bytes and lex-compares identically to the base62 form because KSUIDs are sortable by their timestamp prefix.
Returns ErrInvalidSyncID if s is not a valid KSUID string.
func Register ¶
func Register(name protoreflect.FullName, c Codec)
Register installs a codec under its proto full-name. Called only from generated init() functions. Panics on duplicate registration — that's a build error, not a runtime concern.
func RegisteredNames ¶
func RegisteredNames() []protoreflect.FullName
RegisteredNames returns the proto full-names of all generated codecs registered in the binary. Intended for test introspection and for the manifest's RecordTypeInfo population.
Types ¶
type Codec ¶
type Codec interface {
// EncodeKey returns the primary-key bytes for the message.
// The bytes are tuple-encoded per the codec's record-type
// declaration. Returns ErrCodecTypeMismatch if msg is not the
// type the codec was registered for.
EncodeKey(msg proto.Message) ([]byte, error)
// EncodeValue returns deterministic proto wire bytes for the
// message. Generated codecs use proto.MarshalOptions{Deterministic:
// true} so two equal records produce equal bytes — required for
// the equivalence harness's byte-equality assertion.
EncodeValue(msg proto.Message) ([]byte, error)
// DecodeValue parses bytes into dst. dst must be the same type
// as the registered codec; mismatches return ErrCodecTypeMismatch.
DecodeValue(b []byte, dst proto.Message) error
// WriteIndexes appends all secondary-index entries for msg to
// batch. Called inside a pebble.Batch alongside the primary write.
WriteIndexes(batch *pebble.Batch, msg proto.Message) error
// DeleteIndexes appends index-entry deletions for msg to batch.
// Called during overwrite (after reading the previous value) and
// during explicit Delete. Same atomicity as WriteIndexes.
DeleteIndexes(batch *pebble.Batch, msg proto.Message) error
}
Codec is the per-record-type interface every storage codec implements. Methods return errors on type-assertion mismatch (ErrCodecTypeMismatch) rather than panicking; the engine write path plumbs errors and surfaces a DataLoss-class gRPC code to upstream callers.
Generated codecs hold a private type-assertion at the entry point; reflection codecs evaluate against the descriptor at call time.
func Lookup ¶
func Lookup(md protoreflect.MessageDescriptor) Codec
Lookup returns the codec for the given message descriptor. If a generated codec is registered, returns it (hot path, lock-free read from a frozen map). Otherwise constructs a ReflectCodec lazily, caches it in reflectCache, and returns it.
Lookup never returns nil and never returns an error; an unknown descriptor produces a working reflection codec. Whether that codec can actually encode keys depends on whether the descriptor has the required (storage.v3.table) option — ReflectCodec's methods return errors at call time if the descriptor lacks the necessary metadata.
type ReflectCodec ¶
type ReflectCodec struct {
// contains filtered or unexported fields
}
ReflectCodec encodes records via cached descriptor reflection. It satisfies the value-encoding portions of the Codec interface for any message. Key and index encoding require typed metadata and are provided by the built-in codecs.
ReflectCodec is constructed lazily by Lookup() and cached process-wide. The construction cost — resolving primary-key field paths and index declarations — is paid once per descriptor across the entire process; subsequent calls reuse the cached codec.
Performance: ~5× slower than a generated typed codec at the same workload. Used only off the engine's hot write path.
func NewReflectCodec ¶
func NewReflectCodec(md protoreflect.MessageDescriptor) *ReflectCodec
NewReflectCodec constructs a codec for the given message descriptor. Callers should generally go through Lookup() instead, which caches.
func (*ReflectCodec) DecodeValue ¶
func (c *ReflectCodec) DecodeValue(b []byte, dst proto.Message) error
func (*ReflectCodec) DeleteIndexes ¶
func (*ReflectCodec) EncodeKey ¶
func (c *ReflectCodec) EncodeKey(msg proto.Message) ([]byte, error)
EncodeKey is unsupported for reflection codecs. Built-in record types use typed codecs for primary keys and indexes.
func (*ReflectCodec) EncodeValue ¶
func (c *ReflectCodec) EncodeValue(msg proto.Message) ([]byte, error)
EncodeValue uses deterministic proto marshal — same contract as generated codecs.