Documentation
¶
Overview ¶
Package column implements the organization of columns on storage for a zst columnar storage object.
A zst object is created by allocating a RecordWriter for a top-level zng row type (i.e., "schema") via NewRecordWriter. The object to be written to is wrapped in a Spiller with a column threshold. Output is streamed to the underlying spiller in a single pass. (In the future, we may implement multiple passes to optimize the storage layout of column data or spread a given zst object across multiple
NewRecordWriter recursively decends the record type, allocating a column Writer for each node in the type tree. The top-level record body is written via a call to Write and all of the columns are called with their respetive values represented as a zcode.Bytes. The columns buffer data in memorry until they reach their byte threshold or until Flush is called.
After all of the zng data is written, a reassembly record may be formed for the RecordColumn by calling its MarshalZNG method, which builds the record value in place using zcode.Builder and returns the zed.TypeRecord (i.e., schema) of that record column.
Data is read from a zst file by scanning the reassembly records then unmarshaling a zed.Record body into an empty Record by calling Record.UnmarshalZNG, which recusirvely builds an assembly structure. An io.ReaderAt is passed to unmarshal so each column reader can access the underlying storage object and read its column data effciently in largish column chunks.
Once an assembly is built, the recontructed zng row data can be read from the assembly by calling the Read method on the top-level Record and passing in a zcode.Builder to reconstruct the record body in place. The assembly does not need any type information as the structure of values is entirely self describing in the zng data format.
Index ¶
- Constants
- Variables
- func UnmarshalSegmap(in zed.Value, s *[]Segment) error
- func UnmarshalSegment(zv zcode.Bytes, s *Segment) error
- type Array
- type ArrayWriter
- type Field
- type FieldWriter
- type Int
- type IntWriter
- type Interface
- type Presence
- type PresenceWriter
- type Primitive
- type PrimitiveWriter
- type Record
- type RecordWriter
- type Segment
- type Spiller
- type Union
- type UnionWriter
- type Writer
Constants ¶
const MaxSegmentThresh = 20 * 1024 * 1024
const SegmapTypeString = "[{offset:int64,length:int32}]"
Variables ¶
var ErrColumnMismatch = errors.New("zng record value doesn't match column writer")
var ErrCorruptSegment = errors.New("segmap value corrupt")
var ErrNonRecordAccess = errors.New("attempting to access a field in a non-record value")
Functions ¶
Types ¶
type ArrayWriter ¶
type ArrayWriter struct {
// contains filtered or unexported fields
}
func NewArrayWriter ¶
func NewArrayWriter(inner zed.Type, spiller *Spiller) *ArrayWriter
func (*ArrayWriter) Flush ¶
func (a *ArrayWriter) Flush(eof bool) error
func (*ArrayWriter) MarshalZNG ¶
type FieldWriter ¶
type FieldWriter struct {
// contains filtered or unexported fields
}
func (*FieldWriter) Flush ¶
func (f *FieldWriter) Flush(eof bool) error
func (*FieldWriter) MarshalZNG ¶
type Presence ¶
type Presence struct {
Int
// contains filtered or unexported fields
}
func NewPresence ¶
func NewPresence() *Presence
type PresenceWriter ¶
type PresenceWriter struct {
IntWriter
// contains filtered or unexported fields
}
func NewPresenceWriter ¶
func NewPresenceWriter(spiller *Spiller) *PresenceWriter
func (*PresenceWriter) Finish ¶
func (p *PresenceWriter) Finish()
func (*PresenceWriter) TouchUnset ¶
func (p *PresenceWriter) TouchUnset()
func (*PresenceWriter) TouchValue ¶
func (p *PresenceWriter) TouchValue()
type PrimitiveWriter ¶
type PrimitiveWriter struct {
// contains filtered or unexported fields
}
func NewPrimitiveWriter ¶
func NewPrimitiveWriter(spiller *Spiller) *PrimitiveWriter
func (*PrimitiveWriter) Flush ¶
func (p *PrimitiveWriter) Flush(eof bool) error
func (*PrimitiveWriter) MarshalZNG ¶
type RecordWriter ¶
type RecordWriter []*FieldWriter
func NewRecordWriter ¶
func NewRecordWriter(typ *zed.TypeRecord, spiller *Spiller) RecordWriter
func (RecordWriter) Flush ¶
func (r RecordWriter) Flush(eof bool) error
func (RecordWriter) MarshalZNG ¶
type UnionWriter ¶
type UnionWriter struct {
// contains filtered or unexported fields
}
func NewUnionWriter ¶
func NewUnionWriter(typ *zed.TypeUnion, spiller *Spiller) *UnionWriter
func (*UnionWriter) Flush ¶
func (u *UnionWriter) Flush(eof bool) error
func (*UnionWriter) MarshalZNG ¶
type Writer ¶
type Writer interface {
// Write encodes the given value into memory. When the column exceeds
// a threshold, it is automatically flushed. Flush may also be called
// explicitly to push columns to storage and thus avoid too much row skew
// between columns.
Write(zcode.Bytes) error
// Push all in-memory column data to the storage layer.
Flush(bool) error
// MarshalZNG is called after all data is flushed to build the reassembly
// record for this column.
MarshalZNG(*zed.Context, *zcode.Builder) (zed.Type, error)
}