table

package
v0.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 27, 2026 License: MIT Imports: 11 Imported by: 0

Documentation

Overview

Package table holds Prism's columnar in-memory table type (`Table`), shared by every DAG node from Source through Encode. Columns are typed by a small `Kind` enum that buckets Pulse's 13 storage types down to five categories sufficient for spec validation and rendering.

D015 establishes that Table is materialised (not streaming); D016 establishes that storage is columnar with deferred bit-packing. D024 (queued for P02) records why `table/` is its own package rather than nested under `compile/` or `plan/`.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func SchemaFields

func SchemaFields(s *encoding.Schema) []encoding.Field

SchemaFields returns the schema's field slice for callers that want to walk fields without reaching through Schema().Fields. Kept as a tiny accessor to centralise the dereference and ease future schema refactors.

Types

type BoolColumn

type BoolColumn []bool

BoolColumn is the storage for KindBool columns.

func (BoolColumn) IsNull

func (BoolColumn) IsNull(int) bool

IsNull implements Column.

func (BoolColumn) Kind

func (BoolColumn) Kind() Kind

Kind implements Column.

func (BoolColumn) Len

func (c BoolColumn) Len() int

Len implements Column.

func (BoolColumn) NullCount

func (BoolColumn) NullCount() int

NullCount implements Column.

func (BoolColumn) ValueAt

func (c BoolColumn) ValueAt(i int) any

ValueAt implements Column.

type Column

type Column interface {
	// Kind reports the Prism Kind bucket this column belongs to.
	Kind() Kind
	// Len returns the row count.
	Len() int
	// ValueAt returns the i-th value as an interface{} (any). Returns
	// nil if the column carries an explicit null sentinel at i.
	ValueAt(i int) any
	// IsNull reports whether row i carries an explicit null marker.
	// Implementations that don't track nullability return false for
	// every i (the safe backward-compatible answer).
	IsNull(i int) bool
	// NullCount returns the number of nulls in the column. Plain
	// slice-backed columns return 0; nullable wrappers consult their
	// bitmap. -1 is reserved for future implementations that defer
	// the count until requested.
	NullCount() int
}

Column is the read interface for one materialised column. Concrete impls are typed Go slices (no bit-packing in v1, per D016).

type DateColumn

type DateColumn []int64

DateColumn is the storage for KindDate columns. Values are stored as int64 "days since epoch" using Pulse's wire format; scales/encoders convert to time.Time at the render boundary.

func (DateColumn) IsNull

func (DateColumn) IsNull(int) bool

IsNull implements Column.

func (DateColumn) Kind

func (DateColumn) Kind() Kind

Kind implements Column.

func (DateColumn) Len

func (c DateColumn) Len() int

Len implements Column.

func (DateColumn) NullCount

func (DateColumn) NullCount() int

NullCount implements Column.

func (DateColumn) ValueAt

func (c DateColumn) ValueAt(i int) any

ValueAt implements Column.

type FloatColumn

type FloatColumn []float64

FloatColumn is the storage for KindFloat columns.

func (FloatColumn) IsNull

func (FloatColumn) IsNull(int) bool

IsNull implements Column.

func (FloatColumn) Kind

func (FloatColumn) Kind() Kind

Kind implements Column.

func (FloatColumn) Len

func (c FloatColumn) Len() int

Len implements Column.

func (FloatColumn) NullCount

func (FloatColumn) NullCount() int

NullCount implements Column.

func (FloatColumn) ValueAt

func (c FloatColumn) ValueAt(i int) any

ValueAt implements Column.

type IntColumn

type IntColumn []int64

IntColumn is the storage for KindInt columns. int64 is wide enough to hold u64 values up to math.MaxInt64; values above that are truncated at decode time and a warning surfaces in Source telemetry.

func (IntColumn) IsNull

func (IntColumn) IsNull(int) bool

IsNull implements Column. Plain slice columns never report nulls; nullable variants live behind NullableColumn.

func (IntColumn) Kind

func (IntColumn) Kind() Kind

Kind implements Column.

func (IntColumn) Len

func (c IntColumn) Len() int

Len implements Column.

func (IntColumn) NullCount

func (IntColumn) NullCount() int

NullCount implements Column.

func (IntColumn) ValueAt

func (c IntColumn) ValueAt(i int) any

ValueAt implements Column.

type Kind

type Kind int

Kind buckets Pulse storage types into Prism's columnar categories. The mapping is intentionally lossy: every Pulse type folds into exactly one Kind, so downstream code (scales, encodings, format strings) reasons about one of five shapes instead of seventeen.

const (
	// KindUnknown is the zero value; a Column must never report it.
	KindUnknown Kind = iota
	// KindInt covers unsigned + bit-packed integer Pulse types.
	KindInt
	// KindFloat covers f32, f64, and decimal128 variants.
	KindFloat
	// KindString covers categorical types (rendered by dictionary).
	KindString
	// KindBool covers packed_bool. Null state lives in the per-record
	// null bitmap when Field.Nullable is set.
	KindBool
	// KindDate covers the date Pulse type (days-since-epoch).
	KindDate
)

func KindFromPulseFieldType

func KindFromPulseFieldType(ft encoding.FieldType) Kind

KindFromPulseFieldType folds the 13 Pulse FieldType variants into the five Prism Kinds. Decimal types route to KindFloat in v1 because we surface them as numeric scalars at the encoding layer; revisit when a dedicated decimal renderer lands. Nullability is orthogonal to Kind — callers consult Field.Nullable separately.

func (Kind) String

func (k Kind) String() string

String returns the snake-case name (used in error context and tests).

type NullBitmap

type NullBitmap struct {
	// contains filtered or unexported fields
}

NullBitmap is a packed bit set tracking which positions in a column carry an explicit null marker. Bits are LSB-first within each word. A nil *NullBitmap is treated as "no nulls" — callers should compare against nil before touching the methods.

The bitmap is grown lazily by Set; explicit pre-allocation via NewNullBitmap is preferred when the upper bound is known so the hash-join writer doesn't churn allocations as it appends nulls one row at a time.

func NewNullBitmap

func NewNullBitmap(n int) *NullBitmap

NewNullBitmap returns a bitmap pre-sized for n positions. n may be zero; the slice grows on demand.

func (*NullBitmap) Capacity

func (b *NullBitmap) Capacity() int

Capacity returns the high-water mark of positions ever set or the pre-allocated size — whichever is larger.

func (*NullBitmap) Count

func (b *NullBitmap) Count() int

Count returns the number of bits set in the bitmap.

func (*NullBitmap) IsNull

func (b *NullBitmap) IsNull(i int) bool

IsNull reports whether position i carries a null marker.

func (*NullBitmap) Set

func (b *NullBitmap) Set(i int)

Set marks position i as null. Out-of-range positions extend the backing slice. Calling Set twice on the same position is a no-op for Count.

type NullableColumn

type NullableColumn struct {
	Inner Column
	Nulls *NullBitmap
}

NullableColumn wraps any Column with an optional null bitmap. When the bitmap is nil, the wrapper is transparent. When set, ValueAt returns nil for marked positions and IsNull / NullCount consult the bitmap.

The plan-stage hash join allocates a NullableColumn per output column that may receive unmatched rows; the resolver and inline loader wrap nullable Pulse fields the same way.

func (NullableColumn) IsNull

func (n NullableColumn) IsNull(i int) bool

IsNull implements Column.

func (NullableColumn) Kind

func (n NullableColumn) Kind() Kind

Kind implements Column by delegating to Inner.

func (NullableColumn) Len

func (n NullableColumn) Len() int

Len implements Column.

func (NullableColumn) NullCount

func (n NullableColumn) NullCount() int

NullCount implements Column.

func (NullableColumn) Unwrap

func (n NullableColumn) Unwrap() Column

Unwrap returns the underlying column. Callers that need to feed a slice-backed reader (e.g. scale extent computation) can read the inner column directly, then consult IsNull per row.

func (NullableColumn) ValueAt

func (n NullableColumn) ValueAt(i int) any

ValueAt returns nil for null rows; otherwise delegates to Inner.

type StringColumn

type StringColumn []string

StringColumn is the storage for KindString columns (categorical values decoded against their Pulse dictionary at materialisation time).

func (StringColumn) IsNull

func (StringColumn) IsNull(int) bool

IsNull implements Column.

func (StringColumn) Kind

func (StringColumn) Kind() Kind

Kind implements Column.

func (StringColumn) Len

func (c StringColumn) Len() int

Len implements Column.

func (StringColumn) NullCount

func (StringColumn) NullCount() int

NullCount implements Column.

func (StringColumn) ValueAt

func (c StringColumn) ValueAt(i int) any

ValueAt implements Column.

type Table

type Table struct {
	// contains filtered or unexported fields
}

Table is the columnar in-memory result of one DAG node. Tables are immutable once constructed (callers receive aliased columns; never mutate them in place). Hash is computed by the producer (Resolver, SourceNode, inline converter) and propagated as the cache key.

func Filter

func Filter(src *Table, keep []bool, partitionTag string) (*Table, error)

Filter returns a new Table holding only the rows for which keep[i] is true. The schema is preserved verbatim (same field order, same types); the row count is len(keep_set). The returned table's hash is derived from the source table's hash plus a content-addressed suffix carrying the partition tag, so cached tables remain content-addressed.

Filter is the partition primitive used by the encoder's facet fan-out (D054). It is intentionally narrow — no transform semantics, no column projection — so the hot path stays a single columnar walk.

partitionTag is an opaque string the caller uses to differentiate multiple sibling filters of the same parent. The encoder's facet path passes "facet:<rowVal>:<colVal>" so two distinct partitions of the same parent produce different hashes.

func FromInline

func FromInline(name string, values []map[string]any, fields []spec.FieldSpec) (*Table, *encoding.Schema, error)

FromInline turns inline `data.values` rows (and optional `data.fields` declarations) into a *Table backed by a synthetic *encoding.Schema.

Type resolution:

  • If fields is non-empty, every declared field is honoured verbatim; unknown type tokens fall back to KindString (categorical_u8).
  • Otherwise the first row's JSON kinds drive inference: string → categorical_u8 / KindString float64 / json.Number → f64 / KindFloat bool → packed_bool / KindBool other (nil, nested arrays/maps) → categorical_u8 / KindString so downstream rules see a usable measure type.

Subsequent rows are validated against the resolved schema; a row whose JSON kind for a given field disagrees with the schema returns PRISM_RESOLVE_INLINE_TYPE_MISMATCH with row index and field name.

Hash is xxhash64 over a canonical JSON encoding of values (rows sorted by key per row, fields written in schema declaration order). Identical inputs map to identical hashes regardless of map iteration order.

func NewTable

func NewTable(schema *encoding.Schema, columns map[string]Column, rowCount int, hash string) (*Table, error)

NewTable builds and validates a Table.

Validation:

  • schema must be non-nil and have at least one field.
  • columns must contain exactly one entry per schema field (no extras, no missing).
  • every column's Len() must equal rowCount.
  • every column's Kind() must match KindFromPulseFieldType for its schema field's Type.
  • rowCount must be in [0, limits.TableMaxRows()]. Exceeding the cap returns PRISM_RESOLVE_007.

hash is propagated verbatim; the producer owns hashing strategy.

func (*Table) Column

func (t *Table) Column(name string) (Column, bool)

Column looks up a column by field name.

func (*Table) Columns

func (t *Table) Columns() []Column

Columns returns the columns in schema declaration order.

func (*Table) FieldNames

func (t *Table) FieldNames() []string

FieldNames returns the field names in schema declaration order.

func (*Table) Hash

func (t *Table) Hash() string

Hash returns the producer-supplied content hash. Cache keys combine this with the parent node's fingerprint per design/05-dag-executor.md.

func (*Table) NumRows

func (t *Table) NumRows() int

NumRows returns the row count.

func (*Table) Schema

func (t *Table) Schema() *encoding.Schema

Schema returns the table's Pulse schema.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL