pbpath

package

v0.3.0 Latest Latest Go to latest Published: Apr 4, 2026 License: Apache-2.0 Imports: 19 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/loicalleyne/bufarrowlib

Links

Open Source Insights

README ¶

pbpath

pbpath lets you address any value inside a protobuf message using a compact, human-readable string path. Given a .proto schema and a message descriptor you can parse a path string, then traverse a live message to extract the values at that location — including fan-out across repeated fields via wildcards, ranges, and Python-style slices.

For hot paths that evaluate multiple paths against many messages of the same type, the Plan API compiles paths into a trie-based execution plan that traverses shared prefixes only once.

For jq-style interactive exploration, the Pipeline API provides a composable expression language with | pipes, , comma operator, select() filtering, arithmetic operators (+, -, *, /, %), variable bindings (as $name), control flow (if/then/elif/else/end, try/catch, // alternative), iteration primitives (reduce, foreach, label/break), and 70+ built-in functions covering strings, collections, math, regex, and serialization.

For a deep dive into the internal object model, trie structure, expression system, and performance recommendations, see the Architecture Guide.

Core API

// Parse a path string against a message descriptor.
path, err := pbpath.ParsePath(md, "device.geo.country")

// Walk the path through a concrete message to collect values at every step.
// PathValues returns a slice — one Values per matching branch when the path
// contains wildcards, ranges, or slices.
results, err := pbpath.PathValues(path, msg)

// For a scalar path (no fan-out) there is exactly one result.
last := results[0].Index(-1)
fmt.Println(last.Value.String()) // e.g. "US"

PathValues

func PathValues(p Path, m proto.Message, opts ...PathOption) ([]Values, error)

PathValues walks the given Path through message m and returns every matching Values branch. When the path contains fan-out steps — ListWildcardStep, ListRangeStep, or a ListIndexStep with a negative index — the function produces one Values per matching element. Nested fan-out steps produce the cartesian product of all matches.

Each Values carries:

Path — the concrete path taken (wildcards/ranges are replaced with the actual ListIndex of each element).
Values — the protoreflect.Value at every step along that path.

Options

Option	Effect
`Strict()`	Return an error when a negative index or range bound resolves out-of-bounds. Without it, out-of-range accesses are silently clamped or the branch is skipped.

Values helpers

// Index returns the (step, value) pair at position i (supports negative indices).
pair := vals.Index(-1)   // last step+value
pair.Step                // the Step
pair.Value               // the protoreflect.Value

// ListIndices returns the concrete list indices visited along the path.
indices := vals.ListIndices() // e.g. [0, 2] for repeats[0].inner[2]

// String returns a human-readable "path = value" representation.
fmt.Println(vals.String())
// (pkg.Msg).items[0].name = "widget"

Multi-Path Plan API

When you need to extract several values from every message in a stream, the Plan API avoids redundant work by sharing traversal across paths with common prefixes.

Quick Start

// Compile once.
plan, err := pbpath.NewPlan(md,
    pbpath.PlanPath("device.geo.country",       pbpath.Alias("country")),
    pbpath.PlanPath("device.geo.city",           pbpath.Alias("city")),
    pbpath.PlanPath("imp[*].id",                 pbpath.Alias("imp_id")),
    pbpath.PlanPath("imp[*].pmp.deals[*].id",    pbpath.Alias("deal_id")),
    pbpath.PlanPath("imp[0:3].banner.w",         pbpath.StrictPath()),
)
if err != nil {
    log.Fatal(err)
}

// Evaluate many times.
for _, msg := range messages {
    results, err := plan.Eval(msg)
    if err != nil {
        log.Fatal(err)
    }
    // results[0] = country  (1 branch)
    // results[1] = city     (1 branch)
    // results[2] = imp_id   (N branches, one per impression)
    // results[3] = deal_id  (N×M branches, impressions × deals)
    // results[4] = banner.w (up to 3 branches, strict)
}

plan.Eval returns [][]Values — one []Values slot per path, in the order the paths were provided to NewPlan. Each slot may contain multiple Values branches when the path fans out via wildcards, ranges, or slices.

NewPlan

func NewPlan(md protoreflect.MessageDescriptor, paths ...PlanPathSpec) (*Plan, error)

Compiles path strings against md, builds a trie of shared prefixes, and returns an immutable Plan. All parse errors are bundled into a single returned error. The Plan is safe for concurrent use by multiple goroutines.

PlanPath and PlanOption

func PlanPath(path string, opts ...PlanOption) PlanPathSpec

Pairs a raw path string with per-path options. Available options:

Option	Effect
`Alias(name)`	Give the entry a human-readable name (returned by `Plan.Entries`). Defaults to the raw path string.
`StrictPath()`	Return an error from `Eval` if any range or index on this path was clamped due to the list being shorter than the bound.

Plan.Eval

func (p *Plan) Eval(m proto.Message) ([][]Values, error)

Traverses m along all compiled paths simultaneously. Paths sharing a prefix are walked once through the shared segment, then forked. Returns [][]Values indexed by entry position.

The Eval method always traverses leniently — out-of-bounds indices and range bounds are clamped or skipped rather than returning errors. If a path was compiled with StrictPath(), the clamped flag is checked at the leaf and an error is returned only for that path.

Plan.Entries

func (p *Plan) Entries() []PlanEntry

Returns metadata for each compiled path:

type PlanEntry struct {
    Name string   // alias or raw path string
    Path Path     // the compiled Path
}

Useful for mapping result slots to output column names.

for i, e := range plan.Entries() {
    fmt.Printf("slot %d: name=%s  path=%s\n", i, e.Name, e.Path)
}

PathValuesMulti (Convenience)

func PathValuesMulti(
    md protoreflect.MessageDescriptor,
    m proto.Message,
    paths ...PlanPathSpec,
) ([][]Values, error)

One-shot wrapper that compiles a Plan and immediately evaluates it. Handy for tests and one-off extractions. For repeated evaluation of the same paths against many messages, prefer NewPlan + Plan.Eval.

results, err := pbpath.PathValuesMulti(md, msg,
    pbpath.PlanPath("nested.stringfield", pbpath.Alias("greeting")),
    pbpath.PlanPath("repeats[*].nested.stringfield"),
)

Trie-Based Shared-Prefix Optimization

Paths are inserted into a trie keyed by step equality. Two steps are merged when they have the same kind and kind-specific parameters:

Step kind	Equality criterion
`FieldAccessStep`	Same field number
`ListIndexStep`	Same index (including negatives)
`MapIndexStep`	Same key value
`ListWildcardStep`	Always equal
`ListRangeStep`	Same start, end, step, and omitted flags
`AnyExpandStep`	Same message full name
`FilterStep`	Never merged (each filter has its own predicate)
`MapWildcardStep`	Always equal

Different step types on the same field (e.g. imp[*] vs imp[0:3]) fork into separate trie branches — they produce independent fan-out groups. FilterSteps are never merged even if their predicates look identical — each filter produces its own trie branch.

Path String Syntax

A path string is a dot-separated chain of field names, optionally prefixed by an explicit root and suffixed with index, map-key, wildcard, range, or slice accessors.

Grammar

path         = [ root ] { accessor }
root         = "(" full_message_name ")"
accessor     = field_access | index_access | filter_access
field_access = "." field_name
index_access = "[" key "]"
key          = integer | string_literal | "true" | "false"
             | "*"                           ← wildcard (list or map)
             | [ start ] ":" [ end ]         ← range
             | [ start ] ":" [ end ] ":" [ step ]  ← slice
filter_access = "[?(" predicate ")]"
predicate    = or_expr
or_expr      = and_expr { "||" and_expr }
and_expr     = unary_expr { "&&" unary_expr }
unary_expr   = "!" unary_expr | primary
primary      = comparison | truthy_check | "(" or_expr ")"
comparison   = atom comparator atom
truthy_check = atom
atom         = relative_path | string_literal | integer | float | "true" | "false"
relative_path = "." field_name { "." field_name }
comparator   = "==" | "!=" | "<" | "<=" | ">" | ">="

Obtaining the Message Descriptor

ParsePath requires a protoreflect.MessageDescriptor. You can get one from:

Generated code – (*pb.MyMessage)(nil).ProtoReflect().Descriptor()
Dynamic descriptors – via protodesc.NewFile from a FileDescriptorProto
protocompile / buf – parse .proto files at runtime

Protobuf Editions

pbpath operates entirely on protoreflect descriptors, so it works with proto2, proto3, and Protobuf Editions (Edition 2023+) without any changes. Editions features are resolved by the protobuf runtime before pbpath sees the descriptors.

Identifying the Root Message

Every path implicitly starts at a root message – the outermost message type whose descriptor you pass to ParsePath.

Given this .proto file:

// file: BidRequest.proto
package bidrequest;

message BidRequestEvent {
  string id = 1;
  DeviceEvent device = 3;
  repeated ImpressionEvent imp = 4;
  // ...
}

The root message is BidRequestEvent. You may write the root explicitly or omit it:

Path string	Equivalent explicit form
`id`	`(bidrequest.BidRequestEvent).id`
`device.ip`	`(bidrequest.BidRequestEvent).device.ip`

The explicit (package.MessageName) form is optional and only required when you want to be unambiguous in documentation or tooling.

Field Access

Use the text name of the field (the snake_case name from the .proto file), separated by dots:

field_name
parent.child
parent.child.grandchild

Field names correspond exactly to the names in your .proto definition. For example, given:

message DeviceEvent {
  string ip = 3;
  GeoEvent geo = 4;

  message GeoEvent {
    string country = 4;
    string city = 6;
  }
}

Goal	Path
Device IP	`device.ip`
Geo country	`device.geo.country`
Geo city	`device.geo.city`

Repeated Field (List) Indexing

Append [index] after a repeated field name, where index is an integer. Zero-based, and negative indices are allowed (resolved at traversal time relative to the list length: -1 is the last element, -2 is second-to-last, etc.).

repeated_field[0]     // first element
repeated_field[-1]    // last element
repeated_field[-2]    // second-to-last element

Index literals may be decimal, octal (0-prefixed), or hex (0x-prefixed):

Literal	Decimal value
`0`	0
`12`	12
`0x1F`	31
`010`	8 (octal)

Map Field Indexing

Append [key] after a map field, where the key literal matches the map's key type:

Map key type	Syntax	Example
`string`	`["value"]` or `['value']`	`strkeymap["hello"]`
`bool`	`true` / `false`	`boolkeymap[true]`
`int32` / `int64`	signed integer	`int32keymap[-6]`
`uint32` / `uint64`	unsigned integer	`uint64keymap[0xffffffffffffffff]`

String keys support the same escape sequences as protobuf text format (\n, \t, \", \\, hex \xHH, unicode \uHHHH / \UHHHHHHHH, and octal \ooo).

After indexing a map you can continue traversing into the value type:

strkeymap["mykey"].stringfield

Wildcards, Ranges, and Slices

These step types cause PathValues to fan out, producing one result per matching element.

Wildcard — `[*]` or `[:]` or `[::]`

On repeated (list) fields, selects every element:

repeats[*]              // all elements
repeats[:]              // same (normalizes to [*])
repeats[::]             // same (normalizes to [*])

On map fields, selects all values in the map (order is non-deterministic):

strkeymap[*]            // all values in the map
strkeymap[*].stringfield // a field from every map value

Ranges and slices ([0:3], [::-1], etc.) are only supported on repeated fields. Using a range/slice on a map field is a parse error.

Range — `[start:end]`

Selects a half-open range of elements [start, end) with stride 1. Both start and end may be negative:

repeats[0:3]            // elements 0, 1, 2
repeats[1:3]            // elements 1, 2
repeats[-3:-1]          // 3rd-to-last through 2nd-to-last
repeats[2:]             // element 2 through the end
repeats[:2]             // elements 0, 1

Slice — `[start:end:step]` (Python semantics)

Full Python-style slice with an explicit stride/step. Any of start, end, and step may be omitted — omitted bounds default based on the step sign:

When step > 0	Default start	Default end
omitted start	`0` (beginning)	—
omitted end	—	`len` (past the end)

When step < 0	Default start	Default end
omitted start	`len - 1` (last)	—
omitted end	—	before index 0

repeats[::2]            // every other element: 0, 2, 4, …
repeats[1::2]           // odd-indexed: 1, 3, 5, …
repeats[::-1]           // all elements in reverse
repeats[3:0:-1]         // elements 3, 2, 1 (reverse, half-open)
repeats[-1::-1]         // reverse from last element
repeats[0:10:3]         // elements 0, 3, 6, 9
repeats[:3:2]           // elements 0, 2

A step of 0 is always a parse error ([::0]).

Chaining Steps

Steps can be freely chained. After a field access you can index, after an index you can access a field, and so on:

field.subfield[0].deeper_field["key"].leaf
repeats[*].nested.stringfield
repeats[::2].nested.int32repeats[0]

Mid-Traversal Filtering — `[?(...)]`

Filter predicates let you select only the elements that match a condition, similar to jq's select(). Filters can be applied to repeated message fields or single message fields.

Syntax

field[?(.subfield == "value")]          // equality
field[?(.price > 100)]                  // numeric comparison
field[?(.active)]                       // truthy check
field[?(!.hidden)]                      // negation
field[?(.active && .price > 0)]         // AND
field[?(.type == "a" || .type == "b")]  // OR
field[?(.inner.flag)]                   // nested path
field[?((.x || .y) && .z)]             // grouping with parens

On repeated fields

When applied to a repeated message field, the filter iterates all elements (like [*]) and keeps only those where the predicate is truthy:

// Select active items with price > 100
plan, _ := pbpath.NewPlan(md,
    pbpath.PlanPath("items[?(.active && .price > 100)].name"),
)

This is equivalent to jq's select(.active and .price > 100).

On single message fields

When applied to a non-repeated message field, the filter acts as a gate: if the predicate is truthy the traversal continues; otherwise the branch is dropped (producing no results).

Predicate atoms

Atom	Example	Description
Relative path	`.field`, `.inner.flag`	Field value on the current element
String literal	`"hello"`, `'world'`	String constant
Integer literal	`42`, `-1`, `0xFF`	Integer constant
Float literal	`3.14`, `-0.5`	Float constant
Boolean literal	`true`, `false`	Boolean constant

Comparators

Operator	Meaning
`==`	Equal
`!=`	Not equal
`<`	Less than
`<=`	Less than or equal
`>`	Greater than
`>=`	Greater than or equal

Programmatic API

Filters can also be built programmatically:

predicate := pbpath.FuncEq(
    pbpath.FilterPathRef(".status", statusFD),
    pbpath.Literal(pbpath.ScalarString("active"), 0),
)
p := pbpath.Path{
    pbpath.Root(md),
    pbpath.FieldAccess(itemsFD),
    pbpath.ListWildcard(),
    pbpath.Filter(predicate),
    pbpath.FieldAccess(nameFD),
}

Fan-Out and Nested Fan-Outs

When a path contains one or more wildcard, range, or slice steps, PathValues fans out and returns multiple Values — one per matching list element. When multiple fan-out steps appear in a single path the result is the cartesian product of all expansions.

Single Fan-Out

Given a message with repeats = [A, B, C]:

path, _ := pbpath.ParsePath(md, "repeats[*].nested.stringfield")
results, _ := pbpath.PathValues(path, msg)

// results has 3 entries:
// [0] → (pkg.Test).repeats[0].nested.stringfield = "alpha"
// [1] → (pkg.Test).repeats[1].nested.stringfield = "beta"
// [2] → (pkg.Test).repeats[2].nested.stringfield = "gamma"

Each result's Path contains the concrete ListIndex (not the wildcard), so you always know exactly which element produced the value.

Nested Fan-Out (Cartesian Product)

Consider a schema where a repeated field contains another repeated field:

message Outer {
  repeated Middle items = 1;
}
message Middle {
  repeated Inner sub = 1;
}
message Inner {
  string value = 1;
}

With items containing two Middle messages, each with three Inner messages:

path, _ := pbpath.ParsePath(md, "items[*].sub[*].value")
results, _ := pbpath.PathValues(path, msg)

This produces 2 × 3 = 6 results — every combination of outer and inner index:

items[0].sub[0].value = "a"
items[0].sub[1].value = "b"
items[0].sub[2].value = "c"
items[1].sub[0].value = "d"
items[1].sub[1].value = "e"
items[1].sub[2].value = "f"

You can use Values.ListIndices() to recover the indices for each level:

for _, r := range results {
    indices := r.ListIndices()
    // indices[0] = items index, indices[1] = sub index
    fmt.Printf("items[%d].sub[%d] = %s\n",
        indices[0], indices[1], r.Index(-1).Value.String())
}

Mixed Fan-Out: Range × Wildcard

Fan-out steps don't all have to be the same kind. You can mix ranges, slices, and wildcards:

// First two items, all their sub-items in reverse
path, _ := pbpath.ParsePath(md, "items[0:2].sub[::-1].value")
results, _ := pbpath.PathValues(path, msg)

If items[0] has 3 subs and items[1] has 2 subs, this produces 5 results:

items[0].sub[2].value   ← reversed
items[0].sub[1].value
items[0].sub[0].value
items[1].sub[1].value   ← reversed
items[1].sub[0].value

Step Constructors (Programmatic API)

Paths can also be built programmatically instead of parsing a string:

p := pbpath.Path{
    pbpath.Root(md),
    pbpath.FieldAccess(repeatsFD),
    pbpath.ListWildcard(),
    pbpath.FieldAccess(nestedFD),
    pbpath.FieldAccess(stringfieldFD),
}
results, err := pbpath.PathValues(p, msg)

Available Constructors

Constructor	Produces	Path Syntax
`Root(md)`	`RootStep`	`(full.Name)`
`FieldAccess(fd)`	`FieldAccessStep`	`.field`
`ListIndex(i)`	`ListIndexStep`	`[i]` (negative OK)
`MapIndex(k)`	`MapIndexStep`	`[key]`
`AnyExpand(md)`	`AnyExpandStep`	`.(full.Name)`
`ListWildcard()`	`ListWildcardStep`	`[*]`
`ListRange(start, end)`	`ListRangeStep`	`[start:end]`
`ListRangeFrom(start)`	`ListRangeStep`	`[start:]`
`ListRangeStep3(start, end, step, startOmitted, endOmitted)`	`ListRangeStep`	`[start:end:step]`
`Filter(predicate)`	`FilterStep`	`[?(...)]`
`MapWildcard()`	`MapWildcardStep`	`[*]` (on map fields)

ListRangeStep3 panics if step is 0. Use the startOmitted/endOmitted flags to indicate that a bound should be defaulted at traversal time (matching the behaviour of omitting them in the string syntax).

Examples

Simple Scalar

message BidRequestEvent {
  string id = 1;
  bool throttled = 10;
}

id              → string value of the id field
throttled       → bool value of the throttled field

Nested Messages

message BidRequestEvent {
  DeviceEvent device = 3;
  message DeviceEvent {
    GeoEvent geo = 4;
    DeviceExtEvent ext = 7;
    message GeoEvent { string country = 4; }
    message DeviceExtEvent {
      DoohEvent dooh = 1;
      message DoohEvent { uint32 venuetypeid = 2; }
    }
  }
}

device.geo.country          → the country string
device.ext.dooh.venuetypeid → uint32 venue type id

Repeated Message Elements

message BidRequestEvent {
  repeated ImpressionEvent imp = 4;
  message ImpressionEvent {
    string id = 1;
    BannerEvent banner = 4;
    message BannerEvent { uint32 w = 2; }
  }
}

imp[0].id        → id of the first impression
imp[0].banner.w  → banner width of the first impression
imp[-1].id       → id of the last impression

Repeated Scalars

message BidRequestEvent {
  repeated string cur = 6;
}

cur[0]   → first currency string
cur[-1]  → last currency string

Wildcard over Repeated Messages

path, _ := pbpath.ParsePath(md, "imp[*].id")
results, _ := pbpath.PathValues(path, msg)
// One Values per impression, each ending with that impression's id.
for _, r := range results {
    fmt.Println(r.Index(-1).Value.String())
}

Range: First N Impressions

path, _ := pbpath.ParsePath(md, "imp[0:3].banner.w")
results, _ := pbpath.PathValues(path, msg)
// Up to 3 results (or fewer if imp has < 3 elements).

Slice: Every Other Element in Reverse

path, _ := pbpath.ParsePath(md, "imp[::-2]")
results, _ := pbpath.PathValues(path, msg)
// With 5 impressions → indices 4, 2, 0.

Nested Fan-Out with ListIndices

// All deals across all impressions.
path, _ := pbpath.ParsePath(md, "imp[*].pmp.deals[*].id")
results, _ := pbpath.PathValues(path, msg)

for _, r := range results {
    idx := r.ListIndices() // [imp_index, deal_index]
    fmt.Printf("imp[%d].deal[%d] = %s\n",
        idx[0], idx[1], r.Index(-1).Value.String())
}

Deeply Nested Through Repeated Fields

message ImpressionEvent {
  PrivateMarketplaceEvent pmp = 3;
  message PrivateMarketplaceEvent {
    repeated DealEvent deals = 2;
    message DealEvent {
      string id = 1;
      DealExtEvent ext = 6;
      message DealExtEvent { bool must_bid = 3; }
    }
  }
}

imp[0].pmp.deals[0].id            → deal id
imp[0].pmp.deals[0].ext.must_bid  → must_bid flag on the deal
imp[*].pmp.deals[*].ext.must_bid  → must_bid for every deal in every impression

Map Access

message Test {
  map<string, Nested> strkeymap = 4;
  map<int32, Test>    int32keymap = 6;
  message Nested { string stringfield = 2; }
}

strkeymap["mykey"]              → the Nested message for key "mykey"
strkeymap["mykey"].stringfield  → stringfield inside that Nested message
int32keymap[-6]                 → the Test message for key -6

Self-Referential / Recursive Messages

message Test {
  Nested nested = 1;
  message Nested {
    string stringfield = 2;
    Test nested = 4;        // back-reference to Test
  }
}

nested.stringfield                  → top-level Nested's stringfield
nested.nested.nested.stringfield    → 3 levels deep through the cycle

Complex: Multiple Step Types Combined

Using the testmessage.proto schema with octal and hex literals:

int32keymap[-6].uint64keymap[040000000000].repeats[0].nested.nested.strkeymap["k"].intfield

This path:

Indexes int32keymap with key -6
On the resulting Test, indexes uint64keymap with key 4294967296 (octal 040000000000)
Indexes repeats list at position 0
Accesses nested (a Nested message)
Accesses nested on that Nested (back to a Test)
Indexes strkeymap with string key "k"
Reads the intfield scalar

Explicit Root

When you want to be explicit about the message type:

(bidrequest.BidRequestEvent).imp[0].pmp.deals[0].id
(pbpath.testdata.Test).nested.stringfield

The fully-qualified name inside () must exactly match the message descriptor's FullName().

Strict Mode

By default, out-of-bounds indices and range bounds are silently handled:

A negative index that resolves past the beginning of a list skips that branch (no result is emitted for it).
Range/slice bounds are clamped to the list length.

PathValues — global strict

Pass Strict() to make any out-of-bounds condition an immediate error:

results, err := pbpath.PathValues(path, msg, pbpath.Strict())
// err is non-nil if any index or bound is out of range.

Plan — per-path strict

With the Plan API, strict checking is per-path via StrictPath(). The traversal itself is always lenient (clamp/skip); the clamped flag is checked at leaf nodes only for strict paths.

plan, _ := pbpath.NewPlan(md,
    pbpath.PlanPath("imp[0:100].id", pbpath.StrictPath()), // errors if clamped
    pbpath.PlanPath("imp[0:100].banner.w"),                    // silently clamped
)
results, err := plan.Eval(msg)
// err is non-nil only if the strict path's bounds were clamped.

This lets you mix lenient and strict paths in the same plan without separate traversals.

Value Type

Value is the universal intermediate representation used by pbpath expressions and the Query API. It is a small union struct (≤48 bytes on 64-bit) that can hold any protobuf value without boxing.

Value Kinds

Kind	Constructor	Description
`NullKind`	`Null()`	Absent or unset value
`ScalarKind`	`Scalar(v)`, `ScalarBool(b)`, `ScalarInt64(n)`, `ScalarFloat64(f)`, `ScalarString(s)`	Any protobuf scalar
`ListKind`	`ListVal(vs)`	A slice of Values (fan-out branches)
`MessageKind`	`MessageVal(m)`	A protobuf message reference

Values are created from protobuf values with FromProtoValue and converted back with ToProtoValue. The IsNull, IsNonZero, Kind, ProtoValue, List, and Message accessors provide typed access without allocations.

Result Type

Result wraps a []Value representing the fan-out output of a single path entry. It provides typed accessors for the most common protobuf scalar types:

result := resultSet.Get("country")
country := result.String()       // first branch (or "")
allIDs := result.Strings()       // all branches

price := result.Float64()        // first branch (or 0)
prices := result.Float64s()      // all branches

active := result.Bool()          // first branch (or false)
count := result.Int64()          // first branch (or 0)

Accessor Naming Convention

Singular	Plural	Type
`String()`	`Strings()`	`string`
`Bool()`	`Bools()`	`bool`
`Int32()`	`Int32s()`	`int32`
`Int64()`	`Int64s()`	`int64`
`Uint32()`	`Uint32s()`	`uint32`
`Uint64()`	`Uint64s()`	`uint64`
`Float32()`	`Float32s()`	`float32`
`Float64()`	`Float64s()`	`float64`
`Bytes()`	—	`[]byte`
`Message()`	`Messages()`	`protoreflect.Message`
`ProtoValues()`	—	`[]protoreflect.Value`

Singular accessors return the first branch's value (or zero). Plural accessors return all branches. Len() returns the number of branches.

Query API

The Query API wraps a Plan and presents results through the typed ResultSet/Result types instead of raw [][]protoreflect.Value slices.

Quick Start

q, err := pbpath.NewQuery(md,
    pbpath.PlanPath("device.geo.country", pbpath.Alias("country")),
    pbpath.PlanPath("imp[*].id",          pbpath.Alias("imp_id")),
    pbpath.PlanPath("imp[*].bidfloor",    pbpath.Alias("price")),
)
if err != nil {
    log.Fatal(err)
}

// Evaluate — returns a ResultSet with typed accessors.
rs, err := q.Run(msg)
if err != nil {
    log.Fatal(err)
}

country := rs.Get("country").String()       // "US"
impIDs := rs.Get("imp_id").Strings()        // ["imp-1", "imp-2", ...]
prices := rs.Get("price").Float64s()        // [1.5, 2.0, ...]

// Check if a path produced any results
if rs.Has("country") {
    // ...
}

// Iterate all entries
for _, name := range rs.Names() {
    result := rs.Get(name)
    fmt.Printf("%s: %d branches\n", name, result.Len())
}

Concurrent Use

// RunConcurrent allocates fresh buffers — safe for multi-goroutine use.
rs, err := q.RunConcurrent(msg)

Accessing the Underlying Plan

plan := q.Plan() // *Plan — useful for EvalLeaves when you need raw performance

Expression Engine

The Plan API supports computed columns through a composable Expr tree. Expressions reference protobuf field paths via PathRef and apply functions to produce derived values — all evaluated inline during plan traversal.

Quick Start

plan, err := pbpath.NewPlan(md, nil,
    // Coalesce: first non-zero value from multiple paths
    pbpath.PlanPath("device_id",
        pbpath.WithExpr(pbpath.FuncCoalesce(
            pbpath.PathRef("user.id"),
            pbpath.PathRef("site.id"),
            pbpath.PathRef("device.ifa"),
        )),
        pbpath.Alias("device_id"),
    ),

    // Conditional: use banner dimensions if present, else video
    pbpath.PlanPath("width",
        pbpath.WithExpr(pbpath.FuncCond(
            pbpath.FuncHas(pbpath.PathRef("imp[0].banner.w")),
            pbpath.PathRef("imp[0].banner.w"),
            pbpath.PathRef("imp[0].video.w"),
        )),
        pbpath.Alias("width"),
    ),

    // Arithmetic: compute a derived value
    pbpath.PlanPath("total",
        pbpath.WithExpr(pbpath.FuncMul(
            pbpath.PathRef("items[0].price"),
            pbpath.PathRef("items[0].qty"),
        )),
        pbpath.Alias("total"),
    ),

    // Default: provide a fallback literal
    pbpath.PlanPath("country",
        pbpath.WithExpr(pbpath.FuncDefault(
            pbpath.PathRef("device.geo.country"),
            protoreflect.ValueOfString("UNKNOWN"),
        )),
        pbpath.Alias("country"),
    ),
)

Available Functions

Category	Functions	Output kind
Control flow	`FuncCoalesce`, `FuncDefault`, `FuncCond`	Same as input
Existence	`FuncHas`	Bool
Length	`FuncLen`	Int64
Predicates	`FuncEq`, `FuncNe`, `FuncLt`, `FuncLe`, `FuncGt`, `FuncGe`	Bool
Arithmetic	`FuncAdd`, `FuncSub`, `FuncMul`, `FuncDiv`, `FuncMod`	Numeric (auto-promoted)
Math	`FuncAbs`, `FuncCeil`, `FuncFloor`, `FuncRound`, `FuncMin`, `FuncMax`	Preserved
String	`FuncUpper`, `FuncLower`, `FuncTrim`, `FuncTrimPrefix`, `FuncTrimSuffix`, `FuncConcat`	String
Cast	`FuncCastInt`, `FuncCastFloat`, `FuncCastString`	Changed
Timestamp	`FuncStrptime`, `FuncTryStrptime`, `FuncAge`, `FuncExtract{Year,Month,Day,Hour,Minute,Second}`	Int64
ETL	`FuncHash`, `FuncEpochToDate`, `FuncDatePart`, `FuncBucket`, `FuncMask`, `FuncCoerce`, `FuncEnumName`	Varies
Aggregates	`FuncSum`, `FuncDistinct`, `FuncListConcat`	Varies
Filter/Logic	`FuncSelect`, `FuncAnd`, `FuncOr`, `FuncNot`	Same / Bool

Expressions compose freely — a FuncCond can contain FuncHas as its predicate, PathRef as the then-branch, and FuncDefault as the else-branch. See the Architecture Guide for implementation details.

Pipeline API (jq-style)

The Pipeline API provides a jq-style expression language for exploratory protobuf querying. Unlike the Plan API (which is designed for high-throughput ETL), pipelines are parsed from human-readable strings and evaluated interactively.

Quick Start

// Parse a pipeline against a message descriptor.
p, err := pbpath.ParsePipeline(md, `.items | .[] | select(.active) | .name`)
if err != nil {
    log.Fatal(err)
}

// Execute against a protobuf message.
results, err := p.ExecMessage(msg.ProtoReflect())
for _, v := range results {
    fmt.Println(v.String())
}

Grammar

pipeline       = comma_expr ["as" "$" ident "|" pipeline] { "|" comma_expr ["as" "$" ident "|" pipeline] }
comma_expr     = alt_expr { "," alt_expr }
alt_expr       = or_expr { "//" or_expr }
or_expr        = and_expr { "or" and_expr }
and_expr       = compare_expr { "and" compare_expr }
compare_expr   = add_expr [ ("==" | "!=" | "<" | "<=" | ">" | ">=") add_expr ]
add_expr       = mul_expr { ("+" | "-") mul_expr }
mul_expr       = postfix_expr { ("*" | "/" | "%") postfix_expr }
postfix_expr   = primary { suffix } [ "?" ]
suffix         = "." ident | "[]" | "[" integer "]"
primary        = "." | ".field" | ".[]" | ".[n]"
               | "[" pipeline "]"     // collect into array
               | "(" pipeline ")"     // grouping
               | "{" [ obj_entry { "," obj_entry } [","] ] "}"  // object construction
               | ident [ "(" pipeline { ";" pipeline } ")" ]
               | "$" ident            // variable reference
               | "-" primary          // unary negation
               | "!" primary          // unary not
               | "@" ident            // format string
               | "if" ... "then" ... ["elif" ...] ["else" ...] "end"
               | "try" primary ["catch" primary]
               | "reduce" expr "as" "$" ident "(" pipeline ";" pipeline ")"
               | "foreach" expr "as" "$" ident "(" pipeline ";" pipeline [";" pipeline] ")"
               | "label" "$" ident "|" pipeline
               | "break" "$" ident
               | literal
obj_entry      = ident ":" alt_expr               // static key
               | string ":" alt_expr              // string literal key
               | "(" pipeline ")" ":" alt_expr    // dynamic key
               | ident                            // shorthand for {ident: .ident}

Pipe Operator `|`

The pipe operator chains stages left-to-right. Each stage receives every value produced by the previous stage:

.items | .[] | .name          // access items field, iterate, extract name

Comma Operator `,`

The comma operator produces multiple outputs from each input. Comma has higher precedence than pipe, so .a, .b | f means (.a, .b) | f:

.items | .[0] | .name, .value     // two outputs: name and value of first item
[.name, .kind]                     // collect both into a list
(.name, .kind) | ascii_upcase     // upcase both

select(predicate)

Keeps the input if the predicate produces a truthy result; drops it otherwise:

.items | .[] | select(.active)                // truthy check
.items | .[] | select(.value > 20)            // comparison
.items | .[] | select(.active and .value > 10) // compound

Collect `[pipeline]`

Gathers all outputs of the inner pipeline into a single list value:

[.items | .[] | .name]            // ["alpha", "beta", "gamma", "delta"]
[.items | .[] | .name] | length  // 4

Built-in Functions

String Functions

Function	Args	Description
`ascii_downcase`	—	Convert to lowercase
`ascii_upcase`	—	Convert to uppercase
`ltrimstr(s)`	1	Remove prefix `s`
`rtrimstr(s)`	1	Remove suffix `s`
`startswith(s)`	1	Test if starts with `s` → bool
`endswith(s)`	1	Test if ends with `s` → bool
`split(sep)`	1	Split string by separator → array
`join(sep)`	1	Join array elements by separator → string
`test(re)`	1	Test regex match → bool
`match(re)`	1	Find match → [offset, length, string]
`capture(re)`	1	Named capture groups → [name, value, ...]
`gsub(re; s)`	2	Replace all matches of `re` with `s`
`sub(re; s)`	2	Replace first match of `re` with `s`
`explode`	—	String → array of Unicode code points
`implode`	—	Array of code points → string

"HELLO" | ascii_downcase                // "hello"
"hello world" | ltrimstr("hello ")      // "world"
"a-b-c" | split("-") | join("+")        // "a+b+c"
"foo bar foo" | gsub("foo"; "baz")      // "baz bar baz"
"hello" | explode | implode             // "hello"

Collection Functions

Function	Args	Description
`map(f)`	1	Apply pipeline `f` to each element → array
`sort_by(f)`	1	Sort by key pipeline (stable) → array
`group_by(f)`	1	Group consecutive equal keys → array of arrays
`unique_by(f)`	1	Deduplicate by key (first wins) → array
`min_by(f)`	1	Element with minimum key
`max_by(f)`	1	Element with maximum key
`flatten`	—	Flatten nested arrays one level
`reverse`	—	Reverse array or string
`first`	—	First element of array
`last`	—	Last element of array
`nth(n)`	1	Element at index `n`
`limit(n; f)`	2	First `n` outputs of pipeline `f`
`contains(x)`	1	Test if input contains `x` → bool
`inside(x)`	1	Test if input is inside `x` → bool
`index(s)`	1	First index of `s` (string or array)
`rindex(s)`	1	Last index of `s`
`indices(s)`	1	All indices of `s` → array

.items | map(.name)                         // ["alpha", "beta", "gamma", "delta"]
.items | sort_by(.name) | first | .name     // "alpha"
.items | map(.name) | join(", ")            // "alpha, beta, gamma, delta"
.items | group_by(.kind) | length           // 2 (groups "A" and "B")
.items | unique_by(.kind) | .[] | .name     // "alpha", "beta"
.items | min_by(.value) | .name             // "alpha"
.items | max_by(.value) | .name             // "delta"
"foobar" | contains("bar")                  // true
"abcabc" | index("bc")                      // 1

Numeric Functions

Function	Args	Description
`fabs`	—	Absolute value (float)
`sqrt`	—	Square root
`log`	—	Natural logarithm
`pow`	—	Power: input `[base, exp]`
`nan`	—	NaN constant
`infinite`	—	Infinity constant
`isnan`	—	Test for NaN → bool
`isinfinite`	—	Test for ±infinity → bool
`isnormal`	—	Test for normal number → bool

16 | sqrt                     // 4
[2, 10] | pow                 // 1024
nan | isnan                   // true

Serialization & Format Strings

Function	Description
`tojson`	Value → JSON string
`fromjson`	JSON string → value
`@base64`	Encode to Base64
`@base64d`	Decode from Base64
`@uri`	URL-encode
`@csv`	Array → CSV row
`@tsv`	Array → TSV row
`@html`	HTML-escape
`@json`	Same as `tojson`

"hello" | @base64              // "aGVsbG8="
"aGVsbG8=" | @base64d          // "hello"
"<b>bold</b>" | @html          // "&lt;b&gt;bold&lt;/b&gt;"
42 | tojson                    // "42"

Core Functions

Function	Description
`length`	Array/string/object/message length; abs value for numbers
`type`	Type name: "null", "boolean", "number", "string", "array", "object"
`keys`	Array → indices; message/object → field/key names
`values`	Message/object → values; array → identity
`add`	Reduce: sum numbers, concat strings, flatten arrays, merge objects
`tostring`	Convert to string representation
`tonumber`	Convert string to number
`not`	Logical negation
`empty`	Produce zero outputs
`null`	Null constant
`has(k)`	Test if object/message has key `k` → bool
`in(obj)`	Test if input key exists in `obj` → bool
`to_entries`	Object → `[{key, value}, ...]`
`from_entries`	`[{key, value}, ...]` → object
`with_entries(f)`	`to_entries \| map(f) \| from_entries`
`getpath(path)`	Get value at path (array of keys/indices)
`setpath(path; val)`	Set value at path
`delpaths(paths)`	Delete values at multiple paths

Arithmetic Operators

Standard arithmetic operators with proper precedence (* / % before + -):

Operator	Description
`+`	Add numbers, concatenate strings, concatenate arrays, merge objects
`-`	Subtract numbers
`*`	Multiply numbers, recursively merge objects
`/`	Divide numbers (integer division for integers)
`%`	Modulo

Special + behaviours: null + x = x, "a" + "b" = "ab", [1] + [2] = [1,2], {a:1} + {b:2} = {a:1,b:2}.

5 + 3                              // 8
10 - 3                             // 7
4 * 5                              // 20
10 / 3                             // 3 (integer)
10 % 3                             // 1
"hello" + " " + "world"           // "hello world"
2 + 3 * 4                         // 14 (mul before add)
.items | .[0] | .value * 2 + 1    // field arithmetic

Variables — `as $name`

Bind expression results to variables for use in subsequent pipeline stages:

.name as $n | .items | .[] | select(.active) | [$n, .name]
.items | .[0] | .name as $n | .value as $v | [$n, $v]

Variables are lexically scoped — inner bindings shadow outer ones without affecting the outer scope. The body of an as binding receives the same input as the expression being bound (not its output).

If-then-else

Conditional expressions with optional elif chains:

if .active then .name else "inactive" end
if .value > 50 then "high" elif .value > 10 then "medium" else "low" end
.items | .[] | if .active then .name else empty end

If no else clause is provided and the condition is false, the input passes through unchanged (identity).

Try-catch

Error handling that suppresses or intercepts errors:

try .name                          // pass through; suppress error
try error catch "caught"           // catch → substitute value
try error("oops") catch .          // catch body receives error message

try without catch silently suppresses errors (produces no output on error).

Alternative Operator `//`

Returns the left side if truthy, otherwise the right side (like jq's //):

.name // "default"                 // use .name if non-null/non-false
null // false // "last"            // chains: first truthy wins

Optional Operator `?`

Suppresses errors on the preceding expression, producing empty output instead:

.name?                             // no error if .name fails
.items | .[]?                      // suppress iterate errors

Reduce

Fold a stream into a single accumulator value:

reduce (.items | .[]) as $x (0; . + $x.value)     // sum: 100
reduce (.items | .[]) as $x (0; . + 1)             // count: 4
reduce (.items | .[]) as $x (""; . + $x.name + " ") // concat

Syntax: reduce STREAM as $VAR (INIT; UPDATE) — evaluates STREAM, binds each value to $VAR, and folds through UPDATE starting from INIT. The accumulator (. inside UPDATE) starts at INIT and is updated after each stream element.

Foreach

Like reduce but emits intermediate results:

foreach (.items | .[]) as $x (0; . + $x.value)           // running sum: 10, 30, 60, 100
foreach (.items | .[]) as $x (0; . + 1; . * 10)          // with extract: 10, 20, 30, 40

Syntax: foreach STREAM as $VAR (INIT; UPDATE [; EXTRACT]) — like reduce but emits a value after each iteration. Without EXTRACT, emits the accumulator; with EXTRACT, emits the EXTRACT expression applied to the accumulator.

Label-break

Early exit from a pipeline using label/break:

label $out | .items | .[] | if .value > 20 then break $out else . end

Object Construction `{...}`

Build schema-free objects with key-value pairs. Objects support static keys, string literal keys, dynamic keys via (expr), and shorthand notation:

{name: .name, val: .value}                // static keys
{"my-key": .name}                          // string literal key
{(.name): .value}                          // dynamic key from expression
{name}                                     // shorthand: same as {name: .name}
{name, value}                              // multiple shorthand
{}                                         // empty object

Objects are schema-free — they hold arbitrary string keys and Value values, unlike MessageKind which is tied to a protobuf descriptor. Object values render as JSON-like strings: {"name":"alpha","value":10}.

Object operations

{a: 1} + {b: 2}                           // merge: {"a":1,"b":2}
{a: 1, b: 2} + {b: 3}                     // right wins: {"a":1,"b":3}
{a: {x: 1}} * {a: {y: 2}}                 // recursive merge: {"a":{"x":1,"y":2}}
{a: 1, b: 2} | keys                       // ["a","b"]
{a: 1, b: 2} | values                     // [1,2]
{a: 1, b: 2} | length                     // 2
{a: 1, b: 2} | has("a")                   // true
"a" | in({"a": 1})                         // true
{a: 1, b: 2} | .[]                         // 1, 2 (iterate values)
{a: 1, b: 2} | .a                          // 1 (field access)
{a: 1, b: 2} | type                        // "object"

to_entries / from_entries / with_entries

Convert between object and array-of-pairs representations:

{a: 1, b: 2} | to_entries                 // [{key:"a",value:1}, {key:"b",value:2}]
[{key: "x", value: 42}] | from_entries    // {"x":42}
{a: 1} | with_entries(.)                   // {"a":1} (identity transform)

Path operations on objects

{a: {b: 42}} | getpath(["a", "b"])         // 42
{a: 1} | setpath(["b"]; 2)                // {"a":1,"b":2}
{a: 1, b: 2, c: 3} | delpaths([["b"]])    // {"a":1,"c":3}

Building objects from pipelines

// Extract specific fields from each item
.items | .[] | {name, value}

// Conditional values
{val: (if .value > 50 then "big" else "small" end)}

// Reduce into an object
reduce (.items | .[] | .name) as $n ({} ; . + {($n): true})

// Merge multiple objects
[{a: 1}, {b: 2}, {c: 3}] | add            // {"a":1,"b":2,"c":3}

Additional Built-in Functions

Function	Description
`min`	Minimum of array
`max`	Maximum of array
`floor`	Floor of number
`ceil`	Ceiling of number
`round`	Round number
`range`	Generate 0..n-1 from input n
`any` / `all`	Test array elements for truthiness
`any(f)` / `all(f)`	Test with predicate
`while(cond; update)`	Emit values while condition holds
`until(cond; update)`	Apply update until condition holds
`recurse` / `recurse(f)`	Recursive descent
`repeat(f)`	Apply f repeatedly
`error` / `error(msg)`	Raise an error
`debug`	Pass-through (for debugging)
`env`	Returns null (no env in protobuf context)

Pipeline Examples

// Extract active item names, sorted.
p, _ := pbpath.ParsePipeline(md,
    `.items | [.[] | select(.active)] | sort_by(.name) | .[] | .name`)
results, _ := p.ExecMessage(msg.ProtoReflect())
// → ["alpha", "delta", "gamma"]

// Sum of all values.
p, _ = pbpath.ParsePipeline(md, `.items | map(.value) | add`)
results, _ = p.ExecMessage(msg.ProtoReflect())
// → [100]

// Regex filtering.
p, _ = pbpath.ParsePipeline(md,
    `.items | .[] | select(.name | test("^(alpha|gamma)$")) | .name`)
results, _ = p.ExecMessage(msg.ProtoReflect())
// → ["alpha", "gamma"]

// Transform with gsub and base64.
p, _ = pbpath.ParsePipeline(md, `.name | gsub("-"; "_") | @base64`)
results, _ = p.ExecMessage(msg.ProtoReflect())

// Variable binding with conditional.
p, _ = pbpath.ParsePipeline(md,
    `.name as $n | .items | .[] | if .active then $n + ":" + .name else "skip" end`)
results, _ = p.ExecMessage(msg.ProtoReflect())

// Reduce to sum values.
p, _ = pbpath.ParsePipeline(md,
    `reduce (.items | .[]) as $x (0; . + $x.value)`)
results, _ = p.ExecMessage(msg.ProtoReflect())
// → [100]

// Alternative operator for defaults.
p, _ = pbpath.ParsePipeline(md, `.missing_field // "default"`)
results, _ = p.ExecMessage(msg.ProtoReflect())

EvalLeaves — High-Performance Evaluation

Plan.EvalLeaves is the recommended method for hot paths. It returns only leaf values (not the full path/values chain) and reuses pre-allocated scratch buffers, giving near-zero per-call allocations.

// Compile once.
plan, _ := pbpath.NewPlan(md, nil,
    pbpath.PlanPath("device.geo.country", pbpath.Alias("country")),
    pbpath.PlanPath("imp[*].id",          pbpath.Alias("imp_id")),
)

// Evaluate per message — near-zero allocations.
for _, msg := range stream {
    leaves, _ := plan.EvalLeaves(msg)
    country := leaves[0]  // []protoreflect.Value with 1 element
    impIDs  := leaves[1]  // []protoreflect.Value with N elements
}

Method	Allocations	Thread-safe	Returns
`Eval(msg)`	Full Values chains	✅	`[][]Values`
`EvalLeaves(msg)`	Near-zero (scratch reuse)	❌	`[][]protoreflect.Value`
`EvalLeavesConcurrent(msg)`	Fresh buffers per call	✅	`[][]protoreflect.Value`

Running Benchmarks

# Run all pbpath benchmarks
go test -bench=. -benchmem ./proto/pbpath/

# Run Plan evaluation benchmarks
go test -bench='BenchmarkPlan' -benchmem ./proto/pbpath/

# Run with longer duration for stable results
go test -bench='BenchmarkPlan' -benchmem -benchtime=5s -count=3 ./proto/pbpath/

Error Cases

Input	Error
`unknown`	field not found in message descriptor
`int32keymap["foo"]`	string key for int32 map
`nested.stringfield[0]`	indexing a non-repeated field
`strkeymap.key`	traversing map internal fields
`strkeymap["k"]["k2"]`	double-indexing (value is not a map)
`strkeymap[0:3]`	range/slice not supported on map fields
`repeats[::0]`	step must not be zero
`(wrong.Name).id`	root name doesn't match descriptor
`nested.`	trailing dot with no field name
`nested 🎉`	illegal characters

Slice Quick-Reference

Syntax	Selects	Python equivalent
`[*]`	all elements	`[:]`
`[:]`	all elements	`[:]`
`[::]`	all elements	`[::]`
`[0:3]`	elements 0, 1, 2	`[0:3]`
`[2:]`	from index 2 to end	`[2:]`
`[:2]`	elements 0, 1	`[:2]`
`[-2:]`	last 2 elements	`[-2:]`
`[-3:-1]`	3rd-to-last through 2nd-to-last	`[-3:-1]`
`[::2]`	every other (0, 2, 4, …)	`[::2]`
`[1::2]`	odd-indexed (1, 3, 5, …)	`[1::2]`
`[::-1]`	all in reverse	`[::-1]`
`[3:0:-1]`	3, 2, 1	`[3:0:-1]`
`[0:10:3]`	0, 3, 6, 9	`[0:10:3]`
`[5:2]`	empty (start ≥ end, step=1)	`[5:2]`
`[::0]`	error — step must not be 0	`[::0]` → `ValueError`

Documentation ¶

Overview ¶

Package pbpath provides functionality for representing a sequence of protobuf reflection operations on a message, including parsing human-readable path strings and traversing messages along a path to collect values.

pbpath extends the standard protopath.Step with additional step kinds for list range slicing ([start:end], [start:], [:end]) and list wildcards ([*], [:]) that fan out during value traversal.

Index ¶

func FormatValue(v protoreflect.Value, fd protoreflect.FieldDescriptor) string
func ParseStrptime(format, value string) (time.Time, error)
func PathValuesMulti(md protoreflect.MessageDescriptor, m proto.Message, paths ...PlanPathSpec) ([][]Values, error)
type EntryOption
- func Alias(name string) EntryOption
- func StrictPath() EntryOption
- func WithExpr(expr Expr) EntryOption
type Expr
- func CondWithAutoPromote(on bool, predicate, then, els Expr) Expr
- func FilterPathRef(relPath string, fields ...protoreflect.FieldDescriptor) Expr
- func FuncAbs(child Expr) Expr
- func FuncAdd(a, b Expr) Expr
- func FuncAge(children ...Expr) Expr
- func FuncAnd(a, b Expr) Expr
- func FuncBucket(child Expr, size int) Expr
- func FuncCastFloat(child Expr) Expr
- func FuncCastInt(child Expr) Expr
- func FuncCastString(child Expr) Expr
- func FuncCeil(child Expr) Expr
- func FuncCoalesce(children ...Expr) Expr
- func FuncCoerce(child Expr, ifTrue, ifFalse Value) Expr
- func FuncConcat(sep string, children ...Expr) Expr
- func FuncCond(predicate, then, els Expr) Expr
- func FuncDatePart(part string, child Expr) Expr
- func FuncDefault(child Expr, literal Value) Expr
- func FuncDistinct(child Expr) Expr
- func FuncDiv(a, b Expr) Expr
- func FuncEnumName(child Expr) Expr
- func FuncEpochToDate(child Expr) Expr
- func FuncEq(a, b Expr) Expr
- func FuncExtractDay(child Expr) Expr
- func FuncExtractHour(child Expr) Expr
- func FuncExtractMinute(child Expr) Expr
- func FuncExtractMonth(child Expr) Expr
- func FuncExtractSecond(child Expr) Expr
- func FuncExtractYear(child Expr) Expr
- func FuncFloor(child Expr) Expr
- func FuncGe(a, b Expr) Expr
- func FuncGt(a, b Expr) Expr
- func FuncHas(child Expr) Expr
- func FuncHash(children ...Expr) Expr
- func FuncLe(a, b Expr) Expr
- func FuncLen(child Expr) Expr
- func FuncListConcat(child Expr, sep string) Expr
- func FuncLower(child Expr) Expr
- func FuncLt(a, b Expr) Expr
- func FuncMask(child Expr, keepFirst, keepLast int, maskChar string) Expr
- func FuncMax(a, b Expr) Expr
- func FuncMin(a, b Expr) Expr
- func FuncMod(a, b Expr) Expr
- func FuncMul(a, b Expr) Expr
- func FuncNe(a, b Expr) Expr
- func FuncNot(child Expr) Expr
- func FuncOr(a, b Expr) Expr
- func FuncRound(child Expr) Expr
- func FuncSelect(predicate, input Expr) Expr
- func FuncStrptime(format string, child Expr) Expr
- func FuncSub(a, b Expr) Expr
- func FuncSum(child Expr) Expr
- func FuncTrim(child Expr) Expr
- func FuncTrimPrefix(child Expr, prefix string) Expr
- func FuncTrimSuffix(child Expr, suffix string) Expr
- func FuncTryStrptime(format string, child Expr) Expr
- func FuncUpper(child Expr) Expr
- func Literal(val Value, kind protoreflect.Kind) Expr
- func PathRef(path string) Expr
type ObjectEntry
type Path
- func ParsePath(md protoreflect.MessageDescriptor, path string) (Path, error)
- func (p Path) Index(i int) Step
- func (p Path) String() string
type PathOption
- func Strict() PathOption
type PipeContext
type PipeExpr
type Pipeline
- func ParsePipeline(md protoreflect.MessageDescriptor, input string) (*Pipeline, error)
- func (p *Pipeline) Exec(input []Value) ([]Value, error)
- func (p *Pipeline) ExecMessage(msg protoreflect.Message) ([]Value, error)
- func (p *Pipeline) ExecOne(input Value) ([]Value, error)
type Plan
- func NewPlan(md protoreflect.MessageDescriptor, opts []PlanOption, paths ...PlanPathSpec) (*Plan, error)
- func (p *Plan) Clone() *Plan
- func (p *Plan) Entries() []PlanEntry
- func (p *Plan) Eval(m proto.Message) ([][]Values, error)
- func (p *Plan) EvalLeaves(m proto.Message) ([][]Value, error)
- func (p *Plan) EvalLeavesConcurrent(m proto.Message) ([][]Value, error)
- func (p *Plan) ExprInputEntries(idx int) []int
- func (p *Plan) InternalPath(idx int) Path
type PlanEntry
type PlanOption
- func AutoPromote(on bool) PlanOption
type PlanPathSpec
- func PlanPath(path string, opts ...EntryOption) PlanPathSpec
type Query
- func NewQuery(plan *Plan) *Query
- func (q *Query) Plan() *Plan
- func (q *Query) Run(msg proto.Message) (ResultSet, error)
- func (q *Query) RunConcurrent(msg proto.Message) (ResultSet, error)
type Result
- func NewResult(vals []Value) Result
- func QueryEval(md protoreflect.MessageDescriptor, path string, msg proto.Message) (Result, error)
- func (r Result) Bool() bool
- func (r Result) Bools() []bool
- func (r Result) Bytes() []byte
- func (r Result) BytesSlice() [][]byte
- func (r Result) Float64() float64
- func (r Result) Float64s() []float64
- func (r Result) Int64() int64
- func (r Result) Int64s() []int64
- func (r Result) IsEmpty() bool
- func (r Result) Len() int
- func (r Result) Message() protoreflect.Message
- func (r Result) Messages() []protoreflect.Message
- func (r Result) ProtoValues() []protoreflect.Value
- func (r Result) String() string
- func (r Result) Strings() []string
- func (r Result) Uint64() uint64
- func (r Result) Uint64s() []uint64
- func (r Result) Value() Value
- func (r Result) Values() []Value
type ResultSet
- func (rs ResultSet) All() iter.Seq2[string, Result]
- func (rs ResultSet) At(i int) (name string, result Result)
- func (rs ResultSet) Get(name string) Result
- func (rs ResultSet) Has(name string) bool
- func (rs ResultSet) Len() int
- func (rs ResultSet) Names() []string
type Step
- func AnyExpand(md protoreflect.MessageDescriptor) Step
- func FieldAccess(fd protoreflect.FieldDescriptor) Step
- func Filter(predicate Expr) Step
- func ListIndex(i int) Step
- func ListRange(start, end int) Step
- func ListRangeFrom(start int) Step
- func ListRangeStep3(start, end, step int, startOmitted, endOmitted bool) Step
- func ListWildcard() Step
- func MapIndex(k protoreflect.MapKey) Step
- func MapWildcard() Step
- func Root(md protoreflect.MessageDescriptor) Step
- func (s Step) EndOmitted() bool
- func (s Step) FieldDescriptor() protoreflect.FieldDescriptor
- func (s Step) Kind() StepKind
- func (s Step) ListIndex() int
- func (s Step) MapIndex() protoreflect.MapKey
- func (s Step) MessageDescriptor() protoreflect.MessageDescriptor
- func (s Step) Predicate() Expr
- func (s Step) ProtoStep() protopath.Step
- func (s Step) RangeEnd() int
- func (s Step) RangeOpen() bool
- func (s Step) RangeStart() int
- func (s Step) RangeStep() int
- func (s Step) StartOmitted() bool
type StepKind
type Value
- func FromProtoValue(pv protoreflect.Value) Value
- func ListVal(items []Value) Value
- func MessageVal(m protoreflect.Message) Value
- func Null() Value
- func ObjectVal(entries []ObjectEntry) Value
- func Scalar(pv protoreflect.Value) Value
- func ScalarBool(b bool) Value
- func ScalarFloat64(f float64) Value
- func ScalarInt32(n int32) Value
- func ScalarInt64(n int64) Value
- func ScalarString(s string) Value
- func (v Value) Entries() []ObjectEntry
- func (v Value) Get(fd protoreflect.FieldDescriptor) Value
- func (v Value) Index(i int) Value
- func (v Value) IsNonZero() bool
- func (v Value) IsNull() bool
- func (v Value) Kind() ValueKind
- func (v Value) Len() int
- func (v Value) List() []Value
- func (v Value) Message() protoreflect.Message
- func (v Value) ProtoValue() protoreflect.Value
- func (v Value) String() string
- func (v Value) ToProtoValue() protoreflect.Value
type ValueKind
type Values
- func PathValues(p Path, m proto.Message, opts ...PathOption) ([]Values, error)
- func (p Values) Index(i int) (out struct{ ... })
- func (p Values) Len() int
- func (p Values) ListIndices() []int
- func (p Values) String() string

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func FormatValue ¶

func FormatValue(v protoreflect.Value, fd protoreflect.FieldDescriptor) string

FormatValue returns a human-readable string representation of a protoreflect.Value.

func ParseStrptime ¶ added in v0.2.0

func ParseStrptime(format, value string) (time.Time, error)

ParseStrptime parses a date/time string using the given format.

Format auto-detection:

If the format contains a '%' character, it is interpreted as a DuckDB strptime format (e.g. "%Y-%m-%d %H:%M:%S").
Otherwise it is treated as a Go time.Parse layout (e.g. "2006-01-02 15:04:05").

Supported DuckDB format specifiers:

%Y  — 4-digit year          %y  — 2-digit year (00–99 → 2000–2099)
%m  — month (01–12)         %-m — month without leading zero
%d  — day (01–31)           %-d — day without leading zero
%H  — hour 24h (00–23)     %-H — hour without leading zero
%I  — hour 12h (01–12)     %-I — hour 12h without leading zero
%M  — minute (00–59)       %-M — minute without leading zero
%S  — second (00–59)       %-S — second without leading zero
%f  — microseconds (up to 6 digits, zero-padded right)
%p  — AM/PM
%z  — UTC offset (+HHMM / -HHMM / Z)
%Z  — timezone name (parsed but forced to UTC for portability)
%j  — day of year (001–366)
%a  — abbreviated weekday name (Mon, Tue, …) — consumed but ignored
%A  — full weekday name (Monday, Tuesday, …) — consumed but ignored
%b  — abbreviated month name (Jan, Feb, …)
%B  — full month name (January, February, …)
%%  — literal '%'
%n  — any whitespace (at least one character)
%t  — any whitespace (at least one character; same as %n)

The parser is intentionally simple and lenient: it does not validate day-of-week consistency, and %Z is treated as informational (the result is always in UTC unless %z provides an offset).

func PathValuesMulti ¶

func PathValuesMulti(md protoreflect.MessageDescriptor, m proto.Message, paths ...PlanPathSpec) ([][]Values, error)

PathValuesMulti is a convenience wrapper that compiles a Plan from the given path specs and immediately evaluates it against msg. For repeated evaluation of the same paths against many messages, prefer NewPlan + Plan.Eval.

Example ¶

ExamplePathValuesMulti demonstrates the convenience wrapper for one-shot multi-path evaluation. This is useful for tests and ad-hoc extractions. For repeated evaluation of the same paths across many messages, prefer NewPlan + EvalLeaves.

package main

import (
	"fmt"
	"log"

	"google.golang.org/protobuf/proto"
	"google.golang.org/protobuf/reflect/protodesc"
	"google.golang.org/protobuf/reflect/protoreflect"
	"google.golang.org/protobuf/types/descriptorpb"
	"google.golang.org/protobuf/types/dynamicpb"

	"github.com/loicalleyne/bufarrowlib/proto/pbpath"
)

// exampleDescriptors constructs a Test schema with a Nested submessage and a
// repeated Test field called "repeats". This models a self-referential proto
// message:
//
//	message Test {
//	  Nested           nested  = 1;
//	  repeated Test    repeats = 2;
//	  message Nested {
//	    string stringfield = 1;
//	  }
//	}
//
// Self-referential (recursive) messages are fully supported by pbpath.
func exampleDescriptors() (protoreflect.MessageDescriptor, protoreflect.MessageDescriptor) {
	stringType := descriptorpb.FieldDescriptorProto_TYPE_STRING.Enum()
	messageType := descriptorpb.FieldDescriptorProto_TYPE_MESSAGE.Enum()
	labelOptional := descriptorpb.FieldDescriptorProto_LABEL_OPTIONAL.Enum()
	labelRepeated := descriptorpb.FieldDescriptorProto_LABEL_REPEATED.Enum()

	fdp := &descriptorpb.FileDescriptorProto{
		Name:    proto.String("example.proto"),
		Package: proto.String("example"),
		Syntax:  proto.String("proto3"),
		MessageType: []*descriptorpb.DescriptorProto{
			{
				Name: proto.String("Test"),
				Field: []*descriptorpb.FieldDescriptorProto{
					{Name: proto.String("nested"), Number: proto.Int32(1), Type: messageType, TypeName: proto.String(".example.Test.Nested"), Label: labelOptional},
					{Name: proto.String("repeats"), Number: proto.Int32(2), Type: messageType, TypeName: proto.String(".example.Test"), Label: labelRepeated},
				},
				NestedType: []*descriptorpb.DescriptorProto{
					{
						Name: proto.String("Nested"),
						Field: []*descriptorpb.FieldDescriptorProto{
							{Name: proto.String("stringfield"), Number: proto.Int32(1), Type: stringType, Label: labelOptional},
						},
					},
				},
			},
		},
	}
	fd, err := protodesc.NewFile(fdp, nil)
	if err != nil {
		log.Fatalf("protodesc.NewFile: %v", err)
	}
	testMD := fd.Messages().ByName("Test")
	nestedMD := testMD.Messages().ByName("Nested")
	return testMD, nestedMD
}

func main() {
	testMD, nestedMD := exampleDescriptors()

	nested := dynamicpb.NewMessage(nestedMD)
	nested.Set(nestedMD.Fields().ByName("stringfield"), protoreflect.ValueOfString("hello"))
	msg := dynamicpb.NewMessage(testMD)
	msg.Set(testMD.Fields().ByName("nested"), protoreflect.ValueOfMessage(nested))

	results, err := pbpath.PathValuesMulti(testMD, msg,
		pbpath.PlanPath("nested.stringfield", pbpath.Alias("greeting")),
	)
	if err != nil {
		log.Fatal(err)
	}
	fmt.Println(results[0][0].Index(-1).Value.Interface())
}

Output:
hello

Types ¶

type EntryOption ¶ added in v0.2.0

type EntryOption func(*planEntryOpts)

EntryOption configures a single path entry inside a Plan.

func Alias ¶

func Alias(name string) EntryOption

Alias returns an EntryOption that gives this path entry a human-readable name. The alias is returned by Plan.Entries and is useful for mapping paths to output column names.

func StrictPath ¶

func StrictPath() EntryOption

StrictPath returns an EntryOption that makes this path's evaluation return an error when a range or index is clamped due to the list being shorter than the requested bound. Without StrictPath, out-of-bounds accesses are silently clamped or skipped.

Example ¶

ExampleStrictPath demonstrates using StrictPath to detect when a range or index was clamped because the list is shorter than expected. Without StrictPath, out-of-bounds accesses are silently skipped.

package main

import (
	"fmt"
	"log"

	"google.golang.org/protobuf/proto"
	"google.golang.org/protobuf/reflect/protodesc"
	"google.golang.org/protobuf/reflect/protoreflect"
	"google.golang.org/protobuf/types/descriptorpb"
	"google.golang.org/protobuf/types/dynamicpb"

	"github.com/loicalleyne/bufarrowlib/proto/pbpath"
)

// exampleDescriptors constructs a Test schema with a Nested submessage and a
// repeated Test field called "repeats". This models a self-referential proto
// message:
//
//	message Test {
//	  Nested           nested  = 1;
//	  repeated Test    repeats = 2;
//	  message Nested {
//	    string stringfield = 1;
//	  }
//	}
//
// Self-referential (recursive) messages are fully supported by pbpath.
func exampleDescriptors() (protoreflect.MessageDescriptor, protoreflect.MessageDescriptor) {
	stringType := descriptorpb.FieldDescriptorProto_TYPE_STRING.Enum()
	messageType := descriptorpb.FieldDescriptorProto_TYPE_MESSAGE.Enum()
	labelOptional := descriptorpb.FieldDescriptorProto_LABEL_OPTIONAL.Enum()
	labelRepeated := descriptorpb.FieldDescriptorProto_LABEL_REPEATED.Enum()

	fdp := &descriptorpb.FileDescriptorProto{
		Name:    proto.String("example.proto"),
		Package: proto.String("example"),
		Syntax:  proto.String("proto3"),
		MessageType: []*descriptorpb.DescriptorProto{
			{
				Name: proto.String("Test"),
				Field: []*descriptorpb.FieldDescriptorProto{
					{Name: proto.String("nested"), Number: proto.Int32(1), Type: messageType, TypeName: proto.String(".example.Test.Nested"), Label: labelOptional},
					{Name: proto.String("repeats"), Number: proto.Int32(2), Type: messageType, TypeName: proto.String(".example.Test"), Label: labelRepeated},
				},
				NestedType: []*descriptorpb.DescriptorProto{
					{
						Name: proto.String("Nested"),
						Field: []*descriptorpb.FieldDescriptorProto{
							{Name: proto.String("stringfield"), Number: proto.Int32(1), Type: stringType, Label: labelOptional},
						},
					},
				},
			},
		},
	}
	fd, err := protodesc.NewFile(fdp, nil)
	if err != nil {
		log.Fatalf("protodesc.NewFile: %v", err)
	}
	testMD := fd.Messages().ByName("Test")
	nestedMD := testMD.Messages().ByName("Nested")
	return testMD, nestedMD
}

func main() {
	testMD, nestedMD := exampleDescriptors()

	plan, err := pbpath.NewPlan(testMD, nil,
		// This path expects at least 10 elements — will error if fewer exist.
		pbpath.PlanPath("repeats[0:10].nested.stringfield", pbpath.StrictPath()),
	)
	if err != nil {
		log.Fatal(err)
	}

	// Build a message with only 2 repeats — the range [0:10] will be clamped.
	msg := dynamicpb.NewMessage(testMD)
	list := msg.Mutable(testMD.Fields().ByName("repeats")).List()
	for _, v := range []string{"a", "b"} {
		n := dynamicpb.NewMessage(nestedMD)
		n.Set(nestedMD.Fields().ByName("stringfield"), protoreflect.ValueOfString(v))
		child := dynamicpb.NewMessage(testMD)
		child.Set(testMD.Fields().ByName("nested"), protoreflect.ValueOfMessage(n))
		list.Append(protoreflect.ValueOfMessage(child))
	}

	_, err = plan.Eval(msg)
	fmt.Printf("strict error: %v\n", err != nil)

}

Output:
strict error: true

func WithExpr ¶ added in v0.2.0

func WithExpr(expr Expr) EntryOption

WithExpr returns an EntryOption that attaches a composable Expr to a path entry. The expr tree's leaf PathRef nodes are resolved against the plan's trie during compilation; at evaluation time the function tree is applied to the resolved leaf values.

When WithExpr is set the path string passed to PlanPath is ignored for traversal purposes — all paths come from the Expr tree's leaves. The PlanPath path string (or Alias) is still used as the entry name.

type Expr ¶ added in v0.2.0

type Expr interface {
	// contains filtered or unexported methods
}

Expr represents a composable expression in a Plan.

An Expr tree is built from leaf PathRef nodes (referencing protobuf field paths) and interior function nodes (created via Func* constructors). The tree is validated and resolved at NewPlan time; evaluation happens inside Plan.EvalLeaves.

Expr is intentionally an opaque interface — construct instances via PathRef, FuncCoalesce, FuncAdd, and friends.

func CondWithAutoPromote ¶ added in v0.2.0

func CondWithAutoPromote(on bool, predicate, then, els Expr) Expr

CondWithAutoPromote returns a FuncCond with an explicit auto-promote override. When on is true and the then/else branches have different output kinds, the result is promoted to the wider type.

func FilterPathRef ¶ added in v0.3.0

func FilterPathRef(relPath string, fields ...protoreflect.FieldDescriptor) Expr

FilterPathRef creates a leaf Expr referencing a field path relative to the current message cursor. Used inside filter predicates. The fields parameter is the resolved chain of field descriptors from the cursor message to the target field.

func FuncAbs ¶ added in v0.2.0

func FuncAbs(child Expr) Expr

FuncAbs returns the absolute value of a numeric child. Preserves int vs float kind.

func FuncAdd ¶ added in v0.2.0

func FuncAdd(a, b Expr) Expr

FuncAdd returns the sum of two numeric children (a + b). Go-style type promotion: mixed int/float promotes to float.

func FuncAge ¶ added in v0.2.0

func FuncAge(children ...Expr) Expr

FuncAge computes the duration in milliseconds between timestamps.

1 argument: now − child (age of the timestamp).
2 arguments: child[0] − child[1] (difference between two timestamps).

Children must be Int64 Unix-millisecond values (e.g. from Strptime). Output kind: Int64Kind.

func FuncAnd ¶ added in v0.3.0

func FuncAnd(a, b Expr) Expr

FuncAnd returns the logical AND of two boolean children. Both children should evaluate to boolean-like values; the result is true only when both are truthy (non-null, non-zero). Output kind: BoolKind.

func FuncBucket ¶ added in v0.2.0

func FuncBucket(child Expr, size int) Expr

FuncBucket floors the integer child value to the nearest multiple of size. result = value − value%size. Output kind: pass-through.

func FuncCastFloat ¶ added in v0.2.0

func FuncCastFloat(child Expr) Expr

FuncCastFloat casts the child to Float64 (Double). int→float64, string→float64 (parse), bool→0.0/1.0. Output kind: DoubleKind.

func FuncCastInt ¶ added in v0.2.0

func FuncCastInt(child Expr) Expr

FuncCastInt casts the child to Int64. float→int64 (truncate), string→int64 (parse), bool→0/1. Output kind: Int64Kind.

func FuncCastString ¶ added in v0.2.0

func FuncCastString(child Expr) Expr

FuncCastString casts the child to String using [valueToString]. Output kind: StringKind.

func FuncCeil ¶ added in v0.2.0

func FuncCeil(child Expr) Expr

FuncCeil returns the ceiling of a float child. No-op for integers. Preserves int vs float kind.

func FuncCoalesce ¶ added in v0.2.0

func FuncCoalesce(children ...Expr) Expr

FuncCoalesce returns the first non-zero child value. All children must resolve to the same protoreflect.Kind.

func FuncCoerce ¶ added in v0.2.0

func FuncCoerce(child Expr, ifTrue, ifFalse Value) Expr

FuncCoerce maps a boolean child to one of two literal values. If the child is non-zero (true), ifTrue is returned; otherwise ifFalse. Output kind: StringKind.

func FuncConcat ¶ added in v0.2.0

func FuncConcat(sep string, children ...Expr) Expr

FuncConcat joins the string representations of all children with sep. Output kind: StringKind.

func FuncCond ¶ added in v0.2.0

func FuncCond(predicate, then, els Expr) Expr

FuncCond evaluates predicate (child 0); if its value is non-zero, returns child 1 (then), otherwise child 2 (else). Use CondWithAutoPromote to override the plan-level auto-promote setting.

func FuncDatePart ¶ added in v0.2.0

func FuncDatePart(part string, child Expr) Expr

FuncDatePart extracts a calendar component from a Unix-epoch-second timestamp. Supported parts: "year", "month", "day", "hour", "minute", "second". The part name is stored in the separator field. Output kind: Int64Kind.

func FuncDefault ¶ added in v0.2.0

func FuncDefault(child Expr, literal Value) Expr

FuncDefault returns the child value if non-zero, otherwise the literal.

func FuncDistinct ¶ added in v0.2.0

func FuncDistinct(child Expr) Expr

FuncDistinct is an aggregate that counts the number of distinct values across all fan-out branches of the child. Output kind: Int64Kind.

func FuncDiv ¶ added in v0.2.0

func FuncDiv(a, b Expr) Expr

FuncDiv returns the quotient of two numeric children (a / b). Integer division truncates toward zero. Division by zero returns zero.

func FuncEnumName ¶ added in v0.2.0

func FuncEnumName(child Expr) Expr

FuncEnumName maps an enum-typed field to its string name. The protoreflect.EnumDescriptor is resolved at NewPlan time by inspecting the child leaf's terminal field descriptor. Output kind: StringKind.

func FuncEpochToDate ¶ added in v0.2.0

func FuncEpochToDate(child Expr) Expr

FuncEpochToDate converts a Unix-epoch-second timestamp (Int64) to a day-offset (Int32) by dividing by 86 400. Useful for date-only columns. Output kind: Int32Kind. TODO: ideally maps to Arrow Date32 in typemap.go — deferred.

func FuncEq ¶ added in v0.2.0

func FuncEq(a, b Expr) Expr

FuncEq returns a Bool indicating whether a == b. Numeric operands use Go-style int→float promotion; strings use lexicographic comparison.

func FuncExtractDay ¶ added in v0.2.0

func FuncExtractDay(child Expr) Expr

FuncExtractDay extracts the day of month (1–31) from a Unix-millisecond timestamp. Output kind: Int64Kind.

func FuncExtractHour ¶ added in v0.2.0

func FuncExtractHour(child Expr) Expr

FuncExtractHour extracts the hour (0–23) from a Unix-millisecond timestamp. Output kind: Int64Kind.

func FuncExtractMinute ¶ added in v0.2.0

func FuncExtractMinute(child Expr) Expr

FuncExtractMinute extracts the minute (0–59) from a Unix-millisecond timestamp. Output kind: Int64Kind.

func FuncExtractMonth ¶ added in v0.2.0

func FuncExtractMonth(child Expr) Expr

FuncExtractMonth extracts the month (1–12) from a Unix-millisecond timestamp. Output kind: Int64Kind.

func FuncExtractSecond ¶ added in v0.2.0

func FuncExtractSecond(child Expr) Expr

FuncExtractSecond extracts the second (0–59) from a Unix-millisecond timestamp. Output kind: Int64Kind.

func FuncExtractYear ¶ added in v0.2.0

func FuncExtractYear(child Expr) Expr

FuncExtractYear extracts the year from a Unix-millisecond timestamp (Int64). Output kind: Int64Kind.

func FuncFloor ¶ added in v0.2.0

func FuncFloor(child Expr) Expr

FuncFloor returns the floor of a float child. No-op for integers. Preserves int vs float kind.

func FuncGe ¶ added in v0.2.0

func FuncGe(a, b Expr) Expr

FuncGe returns a Bool indicating whether a >= b.

func FuncGt ¶ added in v0.2.0

func FuncGt(a, b Expr) Expr

FuncGt returns a Bool indicating whether a > b.

func FuncHas ¶ added in v0.2.0

func FuncHas(child Expr) Expr

FuncHas returns a Bool indicating whether the child path is set (non-zero). Output kind: BoolKind.

func FuncHash ¶ added in v0.2.0

func FuncHash(children ...Expr) Expr

FuncHash returns an FNV-1a 64-bit hash of the child's string representation. Output kind: Int64Kind.

func FuncLe ¶ added in v0.2.0

func FuncLe(a, b Expr) Expr

FuncLe returns a Bool indicating whether a <= b.

func FuncLen ¶ added in v0.2.0

func FuncLen(child Expr) Expr

FuncLen returns the length of a repeated/map/bytes/string field as Int64. Output kind: Int64Kind.

func FuncListConcat ¶ added in v0.2.0

func FuncListConcat(child Expr, sep string) Expr

FuncListConcat is an aggregate that joins the string representation of all fan-out branches of the child with the given separator. Output kind: StringKind.

func FuncLower ¶ added in v0.2.0

func FuncLower(child Expr) Expr

FuncLower returns the string value converted to lower case. Output kind: StringKind.

func FuncLt ¶ added in v0.2.0

func FuncLt(a, b Expr) Expr

FuncLt returns a Bool indicating whether a < b.

func FuncMask ¶ added in v0.2.0

func FuncMask(child Expr, keepFirst, keepLast int, maskChar string) Expr

FuncMask redacts the interior of a string, keeping keepFirst leading characters and keepLast trailing characters. The interior is replaced with repetitions of maskChar (default "*"). Output kind: StringKind.

func FuncMax ¶ added in v0.2.0

func FuncMax(a, b Expr) Expr

FuncMax returns the larger of two numeric children. Go-style int→float promotion for mixed types.

func FuncMin ¶ added in v0.2.0

func FuncMin(a, b Expr) Expr

FuncMin returns the smaller of two numeric children. Go-style int→float promotion for mixed types.

func FuncMod ¶ added in v0.2.0

func FuncMod(a, b Expr) Expr

FuncMod returns the remainder of two integer children (a % b). Mod by zero returns zero.

func FuncMul ¶ added in v0.2.0

func FuncMul(a, b Expr) Expr

FuncMul returns the product of two numeric children (a * b).

func FuncNe ¶ added in v0.2.0

func FuncNe(a, b Expr) Expr

FuncNe returns a Bool indicating whether a != b.

func FuncNot ¶ added in v0.3.0

func FuncNot(child Expr) Expr

FuncNot returns the logical NOT of a boolean child. The result is true when the child is falsy (null, zero, false, empty). Output kind: BoolKind.

func FuncOr ¶ added in v0.3.0

func FuncOr(a, b Expr) Expr

FuncOr returns the logical OR of two boolean children. The result is true when at least one child is truthy. Output kind: BoolKind.

func FuncRound ¶ added in v0.2.0

func FuncRound(child Expr) Expr

FuncRound returns the nearest integer value of a float child (banker's rounding). No-op for integers. Preserves int vs float kind.

func FuncSelect ¶ added in v0.3.0

func FuncSelect(predicate, input Expr) Expr

FuncSelect evaluates a predicate child and, if truthy, returns the value of the second child (the "input" being filtered). If the predicate is falsy, returns null. This gives jq-style `| select(pred)` semantics.

Usage: FuncSelect(predicate, inputPathRef) The first child is the boolean predicate; the second is the value to pass through (typically the same path being filtered). Output kind: pass-through (same as the input child).

func FuncStrptime ¶ added in v0.2.0

func FuncStrptime(format string, child Expr) Expr

FuncStrptime parses a string child into a Unix-millisecond timestamp (Int64) using the given format. If the format contains '%' specifiers, it is interpreted as a DuckDB strptime format; otherwise it is treated as a Go time.Parse layout.

Returns an invalid value on parse failure. The format is stored in the separator field. Output kind: Int64Kind.

func FuncSub ¶ added in v0.2.0

func FuncSub(a, b Expr) Expr

FuncSub returns the difference of two numeric children (a - b).

func FuncSum ¶ added in v0.2.0

func FuncSum(child Expr) Expr

FuncSum is an aggregate that sums all fan-out branches of the child. The result is the same for every branch. Output kind: pass-through.

func FuncTrim ¶ added in v0.2.0

func FuncTrim(child Expr) Expr

FuncTrim returns the string with leading and trailing whitespace removed. Output kind: StringKind.

func FuncTrimPrefix ¶ added in v0.2.0

func FuncTrimPrefix(child Expr, prefix string) Expr

FuncTrimPrefix returns the string with the given prefix removed (if present). The prefix is stored as the separator field. Output kind: StringKind.

func FuncTrimSuffix ¶ added in v0.2.0

func FuncTrimSuffix(child Expr, suffix string) Expr

FuncTrimSuffix returns the string with the given suffix removed (if present). The suffix is stored as the separator field. Output kind: StringKind.

func FuncTryStrptime ¶ added in v0.2.0

func FuncTryStrptime(format string, child Expr) Expr

FuncTryStrptime is like FuncStrptime but returns zero (epoch) instead of an invalid value on parse failure. Output kind: Int64Kind.

func FuncUpper ¶ added in v0.2.0

func FuncUpper(child Expr) Expr

FuncUpper returns the string value converted to upper case. Output kind: StringKind.

func Literal ¶ added in v0.3.0

func Literal(val Value, kind protoreflect.Kind) Expr

Literal creates a leaf Expr that always returns val. The kind parameter determines the output kind for Arrow type inference; pass 0 for automatic (inferred from val's protoreflect.Value).

func PathRef ¶ added in v0.2.0

func PathRef(path string) Expr

PathRef creates a leaf Expr referencing a protobuf field path. The path is parsed and validated when the enclosing PlanPathSpec is compiled by NewPlan.

type ObjectEntry ¶ added in v0.3.0

type ObjectEntry struct {
	Key   string
	Value Value
}

ObjectEntry is a single key-value pair in an ObjectKind Value.

type Path ¶

type Path []Step

Path is a sequence of protobuf reflection steps applied to some root protobuf message value to arrive at the current value. The first step must be a Root step.

func ParsePath ¶

func ParsePath(md protoreflect.MessageDescriptor, path string) (Path, error)

ParsePath translates a human-readable representation of a path into a Path.

An empty path is an empty string. A field access step is path '.' identifier A map index step is path '[' natural ']' A list index step is path '[' integer ']' (negative indices allowed) A list range step is path '[' start ':' end ']' or '[' start ':' ']' A list slice step is path '[' start ':' end ':' step ']' (Python-style slice)

Any of start, end, step may be omitted: [::2], [1::], [::-1], [::]
step=0 is an error.
Both [:] and [::] produce a wildcard.

A list wildcard step is path '[' '*' ']' or '[' ':' ']' or '[' '::' ']' A root step is '(' msg.Descriptor().String() ')'

If the path does not start with '(' then the root step is implicitly for the given message. The parser is "type aware" to distinguish lists and maps keyed by numbers.

Example ¶

ExampleParsePath demonstrates parsing a path string against a message descriptor. The returned Path is a slice of Step values that can be inspected or passed to PathValues for traversal.

package main

import (
	"fmt"
	"log"

	"google.golang.org/protobuf/proto"
	"google.golang.org/protobuf/reflect/protodesc"
	"google.golang.org/protobuf/reflect/protoreflect"
	"google.golang.org/protobuf/types/descriptorpb"

	"github.com/loicalleyne/bufarrowlib/proto/pbpath"
)

// exampleDescriptors constructs a Test schema with a Nested submessage and a
// repeated Test field called "repeats". This models a self-referential proto
// message:
//
//	message Test {
//	  Nested           nested  = 1;
//	  repeated Test    repeats = 2;
//	  message Nested {
//	    string stringfield = 1;
//	  }
//	}
//
// Self-referential (recursive) messages are fully supported by pbpath.
func exampleDescriptors() (protoreflect.MessageDescriptor, protoreflect.MessageDescriptor) {
	stringType := descriptorpb.FieldDescriptorProto_TYPE_STRING.Enum()
	messageType := descriptorpb.FieldDescriptorProto_TYPE_MESSAGE.Enum()
	labelOptional := descriptorpb.FieldDescriptorProto_LABEL_OPTIONAL.Enum()
	labelRepeated := descriptorpb.FieldDescriptorProto_LABEL_REPEATED.Enum()

	fdp := &descriptorpb.FileDescriptorProto{
		Name:    proto.String("example.proto"),
		Package: proto.String("example"),
		Syntax:  proto.String("proto3"),
		MessageType: []*descriptorpb.DescriptorProto{
			{
				Name: proto.String("Test"),
				Field: []*descriptorpb.FieldDescriptorProto{
					{Name: proto.String("nested"), Number: proto.Int32(1), Type: messageType, TypeName: proto.String(".example.Test.Nested"), Label: labelOptional},
					{Name: proto.String("repeats"), Number: proto.Int32(2), Type: messageType, TypeName: proto.String(".example.Test"), Label: labelRepeated},
				},
				NestedType: []*descriptorpb.DescriptorProto{
					{
						Name: proto.String("Nested"),
						Field: []*descriptorpb.FieldDescriptorProto{
							{Name: proto.String("stringfield"), Number: proto.Int32(1), Type: stringType, Label: labelOptional},
						},
					},
				},
			},
		},
	}
	fd, err := protodesc.NewFile(fdp, nil)
	if err != nil {
		log.Fatalf("protodesc.NewFile: %v", err)
	}
	testMD := fd.Messages().ByName("Test")
	nestedMD := testMD.Messages().ByName("Nested")
	return testMD, nestedMD
}

func main() {
	testMD, _ := exampleDescriptors()

	// Parse a simple field access path.
	path, err := pbpath.ParsePath(testMD, "nested.stringfield")
	if err != nil {
		log.Fatal(err)
	}
	fmt.Printf("steps: %d\n", len(path))
	fmt.Printf("path:  %s\n", path)

	// Parse a wildcard path — fans out across all elements of "repeats".
	path2, err := pbpath.ParsePath(testMD, "repeats[*].nested.stringfield")
	if err != nil {
		log.Fatal(err)
	}
	fmt.Printf("fan-out path: %s\n", path2)

}

Output:
steps: 3
path:  (example.Test).nested.stringfield
fan-out path: (example.Test).repeats[*].nested.stringfield

func (Path) Index ¶

func (p Path) Index(i int) Step

Index returns the ith step in the path and supports negative indexing. A negative index starts counting from the tail of the Path such that -1 refers to the last step, -2 refers to the second-to-last step, and so on. It returns a zero Step value if the index is out-of-bounds.

func (Path) String ¶

func (p Path) String() string

String returns a structured representation of the path by concatenating the string representation of every path step.

type PathOption ¶

type PathOption func(*pathOpts)

PathOption configures the behaviour of PathValues.

func Strict ¶

func Strict() PathOption

Strict returns a PathOption that makes PathValues return an error when a negative index or range bound resolves to an out-of-bounds position. Without Strict, out-of-bounds accesses are silently clamped or skipped.

type PipeContext ¶ added in v0.3.0

type PipeContext struct {
	// contains filtered or unexported fields
}

PipeContext carries execution state through a pipeline. It holds the root message descriptor for schema-aware operations and variable bindings from `as $name` expressions.

type PipeExpr ¶ added in v0.3.0

type PipeExpr interface {
	// contains filtered or unexported methods
}

PipeExpr is a node in a pipeline expression tree. Each node receives one input Value and produces zero or more output Values.

Implementations include path access, iteration, collection, comparisons, boolean logic, built-in functions, and literal constants.

type Pipeline ¶ added in v0.3.0

type Pipeline struct {
	// contains filtered or unexported fields
}

Pipeline is a compiled sequence of pipe-separated expressions. Each expression receives every value from the current stream and produces zero or more output values for the next expression.

Create a Pipeline via ParsePipeline.

func ParsePipeline ¶ added in v0.3.0

func ParsePipeline(md protoreflect.MessageDescriptor, input string) (*Pipeline, error)

ParsePipeline parses a jq-style pipeline string against a message descriptor and returns a compiled Pipeline ready for execution.

Grammar:

pipeline       = comma_expr [ "as" "$" ident "|" pipeline ] { "|" comma_expr [ "as" "$" ident "|" pipeline ] }
comma_expr     = alt_expr { "," alt_expr }
alt_expr       = or_expr { "//" or_expr }
or_expr        = and_expr { "or" and_expr }
and_expr       = compare_expr { "and" compare_expr }
compare_expr   = add_expr [ ("==" | "!=" | "<" | "<=" | ">" | ">=") add_expr ]
add_expr       = mul_expr { ("+" | "-") mul_expr }
mul_expr       = postfix_expr { ("*" | "/" | "%") postfix_expr }
postfix_expr   = primary { suffix } [ "?" ]
suffix         = "." ident           // field access
               | "[" "]"             // iterate
               | "[" integer "]"     // index
primary        = "."                 // identity (or start of path/iterate/index)
               | "." ident           // field access on input
               | "." "[" "]"         // iterate on input
               | "." "[" int "]"     // index on input
               | "[" pipeline "]"    // collect
               | "(" pipeline ")"    // grouping
               | ident [ "(" pipeline { ";" pipeline } ")" ]  // function call
               | "$" ident           // variable reference
               | "-" primary         // unary negation
               | "!" primary         // unary not
               | "@" ident           // format string (@base64, @csv, @text, etc.)
               | "if" expr "then" pipeline {"elif" expr "then" pipeline} ["else" pipeline] "end"
               | "try" primary ["catch" primary]
               | "reduce" postfix_expr "as" "$" ident "(" pipeline ";" pipeline ")"
               | "foreach" postfix_expr "as" "$" ident "(" pipeline ";" pipeline [";" pipeline] ")"
               | "label" "$" ident "|" pipeline
               | "break" "$" ident
               | "def" ident [ "(" ident { ";" ident } ")" ] ":" pipeline ";" pipeline  // user-defined function
               | "{" [ obj_entry { "," obj_entry } [","] ] "}"  // object construction
               | string_interp       // string interpolation
               | literal
string_interp  = strbegin { pipeline strmid } pipeline strend  // "text \(expr) text"
obj_entry      = ident ":" alt_expr                       // static key
               | string ":" alt_expr                      // string key
               | "(" pipeline ")" ":" alt_expr            // dynamic key
               | ident                                    // shorthand for ident: .ident
literal        = string | integer | float | "true" | "false" | "null"

func (*Pipeline) Exec ¶ added in v0.3.0

func (p *Pipeline) Exec(input []Value) ([]Value, error)

Exec runs the pipeline against the given input stream. Each expression is applied to every value in the current stream; results are concatenated to form the input for the next expression.

func (*Pipeline) ExecMessage ¶ added in v0.3.0

func (p *Pipeline) ExecMessage(msg protoreflect.Message) ([]Value, error)

ExecMessage runs the pipeline against a protobuf message.

func (*Pipeline) ExecOne ¶ added in v0.3.0

func (p *Pipeline) ExecOne(input Value) ([]Value, error)

ExecOne runs the pipeline with a single input Value.

type Plan ¶

type Plan struct {
	// contains filtered or unexported fields
}

Plan is an immutable, pre-compiled bundle of Path values ready for repeated evaluation against messages of a single type.

Paths that share a common prefix are traversed once through the shared segment, then forked — giving an O(1) cost per shared step rather than O(P) where P is the number of paths.

Plan.Eval and Plan.EvalLeavesConcurrent are safe for concurrent use by multiple goroutines.

Plan.EvalLeaves reuses internal scratch buffers and is NOT safe for concurrent use; it is the preferred method when called from a single goroutine (e.g. inside [Transcoder.AppendDenorm]).

func NewPlan ¶

func NewPlan(md protoreflect.MessageDescriptor, opts []PlanOption, paths ...PlanPathSpec) (*Plan, error)

NewPlan compiles one or more path strings against md into an immutable Plan. Parsing and trie construction happen once; Plan.Eval is the hot path.

opts may be nil when no plan-level options are needed. All paths must be rooted at the same message descriptor md. Returns an error that bundles all parse failures.

Example ¶

ExampleNewPlan demonstrates compiling multiple paths into a Plan for repeated evaluation against messages of the same type.

The Plan API is the recommended approach for hot paths because:

Paths sharing a common prefix are traversed only once (trie merging).
EvalLeaves reuses scratch buffers, eliminating per-call allocations.
The Plan is immutable and safe for concurrent use (with EvalLeavesConcurrent).

package main

import (
	"fmt"
	"log"

	"google.golang.org/protobuf/proto"
	"google.golang.org/protobuf/reflect/protodesc"
	"google.golang.org/protobuf/reflect/protoreflect"
	"google.golang.org/protobuf/types/descriptorpb"
	"google.golang.org/protobuf/types/dynamicpb"

	"github.com/loicalleyne/bufarrowlib/proto/pbpath"
)

// exampleDescriptors constructs a Test schema with a Nested submessage and a
// repeated Test field called "repeats". This models a self-referential proto
// message:
//
//	message Test {
//	  Nested           nested  = 1;
//	  repeated Test    repeats = 2;
//	  message Nested {
//	    string stringfield = 1;
//	  }
//	}
//
// Self-referential (recursive) messages are fully supported by pbpath.
func exampleDescriptors() (protoreflect.MessageDescriptor, protoreflect.MessageDescriptor) {
	stringType := descriptorpb.FieldDescriptorProto_TYPE_STRING.Enum()
	messageType := descriptorpb.FieldDescriptorProto_TYPE_MESSAGE.Enum()
	labelOptional := descriptorpb.FieldDescriptorProto_LABEL_OPTIONAL.Enum()
	labelRepeated := descriptorpb.FieldDescriptorProto_LABEL_REPEATED.Enum()

	fdp := &descriptorpb.FileDescriptorProto{
		Name:    proto.String("example.proto"),
		Package: proto.String("example"),
		Syntax:  proto.String("proto3"),
		MessageType: []*descriptorpb.DescriptorProto{
			{
				Name: proto.String("Test"),
				Field: []*descriptorpb.FieldDescriptorProto{
					{Name: proto.String("nested"), Number: proto.Int32(1), Type: messageType, TypeName: proto.String(".example.Test.Nested"), Label: labelOptional},
					{Name: proto.String("repeats"), Number: proto.Int32(2), Type: messageType, TypeName: proto.String(".example.Test"), Label: labelRepeated},
				},
				NestedType: []*descriptorpb.DescriptorProto{
					{
						Name: proto.String("Nested"),
						Field: []*descriptorpb.FieldDescriptorProto{
							{Name: proto.String("stringfield"), Number: proto.Int32(1), Type: stringType, Label: labelOptional},
						},
					},
				},
			},
		},
	}
	fd, err := protodesc.NewFile(fdp, nil)
	if err != nil {
		log.Fatalf("protodesc.NewFile: %v", err)
	}
	testMD := fd.Messages().ByName("Test")
	nestedMD := testMD.Messages().ByName("Nested")
	return testMD, nestedMD
}

func main() {
	testMD, nestedMD := exampleDescriptors()

	// Compile two paths — they share the "repeats[*].nested" prefix so
	// the Plan traverses it only once.
	plan, err := pbpath.NewPlan(testMD, nil,
		pbpath.PlanPath("repeats[*].nested.stringfield", pbpath.Alias("name")),
		pbpath.PlanPath("nested.stringfield", pbpath.Alias("top")),
	)
	if err != nil {
		log.Fatal(err)
	}

	// Build a message:
	//   Test { nested: { stringfield: "root" }, repeats: [
	//     Test{ nested: { stringfield: "a" } },
	//     Test{ nested: { stringfield: "b" } },
	//   ]}
	nested := dynamicpb.NewMessage(nestedMD)
	nested.Set(nestedMD.Fields().ByName("stringfield"), protoreflect.ValueOfString("root"))
	msg := dynamicpb.NewMessage(testMD)
	msg.Set(testMD.Fields().ByName("nested"), protoreflect.ValueOfMessage(nested))
	list := msg.Mutable(testMD.Fields().ByName("repeats")).List()
	for _, v := range []string{"a", "b"} {
		n := dynamicpb.NewMessage(nestedMD)
		n.Set(nestedMD.Fields().ByName("stringfield"), protoreflect.ValueOfString(v))
		child := dynamicpb.NewMessage(testMD)
		child.Set(testMD.Fields().ByName("nested"), protoreflect.ValueOfMessage(n))
		list.Append(protoreflect.ValueOfMessage(child))
	}

	// Evaluate.
	results, err := plan.Eval(msg)
	if err != nil {
		log.Fatal(err)
	}

	// results[0] = "name" path (repeats[*].nested.stringfield) → 2 branches
	fmt.Printf("name: %d branches\n", len(results[0]))
	for _, v := range results[0] {
		fmt.Printf("  %v\n", v.Index(-1).Value.Interface())
	}
	// results[1] = "top" path (nested.stringfield) → 1 branch
	fmt.Printf("top: %v\n", results[1][0].Index(-1).Value.Interface())

}

Output:
name: 2 branches
  a
  b
top: root

func (*Plan) Clone ¶ added in v0.3.0

func (p *Plan) Clone() *Plan

Clone returns a shallow copy of p with its own scratch buffer. The trie, entries, and schema are shared (all immutable after construction); only the mutable [scratch] field is reset so that the clone's Plan.EvalLeaves calls do not race with the original or other clones.

Use Clone when creating independent workers (e.g. [Transcoder.Clone]) that each need to call Plan.EvalLeaves without synchronisation.

func (*Plan) Entries ¶

func (p *Plan) Entries() []PlanEntry

Entries returns metadata for each compiled path, in the order they were provided to NewPlan.

Example ¶

ExamplePlan_Entries shows how to inspect a Plan's compiled entries. This is useful for mapping result slots to output column names — each entry's Name is the alias (if provided) or the raw path string.

package main

import (
	"fmt"
	"log"

	"google.golang.org/protobuf/proto"
	"google.golang.org/protobuf/reflect/protodesc"
	"google.golang.org/protobuf/reflect/protoreflect"
	"google.golang.org/protobuf/types/descriptorpb"

	"github.com/loicalleyne/bufarrowlib/proto/pbpath"
)

// exampleDescriptors constructs a Test schema with a Nested submessage and a
// repeated Test field called "repeats". This models a self-referential proto
// message:
//
//	message Test {
//	  Nested           nested  = 1;
//	  repeated Test    repeats = 2;
//	  message Nested {
//	    string stringfield = 1;
//	  }
//	}
//
// Self-referential (recursive) messages are fully supported by pbpath.
func exampleDescriptors() (protoreflect.MessageDescriptor, protoreflect.MessageDescriptor) {
	stringType := descriptorpb.FieldDescriptorProto_TYPE_STRING.Enum()
	messageType := descriptorpb.FieldDescriptorProto_TYPE_MESSAGE.Enum()
	labelOptional := descriptorpb.FieldDescriptorProto_LABEL_OPTIONAL.Enum()
	labelRepeated := descriptorpb.FieldDescriptorProto_LABEL_REPEATED.Enum()

	fdp := &descriptorpb.FileDescriptorProto{
		Name:    proto.String("example.proto"),
		Package: proto.String("example"),
		Syntax:  proto.String("proto3"),
		MessageType: []*descriptorpb.DescriptorProto{
			{
				Name: proto.String("Test"),
				Field: []*descriptorpb.FieldDescriptorProto{
					{Name: proto.String("nested"), Number: proto.Int32(1), Type: messageType, TypeName: proto.String(".example.Test.Nested"), Label: labelOptional},
					{Name: proto.String("repeats"), Number: proto.Int32(2), Type: messageType, TypeName: proto.String(".example.Test"), Label: labelRepeated},
				},
				NestedType: []*descriptorpb.DescriptorProto{
					{
						Name: proto.String("Nested"),
						Field: []*descriptorpb.FieldDescriptorProto{
							{Name: proto.String("stringfield"), Number: proto.Int32(1), Type: stringType, Label: labelOptional},
						},
					},
				},
			},
		},
	}
	fd, err := protodesc.NewFile(fdp, nil)
	if err != nil {
		log.Fatalf("protodesc.NewFile: %v", err)
	}
	testMD := fd.Messages().ByName("Test")
	nestedMD := testMD.Messages().ByName("Nested")
	return testMD, nestedMD
}

func main() {
	testMD, _ := exampleDescriptors()

	plan, err := pbpath.NewPlan(testMD, nil,
		pbpath.PlanPath("nested.stringfield", pbpath.Alias("col_a")),
		pbpath.PlanPath("repeats[*].nested.stringfield"),
	)
	if err != nil {
		log.Fatal(err)
	}

	for _, e := range plan.Entries() {
		fmt.Printf("%-30s  path: %s\n", e.Name, e.Path)
	}
}

Output:
col_a                           path: (example.Test).nested.stringfield
repeats[*].nested.stringfield   path: (example.Test).repeats[*].nested.stringfield

func (*Plan) Eval ¶

func (p *Plan) Eval(m proto.Message) ([][]Values, error)

Eval traverses msg along all compiled paths simultaneously. Paths sharing a prefix are traversed once through the shared segment.

Returns a slice of []Values indexed by entry position (matching the order paths were provided to NewPlan). Each []Values may contain multiple entries when the path fans out via wildcards, ranges, or slices.

Per-path StrictPath options are checked at leaf nodes: if any branch was clamped during traversal and the path is strict, an error is returned.

Example ¶

ExamplePlan_Eval shows how fan-out paths produce multiple Values branches. The [0:2] range slice selects the first two elements of the repeated field, even though three elements exist.

package main

import (
	"fmt"
	"log"

	"google.golang.org/protobuf/proto"
	"google.golang.org/protobuf/reflect/protodesc"
	"google.golang.org/protobuf/reflect/protoreflect"
	"google.golang.org/protobuf/types/descriptorpb"
	"google.golang.org/protobuf/types/dynamicpb"

	"github.com/loicalleyne/bufarrowlib/proto/pbpath"
)

// exampleDescriptors constructs a Test schema with a Nested submessage and a
// repeated Test field called "repeats". This models a self-referential proto
// message:
//
//	message Test {
//	  Nested           nested  = 1;
//	  repeated Test    repeats = 2;
//	  message Nested {
//	    string stringfield = 1;
//	  }
//	}
//
// Self-referential (recursive) messages are fully supported by pbpath.
func exampleDescriptors() (protoreflect.MessageDescriptor, protoreflect.MessageDescriptor) {
	stringType := descriptorpb.FieldDescriptorProto_TYPE_STRING.Enum()
	messageType := descriptorpb.FieldDescriptorProto_TYPE_MESSAGE.Enum()
	labelOptional := descriptorpb.FieldDescriptorProto_LABEL_OPTIONAL.Enum()
	labelRepeated := descriptorpb.FieldDescriptorProto_LABEL_REPEATED.Enum()

	fdp := &descriptorpb.FileDescriptorProto{
		Name:    proto.String("example.proto"),
		Package: proto.String("example"),
		Syntax:  proto.String("proto3"),
		MessageType: []*descriptorpb.DescriptorProto{
			{
				Name: proto.String("Test"),
				Field: []*descriptorpb.FieldDescriptorProto{
					{Name: proto.String("nested"), Number: proto.Int32(1), Type: messageType, TypeName: proto.String(".example.Test.Nested"), Label: labelOptional},
					{Name: proto.String("repeats"), Number: proto.Int32(2), Type: messageType, TypeName: proto.String(".example.Test"), Label: labelRepeated},
				},
				NestedType: []*descriptorpb.DescriptorProto{
					{
						Name: proto.String("Nested"),
						Field: []*descriptorpb.FieldDescriptorProto{
							{Name: proto.String("stringfield"), Number: proto.Int32(1), Type: stringType, Label: labelOptional},
						},
					},
				},
			},
		},
	}
	fd, err := protodesc.NewFile(fdp, nil)
	if err != nil {
		log.Fatalf("protodesc.NewFile: %v", err)
	}
	testMD := fd.Messages().ByName("Test")
	nestedMD := testMD.Messages().ByName("Nested")
	return testMD, nestedMD
}

func main() {
	testMD, nestedMD := exampleDescriptors()

	plan, err := pbpath.NewPlan(testMD, nil,
		pbpath.PlanPath("repeats[0:2].nested.stringfield"),
	)
	if err != nil {
		log.Fatal(err)
	}

	// Three-element repeated field — the [0:2] slice selects the first two.
	msg := dynamicpb.NewMessage(testMD)
	list := msg.Mutable(testMD.Fields().ByName("repeats")).List()
	for _, v := range []string{"x", "y", "z"} {
		n := dynamicpb.NewMessage(nestedMD)
		n.Set(nestedMD.Fields().ByName("stringfield"), protoreflect.ValueOfString(v))
		child := dynamicpb.NewMessage(testMD)
		child.Set(testMD.Fields().ByName("nested"), protoreflect.ValueOfMessage(n))
		list.Append(protoreflect.ValueOfMessage(child))
	}

	results, err := plan.Eval(msg)
	if err != nil {
		log.Fatal(err)
	}

	for _, v := range results[0] {
		fmt.Println(v.Index(-1).Value.Interface())
	}
}

Output:
x
y

func (*Plan) EvalLeaves ¶ added in v0.2.0

func (p *Plan) EvalLeaves(m proto.Message) ([][]Value, error)

EvalLeaves traverses msg along all compiled paths simultaneously, returning only the leaf (last) value for each branch — not the full path/values chain.

This is significantly cheaper than Plan.Eval because it avoids clonePath/cloneValues allocations entirely, tracking only a single cursor value per branch per trie step.

The returned slice is indexed by entry position (matching NewPlan order). Each inner slice contains one Value per fan-out branch. An empty inner slice means the path produced no values for this message (left-join null).

EvalLeaves reuses internal scratch buffers and is NOT safe for concurrent use. Use Plan.EvalLeavesConcurrent when calling from multiple goroutines.

Example ¶

ExamplePlan_EvalLeaves demonstrates the high-performance EvalLeaves method, which returns only leaf values (not full path chains) and reuses internal scratch buffers to minimize allocations.

Use EvalLeaves for hot paths where you process thousands of messages per second. For concurrent access from multiple goroutines, use EvalLeavesConcurrent instead.

package main

import (
	"fmt"
	"log"

	"google.golang.org/protobuf/proto"
	"google.golang.org/protobuf/reflect/protodesc"
	"google.golang.org/protobuf/reflect/protoreflect"
	"google.golang.org/protobuf/types/descriptorpb"
	"google.golang.org/protobuf/types/dynamicpb"

	"github.com/loicalleyne/bufarrowlib/proto/pbpath"
)

// exampleDescriptors constructs a Test schema with a Nested submessage and a
// repeated Test field called "repeats". This models a self-referential proto
// message:
//
//	message Test {
//	  Nested           nested  = 1;
//	  repeated Test    repeats = 2;
//	  message Nested {
//	    string stringfield = 1;
//	  }
//	}
//
// Self-referential (recursive) messages are fully supported by pbpath.
func exampleDescriptors() (protoreflect.MessageDescriptor, protoreflect.MessageDescriptor) {
	stringType := descriptorpb.FieldDescriptorProto_TYPE_STRING.Enum()
	messageType := descriptorpb.FieldDescriptorProto_TYPE_MESSAGE.Enum()
	labelOptional := descriptorpb.FieldDescriptorProto_LABEL_OPTIONAL.Enum()
	labelRepeated := descriptorpb.FieldDescriptorProto_LABEL_REPEATED.Enum()

	fdp := &descriptorpb.FileDescriptorProto{
		Name:    proto.String("example.proto"),
		Package: proto.String("example"),
		Syntax:  proto.String("proto3"),
		MessageType: []*descriptorpb.DescriptorProto{
			{
				Name: proto.String("Test"),
				Field: []*descriptorpb.FieldDescriptorProto{
					{Name: proto.String("nested"), Number: proto.Int32(1), Type: messageType, TypeName: proto.String(".example.Test.Nested"), Label: labelOptional},
					{Name: proto.String("repeats"), Number: proto.Int32(2), Type: messageType, TypeName: proto.String(".example.Test"), Label: labelRepeated},
				},
				NestedType: []*descriptorpb.DescriptorProto{
					{
						Name: proto.String("Nested"),
						Field: []*descriptorpb.FieldDescriptorProto{
							{Name: proto.String("stringfield"), Number: proto.Int32(1), Type: stringType, Label: labelOptional},
						},
					},
				},
			},
		},
	}
	fd, err := protodesc.NewFile(fdp, nil)
	if err != nil {
		log.Fatalf("protodesc.NewFile: %v", err)
	}
	testMD := fd.Messages().ByName("Test")
	nestedMD := testMD.Messages().ByName("Nested")
	return testMD, nestedMD
}

func main() {
	testMD, nestedMD := exampleDescriptors()

	plan, err := pbpath.NewPlan(testMD, nil,
		pbpath.PlanPath("nested.stringfield", pbpath.Alias("top_name")),
		pbpath.PlanPath("repeats[*].nested.stringfield", pbpath.Alias("child_name")),
	)
	if err != nil {
		log.Fatal(err)
	}

	// Build a message with nested value and two repeats.
	nested := dynamicpb.NewMessage(nestedMD)
	nested.Set(nestedMD.Fields().ByName("stringfield"), protoreflect.ValueOfString("root"))
	msg := dynamicpb.NewMessage(testMD)
	msg.Set(testMD.Fields().ByName("nested"), protoreflect.ValueOfMessage(nested))
	list := msg.Mutable(testMD.Fields().ByName("repeats")).List()
	for _, v := range []string{"x", "y"} {
		n := dynamicpb.NewMessage(nestedMD)
		n.Set(nestedMD.Fields().ByName("stringfield"), protoreflect.ValueOfString(v))
		child := dynamicpb.NewMessage(testMD)
		child.Set(testMD.Fields().ByName("nested"), protoreflect.ValueOfMessage(n))
		list.Append(protoreflect.ValueOfMessage(child))
	}

	// EvalLeaves returns [][]protoreflect.Value — just the leaf values.
	// No full path chains, no per-call allocations (scratch buffers reused).
	leaves, err := plan.EvalLeaves(msg)
	if err != nil {
		log.Fatal(err)
	}

	// leaves[0] = top_name → 1 value
	fmt.Printf("top_name: %s\n", leaves[0][0].ToProtoValue().String())

	// leaves[1] = child_name → 2 values (one per repeat)
	fmt.Printf("child_names: %d values\n", len(leaves[1]))
	for _, v := range leaves[1] {
		fmt.Printf("  %s\n", v.ToProtoValue().String())
	}

}

Output:
top_name: root
child_names: 2 values
  x
  y

func (*Plan) EvalLeavesConcurrent ¶ added in v0.2.0

func (p *Plan) EvalLeavesConcurrent(m proto.Message) ([][]Value, error)

EvalLeavesConcurrent is like Plan.EvalLeaves but allocates fresh buffers per call, making it safe for concurrent use by multiple goroutines.

func (*Plan) ExprInputEntries ¶ added in v0.2.0

func (p *Plan) ExprInputEntries(idx int) []int

ExprInputEntries returns the plan-entry indices of all leaf PathRef nodes referenced by the Expr on user-visible entry idx. These are indices into the full internal entries slice (which may include hidden leaf paths appended after the user-visible entries).

Returns nil when the entry has no Expr. Callers can use Plan.InternalPath to retrieve the compiled path for each returned index.

func (*Plan) InternalPath ¶ added in v0.2.0

func (p *Plan) InternalPath(idx int) Path

InternalPath returns the compiled Path for the given internal entry index. Unlike [Entries] (which exposes only user-visible entries), this method can access hidden leaf paths that were added to the trie for Expr evaluation. Returns nil if idx is out of range.

type PlanEntry ¶

type PlanEntry struct {
	// Name is the alias if one was set via [Alias], otherwise the original
	// path string.
	Name string
	// Path is the compiled [Path]. Nil for Expr-only entries.
	Path Path
	// OutputKind is the protoreflect.Kind of the expression's result,
	// or zero when no [WithExpr] was set (raw path leaf kind) or when the
	// Expr is pass-through (same kind as input leaf).
	OutputKind protoreflect.Kind
	// HasExpr is true when this entry was created with [WithExpr].
	HasExpr bool
}

PlanEntry exposes metadata about one compiled path inside a Plan.

type PlanOption ¶

type PlanOption func(*planConfig)

PlanOption configures plan-level behaviour for NewPlan.

func AutoPromote ¶ added in v0.2.0

func AutoPromote(on bool) PlanOption

AutoPromote returns a PlanOption that sets the default auto-promotion behaviour for [Cond] expressions. When true, Cond branches with mismatched output kinds are automatically promoted to a common type. Individual Cond expressions can override this setting.

type PlanPathSpec ¶

type PlanPathSpec struct {
	// contains filtered or unexported fields
}

PlanPathSpec pairs a raw path string with per-path options. Create one with PlanPath.

func PlanPath ¶

func PlanPath(path string, opts ...EntryOption) PlanPathSpec

PlanPath creates a PlanPathSpec pairing a path string with per-path options.

type Query ¶ added in v0.3.0

type Query struct {
	// contains filtered or unexported fields
}

Query is a pre-compiled, reusable query object built from a Plan. It wraps the plan and presents evaluation results through the typed ResultSet / Result API instead of raw protoreflect.Value slices.

Use NewQuery to create a Query from an existing plan, or QueryEval for a one-shot evaluation without pre-compilation.

func NewQuery ¶ added in v0.3.0

func NewQuery(plan *Plan) *Query

NewQuery creates a Query that wraps plan. The plan must already be compiled via NewPlan. The Query does not own the plan; the caller is responsible for ensuring the plan outlives the query.

func (*Query) Plan ¶ added in v0.3.0

func (q *Query) Plan() *Plan

Plan returns the underlying Plan.

func (*Query) Run ¶ added in v0.3.0

func (q *Query) Run(msg proto.Message) (ResultSet, error)

Run evaluates the query against msg and returns a ResultSet keyed by each path entry's name (alias or path string).

Run reuses the plan's internal scratch buffers (via Plan.EvalLeaves) and is therefore NOT safe for concurrent use. Use [RunConcurrent] when calling from multiple goroutines.

func (*Query) RunConcurrent ¶ added in v0.3.0

func (q *Query) RunConcurrent(msg proto.Message) (ResultSet, error)

RunConcurrent is like [Run] but allocates fresh buffers per call, making it safe for concurrent use by multiple goroutines.

type Result ¶ added in v0.3.0

type Result struct {
	// contains filtered or unexported fields
}

Result wraps a slice of Value representing the fan-out output of a single path entry. It provides typed accessor methods for convenient consumption of query results.

The zero value is an empty result (no values).

Accessor naming convention:

Plural methods (Float64s, Strings, …) return a slice of all branches.
Singular methods (Float64, String, …) return the first branch value or the Go zero value when the result is empty.

func NewResult ¶ added in v0.3.0

func NewResult(vals []Value) Result

NewResult creates a Result from a slice of Value. The slice is referenced, not copied.

func QueryEval ¶ added in v0.3.0

func QueryEval(md protoreflect.MessageDescriptor, path string, msg proto.Message) (Result, error)

QueryEval is a convenience function that compiles a single path against md, evaluates it against msg, and returns the typed Result.

For repeated evaluation of the same path against many messages, prefer NewPlan + NewQuery + Query.Run.

func (Result) Bool ¶ added in v0.3.0

func (r Result) Bool() bool

Bool returns the first branch as bool. Returns false if the result is empty or the value is not boolean.

func (Result) Bools ¶ added in v0.3.0

func (r Result) Bools() []bool

Bools returns all branches as bool values. Non-boolean branches are silently skipped.

func (Result) Bytes ¶ added in v0.3.0

func (r Result) Bytes() []byte

Bytes returns the first branch as []byte. Returns nil if the result is empty or the value is not bytes.

func (Result) BytesSlice ¶ added in v0.3.0

func (r Result) BytesSlice() [][]byte

BytesSlice returns all branches as []byte values. Non-bytes branches are silently skipped.

func (Result) Float64 ¶ added in v0.3.0

func (r Result) Float64() float64

Float64 returns the first branch as float64. Returns 0 if the result is empty or the value is not numeric.

func (Result) Float64s ¶ added in v0.3.0

func (r Result) Float64s() []float64

Float64s returns all branches as float64 values. Non-numeric branches are silently skipped. The returned slice has length ≤ Result.Len.

func (Result) Int64 ¶ added in v0.3.0

func (r Result) Int64() int64

Int64 returns the first branch as int64. Returns 0 if the result is empty or the value is not an integer type.

func (Result) Int64s ¶ added in v0.3.0

func (r Result) Int64s() []int64

Int64s returns all branches as int64 values. Non-integer branches are silently skipped.

func (Result) IsEmpty ¶ added in v0.3.0

func (r Result) IsEmpty() bool

IsEmpty reports whether the result contains no values.

func (Result) Len ¶ added in v0.3.0

func (r Result) Len() int

Len returns the number of fan-out branches in this result.

func (Result) Message ¶ added in v0.3.0

func (r Result) Message() protoreflect.Message

Message returns the first branch as protoreflect.Message. Returns nil if the result is empty or the value is not a message.

func (Result) Messages ¶ added in v0.3.0

func (r Result) Messages() []protoreflect.Message

Messages returns all branches as protoreflect.Message values. Non-message branches are silently skipped.

func (Result) ProtoValues ¶ added in v0.3.0

func (r Result) ProtoValues() []protoreflect.Value

ProtoValues converts all branches back to protoreflect.Value for interoperability with existing code that expects the legacy representation. Null values become the zero protoreflect.Value.

func (Result) String ¶ added in v0.3.0

func (r Result) String() string

String returns the first branch as string. Returns "" if the result is empty. Non-string values are converted via [valueToStringValue].

func (Result) Strings ¶ added in v0.3.0

func (r Result) Strings() []string

Strings returns all branches as string values. Non-string values are converted via [valueToStringValue].

func (Result) Uint64 ¶ added in v0.3.0

func (r Result) Uint64() uint64

Uint64 returns the first branch as uint64. Returns 0 if the result is empty or the value is not an unsigned integer type.

func (Result) Uint64s ¶ added in v0.3.0

func (r Result) Uint64s() []uint64

Uint64s returns all branches as uint64 values. Non-unsigned-integer branches are silently skipped.

func (Result) Value ¶ added in v0.3.0

func (r Result) Value() Value

Value returns the first Value in the result, or null if empty.

func (Result) Values ¶ added in v0.3.0

func (r Result) Values() []Value

Values returns the raw Value slice.

type ResultSet ¶ added in v0.3.0

type ResultSet struct {
	// contains filtered or unexported fields
}

ResultSet is an ordered collection of named Result values returned by a Query.Run call. Results can be accessed by name or iterated in order.

The zero value is a valid, empty result set.

func (ResultSet) All ¶ added in v0.3.0

func (rs ResultSet) All() iter.Seq2[string, Result]

All returns an iterator over all (name, result) pairs in order. Intended for use with Go range-over-func.

func (ResultSet) At ¶ added in v0.3.0

func (rs ResultSet) At(i int) (name string, result Result)

At returns the Result and name at position i. Panics if i is out of range.

func (ResultSet) Get ¶ added in v0.3.0

func (rs ResultSet) Get(name string) Result

Get returns the Result for the given entry name. If the name does not exist, an empty Result is returned.

func (ResultSet) Has ¶ added in v0.3.0

func (rs ResultSet) Has(name string) bool

Has reports whether the result set contains an entry with the given name.

func (ResultSet) Len ¶ added in v0.3.0

func (rs ResultSet) Len() int

Len returns the number of entries in the result set.

func (ResultSet) Names ¶ added in v0.3.0

func (rs ResultSet) Names() []string

Names returns the entry names in order.

type Step ¶

type Step struct {
	// contains filtered or unexported fields
}

Step is a single operation in a Path. It wraps a protopath.Step for the standard step kinds and adds ListRangeStep and ListWildcardStep.

func AnyExpand ¶

func AnyExpand(md protoreflect.MessageDescriptor) Step

AnyExpand returns an AnyExpandStep for the given message descriptor.

func FieldAccess ¶

func FieldAccess(fd protoreflect.FieldDescriptor) Step

FieldAccess returns a FieldAccessStep for the given field descriptor.

func Filter ¶ added in v0.3.0

func Filter(predicate Expr) Step

Filter returns a FilterStep that evaluates predicate against the current cursor value and keeps only branches where the predicate is truthy. The predicate is an Expr whose leaf PathRef nodes are relative to the cursor's message descriptor; resolution happens at NewPlan time.

func ListIndex ¶

func ListIndex(i int) Step

ListIndex returns a ListIndexStep for the given index. Negative indices are allowed and resolved at traversal time. For non-negative indices, the underlying protopath.ListIndex is used. For negative indices, only the raw value is stored (since protopath panics on negatives).

func ListRange ¶

func ListRange(start, end int) Step

ListRange returns a ListRangeStep representing the half-open range [start, end) with stride 1. Both start and end may be negative (resolved at traversal time).

func ListRangeFrom ¶

func ListRangeFrom(start int) Step

ListRangeFrom returns a ListRangeStep with only a start bound (open end) and stride 1. This represents [start:] syntax.

func ListRangeStep3 ¶

func ListRangeStep3(start, end, step int, startOmitted, endOmitted bool) Step

ListRangeStep3 returns a ListRangeStep with explicit start, end, and step. Panics if step is 0.

Use startOmitted/endOmitted to indicate that the respective bound should be defaulted at traversal time based on the step sign (Python semantics).

func ListWildcard ¶

func ListWildcard() Step

ListWildcard returns a ListWildcardStep that selects every element.

func MapIndex ¶

func MapIndex(k protoreflect.MapKey) Step

MapIndex returns a MapIndexStep for the given map key.

func MapWildcard ¶ added in v0.3.0

func MapWildcard() Step

MapWildcard returns a MapWildcardStep that selects every value from a map field, iterating over all key-value pairs.

func Root ¶

func Root(md protoreflect.MessageDescriptor) Step

Root returns a RootStep for the given message descriptor.

func (Step) EndOmitted ¶

func (s Step) EndOmitted() bool

EndOmitted reports whether the end bound was omitted in the source syntax. When true, the effective end is computed at traversal time based on the step sign: len for positive step, -(len+1) for negative step.

func (Step) FieldDescriptor ¶

func (s Step) FieldDescriptor() protoreflect.FieldDescriptor

FieldDescriptor returns the field descriptor for a FieldAccessStep.

func (Step) Kind ¶

func (s Step) Kind() StepKind

Kind returns the step's kind.

func (Step) ListIndex ¶

func (s Step) ListIndex() int

ListIndex returns the list index for a ListIndexStep. May be negative (resolved at traversal time).

func (Step) MapIndex ¶

func (s Step) MapIndex() protoreflect.MapKey

MapIndex returns the map key for a MapIndexStep.

func (Step) MessageDescriptor ¶

func (s Step) MessageDescriptor() protoreflect.MessageDescriptor

MessageDescriptor returns the message descriptor for RootStep and AnyExpandStep.

func (Step) Predicate ¶ added in v0.3.0

func (s Step) Predicate() Expr

Predicate returns the predicate Expr for a FilterStep. Returns nil for all other step kinds.

func (Step) ProtoStep ¶

func (s Step) ProtoStep() protopath.Step

ProtoStep returns the underlying protopath.Step. It panics if the step kind is ListRangeStep, ListWildcardStep, FilterStep, or MapWildcardStep.

func (Step) RangeEnd ¶

func (s Step) RangeEnd() int

RangeEnd returns the end bound of a ListRangeStep. The value is 0 when Step.EndOmitted is true.

func (Step) RangeOpen ¶

func (s Step) RangeOpen() bool

RangeOpen reports whether the range has no explicit end bound. Deprecated: use Step.EndOmitted instead.

func (Step) RangeStart ¶

func (s Step) RangeStart() int

RangeStart returns the start bound of a ListRangeStep. The value is 0 when Step.StartOmitted is true.

func (Step) RangeStep ¶

func (s Step) RangeStep() int

RangeStep returns the stride of a ListRangeStep. Defaults to 1 when not explicitly provided. Never 0.

func (Step) StartOmitted ¶

func (s Step) StartOmitted() bool

StartOmitted reports whether the start bound was omitted in the source syntax. When true, the effective start is computed at traversal time based on the step sign: 0 for positive step, len-1 for negative step.

type StepKind ¶

type StepKind int

StepKind enumerates the kinds of steps in a Path. The first six values mirror protopath.StepKind.

const (
	// RootStep identifies a [Step] as a root message.
	RootStep StepKind = iota
	// FieldAccessStep identifies a [Step] as accessing a message field by name.
	FieldAccessStep
	// UnknownAccessStep identifies a [Step] as accessing unknown fields.
	UnknownAccessStep
	// ListIndexStep identifies a [Step] as indexing into a list by position.
	// Negative indices are allowed and resolved at traversal time.
	ListIndexStep
	// MapIndexStep identifies a [Step] as indexing into a map by key.
	MapIndexStep
	// AnyExpandStep identifies a [Step] as expanding an Any message.
	AnyExpandStep
	// ListRangeStep identifies a [Step] as a Python-style slice
	// [start:end:step] applied to a repeated field during value traversal.
	// Any of start, end, step may be omitted; omitted bounds are resolved
	// at traversal time based on the step sign (see Python slice semantics).
	ListRangeStep
	// ListWildcardStep identifies a [Step] as selecting all elements of a
	// repeated field. Equivalent to [*], [:], or [::] in the path syntax.
	ListWildcardStep
	// FilterStep identifies a [Step] as a mid-traversal predicate filter.
	// It evaluates a predicate expression against the current cursor value
	// (which must be a message) and keeps only branches where the predicate
	// is truthy. Syntax: [?(.field == "value")].
	//
	// The predicate is compiled at [NewPlan] time into a sub-[Plan] rooted
	// at the cursor's message descriptor. At traversal time the predicate
	// is evaluated per-branch with zero additional compilation cost.
	FilterStep
	// MapWildcardStep identifies a [Step] as selecting all values from a
	// map field. Equivalent to [*] applied to a map-typed field.
	MapWildcardStep
)

type Value ¶ added in v0.3.0

type Value struct {
	// contains filtered or unexported fields
}

Value is the universal intermediate representation for pbpath expressions.

It is intentionally a small struct (≤ 48 bytes on 64-bit) that can be passed by value without heap allocation in hot-path expression evaluation. The zero value is a null (kind == NullKind).

Design rationale:

scalar is stored as protoreflect.Value (interface{} wrapper) so every proto primitive type is supported without boxing again.
list is a slice header (24 bytes) stored directly in the struct. For small fan-out counts, Go 1.26 stack-allocated append backing stores keep these off the heap entirely.
msg is a protoreflect.Message interface — one word + one pointer.
kind occupies a single int. Keeping it first enables branch prediction on the hot-path switch.

func FromProtoValue ¶ added in v0.3.0

func FromProtoValue(pv protoreflect.Value) Value

FromProtoValue converts a protoreflect.Value to a Value. Messages are wrapped as MessageKind; Lists/Maps are wrapped as ScalarKind (preserving the proto List/Map interface for downstream expression functions). Invalid values become null.

func ListVal ¶ added in v0.3.0

func ListVal(items []Value) Value

ListVal creates a ListKind Value from an existing slice of Value. If items is nil or empty, the result is an empty list (not null).

func MessageVal ¶ added in v0.3.0

func MessageVal(m protoreflect.Message) Value

MessageVal wraps a protoreflect.Message into a Value with kind MessageKind. If m is nil, the result is null.

func Null ¶ added in v0.3.0

func Null() Value

Null returns the null Value. Equivalent to the zero value.

func ObjectVal ¶ added in v0.3.0

func ObjectVal(entries []ObjectEntry) Value

ObjectVal creates an ObjectKind Value from key-value pairs. If entries is nil, the result is an empty object (not null).

func Scalar ¶ added in v0.3.0

func Scalar(pv protoreflect.Value) Value

Scalar wraps a single protoreflect.Value into a Value with kind ScalarKind. If pv is not valid (zero), the result is null.

func ScalarBool ¶ added in v0.3.0

func ScalarBool(b bool) Value

ScalarBool creates a ScalarKind Value holding a bool.

func ScalarFloat64 ¶ added in v0.3.0

func ScalarFloat64(f float64) Value

ScalarFloat64 creates a ScalarKind Value holding a float64.

func ScalarInt32 ¶ added in v0.3.0

func ScalarInt32(n int32) Value

ScalarInt32 creates a ScalarKind Value holding an int32.

func ScalarInt64 ¶ added in v0.3.0

func ScalarInt64(n int64) Value

ScalarInt64 creates a ScalarKind Value holding an int64.

func ScalarString ¶ added in v0.3.0

func ScalarString(s string) Value

ScalarString creates a ScalarKind Value holding a string.

func (Value) Entries ¶ added in v0.3.0

func (v Value) Entries() []ObjectEntry

Entries returns the key-value pairs for an ObjectKind value. Returns nil for non-object kinds.

func (Value) Get ¶ added in v0.3.0

func (v Value) Get(fd protoreflect.FieldDescriptor) Value

Get returns the value of field fd on a MessageKind value. Returns null for non-message values.

func (Value) Index ¶ added in v0.3.0

func (v Value) Index(i int) Value

Index returns the i-th element of a ListKind value. Returns null for non-list kinds or out-of-bounds indices. Negative indices are supported (Python-style).

func (Value) IsNonZero ¶ added in v0.3.0

func (v Value) IsNonZero() bool

IsNonZero reports whether the value is non-null and not the protobuf zero value for its kind. For scalars this means non-zero/non-empty; for lists, non-empty; for messages, non-nil.

func (Value) IsNull ¶ added in v0.3.0

func (v Value) IsNull() bool

IsNull reports whether the value is null (kind == NullKind).

func (Value) Kind ¶ added in v0.3.0

func (v Value) Kind() ValueKind

Kind returns the ValueKind of this value.

func (Value) Len ¶ added in v0.3.0

func (v Value) Len() int

Len returns the number of elements for a ListKind value, or the number of entries for an ObjectKind value, or 0 otherwise.

func (Value) List ¶ added in v0.3.0

func (v Value) List() []Value

List returns the child elements for a ListKind value. Returns nil for non-list kinds.

func (Value) Message ¶ added in v0.3.0

func (v Value) Message() protoreflect.Message

Message returns the protoreflect.Message for a MessageKind value. Returns nil for non-message kinds.

func (Value) ProtoValue ¶ added in v0.3.0

func (v Value) ProtoValue() protoreflect.Value

ProtoValue returns the underlying protoreflect.Value for a scalar. For non-scalar kinds, it returns the zero protoreflect.Value.

func (Value) String ¶ added in v0.3.0

func (v Value) String() string

String returns a human-readable representation of the value for debugging.

func (Value) ToProtoValue ¶ added in v0.3.0

func (v Value) ToProtoValue() protoreflect.Value

ToProtoValue converts a Value back to a protoreflect.Value. For ScalarKind, the wrapped value is returned directly. For MessageKind, the message is wrapped via protoreflect.ValueOfMessage. For ListKind and NullKind, the zero protoreflect.Value is returned.

type ValueKind ¶ added in v0.3.0

type ValueKind int

ValueKind identifies the category of data held by a Value. A Value is always exactly one of these kinds.

const (
	// NullKind indicates the [Value] carries no data.
	// This is the zero-value kind for [Value].
	NullKind ValueKind = iota

	// ScalarKind indicates the [Value] wraps a single [protoreflect.Value].
	ScalarKind

	// ListKind indicates the [Value] holds an ordered collection of child
	// [Value] elements — the result of a collect or fan-out operation.
	ListKind

	// MessageKind indicates the [Value] wraps a live [protoreflect.Message]
	// that can be traversed further by subsequent path steps.
	MessageKind

	// ObjectKind indicates the [Value] holds a constructed object — an
	// ordered sequence of key-value pairs produced by {key: expr, ...}
	// syntax. Unlike [MessageKind], objects are schema-free and can hold
	// arbitrary string keys and [Value] values.
	ObjectKind
)

type Values ¶

type Values struct {
	Path   Path
	Values []protoreflect.Value
	// contains filtered or unexported fields
}

Values is a Path paired with a sequence of values at each step. The lengths of [Values.Path] and [Values.Values] must be identical. The first step must be a Root step and the first value must be a concrete message value.

func PathValues ¶

func PathValues(p Path, m proto.Message, opts ...PathOption) ([]Values, error)

PathValues returns the values along a path in message m.

When the path contains a ListWildcardStep or ListRangeStep the function fans out: one Values is produced for every matching list element (cartesian product when multiple fan-out steps are nested).

A single ListIndexStep with a negative index is resolved relative to the list length (e.g. -1 → last element).

Example ¶

ExamplePathValues demonstrates traversing a path through a live message to extract values. For scalar paths, exactly one Values is returned.

package main

import (
	"fmt"
	"log"

	"google.golang.org/protobuf/proto"
	"google.golang.org/protobuf/reflect/protodesc"
	"google.golang.org/protobuf/reflect/protoreflect"
	"google.golang.org/protobuf/types/descriptorpb"
	"google.golang.org/protobuf/types/dynamicpb"

	"github.com/loicalleyne/bufarrowlib/proto/pbpath"
)

// exampleDescriptors constructs a Test schema with a Nested submessage and a
// repeated Test field called "repeats". This models a self-referential proto
// message:
//
//	message Test {
//	  Nested           nested  = 1;
//	  repeated Test    repeats = 2;
//	  message Nested {
//	    string stringfield = 1;
//	  }
//	}
//
// Self-referential (recursive) messages are fully supported by pbpath.
func exampleDescriptors() (protoreflect.MessageDescriptor, protoreflect.MessageDescriptor) {
	stringType := descriptorpb.FieldDescriptorProto_TYPE_STRING.Enum()
	messageType := descriptorpb.FieldDescriptorProto_TYPE_MESSAGE.Enum()
	labelOptional := descriptorpb.FieldDescriptorProto_LABEL_OPTIONAL.Enum()
	labelRepeated := descriptorpb.FieldDescriptorProto_LABEL_REPEATED.Enum()

	fdp := &descriptorpb.FileDescriptorProto{
		Name:    proto.String("example.proto"),
		Package: proto.String("example"),
		Syntax:  proto.String("proto3"),
		MessageType: []*descriptorpb.DescriptorProto{
			{
				Name: proto.String("Test"),
				Field: []*descriptorpb.FieldDescriptorProto{
					{Name: proto.String("nested"), Number: proto.Int32(1), Type: messageType, TypeName: proto.String(".example.Test.Nested"), Label: labelOptional},
					{Name: proto.String("repeats"), Number: proto.Int32(2), Type: messageType, TypeName: proto.String(".example.Test"), Label: labelRepeated},
				},
				NestedType: []*descriptorpb.DescriptorProto{
					{
						Name: proto.String("Nested"),
						Field: []*descriptorpb.FieldDescriptorProto{
							{Name: proto.String("stringfield"), Number: proto.Int32(1), Type: stringType, Label: labelOptional},
						},
					},
				},
			},
		},
	}
	fd, err := protodesc.NewFile(fdp, nil)
	if err != nil {
		log.Fatalf("protodesc.NewFile: %v", err)
	}
	testMD := fd.Messages().ByName("Test")
	nestedMD := testMD.Messages().ByName("Nested")
	return testMD, nestedMD
}

func main() {
	testMD, nestedMD := exampleDescriptors()

	// Build a message: Test { nested: { stringfield: "hello" } }
	nested := dynamicpb.NewMessage(nestedMD)
	nested.Set(nestedMD.Fields().ByName("stringfield"), protoreflect.ValueOfString("hello"))
	msg := dynamicpb.NewMessage(testMD)
	msg.Set(testMD.Fields().ByName("nested"), protoreflect.ValueOfMessage(nested))

	// Parse and evaluate.
	path, _ := pbpath.ParsePath(testMD, "nested.stringfield")
	results, err := pbpath.PathValues(path, msg)
	if err != nil {
		log.Fatal(err)
	}

	// Scalar path → exactly one result.
	fmt.Printf("branches: %d\n", len(results))

	// Index(-1) returns the last (step, value) pair — the leaf value.
	leaf := results[0].Index(-1)
	fmt.Printf("value: %s\n", leaf.Value.String())

}

Output:
branches: 1
value: hello

Example (Fanout) ¶

ExamplePathValues_fanout demonstrates how wildcards cause PathValues to produce multiple result branches — one per matching list element.

package main

import (
	"fmt"
	"log"

	"google.golang.org/protobuf/proto"
	"google.golang.org/protobuf/reflect/protodesc"
	"google.golang.org/protobuf/reflect/protoreflect"
	"google.golang.org/protobuf/types/descriptorpb"
	"google.golang.org/protobuf/types/dynamicpb"

	"github.com/loicalleyne/bufarrowlib/proto/pbpath"
)

// exampleDescriptors constructs a Test schema with a Nested submessage and a
// repeated Test field called "repeats". This models a self-referential proto
// message:
//
//	message Test {
//	  Nested           nested  = 1;
//	  repeated Test    repeats = 2;
//	  message Nested {
//	    string stringfield = 1;
//	  }
//	}
//
// Self-referential (recursive) messages are fully supported by pbpath.
func exampleDescriptors() (protoreflect.MessageDescriptor, protoreflect.MessageDescriptor) {
	stringType := descriptorpb.FieldDescriptorProto_TYPE_STRING.Enum()
	messageType := descriptorpb.FieldDescriptorProto_TYPE_MESSAGE.Enum()
	labelOptional := descriptorpb.FieldDescriptorProto_LABEL_OPTIONAL.Enum()
	labelRepeated := descriptorpb.FieldDescriptorProto_LABEL_REPEATED.Enum()

	fdp := &descriptorpb.FileDescriptorProto{
		Name:    proto.String("example.proto"),
		Package: proto.String("example"),
		Syntax:  proto.String("proto3"),
		MessageType: []*descriptorpb.DescriptorProto{
			{
				Name: proto.String("Test"),
				Field: []*descriptorpb.FieldDescriptorProto{
					{Name: proto.String("nested"), Number: proto.Int32(1), Type: messageType, TypeName: proto.String(".example.Test.Nested"), Label: labelOptional},
					{Name: proto.String("repeats"), Number: proto.Int32(2), Type: messageType, TypeName: proto.String(".example.Test"), Label: labelRepeated},
				},
				NestedType: []*descriptorpb.DescriptorProto{
					{
						Name: proto.String("Nested"),
						Field: []*descriptorpb.FieldDescriptorProto{
							{Name: proto.String("stringfield"), Number: proto.Int32(1), Type: stringType, Label: labelOptional},
						},
					},
				},
			},
		},
	}
	fd, err := protodesc.NewFile(fdp, nil)
	if err != nil {
		log.Fatalf("protodesc.NewFile: %v", err)
	}
	testMD := fd.Messages().ByName("Test")
	nestedMD := testMD.Messages().ByName("Nested")
	return testMD, nestedMD
}

func main() {
	testMD, nestedMD := exampleDescriptors()

	// Build a message with 3 elements in the "repeats" list.
	msg := dynamicpb.NewMessage(testMD)
	list := msg.Mutable(testMD.Fields().ByName("repeats")).List()
	for _, v := range []string{"alpha", "beta", "gamma"} {
		n := dynamicpb.NewMessage(nestedMD)
		n.Set(nestedMD.Fields().ByName("stringfield"), protoreflect.ValueOfString(v))
		child := dynamicpb.NewMessage(testMD)
		child.Set(testMD.Fields().ByName("nested"), protoreflect.ValueOfMessage(n))
		list.Append(protoreflect.ValueOfMessage(child))
	}

	path, _ := pbpath.ParsePath(testMD, "repeats[*].nested.stringfield")
	results, err := pbpath.PathValues(path, msg)
	if err != nil {
		log.Fatal(err)
	}

	// One branch per list element, each with the concrete index resolved.
	fmt.Printf("branches: %d\n", len(results))
	for _, r := range results {
		// ListIndices() extracts the concrete list indices visited.
		indices := r.ListIndices()
		fmt.Printf("  repeats[%d] = %s\n", indices[0], r.Index(-1).Value.String())
	}

}

Output:
branches: 3
  repeats[0] = alpha
  repeats[1] = beta
  repeats[2] = gamma

func (Values) Index ¶

func (p Values) Index(i int) (out struct {
	Step  Step
	Value protoreflect.Value
},
)

Index returns the ith step and value and supports negative indexing. A negative index starts counting from the tail of the Values such that -1 refers to the last pair, -2 refers to the second-to-last pair, and so on.

func (Values) Len ¶

func (p Values) Len() int

Len reports the length of the path and values. If the path and values have differing length, it returns the minimum length.

func (Values) ListIndices ¶

func (p Values) ListIndices() []int

ListIndices returns the concrete list indices visited along this Values path. It collects the index from every ListIndexStep in order.

func (Values) String ¶

func (p Values) String() string

String returns a humanly readable representation of the path and last value. Do not depend on the output being stable.

For example:

(path.to.MyMessage).list_field[5].map_field["hello"] = {hello: "world"}

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

README ¶

pbpath

Core API

PathValues

Options

Values helpers

Multi-Path Plan API

Quick Start

NewPlan

PlanPath and PlanOption

Plan.Eval

Plan.Entries

PathValuesMulti (Convenience)

Trie-Based Shared-Prefix Optimization

Path String Syntax

Grammar

Obtaining the Message Descriptor

Protobuf Editions

Identifying the Root Message

Field Access

Repeated Field (List) Indexing

Map Field Indexing

Wildcards, Ranges, and Slices

Wildcard — [*] or [:] or [::]

Range — [start:end]

Slice — [start:end:step] (Python semantics)

Chaining Steps

Mid-Traversal Filtering — [?(...)]

Syntax

On repeated fields

On single message fields

Predicate atoms

Comparators

Programmatic API

Fan-Out and Nested Fan-Outs

Single Fan-Out

Nested Fan-Out (Cartesian Product)

Mixed Fan-Out: Range × Wildcard

Step Constructors (Programmatic API)

Available Constructors

Examples

Simple Scalar

Nested Messages

Repeated Message Elements

Repeated Scalars

Wildcard over Repeated Messages

Range: First N Impressions

Slice: Every Other Element in Reverse

Nested Fan-Out with ListIndices

Deeply Nested Through Repeated Fields

Map Access

Self-Referential / Recursive Messages

Complex: Multiple Step Types Combined

Explicit Root

Strict Mode

PathValues — global strict

Plan — per-path strict

Value Type

Value Kinds

Result Type

Accessor Naming Convention

Query API

Quick Start

Concurrent Use

Accessing the Underlying Plan

Expression Engine

Quick Start

Available Functions

Pipeline API (jq-style)

Quick Start

Grammar

Pipe Operator |

Comma Operator ,

select(predicate)

Collect [pipeline]

Built-in Functions

String Functions

Collection Functions

Numeric Functions

Serialization & Format Strings

Wildcard — `[*]` or `[:]` or `[::]`

Range — `[start:end]`

Slice — `[start:end:step]` (Python semantics)

Mid-Traversal Filtering — `[?(...)]`

Pipe Operator `|`

Comma Operator `,`

Collect `[pipeline]`

Variables — `as $name`

Alternative Operator `//`

Optional Operator `?`

Object Construction `{...}`