ir

package
v0.0.12 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 12, 2025 License: Apache-2.0 Imports: 15 Imported by: 0

Documentation

Overview

Package ir provides the intermediate representation (IR) for Tony format documents.

Overview

The IR package defines the core data structures for representing Tony format documents as a tree of nodes. All Tony documents (whether parsed from text, created programmatically, or generated from schemas) are represented as ir.Node trees.

The IR itself is a simple recursive structure that is readily representable in JSON, YAML, and Tony format. This makes the IR useful for manipulating Tony documents in contexts which lack parsing and encoding support. The IR contains no position information from input documents, making it purely semantic.

Node Structure

A Node represents a single value in a Tony document. Nodes can be:

  • Atomic types: null, boolean, number, string
  • Composite types: object (key-value pairs), array (ordered list)
  • Metadata: tags, comments, line information

Each node maintains parent-child relationships, allowing navigation through the tree structure.

The IR works as a recursive tagged union structure, where values are placed in fields depending on the node type.

Node Types

The Type field indicates the node's type:

  • NullType: null value
  • BoolType: boolean (true/false)
  • NumberType: numeric value (int64 or float64)
  • StringType: string value
  • ArrayType: ordered list of nodes
  • ObjectType: key-value pairs (fields and values)
  • CommentType: comment node

Creating Nodes

Use constructor functions to create nodes:

node := ir.FromString("hello")
num := ir.FromInt(42)
flag := ir.FromBool(true)
obj := ir.FromMap(map[string]*ir.Node{
    "key": ir.FromString("value"),
})
arr := ir.FromSlice([]*ir.Node{
    ir.FromInt(1),
    ir.FromInt(2),
})

IR Structure Constraints

The IR has specific constraints that must be maintained:

## Objects

For ObjectType nodes, Fields[i] is the key for the value at Values[i], so there will always be the same number of fields as values.

Fields are always either:

  • String typed (and not multiline) - normal object keys
  • Int typed (fitting in uint32) - for int-keyed maps (sparse arrays)
  • Null typed - represents a merge key and may occur multiple times

Other keys (non-null) should appear only once. Objects must either have all keys int typed, or all keys not int typed (mixed int/string keys are not allowed).

## Strings

String canonical values are stored under the String field. If the string was a multiline folding Tony string, then the Lines field may contain the folding decomposition. Producers should not populate Lines where String is not equal to the concatenation of Lines. Consumers should check if they are equal and if not, remove the Lines decomposition and consider String canonical.

## Numbers

Number values are placed under:

  • Int64: if it is an integer (64-bit signed)
  • Float64: if it is a floating point number (64-bit IEEE float)
  • Number: as a string fallback if neither Int64 nor Float64 can represent it

## Comments

CommentType nodes define comment association. Comment content is placed in the Lines field.

A comment node either:

  • Contains 1 element in Values, a non-comment node to which it is associated as a head comment
  • Contains 0 elements and resides in the Comment field of a non-comment node, representing its line comment plus possibly trailing comment material

A comment node may not represent both a head comment and a line comment. In the second case, normally it represents a single line comment (e.g., `null # comment`) with 1 entry in Lines. All such comments must contain all whitespace between the end of the value and the `#` to preserve vertical alignment.

All comments not preceding any value nor occurring on the same line of any value are collected and appended to the Lines of the comment node residing in the Comment field of the root non-comment node. If that node has no line comment, a dummy line comment is present with value "".

Navigating Nodes

Nodes maintain parent-child relationships:

  • Parent: parent node (nil for root)
  • ParentIndex: index in parent's array/object
  • ParentField: field name if parent is object
  • Fields: field names (for ObjectType)
  • Values: child values (for ObjectType and ArrayType)

Use Path() to get a JSONPath-style path string:

path := node.Path() // e.g., "$.foo.bar[0]"

Use KPath() to get a kinded path string:

kpath := node.KPath() // e.g., "foo.bar[0]"

Path Operations

The package provides two path systems:

  • Path: JSONPath-style paths (e.g., "$.foo.bar[0]")
  • KPath: Kinded paths (e.g., "foo.bar[0]") that encode node kinds in syntax

Use GetPath() or GetKPath() to navigate to a node:

child, err := node.GetKPath("foo.bar[0]")
if err != nil {
    // path doesn't exist
}

Tags

Nodes can have tags (YAML-style metadata) stored in the Tag field:

node.Tag = "!or"
node.Tag = "!schema(person)"

Use tag manipulation functions:

ir.TagArgs("!schema(person)") // parse tag name and arguments
ir.TagCompose("!all", "has-path", "foo") // compose tags

Comparison and Hashing

Nodes can be compared for equality:

equal := ir.Compare(a, b) == 0

Nodes can be hashed (useful for caching, deduplication):

hash := ir.Hash(node)

JSON Interoperability

The IR itself is representable in JSON, YAML, and Tony format, making it self-describing. Nodes can be converted to/from JSON:

jsonBytes, err := ir.ToJSON(node)
node, err := ir.FromJSON(jsonBytes)

This allows the IR to be serialized and manipulated in contexts without Tony format support.

Thread Safety

Node structures are not thread-safe. If you need to access nodes from multiple goroutines, you must synchronize access yourself or clone nodes for each goroutine.

  • github.com/signadot/tony-format/go-tony/parse - Parses text into IR nodes
  • github.com/signadot/tony-format/go-tony/encode - Encodes IR nodes to text
  • github.com/signadot/tony-format/go-tony/schema - Schema system using IR
  • github.com/signadot/tony-format/go-tony/mergeop - Operations on IR nodes

Package ir contains the Tony format implementation.

Index

Constants

View Source
const (
	IntKeysTag = "!sparsearray"
	IntKeysFmt = "%d"
	MergeKey   = "<<"
)

Variables

View Source
var (
	ErrParse     = errors.New("parse error")
	ErrBadFormat = format.ErrBadFormat
)

Functions

func CheckTag

func CheckTag(tag string) error

func Compare added in v0.0.7

func Compare(a, b *Node) int

Compare returns an integer comparing two nodes. The result will be 0 if a==b, -1 if a < b, and +1 if a > b.

func HeadTag

func HeadTag(tag string) (string, string)

func Join added in v0.0.10

func Join(prefix string, suffix string) string

Join joins a prefix segment with a suffix kinded path. The prefix should be a single segment (field name, [index], or {index}). Returns the combined kinded path string.

Examples:

  • Join("a", "b.c") → "a.b.c"
  • Join("a", "[0]") → "a[0]"
  • Join("[0]", "b") → "[0].b"
  • Join("a", "") → "a"
  • Join("", "b") → "b"

func RSplit added in v0.0.12

func RSplit(kpath string) (parentPath string, lastSegment string)

RSplit splits a kinded path into the parent path and the last segment. Returns the parent path (everything except the last segment) and the last segment. Panics if the path cannot be parsed (invalid kinded path syntax).

Examples:

  • RSplit("a.b.c") → ("a.b", "c")
  • RSplit("a[0]") → ("a", "[0]")
  • RSplit("[0].b") → ("[0]", "b")
  • RSplit("{13}.c") → ("{13}", "c")
  • RSplit("a") → ("", "a")
  • RSplit("") → ("", "")

The last segment is returned as a string representation:

  • Field: "a" or "'field name'" (quoted if needed)
  • Dense array: "[0]"
  • Sparse array: "{0}"

func SegmentFieldName added in v0.0.12

func SegmentFieldName(segment string) (fieldName string, isField bool)

SegmentFieldName extracts the field name from a segment string. Returns the unquoted field name and true if the segment is a field segment. Returns false if the segment is not a field (e.g., array index "[0]", sparse index "{0}", wildcard "*").

Examples:

  • SegmentFieldName("a") → ("a", true)
  • SegmentFieldName("'field name'") → ("field name", true)
  • SegmentFieldName("\"field name\"") → ("field name", true)
  • SegmentFieldName("[0]") → ("", false)
  • SegmentFieldName("{0}") → ("", false)
  • SegmentFieldName("*") → ("", false)

func Split added in v0.0.10

func Split(kpath string) (firstSegment string, restPath string)

Split splits a kinded path into the first segment and the remaining path. Returns the first segment as a string (suitable for use as a map key) and the rest of the path. Panics if the path cannot be parsed (invalid kinded path syntax).

Examples:

  • Split("a.b.c") → ("a", "b.c")
  • Split("[0].b") → ("[0]", "b")
  • Split("{13}.c") → ("{13}", "c")
  • Split("a") → ("a", "")
  • Split("") → ("", "")

The first segment is returned as a string representation:

  • Field: "a" or "'field name'" (quoted if needed)
  • Dense array: "[0]"
  • Sparse array: "{0}"

func SplitAll added in v0.0.10

func SplitAll(kpath string) []string

SplitAll splits a kinded path into all segments from root to leaf. Returns a slice of segment strings, each representing a valid top-level kpath. Panics if the path cannot be parsed (invalid kinded path syntax).

Examples:

  • SplitAll("a.b.c") → ["a", "b", "c"]
  • SplitAll("[0].b") → ["[0]", "b"]
  • SplitAll("{13}.c") → ["{13}", "c"]
  • SplitAll("a") → ["a"]
  • SplitAll("") → []

Each segment is a valid top-level kpath that will parse:

  • Field: "a" or "'field name'" (quoted if needed)
  • Dense array: "[0]" or "[*]"
  • Sparse array: "{0}" or "{*}"
  • Field wildcard: "*" (top-level) or ".*" (nested)

func TagArgs

func TagArgs(tag string) (string, []string, string)

func TagCompose

func TagCompose(tag string, args []string, oTag string) string

func TagGet

func TagGet(tag, what string) (string, []string)

func TagHas

func TagHas(tag, what string) bool

TagHas: what should be ! prefixed

func TagRemove

func TagRemove(tag, what string) string

func ToMap

func ToMap(node *Node) map[string]*Node

func Truth

func Truth(node *Node) bool

Types

type KPath added in v0.0.10

type KPath struct {
	Field          *string // Object field name (e.g., "a", "b") - similar to Path.Field
	FieldAll       bool    // Object field wildcard .* - matches all fields
	Index          *int    // Dense array index (e.g., 0, 1) - similar to Path.Index
	IndexAll       bool    // Dense array wildcard [*] - similar to Path.IndexAll
	SparseIndex    *int    // Sparse array index (e.g., 0, 42) - for {n} syntax
	SparseIndexAll bool    // Sparse array wildcard {*} - matches all sparse indices
	KeyValue       *Node   // Optional: for !key(path) objects, the path value (future)
	Next           *KPath  // Next segment in path (nil for leaf) - similar to Path.Next
}

KPath represents a kinded path (similar to Path but for kinded syntax). Kinded paths encode node kinds in the path syntax itself:

  • "a.b" → Object accessed via ".b" (a is ObjectType)
  • "a.*" → Object field wildcard (matches all fields)
  • "a[0]" → Dense Array accessed via "[0]" (a is ArrayType)
  • "a[*]" → Dense Array wildcard (matches all elements)
  • "a{0}" → Sparse Array accessed via "{0}" (a is SparseArrayType)
  • "a{*}" → Sparse Array wildcard (matches all sparse indices)

Future: Support for !key(path) objects:

  • "a.b(<value>)[2].fred" → Object with !key path value

func ParseKPath added in v0.0.10

func ParseKPath(kpath string) (*KPath, error)

ParseKPath parses a kinded path string into a KPath structure.

Kinded path syntax:

  • "a.b" → Object accessed via ".b"
  • "a[0]" → Dense Array accessed via "[0]"
  • "a[*]" → Dense Array wildcard (matches all elements)
  • "a{0}" → Sparse Array accessed via "{0}"

Examples:

  • "a.b.c" → Object path with 3 segments
  • "a[0][1]" → Dense array path with 3 segments
  • "a[*].b" → Array wildcard then object
  • "a{0}.b" → Sparse array then object
  • "" → Root path (returns nil)

Returns an error if the path syntax is invalid.

func (*KPath) Compare added in v0.0.10

func (p *KPath) Compare(other *KPath) int

Compare compares two paths lexicographically. Returns -1 if p < other, 0 if p == other, 1 if p > other.

func (*KPath) IsChildOf added in v0.0.10

func (p *KPath) IsChildOf(parent *KPath) bool

IsChildOf returns true if this path is a child of the given parent path.

func (*KPath) Parent added in v0.0.10

func (p *KPath) Parent() *KPath

Parent returns the parent path (all segments except the last). Returns nil if this is already the root segment or if there's only one segment.

func (*KPath) SegmentString added in v0.0.10

func (p *KPath) SegmentString() string

SegmentString returns the canonical string representation of this single segment. Unlike String(), this only returns the current segment, not the entire path. Examples:

  • KPath{Field: &"a"} → "a"
  • KPath{Field: &"field name"} → "'field name'" (quoted if needed)
  • KPath{Index: &0} → "[0]"
  • KPath{SparseIndex: &42} → "{42}"
  • KPath{FieldAll: true} → "*"
  • KPath{IndexAll: true} → "[*]"
  • KPath{SparseIndexAll: true} → "{*}"

func (*KPath) String added in v0.0.10

func (p *KPath) String() string

String returns the kinded path string representation of this KPath. Example:

KPath{Field: &"a", Next: &KPath{Field: &"b", ...}} → "a.b"
KPath{Field: &"a", Next: &KPath{FieldAll: true, ...}} → "a.*"
KPath{Field: &"a", Next: &KPath{Index: &0, ...}} → "a[0]"
KPath{Field: &"a", Next: &KPath{IndexAll: true, ...}} → "a[*]"
KPath{Field: &"a", Next: &KPath{SparseIndex: &42, ...}} → "a{42}"
KPath{Field: &"a", Next: &KPath{SparseIndexAll: true, ...}} → "a{*}"

type KeyVal

type KeyVal struct {
	Key *Node
	Val *Node
}

type Node

type Node struct {
	Type        Type
	Parent      *Node
	ParentIndex int
	ParentField string
	Fields      []*Node
	Values      []*Node

	Tag     string
	Lines   []string
	Comment *Node

	String  string
	Bool    bool
	Number  string
	Float64 *float64
	Int64   *int64
}

func Comment added in v0.0.7

func Comment(n *Node, c string) *Node

func FromBool

func FromBool(v bool) *Node

func FromFloat

func FromFloat(f float64) *Node

func FromInt

func FromInt(v int64) *Node

func FromIntKeysMap

func FromIntKeysMap(yMap map[uint32]*Node) *Node

func FromIntKeysMapAt

func FromIntKeysMapAt(res *Node, yMap map[uint32]*Node) *Node

func FromKeyVals

func FromKeyVals(kvs []KeyVal) *Node

func FromKeyValsAt

func FromKeyValsAt(res *Node, kvs []KeyVal) *Node

func FromMap

func FromMap(yMap map[string]*Node) *Node

func FromSlice

func FromSlice(ySlice []*Node) *Node

func FromString

func FromString(v string) *Node

func FromStringAt

func FromStringAt(p *Node, v string) *Node

func Get

func Get(y *Node, field string) *Node

func Null

func Null() *Node

func (*Node) Clone

func (y *Node) Clone() *Node

func (*Node) CloneTo

func (y *Node) CloneTo(dst *Node) *Node

func (*Node) DeepEqual added in v0.0.9

func (y *Node) DeepEqual(other *Node) bool

DeepEqual reports whether two nodes are deeply equal. It compares all data fields recursively, but does not compare Parent, ParentIndex, or ParentField as these are structural metadata.

func (*Node) FromTonyIR added in v0.0.6

func (node *Node) FromTonyIR(o *Node) error

func (*Node) GetKPath added in v0.0.10

func (y *Node) GetKPath(kpath string) (*Node, error)

GetKPath navigates an ir.Node tree using a kinded path. Similar to GetPath() but uses kinded path syntax.

Example:

rootNode.GetKPath("a.b.c") navigates to rootNode.Values["a"].Values["b"].Values["c"]

Returns an error if the path doesn't exist or is invalid.

func (*Node) GetPath

func (y *Node) GetPath(yPath string) (*Node, error)

func (*Node) Hash added in v0.0.7

func (n *Node) Hash() uint64

Hash returns a 64-bit hash of the node. Hash includes comments It panics if n is nil.

func (*Node) KPath added in v0.0.10

func (y *Node) KPath() string

KPath returns the kinded path string representation of this node's position in the tree. Similar to Path() but returns kinded path syntax (e.g., "a.b[0]" instead of "$.a.b[0]").

Examples:

  • Root node → ""
  • Object field "a" → "a"
  • Array element at index 0 → "[0]"
  • Nested object "a.b" → "a.b"
  • Mixed "a[0].b" → "a[0].b"

func (*Node) ListKPath added in v0.0.10

func (y *Node) ListKPath(dst []*Node, kpath string) ([]*Node, error)

ListKPath traverses an ir.Node tree and collects all nodes matching a kinded path. Similar to ListPath() but uses kinded path syntax.

Returns a slice of matching nodes.

func (*Node) ListPath

func (y *Node) ListPath(dst []*Node, yPath string) ([]*Node, error)

func (*Node) MarshalJSON

func (y *Node) MarshalJSON() ([]byte, error)

func (*Node) NonCommentParent

func (y *Node) NonCommentParent() *Node

func (*Node) Path

func (y *Node) Path() string

func (*Node) RemoveComments added in v0.0.7

func (y *Node) RemoveComments()

RemoveComments removes all comment nodes from the tree recursively. It removes: - Comment nodes (Type == CommentType) - Comment fields from all nodes (sets Comment to nil)

func (*Node) Root

func (y *Node) Root() *Node

func (*Node) ToIntKeysMap

func (y *Node) ToIntKeysMap() (map[uint32]*Node, error)

func (*Node) ToTonyIR added in v0.0.6

func (node *Node) ToTonyIR() (*Node, error)

func (*Node) UnmarshalJSON

func (y *Node) UnmarshalJSON(d []byte) error

func (*Node) Visit

func (y *Node) Visit(f func(y *Node, isPost bool) (bool, error)) error

func (*Node) WithTag

func (y *Node) WithTag(tag string) *Node

type Path

type Path struct {
	IndexAll bool
	Index    *int
	Field    *string
	Subtree  bool
	Next     *Path
}

func ParsePath

func ParsePath(p string) (*Path, error)

func (*Path) String

func (p *Path) String() string

type Type

type Type int
const (
	NullType Type = iota
	NumberType
	StringType
	BoolType
	ObjectType
	ArrayType
	CommentType
)

func Types

func Types() []Type

func (Type) IsLeaf

func (t Type) IsLeaf() bool

func (Type) MarshalText

func (t Type) MarshalText() ([]byte, error)

func (Type) String

func (t Type) String() string

func (*Type) UnmarshalText

func (t *Type) UnmarshalText(d []byte) error

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL