dql

package
v1.6.5 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 18, 2026 License: Apache-2.0 Imports: 3 Imported by: 0

README

DQL - DOM Query Language

DQL is a universal, expressive query language for structured and tree-shaped data, built for XGo. It provides a unified query interface across JSON, XML, HTML, ASTs, file systems, and any other domain that can be modeled as a tree.


Table of Contents


Overview

DQL brings a concise, chainable query syntax to XGo. Think of it as XPath or CSS selectors — but designed for XGo's type system, with first-class support for lazy evaluation, error propagation, and domain-specific adaptations.

// Find all hyperlinks in an HTML document
for a in doc.**.a {
    if url := a.$href; url != "" {
        echo url
    }
}

// Filter JSON records
for animal in doc.animals.*@($class == "zebra") {
    echo animal.$at
}

// Walk the file system
for e in fs`.`.**.file.match("*.xgo") {
    echo e.path
}

Supported Domains

DQL is not tied to any single format. Any data that can be represented as a tree can be queried with DQL:

  • JSON — query fields, arrays, and nested objects
  • YAML — same model as JSON
  • HTML / XML — traverse elements, attributes, and text
  • Go / XGo AST — inspect and filter syntax trees
  • File System — walk directories, match files by name or extension
  • Any custom tree structure — implement the NodeSet interface

Core Concepts

NodeSet

Everything in DQL is a NodeSet. Queries consume a NodeSet and produce a new NodeSet. The entry point is typically the root document node, called doc.

doc NodeSet  // root of the query

NodeSets are lazily evaluated — they describe a query plan that is executed only when you actually iterate or read data from them. This keeps memory usage low and allows complex query composition without overhead.

Child Access

Navigate the tree with dot notation:

ns.name          // direct child named "name"
ns."elem-name"   // child with a hyphenated name
ns.*             // all direct children (wildcard)
ns.**.name       // all descendants named "name" (deep query)
ns.**.*          // all descendants
ns[0]            // first element of an array/list node (0-based)
Filtering (Select)

Filter nodes using @:

ns@name          // keep only nodes named "name"
ns@(condExpr)    // keep nodes matching a boolean expression
ns@fn(args)      // keep nodes where fn(...) returns true

Examples:

doc.users.*@($role == "admin")         // users whose role attribute is "admin"
doc.users.*@($age.int?:100 < 18)       // users under 18
doc.**.div@hasClass("widget")          // all divs with a given CSS class
Attributes

Read node-level data with $:

val := ns.$name              // single-value: zero value on error
val, err := ns.$name         // dual-value: explicit error handling
val := ns.$name!             // panic on error
val := ns.$name?:"default"   // use a fallback value on error
val := ns.$"attr-name"       // hyphenated attribute names

Attribute access always operates on the first node in the NodeSet. To collect an attribute from every node, use a list comprehension:

names := [user.$name for user in doc.users.*]
Methods

Methods provide access to computed or typed data:

text := div.text             // call Text() method (single-value)
text, err := div.text        // dual-value
count := div.count?:0        // explicit default for numeric methods
age, err := node.$age.int    // numeric methods require dual-value

Methods prefixed with _ map to XGo_-prefixed implementations:

node._text    // → node.XGo_text()
node._count   // → node.XGo_count()
node._first   // → node.XGo_first()

Note on numeric methods: Methods that return numeric types (like int, float, count) do not have a single-value form, to prevent silent bugs where a zero default masks a real error. Always use the dual-value form or an explicit ?: default.


Syntax Reference

Syntax Meaning
ns.name Direct child named name
ns."elem-name" Direct child with special characters
ns.* All direct children
ns[n] Child at index n (0-based)
ns.**.name Deep search for name
ns.**.* All descendants
ns@name Filter: keep nodes named name
ns@(expr) Filter: keep nodes matching expression
ns.$name Attribute (single-value, zero on error)
val, err := ns.$name Attribute (dual-value)
ns.$name! Attribute, panic on error
ns.$name?:def Attribute, custom default on error
ns.method(args) Method call
ns._method Method call via XGo_method()
ns.all / ns._all Materialize and cache all results
ns.one / ns._one First match, early termination
ns.single / ns._single Exactly one match (validates uniqueness)

Query Examples

JSON
doc := json`{
    "animals": [
        {"class": "gopher", "at": "Line 1"},
        {"class": "armadillo", "at": "Line 2"},
        {"class": "zebra", "at": "Line 3"},
        {"class": "unknown", "at": "Line 4"},
        {"class": "gopher", "at": "Line 5"},
        {"class": "bee", "at": "Line 6"},
        {"class": "gopher", "at": "Line 7"},
        {"class": "zebra", "at": "Line 8"}
    ]
}`!

// Iterate filtered records
for animal in doc.animals.*@($class == "zebra") {
    echo animal.$at
}

// Collect all names into a list
names := [a.$class for a in doc.animals.*]

// Access by index
first := doc.animals[0].$class

// Find unique admin (validate exactly one exists)
user := doc.users.*@($role == "admin")._single
name, err := user.$name
HTML
import "os"
import "github.com/goplus/xgo/dql/html"

doc := html.source(os.Args[1])

// Print all hyperlink URLs
for a in doc.**.a {
    if url := a.$href; url != "" {
        echo url
    }
}

// Collect text from all paragraphs
texts := [p.text for p in doc.**.p]

// Find the first element with a specific class (early termination)
widget := doc.**.*@($class == "widget").one
id := widget.$id
XGo AST
doc := xgo`
x, y := "Hi", 123
echo x
print y
`!

// Find all expression statements
stmts := doc.shadowEntry.body.list.*@(self.class == "ExprStmt")

// Extract function names from call expressions
for fn in stmts.x@(self.class == "CallExpr").fun@(self.class == "Ident") {
    echo fn.$name
}
File System
// Walk current directory and print all .xgo files
for e in fs`.`.**.file.match("*.xgo") {
    echo e.path
}

// Collect all Go source file names
names := [f.$name for f in root.**.file@match("*.go", $name)]

Error Handling

DQL integrates with XGo's error handling operators and follows a consistent two-version design for attribute and method access.

Two-Version Design

Most accessors come in two forms:

// Single-value: convenient, returns zero on error
name := node.$name

// Dual-value: explicit, lets you inspect the error
name, err := node.$name

Additionally, XGo's error operators work directly on DQL expressions:

name := node.$name!           // panic if error
name := node.$name?:"N/A"     // custom fallback
NodeSet Error State

Some operations return a NodeSet that internally carries an error (e.g., _one when no match is found, _single when zero or multiple matches are found). This error propagates automatically to any subsequent attribute or method access:

admin := doc.users.*@($role == "admin")._one

// All of these reflect the internal ErrNotFound:
name := admin.$name              // returns zero value
name, err := admin.$name         // err == dql.ErrNotFound
name := admin.$name!             // panics

An empty NodeSet is still a valid NodeSet — loops simply don't execute:

for user in admin {
    // Not reached if admin has ErrNotFound
}

Performance and Caching

DQL uses lazy evaluation by default. Queries are not executed until you iterate or read from the NodeSet. This is memory-efficient and enables query composition, but means a NodeSet re-executes its query each time it is accessed.

Use cache control methods to avoid repeated execution:

_all / all — Materialize Everything

Executes the query and caches all results. Use when you need to access the same NodeSet multiple times.

users := doc.users.*@($active == true)._all

names  := [u.$name  for u in users]   // uses cache
emails := [u.$email for u in users]   // uses cache, no re-query
_one / one — First Match, Early Exit

Stops after finding the first matching node. Use when you know (or assume) there is at most one result.

admin := doc.users.*@($role == "admin")._one
name := admin.$name   // ErrNotFound if no match
_single / single — Uniqueness Validation

Validates that exactly one node matches. Returns ErrMultipleEntities if more than one is found.

user := doc.users.*@($id == 12345)._single
name, err := user.$name
// err may be ErrNotFound or ErrMultipleEntities
Choosing the Right Method
Need Use
Multiple accesses to same results _all
Expect one result, want fast exit _one
Require exactly one result _single
Single-pass iteration no cache (default lazy)

Implementing a NodeSet

To make a custom data source queryable with DQL, implement the following interface on your NodeSet type.

Required Methods
// Iteration
func (ns NodeSet) XGo_Enum() iter.Seq[NodeSet]

// Extract first node
func (ns NodeSet) XGo_first() (NodeType, error)

// Type conversion from raw sequence
func NodeSet_Cast(seq func(func(NodeType) bool)) NodeSet
Child Navigation
func (ns NodeSet) XGo_Elem(name string) NodeSet    // ns.name
func (ns NodeSet) XGo_Child() NodeSet              // ns.*
func (ns NodeSet) XGo_Index(index int) NodeSet     // ns[n]
func (ns NodeSet) XGo_Any(name string) NodeSet     // ns.**.name  ("" = all)
Filtering
func (ns NodeSet) XGo_Select(name string) NodeSet  // ns@name
// Conditional filters are compiler-generated using NodeSet_Cast
Attribute Access
func (ns NodeSet) XGo_Attr__0(name string) ValueType          // single-value
func (ns NodeSet) XGo_Attr__1(name string) (ValueType, error) // dual-value
Cache Control

Provide either general-form (XGo_all, XGo_one, XGo_single) or domain-specific form (All, One, Single) methods:

func (ns NodeSet) XGo_all()    NodeSet  // or All()
func (ns NodeSet) XGo_one()    NodeSet  // or One()
func (ns NodeSet) XGo_single() NodeSet  // or Single()

Domain types (NodeType, ValueType, IndexType) are customized per implementation.


Standard Errors

The dql package defines two standard sentinel errors:

package dql

var (
    ErrNotFound         = errors.New("node not found")
    ErrMultipleEntities = errors.New("multiple entities found, expected single")
)

These are returned by _one (when no match exists) and _single (when zero or more than one match exists), and propagate through the NodeSet to any subsequent attribute or method call.

Documentation

Index

Constants

View Source
const (
	XGoPackage = true
)

Variables

View Source
var (
	ErrNotFound      = errors.New("entity not found")
	ErrMultiEntities = errors.New("too many entities found")
)

Functions

func Collect

func Collect[T any, Seq ~func(func(T) bool)](seq Seq) []T

Collect retrieves all items from the provided sequence.

func First

func First[T any, Seq ~func(func(T) bool)](seq Seq) (ret T, err error)

First retrieves the first item from the provided sequence. If the sequence is empty, it returns ErrNotFound.

func Int

func Int(text string) (int, error)

Int parses the given string as an integer, removing any commas and trimming whitespace.

func NopIter

func NopIter[T any](yield func(T) bool)

NopIter is a no-operation iterator that yields no values.

func Single

func Single[T any, Seq ~func(func(T) bool)](seq Seq) (ret T, err error)

Single retrieves a single item from the provided sequence. If the sequence is empty, it returns ErrNotFound. If the sequence contains more than one item, it returns ErrMultiEntities.

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL