tpl

package

v1.4.6 Latest Latest Go to latest Published: May 17, 2025 License: Apache-2.0 Imports: 13 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/goplus/gop

Links

Open Source Insights

README ¶

TPL: Text Processing Language

Text processing is a common task in programming, and regular expressions have long been the go-to solution. However, regular expressions are notorious for their cryptic syntax and poor readability. Enter Go+ TPL (Text Processing Language), an enhanced alternative that offers both power and intuitive syntax.

Go+ TPL is a grammar-based language similar to EBNF (Extended Backus-Naur Form) that seamlessly integrates with Go+. It provides a more readable and maintainable approach to text processing while offering capabilities beyond what regular expressions can achieve.

Understanding Go+ TPL

To understand Go+ TPL, you need to grasp three key concepts:

1. Naming Rules

The foundation of TPL is its naming rules, expressed as name = rule. A TPL grammar consists of a series of named rules, with the first one being the root rule. The rule can be a combination of:

Basic Tokens: Fundamental syntax units like INT, FLOAT, CHAR, STRING, IDENT, "+", "++", "+=", "<<=", etc.
Keywords: An IDENT enclosed in quotes, such as "if", "else", "for".
References: References to other named rules, including self-references.
Sequence: R1 R2 ... Rn - matches a sequence of rules.
Alternatives: R1 | R2 | ... | Rn - matches any one of the rules.
Repetition Operators:
- *R - matches the rule zero or more times
- +R - matches the rule one or more times
- ?R - matches the rule zero or one time (optional)
List Operator: R1 % R2 - shorthand for R1 *(R2 R1), representing a sequence of R1 separated by R2. For example, INT % "," represents a comma-separated list of integers.
Adjacency Operator: R1 ++ R2 - indicates that R1 and R2 must be adjacent with no whitespace or comments between them.

The default operator precedence is: unary operators (*R, +R, ?R) > ++ > % > sequence (space) > |. Parentheses can be used to change the precedence.

String Literals in Detail

STRING (string literals) can take two forms:

"Hello\nWorld\n"  // QSTRING (quoted string)

`Hello
World
`               // RAWSTRING (raw string)

STRING can be defined as:

STRING = QSTRING | RAWSTRING

The Adjacency Operator Explained

Since TPL rules automatically filter whitespace and comments, the sequence R1 R2 doesn't express that R1 and R2 are adjacent. This is where the adjacency operator ++ comes in.

For example, Go+ domain text literal is defined as IDENT ++ RAWSTRING, making these valid:

tpl`expr = INT % ","`
json`{"name": "Ken", age: 15}`

While these would match IDENT STRING but are not valid domain text literals:

tpl"expr = *INT"              // IDENT must be followed by RAWSTRING, not QSTRING
tpl/* comment */`expr = *INT` // No whitespace or comments allowed between IDENT and RAWSTRING

2. Matching Results

Each rule has its built-in matching result:

Tokens and Keywords: Result is *tpl.Token.
Sequence (R1 R2 ... Rn): Result is a list ([]any) with n elements.
Repetition (*R, +R): Result is a list ([]any) with elements depending on how many times R matches.
Alternatives (R1 | R2 | ... | Rn): Result depends on which rule matches.
Optional (?R): Result is either the result of R or nil if no match.
List Operator (R1 % R2): Result is a complex tree-like structure with three levels.

Let's explain why it has three levels:
1. The first level is the result of the entire expression R1 % R2 (i.e.R1 *(R2 R1)), which is a list with two elements.
2. The first element of this list is the result of the first R1.
3. The second element is a list containing the results of all subsequent (R2 R1) matches.
  - Each element in this second-level list is itself a list with two elements: the result of R2 and the result of R1.
For example, when parsing "1, 2, 3" with INT % ",", the result structure would be:
```
[
  <INT:1>,                // First R1
  [
    [<COMMA>, <INT:2>],   // First (R2 R1)
    [<COMMA>, <INT:3>]    // Second (R2 R1)
  ]
]
```
This tree-like structure preserves all the information about the matched elements and their relationships, but can be complex to work with directly. That's why TPL provides helper functions like ListOp and BinaryOp to transform this structure into more usable forms.
Adjacency Operator (R1 ++ R2): Result is a list ([]any) with 2 elements, similar to a R1 R2 sequence.

3. Rewriting Matching Results

The default matching result is called "self" in TPL. You can rewrite this result using a Go+ closure => { ... }.

This feature is crucial as it allows seamless integration between TPL and Go+. In Go+, you reference TPL through domain text literal, and within TPL, you can call Go+ code through result rewriting.

Practical Examples

Basic Example: Parsing Integers

import "gop/tpl"

cl := tpl`
expr = INT % "," => {
    return tpl.ListOp[int](self, v => {
        return v.(*tpl.Token).Lit.int!
    })
}
`!

echo cl.parseExpr("1, 2, 3", nil)!  // Outputs: [1 2 3]

This example parses a comma-separated list of integers and converts it to a flat list of integers using TPL's ListOp function.

Building a Calculator

Creating a calculator with Go+ TPL is remarkably concise:

import "gop/tpl"

cl := tpl`
expr = operand % ("*" | "/") % ("+" | "-") => {
    return tpl.BinaryOp(true, self, (op, x, y) => {
        switch op.Tok {
        case '+': return x.(float64) + y.(float64)
        case '-': return x.(float64) - y.(float64)
        case '*': return x.(float64) * y.(float64)
        case '/': return x.(float64) / y.(float64)
        }
        panic("unexpected")
    })
}

operand = basicLit | unaryExpr

unaryExpr = "-" operand => {
    return -(self[1].(float64))
}

basicLit = INT | FLOAT => {
    return self.(*tpl.Token).Lit.float!
}
`!

echo cl.parseExpr("1 + 2 * -3", nil)!  // Outputs: -5

This calculator handles basic arithmetic operations with proper operator precedence in less than 30 lines of code.

Conclusion

Go+ TPL offers a powerful yet intuitive alternative to regular expressions for text processing. By combining grammar-based parsing with seamless Go+ integration, it enables developers to create clear, maintainable text processing solutions.

For more examples of TPL in action, check out the Go+ demos starting with tpl- at https://github.com/goplus/gop/tree/main/demo. These examples showcase how to implement calculators, parse text to generate ASTs, and even implement entire languages in just a few hundred lines of code.

Whether you're parsing structured text, building domain-specific languages, or implementing complex text transformations, Go+ TPL provides a robust and readable approach that surpasses traditional regular expressions.

Documentation ¶

Index ¶

func BasicLit(this any) *ast.BasicLit
func BinaryExpr(recursive bool, in []any) ast.Expr
func BinaryExprNR(in []any) ast.Expr
func BinaryExprR(in []any) ast.Expr
func BinaryOp(recursive bool, in []any, fn func(op *Token, x, y any) any) any
func BinaryOpNR(in []any, fn func(op *Token, x, y any) any) any
func BinaryOpR(in []any, fn func(op *Token, x, y any) any) any
func Dump(result any, omitSemi ...bool)
func Fdump(w io.Writer, ret any, prefix, indent string, omitSemi bool)
func Ident(this any) *ast.Ident
func List(in []any) []any
func ListOp[T any](in []any, fn func(v any) T) []T
func Panic(pos token.Pos, msg string)
func RangeOp(in []any, fn func(v any))
func Relocate(err error, filename string, line, col int) error
func ShowConflict(f bool) int
func UnaryExpr(in []any) ast.Expr
type Compiler
- func FromFile(fset *token.FileSet, filename string, src any, conf *cl.Config) (ret Compiler, err error)
- func New(src any, params ...any) (ret Compiler, err error)
- func NewEx(src any, filename string, line, col int, params ...any) (ret Compiler, err error)
- func (p *Compiler) Match(filename string, src any, conf *Config) (ms MatchState, result any, err error)
- func (p *Compiler) Parse(filename string, src any, conf *Config) (result any, err error)
- func (p *Compiler) ParseExpr(x string, conf *Config) (result any, err error)
- func (p *Compiler) ParseExprFrom(filename string, src any, conf *Config) (result any, err error)
type Config
type Error
type MatchState
- func (p *MatchState) Next() *Token
type Scanner
type Token

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func BasicLit ¶ added in v1.3.7

func BasicLit(this any) *ast.BasicLit

BasicLit converts the matching result of a basic literal to an ast.BasicLit expression.

func BinaryExpr ¶ added in v1.3.7

func BinaryExpr(recursive bool, in []any) ast.Expr

BinaryExpr converts the matching result of (X % op) to a binary expression. X % op means X *(op X)

func BinaryExprNR ¶ added in v1.3.7

func BinaryExprNR(in []any) ast.Expr

func BinaryExprR ¶ added in v1.3.7

func BinaryExprR(in []any) ast.Expr

func BinaryOp ¶ added in v1.3.7

func BinaryOp(recursive bool, in []any, fn func(op *Token, x, y any) any) any

func BinaryOpNR ¶ added in v1.3.7

func BinaryOpNR(in []any, fn func(op *Token, x, y any) any) any

func BinaryOpR ¶ added in v1.3.7

func BinaryOpR(in []any, fn func(op *Token, x, y any) any) any

func Dump ¶

func Dump(result any, omitSemi ...bool)

func Fdump ¶

func Fdump(w io.Writer, ret any, prefix, indent string, omitSemi bool)

func Ident ¶ added in v1.3.7

func Ident(this any) *ast.Ident

Ident converts the matching result of an identifier to an ast.Ident expression.

func List ¶ added in v1.3.7

func List(in []any) []any

List converts the matching result of (R % ",") to a flat list. R % "," means R *("," R)

func ListOp ¶ added in v1.3.9

func ListOp[T any](in []any, fn func(v any) T) []T

ListOp converts the matching result of (R % ",") to a flat list. R % "," means R *("," R)

func Panic ¶ added in v1.3.7

func Panic(pos token.Pos, msg string)

Panic panics with a matcher error.

func RangeOp ¶ added in v1.3.7

func RangeOp(in []any, fn func(v any))

RangeOp travels the matching result of (R % ",") and call fn(result of R). R % "," means R *("," R)

func Relocate ¶ added in v1.3.8

func Relocate(err error, filename string, line, col int) error

Relocate relocates the error positions.

func ShowConflict ¶ added in v1.3.8

func ShowConflict(f bool) int

ShowConflict sets the flag to show or hide conflicts.

func UnaryExpr ¶ added in v1.3.7

func UnaryExpr(in []any) ast.Expr

UnaryExpr converts the matching result of (op X) to a unary expression.

Types ¶

type Compiler ¶

type Compiler struct {
	cl.Result
}

Compiler represents a TPL compiler.

func FromFile ¶

func FromFile(fset *token.FileSet, filename string, src any, conf *cl.Config) (ret Compiler, err error)

FromFile creates a new TPL compiler from a file. fset can be nil.

func New ¶

func New(src any, params ...any) (ret Compiler, err error)

New creates a new TPL compiler. params: ruleName1, retProc1, ..., ruleNameN, retProcN

func NewEx ¶ added in v1.3.8

func NewEx(src any, filename string, line, col int, params ...any) (ret Compiler, err error)

NewEx creates a new TPL compiler. params: ruleName1, retProc1, ..., ruleNameN, retProcN

func (*Compiler) Match ¶

func (p *Compiler) Match(filename string, src any, conf *Config) (ms MatchState, result any, err error)

Match matches a source file.

func (*Compiler) Parse ¶

func (p *Compiler) Parse(filename string, src any, conf *Config) (result any, err error)

Parse parses a source file.

func (*Compiler) ParseExpr ¶

func (p *Compiler) ParseExpr(x string, conf *Config) (result any, err error)

ParseExpr parses an expression.

func (*Compiler) ParseExprFrom ¶

func (p *Compiler) ParseExprFrom(filename string, src any, conf *Config) (result any, err error)

ParseExprFrom parses an expression from a file.

type Config ¶

type Config struct {
	Scanner          Scanner
	ScanErrorHandler scanner.ErrorHandler
	ScanMode         scanner.Mode
	Fset             *token.FileSet
}

Config represents a parsing configuration of Compiler.Parse.

type Error ¶ added in v1.3.8

type Error = matcher.Error

Error represents a matching error.

type MatchState ¶ added in v1.3.7

type MatchState struct {
	Toks []*Token
	Ctx  *matcher.Context
	N    int
}

MatchState represents a matching state.

func (*MatchState) Next ¶ added in v1.3.7

func (p *MatchState) Next() *Token

Next returns the next token.

type Scanner ¶

type Scanner interface {
	Scan() Token
	Init(file *token.File, src []byte, err scanner.ErrorHandler, mode scanner.Mode)
}

Scanner represents a TPL scanner.

type Token ¶

type Token = types.Token

A Token is a lexical unit returned by Scan.

Source Files ¶

View all Source files

tpl.go

Directories ¶

Path	Synopsis
ast
cl
encoding
csv
json
regexp
regexposix
xml
matcher
parser
parsertest
scanner
scannertest
token
types
variant
builtin
delay
math
time

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL