preprocess

package
v0.26.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 19, 2026 License: MIT Imports: 6 Imported by: 0

Documentation

Overview

Package preprocess provides the Rugo source preprocessing pipeline. It handles syntactic sugar expansion, shell fallback, string interpolation, and other source transformations that run before parsing.

Index

Constants

This section is empty.

Variables

View Source
var RugoKeywords = map[string]bool{
	"if": true, "elsif": true, "else": true, "end": true,
	"while": true, "for": true, "in": true, "def": true,
	"return": true, "require": true, "break": true, "next": true,
	"true": true, "false": true, "nil": true, "import": true, "use": true,
	"rats": true, "try": true, "or": true,
	"spawn": true, "parallel": true, "bench": true, "fn": true,
	"struct": true, "with": true, "sandbox": true, "do": true,
}
View Source
var Sources embed.FS

Functions

func ExpandBareAppend

func ExpandBareAppend(src string) string

expandBareAppend desugars bare append statements.

append(x, val)  → x = append(x, val)

Only rewrites when append( starts the line (after whitespace). Lines like "y = append(x, val)" or "puts(append(x, val))" are left alone.

func ExpandCompoundAssign

func ExpandCompoundAssign(src string) string

expandCompoundAssign desugars compound assignment operators.

x += y       → x = x + y
arr[0] += y  → arr[0] = arr[0] + y

Handles +=, -=, *=, /=, %=. Respects string boundaries.

func ExpandHashColonSyntax

func ExpandHashColonSyntax(src string) (string, error)
{foo: "bar"}  →  {"foo" => "bar"}

Only bare identifiers followed by ": " are rewritten. String contents are left untouched. The arrow syntax {expr => val} is unaffected.

func ExpandHeredocs

func ExpandHeredocs(src string) (string, []int, error)

expandHeredocs replaces heredoc syntax with single-line string expressions. Must run before stripComments since heredoc bodies may contain # characters.

Supported forms (DELIM is [A-Z_][A-Z0-9_]*):

x = <<DELIM          — interpolating heredoc (assignment context)
x = <<~DELIM         — interpolating, strip common indent
x = <<'DELIM'        — raw heredoc (no interpolation)
x = <<~'DELIM'       — raw, strip common indent
return <<DELIM       — heredoc in return context
return <<~'DELIM'    — raw heredoc in return context

The closing delimiter may be indented; leading whitespace is ignored when matching. Body lines between the opener and closer are collected verbatim.

func ExpandTrySugar

func ExpandTrySugar(src string) (string, []int)

expandTrySugar expands single-line try forms into the full block form. Returns the expanded source and a mapping from output line (0-indexed) to original input line (1-indexed).

try EXPR            → try EXPR or _ nil end
try EXPR or DEFAULT → try EXPR or _ DEFAULT end
x = try EXPR ...    → x = try EXPR ... (same, in assignment context)

Multi-line try blocks (try ... or ident ... end) are left untouched.

func FindAllTopLevel

func FindAllTopLevel(s string, pred func(ch byte, pos int, src string) bool) []int

FindAllTopLevel is like FindTopLevel but returns all matching positions.

func FindTopLevel

func FindTopLevel(s string, pred func(ch byte, pos int, src string) bool) int

FindTopLevel scans s for a byte matching pred at bracket depth 0, outside all string literals. Returns the byte offset or -1.

This covers the common pattern used by findCompoundOp, findDestructAssign, findDoAssignment, findTopLevelOr, findTopLevelPipes, etc.

func HasInterpolation

func HasInterpolation(s string) bool

hasInterpolation checks if a string contains #{} interpolation.

func InsertArraySeparators

func InsertArraySeparators(src string) string

func IsCloseBracket

func IsCloseBracket(ch byte) bool

IsCloseBracket reports whether ch is a closing bracket/paren/brace.

func IsInsideString

func IsInsideString(s string, pos int) bool

IsInsideString reports whether byte offset pos in s falls inside a string literal. This matches the original preprocessor behavior: it checks the string state just before pos (scanning bytes 0..pos-1), so opening delimiters return false and closing delimiters return true.

func IsOpenBracket

func IsOpenBracket(ch byte) bool

IsOpenBracket reports whether ch is an opening bracket/paren/brace.

func Preprocess

func Preprocess(src string, allFuncs map[string]bool) (string, []int, error)

preprocess performs line-level transformations: 1. Parenthesis-free function calls: `puts "foo"` → `puts("foo")` 2. Shell fallback: unknown idents → `__shell__("cmd line")`

It uses positional resolution at the top level: a function name is only recognized after its `def` line has been encountered. Inside function bodies, all function names (allFuncs) are visible to allow forward references.

Returns the preprocessed source and a line map (preprocessed line 0-indexed → original line 1-indexed). If lineMap is nil, the mapping is 1:1.

func ProcessInterpolation

func ProcessInterpolation(s string) (format string, exprs []string, err error)

processInterpolation converts "Hello #{expr}" to format string + args. Returns the format string and a list of expression strings.

func RejectSemicolons

func RejectSemicolons(src string) error

RejectSemicolons scans source for user-written semicolons outside strings and heredocs, and returns an error if found. Semicolons are reserved for internal use by the preprocessor as statement separators.

func RejectTrailingCommas

func RejectTrailingCommas(src string) error

RejectTrailingCommas scans source for trailing commas before ']' or '}' (outside string literals) and returns an error if found.

func ScanFuncDefs

func ScanFuncDefs(src string) map[string]bool

scanFuncDefs does a quick scan to find all `def name(` patterns so the preprocessor knows which identifiers are user functions.

func StripComments

func StripComments(src string) (string, error)

stripComments removes # comments from source, respecting string and backtick boundaries. Returns an error if an unterminated string literal is found.

Types

type StringTracker

type StringTracker struct {
	// contains filtered or unexported fields
}

StringTracker iterates byte-by-byte over source text, tracking string literal boundaries (double-quoted, single-quoted, backtick) and escape sequences. Callers check InString() instead of maintaining their own inDouble/inSingle/escaped flags.

InString() returns true for the entire string span including both opening and closing delimiters, matching the preprocessor's convention of skipping all bytes that are part of string literals.

func NewStringTracker

func NewStringTracker(src string) *StringTracker

New creates a StringTracker for the given source text. Call Next() to advance to the first byte.

func (*StringTracker) InBacktick

func (s *StringTracker) InBacktick() bool

InBacktick reports whether the current position is inside a backtick expression.

func (*StringTracker) InCode

func (s *StringTracker) InCode() bool

InCode reports whether the current position is outside all string literals.

func (*StringTracker) InDoubleString

func (s *StringTracker) InDoubleString() bool

InDoubleString reports whether the current position is inside a double-quoted string literal.

func (*StringTracker) InSingleString

func (s *StringTracker) InSingleString() bool

InSingleString reports whether the current position is inside a single-quoted string literal.

func (*StringTracker) InString

func (s *StringTracker) InString() bool

InString reports whether the current position is inside a string literal (double-quoted, single-quoted, or backtick), including both opening and closing delimiters.

func (*StringTracker) Line

func (s *StringTracker) Line() int

Line returns the current 1-based line number.

func (*StringTracker) LookingAt

func (s *StringTracker) LookingAt(prefix string) bool

LookingAt checks if src[pos:] starts with the given prefix. Useful for multi-character token detection (e.g., "||", "or", "fn(").

func (*StringTracker) Next

func (s *StringTracker) Next() (byte, bool)

Next advances to the next byte, updating string/escape state. Returns the byte and true, or (0, false) at end of input.

func (*StringTracker) Peek

func (s *StringTracker) Peek() (byte, bool)

Peek returns the next byte without advancing, or (0, false) at end.

func (*StringTracker) Pos

func (s *StringTracker) Pos() int

Pos returns the current byte offset (the position of the last byte returned by Next). Returns -1 before the first call to Next.

func (*StringTracker) Skip

func (s *StringTracker) Skip(n int) int

Skip advances past n bytes without returning them. String/escape state is updated for each skipped byte. Returns the number of bytes actually skipped (may be less than n at end of input).

func (*StringTracker) Src

func (s *StringTracker) Src() string

Src returns the full source text being scanned.

type StructInfo

type StructInfo struct {
	Name   string   // struct name (e.g. "Dog")
	Fields []string // field names
	Line   int      // 1-based line number of the struct keyword in original source
}

StructInfo holds metadata about a struct definition extracted during preprocessing. Structs are expanded into constructor functions before parsing, so they don't appear in the AST as nodes.

func ExpandStructDefs

func ExpandStructDefs(src string) (string, []int, []StructInfo)

expandStructDefs rewrites struct definitions and method definitions.

struct Dog

name
breed

end

becomes:

def Dog(name, breed)
  return {"__type__" => "Dog", "name" => name, "breed" => breed}
end
def new(name, breed)
  return Dog(name, breed)
end

And:

def Dog.bark()

becomes:

def bark(self)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL