goparser

package
v0.3.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 22, 2026 License: BSD-3-Clause Imports: 19 Imported by: 0

README

Parser

See if we can build an almost single pass compiler for Go doing syntax directed translation, without any complex data structure (no syntax tree), only lists of tokens.

The goal is to have the shortest and simplest path from source to bytecode.

Design

The input of parser is a list of tokens produced by the scanner. Multiple tokens are processed at once. The minimal set to get meaningful results (not an error or nil) is a complete statement.

The output of parser is also a list of tokens, to be consumed by the compiler to produce bytecode. The output tokens set is identical to the bytecode instructions set except that:

  • code locations may be provided as labels instead of numerical values,
  • memory locations for constants and variables may be provided as symbol names instead of numerical values.

Status

Go language support:

  • named functions
  • anonymous functions (closures)
  • methods
  • internal function calls
  • external function calls (calling runtime symbols in interpreter)
  • export to runtime
  • builtin calls (new, make, copy, delete, len, cap, ...)
  • out of order declarations
  • arbirtrary precision constants
  • basic types
  • complete numeric types
  • complex numbers
  • function types
  • variadic functions
  • pointers
  • structures
  • embedded structures
  • recursive structures
  • literal composite objects
  • interfaces
  • arrays, slices
  • maps
  • deterministic maps
  • channel types
  • channel operations
  • [x]� multi-assign expressions
  • var defined by assign :=
  • var assign =
  • var declaration
  • type declaration
  • func declaration
  • const declaration
  • iota expression
  • panic statement
  • defer statement
  • recover statement
  • range clause
  • go statement
  • if statement (including else and else if)
  • for statement
  • switch statement
  • type switch statement
  • break statement
  • continue statement
  • fallthrough statement
  • goto statement
  • label statement
  • select statement
  • binary operators
  • unary operators
  • logical operators && and ||
  • assign operators
  • operator precedence rules
  • parenthesis expressions
  • call expressions
  • index expressions
  • selector expressions
  • slice expressions
  • type convertions
  • type assertions
  • parametric types (generic)
  • type parametric functions (generic)
  • type constraints (generic)
  • type checking
  • comment pragmas
  • package import
  • init functions
  • modules

Other items:

  • REPL
  • multiline statements in REPL
  • completion, history in REPL
  • eval strings
  • eval files (including stdin, ...)
  • debug traces for scanner, parser, compiler, bytecode vm
  • simple interpreter tests to exercise from source to execution
  • compile time error detection and diagnosis
  • stack dump
  • symbol tables, data tables, code binded to source lines
  • interactive debugger: breaks, continue, instrospection, ...
  • machine level debugger
  • source level debugger
  • replay debugger, backward instruction execution
  • vm monitor: live memory / code display during run
  • stdlib wrappers a la yaegi
  • system and environment sandboxing
  • build constraints (arch, sys, etc)
  • test command (running go test / benchmark / example files)
  • skipping / including test files
  • test coverage
  • fuzzy tests for scanner, vm, ...

Documentation

Overview

Package goparser implements a structured parser for Go.

Index

Constants

This section is empty.

Variables

View Source
var (
	ErrEllipsisArray  = errors.New("[...] array")
	ErrFuncType       = errors.New("invalid function type")
	ErrInvalidType    = errors.New("invalid type")
	ErrMissingType    = errors.New("missing type")
	ErrSize           = errors.New("invalid size")
	ErrSyntax         = errors.New("syntax error")
	ErrNotImplemented = errors.New("not implemented")
)

Type parsing error definitions.

Functions

func IsExported

func IsExported(name string) bool

IsExported reports whether the given name starts with an upper-case letter.

func MatchFileName

func MatchFileName(name string, ctx *buildContext) bool

MatchFileName reports whether name (a .go file basename) matches the given build context's GOOS/GOARCH constraints encoded in the file name.

func MatchFileNameFor

func MatchFileNameFor(name, goos, goarch string) bool

MatchFileNameFor reports whether name matches the given GOOS/GOARCH constraints encoded in the file name. It is like MatchFileName but for an explicit platform.

func PackageName

func PackageName(importPath string) string

PackageName returns the identifier used to reference the package given its import path: the last segment, or the second-to-last when the last matches "v[0-9]*" (module versioning suffix).

func QualifyName added in v0.2.0

func QualifyName(pkg, name string) string

QualifyName composes the canonical pkg-qualified symbol-table key for a top-level name. For pointer-receiver method names ("*Tag.M"), the '*' moves to the very front of the key ("*<pkg>.Tag.M") so the standard pointer- method composition `"*"+typeKey+"."+method` still produces the same key. Exported so the comp package (qualifyLabel) shares the exact composition.

func RegisterGenericShim added in v0.2.0

func RegisterGenericShim(pkg, source string, nativeRefs []string)

RegisterGenericShim queues an interpreted-source shim to install into pkg's symbol table the next time a Parser populates that package (typically from ImportPackageValues). source must start with `package <pkg>`. nativeRefs lists names referenced bare by source from pkg's exports: those are pre-seeded into p.Symbols at qualified keys so symGet's importingPkg / CompilingPkg fallback resolves them at signature parse and body instantiation time.

Safe to call from package init; the registry persists for the process lifetime and is consulted by every parser created thereafter.

Types

type DeferredDecl added in v0.2.0

type DeferredDecl struct {
	PkgPath string
	Toks    Tokens
}

DeferredDecl is a top-level declaration (func body, var initializer) whose code generation is deferred to Phase 2, tagged with the import path of the package it came from ("" for the main package / REPL). The tag lets Phase 2 resolve unqualified identifiers against the originating package's symbols, which matters when a sibling import shadowed a bare name in the symbol table.

type ErrUndefined

type ErrUndefined struct {
	Name string
	Loc  string // optional "file:line:col" source position
	Pos  int    // optional global source offset, for snippet rendering
}

ErrUndefined is returned during parsing when a referenced symbol is not yet defined. It is retryable: the lazy fixpoint loop in interp.Eval defers the declaration and retries after other declarations have been processed.

func (ErrUndefined) ErrPos added in v0.3.0

func (e ErrUndefined) ErrPos() int

ErrPos exposes the source offset so a diagnostic chokepoint (interp.Eval) can render a source snippet. Returns 0 when no position was attached.

func (ErrUndefined) Error

func (e ErrUndefined) Error() string

type PackageSource added in v0.2.0

type PackageSource struct {
	Name    string // basename (e.g. "uuid.go")
	Content string
}

PackageSource is a single .go file's basename and content as loaded by LoadPackageSources.

type Parser

type Parser struct {
	*scan.Scanner

	Symbols  symbol.SymMap
	Packages map[string]*symbol.Package

	CompilingPkg string // while a deferred decl is being parsed/compiled in Phase 2: its origin package's import path ("" = main/REPL); makes unqualified type/name lookups prefer that package's symbols (see symGet, comp.Compiler.symAt)

	InitFuncs []string // ordered list of init function internal names
	// contains filtered or unexported fields
}

Parser represents the state of a parser.

func NewParser

func NewParser(spec *lang.Spec, noPkg bool) *Parser

NewParser returns a new parser.

func (*Parser) ImportPackageValues

func (p *Parser) ImportPackageValues(m map[string]map[string]reflect.Value)

ImportPackageValues populates packages with values.

func (*Parser) LoadPackageSources added in v0.2.0

func (p *Parser) LoadPackageSources(importPath string, includeTests bool) ([]PackageSource, error)

LoadPackageSources returns the .go files of the given package import path (a directory in the FS chain pkgfs -> stdlibfs -> remotefs), filtered by build tags (file-name and //go:build directives). When includeTests is false, _test.go files are skipped (matching `import "X"` resolution); pass true to include them (used by `mvm test <importpath>`).

Result order matches fs.ReadDir, which is sorted by filename.

func (*Parser) ParseAll

func (p *Parser) ParseAll(name, src string) (out []DeferredDecl, err error)

ParseAll parses code and its dependencies, and returns the still-to-be- code-generated declarations (func bodies, var initializers), each tagged with its originating package path, or an error. When src == "" the source is loaded from the package directory `name`, so its decls are tagged with `name`; otherwise (main package / REPL) they are tagged with "".

func (*Parser) ParseDecl

func (p *Parser) ParseDecl(toks Tokens) (handled bool, err error)

ParseDecl resolves a declaration's symbols (Phase 1) without emitting code. Returns handled=true if fully resolved, false if code generation is needed.

func (*Parser) ParseOneStmt

func (p *Parser) ParseOneStmt(toks Tokens) (Tokens, error)

ParseOneStmt parses a single pre-scanned statement token slice.

func (*Parser) SetBuildContext

func (p *Parser) SetBuildContext(goos, goarch string)

SetBuildContext overrides the parser's target GOOS/GOARCH for build constraint filtering.

func (*Parser) SetIncludeTests added in v0.2.0

func (p *Parser) SetIncludeTests(b bool)

SetIncludeTests toggles whether ParseAll's directory-mode load includes _test.go files. Off by default (matching `import "X"` resolution); turn on for `mvm test <importpath>` so test functions become callable.

func (*Parser) SetPkgfs

func (p *Parser) SetPkgfs(pkgPath string)

SetPkgfs sets the parser virtual filesystem for reading sources.

func (*Parser) SetRemoteFS added in v0.2.0

func (p *Parser) SetRemoteFS(fsys fs.FS)

SetRemoteFS installs a last-resort filesystem consulted when neither pkgfs nor stdlibfs contain the requested import path. Typical use is a modfs.FS that fetches modules from a proxy on demand.

func (*Parser) SetStdlibFS

func (p *Parser) SetStdlibFS(fsys fs.FS)

SetStdlibFS installs a fallback filesystem for resolving imported source packages that are not present in the primary pkgfs. This is used to resolve generics-first stdlib packages (cmp, slices, maps, ...) whose sources are embedded in the interpreter binary.

func (*Parser) SetTestSkipFiles added in v0.3.0

func (p *Parser) SetTestSkipFiles(names map[string]bool)

SetTestSkipFiles records basenames of bridged-stdlib test files that loadBridgedTestSources must skip. Used by `mvm test`'s drop-on-compile- error retry: a stdlib external test file that references export_test.go- only symbols (e.g. a method the real native type lacks) can't compile against the bridge, so the driver drops it and reloads the rest. nil or empty means skip nothing.

func (*Parser) SetTestSourceFS added in v0.3.0

func (p *Parser) SetTestSourceFS(fsys fs.FS)

SetTestSourceFS installs the test-source filesystem consulted by LoadPackageSources only when (a) includeTests is on and (b) the target import path is a bridge-only stdlib package (i.e. has a Bin entry in p.Packages but no source in pkgfs/stdlibfs/remotefs). The intended supplier is stdlib.GorootTestFS(), which serves $GOROOT/src so external `package X_test` files run against the existing reflect bindings.

This FS is deliberately separate from the pkgfs -> stdlibfs -> remotefs chain: feeding $GOROOT/src into that chain would make ordinary `import "strings"` start loading interpreted source alongside the reflect bridge, double-defining every exported symbol.

func (*Parser) SymAdd

func (p *Parser) SymAdd(i int, name string, v vm.Value, k symbol.Kind, t *vm.Type)

SymAdd adds a new named symbol, recording the key for potential rollback.

func (*Parser) SymSet

func (p *Parser) SymSet(key string, sym *symbol.Symbol)

SymSet inserts sym at key in the symbol table, recording the key for potential rollback.

func (*Parser) WithImportingPkg added in v0.2.0

func (p *Parser) WithImportingPkg(pkg string) func()

WithImportingPkg sets p.importingPkg to pkg and returns a function that restores the previous value. Callers loading a package's source directly (e.g. `mvm test <importpath>`) use this to mirror the canonical-key setup that importSrc performs for transitive imports, so the target's top-level Type/Func/Method/Var/Const symbols land at `<pkg>.<name>` keys rather than bare keys (which would mismatch every subsequent qualified lookup in the target's own deferred bodies). See pkgKey, symGet, and the Phase 2 Path B memory notes.

type SelectCaseDesc

type SelectCaseDesc struct {
	Dir     reflect.SelectDir
	ValName string // scoped name of recv value var ("" if none)
	OkName  string // scoped name of recv ok var ("" if none)
}

SelectCaseDesc describes one case of a select statement for the compiler.

type Token

type Token struct {
	scan.Token
	Arg []any
}

Token represents a parser token.

func (Token) FieldKeyName added in v0.2.0

func (t Token) FieldKeyName() (string, bool)

FieldKeyName returns the field name of a struct-composite-literal key Colon token (emitted by newFieldColon), and false otherwise.

func (*Token) MarkNoFnew added in v0.2.0

func (t *Token) MarkNoFnew()

MarkNoFnew tags this Token (intended for Type Idents) so the compiler will not emit a speculative Fnew for it.

func (Token) NoFnew added in v0.2.0

func (t Token) NoFnew() bool

NoFnew reports whether this Token was tagged via MarkNoFnew.

type Tokens

type Tokens []Token

Tokens represents slice of tokens.

func (Tokens) Index

func (toks Tokens) Index(tok lang.Token) int

Index returns the index in toks of the first matching tok, or -1.

func (Tokens) LastIndex

func (toks Tokens) LastIndex(tok lang.Token) int

LastIndex returns the index in toks of the last matching tok, or -1.

func (Tokens) Split

func (toks Tokens) Split(tok lang.Token) (result []Tokens)

Split returns a slice of token arrays, separated by tok.

func (Tokens) SplitStart

func (toks Tokens) SplitStart(tok lang.Token) (result []Tokens)

SplitStart is similar to Split, except the first token in toks is skipped.

func (Tokens) String

func (toks Tokens) String() (s string)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL