bytecode

package

v0.14.2 Latest Latest Go to latest Published: Apr 26, 2026 License: Apache-2.0 Imports: 4 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/sunholo-data/ailang

Links

Open Source Insights

Documentation ¶

Overview ¶

Package bytecode defines the AILANG bytecode instruction set, image format, and Value type used by the register VM (internal/vm).

This package is the compilation target for Statement IR (Phase 2C) and the input to the VM dispatch loop (Phase 2B). Per the M-BYTECODE-VM design doc (§11), this package must NOT import internal/vm. The import direction is always vm → bytecode.

Instructions are 32-bit words with two layouts:

┌──────────┬──────────┬──────────┬──────────┐
│  OpCode  │    A     │    B     │    C     │   ABC form
│  8 bits  │  8 bits  │  8 bits  │  8 bits  │
└──────────┴──────────┴──────────┴──────────┘

┌──────────┬──────────┬─────────────────────┐
│  OpCode  │    A     │       Bx (16 bits)  │   ABx form
│  8 bits  │  8 bits  │                     │
└──────────┴──────────┴─────────────────────┘

Bx is interpreted as unsigned (constant pool indices, prototype indices). SBx is the same field interpreted as a signed offset (jump targets), with bias 0x8000 — i.e. SBx = int(Bx) - 0x8000, range [-32768, 32767].

Index ¶

func Disassemble(img *BytecodeImage) string
func DisassembleFunc(p *FuncPrototype, img *BytecodeImage) string
func OpCount() int
type ADTObj
type BytecodeImage
- func NewImage() *BytecodeImage
- func (img *BytecodeImage) AddConstant(v Value) int
- func (img *BytecodeImage) AddPrototype(p *FuncPrototype) int
- func (img *BytecodeImage) SetEntryPoint(idx int) error
- func (img *BytecodeImage) Validate() error
- func (img *BytecodeImage) ValidatePrototype(protoIdx int) error
type ClosureObj
type FuncPrototype
- func (p *FuncPrototype) LookupConstant(localIdx int, img *BytecodeImage) (Value, bool)
- func (p *FuncPrototype) NumParameters() uint8
- func (p *FuncPrototype) NumRegisters() uint8
- func (p *FuncPrototype) ProtoName() string
type FuncPrototypeRef
type Instruction
- func EncodeABC(op OpCode, a, b, c uint8) Instruction
- func EncodeABx(op OpCode, a uint8, bx uint16) Instruction
- func EncodeASBx(op OpCode, a uint8, sbx int) Instruction
- func (i Instruction) A() uint8
- func (i Instruction) B() uint8
- func (i Instruction) Bx() uint16
- func (i Instruction) C() uint8
- func (i Instruction) Op() OpCode
- func (i Instruction) SBx() int
- func (i Instruction) String() string
type ListObj
type OpCode
- func (op OpCode) String() string
type RecordField
type RecordObj
type StringObj
type TupleObj
type Value
- func NewADT(tag int, fields []Value) Value
- func NewBool(b bool) Value
- func NewClosure(proto FuncPrototypeRef, captures []Value) Value
- func NewFloat(f float64) Value
- func NewInt(n int64) Value
- func NewList(elems []Value) Value
- func NewRecord(fields []RecordField) Value
- func NewString(s string) Value
- func NewTuple(elems []Value) Value
- func Unit() Value
- func (v Value) AsADT() *ADTObj
- func (v Value) AsClosure() *ClosureObj
- func (v Value) AsList() []Value
- func (v Value) AsRecord() []RecordField
- func (v Value) AsString() string
- func (v Value) AsTuple() []Value
- func (v Value) Equal(other Value) bool
- func (v Value) String() string
type ValueTag
- func (t ValueTag) String() string

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func Disassemble ¶

func Disassemble(img *BytecodeImage) string

Disassemble produces a human-readable listing of an entire BytecodeImage: header (entry point, totals), constant pool, globals, and one block per prototype with annotated instructions.

The output is intentionally text-only and stable enough for golden tests. It is the primary debugging surface for Phase 2D — `ailang disasm` calls directly into this function.

func DisassembleFunc ¶

func DisassembleFunc(p *FuncPrototype, img *BytecodeImage) string

DisassembleFunc disassembles a single FuncPrototype against an image (for constant resolution). Used by tests that don't want a full image dump.

func OpCount ¶

func OpCount() int

OpCount returns the number of defined opcodes (excluding the sentinel). Used by tests and disassemblers.

Types ¶

type ADTObj ¶

type ADTObj struct {
	Tag    int
	Fields []Value
}

ADTObj is a tagged algebraic data type instance. Tag is a per-type local ordinal assigned during type elaboration; it is NOT globally unique. Type disambiguation is the compiler's responsibility (§4.3).

type BytecodeImage ¶

type BytecodeImage struct {
	// Constants is the deduplicated constant pool. Constants are added via
	// AddConstant which dedupes by structural equality.
	Constants []Value

	// Prototypes is the table of compiled functions. The order is the
	// canonical numeric prototype identity used by CLOSURE indices.
	Prototypes []*FuncPrototype

	// EntryPoint is the index in Prototypes of the function the VM should
	// invoke when running the image. -1 means "no entry point" (e.g. a
	// library image with only callable functions).
	EntryPoint int

	// Globals is a flat array of mutable global slots. LOAD_GLOBAL Bx indexes
	// into this slice. Phase 2B uses this only for hand-assembled tests; the
	// compiler in Phase 2C will populate it from top-level let bindings.
	Globals []Value
}

BytecodeImage is the in-memory bytecode unit. It bundles a constant pool, a prototype table, and an entry point. Phase 2B is in-memory only — there is no on-disk format yet (Phase 2D will add serialization if needed).

func NewImage ¶

func NewImage() *BytecodeImage

NewImage returns an empty bytecode image with no entry point.

func (*BytecodeImage) AddConstant ¶

func (img *BytecodeImage) AddConstant(v Value) int

AddConstant adds a value to the constant pool, deduplicating by structural equality (Value.Equal). Returns the pool index. The dedup is intentionally linear — Phase 2B images are tiny. If profiling later shows this matters, we can add a hash table keyed by a canonical representation.

func (*BytecodeImage) AddPrototype ¶

func (img *BytecodeImage) AddPrototype(p *FuncPrototype) int

AddPrototype appends a prototype and returns its index. The returned index is what CLOSURE instructions reference (via the parent prototype's NestedProtos table) and what EntryPoint is set to.

func (*BytecodeImage) SetEntryPoint ¶

func (img *BytecodeImage) SetEntryPoint(idx int) error

SetEntryPoint marks a prototype index as the image's entry point. Returns an error if the index is out of range.

func (*BytecodeImage) Validate ¶

func (img *BytecodeImage) Validate() error

Validate performs structural sanity checks on an image:

every instruction's prototype/constant references are in range
jump targets land within the same prototype
LineInfo (if non-nil) has the right length

This is intended to be called by tests and by the VM before execution as a defensive check. It does not verify type correctness or semantic well- formedness — that's the compiler's job.

func (*BytecodeImage) ValidatePrototype ¶

func (img *BytecodeImage) ValidatePrototype(protoIdx int) error

ValidatePrototype runs the structural checks from Validate against a single prototype. The compiler uses this to validate per-function after lowering, so a single buggy proto can be rolled back and tagged EvalOnly rather than aborting the whole image (see compiler.go Phase 2).

type ClosureObj ¶

type ClosureObj struct {
	Proto    FuncPrototypeRef
	Captures []Value
}

ClosureObj is a flat closure: the prototype plus a fixed array of captured values. Captures are copied at closure creation time (§3.3 — flat closures).

type FuncPrototype ¶

type FuncPrototype struct {
	// Name is a human-readable identifier for stack traces and disassembly.
	Name string

	// NumRegs is the size of the register frame allocated when this function
	// is called. The compiler (Phase 2C) computes this from VarDecl count
	// plus a few scratch slots.
	NumRegs uint8

	// NumParams is the number of parameters. They occupy registers
	// [0, NumParams). The remaining registers are locals.
	NumParams uint8

	// IsVariadic indicates the last parameter consumes a list of remaining
	// arguments. Phase 2B does not require this; included so the type doesn't
	// need a breaking change in 2C.
	IsVariadic bool

	// NumCaptures is the number of values this function captures from its
	// enclosing scope when it is instantiated as a closure. Used by the VM
	// to know how many pseudo-MOVE instructions to consume after CLOSURE.
	// Top-level functions have NumCaptures=0.
	NumCaptures uint8

	// Instructions is the bytecode body.
	Instructions []Instruction

	// Constants holds indices into the parent BytecodeImage's constant pool.
	// Storing indices (not values) keeps the prototype small and the
	// deduplication centralized.
	//
	// LOAD_CONST's Bx field is an index into THIS slice (the prototype's
	// local constant table), not directly into the image pool. The
	// indirection lets a prototype use only the constants it actually
	// references.
	Constants []int

	// NestedProtos holds indices into the parent BytecodeImage's prototype
	// table. CLOSURE's Bx field indexes into this slice.
	NestedProtos []int

	// LineInfo maps instruction index → source line number. Used to attach
	// source locations to runtime errors. Length should equal len(Instructions);
	// a zero entry means "no source line available".
	LineInfo []int

	// File is the source file the function was compiled from. Used together
	// with LineInfo to format runtime errors as `<file>:<line>`. Empty when
	// the file is unknown (e.g. hand-built test prototypes).
	File string

	// EvalOnly marks this prototype as a stub: the function exists in the
	// program but the bytecode compiler couldn't compile it (or chose not to)
	// and the VM must trap to the evaluator via VM.Interop on every call.
	//
	// When true, Instructions and LineInfo are nil and NumRegs is zero. The
	// VM checks this flag inside OpCall/OpTailCall before pushing a frame.
	// Other prototypes may still hold OpClosure references to this prototype
	// via NestedProtos — that is the whole point of M-BYTECODE-2D M3.
	EvalOnly bool

	// EvalReason is a human-readable explanation of why the function was
	// marked EvalOnly (typically the compile error). Used in error messages
	// and in `ailang disasm` output. Only meaningful when EvalOnly is true.
	EvalReason string
}

FuncPrototype is a compiled function: instructions, register layout, constant references, nested prototype indices (for closures), and source line information for diagnostics.

FuncPrototype satisfies the FuncPrototypeRef interface so it can be used directly as a closure target without an adapter.

func (*FuncPrototype) LookupConstant ¶

func (p *FuncPrototype) LookupConstant(localIdx int, img *BytecodeImage) (Value, bool)

LookupConstant resolves a local constant index (the Bx of a LOAD_CONST) to the actual Value via the supplied image's pool. Out-of-range indices return the second value as false; the VM should treat that as a corrupt image.

func (*FuncPrototype) NumParameters ¶

func (p *FuncPrototype) NumParameters() uint8

NumParameters implements FuncPrototypeRef.

func (*FuncPrototype) NumRegisters ¶

func (p *FuncPrototype) NumRegisters() uint8

NumRegisters implements FuncPrototypeRef.

func (*FuncPrototype) ProtoName ¶

func (p *FuncPrototype) ProtoName() string

ProtoName implements FuncPrototypeRef.

type FuncPrototypeRef ¶

type FuncPrototypeRef interface {
	ProtoName() string
	NumRegisters() uint8
	NumParameters() uint8
}

FuncPrototypeRef is a forward reference to a FuncPrototype defined in image.go. Defined here as an interface to avoid forcing a particular concrete type into the closure value.

In practice the only implementer is *FuncPrototype, but using an interface keeps Value decoupled from the image format and lets tests substitute stub prototypes.

type Instruction ¶

type Instruction uint32

Instruction is a single 32-bit bytecode word.

func EncodeABC ¶

func EncodeABC(op OpCode, a, b, c uint8) Instruction

EncodeABC builds an ABC-form instruction. Each operand must fit in 8 bits.

func EncodeABx ¶

func EncodeABx(op OpCode, a uint8, bx uint16) Instruction

EncodeABx builds an ABx-form instruction. Bx is unsigned 16 bits.

func EncodeASBx ¶

func EncodeASBx(op OpCode, a uint8, sbx int) Instruction

EncodeASBx builds an ABx-form instruction with a signed 16-bit operand. sbx must be in the range [-0x8000, 0x7FFF].

func (Instruction) A ¶

func (i Instruction) A() uint8

A extracts the A operand (8 bits).

func (Instruction) B ¶

func (i Instruction) B() uint8

B extracts the B operand (8 bits, ABC form).

func (Instruction) Bx ¶

func (i Instruction) Bx() uint16

Bx extracts the wide unsigned operand (16 bits, ABx form).

func (Instruction) C ¶

func (i Instruction) C() uint8

C extracts the C operand (8 bits, ABC form).

func (Instruction) Op ¶

func (i Instruction) Op() OpCode

Op extracts the opcode.

func (Instruction) SBx ¶

func (i Instruction) SBx() int

SBx extracts the wide signed operand (16 bits, ABx form, biased).

func (Instruction) String ¶

func (i Instruction) String() string

String returns a human-readable disassembly of a single instruction. The output is intentionally minimal; full disassembly with constant resolution happens in Phase 2D's disassembler.

type ListObj ¶

type ListObj struct {
	Elems []Value
}

ListObj is a slice-backed list. The design doc (§3.2) leaves the choice of cons-cell vs slice open; Phase 2B uses slices for cache locality and simple indexing. Revisit in Phase 2D if benchmarks demand.

type OpCode ¶

type OpCode uint8

OpCode is the 8-bit instruction tag.

const (
	// Loads
	OpLoadConst  OpCode = iota // R[A] = Constants[Bx]                       (ABx)
	OpLoadNil                  // R[A] = Unit                                (A)
	OpMove                     // R[A] = R[B]                                (AB)
	OpLoadGlobal               // R[A] = Globals[Bx]                         (ABx)

	// Arithmetic (int and float, dispatched by value tag at runtime)
	OpAdd // R[A] = R[B] + R[C]                                             (ABC)
	OpSub // R[A] = R[B] - R[C]                                             (ABC)
	OpMul // R[A] = R[B] * R[C]                                             (ABC)
	OpDiv // R[A] = R[B] / R[C]                                             (ABC)
	OpMod // R[A] = R[B] % R[C]                                             (ABC)
	OpNeg // R[A] = -R[B]                                                   (AB)

	// Comparison (result is Bool)
	OpEq // R[A] = R[B] == R[C]                                             (ABC)
	OpLt // R[A] = R[B] < R[C]                                              (ABC)
	OpLe // R[A] = R[B] <= R[C]                                             (ABC)

	// Logic (NOT only — AND/OR are compiled as conditional jumps)
	OpNot // R[A] = !R[B]                                                   (AB)

	// String
	OpConcat // R[A] = R[B] ++ R[C]                                         (ABC)

	// Control flow
	OpJump        // IP += SBx                                              (SBx)
	OpJumpIfFalse // if !R[A] then IP += SBx                                (A, SBx)
	OpCall        // R[A..A+C-1] = R[A](R[A+1..A+B])                        (ABC)
	OpTailCall    // return R[A](R[A+1..A+B])  -- reuses current frame      (AB)
	OpReturn      // return R[A]                                            (A)

	// Closures
	OpClosure // R[A] = new closure from Prototypes[Bx], followed by C MOVE

	// Collections
	OpMakeList   // R[A] = [R[B]..R[B+C-1]]                                 (ABC)
	OpMakeTuple  // R[A] = (R[B]..R[B+C-1])                                 (ABC)
	OpMakeRecord // R[A] = {fields from R[B]..R[B+C-1]} -- field names from
	// the immediately following pseudo-instructions encoded
	// as constant pool indices                                (ABC)
	OpCons     // R[A] = R[B] :: R[C]                                     (ABC)
	OpGetField // R[A] = R[B].Fields[C]  -- C indexes record's sorted
	// field name table inherited from constant pool           (ABC)
	OpGetIndex // R[A] = R[B][R[C]]  -- list integer indexing only        (ABC)

	// ADT
	OpMakeADT // R[A] = ADT{tag=B, fields=R[A+1..A+C]}                      (ABC)
	OpGetTag  // R[A] = R[B].tag  -- for switch dispatch                    (AB)

	// Builtins
	OpBuiltinCall // R[A] = BuiltinTable[Bx](R[A+1..A+C])                   (ABx, C)
	// Note: encoding uses ABx + a following ABC pseudo-op
	// is avoided here. C is read from a second instruction
	// word — see the assembler in vm/assemble.go (M4).
	OpBuiltinCallHOF // R[A] = HOFBuiltinTable[B](vm, R[A+1..A+C])          (ABC)
	// Like OpBuiltinCall but passes VM as ClosureCaller so the
	// builtin can invoke closure arguments via VM.CallClosure.
	OpBuiltinTrap // R[A] = evaluator.CallBuiltin(Bx, R[A+1..A+C])          (Phase 2C/2E)
	OpEffectTrap  // yield to evaluator for effect Bx, arg in R[A]          (Phase 2E)

)

Opcode definitions. Numeric values are part of the bytecode format and must remain stable; append new opcodes at the end. Opcodes correspond directly to the design doc M-BYTECODE-VM §4.2.

func (OpCode) String ¶

func (op OpCode) String() string

String returns the human-readable opcode name. Used by the disassembler and error messages.

type RecordField ¶

type RecordField struct {
	Name  string
	Value Value
}

RecordField is a single field of a record. Records store fields in alphabetical order by Name (per §4.3 — A1 determinism).

type RecordObj ¶

type RecordObj struct {
	Fields []RecordField
}

RecordObj is an alphabetically-sorted set of named fields. The sort invariant is enforced by NewRecord; do not construct RecordObj directly.

type StringObj ¶

type StringObj struct {
	S string
}

StringObj wraps an immutable string. Strings are reference-shared across VM/evaluator boundaries since they cannot be mutated.

type TupleObj ¶

type TupleObj struct {
	Elems []Value
}

TupleObj is a fixed-arity heterogeneous record-without-names.

type Value ¶

type Value struct {
	Tag  ValueTag
	Int  int64
	Flt  float64
	Bool bool
	Obj  any // *StringObj, *ListObj, *TupleObj, *RecordObj, *ClosureObj, *ADTObj
}

Value is the Phase 2B tagged-struct representation of an AILANG runtime value, per M-BYTECODE-VM §3.2.

Primitives (Int, Float, Bool, Unit) are unboxed into the dedicated fields. Heap objects (String, List, Tuple, Record, Closure, ADT) live in Obj as pointers to the corresponding *Obj struct.

This is intentionally larger than a NaN-boxed uint64 (~32 bytes vs 8). The design doc gates NaN-boxing on Phase 2D benchmark evidence — do not switch representation until we have data showing value dispatch is the bottleneck.

func NewADT ¶

func NewADT(tag int, fields []Value) Value

NewADT constructs an ADT value with the given tag and fields.

func NewBool ¶

func NewBool(b bool) Value

NewBool constructs a Bool value.

func NewClosure ¶

func NewClosure(proto FuncPrototypeRef, captures []Value) Value

NewClosure constructs a Closure value with flat-captured values.

func NewFloat ¶

func NewFloat(f float64) Value

NewFloat constructs a Float value.

func NewInt ¶

func NewInt(n int64) Value

NewInt constructs an Int value.

func NewList ¶

func NewList(elems []Value) Value

NewList constructs a List value from the given elements. The slice is retained as-is — callers must not mutate it after construction.

func NewRecord ¶

func NewRecord(fields []RecordField) Value

NewRecord constructs a Record value, sorting fields alphabetically by name. Duplicate field names cause a panic — the compiler must reject duplicates upstream.

func NewString ¶

func NewString(s string) Value

NewString constructs a String value backed by a heap StringObj.

func NewTuple ¶

func NewTuple(elems []Value) Value

NewTuple constructs a Tuple value from the given elements.

func Unit ¶

func Unit() Value

Unit returns the canonical unit value. Unit is a singleton from a semantic standpoint; the struct copy is cheap.

func (Value) AsADT ¶

func (v Value) AsADT() *ADTObj

AsADT returns the underlying ADT object.

func (Value) AsClosure ¶

func (v Value) AsClosure() *ClosureObj

AsClosure returns the underlying closure object.

func (Value) AsList ¶

func (v Value) AsList() []Value

AsList returns the underlying slice. Panics if v is not a List. The caller must not mutate the returned slice.

func (Value) AsRecord ¶

func (v Value) AsRecord() []RecordField

AsRecord returns the underlying record fields (alphabetically sorted).

func (Value) AsString ¶

func (v Value) AsString() string

AsString returns the underlying string. Panics if v is not a String.

func (Value) AsTuple ¶

func (v Value) AsTuple() []Value

AsTuple returns the underlying slice. Panics if v is not a Tuple.

func (Value) Equal ¶

func (v Value) Equal(other Value) bool

Equal reports whether two values are structurally equal under AILANG's canonical equality (§3.6). This is the equivalence relation used for constant-pool deduplication and for comparing test results between the VM and evaluator.

Notes:

Float NaN compares equal to NaN here (so NaN constants dedupe). The `EQ` opcode at runtime uses IEEE semantics — NaN != NaN — and must NOT route through this function.
Records require both sides to be in alphabetical order, which is the constructor's invariant.
Closures compare by prototype identity and capture-by-capture equality. Two closures wrapping the same prototype with different captures are unequal.

func (Value) String ¶

func (v Value) String() string

String renders a value for debugging. Output is intended to be readable, not machine-parseable, and not part of any wire format.

type ValueTag ¶

type ValueTag uint8

ValueTag identifies the runtime type of a Value.

Tag identity is part of the in-memory representation, not the bytecode format. Numeric values may be reordered freely.

const (
	TagInt ValueTag = iota
	TagFloat
	TagBool
	TagUnit
	TagString
	TagList
	TagTuple
	TagRecord
	TagClosure
	TagADT
)

func (ValueTag) String ¶

func (t ValueTag) String() string

String returns the human-readable tag name. Used for error messages and debugging.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
compiler Package compiler lowers Statement IR (internal/gen/stmt) into bytecode images runnable by the register VM (internal/vm).	Package compiler lowers Statement IR (internal/gen/stmt) into bytecode images runnable by the register VM (internal/vm).

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL