bytecode

package
v0.14.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 26, 2026 License: Apache-2.0 Imports: 4 Imported by: 0

Documentation

Overview

Package bytecode defines the AILANG bytecode instruction set, image format, and Value type used by the register VM (internal/vm).

This package is the compilation target for Statement IR (Phase 2C) and the input to the VM dispatch loop (Phase 2B). Per the M-BYTECODE-VM design doc (§11), this package must NOT import internal/vm. The import direction is always vm → bytecode.

Instructions are 32-bit words with two layouts:

┌──────────┬──────────┬──────────┬──────────┐
│  OpCode  │    A     │    B     │    C     │   ABC form
│  8 bits  │  8 bits  │  8 bits  │  8 bits  │
└──────────┴──────────┴──────────┴──────────┘

┌──────────┬──────────┬─────────────────────┐
│  OpCode  │    A     │       Bx (16 bits)  │   ABx form
│  8 bits  │  8 bits  │                     │
└──────────┴──────────┴─────────────────────┘

Bx is interpreted as unsigned (constant pool indices, prototype indices). SBx is the same field interpreted as a signed offset (jump targets), with bias 0x8000 — i.e. SBx = int(Bx) - 0x8000, range [-32768, 32767].

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Disassemble

func Disassemble(img *BytecodeImage) string

Disassemble produces a human-readable listing of an entire BytecodeImage: header (entry point, totals), constant pool, globals, and one block per prototype with annotated instructions.

The output is intentionally text-only and stable enough for golden tests. It is the primary debugging surface for Phase 2D — `ailang disasm` calls directly into this function.

func DisassembleFunc

func DisassembleFunc(p *FuncPrototype, img *BytecodeImage) string

DisassembleFunc disassembles a single FuncPrototype against an image (for constant resolution). Used by tests that don't want a full image dump.

func OpCount

func OpCount() int

OpCount returns the number of defined opcodes (excluding the sentinel). Used by tests and disassemblers.

Types

type ADTObj

type ADTObj struct {
	Tag    int
	Fields []Value
}

ADTObj is a tagged algebraic data type instance. Tag is a per-type local ordinal assigned during type elaboration; it is NOT globally unique. Type disambiguation is the compiler's responsibility (§4.3).

type BytecodeImage

type BytecodeImage struct {
	// Constants is the deduplicated constant pool. Constants are added via
	// AddConstant which dedupes by structural equality.
	Constants []Value

	// Prototypes is the table of compiled functions. The order is the
	// canonical numeric prototype identity used by CLOSURE indices.
	Prototypes []*FuncPrototype

	// EntryPoint is the index in Prototypes of the function the VM should
	// invoke when running the image. -1 means "no entry point" (e.g. a
	// library image with only callable functions).
	EntryPoint int

	// Globals is a flat array of mutable global slots. LOAD_GLOBAL Bx indexes
	// into this slice. Phase 2B uses this only for hand-assembled tests; the
	// compiler in Phase 2C will populate it from top-level let bindings.
	Globals []Value
}

BytecodeImage is the in-memory bytecode unit. It bundles a constant pool, a prototype table, and an entry point. Phase 2B is in-memory only — there is no on-disk format yet (Phase 2D will add serialization if needed).

func NewImage

func NewImage() *BytecodeImage

NewImage returns an empty bytecode image with no entry point.

func (*BytecodeImage) AddConstant

func (img *BytecodeImage) AddConstant(v Value) int

AddConstant adds a value to the constant pool, deduplicating by structural equality (Value.Equal). Returns the pool index. The dedup is intentionally linear — Phase 2B images are tiny. If profiling later shows this matters, we can add a hash table keyed by a canonical representation.

func (*BytecodeImage) AddPrototype

func (img *BytecodeImage) AddPrototype(p *FuncPrototype) int

AddPrototype appends a prototype and returns its index. The returned index is what CLOSURE instructions reference (via the parent prototype's NestedProtos table) and what EntryPoint is set to.

func (*BytecodeImage) SetEntryPoint

func (img *BytecodeImage) SetEntryPoint(idx int) error

SetEntryPoint marks a prototype index as the image's entry point. Returns an error if the index is out of range.

func (*BytecodeImage) Validate

func (img *BytecodeImage) Validate() error

Validate performs structural sanity checks on an image:

  • every instruction's prototype/constant references are in range
  • jump targets land within the same prototype
  • LineInfo (if non-nil) has the right length

This is intended to be called by tests and by the VM before execution as a defensive check. It does not verify type correctness or semantic well- formedness — that's the compiler's job.

func (*BytecodeImage) ValidatePrototype

func (img *BytecodeImage) ValidatePrototype(protoIdx int) error

ValidatePrototype runs the structural checks from Validate against a single prototype. The compiler uses this to validate per-function after lowering, so a single buggy proto can be rolled back and tagged EvalOnly rather than aborting the whole image (see compiler.go Phase 2).

type ClosureObj

type ClosureObj struct {
	Proto    FuncPrototypeRef
	Captures []Value
}

ClosureObj is a flat closure: the prototype plus a fixed array of captured values. Captures are copied at closure creation time (§3.3 — flat closures).

type FuncPrototype

type FuncPrototype struct {
	// Name is a human-readable identifier for stack traces and disassembly.
	Name string

	// NumRegs is the size of the register frame allocated when this function
	// is called. The compiler (Phase 2C) computes this from VarDecl count
	// plus a few scratch slots.
	NumRegs uint8

	// NumParams is the number of parameters. They occupy registers
	// [0, NumParams). The remaining registers are locals.
	NumParams uint8

	// IsVariadic indicates the last parameter consumes a list of remaining
	// arguments. Phase 2B does not require this; included so the type doesn't
	// need a breaking change in 2C.
	IsVariadic bool

	// NumCaptures is the number of values this function captures from its
	// enclosing scope when it is instantiated as a closure. Used by the VM
	// to know how many pseudo-MOVE instructions to consume after CLOSURE.
	// Top-level functions have NumCaptures=0.
	NumCaptures uint8

	// Instructions is the bytecode body.
	Instructions []Instruction

	// Constants holds indices into the parent BytecodeImage's constant pool.
	// Storing indices (not values) keeps the prototype small and the
	// deduplication centralized.
	//
	// LOAD_CONST's Bx field is an index into THIS slice (the prototype's
	// local constant table), not directly into the image pool. The
	// indirection lets a prototype use only the constants it actually
	// references.
	Constants []int

	// NestedProtos holds indices into the parent BytecodeImage's prototype
	// table. CLOSURE's Bx field indexes into this slice.
	NestedProtos []int

	// LineInfo maps instruction index → source line number. Used to attach
	// source locations to runtime errors. Length should equal len(Instructions);
	// a zero entry means "no source line available".
	LineInfo []int

	// File is the source file the function was compiled from. Used together
	// with LineInfo to format runtime errors as `<file>:<line>`. Empty when
	// the file is unknown (e.g. hand-built test prototypes).
	File string

	// EvalOnly marks this prototype as a stub: the function exists in the
	// program but the bytecode compiler couldn't compile it (or chose not to)
	// and the VM must trap to the evaluator via VM.Interop on every call.
	//
	// When true, Instructions and LineInfo are nil and NumRegs is zero. The
	// VM checks this flag inside OpCall/OpTailCall before pushing a frame.
	// Other prototypes may still hold OpClosure references to this prototype
	// via NestedProtos — that is the whole point of M-BYTECODE-2D M3.
	EvalOnly bool

	// EvalReason is a human-readable explanation of why the function was
	// marked EvalOnly (typically the compile error). Used in error messages
	// and in `ailang disasm` output. Only meaningful when EvalOnly is true.
	EvalReason string
}

FuncPrototype is a compiled function: instructions, register layout, constant references, nested prototype indices (for closures), and source line information for diagnostics.

FuncPrototype satisfies the FuncPrototypeRef interface so it can be used directly as a closure target without an adapter.

func (*FuncPrototype) LookupConstant

func (p *FuncPrototype) LookupConstant(localIdx int, img *BytecodeImage) (Value, bool)

LookupConstant resolves a local constant index (the Bx of a LOAD_CONST) to the actual Value via the supplied image's pool. Out-of-range indices return the second value as false; the VM should treat that as a corrupt image.

func (*FuncPrototype) NumParameters

func (p *FuncPrototype) NumParameters() uint8

NumParameters implements FuncPrototypeRef.

func (*FuncPrototype) NumRegisters

func (p *FuncPrototype) NumRegisters() uint8

NumRegisters implements FuncPrototypeRef.

func (*FuncPrototype) ProtoName

func (p *FuncPrototype) ProtoName() string

ProtoName implements FuncPrototypeRef.

type FuncPrototypeRef

type FuncPrototypeRef interface {
	ProtoName() string
	NumRegisters() uint8
	NumParameters() uint8
}

FuncPrototypeRef is a forward reference to a FuncPrototype defined in image.go. Defined here as an interface to avoid forcing a particular concrete type into the closure value.

In practice the only implementer is *FuncPrototype, but using an interface keeps Value decoupled from the image format and lets tests substitute stub prototypes.

type Instruction

type Instruction uint32

Instruction is a single 32-bit bytecode word.

func EncodeABC

func EncodeABC(op OpCode, a, b, c uint8) Instruction

EncodeABC builds an ABC-form instruction. Each operand must fit in 8 bits.

func EncodeABx

func EncodeABx(op OpCode, a uint8, bx uint16) Instruction

EncodeABx builds an ABx-form instruction. Bx is unsigned 16 bits.

func EncodeASBx

func EncodeASBx(op OpCode, a uint8, sbx int) Instruction

EncodeASBx builds an ABx-form instruction with a signed 16-bit operand. sbx must be in the range [-0x8000, 0x7FFF].

func (Instruction) A

func (i Instruction) A() uint8

A extracts the A operand (8 bits).

func (Instruction) B

func (i Instruction) B() uint8

B extracts the B operand (8 bits, ABC form).

func (Instruction) Bx

func (i Instruction) Bx() uint16

Bx extracts the wide unsigned operand (16 bits, ABx form).

func (Instruction) C

func (i Instruction) C() uint8

C extracts the C operand (8 bits, ABC form).

func (Instruction) Op

func (i Instruction) Op() OpCode

Op extracts the opcode.

func (Instruction) SBx

func (i Instruction) SBx() int

SBx extracts the wide signed operand (16 bits, ABx form, biased).

func (Instruction) String

func (i Instruction) String() string

String returns a human-readable disassembly of a single instruction. The output is intentionally minimal; full disassembly with constant resolution happens in Phase 2D's disassembler.

type ListObj

type ListObj struct {
	Elems []Value
}

ListObj is a slice-backed list. The design doc (§3.2) leaves the choice of cons-cell vs slice open; Phase 2B uses slices for cache locality and simple indexing. Revisit in Phase 2D if benchmarks demand.

type OpCode

type OpCode uint8

OpCode is the 8-bit instruction tag.

const (
	// Loads
	OpLoadConst  OpCode = iota // R[A] = Constants[Bx]                       (ABx)
	OpLoadNil                  // R[A] = Unit                                (A)
	OpMove                     // R[A] = R[B]                                (AB)
	OpLoadGlobal               // R[A] = Globals[Bx]                         (ABx)

	// Arithmetic (int and float, dispatched by value tag at runtime)
	OpAdd // R[A] = R[B] + R[C]                                             (ABC)
	OpSub // R[A] = R[B] - R[C]                                             (ABC)
	OpMul // R[A] = R[B] * R[C]                                             (ABC)
	OpDiv // R[A] = R[B] / R[C]                                             (ABC)
	OpMod // R[A] = R[B] % R[C]                                             (ABC)
	OpNeg // R[A] = -R[B]                                                   (AB)

	// Comparison (result is Bool)
	OpEq // R[A] = R[B] == R[C]                                             (ABC)
	OpLt // R[A] = R[B] < R[C]                                              (ABC)
	OpLe // R[A] = R[B] <= R[C]                                             (ABC)

	// Logic (NOT only — AND/OR are compiled as conditional jumps)
	OpNot // R[A] = !R[B]                                                   (AB)

	// String
	OpConcat // R[A] = R[B] ++ R[C]                                         (ABC)

	// Control flow
	OpJump        // IP += SBx                                              (SBx)
	OpJumpIfFalse // if !R[A] then IP += SBx                                (A, SBx)
	OpCall        // R[A..A+C-1] = R[A](R[A+1..A+B])                        (ABC)
	OpTailCall    // return R[A](R[A+1..A+B])  -- reuses current frame      (AB)
	OpReturn      // return R[A]                                            (A)

	// Closures
	OpClosure // R[A] = new closure from Prototypes[Bx], followed by C MOVE

	// Collections
	OpMakeList   // R[A] = [R[B]..R[B+C-1]]                                 (ABC)
	OpMakeTuple  // R[A] = (R[B]..R[B+C-1])                                 (ABC)
	OpMakeRecord // R[A] = {fields from R[B]..R[B+C-1]} -- field names from
	// the immediately following pseudo-instructions encoded
	// as constant pool indices                                (ABC)
	OpCons     // R[A] = R[B] :: R[C]                                     (ABC)
	OpGetField // R[A] = R[B].Fields[C]  -- C indexes record's sorted
	// field name table inherited from constant pool           (ABC)
	OpGetIndex // R[A] = R[B][R[C]]  -- list integer indexing only        (ABC)

	// ADT
	OpMakeADT // R[A] = ADT{tag=B, fields=R[A+1..A+C]}                      (ABC)
	OpGetTag  // R[A] = R[B].tag  -- for switch dispatch                    (AB)

	// Builtins
	OpBuiltinCall // R[A] = BuiltinTable[Bx](R[A+1..A+C])                   (ABx, C)
	// Note: encoding uses ABx + a following ABC pseudo-op
	// is avoided here. C is read from a second instruction
	// word — see the assembler in vm/assemble.go (M4).
	OpBuiltinCallHOF // R[A] = HOFBuiltinTable[B](vm, R[A+1..A+C])          (ABC)
	// Like OpBuiltinCall but passes VM as ClosureCaller so the
	// builtin can invoke closure arguments via VM.CallClosure.
	OpBuiltinTrap // R[A] = evaluator.CallBuiltin(Bx, R[A+1..A+C])          (Phase 2C/2E)
	OpEffectTrap  // yield to evaluator for effect Bx, arg in R[A]          (Phase 2E)

)

Opcode definitions. Numeric values are part of the bytecode format and must remain stable; append new opcodes at the end. Opcodes correspond directly to the design doc M-BYTECODE-VM §4.2.

func (OpCode) String

func (op OpCode) String() string

String returns the human-readable opcode name. Used by the disassembler and error messages.

type RecordField

type RecordField struct {
	Name  string
	Value Value
}

RecordField is a single field of a record. Records store fields in alphabetical order by Name (per §4.3 — A1 determinism).

type RecordObj

type RecordObj struct {
	Fields []RecordField
}

RecordObj is an alphabetically-sorted set of named fields. The sort invariant is enforced by NewRecord; do not construct RecordObj directly.

type StringObj

type StringObj struct {
	S string
}

StringObj wraps an immutable string. Strings are reference-shared across VM/evaluator boundaries since they cannot be mutated.

type TupleObj

type TupleObj struct {
	Elems []Value
}

TupleObj is a fixed-arity heterogeneous record-without-names.

type Value

type Value struct {
	Tag  ValueTag
	Int  int64
	Flt  float64
	Bool bool
	Obj  any // *StringObj, *ListObj, *TupleObj, *RecordObj, *ClosureObj, *ADTObj
}

Value is the Phase 2B tagged-struct representation of an AILANG runtime value, per M-BYTECODE-VM §3.2.

Primitives (Int, Float, Bool, Unit) are unboxed into the dedicated fields. Heap objects (String, List, Tuple, Record, Closure, ADT) live in Obj as pointers to the corresponding *Obj struct.

This is intentionally larger than a NaN-boxed uint64 (~32 bytes vs 8). The design doc gates NaN-boxing on Phase 2D benchmark evidence — do not switch representation until we have data showing value dispatch is the bottleneck.

func NewADT

func NewADT(tag int, fields []Value) Value

NewADT constructs an ADT value with the given tag and fields.

func NewBool

func NewBool(b bool) Value

NewBool constructs a Bool value.

func NewClosure

func NewClosure(proto FuncPrototypeRef, captures []Value) Value

NewClosure constructs a Closure value with flat-captured values.

func NewFloat

func NewFloat(f float64) Value

NewFloat constructs a Float value.

func NewInt

func NewInt(n int64) Value

NewInt constructs an Int value.

func NewList

func NewList(elems []Value) Value

NewList constructs a List value from the given elements. The slice is retained as-is — callers must not mutate it after construction.

func NewRecord

func NewRecord(fields []RecordField) Value

NewRecord constructs a Record value, sorting fields alphabetically by name. Duplicate field names cause a panic — the compiler must reject duplicates upstream.

func NewString

func NewString(s string) Value

NewString constructs a String value backed by a heap StringObj.

func NewTuple

func NewTuple(elems []Value) Value

NewTuple constructs a Tuple value from the given elements.

func Unit

func Unit() Value

Unit returns the canonical unit value. Unit is a singleton from a semantic standpoint; the struct copy is cheap.

func (Value) AsADT

func (v Value) AsADT() *ADTObj

AsADT returns the underlying ADT object.

func (Value) AsClosure

func (v Value) AsClosure() *ClosureObj

AsClosure returns the underlying closure object.

func (Value) AsList

func (v Value) AsList() []Value

AsList returns the underlying slice. Panics if v is not a List. The caller must not mutate the returned slice.

func (Value) AsRecord

func (v Value) AsRecord() []RecordField

AsRecord returns the underlying record fields (alphabetically sorted).

func (Value) AsString

func (v Value) AsString() string

AsString returns the underlying string. Panics if v is not a String.

func (Value) AsTuple

func (v Value) AsTuple() []Value

AsTuple returns the underlying slice. Panics if v is not a Tuple.

func (Value) Equal

func (v Value) Equal(other Value) bool

Equal reports whether two values are structurally equal under AILANG's canonical equality (§3.6). This is the equivalence relation used for constant-pool deduplication and for comparing test results between the VM and evaluator.

Notes:

  • Float NaN compares equal to NaN here (so NaN constants dedupe). The `EQ` opcode at runtime uses IEEE semantics — NaN != NaN — and must NOT route through this function.
  • Records require both sides to be in alphabetical order, which is the constructor's invariant.
  • Closures compare by prototype identity and capture-by-capture equality. Two closures wrapping the same prototype with different captures are unequal.

func (Value) String

func (v Value) String() string

String renders a value for debugging. Output is intended to be readable, not machine-parseable, and not part of any wire format.

type ValueTag

type ValueTag uint8

ValueTag identifies the runtime type of a Value.

Tag identity is part of the in-memory representation, not the bytecode format. Numeric values may be reordered freely.

const (
	TagInt ValueTag = iota
	TagFloat
	TagBool
	TagUnit
	TagString
	TagList
	TagTuple
	TagRecord
	TagClosure
	TagADT
)

func (ValueTag) String

func (t ValueTag) String() string

String returns the human-readable tag name. Used for error messages and debugging.

Directories

Path Synopsis
Package compiler lowers Statement IR (internal/gen/stmt) into bytecode images runnable by the register VM (internal/vm).
Package compiler lowers Statement IR (internal/gen/stmt) into bytecode images runnable by the register VM (internal/vm).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL