Documentation
¶
Overview ¶
Package bytecode defines the AILANG bytecode instruction set, image format, and Value type used by the register VM (internal/vm).
This package is the compilation target for Statement IR (Phase 2C) and the input to the VM dispatch loop (Phase 2B). Per the M-BYTECODE-VM design doc (§11), this package must NOT import internal/vm. The import direction is always vm → bytecode.
Instructions are 32-bit words with two layouts:
┌──────────┬──────────┬──────────┬──────────┐ │ OpCode │ A │ B │ C │ ABC form │ 8 bits │ 8 bits │ 8 bits │ 8 bits │ └──────────┴──────────┴──────────┴──────────┘ ┌──────────┬──────────┬─────────────────────┐ │ OpCode │ A │ Bx (16 bits) │ ABx form │ 8 bits │ 8 bits │ │ └──────────┴──────────┴─────────────────────┘
Bx is interpreted as unsigned (constant pool indices, prototype indices). SBx is the same field interpreted as a signed offset (jump targets), with bias 0x8000 — i.e. SBx = int(Bx) - 0x8000, range [-32768, 32767].
Index ¶
- func Disassemble(img *BytecodeImage) string
- func DisassembleFunc(p *FuncPrototype, img *BytecodeImage) string
- func OpCount() int
- type ADTObj
- type BytecodeImage
- type ClosureObj
- type FuncPrototype
- type FuncPrototypeRef
- type Instruction
- type ListObj
- type OpCode
- type RecordField
- type RecordObj
- type StringObj
- type TupleObj
- type Value
- func NewADT(tag int, fields []Value) Value
- func NewBool(b bool) Value
- func NewClosure(proto FuncPrototypeRef, captures []Value) Value
- func NewFloat(f float64) Value
- func NewInt(n int64) Value
- func NewList(elems []Value) Value
- func NewRecord(fields []RecordField) Value
- func NewString(s string) Value
- func NewTuple(elems []Value) Value
- func Unit() Value
- type ValueTag
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func Disassemble ¶
func Disassemble(img *BytecodeImage) string
Disassemble produces a human-readable listing of an entire BytecodeImage: header (entry point, totals), constant pool, globals, and one block per prototype with annotated instructions.
The output is intentionally text-only and stable enough for golden tests. It is the primary debugging surface for Phase 2D — `ailang disasm` calls directly into this function.
func DisassembleFunc ¶
func DisassembleFunc(p *FuncPrototype, img *BytecodeImage) string
DisassembleFunc disassembles a single FuncPrototype against an image (for constant resolution). Used by tests that don't want a full image dump.
Types ¶
type ADTObj ¶
ADTObj is a tagged algebraic data type instance. Tag is a per-type local ordinal assigned during type elaboration; it is NOT globally unique. Type disambiguation is the compiler's responsibility (§4.3).
type BytecodeImage ¶
type BytecodeImage struct {
// Constants is the deduplicated constant pool. Constants are added via
// AddConstant which dedupes by structural equality.
Constants []Value
// Prototypes is the table of compiled functions. The order is the
// canonical numeric prototype identity used by CLOSURE indices.
Prototypes []*FuncPrototype
// EntryPoint is the index in Prototypes of the function the VM should
// invoke when running the image. -1 means "no entry point" (e.g. a
// library image with only callable functions).
EntryPoint int
// Globals is a flat array of mutable global slots. LOAD_GLOBAL Bx indexes
// into this slice. Phase 2B uses this only for hand-assembled tests; the
// compiler in Phase 2C will populate it from top-level let bindings.
Globals []Value
}
BytecodeImage is the in-memory bytecode unit. It bundles a constant pool, a prototype table, and an entry point. Phase 2B is in-memory only — there is no on-disk format yet (Phase 2D will add serialization if needed).
func NewImage ¶
func NewImage() *BytecodeImage
NewImage returns an empty bytecode image with no entry point.
func (*BytecodeImage) AddConstant ¶
func (img *BytecodeImage) AddConstant(v Value) int
AddConstant adds a value to the constant pool, deduplicating by structural equality (Value.Equal). Returns the pool index. The dedup is intentionally linear — Phase 2B images are tiny. If profiling later shows this matters, we can add a hash table keyed by a canonical representation.
func (*BytecodeImage) AddPrototype ¶
func (img *BytecodeImage) AddPrototype(p *FuncPrototype) int
AddPrototype appends a prototype and returns its index. The returned index is what CLOSURE instructions reference (via the parent prototype's NestedProtos table) and what EntryPoint is set to.
func (*BytecodeImage) SetEntryPoint ¶
func (img *BytecodeImage) SetEntryPoint(idx int) error
SetEntryPoint marks a prototype index as the image's entry point. Returns an error if the index is out of range.
func (*BytecodeImage) Validate ¶
func (img *BytecodeImage) Validate() error
Validate performs structural sanity checks on an image:
- every instruction's prototype/constant references are in range
- jump targets land within the same prototype
- LineInfo (if non-nil) has the right length
This is intended to be called by tests and by the VM before execution as a defensive check. It does not verify type correctness or semantic well- formedness — that's the compiler's job.
func (*BytecodeImage) ValidatePrototype ¶
func (img *BytecodeImage) ValidatePrototype(protoIdx int) error
ValidatePrototype runs the structural checks from Validate against a single prototype. The compiler uses this to validate per-function after lowering, so a single buggy proto can be rolled back and tagged EvalOnly rather than aborting the whole image (see compiler.go Phase 2).
type ClosureObj ¶
type ClosureObj struct {
Proto FuncPrototypeRef
Captures []Value
}
ClosureObj is a flat closure: the prototype plus a fixed array of captured values. Captures are copied at closure creation time (§3.3 — flat closures).
type FuncPrototype ¶
type FuncPrototype struct {
// Name is a human-readable identifier for stack traces and disassembly.
Name string
// NumRegs is the size of the register frame allocated when this function
// is called. The compiler (Phase 2C) computes this from VarDecl count
// plus a few scratch slots.
NumRegs uint8
// NumParams is the number of parameters. They occupy registers
// [0, NumParams). The remaining registers are locals.
NumParams uint8
// IsVariadic indicates the last parameter consumes a list of remaining
// arguments. Phase 2B does not require this; included so the type doesn't
// need a breaking change in 2C.
IsVariadic bool
// NumCaptures is the number of values this function captures from its
// enclosing scope when it is instantiated as a closure. Used by the VM
// to know how many pseudo-MOVE instructions to consume after CLOSURE.
// Top-level functions have NumCaptures=0.
NumCaptures uint8
// Instructions is the bytecode body.
Instructions []Instruction
// Constants holds indices into the parent BytecodeImage's constant pool.
// Storing indices (not values) keeps the prototype small and the
// deduplication centralized.
//
// LOAD_CONST's Bx field is an index into THIS slice (the prototype's
// local constant table), not directly into the image pool. The
// indirection lets a prototype use only the constants it actually
// references.
Constants []int
// NestedProtos holds indices into the parent BytecodeImage's prototype
// table. CLOSURE's Bx field indexes into this slice.
NestedProtos []int
// LineInfo maps instruction index → source line number. Used to attach
// source locations to runtime errors. Length should equal len(Instructions);
// a zero entry means "no source line available".
LineInfo []int
// File is the source file the function was compiled from. Used together
// with LineInfo to format runtime errors as `<file>:<line>`. Empty when
// the file is unknown (e.g. hand-built test prototypes).
File string
// EvalOnly marks this prototype as a stub: the function exists in the
// program but the bytecode compiler couldn't compile it (or chose not to)
// and the VM must trap to the evaluator via VM.Interop on every call.
//
// When true, Instructions and LineInfo are nil and NumRegs is zero. The
// VM checks this flag inside OpCall/OpTailCall before pushing a frame.
// Other prototypes may still hold OpClosure references to this prototype
// via NestedProtos — that is the whole point of M-BYTECODE-2D M3.
EvalOnly bool
// EvalReason is a human-readable explanation of why the function was
// marked EvalOnly (typically the compile error). Used in error messages
// and in `ailang disasm` output. Only meaningful when EvalOnly is true.
EvalReason string
}
FuncPrototype is a compiled function: instructions, register layout, constant references, nested prototype indices (for closures), and source line information for diagnostics.
FuncPrototype satisfies the FuncPrototypeRef interface so it can be used directly as a closure target without an adapter.
func (*FuncPrototype) LookupConstant ¶
func (p *FuncPrototype) LookupConstant(localIdx int, img *BytecodeImage) (Value, bool)
LookupConstant resolves a local constant index (the Bx of a LOAD_CONST) to the actual Value via the supplied image's pool. Out-of-range indices return the second value as false; the VM should treat that as a corrupt image.
func (*FuncPrototype) NumParameters ¶
func (p *FuncPrototype) NumParameters() uint8
NumParameters implements FuncPrototypeRef.
func (*FuncPrototype) NumRegisters ¶
func (p *FuncPrototype) NumRegisters() uint8
NumRegisters implements FuncPrototypeRef.
func (*FuncPrototype) ProtoName ¶
func (p *FuncPrototype) ProtoName() string
ProtoName implements FuncPrototypeRef.
type FuncPrototypeRef ¶
FuncPrototypeRef is a forward reference to a FuncPrototype defined in image.go. Defined here as an interface to avoid forcing a particular concrete type into the closure value.
In practice the only implementer is *FuncPrototype, but using an interface keeps Value decoupled from the image format and lets tests substitute stub prototypes.
type Instruction ¶
type Instruction uint32
Instruction is a single 32-bit bytecode word.
func EncodeABC ¶
func EncodeABC(op OpCode, a, b, c uint8) Instruction
EncodeABC builds an ABC-form instruction. Each operand must fit in 8 bits.
func EncodeABx ¶
func EncodeABx(op OpCode, a uint8, bx uint16) Instruction
EncodeABx builds an ABx-form instruction. Bx is unsigned 16 bits.
func EncodeASBx ¶
func EncodeASBx(op OpCode, a uint8, sbx int) Instruction
EncodeASBx builds an ABx-form instruction with a signed 16-bit operand. sbx must be in the range [-0x8000, 0x7FFF].
func (Instruction) Bx ¶
func (i Instruction) Bx() uint16
Bx extracts the wide unsigned operand (16 bits, ABx form).
func (Instruction) SBx ¶
func (i Instruction) SBx() int
SBx extracts the wide signed operand (16 bits, ABx form, biased).
func (Instruction) String ¶
func (i Instruction) String() string
String returns a human-readable disassembly of a single instruction. The output is intentionally minimal; full disassembly with constant resolution happens in Phase 2D's disassembler.
type ListObj ¶
type ListObj struct {
Elems []Value
}
ListObj is a slice-backed list. The design doc (§3.2) leaves the choice of cons-cell vs slice open; Phase 2B uses slices for cache locality and simple indexing. Revisit in Phase 2D if benchmarks demand.
type OpCode ¶
type OpCode uint8
OpCode is the 8-bit instruction tag.
const ( // Loads OpLoadConst OpCode = iota // R[A] = Constants[Bx] (ABx) OpLoadNil // R[A] = Unit (A) OpMove // R[A] = R[B] (AB) OpLoadGlobal // R[A] = Globals[Bx] (ABx) // Arithmetic (int and float, dispatched by value tag at runtime) OpAdd // R[A] = R[B] + R[C] (ABC) OpSub // R[A] = R[B] - R[C] (ABC) OpMul // R[A] = R[B] * R[C] (ABC) OpDiv // R[A] = R[B] / R[C] (ABC) OpMod // R[A] = R[B] % R[C] (ABC) OpNeg // R[A] = -R[B] (AB) // Comparison (result is Bool) OpEq // R[A] = R[B] == R[C] (ABC) OpLt // R[A] = R[B] < R[C] (ABC) OpLe // R[A] = R[B] <= R[C] (ABC) // Logic (NOT only — AND/OR are compiled as conditional jumps) OpNot // R[A] = !R[B] (AB) // String OpConcat // R[A] = R[B] ++ R[C] (ABC) // Control flow OpJump // IP += SBx (SBx) OpJumpIfFalse // if !R[A] then IP += SBx (A, SBx) OpCall // R[A..A+C-1] = R[A](R[A+1..A+B]) (ABC) OpTailCall // return R[A](R[A+1..A+B]) -- reuses current frame (AB) OpReturn // return R[A] (A) // Closures OpClosure // R[A] = new closure from Prototypes[Bx], followed by C MOVE // Collections OpMakeList // R[A] = [R[B]..R[B+C-1]] (ABC) OpMakeTuple // R[A] = (R[B]..R[B+C-1]) (ABC) OpMakeRecord // R[A] = {fields from R[B]..R[B+C-1]} -- field names from // the immediately following pseudo-instructions encoded // as constant pool indices (ABC) OpCons // R[A] = R[B] :: R[C] (ABC) OpGetField // R[A] = R[B].Fields[C] -- C indexes record's sorted // field name table inherited from constant pool (ABC) OpGetIndex // R[A] = R[B][R[C]] -- list integer indexing only (ABC) // ADT OpMakeADT // R[A] = ADT{tag=B, fields=R[A+1..A+C]} (ABC) OpGetTag // R[A] = R[B].tag -- for switch dispatch (AB) // Builtins OpBuiltinCall // R[A] = BuiltinTable[Bx](R[A+1..A+C]) (ABx, C) // Note: encoding uses ABx + a following ABC pseudo-op // is avoided here. C is read from a second instruction // word — see the assembler in vm/assemble.go (M4). OpBuiltinCallHOF // R[A] = HOFBuiltinTable[B](vm, R[A+1..A+C]) (ABC) // Like OpBuiltinCall but passes VM as ClosureCaller so the // builtin can invoke closure arguments via VM.CallClosure. OpBuiltinTrap // R[A] = evaluator.CallBuiltin(Bx, R[A+1..A+C]) (Phase 2C/2E) OpEffectTrap // yield to evaluator for effect Bx, arg in R[A] (Phase 2E) )
Opcode definitions. Numeric values are part of the bytecode format and must remain stable; append new opcodes at the end. Opcodes correspond directly to the design doc M-BYTECODE-VM §4.2.
type RecordField ¶
RecordField is a single field of a record. Records store fields in alphabetical order by Name (per §4.3 — A1 determinism).
type RecordObj ¶
type RecordObj struct {
Fields []RecordField
}
RecordObj is an alphabetically-sorted set of named fields. The sort invariant is enforced by NewRecord; do not construct RecordObj directly.
type StringObj ¶
type StringObj struct {
S string
}
StringObj wraps an immutable string. Strings are reference-shared across VM/evaluator boundaries since they cannot be mutated.
type TupleObj ¶
type TupleObj struct {
Elems []Value
}
TupleObj is a fixed-arity heterogeneous record-without-names.
type Value ¶
type Value struct {
Tag ValueTag
Int int64
Flt float64
Bool bool
Obj any // *StringObj, *ListObj, *TupleObj, *RecordObj, *ClosureObj, *ADTObj
}
Value is the Phase 2B tagged-struct representation of an AILANG runtime value, per M-BYTECODE-VM §3.2.
Primitives (Int, Float, Bool, Unit) are unboxed into the dedicated fields. Heap objects (String, List, Tuple, Record, Closure, ADT) live in Obj as pointers to the corresponding *Obj struct.
This is intentionally larger than a NaN-boxed uint64 (~32 bytes vs 8). The design doc gates NaN-boxing on Phase 2D benchmark evidence — do not switch representation until we have data showing value dispatch is the bottleneck.
func NewClosure ¶
func NewClosure(proto FuncPrototypeRef, captures []Value) Value
NewClosure constructs a Closure value with flat-captured values.
func NewList ¶
NewList constructs a List value from the given elements. The slice is retained as-is — callers must not mutate it after construction.
func NewRecord ¶
func NewRecord(fields []RecordField) Value
NewRecord constructs a Record value, sorting fields alphabetically by name. Duplicate field names cause a panic — the compiler must reject duplicates upstream.
func Unit ¶
func Unit() Value
Unit returns the canonical unit value. Unit is a singleton from a semantic standpoint; the struct copy is cheap.
func (Value) AsClosure ¶
func (v Value) AsClosure() *ClosureObj
AsClosure returns the underlying closure object.
func (Value) AsList ¶
AsList returns the underlying slice. Panics if v is not a List. The caller must not mutate the returned slice.
func (Value) AsRecord ¶
func (v Value) AsRecord() []RecordField
AsRecord returns the underlying record fields (alphabetically sorted).
func (Value) Equal ¶
Equal reports whether two values are structurally equal under AILANG's canonical equality (§3.6). This is the equivalence relation used for constant-pool deduplication and for comparing test results between the VM and evaluator.
Notes:
- Float NaN compares equal to NaN here (so NaN constants dedupe). The `EQ` opcode at runtime uses IEEE semantics — NaN != NaN — and must NOT route through this function.
- Records require both sides to be in alphabetical order, which is the constructor's invariant.
- Closures compare by prototype identity and capture-by-capture equality. Two closures wrapping the same prototype with different captures are unequal.