Documentation
¶
Overview ¶
Package cevm provides Go bindings to the C++ EVM (cevm) with GPU acceleration. Import this package to use the C++ EVM as a drop-in replacement for go-ethereum's EVM.
The C++ EVM supports:
- Block-STM parallel execution
- GPU Keccak-256 state hashing (Metal/CUDA)
- GPU batch ecrecover (Metal/CUDA)
- GPU EVM opcode interpreter (Metal/CUDA)
- ZAP VM plugin protocol (native)
Build with CGo: CGO_ENABLED=1 go build -tags cgo Build without CGo: CGO_ENABLED=0 go build (types only, no execution) Binary: the `cevm` binary in luxcpp/evm/build/bin/ is the Lux VM plugin.
Concurrency model ¶
ExecuteBlock and ExecuteBlockV2 are safe to call concurrently from multiple goroutines. The implementation guarantees:
No shared mutable state on the Go side. Every call allocates a fresh []C.CGpuTx for its inputs and a fresh runtime.Pinner for its lifetime. The pinner pins the base address of every Go-owned []byte (tx.Data, tx.Code) that the C side dereferences, and is unpinned via defer after the C call returns — including on the error path.
The C result is freed via defer (gpu_free_result / gpu_free_result_v2) on every code path including failure. Gas/status arrays are copied into Go-owned slices before the result is freed.
The C++ engine uses a thread_local engine cache (one per OS thread reached by goroutines via cgo) for the Keccak hasher; per-instance MTLBuffer / CUDA context caches are mutex-protected on the C++ side. Two goroutines on different OS threads use independent kernel state.
The CPU path is fully reentrant: each call constructs a fresh cevm state and tears it down before returning.
What is NOT safe:
- Mutating the Transaction.Data or Transaction.Code slices while a concurrent ExecuteBlock call is reading them. The pinner only prevents GC moves; it does not provide read/write synchronization.
- Sharing a *BlockResult between goroutines without external sync.
ABI version ¶
The Go module's ABIVersion constant is checked against the loaded library's gpu_abi_version() in init(). A mismatch panics at process start — that is intentional. A silent ABI mismatch produces wrong gas/state results and would corrupt consensus, so fail-fast is the only safe behaviour.
Use Health() at startup to additionally verify each backend executes the canonical health-check battery (arithmetic, storage, hashing, memory, and the call bridge) without error.
Index ¶
- Constants
- func BackendName(b Backend) string
- func BatchRecoverSenders(txs types.Transactions, signer types.Signer) ([]common.Address, error)
- func LibraryABIVersion() uint32
- func PluginExists() bool
- func PluginPath() string
- func VMID() string
- type Backend
- type BlockContext
- type BlockResult
- type BlockResultV2
- func ExecuteBlockV2(backend Backend, numThreads uint32, txs []Transaction) (*BlockResultV2, error)
- func ExecuteBlockV3(backend Backend, numThreads uint32, txs []Transaction, ctx *BlockContext) (*BlockResultV2, error)
- func ExecuteBlockV4(backend Backend, numThreads uint32, txs []Transaction, ctx *BlockContext, ...) (*BlockResultV2, error)
- type HealthProbeResult
- type HealthReport
- type StateAccount
- type Transaction
- type TxStatus
Constants ¶
const ABIVersion uint32 = 6
ABIVersion is the C ABI version this Go module expects. Compare against the loaded library's gpu_abi_version() to detect version skew.
v5 (v0.26.0): added gpu_execute_block_v3 with CBlockContext (TIMESTAMP, NUMBER, CHAINID, BASEFEE, etc.) and per-tx status[] in BlockResult. V2 callers still work; only ExecuteBlockV3 sees the new BlockContext fields.
v6: added gpu_execute_block_v4 + CGpuStateAccount. Callers can now hand the GPU a state snapshot (account nonce, balance, code, code_hash) so the kernel CALL/CREATE path resolves targets on-device instead of returning CallNotSupported. V3 callers still see the same wire shape.
Variables ¶
This section is empty.
Functions ¶
func BackendName ¶
BackendName returns the human-readable name of a backend as reported by the C++ library (which is authoritative).
func BatchRecoverSenders ¶
BatchRecoverSenders recovers the sender address for every transaction in the input slice in a single cgo dispatch into the luxcpp/crypto first-party secp256k1 ecrecover pipeline.
The output slice is the same length as txs and indexed positionally — i.e. out[i] is the sender of txs[i]. Per-tx behaviour matches types.Sender byte-for-byte; that contract is enforced by the parity test in cevm_secp256k1_parity_test.go.
On any per-tx recovery failure the call returns an error naming the offending tx index. This matches the existing parallel.Executor behaviour (Stage 1 in parallel/parallel.go) which errors on the first failed sender rather than mixing successful and failed senders mid-block.
The recovered addresses are written into each tx's sigCache via types.CacheSender, so subsequent calls to types.Sender(signer, tx) return the cached value and skip recomputation.
func LibraryABIVersion ¶
func LibraryABIVersion() uint32
LibraryABIVersion returns the ABI version reported by the loaded library. Useful for diagnostics when binaries and shared libs may drift.
func PluginExists ¶
func PluginExists() bool
PluginExists reports whether the cevm plugin binary is present on disk.
func PluginPath ¶
func PluginPath() string
PluginPath returns the absolute path to the cevm VM plugin binary. Used by lux CLI and universe Makefiles to locate the built plugin.
Types ¶
type Backend ¶
type Backend int
Backend selects the C++ EVM execution mode.
const ( // CPUSequential runs transactions one at a time on a single core. CPUSequential Backend = 0 // CPUParallel uses Block-STM to run transactions across all cores. CPUParallel Backend = 1 // GPUMetal offloads Keccak, ecrecover, and the EVM interpreter to Metal. GPUMetal Backend = 2 // GPUCUDA offloads Keccak, ecrecover, and the EVM interpreter to CUDA. GPUCUDA Backend = 3 )
func AutoDetect ¶
func AutoDetect() Backend
AutoDetect returns the best available backend for this machine.
func AvailableBackends ¶
func AvailableBackends() []Backend
AvailableBackends returns the list of backends compiled and detected at runtime by the loaded library.
type BlockContext ¶
type BlockContext struct {
Origin [20]byte
GasPrice uint64
Timestamp uint64
Number uint64
Prevrandao [32]byte
GasLimit uint64
ChainID uint64
BaseFee uint64
BlobBaseFee uint64
Coinbase [20]byte
BlobHashes [8][32]byte
NumBlobHashes uint32
}
BlockContext is the block-level execution context shared by every transaction in a block. It feeds the EVM opcodes that report block-level state: TIMESTAMP, NUMBER, CHAINID, BASEFEE, COINBASE, GASLIMIT, PREVRANDAO, BLOBHASH, BLOBBASEFEE.
Pass a non-nil *BlockContext to ExecuteBlockV3 when the call must mirror real chain semantics (consensus, replay, fork-aware execution). The zero-value is the documented "no context" default — chain id resolves to 0, timestamp to 0, etc., which matches the dispatcher's pre-v0.26 behaviour.
Field layout matches the C-side CBlockContext byte-for-byte: this struct is passed to the C ABI via direct memcpy, no field-by-field translation. Field order MUST match go_bridge.h CBlockContext exactly. Adding new fields requires bumping ABIVersion and the C-side EVM_GPU_ABI_VERSION in lockstep.
type BlockResult ¶
type BlockResult struct {
// GasUsed per transaction, indexed by position.
GasUsed []uint64
// TotalGas consumed by the entire block.
TotalGas uint64
// ExecTimeMs is wall-clock execution time in milliseconds.
ExecTimeMs float64
// Conflicts detected during Block-STM parallel execution.
Conflicts uint32
// ReExecutions caused by conflicts.
ReExecutions uint32
}
BlockResult holds the outcome of executing a block of transactions.
func ExecuteBlock ¶
func ExecuteBlock(backend Backend, txs []Transaction) (*BlockResult, error)
ExecuteBlock runs a block of transactions through the C++ EVM.
Thread safety: ExecuteBlock is safe to call from multiple goroutines concurrently. The C++ engine uses thread-local kernel hosts, so each goroutine that reaches the GPU path gets its own MTLBuffer/CUDA context cache. There are no shared mutable globals between calls.
Memory safety: every Go-owned []byte the C side dereferences (tx.Data, tx.Code) is pinned for the duration of the C call. The ctxs[] slice itself is a stack-allocated local (or heap-promoted by escape analysis, either way reachable) — runtime.KeepAlive(ctxs) at the end guarantees the GC won't collect it while the C call is still in flight. The pinner is unpinned via defer on every return path including errors.
type BlockResultV2 ¶
type BlockResultV2 struct {
StateRoot [32]byte
GasUsed []uint64
Status []TxStatus
TotalGas uint64
ExecTimeMs float64
Conflicts uint32
ReExecutions uint32
ABIVersion uint32
}
BlockResultV2 extends BlockResult with the V2 ABI fields: per-tx status and the post-execution state root.
func ExecuteBlockV2 ¶
func ExecuteBlockV2(backend Backend, numThreads uint32, txs []Transaction) (*BlockResultV2, error)
ExecuteBlockV2 runs a block through the C++ EVM and returns the V2 result with per-tx status and post-execution state root.
Thread safety: same as ExecuteBlock — safe under concurrent goroutines. Memory safety: same pinner + KeepAlive contract as ExecuteBlock.
func ExecuteBlockV3 ¶
func ExecuteBlockV3(backend Backend, numThreads uint32, txs []Transaction, ctx *BlockContext) (*BlockResultV2, error)
ExecuteBlockV3 runs a block through the C++ EVM with an explicit block context and returns the V2 result shape (state root + per-tx status).
Pass `ctx == nil` for V2 semantics (zero-initialised block context — chain id, timestamp, etc. all resolve to zero). Pass a populated *BlockContext to feed CHAINID, TIMESTAMP, NUMBER, BASEFEE, COINBASE, etc. through to every backend that consumes them (Metal kernel reads it directly; CPU kernel path picks it up once the parallel agent's wiring lands; CUDA host drops it until that backend grows the same overload).
Thread safety and memory safety are identical to ExecuteBlockV2: pinner over Data/Code, KeepAlive over the ctxs slice, defer-free of the result. The BlockContext itself is passed by value into a stack-allocated C struct, so it doesn't need pinning.
func ExecuteBlockV4 ¶
func ExecuteBlockV4(backend Backend, numThreads uint32, txs []Transaction, ctx *BlockContext, state []StateAccount) (*BlockResultV2, error)
ExecuteBlockV4 runs a block with both an explicit BlockContext and a caller-supplied state snapshot. The snapshot lets the GPU CALL/CREATE path resolve target nonce / balance / code on-device instead of returning CallNotSupported. Pass an empty `state` for V3 semantics.
State packing: each StateAccount is copied into a flat C array; account code is concatenated into a single blob and each entry indexes into the blob via (offset, size). The blob and the C account array are kept alive for the duration of the cgo call via runtime.KeepAlive.
Thread safety / memory safety: same contract as ExecuteBlockV3.
type HealthProbeResult ¶
HealthProbeResult is the outcome of a single probe on a single backend.
type HealthReport ¶
type HealthReport struct {
Backend Backend
Name string
OK bool
Err error
Probe string // first failing probe name, empty when OK
ProbesRun int
ProbeResults []HealthProbeResult
// Aggregate stats — sum of gas across probes, time of the last probe.
GasUsed uint64
Status TxStatus
ExecTime float64
}
HealthReport is the per-backend result of Health(). It aggregates the per-probe results into a single OK / not-OK signal: a backend is healthy iff every probe ran to its expected status with non-zero gas.
func Health ¶
func Health() []HealthReport
Health runs a battery of canonical bytecode programs through every backend the loaded library exposes and returns a per-backend report. Use at process start to fail-fast on misconfigured GPUs (driver missing, library mismatch, device permissions, kernel coverage gaps). Returns nil only if the runtime cannot enumerate backends at all.
The battery covers:
- arithmetic (ADD/POP) — strict gas parity required across backends
- storage (SSTORE / SLOAD) — strict gas parity required
- hashing (KECCAK256) — non-zero gas required, parity not strict
- memory ops (MSTORE / MLOAD / MCOPY) — non-zero gas required, parity not strict
- the CALL bridge (CALL with a constant target) — must complete cleanly
A backend is reported OK iff every probe executed to its expected status with non-zero gas AND its gas matches every other backend on the strict- parity probes. A failure sets Err and Probe to identify the offending case.
type StateAccount ¶
type StateAccount struct {
Address [20]byte
Nonce uint64
Balance [4]uint64
Code []byte
CodeHash [32]byte
}
StateAccount is one entry in the snapshot of touched accounts handed to ExecuteBlockV4. Fields mirror the C-side CGpuStateAccount byte-for-byte (modulo the inline `Code` slice which the binding flattens into a single blob before crossing the cgo boundary).
Address is canonical 20-byte big-endian. Balance is little-endian limbs (Balance[0] = low 64 bits). Code may be nil for EOAs — empty code is the EOA marker. CodeHash should be keccak256(code); the dispatcher does not recompute it because callers usually have it cached on the StateDB side.
type Transaction ¶
type Transaction struct {
From [20]byte
To [20]byte
HasTo bool
Data []byte // Calldata
Code []byte // EVM bytecode (optional — required for real GPU execution)
GasLimit uint64
Value uint64
Nonce uint64
GasPrice uint64
}
Transaction is a single EVM transaction to execute.
When Code is non-empty AND a GPU backend is selected, the C++ EVM dispatches each tx through the parallel opcode interpreter (Metal: kernel::EvmKernelHost, CUDA: cuda::EvmKernel). When Code is empty, GPU backends use the scheduler-only Block-STM kernel.