accel

package module
v1.0.7 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 12, 2026 License: BSD-3-Clause Imports: 10 Imported by: 12

Documentation

Overview

Package accel provides GPU-accelerated operations for blockchain and ML workloads.

The package supports multiple GPU backends (Metal, WebGPU, CUDA) via runtime plugin discovery. When built without CGO or when no backends are available, operations return ErrNoBackends.

Architecture

accel wraps the lux-accel C++ library which provides:

  • ML operations: matmul, attention, convolution, normalization
  • Crypto operations: batch signature verification, hashing, Merkle trees
  • ZK operations: NTT, MSM, polynomial arithmetic
  • Lattice crypto: Kyber, Dilithium post-quantum operations
  • FHE operations: BFV/CKKS homomorphic encryption
  • DEX operations: AMM swaps, TWAP, order matching

Backend Selection

Backends are automatically detected and selected in this priority order:

  • CUDA (NVIDIA GPUs)
  • Metal (Apple Silicon)
  • WebGPU (cross-platform fallback)

You can override with environment variable LUX_BACKEND or via API:

session, _ := accel.NewSessionWithBackend(accel.BackendMetal)

Runtime Backend Selection

For intelligent backend selection based on required operations:

// Select best backend for ZK operations
backend, _ := accel.SelectBackend(accel.OpNTT, accel.OpMSM)
session, _ := accel.NewSessionWithBackend(backend)

// Query capabilities
caps, _ := accel.Capabilities(accel.BackendWebGPU)
if caps.Supports(accel.OpMSM) {
    // Use MSM on WebGPU
}

// Compare backends for an operation
comparison, _ := accel.CompareBackends(accel.OpNTT, 10)
fmt.Printf("Fastest backend for NTT: %s\n", comparison.Fastest)

// Print all capabilities
accel.PrintCapabilities()

Pure Go Mode

When built with CGO_ENABLED=0, the package compiles in pure Go mode. All operations return ErrNoBackends but the package remains importable, allowing graceful fallback to CPU implementations.

Basic Usage

// Initialize library
if err := accel.Init(); err != nil {
    log.Printf("GPU accel not available: %v", err)
}
defer accel.Shutdown()

// Check availability
if !accel.Available() {
    // Use CPU fallback
    return
}

// Create session
session, err := accel.NewSession()
if err != nil {
    log.Fatal(err)
}
defer session.Close()

// Create tensors
a, _ := accel.NewTensor[float32](session, []int{1024, 1024})
b, _ := accel.NewTensor[float32](session, []int{1024, 1024})
c, _ := accel.NewTensor[float32](session, []int{1024, 1024})

// Perform GPU operation
if err := session.ML().MatMul(a.Untyped(), b.Untyped(), c.Untyped()); err != nil {
    log.Fatal(err)
}

Integration with Lux Node

The accel package integrates with lux-node for:

  • Batch signature verification in consensus
  • Merkle tree computation for state sync
  • Post-quantum cryptography for future-proofing

See the node/consensus and precompile packages for integration examples.

Index

Constants

View Source
const (
	BLSBatchVerifyThreshold    = 64  // Min signatures for GPU batch verify
	BLSBatchAggregateThreshold = 128 // Min items for GPU aggregation
	HashBatchThreshold         = 32  // Min items for GPU batch hash
	NTTBatchThreshold          = 4   // Min polynomials for GPU batch NTT
	MSMBatchThreshold          = 64  // Min points for GPU MSM
	KyberBatchThreshold        = 8   // Min operations for GPU batch
	DilithiumBatchThreshold    = 8   // Min operations for GPU batch
)

Batch operation thresholds - minimum items for GPU acceleration to be worthwhile.

View Source
const (
	KyberPublicKeySize  = 1184
	KyberSecretKeySize  = 2400
	KyberCiphertextSize = 1088
	KyberSharedKeySize  = 32
)

Kyber key and ciphertext sizes (ML-KEM-768)

View Source
const (
	DilithiumPublicKeySize = 1952
	DilithiumSecretKeySize = 4016
	DilithiumSignatureSize = 3309
)

Dilithium sizes (ML-DSA-65)

View Source
const Version = "0.1.0"

Version is the library version.

Variables

View Source
var (
	// ErrNoBackends indicates no GPU backends are available.
	ErrNoBackends = errors.New("accel: no GPU backends available")

	// ErrNotInitialized indicates the library was not initialized.
	ErrNotInitialized = errors.New("accel: library not initialized")

	// ErrInvalidArgument indicates an invalid argument was provided.
	ErrInvalidArgument = errors.New("accel: invalid argument")

	// ErrOutOfMemory indicates GPU memory allocation failed.
	ErrOutOfMemory = errors.New("accel: out of GPU memory")

	// ErrNotSupported indicates the operation is not supported.
	ErrNotSupported = errors.New("accel: operation not supported")

	// ErrKernelFailed indicates a GPU kernel execution failed.
	ErrKernelFailed = errors.New("accel: kernel execution failed")

	// ErrBackendNotFound indicates the requested backend is not available.
	ErrBackendNotFound = errors.New("accel: backend not found")

	// ErrSessionClosed indicates the session has been closed.
	ErrSessionClosed = errors.New("accel: session closed")

	// ErrShapeMismatch indicates tensor shapes are incompatible.
	ErrShapeMismatch = errors.New("accel: tensor shape mismatch")

	// ErrBatchSizeMismatch indicates mismatched batch input sizes.
	ErrBatchSizeMismatch = errors.New("accel: mismatched batch input sizes")

	// ErrNilInput indicates nil input in batch operation.
	ErrNilInput = errors.New("accel: nil input in batch operation")
)

BackendPriority defines the order for automatic backend selection.

Functions

func AllCapabilities

func AllCapabilities() map[BackendType]*BackendCapabilities

AllCapabilities returns capabilities for all available backends.

func Available

func Available() bool

Available returns true if at least one GPU backend is available.

func BLSBatchVerify

func BLSBatchVerify(pks, sigs, msgs [][]byte) ([]bool, error)

BLSBatchVerify verifies multiple BLS signatures using GPU acceleration. Returns slice of bools indicating validity of each signature. Returns ErrNotSupported if GPU unavailable or batch too small.

func CUDAAvailable

func CUDAAvailable() bool

CUDAAvailable returns true if CUDA backend is available.

func DeviceCount

func DeviceCount() int

DeviceCount returns the total number of available devices.

func DilithiumSign

func DilithiumSign(msg, sk []byte) (sig []byte, err error)

DilithiumSign signs a message using Dilithium (ML-DSA).

func DilithiumVerify

func DilithiumVerify(msg, sig, pk []byte) (bool, error)

DilithiumVerify verifies a Dilithium signature.

func GetLastError

func GetLastError() string

GetLastError returns the last error message from the C library.

func GetVersion

func GetVersion() string

GetVersion returns the C library version string.

func Init

func Init() error

Init initializes the accel library. Must be called before any other operations. Safe to call multiple times; subsequent calls are no-ops.

func Keccak256Batch

func Keccak256Batch(inputs [][]byte) ([][]byte, error)

Keccak256Batch computes Keccak256 hashes for multiple inputs using GPU. Returns ErrNotSupported if GPU unavailable or batch too small.

func KyberDecaps

func KyberDecaps(ct, sk []byte) (ss []byte, err error)

KyberDecaps decapsulates a ciphertext using a secret key.

func KyberEncaps

func KyberEncaps(pk []byte) (ct, ss []byte, err error)

KyberEncaps encapsulates a shared secret using a public key.

func KyberKeyGen

func KyberKeyGen() (pk, sk []byte, err error)

KyberKeyGen generates a Kyber keypair using GPU acceleration.

func LoadPlugin

func LoadPlugin(path string) error

LoadPlugin explicitly loads a backend plugin from a path.

func MSM

func MSM(scalars, bases [][]byte) ([]byte, error)

MSM computes Multi-Scalar Multiplication: sum(scalars[i] * bases[i]) Returns ErrNotSupported if GPU unavailable or batch too small.

func MerkleRoot

func MerkleRoot(leaves [][]byte) ([]byte, error)

MerkleRoot computes the Merkle root of leaves using GPU. Returns ErrNotSupported if GPU unavailable or batch too small.

func MetalAvailable

func MetalAvailable() bool

MetalAvailable returns true if Metal backend is available.

func NTTForward

func NTTForward(coeffs, roots []uint64, modulus uint64) error

NTTForward computes forward Number Theoretic Transform on a polynomial. Modifies coeffs in-place.

func NTTInverse

func NTTInverse(coeffs, invRoots []uint64, modulus uint64) error

NTTInverse computes inverse Number Theoretic Transform on a polynomial. Modifies coeffs in-place.

func PrintCapabilities

func PrintCapabilities()

PrintCapabilities prints a human-readable summary of backend capabilities.

func SHA256Batch

func SHA256Batch(inputs [][]byte) ([][]byte, error)

SHA256Batch computes SHA256 hashes for multiple inputs using GPU. Returns ErrNotSupported if GPU unavailable or batch too small.

func Shutdown

func Shutdown()

Shutdown releases all library resources. Call when done using the library.

func WebGPUAvailable

func WebGPUAvailable() bool

WebGPUAvailable returns true if WebGPU backend is available.

Types

type BackendCapabilities

type BackendCapabilities struct {
	Backend    BackendType
	Operations map[OperationType]bool
	Categories map[string]bool
}

BackendCapabilities describes what operations a backend supports.

func Capabilities

func Capabilities(backend BackendType) (*BackendCapabilities, error)

Capabilities returns the capabilities for a specific backend.

func GetCapabilities

func GetCapabilities(backend BackendType) (*BackendCapabilities, error)

GetCapabilities returns the capabilities for a backend. This probes the backend to determine which operations are supported.

func (*BackendCapabilities) SupportedOperations

func (c *BackendCapabilities) SupportedOperations() []OperationType

SupportedOperations returns a list of supported operations.

func (*BackendCapabilities) Supports

func (c *BackendCapabilities) Supports(op OperationType) bool

Supports returns true if the backend supports the operation.

func (*BackendCapabilities) SupportsCategory

func (c *BackendCapabilities) SupportsCategory(cat string) bool

SupportsCategory returns true if the backend supports any operation in the category.

type BackendComparison

type BackendComparison struct {
	Operation OperationType
	Results   map[BackendType]BenchmarkResult
	Fastest   BackendType
}

BackendComparison holds benchmark results across backends.

func CompareBackends

func CompareBackends(op OperationType, iterations int) (*BackendComparison, error)

CompareBackends runs a quick benchmark of an operation across all backends. Returns results for comparison. If the operation isn't supported, that backend's result will have an error.

type BackendInfo

type BackendInfo struct {
	Type        BackendType
	Name        string
	APIVersion  int
	DeviceCount int
}

BackendInfo provides information about an available backend.

type BackendType

type BackendType int

BackendType identifies a GPU compute backend.

const (
	// BackendAuto selects the best available backend automatically.
	// Priority: CUDA > Metal > WebGPU
	BackendAuto BackendType = iota

	// BackendMetal uses Apple Metal (macOS/iOS).
	BackendMetal

	// BackendWebGPU uses WebGPU via Dawn (cross-platform).
	BackendWebGPU

	// BackendCUDA uses NVIDIA CUDA.
	BackendCUDA
)

func Backends

func Backends() []BackendType

Backends returns a list of available backend types.

func MustSelectBackend

func MustSelectBackend(ops ...OperationType) BackendType

MustSelectBackend returns the best backend or panics on error.

func SelectBackend

func SelectBackend(ops ...OperationType) (BackendType, error)

SelectBackend returns the best backend for the given operations. If ops is empty, returns the highest priority available backend.

func SelectBestBackend

func SelectBestBackend(ops []OperationType, preferPerformance bool) (BackendType, error)

SelectBestBackend returns the best available backend for a set of operations. It considers backend availability, capability support, and optionally performance.

func (BackendType) String

func (b BackendType) String() string

String returns the backend name.

type BenchmarkResult

type BenchmarkResult struct {
	Backend   BackendType
	Operation OperationType
	Duration  time.Duration
	Error     error
}

BenchmarkResult holds timing data for an operation.

type CryptoOps

type CryptoOps interface {
	// SHA256 computes SHA-256 hashes for a batch of inputs.
	// input: [N, input_len] bytes
	// output: [N, 32] bytes
	SHA256(input, output *UntypedTensor) error

	// Keccak256 computes Keccak-256 (Ethereum hash) for a batch.
	// input: [N, input_len] bytes
	// output: [N, 32] bytes
	Keccak256(input, output *UntypedTensor) error

	// Poseidon computes Poseidon hash (ZK-friendly).
	// input: [N, field_elements] uint64
	// output: [N, 1] uint64
	Poseidon(input, output *UntypedTensor) error

	// ECDSAVerifyBatch verifies multiple ECDSA signatures in parallel.
	// messages: [N, 32] bytes (message hashes)
	// signatures: [N, 64] bytes (r || s)
	// pubkeys: [N, 33] bytes (compressed) or [N, 65] (uncompressed)
	// results: [N] uint8 (1 = valid, 0 = invalid)
	ECDSAVerifyBatch(messages, signatures, pubkeys, results *UntypedTensor) error

	// Ed25519VerifyBatch verifies multiple Ed25519 signatures.
	// messages: [N, msg_len] bytes
	// signatures: [N, 64] bytes
	// pubkeys: [N, 32] bytes
	// results: [N] uint8 (1 = valid, 0 = invalid)
	Ed25519VerifyBatch(messages, signatures, pubkeys, results *UntypedTensor) error

	// BLSVerifyBatch verifies multiple BLS signatures.
	// messages: [N, msg_len] bytes
	// signatures: [N, 96] bytes (G2 points)
	// pubkeys: [N, 48] bytes (G1 points)
	// results: [N] uint8 (1 = valid, 0 = invalid)
	BLSVerifyBatch(messages, signatures, pubkeys, results *UntypedTensor) error

	// BLSAggregate aggregates multiple BLS signatures into one.
	// signatures: [N, 96] bytes
	// aggregated: [96] bytes
	BLSAggregate(signatures, aggregated *UntypedTensor) error

	// MerkleRoot computes Merkle root from leaves.
	// leaves: [N, 32] bytes (N must be power of 2)
	// root: [32] bytes
	MerkleRoot(leaves, root *UntypedTensor) error

	// MerkleBatch computes multiple Merkle roots in parallel.
	// leavesSet: [M, N, 32] bytes
	// roots: [M, 32] bytes
	MerkleBatch(leavesSet, roots *UntypedTensor) error

	// MerkleProof generates Merkle proof for a leaf.
	// leaves: [N, 32] bytes
	// leafIndex: index of the leaf
	// proof: [log2(N), 32] bytes
	MerkleProof(leaves *UntypedTensor, leafIndex int, proof *UntypedTensor) error
}

CryptoOps provides GPU-accelerated cryptographic operations.

type DEXOps

type DEXOps interface {
	// ConstantProductSwap computes AMM swap output using x*y=k formula.
	// reserveX: [N] uint64 - X token reserves
	// reserveY: [N] uint64 - Y token reserves
	// amountIn: [N] uint64 - input amounts
	// xToY: true for X→Y swap, false for Y→X
	// amountOut: [N] uint64 - output amounts
	// fee: fee percentage (e.g., 0.003 for 0.3%)
	ConstantProductSwap(reserveX, reserveY, amountIn *UntypedTensor, xToY bool, amountOut *UntypedTensor, fee float32) error

	// ConstantProductSwapBatch processes multiple swaps.
	// reserves: [M, 2] uint64 (reserveX, reserveY per pool)
	// swaps: [N, 3] uint64 (poolIndex, amountIn, direction)
	// amounts: [N] uint64 output amounts
	ConstantProductSwapBatch(reserves, swaps, amounts *UntypedTensor, fee float32) error

	// ComputeTWAP computes time-weighted average price.
	// prices: [N] uint64 - historical prices
	// timestamps: [N] uint64 - timestamps
	// start, end: time range
	// twap: [1] uint64 output
	ComputeTWAP(prices, timestamps *UntypedTensor, start, end uint64, twap *UntypedTensor) error

	// MatchOrders matches bid/ask orders.
	// bids: [N, 3] uint64 (price, quantity, orderId)
	// asks: [M, 3] uint64 (price, quantity, orderId)
	// matches: output (bidId, askId, quantity, price)
	// prices: fill prices
	// amounts: fill amounts
	MatchOrders(bids, asks, matches, prices, amounts *UntypedTensor) error

	// MatchOrdersWithPriority matches orders with time/price priority.
	// bids: [N, 4] uint64 (price, quantity, orderId, timestamp)
	// asks: [M, 4] uint64
	MatchOrdersWithPriority(bids, asks, matches *UntypedTensor) error

	// ComputeLiquidity computes concentrated liquidity positions (Uniswap V3 style).
	// tickLower: [N] int32 - lower tick
	// tickUpper: [N] int32 - upper tick
	// amounts: [N, 2] uint64 (amount0, amount1)
	// liquidity: [N] uint128 output
	ComputeLiquidity(tickLower, tickUpper, amounts, liquidity *UntypedTensor) error

	// ComputePositionValue computes position value at current price.
	// liquidity: [N] uint128
	// tickLower: [N] int32
	// tickUpper: [N] int32
	// currentTick: current price tick
	// values: [N, 2] uint64 (token0, token1)
	ComputePositionValue(liquidity, tickLower, tickUpper *UntypedTensor, currentTick int32, values *UntypedTensor) error

	// CalculateFees computes accumulated fees for positions.
	// liquidity: [N] uint128
	// feeGrowthInside0: [N] uint256
	// feeGrowthInside1: [N] uint256
	// fees: [N, 2] uint64 output
	CalculateFees(liquidity, feeGrowthInside0, feeGrowthInside1, fees *UntypedTensor) error

	// BatchSettlement settles multiple trades atomically.
	// trades: [N, 4] uint64 (buyer, seller, token, amount)
	// balances: [M, T] uint64 (M users, T tokens)
	// newBalances: output
	BatchSettlement(trades, balances, newBalances *UntypedTensor) error
}

DEXOps provides GPU-accelerated DEX (decentralized exchange) operations.

type DType

type DType int

DType represents a tensor data type.

const (
	Float32 DType = iota
	Float16
	Float64
	Int32
	Int64
	Uint8
	Uint32
	Uint64
)

func DTypeOf

func DTypeOf[T TensorElement]() DType

DTypeOf returns the DType for a Go type.

func (DType) Size

func (d DType) Size() int

Size returns the byte size of a single element.

func (DType) String

func (d DType) String() string

String returns the dtype name.

type DeviceCaps

type DeviceCaps uint32

DeviceCaps represents device capability flags.

const (
	CapFP16         DeviceCaps = 1 << iota // Half-precision float support
	CapFP64                                // Double precision support
	CapSubgroups                           // Subgroup/warp operations
	CapInt64Atomics                        // 64-bit atomic operations
)

func (DeviceCaps) Has

func (c DeviceCaps) Has(cap DeviceCaps) bool

Has returns true if the device has the specified capability.

type DeviceInfo

type DeviceInfo struct {
	Backend          BackendType
	Index            int
	Name             string
	Vendor           string
	IsDiscrete       bool
	IsUnifiedMemory  bool
	TotalMemory      uint64 // bytes
	MaxBufferSize    uint64 // bytes
	MaxWorkgroupSize uint32
	SIMDWidth        uint32
	Capabilities     DeviceCaps
}

DeviceInfo contains information about a compute device.

func Devices

func Devices() []DeviceInfo

Devices returns information about all available devices across all backends.

func (*DeviceInfo) MemoryGB

func (d *DeviceInfo) MemoryGB() float64

MemoryGB returns total memory in gigabytes.

type Error

type Error struct {
	Op      string // Operation that failed
	Backend BackendType
	Err     error
	Detail  string // Additional detail from C library
}

Error wraps an error with additional context from the C library.

func (*Error) Error

func (e *Error) Error() string

func (*Error) Unwrap

func (e *Error) Unwrap() error

type FHEOps

type FHEOps interface {
	// BFVEncrypt encrypts plaintext with BFV scheme.
	// plaintext: [N] int64 values (N ≤ poly_modulus_degree)
	// pk: public key
	// ciphertext: output ciphertext
	BFVEncrypt(plaintext, pk, ciphertext *UntypedTensor) error

	// BFVEncryptBatch encrypts multiple plaintexts.
	// plaintexts: [M, N] int64
	// pk: public key
	// ciphertexts: [M, ...] output
	BFVEncryptBatch(plaintexts, pk, ciphertexts *UntypedTensor) error

	// BFVDecrypt decrypts ciphertext.
	// ciphertext: input ciphertext
	// sk: secret key
	// plaintext: [N] int64 output
	BFVDecrypt(ciphertext, sk, plaintext *UntypedTensor) error

	// BFVAdd adds two ciphertexts.
	// ct1, ct2: input ciphertexts
	// result: output ciphertext
	BFVAdd(ct1, ct2, result *UntypedTensor) error

	// BFVMultiply multiplies ciphertexts with relinearization.
	// ct1, ct2: input ciphertexts
	// relinKey: relinearization key
	// result: output ciphertext
	BFVMultiply(ct1, ct2, relinKey, result *UntypedTensor) error

	// BFVMultiplyPlain multiplies ciphertext by plaintext.
	// ct: input ciphertext
	// plain: [N] int64 plaintext
	// result: output ciphertext
	BFVMultiplyPlain(ct, plain, result *UntypedTensor) error

	// BFVRotate rotates ciphertext slots.
	// ct: input ciphertext
	// galoisKey: Galois key for rotation
	// steps: rotation amount (positive = left)
	// result: output ciphertext
	BFVRotate(ct, galoisKey *UntypedTensor, steps int, result *UntypedTensor) error

	// CKKSEncrypt encrypts with CKKS (approximate arithmetic).
	// plaintext: [N] float64 values
	// pk: public key
	// scale: encoding scale
	// ciphertext: output
	CKKSEncrypt(plaintext, pk *UntypedTensor, scale float64, ciphertext *UntypedTensor) error

	// CKKSDecrypt decrypts CKKS ciphertext.
	// ciphertext: input
	// sk: secret key
	// plaintext: [N] float64 output
	CKKSDecrypt(ciphertext, sk, plaintext *UntypedTensor) error

	// CKKSAdd adds two CKKS ciphertexts.
	CKKSAdd(ct1, ct2, result *UntypedTensor) error

	// CKKSMultiply multiplies CKKS ciphertexts.
	CKKSMultiply(ct1, ct2, relinKey, result *UntypedTensor) error

	// CKKSRescale rescales ciphertext after multiplication.
	CKKSRescale(ct, result *UntypedTensor) error

	// CKKSRotate rotates CKKS slots.
	CKKSRotate(ct, galoisKey *UntypedTensor, steps int, result *UntypedTensor) error

	// Bootstrap refreshes ciphertext noise level (limited support).
	Bootstrap(ct, bootstrapKey, result *UntypedTensor) error
}

FHEOps provides GPU-accelerated fully homomorphic encryption operations. Supports BFV (exact arithmetic) and CKKS (approximate arithmetic) schemes.

type LatticeOps

type LatticeOps interface {
	// KyberKeyGen generates Kyber (ML-KEM) key pair.
	// pk: [1184] bytes (Kyber768 public key)
	// sk: [2400] bytes (Kyber768 secret key)
	KyberKeyGen(pk, sk *UntypedTensor) error

	// KyberKeyGenBatch generates multiple key pairs in parallel.
	// pk: [N, 1184] bytes
	// sk: [N, 2400] bytes
	KyberKeyGenBatch(pk, sk *UntypedTensor) error

	// KyberEncaps encapsulates shared secret.
	// pk: [1184] bytes public key
	// ct: [1088] bytes ciphertext output
	// ss: [32] bytes shared secret output
	KyberEncaps(pk, ct, ss *UntypedTensor) error

	// KyberEncapsBatch performs batch encapsulation.
	// pk: [N, 1184] bytes
	// ct: [N, 1088] bytes
	// ss: [N, 32] bytes
	KyberEncapsBatch(pk, ct, ss *UntypedTensor) error

	// KyberDecaps decapsulates shared secret.
	// ct: [1088] bytes ciphertext
	// sk: [2400] bytes secret key
	// ss: [32] bytes shared secret output
	KyberDecaps(ct, sk, ss *UntypedTensor) error

	// KyberDecapsBatch performs batch decapsulation.
	// ct: [N, 1088] bytes
	// sk: [N, 2400] bytes
	// ss: [N, 32] bytes
	KyberDecapsBatch(ct, sk, ss *UntypedTensor) error

	// DilithiumKeyGen generates Dilithium (ML-DSA) key pair.
	// pk: [1952] bytes (Dilithium3 public key)
	// sk: [4016] bytes (Dilithium3 secret key)
	DilithiumKeyGen(pk, sk *UntypedTensor) error

	// DilithiumSign signs a message.
	// msg: [msg_len] bytes message
	// sk: [4016] bytes secret key
	// sig: [3293] bytes signature output
	DilithiumSign(msg, sk, sig *UntypedTensor) error

	// DilithiumSignBatch signs multiple messages in parallel.
	// msgs: [N, msg_len] bytes
	// sk: [4016] bytes (same key for all)
	// sigs: [N, 3293] bytes
	DilithiumSignBatch(msgs, sk, sigs *UntypedTensor) error

	// DilithiumVerify verifies a signature.
	// msg: [msg_len] bytes
	// sig: [3293] bytes
	// pk: [1952] bytes
	// Returns true if valid.
	DilithiumVerify(msg, sig, pk *UntypedTensor) (bool, error)

	// DilithiumVerifyBatch verifies multiple signatures.
	// msgs: [N, msg_len] bytes
	// sigs: [N, 3293] bytes
	// pks: [N, 1952] bytes
	// results: [N] uint8 (1 = valid, 0 = invalid)
	DilithiumVerifyBatch(msgs, sigs, pks, results *UntypedTensor) error

	// PolynomialNTT performs NTT in lattice polynomial ring.
	// Operates on polynomials in Z_q[X]/(X^256 + 1).
	PolynomialNTT(input, output *UntypedTensor, q uint32) error

	// PolynomialINTT performs inverse NTT.
	PolynomialINTT(input, output *UntypedTensor, q uint32) error

	// PolynomialMul multiplies polynomials in NTT domain.
	PolynomialMul(a, b, c *UntypedTensor, q uint32) error

	// PolynomialAdd adds polynomials.
	PolynomialAdd(a, b, c *UntypedTensor, q uint32) error
}

LatticeOps provides GPU-accelerated lattice-based cryptography operations. Implements NIST post-quantum standards: ML-KEM (Kyber) and ML-DSA (Dilithium).

type MLOps

type MLOps interface {
	// MatMul performs matrix multiplication: C = A @ B
	MatMul(a, b, c *UntypedTensor) error

	// MatMulTranspose performs C = A @ B^T or C = A^T @ B
	MatMulTranspose(a, b, c *UntypedTensor, transposeA, transposeB bool) error

	// ReLU applies rectified linear unit: y = max(0, x)
	ReLU(input, output *UntypedTensor) error

	// GELU applies Gaussian error linear unit activation
	GELU(input, output *UntypedTensor) error

	// Softmax applies softmax along an axis
	Softmax(input, output *UntypedTensor, axis int) error

	// LayerNorm applies layer normalization
	LayerNorm(input, gamma, beta, output *UntypedTensor, eps float32) error

	// Attention computes scaled dot-product attention
	// output = softmax(Q @ K^T / scale) @ V
	Attention(q, k, v, output *UntypedTensor, scale float32) error

	// Conv2D performs 2D convolution
	Conv2D(input, kernel, output *UntypedTensor, stride, padding [2]int) error

	// MaxPool2D performs 2D max pooling
	MaxPool2D(input, output *UntypedTensor, kernelSize, stride [2]int) error

	// BatchNorm applies batch normalization
	BatchNorm(input, gamma, beta, mean, variance, output *UntypedTensor, eps float32) error

	// Dropout applies dropout with given probability (inference mode)
	Dropout(input, output *UntypedTensor, p float32) error

	// Add performs element-wise addition
	Add(a, b, c *UntypedTensor) error

	// Multiply performs element-wise multiplication
	Multiply(a, b, c *UntypedTensor) error

	// Sum reduces tensor along specified axes
	Sum(input, output *UntypedTensor, axes []int) error

	// Mean reduces tensor along specified axes
	Mean(input, output *UntypedTensor, axes []int) error
}

MLOps provides GPU-accelerated machine learning operations.

type OperationType

type OperationType int

OperationType identifies a type of compute operation.

const (
	// ML Operations
	OpMatMul OperationType = iota
	OpReLU
	OpGELU
	OpSoftmax
	OpLayerNorm
	OpAttention

	// Crypto Operations
	OpSHA256
	OpKeccak256
	OpPoseidon
	OpECDSAVerify
	OpEd25519Verify
	OpBLSVerify
	OpMerkleRoot

	// ZK Operations
	OpNTT
	OpINTT
	OpMSM
	OpPolyMul

	// FHE Operations
	OpBFVEncrypt
	OpBFVDecrypt
	OpBFVAdd
	OpBFVMul

	// Lattice Operations
	OpKyberKeyGen
	OpKyberEncaps
	OpKyberDecaps
	OpDilithiumSign
	OpDilithiumVerify

	// DEX Operations
	OpConstantProductSwap
	OpTWAP
	OpOrderMatch
)

func (OperationType) Category

func (o OperationType) Category() string

Category returns the operation category.

func (OperationType) String

func (o OperationType) String() string

String returns the operation name.

type Session

type Session struct {
	// contains filtered or unexported fields
}

Session manages a GPU acceleration context. All tensor operations must use tensors created from the same session. Session is safe for concurrent use.

func DefaultSession

func DefaultSession() (*Session, error)

DefaultSession returns a lazily initialized default session. It uses the best available backend (Metal on macOS, CUDA on Linux).

func NewSession

func NewSession(opts ...SessionOption) (*Session, error)

NewSession creates a new acceleration session with auto-detected best backend.

func NewSessionWithBackend

func NewSessionWithBackend(backend BackendType, opts ...SessionOption) (*Session, error)

NewSessionWithBackend creates a session using a specific backend.

func NewSessionWithDevice

func NewSessionWithDevice(backend BackendType, deviceIndex int, opts ...SessionOption) (*Session, error)

NewSessionWithDevice creates a session using a specific device.

func (*Session) Backend

func (s *Session) Backend() BackendType

Backend returns the backend type for this session.

func (*Session) Close

func (s *Session) Close() error

Close releases all session resources.

func (*Session) Crypto

func (s *Session) Crypto() CryptoOps

Crypto returns the cryptographic operations interface.

func (*Session) DEX

func (s *Session) DEX() DEXOps

DEX returns the decentralized exchange operations interface.

func (*Session) DeviceInfo

func (s *Session) DeviceInfo() DeviceInfo

DeviceInfo returns information about the session's device.

func (*Session) FHE

func (s *Session) FHE() FHEOps

FHE returns the fully homomorphic encryption operations interface.

func (*Session) IsClosed

func (s *Session) IsClosed() bool

IsClosed returns true if the session has been closed.

func (*Session) Lattice

func (s *Session) Lattice() LatticeOps

Lattice returns the lattice cryptography operations interface.

func (*Session) ML

func (s *Session) ML() MLOps

ML returns the ML operations interface.

func (*Session) Sync

func (s *Session) Sync() error

Sync waits for all pending operations to complete.

func (*Session) SyncContext

func (s *Session) SyncContext(ctx context.Context) error

SyncContext waits for pending operations with context cancellation.

func (*Session) ZK

func (s *Session) ZK() ZKOps

ZK returns the zero-knowledge proof operations interface.

type SessionOption

type SessionOption func(*sessionConfig)

SessionOption configures session creation.

func WithAsync

func WithAsync(async bool) SessionOption

WithAsync enables asynchronous operation mode.

func WithBackend

func WithBackend(b BackendType) SessionOption

WithBackend specifies the backend to use.

func WithDevice

func WithDevice(index int) SessionOption

WithDevice specifies the device index within the backend.

type Tensor

type Tensor[T TensorElement] struct {
	// contains filtered or unexported fields
}

Tensor represents a multi-dimensional array on GPU memory. Tensor is not safe for concurrent modification but safe for concurrent reads.

func NewTensor

func NewTensor[T TensorElement](s *Session, shape []int) (*Tensor[T], error)

NewTensor creates a new tensor with the given shape.

func NewTensorWithData

func NewTensorWithData[T TensorElement](s *Session, shape []int, data []T) (*Tensor[T], error)

NewTensorWithData creates a tensor initialized with data from a slice.

func (*Tensor[T]) Bytes

func (t *Tensor[T]) Bytes() int

Bytes returns the total byte size.

func (*Tensor[T]) Close

func (t *Tensor[T]) Close()

Close releases tensor resources.

func (*Tensor[T]) DType

func (t *Tensor[T]) DType() DType

DType returns the element data type.

func (*Tensor[T]) FromSlice

func (t *Tensor[T]) FromSlice(src []T) error

FromSlice copies data from a Go slice to the tensor.

func (*Tensor[T]) NDim

func (t *Tensor[T]) NDim() int

NDim returns the number of dimensions.

func (*Tensor[T]) NumEl

func (t *Tensor[T]) NumEl() int

NumEl returns the total number of elements.

func (*Tensor[T]) Shape

func (t *Tensor[T]) Shape() []int

Shape returns a copy of the tensor shape.

func (*Tensor[T]) ToSlice

func (t *Tensor[T]) ToSlice() ([]T, error)

ToSlice copies tensor data to a Go slice.

func (*Tensor[T]) Untyped

func (t *Tensor[T]) Untyped() *UntypedTensor

Untyped returns an untyped view of the tensor for passing to ops.

type TensorElement

type TensorElement interface {
	float32 | float64 | int32 | int64 | uint8 | uint32 | uint64
}

TensorElement is a type constraint for tensor element types.

type UntypedTensor

type UntypedTensor struct {
	// contains filtered or unexported fields
}

UntypedTensor provides type-erased tensor operations. Used internally and for dynamic typing scenarios.

func (*UntypedTensor) Bytes

func (t *UntypedTensor) Bytes() int

Bytes returns the total byte size.

func (*UntypedTensor) DType

func (t *UntypedTensor) DType() DType

DType returns the element data type.

func (*UntypedTensor) Handle

func (t *UntypedTensor) Handle() uintptr

Handle returns the raw tensor handle for CGO operations.

func (*UntypedTensor) NDim

func (t *UntypedTensor) NDim() int

NDim returns the number of dimensions.

func (*UntypedTensor) NumEl

func (t *UntypedTensor) NumEl() int

NumEl returns the total number of elements.

func (*UntypedTensor) Shape

func (t *UntypedTensor) Shape() []int

Shape returns a copy of the tensor shape.

type ZKOps

type ZKOps interface {
	// NTT performs Number Theoretic Transform.
	// input: [N] uint64 coefficients
	// output: [N] uint64 NTT values
	// roots: [N] uint64 roots of unity
	// modulus: prime modulus
	NTT(input, output, roots *UntypedTensor, modulus uint64) error

	// INTT performs inverse NTT.
	// input: [N] uint64 NTT values
	// output: [N] uint64 coefficients
	// invRoots: [N] uint64 inverse roots of unity
	// modulus: prime modulus
	INTT(input, output, invRoots *UntypedTensor, modulus uint64) error

	// MSM performs multi-scalar multiplication on elliptic curves.
	// scalars: [N, scalar_size] bytes
	// bases: [N, point_size] bytes (affine points)
	// result: [point_size] bytes
	MSM(scalars, bases, result *UntypedTensor) error

	// MSMBatch performs multiple MSMs in parallel.
	// scalars: [M, N, scalar_size] bytes
	// bases: [M, N, point_size] bytes
	// results: [M, point_size] bytes
	MSMBatch(scalars, bases, results *UntypedTensor) error

	// PolyMul multiplies polynomials in coefficient form.
	// a: [N] uint64 coefficients
	// b: [N] uint64 coefficients
	// c: [2N-1] uint64 result coefficients
	// modulus: prime modulus
	PolyMul(a, b, c *UntypedTensor, modulus uint64) error

	// PolyEval evaluates polynomial at given points.
	// coeffs: [degree+1] uint64
	// points: [N] uint64
	// results: [N] uint64
	// modulus: prime modulus
	PolyEval(coeffs, points, results *UntypedTensor, modulus uint64) error

	// CommitPoly computes polynomial commitment (KZG).
	// coeffs: [degree+1, field_size] bytes
	// srs: structured reference string
	// commitment: [point_size] bytes
	CommitPoly(coeffs, srs, commitment *UntypedTensor) error

	// FFT performs Fast Fourier Transform (complex).
	// input: [N, 2] float32 (real, imag)
	// output: [N, 2] float32
	FFT(input, output *UntypedTensor) error

	// IFFT performs inverse FFT.
	IFFT(input, output *UntypedTensor) error

	// FieldAdd adds field elements.
	// a: [N] uint64
	// b: [N] uint64
	// c: [N] uint64
	// modulus: prime modulus
	FieldAdd(a, b, c *UntypedTensor, modulus uint64) error

	// FieldMul multiplies field elements.
	FieldMul(a, b, c *UntypedTensor, modulus uint64) error

	// FieldInv computes modular inverse.
	FieldInv(a, b *UntypedTensor, modulus uint64) error
}

ZKOps provides GPU-accelerated zero-knowledge proof operations.

Directories

Path Synopsis
internal
capi
Package capi provides CGO bindings to the lux-accel C library.
Package capi provides CGO bindings to the lux-accel C library.
ops
consensus
Package consensus provides GPU-accelerated consensus primitives.
Package consensus provides GPU-accelerated consensus primitives.
crypto
Package crypto provides GPU-accelerated cryptographic operations.
Package crypto provides GPU-accelerated cryptographic operations.
dex
Package dex provides GPU-accelerated DEX operations.
Package dex provides GPU-accelerated DEX operations.
fhe
Package fhe provides GPU-accelerated Fully Homomorphic Encryption operations.
Package fhe provides GPU-accelerated Fully Homomorphic Encryption operations.
lattice
Package lattice provides GPU-accelerated lattice cryptography operations.
Package lattice provides GPU-accelerated lattice cryptography operations.
zk
Package zk provides GPU-accelerated zero-knowledge proof operations.
Package zk provides GPU-accelerated zero-knowledge proof operations.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL