ane

package
v0.4.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 21, 2026 License: MIT Imports: 3 Imported by: 1

Documentation

Overview

Package ane provides high-level access to the Apple Neural Engine.

Open a Client, compile MIL programs into Model values, and evaluate them on the ANE hardware. IOSurface buffers are managed automatically.

c, err := ane.Open()
if err != nil {
	log.Fatal(err)
}
defer c.Close()

m, err := c.Compile(ane.CompileOptions{
	MILText:    milText,
	WeightBlob: blob,
	ModelType:  ane.ModelTypeMIL,
})
if err != nil {
	log.Fatal(err)
}
defer m.Close()

m.WriteInputF32(0, input)
if err := m.Eval(); err != nil {
	log.Fatal(err)
}
m.ReadOutputF32(0, output)

Compilation

There are two compilation paths. ModelTypeMIL compiles MIL text and weights in memory:

MIL text + weights → ANEInMemoryModel → compile → load → Model

ModelTypePackage loads a pre-compiled .mlmodelc package from disk:

.mlmodelc path → ANEModel → compile → load → Model

After compilation the ANE reports its expected memory layout via model attributes. IOSurfaces are created to match these layouts exactly. Surface sizes, channels, and spatial dimensions are never specified manually — they are parsed from the compiled model.

Client.CompileWithStats returns a CompileStats with wall-clock compile and load phase durations.

Tensor Layout

Data is in channel-first (NCHW) order: data[c*width + x]. The byte offset in the IOSurface is c*PlaneStride + x*ElemSize. The minimum row stride is 64 bytes (the ANE's alignment granularity).

TensorLayout describes the memory layout for each input and output. AllocSize (Channels * PlaneStride) includes stride padding and is always >= the logical data size. Typed I/O methods (Model.WriteInputF32, Model.ReadOutputF32, etc.) accept logical element counts and handle stride padding internally. Raw I/O (Model.WriteInput, Model.ReadOutput) requires exactly AllocSize bytes.

Shared Events

SharedEvent enables ANE↔GPU/CPU synchronization via IOSurface shared events. Use Model.EvalWithSignalEvent, Model.EvalBidirectional, and related methods for pipelined evaluation across compute domains.

Telemetry

For hardware performance counters, diagnostics, and runtime snapshots, see the github.com/tmc/apple/x/ane/telemetry package.

Index

Constants

This section is empty.

Variables

View Source
var (
	ErrNoANE                    = errors.New("ane: no ANE hardware available")
	ErrCompileBudgetExhausted   = errors.New("ane: compile budget exhausted")
	ErrMapFailed                = errors.New("ane: IOSurface mapping failed")
	ErrUnsupportedSelector      = errors.New("ane: unsupported selector")
	ErrVirtualClientUnavailable = errors.New("ane: virtual client unavailable")
	ErrModelLoad                = errors.New("ane: model load failed")
	ErrEval                     = errors.New("ane: evaluation failed")
	ErrUnsupportedLayout        = errors.New("ane: unsupported tensor layout")
)

Functions

func EnsureANELoaded added in v0.3.0

func EnsureANELoaded() error

EnsureANELoaded loads the AppleNeuralEngine private framework once.

func EnsureCoreMLLoaded added in v0.3.0

func EnsureCoreMLLoaded() error

EnsureCoreMLLoaded loads the CoreML framework once.

func EnsureEspressoLoaded deprecated added in v0.3.0

func EnsureEspressoLoaded() error

EnsureEspressoLoaded loads the Espresso private framework once.

Deprecated: new code should prefer github.com/tmc/apple/x/espresso.

func FP16ToFloat32

func FP16ToFloat32(h uint16) float32

FP16ToFloat32 converts an IEEE 754 half-precision value to float32.

func Float32ToFP16

func Float32ToFP16(f float32) uint16

Float32ToFP16 converts a float32 to IEEE 754 half-precision.

func RowStrideFor

func RowStrideFor(width, elemSize int) int

RowStrideFor computes the minimum 64-byte-aligned row stride.

Types

type ANEError

type ANEError struct {
	Op    string // operation that failed
	Class string // error domain or class
	Code  int    // error code
	Err   error  // underlying error
}

ANEError wraps an error from the ANE subsystem with context.

func (*ANEError) Error

func (e *ANEError) Error() string

func (*ANEError) Is

func (e *ANEError) Is(target error) bool

func (*ANEError) Unwrap

func (e *ANEError) Unwrap() error

type Client added in v0.3.0

type Client struct{}

Client manages a connection to the ANE hardware.

func Open

func Open() (*Client, error)

Open creates a new Client for ANE inference. On non-Darwin platforms, it always returns ErrNoANE.

func (*Client) ClientObjcID added in v0.3.0

func (c *Client) ClientObjcID() uintptr

func (*Client) Close added in v0.3.0

func (c *Client) Close() error

Close releases the client resources.

func (*Client) Compile added in v0.3.0

func (c *Client) Compile(opts CompileOptions) (*Model, error)

Compile compiles a model and returns a ready-to-evaluate Model.

func (*Client) CompileCount added in v0.3.0

func (c *Client) CompileCount() int64

CompileCount returns the number of compilations performed.

func (*Client) CompileWithStats added in v0.3.0

func (c *Client) CompileWithStats(opts CompileOptions) (*Model, CompileStats, error)

CompileWithStats compiles a model and returns a Model along with compilation timing.

func (*Client) Info added in v0.3.0

func (c *Client) Info() DeviceInfo

Info returns the device information for this client.

type CompileOptions

type CompileOptions struct {
	ModelType ModelType

	// For ModelTypeMIL:
	MILText []byte // MIL program text
	// Legacy single-weight MIL fields.
	WeightBlob []byte // weight binary blob (may be nil)
	WeightPath string // model path key for weight dict (default: "@model_path/weights/weight.bin")
	// WeightFiles allows MIL graphs with multiple named BLOBFILE inputs.
	// If both WeightBlob and WeightFiles are set, they are merged.
	WeightFiles []WeightFile

	// For ModelTypePackage:
	PackagePath string // path to .mlmodelc directory
	ModelKey    string // key for model dictionary (default: "s")

	// QoS selects the ANE quality-of-service scheduling class.
	// Zero uses the default value of 21. Higher values may receive
	// priority on shared hardware.
	QoS uint32

	// PerfStatsMask enables hardware performance counters during evaluation.
	// Zero disables stats collection. When non-zero, telemetry.EvalWithStats
	// returns counter names and timing data. Valid masks are 4 bits wide
	// (0x1–0xF); higher values are silently converted to zero by the driver.
	PerfStatsMask uint32
}

CompileOptions configures model compilation.

type CompileStats added in v0.2.2

type CompileStats struct {
	CompileNS int64 // wall-clock compile phase
	LoadNS    int64 // wall-clock load phase
	TotalNS   int64 // wall-clock total
}

CompileStats contains timing measurements from model compilation.

func (CompileStats) Available added in v0.2.2

func (s CompileStats) Available() bool

Available reports whether any timing data was collected.

func (CompileStats) ReportMetrics added in v0.2.2

func (s CompileStats) ReportMetrics(b interface{ ReportMetric(float64, string) })

ReportMetrics reports compilation timing to a testing.B-compatible reporter.

func (CompileStats) String added in v0.2.2

func (s CompileStats) String() string

String returns a compact human-readable summary of compile timing.

type DeviceInfo

type DeviceInfo struct {
	HasANE        bool
	NumCores      uint32
	Architecture  string
	Product       string
	BuildVersion  string
	IsVM          bool
	NumANEs       uint32
	SubType       string
	BoardType     int64
	InternalBuild bool
}

DeviceInfo describes the ANE hardware present on the system.

func Probe

func Probe() (DeviceInfo, error)

Probe returns device information about the ANE. On non-Darwin platforms, it always reports no ANE.

type MetalDevice

type MetalDevice struct{}

MetalDevice wraps a Metal GPU device for zero-copy interop with ANE.

func OpenMetal

func OpenMetal() (*MetalDevice, error)

func (*MetalDevice) Close

func (d *MetalDevice) Close() error

func (*MetalDevice) MetalSharedEvent

func (d *MetalDevice) MetalSharedEvent(ev *SharedEvent) (any, error)

func (*MetalDevice) NewMetalSharedEvent

func (d *MetalDevice) NewMetalSharedEvent() (any, *SharedEvent, error)

type Model added in v0.3.0

type Model struct{}

Model represents a compiled and loaded model ready for evaluation.

func (*Model) Close added in v0.3.0

func (m *Model) Close() error

func (*Model) CompileModelType added in v0.3.0

func (m *Model) CompileModelType() ModelType

func (*Model) Eval added in v0.3.0

func (m *Model) Eval() error

func (*Model) EvalAsync added in v0.3.0

func (m *Model) EvalAsync() <-chan error

func (*Model) EvalAsyncWithCallback added in v0.3.0

func (m *Model) EvalAsyncWithCallback(fn func(error))

func (*Model) EvalBidirectional added in v0.3.0

func (m *Model) EvalBidirectional(waitPort uint32, waitValue uint64, signalPort uint32, signalValue uint64, cfg SharedEventEvalOptions) error

func (*Model) EvalWithSignalEvent added in v0.3.0

func (m *Model) EvalWithSignalEvent(signalPort uint32, signalValue uint64, cfg SharedEventEvalOptions) error

func (*Model) InMemModelObjcID added in v0.3.0

func (m *Model) InMemModelObjcID() uintptr

func (*Model) InputAllocSize added in v0.3.0

func (m *Model) InputAllocSize(i int) int

func (*Model) InputChannels added in v0.3.0

func (m *Model) InputChannels(i int) int

func (*Model) InputLayout added in v0.3.0

func (m *Model) InputLayout(i int) TensorLayout

func (*Model) InputSurface added in v0.3.0

func (m *Model) InputSurface(i int) uintptr

func (*Model) InputSurfaces added in v0.3.0

func (m *Model) InputSurfaces() []uintptr

func (*Model) MetalInputBuffer added in v0.3.0

func (m *Model) MetalInputBuffer(d *MetalDevice, i int) (any, error)

func (*Model) MetalOutputBuffer added in v0.3.0

func (m *Model) MetalOutputBuffer(d *MetalDevice, i int) (any, error)

func (*Model) ModelObjcID added in v0.3.0

func (m *Model) ModelObjcID() uintptr

func (*Model) NumInputs added in v0.3.0

func (m *Model) NumInputs() int

func (*Model) NumOutputs added in v0.3.0

func (m *Model) NumOutputs() int

func (*Model) OutputAllocSize added in v0.3.0

func (m *Model) OutputAllocSize(i int) int

func (*Model) OutputChannels added in v0.3.0

func (m *Model) OutputChannels(i int) int

func (*Model) OutputLayout added in v0.3.0

func (m *Model) OutputLayout(i int) TensorLayout

func (*Model) OutputSurface added in v0.3.0

func (m *Model) OutputSurface(i int) uintptr

func (*Model) OutputSurfaces added in v0.3.0

func (m *Model) OutputSurfaces() []uintptr

func (*Model) RawPerfStatsMask added in v0.3.0

func (m *Model) RawPerfStatsMask() uint32

func (*Model) RawRequest added in v0.3.0

func (m *Model) RawRequest() uintptr

func (*Model) ReadOutput added in v0.3.0

func (m *Model) ReadOutput(i int, data []byte) error

func (*Model) ReadOutputF32 added in v0.3.0

func (m *Model) ReadOutputF32(i int, data []float32) error

func (*Model) ReadOutputFP16 added in v0.3.0

func (m *Model) ReadOutputFP16(i int, data []float32) error

func (*Model) ReadOutputFP16Channels added in v0.3.0

func (m *Model) ReadOutputFP16Channels(i, channel int, data []float32) error

func (*Model) Spatial added in v0.3.0

func (m *Model) Spatial(i int) int

func (*Model) WriteInput added in v0.3.0

func (m *Model) WriteInput(i int, data []byte) error

func (*Model) WriteInputF32 added in v0.3.0

func (m *Model) WriteInputF32(i int, data []float32) error

func (*Model) WriteInputFP16 added in v0.3.0

func (m *Model) WriteInputFP16(i int, data []float32) error

func (*Model) WriteInputFP16Channels added in v0.3.0

func (m *Model) WriteInputFP16Channels(i, channel int, data []float32) error

type ModelType

type ModelType int

ModelType selects the compilation path.

const (
	ModelTypeMIL     ModelType = iota // In-memory MIL text + weights
	ModelTypePackage                  // On-disk .mlmodelc package
)

type Pipeline

type Pipeline struct{}

Pipeline manages SharedEvent synchronization between ANE and Metal.

func NewPipeline

func NewPipeline(_ *MetalDevice) (*Pipeline, error)

func (*Pipeline) ANEEvent

func (p *Pipeline) ANEEvent() *SharedEvent

func (*Pipeline) ANEToMetal

func (p *Pipeline) ANEToMetal(_ *Model) error

func (*Pipeline) Bidirectional

func (p *Pipeline) Bidirectional(_ *Model) error

func (*Pipeline) Close

func (p *Pipeline) Close() error

func (*Pipeline) Counter

func (p *Pipeline) Counter() uint64

func (*Pipeline) Metal

func (p *Pipeline) Metal() *MetalDevice

func (*Pipeline) MetalEvent

func (p *Pipeline) MetalEvent() any

func (*Pipeline) WaitOnANE

func (p *Pipeline) WaitOnANE(_ *Model) error

type PooledRequest

type PooledRequest struct{}

PooledRequest is a request checked out from a pool.

func (*PooledRequest) Eval

func (pr *PooledRequest) Eval() error

func (*PooledRequest) Release

func (pr *PooledRequest) Release()

type RequestPool

type RequestPool struct{}

RequestPool pre-allocates a ring of ANE requests for pipelined evaluation.

func NewRequestPool

func NewRequestPool(m *Model, depth int) (*RequestPool, error)

func (*RequestPool) Acquire

func (p *RequestPool) Acquire() *PooledRequest

func (*RequestPool) Close

func (p *RequestPool) Close() error

type SharedEvent

type SharedEvent struct{}

SharedEvent wraps an IOSurfaceSharedEvent for ANE↔GPU/CPU synchronization.

func NewSharedEvent

func NewSharedEvent() (*SharedEvent, error)

func SharedEventFromPort

func SharedEventFromPort(port uint32) (*SharedEvent, error)

func (*SharedEvent) Close

func (e *SharedEvent) Close() error

func (*SharedEvent) Port

func (e *SharedEvent) Port() uint32

func (*SharedEvent) Signal

func (e *SharedEvent) Signal(value uint64)

func (*SharedEvent) SignaledValue

func (e *SharedEvent) SignaledValue() uint64

func (*SharedEvent) TimeWait added in v0.2.2

func (e *SharedEvent) TimeWait(value uint64, timeout time.Duration) (bool, time.Duration)

func (*SharedEvent) Wait

func (e *SharedEvent) Wait(value uint64, timeout time.Duration) bool

type SharedEventEvalOptions

type SharedEventEvalOptions struct {
	DisableIOFencesUseSharedEvents bool // set kANEFDisableIOFencesUseSharedEventsKey
	EnableFWToFWSignal             bool // set kANEFEnableFWToFWSignal (keep false for ANE→Metal on physical hosts)
}

SharedEventEvalOptions configures shared event evaluation behavior.

type StateHandle

type StateHandle struct{}

StateHandle manages KV cache state for stateful MIL models.

func NewStateHandle

func NewStateHandle(_ *Model, _ int) *StateHandle

func (*StateHandle) Advance

func (s *StateHandle) Advance(_ int)

func (*StateHandle) Close

func (s *StateHandle) Close() error

func (*StateHandle) MaxSeq

func (s *StateHandle) MaxSeq() int

func (*StateHandle) Model added in v0.3.0

func (s *StateHandle) Model() *Model

func (*StateHandle) Position

func (s *StateHandle) Position() int

func (*StateHandle) Remaining

func (s *StateHandle) Remaining() int

func (*StateHandle) Reset

func (s *StateHandle) Reset()

type TensorLayout

type TensorLayout struct {
	Channels    int
	Width       int
	Height      int
	ElemSize    int
	RowStride   int
	PlaneStride int
}

TensorLayout describes the compiled model's memory layout for a single tensor.

func (TensorLayout) AllocSize

func (l TensorLayout) AllocSize() int

func (TensorLayout) LogicalBytes

func (l TensorLayout) LogicalBytes() int

func (TensorLayout) LogicalElements

func (l TensorLayout) LogicalElements() int

type WeightFile

type WeightFile struct {
	Path string
	Blob []byte
}

WeightFile describes a named MIL BLOBFILE entry.

Directories

Path Synopsis
Package dynamicmatmul provides compile-once ANE matmul kernels with runtime-provided weights.
Package dynamicmatmul provides compile-once ANE matmul kernels with runtime-provided weights.
Package linear provides reusable ANE-backed linear forward execution.
Package linear provides reusable ANE-backed linear forward execution.
Package mil generates MIL (Model Intermediate Language) programs and weight blobs for Apple Neural Engine compilation.
Package mil generates MIL (Model Intermediate Language) programs and weight blobs for Apple Neural Engine compilation.
Package telemetry provides diagnostic and performance measurement types for the Apple Neural Engine.
Package telemetry provides diagnostic and performance measurement types for the Apple Neural Engine.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL