ane

package

v0.4.4 Latest Latest Go to latest Published: Mar 21, 2026 License: MIT Imports: 3 Imported by: 1

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/tmc/apple

Links

Open Source Insights

Documentation ¶

Rendered for

Overview ¶

Package ane provides high-level access to the Apple Neural Engine.

Open a Client, compile MIL programs into Model values, and evaluate them on the ANE hardware. IOSurface buffers are managed automatically.

c, err := ane.Open()
if err != nil {
	log.Fatal(err)
}
defer c.Close()

m, err := c.Compile(ane.CompileOptions{
	MILText:    milText,
	WeightBlob: blob,
	ModelType:  ane.ModelTypeMIL,
})
if err != nil {
	log.Fatal(err)
}
defer m.Close()

m.WriteInputF32(0, input)
if err := m.Eval(); err != nil {
	log.Fatal(err)
}
m.ReadOutputF32(0, output)

Compilation ¶

There are two compilation paths. ModelTypeMIL compiles MIL text and weights in memory:

MIL text + weights → ANEInMemoryModel → compile → load → Model

ModelTypePackage loads a pre-compiled .mlmodelc package from disk:

.mlmodelc path → ANEModel → compile → load → Model

After compilation the ANE reports its expected memory layout via model attributes. IOSurfaces are created to match these layouts exactly. Surface sizes, channels, and spatial dimensions are never specified manually — they are parsed from the compiled model.

Client.CompileWithStats returns a CompileStats with wall-clock compile and load phase durations.

Tensor Layout ¶

Data is in channel-first (NCHW) order: data[c*width + x]. The byte offset in the IOSurface is c*PlaneStride + x*ElemSize. The minimum row stride is 64 bytes (the ANE's alignment granularity).

TensorLayout describes the memory layout for each input and output. AllocSize (Channels * PlaneStride) includes stride padding and is always >= the logical data size. Typed I/O methods (Model.WriteInputF32, Model.ReadOutputF32, etc.) accept logical element counts and handle stride padding internally. Raw I/O (Model.WriteInput, Model.ReadOutput) requires exactly AllocSize bytes.

Shared Events ¶

SharedEvent enables ANE↔GPU/CPU synchronization via IOSurface shared events. Use Model.EvalWithSignalEvent, Model.EvalBidirectional, and related methods for pipelined evaluation across compute domains.

Telemetry ¶

For hardware performance counters, diagnostics, and runtime snapshots, see the github.com/tmc/apple/x/ane/telemetry package.

Index ¶

Variables
func EnsureANELoaded() error
func EnsureCoreMLLoaded() error
func EnsureEspressoLoaded() errordeprecated
func FP16ToFloat32(h uint16) float32
func Float32ToFP16(f float32) uint16
func RowStrideFor(width, elemSize int) int
type ANEError
- func (e *ANEError) Error() string
- func (e *ANEError) Is(target error) bool
- func (e *ANEError) Unwrap() error
type Client
- func Open() (*Client, error)
- func (c *Client) ClientObjcID() uintptr
- func (c *Client) Close() error
- func (c *Client) Compile(opts CompileOptions) (*Model, error)
- func (c *Client) CompileCount() int64
- func (c *Client) CompileWithStats(opts CompileOptions) (*Model, CompileStats, error)
- func (c *Client) Info() DeviceInfo
type CompileOptions
type CompileStats
- func (s CompileStats) Available() bool
- func (s CompileStats) ReportMetrics(b interface{ ... })
- func (s CompileStats) String() string
type DeviceInfo
- func Probe() (DeviceInfo, error)
type MetalDevice
- func OpenMetal() (*MetalDevice, error)
- func (d *MetalDevice) Close() error
- func (d *MetalDevice) MetalSharedEvent(ev *SharedEvent) (any, error)
- func (d *MetalDevice) NewMetalSharedEvent() (any, *SharedEvent, error)
type Model
- func (m *Model) Close() error
- func (m *Model) CompileModelType() ModelType
- func (m *Model) Eval() error
- func (m *Model) EvalAsync() <-chan error
- func (m *Model) EvalAsyncWithCallback(fn func(error))
- func (m *Model) EvalBidirectional(waitPort uint32, waitValue uint64, signalPort uint32, signalValue uint64, ...) error
- func (m *Model) EvalWithSignalEvent(signalPort uint32, signalValue uint64, cfg SharedEventEvalOptions) error
- func (m *Model) InMemModelObjcID() uintptr
- func (m *Model) InputAllocSize(i int) int
- func (m *Model) InputChannels(i int) int
- func (m *Model) InputLayout(i int) TensorLayout
- func (m *Model) InputSurface(i int) uintptr
- func (m *Model) InputSurfaces() []uintptr
- func (m *Model) MetalInputBuffer(d *MetalDevice, i int) (any, error)
- func (m *Model) MetalOutputBuffer(d *MetalDevice, i int) (any, error)
- func (m *Model) ModelObjcID() uintptr
- func (m *Model) NumInputs() int
- func (m *Model) NumOutputs() int
- func (m *Model) OutputAllocSize(i int) int
- func (m *Model) OutputChannels(i int) int
- func (m *Model) OutputLayout(i int) TensorLayout
- func (m *Model) OutputSurface(i int) uintptr
- func (m *Model) OutputSurfaces() []uintptr
- func (m *Model) RawPerfStatsMask() uint32
- func (m *Model) RawRequest() uintptr
- func (m *Model) ReadOutput(i int, data []byte) error
- func (m *Model) ReadOutputF32(i int, data []float32) error
- func (m *Model) ReadOutputFP16(i int, data []float32) error
- func (m *Model) ReadOutputFP16Channels(i, channel int, data []float32) error
- func (m *Model) Spatial(i int) int
- func (m *Model) WriteInput(i int, data []byte) error
- func (m *Model) WriteInputF32(i int, data []float32) error
- func (m *Model) WriteInputFP16(i int, data []float32) error
- func (m *Model) WriteInputFP16Channels(i, channel int, data []float32) error
type ModelType
type Pipeline
- func NewPipeline(_ *MetalDevice) (*Pipeline, error)
- func (p *Pipeline) ANEEvent() *SharedEvent
- func (p *Pipeline) ANEToMetal(_ *Model) error
- func (p *Pipeline) Bidirectional(_ *Model) error
- func (p *Pipeline) Close() error
- func (p *Pipeline) Counter() uint64
- func (p *Pipeline) Metal() *MetalDevice
- func (p *Pipeline) MetalEvent() any
- func (p *Pipeline) WaitOnANE(_ *Model) error
type PooledRequest
- func (pr *PooledRequest) Eval() error
- func (pr *PooledRequest) Release()
type RequestPool
- func NewRequestPool(m *Model, depth int) (*RequestPool, error)
- func (p *RequestPool) Acquire() *PooledRequest
- func (p *RequestPool) Close() error
type SharedEvent
- func NewSharedEvent() (*SharedEvent, error)
- func SharedEventFromPort(port uint32) (*SharedEvent, error)
- func (e *SharedEvent) Close() error
- func (e *SharedEvent) Port() uint32
- func (e *SharedEvent) Signal(value uint64)
- func (e *SharedEvent) SignaledValue() uint64
- func (e *SharedEvent) TimeWait(value uint64, timeout time.Duration) (bool, time.Duration)
- func (e *SharedEvent) Wait(value uint64, timeout time.Duration) bool
type SharedEventEvalOptions
type StateHandle
- func NewStateHandle(_ *Model, _ int) *StateHandle
- func (s *StateHandle) Advance(_ int)
- func (s *StateHandle) Close() error
- func (s *StateHandle) MaxSeq() int
- func (s *StateHandle) Model() *Model
- func (s *StateHandle) Position() int
- func (s *StateHandle) Remaining() int
- func (s *StateHandle) Reset()
type TensorLayout
- func (l TensorLayout) AllocSize() int
- func (l TensorLayout) LogicalBytes() int
- func (l TensorLayout) LogicalElements() int
type WeightFile

Constants ¶

This section is empty.

Variables ¶

View Source

var (
	ErrNoANE                    = errors.New("ane: no ANE hardware available")
	ErrCompileBudgetExhausted   = errors.New("ane: compile budget exhausted")
	ErrMapFailed                = errors.New("ane: IOSurface mapping failed")
	ErrUnsupportedSelector      = errors.New("ane: unsupported selector")
	ErrVirtualClientUnavailable = errors.New("ane: virtual client unavailable")
	ErrModelLoad                = errors.New("ane: model load failed")
	ErrEval                     = errors.New("ane: evaluation failed")
	ErrUnsupportedLayout        = errors.New("ane: unsupported tensor layout")
)

Functions ¶

func EnsureANELoaded ¶ added in v0.3.0

func EnsureANELoaded() error

EnsureANELoaded loads the AppleNeuralEngine private framework once.

func EnsureCoreMLLoaded ¶ added in v0.3.0

func EnsureCoreMLLoaded() error

EnsureCoreMLLoaded loads the CoreML framework once.

func EnsureEspressoLoaded deprecated added in v0.3.0

func EnsureEspressoLoaded() error

EnsureEspressoLoaded loads the Espresso private framework once.

Deprecated: new code should prefer github.com/tmc/apple/x/espresso.

func FP16ToFloat32 ¶

func FP16ToFloat32(h uint16) float32

FP16ToFloat32 converts an IEEE 754 half-precision value to float32.

func Float32ToFP16 ¶

func Float32ToFP16(f float32) uint16

Float32ToFP16 converts a float32 to IEEE 754 half-precision.

func RowStrideFor ¶

func RowStrideFor(width, elemSize int) int

RowStrideFor computes the minimum 64-byte-aligned row stride.

Types ¶

type ANEError ¶

type ANEError struct {
	Op    string // operation that failed
	Class string // error domain or class
	Code  int    // error code
	Err   error  // underlying error
}

ANEError wraps an error from the ANE subsystem with context.

func (*ANEError) Error ¶

func (e *ANEError) Error() string

func (*ANEError) Is ¶

func (e *ANEError) Is(target error) bool

func (*ANEError) Unwrap ¶

func (e *ANEError) Unwrap() error

type Client ¶ added in v0.3.0

type Client struct{}

Client manages a connection to the ANE hardware.

func Open ¶

func Open() (*Client, error)

Open creates a new Client for ANE inference. On non-Darwin platforms, it always returns ErrNoANE.

func (*Client) ClientObjcID ¶ added in v0.3.0

func (c *Client) ClientObjcID() uintptr

func (*Client) Close ¶ added in v0.3.0

func (c *Client) Close() error

Close releases the client resources.

func (*Client) Compile ¶ added in v0.3.0

func (c *Client) Compile(opts CompileOptions) (*Model, error)

Compile compiles a model and returns a ready-to-evaluate Model.

func (*Client) CompileCount ¶ added in v0.3.0

func (c *Client) CompileCount() int64

CompileCount returns the number of compilations performed.

func (*Client) CompileWithStats ¶ added in v0.3.0

func (c *Client) CompileWithStats(opts CompileOptions) (*Model, CompileStats, error)

CompileWithStats compiles a model and returns a Model along with compilation timing.

func (*Client) Info ¶ added in v0.3.0

func (c *Client) Info() DeviceInfo

Info returns the device information for this client.

type CompileOptions ¶

type CompileOptions struct {
	ModelType ModelType

	// For ModelTypeMIL:
	MILText []byte // MIL program text
	// Legacy single-weight MIL fields.
	WeightBlob []byte // weight binary blob (may be nil)
	WeightPath string // model path key for weight dict (default: "@model_path/weights/weight.bin")
	// WeightFiles allows MIL graphs with multiple named BLOBFILE inputs.
	// If both WeightBlob and WeightFiles are set, they are merged.
	WeightFiles []WeightFile

	// For ModelTypePackage:
	PackagePath string // path to .mlmodelc directory
	ModelKey    string // key for model dictionary (default: "s")

	// QoS selects the ANE quality-of-service scheduling class.
	// Zero uses the default value of 21. Higher values may receive
	// priority on shared hardware.
	QoS uint32

	// PerfStatsMask enables hardware performance counters during evaluation.
	// Zero disables stats collection. When non-zero, telemetry.EvalWithStats
	// returns counter names and timing data. Valid masks are 4 bits wide
	// (0x1–0xF); higher values are silently converted to zero by the driver.
	PerfStatsMask uint32
}

CompileOptions configures model compilation.

type CompileStats ¶ added in v0.2.2

type CompileStats struct {
	CompileNS int64 // wall-clock compile phase
	LoadNS    int64 // wall-clock load phase
	TotalNS   int64 // wall-clock total
}

CompileStats contains timing measurements from model compilation.

func (CompileStats) Available ¶ added in v0.2.2

func (s CompileStats) Available() bool

Available reports whether any timing data was collected.

func (CompileStats) ReportMetrics ¶ added in v0.2.2

func (s CompileStats) ReportMetrics(b interface{ ReportMetric(float64, string) })

ReportMetrics reports compilation timing to a testing.B-compatible reporter.

func (CompileStats) String ¶ added in v0.2.2

func (s CompileStats) String() string

String returns a compact human-readable summary of compile timing.

type DeviceInfo ¶

type DeviceInfo struct {
	HasANE        bool
	NumCores      uint32
	Architecture  string
	Product       string
	BuildVersion  string
	IsVM          bool
	NumANEs       uint32
	SubType       string
	BoardType     int64
	InternalBuild bool
}

DeviceInfo describes the ANE hardware present on the system.

func Probe ¶

func Probe() (DeviceInfo, error)

Probe returns device information about the ANE. On non-Darwin platforms, it always reports no ANE.

type MetalDevice ¶

type MetalDevice struct{}

MetalDevice wraps a Metal GPU device for zero-copy interop with ANE.

func OpenMetal ¶

func OpenMetal() (*MetalDevice, error)

func (*MetalDevice) Close ¶

func (d *MetalDevice) Close() error

func (*MetalDevice) MetalSharedEvent ¶

func (d *MetalDevice) MetalSharedEvent(ev *SharedEvent) (any, error)

func (*MetalDevice) NewMetalSharedEvent ¶

func (d *MetalDevice) NewMetalSharedEvent() (any, *SharedEvent, error)

type Model ¶ added in v0.3.0

type Model struct{}

Model represents a compiled and loaded model ready for evaluation.

func (*Model) Close ¶ added in v0.3.0

func (m *Model) Close() error

func (*Model) CompileModelType ¶ added in v0.3.0

func (m *Model) CompileModelType() ModelType

func (*Model) Eval ¶ added in v0.3.0

func (m *Model) Eval() error

func (*Model) EvalAsync ¶ added in v0.3.0

func (m *Model) EvalAsync() <-chan error

func (*Model) EvalAsyncWithCallback ¶ added in v0.3.0

func (m *Model) EvalAsyncWithCallback(fn func(error))

func (*Model) EvalBidirectional ¶ added in v0.3.0

func (m *Model) EvalBidirectional(waitPort uint32, waitValue uint64, signalPort uint32, signalValue uint64, cfg SharedEventEvalOptions) error

func (*Model) EvalWithSignalEvent ¶ added in v0.3.0

func (m *Model) EvalWithSignalEvent(signalPort uint32, signalValue uint64, cfg SharedEventEvalOptions) error

func (*Model) InMemModelObjcID ¶ added in v0.3.0

func (m *Model) InMemModelObjcID() uintptr

func (*Model) InputAllocSize ¶ added in v0.3.0

func (m *Model) InputAllocSize(i int) int

func (*Model) InputChannels ¶ added in v0.3.0

func (m *Model) InputChannels(i int) int

func (*Model) InputLayout ¶ added in v0.3.0

func (m *Model) InputLayout(i int) TensorLayout

func (*Model) InputSurface ¶ added in v0.3.0

func (m *Model) InputSurface(i int) uintptr

func (*Model) InputSurfaces ¶ added in v0.3.0

func (m *Model) InputSurfaces() []uintptr

func (*Model) MetalInputBuffer ¶ added in v0.3.0

func (m *Model) MetalInputBuffer(d *MetalDevice, i int) (any, error)

func (*Model) MetalOutputBuffer ¶ added in v0.3.0

func (m *Model) MetalOutputBuffer(d *MetalDevice, i int) (any, error)

func (*Model) ModelObjcID ¶ added in v0.3.0

func (m *Model) ModelObjcID() uintptr

func (*Model) NumInputs ¶ added in v0.3.0

func (m *Model) NumInputs() int

func (*Model) NumOutputs ¶ added in v0.3.0

func (m *Model) NumOutputs() int

func (*Model) OutputAllocSize ¶ added in v0.3.0

func (m *Model) OutputAllocSize(i int) int

func (*Model) OutputChannels ¶ added in v0.3.0

func (m *Model) OutputChannels(i int) int

func (*Model) OutputLayout ¶ added in v0.3.0

func (m *Model) OutputLayout(i int) TensorLayout

func (*Model) OutputSurface ¶ added in v0.3.0

func (m *Model) OutputSurface(i int) uintptr

func (*Model) OutputSurfaces ¶ added in v0.3.0

func (m *Model) OutputSurfaces() []uintptr

func (*Model) RawPerfStatsMask ¶ added in v0.3.0

func (m *Model) RawPerfStatsMask() uint32

func (*Model) RawRequest ¶ added in v0.3.0

func (m *Model) RawRequest() uintptr

func (*Model) ReadOutput ¶ added in v0.3.0

func (m *Model) ReadOutput(i int, data []byte) error

func (*Model) ReadOutputF32 ¶ added in v0.3.0

func (m *Model) ReadOutputF32(i int, data []float32) error

func (*Model) ReadOutputFP16 ¶ added in v0.3.0

func (m *Model) ReadOutputFP16(i int, data []float32) error

func (*Model) ReadOutputFP16Channels ¶ added in v0.3.0

func (m *Model) ReadOutputFP16Channels(i, channel int, data []float32) error

func (*Model) Spatial ¶ added in v0.3.0

func (m *Model) Spatial(i int) int

func (*Model) WriteInput ¶ added in v0.3.0

func (m *Model) WriteInput(i int, data []byte) error

func (*Model) WriteInputF32 ¶ added in v0.3.0

func (m *Model) WriteInputF32(i int, data []float32) error

func (*Model) WriteInputFP16 ¶ added in v0.3.0

func (m *Model) WriteInputFP16(i int, data []float32) error

func (*Model) WriteInputFP16Channels ¶ added in v0.3.0

func (m *Model) WriteInputFP16Channels(i, channel int, data []float32) error

type ModelType ¶

type ModelType int

ModelType selects the compilation path.

const (
	ModelTypeMIL     ModelType = iota // In-memory MIL text + weights
	ModelTypePackage                  // On-disk .mlmodelc package
)

type Pipeline ¶

type Pipeline struct{}

Pipeline manages SharedEvent synchronization between ANE and Metal.

func NewPipeline ¶

func NewPipeline(_ *MetalDevice) (*Pipeline, error)

func (*Pipeline) ANEEvent ¶

func (p *Pipeline) ANEEvent() *SharedEvent

func (*Pipeline) ANEToMetal ¶

func (p *Pipeline) ANEToMetal(_ *Model) error

func (*Pipeline) Bidirectional ¶

func (p *Pipeline) Bidirectional(_ *Model) error

func (*Pipeline) Close ¶

func (p *Pipeline) Close() error

func (*Pipeline) Counter ¶

func (p *Pipeline) Counter() uint64

func (*Pipeline) Metal ¶

func (p *Pipeline) Metal() *MetalDevice

func (*Pipeline) MetalEvent ¶

func (p *Pipeline) MetalEvent() any

func (*Pipeline) WaitOnANE ¶

func (p *Pipeline) WaitOnANE(_ *Model) error

type PooledRequest ¶

type PooledRequest struct{}

PooledRequest is a request checked out from a pool.

func (*PooledRequest) Eval ¶

func (pr *PooledRequest) Eval() error

func (*PooledRequest) Release ¶

func (pr *PooledRequest) Release()

type RequestPool ¶

type RequestPool struct{}

RequestPool pre-allocates a ring of ANE requests for pipelined evaluation.

func NewRequestPool ¶

func NewRequestPool(m *Model, depth int) (*RequestPool, error)

func (*RequestPool) Acquire ¶

func (p *RequestPool) Acquire() *PooledRequest

func (*RequestPool) Close ¶

func (p *RequestPool) Close() error

type SharedEvent ¶

type SharedEvent struct{}

SharedEvent wraps an IOSurfaceSharedEvent for ANE↔GPU/CPU synchronization.

func NewSharedEvent ¶

func NewSharedEvent() (*SharedEvent, error)

func SharedEventFromPort ¶

func SharedEventFromPort(port uint32) (*SharedEvent, error)

func (*SharedEvent) Close ¶

func (e *SharedEvent) Close() error

func (*SharedEvent) Port ¶

func (e *SharedEvent) Port() uint32

func (*SharedEvent) Signal ¶

func (e *SharedEvent) Signal(value uint64)

func (*SharedEvent) SignaledValue ¶

func (e *SharedEvent) SignaledValue() uint64

func (*SharedEvent) TimeWait ¶ added in v0.2.2

func (e *SharedEvent) TimeWait(value uint64, timeout time.Duration) (bool, time.Duration)

func (*SharedEvent) Wait ¶

func (e *SharedEvent) Wait(value uint64, timeout time.Duration) bool

type SharedEventEvalOptions ¶

type SharedEventEvalOptions struct {
	DisableIOFencesUseSharedEvents bool // set kANEFDisableIOFencesUseSharedEventsKey
	EnableFWToFWSignal             bool // set kANEFEnableFWToFWSignal (keep false for ANE→Metal on physical hosts)
}

SharedEventEvalOptions configures shared event evaluation behavior.

type StateHandle ¶

type StateHandle struct{}

StateHandle manages KV cache state for stateful MIL models.

func NewStateHandle ¶

func NewStateHandle(_ *Model, _ int) *StateHandle

func (*StateHandle) Advance ¶

func (s *StateHandle) Advance(_ int)

func (*StateHandle) Close ¶

func (s *StateHandle) Close() error

func (*StateHandle) MaxSeq ¶

func (s *StateHandle) MaxSeq() int

func (*StateHandle) Model ¶ added in v0.3.0

func (s *StateHandle) Model() *Model

func (*StateHandle) Position ¶

func (s *StateHandle) Position() int

func (*StateHandle) Remaining ¶

func (s *StateHandle) Remaining() int

func (*StateHandle) Reset ¶

func (s *StateHandle) Reset()

type TensorLayout ¶

type TensorLayout struct {
	Channels    int
	Width       int
	Height      int
	ElemSize    int
	RowStride   int
	PlaneStride int
}

TensorLayout describes the compiled model's memory layout for a single tensor.

func (TensorLayout) AllocSize ¶

func (l TensorLayout) AllocSize() int

func (TensorLayout) LogicalBytes ¶

func (l TensorLayout) LogicalBytes() int

func (TensorLayout) LogicalElements ¶

func (l TensorLayout) LogicalElements() int

type WeightFile ¶

type WeightFile struct {
	Path string
	Blob []byte
}

WeightFile describes a named MIL BLOBFILE entry.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
dynamicmatmul Package dynamicmatmul provides compile-once ANE matmul kernels with runtime-provided weights.	Package dynamicmatmul provides compile-once ANE matmul kernels with runtime-provided weights.
forward
linear Package linear provides reusable ANE-backed linear forward execution.	Package linear provides reusable ANE-backed linear forward execution.
mil Package mil generates MIL (Model Intermediate Language) programs and weight blobs for Apple Neural Engine compilation.	Package mil generates MIL (Model Intermediate Language) programs and weight blobs for Apple Neural Engine compilation.
model
telemetry Package telemetry provides diagnostic and performance measurement types for the Apple Neural Engine.	Package telemetry provides diagnostic and performance measurement types for the Apple Neural Engine.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL