Documentation
¶
Overview ¶
Package ane provides high-level access to the Apple Neural Engine.
Open a Client, compile MIL programs into Model values, and evaluate them on the ANE hardware. IOSurface buffers are managed automatically.
c, err := ane.Open()
if err != nil {
log.Fatal(err)
}
defer c.Close()
m, err := c.Compile(ane.CompileOptions{
MILText: milText,
WeightBlob: blob,
ModelType: ane.ModelTypeMIL,
})
if err != nil {
log.Fatal(err)
}
defer m.Close()
m.WriteInputF32(0, input)
if err := m.Eval(); err != nil {
log.Fatal(err)
}
m.ReadOutputF32(0, output)
Compilation ¶
There are two compilation paths. ModelTypeMIL compiles MIL text and weights in memory:
MIL text + weights → ANEInMemoryModel → compile → load → Model
ModelTypePackage loads a pre-compiled .mlmodelc package from disk:
.mlmodelc path → ANEModel → compile → load → Model
After compilation the ANE reports its expected memory layout via model attributes. IOSurfaces are created to match these layouts exactly. Surface sizes, channels, and spatial dimensions are never specified manually — they are parsed from the compiled model.
Client.CompileWithStats returns a CompileStats with wall-clock compile and load phase durations.
Tensor Layout ¶
Data is in channel-first (NCHW) order: data[c*width + x]. The byte offset in the IOSurface is c*PlaneStride + x*ElemSize. The minimum row stride is 64 bytes (the ANE's alignment granularity).
TensorLayout describes the memory layout for each input and output. AllocSize (Channels * PlaneStride) includes stride padding and is always >= the logical data size. Typed I/O methods (Model.WriteInputF32, Model.ReadOutputF32, etc.) accept logical element counts and handle stride padding internally. Raw I/O (Model.WriteInput, Model.ReadOutput) requires exactly AllocSize bytes.
Shared Events ¶
SharedEvent enables ANE↔GPU/CPU synchronization via IOSurface shared events. Use Model.EvalWithSignalEvent, Model.EvalBidirectional, and related methods for pipelined evaluation across compute domains.
Telemetry ¶
For hardware performance counters, diagnostics, and runtime snapshots, see the github.com/tmc/apple/x/ane/telemetry package.
Index ¶
- Variables
- func EnsureANELoaded() error
- func EnsureCoreMLLoaded() error
- func EnsureEspressoLoaded() errordeprecated
- func FP16ToFloat32(h uint16) float32
- func Float32ToFP16(f float32) uint16
- func RowStrideFor(width, elemSize int) int
- type ANEError
- type Client
- type CompileOptions
- type CompileStats
- type DeviceInfo
- type MetalDevice
- type Model
- func (m *Model) Close() error
- func (m *Model) CompileModelType() ModelType
- func (m *Model) Eval() error
- func (m *Model) EvalAsync() <-chan error
- func (m *Model) EvalAsyncWithCallback(fn func(error))
- func (m *Model) EvalBidirectional(waitPort uint32, waitValue uint64, signalPort uint32, signalValue uint64, ...) error
- func (m *Model) EvalWithSignalEvent(signalPort uint32, signalValue uint64, cfg SharedEventEvalOptions) error
- func (m *Model) InMemModelObjcID() uintptr
- func (m *Model) InputAllocSize(i int) int
- func (m *Model) InputChannels(i int) int
- func (m *Model) InputLayout(i int) TensorLayout
- func (m *Model) InputSurface(i int) uintptr
- func (m *Model) InputSurfaces() []uintptr
- func (m *Model) MetalInputBuffer(d *MetalDevice, i int) (any, error)
- func (m *Model) MetalOutputBuffer(d *MetalDevice, i int) (any, error)
- func (m *Model) ModelObjcID() uintptr
- func (m *Model) NumInputs() int
- func (m *Model) NumOutputs() int
- func (m *Model) OutputAllocSize(i int) int
- func (m *Model) OutputChannels(i int) int
- func (m *Model) OutputLayout(i int) TensorLayout
- func (m *Model) OutputSurface(i int) uintptr
- func (m *Model) OutputSurfaces() []uintptr
- func (m *Model) RawPerfStatsMask() uint32
- func (m *Model) RawRequest() uintptr
- func (m *Model) ReadOutput(i int, data []byte) error
- func (m *Model) ReadOutputF32(i int, data []float32) error
- func (m *Model) ReadOutputFP16(i int, data []float32) error
- func (m *Model) ReadOutputFP16Channels(i, channel int, data []float32) error
- func (m *Model) Spatial(i int) int
- func (m *Model) WriteInput(i int, data []byte) error
- func (m *Model) WriteInputF32(i int, data []float32) error
- func (m *Model) WriteInputFP16(i int, data []float32) error
- func (m *Model) WriteInputFP16Channels(i, channel int, data []float32) error
- type ModelType
- type Pipeline
- func (p *Pipeline) ANEEvent() *SharedEvent
- func (p *Pipeline) ANEToMetal(_ *Model) error
- func (p *Pipeline) Bidirectional(_ *Model) error
- func (p *Pipeline) Close() error
- func (p *Pipeline) Counter() uint64
- func (p *Pipeline) Metal() *MetalDevice
- func (p *Pipeline) MetalEvent() any
- func (p *Pipeline) WaitOnANE(_ *Model) error
- type PooledRequest
- type RequestPool
- type SharedEvent
- func (e *SharedEvent) Close() error
- func (e *SharedEvent) Port() uint32
- func (e *SharedEvent) Signal(value uint64)
- func (e *SharedEvent) SignaledValue() uint64
- func (e *SharedEvent) TimeWait(value uint64, timeout time.Duration) (bool, time.Duration)
- func (e *SharedEvent) Wait(value uint64, timeout time.Duration) bool
- type SharedEventEvalOptions
- type StateHandle
- type TensorLayout
- type WeightFile
Constants ¶
This section is empty.
Variables ¶
var ( ErrNoANE = errors.New("ane: no ANE hardware available") ErrCompileBudgetExhausted = errors.New("ane: compile budget exhausted") ErrMapFailed = errors.New("ane: IOSurface mapping failed") ErrUnsupportedSelector = errors.New("ane: unsupported selector") ErrModelLoad = errors.New("ane: model load failed") ErrEval = errors.New("ane: evaluation failed") ErrUnsupportedLayout = errors.New("ane: unsupported tensor layout") )
Functions ¶
func EnsureANELoaded ¶ added in v0.3.0
func EnsureANELoaded() error
EnsureANELoaded loads the AppleNeuralEngine private framework once.
func EnsureCoreMLLoaded ¶ added in v0.3.0
func EnsureCoreMLLoaded() error
EnsureCoreMLLoaded loads the CoreML framework once.
func EnsureEspressoLoaded
deprecated
added in
v0.3.0
func EnsureEspressoLoaded() error
EnsureEspressoLoaded loads the Espresso private framework once.
Deprecated: new code should prefer github.com/tmc/apple/x/espresso.
func FP16ToFloat32 ¶
FP16ToFloat32 converts an IEEE 754 half-precision value to float32.
func Float32ToFP16 ¶
Float32ToFP16 converts a float32 to IEEE 754 half-precision.
func RowStrideFor ¶
RowStrideFor computes the minimum 64-byte-aligned row stride.
Types ¶
type ANEError ¶
type ANEError struct {
Op string // operation that failed
Class string // error domain or class
Code int // error code
Err error // underlying error
}
ANEError wraps an error from the ANE subsystem with context.
type Client ¶ added in v0.3.0
type Client struct{}
Client manages a connection to the ANE hardware.
func Open ¶
Open creates a new Client for ANE inference. On non-Darwin platforms, it always returns ErrNoANE.
func (*Client) ClientObjcID ¶ added in v0.3.0
func (*Client) Compile ¶ added in v0.3.0
func (c *Client) Compile(opts CompileOptions) (*Model, error)
Compile compiles a model and returns a ready-to-evaluate Model.
func (*Client) CompileCount ¶ added in v0.3.0
CompileCount returns the number of compilations performed.
func (*Client) CompileWithStats ¶ added in v0.3.0
func (c *Client) CompileWithStats(opts CompileOptions) (*Model, CompileStats, error)
CompileWithStats compiles a model and returns a Model along with compilation timing.
func (*Client) Info ¶ added in v0.3.0
func (c *Client) Info() DeviceInfo
Info returns the device information for this client.
type CompileOptions ¶
type CompileOptions struct {
ModelType ModelType
// For ModelTypeMIL:
MILText []byte // MIL program text
// Legacy single-weight MIL fields.
WeightBlob []byte // weight binary blob (may be nil)
WeightPath string // model path key for weight dict (default: "@model_path/weights/weight.bin")
// WeightFiles allows MIL graphs with multiple named BLOBFILE inputs.
// If both WeightBlob and WeightFiles are set, they are merged.
WeightFiles []WeightFile
// For ModelTypePackage:
PackagePath string // path to .mlmodelc directory
ModelKey string // key for model dictionary (default: "s")
// QoS selects the ANE quality-of-service scheduling class.
// Zero uses the default value of 21. Higher values may receive
// priority on shared hardware.
QoS uint32
// PerfStatsMask enables hardware performance counters during evaluation.
// Zero disables stats collection. When non-zero, telemetry.EvalWithStats
// returns counter names and timing data. Valid masks are 4 bits wide
// (0x1–0xF); higher values are silently converted to zero by the driver.
PerfStatsMask uint32
}
CompileOptions configures model compilation.
type CompileStats ¶ added in v0.2.2
type CompileStats struct {
CompileNS int64 // wall-clock compile phase
LoadNS int64 // wall-clock load phase
TotalNS int64 // wall-clock total
}
CompileStats contains timing measurements from model compilation.
func (CompileStats) Available ¶ added in v0.2.2
func (s CompileStats) Available() bool
Available reports whether any timing data was collected.
func (CompileStats) ReportMetrics ¶ added in v0.2.2
func (s CompileStats) ReportMetrics(b interface{ ReportMetric(float64, string) })
ReportMetrics reports compilation timing to a testing.B-compatible reporter.
func (CompileStats) String ¶ added in v0.2.2
func (s CompileStats) String() string
String returns a compact human-readable summary of compile timing.
type DeviceInfo ¶
type DeviceInfo struct {
HasANE bool
NumCores uint32
Architecture string
Product string
BuildVersion string
IsVM bool
NumANEs uint32
SubType string
BoardType int64
InternalBuild bool
}
DeviceInfo describes the ANE hardware present on the system.
func Probe ¶
func Probe() (DeviceInfo, error)
Probe returns device information about the ANE. On non-Darwin platforms, it always reports no ANE.
type MetalDevice ¶
type MetalDevice struct{}
MetalDevice wraps a Metal GPU device for zero-copy interop with ANE.
func OpenMetal ¶
func OpenMetal() (*MetalDevice, error)
func (*MetalDevice) Close ¶
func (d *MetalDevice) Close() error
func (*MetalDevice) MetalSharedEvent ¶
func (d *MetalDevice) MetalSharedEvent(ev *SharedEvent) (any, error)
func (*MetalDevice) NewMetalSharedEvent ¶
func (d *MetalDevice) NewMetalSharedEvent() (any, *SharedEvent, error)
type Model ¶ added in v0.3.0
type Model struct{}
Model represents a compiled and loaded model ready for evaluation.
func (*Model) CompileModelType ¶ added in v0.3.0
func (*Model) EvalAsyncWithCallback ¶ added in v0.3.0
func (*Model) EvalBidirectional ¶ added in v0.3.0
func (*Model) EvalWithSignalEvent ¶ added in v0.3.0
func (m *Model) EvalWithSignalEvent(signalPort uint32, signalValue uint64, cfg SharedEventEvalOptions) error
func (*Model) InMemModelObjcID ¶ added in v0.3.0
func (*Model) InputAllocSize ¶ added in v0.3.0
func (*Model) InputChannels ¶ added in v0.3.0
func (*Model) InputLayout ¶ added in v0.3.0
func (m *Model) InputLayout(i int) TensorLayout
func (*Model) InputSurface ¶ added in v0.3.0
func (*Model) InputSurfaces ¶ added in v0.3.0
func (*Model) MetalInputBuffer ¶ added in v0.3.0
func (m *Model) MetalInputBuffer(d *MetalDevice, i int) (any, error)
func (*Model) MetalOutputBuffer ¶ added in v0.3.0
func (m *Model) MetalOutputBuffer(d *MetalDevice, i int) (any, error)
func (*Model) ModelObjcID ¶ added in v0.3.0
func (*Model) NumOutputs ¶ added in v0.3.0
func (*Model) OutputAllocSize ¶ added in v0.3.0
func (*Model) OutputChannels ¶ added in v0.3.0
func (*Model) OutputLayout ¶ added in v0.3.0
func (m *Model) OutputLayout(i int) TensorLayout
func (*Model) OutputSurface ¶ added in v0.3.0
func (*Model) OutputSurfaces ¶ added in v0.3.0
func (*Model) RawPerfStatsMask ¶ added in v0.3.0
func (*Model) RawRequest ¶ added in v0.3.0
func (*Model) ReadOutputF32 ¶ added in v0.3.0
func (*Model) ReadOutputFP16 ¶ added in v0.3.0
func (*Model) ReadOutputFP16Channels ¶ added in v0.3.0
func (*Model) WriteInputF32 ¶ added in v0.3.0
func (*Model) WriteInputFP16 ¶ added in v0.3.0
type Pipeline ¶
type Pipeline struct{}
Pipeline manages SharedEvent synchronization between ANE and Metal.
func NewPipeline ¶
func NewPipeline(_ *MetalDevice) (*Pipeline, error)
func (*Pipeline) ANEEvent ¶
func (p *Pipeline) ANEEvent() *SharedEvent
func (*Pipeline) ANEToMetal ¶
func (*Pipeline) Bidirectional ¶
func (*Pipeline) Metal ¶
func (p *Pipeline) Metal() *MetalDevice
func (*Pipeline) MetalEvent ¶
type PooledRequest ¶
type PooledRequest struct{}
PooledRequest is a request checked out from a pool.
func (*PooledRequest) Eval ¶
func (pr *PooledRequest) Eval() error
func (*PooledRequest) Release ¶
func (pr *PooledRequest) Release()
type RequestPool ¶
type RequestPool struct{}
RequestPool pre-allocates a ring of ANE requests for pipelined evaluation.
func NewRequestPool ¶
func NewRequestPool(m *Model, depth int) (*RequestPool, error)
func (*RequestPool) Acquire ¶
func (p *RequestPool) Acquire() *PooledRequest
func (*RequestPool) Close ¶
func (p *RequestPool) Close() error
type SharedEvent ¶
type SharedEvent struct{}
SharedEvent wraps an IOSurfaceSharedEvent for ANE↔GPU/CPU synchronization.
func NewSharedEvent ¶
func NewSharedEvent() (*SharedEvent, error)
func SharedEventFromPort ¶
func SharedEventFromPort(port uint32) (*SharedEvent, error)
func (*SharedEvent) Close ¶
func (e *SharedEvent) Close() error
func (*SharedEvent) Port ¶
func (e *SharedEvent) Port() uint32
func (*SharedEvent) Signal ¶
func (e *SharedEvent) Signal(value uint64)
func (*SharedEvent) SignaledValue ¶
func (e *SharedEvent) SignaledValue() uint64
type SharedEventEvalOptions ¶
type SharedEventEvalOptions struct {
}
SharedEventEvalOptions configures shared event evaluation behavior.
type StateHandle ¶
type StateHandle struct{}
StateHandle manages KV cache state for stateful MIL models.
func NewStateHandle ¶
func NewStateHandle(_ *Model, _ int) *StateHandle
func (*StateHandle) Advance ¶
func (s *StateHandle) Advance(_ int)
func (*StateHandle) Close ¶
func (s *StateHandle) Close() error
func (*StateHandle) MaxSeq ¶
func (s *StateHandle) MaxSeq() int
func (*StateHandle) Model ¶ added in v0.3.0
func (s *StateHandle) Model() *Model
func (*StateHandle) Position ¶
func (s *StateHandle) Position() int
func (*StateHandle) Remaining ¶
func (s *StateHandle) Remaining() int
func (*StateHandle) Reset ¶
func (s *StateHandle) Reset()
type TensorLayout ¶
type TensorLayout struct {
Channels int
Width int
Height int
ElemSize int
RowStride int
PlaneStride int
}
TensorLayout describes the compiled model's memory layout for a single tensor.
func (TensorLayout) AllocSize ¶
func (l TensorLayout) AllocSize() int
func (TensorLayout) LogicalBytes ¶
func (l TensorLayout) LogicalBytes() int
func (TensorLayout) LogicalElements ¶
func (l TensorLayout) LogicalElements() int
type WeightFile ¶
WeightFile describes a named MIL BLOBFILE entry.
Directories
¶
| Path | Synopsis |
|---|---|
|
Package dynamicmatmul provides compile-once ANE matmul kernels with runtime-provided weights.
|
Package dynamicmatmul provides compile-once ANE matmul kernels with runtime-provided weights. |
|
Package linear provides reusable ANE-backed linear forward execution.
|
Package linear provides reusable ANE-backed linear forward execution. |
|
Package mil generates MIL (Model Intermediate Language) programs and weight blobs for Apple Neural Engine compilation.
|
Package mil generates MIL (Model Intermediate Language) programs and weight blobs for Apple Neural Engine compilation. |
|
Package telemetry provides diagnostic and performance measurement types for the Apple Neural Engine.
|
Package telemetry provides diagnostic and performance measurement types for the Apple Neural Engine. |