Documentation
¶
Overview ¶
Package dynamicmatmul provides compile-once ANE matmul kernels with runtime-provided weights.
It stages activations and weights into a single ANE input surface, then evaluates y = x*w without recompiling when w changes.
Index ¶
- type EvalStats
- type Executor
- func (e *Executor) Close()
- func (e *Executor) CopyOutputToInput(dst *model.Kernel, dstInput, dstChannel int) error
- func (e *Executor) Eval(x, w []float32) ([]float32, error)
- func (e *Executor) EvalCF(xCF []float32) (EvalStats, error)
- func (e *Executor) EvalCFHW(xCF []float32) (uint64, error)
- func (e *Executor) EvalCFIOInto(dstCF, xCF []float32) (EvalStats, error)
- func (e *Executor) EvalCFIOIntoHW(dstCF, xCF []float32) (uint64, error)
- func (e *Executor) EvalInto(dst, x, w []float32) (EvalStats, error)
- func (e *Executor) EvalOneHotIOInto(dst []float32, xs []int) (EvalStats, error)
- func (e *Executor) EvalOneHotIOIntoHW(dst []float32, xs []int) (uint64, error)
- func (e *Executor) EvalWithStats(x, w []float32) ([]float32, EvalStats, error)
- func (e *Executor) PrimeWeightsIO(wIO []float32) error
- func (e *Executor) ReadOutputCF(dstCF []float32) error
- func (e *Executor) UpdateWeightsIORows(wIO []float32, rows []int) error
- type Options
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Executor ¶
type Executor struct {
// contains filtered or unexported fields
}
Executor evaluates row-major y = x*w with runtime-provided weights.
x has shape [batch, inDim], w has shape [inDim, outDim], and the result has shape [batch, outDim]. The executor compiles once for a fixed shape and rewrites the packed input surface on each evaluation.
func (*Executor) CopyOutputToInput ¶
CopyOutputToInput copies the last evaluated output tensor into a destination kernel input without converting through Go-managed float buffers.
func (*Executor) EvalCF ¶
EvalCF evaluates a channel-first input tensor against previously primed weights and leaves the output resident in the ANE output surfaces.
xCF has shape [inDim, batch].
func (*Executor) EvalCFHW ¶
EvalCFHW evaluates a channel-first input tensor against previously primed weights and returns only aggregate hardware execution time.
func (*Executor) EvalCFIOInto ¶
EvalCFIOInto evaluates a channel-first input tensor into a channel-first output tensor using previously primed weights.
xCF has shape [inDim, batch] and dstCF has shape [outDim, batch].
func (*Executor) EvalCFIOIntoHW ¶
EvalCFIOIntoHW evaluates a channel-first input tensor into a channel-first output tensor using previously primed weights and returns only aggregate hardware execution time.
func (*Executor) EvalOneHotIOInto ¶
EvalOneHotIOInto computes y = x*w for one-hot activations encoded by xs and a previously primed IO-layout weight matrix.
xs holds at most batch token ids in [0, inDim). Position t selects the input row for batch element t. Remaining batch positions are treated as zero input.
func (*Executor) EvalOneHotIOIntoHW ¶
EvalOneHotIOIntoHW computes y = x*w for one-hot activations encoded by xs and a previously primed IO-layout weight matrix, returning only aggregate hardware execution time.
func (*Executor) EvalWithStats ¶
EvalWithStats computes y = x*w and returns a new output slice plus timing.
func (*Executor) PrimeWeightsIO ¶
PrimeWeightsIO copies the full IO-layout weight matrix into the cached ANE input buffers. wIO must be laid out as [inDim, outDim].
func (*Executor) ReadOutputCF ¶
ReadOutputCF reads the last evaluated output tensor in channel-first order.
dstCF has shape [outDim, batch].
type Options ¶
type Options struct {
QoS uint32
// TileOut forces output-channel tiling when > 0.
//
// Each tile compiles a separate kernel with output width <= TileOut.
// When TileOut == 0, New first tries a single full-width kernel and
// falls back to tiling only if full-width compilation fails.
TileOut int
}
Options configures executor creation.