Documentation
¶
Overview ¶
Package cpu implements the CPU backend with SIMD optimizations and BLAS integration.
Index ¶
- type CPUBackend
- func (cpu *CPUBackend) Add(a, b *tensor.RawTensor) *tensor.RawTensor
- func (cpu *CPUBackend) Conv2D(input, kernel *tensor.RawTensor, stride, padding int) *tensor.RawTensor
- func (cpu *CPUBackend) Device() tensor.Device
- func (cpu *CPUBackend) Div(a, b *tensor.RawTensor) *tensor.RawTensor
- func (cpu *CPUBackend) MatMul(a, b *tensor.RawTensor) *tensor.RawTensor
- func (cpu *CPUBackend) MaxPool2D(input *tensor.RawTensor, kernelSize, stride int) *tensor.RawTensor
- func (cpu *CPUBackend) Mul(a, b *tensor.RawTensor) *tensor.RawTensor
- func (cpu *CPUBackend) Name() string
- func (cpu *CPUBackend) ReLU(x *tensor.RawTensor) *tensor.RawTensor
- func (cpu *CPUBackend) Reshape(t *tensor.RawTensor, newShape tensor.Shape) *tensor.RawTensor
- func (cpu *CPUBackend) Softmax(x *tensor.RawTensor) *tensor.RawTensor
- func (cpu *CPUBackend) Sub(a, b *tensor.RawTensor) *tensor.RawTensor
- func (cpu *CPUBackend) Transpose(t *tensor.RawTensor, axes ...int) *tensor.RawTensor
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type CPUBackend ¶
type CPUBackend struct {
// contains filtered or unexported fields
}
CPUBackend implements tensor operations on CPU with optional SIMD and BLAS optimizations.
func (*CPUBackend) Add ¶
func (cpu *CPUBackend) Add(a, b *tensor.RawTensor) *tensor.RawTensor
Add performs element-wise addition with NumPy-style broadcasting.
func (*CPUBackend) Conv2D ¶
func (cpu *CPUBackend) Conv2D(input, kernel *tensor.RawTensor, stride, padding int) *tensor.RawTensor
Conv2D performs 2D convolution using im2col algorithm.
Input shape: [batch, in_channels, height, width] Kernel shape: [out_channels, in_channels, kernel_h, kernel_w] Output shape: [batch, out_channels, out_h, out_w]
Parameters:
- input: Input tensor [N, C_in, H, W]
- kernel: Convolution kernel [C_out, C_in, K_h, K_w]
- stride: Stride for convolution (default: 1)
- padding: Padding to apply (default: 0)
Algorithm: Im2col
- Transform input patches into columns (im2col)
- Reshape kernel into matrix
- Perform matrix multiplication
- Reshape output to [N, C_out, H_out, W_out]
Im2col is efficient because:
- Converts convolution to matmul (highly optimized)
- Cache-friendly memory access
- Reuses existing matmul code
Reference: "High Performance Convolutional Neural Networks for Document Processing" (Chellapilla et al., 2006).
func (*CPUBackend) Device ¶
func (cpu *CPUBackend) Device() tensor.Device
Device returns the compute device.
func (*CPUBackend) Div ¶
func (cpu *CPUBackend) Div(a, b *tensor.RawTensor) *tensor.RawTensor
Div performs element-wise division with broadcasting.
func (*CPUBackend) MatMul ¶
func (cpu *CPUBackend) MatMul(a, b *tensor.RawTensor) *tensor.RawTensor
MatMul performs matrix multiplication. For 2D tensors: (M, K) @ (K, N) -> (M, N) Uses naive O(n³) implementation for Phase 1. TODO: Integrate with gonum/blas for better performance in Phase 2.
func (*CPUBackend) MaxPool2D ¶
MaxPool2D performs 2D max pooling.
Max pooling reduces spatial dimensions by taking the maximum value in each pooling window. Unlike Conv2D, MaxPool2D has no learnable parameters.
Input shape: [batch, channels, height, width] Output shape: [batch, channels, out_height, out_width]
Where:
out_height = (height - kernelSize) / stride + 1 out_width = (width - kernelSize) / stride + 1
Algorithm:
- For each batch and channel
- Slide kernelSize x kernelSize window with given stride
- Take maximum value in each window
- Output max value
Example (2x2 pool, stride=2):
Input: [[1,2,3,4], Output: [[4,6],
[5,6,7,8], [12,14]]
[9,10,11,12],
[13,14,15,16]]
func (*CPUBackend) Mul ¶
func (cpu *CPUBackend) Mul(a, b *tensor.RawTensor) *tensor.RawTensor
Mul performs element-wise multiplication with broadcasting.
func (*CPUBackend) ReLU ¶ added in v0.2.0
func (cpu *CPUBackend) ReLU(x *tensor.RawTensor) *tensor.RawTensor
ReLU applies ReLU activation: max(0, x).
func (*CPUBackend) Softmax ¶ added in v0.2.0
func (cpu *CPUBackend) Softmax(x *tensor.RawTensor) *tensor.RawTensor
Softmax applies softmax along the last dimension. Expects 2D input [batch_size, num_classes].