tensor

package

v0.7.7 Latest Latest Go to latest Published: Jan 6, 2026 License: Apache-2.0 Imports: 7 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/born-ml/born

Links

Open Source Insights

Documentation ¶

Rendered for

Overview ¶

Package tensor provides the core tensor types and operations for Born ML framework.

Package tensor provides tensor data structures for the Born ML framework.

Package tensor provides the core tensor types and operations for Born ML framework.

Package tensor raw_ops provides type-specific tensor operations for ONNX inference. Type-specific implementations (Float32, Float64, Int32, Int64) are intentionally similar/duplicated for performance - generics would add overhead.

Index ¶

type Backend
type DType
type DataType
- func (dt DataType) Size() int
- func (dt DataType) String() string
type Device
- func (d Device) String() string
type LazyBackend
type LazyGPUData
- func NewLazyGPUData(bufferPtr unsafe.Pointer, size uint64, backend LazyBackend) *LazyGPUData
- func (l *LazyGPUData) BufferPtr() unsafe.Pointer
- func (l *LazyGPUData) IsRealized() bool
- func (l *LazyGPUData) MarkRealized()
- func (l *LazyGPUData) Realize() ([]byte, error)
- func (l *LazyGPUData) Release()
- func (l *LazyGPUData) Size() uint64
type MockBackend
- func NewMockBackend() *MockBackend
- func (m *MockBackend) Add(a, b *RawTensor) *RawTensor
- func (m *MockBackend) AddScalar(x *RawTensor, scalar any) *RawTensor
- func (m *MockBackend) And(a, b *RawTensor) *RawTensor
- func (m *MockBackend) Argmax(_ *RawTensor, _ int) *RawTensor
- func (m *MockBackend) BatchMatMul(a, b *RawTensor) *RawTensor
- func (m *MockBackend) Cast(x *RawTensor, dtype DataType) *RawTensor
- func (m *MockBackend) Cat(tensors []*RawTensor, dim int) *RawTensor
- func (m *MockBackend) Chunk(x *RawTensor, n, dim int) []*RawTensor
- func (m *MockBackend) Conv2D(input, kernel *RawTensor, stride, padding int) *RawTensor
- func (m *MockBackend) Conv2DInputBackward(_, _, _ *RawTensor, _, _ int) *RawTensor
- func (m *MockBackend) Conv2DKernelBackward(_, _, _ *RawTensor, _, _ int) *RawTensor
- func (m *MockBackend) Cos(x *RawTensor) *RawTensor
- func (m *MockBackend) Device() Device
- func (m *MockBackend) Div(a, b *RawTensor) *RawTensor
- func (m *MockBackend) DivScalar(x *RawTensor, scalar any) *RawTensor
- func (m *MockBackend) Embedding(weight, indices *RawTensor) *RawTensor
- func (m *MockBackend) Equal(a, b *RawTensor) *RawTensor
- func (m *MockBackend) Exp(x *RawTensor) *RawTensor
- func (m *MockBackend) Expand(_ *RawTensor, _ Shape) *RawTensor
- func (m *MockBackend) Gather(x *RawTensor, dim int, index *RawTensor) *RawTensor
- func (m *MockBackend) Greater(a, b *RawTensor) *RawTensor
- func (m *MockBackend) GreaterEqual(a, b *RawTensor) *RawTensor
- func (m *MockBackend) Log(x *RawTensor) *RawTensor
- func (m *MockBackend) Lower(a, b *RawTensor) *RawTensor
- func (m *MockBackend) LowerEqual(a, b *RawTensor) *RawTensor
- func (m *MockBackend) MatMul(a, b *RawTensor) *RawTensor
- func (m *MockBackend) MaxPool2D(input *RawTensor, kernelSize, stride int) *RawTensor
- func (m *MockBackend) MaxPool2DBackward(_, _ *RawTensor, _ []int, _, _ int) *RawTensor
- func (m *MockBackend) MeanDim(x *RawTensor, dim int, keepDim bool) *RawTensor
- func (m *MockBackend) Mul(a, b *RawTensor) *RawTensor
- func (m *MockBackend) MulScalar(x *RawTensor, scalar any) *RawTensor
- func (m *MockBackend) Name() string
- func (m *MockBackend) Not(x *RawTensor) *RawTensor
- func (m *MockBackend) NotEqual(a, b *RawTensor) *RawTensor
- func (m *MockBackend) Or(a, b *RawTensor) *RawTensor
- func (m *MockBackend) Reshape(t *RawTensor, newShape Shape) *RawTensor
- func (m *MockBackend) Rsqrt(x *RawTensor) *RawTensor
- func (m *MockBackend) Sin(x *RawTensor) *RawTensor
- func (m *MockBackend) Softmax(_ *RawTensor, _ int) *RawTensor
- func (m *MockBackend) Sqrt(x *RawTensor) *RawTensor
- func (m *MockBackend) Squeeze(x *RawTensor, dim int) *RawTensor
- func (m *MockBackend) Sub(a, b *RawTensor) *RawTensor
- func (m *MockBackend) SubScalar(x *RawTensor, scalar any) *RawTensor
- func (m *MockBackend) Sum(x *RawTensor) *RawTensor
- func (m *MockBackend) SumDim(x *RawTensor, dim int, keepDim bool) *RawTensor
- func (m *MockBackend) Transpose(t *RawTensor, axes ...int) *RawTensor
- func (m *MockBackend) Unsqueeze(x *RawTensor, dim int) *RawTensor
- func (m *MockBackend) Where(condition, x, y *RawTensor) *RawTensor
type RawTensor
- func Cast(x *RawTensor, dtype DataType) (*RawTensor, error)
- func Clip(x *RawTensor, minVal, maxVal float32) (*RawTensor, error)
- func Concat(tensors []*RawTensor, axis int) (*RawTensor, error)
- func Expand(x *RawTensor, targetShape Shape) (*RawTensor, error)
- func Flatten(x *RawTensor, axis int) (*RawTensor, error)
- func FullRaw(shape Shape, value float32, dtype DataType, device Device) (*RawTensor, error)
- func GELU(x *RawTensor) (*RawTensor, error)
- func Gather(x, indices *RawTensor, axis int) (*RawTensor, error)
- func LeakyReLU(x *RawTensor, alpha float32) (*RawTensor, error)
- func LogSoftmax(x *RawTensor, axis int) (*RawTensor, error)
- func NewLazyRaw(shape Shape, dtype DataType, device Device, gpuData *LazyGPUData) (*RawTensor, error)
- func NewRaw(shape Shape, dtype DataType, device Device) (*RawTensor, error)
- func PReLU(x, slope *RawTensor) (*RawTensor, error)
- func ReLU(x *RawTensor) (*RawTensor, error)
- func Reshape(x *RawTensor, newShape Shape) (*RawTensor, error)
- func SiLU(x *RawTensor) (*RawTensor, error)
- func Sigmoid(x *RawTensor) (*RawTensor, error)
- func Slice(x *RawTensor, starts, ends, axes, steps []int64) (*RawTensor, error)
- func Softmax(x *RawTensor, axis int) (*RawTensor, error)
- func Split(x *RawTensor, axis int, splitSizes []int) ([]*RawTensor, error)
- func Squeeze(x *RawTensor, axes ...int) (*RawTensor, error)
- func Tanh(x *RawTensor) (*RawTensor, error)
- func TransposeAxes(x *RawTensor, axes ...int) (*RawTensor, error)
- func Unsqueeze(x *RawTensor, axes ...int) (*RawTensor, error)
- func WhereRaw(condition, x, y *RawTensor) (*RawTensor, error)
- func (r *RawTensor) AsBool() []bool
- func (r *RawTensor) AsFloat32() []float32
- func (r *RawTensor) AsFloat64() []float64
- func (r *RawTensor) AsInt32() []int32
- func (r *RawTensor) AsInt64() []int64
- func (r *RawTensor) AsUint8() []uint8
- func (r *RawTensor) ByteSize() int
- func (r *RawTensor) Clone() *RawTensor
- func (r *RawTensor) DType() DataType
- func (r *RawTensor) Data() []byte
- func (r *RawTensor) Device() Device
- func (r *RawTensor) ForceNonUnique() func()
- func (r *RawTensor) GPUData() *LazyGPUData
- func (r *RawTensor) IsLazy() bool
- func (r *RawTensor) IsUnique() bool
- func (r *RawTensor) NumElements() int
- func (r *RawTensor) Release()
- func (r *RawTensor) SetGPUData(gpuData *LazyGPUData)
- func (r *RawTensor) Shape() Shape
- func (r *RawTensor) Strides() []int
type Shape
- func BroadcastShapes(a, b Shape) (Shape, bool, error)
- func (s Shape) Clone() Shape
- func (s Shape) ComputeStrides() []int
- func (s Shape) Equal(other Shape) bool
- func (s Shape) NumElements() int
- func (s Shape) Validate() error
type Tensor
- func Arange[T DType, B Backend](start, end T, b B) *Tensor[T, B]
- func Cat[T DType, B Backend](tensors []*Tensor[T, B], dim int) *Tensor[T, B]
- func Eye[T DType, B Backend](n int, b B) *Tensor[T, B]
- func FromSlice[T DType, B Backend](data []T, shape Shape, b B) (*Tensor[T, B], error)
- func Full[T DType, B Backend](shape Shape, value T, b B) *Tensor[T, B]
- func New[T DType, B Backend](raw *RawTensor, b B) *Tensor[T, B]
- func Ones[T DType, B Backend](shape Shape, b B) *Tensor[T, B]
- func Rand[T DType, B Backend](shape Shape, b B) *Tensor[T, B]
- func Randn[T DType, B Backend](shape Shape, b B) *Tensor[T, B]
- func Where[T DType, B Backend](cond *Tensor[bool, B], x, y *Tensor[T, B]) *Tensor[T, B]
- func Zeros[T DType, B Backend](shape Shape, b B) *Tensor[T, B]
- func (t *Tensor[T, B]) Add(other *Tensor[T, B]) *Tensor[T, B]
- func (t *Tensor[T, B]) AddScalar(scalar T) *Tensor[T, B]
- func (t *Tensor[bool, B]) And(other *Tensor[bool, B]) *Tensor[bool, B]
- func (t *Tensor[T, B]) Argmax(dim int) *Tensor[int32, B]
- func (t *Tensor[T, B]) At(indices ...int) T
- func (t *Tensor[T, B]) Backend() B
- func (t *Tensor[T, B]) BatchMatMul(other *Tensor[T, B]) *Tensor[T, B]
- func (t *Tensor[T, B]) Chunk(n, dim int) []*Tensor[T, B]
- func (t *Tensor[T, B]) Clone() *Tensor[T, B]
- func (t *Tensor[T, B]) Cos() *Tensor[T, B]
- func (t *Tensor[T, B]) DType() DataType
- func (t *Tensor[T, B]) Data() []T
- func (t *Tensor[T, B]) Detach() *Tensor[T, B]
- func (t *Tensor[T, B]) Device() Device
- func (t *Tensor[T, B]) Div(other *Tensor[T, B]) *Tensor[T, B]
- func (t *Tensor[T, B]) DivScalar(scalar T) *Tensor[T, B]
- func (t *Tensor[T, B]) Embedding(indices *Tensor[int32, B]) *Tensor[T, B]
- func (t *Tensor[T, B]) Eq(other *Tensor[T, B]) *Tensor[bool, B]
- func (t *Tensor[T, B]) Equal(other *Tensor[T, B]) *Tensor[bool, B]
- func (t *Tensor[T, B]) Exp() *Tensor[T, B]
- func (t *Tensor[T, B]) Expand(shape Shape) *Tensor[T, B]
- func (t *Tensor[T, B]) Float32() *Tensor[float32, B]
- func (t *Tensor[T, B]) Float64() *Tensor[float64, B]
- func (t *Tensor[T, B]) Gather(dim int, index *Tensor[int32, B]) *Tensor[T, B]
- func (t *Tensor[T, B]) Ge(other *Tensor[T, B]) *Tensor[bool, B]
- func (t *Tensor[T, B]) Grad() *Tensor[T, B]
- func (t *Tensor[T, B]) Greater(other *Tensor[T, B]) *Tensor[bool, B]
- func (t *Tensor[T, B]) GreaterEqual(other *Tensor[T, B]) *Tensor[bool, B]
- func (t *Tensor[T, B]) Gt(other *Tensor[T, B]) *Tensor[bool, B]
- func (t *Tensor[T, B]) Int32() *Tensor[int32, B]
- func (t *Tensor[T, B]) Int64() *Tensor[int64, B]
- func (t *Tensor[T, B]) Item() T
- func (t *Tensor[T, B]) Le(other *Tensor[T, B]) *Tensor[bool, B]
- func (t *Tensor[T, B]) Log() *Tensor[T, B]
- func (t *Tensor[T, B]) Lower(other *Tensor[T, B]) *Tensor[bool, B]
- func (t *Tensor[T, B]) LowerEqual(other *Tensor[T, B]) *Tensor[bool, B]
- func (t *Tensor[T, B]) Lt(other *Tensor[T, B]) *Tensor[bool, B]
- func (t *Tensor[T, B]) MatMul(other *Tensor[T, B]) *Tensor[T, B]
- func (t *Tensor[T, B]) MeanDim(dim int, keepDim bool) *Tensor[T, B]
- func (t *Tensor[T, B]) Mul(other *Tensor[T, B]) *Tensor[T, B]
- func (t *Tensor[T, B]) MulScalar(scalar T) *Tensor[T, B]
- func (t *Tensor[T, B]) Ne(other *Tensor[T, B]) *Tensor[bool, B]
- func (t *Tensor[bool, B]) Not() *Tensor[bool, B]
- func (t *Tensor[T, B]) NotEqual(other *Tensor[T, B]) *Tensor[bool, B]
- func (t *Tensor[T, B]) NumElements() int
- func (t *Tensor[bool, B]) Or(other *Tensor[bool, B]) *Tensor[bool, B]
- func (t *Tensor[T, B]) Raw() *RawTensor
- func (t *Tensor[T, B]) RequireGrad() *Tensor[T, B]
- func (t *Tensor[T, B]) RequiresGrad() bool
- func (t *Tensor[T, B]) Reshape(newShape ...int) *Tensor[T, B]
- func (t *Tensor[T, B]) Rsqrt() *Tensor[T, B]
- func (t *Tensor[T, B]) Set(value T, indices ...int)
- func (t *Tensor[T, B]) SetGrad(grad *Tensor[T, B])
- func (t *Tensor[T, B]) Shape() Shape
- func (t *Tensor[T, B]) Sin() *Tensor[T, B]
- func (t *Tensor[T, B]) Softmax(dim int) *Tensor[T, B]
- func (t *Tensor[T, B]) Sqrt() *Tensor[T, B]
- func (t *Tensor[T, B]) Squeeze(dim int) *Tensor[T, B]
- func (t *Tensor[T, B]) String() string
- func (t *Tensor[T, B]) Sub(other *Tensor[T, B]) *Tensor[T, B]
- func (t *Tensor[T, B]) SubScalar(scalar T) *Tensor[T, B]
- func (t *Tensor[T, B]) Sum() *Tensor[T, B]
- func (t *Tensor[T, B]) SumDim(dim int, keepDim bool) *Tensor[T, B]
- func (t *Tensor[T, B]) T() *Tensor[T, B]
- func (t *Tensor[T, B]) Transpose(axes ...int) *Tensor[T, B]
- func (t *Tensor[T, B]) Unsqueeze(dim int) *Tensor[T, B]

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Backend ¶

type Backend interface {
	// Element-wise binary operations
	Add(a, b *RawTensor) *RawTensor
	Sub(a, b *RawTensor) *RawTensor
	Mul(a, b *RawTensor) *RawTensor
	Div(a, b *RawTensor) *RawTensor

	// Matrix operations
	MatMul(a, b *RawTensor) *RawTensor

	// BatchMatMul performs batched matrix multiplication for 3D/4D tensors.
	// For 3D: [B, M, K] @ [B, K, N] -> [B, M, N]
	// For 4D: [B, H, M, K] @ [B, H, K, N] -> [B, H, M, N]
	BatchMatMul(a, b *RawTensor) *RawTensor

	// Convolutional operations
	Conv2D(input, kernel *RawTensor, stride, padding int) *RawTensor
	MaxPool2D(input *RawTensor, kernelSize, stride int) *RawTensor

	// Convolutional backward operations
	Conv2DInputBackward(input, kernel, grad *RawTensor, stride, padding int) *RawTensor
	Conv2DKernelBackward(input, kernel, grad *RawTensor, stride, padding int) *RawTensor
	MaxPool2DBackward(input, grad *RawTensor, maxIndices []int, kernelSize, stride int) *RawTensor

	// Shape operations
	Reshape(t *RawTensor, newShape Shape) *RawTensor
	Transpose(t *RawTensor, axes ...int) *RawTensor

	// Scalar operations (element-wise with scalar)
	MulScalar(x *RawTensor, scalar any) *RawTensor // multiply by scalar
	AddScalar(x *RawTensor, scalar any) *RawTensor // add scalar
	SubScalar(x *RawTensor, scalar any) *RawTensor // subtract scalar
	DivScalar(x *RawTensor, scalar any) *RawTensor // divide by scalar

	// Math operations (element-wise)
	Exp(x *RawTensor) *RawTensor   // exponential
	Log(x *RawTensor) *RawTensor   // natural logarithm
	Sqrt(x *RawTensor) *RawTensor  // square root
	Rsqrt(x *RawTensor) *RawTensor // reciprocal square root (1/sqrt(x))
	Cos(x *RawTensor) *RawTensor   // cosine
	Sin(x *RawTensor) *RawTensor   // sine

	// Activation functions
	Softmax(x *RawTensor, dim int) *RawTensor // softmax along dimension

	// Comparison operations (element-wise, return bool tensor)
	Greater(a, b *RawTensor) *RawTensor      // a > b
	Lower(a, b *RawTensor) *RawTensor        // a < b
	GreaterEqual(a, b *RawTensor) *RawTensor // a >= b
	LowerEqual(a, b *RawTensor) *RawTensor   // a <= b
	Equal(a, b *RawTensor) *RawTensor        // a == b
	NotEqual(a, b *RawTensor) *RawTensor     // a != b

	// Boolean operations (element-wise on bool tensors)
	Or(a, b *RawTensor) *RawTensor  // logical OR
	And(a, b *RawTensor) *RawTensor // logical AND
	Not(x *RawTensor) *RawTensor    // logical NOT

	// Reduction operations
	Sum(x *RawTensor) *RawTensor                            // total sum (scalar result)
	SumDim(x *RawTensor, dim int, keepDim bool) *RawTensor  // sum along dimension
	MeanDim(x *RawTensor, dim int, keepDim bool) *RawTensor // mean along dimension
	Argmax(x *RawTensor, dim int) *RawTensor                // index of maximum value along dimension

	// Manipulation operations
	Cat(tensors []*RawTensor, dim int) *RawTensor // concatenate along dimension
	Chunk(x *RawTensor, n, dim int) []*RawTensor  // split into n equal parts
	Unsqueeze(x *RawTensor, dim int) *RawTensor   // add dimension of size 1
	Squeeze(x *RawTensor, dim int) *RawTensor     // remove dimension of size 1

	// Indexing operations
	Gather(x *RawTensor, dim int, index *RawTensor) *RawTensor // select elements along dim using index tensor
	Where(condition, x, y *RawTensor) *RawTensor               // conditional element selection
	Embedding(weight, indices *RawTensor) *RawTensor           // lookup embeddings by indices

	// Shape operations (broadcast)
	Expand(x *RawTensor, shape Shape) *RawTensor // broadcast to shape

	// Type conversion
	Cast(x *RawTensor, dtype DataType) *RawTensor // cast to different data type

	// Metadata
	Name() string
	Device() Device
}

Backend defines the interface that all compute backends must implement. Backends handle the actual computation for tensor operations.

Implementations:

CPU: Pure Go with SIMD optimizations (TASK-003)
CUDA: NVIDIA GPU via driver API (Phase 2)
Vulkan: Cross-platform GPU compute (Phase 2)
Metal: Apple GPU (Phase 2)
WebGPU: Browser/native GPU (Phase 2)

type DType ¶

type DType interface {
	~float32 | ~float64 | ~int32 | ~int64 | ~uint8 | ~bool
}

DType is a constraint for supported tensor data types. It uses Go generics to ensure compile-time type safety.

type DataType ¶

type DataType int

DataType represents runtime type information for tensors.

const (
	Float32 DataType = iota
	Float64
	Int32
	Int64
	Uint8
	Bool
)

Supported data types for tensors.

func (DataType) Size ¶

func (dt DataType) Size() int

Size returns the byte size of the data type.

func (DataType) String ¶

func (dt DataType) String() string

String returns a human-readable name for the data type.

type Device ¶

type Device int

Device represents the compute device for tensor operations.

const (
	CPU Device = iota
	CUDA
	Vulkan
	Metal
	WebGPU
)

Supported compute devices.

func (Device) String ¶

func (d Device) String() string

String returns a human-readable device name.

type LazyBackend ¶ added in v0.6.0

type LazyBackend interface {
	// ReadGPUBuffer reads data from a GPU buffer to CPU memory.
	// bufferPtr is unsafe.Pointer to *wgpu.Buffer (or similar GPU buffer type).
	// size is the number of bytes to read.
	// Returns the CPU data or an error.
	ReadGPUBuffer(bufferPtr unsafe.Pointer, size uint64) ([]byte, error)

	// ReleaseGPUBuffer releases the GPU buffer when no longer needed.
	ReleaseGPUBuffer(bufferPtr unsafe.Pointer)
}

LazyBackend is an interface for backends that support lazy GPU evaluation. The backend must implement ReadGPUBuffer to transfer data from GPU to CPU.

type LazyGPUData ¶ added in v0.6.0

type LazyGPUData struct {
	// contains filtered or unexported fields
}

LazyGPUData holds a reference to GPU-resident data for lazy evaluation. When Data() is called on a RawTensor with LazyGPUData, the data is transferred from GPU to CPU only at that point (lazy realization).

func NewLazyGPUData ¶ added in v0.6.0

func NewLazyGPUData(bufferPtr unsafe.Pointer, size uint64, backend LazyBackend) *LazyGPUData

NewLazyGPUData creates a new LazyGPUData referencing a GPU buffer. The GPU buffer will be automatically released when garbage collected.

func (*LazyGPUData) BufferPtr ¶ added in v0.6.0

func (l *LazyGPUData) BufferPtr() unsafe.Pointer

BufferPtr returns the underlying GPU buffer pointer. This is used by backend operations that need to chain GPU operations.

func (*LazyGPUData) IsRealized ¶ added in v0.6.0

func (l *LazyGPUData) IsRealized() bool

IsRealized returns whether the GPU data has been transferred to CPU.

func (*LazyGPUData) MarkRealized ¶ added in v0.6.0

func (l *LazyGPUData) MarkRealized()

MarkRealized marks the GPU data as realized (transferred to CPU).

func (*LazyGPUData) Realize ¶ added in v0.6.0

func (l *LazyGPUData) Realize() ([]byte, error)

Realize transfers data from GPU to CPU and returns it. This is called lazily when Data() is accessed. Thread-safe: multiple goroutines can safely call this. After realization, the GPU buffer is released to free GPU memory.

func (*LazyGPUData) Release ¶ added in v0.6.0

func (l *LazyGPUData) Release()

Release releases the GPU buffer. Called when the tensor is no longer needed.

func (*LazyGPUData) Size ¶ added in v0.6.0

func (l *LazyGPUData) Size() uint64

Size returns the buffer size in bytes.

type MockBackend ¶

type MockBackend struct{}

MockBackend is a simple backend for testing. It implements all operations naively for correctness verification.

func NewMockBackend ¶

func NewMockBackend() *MockBackend

NewMockBackend creates a new MockBackend.

func (*MockBackend) Add ¶

func (m *MockBackend) Add(a, b *RawTensor) *RawTensor

Add performs element-wise addition with broadcasting.

func (*MockBackend) AddScalar ¶ added in v0.3.0

func (m *MockBackend) AddScalar(x *RawTensor, scalar any) *RawTensor

AddScalar adds a scalar to tensor elements (mock implementation).

func (*MockBackend) And ¶ added in v0.3.0

func (m *MockBackend) And(a, b *RawTensor) *RawTensor

And performs element-wise logical AND operation (mock implementation).

func (*MockBackend) Argmax ¶ added in v0.3.0

func (m *MockBackend) Argmax(_ *RawTensor, _ int) *RawTensor

Argmax returns indices of maximum values along the specified dimension (mock stub).

func (*MockBackend) BatchMatMul ¶ added in v0.4.0

func (m *MockBackend) BatchMatMul(a, b *RawTensor) *RawTensor

BatchMatMul performs batched matrix multiplication (naive implementation for testing).

func (*MockBackend) Cast ¶ added in v0.3.0

func (m *MockBackend) Cast(x *RawTensor, dtype DataType) *RawTensor

Cast converts the tensor to a different data type (mock implementation).

func (*MockBackend) Cat ¶ added in v0.3.0

func (m *MockBackend) Cat(tensors []*RawTensor, dim int) *RawTensor

Cat concatenates tensors along the specified dimension (naive implementation).

func (*MockBackend) Chunk ¶ added in v0.3.0

func (m *MockBackend) Chunk(x *RawTensor, n, dim int) []*RawTensor

Chunk splits tensor into n equal parts along the specified dimension.

func (*MockBackend) Conv2D ¶

func (m *MockBackend) Conv2D(input, kernel *RawTensor, stride, padding int) *RawTensor

Conv2D performs 2D convolution (naive implementation for testing).

func (*MockBackend) Conv2DInputBackward ¶ added in v0.7.1

func (m *MockBackend) Conv2DInputBackward(_, _, _ *RawTensor, _, _ int) *RawTensor

Conv2DInputBackward computes gradient w.r.t. input for Conv2D. Stub implementation for MockBackend (test-only).

func (*MockBackend) Conv2DKernelBackward ¶ added in v0.7.1

func (m *MockBackend) Conv2DKernelBackward(_, _, _ *RawTensor, _, _ int) *RawTensor

Conv2DKernelBackward computes gradient w.r.t. kernel for Conv2D. Stub implementation for MockBackend (test-only).

func (*MockBackend) Cos ¶ added in v0.3.0

func (m *MockBackend) Cos(x *RawTensor) *RawTensor

Cos computes element-wise cosine.

func (*MockBackend) Device ¶

func (m *MockBackend) Device() Device

Device returns the device type.

func (*MockBackend) Div ¶

func (m *MockBackend) Div(a, b *RawTensor) *RawTensor

Div performs element-wise division with broadcasting.

func (*MockBackend) DivScalar ¶ added in v0.3.0

func (m *MockBackend) DivScalar(x *RawTensor, scalar any) *RawTensor

DivScalar divides tensor elements by a scalar (mock implementation).

func (*MockBackend) Embedding ¶ added in v0.5.1

func (m *MockBackend) Embedding(weight, indices *RawTensor) *RawTensor

Embedding performs embedding lookup (naive implementation). weight: [numEmbeddings, embeddingDim] indices: any shape of int32 indices output: [...indices.shape, embeddingDim]

func (*MockBackend) Equal ¶ added in v0.3.0

func (m *MockBackend) Equal(a, b *RawTensor) *RawTensor

Equal performs element-wise equality comparison (mock implementation).

func (*MockBackend) Exp ¶ added in v0.3.0

func (m *MockBackend) Exp(x *RawTensor) *RawTensor

Exp computes element-wise exponential.

func (*MockBackend) Expand ¶ added in v0.3.0

func (m *MockBackend) Expand(_ *RawTensor, _ Shape) *RawTensor

Expand broadcasts the tensor to a new shape (mock stub).

func (*MockBackend) Gather ¶ added in v0.3.0

func (m *MockBackend) Gather(x *RawTensor, dim int, index *RawTensor) *RawTensor

Gather selects elements along dim using index tensor (naive implementation).

func (*MockBackend) Greater ¶ added in v0.3.0

func (m *MockBackend) Greater(a, b *RawTensor) *RawTensor

Greater performs element-wise greater-than comparison (mock implementation).

func (*MockBackend) GreaterEqual ¶ added in v0.3.0

func (m *MockBackend) GreaterEqual(a, b *RawTensor) *RawTensor

GreaterEqual performs element-wise greater-than-or-equal comparison (mock implementation).

func (*MockBackend) Log ¶ added in v0.3.0

func (m *MockBackend) Log(x *RawTensor) *RawTensor

Log computes natural logarithm element-wise (mock implementation).

func (*MockBackend) Lower ¶ added in v0.3.0

func (m *MockBackend) Lower(a, b *RawTensor) *RawTensor

Lower performs element-wise less-than comparison (mock implementation).

func (*MockBackend) LowerEqual ¶ added in v0.3.0

func (m *MockBackend) LowerEqual(a, b *RawTensor) *RawTensor

LowerEqual performs element-wise less-than-or-equal comparison (mock implementation).

func (*MockBackend) MatMul ¶

func (m *MockBackend) MatMul(a, b *RawTensor) *RawTensor

MatMul performs matrix multiplication.

func (*MockBackend) MaxPool2D ¶

func (m *MockBackend) MaxPool2D(input *RawTensor, kernelSize, stride int) *RawTensor

MaxPool2D performs 2D max pooling (naive implementation for testing).

func (*MockBackend) MaxPool2DBackward ¶ added in v0.7.1

func (m *MockBackend) MaxPool2DBackward(_, _ *RawTensor, _ []int, _, _ int) *RawTensor

MaxPool2DBackward computes gradient w.r.t. input for MaxPool2D. Stub implementation for MockBackend (test-only).

func (*MockBackend) MeanDim ¶ added in v0.3.0

func (m *MockBackend) MeanDim(x *RawTensor, dim int, keepDim bool) *RawTensor

MeanDim computes the mean of tensor elements along the specified dimension.

func (*MockBackend) Mul ¶

func (m *MockBackend) Mul(a, b *RawTensor) *RawTensor

Mul performs element-wise multiplication with broadcasting.

func (*MockBackend) MulScalar ¶ added in v0.3.0

func (m *MockBackend) MulScalar(x *RawTensor, scalar any) *RawTensor

MulScalar multiplies tensor elements by a scalar (mock implementation).

func (*MockBackend) Name ¶

func (m *MockBackend) Name() string

Name returns the backend name.

func (*MockBackend) Not ¶ added in v0.3.0

func (m *MockBackend) Not(x *RawTensor) *RawTensor

Not performs element-wise logical NOT operation (mock implementation).

func (*MockBackend) NotEqual ¶ added in v0.3.0

func (m *MockBackend) NotEqual(a, b *RawTensor) *RawTensor

NotEqual performs element-wise inequality comparison (mock implementation).

func (*MockBackend) Or ¶ added in v0.3.0

func (m *MockBackend) Or(a, b *RawTensor) *RawTensor

Or performs element-wise logical OR operation (mock implementation).

func (*MockBackend) Reshape ¶

func (m *MockBackend) Reshape(t *RawTensor, newShape Shape) *RawTensor

Reshape changes tensor shape.

func (*MockBackend) Rsqrt ¶ added in v0.3.0

func (m *MockBackend) Rsqrt(x *RawTensor) *RawTensor

Rsqrt computes element-wise reciprocal square root.

func (*MockBackend) Sin ¶ added in v0.3.0

func (m *MockBackend) Sin(x *RawTensor) *RawTensor

Sin computes element-wise sine.

func (*MockBackend) Softmax ¶ added in v0.3.0

func (m *MockBackend) Softmax(_ *RawTensor, _ int) *RawTensor

Softmax applies softmax activation along the specified dimension (mock stub).

func (*MockBackend) Sqrt ¶ added in v0.3.0

func (m *MockBackend) Sqrt(x *RawTensor) *RawTensor

Sqrt computes element-wise square root.

func (*MockBackend) Squeeze ¶ added in v0.3.0

func (m *MockBackend) Squeeze(x *RawTensor, dim int) *RawTensor

Squeeze removes a dimension of size 1 at the specified position.

func (*MockBackend) Sub ¶

func (m *MockBackend) Sub(a, b *RawTensor) *RawTensor

Sub performs element-wise subtraction with broadcasting.

func (*MockBackend) SubScalar ¶ added in v0.3.0

func (m *MockBackend) SubScalar(x *RawTensor, scalar any) *RawTensor

SubScalar subtracts a scalar from tensor elements (mock implementation).

func (*MockBackend) Sum ¶ added in v0.3.0

func (m *MockBackend) Sum(x *RawTensor) *RawTensor

Sum computes the total sum of all tensor elements (mock implementation).

func (*MockBackend) SumDim ¶ added in v0.3.0

func (m *MockBackend) SumDim(x *RawTensor, dim int, keepDim bool) *RawTensor

SumDim sums tensor elements along the specified dimension (naive implementation).

func (*MockBackend) Transpose ¶

func (m *MockBackend) Transpose(t *RawTensor, axes ...int) *RawTensor

Transpose transposes tensor dimensions.

func (*MockBackend) Unsqueeze ¶ added in v0.3.0

func (m *MockBackend) Unsqueeze(x *RawTensor, dim int) *RawTensor

Unsqueeze adds a dimension of size 1 at the specified position.

func (*MockBackend) Where ¶ added in v0.3.0

func (m *MockBackend) Where(condition, x, y *RawTensor) *RawTensor

Where performs conditional element selection (naive implementation).

type RawTensor ¶

type RawTensor struct {
	// contains filtered or unexported fields
}

RawTensor is the low-level tensor representation. It uses reference-counted shared buffers for Copy-on-Write semantics. Supports lazy GPU evaluation: data is transferred from GPU only when Data() is called.

func Cast ¶ added in v0.6.0

func Cast(x *RawTensor, dtype DataType) (*RawTensor, error)

Cast converts a tensor to a different data type.

func Clip ¶ added in v0.6.0

func Clip(x *RawTensor, minVal, maxVal float32) (*RawTensor, error)

Clip clamps values to the range [min, max].

func Concat ¶ added in v0.6.0

func Concat(tensors []*RawTensor, axis int) (*RawTensor, error)

Concat concatenates tensors along the specified dimension.

func Expand ¶ added in v0.6.0

func Expand(x *RawTensor, targetShape Shape) (*RawTensor, error)

Expand broadcasts a tensor to a larger shape.

func Flatten ¶ added in v0.6.0

func Flatten(x *RawTensor, axis int) (*RawTensor, error)

Flatten flattens tensor from axis onward into a single dimension.

func FullRaw ¶ added in v0.6.0

func FullRaw(shape Shape, value float32, dtype DataType, device Device) (*RawTensor, error)

FullRaw creates a RawTensor filled with a constant value.

func GELU ¶ added in v0.6.0

func GELU(x *RawTensor) (*RawTensor, error)

GELU applies the Gaussian Error Linear Unit activation. Uses approximation: 0.5 * x * (1 + tanh(sqrt(2/pi) * (x + 0.044715 * x^3))).

func Gather ¶ added in v0.6.0

func Gather(x, indices *RawTensor, axis int) (*RawTensor, error)

Gather selects elements along an axis according to indices.

func LeakyReLU ¶ added in v0.6.0

func LeakyReLU(x *RawTensor, alpha float32) (*RawTensor, error)

LeakyReLU applies leaky ReLU: max(x, alpha*x).

func LogSoftmax ¶ added in v0.6.0

func LogSoftmax(x *RawTensor, axis int) (*RawTensor, error)

LogSoftmax computes log(softmax(x)) along the specified axis.

func NewLazyRaw ¶ added in v0.6.0

func NewLazyRaw(shape Shape, dtype DataType, device Device, gpuData *LazyGPUData) (*RawTensor, error)

NewLazyRaw creates a new RawTensor with lazy GPU data. The data is not transferred from GPU until Data() is called.

func NewRaw ¶

func NewRaw(shape Shape, dtype DataType, device Device) (*RawTensor, error)

NewRaw creates a new RawTensor with the given shape and type. Memory is allocated but not initialized (contains zeros).

func PReLU ¶ added in v0.6.0

func PReLU(x, slope *RawTensor) (*RawTensor, error)

PReLU applies parametric ReLU: max(x, slope*x) where slope is per-element or broadcasted.

func ReLU ¶ added in v0.6.0

func ReLU(x *RawTensor) (*RawTensor, error)

ReLU applies the ReLU activation function element-wise: max(x, 0).

func Reshape ¶ added in v0.6.0

func Reshape(x *RawTensor, newShape Shape) (*RawTensor, error)

Reshape returns a new tensor with the given shape (shares data if contiguous).

func SiLU ¶ added in v0.6.0

func SiLU(x *RawTensor) (*RawTensor, error)

SiLU applies the Sigmoid Linear Unit (Swish) activation: x * sigmoid(x).

func Sigmoid ¶ added in v0.6.0

func Sigmoid(x *RawTensor) (*RawTensor, error)

Sigmoid applies the sigmoid activation function: 1/(1+exp(-x)).

func Slice ¶ added in v0.6.0

func Slice(x *RawTensor, starts, ends, axes, steps []int64) (*RawTensor, error)

Slice extracts a slice from a tensor.

func Softmax ¶ added in v0.6.0

func Softmax(x *RawTensor, axis int) (*RawTensor, error)

Softmax applies softmax along the specified axis.

func Split ¶ added in v0.6.0

func Split(x *RawTensor, axis int, splitSizes []int) ([]*RawTensor, error)

Split splits a tensor into multiple tensors along an axis.

func Squeeze ¶ added in v0.6.0

func Squeeze(x *RawTensor, axes ...int) (*RawTensor, error)

Squeeze removes dimensions of size 1 at the specified axes.

If no axes are specified, removes all dimensions of size 1.

func Tanh ¶ added in v0.6.0

func Tanh(x *RawTensor) (*RawTensor, error)

Tanh applies the hyperbolic tangent activation function.

func TransposeAxes ¶ added in v0.6.0

func TransposeAxes(x *RawTensor, axes ...int) (*RawTensor, error)

TransposeAxes transposes dimensions according to the given permutation.

func Unsqueeze ¶ added in v0.6.0

func Unsqueeze(x *RawTensor, axes ...int) (*RawTensor, error)

Unsqueeze adds dimensions of size 1 at the specified axes.

func WhereRaw ¶ added in v0.6.0

func WhereRaw(condition, x, y *RawTensor) (*RawTensor, error)

WhereRaw selects elements from x or y based on condition.

func (*RawTensor) AsBool ¶

func (r *RawTensor) AsBool() []bool

AsBool interprets the data as []bool. Panics if the tensor's dtype is not Bool. For lazy GPU tensors, this triggers data transfer from GPU to CPU.

func (*RawTensor) AsFloat32 ¶

func (r *RawTensor) AsFloat32() []float32

AsFloat32 interprets the data as []float32. Panics if the tensor's dtype is not Float32. For lazy GPU tensors, this triggers data transfer from GPU to CPU.

func (*RawTensor) AsFloat64 ¶

func (r *RawTensor) AsFloat64() []float64

AsFloat64 interprets the data as []float64. Panics if the tensor's dtype is not Float64. For lazy GPU tensors, this triggers data transfer from GPU to CPU.

func (*RawTensor) AsInt32 ¶

func (r *RawTensor) AsInt32() []int32

AsInt32 interprets the data as []int32. Panics if the tensor's dtype is not Int32. For lazy GPU tensors, this triggers data transfer from GPU to CPU.

func (*RawTensor) AsInt64 ¶

func (r *RawTensor) AsInt64() []int64

AsInt64 interprets the data as []int64. Panics if the tensor's dtype is not Int64. For lazy GPU tensors, this triggers data transfer from GPU to CPU.

func (*RawTensor) AsUint8 ¶

func (r *RawTensor) AsUint8() []uint8

AsUint8 interprets the data as []uint8. Panics if the tensor's dtype is not Uint8. For lazy GPU tensors, this triggers data transfer from GPU to CPU.

func (*RawTensor) ByteSize ¶

func (r *RawTensor) ByteSize() int

ByteSize returns the total memory size in bytes.

func (*RawTensor) Clone ¶

func (r *RawTensor) Clone() *RawTensor

Clone creates a shallow copy of the RawTensor (shares buffer with reference counting). The buffer is reference-counted and will be copied only when modified (copy-on-write). This enables cheap cloning and inplace optimizations when refCount == 1. Note: GPU lazy data is shared (same underlying GPU buffer).

Example:

a := tensor.Ones[float32](Shape{1000, 1000}, backend)
b := a.Clone()  // Shares buffer with a (just increments refCount)
c := a.Add(b)   // May use inplace if refCount allows

func (*RawTensor) DType ¶

func (r *RawTensor) DType() DataType

DType returns the tensor's data type.

func (*RawTensor) Data ¶

func (r *RawTensor) Data() []byte

Data returns the raw byte slice. For lazy GPU tensors, this triggers data transfer from GPU to CPU (expensive!). WARNING: Direct access to underlying memory. Use with caution.

func (*RawTensor) Device ¶

func (r *RawTensor) Device() Device

Device returns the tensor's compute device.

func (*RawTensor) ForceNonUnique ¶

func (r *RawTensor) ForceNonUnique() func()

ForceNonUnique temporarily increases refCount to prevent inplace modifications. Returns a cleanup function that MUST be called to restore refCount (use defer).

This is used by autodiff backend to preserve original input values: inplace optimizations would corrupt the computational graph.

Example:

defer tensor.ForceNonUnique()()
result := backend.Mul(tensor, other)  // No inplace modification!

func (*RawTensor) GPUData ¶ added in v0.6.0

func (r *RawTensor) GPUData() *LazyGPUData

GPUData returns the lazy GPU data reference, if any. Returns nil for CPU-only tensors.

func (*RawTensor) IsLazy ¶ added in v0.6.0

func (r *RawTensor) IsLazy() bool

IsLazy returns true if this tensor has unrealized GPU data. Use this to check if Data() will trigger an expensive GPU→CPU transfer.

func (*RawTensor) IsUnique ¶

func (r *RawTensor) IsUnique() bool

IsUnique returns true if this tensor is the only reference to the buffer. When true, backends can perform inplace operations for better performance.

func (*RawTensor) NumElements ¶

func (r *RawTensor) NumElements() int

NumElements returns the total number of elements.

func (*RawTensor) Release ¶

func (r *RawTensor) Release()

Release decrements the reference count and deallocates if it reaches 0. This is called automatically when a tensor is no longer needed (e.g., by GC finalizer).

func (*RawTensor) SetGPUData ¶ added in v0.6.0

func (r *RawTensor) SetGPUData(gpuData *LazyGPUData)

SetGPUData sets the lazy GPU data reference. This is used by GPU backends to create lazy tensors.

func (*RawTensor) Shape ¶

func (r *RawTensor) Shape() Shape

Shape returns the tensor's shape.

func (*RawTensor) Strides ¶

func (r *RawTensor) Strides() []int

Strides returns the tensor's memory strides.

type Shape ¶

type Shape []int

Shape represents the dimensions of a tensor.

func BroadcastShapes ¶

func BroadcastShapes(a, b Shape) (Shape, bool, error)

BroadcastShapes implements NumPy-style broadcasting rules.

Rules: 1. Compare shapes element-wise from right to left 2. Dimensions are compatible if:

They are equal, OR
One of them is 1

3. Missing dimensions are treated as 1

Returns the broadcasted shape, a flag indicating if broadcasting is needed, and an error if incompatible.

Examples:

(3, 1) + (3, 5) → (3, 5), true, nil
(1, 5) + (3, 5) → (3, 5), true, nil
(3, 5) + (3, 5) → (3, 5), false, nil
(3, 4) + (3, 5) → nil, false, Error

func (Shape) Clone ¶

func (s Shape) Clone() Shape

Clone returns a copy of the shape.

func (Shape) ComputeStrides ¶

func (s Shape) ComputeStrides() []int

ComputeStrides calculates row-major strides for the shape. Strides define memory layout: stride[i] = product of all dimensions after i.

func (Shape) Equal ¶

func (s Shape) Equal(other Shape) bool

Equal checks if two shapes are equal.

func (Shape) NumElements ¶

func (s Shape) NumElements() int

NumElements returns the total number of elements in the tensor.

func (Shape) Validate ¶

func (s Shape) Validate() error

Validate checks if the shape is valid (all dimensions > 0).

type Tensor ¶

type Tensor[T DType, B Backend] struct {
	// contains filtered or unexported fields
}

Tensor is a generic tensor with type T and backend B. It provides type-safe operations over multi-dimensional arrays.

Type Parameters:

T: Data type (must satisfy DType constraint)
B: Computation backend (must implement Backend interface)

Example:

backend := cpu.New()
t := tensor.Zeros[float32](Shape{3, 4}, backend)
result := t.Add(t) // Type-safe addition

func Arange ¶

func Arange[T DType, B Backend](start, end T, b B) *Tensor[T, B]

Arange creates a 1D tensor with values from start to end (exclusive). Only works with numeric types (not bool).

Example:

t := tensor.Arange[int32](0, 10, backend) // [0, 1, 2, ..., 9]

func Cat ¶ added in v0.3.0

func Cat[T DType, B Backend](tensors []*Tensor[T, B], dim int) *Tensor[T, B]

Cat concatenates tensors along the specified dimension.

All tensors must have the same shape except along the concatenation dimension. Supports negative dim indexing (-1 = last dimension).

Example:

a := tensor.Randn[float32](Shape{2, 3}, backend)
b := tensor.Randn[float32](Shape{2, 5}, backend)
c := tensor.Cat([]*Tensor[float32, B]{a, b}, 1) // Shape: [2, 8]

func Eye ¶

func Eye[T DType, B Backend](n int, b B) *Tensor[T, B]

Eye creates a 2D identity matrix.

Example:

t := tensor.Eye[float32](3, backend) // 3x3 identity matrix

func FromSlice ¶

func FromSlice[T DType, B Backend](data []T, shape Shape, b B) (*Tensor[T, B], error)

FromSlice creates a tensor from a Go slice. The slice is copied into the tensor's memory.

func Full ¶

func Full[T DType, B Backend](shape Shape, value T, b B) *Tensor[T, B]

Full creates a tensor filled with a specific value.

Example:

t := tensor.Full[float32](Shape{3, 3}, 3.14, backend)

func New ¶

func New[T DType, B Backend](raw *RawTensor, b B) *Tensor[T, B]

New creates a Tensor from a RawTensor and backend.

func Ones ¶

func Ones[T DType, B Backend](shape Shape, b B) *Tensor[T, B]

Ones creates a tensor filled with ones.

Example:

t := tensor.Ones[float64](Shape{2, 3}, backend)

func Rand ¶

func Rand[T DType, B Backend](shape Shape, b B) *Tensor[T, B]

Rand creates a tensor with random values uniformly distributed in [0, 1). Only works with float types.

Example:

t := tensor.Rand[float32](Shape{10, 10}, backend)

func Randn ¶

func Randn[T DType, B Backend](shape Shape, b B) *Tensor[T, B]

Randn creates a tensor with random values from a normal distribution (mean=0, std=1). Uses Box-Muller transform for generating normal distribution. Only works with float types. Note: Uses math/rand (not crypto/rand) - appropriate for ML/statistical purposes.

Example:

t := tensor.Randn[float32](Shape{100, 100}, backend)

func Where ¶ added in v0.3.0

func Where[T DType, B Backend](cond *Tensor[bool, B], x, y *Tensor[T, B]) *Tensor[T, B]

Where selects elements from x or y based on condition.

For each element:

If condition is true, select from x
If condition is false, select from y

Supports broadcasting between condition, x, and y.

Example:

cond := tensor.Full[bool](Shape{3}, true, backend)
x := tensor.Full[float32](Shape{3}, 1.0, backend)
y := tensor.Full[float32](Shape{3}, 0.0, backend)
result := tensor.Where(cond, x, y)  // [1.0, 1.0, 1.0]

func Zeros ¶

func Zeros[T DType, B Backend](shape Shape, b B) *Tensor[T, B]

Zeros creates a tensor filled with zeros.

Example:

backend := cpu.New()
t := tensor.Zeros[float32](Shape{3, 4}, backend)

func (*Tensor[T, B]) Add ¶

func (t *Tensor[T, B]) Add(other *Tensor[T, B]) *Tensor[T, B]

Add performs element-wise addition with broadcasting.

Example:

a := tensor.Ones[float32](Shape{3, 1}, backend)
b := tensor.Ones[float32](Shape{3, 5}, backend)
c := a.Add(b) // Shape: [3, 5] (broadcasted)

func (*Tensor[T, B]) AddScalar ¶ added in v0.3.0

func (t *Tensor[T, B]) AddScalar(scalar T) *Tensor[T, B]

AddScalar adds a scalar value to each element of the tensor.

The scalar is broadcast to all elements of the tensor.

Example:

x := tensor.Randn[float32](Shape{2, 3}, backend)
y := x.AddScalar(1.0)  // add 1.0 to all elements

func (*Tensor[bool, B]) And ¶ added in v0.3.0

func (t *Tensor[bool, B]) And(other *Tensor[bool, B]) *Tensor[bool, B]

And computes element-wise logical AND between two boolean tensors.

Supports broadcasting between tensors of different shapes.

Example:

a := tensor.Full[bool](Shape{3}, true, backend)
b := tensor.Full[bool](Shape{3}, false, backend)
c := a.And(b)  // [false, false, false]

func (*Tensor[T, B]) Argmax ¶ added in v0.3.0

func (t *Tensor[T, B]) Argmax(dim int) *Tensor[int32, B]

Argmax returns the index of the maximum value along the specified dimension.

Returns a tensor of type int32 with the same shape as the input except the specified dimension is removed.

Supports negative dimension indexing (-1 = last dimension).

Example:

x := tensor.Randn[float32](Shape{3, 4}, backend)
indices := x.Argmax(1)  // Shape: [3], index of max in each row

func (*Tensor[T, B]) At ¶

func (t *Tensor[T, B]) At(indices ...int) T

At returns the element at the given indices. Panics if indices are out of bounds.

Example:

t := tensor.Zeros[float32](Shape{3, 4}, backend)
value := t.At(1, 2) // Row 1, column 2

func (*Tensor[T, B]) Backend ¶

func (t *Tensor[T, B]) Backend() B

Backend returns the computation backend.

func (*Tensor[T, B]) BatchMatMul ¶ added in v0.4.0

func (t *Tensor[T, B]) BatchMatMul(other *Tensor[T, B]) *Tensor[T, B]

BatchMatMul performs batched matrix multiplication.

For 3D tensors: [B, M, K] @ [B, K, N] -> [B, M, N] For 4D tensors: [B, H, M, K] @ [B, H, K, N] -> [B, H, M, N]

Example (Attention scores):

q := tensor.Randn[float32](tensor.Shape{2, 4, 64, 16}, backend) // [B, H, S, D]
k := tensor.Randn[float32](tensor.Shape{2, 4, 64, 16}, backend)
kT := k.Transpose(0, 1, 3, 2)
scores := q.BatchMatMul(kT) // [2, 4, 64, 64]

func (*Tensor[T, B]) Chunk ¶ added in v0.3.0

func (t *Tensor[T, B]) Chunk(n, dim int) []*Tensor[T, B]

Chunk splits the tensor into n equal parts along the specified dimension.

The dimension size must be divisible by n. Supports negative dim indexing (-1 = last dimension).

Example:

x := tensor.Randn[float32](Shape{2, 3, 6}, backend)
parts := x.Chunk(3, -1) // 3 tensors of shape [2, 3, 2]

func (*Tensor[T, B]) Clone ¶

func (t *Tensor[T, B]) Clone() *Tensor[T, B]

Clone creates a deep copy of the tensor.

func (*Tensor[T, B]) Cos ¶ added in v0.3.0

func (t *Tensor[T, B]) Cos() *Tensor[T, B]

Cos computes the cosine of each element (input in radians).

Example:

x := tensor.Arange[float32](0, 10, backend)
y := x.Cos()  // cos(x) for each element

func (*Tensor[T, B]) DType ¶

func (t *Tensor[T, B]) DType() DataType

DType returns the tensor's data type.

func (*Tensor[T, B]) Data ¶

func (t *Tensor[T, B]) Data() []T

Data returns a typed slice view of the tensor's data. The slice directly accesses the underlying memory (zero-copy).

WARNING: Modifications to the returned slice will modify the tensor.

func (*Tensor[T, B]) Detach ¶ added in v0.3.0

func (t *Tensor[T, B]) Detach() *Tensor[T, B]

Detach returns a new tensor that shares the same data but doesn't track gradients.

This is useful for:

Stopping gradient flow at a specific point
Creating targets in reinforcement learning (no backprop through target)
HRM carry states between iterations (detach to prevent long gradient chains)
Teacher-student training (stop gradients through teacher)

The returned tensor shares the underlying data (zero-copy) but has no gradient tracking. Any operations on the detached tensor won't appear in the autodiff tape.

Example:

// Training with detached target
prediction := model.Forward(input)
target := target_model.Forward(input).Detach()  // No gradients through target
loss := prediction.Sub(target).Pow(2).Mean()

// HRM carry state
newCarry := Carry{
    zH: zH.Detach(),  // Break gradient chain
    zL: zL.Detach(),
}

func (*Tensor[T, B]) Device ¶

func (t *Tensor[T, B]) Device() Device

Device returns the tensor's compute device.

func (*Tensor[T, B]) Div ¶

func (t *Tensor[T, B]) Div(other *Tensor[T, B]) *Tensor[T, B]

Div performs element-wise division with broadcasting.

func (*Tensor[T, B]) DivScalar ¶ added in v0.3.0

func (t *Tensor[T, B]) DivScalar(scalar T) *Tensor[T, B]

DivScalar divides each element of the tensor by a scalar value.

The scalar is broadcast to all elements of the tensor.

Example:

x := tensor.Randn[float32](Shape{2, 3}, backend)
y := x.DivScalar(2.0)  // divide all elements by 2.0

func (*Tensor[T, B]) Embedding ¶ added in v0.5.1

func (t *Tensor[T, B]) Embedding(indices *Tensor[int32, B]) *Tensor[T, B]

Embedding performs embedding lookup using the tensor as the weight matrix.

The tensor t should have shape [numEmbeddings, embeddingDim]. The indices tensor contains integer indices to look up. Returns embeddings with shape [...indices.shape, embeddingDim].

This operation is differentiable - gradients flow back to the weight tensor via scatter-add (same indices accumulate gradients).

Example:

weight := tensor.Randn[float32](Shape{1000, 256}, backend)  // vocab=1000, dim=256
indices := tensor.FromSlice([]int32{0, 5, 3}, Shape{3}, backend)
embeddings := weight.Embedding(indices)  // shape: [3, 256]

func (*Tensor[T, B]) Eq ¶ added in v0.3.0

func (t *Tensor[T, B]) Eq(other *Tensor[T, B]) *Tensor[bool, B]

Eq is a short alias for Equal.

Example:

mask := a.Eq(b)  // same as a.Equal(b)

func (*Tensor[T, B]) Equal ¶ added in v0.3.0

func (t *Tensor[T, B]) Equal(other *Tensor[T, B]) *Tensor[bool, B]

Equal returns a boolean tensor where each element is true if the corresponding elements in this tensor and other are equal.

Supports broadcasting between tensors of different shapes.

Example:

a := tensor.Arange[float32](0, 5, backend)
b := tensor.Full[float32](Shape{5}, 2.0, backend)
mask := a.Equal(b)  // [false, false, true, false, false]

func (*Tensor[T, B]) Exp ¶ added in v0.3.0

func (t *Tensor[T, B]) Exp() *Tensor[T, B]

Exp computes the exponential (e^x) of each element.

Example:

x := tensor.Randn[float32](Shape{2, 3}, backend)
y := x.Exp()  // e^x for each element

func (*Tensor[T, B]) Expand ¶ added in v0.3.0

func (t *Tensor[T, B]) Expand(shape Shape) *Tensor[T, B]

Expand broadcasts the tensor to a new shape.

The new shape must be compatible with the current shape according to NumPy broadcasting rules. Dimensions of size 1 can be broadcast to any size.

Example:

x := tensor.Randn[float32](Shape{1, 3}, backend)
y := x.Expand(Shape{5, 3})  // broadcast to [5, 3]

func (*Tensor[T, B]) Float32 ¶ added in v0.3.0

func (t *Tensor[T, B]) Float32() *Tensor[float32, B]

Float32 casts the tensor to float32 dtype.

Example:

x := tensor.Arange[int32](0, 10, backend)
y := x.Float32()  // Tensor[float32, B]

func (*Tensor[T, B]) Float64 ¶ added in v0.3.0

func (t *Tensor[T, B]) Float64() *Tensor[float64, B]

Float64 casts the tensor to float64 dtype.

Example:

x := tensor.Randn[float32](Shape{3, 4}, backend)
y := x.Float64()  // Tensor[float64, B]

func (*Tensor[T, B]) Gather ¶ added in v0.3.0

func (t *Tensor[T, B]) Gather(dim int, index *Tensor[int32, B]) *Tensor[T, B]

Gather selects elements from the tensor along a dimension using an index tensor.

For each element in the index tensor, Gather selects the corresponding element from the input tensor along the specified dimension.

Example:

x := tensor.Randn[float32](Shape{3, 5}, backend)
indices := tensor.FromSlice([]int32{0, 2, 4}, Shape{3}, backend)
y := x.Gather(1, indices)  // select columns 0, 2, 4 for each row

func (*Tensor[T, B]) Ge ¶ added in v0.3.0

func (t *Tensor[T, B]) Ge(other *Tensor[T, B]) *Tensor[bool, B]

Ge is a short alias for GreaterEqual.

Example:

mask := a.Ge(b)  // same as a.GreaterEqual(b)

func (*Tensor[T, B]) Grad ¶

func (t *Tensor[T, B]) Grad() *Tensor[T, B]

Grad returns the gradient tensor (if computed by autodiff).

func (*Tensor[T, B]) Greater ¶ added in v0.3.0

func (t *Tensor[T, B]) Greater(other *Tensor[T, B]) *Tensor[bool, B]

Greater returns a boolean tensor where each element is true if the corresponding element in this tensor is greater than the corresponding element in other.

Supports broadcasting between tensors of different shapes.

Example:

a := tensor.Arange[float32](0, 5, backend)
b := tensor.Full[float32](Shape{5}, 2.0, backend)
mask := a.Greater(b)  // [false, false, false, true, true]

func (*Tensor[T, B]) GreaterEqual ¶ added in v0.3.0

func (t *Tensor[T, B]) GreaterEqual(other *Tensor[T, B]) *Tensor[bool, B]

GreaterEqual returns a boolean tensor where each element is true if the corresponding element in this tensor is greater than or equal to the corresponding element in other.

Supports broadcasting between tensors of different shapes.

Example:

a := tensor.Arange[float32](0, 5, backend)
b := tensor.Full[float32](Shape{5}, 2.0, backend)
mask := a.GreaterEqual(b)  // [false, false, true, true, true]

func (*Tensor[T, B]) Gt ¶ added in v0.3.0

func (t *Tensor[T, B]) Gt(other *Tensor[T, B]) *Tensor[bool, B]

Gt is a short alias for Greater.

Example:

mask := a.Gt(b)  // same as a.Greater(b)

func (*Tensor[T, B]) Int32 ¶ added in v0.3.0

func (t *Tensor[T, B]) Int32() *Tensor[int32, B]

Int32 casts the tensor to int32 dtype.

Example:

x := tensor.Randn[float32](Shape{3, 4}, backend)
y := x.Int32()  // Tensor[int32, B]

func (*Tensor[T, B]) Int64 ¶ added in v0.3.0

func (t *Tensor[T, B]) Int64() *Tensor[int64, B]

Int64 casts the tensor to int64 dtype.

Example:

x := tensor.Arange[int32](0, 10, backend)
y := x.Int64()  // Tensor[int64, B]

func (*Tensor[T, B]) Item ¶

func (t *Tensor[T, B]) Item() T

Item returns the scalar value of a 0-D tensor. Panics if the tensor is not a scalar.

func (*Tensor[T, B]) Le ¶ added in v0.3.0

func (t *Tensor[T, B]) Le(other *Tensor[T, B]) *Tensor[bool, B]

Le is a short alias for LowerEqual.

Example:

mask := a.Le(b)  // same as a.LowerEqual(b)

func (*Tensor[T, B]) Log ¶ added in v0.3.0

func (t *Tensor[T, B]) Log() *Tensor[T, B]

Log computes the natural logarithm (ln(x)) of each element.

Example:

x := tensor.Randn[float32](Shape{2, 3}, backend)
y := x.Log()  // ln(x) for each element

func (*Tensor[T, B]) Lower ¶ added in v0.3.0

func (t *Tensor[T, B]) Lower(other *Tensor[T, B]) *Tensor[bool, B]

Lower returns a boolean tensor where each element is true if the corresponding element in this tensor is less than the corresponding element in other.

Supports broadcasting between tensors of different shapes.

Example:

a := tensor.Arange[float32](0, 5, backend)
b := tensor.Full[float32](Shape{5}, 2.0, backend)
mask := a.Lower(b)  // [true, true, false, false, false]

func (*Tensor[T, B]) LowerEqual ¶ added in v0.3.0

func (t *Tensor[T, B]) LowerEqual(other *Tensor[T, B]) *Tensor[bool, B]

LowerEqual returns a boolean tensor where each element is true if the corresponding element in this tensor is less than or equal to the corresponding element in other.

Supports broadcasting between tensors of different shapes.

Example:

a := tensor.Arange[float32](0, 5, backend)
b := tensor.Full[float32](Shape{5}, 2.0, backend)
mask := a.LowerEqual(b)  // [true, true, true, false, false]

func (*Tensor[T, B]) Lt ¶ added in v0.3.0

func (t *Tensor[T, B]) Lt(other *Tensor[T, B]) *Tensor[bool, B]

Lt is a short alias for Lower.

Example:

mask := a.Lt(b)  // same as a.Lower(b)

func (*Tensor[T, B]) MatMul ¶

func (t *Tensor[T, B]) MatMul(other *Tensor[T, B]) *Tensor[T, B]

MatMul performs matrix multiplication.

Requirements:

For 2D tensors: (M, K) @ (K, N) → (M, N)
For batched: (B, M, K) @ (B, K, N) → (B, M, N)

Example:

a := tensor.Randn[float32](Shape{3, 4}, backend)
b := tensor.Randn[float32](Shape{4, 5}, backend)
c := a.MatMul(b) // Shape: [3, 5]

func (*Tensor[T, B]) MeanDim ¶ added in v0.3.0

func (t *Tensor[T, B]) MeanDim(dim int, keepDim bool) *Tensor[T, B]

MeanDim computes the mean of tensor elements along the specified dimension.

Parameters:

dim: dimension to reduce (supports negative indexing: -1 = last dim)
keepDim: if true, keep the reduced dimension with size 1; if false, remove it

Example:

x := tensor.Randn[float32](Shape{2, 3, 4}, backend)
y := x.MeanDim(-1, true)   // shape: [2, 3, 1]
z := x.MeanDim(-1, false)  // shape: [2, 3]

func (*Tensor[T, B]) Mul ¶

func (t *Tensor[T, B]) Mul(other *Tensor[T, B]) *Tensor[T, B]

Mul performs element-wise multiplication with broadcasting.

func (*Tensor[T, B]) MulScalar ¶ added in v0.3.0

func (t *Tensor[T, B]) MulScalar(scalar T) *Tensor[T, B]

MulScalar multiplies each element of the tensor by a scalar value.

The scalar is broadcast to all elements of the tensor.

Example:

x := tensor.Randn[float32](Shape{2, 3}, backend)
y := x.MulScalar(2.5)  // multiply all elements by 2.5

func (*Tensor[T, B]) Ne ¶ added in v0.3.0

func (t *Tensor[T, B]) Ne(other *Tensor[T, B]) *Tensor[bool, B]

Ne is a short alias for NotEqual.

Example:

mask := a.Ne(b)  // same as a.NotEqual(b)

func (*Tensor[bool, B]) Not ¶ added in v0.3.0

func (t *Tensor[bool, B]) Not() *Tensor[bool, B]

Not computes element-wise logical NOT of a boolean tensor.

Example:

a := tensor.Full[bool](Shape{3}, true, backend)
b := a.Not()  // [false, false, false]

func (*Tensor[T, B]) NotEqual ¶ added in v0.3.0

func (t *Tensor[T, B]) NotEqual(other *Tensor[T, B]) *Tensor[bool, B]

NotEqual returns a boolean tensor where each element is true if the corresponding elements in this tensor and other are not equal.

Supports broadcasting between tensors of different shapes.

Example:

a := tensor.Arange[float32](0, 5, backend)
b := tensor.Full[float32](Shape{5}, 2.0, backend)
mask := a.NotEqual(b)  // [true, true, false, true, true]

func (*Tensor[T, B]) NumElements ¶

func (t *Tensor[T, B]) NumElements() int

NumElements returns the total number of elements.

func (*Tensor[bool, B]) Or ¶ added in v0.3.0

func (t *Tensor[bool, B]) Or(other *Tensor[bool, B]) *Tensor[bool, B]

Or computes element-wise logical OR between two boolean tensors.

Supports broadcasting between tensors of different shapes.

Example:

a := tensor.Full[bool](Shape{3}, true, backend)
b := tensor.Full[bool](Shape{3}, false, backend)
c := a.Or(b)  // [true, true, true]

func (*Tensor[T, B]) Raw ¶

func (t *Tensor[T, B]) Raw() *RawTensor

Raw returns the underlying RawTensor. Used by backend implementations for low-level operations.

func (*Tensor[T, B]) RequireGrad ¶

func (t *Tensor[T, B]) RequireGrad() *Tensor[T, B]

RequireGrad marks this tensor for gradient computation. When called, subsequent operations involving this tensor will be tracked in the computation graph (if using an AutodiffBackend).

Returns the tensor itself for method chaining.

Example:

x := tensor.Ones[float32](Shape{2, 2}, autodiffBackend).RequireGrad()
y := x.Mul(x) // Operations are tracked
y.Backward()  // Computes gradients
fmt.Println(x.Grad()) // dy/dx available

func (*Tensor[T, B]) RequiresGrad ¶

func (t *Tensor[T, B]) RequiresGrad() bool

RequiresGrad returns true if this tensor requires gradient computation.

func (*Tensor[T, B]) Reshape ¶

func (t *Tensor[T, B]) Reshape(newShape ...int) *Tensor[T, B]

Reshape returns a tensor with the same data but different shape. The new shape must have the same number of elements.

Example:

t := tensor.Arange[int32](0, 12, backend) // Shape: [12]
reshaped := t.Reshape(3, 4)               // Shape: [3, 4]

func (*Tensor[T, B]) Rsqrt ¶ added in v0.3.0

func (t *Tensor[T, B]) Rsqrt() *Tensor[T, B]

Rsqrt computes the reciprocal square root (1/sqrt(x)) of each element.

This is often faster than computing Sqrt and then taking the reciprocal.

Example:

x := tensor.Randn[float32](Shape{2, 3}, backend)
y := x.Rsqrt()  // 1/sqrt(x) for each element

func (*Tensor[T, B]) Set ¶

func (t *Tensor[T, B]) Set(value T, indices ...int)

Set sets the element at the given indices. Panics if indices are out of bounds.

func (*Tensor[T, B]) SetGrad ¶

func (t *Tensor[T, B]) SetGrad(grad *Tensor[T, B])

SetGrad sets the gradient tensor. Used internally by autodiff (TASK-004).

func (*Tensor[T, B]) Shape ¶

func (t *Tensor[T, B]) Shape() Shape

Shape returns the tensor's shape.

func (*Tensor[T, B]) Sin ¶ added in v0.3.0

func (t *Tensor[T, B]) Sin() *Tensor[T, B]

Sin computes the sine of each element (input in radians).

Example:

x := tensor.Arange[float32](0, 10, backend)
y := x.Sin()  // sin(x) for each element

func (*Tensor[T, B]) Softmax ¶ added in v0.3.0

func (t *Tensor[T, B]) Softmax(dim int) *Tensor[T, B]

Softmax computes the softmax function along the specified dimension.

Softmax(x_i) = exp(x_i) / sum(exp(x_j)) for all j in dimension. Supports negative dimension indexing (-1 = last dimension).

Example:

logits := tensor.Randn[float32](Shape{2, 10}, backend)
probs := logits.Softmax(1)  // softmax along last dimension

func (*Tensor[T, B]) Sqrt ¶ added in v0.3.0

func (t *Tensor[T, B]) Sqrt() *Tensor[T, B]

Sqrt computes the square root of each element.

Example:

x := tensor.Randn[float32](Shape{2, 3}, backend)
y := x.Sqrt()  // sqrt(x) for each element

func (*Tensor[T, B]) Squeeze ¶ added in v0.3.0

func (t *Tensor[T, B]) Squeeze(dim int) *Tensor[T, B]

Squeeze removes a dimension of size 1 at the specified position.

Panics if the dimension size is not 1. Supports negative dim indexing. This is a view operation (no data copy).

Example:

x := tensor.Randn[float32](Shape{2, 1, 3}, backend)
y := x.Squeeze(1)  // Shape: [2, 3]
z := x.Squeeze(-2) // Shape: [2, 3]

func (*Tensor[T, B]) String ¶

func (t *Tensor[T, B]) String() string

String returns a human-readable representation of the tensor.

func (*Tensor[T, B]) Sub ¶

func (t *Tensor[T, B]) Sub(other *Tensor[T, B]) *Tensor[T, B]

Sub performs element-wise subtraction with broadcasting.

func (*Tensor[T, B]) SubScalar ¶ added in v0.3.0

func (t *Tensor[T, B]) SubScalar(scalar T) *Tensor[T, B]

SubScalar subtracts a scalar value from each element of the tensor.

The scalar is broadcast to all elements of the tensor.

Example:

x := tensor.Randn[float32](Shape{2, 3}, backend)
y := x.SubScalar(0.5)  // subtract 0.5 from all elements

func (*Tensor[T, B]) Sum ¶ added in v0.3.0

func (t *Tensor[T, B]) Sum() *Tensor[T, B]

Sum computes the sum of all elements in the tensor, returning a scalar.

The result is a tensor with shape [] (scalar).

Example:

x := tensor.Randn[float32](Shape{3, 4}, backend)
total := x.Sum()  // sum of all 12 elements

func (*Tensor[T, B]) SumDim ¶ added in v0.3.0

func (t *Tensor[T, B]) SumDim(dim int, keepDim bool) *Tensor[T, B]

SumDim sums tensor elements along the specified dimension.

Parameters:

dim: dimension to reduce (supports negative indexing: -1 = last dim)
keepDim: if true, keep the reduced dimension with size 1; if false, remove it

Example:

x := tensor.Randn[float32](Shape{2, 3, 4}, backend)
y := x.SumDim(-1, true)   // shape: [2, 3, 1]
z := x.SumDim(-1, false)  // shape: [2, 3]

func (*Tensor[T, B]) T ¶

func (t *Tensor[T, B]) T() *Tensor[T, B]

T is a shortcut for 2D transpose (swaps rows and columns). Panics if the tensor is not 2D.

Example:

t := tensor.Randn[float32](Shape{3, 4}, backend)
transposed := t.T() // Shape: [4, 3]

func (*Tensor[T, B]) Transpose ¶

func (t *Tensor[T, B]) Transpose(axes ...int) *Tensor[T, B]

Transpose transposes the tensor by permuting its dimensions.

If axes is empty, reverses all dimensions (for 2D, this is standard transpose). Otherwise, axes specifies the permutation.

Example:

t := tensor.Randn[float32](Shape{2, 3, 4}, backend)
transposed := t.Transpose(2, 0, 1) // Shape: [4, 2, 3]

func (*Tensor[T, B]) Unsqueeze ¶ added in v0.3.0

func (t *Tensor[T, B]) Unsqueeze(dim int) *Tensor[T, B]

Unsqueeze adds a dimension of size 1 at the specified position.

Supports negative dim indexing. This is a view operation (no data copy).

Example:

x := tensor.Randn[float32](Shape{2, 3}, backend)
y := x.Unsqueeze(1)  // Shape: [2, 1, 3]
z := x.Unsqueeze(-1) // Shape: [2, 3, 1]

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL