tensor

package
v0.7.7 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 6, 2026 License: Apache-2.0 Imports: 7 Imported by: 0

Documentation

Overview

Package tensor provides the core tensor types and operations for Born ML framework.

Package tensor provides tensor data structures for the Born ML framework.

Package tensor provides the core tensor types and operations for Born ML framework.

Package tensor raw_ops provides type-specific tensor operations for ONNX inference. Type-specific implementations (Float32, Float64, Int32, Int64) are intentionally similar/duplicated for performance - generics would add overhead.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Backend

type Backend interface {
	// Element-wise binary operations
	Add(a, b *RawTensor) *RawTensor
	Sub(a, b *RawTensor) *RawTensor
	Mul(a, b *RawTensor) *RawTensor
	Div(a, b *RawTensor) *RawTensor

	// Matrix operations
	MatMul(a, b *RawTensor) *RawTensor

	// BatchMatMul performs batched matrix multiplication for 3D/4D tensors.
	// For 3D: [B, M, K] @ [B, K, N] -> [B, M, N]
	// For 4D: [B, H, M, K] @ [B, H, K, N] -> [B, H, M, N]
	BatchMatMul(a, b *RawTensor) *RawTensor

	// Convolutional operations
	Conv2D(input, kernel *RawTensor, stride, padding int) *RawTensor
	MaxPool2D(input *RawTensor, kernelSize, stride int) *RawTensor

	// Convolutional backward operations
	Conv2DInputBackward(input, kernel, grad *RawTensor, stride, padding int) *RawTensor
	Conv2DKernelBackward(input, kernel, grad *RawTensor, stride, padding int) *RawTensor
	MaxPool2DBackward(input, grad *RawTensor, maxIndices []int, kernelSize, stride int) *RawTensor

	// Shape operations
	Reshape(t *RawTensor, newShape Shape) *RawTensor
	Transpose(t *RawTensor, axes ...int) *RawTensor

	// Scalar operations (element-wise with scalar)
	MulScalar(x *RawTensor, scalar any) *RawTensor // multiply by scalar
	AddScalar(x *RawTensor, scalar any) *RawTensor // add scalar
	SubScalar(x *RawTensor, scalar any) *RawTensor // subtract scalar
	DivScalar(x *RawTensor, scalar any) *RawTensor // divide by scalar

	// Math operations (element-wise)
	Exp(x *RawTensor) *RawTensor   // exponential
	Log(x *RawTensor) *RawTensor   // natural logarithm
	Sqrt(x *RawTensor) *RawTensor  // square root
	Rsqrt(x *RawTensor) *RawTensor // reciprocal square root (1/sqrt(x))
	Cos(x *RawTensor) *RawTensor   // cosine
	Sin(x *RawTensor) *RawTensor   // sine

	// Activation functions
	Softmax(x *RawTensor, dim int) *RawTensor // softmax along dimension

	// Comparison operations (element-wise, return bool tensor)
	Greater(a, b *RawTensor) *RawTensor      // a > b
	Lower(a, b *RawTensor) *RawTensor        // a < b
	GreaterEqual(a, b *RawTensor) *RawTensor // a >= b
	LowerEqual(a, b *RawTensor) *RawTensor   // a <= b
	Equal(a, b *RawTensor) *RawTensor        // a == b
	NotEqual(a, b *RawTensor) *RawTensor     // a != b

	// Boolean operations (element-wise on bool tensors)
	Or(a, b *RawTensor) *RawTensor  // logical OR
	And(a, b *RawTensor) *RawTensor // logical AND
	Not(x *RawTensor) *RawTensor    // logical NOT

	// Reduction operations
	Sum(x *RawTensor) *RawTensor                            // total sum (scalar result)
	SumDim(x *RawTensor, dim int, keepDim bool) *RawTensor  // sum along dimension
	MeanDim(x *RawTensor, dim int, keepDim bool) *RawTensor // mean along dimension
	Argmax(x *RawTensor, dim int) *RawTensor                // index of maximum value along dimension

	// Manipulation operations
	Cat(tensors []*RawTensor, dim int) *RawTensor // concatenate along dimension
	Chunk(x *RawTensor, n, dim int) []*RawTensor  // split into n equal parts
	Unsqueeze(x *RawTensor, dim int) *RawTensor   // add dimension of size 1
	Squeeze(x *RawTensor, dim int) *RawTensor     // remove dimension of size 1

	// Indexing operations
	Gather(x *RawTensor, dim int, index *RawTensor) *RawTensor // select elements along dim using index tensor
	Where(condition, x, y *RawTensor) *RawTensor               // conditional element selection
	Embedding(weight, indices *RawTensor) *RawTensor           // lookup embeddings by indices

	// Shape operations (broadcast)
	Expand(x *RawTensor, shape Shape) *RawTensor // broadcast to shape

	// Type conversion
	Cast(x *RawTensor, dtype DataType) *RawTensor // cast to different data type

	// Metadata
	Name() string
	Device() Device
}

Backend defines the interface that all compute backends must implement. Backends handle the actual computation for tensor operations.

Implementations:

  • CPU: Pure Go with SIMD optimizations (TASK-003)
  • CUDA: NVIDIA GPU via driver API (Phase 2)
  • Vulkan: Cross-platform GPU compute (Phase 2)
  • Metal: Apple GPU (Phase 2)
  • WebGPU: Browser/native GPU (Phase 2)

type DType

type DType interface {
	~float32 | ~float64 | ~int32 | ~int64 | ~uint8 | ~bool
}

DType is a constraint for supported tensor data types. It uses Go generics to ensure compile-time type safety.

type DataType

type DataType int

DataType represents runtime type information for tensors.

const (
	Float32 DataType = iota
	Float64
	Int32
	Int64
	Uint8
	Bool
)

Supported data types for tensors.

func (DataType) Size

func (dt DataType) Size() int

Size returns the byte size of the data type.

func (DataType) String

func (dt DataType) String() string

String returns a human-readable name for the data type.

type Device

type Device int

Device represents the compute device for tensor operations.

const (
	CPU Device = iota
	CUDA
	Vulkan
	Metal
	WebGPU
)

Supported compute devices.

func (Device) String

func (d Device) String() string

String returns a human-readable device name.

type LazyBackend added in v0.6.0

type LazyBackend interface {
	// ReadGPUBuffer reads data from a GPU buffer to CPU memory.
	// bufferPtr is unsafe.Pointer to *wgpu.Buffer (or similar GPU buffer type).
	// size is the number of bytes to read.
	// Returns the CPU data or an error.
	ReadGPUBuffer(bufferPtr unsafe.Pointer, size uint64) ([]byte, error)

	// ReleaseGPUBuffer releases the GPU buffer when no longer needed.
	ReleaseGPUBuffer(bufferPtr unsafe.Pointer)
}

LazyBackend is an interface for backends that support lazy GPU evaluation. The backend must implement ReadGPUBuffer to transfer data from GPU to CPU.

type LazyGPUData added in v0.6.0

type LazyGPUData struct {
	// contains filtered or unexported fields
}

LazyGPUData holds a reference to GPU-resident data for lazy evaluation. When Data() is called on a RawTensor with LazyGPUData, the data is transferred from GPU to CPU only at that point (lazy realization).

func NewLazyGPUData added in v0.6.0

func NewLazyGPUData(bufferPtr unsafe.Pointer, size uint64, backend LazyBackend) *LazyGPUData

NewLazyGPUData creates a new LazyGPUData referencing a GPU buffer. The GPU buffer will be automatically released when garbage collected.

func (*LazyGPUData) BufferPtr added in v0.6.0

func (l *LazyGPUData) BufferPtr() unsafe.Pointer

BufferPtr returns the underlying GPU buffer pointer. This is used by backend operations that need to chain GPU operations.

func (*LazyGPUData) IsRealized added in v0.6.0

func (l *LazyGPUData) IsRealized() bool

IsRealized returns whether the GPU data has been transferred to CPU.

func (*LazyGPUData) MarkRealized added in v0.6.0

func (l *LazyGPUData) MarkRealized()

MarkRealized marks the GPU data as realized (transferred to CPU).

func (*LazyGPUData) Realize added in v0.6.0

func (l *LazyGPUData) Realize() ([]byte, error)

Realize transfers data from GPU to CPU and returns it. This is called lazily when Data() is accessed. Thread-safe: multiple goroutines can safely call this. After realization, the GPU buffer is released to free GPU memory.

func (*LazyGPUData) Release added in v0.6.0

func (l *LazyGPUData) Release()

Release releases the GPU buffer. Called when the tensor is no longer needed.

func (*LazyGPUData) Size added in v0.6.0

func (l *LazyGPUData) Size() uint64

Size returns the buffer size in bytes.

type MockBackend

type MockBackend struct{}

MockBackend is a simple backend for testing. It implements all operations naively for correctness verification.

func NewMockBackend

func NewMockBackend() *MockBackend

NewMockBackend creates a new MockBackend.

func (*MockBackend) Add

func (m *MockBackend) Add(a, b *RawTensor) *RawTensor

Add performs element-wise addition with broadcasting.

func (*MockBackend) AddScalar added in v0.3.0

func (m *MockBackend) AddScalar(x *RawTensor, scalar any) *RawTensor

AddScalar adds a scalar to tensor elements (mock implementation).

func (*MockBackend) And added in v0.3.0

func (m *MockBackend) And(a, b *RawTensor) *RawTensor

And performs element-wise logical AND operation (mock implementation).

func (*MockBackend) Argmax added in v0.3.0

func (m *MockBackend) Argmax(_ *RawTensor, _ int) *RawTensor

Argmax returns indices of maximum values along the specified dimension (mock stub).

func (*MockBackend) BatchMatMul added in v0.4.0

func (m *MockBackend) BatchMatMul(a, b *RawTensor) *RawTensor

BatchMatMul performs batched matrix multiplication (naive implementation for testing).

func (*MockBackend) Cast added in v0.3.0

func (m *MockBackend) Cast(x *RawTensor, dtype DataType) *RawTensor

Cast converts the tensor to a different data type (mock implementation).

func (*MockBackend) Cat added in v0.3.0

func (m *MockBackend) Cat(tensors []*RawTensor, dim int) *RawTensor

Cat concatenates tensors along the specified dimension (naive implementation).

func (*MockBackend) Chunk added in v0.3.0

func (m *MockBackend) Chunk(x *RawTensor, n, dim int) []*RawTensor

Chunk splits tensor into n equal parts along the specified dimension.

func (*MockBackend) Conv2D

func (m *MockBackend) Conv2D(input, kernel *RawTensor, stride, padding int) *RawTensor

Conv2D performs 2D convolution (naive implementation for testing).

func (*MockBackend) Conv2DInputBackward added in v0.7.1

func (m *MockBackend) Conv2DInputBackward(_, _, _ *RawTensor, _, _ int) *RawTensor

Conv2DInputBackward computes gradient w.r.t. input for Conv2D. Stub implementation for MockBackend (test-only).

func (*MockBackend) Conv2DKernelBackward added in v0.7.1

func (m *MockBackend) Conv2DKernelBackward(_, _, _ *RawTensor, _, _ int) *RawTensor

Conv2DKernelBackward computes gradient w.r.t. kernel for Conv2D. Stub implementation for MockBackend (test-only).

func (*MockBackend) Cos added in v0.3.0

func (m *MockBackend) Cos(x *RawTensor) *RawTensor

Cos computes element-wise cosine.

func (*MockBackend) Device

func (m *MockBackend) Device() Device

Device returns the device type.

func (*MockBackend) Div

func (m *MockBackend) Div(a, b *RawTensor) *RawTensor

Div performs element-wise division with broadcasting.

func (*MockBackend) DivScalar added in v0.3.0

func (m *MockBackend) DivScalar(x *RawTensor, scalar any) *RawTensor

DivScalar divides tensor elements by a scalar (mock implementation).

func (*MockBackend) Embedding added in v0.5.1

func (m *MockBackend) Embedding(weight, indices *RawTensor) *RawTensor

Embedding performs embedding lookup (naive implementation). weight: [numEmbeddings, embeddingDim] indices: any shape of int32 indices output: [...indices.shape, embeddingDim]

func (*MockBackend) Equal added in v0.3.0

func (m *MockBackend) Equal(a, b *RawTensor) *RawTensor

Equal performs element-wise equality comparison (mock implementation).

func (*MockBackend) Exp added in v0.3.0

func (m *MockBackend) Exp(x *RawTensor) *RawTensor

Exp computes element-wise exponential.

func (*MockBackend) Expand added in v0.3.0

func (m *MockBackend) Expand(_ *RawTensor, _ Shape) *RawTensor

Expand broadcasts the tensor to a new shape (mock stub).

func (*MockBackend) Gather added in v0.3.0

func (m *MockBackend) Gather(x *RawTensor, dim int, index *RawTensor) *RawTensor

Gather selects elements along dim using index tensor (naive implementation).

func (*MockBackend) Greater added in v0.3.0

func (m *MockBackend) Greater(a, b *RawTensor) *RawTensor

Greater performs element-wise greater-than comparison (mock implementation).

func (*MockBackend) GreaterEqual added in v0.3.0

func (m *MockBackend) GreaterEqual(a, b *RawTensor) *RawTensor

GreaterEqual performs element-wise greater-than-or-equal comparison (mock implementation).

func (*MockBackend) Log added in v0.3.0

func (m *MockBackend) Log(x *RawTensor) *RawTensor

Log computes natural logarithm element-wise (mock implementation).

func (*MockBackend) Lower added in v0.3.0

func (m *MockBackend) Lower(a, b *RawTensor) *RawTensor

Lower performs element-wise less-than comparison (mock implementation).

func (*MockBackend) LowerEqual added in v0.3.0

func (m *MockBackend) LowerEqual(a, b *RawTensor) *RawTensor

LowerEqual performs element-wise less-than-or-equal comparison (mock implementation).

func (*MockBackend) MatMul

func (m *MockBackend) MatMul(a, b *RawTensor) *RawTensor

MatMul performs matrix multiplication.

func (*MockBackend) MaxPool2D

func (m *MockBackend) MaxPool2D(input *RawTensor, kernelSize, stride int) *RawTensor

MaxPool2D performs 2D max pooling (naive implementation for testing).

func (*MockBackend) MaxPool2DBackward added in v0.7.1

func (m *MockBackend) MaxPool2DBackward(_, _ *RawTensor, _ []int, _, _ int) *RawTensor

MaxPool2DBackward computes gradient w.r.t. input for MaxPool2D. Stub implementation for MockBackend (test-only).

func (*MockBackend) MeanDim added in v0.3.0

func (m *MockBackend) MeanDim(x *RawTensor, dim int, keepDim bool) *RawTensor

MeanDim computes the mean of tensor elements along the specified dimension.

func (*MockBackend) Mul

func (m *MockBackend) Mul(a, b *RawTensor) *RawTensor

Mul performs element-wise multiplication with broadcasting.

func (*MockBackend) MulScalar added in v0.3.0

func (m *MockBackend) MulScalar(x *RawTensor, scalar any) *RawTensor

MulScalar multiplies tensor elements by a scalar (mock implementation).

func (*MockBackend) Name

func (m *MockBackend) Name() string

Name returns the backend name.

func (*MockBackend) Not added in v0.3.0

func (m *MockBackend) Not(x *RawTensor) *RawTensor

Not performs element-wise logical NOT operation (mock implementation).

func (*MockBackend) NotEqual added in v0.3.0

func (m *MockBackend) NotEqual(a, b *RawTensor) *RawTensor

NotEqual performs element-wise inequality comparison (mock implementation).

func (*MockBackend) Or added in v0.3.0

func (m *MockBackend) Or(a, b *RawTensor) *RawTensor

Or performs element-wise logical OR operation (mock implementation).

func (*MockBackend) Reshape

func (m *MockBackend) Reshape(t *RawTensor, newShape Shape) *RawTensor

Reshape changes tensor shape.

func (*MockBackend) Rsqrt added in v0.3.0

func (m *MockBackend) Rsqrt(x *RawTensor) *RawTensor

Rsqrt computes element-wise reciprocal square root.

func (*MockBackend) Sin added in v0.3.0

func (m *MockBackend) Sin(x *RawTensor) *RawTensor

Sin computes element-wise sine.

func (*MockBackend) Softmax added in v0.3.0

func (m *MockBackend) Softmax(_ *RawTensor, _ int) *RawTensor

Softmax applies softmax activation along the specified dimension (mock stub).

func (*MockBackend) Sqrt added in v0.3.0

func (m *MockBackend) Sqrt(x *RawTensor) *RawTensor

Sqrt computes element-wise square root.

func (*MockBackend) Squeeze added in v0.3.0

func (m *MockBackend) Squeeze(x *RawTensor, dim int) *RawTensor

Squeeze removes a dimension of size 1 at the specified position.

func (*MockBackend) Sub

func (m *MockBackend) Sub(a, b *RawTensor) *RawTensor

Sub performs element-wise subtraction with broadcasting.

func (*MockBackend) SubScalar added in v0.3.0

func (m *MockBackend) SubScalar(x *RawTensor, scalar any) *RawTensor

SubScalar subtracts a scalar from tensor elements (mock implementation).

func (*MockBackend) Sum added in v0.3.0

func (m *MockBackend) Sum(x *RawTensor) *RawTensor

Sum computes the total sum of all tensor elements (mock implementation).

func (*MockBackend) SumDim added in v0.3.0

func (m *MockBackend) SumDim(x *RawTensor, dim int, keepDim bool) *RawTensor

SumDim sums tensor elements along the specified dimension (naive implementation).

func (*MockBackend) Transpose

func (m *MockBackend) Transpose(t *RawTensor, axes ...int) *RawTensor

Transpose transposes tensor dimensions.

func (*MockBackend) Unsqueeze added in v0.3.0

func (m *MockBackend) Unsqueeze(x *RawTensor, dim int) *RawTensor

Unsqueeze adds a dimension of size 1 at the specified position.

func (*MockBackend) Where added in v0.3.0

func (m *MockBackend) Where(condition, x, y *RawTensor) *RawTensor

Where performs conditional element selection (naive implementation).

type RawTensor

type RawTensor struct {
	// contains filtered or unexported fields
}

RawTensor is the low-level tensor representation. It uses reference-counted shared buffers for Copy-on-Write semantics. Supports lazy GPU evaluation: data is transferred from GPU only when Data() is called.

func Cast added in v0.6.0

func Cast(x *RawTensor, dtype DataType) (*RawTensor, error)

Cast converts a tensor to a different data type.

func Clip added in v0.6.0

func Clip(x *RawTensor, minVal, maxVal float32) (*RawTensor, error)

Clip clamps values to the range [min, max].

func Concat added in v0.6.0

func Concat(tensors []*RawTensor, axis int) (*RawTensor, error)

Concat concatenates tensors along the specified dimension.

func Expand added in v0.6.0

func Expand(x *RawTensor, targetShape Shape) (*RawTensor, error)

Expand broadcasts a tensor to a larger shape.

func Flatten added in v0.6.0

func Flatten(x *RawTensor, axis int) (*RawTensor, error)

Flatten flattens tensor from axis onward into a single dimension.

func FullRaw added in v0.6.0

func FullRaw(shape Shape, value float32, dtype DataType, device Device) (*RawTensor, error)

FullRaw creates a RawTensor filled with a constant value.

func GELU added in v0.6.0

func GELU(x *RawTensor) (*RawTensor, error)

GELU applies the Gaussian Error Linear Unit activation. Uses approximation: 0.5 * x * (1 + tanh(sqrt(2/pi) * (x + 0.044715 * x^3))).

func Gather added in v0.6.0

func Gather(x, indices *RawTensor, axis int) (*RawTensor, error)

Gather selects elements along an axis according to indices.

func LeakyReLU added in v0.6.0

func LeakyReLU(x *RawTensor, alpha float32) (*RawTensor, error)

LeakyReLU applies leaky ReLU: max(x, alpha*x).

func LogSoftmax added in v0.6.0

func LogSoftmax(x *RawTensor, axis int) (*RawTensor, error)

LogSoftmax computes log(softmax(x)) along the specified axis.

func NewLazyRaw added in v0.6.0

func NewLazyRaw(shape Shape, dtype DataType, device Device, gpuData *LazyGPUData) (*RawTensor, error)

NewLazyRaw creates a new RawTensor with lazy GPU data. The data is not transferred from GPU until Data() is called.

func NewRaw

func NewRaw(shape Shape, dtype DataType, device Device) (*RawTensor, error)

NewRaw creates a new RawTensor with the given shape and type. Memory is allocated but not initialized (contains zeros).

func PReLU added in v0.6.0

func PReLU(x, slope *RawTensor) (*RawTensor, error)

PReLU applies parametric ReLU: max(x, slope*x) where slope is per-element or broadcasted.

func ReLU added in v0.6.0

func ReLU(x *RawTensor) (*RawTensor, error)

ReLU applies the ReLU activation function element-wise: max(x, 0).

func Reshape added in v0.6.0

func Reshape(x *RawTensor, newShape Shape) (*RawTensor, error)

Reshape returns a new tensor with the given shape (shares data if contiguous).

func SiLU added in v0.6.0

func SiLU(x *RawTensor) (*RawTensor, error)

SiLU applies the Sigmoid Linear Unit (Swish) activation: x * sigmoid(x).

func Sigmoid added in v0.6.0

func Sigmoid(x *RawTensor) (*RawTensor, error)

Sigmoid applies the sigmoid activation function: 1/(1+exp(-x)).

func Slice added in v0.6.0

func Slice(x *RawTensor, starts, ends, axes, steps []int64) (*RawTensor, error)

Slice extracts a slice from a tensor.

func Softmax added in v0.6.0

func Softmax(x *RawTensor, axis int) (*RawTensor, error)

Softmax applies softmax along the specified axis.

func Split added in v0.6.0

func Split(x *RawTensor, axis int, splitSizes []int) ([]*RawTensor, error)

Split splits a tensor into multiple tensors along an axis.

func Squeeze added in v0.6.0

func Squeeze(x *RawTensor, axes ...int) (*RawTensor, error)

Squeeze removes dimensions of size 1 at the specified axes.

If no axes are specified, removes all dimensions of size 1.

func Tanh added in v0.6.0

func Tanh(x *RawTensor) (*RawTensor, error)

Tanh applies the hyperbolic tangent activation function.

func TransposeAxes added in v0.6.0

func TransposeAxes(x *RawTensor, axes ...int) (*RawTensor, error)

TransposeAxes transposes dimensions according to the given permutation.

func Unsqueeze added in v0.6.0

func Unsqueeze(x *RawTensor, axes ...int) (*RawTensor, error)

Unsqueeze adds dimensions of size 1 at the specified axes.

func WhereRaw added in v0.6.0

func WhereRaw(condition, x, y *RawTensor) (*RawTensor, error)

WhereRaw selects elements from x or y based on condition.

func (*RawTensor) AsBool

func (r *RawTensor) AsBool() []bool

AsBool interprets the data as []bool. Panics if the tensor's dtype is not Bool. For lazy GPU tensors, this triggers data transfer from GPU to CPU.

func (*RawTensor) AsFloat32

func (r *RawTensor) AsFloat32() []float32

AsFloat32 interprets the data as []float32. Panics if the tensor's dtype is not Float32. For lazy GPU tensors, this triggers data transfer from GPU to CPU.

func (*RawTensor) AsFloat64

func (r *RawTensor) AsFloat64() []float64

AsFloat64 interprets the data as []float64. Panics if the tensor's dtype is not Float64. For lazy GPU tensors, this triggers data transfer from GPU to CPU.

func (*RawTensor) AsInt32

func (r *RawTensor) AsInt32() []int32

AsInt32 interprets the data as []int32. Panics if the tensor's dtype is not Int32. For lazy GPU tensors, this triggers data transfer from GPU to CPU.

func (*RawTensor) AsInt64

func (r *RawTensor) AsInt64() []int64

AsInt64 interprets the data as []int64. Panics if the tensor's dtype is not Int64. For lazy GPU tensors, this triggers data transfer from GPU to CPU.

func (*RawTensor) AsUint8

func (r *RawTensor) AsUint8() []uint8

AsUint8 interprets the data as []uint8. Panics if the tensor's dtype is not Uint8. For lazy GPU tensors, this triggers data transfer from GPU to CPU.

func (*RawTensor) ByteSize

func (r *RawTensor) ByteSize() int

ByteSize returns the total memory size in bytes.

func (*RawTensor) Clone

func (r *RawTensor) Clone() *RawTensor

Clone creates a shallow copy of the RawTensor (shares buffer with reference counting). The buffer is reference-counted and will be copied only when modified (copy-on-write). This enables cheap cloning and inplace optimizations when refCount == 1. Note: GPU lazy data is shared (same underlying GPU buffer).

Example:

a := tensor.Ones[float32](Shape{1000, 1000}, backend)
b := a.Clone()  // Shares buffer with a (just increments refCount)
c := a.Add(b)   // May use inplace if refCount allows

func (*RawTensor) DType

func (r *RawTensor) DType() DataType

DType returns the tensor's data type.

func (*RawTensor) Data

func (r *RawTensor) Data() []byte

Data returns the raw byte slice. For lazy GPU tensors, this triggers data transfer from GPU to CPU (expensive!). WARNING: Direct access to underlying memory. Use with caution.

func (*RawTensor) Device

func (r *RawTensor) Device() Device

Device returns the tensor's compute device.

func (*RawTensor) ForceNonUnique

func (r *RawTensor) ForceNonUnique() func()

ForceNonUnique temporarily increases refCount to prevent inplace modifications. Returns a cleanup function that MUST be called to restore refCount (use defer).

This is used by autodiff backend to preserve original input values: inplace optimizations would corrupt the computational graph.

Example:

defer tensor.ForceNonUnique()()
result := backend.Mul(tensor, other)  // No inplace modification!

func (*RawTensor) GPUData added in v0.6.0

func (r *RawTensor) GPUData() *LazyGPUData

GPUData returns the lazy GPU data reference, if any. Returns nil for CPU-only tensors.

func (*RawTensor) IsLazy added in v0.6.0

func (r *RawTensor) IsLazy() bool

IsLazy returns true if this tensor has unrealized GPU data. Use this to check if Data() will trigger an expensive GPU→CPU transfer.

func (*RawTensor) IsUnique

func (r *RawTensor) IsUnique() bool

IsUnique returns true if this tensor is the only reference to the buffer. When true, backends can perform inplace operations for better performance.

func (*RawTensor) NumElements

func (r *RawTensor) NumElements() int

NumElements returns the total number of elements.

func (*RawTensor) Release

func (r *RawTensor) Release()

Release decrements the reference count and deallocates if it reaches 0. This is called automatically when a tensor is no longer needed (e.g., by GC finalizer).

func (*RawTensor) SetGPUData added in v0.6.0

func (r *RawTensor) SetGPUData(gpuData *LazyGPUData)

SetGPUData sets the lazy GPU data reference. This is used by GPU backends to create lazy tensors.

func (*RawTensor) Shape

func (r *RawTensor) Shape() Shape

Shape returns the tensor's shape.

func (*RawTensor) Strides

func (r *RawTensor) Strides() []int

Strides returns the tensor's memory strides.

type Shape

type Shape []int

Shape represents the dimensions of a tensor.

func BroadcastShapes

func BroadcastShapes(a, b Shape) (Shape, bool, error)

BroadcastShapes implements NumPy-style broadcasting rules.

Rules: 1. Compare shapes element-wise from right to left 2. Dimensions are compatible if:

  • They are equal, OR
  • One of them is 1

3. Missing dimensions are treated as 1

Returns the broadcasted shape, a flag indicating if broadcasting is needed, and an error if incompatible.

Examples:

(3, 1) + (3, 5) → (3, 5), true, nil
(1, 5) + (3, 5) → (3, 5), true, nil
(3, 5) + (3, 5) → (3, 5), false, nil
(3, 4) + (3, 5) → nil, false, Error

func (Shape) Clone

func (s Shape) Clone() Shape

Clone returns a copy of the shape.

func (Shape) ComputeStrides

func (s Shape) ComputeStrides() []int

ComputeStrides calculates row-major strides for the shape. Strides define memory layout: stride[i] = product of all dimensions after i.

func (Shape) Equal

func (s Shape) Equal(other Shape) bool

Equal checks if two shapes are equal.

func (Shape) NumElements

func (s Shape) NumElements() int

NumElements returns the total number of elements in the tensor.

func (Shape) Validate

func (s Shape) Validate() error

Validate checks if the shape is valid (all dimensions > 0).

type Tensor

type Tensor[T DType, B Backend] struct {
	// contains filtered or unexported fields
}

Tensor is a generic tensor with type T and backend B. It provides type-safe operations over multi-dimensional arrays.

Type Parameters:

  • T: Data type (must satisfy DType constraint)
  • B: Computation backend (must implement Backend interface)

Example:

backend := cpu.New()
t := tensor.Zeros[float32](Shape{3, 4}, backend)
result := t.Add(t) // Type-safe addition

func Arange

func Arange[T DType, B Backend](start, end T, b B) *Tensor[T, B]

Arange creates a 1D tensor with values from start to end (exclusive). Only works with numeric types (not bool).

Example:

t := tensor.Arange[int32](0, 10, backend) // [0, 1, 2, ..., 9]

func Cat added in v0.3.0

func Cat[T DType, B Backend](tensors []*Tensor[T, B], dim int) *Tensor[T, B]

Cat concatenates tensors along the specified dimension.

All tensors must have the same shape except along the concatenation dimension. Supports negative dim indexing (-1 = last dimension).

Example:

a := tensor.Randn[float32](Shape{2, 3}, backend)
b := tensor.Randn[float32](Shape{2, 5}, backend)
c := tensor.Cat([]*Tensor[float32, B]{a, b}, 1) // Shape: [2, 8]

func Eye

func Eye[T DType, B Backend](n int, b B) *Tensor[T, B]

Eye creates a 2D identity matrix.

Example:

t := tensor.Eye[float32](3, backend) // 3x3 identity matrix

func FromSlice

func FromSlice[T DType, B Backend](data []T, shape Shape, b B) (*Tensor[T, B], error)

FromSlice creates a tensor from a Go slice. The slice is copied into the tensor's memory.

func Full

func Full[T DType, B Backend](shape Shape, value T, b B) *Tensor[T, B]

Full creates a tensor filled with a specific value.

Example:

t := tensor.Full[float32](Shape{3, 3}, 3.14, backend)

func New

func New[T DType, B Backend](raw *RawTensor, b B) *Tensor[T, B]

New creates a Tensor from a RawTensor and backend.

func Ones

func Ones[T DType, B Backend](shape Shape, b B) *Tensor[T, B]

Ones creates a tensor filled with ones.

Example:

t := tensor.Ones[float64](Shape{2, 3}, backend)

func Rand

func Rand[T DType, B Backend](shape Shape, b B) *Tensor[T, B]

Rand creates a tensor with random values uniformly distributed in [0, 1). Only works with float types.

Example:

t := tensor.Rand[float32](Shape{10, 10}, backend)

func Randn

func Randn[T DType, B Backend](shape Shape, b B) *Tensor[T, B]

Randn creates a tensor with random values from a normal distribution (mean=0, std=1). Uses Box-Muller transform for generating normal distribution. Only works with float types. Note: Uses math/rand (not crypto/rand) - appropriate for ML/statistical purposes.

Example:

t := tensor.Randn[float32](Shape{100, 100}, backend)

func Where added in v0.3.0

func Where[T DType, B Backend](cond *Tensor[bool, B], x, y *Tensor[T, B]) *Tensor[T, B]

Where selects elements from x or y based on condition.

For each element:

  • If condition is true, select from x
  • If condition is false, select from y

Supports broadcasting between condition, x, and y.

Example:

cond := tensor.Full[bool](Shape{3}, true, backend)
x := tensor.Full[float32](Shape{3}, 1.0, backend)
y := tensor.Full[float32](Shape{3}, 0.0, backend)
result := tensor.Where(cond, x, y)  // [1.0, 1.0, 1.0]

func Zeros

func Zeros[T DType, B Backend](shape Shape, b B) *Tensor[T, B]

Zeros creates a tensor filled with zeros.

Example:

backend := cpu.New()
t := tensor.Zeros[float32](Shape{3, 4}, backend)

func (*Tensor[T, B]) Add

func (t *Tensor[T, B]) Add(other *Tensor[T, B]) *Tensor[T, B]

Add performs element-wise addition with broadcasting.

Example:

a := tensor.Ones[float32](Shape{3, 1}, backend)
b := tensor.Ones[float32](Shape{3, 5}, backend)
c := a.Add(b) // Shape: [3, 5] (broadcasted)

func (*Tensor[T, B]) AddScalar added in v0.3.0

func (t *Tensor[T, B]) AddScalar(scalar T) *Tensor[T, B]

AddScalar adds a scalar value to each element of the tensor.

The scalar is broadcast to all elements of the tensor.

Example:

x := tensor.Randn[float32](Shape{2, 3}, backend)
y := x.AddScalar(1.0)  // add 1.0 to all elements

func (*Tensor[bool, B]) And added in v0.3.0

func (t *Tensor[bool, B]) And(other *Tensor[bool, B]) *Tensor[bool, B]

And computes element-wise logical AND between two boolean tensors.

Supports broadcasting between tensors of different shapes.

Example:

a := tensor.Full[bool](Shape{3}, true, backend)
b := tensor.Full[bool](Shape{3}, false, backend)
c := a.And(b)  // [false, false, false]

func (*Tensor[T, B]) Argmax added in v0.3.0

func (t *Tensor[T, B]) Argmax(dim int) *Tensor[int32, B]

Argmax returns the index of the maximum value along the specified dimension.

Returns a tensor of type int32 with the same shape as the input except the specified dimension is removed.

Supports negative dimension indexing (-1 = last dimension).

Example:

x := tensor.Randn[float32](Shape{3, 4}, backend)
indices := x.Argmax(1)  // Shape: [3], index of max in each row

func (*Tensor[T, B]) At

func (t *Tensor[T, B]) At(indices ...int) T

At returns the element at the given indices. Panics if indices are out of bounds.

Example:

t := tensor.Zeros[float32](Shape{3, 4}, backend)
value := t.At(1, 2) // Row 1, column 2

func (*Tensor[T, B]) Backend

func (t *Tensor[T, B]) Backend() B

Backend returns the computation backend.

func (*Tensor[T, B]) BatchMatMul added in v0.4.0

func (t *Tensor[T, B]) BatchMatMul(other *Tensor[T, B]) *Tensor[T, B]

BatchMatMul performs batched matrix multiplication.

For 3D tensors: [B, M, K] @ [B, K, N] -> [B, M, N] For 4D tensors: [B, H, M, K] @ [B, H, K, N] -> [B, H, M, N]

Example (Attention scores):

q := tensor.Randn[float32](tensor.Shape{2, 4, 64, 16}, backend) // [B, H, S, D]
k := tensor.Randn[float32](tensor.Shape{2, 4, 64, 16}, backend)
kT := k.Transpose(0, 1, 3, 2)
scores := q.BatchMatMul(kT) // [2, 4, 64, 64]

func (*Tensor[T, B]) Chunk added in v0.3.0

func (t *Tensor[T, B]) Chunk(n, dim int) []*Tensor[T, B]

Chunk splits the tensor into n equal parts along the specified dimension.

The dimension size must be divisible by n. Supports negative dim indexing (-1 = last dimension).

Example:

x := tensor.Randn[float32](Shape{2, 3, 6}, backend)
parts := x.Chunk(3, -1) // 3 tensors of shape [2, 3, 2]

func (*Tensor[T, B]) Clone

func (t *Tensor[T, B]) Clone() *Tensor[T, B]

Clone creates a deep copy of the tensor.

func (*Tensor[T, B]) Cos added in v0.3.0

func (t *Tensor[T, B]) Cos() *Tensor[T, B]

Cos computes the cosine of each element (input in radians).

Example:

x := tensor.Arange[float32](0, 10, backend)
y := x.Cos()  // cos(x) for each element

func (*Tensor[T, B]) DType

func (t *Tensor[T, B]) DType() DataType

DType returns the tensor's data type.

func (*Tensor[T, B]) Data

func (t *Tensor[T, B]) Data() []T

Data returns a typed slice view of the tensor's data. The slice directly accesses the underlying memory (zero-copy).

WARNING: Modifications to the returned slice will modify the tensor.

func (*Tensor[T, B]) Detach added in v0.3.0

func (t *Tensor[T, B]) Detach() *Tensor[T, B]

Detach returns a new tensor that shares the same data but doesn't track gradients.

This is useful for:

  • Stopping gradient flow at a specific point
  • Creating targets in reinforcement learning (no backprop through target)
  • HRM carry states between iterations (detach to prevent long gradient chains)
  • Teacher-student training (stop gradients through teacher)

The returned tensor shares the underlying data (zero-copy) but has no gradient tracking. Any operations on the detached tensor won't appear in the autodiff tape.

Example:

// Training with detached target
prediction := model.Forward(input)
target := target_model.Forward(input).Detach()  // No gradients through target
loss := prediction.Sub(target).Pow(2).Mean()

// HRM carry state
newCarry := Carry{
    zH: zH.Detach(),  // Break gradient chain
    zL: zL.Detach(),
}

func (*Tensor[T, B]) Device

func (t *Tensor[T, B]) Device() Device

Device returns the tensor's compute device.

func (*Tensor[T, B]) Div

func (t *Tensor[T, B]) Div(other *Tensor[T, B]) *Tensor[T, B]

Div performs element-wise division with broadcasting.

func (*Tensor[T, B]) DivScalar added in v0.3.0

func (t *Tensor[T, B]) DivScalar(scalar T) *Tensor[T, B]

DivScalar divides each element of the tensor by a scalar value.

The scalar is broadcast to all elements of the tensor.

Example:

x := tensor.Randn[float32](Shape{2, 3}, backend)
y := x.DivScalar(2.0)  // divide all elements by 2.0

func (*Tensor[T, B]) Embedding added in v0.5.1

func (t *Tensor[T, B]) Embedding(indices *Tensor[int32, B]) *Tensor[T, B]

Embedding performs embedding lookup using the tensor as the weight matrix.

The tensor t should have shape [numEmbeddings, embeddingDim]. The indices tensor contains integer indices to look up. Returns embeddings with shape [...indices.shape, embeddingDim].

This operation is differentiable - gradients flow back to the weight tensor via scatter-add (same indices accumulate gradients).

Example:

weight := tensor.Randn[float32](Shape{1000, 256}, backend)  // vocab=1000, dim=256
indices := tensor.FromSlice([]int32{0, 5, 3}, Shape{3}, backend)
embeddings := weight.Embedding(indices)  // shape: [3, 256]

func (*Tensor[T, B]) Eq added in v0.3.0

func (t *Tensor[T, B]) Eq(other *Tensor[T, B]) *Tensor[bool, B]

Eq is a short alias for Equal.

Example:

mask := a.Eq(b)  // same as a.Equal(b)

func (*Tensor[T, B]) Equal added in v0.3.0

func (t *Tensor[T, B]) Equal(other *Tensor[T, B]) *Tensor[bool, B]

Equal returns a boolean tensor where each element is true if the corresponding elements in this tensor and other are equal.

Supports broadcasting between tensors of different shapes.

Example:

a := tensor.Arange[float32](0, 5, backend)
b := tensor.Full[float32](Shape{5}, 2.0, backend)
mask := a.Equal(b)  // [false, false, true, false, false]

func (*Tensor[T, B]) Exp added in v0.3.0

func (t *Tensor[T, B]) Exp() *Tensor[T, B]

Exp computes the exponential (e^x) of each element.

Example:

x := tensor.Randn[float32](Shape{2, 3}, backend)
y := x.Exp()  // e^x for each element

func (*Tensor[T, B]) Expand added in v0.3.0

func (t *Tensor[T, B]) Expand(shape Shape) *Tensor[T, B]

Expand broadcasts the tensor to a new shape.

The new shape must be compatible with the current shape according to NumPy broadcasting rules. Dimensions of size 1 can be broadcast to any size.

Example:

x := tensor.Randn[float32](Shape{1, 3}, backend)
y := x.Expand(Shape{5, 3})  // broadcast to [5, 3]

func (*Tensor[T, B]) Float32 added in v0.3.0

func (t *Tensor[T, B]) Float32() *Tensor[float32, B]

Float32 casts the tensor to float32 dtype.

Example:

x := tensor.Arange[int32](0, 10, backend)
y := x.Float32()  // Tensor[float32, B]

func (*Tensor[T, B]) Float64 added in v0.3.0

func (t *Tensor[T, B]) Float64() *Tensor[float64, B]

Float64 casts the tensor to float64 dtype.

Example:

x := tensor.Randn[float32](Shape{3, 4}, backend)
y := x.Float64()  // Tensor[float64, B]

func (*Tensor[T, B]) Gather added in v0.3.0

func (t *Tensor[T, B]) Gather(dim int, index *Tensor[int32, B]) *Tensor[T, B]

Gather selects elements from the tensor along a dimension using an index tensor.

For each element in the index tensor, Gather selects the corresponding element from the input tensor along the specified dimension.

Example:

x := tensor.Randn[float32](Shape{3, 5}, backend)
indices := tensor.FromSlice([]int32{0, 2, 4}, Shape{3}, backend)
y := x.Gather(1, indices)  // select columns 0, 2, 4 for each row

func (*Tensor[T, B]) Ge added in v0.3.0

func (t *Tensor[T, B]) Ge(other *Tensor[T, B]) *Tensor[bool, B]

Ge is a short alias for GreaterEqual.

Example:

mask := a.Ge(b)  // same as a.GreaterEqual(b)

func (*Tensor[T, B]) Grad

func (t *Tensor[T, B]) Grad() *Tensor[T, B]

Grad returns the gradient tensor (if computed by autodiff).

func (*Tensor[T, B]) Greater added in v0.3.0

func (t *Tensor[T, B]) Greater(other *Tensor[T, B]) *Tensor[bool, B]

Greater returns a boolean tensor where each element is true if the corresponding element in this tensor is greater than the corresponding element in other.

Supports broadcasting between tensors of different shapes.

Example:

a := tensor.Arange[float32](0, 5, backend)
b := tensor.Full[float32](Shape{5}, 2.0, backend)
mask := a.Greater(b)  // [false, false, false, true, true]

func (*Tensor[T, B]) GreaterEqual added in v0.3.0

func (t *Tensor[T, B]) GreaterEqual(other *Tensor[T, B]) *Tensor[bool, B]

GreaterEqual returns a boolean tensor where each element is true if the corresponding element in this tensor is greater than or equal to the corresponding element in other.

Supports broadcasting between tensors of different shapes.

Example:

a := tensor.Arange[float32](0, 5, backend)
b := tensor.Full[float32](Shape{5}, 2.0, backend)
mask := a.GreaterEqual(b)  // [false, false, true, true, true]

func (*Tensor[T, B]) Gt added in v0.3.0

func (t *Tensor[T, B]) Gt(other *Tensor[T, B]) *Tensor[bool, B]

Gt is a short alias for Greater.

Example:

mask := a.Gt(b)  // same as a.Greater(b)

func (*Tensor[T, B]) Int32 added in v0.3.0

func (t *Tensor[T, B]) Int32() *Tensor[int32, B]

Int32 casts the tensor to int32 dtype.

Example:

x := tensor.Randn[float32](Shape{3, 4}, backend)
y := x.Int32()  // Tensor[int32, B]

func (*Tensor[T, B]) Int64 added in v0.3.0

func (t *Tensor[T, B]) Int64() *Tensor[int64, B]

Int64 casts the tensor to int64 dtype.

Example:

x := tensor.Arange[int32](0, 10, backend)
y := x.Int64()  // Tensor[int64, B]

func (*Tensor[T, B]) Item

func (t *Tensor[T, B]) Item() T

Item returns the scalar value of a 0-D tensor. Panics if the tensor is not a scalar.

func (*Tensor[T, B]) Le added in v0.3.0

func (t *Tensor[T, B]) Le(other *Tensor[T, B]) *Tensor[bool, B]

Le is a short alias for LowerEqual.

Example:

mask := a.Le(b)  // same as a.LowerEqual(b)

func (*Tensor[T, B]) Log added in v0.3.0

func (t *Tensor[T, B]) Log() *Tensor[T, B]

Log computes the natural logarithm (ln(x)) of each element.

Example:

x := tensor.Randn[float32](Shape{2, 3}, backend)
y := x.Log()  // ln(x) for each element

func (*Tensor[T, B]) Lower added in v0.3.0

func (t *Tensor[T, B]) Lower(other *Tensor[T, B]) *Tensor[bool, B]

Lower returns a boolean tensor where each element is true if the corresponding element in this tensor is less than the corresponding element in other.

Supports broadcasting between tensors of different shapes.

Example:

a := tensor.Arange[float32](0, 5, backend)
b := tensor.Full[float32](Shape{5}, 2.0, backend)
mask := a.Lower(b)  // [true, true, false, false, false]

func (*Tensor[T, B]) LowerEqual added in v0.3.0

func (t *Tensor[T, B]) LowerEqual(other *Tensor[T, B]) *Tensor[bool, B]

LowerEqual returns a boolean tensor where each element is true if the corresponding element in this tensor is less than or equal to the corresponding element in other.

Supports broadcasting between tensors of different shapes.

Example:

a := tensor.Arange[float32](0, 5, backend)
b := tensor.Full[float32](Shape{5}, 2.0, backend)
mask := a.LowerEqual(b)  // [true, true, true, false, false]

func (*Tensor[T, B]) Lt added in v0.3.0

func (t *Tensor[T, B]) Lt(other *Tensor[T, B]) *Tensor[bool, B]

Lt is a short alias for Lower.

Example:

mask := a.Lt(b)  // same as a.Lower(b)

func (*Tensor[T, B]) MatMul

func (t *Tensor[T, B]) MatMul(other *Tensor[T, B]) *Tensor[T, B]

MatMul performs matrix multiplication.

Requirements:

  • For 2D tensors: (M, K) @ (K, N) → (M, N)
  • For batched: (B, M, K) @ (B, K, N) → (B, M, N)

Example:

a := tensor.Randn[float32](Shape{3, 4}, backend)
b := tensor.Randn[float32](Shape{4, 5}, backend)
c := a.MatMul(b) // Shape: [3, 5]

func (*Tensor[T, B]) MeanDim added in v0.3.0

func (t *Tensor[T, B]) MeanDim(dim int, keepDim bool) *Tensor[T, B]

MeanDim computes the mean of tensor elements along the specified dimension.

Parameters:

  • dim: dimension to reduce (supports negative indexing: -1 = last dim)
  • keepDim: if true, keep the reduced dimension with size 1; if false, remove it

Example:

x := tensor.Randn[float32](Shape{2, 3, 4}, backend)
y := x.MeanDim(-1, true)   // shape: [2, 3, 1]
z := x.MeanDim(-1, false)  // shape: [2, 3]

func (*Tensor[T, B]) Mul

func (t *Tensor[T, B]) Mul(other *Tensor[T, B]) *Tensor[T, B]

Mul performs element-wise multiplication with broadcasting.

func (*Tensor[T, B]) MulScalar added in v0.3.0

func (t *Tensor[T, B]) MulScalar(scalar T) *Tensor[T, B]

MulScalar multiplies each element of the tensor by a scalar value.

The scalar is broadcast to all elements of the tensor.

Example:

x := tensor.Randn[float32](Shape{2, 3}, backend)
y := x.MulScalar(2.5)  // multiply all elements by 2.5

func (*Tensor[T, B]) Ne added in v0.3.0

func (t *Tensor[T, B]) Ne(other *Tensor[T, B]) *Tensor[bool, B]

Ne is a short alias for NotEqual.

Example:

mask := a.Ne(b)  // same as a.NotEqual(b)

func (*Tensor[bool, B]) Not added in v0.3.0

func (t *Tensor[bool, B]) Not() *Tensor[bool, B]

Not computes element-wise logical NOT of a boolean tensor.

Example:

a := tensor.Full[bool](Shape{3}, true, backend)
b := a.Not()  // [false, false, false]

func (*Tensor[T, B]) NotEqual added in v0.3.0

func (t *Tensor[T, B]) NotEqual(other *Tensor[T, B]) *Tensor[bool, B]

NotEqual returns a boolean tensor where each element is true if the corresponding elements in this tensor and other are not equal.

Supports broadcasting between tensors of different shapes.

Example:

a := tensor.Arange[float32](0, 5, backend)
b := tensor.Full[float32](Shape{5}, 2.0, backend)
mask := a.NotEqual(b)  // [true, true, false, true, true]

func (*Tensor[T, B]) NumElements

func (t *Tensor[T, B]) NumElements() int

NumElements returns the total number of elements.

func (*Tensor[bool, B]) Or added in v0.3.0

func (t *Tensor[bool, B]) Or(other *Tensor[bool, B]) *Tensor[bool, B]

Or computes element-wise logical OR between two boolean tensors.

Supports broadcasting between tensors of different shapes.

Example:

a := tensor.Full[bool](Shape{3}, true, backend)
b := tensor.Full[bool](Shape{3}, false, backend)
c := a.Or(b)  // [true, true, true]

func (*Tensor[T, B]) Raw

func (t *Tensor[T, B]) Raw() *RawTensor

Raw returns the underlying RawTensor. Used by backend implementations for low-level operations.

func (*Tensor[T, B]) RequireGrad

func (t *Tensor[T, B]) RequireGrad() *Tensor[T, B]

RequireGrad marks this tensor for gradient computation. When called, subsequent operations involving this tensor will be tracked in the computation graph (if using an AutodiffBackend).

Returns the tensor itself for method chaining.

Example:

x := tensor.Ones[float32](Shape{2, 2}, autodiffBackend).RequireGrad()
y := x.Mul(x) // Operations are tracked
y.Backward()  // Computes gradients
fmt.Println(x.Grad()) // dy/dx available

func (*Tensor[T, B]) RequiresGrad

func (t *Tensor[T, B]) RequiresGrad() bool

RequiresGrad returns true if this tensor requires gradient computation.

func (*Tensor[T, B]) Reshape

func (t *Tensor[T, B]) Reshape(newShape ...int) *Tensor[T, B]

Reshape returns a tensor with the same data but different shape. The new shape must have the same number of elements.

Example:

t := tensor.Arange[int32](0, 12, backend) // Shape: [12]
reshaped := t.Reshape(3, 4)               // Shape: [3, 4]

func (*Tensor[T, B]) Rsqrt added in v0.3.0

func (t *Tensor[T, B]) Rsqrt() *Tensor[T, B]

Rsqrt computes the reciprocal square root (1/sqrt(x)) of each element.

This is often faster than computing Sqrt and then taking the reciprocal.

Example:

x := tensor.Randn[float32](Shape{2, 3}, backend)
y := x.Rsqrt()  // 1/sqrt(x) for each element

func (*Tensor[T, B]) Set

func (t *Tensor[T, B]) Set(value T, indices ...int)

Set sets the element at the given indices. Panics if indices are out of bounds.

func (*Tensor[T, B]) SetGrad

func (t *Tensor[T, B]) SetGrad(grad *Tensor[T, B])

SetGrad sets the gradient tensor. Used internally by autodiff (TASK-004).

func (*Tensor[T, B]) Shape

func (t *Tensor[T, B]) Shape() Shape

Shape returns the tensor's shape.

func (*Tensor[T, B]) Sin added in v0.3.0

func (t *Tensor[T, B]) Sin() *Tensor[T, B]

Sin computes the sine of each element (input in radians).

Example:

x := tensor.Arange[float32](0, 10, backend)
y := x.Sin()  // sin(x) for each element

func (*Tensor[T, B]) Softmax added in v0.3.0

func (t *Tensor[T, B]) Softmax(dim int) *Tensor[T, B]

Softmax computes the softmax function along the specified dimension.

Softmax(x_i) = exp(x_i) / sum(exp(x_j)) for all j in dimension. Supports negative dimension indexing (-1 = last dimension).

Example:

logits := tensor.Randn[float32](Shape{2, 10}, backend)
probs := logits.Softmax(1)  // softmax along last dimension

func (*Tensor[T, B]) Sqrt added in v0.3.0

func (t *Tensor[T, B]) Sqrt() *Tensor[T, B]

Sqrt computes the square root of each element.

Example:

x := tensor.Randn[float32](Shape{2, 3}, backend)
y := x.Sqrt()  // sqrt(x) for each element

func (*Tensor[T, B]) Squeeze added in v0.3.0

func (t *Tensor[T, B]) Squeeze(dim int) *Tensor[T, B]

Squeeze removes a dimension of size 1 at the specified position.

Panics if the dimension size is not 1. Supports negative dim indexing. This is a view operation (no data copy).

Example:

x := tensor.Randn[float32](Shape{2, 1, 3}, backend)
y := x.Squeeze(1)  // Shape: [2, 3]
z := x.Squeeze(-2) // Shape: [2, 3]

func (*Tensor[T, B]) String

func (t *Tensor[T, B]) String() string

String returns a human-readable representation of the tensor.

func (*Tensor[T, B]) Sub

func (t *Tensor[T, B]) Sub(other *Tensor[T, B]) *Tensor[T, B]

Sub performs element-wise subtraction with broadcasting.

func (*Tensor[T, B]) SubScalar added in v0.3.0

func (t *Tensor[T, B]) SubScalar(scalar T) *Tensor[T, B]

SubScalar subtracts a scalar value from each element of the tensor.

The scalar is broadcast to all elements of the tensor.

Example:

x := tensor.Randn[float32](Shape{2, 3}, backend)
y := x.SubScalar(0.5)  // subtract 0.5 from all elements

func (*Tensor[T, B]) Sum added in v0.3.0

func (t *Tensor[T, B]) Sum() *Tensor[T, B]

Sum computes the sum of all elements in the tensor, returning a scalar.

The result is a tensor with shape [] (scalar).

Example:

x := tensor.Randn[float32](Shape{3, 4}, backend)
total := x.Sum()  // sum of all 12 elements

func (*Tensor[T, B]) SumDim added in v0.3.0

func (t *Tensor[T, B]) SumDim(dim int, keepDim bool) *Tensor[T, B]

SumDim sums tensor elements along the specified dimension.

Parameters:

  • dim: dimension to reduce (supports negative indexing: -1 = last dim)
  • keepDim: if true, keep the reduced dimension with size 1; if false, remove it

Example:

x := tensor.Randn[float32](Shape{2, 3, 4}, backend)
y := x.SumDim(-1, true)   // shape: [2, 3, 1]
z := x.SumDim(-1, false)  // shape: [2, 3]

func (*Tensor[T, B]) T

func (t *Tensor[T, B]) T() *Tensor[T, B]

T is a shortcut for 2D transpose (swaps rows and columns). Panics if the tensor is not 2D.

Example:

t := tensor.Randn[float32](Shape{3, 4}, backend)
transposed := t.T() // Shape: [4, 3]

func (*Tensor[T, B]) Transpose

func (t *Tensor[T, B]) Transpose(axes ...int) *Tensor[T, B]

Transpose transposes the tensor by permuting its dimensions.

If axes is empty, reverses all dimensions (for 2D, this is standard transpose). Otherwise, axes specifies the permutation.

Example:

t := tensor.Randn[float32](Shape{2, 3, 4}, backend)
transposed := t.Transpose(2, 0, 1) // Shape: [4, 2, 3]

func (*Tensor[T, B]) Unsqueeze added in v0.3.0

func (t *Tensor[T, B]) Unsqueeze(dim int) *Tensor[T, B]

Unsqueeze adds a dimension of size 1 at the specified position.

Supports negative dim indexing. This is a view operation (no data copy).

Example:

x := tensor.Randn[float32](Shape{2, 3}, backend)
y := x.Unsqueeze(1)  // Shape: [2, 1, 3]
z := x.Unsqueeze(-1) // Shape: [2, 3, 1]

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL