born

module
v0.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 28, 2025 License: Apache-2.0

README ΒΆ

Born - Production-Ready ML for Go

Go Version Go Reference Go Report Card Pure Go Release License Test Status Codecov

"Models are born production-ready"

Born is a modern deep learning framework for Go, inspired by Burn (Rust). Build ML models in pure Go and deploy as single binaries - no Python runtime, no complex dependencies.

Project Status: πŸš€ v0.2.0 Coming Soon! (WebGPU GPU backend - 123x MatMul speedup!) Latest: ⚑ Phase 2 GPU complete - WebGPU backend with 10.9x inference speedup at batch inference

Pure Go ML with GPU acceleration - no CGO required!


Why Born?

The Problem

Deploying ML models is hard:

  • Python runtime required
  • Complex dependency management
  • Large Docker images
  • Slow startup times
  • Integration friction with Go backends
The Born Solution
import "github.com/born-ml/born"

// Models "born" ready for production
model := born.Load("resnet50.born")
prediction := model.Predict(image)

// That's it. No Python. No containers. Just Go.

Benefits:

  • Single binary deployment
  • Fast startup (< 100ms)
  • Small memory footprint
  • Native Go integration
  • Cross-platform out of the box

Features

  • Pure Go - No CGO dependencies, trivial cross-compilation
  • Type Safe - Generics-powered API for compile-time guarantees
  • Multiple Backends - CPU (SIMD), CUDA, Vulkan, Metal, WebGPU
  • Autodiff - Automatic differentiation via decorators
  • Production Ready - ONNX support, quantization, serving
  • WebAssembly - Run inference in browsers natively
  • Single Binary - Models embedded in executables

Quick Start

Installation
# Clone repository
git clone https://github.com/born-ml/born.git
cd born

# Build
make build

# Or install CLI
make install
Development Setup

Requirements:

  • Go 1.25+
  • Make (optional, but recommended)
  • golangci-lint (for linting)

Build:

make build          # Build all binaries
make test           # Run tests
make lint           # Run linter
make bench          # Run benchmarks
Example: MNIST Classification

Working example included! See examples/mnist/ for complete implementation.

package main

import (
    "github.com/born-ml/born/autodiff"
    "github.com/born-ml/born/backend/cpu"
    "github.com/born-ml/born/nn"
    "github.com/born-ml/born/optim"
)

func main() {
    // Create backend with autodiff
    backend := autodiff.New(cpu.New())

    // Define model (784 β†’ 128 β†’ 10)
    model := NewMNISTNet(backend)

    // Create loss and optimizer
    criterion := nn.NewCrossEntropyLoss(backend)
    optimizer := optim.NewAdam(model.Parameters(), optim.AdamConfig{
        LR:    0.001,
        Betas: [2]float32{0.9, 0.999},
    }, backend)

    // Training loop
    for epoch := range 10 {
        // Forward pass
        logits := model.Forward(batch.ImagesTensor)
        loss := criterion.Forward(logits, batch.LabelsTensor)

        // Backward pass
        optimizer.ZeroGrad()
        grads := backend.Backward(loss.Raw())
        optimizer.Step(grads)

        // Log progress
        acc := nn.Accuracy(logits, batch.LabelsTensor)
        fmt.Printf("Epoch %d: Loss=%.4f, Accuracy=%.2f%%\n",
            epoch, loss.Raw().AsFloat32()[0], acc*100)
    }
}

Run it: cd examples/mnist && go run .

Phase 1 includes:

  • βœ… Tensor operations (Add, MatMul, Reshape, etc.)
  • βœ… Automatic differentiation with gradient tape
  • βœ… Neural network modules (Linear, ReLU, activations)
  • βœ… Optimizers (SGD with momentum, Adam with bias correction)
  • βœ… CrossEntropyLoss with numerical stability (log-sum-exp trick)
  • βœ… Full MNIST training example

Architecture

Backend Abstraction

Born uses a backend interface for device independence:

type Backend interface {
    Add(a, b *RawTensor) *RawTensor
    MatMul(a, b *RawTensor) *RawTensor
    // ... other operations
}

Available Backends:

Backend Status Description
CPU βœ… Available Pure Go implementation (v0.1.1)
WebGPU βœ… Available Zero-CGO GPU via go-webgpu (v0.2.0)
Vulkan πŸ“‹ Q3 2025 Cross-platform GPU compute
CUDA πŸ“‹ Q3 2025 NVIDIA GPU via zero-CGO
Metal πŸ“‹ Q4 2025 Apple GPU (macOS/iOS)
Decorator Pattern

Functionality composed via decorators (inspired by Burn):

// Basic backend
base := cpu.New()

// Add autodiff
withAutodiff := autodiff.New(base)

// Add kernel fusion
optimized := fusion.New(withAutodiff)

// Your code works with any backend!
model := createModel(optimized)
Type Safety with Generics
type Tensor[T DType, B Backend] struct {
    raw     *RawTensor
    backend B
}

// Compile-time type checking
func (t *Tensor[float32, B]) MatMul(other *Tensor[float32, B]) *Tensor[float32, B]

Roadmap

Phase 1: Core (v0.1) - βœ… COMPLETE (Nov 2025)
  • Tensor API with generics
  • CPU backend (pure Go)
  • Autodiff decorator with gradient tape
  • NN modules (Linear, ReLU, Sigmoid, Tanh, Sequential)
  • SGD/Adam optimizers with momentum/bias correction
  • CrossEntropyLoss with numerical stability
  • MNIST classification example

Status: All 7 core tasks complete. 132 unit tests, 83.8% average coverage, 0 linter issues.

Phase 2: GPU (v0.2) - βœ… COMPLETE (Nov 2025)
  • WebGPU backend (zero-CGO via go-webgpu)
  • WGSL compute shaders (12 operations)
  • GPU buffer pooling & memory management
  • MNIST GPU inference (10.9x speedup)

Status: All 5 GPU tasks complete. 123x MatMul speedup, ~16000 samples/sec throughput.

Phase 3: Production (v0.3) - Q1 2026
  • ONNX import/export
  • Model quantization
  • Model serving
  • Vulkan/CUDA backends
Phase 4: Advanced (v1.0) - Q2 2026
  • Metal backend
  • Distributed training
  • Advanced optimizations
  • Model zoo

Full roadmap: See project milestones


Documentation

For Users
For Contributors

Philosophy

"Born Ready"

Models trained anywhere (PyTorch, TensorFlow) are imported and born production-ready:

Training β†’ Birth β†’ Production
 (Burn)    (Born)    (Run)

PyTorch trains  β†’  Born imports  β†’  Born deploys
TensorFlow trains β†’ Born imports β†’ Born deploys
Born trains    β†’  Born ready   β†’  Born serves
Production First
  • Single Binary: Entire model in one executable
  • No Runtime: No Python, no dependencies
  • Fast Startup: < 100ms cold start
  • Small Memory: Minimal footprint
  • Cloud Native: Natural fit for Go services
Developer Experience
  • Type Safe: Catch errors at compile time
  • Clean API: Intuitive and ergonomic
  • Great Docs: Comprehensive documentation
  • Easy Deploy: go build and you're done

Performance

Actual Benchmarks (AMD Ryzen 9 5950X, NVIDIA RTX 3080):

Matrix Operations (WebGPU vs CPU)
Operation CPU GPU Speedup
MatMul 1024x1024 7143ms 58ms 123x
MatMul 512x512 499ms 12ms 41x
MatMul 256x256 56ms 3.7ms 15x
Neural Network Inference
Batch Size CPU GPU Speedup Throughput
64 48ms 19ms 2.5x 3,357/s
256 182ms 21ms 8.5x 11,883/s
512 348ms 32ms 10.9x 15,973/s

Note: CPU backend uses naive O(nΒ³) MatMul. SIMD optimizations planned for v0.3.


Inspiration

Born is inspired by and learns from:

  • Burn - Architecture patterns, decorator design
  • PyTorch - API ergonomics
  • TinyGrad - Simplicity principles
  • Gonum - Go numerical computing
  • HDF5 for Go - Model serialization, dataset storage (planned)

Community

Project is in early development. Star the repo to follow progress!


License

Licensed under the Apache License, Version 2.0.

Why Apache 2.0?

  • βœ… Patent protection - Critical for ML algorithms and production use
  • βœ… Enterprise-friendly - Clear legal framework for commercial adoption
  • βœ… Industry standard - Same as TensorFlow, battle-tested in ML ecosystem
  • βœ… Contributor protection - Explicit patent grant and termination clauses

See LICENSE file for full terms.


FAQ

Q: Why not use Gorgonia? A: Gorgonia is great but uses a different approach. Born focuses on modern Go (generics), pure Go (no CGO), and production-first design inspired by Burn.

Q: When will it be ready? A: Phase 1 (v0.1.1) and Phase 2 (v0.2.0 WebGPU) are RELEASED! ONNX support (Phase 3) targeted for Q1 2026.

Q: Can I use PyTorch models? A: Yes! Via ONNX import (Phase 3). Train in PyTorch, deploy with Born.

Q: WebAssembly support? A: Yes! Pure Go compiles to WASM natively. Inference in browsers out of the box.

Q: How can I help? A: Watch this space! Contributing guide coming soon.


Born for Production. Ready from Day One.

Made with ❀️ by the Born ML team

Documentation β€’ Contributing β€’ Community

Directories ΒΆ

Path Synopsis
Package autodiff provides automatic differentiation capabilities.
Package autodiff provides automatic differentiation capabilities.
backend
cpu
Package cpu provides a pure Go CPU backend for tensor operations.
Package cpu provides a pure Go CPU backend for tensor operations.
cmd
born command
Package main provides the Born ML Framework CLI.
Package main provides the Born ML Framework CLI.
born-bench command
Package main provides benchmarking tools for Born.
Package main provides benchmarking tools for Born.
born-convert command
Package main provides model conversion tools for Born.
Package main provides model conversion tools for Born.
examples
mnist command
mnist-cnn command
mnist-gpu command
MNIST GPU Inference Benchmark
MNIST GPU Inference Benchmark
internal
autodiff
Package autodiff implements automatic differentiation using the decorator pattern.
Package autodiff implements automatic differentiation using the decorator pattern.
autodiff/ops
Package ops defines operation interfaces and implementations for automatic differentiation.
Package ops defines operation interfaces and implementations for automatic differentiation.
backend
Package backend provides backend implementations for tensor operations.
Package backend provides backend implementations for tensor operations.
backend/cpu
Package cpu implements the CPU backend with SIMD optimizations and BLAS integration.
Package cpu implements the CPU backend with SIMD optimizations and BLAS integration.
backend/webgpu
Package webgpu implements the WebGPU backend for GPU-accelerated tensor operations.
Package webgpu implements the WebGPU backend for GPU-accelerated tensor operations.
nn
Package nn implements neural network modules for the Born ML Framework.
Package nn implements neural network modules for the Born ML Framework.
optim
Package optim implements optimization algorithms for training neural networks.
Package optim implements optimization algorithms for training neural networks.
tensor
Package tensor provides the core tensor types and operations for Born ML framework.
Package tensor provides the core tensor types and operations for Born ML framework.
Package nn provides neural network layers and building blocks.
Package nn provides neural network layers and building blocks.
Package optim provides optimization algorithms for training neural networks.
Package optim provides optimization algorithms for training neural networks.
Package tensor provides type-safe tensor operations for the Born ML framework.
Package tensor provides type-safe tensor operations for the Born ML framework.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL