Born - Production-Ready ML for Go

"Models are born production-ready"
Born is a modern deep learning framework for Go, inspired by Burn (Rust). Build ML models in pure Go and deploy as single binaries - no Python runtime, no complex dependencies.
Project Status: π v0.1.0 Initial Release! (MNIST: 97.44% MLP, 98.18% CNN - production-ready)
Celebrating 16 years of Go (2009-2025) with production-ready ML π
Why Born?
The Problem
Deploying ML models is hard:
- Python runtime required
- Complex dependency management
- Large Docker images
- Slow startup times
- Integration friction with Go backends
The Born Solution
import "github.com/born-ml/born"
// Models "born" ready for production
model := born.Load("resnet50.born")
prediction := model.Predict(image)
// That's it. No Python. No containers. Just Go.
Benefits:
- Single binary deployment
- Fast startup (< 100ms)
- Small memory footprint
- Native Go integration
- Cross-platform out of the box
Features
- Pure Go - No CGO dependencies, trivial cross-compilation
- Type Safe - Generics-powered API for compile-time guarantees
- Multiple Backends - CPU (SIMD), CUDA, Vulkan, Metal, WebGPU
- Autodiff - Automatic differentiation via decorators
- Production Ready - ONNX support, quantization, serving
- WebAssembly - Run inference in browsers natively
- Single Binary - Models embedded in executables
Quick Start
Installation
# Clone repository
git clone https://github.com/born-ml/born.git
cd born
# Build
make build
# Or install CLI
make install
Development Setup
Requirements:
- Go 1.25+
- Make (optional, but recommended)
- golangci-lint (for linting)
Build:
make build # Build all binaries
make test # Run tests
make lint # Run linter
make bench # Run benchmarks
Example: MNIST Classification
Working example included! See examples/mnist/ for complete implementation.
package main
import (
"github.com/born-ml/born/internal/autodiff"
"github.com/born-ml/born/internal/backend/cpu"
"github.com/born-ml/born/internal/nn"
)
func main() {
// Create backend with autodiff
backend := autodiff.New(cpu.New())
// Define model (784 β 128 β 10)
model := NewMNISTNet(backend)
// Create loss and optimizer
criterion := nn.NewCrossEntropyLoss(backend)
optimizer := optim.NewAdam(model.Parameters(), optim.AdamConfig{
LR: 0.001,
Betas: [2]float32{0.9, 0.999},
}, backend)
// Training loop
for epoch := range 10 {
// Forward pass
logits := model.Forward(batch.ImagesTensor)
loss := criterion.Forward(logits, batch.LabelsTensor)
// Backward pass
optimizer.ZeroGrad()
grads := backend.Backward(loss.Raw())
optimizer.Step(grads)
// Log progress
acc := nn.Accuracy(logits, batch.LabelsTensor)
fmt.Printf("Epoch %d: Loss=%.4f, Accuracy=%.2f%%\n",
epoch, loss.Raw().AsFloat32()[0], acc*100)
}
}
Run it: cd examples/mnist && go run .
Phase 1 includes:
- β
Tensor operations (Add, MatMul, Reshape, etc.)
- β
Automatic differentiation with gradient tape
- β
Neural network modules (Linear, ReLU, activations)
- β
Optimizers (SGD with momentum, Adam with bias correction)
- β
CrossEntropyLoss with numerical stability (log-sum-exp trick)
- β
Full MNIST training example
Architecture
Backend Abstraction
Born uses a backend interface for device independence:
type Backend interface {
Add(a, b *RawTensor) *RawTensor
MatMul(a, b *RawTensor) *RawTensor
// ... other operations
}
Available Backends:
| Backend |
Status |
Description |
| CPU |
Planned |
Pure Go with SIMD optimizations |
| CUDA |
Planned |
NVIDIA GPU via direct driver calls |
| Vulkan |
Planned |
Cross-platform GPU compute |
| Metal |
Planned |
Apple GPU (macOS/iOS) |
| WebGPU |
Planned |
Modern browser GPU |
Decorator Pattern
Functionality composed via decorators (inspired by Burn):
// Basic backend
base := cpu.New()
// Add autodiff
withAutodiff := autodiff.New(base)
// Add kernel fusion
optimized := fusion.New(withAutodiff)
// Your code works with any backend!
model := createModel(optimized)
Type Safety with Generics
type Tensor[T DType, B Backend] struct {
raw *RawTensor
backend B
}
// Compile-time type checking
func (t *Tensor[float32, B]) MatMul(other *Tensor[float32, B]) *Tensor[float32, B]
Roadmap
Phase 1: Core (v0.1) - β
COMPLETE (Nov 2025)
- Tensor API with generics
- CPU backend (pure Go)
- Autodiff decorator with gradient tape
- NN modules (Linear, ReLU, Sigmoid, Tanh, Sequential)
- SGD/Adam optimizers with momentum/bias correction
- CrossEntropyLoss with numerical stability
- MNIST classification example
Status: All 7 core tasks complete. 132 unit tests, 83.8% average coverage, 0 linter issues.
Phase 2: GPU (v0.2) - Q2 2025
- Vulkan backend
- CUDA backend
- Kernel fusion
- Memory optimization
Phase 3: Production (v0.3) - Q3 2025
- ONNX import/export
- Model quantization
- Model serving
- Distributed training
Phase 4: Advanced (v1.0) - Q4 2025
- Metal backend
- WebGPU backend
- Advanced optimizations
- Model zoo
Full roadmap: See project milestones
Documentation
For Contributors
Philosophy
"Born Ready"
Models trained anywhere (PyTorch, TensorFlow) are imported and born production-ready:
Training β Birth β Production
(Burn) (Born) (Run)
PyTorch trains β Born imports β Born deploys
TensorFlow trains β Born imports β Born deploys
Born trains β Born ready β Born serves
Production First
- Single Binary: Entire model in one executable
- No Runtime: No Python, no dependencies
- Fast Startup: < 100ms cold start
- Small Memory: Minimal footprint
- Cloud Native: Natural fit for Go services
Developer Experience
- Type Safe: Catch errors at compile time
- Clean API: Intuitive and ergonomic
- Great Docs: Comprehensive documentation
- Easy Deploy:
go build and you're done
Benchmarks coming with Phase 1 implementation
Targets (Intel i9-13900K, NVIDIA RTX 4090):
| Operation |
CPU Target |
GPU Target |
Speedup |
| MatMul 1024x1024 |
< 20ms |
< 1ms |
20x |
| Conv2d 224x224 |
< 100ms |
< 3ms |
30x |
| Transformer Block |
< 150ms |
< 5ms |
30x |
Inspiration
Born is inspired by and learns from:
- Burn - Architecture patterns, decorator design
- PyTorch - API ergonomics
- TinyGrad - Simplicity principles
- Gonum - Go numerical computing
- HDF5 for Go - Model serialization, dataset storage (planned)
Project is in early development. Star the repo to follow progress!
License
Licensed under the Apache License, Version 2.0.
Why Apache 2.0?
- β
Patent protection - Critical for ML algorithms and production use
- β
Enterprise-friendly - Clear legal framework for commercial adoption
- β
Industry standard - Same as TensorFlow, battle-tested in ML ecosystem
- β
Contributor protection - Explicit patent grant and termination clauses
See LICENSE file for full terms.
FAQ
Q: Why not use Gorgonia?
A: Gorgonia is great but uses a different approach. Born focuses on modern Go (generics), pure Go (no CGO), and production-first design inspired by Burn.
Q: When will it be ready?
A: Phase 1 (CPU, basic training) targeted for Q1 2026. Follow development on GitHub.
Q: Can I use PyTorch models?
A: Yes! Via ONNX import (Phase 3). Train in PyTorch, deploy with Born.
Q: WebAssembly support?
A: Yes! Pure Go compiles to WASM natively. Inference in browsers out of the box.
Q: How can I help?
A: Watch this space! Contributing guide coming soon.