SciGo π
π Why SciGo?
SciGo = Statistical Computing In Go
SciGo brings the power and familiarity of scikit-learn to the Go ecosystem, offering:
- π₯ Blazing Fast: Native Go implementation with built-in parallelization
- π― scikit-learn Compatible: Familiar Fit/Predict API for easy migration
- π² LightGBM Support: Full compatibility with Python LightGBM models (.txt/JSON/string)
- π Well Documented: Complete API documentation with examples on pkg.go.dev
- π Streaming Support: Online learning algorithms for real-time data
- π Zero Heavy Dependencies: Pure Go implementation (only scientific essentials)
- π Comprehensive: Regression, classification, clustering, tree-based models, and more
- π§ͺ Production Ready: Extensive tests, benchmarks, and error handling
- β‘ Superior to leaves: Not just inference - full training, convenience features, and numerical precision
π¦ Installation
Go Module (Recommended)
go get github.com/YuminosukeSato/scigo@latest
Quick Start Options
- π³ Docker:
docker run --rm -it ghcr.io/yuminosukesato/scigo:latest
- βοΈ GitPod:

- π¦ Go Install:
go install github.com/YuminosukeSato/scigo/examples/quick-start@latest
π Quick Start
π‘ Tip: For complete API documentation with examples, visit pkg.go.dev/scigo
Option 1: One-Liner with LightGBM π²
package main
import (
"github.com/YuminosukeSato/scigo/sklearn/lightgbm"
"gonum.org/v1/gonum/mat"
)
func main() {
// Super convenient one-liner training!
X := mat.NewDense(100, 4, data) // Your data
y := mat.NewDense(100, 1, labels) // Your labels
// Train and predict in one line!
result := lightgbm.QuickTrain(X, y)
predictions := result.Predict(X_test)
// Or use AutoML for automatic tuning
best := lightgbm.AutoFit(X, y)
// Load Python LightGBM models directly!
model := lightgbm.NewLGBMClassifier()
model.LoadModel("python_model.txt") // Full compatibility!
predictions, _ := model.Predict(X_test)
}
Option 2: Classic Linear Regression
package main
import (
"fmt"
"log"
"github.com/YuminosukeSato/scigo/linear"
"gonum.org/v1/gonum/mat"
)
func main() {
// Create and train model - just like scikit-learn!
model := linear.NewLinearRegression()
// Training data
X := mat.NewDense(4, 2, []float64{
1, 1,
1, 2,
2, 2,
2, 3,
})
y := mat.NewDense(4, 1, []float64{
2, 3, 3, 4,
})
// Fit the model
if err := model.Fit(X, y); err != nil {
log.Fatal(err)
}
// Make predictions
XTest := mat.NewDense(2, 2, []float64{
1.5, 1.5,
2.5, 3.5,
})
predictions, _ := model.Predict(XTest)
fmt.Println("Ready, Set, SciGo! Predictions:", predictions)
}
π API Documentation

π Package Documentation
| Package |
Description |
Go Doc |
| sklearn/lightgbm π² |
LightGBM with Python model compatibility & convenience features |
 |
| sklearn/linear_model |
Linear models with full scikit-learn compatibility |
 |
| preprocessing |
Data preprocessing utilities (StandardScaler, MinMaxScaler, OneHotEncoder) |
 |
| linear |
Linear machine learning algorithms (LinearRegression) |
 |
| metrics |
Model evaluation metrics (MSE, RMSE, MAE, RΒ², MAPE) |
 |
| core/model |
Base model with weight export/import and scikit-learn compatibility |
 |
π Complete API Examples
The documentation includes comprehensive examples for all major APIs. Visit the Go Doc links above or use go doc locally:
# View package documentation
go doc github.com/YuminosukeSato/scigo/preprocessing
go doc github.com/YuminosukeSato/scigo/linear
go doc github.com/YuminosukeSato/scigo/metrics
# View specific function documentation
go doc github.com/YuminosukeSato/scigo/preprocessing.StandardScaler.Fit
go doc github.com/YuminosukeSato/scigo/linear.LinearRegression.Predict
go doc github.com/YuminosukeSato/scigo/metrics.MSE
# Run example tests
go test -v ./preprocessing -run Example
go test -v ./linear -run Example
go test -v ./metrics -run Example
π Algorithms
Supervised Learning
Linear Models
- β
Linear Regression - Full scikit-learn compatible implementation with QR decomposition
- β
SGD Regressor - Stochastic Gradient Descent for large-scale learning
- β
SGD Classifier - Linear classifiers with SGD training
- β
Passive-Aggressive - Online learning for classification and regression
Data Preprocessing
- β
StandardScaler - Standardizes features by removing mean and scaling to unit variance
- β
MinMaxScaler - Scales features to a given range (e.g., [0,1] or [-1,1])
- β
OneHotEncoder - Encodes categorical features as one-hot numeric arrays
Tree-based Models
- β
LightGBM - Full Python model compatibility (.txt/JSON/string formats)
- LGBMClassifier - Binary and multiclass classification
- LGBMRegressor - Regression with multiple objectives
- QuickTrain - One-liner training with automatic model selection
- AutoFit - Automatic hyperparameter tuning
- Superior to leaves - training + convenience features
- π§ Random Forest (Coming Soon)
- π§ XGBoost compatibility (Coming Soon)
Unsupervised Learning
Clustering
- β
MiniBatch K-Means - Scalable K-Means for large datasets
- π§ DBSCAN (Coming Soon)
- π§ Hierarchical Clustering (Coming Soon)
Special Features
Online Learning & Streaming
- β
Incremental Learning - Update models with new data batches
- β
Partial Fit - scikit-learn compatible online learning
- β
Concept Drift Detection - DDM and ADWIN algorithms
- β
Streaming Pipelines - Real-time data processing with channels
π― scikit-learn Compatibility
SciGo implements the familiar scikit-learn API with full compatibility:
// Just like scikit-learn!
model.Fit(X, y) // Train the model
model.Predict(X) // Make predictions
model.Score(X, y) // Evaluate the model
model.PartialFit(X, y) // Incremental learning
// New in v0.3.0 - Full scikit-learn compatibility
model.GetParams(deep) // Get model parameters
model.SetParams(params) // Set model parameters
weights, _ := model.ExportWeights() // Export model weights
model.ImportWeights(weights) // Import with guaranteed reproducibility
// Streaming - unique to Go!
model.FitStream(ctx, dataChan) // Streaming training
π New Features in v0.3.0
- Complete Weight Reproducibility - Guaranteed identical outputs with same weights
- gRPC/Protobuf Support - Distributed training and prediction
- Full Parameter Management - GetParams/SetParams for all models
- Model Serialization - Export/Import with full precision
SciGo leverages Go's concurrency for exceptional performance:
| Algorithm |
Dataset Size |
SciGo |
scikit-learn (Python) |
Speedup |
| Linear Regression |
1MΓ100 |
245ms |
890ms |
3.6Γ |
| SGD Classifier |
500KΓ50 |
180ms |
520ms |
2.9Γ |
| MiniBatch K-Means |
100KΓ20 |
95ms |
310ms |
3.3Γ |
| Streaming SGD |
1M streaming |
320ms |
1.2s |
3.8Γ |
Benchmarks on MacBook Pro M2, 16GB RAM
Memory Efficiency
| Dataset Size |
Memory |
Allocations |
| 100Γ10 |
22.8KB |
22 |
| 1,000Γ10 |
191.8KB |
22 |
| 10,000Γ20 |
3.4MB |
57 |
| 50,000Γ50 |
41.2MB |
61 |
ποΈ Architecture
scigo/
βββ linear/ # Linear models
βββ sklearn/ # scikit-learn compatible implementations
β βββ linear_model/ # SGD, Passive-Aggressive
β βββ cluster/ # Clustering algorithms
β βββ drift/ # Concept drift detection
βββ metrics/ # Evaluation metrics
βββ core/ # Core abstractions
β βββ model/ # Base model interfaces
β βββ tensor/ # Tensor operations
β βββ parallel/ # Parallel processing
βββ datasets/ # Dataset utilities
βββ examples/ # Usage examples
π Metrics
Comprehensive evaluation metrics with full documentation:
- Regression Metrics:
- Classification: Accuracy, Precision, Recall, F1-Score, ROC-AUC (coming)
- Clustering: Silhouette Score, Davies-Bouldin Index (coming)
π§ͺ Testing & Quality
# Run tests
go test ./...
# Run benchmarks
go test -bench=. -benchmem ./...
# Check coverage (76.7% overall coverage)
go test -cover ./...
# Run linter (errcheck, govet, ineffassign, staticcheck, unused, misspell)
make lint-full
# Run examples to see API usage
go test -v ./preprocessing -run Example
go test -v ./linear -run Example
go test -v ./metrics -run Example
go test -v ./core/model -run Example
Quality Gates
- β
Test Coverage: 76.7% (target: 70%+)
- β
Linting: golangci-lint with comprehensive checks
- β
Documentation: Complete godoc for all public APIs
- β
Examples: Comprehensive example functions for all major APIs
π Examples
Check out the examples directory:
π€ Contributing
We welcome contributions! Please see our Contributing Guide.
Development Setup
# Clone the repository
git clone https://github.com/YuminosukeSato/scigo.git
cd scigo
# Install dependencies
go mod download
# Run tests
go test ./...
# Run linter
golangci-lint run
π Continuous Delivery (CD)
SciGo uses automated continuous delivery for releases:
- Automatic Release: Every push to the
main branch triggers an automatic patch version release
- Version Management: Versions are automatically incremented (e.g., 0.4.0 β 0.4.1)
- Release Assets: Binaries for Linux, macOS, and Windows are automatically built and attached
- Docker Images: Docker images are automatically built and pushed to GitHub Container Registry (ghcr.io)
- Documentation: pkg.go.dev is automatically updated with the latest version
Release Process
- Merge PR to main: When a PR is merged to main branch
- Automatic Tests: CI runs all tests and coverage checks
- Version Bump: Patch version is automatically incremented
- Create Release: GitHub Release is created with:
- Multi-platform binaries (Linux, macOS, Windows)
- Release notes from CHANGELOG.md
- Docker image at
ghcr.io/yuminosukesato/scigo:VERSION
- Post-Release: An issue is created to track post-release verification tasks
Manual Release
For major or minor version releases, create and push a tag manually:
git tag v0.5.0 -m "Release v0.5.0"
git push origin v0.5.0
This will trigger the release workflow via the existing release.yml workflow.
πΊοΈ Roadmap
Phase 1: Core ML (Current)
- β
Linear models
- β
Online learning
- β
Basic clustering
- π§ Tree-based models
Phase 2: Advanced Features
- Neural Networks (MLP)
- Deep Learning integration
- Model serialization (ONNX export)
- GPU acceleration
Phase 3: Enterprise Features
- Distributed training
- AutoML capabilities
- Model versioning
- A/B testing framework
π Documentation
Core Documentation
API Quick Reference
Migration & Advanced Guides
π Acknowledgments
π License
SciGo is licensed under the MIT License. See LICENSE for details.
π Ready, Set, SciGo! π
Where Science Meets Go - Say goodbye to slow ML!
Made with β€οΈ and lots of β in Go
### Running scikit-learn parity tests
Development-only parity tests compare the Go implementation against scikit-learn outputs.
They are not part of the default go test; use the parity build tag explicitly.
Steps
- Generate golden data
- Use
uv instead of pip.
- Command:
uv run --with scikit-learn --with numpy --with scipy python scripts/golden/gen_logreg.py
- Run parity tests
- Command:
go test ./sklearn/linear_model -tags=parity -run Parity -v
One-liner
make parity-linear
Notes
- Current LogisticRegression uses simplified gradient descent. After implementing lbfgs/newton-cg, tolerances will be tightened.
- Golden file is written to
tests/golden/logreg_case1.json.