 README
      ยถ
      README
      ยถ
    
    
      GoCUDA - Professional CUDA + Go Framework
A high-performance, production-ready framework for GPU-accelerated computing in Go with intelligent CPU/GPU dispatch, automatic memory management, and comprehensive performance monitoring.
๐ Features
Core Capabilities
- Intelligent Dispatch: Automatic CPU/GPU selection based on workload characteristics
- Memory Management: Advanced memory pooling and automatic cleanup
- Performance Monitoring: Real-time metrics collection and analysis
- Batch Processing: Efficient batch operations for improved throughput
- Concurrent Workers: Optimized worker pools for maximum performance
Professional Features
- Comprehensive Logging: Structured logging with configurable levels
- Error Handling: Robust error handling and recovery mechanisms
- Configuration Management: Flexible configuration with optimal defaults
- Resource Management: Automatic resource cleanup and leak prevention
- Metrics Export: Integration with monitoring systems
Developer Experience
- Simple API: Clean, intuitive interface for common operations
- Extensive Documentation: Comprehensive guides and examples
- Performance Optimization: Built-in performance analysis tools
- Production Ready: Tested and optimized for production workloads
๐ Requirements
- Go 1.21+
- CUDA 11.0+ (with compatible GPU drivers)
- GCC/G++ compiler (for CGO compilation)
- Linux/WSL2 (primary support)
๐ง Installation
# Install the framework
go get github.com/BasedCodeCapital/gocuda
# Install CUDA dependencies (Ubuntu/Debian)
sudo apt update
sudo apt install nvidia-cuda-toolkit build-essential
# Verify CUDA installation
nvcc --version
nvidia-smi
๐ฏ Quick Start
Basic Usage
package main
import (
    "context"
    "fmt"
    "log"
    
    "github.com/BasedCodeCapital/gocuda/gocuda"
    "github.com/sirupsen/logrus"
)
func main() {
    // Create configuration
    config := gocuda.DefaultConfig()
    config.LogLevel = logrus.InfoLevel
    config.CPUThreshold = 1024 * 1024 // 1M elements
    config.EnableMetrics = true
    
    // Create and initialize engine
    engine, err := gocuda.NewEngine(config)
    if err != nil {
        log.Fatalf("Failed to create engine: %v", err)
    }
    
    if err := engine.Initialize(); err != nil {
        log.Fatalf("Failed to initialize engine: %v", err)
    }
    
    defer engine.Shutdown()
    
    // Create compute engine
    compute := gocuda.NewComputeEngine(engine)
    
    // Vector addition with intelligent dispatch
    a := []float32{1, 2, 3, 4, 5}
    b := []float32{10, 20, 30, 40, 50}
    
    result, err := compute.VectorAdd(context.Background(), a, b)
    if err != nil {
        log.Fatalf("Vector addition failed: %v", err)
    }
    
    fmt.Printf("Result: %v\n", result)
}
Advanced Usage
// Matrix multiplication
matrixA := make([]float32, 1024*1024)
matrixB := make([]float32, 1024*1024)
// ... fill matrices ...
result, err := compute.MatrixMultiply(context.Background(), matrixA, matrixB, 1024)
// Batch operations
operations := []gocuda.VectorOperation{
    {A: vectorA1, B: vectorB1},
    {A: vectorA2, B: vectorB2},
    // ... more operations
}
err = compute.BatchVectorAdd(context.Background(), operations)
// Performance metrics
metrics, err := engine.GetMetrics()
fmt.Printf("GPU Utilization: %.1f%%\n", metrics.GPUUtilization)
fmt.Printf("Average Execution Time: %v\n", metrics.AvgExecutionTime)
๐๏ธ Architecture
โโโโโโโโโโโโโโโโโโโ    โโโโโโโโโโโโโโโโโโโ    โโโโโโโโโโโโโโโโโโโ
โ   Application   โโโโโถโ   GoCUDA        โโโโโถโ   CUDA Runtime  โ
โ   Code          โ    โ   Framework     โ    โ   + GPU         โ
โโโโโโโโโโโโโโโโโโโ    โโโโโโโโโโโโโโโโโโโ    โโโโโโโโโโโโโโโโโโโ
                              โ
                              โผ
                    โโโโโโโโโโโโโโโโโโโ
                    โ   Performance   โ
                    โ   Monitoring    โ
                    โโโโโโโโโโโโโโโโโโโ
Core Components
- Engine: Main framework coordinator
- Device Manager: CUDA device detection and management
- Memory Manager: GPU memory allocation and pooling
- Compute Engine: High-level operation interface
- Metrics Collector: Performance monitoring and analysis
๐ Performance Characteristics
CPU vs GPU Dispatch
- Small Operations: Automatically use CPU to avoid GPU overhead
- Large Operations: Leverage GPU parallelism for maximum performance
- Adaptive Thresholds: Configurable based on hardware capabilities
Memory Management
- Memory Pools: Reduce allocation overhead
- Automatic Cleanup: Prevent memory leaks
- Smart Allocation: Optimize memory usage patterns
Benchmark Results (RTX 4070)
Vector Addition (2M elements):
  CPU: 4.5ms
  GPU: 8.2ms (small overhead, uses CPU)
  
Matrix Multiplication (2048x2048):
  CPU: 32.1s
  GPU: 52.2ms (615x speedup!)
  
Batch Operations (100 operations):
  Sequential: 450ms
  Batched: 125ms (3.6x speedup)
โ๏ธ Configuration
Default Configuration
config := gocuda.DefaultConfig()
// Automatically optimized for your hardware
Custom Configuration
config := &gocuda.Config{
    PreferredDevice: 0,              // Use specific GPU
    CPUThreshold:    512 * 1024,     // 512K elements
    WorkerCount:     8,              // 8 concurrent workers
    BatchSize:       64,             // Process 64 ops per batch
    MemoryPoolSize:  1024 * 1024 * 1024, // 1GB pool
    EnableMetrics:   true,           // Enable monitoring
    LogLevel:        logrus.InfoLevel,
}
Optimal Configuration
// Get hardware-optimized settings
compute := gocuda.NewComputeEngine(engine)
optimalConfig, err := compute.GetOptimalConfig()
๐ Monitoring & Metrics
Built-in Metrics
- Total operations executed
- Success/failure rates
- Average execution times
- GPU utilization
- Memory usage statistics
Accessing Metrics
metrics, err := engine.GetMetrics()
fmt.Printf("Total Operations: %d\n", metrics.TotalOperations)
fmt.Printf("GPU Utilization: %.1f%%\n", metrics.GPUUtilization)
fmt.Printf("Memory Usage: %d MB\n", metrics.MemoryUsage/1024/1024)
Integration with Monitoring Systems
// Export to Prometheus, Grafana, etc.
// (Implementation depends on your monitoring stack)
๐งช Testing
# Run unit tests
go test ./...
# Run benchmarks
go test -bench=. ./...
# Run with race detection
go test -race ./...
๐ง Development
Project Structure
gocuda/
โโโ gocuda/           # Core framework
โ   โโโ engine.go     # Main engine
โ   โโโ device.go     # Device management
โ   โโโ memory.go     # Memory management
โ   โโโ compute.go    # Compute operations
โ   โโโ metrics.go    # Performance monitoring
โโโ examples/         # Usage examples
โโโ docs/            # Documentation
โโโ tests/           # Test suites
Building from Source
# Clone repository
git clone https://github.com/BasedCodeCapital/gocuda.git
cd gocuda
# Build framework
make build
# Run tests
make test
# Run examples
make examples
๐ Documentation
- API Reference - Complete API documentation
- Performance Guide - Optimization strategies
- Architecture Overview - Technical deep dive
- Migration Guide - Upgrading from v1.x
๐ค Contributing
We welcome contributions! Please see our Contributing Guide for details.
Development Setup
# Fork the repository
git clone https://github.com/BasedCodeCapital/gocuda.git
cd gocuda
# Install dependencies
go mod tidy
# Run tests
make test
# Submit pull request
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
๐ Acknowledgments
- NVIDIA CUDA team for the excellent GPU computing platform
- Go team for the fantastic programming language
- Open source community for contributions and feedback
๐ฎ Roadmap
v2.0 (Current)
- โ Intelligent CPU/GPU dispatch
- โ Advanced memory management
- โ Performance monitoring
- โ Batch processing
v2.1 (Planned)
- ๐ Multi-GPU support
- ๐ Streaming operations
- ๐ Custom kernel integration
- ๐ Enhanced metrics export
v3.0 (Future)
- ๐ Machine learning integration
- ๐ Distributed computing
- ๐ Advanced optimization algorithms
- ๐ WebAssembly support
๐ Support
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: support@gocuda.dev
Made with โค๏ธ by the GoCUDA team
       Directories
      ยถ
      Directories
      ยถ
    
    | Path | Synopsis | 
|---|---|
| Package gocuda provides a high-performance CUDA + Go concurrency framework for GPU-accelerated computing with intelligent CPU/GPU dispatch and resource management. | Package gocuda provides a high-performance CUDA + Go concurrency framework for GPU-accelerated computing with intelligent CPU/GPU dispatch and resource management. | 
| internal
       | |
| pkg
       | |
 Click to show internal directories. 
   Click to hide internal directories.