Documentation
¶
Overview ¶
Package collector provides interfaces and implementations for collecting system configuration data.
Overview ¶
This package defines a unified interface for gathering measurements from various system sources including Kubernetes clusters, GPU hardware, operating system configuration, and systemd services. Collectors run concurrently and return structured measurement data that can be serialized for analysis or recommendation generation.
Core Interface ¶
The Collector interface defines a single method for gathering data:
type Collector interface {
Collect(ctx context.Context) (*measurement.Measurement, error)
}
All collectors support context-based cancellation for graceful shutdown and timeout handling.
Factory Pattern ¶
The Factory interface enables dependency injection and testing by abstracting collector creation:
type Factory interface {
CreateSystemDCollector() Collector
CreateOSCollector() Collector
CreateKubernetesCollector() Collector
CreateGPUCollector() Collector
}
The DefaultFactory provides production implementations with configurable options:
factory := collector.NewDefaultFactory(
collector.WithSystemDServices([]string{"containerd.service", "kubelet.service"}),
)
Available Collectors ¶
Kubernetes (k8s): Collects cluster configuration including:
- Node information (provider, kernel, container runtime)
- Server version and platform details
- Deployed container images
- GPU Operator ClusterPolicy configuration
GPU: Gathers GPU hardware and driver information including:
- GPU model, architecture, and compute capability
- Driver version, CUDA version, and firmware
- GPU-specific settings (MIG mode, persistence mode)
Operating System (os): Captures OS-level configuration:
- GRUB boot parameters and kernel arguments
- Sysctl kernel parameters
- Loaded kernel modules
- OS release information
SystemD: Monitors systemd service states:
- Service status and configuration
- Active state and startup settings
- Resource limits and dependencies
Usage Example ¶
Using the default factory:
factory := collector.NewDefaultFactory()
k8sCollector := factory.CreateKubernetesCollector()
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
measurement, err := k8sCollector.Collect(ctx)
if err != nil {
log.Fatalf("collection failed: %v", err)
}
Running multiple collectors in parallel:
factory := collector.NewDefaultFactory()
g, ctx := errgroup.WithContext(context.Background())
var measurements []*measurement.Measurement
var mu sync.Mutex
collectors := []struct {
name string
c collector.Collector
}{
{"k8s", factory.CreateKubernetesCollector()},
{"gpu", factory.CreateGPUCollector()},
{"os", factory.CreateOSCollector()},
{"systemd", factory.CreateSystemDCollector()},
}
for _, col := range collectors {
col := col
g.Go(func() error {
m, err := col.c.Collect(ctx)
if err != nil {
return fmt.Errorf("%s collection failed: %w", col.name, err)
}
mu.Lock()
measurements = append(measurements, m)
mu.Unlock()
return nil
})
}
if err := g.Wait(); err != nil {
log.Fatalf("collection error: %v", err)
}
Subpackages ¶
The collector package is organized into subpackages by data source:
- collector/k8s - Kubernetes API collectors
- collector/gpu - GPU hardware collectors
- collector/os - Operating system collectors
- collector/systemd - SystemD service collectors
- collector/file - File-based configuration collectors
Error Handling ¶
Collectors return errors when:
- Required resources are unavailable (e.g., no Kubernetes cluster)
- Permissions are insufficient
- Context is canceled or times out
- Data parsing fails
Callers should handle these errors appropriately based on their use case.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Collector ¶
type Collector interface {
Collect(ctx context.Context) (*measurement.Measurement, error)
}
Collector defines the interface for collecting system measurement data. Implementations gather data from various sources including system services, hardware components, OS configuration, and cluster state. All collectors must support context-based cancellation.
type DefaultFactory ¶
DefaultFactory is the standard implementation of Factory that creates collectors with production dependencies. It configures default systemd services to monitor.
func NewDefaultFactory ¶
func NewDefaultFactory(opts ...Option) *DefaultFactory
NewDefaultFactory creates a new DefaultFactory with default configuration. By default, it monitors containerd, docker, and kubelet systemd services. Additional configuration can be provided via functional options.
func (*DefaultFactory) CreateGPUCollector ¶
func (f *DefaultFactory) CreateGPUCollector() Collector
CreateGPUCollector creates a GPU collector that gathers GPU hardware and driver information.
func (*DefaultFactory) CreateKubernetesCollector ¶
func (f *DefaultFactory) CreateKubernetesCollector() Collector
CreateKubernetesCollector creates a Kubernetes API collector.
func (*DefaultFactory) CreateNodeTopologyCollector ¶ added in v0.8.2
func (f *DefaultFactory) CreateNodeTopologyCollector() Collector
CreateNodeTopologyCollector creates a node topology collector that gathers taint and label information across all cluster nodes.
func (*DefaultFactory) CreateOSCollector ¶
func (f *DefaultFactory) CreateOSCollector() Collector
CreateOSCollector creates an OS collector.
func (*DefaultFactory) CreateSystemDCollector ¶
func (f *DefaultFactory) CreateSystemDCollector() Collector
CreateSystemDCollector creates a systemd collector that monitors the configured services.
type Factory ¶
type Factory interface {
CreateSystemDCollector() Collector
CreateOSCollector() Collector
CreateKubernetesCollector() Collector
CreateGPUCollector() Collector
CreateNodeTopologyCollector() Collector
}
Factory defines the interface for creating collector instances. Implementations of Factory provide configured collectors for various system components. This interface enables dependency injection and facilitates testing by allowing mock collectors.
type Option ¶
type Option func(*DefaultFactory)
Option defines a configuration option for DefaultFactory.
func WithMaxNodesPerEntry ¶ added in v0.8.2
WithMaxNodesPerEntry configures the maximum number of node names stored per taint/label entry in the topology collector. 0 = no limit.
func WithSystemDServices ¶
WithSystemDServices configures the systemd services to monitor.
Directories
¶
| Path | Synopsis |
|---|---|
|
Package file provides utilities for reading files from the filesystem.
|
Package file provides utilities for reading files from the filesystem. |
|
Package gpu collects GPU hardware and driver configuration data.
|
Package gpu collects GPU hardware and driver configuration data. |
|
Package k8s collects Kubernetes cluster configuration data.
|
Package k8s collects Kubernetes cluster configuration data. |
|
Package os collects operating system configuration data.
|
Package os collects operating system configuration data. |
|
Package systemd collects systemd service configuration data.
|
Package systemd collects systemd service configuration data. |