load/

directory
v1.0.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 17, 2026 License: GPL-3.0

README

Load Testing Framework for cd-operator

This directory contains comprehensive load tests to validate cd-operator performance at scale. The framework tests PR pipeline operations, drift detection across multiple clusters, and system behavior under sustained load.

Overview

The load testing framework validates:

  • PR Pipeline Performance: Handling 100+ concurrent PRs across multiple repositories
  • Drift Detection at Scale: Monitoring 50+ applications across 5+ ArgoCD clusters
  • Worker Pool Behavior: Bounded concurrency and goroutine management
  • Rate Limiting: Graceful handling of GitHub/ArgoCD API rate limits
  • Resource Usage: CPU, memory, and goroutine stability under load

Test Structure

tests/load/
├── suite_test.go           # Test suite setup and global fixtures
├── pr_load_test.go         # PR pipeline load tests
├── drift_load_test.go      # Drift detection load tests
├── helpers/
│   ├── resources.go        # Resource monitoring utilities
│   ├── generators.go       # Mock data generators
│   ├── metrics.go          # Prometheus metrics collection
│   └── assertions.go       # Performance SLO assertions
└── README.md               # This file

Prerequisites

For realistic load testing, use a real Kubernetes cluster:

# Ensure kubectl is configured
kubectl cluster-info

# Set environment variable to use existing cluster
export USE_EXISTING_CLUSTER=true

Cluster Requirements:

  • Kubernetes 1.26+
  • 4+ CPU cores
  • 8GB+ RAM
  • CRDs installed (make install)
Option 2: Using envtest

For lighter-weight testing (less realistic but faster):

# Use envtest (default)
export USE_ENVTEST=true
Install Dependencies
# Install test dependencies
go mod download

# Install CRDs (if using existing cluster)
make install

Running Load Tests

Basic Usage

Run all load tests with default configuration:

go test -v ./tests/load/... -timeout=30m
Scaling Test Load

Use the LOAD_FACTOR environment variable to scale test workloads:

# Half load (faster, for CI)
LOAD_FACTOR=0.5 go test -v ./tests/load/...

# Baseline load (default)
LOAD_FACTOR=1.0 go test -v ./tests/load/...

# Double load (stress testing)
LOAD_FACTOR=2.0 go test -v ./tests/load/...

# Maximum load (10x baseline)
LOAD_FACTOR=10.0 go test -v ./tests/load/... -timeout=60m

Load factor scales:

  • Number of PRs/apps created
  • Timeout durations
  • Concurrency levels
Running Specific Tests
# Only PR pipeline tests
go test -v ./tests/load/... -run TestPRPipeline -timeout=20m

# Only drift detection tests
go test -v ./tests/load/... -run TestDriftPipeline -timeout=20m

# Specific test case
go test -v ./tests/load/... -run "TestPRPipeline/Concurrent_PR_Discovery" -timeout=15m
Ginkgo Focus/Skip
# Focus on specific test
go test -v ./tests/load/... -ginkgo.focus="should handle 100 concurrent PRs"

# Skip specific test
go test -v ./tests/load/... -ginkgo.skip="rate limiting"

Performance Baselines

PR Pipeline
Metric Target Acceptable Range
Discovery Latency P95 <5s 2-10s
Qualification Latency P95 <2s 1-5s
Throughput >10 PRs/sec 5-20 PRs/sec
Memory Usage <500MB 300-800MB
Goroutine Growth <100 50-200
Error Rate <1% 0-2%
Drift Detection
Metric Target Acceptable Range
Drift Check Latency P95 <10s 5-20s
Multi-Cluster Throughput >10 checks/sec 5-15 checks/sec
Memory Usage <500MB 300-800MB
Error Rate <1% 0-2%

Test Scenarios

PR Pipeline Load Tests (pr_load_test.go)
1. Concurrent PR Discovery

Tests operator's ability to discover and track 100 PRs across 5 repositories simultaneously.

Validates:

  • CRD creation throughput
  • Discovery loop performance
  • State transition speed
  • Memory stability

Load: 100 PRs, 5 repositories

2. PR Qualification Under Load

Tests qualification logic with mixed pass/fail scenarios under high volume.

Validates:

  • Validation rule performance
  • Layout detection speed
  • Mergeable state checks
  • Result distribution accuracy

Load: 100 PRs (70% pass, 20% layout fail, 10% not mergeable)

3. Worker Pool Concurrency

Tests worker pool behavior under sustained load with periodic PR arrivals.

Validates:

  • Goroutine count stability
  • Queue depth management
  • No unbounded growth
  • Graceful backpressure

Load: 200 PRs in 20 batches over 1 minute

Drift Detection Load Tests (drift_load_test.go)
1. Multi-Cluster Drift Detection

Tests drift monitoring across multiple ArgoCD clusters with many applications.

Validates:

  • Multi-cluster query parallelization
  • Sync status check accuracy
  • Drift detection correctness
  • Status update propagation

Load: 50 applications, 5 clusters

2. ArgoCD API Rate Limiting

Tests graceful handling of ArgoCD API rate limits with retry backoff.

Validates:

  • Rate limit detection
  • Exponential backoff
  • Request retry logic
  • Error rate under throttling

Load: 30 applications, aggressive rate limiting (10 req/sec)

3. Parallel Cluster Checks

Tests parallel execution efficiency across multiple clusters with simulated latency.

Validates:

  • Parallelization effectiveness
  • Latency handling
  • Resource efficiency
  • Speedup vs serial execution

Load: 25 applications, 5 clusters, 100ms latency per cluster

Performance Report

After running tests, an HTML performance report is generated:

load-test-report.html

The report includes:

  • Test summary (duration, items processed)
  • Latency percentiles (P50, P95, P99)
  • Resource usage charts (memory, goroutines)
  • Throughput analysis
  • Error rate breakdown

View the report:

open load-test-report.html  # macOS
xdg-open load-test-report.html  # Linux

Interpreting Results

Successful Test Run
Performance Summary:
  Throughput: 15.23 PRs/sec
  P50 Latency: 1.2s
  P95 Latency: 3.8s
  P99 Latency: 5.1s
  Memory Delta: 45.32 MB
  Goroutines: 50 -> 75

SLO Report: 5/5 SLOs met (100.0%)
─────────────────────────────────────────────
✓ PR Discovery Latency P95: 3.80 seconds (target: <= 5.00 seconds)
✓ PR Qualification Latency P95: 1.50 seconds (target: <= 2.00 seconds)
✓ Memory Usage: 345.23 MB (target: <= 500.00 MB)
✓ Goroutine Growth: 25 count (target: <= 100 count)
✓ Error Rate: 0.20 percent (target: <= 1.00 percent)
─────────────────────────────────────────────
Warning Signs

High P99 Latency:

  • Investigate long tail latencies
  • Check for slow API calls
  • Review timeout configurations

Memory Growth:

  • Check for memory leaks
  • Review object caching
  • Validate cleanup logic

Goroutine Growth:

  • Check for goroutine leaks
  • Validate worker pool shutdown
  • Review context cancellation

High Error Rate:

  • Check API connectivity
  • Review retry logic
  • Validate rate limiting

Troubleshooting

Test Timeouts

If tests timeout, increase the timeout or reduce load:

# Increase timeout
go test -v ./tests/load/... -timeout=60m

# Reduce load
LOAD_FACTOR=0.5 go test -v ./tests/load/... -timeout=30m
Out of Memory

If tests run out of memory, reduce concurrency:

# Reduce load factor
LOAD_FACTOR=0.3 go test -v ./tests/load/...
CRD Not Found Errors

Install CRDs before running tests:

make install
Connection Refused Errors

Ensure you have a running Kubernetes cluster:

kubectl cluster-info
Slow Test Execution

Load tests are CPU and I/O intensive. To speed up:

  1. Use fewer test iterations: LOAD_FACTOR=0.5
  2. Run specific tests: -run TestPRPipeline
  3. Use existing cluster: USE_EXISTING_CLUSTER=true
  4. Increase cluster resources

CI/CD Integration

GitHub Actions Example
name: Load Tests

on:
  schedule:
    - cron: '0 2 * * *'  # Nightly
  workflow_dispatch:

jobs:
  load-test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - uses: actions/setup-go@v4
        with:
          go-version: '1.26'

      - name: Setup test cluster
        run: |
          kind create cluster
          make install

      - name: Run load tests
        env:
          USE_EXISTING_CLUSTER: true
          LOAD_FACTOR: 0.5
        run: |
          go test -v ./tests/load/... -timeout=30m

      - name: Upload performance report
        uses: actions/upload-artifact@v3
        if: always()
        with:
          name: load-test-report
          path: load-test-report.html

Best Practices

When to Run Load Tests
  • Before major releases - Validate performance regressions
  • After optimization work - Measure improvement
  • Nightly in CI - Catch performance degradation early
  • On-demand - Troubleshoot production issues
Analyzing Results
  1. Establish Baseline - Run tests on main branch
  2. Compare Results - Run on feature branch
  3. Check SLOs - Verify all targets met
  4. Review Report - Analyze detailed metrics
  5. Investigate Failures - Debug issues before merge
Scaling Guidelines
LOAD_FACTOR Use Case Duration Resources
0.1 Quick smoke test 5min Minimal
0.5 CI/PR checks 15min Standard
1.0 Nightly regression 30min Standard
2.0 Weekly stress test 60min High
5.0+ Capacity planning 2hr+ Maximum

Contributing

When adding new load tests:

  1. Follow naming convention: <component>_load_test.go
  2. Use table-driven tests: For parameterized scenarios
  3. Add SLO assertions: Define performance targets
  4. Document scenarios: Explain what's being tested
  5. Scale with LOAD_FACTOR: Use scaleLoad() helper
  6. Generate reports: Capture performance data

References

Support

For issues or questions:

  • File a GitHub issue with load-test label
  • Include test output and performance report
  • Share cluster specs and LOAD_FACTOR used

Directories

Path Synopsis
Package helpers provides performance assertion utilities for load testing.
Package helpers provides performance assertion utilities for load testing.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL