artifacts

package
v0.4.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 26, 2025 License: Apache-2.0 Imports: 13 Imported by: 0

README

E2E Test Artifact Collection

Automatic collection of debugging artifacts when e2e tests fail.

Features

  • Automatic: Collects artifacts only on test failures (configurable)
  • Safe: Never fails tests due to collection errors
  • Fast: < 1s overhead for passing tests, < 30s for failing tests
  • Comprehensive: Logs, pod status, events, resources

Configuration (ENV Variables)

All configuration is ENV-based with sensible defaults:

Collection Control
  • E2E_ARTIFACTS_ENABLED (default: true) - Master switch
  • E2E_ARTIFACTS_ON_FAILURE_ONLY (default: true) - Only collect on failures
  • E2E_ARTIFACTS_MINIMAL_ONLY (default: false) - P0 artifacts only
Storage
  • E2E_ARTIFACTS_DIR (default: test/e2e/artifacts) - Base directory
Size Limits
  • E2E_ARTIFACTS_MAX_LOG_LINES (default: 500) - Max log lines per pod
  • E2E_ARTIFACTS_MAX_RESOURCE_SIZE (default: 10485760) - Max 10MB per file
  • E2E_ARTIFACTS_MAX_TOTAL_SIZE (default: 104857600) - Max 100MB per test
Timeouts
  • E2E_ARTIFACTS_TIMEOUT (default: 30s) - Max collection time

Usage

Running Tests with Artifacts
# Default behavior (enabled, on-failure-only)
make test-e2e

# Disable artifacts
E2E_ARTIFACTS_ENABLED=false make test-e2e

# Collect for all tests (even passing)
E2E_ARTIFACTS_ON_FAILURE_ONLY=false make test-e2e

# Increase log lines
E2E_ARTIFACTS_MAX_LOG_LINES=1000 make test-e2e
Artifact Location

Artifacts are stored in:

test/e2e/artifacts/
└── run-{timestamp}/
    ├── metadata.json
    └── {test-name}/
        ├── metadata.json
        ├── logs/
        │   ├── operator-controller.log
        │   └── pod-{name}.log
        ├── pods/
        │   └── {pod-name}-status.json
        ├── resources/
        │   ├── vectorpipeline-{name}-status.json
        │   └── deployment-{name}.yaml
        └── events/
            └── namespace-events.txt
Unified Test Results

When you run make test-e2e, all results are automatically saved in a unified structure with reports and artifacts correlated by timestamp:

# Run tests - results automatically saved with timestamp
make test-e2e

# Results structure:
test/e2e/results/run-{timestamp}/
├── reports/
│   ├── junit-report.xml    # JUnit XML for CI integration
│   ├── report.json          # Ginkgo JSON report
│   └── test-output.log      # Full test output logs
└── artifacts/               # Debug artifacts (only for failed tests)
    ├── metadata.json        # Run-level metadata
    └── {test-name}/         # Per-test artifacts
        ├── metadata.json
        ├── logs/
        ├── pods/
        ├── resources/
        └── events/

Benefits:

  • Single runID correlates all reports and artifacts
  • Easy to navigate - everything in one directory
  • CI/CD friendly - upload one directory
  • Helpful output with quick analysis commands
CI Integration (GitHub Actions)
- name: Run E2E Tests
  run: make test-e2e

- name: Upload Test Results
  if: always()  # Upload even if tests fail
  uses: actions/upload-artifact@v4
  with:
    name: e2e-results-${{ github.run_number }}
    path: test/e2e/results/
    retention-days: 30

Collected Artifacts (P0 - MVP)

Critical for Debugging
  1. Pod Status JSON - Conditions, restarts, phase
  2. Operator Controller Logs - Time-filtered logs (test duration + 1min buffer)
  3. VectorPipeline CR Status - Validation results
  4. Namespace Events - What happened in test namespace
  5. Resource Metadata - Deployments, DaemonSets, Services
Future (Phase 2)
  • Full pod logs (all containers)
  • Full pod descriptions
  • Vector agent/aggregator logs
  • ConfigCheck pod logs
  • Timeline reconstruction

Architecture

  • Thread-safe: Uses sync.Map for parallel test support
  • Graceful degradation: Collection errors don't fail tests
  • Size limits: Prevents CI artifact bloat
  • Atomic writes: Temp file + rename for reliability

Performance

  • Passing tests: < 1s overhead (if ON_FAILURE_ONLY=true)
  • Failing tests: < 30s collection time
  • Storage: < 100MB per test, < 500MB per run

Troubleshooting

No artifacts collected
  1. Check E2E_ARTIFACTS_ENABLED=true
  2. Verify test is using framework.NewFramework() or framework.Shared()
  3. Check GinkgoWriter output for warning messages
Artifacts too large
  1. Reduce E2E_ARTIFACTS_MAX_LOG_LINES (default: 500)
  2. Enable E2E_ARTIFACTS_MINIMAL_ONLY=true
  3. Check individual file sizes with E2E_ARTIFACTS_MAX_RESOURCE_SIZE
Collection timeout
  1. Increase E2E_ARTIFACTS_TIMEOUT (default: 30s)
  2. Check kubectl connectivity
  3. Review namespace resource count

Important Bug Fixes

Time-based Log Collection (Fixed)

Problem: Previously, operator logs were collected using kubectl logs --tail 500, which retrieved the last 500 lines from the entire pod lifetime. In long-running test suites (e.g., full e2e runs lasting 15+ minutes), the operator pod could generate thousands of log lines, causing the last 500 lines to exclude logs from earlier failing tests.

Example: A test failing at 18:05-18:07 would collect operator logs from 16:02-16:03 (the pod's startup logs), completely missing the relevant reconciliation attempts.

Solution: Implemented time-based log collection using kubectl logs --since-time with the test's start time (+ 1 minute buffer). This ensures operator logs are collected only for the relevant time period, regardless of how long the pod has been running.

Impact:

  • Fixes flaky test debugging where operator logs were missing
  • Enables reliable root cause analysis for race conditions
  • Reduces confusion when logs don't match test timeline

Development

See architect design document for Phase 2+ enhancements.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func TruncateLogLines

func TruncateLogLines(content []byte, maxLines int) []byte

TruncateLogLines truncates log output to specified number of lines Takes the LAST N lines (most recent logs are most relevant for debugging)

Types

type ArtifactInventory

type ArtifactInventory struct {
	PodCount       int      `json:"pod_count"`
	LogFiles       []string `json:"log_files"`
	ResourceFiles  []string `json:"resource_files"`
	EventFiles     []string `json:"event_files"`
	TotalSizeBytes int64    `json:"total_size_bytes"`
	CollectionTime string   `json:"collection_time"` // Human-readable duration
}

ArtifactInventory tracks what artifacts were collected

type Collector

type Collector interface {
	// Initialize sets up the collector for a test run
	Initialize(runID string) error

	// CollectForTest collects artifacts for a test
	CollectForTest(ctx context.Context, testInfo TestInfo) error

	// Close finalizes the collector and writes summary
	Close() error
}

Collector manages artifact collection for e2e tests

func NewCollector

func NewCollector(config Config) (Collector, error)

NewCollector creates a new artifact collector

type Config

type Config struct {
	// Collection control
	Enabled              bool // Master switch for artifact collection
	CollectOnFailureOnly bool // Collect artifacts only for failed tests
	CollectMinimalOnly   bool // Collect only P0 artifacts (fast path)

	// Storage paths
	BaseDir string // Base directory for artifact storage

	// Size limits (prevent artifact bloat)
	MaxLogLines     int   // Maximum log lines per pod
	MaxResourceSize int64 // Maximum size for single resource (bytes)
	MaxTotalSize    int64 // Maximum total size per test (bytes)

	// Timeouts
	CollectionTimeout time.Duration // Maximum time to collect artifacts

	// Filters
	NamespacePatterns []string // Namespace patterns to collect from
	PodLabelSelectors []string // Pod label selectors for filtering
}

Config defines artifact collection behavior

func LoadConfigFromEnv

func LoadConfigFromEnv() Config

LoadConfigFromEnv loads configuration from environment variables Following Phase 1 pattern: ENV-based config with sensible defaults

type MetadataBuilder

type MetadataBuilder struct {
	// contains filtered or unexported fields
}

MetadataBuilder helps build and write metadata files

func NewMetadataBuilder

func NewMetadataBuilder(storage *Storage) *MetadataBuilder

NewMetadataBuilder creates a new metadata builder

func (*MetadataBuilder) WriteRunMetadata

func (m *MetadataBuilder) WriteRunMetadata(meta RunMetadata) error

WriteRunMetadata writes run metadata to JSON file

func (*MetadataBuilder) WriteTestMetadata

func (m *MetadataBuilder) WriteTestMetadata(meta TestMetadata, testDir string) error

WriteTestMetadata writes test metadata to JSON file

type RunMetadata

type RunMetadata struct {
	RunID        string            `json:"run_id"`
	StartTime    time.Time         `json:"start_time"`
	EndTime      time.Time         `json:"end_time,omitempty"`
	TotalTests   int               `json:"total_tests"`
	FailedTests  int               `json:"failed_tests"`
	PassedTests  int               `json:"passed_tests"`
	Environment  map[string]string `json:"environment"`
	ArtifactsDir string            `json:"artifacts_dir"`
	// Git information for tracking test run version
	GitCommit   string `json:"git_commit,omitempty"`
	GitBranch   string `json:"git_branch,omitempty"`
	GitDirty    string `json:"git_dirty,omitempty"`   // "dirty", "staged", or empty if clean
	Description string `json:"description,omitempty"` // Optional user description
}

RunMetadata contains metadata about an entire test run

type Storage

type Storage struct {
	// contains filtered or unexported fields
}

Storage handles filesystem operations for artifact collection

func NewStorage

func NewStorage(baseDir string, runID string, maxSize int64) (*Storage, error)

NewStorage creates a new storage instance with specified configuration

func (*Storage) CreateTestDir

func (s *Storage) CreateTestDir(testName string) (string, error)

CreateTestDir creates a directory for a specific test

func (*Storage) GetRunDir

func (s *Storage) GetRunDir() string

GetRunDir returns the run directory path

func (*Storage) GetRunID

func (s *Storage) GetRunID() string

GetRunID returns the run ID

func (*Storage) WriteFile

func (s *Storage) WriteFile(testDir, category, filename string, content []byte) error

WriteFile writes content to a file within a test directory with size limits testDir: test-specific directory name (e.g., "test-normal-mode") category: subdirectory within test dir (e.g., "logs", "resources", "events") filename: name of the file to write

func (*Storage) WriteFileInRunDir

func (s *Storage) WriteFileInRunDir(filename string, content []byte) error

WriteFileInRunDir writes a file directly in the run directory (not test-specific) Used for run-level metadata

func (*Storage) WriteStream

func (s *Storage) WriteStream(testDir, category, filename string, reader io.Reader, maxLines int) error

WriteStream writes content from a reader to a file with size limits Useful for streaming command output without loading all into memory

type TestInfo

type TestInfo struct {
	Name           string
	Namespace      string
	Failed         bool
	FailureMessage string
	Duration       time.Duration
	StartTime      time.Time
	EndTime        time.Time
	Labels         []string

	// Test sequence tracking (for degradation analysis)
	SequenceNumber int           // Which test in the run (1, 2, 3...)
	OperatorAge    time.Duration // How long operator has been running

	// Kubernetes context
	KubectlClient *kubectl.Client
}

TestInfo contains information about a test execution

type TestMetadata

type TestMetadata struct {
	Name           string        `json:"name"`
	Namespace      string        `json:"namespace"`
	StartTime      time.Time     `json:"start_time"`
	EndTime        time.Time     `json:"end_time"`
	Duration       time.Duration `json:"duration_ms"` // in milliseconds for JSON
	Failed         bool          `json:"failed"`
	FailureMessage string        `json:"failure_message,omitempty"`
	Labels         []string      `json:"labels"`

	// Test sequence tracking (for degradation analysis)
	TestSequenceNumber int           `json:"test_sequence_number"` // Which test in the run (1, 2, 3...)
	OperatorAge        time.Duration `json:"operator_age_seconds"` // How long operator has been running

	// Collected artifacts inventory
	Artifacts ArtifactInventory `json:"artifacts"`
}

TestMetadata contains metadata about a single test execution

func BuildTestMetadata

func BuildTestMetadata(info TestInfo, artifacts ArtifactInventory) TestMetadata

BuildTestMetadata creates TestMetadata from TestInfo

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL