artifacts

package

v0.4.0 Latest Latest Go to latest Published: Nov 26, 2025 License: Apache-2.0 Imports: 13 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/kaasops/vector-operator

Links

Open Source Insights

README ¶

E2E Test Artifact Collection

Automatic collection of debugging artifacts when e2e tests fail.

Features

Automatic: Collects artifacts only on test failures (configurable)
Safe: Never fails tests due to collection errors
Fast: < 1s overhead for passing tests, < 30s for failing tests
Comprehensive: Logs, pod status, events, resources

Configuration (ENV Variables)

All configuration is ENV-based with sensible defaults:

Collection Control

E2E_ARTIFACTS_ENABLED (default: true) - Master switch
E2E_ARTIFACTS_ON_FAILURE_ONLY (default: true) - Only collect on failures
E2E_ARTIFACTS_MINIMAL_ONLY (default: false) - P0 artifacts only

Storage

E2E_ARTIFACTS_DIR (default: test/e2e/artifacts) - Base directory

Size Limits

E2E_ARTIFACTS_MAX_LOG_LINES (default: 500) - Max log lines per pod
E2E_ARTIFACTS_MAX_RESOURCE_SIZE (default: 10485760) - Max 10MB per file
E2E_ARTIFACTS_MAX_TOTAL_SIZE (default: 104857600) - Max 100MB per test

Timeouts

E2E_ARTIFACTS_TIMEOUT (default: 30s) - Max collection time

Usage

Running Tests with Artifacts

# Default behavior (enabled, on-failure-only)
make test-e2e

# Disable artifacts
E2E_ARTIFACTS_ENABLED=false make test-e2e

# Collect for all tests (even passing)
E2E_ARTIFACTS_ON_FAILURE_ONLY=false make test-e2e

# Increase log lines
E2E_ARTIFACTS_MAX_LOG_LINES=1000 make test-e2e

Artifact Location

Artifacts are stored in:

test/e2e/artifacts/
└── run-{timestamp}/
    ├── metadata.json
    └── {test-name}/
        ├── metadata.json
        ├── logs/
        │   ├── operator-controller.log
        │   └── pod-{name}.log
        ├── pods/
        │   └── {pod-name}-status.json
        ├── resources/
        │   ├── vectorpipeline-{name}-status.json
        │   └── deployment-{name}.yaml
        └── events/
            └── namespace-events.txt

Unified Test Results

When you run make test-e2e, all results are automatically saved in a unified structure with reports and artifacts correlated by timestamp:

# Run tests - results automatically saved with timestamp
make test-e2e

# Results structure:
test/e2e/results/run-{timestamp}/
├── reports/
│   ├── junit-report.xml    # JUnit XML for CI integration
│   ├── report.json          # Ginkgo JSON report
│   └── test-output.log      # Full test output logs
└── artifacts/               # Debug artifacts (only for failed tests)
    ├── metadata.json        # Run-level metadata
    └── {test-name}/         # Per-test artifacts
        ├── metadata.json
        ├── logs/
        ├── pods/
        ├── resources/
        └── events/

Benefits:

Single runID correlates all reports and artifacts
Easy to navigate - everything in one directory
CI/CD friendly - upload one directory
Helpful output with quick analysis commands

CI Integration (GitHub Actions)

- name: Run E2E Tests
  run: make test-e2e

- name: Upload Test Results
  if: always()  # Upload even if tests fail
  uses: actions/upload-artifact@v4
  with:
    name: e2e-results-${{ github.run_number }}
    path: test/e2e/results/
    retention-days: 30

Collected Artifacts (P0 - MVP)

Critical for Debugging

Pod Status JSON - Conditions, restarts, phase
Operator Controller Logs - Time-filtered logs (test duration + 1min buffer)
VectorPipeline CR Status - Validation results
Namespace Events - What happened in test namespace
Resource Metadata - Deployments, DaemonSets, Services

Future (Phase 2)

Full pod logs (all containers)
Full pod descriptions
Vector agent/aggregator logs
ConfigCheck pod logs
Timeline reconstruction

Architecture

Thread-safe: Uses sync.Map for parallel test support
Graceful degradation: Collection errors don't fail tests
Size limits: Prevents CI artifact bloat
Atomic writes: Temp file + rename for reliability

Performance

Passing tests: < 1s overhead (if ON_FAILURE_ONLY=true)
Failing tests: < 30s collection time
Storage: < 100MB per test, < 500MB per run

Troubleshooting

No artifacts collected

Check E2E_ARTIFACTS_ENABLED=true
Verify test is using framework.NewFramework() or framework.Shared()
Check GinkgoWriter output for warning messages

Artifacts too large

Reduce E2E_ARTIFACTS_MAX_LOG_LINES (default: 500)
Enable E2E_ARTIFACTS_MINIMAL_ONLY=true
Check individual file sizes with E2E_ARTIFACTS_MAX_RESOURCE_SIZE

Collection timeout

Increase E2E_ARTIFACTS_TIMEOUT (default: 30s)
Check kubectl connectivity
Review namespace resource count

Important Bug Fixes

Time-based Log Collection (Fixed)

Problem: Previously, operator logs were collected using kubectl logs --tail 500, which retrieved the last 500 lines from the entire pod lifetime. In long-running test suites (e.g., full e2e runs lasting 15+ minutes), the operator pod could generate thousands of log lines, causing the last 500 lines to exclude logs from earlier failing tests.

Example: A test failing at 18:05-18:07 would collect operator logs from 16:02-16:03 (the pod's startup logs), completely missing the relevant reconciliation attempts.

Solution: Implemented time-based log collection using kubectl logs --since-time with the test's start time (+ 1 minute buffer). This ensures operator logs are collected only for the relevant time period, regardless of how long the pod has been running.

Impact:

Fixes flaky test debugging where operator logs were missing
Enables reliable root cause analysis for race conditions
Reduces confusion when logs don't match test timeline

Development

See architect design document for Phase 2+ enhancements.

Documentation ¶

Index ¶

func TruncateLogLines(content []byte, maxLines int) []byte
type ArtifactInventory
type Collector
- func NewCollector(config Config) (Collector, error)
type Config
- func LoadConfigFromEnv() Config
type MetadataBuilder
- func NewMetadataBuilder(storage *Storage) *MetadataBuilder
- func (m *MetadataBuilder) WriteRunMetadata(meta RunMetadata) error
- func (m *MetadataBuilder) WriteTestMetadata(meta TestMetadata, testDir string) error
type RunMetadata
type Storage
- func NewStorage(baseDir string, runID string, maxSize int64) (*Storage, error)
type TestInfo
type TestMetadata
- func BuildTestMetadata(info TestInfo, artifacts ArtifactInventory) TestMetadata

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func TruncateLogLines ¶

func TruncateLogLines(content []byte, maxLines int) []byte

TruncateLogLines truncates log output to specified number of lines Takes the LAST N lines (most recent logs are most relevant for debugging)

Types ¶

type ArtifactInventory ¶

type ArtifactInventory struct {
	PodCount       int      `json:"pod_count"`
	LogFiles       []string `json:"log_files"`
	ResourceFiles  []string `json:"resource_files"`
	EventFiles     []string `json:"event_files"`
	TotalSizeBytes int64    `json:"total_size_bytes"`
	CollectionTime string   `json:"collection_time"` // Human-readable duration
}

ArtifactInventory tracks what artifacts were collected

type Collector ¶

type Collector interface {
	// Initialize sets up the collector for a test run
	Initialize(runID string) error

	// CollectForTest collects artifacts for a test
	CollectForTest(ctx context.Context, testInfo TestInfo) error

	// Close finalizes the collector and writes summary
	Close() error
}

Collector manages artifact collection for e2e tests

func NewCollector ¶

func NewCollector(config Config) (Collector, error)

NewCollector creates a new artifact collector

type Config ¶

type Config struct {
	// Collection control
	Enabled              bool // Master switch for artifact collection
	CollectOnFailureOnly bool // Collect artifacts only for failed tests
	CollectMinimalOnly   bool // Collect only P0 artifacts (fast path)

	// Storage paths
	BaseDir string // Base directory for artifact storage

	// Size limits (prevent artifact bloat)
	MaxLogLines     int   // Maximum log lines per pod
	MaxResourceSize int64 // Maximum size for single resource (bytes)
	MaxTotalSize    int64 // Maximum total size per test (bytes)

	// Timeouts
	CollectionTimeout time.Duration // Maximum time to collect artifacts

	// Filters
	NamespacePatterns []string // Namespace patterns to collect from
	PodLabelSelectors []string // Pod label selectors for filtering
}

Config defines artifact collection behavior

func LoadConfigFromEnv ¶

func LoadConfigFromEnv() Config

LoadConfigFromEnv loads configuration from environment variables Following Phase 1 pattern: ENV-based config with sensible defaults

type MetadataBuilder ¶

type MetadataBuilder struct {
	// contains filtered or unexported fields
}

MetadataBuilder helps build and write metadata files

func NewMetadataBuilder ¶

func NewMetadataBuilder(storage *Storage) *MetadataBuilder

NewMetadataBuilder creates a new metadata builder

func (*MetadataBuilder) WriteRunMetadata ¶

func (m *MetadataBuilder) WriteRunMetadata(meta RunMetadata) error

WriteRunMetadata writes run metadata to JSON file

func (*MetadataBuilder) WriteTestMetadata ¶

func (m *MetadataBuilder) WriteTestMetadata(meta TestMetadata, testDir string) error

WriteTestMetadata writes test metadata to JSON file

type RunMetadata ¶

type RunMetadata struct {
	RunID        string            `json:"run_id"`
	StartTime    time.Time         `json:"start_time"`
	EndTime      time.Time         `json:"end_time,omitempty"`
	TotalTests   int               `json:"total_tests"`
	FailedTests  int               `json:"failed_tests"`
	PassedTests  int               `json:"passed_tests"`
	Environment  map[string]string `json:"environment"`
	ArtifactsDir string            `json:"artifacts_dir"`
	// Git information for tracking test run version
	GitCommit   string `json:"git_commit,omitempty"`
	GitBranch   string `json:"git_branch,omitempty"`
	GitDirty    string `json:"git_dirty,omitempty"`   // "dirty", "staged", or empty if clean
	Description string `json:"description,omitempty"` // Optional user description
}

RunMetadata contains metadata about an entire test run

type Storage ¶

type Storage struct {
	// contains filtered or unexported fields
}

Storage handles filesystem operations for artifact collection

func NewStorage ¶

func NewStorage(baseDir string, runID string, maxSize int64) (*Storage, error)

NewStorage creates a new storage instance with specified configuration

func (*Storage) CreateTestDir ¶

func (s *Storage) CreateTestDir(testName string) (string, error)

CreateTestDir creates a directory for a specific test

func (*Storage) GetRunDir ¶

func (s *Storage) GetRunDir() string

GetRunDir returns the run directory path

func (*Storage) GetRunID ¶

func (s *Storage) GetRunID() string

GetRunID returns the run ID

func (*Storage) WriteFile ¶

func (s *Storage) WriteFile(testDir, category, filename string, content []byte) error

WriteFile writes content to a file within a test directory with size limits testDir: test-specific directory name (e.g., "test-normal-mode") category: subdirectory within test dir (e.g., "logs", "resources", "events") filename: name of the file to write

func (*Storage) WriteFileInRunDir ¶

func (s *Storage) WriteFileInRunDir(filename string, content []byte) error

WriteFileInRunDir writes a file directly in the run directory (not test-specific) Used for run-level metadata

func (*Storage) WriteStream ¶

func (s *Storage) WriteStream(testDir, category, filename string, reader io.Reader, maxLines int) error

WriteStream writes content from a reader to a file with size limits Useful for streaming command output without loading all into memory

type TestInfo ¶

type TestInfo struct {
	Name           string
	Namespace      string
	Failed         bool
	FailureMessage string
	Duration       time.Duration
	StartTime      time.Time
	EndTime        time.Time
	Labels         []string

	// Test sequence tracking (for degradation analysis)
	SequenceNumber int           // Which test in the run (1, 2, 3...)
	OperatorAge    time.Duration // How long operator has been running

	// Kubernetes context
	KubectlClient *kubectl.Client
}

TestInfo contains information about a test execution

type TestMetadata ¶

type TestMetadata struct {
	Name           string        `json:"name"`
	Namespace      string        `json:"namespace"`
	StartTime      time.Time     `json:"start_time"`
	EndTime        time.Time     `json:"end_time"`
	Duration       time.Duration `json:"duration_ms"` // in milliseconds for JSON
	Failed         bool          `json:"failed"`
	FailureMessage string        `json:"failure_message,omitempty"`
	Labels         []string      `json:"labels"`

	// Test sequence tracking (for degradation analysis)
	TestSequenceNumber int           `json:"test_sequence_number"` // Which test in the run (1, 2, 3...)
	OperatorAge        time.Duration `json:"operator_age_seconds"` // How long operator has been running

	// Collected artifacts inventory
	Artifacts ArtifactInventory `json:"artifacts"`
}

TestMetadata contains metadata about a single test execution

func BuildTestMetadata ¶

func BuildTestMetadata(info TestInfo, artifacts ArtifactInventory) TestMetadata

BuildTestMetadata creates TestMetadata from TestInfo

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL