handlers

package
v1.2.9 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 25, 2025 License: Apache-2.0 Imports: 27 Imported by: 0

README

CloudZero Agent HTTP Handlers

The handlers package implements Primary Adapters in CloudZero Agent's hexagonal architecture, providing HTTP interfaces that translate external requests into domain service operations for cost allocation processing.

Architecture Overview

graph LR
    A[External Systems<br/>Kubernetes] --> B[HTTP Handlers<br/>Primary Adapters]
    B --> C[Domain Services<br/>Business Logic]
    C --> D[Storage/APIs<br/>Secondary Adapters]

Core Handlers

Admission Webhook API (webhook.go)
  • ValidationWebhookAPI: Kubernetes admission control for cost allocation metadata
  • Fail-Open Design: Always allows resources if processing fails to prevent cluster disruption
  • Multi-Version Support: Compatible with admission.k8s.io/v1 and v1beta1 APIs
  • Request Validation: Content-type checking, timeout management, size limits
  • Connection Management: Periodic closure for load balancing across replicas
Prometheus Remote Write API (remote_write.go)
  • RemoteWriteAPI: High-throughput metric ingestion from Prometheus instances
  • Protocol Support: Full remote_write v1 and v2 specification compliance
  • Compression Handling: Snappy decompression for efficient network transmission
  • Size Limits: 16MB maximum payload with memory protection
  • Load Balancing: Connection management for distributing Prometheus load
Metrics API (prom_metrics.go)
  • PromMetricsAPI: Prometheus metrics exposition for operational monitoring
  • Standard Endpoint: /metrics endpoint following Prometheus conventions
  • Comprehensive Metrics: Agent performance, business metrics, and infrastructure data
  • Auto-Discovery: Compatible with Kubernetes service discovery patterns
Shipper API (shipper.go)
  • ShipperAPI: Operational endpoints for metric shipping monitoring
  • Debug Capabilities: Shipping pipeline status and performance insights
  • Health Checking: Integration with Kubernetes liveness and readiness probes
  • Prometheus Integration: Internal metrics for shipping operations monitoring
Profiling API (profiling.go)
  • ProfilingAPI: Go pprof profiling endpoints for performance analysis
  • Standard Endpoints: Complete pprof interface for development and debugging
  • Security Considerations: Should be restricted in production environments
  • Tool Compatibility: Direct integration with go tool pprof and analysis tools

Request Processing Patterns

Kubernetes Admission Control
func (a *ValidationWebhookAPI) PostAdmissionRequest(w http.ResponseWriter, r *http.Request) {
    // 1. Request validation and timeout setup
    ctx, cancel := context.WithTimeout(r.Context(), DefaultTimeout)
    defer cancel()

    // 2. Parse admission review from request body
    review, err := a.requestBodyToModelReview(body)

    // 3. Process through domain service
    _, err = a.controller.Review(ctx, review)

    // 4. Always allow resources (fail-open behavior)
    sendAllowResponse(w, r)
}
Prometheus Metric Ingestion
func (a *RemoteWriteAPI) PostMetrics(w http.ResponseWriter, r *http.Request) {
    // 1. Validate request size and content type
    if r.ContentLength > MaxPayloadSize {
        logErrorReply(r, w, "too big", http.StatusOK)
        return
    }

    // 2. Read and process metric data
    data, err := io.ReadAll(r.Body)
    stats, err := a.metrics.PutMetrics(r.Context(), contentType, encodingType, data)

    // 3. Implement load balancing through connection management
    if r.ProtoMajor == 1 && shouldCloseConnection() {
        w.Header().Set("Connection", "close")
    }

    // 4. Return processing statistics
    if stats != nil {
        stats.SetHeaders(w)
    }
}

HTTP Infrastructure Integration

Server Integration (go-obvious/server)
  • Consistent Patterns: All handlers implement server.API interface
  • Middleware Support: Logging, metrics, authentication, and CORS handling
  • Graceful Shutdown: Coordinated shutdown with proper connection drainage
  • Health Checking: Built-in health and readiness endpoint support
Router Configuration (chi)
  • High Performance: Chi router optimized for high-throughput operations
  • RESTful Patterns: Standard HTTP method and path conventions
  • Middleware Chains: Composable request processing pipeline
  • Route Groups: Logical organization of related endpoints
Error Handling
// Consistent error response pattern
func logErrorReply(r *http.Request, w http.ResponseWriter, data string, statusCode int) {
    log.Ctx(r.Context()).Error().Msg(data)
    request.Reply(r, w, data, statusCode)
}

// Fail-open behavior for critical operations
func sendAllowResponse(w http.ResponseWriter, r *http.Request) {
    allowResponse := &types.AdmissionResponse{Allowed: true}
    resp, err := marshallResponseToJSON(ctx, review, allowResponse)
    if err != nil {
        // Use minimal JSON response to ensure we always allow
        w.WriteHeader(http.StatusOK)
        w.Write([]byte(minimalAllowResponse))
        return
    }
    w.WriteHeader(http.StatusOK)
    w.Write(resp)
}

Security Considerations

Request Validation
  • Content-Type Verification: Strict content type checking for all endpoints
  • Size Limits: Maximum request body sizes to prevent memory exhaustion
  • Timeout Management: Request-level timeouts to prevent resource exhaustion
  • Input Sanitization: Validation of all external input data
Authentication and Authorization
  • Kubernetes Integration: Service account-based authentication for webhooks
  • API Keys: CloudZero platform authentication for metric uploads
  • Network Security: TLS encryption for all external communications
  • Access Logging: Comprehensive request logging for audit and monitoring
Production Hardening
  • Rate Limiting: Protection against abuse and overload conditions
  • Circuit Breaking: Protection against cascading failures
  • Health Monitoring: Comprehensive health checking and alerting
  • Resource Limits: Memory and CPU usage controls for stability

Performance Characteristics

High-Throughput Design
  • Concurrent Processing: Handlers designed for concurrent request handling
  • Memory Efficiency: Streaming processing for large payloads
  • Connection Pooling: Efficient reuse of HTTP connections
  • Load Balancing: Request distribution across multiple agent replicas
Prometheus Integration Optimization
  • Remote Write Efficiency: Optimized for high-volume metric ingestion
  • Compression Support: Snappy decompression for bandwidth efficiency
  • Batch Processing: Efficient handling of large metric batches
  • Connection Management: HTTP/1.1 connection cycling for load distribution
Kubernetes Webhook Optimization
  • Fast Response Times: Sub-second admission control decisions
  • Fail-Open Design: Never blocks cluster operations due to agent issues
  • Request Correlation: Structured logging for debugging and monitoring
  • Resource Efficiency: Minimal memory and CPU usage per request

Monitoring and Observability

Request Metrics
// HTTP request metrics
var (
    httpRequestsTotal = prometheus.NewCounterVec(
        prometheus.CounterOpts{Name: "http_requests_total"},
        []string{"method", "path", "status"},
    )

    httpRequestDuration = prometheus.NewHistogramVec(
        prometheus.HistogramOpts{Name: "http_request_duration_seconds"},
        []string{"method", "path"},
    )
)
Health Monitoring
  • Endpoint Health: Individual handler health status reporting
  • Dependency Status: Database and external service connectivity
  • Performance Metrics: Request latency and throughput tracking
  • Error Rate Monitoring: Error classification and trending
Distributed Tracing
  • Request Correlation: Trace ID propagation through request processing
  • Service Boundaries: Span creation at handler and service boundaries
  • Performance Analysis: End-to-end request timing and bottleneck identification
  • Error Attribution: Trace-based error tracking and root cause analysis

Development Guidelines

Adding New Handlers
  1. Implement API Interface: Create struct implementing server.API
  2. Define Routes: Configure chi router with appropriate HTTP methods
  3. Add Validation: Implement request validation and error handling
  4. Domain Integration: Delegate business logic to domain services
  5. Add Tests: Comprehensive unit and integration testing
Testing Strategies
// Unit testing with mocked dependencies
func TestValidationWebhookAPI_PostAdmissionRequest(t *testing.T) {
    mockController := &mocks.WebhookController{}
    api := NewValidationWebhookAPI("/webhook", mockController)

    req := httptest.NewRequest("POST", "/webhook", bytes.NewReader(testAdmissionReview))
    w := httptest.NewRecorder()

    api.PostAdmissionRequest(w, req)

    assert.Equal(t, http.StatusOK, w.Code)
    mockController.AssertExpectations(t)
}

// Integration testing with real HTTP server
func TestRemoteWriteAPI_Integration(t *testing.T) {
    collector := setupTestCollector(t)
    api := NewRemoteWriteAPI("/api/v1/write", collector)

    server := httptest.NewServer(api.Routes())
    defer server.Close()

    // Test with real Prometheus remote_write data
    resp, err := http.Post(server.URL+"/", "application/x-protobuf", bytes.NewReader(testMetrics))
    require.NoError(t, err)
    assert.Equal(t, http.StatusNoContent, resp.StatusCode)
}
Error Handling Best Practices
  • Consistent Response Format: Standardized error response structure
  • Appropriate Status Codes: Correct HTTP status codes for different error types
  • Detailed Logging: Comprehensive error logging for troubleshooting
  • Client-Friendly Messages: Clear error messages for API consumers
Performance Optimization
  • Request Parsing: Efficient parsing of request bodies and headers
  • Memory Management: Minimize allocations and enable garbage collection
  • Connection Handling: Proper connection lifecycle management
  • Resource Cleanup: Ensure proper cleanup of resources and goroutines

Documentation

Overview

Package handlers provides HTTP request handlers for CloudZero Agent Primary Adapter implementations.

Package handlers provides HTTP request handlers for CloudZero Agent Primary Adapter implementations.

Package handlers provides HTTP request handlers for CloudZero Agent Primary Adapter implementations.

Package handlers provides HTTP request handlers for CloudZero Agent Primary Adapter implementations.

Package handlers provides HTTP request handlers for CloudZero Agent Primary Adapter implementations. This package implements the HTTP interface layer that receives external requests from Kubernetes API servers, Prometheus instances, and other external systems, translating them into domain service operations.

The handlers serve as Primary Adapters in the hexagonal architecture, providing the entry point for all external communication with the CloudZero Agent while maintaining clean separation between HTTP concerns and business logic.

Key responsibilities:

  • HTTP request/response handling: Parse incoming requests and generate appropriate responses
  • Protocol adaptation: Convert HTTP requests to domain service method calls
  • Error handling: Translate domain errors to appropriate HTTP status codes and responses
  • Content negotiation: Handle various content types and serialization formats
  • Security enforcement: Validate request authenticity and enforce security policies

Handler types:

  • ValidationWebhookAPI: Kubernetes admission webhook for cost allocation validation
  • Remote write handlers: Prometheus metric ingestion endpoints
  • Health check handlers: Service health and readiness endpoints
  • Profiling handlers: Development and debugging support endpoints

The package ensures that domain services remain HTTP-agnostic while providing robust, production-ready HTTP interfaces for all CloudZero Agent functionality.

Index

Constants

View Source
const (

	// MaxRequestBodyBytes defines the maximum size of Kubernetes admission review requests.
	// This limit is based on Kubernetes etcd storage constraints with additional safety buffer
	// to handle large resource manifests while preventing memory exhaustion attacks.
	//
	// Size calculation:
	//   - Kubernetes default etcd max size: 1.5MB
	//   - Kubernetes internal buffer: 2x (3MB total)
	//   - CloudZero Agent additional buffer: 2x (6MB final limit)
	//
	// This conservative approach ensures:
	//   - Support for large ConfigMaps, Secrets, and Custom Resources
	//   - Protection against malicious oversized requests
	//   - Memory usage predictability in constrained environments
	//   - Compatibility with Kubernetes API server request size policies
	//
	// Reference: https://github.com/istio/istio/commit/6ca5055a4db6695ef5504eabdfde3799f2ea91fd
	MaxRequestBodyBytes = int64(6 * 1024 * 1024)

	// DefaultTimeout specifies the maximum processing time for admission webhook requests.
	// This timeout balances thorough cost allocation analysis with Kubernetes responsiveness
	// requirements, ensuring webhook decisions complete within cluster performance expectations.
	//
	// Timeout considerations:
	//   - Kubernetes API server expects webhook responses within 10-30 seconds
	//   - CloudZero metadata extraction and storage operations typically complete in <1 second
	//   - Network latency for CloudZero API calls may add 2-5 seconds
	//   - Buffer for temporary slowdowns or resource contention
	//
	// Production implications:
	//   - Prevents webhook from blocking cluster operations during slowdowns
	//   - Allows graceful fallback to fail-open behavior if processing exceeds timeout
	//   - Maintains cluster responsiveness under high admission request volumes
	//   - Provides predictable SLA for application deployment times
	DefaultTimeout = 15 * time.Second

	// MinTimeout establishes the minimum acceptable timeout for admission webhook processing.
	// This prevents extremely short timeouts that could cause unnecessary failures during
	// normal operations while still allowing timeout customization for specific environments.
	//
	// This minimum ensures:
	//   - Sufficient time for standard CloudZero metadata extraction
	//   - Network request completion under normal conditions
	//   - Database transaction completion for resource storage
	//   - Graceful handling of minor system resource contention
	//
	// Values below this threshold risk causing webhook failures during normal
	// cluster operations, potentially degrading application deployment reliability.
	MinTimeout = 5 * time.Second
)

MaxRequestBodyBytes represents the max size of Kubernetes objects we read. Kubernetes allows a 2x buffer on the max etcd size (https://github.com/kubernetes/kubernetes/blob/0afa569499d480df4977568454a50790891860f5/staging/src/k8s.io/apiserver/pkg/server/config.go#L362). We allow an additional 2x buffer, as it is still fairly cheap (6mb) Taken from https://github.com/istio/istio/commit/6ca5055a4db6695ef5504eabdfde3799f2ea91fd

View Source
const MaxPayloadSize = 16 * 1024 * 1024

MaxPayloadSize defines the maximum allowed size for Prometheus remote_write requests. This limit balances memory usage protection with support for large metric batches from high-volume Prometheus instances while preventing memory exhaustion attacks.

Size considerations:

  • Large Prometheus deployments can send 10MB+ batches during high-volume periods
  • Compressed payloads typically achieve 5-10x compression ratios
  • CloudZero Agent must handle concurrent requests from multiple Prometheus instances
  • Memory usage must remain predictable in constrained Kubernetes environments

Production implications:

  • Prevents out-of-memory conditions during traffic spikes
  • Enables reliable operation under high metric ingestion loads
  • Maintains predictable resource consumption for cluster resource planning
  • Provides clear error responses for oversized requests

The 16MB limit accommodates legitimate large metric batches while providing protection against malicious or misconfigured Prometheus instances.

Variables

This section is empty.

Functions

func NewShipperAPI

func NewShipperAPI(base string, d *shipper.MetricShipper) server.API

NewShipperAPI creates a new HTTP API server for CloudZero Agent metric shipping operations. This constructor initializes all necessary components for exposing metric shipper functionality through HTTP endpoints, enabling operational monitoring and debugging capabilities.

The ShipperAPI provides essential operational visibility into the CloudZero data transmission pipeline, allowing monitoring systems and operations teams to track shipping performance, diagnose issues, and ensure reliable data delivery to the CloudZero platform.

Configuration parameters:

  • base: HTTP path prefix for shipper endpoints (typically "/shipper" or "/shipping")
  • d: MetricShipper domain service implementing the core shipping business logic

Initialization process:

  1. Configure HTTP routing with chi router for efficient request handling
  2. Set up metric endpoints for Prometheus scraping and operational monitoring
  3. Register shipper endpoints with go-obvious server infrastructure
  4. Enable structured logging and health checking integration

The returned server.API can be registered with the CloudZero Agent HTTP server to begin serving metric shipping operational endpoints for monitoring and debugging.

Production considerations:

  • Prometheus metrics endpoint for automated monitoring and alerting
  • Health check endpoints for Kubernetes liveness and readiness probes
  • Debug endpoints for operations team troubleshooting and analysis
  • Performance metrics for CloudZero platform integration monitoring

func NewValidationWebhookAPI

func NewValidationWebhookAPI(base string, controller webhook.WebhookController) server.API

NewValidationWebhookAPI creates a new Kubernetes admission webhook API server instance. This constructor initializes all necessary components for processing admission requests, including Kubernetes API deserialization support and HTTP routing configuration.

The webhook API server provides the primary integration point between Kubernetes clusters and CloudZero cost allocation services, enabling automatic metadata extraction during resource creation, update, and deletion operations.

Configuration parameters:

  • base: HTTP path prefix for webhook endpoints (typically "/webhook" or "/validate")
  • controller: Domain service implementing the actual cost allocation processing logic

Initialization process:

  1. Configure Kubernetes API deserialization for v1 and v1beta1 admission formats
  2. Set up HTTP routing with chi router for efficient request handling
  3. Register webhook endpoints with go-obvious server infrastructure
  4. Enable structured logging and metrics collection for operational monitoring

The returned server.API can be registered with the CloudZero Agent HTTP server to begin processing admission webhook requests from Kubernetes API servers.

Production considerations:

  • Supports both legacy (v1beta1) and modern (v1) Kubernetes admission APIs
  • Automatic content negotiation based on request API version
  • Fail-safe deserialization with comprehensive error handling
  • Ready for integration with agent lifecycle management and health checking

Types

type ProfilingAPI

type ProfilingAPI struct {
	// api.Service provides the foundational HTTP server infrastructure from go-obvious/server.
	// This embedded service handles HTTP server lifecycle, request routing, middleware integration,
	// and provides consistent API patterns across CloudZero Agent HTTP endpoints.
	api.Service
}

ProfilingAPI provides HTTP endpoints for CloudZero Agent performance profiling and debugging. This API serves as a Primary Adapter in the hexagonal architecture, exposing Go pprof profiling capabilities through HTTP interfaces for development, debugging, and performance analysis.

The ProfilingAPI enables developers and operations teams to analyze CloudZero Agent runtime performance, memory usage patterns, goroutine behavior, and CPU utilization characteristics in production environments.

Key capabilities:

  • CPU profiling: Sample-based CPU usage analysis for performance optimization
  • Memory profiling: Heap allocation tracking and memory leak detection
  • Goroutine profiling: Concurrency analysis and deadlock debugging
  • Block profiling: Synchronization bottleneck identification
  • Trace collection: Detailed execution flow analysis for complex debugging scenarios

Development use cases:

  • Performance optimization: Identify CPU and memory bottlenecks in metric processing
  • Memory leak debugging: Track heap allocations and garbage collection patterns
  • Concurrency analysis: Understand goroutine interactions and synchronization behavior
  • Production debugging: Investigate performance issues in live environments

Security considerations:

  • Should only be enabled in development and controlled production environments
  • Provides detailed runtime information that could be sensitive
  • Typically deployed behind authentication and network access controls
  • CPU profiling may impact performance during profile collection

The API follows Go standard pprof conventions, making it compatible with standard profiling tools like "go tool pprof" and web-based profile analysis interfaces.

func NewProfilingAPI

func NewProfilingAPI(base string) *ProfilingAPI

NewProfilingAPI creates a new HTTP API server for CloudZero Agent performance profiling and debugging. This constructor initializes all necessary components for exposing Go runtime profiling capabilities through HTTP endpoints, enabling development teams to analyze agent performance characteristics.

The ProfilingAPI provides essential development and debugging capabilities for CloudZero Agent, allowing performance analysis, memory usage tracking, and concurrency debugging in both development and controlled production environments.

Configuration parameters:

  • base: HTTP path prefix for profiling endpoints (typically "/debug/pprof")

Standard deployment path:

The profiling API should be mounted at "/debug/pprof" to maintain compatibility
with Go ecosystem tooling and established pprof conventions. This enables direct
integration with "go tool pprof" and other profiling analysis tools.

Initialization process:

  1. Configure HTTP routing with chi router for efficient profiling request handling
  2. Set up all standard pprof endpoints with proper Go runtime integration
  3. Register dynamic profile handlers for all available runtime profiles
  4. Enable structured profiling endpoint access patterns

Production considerations:

  • Enable only in development and controlled production environments
  • Consider network access restrictions and authentication requirements
  • Monitor performance impact during active profiling sessions
  • Implement proper access logging for security audit requirements

The returned ProfilingAPI can be registered with the CloudZero Agent HTTP server to begin serving performance profiling endpoints for development and debugging.

func (*ProfilingAPI) Register

func (a *ProfilingAPI) Register(app server.Server) error

Register integrates the ProfilingAPI with the CloudZero Agent HTTP server infrastructure. This method completes the profiling API setup by mounting the debugging endpoints and enabling performance analysis capabilities for development and operations teams.

Registration process:

  • Mount profiling routes at the configured base path (typically /debug/pprof)
  • Enable HTTP middleware for logging, security, and access control
  • Configure all standard Go pprof endpoints with runtime integration
  • Activate dynamic profile handlers for comprehensive profiling support

The registration process integrates the profiling API with the agent's broader HTTP infrastructure, enabling coordinated startup, shutdown, and access control across all agent components.

Security considerations:

  • Profiling endpoints provide detailed runtime information
  • Consider authentication and network access restrictions
  • Enable proper access logging for security audit requirements
  • Monitor for potential performance impact during profiling

Error conditions:

  • Service registration failures (port conflicts, permission issues)
  • Route mounting conflicts with existing endpoints
  • Middleware initialization failures

Once registration completes successfully, the profiling API is ready to serve performance analysis endpoints for development and debugging activities.

func (*ProfilingAPI) Routes

func (a *ProfilingAPI) Routes() *chi.Mux

Routes configures HTTP request routing for the performance profiling API endpoints. This method creates a chi router instance with all standard Go pprof endpoints plus dynamic handlers for runtime-available profiles.

Standard pprof endpoints:

  • GET /: Profile index page with links to all available profiles
  • GET /cmdline: Command line arguments used to start the agent
  • GET /profile: CPU profile collection (30-second default sampling)
  • GET /symbol: Symbol table lookup for profile analysis
  • GET /trace: Execution trace collection for detailed flow analysis

Dynamic profile endpoints:

Automatically registers handlers for all runtime-available profiles including:
- /heap: Memory heap allocation profiles
- /goroutine: Active goroutine stack traces
- /allocs: All memory allocation profiles since startup
- /block: Blocking synchronization profiles
- /mutex: Mutex contention profiles
- /threadcreate: Thread creation profiles

The chi router provides:

  • High-performance HTTP routing optimized for profiling request patterns
  • Middleware support for access control and audit logging
  • RESTful routing compatible with standard Go profiling tools
  • Integration with go-obvious server infrastructure

This routing configuration enables seamless integration with "go tool pprof" and other Go ecosystem profiling tools while maintaining compatibility with CloudZero Agent operational patterns.

type PromMetricsAPI

type PromMetricsAPI struct {
	// api.Service provides the foundational HTTP server infrastructure from go-obvious/server.
	// This embedded service handles HTTP server lifecycle, request routing, middleware integration,
	// and provides consistent API patterns across CloudZero Agent HTTP endpoints.
	api.Service
}

PromMetricsAPI provides HTTP endpoints for CloudZero Agent Prometheus metrics collection. This API serves as a Primary Adapter in the hexagonal architecture, exposing internal agent metrics through the standard Prometheus HTTP interface for monitoring and alerting systems.

The PromMetricsAPI enables external monitoring systems to collect operational metrics from CloudZero Agent components, providing visibility into performance, health, and business metrics essential for production operations and CloudZero cost optimization insights.

Key capabilities:

  • Prometheus metrics exposition: Standard /metrics endpoint for automated scraping
  • Internal agent metrics: Performance counters, error rates, and operational health
  • Business metrics: Cost allocation processing rates, webhook admission statistics
  • Infrastructure metrics: Resource usage, connection health, and storage operations

Metrics categories exposed:

  • HTTP request metrics: Request rates, response times, error rates by endpoint
  • Webhook processing: Admission request volumes, processing latencies, failure rates
  • Metric shipping: Upload rates, batch sizes, CloudZero API response times
  • Storage operations: Database query performance, storage utilization, error rates
  • Resource processing: Kubernetes resource admission rates by type and operation

Integration patterns:

  • Prometheus scraping: Automated metric collection every 15-30 seconds
  • Grafana dashboards: Visual monitoring and alerting for operational teams
  • Alert Manager: Automated incident response based on metric thresholds
  • CloudZero platform: Business metrics integration for cost optimization insights

The API provides a thin integration layer around the standard Prometheus HTTP handler, maintaining compatibility with Prometheus ecosystem tooling while integrating with CloudZero Agent HTTP server infrastructure and operational patterns.

func NewPromMetricsAPI

func NewPromMetricsAPI(base string) *PromMetricsAPI

NewPromMetricsAPI creates a new HTTP API server for CloudZero Agent Prometheus metrics exposition. This constructor initializes all necessary components for exposing internal agent metrics through the standard Prometheus HTTP interface for monitoring system integration.

The PromMetricsAPI provides essential operational visibility into CloudZero Agent performance, health, and business metrics, enabling comprehensive monitoring and alerting for production deployments and CloudZero cost optimization operations.

Configuration parameters:

  • base: HTTP path prefix for metrics endpoints (typically "/metrics")

Standard deployment path:

The metrics API should be mounted at "/metrics" to maintain compatibility with
Prometheus ecosystem conventions and automated service discovery patterns.
This enables seamless integration with existing monitoring infrastructure.

Initialization process:

  1. Configure HTTP routing with chi router for efficient metrics request handling
  2. Set up Prometheus HTTP handler integration with standard metrics exposition format
  3. Register metrics endpoints with go-obvious server infrastructure
  4. Enable structured metrics access patterns for monitoring systems

Metrics collection:

The API exposes all metrics registered with the default Prometheus registry,
including both standard Go runtime metrics and CloudZero-specific business metrics.
This provides comprehensive observability into agent operations.

The returned PromMetricsAPI can be registered with the CloudZero Agent HTTP server to begin serving Prometheus metrics endpoints for monitoring system integration.

func (*PromMetricsAPI) Register

func (a *PromMetricsAPI) Register(app server.Server) error

Register integrates the PromMetricsAPI with the CloudZero Agent HTTP server infrastructure. This method completes the metrics API setup by mounting the Prometheus endpoints and enabling comprehensive monitoring capabilities for operations teams and automated systems.

Registration process:

  • Mount metrics routes at the configured base path (typically /metrics)
  • Enable HTTP middleware for logging, security, and performance tracking
  • Configure Prometheus HTTP handler with default registry integration
  • Activate metrics exposition for monitoring system scraping

The registration process integrates the metrics API with the agent's broader HTTP infrastructure, enabling coordinated startup, shutdown, and operational monitoring across all agent components.

Operational integration:

  • Prometheus scraping: Enables automated metric collection from monitoring systems
  • Service discovery: Compatible with Kubernetes service discovery and annotation patterns
  • Health monitoring: Provides metrics for liveness and readiness probe integration
  • Performance tracking: Exposes operational metrics for SLA monitoring and alerting

Error conditions:

  • Service registration failures (port conflicts, permission issues)
  • Route mounting conflicts with existing endpoints
  • Middleware initialization failures

Once registration completes successfully, the metrics API is ready to serve Prometheus-format metrics for comprehensive CloudZero Agent monitoring.

func (*PromMetricsAPI) Routes

func (a *PromMetricsAPI) Routes() *chi.Mux

Routes configures HTTP request routing for the Prometheus metrics API endpoints. This method creates a chi router instance with the standard Prometheus metrics endpoint integrated with the official Prometheus HTTP handler.

Metrics endpoint configuration:

  • GET /: Standard Prometheus metrics exposition endpoint
  • Content-Type: text/plain; version=0.0.4; charset=utf-8
  • Format: Standard Prometheus exposition format with metric families

Metrics exposed include:

  • Go runtime metrics: Memory usage, garbage collection, goroutines
  • HTTP server metrics: Request rates, response times, status codes
  • CloudZero business metrics: Webhook processing, metric shipping, cost allocation
  • Storage metrics: Database operations, query performance, error rates

The chi router provides:

  • High-performance HTTP routing optimized for frequent metrics scraping
  • Middleware support for access control and performance monitoring
  • Standard HTTP patterns compatible with Prometheus ecosystem tools
  • Integration with go-obvious server infrastructure

This routing configuration enables seamless integration with Prometheus monitoring systems while maintaining compatibility with CloudZero Agent operational patterns and providing comprehensive observability into agent performance and health.

type RemoteWriteAPI

type RemoteWriteAPI struct {
	// api.Service provides the foundational HTTP server infrastructure from go-obvious/server.
	// This embedded service handles HTTP server lifecycle, request routing, middleware integration,
	// and provides consistent API patterns across CloudZero Agent HTTP endpoints.
	api.Service
	// contains filtered or unexported fields
}

RemoteWriteAPI provides HTTP endpoints for Prometheus remote_write metric ingestion. This API serves as a Primary Adapter in the hexagonal architecture, implementing the Prometheus remote_write protocol for collecting metrics from Prometheus instances and processing them through the CloudZero cost allocation pipeline.

The RemoteWriteAPI enables CloudZero Agent to receive metric data from multiple Prometheus instances across Kubernetes clusters, extracting cost allocation insights and forwarding relevant metrics to the CloudZero platform for comprehensive cost optimization analysis.

Key capabilities:

  • Prometheus remote_write protocol: Full support for v1 and v2 remote_write specifications
  • Compression handling: Automatic decompression of snappy-compressed payloads
  • High throughput: Optimized for processing thousands of metrics per second
  • Error handling: Graceful handling of malformed requests with appropriate HTTP responses
  • Load balancing: Connection management for distributing load across agent replicas

Protocol support:

  • Remote Write v1: Standard protobuf-based metric ingestion
  • Remote Write v2: Enhanced protocol with improved metadata support
  • Content encoding: Snappy compression for efficient network transmission
  • Batch processing: Handles large metric batches from high-volume Prometheus instances

Integration patterns:

  • Prometheus configuration: remote_write endpoint configuration
  • Kubernetes deployment: Multiple agent replicas with load balancing
  • CloudZero platform: Metric forwarding and cost allocation analysis
  • Monitoring systems: Request metrics and error rate tracking

The API maintains clean separation between HTTP protocol concerns and metric processing business logic, delegating actual metric collection to the domain service.

func NewRemoteWriteAPI

func NewRemoteWriteAPI(base string, d *domain.MetricCollector) *RemoteWriteAPI

NewRemoteWriteAPI creates a new HTTP API server for Prometheus remote_write metric ingestion. This constructor initializes all necessary components for receiving and processing Prometheus metrics through the CloudZero Agent cost allocation pipeline.

The RemoteWriteAPI provides the essential integration point between Prometheus monitoring systems and CloudZero cost optimization, enabling automatic metric collection and processing for comprehensive cost allocation analysis across Kubernetes environments.

Configuration parameters:

  • base: HTTP path prefix for remote_write endpoints (typically "/api/v1/write")
  • d: MetricCollector domain service implementing the core metric processing logic

Initialization process:

  1. Configure HTTP routing with chi router for high-performance metric ingestion
  2. Set up remote_write endpoint with proper request validation and size limits
  3. Register metric processing endpoints with go-obvious server infrastructure
  4. Enable structured logging and metrics collection for operational monitoring

Production characteristics:

  • High throughput: Optimized for processing thousands of metrics per second
  • Memory efficiency: Streaming processing with predictable memory usage
  • Error resilience: Graceful handling of malformed or oversized requests
  • Load balancing: Connection management for distributing Prometheus load

The returned RemoteWriteAPI can be registered with the CloudZero Agent HTTP server to begin receiving Prometheus metrics for cost allocation processing.

func (*RemoteWriteAPI) PostMetrics

func (a *RemoteWriteAPI) PostMetrics(w http.ResponseWriter, r *http.Request)

PostMetrics processes Prometheus remote_write requests for CloudZero metric ingestion. This method implements the core remote_write endpoint that receives metric data from Prometheus instances and processes it through the CloudZero cost allocation pipeline.

Processing pipeline:

  1. Request validation: Verify content length, size limits, and required headers
  2. Body reading: Stream request body with memory usage protection
  3. Metric processing: Parse and process metrics through domain service
  4. Response generation: Return appropriate HTTP status and headers
  5. Connection management: Implement load balancing through periodic connection closure

Protocol compliance:

  • Content-Type: application/x-protobuf for remote_write v1/v2
  • Content-Encoding: snappy compression support
  • Status codes: HTTP 204 for success, 400/500 for errors
  • Headers: Custom response headers for client optimization

Performance characteristics:

  • Streaming processing: Memory-efficient handling of large metric batches
  • Size limits: 16MB maximum payload protection
  • Connection management: Periodic closure for load distribution
  • Error resilience: Graceful handling of malformed requests

Load balancing:

Implements periodic HTTP/1.1 connection closure to distribute Prometheus
requests across multiple CloudZero Agent replicas, improving resource
utilization and fault tolerance.

This method represents the primary integration point between Prometheus monitoring and CloudZero cost optimization, processing high-volume metric streams while maintaining sub-second response times.

func (*RemoteWriteAPI) Register

func (a *RemoteWriteAPI) Register(app server.Server) error

Register integrates the RemoteWriteAPI with the CloudZero Agent HTTP server infrastructure. This method completes the remote_write API setup by mounting the Prometheus endpoints and enabling high-throughput metric ingestion capabilities for cost allocation processing.

Registration process:

  • Mount remote_write routes at the configured base path
  • Enable HTTP middleware for logging, metrics, and error handling
  • Configure request size limits and timeout management
  • Activate metric ingestion pipeline with domain service integration

The registration process integrates the remote_write API with the agent's broader HTTP infrastructure, enabling coordinated startup, shutdown, and performance monitoring across all agent components.

Performance considerations:

  • High-throughput endpoint requiring efficient request processing
  • Memory management for handling large metric batches from Prometheus
  • Connection pooling and load balancing across multiple agent replicas
  • Metrics collection for monitoring ingestion rates and error patterns

Error conditions:

  • Service registration failures (port conflicts, permission issues)
  • Route mounting conflicts with existing endpoints
  • Middleware initialization failures

Once registration completes successfully, the remote_write API is ready to receive Prometheus metrics for processing through the CloudZero cost allocation pipeline.

func (*RemoteWriteAPI) Routes

func (a *RemoteWriteAPI) Routes() *chi.Mux

Routes configures HTTP request routing for the Prometheus remote_write API endpoints. This method creates a chi router instance with the remote_write endpoint optimized for high-throughput metric ingestion from Prometheus instances.

Remote write endpoint configuration:

  • POST /: Primary Prometheus remote_write endpoint for metric ingestion
  • Protocol support: Prometheus remote_write v1 and v2 specifications
  • Content types: application/x-protobuf with snappy compression
  • Request handling: Streaming processing for memory efficiency

The chi router provides:

  • High-performance HTTP routing optimized for frequent metric submissions
  • Middleware support for authentication, logging, and performance monitoring
  • Standard HTTP patterns compatible with Prometheus remote_write clients
  • Integration with go-obvious server infrastructure

Performance optimizations:

  • Minimal routing overhead for high-frequency requests
  • Efficient request body processing with size limits
  • Connection management for load balancing across replicas
  • Request correlation for debugging and monitoring

This routing configuration enables seamless integration with Prometheus monitoring while maintaining high throughput and reliability for CloudZero metric processing.

type ShipperAPI

type ShipperAPI struct {
	// api.Service provides the foundational HTTP server infrastructure from go-obvious/server.
	// This embedded service handles HTTP server lifecycle, request routing, middleware integration,
	// and provides consistent API patterns across CloudZero Agent HTTP endpoints.
	api.Service
	// contains filtered or unexported fields
}

ShipperAPI provides HTTP endpoints for CloudZero Agent metric shipping operations and monitoring. This API serves as a Primary Adapter in the hexagonal architecture, exposing metric shipper functionality through HTTP interfaces for operational monitoring, debugging, and integration.

The ShipperAPI enables external systems to interact with the metric shipping pipeline, providing visibility into CloudZero data transmission status, performance metrics, and operational health information.

Key capabilities:

  • Prometheus metrics exposure: Internal shipper metrics for monitoring and alerting
  • Health checking: Endpoint status and connectivity validation
  • Debugging support: Operational insights for troubleshooting shipping issues
  • Performance monitoring: Throughput, latency, and error rate tracking

Integration points:

  • Prometheus monitoring: Scrapes /metrics endpoint for operational dashboards
  • Health checks: Kubernetes liveness and readiness probe support
  • Debugging tools: Operations team access to shipping pipeline status
  • Performance analysis: CloudZero platform integration monitoring

The API maintains separation between HTTP concerns and metric shipping business logic, delegating actual shipping operations to the domain service while providing HTTP accessibility.

func (*ShipperAPI) Register

func (a *ShipperAPI) Register(app server.Server) error

Register integrates the ShipperAPI with the CloudZero Agent HTTP server infrastructure. This method completes the shipper API setup by mounting the operational endpoints and enabling metric shipping monitoring and debugging capabilities.

Registration process:

  • Mount shipper routes at the configured base path
  • Enable HTTP middleware for logging, metrics, and error handling
  • Configure Prometheus metrics scraping endpoint
  • Activate health check and debugging endpoints

The registration process integrates the shipper API with the agent's broader HTTP infrastructure, enabling coordinated startup, shutdown, and health monitoring across all agent components.

Error conditions:

  • Service registration failures (port conflicts, permission issues)
  • Route mounting conflicts with existing endpoints
  • Middleware initialization failures

Once registration completes successfully, the shipper API is ready to serve operational endpoints for metric shipping monitoring, debugging, and health checking.

func (*ShipperAPI) Routes

func (a *ShipperAPI) Routes() *chi.Mux

Routes configures HTTP request routing for the metric shipper API endpoints. This method creates a chi router instance with all necessary routes for monitoring and debugging CloudZero metric shipping operations.

Route configuration:

  • GET /metrics: Prometheus metrics endpoint for operational monitoring and alerting
  • Future endpoints: Health checks, debug information, and performance metrics as needed

The chi router provides:

  • High-performance HTTP routing with minimal overhead for metrics scraping
  • Middleware support for cross-cutting concerns (logging, authentication)
  • RESTful routing patterns compatible with monitoring and debugging tools
  • Integration with go-obvious server infrastructure

This routing configuration enables monitoring systems to collect operational metrics from the metric shipping pipeline while maintaining compatibility with standard HTTP tooling and Prometheus monitoring patterns.

type ValidationWebhookAPI

type ValidationWebhookAPI struct {
	// api.Service provides the foundational HTTP server infrastructure from go-obvious/server.
	// This embedded service handles HTTP server lifecycle, request routing, middleware integration,
	// and provides consistent API patterns across CloudZero Agent HTTP endpoints.
	//
	// Service capabilities:
	//   - Router mounting: Automatic registration of webhook routes with the HTTP server
	//   - Middleware support: Request logging, authentication, and performance monitoring
	//   - Graceful shutdown: Proper cleanup during agent termination or restart
	//   - Health checking: Integration with agent health monitoring systems
	api.Service
	// contains filtered or unexported fields
}

ValidationWebhookAPI implements the HTTP interface for CloudZero Agent Kubernetes admission webhooks. This struct serves as the Primary Adapter in the hexagonal architecture, translating HTTP admission requests from the Kubernetes API server into domain service operations for cost allocation processing.

The webhook API provides the critical integration point between Kubernetes resource lifecycle management and CloudZero cost optimization, enabling automatic cost allocation metadata extraction and storage during resource admission processing.

Architecture responsibilities:

  • HTTP protocol handling: Parse admission review requests and generate valid responses
  • Request validation: Ensure admission requests meet security and format requirements
  • Timeout management: Enforce processing time limits to maintain cluster responsiveness
  • Error handling: Translate domain errors into appropriate HTTP responses with fail-open behavior
  • API versioning: Support both v1 and v1beta1 admission review formats for compatibility

Production characteristics:

  • Fail-open behavior: Always allow resources if processing fails to prevent cluster disruption
  • Connection management: Periodic connection closure for load balancing across replicas
  • Security validation: Content-type verification and request size limits
  • Structured logging: Comprehensive request tracing for operational monitoring

The webhook integrates with the domain webhook controller to perform actual cost allocation logic while maintaining clean separation between HTTP concerns and business logic.

func (*ValidationWebhookAPI) PostAdmissionRequest

func (a *ValidationWebhookAPI) PostAdmissionRequest(w http.ResponseWriter, r *http.Request)

PostAdmissionRequest processes Kubernetes admission review requests for CloudZero cost allocation. This method implements the core webhook endpoint that receives admission requests from Kubernetes API servers and orchestrates the complete cost allocation pipeline while maintaining fail-open behavior for cluster stability.

Processing pipeline:

  1. Request validation: Verify content type, timeout parameters, and body size limits
  2. Admission review parsing: Deserialize Kubernetes admission request into domain objects
  3. Business logic execution: Extract cost allocation metadata through webhook controller
  4. Response generation: Create properly formatted admission review response
  5. Fail-open handling: Always allow requests if processing encounters errors

Webhook execution matrix:

| Result Type              | HTTP Code | status.Code | status.Status | status.Message |
|--------------------------|-----------|-------------|---------------|----------------|
| Validating Allowed       | 200       | -           | -             | -              |
| Validating not allowed   | 200       | 400         | Failure       | Custom message |
| Processing Error         | 200       | -           | -             | - (fail-open)  |

Timeout management:

  • Default timeout: 15 seconds for comprehensive processing
  • Minimum timeout: 5 seconds to prevent unnecessary failures
  • Query parameter override: "?timeout=10s" for custom timeouts
  • Context cancellation: Proper cleanup on request cancellation

Production safeguards:

  • Fail-open behavior: Processing errors result in admission approval
  • Connection management: Periodic closure for load balancing
  • Security validation: Content-type checking and size limits
  • Comprehensive logging: Structured events for operational monitoring

This method represents the primary integration point between Kubernetes resource lifecycle management and CloudZero cost optimization, processing thousands of admission requests per minute in production clusters while maintaining sub-second response times.

func (*ValidationWebhookAPI) Register

func (a *ValidationWebhookAPI) Register(app server.Server) error

Register integrates the ValidationWebhookAPI with the CloudZero Agent HTTP server infrastructure. This method completes the webhook server setup by mounting the API endpoints and enabling request processing for Kubernetes admission control integration.

Registration process:

  • Mount webhook routes at the configured base path
  • Enable HTTP middleware for logging, metrics, and error handling
  • Configure request timeout and size limit enforcement
  • Activate admission review processing pipeline

The registration process integrates the webhook with the agent's broader HTTP infrastructure, enabling coordinated startup, shutdown, and health monitoring across all agent components.

Error conditions:

  • Service registration failures (port conflicts, permission issues)
  • Route mounting conflicts with existing endpoints
  • Middleware initialization failures

Once registration completes successfully, the webhook is ready to receive admission requests from Kubernetes API servers and process them through the CloudZero cost allocation pipeline.

func (*ValidationWebhookAPI) Routes

func (a *ValidationWebhookAPI) Routes() *chi.Mux

Routes configures HTTP request routing for the admission webhook API endpoints. This method creates a chi router instance with all necessary routes for processing Kubernetes admission review requests and health checking.

Route configuration:

  • POST /: Primary admission review endpoint for Kubernetes API server requests
  • Future endpoints: Health checks, metrics, and debugging endpoints as needed

The chi router provides:

  • High-performance HTTP routing with minimal overhead
  • Middleware support for cross-cutting concerns (logging, metrics, authentication)
  • RESTful routing patterns compatible with Kubernetes expectations
  • Integration with go-obvious server infrastructure

This routing configuration enables the webhook to respond to admission requests while maintaining compatibility with standard HTTP tooling and monitoring systems.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL