multigres-operator

module
v0.10.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 12, 2026 License: Apache-2.0

README ΒΆ

Multigres Operator

The Multigres Operator is a Kubernetes operator for managing distributed, sharded PostgreSQL clusters across multiple failure domains (zones or regions). It provides a unified API to define the topology of your database system, handling the complex orchestration of shards, cells (failure domains), and gateways.

Table of Contents

Features

  • Global Cluster Management: Single source of truth (MultigresCluster) for the entire database topology.
  • Automated Sharding: Manages TableGroups and Shards as first-class citizens.
  • Direct Pod Management: Manages individual Pods and PVCs directly (no StatefulSets), enabling targeted decommissioning, rolling updates with primary awareness, and granular PVC lifecycle control.
  • Failover & High Availability: Orchestrates Primary/Standby failovers across defined Cells.
  • Template System: Define configuration once (CoreTemplate, CellTemplate, ShardTemplate) and reuse it across the cluster.
  • Hierarchical Defaults: Smart override logic allowing for global defaults, namespace defaults, and granular overrides.
  • External Gateway Exposure: Optional external gateway support via spec.externalGateway with configurable externalIPs, tracked by a GatewayExternalReady condition.
  • External Admin Web Exposure: Optional external exposure for the multiadmin-web Service via spec.externalAdminWeb, mirroring the gateway pattern with an AdminWebExternalReady condition.
  • PostgreSQL Configuration: Reference a user-created ConfigMap with postgresql.conf overrides via postgresConfigRef on shard templates. ConfigMap content changes trigger automatic rolling updates.
  • Integrated Cert Management: Built-in self-signed certificate generation and rotation for validating webhooks, with optional support for cert-manager.

Installation

Prerequisites
  • Kubernetes v1.25+
Quick Start

Install the operator with built-in self-signed certificate management:

kubectl apply --server-side -f \
  https://github.com/multigres/multigres-operator/releases/latest/download/install.yaml

This deploys the operator into the multigres-operator namespace with:

  • All CRDs (MultigresCluster, Cell, Shard, TableGroup, TopoServer, and templates)
  • RBAC roles and bindings
  • Mutating and Validating webhooks with self-signed certificates (auto-rotated)
  • The operator Deployment
  • Metrics endpoint

Once the operator is running, try a sample cluster:

kubectl apply -f https://raw.githubusercontent.com/multigres/multigres-operator/main/config/samples/minimal.yaml

For more sample configurations, see the samples directory.

Deployment Options
Option Description Guide
Self-signed certs (default) Zero-config TLS β€” operator generates and rotates its own CA. (Installed above)
cert-manager External certificate management via cert-manager. Cert-Manager Demo
Observability stack Full metrics, tracing, and dashboards (Prometheus, Tempo, Grafana). Observability Demo

How it Works

The Multigres Operator follows a Parent/Child architecture. You, the user, manage the Root resource (MultigresCluster) and its shared Templates. The operator automatically creates and reconciles all necessary child resources (Cells, TableGroups, Shards, TopoServers) to match your desired state.

Resource Hierarchy
[MultigresCluster] πŸš€ (Root CR - User Editable)
      β”‚
      β”œβ”€β”€ πŸ“ Defines [TemplateDefaults] (Cluster-wide default templates)
      β”‚
      β”œβ”€β”€ 🌍 [GlobalTopoServer] (Child CR) ← πŸ“„ Uses [CoreTemplate] OR inline [spec]
      β”‚
      β”œβ”€β”€ πŸ€– MultiAdmin Resources ← πŸ“„ Uses [CoreTemplate] OR inline [spec]
      β”‚
      β”œβ”€β”€ πŸ’  [Cell] (Child CR) ← πŸ“„ Uses [CellTemplate] OR inline [spec]
      β”‚    β”‚
      β”‚    β”œβ”€β”€ πŸšͺ MultiGateway Resources
      β”‚    └── πŸ“‘ [LocalTopoServer] (Child CR, optional)
      β”‚
      └── πŸ—ƒοΈ [TableGroup] (Child CR)
           β”‚
           └── πŸ“¦ [Shard] (Child CR) ← πŸ“„ Uses [ShardTemplate] OR inline [spec]
                β”‚
                β”œβ”€β”€ 🧠 MultiOrch Resources (Deployment)
                └── 🏊 Pools (Operator-managed Pods + PVCs)

πŸ“„ [CoreTemplate] (User-editable, scoped config)
   β”œβ”€β”€ globalTopoServer
   └── multiadmin

πŸ“„ [CellTemplate] (User-editable, scoped config)
   β”œβ”€β”€ multigateway
   └── localTopoServer (optional)

πŸ“„ [ShardTemplate] (User-editable, scoped config)
   β”œβ”€β”€ multiorch
   └── pools (postgres + multipooler)

Important:

  • Only MultigresCluster, CoreTemplate, CellTemplate, and ShardTemplate are meant to be edited by users.
  • Child resources (Cell, TableGroup, Shard, TopoServer) are Read-Only. Any manual changes to them will be immediately reverted by the operator to ensure the system stays in sync with the root configuration.

Configuration & Defaults

The operator uses a 4-Level Override Chain to resolve configuration for every component. This allows you to keep your MultigresCluster spec clean while maintaining full control when needed.

1. The Default Hierarchy

When determining the configuration for a component (e.g., a Shard), the operator looks for configuration in this order:

  1. Inline Spec / Explicit Template Ref: Defined directly on the component in the MultigresCluster YAML.
  2. Cluster-Level Template Default: Defined in spec.templateDefaults of the MultigresCluster.
  3. Namespace-Level Default: A template of the correct kind (e.g., ShardTemplate) named "default" in the same namespace.
  4. Operator Hardcoded Defaults: Fallback values built into the operator Webhook.
2. Templates and Overrides

Templates allow you to define standard configurations (e.g., "Standard High-Availability Cell"). You can then apply specific overrides on top of a template.

Example: Using a Template with Overrides

spec:
  cells:
    - name: "us-east-1a"
      cellTemplate: "standard-ha-cell" # <--- Uses the template
      overrides:                       # <--- Patches specific fields
        multigateway:
          replicas: 5                  # <--- Overrides only the replica count

Note on Overrides: When using overrides, you must provide the complete struct for the section you are overriding if it's a pointer. For specific fields like resources, it's safer to ensure you provide the full context if the merge behavior isn't granular enough for your needs (currently, the resolver performs a deep merge).

3. Template Update Behavior

[!WARNING] When a template (CoreTemplate, CellTemplate, or ShardTemplate) is updated, all clusters using that template are reconciled immediately. This means changes to a shared template propagate to every referencing cluster at once.

For production environments where you want controlled rollouts, consider versioning templates by name:

# Instead of editing "standard-shard" in-place...
apiVersion: multigres.com/v1alpha1
kind: ShardTemplate
metadata:
  name: standard-shard-v2   # <--- New version = new resource
spec:
  # ... updated configuration

Then update each cluster's templateRef individually when ready:

spec:
  templateDefaults:
    shardTemplate: "standard-shard-v2"  # <--- Opt-in to the new version

[!NOTE] Avoid using default-named templates (the namespace-level fallback) in production if you need controlled rollouts. They cannot be versioned since any cluster without an explicit template reference will automatically use whichever template is named default.

This mechanism may change in future versions. See Template Propagation for details on planned improvements.


Backup & Restore

The operator integrates pgBackRest for automated backups, WAL archiving, and point-in-time recovery (PITR). Two storage backends are supported: S3 (recommended for production and multi-cell clusters) and Filesystem (PVC-based, for development/single-node). Backup configuration is fully declarative and propagates from the cluster level down to individual shards.

Key features:

  • Replica-based backups β€” backups run on a replica to avoid impacting the primary
  • S3 credential options β€” IRSA, static credentials, or EC2 instance metadata
  • Auto-generated TLS β€” pgBackRest inter-node TLS is managed automatically, with optional cert-manager support

[!WARNING] Filesystem backups are cell-local. Cross-cell failover cannot restore from another cell's backup. Use S3 for multi-cell clusters.

πŸ“– Full documentation: Backup & Restore Guide

Observability

The operator ships with built-in support for metrics, alerting, distributed tracing, and structured logging.

  • Metrics β€” Prometheus endpoint with 8 operator-specific metrics + controller-runtime framework metrics
  • Alerts β€” 10 pre-configured PrometheusRule alerts with dedicated runbooks (view runbooks)
  • Grafana Dashboards β€” Operator dashboard, per-cluster topology dashboard, and data-plane operations dashboard
  • Distributed Tracing β€” OpenTelemetry OTLP support, disabled by default, zero overhead when off
  • Structured Logging β€” JSON logging with automatic trace_id/span_id injection for log-trace correlation

πŸ“– Full documentation: Observability Guide Β· Observability Demo


Webhook & Certificate Management

The operator includes a Mutating and Validating Webhook to enforce defaults and data integrity.

Automatic Certificate Management (Default)

By default, the operator manages its own TLS certificates using the generic pkg/cert module. This implements a Split-Secret PKI architecture:

  1. Bootstrap: On startup, the cert rotator generates a self-signed Root CA (ECDSA P-256) and a Server Certificate, storing them in two separate Kubernetes Secrets.
  2. CA Bundle Injection: A post-reconcile hook automatically patches the MutatingWebhookConfiguration and ValidatingWebhookConfiguration with the CA bundle.
  3. Rotation: A background loop checks certificates hourly. Certs nearing expiry (or signed by a rotated CA) are automatically renewed without downtime.
  4. Owner References: Both secrets are owned by the operator Deployment, so they are garbage-collected on uninstall.
Using external Cert-Manager

If you prefer to use cert-manager or another external tool, deploy using the cert-manager overlay (install-certmanager.yaml). This overlay:

  1. Creates a Certificate and ClusterIssuer resource for cert-manager to manage.
  2. Mounts the cert-manager-provisioned secret to /var/run/secrets/webhook so certificates exist on disk at startup.

The operator automatically detects the certificate management strategy on startup:

  • If certificates already exist on disk and the operator did not previously manage them (no cert-strategy annotation), it assumes an external provider (e.g. cert-manager) and skips internal rotation.
  • If no certificates exist on disk, or the operator previously annotated the ValidatingWebhookConfiguration, internal certificate rotation is enabled.

πŸ“– Cert-Manager walkthrough: Cert-Manager Demo


GitOps & Webhook Defaults

The operator's Mutating Webhook materialises all defaults (images, replicas, resources, backup config, etc.) directly into the MultigresCluster spec stored in etcd. This means kubectl get multigrescluster -o yaml always shows the full effective configuration β€” no hidden in-memory defaults.

Some fields (like cell assignments on shards) are intentionally kept dynamic and resolved at reconcile time. The resolved values are visible on child CRs (Shard, TableGroup).

If you use GitOps tooling (ArgoCD, Flux), the webhook-materialised fields can cause diffs between your Git manifests and the live state. The documentation covers recommended mitigations.

πŸ“– Full documentation: Webhook Defaults & GitOps Guide


Pool Replication & Quorum

Multigres uses a configurable durability policy to control synchronous replication quorum. The default policy is AT_LEAST_2, which requires every write to be acknowledged by at least 2 nodes (the primary + 1 synchronous standby). For multi-AZ clusters, MULTI_CELL_AT_LEAST_2 enforces cross-zone quorum. This has implications for how many replicas you should run per cell in readWrite pools.

Replicas per Cell Configuration Rolling Upgrade Behavior
1 1 pod (primary only, no standbys) Downtime during upgrades. No standby to maintain quorum.
2 1 primary + 1 standby Downtime during upgrades. Draining the standby leaves zero synchronous standbys, violating AT_LEAST_2. Upstream multigres rejects the UpdateSynchronousStandbyList REMOVE because it would empty the synchronous standby list.
3 (recommended) 1 primary + 2 standbys Zero-downtime upgrades. One standby can be drained while the other maintains quorum.

The operator enforces a hard minimum of 1 replica per cell (the CRD rejects replicasPerCell: 0). For pools with fewer than 3 replicas, the webhook returns an admission warning (not a rejection) explaining the quorum limitation.

πŸ“– Full documentation: Durability Policy


Constraints & Limits (v1alpha1)

Please be aware of the following constraints in the current version:

  • Database Limit: Only 1 database is supported per cluster. It must be named postgres and marked default: true.
  • Shard Naming: Shards currently must be named 0-inf - this is a limitation of the current implementation of Multigres.
  • Naming Lengths:
    • TableGroup Names: If the combined name (cluster-db-tg) exceeds 28 characters, the operator automatically hashes the database and tablegroup names to ensure that the resulting child resource names (Shards, Pods, PVCs) stay within Kubernetes limits (63 chars).
    • Cluster Name: Recommended to be under 20 characters to ensure that even with hashing, suffixes fit comfortably.
  • Immutable Fields: Some fields like zone and region in Cell definitions are immutable after creation.
  • Append-Only Pools and Cells: Pools and cells cannot be renamed or removed from a cluster. This prevents orphaned pods and stale etcd registrations.

Further Reading

Resource Description
Operator Capability Levels Maturity assessment against the Operator Framework capability model
Webhook Defaults & GitOps How the webhook materialises defaults, dynamic cell resolution, and GitOps compatibility
Durability Policy Configurable replication quorum: AT_LEAST_2 (default) and MULTI_CELL_AT_LEAST_2 for cross-AZ durability
External Admin Web External exposure for the multiadmin-web Service
PostgreSQL Configuration Custom postgresql.conf overrides via ConfigMap reference
Storage Management PVC deletion policies (Retain/Delete) and volume expansion
Configuration Reference Operator flags, environment variables, and logging
Demos Guided walkthroughs (webhook, cert-manager, observability)
Developer Documentation Internal architecture, controller patterns, caching strategy
Contributing Development setup, local Kind deployment, code style
Changelog Release history

Directories ΒΆ

Path Synopsis
api
v1alpha1
Package v1alpha1 defines the API types for the Multigres Operator.
Package v1alpha1 defines the API types for the Multigres Operator.
cmd
multigres-operator command
The multigres-operator command runs the Kubernetes operator for Multigres distributed database clusters.
The multigres-operator command runs the Kubernetes operator for Multigres distributed database clusters.
pkg
cert
Package cert provides a generic, consumer-agnostic TLS certificate lifecycle manager for Kubernetes operators.
Package cert provides a generic, consumer-agnostic TLS certificate lifecycle manager for Kubernetes operators.
cluster-handler/controller/multigrescluster
Package multigrescluster implements the controller for the root MultigresCluster resource.
Package multigrescluster implements the controller for the root MultigresCluster resource.
cluster-handler/controller/tablegroup
Package tablegroup implements the controller for the TableGroup resource.
Package tablegroup implements the controller for the TableGroup resource.
data-handler/backuphealth
Package backuphealth evaluates the health of database backups by querying the primary pooler via gRPC for backup metadata and determining whether the most recent backup is within acceptable staleness thresholds.
Package backuphealth evaluates the health of database backups by querying the primary pooler via gRPC for backup metadata and determining whether the most recent backup is within acceptable staleness thresholds.
data-handler/drain
Package drain implements the drain state machine for graceful pod removal.
Package drain implements the drain state machine for graceful pod removal.
data-handler/topo
Package topo provides utility functions for interacting with the Multigres global topology server (etcd).
Package topo provides utility functions for interacting with the Multigres global topology server (etcd).
monitoring
Package monitoring provides Prometheus metrics and recording helpers for the Multigres Operator.
Package monitoring provides Prometheus metrics and recording helpers for the Multigres Operator.
resolver
Package resolver provides the central logic for calculating the "Effective Specification" of a MultigresCluster.
Package resolver provides the central logic for calculating the "Effective Specification" of a MultigresCluster.
resource-handler/controller/cell
Package cell implements the controller for the Cell resource.
Package cell implements the controller for the Cell resource.
resource-handler/controller/shard
Package shard implements the controller for the Shard resource.
Package shard implements the controller for the Shard resource.
resource-handler/controller/storage
Package storage provides shared utilities for building Kubernetes storage resources used across the resource-handler controllers.
Package storage provides shared utilities for building Kubernetes storage resources used across the resource-handler controllers.
resource-handler/controller/toposerver
Package toposerver implements the controller for the TopoServer resource.
Package toposerver implements the controller for the TopoServer resource.
testutil
Package testutil provides test utilities for Kubernetes operators.
Package testutil provides test utilities for Kubernetes operators.
util/metadata
Package metadata provides utilities for building Kubernetes resource metadata such as labels, annotations, and owner references used across all Multigres components.
Package metadata provides utilities for building Kubernetes resource metadata such as labels, annotations, and owner references used across all Multigres components.
util/name
Package names generates deterministic, unique names for Kubernetes objects.
Package names generates deterministic, unique names for Kubernetes objects.
util/pvc
Package pvc provides utilities for PVC lifecycle management.
Package pvc provides utilities for PVC lifecycle management.
util/status
Package status provides shared helpers for managing the lifecycle state of Multigres Custom Resources.
Package status provides shared helpers for managing the lifecycle state of Multigres Custom Resources.
webhook
Package webhook provides the entry point for configuring Kubernetes admission webhooks for the Multigres Operator.
Package webhook provides the entry point for configuring Kubernetes admission webhooks for the Multigres Operator.
webhook/handlers
Package handlers implements the specific business logic for Kubernetes Admission Control.
Package handlers implements the specific business logic for Kubernetes Admission Control.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL