shard

package
v0.10.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 12, 2026 License: Apache-2.0 Imports: 45 Imported by: 0

Documentation

Overview

Package shard implements the controller for the Shard resource.

The Shard controller is the most complex controller in the operator, responsible for managing all infrastructure required to run a database shard. It creates and maintains the following components:

PostgreSQL Pools

For each pool defined in the Shard spec, the controller creates:

  • Pod: Runs PostgreSQL replicas (operator-managed) with proper volume claims and configuration.
  • Headless Service: Enables direct pod addressing for replication.
  • Backup PVC: Shared persistent volume for backup storage (when configured).

Pools can span multiple cells for high availability. In multi-cell configurations, the controller creates separate Pods per cell while maintaining a unified view in the Shard status.

MultiOrch Orchestrator

The controller manages the MultiOrch component which handles:

  • Leader election and failover for PostgreSQL
  • Replication topology management
  • Health monitoring and automatic recovery

For each cell where the shard operates, a MultiOrch Deployment and Service are created.

Configuration Management

The controller creates ConfigMaps for PostgreSQL configuration:

  • pg_hba.conf template: Authentication rules for client connections.

Status Aggregation

The controller continuously monitors all managed resources and aggregates their status into the Shard's status fields, including:

  • Ready/Total pod counts across all pools
  • List of cells where the shard is deployed
  • Orchestrator readiness per cell
  • Overall phase (Initializing, Progressing, Healthy)

Index

Constants

View Source
const (
	// DataVolumeName is the name of the data volume for PostgreSQL
	DataVolumeName = "pgdata"

	// DataMountPath is where the PVC is mounted
	// Mounted at parent directory because mounting directly at pg_data/ prevents
	// initdb from setting directory permissions (non-root can't chmod mount points).
	// pgctld creates pg_data/ subdirectory with proper 0700/0750 permissions.
	DataMountPath = "/var/lib/pooler"

	// PgDataPath is the actual postgres data directory (PGDATA env var value)
	// pgctld expects postgres data at <pooler-dir>/pg_data
	PgDataPath = "/var/lib/pooler/pg_data"

	// PoolerDirMountPath must equal DataMountPath because both containers share the PVC
	// and pgctld derives postgres data directory as <pooler-dir>/pg_data
	PoolerDirMountPath = "/var/lib/pooler"

	// SocketDirVolumeName is the name of the shared volume for unix sockets
	SocketDirVolumeName = "socket-dir"

	// SocketDirMountPath is the mount path for unix sockets (postgres and pgctld communicate here)
	// We use /var/run/postgresql because that is the default socket directory for the official postgres image.
	SocketDirMountPath = "/var/run/postgresql"

	// BackupVolumeName is the name of the backup volume for pgbackrest
	BackupVolumeName = "backup-data"

	// BackupMountPath is where the backup volume is mounted
	// pgbackrest stores backups here via --repo1-path
	BackupMountPath = "/backups"

	// PgHbaVolumeName is the name of the volume for pg_hba template
	PgHbaVolumeName = "pg-hba-template"

	// PgHbaMountPath is where the pg_hba template is mounted
	PgHbaMountPath = "/etc/pgctld"

	// PgHbaTemplatePath is the full path to the pg_hba template file
	PgHbaTemplatePath = PgHbaMountPath + "/pg_hba_template.conf"

	// PostgresConfigVolumeName is the name of the volume for the postgresql.conf template
	PostgresConfigVolumeName = "postgres-config-template"

	// PostgresConfigMountPath is where the postgresql.conf template is mounted
	PostgresConfigMountPath = "/etc/pgctld/postgres"

	// PostgresConfigFilePath is the full path to the postgresql.conf template file
	PostgresConfigFilePath = PostgresConfigMountPath + "/postgresql.conf.tmpl"

	// PostgresPasswordSecretKey is the key within the Secret that holds the password
	PostgresPasswordSecretKey = "password"

	// PgBackRestCertVolumeName is the name of the volume for pgBackRest TLS certificates
	PgBackRestCertVolumeName = "pgbackrest-certs"

	// PgBackRestCertMountPath is where pgBackRest TLS certificates are mounted
	PgBackRestCertMountPath = "/certs/pgbackrest"

	// PgBackRestPort is the port for the pgBackRest TLS server
	PgBackRestPort = 8432
)
View Source
const (
	// DefaultMultiPoolerHTTPPort is the default port for MultiPooler HTTP traffic.
	DefaultMultiPoolerHTTPPort int32 = 15200

	// DefaultMultiPoolerGRPCPort is the default port for MultiPooler gRPC traffic.
	DefaultMultiPoolerGRPCPort int32 = 15270

	// DefaultPostgresPort is the default port for PostgreSQL protocol traffic.
	DefaultPostgresPort int32 = 5432

	// DefaultPgctldHTTPPort is the default port for pgctld HTTP traffic.
	DefaultPgctldHTTPPort int32 = 15400

	// DefaultMultiOrchHTTPPort is the default port for MultiOrch HTTP traffic.
	DefaultMultiOrchHTTPPort int32 = 15300

	// DefaultMultiOrchGRPCPort is the default port for MultiOrch gRPC traffic.
	DefaultMultiOrchGRPCPort int32 = 15370
)
View Source
const (
	// DefaultDataVolumeSize is the minimal viable backup/data size
	DefaultDataVolumeSize = "1Gi"
)
View Source
const (

	// DefaultPoolReplicas is the default number of replicas for a pool cell if not specified.
	DefaultPoolReplicas int32 = 1
)
View Source
const DefaultPostgresPassword = "postgres"

DefaultPostgresPassword is the default password for the PostgreSQL superuser during v1alpha1. Future versions will support user-supplied or auto-generated credentials.

View Source
const (
	// MultiOrchComponentName is the component label value for MultiOrch resources
	MultiOrchComponentName = "multiorch"
)
View Source
const (
	// PoolComponentName is the component label value for pool resources
	PoolComponentName = "shard-pool"
)

Variables

View Source
var DefaultPgHbaTemplate string

DefaultPgHbaTemplate is the default pg_hba.conf template for pooler instances. Uses trust authentication for testing/development. Production deployments should override this with proper authentication (scram-sha-256, SSL certificates, etc.).

Functions

func BuildMultiOrchDeployment

func BuildMultiOrchDeployment(
	shard *multigresv1alpha1.Shard,
	cellName string,
	scheme *runtime.Scheme,
) (*appsv1.Deployment, error)

BuildMultiOrchDeployment creates a Deployment for the MultiOrch component in a specific cell. For shards spanning multiple cells, this function should be called once per cell. MultiOrch handles orchestration for the shard.

func BuildMultiOrchService

func BuildMultiOrchService(
	shard *multigresv1alpha1.Shard,
	cellName string,
	scheme *runtime.Scheme,
) (*corev1.Service, error)

BuildMultiOrchService creates a Service for the MultiOrch component in a specific cell.

func BuildPgHbaConfigMap

func BuildPgHbaConfigMap(
	shard *multigresv1alpha1.Shard,
	scheme *runtime.Scheme,
) (*corev1.ConfigMap, error)

BuildPgHbaConfigMap creates a ConfigMap containing the pg_hba.conf template. This ConfigMap is shared across all pools in a shard and mounted into postgres containers.

func BuildPoolDataPVC

func BuildPoolDataPVC(
	shard *multigresv1alpha1.Shard,
	poolName string,
	cellName string,
	poolSpec multigresv1alpha1.PoolSpec,
	index int,
	deleteOnShardRemoval bool,
	scheme *runtime.Scheme,
) (*corev1.PersistentVolumeClaim, error)

BuildPoolDataPVC creates a PersistentVolumeClaim for a pool pod's data volume. When deleteOnShardRemoval is true, a controller ownerRef is set so that Kubernetes GC cascade-deletes the PVC when the Shard is removed.

func BuildPoolDataPVCName

func BuildPoolDataPVCName(
	shard *multigresv1alpha1.Shard,
	poolName, cellName string,
	index int,
) string

BuildPoolDataPVCName constructs the PVC name for a specific pod index.

func BuildPoolHeadlessService

func BuildPoolHeadlessService(
	shard *multigresv1alpha1.Shard,
	poolName string,
	cellName string,
	poolSpec multigresv1alpha1.PoolSpec,
	scheme *runtime.Scheme,
) (*corev1.Service, error)

BuildPoolHeadlessService creates a headless Service for a pool's pods in a specific cell. Headless services are required for pool pod DNS records.

func BuildPoolPod

func BuildPoolPod(
	shard *multigresv1alpha1.Shard,
	poolName string,
	cellName string,
	poolSpec multigresv1alpha1.PoolSpec,
	index int,
	scheme *runtime.Scheme,
) (*corev1.Pod, error)

BuildPoolPod creates a Pod for a shard pool in a specific cell at the given replica index. It reuses the existing container and volume builders from containers.go and sets operator-specific metadata (finalizer, spec-hash).

func BuildPoolPodDisruptionBudget

func BuildPoolPodDisruptionBudget(
	shard *multigresv1alpha1.Shard,
	poolName string,
	cellName string,
	scheme *runtime.Scheme,
) (*policyv1.PodDisruptionBudget, error)

BuildPoolPodDisruptionBudget creates a PodDisruptionBudget that limits voluntary evictions to one pod at a time per pool/cell combination.

func BuildPoolPodName

func BuildPoolPodName(shard *multigresv1alpha1.Shard, poolName, cellName string, index int) string

BuildPoolPodName constructs the deterministic name for a pool pod at the given index. The base name is generated using PodConstraints (60 chars) and the index is appended as a suffix.

func BuildPoolServiceID

func BuildPoolServiceID(podName string) string

BuildPoolServiceID generates a short, deterministic service ID for a multipooler from its pod name. The pod name is already guaranteed unique (via JoinWithConstraints), so hashing it produces a collision-free short ID.

Format: p-{fnv32a_hex} (e.g. "p-a1b2c3d4", always 10 chars). The "p" prefix follows the multigres component naming convention (p=pooler).

func BuildPostgresPasswordSecret

func BuildPostgresPasswordSecret(
	shard *multigresv1alpha1.Shard,
	scheme *runtime.Scheme,
) (*corev1.Secret, error)

BuildPostgresPasswordSecret creates a Secret containing the PostgreSQL superuser password. Both pgctld (via POSTGRES_PASSWORD env var) and multipooler (via POSTGRES_PASSWORD env var) source credentials from this Secret.

func BuildSharedBackupPVC

func BuildSharedBackupPVC(
	shard *multigresv1alpha1.Shard,
	cellName string,
	deleteOnShardRemoval bool,
	scheme *runtime.Scheme,
) (*corev1.PersistentVolumeClaim, error)

BuildSharedBackupPVC creates a ReadWriteMany PersistentVolumeClaim shared by all pods in the cell. When deleteOnShardRemoval is true, a controller ownerRef is set so that Kubernetes GC cascade-deletes the PVC when the Shard is removed.

func BuildSharedBackupPVCName

func BuildSharedBackupPVCName(shard *multigresv1alpha1.Shard, cellName string) string

BuildSharedBackupPVCName builds the deterministic name for the cell-level shared backup PVC.

func ComputeSpecHash

func ComputeSpecHash(pod *corev1.Pod) string

ComputeSpecHash produces a deterministic FNV-1a hex string over the operator- managed pod spec fields that should trigger a rolling update when changed.

Fields included: images, commands, args, env vars, resources, volume mounts, container security contexts, pod affinity, and node selector.

Hash write errors are discarded throughout because hash.Hash.Write never returns an error per the hash.Hash interface contract (it panics on failure instead).

func PgHbaConfigMapName

func PgHbaConfigMapName(shardName string) string

PgHbaConfigMapName returns the per-shard ConfigMap name for the pg_hba template.

func PostgresPasswordSecretName

func PostgresPasswordSecretName(shardName string) string

PostgresPasswordSecretName returns the per-shard Secret name for the postgres password.

func ShouldDeletePVCOnShardRemoval

func ShouldDeletePVCOnShardRemoval(
	shard *multigresv1alpha1.Shard,
	poolSpec multigresv1alpha1.PoolSpec,
) bool

ShouldDeletePVCOnShardRemoval returns true when the effective PVCDeletionPolicy for a pool resolves to Delete. Used by PVC builders to conditionally set a controller ownerRef so Kubernetes GC cascade-deletes the PVC with the Shard.

func ShouldDeleteShardLevelPVCOnRemoval

func ShouldDeleteShardLevelPVCOnRemoval(shard *multigresv1alpha1.Shard) bool

ShouldDeleteShardLevelPVCOnRemoval returns true when the shard-level PVCDeletionPolicy resolves to Delete. Used for shared infrastructure PVCs (e.g., backup PVCs) that are not pool-specific.

Types

type ShardReconciler

type ShardReconciler struct {
	client.Client
	Scheme   *runtime.Scheme
	Recorder record.EventRecorder
	// APIReader is an uncached client that reads directly from the API server.
	// The default cached client (r.Get) only sees Secrets labeled with
	// "app.kubernetes.io/managed-by: multigres-operator" due to the informer
	// cache's label filter. External Secrets (e.g., cert-manager) lack this
	// label, so we need APIReader to validate user-provided pgBackRest TLS Secrets.
	APIReader       client.Reader
	RPCClient       rpcclient.MultiPoolerClient
	CreateTopoStore func(*multigresv1alpha1.Shard) (topoclient.Store, error)
}

ShardReconciler reconciles a Shard object.

func (*ShardReconciler) Reconcile

func (r *ShardReconciler) Reconcile(
	ctx context.Context,
	req ctrl.Request,
) (ctrl.Result, error)

Reconcile manages pool pods, PVCs, services, and data-plane topology for a Shard.

func (*ShardReconciler) SetupWithManager

func (r *ShardReconciler) SetupWithManager(mgr ctrl.Manager, opts ...controller.Options) error

SetupWithManager sets up the controller with the Manager.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL