github-path-reconciler/

directory
v0.9.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 13, 2025 License: Apache-2.0

README

GitHub Path Reconciler Module

This module creates a GitHub path reconciliation system that monitors file paths in a GitHub repository and reconciles them when they change. It combines a regional-go-reconciler with both periodic (cron-based) and event-driven (push-based) reconciliation.

Usage

module "path-reconciler" {
  source = "chainguard-dev/terraform-infra-common//modules/github-path-reconciler"

  project_id     = var.project_id
  name           = "my-path-reconciler"
  primary-region = "us-central1"
  regions        = var.regions

  service_account = google_service_account.reconciler.email

  # Container configuration
  containers = {
    reconciler = {
      source = {
        working_dir = path.module
        importpath  = "./cmd/reconciler"
      }
      ports = [{
        container_port = 8080
      }]
      env = [{
        name  = "OCTO_IDENTITY"
        value = "my-reconciler"
      }]
    }
  }

  # Path patterns to match (with exactly one capture group each)
  path_patterns = [
    "^configs/(.+\\.yaml)$",  # Match YAML files in configs/
    "^deployments/(.+)$",      # Match everything in deployments/
  ]

  # Repository configuration
  github_owner      = "my-org"
  github_repo       = "my-repo"
  octo_sts_identity = "my-reconciler"

  # Event broker for push notifications
  broker = var.github_events_broker

  # Resync every 6 hours
  resync_period_hours = 6

  notification_channels = var.notification_channels
  squad                 = "platform"
  product               = "infrastructure"
}

Features

  • Path Pattern Matching: Define regex patterns to match specific file paths
  • Dual Reconciliation Modes:
    • Event-Driven: Responds immediately to push events with high priority
    • Periodic: Full repository scan on a configurable schedule
  • Built-in Workqueue: Integrated workqueue with priority support
  • Regional Deployment: Deploy reconciler services across multiple regions
  • Pausable: Single control to pause both cron and push listeners

Architecture

The module creates:

  1. Reconciler Service (via regional-go-reconciler):

    • Implements the workqueue service protocol
    • Processes path reconciliation requests
    • Deployed across all configured regions
  2. Cron Job (periodic reconciliation):

    • Runs on a schedule (configurable in hours)
    • Fetches all files from the repository at HEAD
    • Matches files against path patterns
    • Enqueues matched paths with time-bucketed delays (priority 0)
  3. Push Listener (event-driven reconciliation):

    • Subscribes to GitHub push events via CloudEvents
    • Compares commits to find changed files
    • Matches changed files against path patterns
    • Enqueues matched paths immediately (priority 100)

Path Patterns

Path patterns are regular expressions with exactly one capture group. The captured portion becomes the path in the resource URL.

Note: Patterns are automatically anchored with ^ and $, ensuring full-path matching. Do not include these anchors in your patterns.

Examples:

path_patterns = [
  # Match all files (entire path)
  "(.+)",

  # Match only YAML files (entire path)
  "(.+\\.yaml)",

  # Match files in a specific directory (entire path)
  "(infrastructure/.+)",
]

The module will create resource URLs in the format:

https://github.com/{owner}/{repo}/blob/{branch}/{captured_path}

Reconciler Implementation

Your reconciler should implement the workqueue protocol. The key will be a GitHub URL to the file path:

import (
    "github.com/chainguard-dev/terraform-infra-common/pkg/githubreconciler"
    "github.com/chainguard-dev/terraform-infra-common/pkg/workqueue"
)

func (r *Reconciler) Process(ctx context.Context, req *workqueue.ProcessRequest) (*workqueue.ProcessResponse, error) {
    log := clog.FromContext(ctx)

    // Parse the GitHub URL from the key
    res, err := githubreconciler.ParseResource(req.Key)
    if err != nil {
        return nil, err
    }

    log.Infof("Reconciling path: %s in %s/%s", res.Path, res.Owner, res.Repo)

    // Your reconciliation logic here
    // ...

    return &workqueue.ProcessResponse{}, nil
}

Reconciliation Triggers

Periodic (Cron)
  • Runs every resync_period_hours (1-24 hours)
  • Fetches complete repository tree at HEAD
  • Uses time-bucketed delays to spread load across the period
  • Priority: 0 (normal)
Push Events
  • Triggers on GitHub push events
  • Uses CompareCommits API to get all changed files
  • Handles all merge strategies (merge commits, squash, rebase)
  • Priority: 100 (immediate)

Safe Rollout Process

To safely deploy a new path reconciler, follow these steps:

  1. Initial Deployment - Deploy with paused = true and deletion_protection = false:

    module "my-reconciler" {
      # ... other configuration ...
      paused = true
      deletion_protection = false
    }
    
  2. Create Octo STS Identity - After applying, use the service account's unique_id output to create the Octo STS identity in the GitHub organization. This grants the reconciler access to the GitHub API.

  3. Unpause - Once the Octo STS identity is configured, set paused = false and apply:

    paused = false
    
  4. Enable Protection - After verifying the reconciler works correctly and you're confident you won't need to tear it down quickly, enable deletion protection:

    deletion_protection = true
    

Variables

See variables.tf for all available configuration options.

Key variables:

  • path_patterns: List of regex patterns (each with one capture group)
  • github_owner, github_repo: Repository to monitor
  • octo_sts_identity: Octo STS identity for GitHub authentication
  • broker: Map of region to CloudEvents broker topic
  • resync_period_hours: How often to run full reconciliation (1-24)
  • paused: Pause both cron and push listeners
  • deletion_protection: Prevent accidental deletion (disable during initial rollout)

Requirements

No requirements.

Providers

No providers.

Modules

Name Source Version
authorize-receiver-per-region ../authorize-private-service n/a
cron ../cron n/a
push-listener ../regional-go-service n/a
push-subscription ../cloudevent-trigger n/a
reconciler ../regional-go-reconciler n/a

Resources

No resources.

Inputs

Name Description Type Default Required
broker A map from each of the input region names to the name of the Broker topic in that region. map(string) n/a yes
concurrent-work The amount of concurrent work to dispatch at a given time. number 20 no
containers The containers to run in the service. Each container will be run in each region.
map(object({
source = object({
base_image = optional(string, "cgr.dev/chainguard/static:latest-glibc@sha256:d44809cee093b550944c1f666ff13301f92484bfdd2e53ecaac82b5b6f89647d")
working_dir = string
importpath = string
env = optional(list(string), [])
})
args = optional(list(string), [])
ports = optional(list(object({
name = optional(string, "h2c")
container_port = number
})), [])
resources = optional(
object(
{
limits = optional(object(
{
cpu = string
memory = string
}
), null)
cpu_idle = optional(bool)
startup_cpu_boost = optional(bool, true)
}
),
{}
)
env = optional(list(object({
name = string
value = optional(string)
value_source = optional(object({
secret_key_ref = object({
secret = string
version = string
})
}), null)
})), [])
regional-env = optional(list(object({
name = string
value = map(string)
})), [])
regional-cpu-idle = optional(map(bool), {})
volume_mounts = optional(list(object({
name = string
mount_path = string
})), [])
startup_probe = optional(object({
initial_delay_seconds = optional(number)
timeout_seconds = optional(number, 240)
period_seconds = optional(number, 240)
failure_threshold = optional(number, 1)
tcp_socket = optional(object({
port = optional(number)
}), null)
grpc = optional(object({
port = optional(number)
service = optional(string)
}), null)
}), null)
liveness_probe = optional(object({
initial_delay_seconds = optional(number)
timeout_seconds = optional(number)
period_seconds = optional(number)
failure_threshold = optional(number)
http_get = optional(object({
path = optional(string)
http_headers = optional(list(object({
name = string
value = string
})), [])
}), null)
grpc = optional(object({
port = optional(number)
service = optional(string)
}), null)
}), null)
}))
{} no
deletion_protection Whether to enable delete protection for the service. bool true no
egress Which type of egress traffic to send through the VPC.

- ALL_TRAFFIC sends all traffic through regional VPC network. This should be used if service is not expected to egress to the Internet.
- PRIVATE_RANGES_ONLY sends only traffic to private IP addresses through regional VPC network
string "ALL_TRAFFIC" no
enable_profiler Enable continuous profiling for the service. This has a small performance impact, which shouldn't matter for production services. bool true no
execution_environment The execution environment for the service (options: EXECUTION_ENVIRONMENT_GEN1, EXECUTION_ENVIRONMENT_GEN2). string "EXECUTION_ENVIRONMENT_GEN2" no
github_owner GitHub organization or user string n/a yes
github_repo GitHub repository name string n/a yes
labels Additional labels to add to all resources. map(string) {} no
max-retry The maximum number of times a task will be retried before being moved to the dead-letter queue. Set to 0 for unlimited retries. number 100 no
multi_regional_location The multi-regional location for the global workqueue bucket. Options: US, EU, ASIA. string "US" no
name n/a string n/a yes
notification_channels The channels to send notifications to. List of channel IDs list(string) [] no
octo_sts_identity Octo STS identity for GitHub authentication string n/a yes
otel_resources Resources to add to the OpenTelemetry resource. map(string) {} no
path_patterns List of regex patterns with one capture group each for matching paths list(string) n/a yes
paused Whether to pause both the cron service and push listener bool false no
primary-region The primary region to run the cron job in string n/a yes
product The product that this service belongs to. string "" no
project_id n/a string n/a yes
regional-volumes The volumes to make available to the containers in the service for mounting.
list(object({
name = string
gcs = optional(map(object({
bucket = string
read_only = optional(bool, true)
mount_options = optional(list(string), [])
})), {})
nfs = optional(map(object({
server = string
path = string
read_only = optional(bool, true)
})), {})
}))
[] no
regions A map from region names to a network and subnetwork. A service will be created in each region configured to egress the specified traffic via the specified subnetwork.
map(object({
network = string
subnet = string
}))
n/a yes
request_timeout_seconds The request timeout for the service in seconds. number 300 no
resync_period_hours How often to resync all paths (in hours, must be between 1 and 24) number n/a yes
scaling The scaling configuration for the service.
object({
min_instances = optional(number, 0)
max_instances = optional(number, 100)
max_instance_request_concurrency = optional(number, 1000)
})
{} no
service_account The service account as which to run the reconciler service. string n/a yes
slo Configuration for setting up SLO for the cloud run service
object({
enable = optional(bool, false)
enable_alerting = optional(bool, false)
success = optional(object(
{
multi_region_goal = optional(number, 0.999)
per_region_goal = optional(number, 0.999)
}
), null)
monitor_gclb = optional(bool, false)
})
{} no
team Team label to apply to resources (replaces deprecated 'squad'). string "" no
volumes The volumes to attach to the service.
list(object({
name = string
empty_dir = optional(object({
medium = optional(string, "MEMORY")
size_limit = optional(string, "1Gi")
}), null)
csi = optional(object({
driver = string
volume_attributes = optional(object({
bucketName = string
}), null)
}), null)
}))
[] no
workqueue_cpu_idle Set to false for a region in order to use instance-based billing for workqueue services (dispatcher and receiver). Defaults to true. To control reconciler cpu_idle, use the 'regional-cpu-idle' field in the 'containers' variable. map(map(bool))
{
"dispatcher": {},
"receiver": {}
}
no

Outputs

No outputs.

Directories

Path Synopsis
cmd
push command
resync command
internal

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL