caesium

command module
v0.0.0-...-511e509 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 11, 2026 License: Apache-2.0 Imports: 3 Imported by: 0

README

Caesium logo

caesium

Open-source distributed job scheduler with DAG pipelines, multi-runtime support, and an embedded web UI

CI Go Reference Go Report Card Coverage Release Docker Pulls

Caesium lets you define jobs as declarative YAML DAGs, run them on Docker, Podman, or Kubernetes, and operate them through a REST API, Prometheus metrics, an embedded React UI, and an optional GraphQL endpoint when API-key auth is disabled.

Local Developer Experience

Caesium is designed so job authors can validate, visualize, and execute pipelines locally before pushing them to a server.

Validate definitions

caesium test --path jobs/ --verbose

Use --check-images to verify local image availability.

Run executable harness scenarios

caesium test --scenario ./harness

Harness scenario files use the Harness kind and let you assert run status, task status, output fragments, schema-violation counts, cache hits, log content, Prometheus metric values, and emitted OpenLineage events against a real local execution.

Visualize a DAG

caesium job preview --path jobs/fanout-join.job.yaml

Run locally

caesium dev --once --path jobs/nightly-etl.job.yaml

caesium dev without --once watches YAML files and re-runs the DAG on save. The local runner uses an in-memory SQLite database and the same execution engine as the server path.

Quick Start

1. Write a job definition

apiVersion: v1
kind: Job
metadata:
  alias: nightly-etl
trigger:
  type: cron
  configuration:
    cron: "0 2 * * *"
    timezone: "UTC"
steps:
  - name: extract
    image: alpine:3.20
    command: ["sh", "-c", "echo extracting"]
  - name: transform
    image: alpine:3.20
    command: ["sh", "-c", "echo transforming"]
  - name: load
    image: alpine:3.20
    command: ["sh", "-c", "echo loading"]

2. Validate and preview it

caesium test --path jobs/ --verbose
caesium test --scenario ./harness
caesium job preview --path jobs/nightly-etl.job.yaml
caesium job lint --path jobs/

3. Run it locally

caesium dev --once --path jobs/nightly-etl.job.yaml

4. Start the server and apply definitions

# Start the server
just run

# Apply definitions
caesium job apply --path jobs/ --server http://localhost:8080

Features

  • Declarative YAML job definitions with validation, diffing, schema reporting, and Git sync.
  • DAG execution with fan-out, fan-in, retry controls, trigger rules, and run parameters.
  • Docker, Podman, and Kubernetes task runtimes.
  • Cron and HTTP triggers.
  • Distributed execution backed by dqlite, including mixed amd64 and arm64 clusters.
  • Embedded operator UI with live run updates, DAG inspection, backfill controls, and log streaming.
  • Smart incremental execution: cache task results and skip re-execution when inputs are unchanged.
  • OpenLineage event emission.
  • Prometheus metrics plus optional in-browser operator tools for server logs and database inspection.

Server Workflow

Prerequisites

Run the server

just run

The API and embedded UI are served from http://localhost:8080.

Using Podman

Set the following environment variables to use Podman instead of Docker:

export CAESIUM_PODMAN=true
just run

When CAESIUM_PODMAN=true, Caesium defaults to the rootless Podman socket at $XDG_RUNTIME_DIR/podman/podman.sock (or /run/user/$UID/podman/podman.sock when XDG_RUNTIME_DIR is unset) and uses the podman CLI. Override either if your setup differs:

export CAESIUM_SOCK=/custom/path/podman.sock
export CAESIUM_CONTAINER_CLI=podman
just run
Variable Default Description
CAESIUM_PODMAN false Prefix image references with localhost/ for Podman's local image store
CAESIUM_CONTAINER_CLI docker or podman when CAESIUM_PODMAN=true Container CLI used by just recipes
CAESIUM_SOCK /var/run/docker.sock or $XDG_RUNTIME_DIR/podman/podman.sock when CAESIUM_PODMAN=true Host-side container socket to mount into the container
CAESIUM_PORT 8080 Host port to expose the server on

Load example jobs

just hydrate

Trigger a run manually

curl -X POST http://localhost:8080/v1/jobs/<job-id>/run

Backfill a cron job

caesium backfill create \
  --job-id <job-id> \
  --start 2026-03-01T00:00:00Z \
  --end 2026-03-03T00:00:00Z \
  --server http://localhost:8080

Job Definitions

Jobs use the apiVersion / kind / metadata / trigger / steps schema. For full authoring guidance see docs/job-definitions.md and the generated reference in docs/job-schema-reference.md.

Useful CLI commands:

caesium job lint --path ./jobs
caesium job diff --path ./jobs
caesium job apply --path ./jobs --server http://localhost:8080
caesium job schema --doc
caesium run retry-callbacks --job-id <job-id> --run-id <run-id>

Building and Testing

Runtime images are published as multi-arch Docker manifests. docker pull caesiumcloud/caesium:<tag> resolves to the native architecture automatically.

Command Description
just build Build a release image for the host platform
CAESIUM_PLATFORM=linux/arm64 just build Cross-build for a specific architecture
just build-cross linux/arm64 Cross-build a single platform with buildx
just build-multiarch tag=<tag> Build and push a multi-arch manifest
just unit-test Run Go unit tests with race detector and coverage
just ui-test Run UI unit tests and bundle budget checks
just ui-e2e Run Playwright against the embedded UI and a real Caesium server
just integration-test Run integration tests
just helm-lint Validate the Helm chart

Supported runtime image targets:

  • linux/amd64
  • linux/arm64

Operator Tools

The embedded UI exposes a few optional power-user surfaces:

  • Server log console: enabled by CAESIUM_LOG_CONSOLE_ENABLED=true and backed by GET /v1/logs/stream, GET /v1/logs/level, and PUT /v1/logs/level.
  • Database console: enabled by CAESIUM_DATABASE_CONSOLE_ENABLED=true and backed by GET /v1/database/schema and POST /v1/database/query.
  • Worker inspection: GET /v1/nodes/:address/workers.
  • Fleet-level stats: GET /v1/stats.

API Reference

The server exposes REST on port 8080. GraphQL is available at GET /gql only when CAESIUM_AUTH_MODE=none; when API-key auth is enabled, authentication in this release applies to the REST API, /metrics, and embedded UI only, and webhook delivery continues to use per-trigger webhook signature configuration rather than bearer tokens. The UI determines whether login is required through the explicit GET /auth/status endpoint rather than probing protected resources.

Endpoint Purpose
GET /health Health check
GET /auth/status Report whether API-key auth is enabled for the UI
GET /metrics Prometheus metrics (viewer auth required when CAESIUM_AUTH_MODE=api-key)
GET /gql GraphQL endpoint when CAESIUM_AUTH_MODE=none
GET /v1/jobs List jobs
GET /v1/jobs/:id Get one job
GET /v1/jobs/:id/tasks List persisted task definitions for a job
GET /v1/jobs/:id/dag Retrieve DAG nodes and edges
POST /v1/jobs/:id/run Trigger a new run
PUT /v1/jobs/:id/pause Pause a job
PUT /v1/jobs/:id/unpause Unpause a job
GET /v1/jobs/:id/runs List runs for a job
GET /v1/jobs/:id/runs/:run_id Get one run
GET /v1/jobs/:id/runs/:run_id/logs?task_id=<task-id> Stream or retrieve task logs
POST /v1/jobs/:id/runs/:run_id/callbacks/retry Retry failed callbacks
POST /v1/jobs/:id/backfill Start a backfill
GET /v1/jobs/:id/backfills List backfills
PUT /v1/jobs/:id/backfills/:backfill_id/cancel Cancel a backfill
POST /v1/jobdefs/apply Apply one or more job definitions
GET /v1/triggers List triggers
GET /v1/atoms List atoms
GET /v1/events Subscribe to lifecycle events over SSE
GET /v1/stats Get aggregated job/run statistics
GET /v1/nodes/:address/workers Inspect worker state for one node

The log and database console endpoints are intentionally gated by environment variables because they are operator-facing debugging features rather than default public APIs.

For the auth management CLI, prefer supplying credentials through CAESIUM_API_KEY; the --api-key flag remains available but is visible in process listings.

When CAESIUM_AUTH_MODE=api-key, you must also set CAESIUM_AUTH_KEY_HASH_SECRET to a long random server-side secret. New and rotated API keys are stored as HMAC-SHA256 hashes derived from that secret. Existing legacy SHA-256 key hashes continue to validate after upgrade so you can roll the change out safely, but you should rotate those keys so the database no longer contains legacy unkeyed hashes.

Documentation

Guide Description
docs/README.md Documentation index
docs/job-definitions.md Authoring, linting, diffing, and applying manifests
docs/job-schema-reference.md Generated schema reference
docs/backfill.md Backfill API, CLI, and UI behavior
docs/parallel-execution-operations.md Distributed execution configuration and troubleshooting
docs/open_lineage.md OpenLineage transport and configuration
docs/kubernetes-deployment.md Helm-based Kubernetes deployment
docs/ui_implementation_plan.md Embedded UI implementation status and remaining gaps

Contributing

See CONTRIBUTING.md for setup, development workflow, and PR guidance.

License

See LICENSE for details.

Documentation

The Go Gopher

There is no documentation for this package.

Directories

Path Synopsis
api
gql
cmd
dev
Package dev implements the caesium dev command for local DAG development.
Package dev implements the caesium dev command for local DAG development.
job
run
test
Package test implements the caesium test command for dry-run DAG validation.
Package test implements the caesium test command for dry-run DAG validation.
internal
dag
Package dag computes topology metrics for job DAGs.
Package dag computes topology metrics for job DAGs.
dagrender
Package dagrender renders DAG visualizations in ASCII.
Package dagrender renders DAG visualizations in ASCII.
imagecheck
Package imagecheck verifies container image availability locally.
Package imagecheck verifies container image availability locally.
job
jobdef
Package jobdef provides job definition utilities including collection and import of YAML manifests.
Package jobdef provides job definition utilities including collection and import of YAML manifests.
localrun
Package localrun executes job DAGs locally without a running server.
Package localrun executes job DAGs locally without a running server.
run
pkg
db
env
log
tools
client command

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL