= Sandbox
:toc:
Go REST API that manages multi-cloud sandbox provisioning for the Red Hat Demo Platform (RHDP). It allocates cloud resources (AWS accounts, OpenShift namespaces, DNS zones, IBM resource groups) to requesters grouped into *placements*.
This repository contains:
* *sandbox-api*: The backend API for all things sandbox
* *sandbox-cli*: CLI client for admin and operator tasks (preferred over shell scripts)
* *sandbox-issue-jwt*: CLI to generate a new signed JWT login token — DEPRECATED, use `sandbox-cli jwt` instead
* *sandbox-list*: Interact with the AWS Sandbox DB in a read-only way
* *sandbox-metrics*: Prometheus endpoint exposing metrics
* *sandbox-rotate-vault*: Program to re-encrypt IAM keys using a new secret key (ansible-vault, AES256)
== Quick Start
=== Build
You will need Go 1.22 or later.
----
make build # All binaries (sandbox-api, sandbox-cli, etc.)
make sandbox-api # API server only
make sandbox-cli # CLI client only
----
=== Local Development
----
make run-local-pg # Start local PostgreSQL
make migrate # Run DB migrations
make tokens # Generate dev JWT tokens
make run-api # Start API server
----
=== Run Tests
----
make test # Unit tests + OpenAPI linting
----
== sandbox-cli
`sandbox-cli` is the *preferred tool* for admin and operator tasks. It replaces the legacy shell scripts in `tools/` with a single binary that handles authentication, config persistence, and error formatting.
=== Install
----
make sandbox-cli
# Binary at ./build/sandbox-cli
----
=== Usage
----
# First-time setup
sandbox-cli login --server https://sandbox-api.example.com --token <your-login-token>
# Verify connection (shows client/server version, environment, DB migration)
sandbox-cli status
# JWT token management
sandbox-cli jwt list
sandbox-cli jwt issue --name ops-user --role shared-cluster-manager
sandbox-cli jwt issue --name admin-user --role admin
sandbox-cli jwt activity <id>
sandbox-cli jwt invalidate <id>
# Cluster management
sandbox-cli cluster list
sandbox-cli cluster get <name>
sandbox-cli cluster create <name> < cluster-config.json
sandbox-cli cluster create <name> --force < cluster-config.json
sandbox-cli cluster onboard [name] [--purpose dev] [--annotations '{}'] [--config file.json]
sandbox-cli cluster enable <name>
sandbox-cli cluster disable <name>
sandbox-cli cluster health <name>
sandbox-cli cluster offboard <name> [--force]
sandbox-cli cluster offboard-status <name>
sandbox-cli cluster delete <name>
# Placement testing
sandbox-cli placement dry-run --selector purpose=dev
sandbox-cli placement dry-run --selector purpose=dev,cloud=aws-shared
sandbox-cli placement dry-run --selector purpose=dev --preference region=us-east-1
# Session management
sandbox-cli logout
# Version info
sandbox-cli version [--json]
----
=== Configuration
Config is saved to `~/.local/sandbox-cli/config.json` (permissions `0600`). The CLI auto-refreshes access tokens when they expire.
Configuration priority: CLI flags > environment variables > saved config.
[cols="1,1,1"]
|===
| Source | Server URL | Login Token
| Flag | `--server` | `--token`
| Env var | `SANDBOX_API_ROUTE` | `SANDBOX_LOGIN_TOKEN`
| Config | saved from `login` | saved from `login`
|===
=== Environment Detection
`sandbox-cli status` detects the environment from the server URL:
----
=== Client ===
Version: 1.0.0
Commit: abc123de
Built: 2026-03-10 14:30:00 UTC
Config: /home/user/.local/sandbox-cli/config.json
=== Connection ===
Server: https://sandbox-api-dev.example.com
Environment: development
Auth: valid (expires 2026-03-10 15:30:00 UTC)
=== Server ===
Version: 1.0.0
Commit: abc123de
Built: 2026-03-10 14:00:00 UTC
DB Migration: 22
----
=== Legacy Shell Scripts
The `tools/` directory contains some legacy shell scripts. Use `sandbox-cli` equivalents when available:
[cols="1,1"]
|===
| Script | sandbox-cli equivalent
| `tools/issue-jwt.sh` | `sandbox-cli jwt list` / `jwt issue`
| `tools/list-shared-clusters.sh` | `sandbox-cli cluster list`
| `tools/get-shared-cluster.sh` | `sandbox-cli cluster get <name>`
| `tools/sandbox_api_enable_shared_cluster.sh` | `sandbox-cli cluster enable <name>`
| `tools/sandbox_api_disable_shared_cluster.sh` | `sandbox-cli cluster disable <name>`
|===
== RBAC Roles
[cols="1,3"]
|===
| Role | Access
| `admin` | Full access to all endpoints
| `app` | Placement and account operations (create, get, delete, lifecycle)
| `shared-cluster-manager` | Manage OCP clusters (own clusters with full details, others with shared view)
|===
All roles can access: `GET /version`, `GET /health`, `POST /placements/dry-run`, `GET /requests/{id}/status`.
The `app` role can additionally: create/get/delete placements, manage accounts, read reservations.
The `shared-cluster-manager` role can:
* *Create/update* clusters via POST/PUT
* *List* all clusters (full details for own clusters, shared view for others)
* *Get* any cluster (full details for own, shared view for others)
* *Offboard* clusters it created
The *shared view* includes: name, api_url, ingress_domain, annotations, valid, quotas, limits, max_placements, created_by, timestamps. It excludes: credentials (token, kubeconfig), additional_vars, deployer SA token config, internal data.
The `shared-cluster-manager` cannot access: placement CRUD, account operations, admin-only endpoints (enable/disable, health, delete, JWT management, DNS, IBM, reservations).
== API Endpoints
=== Public
* `GET /ping` — Heartbeat
=== Authentication
* `GET /api/v1/login` — Exchange login token for access token
* `POST /api/v1/admin/jwt` — Issue login JWT (admin)
* `GET /api/v1/admin/jwt` — List tokens (admin)
* `GET /api/v1/admin/jwt/{id}/activity` — Token audit log (admin)
* `PUT /api/v1/admin/jwt/{id}/invalidate` — Revoke token (admin)
=== Placements
* `POST /api/v1/placements` — Create placement (202 async)
* `POST /api/v1/placements/dry-run` — Check availability
* `GET /api/v1/placements` — List all (admin)
* `GET /api/v1/placements/{uuid}` — Get status + resources
* `DELETE /api/v1/placements/{uuid}` — Delete (cascading cleanup)
=== OCP Shared Clusters
* `GET /api/v1/ocp-shared-cluster-configurations` — List all (admin: full, manager: own full + others shared view)
* `GET /api/v1/ocp-shared-cluster-configurations/{name}` — Get config (admin: full, manager: own full + others shared view)
* `POST /api/v1/ocp-shared-cluster-configurations` — Create (admin, manager)
* `PUT /api/v1/ocp-shared-cluster-configurations/{name}` — Upsert (admin, manager)
* `DELETE /api/v1/ocp-shared-cluster-configurations/{name}` — Delete (admin)
* `DELETE /api/v1/ocp-shared-cluster-configurations/{name}/offboard` — Offboard (admin, manager for own)
* `GET /api/v1/ocp-shared-cluster-configurations/{name}/offboard` — Offboard status (admin, manager)
* `PUT /api/v1/ocp-shared-cluster-configurations/{name}/enable` — Enable (admin)
* `PUT /api/v1/ocp-shared-cluster-configurations/{name}/disable` — Disable (admin)
* `GET /api/v1/ocp-shared-cluster-configurations/{name}/health` — Health check (admin)
=== DNS & IBM Resource Groups (admin)
* CRUD on `/api/v1/dns-account-configurations`
* CRUD on `/api/v1/ibm-resource-group-configurations`
=== Other
* `GET /api/v1/version` — Server version, build commit, DB migration
* `GET /api/v1/health` — Health check
* `POST /api/v1/graphql` — GraphQL (admin)
== Project Structure
----
cmd/
sandbox-api/ HTTP server (handlers, middleware, GraphQL)
sandbox-cli/ CLI client (Cobra-based)
sandbox-issue-jwt/ JWT token generator
sandbox-list/ Read-only account listing
sandbox-metrics/ Prometheus exporter
sandbox-replicate/ DynamoDB -> PostgreSQL replication (Lambda)
sandbox-rotate-vault/ Credential re-encryption
internal/
api/v1/ Request/response types
cli/ sandbox-cli implementation
models/ Domain models, DB queries, cloud integrations
config/ App config
dynamodb/ AWS DynamoDB provider
log/ Structured logging (slog)
db/migrations/ SQL migrations (golang-migrate)
tests/ Hurl integration tests + Python functional tests
tools/ Legacy shell scripts (prefer sandbox-cli)
docs/api-reference/ OpenAPI 3.0 spec (swagger.yaml)
deploy/ Helm charts
----
== Environment Variables
[cols="1,2"]
|===
| Variable | Description
| `DATABASE_URL` | PostgreSQL connection string
| `JWT_AUTH_SECRET` | HMAC-SHA256 signing key
| `VAULT_SECRET` | Credential encryption key
| `ASSUMEROLE_AWS_ACCESS_KEY_ID` | IAM user for STS AssumeRole
| `ASSUMEROLE_AWS_SECRET_ACCESS_KEY` | IAM secret
| `AWS_ACCESS_KEY_ID` / `AWS_SECRET_ACCESS_KEY` | DynamoDB access
| `PORT` | HTTP listen port (default: 8080)
| `WORKERS` | Async job workers (default: 5)
| `AUDIT_LOG_RETENTION` | Audit log retention period (default: `1y`)
| `DEBUG` | Enable debug logging
|===
== sandbox-api Installation
Create a secret file and run helm.
.secrets.yaml
----
sandbox_api_secrets:
database:
url: postgres://...
dynamodb_aws_access_key_id: ...
dynamodb_aws_secret_access_key: ...
# Secret to generate and validate JWT tokens
auth:
jwt_auth_secret: ...
# ansible-vault (AES256) key to encrypt the secret in the DB
vault:
vault_secret: ...
# AWS user that can assume role into the accounts
aws:
assume_aws_access_key_id: ...
assume_aws_secret_access_key: ...
----
.Install chart
----
helm install -f secrets.yaml sandbox-api deploy/helm-api/
----
.Upgrade chart
----
helm upgrade -f secrets.yaml sandbox-api deploy/helm-api/
----
=== DB Migration
To initialize or update the PostgreSQL schema:
----
oc run admin-$$ \
--image=quay.io/rhpds/sandbox-admin:latest -i -t \
--restart=Never --rm -- /bin/bash <1>
DATABASE_URL=postgres://postgres:PASSWORD@RDS_ADDRESS.us-west-2.rds.amazonaws.com:5432/sandbox_api_dev?sslmode=require
migrate \
-source github://rhpds/sandbox/db/migrations#main \
-database $DATABASE_URL up
----
<1> Use the rhpds/sandbox-admin image which contains all the necessary binaries and tools.
=== Bootstrap an Admin Login Token
----
oc run admin-$$ --image=quay.io/rhpds/sandbox-admin:latest -i -t --restart=Never --rm -- /bin/bash
export DATABASE_URL=postgres://postgres:PASSWORD@RDS_ADDRESS.us-west-2.rds.amazonaws.com:5432/sandbox_api_dev?sslmode=require
./sandbox-issue-jwt
JWT Auth secret: Enter Claims in the JSON format:
for example: {"kind": "login", "name": "gucore", "role": "admin"}
{"kind": "login", "name": "gucore", "role": "admin"}
token:
[TOKEN HERE]
----
.Create an access token
----
logintoken=[TOKEN]
curl -H "Authorization: Bearer ${logintoken}" sandbox-api:8080/api/v1/login
token=[ACCESS TOKEN]
# check access
curl -H "Authorization: Bearer ${token}" sandbox-api:8080/api/v1/health
----
=== Local Development Environment
All files used for local development are prefixed by `.dev` and are ignored by Git, see link:.gitignore[`.gitignore`].
[source,shell]
----
make run-local-pg # run postgresql locally in a Container
make migrate # Run the DB migrations to setup the db schema
# Set the following secrets, notice the heading space ' ' to avoid shell history
# IAM secrets to access AWS sandboxes
export ASSUMEROLE_AWS_SECRET_ACCESS_KEY=...
export ASSUMEROLE_AWS_ACCESS_KEY_ID=...
# IAM secrets to access dynamodb table that contains info of the AWS sandboxes
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...
# AES key to encrypt sensible data in the different databases
# If you're using the dynamoDB dev database for AWS sandboxes (which you probably are)
# Then this needs to match the one in use on the DEV environment
export VAULT_SECRET=...
make tokens # issue some JWT token for access
make run-api # <1>
air # <2>
----
<1> When iterating, you will be stopping and relaunching this step
<2> You can use link:https://github.com/cosmtrek/air[cosmtrek/air] instead. That will watch local files and rebuild + launch the API automatically if any changes are made.
== OCP Shared Cluster Onboarding
Use `sandbox-cli` for cluster onboarding and offboarding. See link:docs/cli.md[sandbox-cli documentation] for full details.
----
# Onboard: connects to the target OCP cluster, creates SA + token, registers with API
sandbox-cli cluster onboard [name] [--purpose dev] [--annotations '{}']
# Offboard: disables cluster, cleans up placements, removes from API
sandbox-cli cluster offboard <name> [--force]
----
== sandbox-replicate
The role of the lambda function is to replicate any changes made to the DynamoDB table into a PostgreSQL database.
=== Push lambda
----
export AWS_PROFILE=infra-dev
make push-lambda
----
That will:
. Create a role, a policy and a lambda function
. Attach the policy to the role and the role to the lambda function
. Push the updated `build/sandbox-replicate` binary to the lambda function
== sandbox-metrics
=== Deploy Metrics Prometheus
. Clone this repository
+
----
git clone --depth 1 https://github.com/rhpds/sandbox sandbox
----
. If it doesn't exist yet, create an IAM user in AWS with read-only access to DynamoDB
. Create the secret file containing the key for the IAM user
+
[source,yaml]
.`aws_sandbox_readonly.yaml`
----
aws_sandbox_metrics_secrets:
readonly:
aws_access_key_id: ...
aws_secret_access_key: ...
----
. Install the helm chart
+
----
helm install sandbox-metrics sandbox/deploy/helm-metrics/ -f aws_sandbox_readonly.yaml
----
== Create AWS Sandboxes
Use link:playbooks[ansible playbooks].
== Conan - Sandbox Cleanup Daemon
See link:conan[conan].