NOTE
While we do our best to keep the master branch in a usable state, there is no guarantee the master branch works.
Please do not use it for production!
Please have a look at the Release
Notes
for breaking changes!
ClusterCockpit REST and GraphQL API backend

This is a Golang backend implementation for a REST and GraphQL API according to
the ClusterCockpit
specifications. It also
includes a web interface for ClusterCockpit. This implementation replaces the
previous PHP Symfony based ClusterCockpit web interface. The reasons for
switching from PHP Symfony to a Golang based solution are explained
here.
Overview
This is a Golang web backend for the ClusterCockpit job-specific performance
monitoring framework. It provides a REST API and an optional NATS-based messaging
API for integrating ClusterCockpit with an HPC cluster batch system and external
analysis scripts. Data exchange between the web front-end and the back-end is
based on a GraphQL API. The web frontend is also served by the backend using
Svelte components. Layout and styling are based on
Bootstrap 5 using
Bootstrap Icons.
The backend uses SQLite 3 as the relational SQL database.
While there are metric data backends for the InfluxDB and Prometheus time series
databases, the only tested and supported setup is to use cc-metric-store as the
metric data backend. Documentation on how to integrate ClusterCockpit with other
time series databases will be added in the future.
For real-time integration with HPC systems, the backend can subscribe to
NATS subjects to receive job start/stop events and node
state updates, providing an alternative to REST API polling.
Completed batch jobs are stored in a file-based job archive according to
this specification.
The backend supports authentication via local accounts, an external LDAP
directory, and JWT tokens. Authorization for APIs is implemented with
JWT tokens created with public/private key encryption.
You find a detailed documentation on the ClusterCockpit
Webpage.
Build requirements
ClusterCockpit requires a current version of the golang toolchain and node.js.
You can check go.mod to see what is the current minimal golang version needed.
Homebrew and Archlinux usually have current golang versions. For other Linux
distros this often means that you have to install the golang compiler yourself.
Fortunately, this is easy with golang. Since much of the functionality is based
on the Go standard library, it is crucial for security and performance to use a
current version of golang. In addition, an old golang toolchain may limit the supported
versions of third-party packages.
How to try ClusterCockpit with a demo setup
We provide a shell script that downloads demo data and automatically starts the
cc-backend. You will need wget, go, node, npm in your path to
start the demo. The demo downloads 32MB of data (223MB on disk).
git clone https://github.com/ClusterCockpit/cc-backend.git
cd ./cc-backend
./startDemo.sh
You can also try the demo using the latest release binary.
Create a folder and put the release binary cc-backend into this folder.
Execute the following steps:
./cc-backend -init
vim config.json (Add a second cluster entry and name the clusters alex and fritz)
wget https://hpc-mover.rrze.uni-erlangen.de/HPC-Data/0x7b58aefb/eig7ahyo6fo2bais0ephuf2aitohv1ai/job-archive-demo.tar
tar xf job-archive-demo.tar
./cc-backend -init-db -add-user demo:admin:demo -loglevel info
./cc-backend -server -dev -loglevel info
You can access the web interface at http://localhost:8080.
Credentials for login are demo:demo.
Please note that some views do not work without a metric backend (e.g., the
Analysis, Systems and Status views).
How to build and run
There is a Makefile to automate the build of cc-backend. The Makefile supports
the following targets:
make: Initialize var directory and build svelte frontend and backend
binary. Note that there is no proper prerequisite handling. Any change of
frontend source files will result in a complete rebuild.
make clean: Clean go build cache and remove binary.
make test: Run the tests that are also run in the GitHub workflow setup.
A common workflow for setting up cc-backend from scratch is:
git clone https://github.com/ClusterCockpit/cc-backend.git
# Build binary
cd ./cc-backend/
make
# EDIT THE .env FILE BEFORE YOU DEPLOY (Change the secrets)!
# If authentication is disabled, it can be empty.
cp configs/env-template.txt .env
vim .env
cp configs/config.json .
vim config.json
#Optional: Link an existing job archive:
ln -s <your-existing-job-archive> ./var/job-archive
# This will first initialize the job.db database by traversing all
# `meta.json` files in the job-archive and add a new user.
./cc-backend -init-db -add-user <your-username>:admin:<your-password>
# Start a HTTP server (HTTPS can be enabled in the configuration, the default port is 8080).
# The --dev flag enables GraphQL Playground (http://localhost:8080/playground) and Swagger UI (http://localhost:8080/swagger).
./cc-backend -server -dev
# Show other options:
./cc-backend -help
Database Configuration
cc-backend uses SQLite as its database. For large installations, SQLite memory
usage can be tuned via the optional db-config section in config.json under
main:
{
"main": {
"db": "./var/job.db",
"db-config": {
"cache-size-mb": 2048,
"soft-heap-limit-mb": 16384,
"max-open-connections": 4,
"max-idle-connections": 4,
"max-idle-time-minutes": 10
}
}
}
All fields are optional. If db-config is omitted entirely, built-in defaults
are used.
Options
| Option |
Default |
Description |
cache-size-mb |
2048 |
SQLite page cache size per connection in MB. Maps to PRAGMA cache_size. Total cache memory is up to cache-size-mb × max-open-connections. |
soft-heap-limit-mb |
16384 |
Process-wide SQLite soft heap limit in MB. SQLite will try to release cache pages to stay under this limit. Queries won't fail if exceeded, but cache eviction becomes more aggressive. |
max-open-connections |
4 |
Maximum number of open database connections. |
max-idle-connections |
4 |
Maximum number of idle database connections kept in the pool. |
max-idle-time-minutes |
10 |
Maximum time in minutes a connection can sit idle before being closed. |
Sizing Guidelines
SQLite's cache_size is a per-connection setting — each connection
maintains its own independent page cache. With multiple connections, the total
memory available for caching is the sum across all connections.
In practice, different connections tend to cache different pages (e.g., one
handles a job listing query while another runs a statistics aggregation), so
their caches naturally spread across the database. The formula
DB_size / max-open-connections gives enough per-connection cache that the
combined caches can cover the entire database.
However, this is a best-case estimate. Connections running similar queries will
cache the same pages redundantly. In the worst case (all connections caching
identical pages), only cache-size-mb worth of unique data is cached rather
than cache-size-mb × max-open-connections. For workloads with diverse
concurrent queries, cache overlap is typically low.
Rules of thumb:
-
cache-size-mb: Set to DB_size_in_MB / max-open-connections to allow the
entire database to be cached in memory. For example, an 80GB database with 8
connections needs at least 10240 MB (10GB) per connection. If your workload
has many similar concurrent queries, consider setting it higher to account for
cache overlap between connections.
-
soft-heap-limit-mb: Should be >= cache-size-mb × max-open-connections to
avoid cache thrashing. This is the total SQLite memory budget for the process.
-
On small installations the defaults work well. On servers with large databases
(tens of GB) and plenty of RAM, increasing these values significantly improves
query performance by reducing disk I/O.
Example: Large Server (512GB RAM, 80GB database)
{
"main": {
"db-config": {
"cache-size-mb": 16384,
"soft-heap-limit-mb": 131072,
"max-open-connections": 8,
"max-idle-time-minutes": 30
}
}
}
This allows the entire 80GB database to be cached (8 × 16GB = 128GB page cache)
with a 128GB soft heap limit, using about 25% of available RAM.
The effective configuration is logged at startup for verification.
Project file structure
.github/
GitHub Actions workflows and dependabot configuration for CI/CD.
api/
contains the API schema files for the REST and GraphQL APIs. The REST API is
documented in the OpenAPI 3.0 format in
./api/swagger.yaml. The GraphQL schema is in
./api/schema.graphqls.
cmd/cc-backend
contains the main application entry point and CLI implementation.
configs/
contains documentation about configuration and command line options and required
environment variables. Sample configuration files are provided.
init/
contains an example of setting up systemd for production use.
internal/
contains library source code that is not intended for use by others.
api
REST API handlers and NATS integration
archiver
Job archiving functionality
auth
Authentication (local, LDAP, OIDC) and JWT token handling
config
Configuration management and validation
graph
GraphQL schema and resolvers
importer
Job data import and database initialization
metricdispatch
Dispatches metric data loading to appropriate backends
repository
Database repository layer for jobs and metadata
routerConfig
HTTP router configuration and middleware
tagger
Job classification and application detection
taskmanager
Background task management and scheduled jobs
metricstoreclient
Client for cc-metric-store queries
pkg/
contains Go packages that can be used by other projects.
archive
Job archive backend implementations (filesystem, S3, SQLite)
metricstore
In-memory metric data store with checkpointing and metric loading
tools/
Additional command line helper tools.
archive-manager
Commands for getting infos about an existing job archive, importing jobs
between archive backends, and converting archives between JSON and Parquet formats.
archive-migration
Tool for migrating job archives between formats.
convert-pem-pubkey
Tool to convert external pubkey for use in cc-backend.
gen-keypair
contains a small application to generate a compatible JWT keypair. You find
documentation on how to use it
here.
web/
Server-side templates and frontend-related files:
frontend
Svelte components and static assets for the frontend UI
templates
Server-side Go templates, including monitoring views
gqlgen.yml
Configures the behaviour and generation of
gqlgen.
startDemo.sh
is a shell script that sets up demo data, and builds and starts cc-backend.