GitLab IAM Service
This service provides Identity and Access Management (IAM) capabilities
specifically tailored for GitLab. Designed with modularity in mind, it offers a
unified codebase that can be compiled to serve distinct purposes, ensuring
flexibility and optimized deployments.
It is the main codebase to implement GitLab Adaptive Trust Environment.
Capabilities
- Auth Provider: Manages OAuth 2.0 and OIDC flows, enabling secure authentication and authorization for applications integrating with GitLab.
- Lookup API: Facilitates efficient retrieval of identity and access-related data stored within the database.
- Update API: Provides a robust API for securely updating identity and access information in the database.
- IAM Transform: Transforms GitLab permissions into a format optimized for rapid lookup, stored in the database.
- Envoy Control Plane: xDS APIs to integrate the IAM service with Envoy as a control plane.
- Secure Token Service: Token exchange and URT signing capabilities.
Design
The GitLab IAM service is designed for flexibility and can be deployed in
several distinct modes to suit different environments, from local development
to production.
Single Binary, Single-Process Mode
This is the default and simplest mode for development and testing. A single
iam-service binary runs all microservices concurrently as goroutines within
one process.
- How it works: Run
iam-service --serve all.
- Use Case: Ideal for local development, debugging, and simple
deployments where ease of use is paramount.
- Communication: Services communicate over the network (
localhost) via
their standard gRPC interfaces, ensuring consistency with distributed
deployments.
Single Binary, Multi-Process Mode
The same iam-service binary can be used to launch a single, specific microservice.
- How it works: Use the
--serve flag to specify the service (e.g.,
iam-service --serve lookup).
- Use Case: Useful for deployments where services are managed
individually by an external orchestrator (e.g., Kubernetes, systemd) but
sourced from a single container image.
Multiple Binaries, Multiple Processes Mode
For maximum separation and traditional microservice deployments, each service
can be compiled into its own standalone binary.
- How it works: Each service has a dedicated
main package (e.g.,
cmd/lookup/main.go) that can be built into a separate binary (e.g.,
iam-lookup, iam-sts). The Makefile provides targets for this
(make iam-lookup, make iam-sts, etc.).
- Use Case: Production deployments where services need to be scaled,
monitored, and managed independently. This provides the highest level of
isolation.
Composite Binary Mode
For deployments that benefit from co-locating a subset of services in one
process on a shared gRPC port (e.g. a single Runway deployment), the
Makefile also produces composite binaries.
iam-data-access — runs Lookup and Update on a shared port (:5005
in development). Used as the Runway target for the iam-data-access-grpc
deployment. Built via make iam-data-access.
Development
Run scripts/prepare-dev-env.sh to install build tools first.
Building
The Makefile provides targets for all build scenarios:
make: Build all targets.
make iam-service: Build the primary multi-service binary.
make iam-lookup, make iam-sts, make iam-update, make iam-auth: Build the individual service binaries.
make iam-data-access: Build the composite Lookup + Update binary used by Runway.
make up, make down: Start and stop dependent services like the database.
Linting
While we catch linter and formatting errors in CI, it's best to catch these pre-CI.
Run linters once:
make lint
Database migration linting
Migrations under db/migrations/ are linted with squawk
for unsafe online-migration patterns (rule configuration lives in
.squawk.postgres.toml and .squawk.yugabyte.toml):
make lint-migrations
CI runs this automatically. Unlike the other tools, squawk has no mise plugin, so
scripts/prepare-dev-env.sh does not install it. Install the standalone binary
matching the GL_SQUAWK_VERSION pin in .gitlab-ci-other-versions.yml — for
example, with mise:
mise use "github:sbdchd/squawk@<GL_SQUAWK_VERSION>"
or download it from the releases page.
Note: the pg_version in the squawk configs tracks the Postgres major and
must be bumped manually on a major upgrade. It's a separate copy from the
postgres:18.x image tags (which renovate bumps) and isn't covered by
scripts/update-asdf-version-variables.sh.
Storage backends
We support multiple storage backends:
Before running the service or the tests, the dependent services need to be started using the command:
make up
Testing
make test: Runs the entire test suite using the development-l2 environment (Postgres).
make test-l1: Runs the entire test suite using the development-l1 environment (YugabyteDB).
Continuous Integration (.gitlab-ci.yml)
The pipeline is defined with three stages (lint, verify, test) and uses
pinned, compatible versions of Go (golang:1.24-alpine) and golangci-lint to
ensure stable, reproducible builds. This specific Go version is required by the
go.mod file.
Invoking RPCs
To invoke RPCs from any of the gRPC servers, use grpcurl.
In development, grpcurl can automatically discover RPCs via a reflection server.
However, this server only works in development builds (disabled elsewhere for security reasons), so
grpcurl cannot always discover the schema over the wire.
Passing the .proto files directly with -proto does not work either: our protos import the
protovalidate definitions
(buf/validate/validate.proto) from the Buf Schema Registry (BSR), which grpcurl can't resolve on
its own.
Instead, let buf compile a self-contained descriptor set (buf resolves the BSR deps declared in
buf.yaml) and pipe it straight into grpcurl via -protoset:
buf build -o - | grpcurl -protoset /dev/stdin -plaintext localhost:5001 lookup.v1.LookupService/Health
{
"status": "ok"
}
The same descriptor set works for discovery and to reach deployed environments. For the
sandbox, drop -plaintext (it's TLS on :443) and add the x-gitlab-svc routing header:
# List the services exposed by a server
buf build -o - | grpcurl -protoset /dev/stdin \
-H "x-gitlab-svc: iam-data-access-grpc" \
gate-sandbox.com:443 list
# List the methods on one service
buf build -o - | grpcurl -protoset /dev/stdin \
-H "x-gitlab-svc: iam-data-access-grpc" \
gate-sandbox.com:443 list lookup.v1.LookupService
# Invoke a method
buf build -o - | grpcurl -protoset /dev/stdin \
-H "x-gitlab-svc: iam-data-access-grpc" \
-d '{}' gate-sandbox.com:443 lookup.v1.LookupService/Health
# For AuthService
buf build -o - | grpcurl -protoset /dev/stdin \
-H "x-gitlab-svc: iam-auth-grpc" \
-d '{}' gate-sandbox.com:443 auth.v1.AuthService/Health
To make many calls, build the descriptor set once (buf build -o /tmp/iam.binpb) and
reuse it with -protoset /tmp/iam.binpb to avoid rebuild on every invocation.
Configuration
The IAM service uses a layered TOML configuration system. On startup, three files are merged in order — each layer overrides values from the previous one:
base.toml — shared defaults for all environments (HTTP timeouts, CORS defaults, service addresses)
- Environment Config file — environment-specific overrides (database host/port, CORS origins, service URLs)
- Secrets file — credentials that must never be committed (database password, HMAC secrets, OAuth client secrets)
These three files use a shared structure — platform-wide settings live under [platform] and per-service settings under [services.<name>],
so all services draw from the same merged config, all resolved from configs/ at the repo root with environment-specific files under configs/environments/.
Environment variables
| Variable |
Description |
Default |
ENV |
Selects the environment config file as ./configs/environments/$ENV.toml |
development-l2 |
CONFIG_FILE_PATH |
Absolute path to the environment config file. Overrides the default ./configs/environments/$ENV.toml when set. |
— |
SECRETS_FILE_PATH |
Absolute path to the secrets file. Overrides the default ./configs/environments/secrets.toml when set. |
— |
CONFIG_TOML |
Full environment config TOML passed inline (not a path). The container entrypoint writes it to a file and sets CONFIG_FILE_PATH for you. Used on Runway. |
— |
PLATFORM_DATABASE_PASSWORD, … |
Any config key can be overridden by an env var named after its dotted path, uppercased with . → _ (e.g. platform.database.password → PLATFORM_DATABASE_PASSWORD). Used to inject secrets from Vault. |
— |
Deployed Environments
- Set the
ENV variable for the environment that is being deployed. Possible values are: sandbox-l1, sandbox-l2, staging-l1, staging-l2, production-l1, production-l2
- Mount the env config file with the name
ENV.toml and the secrets file with the name secrets.toml under /app/configs/environments.
e.g., if ENV is staging-l1, the config file should be named staging-l1.toml
File Resolution
/app/configs/
├── base.toml # always loaded first
└── environments/
├── staging-l1.toml # ENV=staging-l1
└── secrets.toml # loaded last
Special Case - Runway:
Runway does not expose configMap volumes, and a Kubernetes volume mount replaces the whole target
directory — so the env config and the secrets file cannot be injected as separate files into the same
/app/configs/environments directory. Instead of mounting files, Runway deployments inject config
without touching the filesystem layout:
- Env (non-secret) config is passed inline as a single
CONFIG_TOML env var. The container
entrypoint (scripts/entrypoint.sh) writes it to /tmp/iam-config/config.toml and exports
CONFIG_FILE_PATH to point at it; base.toml stays baked in at /app/configs/base.toml.
- Secrets are injected as individual env vars from Vault (via Runway's
secretEnvFrom). Viper's
AutomaticEnv overrides the matching config key — e.g. PLATFORM_DATABASE_PASSWORD overrides
[platform.database] password. No secrets.toml is shipped, so the loader treats a missing
secrets file as non-fatal.
Because AutomaticEnv only consults env vars for keys that already exist in a loaded config file,
each Vault-managed secret must have a placeholder entry in the env config to register its key. The
convention is the VAULT_MANAGED sentinel value — see configs/environments/staging-l1.toml.
Example: inline config (as Runway runs it)
CONFIG_TOML="$(cat configs/environments/staging-l1.toml)" \
PLATFORM_DATABASE_USERNAME=iam PLATFORM_DATABASE_PASSWORD=… \
./iam --serve all
CONFIG_FILE_PATH / SECRETS_FILE_PATH (absolute paths to each file) remain supported for any setup
that can mount the files directly. CONFIG_TOML and CONFIG_FILE_PATH are mutually exclusive —
they supply the same env config two different ways — and the entrypoint exits with an error if both are
set.
Local Development and CI
For integration with a local GitLab Rails instance, see Rails development setup.
- Set
ENV to the environment you want to work with (development-l1 or development-l2, default is development-l2): e.g. ENV=development-l1 make run-all
- The repo ships these config files ready to use:
./configs/
├── base.toml # always loaded first
└── environments/
├── development-l1.toml # ENV=development-l1
├── development-l2.toml # ENV=development-l2 (default)
├── ci-l1.toml # ENV=ci-l1
├── ci-l2.toml # ENV=ci-l2
└── secrets.toml # shipped with placeholder values for local development; not baked into docker images
Tests that load config call testhelper.ChdirRepoRoot(t) in their setup, which changes the working directory to the repo root for the duration of the test.
This lets config.Load resolve ./configs correctly regardless of which package directory the test runs from.
Commit messages
This project uses Conventional Commits to drive automated releases via semantic-release. Releases produce a git tag and a GitLab Release, which downstream tooling (e.g., Runway) uses to deploy.
Since we generally squash on merge, the MR title is the message that gets analyzed.
See also docs/release.
<type>(<optional scope>): <description>
Common types
| Type |
Triggers release |
Example |
feat |
minor (1.2.0) |
feat: add OAuth scope validation |
fix |
patch (1.2.3) |
fix: handle expired tokens correctly |
perf |
patch (1.2.3) |
perf: cache JWKS responses |
chore |
none |
chore: bump go-jose to v4 |
docs |
none |
docs: clarify token rotation |
refactor |
none |
refactor: extract token helper |
test |
none |
test: add coverage for STS endpoint |
ci |
none |
ci: pin runner image version |
Breaking changes
Append ! after the type, or include a BREAKING CHANGE: footer:
feat!: drop support for v1 tokens
feat: redesign token refresh flow
BREAKING CHANGE: clients must re-authenticate after upgrade
Notes
- Non-conventional MR titles won't break the build, they're silently skipped by the analyzer and no release happens.
- Use
fix: for any change that should produce a deployable artifact, even if "fix" feels semantically loose. chore:, docs:, etc. will not produce a release.
Environments and releasing changes
Release management
See docs/release.
Sandbox environment
IAM services are deployed to a sandbox environment. The sandbox environment is a Kubernetes cluster under a GCP project maintained by the team. Much of this config lives in this sandbox-config repo. Please see that repo's README for details.
Deployments to the sandbox environment are automatically triggered on a green main pipeline.
Staging environment
We're deploying just the iam-auth service via Runway GKE. We have two workloads: one serving HTTP, and one serving gRPC.
Runway
NOTE: This section only covers the iam-auth Runway deployment and is out of date. We will eventually replace this with a Runway deployment handled by the Release Platform.
- HTTP workload
- Provisioned in this MR
- Runway service ID:
iam-auth-gke-ext
- Runway project ID: 74293132
- Access control: group-level (
gitlab-org/software-supply-chain-security/authentication/authentication-runway-access)
- gRPC workload
- Access control: group-level (
gitlab-org/software-supply-chain-security/authentication/authentication-runway-access)
- Provisioning TBD
Project configuration and Vault
To support Runway deploys, the iam GitLab project was onboarded to GitLab's infra-mgmt repository. infra-mgmt is a central location to manage project configs via Terraform. The iam project was onboarded in this MR.
Communication between Runway infra and the iam project's build pipeline is enabled by an access token stored in Vault. We also have a Vault path configured for the iam-auth-gke-ext workload where we store application secrets:
Links