README
¶
Flyte Manager - Unified Binary
The Flyte Manager is a unified binary that runs all Flyte services in a single process:
- Runs Service - Manages workflow runs and action state
- Executor/Operator - Reconciles and transitions TaskAction CRs through their lifecycle
- Actions Service - Serves action metadata and lifecycle APIs, including enqueueing TaskAction CRs
- DataProxy Service - Proxies signed-URL and blob access for task I/O
- Events Service - Ingests and fans out task/run events
- Cache Service - Backs task output caching and lookups
- App Service (+ internal proxy) - Hosts the Flyte UI/app and routes to internal services
- Secret Service - Manages secret references used by tasks
Features
✅ Single Binary - One process to deploy and manage ✅ PostgreSQL Backend - Shared database for all services ✅ Auto Kubernetes Detection - Uses current kubeconfig ✅ Unified Configuration - One config file for all services ✅ HTTP/2 Support - Buf Connect compatible
API Endpoints
Manager (port 8090)
All Connect/gRPC services are mounted on a single port. Notable handlers:
flyteidl2.workflow.RunService- Create / Get / List / Abort runsflyteidl2.workflow.InternalRunService- Internal run-control APIs used by the executorflyteidl2.workflow.TranslatorService- Translates user task definitionsflyteidl2.workflow.RunLogsService- Stream logs for a runflyteidl2.actions.ActionsService- Action lifecycle and metadataflyteidl2.task.TaskService- Task registration and lookupflyteidl2.trigger.TriggerService- Schedules and triggersflyteidl2.project.ProjectService- Project managementflyteidl2.auth.IdentityService/AuthMetadataService- Identity and auth metadata- DataProxy, Events, Cache, Secret, and App services (see their respective packages)
GET /healthz- Health checkGET /readyz- Readiness check
Executor (port 8081)
GET /healthz- Health checkGET /readyz- Readiness check
How It Works
- CreateRun → Runs Service persists the run to PostgreSQL and calls
ActionsService.Enqueue(...)to enqueue the root action - Actions Service / Executor → That enqueue flow results in the root TaskAction CR being created in Kubernetes, which the Executor then watches and reconciles
- Executor → Transitions: Queued → Initializing → Running → Succeeded
- Actions Service → Watches TaskAction CRs via a shared informer and forwards status updates (phase, output URI, error state) to subscribers; sdk controller consumes these updates through
WatchForUpdatesto drive the run forward - Runs Service → Persists state changes to PostgreSQL and notifies its own watchers
Testing
Check Services
# Manager (Connect services) health
curl http://localhost:8090/healthz
# Executor health
curl http://localhost:8081/healthz
Watch TaskActions
# List all TaskActions
kubectl get taskactions -n flyte
# Watch TaskActions in real-time
kubectl get taskactions -n flyte -w
# Get details of a specific TaskAction
kubectl describe taskaction <name> -n flyte
Check Database
# Connect to the PostgreSQL backend (devbox defaults)
psql -h localhost -p 30001 -U postgres -d runs
# List tables
\dt
# Query runs
SELECT * FROM runs;
# Query actions
SELECT name, phase, state FROM actions;
Deployment
Local Development
make run
Troubleshooting
Connection Issues
Error: "failed to get Kubernetes config"
# Verify kubeconfig
kubectl cluster-info
# Or set explicit kubeconfig in config.yaml
manager:
kubernetes:
kubeconfig: "/path/to/kubeconfig"
Database Issues
Error: relation/table not found (runs schema missing)
# Restart manager so startup migrations run
./bin/flyte-manager --config config.yaml
Port Conflicts
Error: "address already in use"
# Check what's using the ports
lsof -i :8090
lsof -i :8081
# Change ports in config.yaml
manager:
server:
port: 9090 # Changed from 8090
Development
Build
make build
Clean
make clean
Run Tests
make test
Click to show internal directories.
Click to hide internal directories.