

What is Colonies?
Colonies is a Process Orchestration framework for managing AI/ML workloads across heterogeneous computing platforms.
Key features
- Colonies makes it possible to implement a loosely decoupled workflow architecture spanning many platforms and infrastructures. All coordination is managed by Colonies servers and developers can focus on implementing Colonies workers based on a Function-as-a-Service (FaaS) event-driven execution model.
- Complex workflows are automatically broken down into events (process assignments) that is received by the workers. The system can then easily scale just by deploying more workers. Failed processes are automatically re-assigned to other workers.
- Colonies function a distributed ledger and contains full execution history. Traceability allows developers to keep track of the system and more easily debug their services.
- Colonies integrates well with Kubernetes and offers a more powerful alternative than traditional message-broker worker queues, e.g RabbitMQ worker queues.
- Colonies provides functionality to establish trusted distributed computing environments and is a building block for a Meta-Operating System, an overlay built on top of existing operating systems and platforms to create compute continuums spanning devices, webapps, clouds, and edge and HPC platforms.
Design
The core idea of Colonies is to split up complex workloads in two layers, a Meta-layer and an Execution-layer.

- The Meta-layer makes it possible to describe and manage complex workflows as meta-processes independently of implementation and execution environment.
- The Execution-layer provides a serverless computing environment where developers can implement workers capable of executing certain types of meta-processes. AI applications can then be broken down into composable functions executed by remote workers anywhere on the Internet.
- A build-in zero-trust protocol makes it possible to organize remote workers as a single unit called a Colony, thus making it possible for users to keep control even if workloads are spread out and executed on many different platforms at the same time.
Example
Start a Colonier server
source devenv
colonies dev
{
"conditions": {
"runtimetype": "cli"
},
"func": "echo sayhello"
}
colonies process submit --spec sayhello.json
Start a Unix worker (executes functions as Unix commands)
colonies worker start --name testworker --runtimetype cli
INFO[0000] Lauching process Args="[]" Func="echo sayhello"
sayhello
See this guide how to implement workers in Python, Julia, Go, and JavaScript.
Dashboard screenshots
Below are some screenshots from the Colonies Dashboard:

Installation
Presentations
Guides
Design
SDKs
Deployment
More information can also be found here.
Current users
- Colonies is currently being used by RockSigma AB to build a compute engine for automatic seismic processing in underground mines.
Running the tests
Follow the instructions at Installation Guide then type:
make test