cocoon-operator
Kubernetes operator that manages VM-backed pod lifecycles through two CRDs: Hibernation for suspending or waking a single VM pod, and CocoonSet for managing a group of related VM pods.
Overview
- Hibernation controller -- watches
Hibernation CRDs and annotates pods with cocoon.cis/hibernate=true to trigger vk-cocoon snapshot and restore
- CocoonSet controller -- watches
CocoonSet CRDs, creates and deletes agent and toolbox pods, manages suspend and unsuspend, and reports aggregate status
- Pod watcher -- detects pod changes owned by CocoonSets and triggers reconciliation
Kubernetes controllers assume pods are replaceable. VM-backed workloads are not. This operator keeps stateful VM workflows inside native Kubernetes APIs: scale out from a known main VM, keep stable slot identities, hibernate without letting ReplicaSets recreate the pod, and expose aggregate state through CRD status.
Architecture
The operator runs a single binary with three informer loops:
- Hibernation controller -- annotates pods for vk-cocoon snapshot and restore
- CocoonSet controller -- manages agent and toolbox pod groups
- Pod watcher -- triggers reconciliation on pod changes
A 30-second informer resync and 60-second periodic reconciliation catch status transitions that informer events may miss.
Installation
Prerequisites
- Kubernetes cluster (v1.26+)
kubectl configured to talk to the cluster
- vk-cocoon virtual kubelet provider running on at least one node
Download
Download pre-built binaries from GitHub Releases.
Build from source
git clone https://github.com/cocoonstack/cocoon-operator.git
cd cocoon-operator
make build # produces ./cocoon-operator
Deploy
- Install the CRDs:
kubectl apply -f deploy/crd.yaml
kubectl apply -f deploy/cocoonset-crd.yaml
- Deploy the operator:
kubectl apply -f deploy/deploy.yaml
- Verify the operator is running:
kubectl get pods -l app=cocoon-operator
Configuration
| Variable |
Default |
Description |
KUBECONFIG |
~/.kube/config |
Path to kubeconfig when running outside the cluster |
LOG_LEVEL |
info |
Log level for the operator process |
Usage
Hibernate a pod
apiVersion: cocoon.cis/v1alpha1
kind: Hibernation
metadata:
name: hibernate-bot-1
namespace: prod
spec:
podName: sre-agent-xxx
action: hibernate
Manage a VM group
apiVersion: cocoon.cis/v1alpha1
kind: CocoonSet
metadata:
name: demo
spec:
agent:
image: ubuntu-dev-base
replicas: 1
resources:
cpu: "2"
memory: "4Gi"
toolboxes:
- name: windows
os: windows
image: windows-server-2022
mode: static
Manifests
| File |
Description |
deploy/crd.yaml |
Hibernation CRD |
deploy/cocoonset-crd.yaml |
CocoonSet CRD |
deploy/deploy.yaml |
Operator Deployment and RBAC |
Development
make build # build binary
make test # run tests
make lint # run golangci-lint
make fmt # format code
make help # show all targets
| Project |
Role |
| cocoon-common |
Shared metadata, Kubernetes, and logging helpers |
| cocoon-webhook |
Admission webhook for sticky scheduling |
| epoch |
Snapshot storage backend for hibernated VMs |
| vk-cocoon |
Virtual kubelet provider that performs the actual VM lifecycle |
License
MIT