Cilium’s new Tetragon component enables powerful realtime, eBPF-based Security Observability and
Runtime Enforcement.
Tetragon detects and is able to react to security-significant events, such as
Process execution events
System call activity
I/O activity including network & file access
When used in a Kubernetes environment, Tetragon is Kubernetes-aware - that is, it understands
Kubernetes identities such as namespaces, pods and so-on - so that security event detection
can be configured in relation to individual workloads.
Functionality Overview
eBPF Real-Time
Tetragon is a runtime security enforcement and observability tool. What this means is Tetragon applies
policy and filtering directly in eBPF in the kernel. It performs the filtering,
blocking, and reacting to events directly in the kernel instead of sending
events to a user space agent.
For an observability use case, applying filters directly in the kernel drastically reduces
observation overhead. By avoiding expensive context switching and wake-ups, especially
for high frequency events, such as send, read, or write operations, eBPF reduces required
resources. Instead, Tetragon provides rich filters (file, socket, binary names, namespace/capabilities,
etc.) in eBPF, which allows users to specify the important and relevant events in their
specific context, and pass only those to the user-space agent.
eBPF Flexibility
Tetragon can hook into any function in the Linux kernel and filter on its arguments,
return value, associated metadata that Tetragon collects about processes (e.g., executable
names), files, and other properties. By writing tracing policies users can solve various
security and observability use cases. We provide a number of examples for these in the repository and
highlight some below in the 'Getting Started Guide', but users are encouraged to create new policies that
match their use cases. The examples are just that, jumping off points that users can
then use to create new and specific policy deployments even potentially tracing kernel
functions we did not consider. None of the specifics about which functions are traced
and what filters are applied are hard-coded in the engine itself.
Critically, Tetragon allows hooking deep in the kernel where data structures can not be manipulated
by user space applications avoiding common issues with syscall tracing where
data is incorrectly read, maliciously altered by attackers, or missing due to page
faults and other user/kernel boundary errors.
Many of the Tetragon developers are also kernel developers. By leveraging this knowledge base
Tetragon has created a set of tracing policies that can solve many common observability
and security use cases.
eBPF Kernel Aware
Tetragon, through eBPF, has access to the Linux kernel state. Tetragon can then
join this kernel state with Kubernetes awareness or user policy to create rules
enforced by the kernel in real time. This allows annotating and enforcing process
namespace and capabilities, sockets to processes, process file descriptor to
filenames and so on. For example, when an application changes its privileges we
can create a policy to trigger an alert or even kill the process before it has
a chance to complete the syscall and potentially run additional syscalls.
This Quickstart guide uses a Kind cluster and a helm-based installation to
provide a simple way to get a hands on experience with Tetragon and
the generated events. These events include monitoring process execution,
network sockets, and file access to see what binaries are executing and making
network connections or writing to sensitive files.
In this scenario, we are going to install a demo application,
observe all process execution happening inside a Kubernetes workload
detect file access and writes
observe network connections that a Kubernetes workload is making
detect privileged processes inside a Kubernetes workload
While, we use a Kubernetes Kind cluster in this guide, users can also apply
the same concepts in other Kubernetes platforms, bare-metal, or VM environments.
Requirements
The base kernel should support BTF or the BTF file should
be placed where Tetragon can read it.
For reference, the examples below use this Vagrantfile and we
created our Kind cluster using
the defaults options.
Create a cluster
Create a Kubernetes cluster using Kind or GKE.
Kind
Run the following command to create the Kubernetes cluster:
kind create cluster
GKE
Run the following command to create a GKE cluster:
Before going forward, verify that all pods are up and running - it might take
several seconds for some pods until they satisfy all the dependencies:
kubectl get pods
NAME READY STATUS RESTARTS AGE
deathstar-6c94dcc57b-7pr8c 1/1 Running 0 10s
deathstar-6c94dcc57b-px2vw 1/1 Running 0 10s
tiefighter 1/1 Running 0 10s
xwing 1/1 Running 0 10s
Explore Security Observability Events
After Tetragon and the Demo Application is up and running you can examine
the security and observability events produced by Tetragon in different ways.
Raw JSON events
The first way is to observe the raw json output from the stdout container log:
The raw JSON events provide Kubernetes API, identity metadata, and OS
level process visibility about the executed binary, its parent and the execution
time.
tetra CLI
A second way is to pretty print the events using the
tetra CLI.
The tool also allows filtering by process, pod, and other fields.
If you are using homebrew, you can install the latest release with:
brew install tetra
Or you can download and install the latest release with the following commands:
Tetragon is able to observe critical hooks in the kernel through its sensors
and generates enriched events from them. In the next sections we detail the
available sensors and the events they produce:
Generic tracing: generating process_kprobes
and process_tracepoint events.
Along, we present use cases on how they can be used as a starting point.
Process execution
Tetragon observes process creation and termination with default configuration
and generates process_exec and process_exit events:
The process_exec events include useful information about the execution of
binaries and related process information. This includes the binary image that
was executed, command-line arguments, the UID context the process was
executed with, the process parent information, the capabilities that a
process had while executed, the process start time, the Kubernetes Pod,
labels and more.
The process_exit events, as the process_exec event shows how and when a
process started, indicate how and when a process is removed. The information
in the event includes the binary image that was executed, command-line
arguments, the UID context the process was executed with, process parent
information, process start time, the status codes and signals on process
exit. Understanding why a process exited and with what status code helps
understand the specifics of that exit.
Both these events include Linux-level metadata (UID, parents, capabilities,
start time, etc.) but also Kubernetes-level metadata (Kubernetes namespace,
labels, name, etc.). This data make the connection between node-level concepts,
the processes, and Kubernetes or container environments.
These events enable a full lifecycle view into a process that can aid an
incident investigation, for example, we can determine if a suspicious process
is still running in a particular environment. For concrete examples of such
events, see the next use case on process execution.
Use case 1: Monitoring Process Execution
This first use case is monitoring process execution, which can be observed with
the Tetragon process_exec and process_exit JSON events.
These events contain the full lifecycle of processes, from fork/exec to
exit, including metadata such as:
Binary name: Defines the name of an executable file
Parent process: Helps to identify process execution anomalies (e.g., if a nodejs app forks a shell, this is suspicious)
Command-line argument: Defines the program runtime behavior
Current working directory: Helps to identify hidden malware execution from a temporary folder, which is a common pattern used in malwares
Kubernetes metadata: Contains pods, labels, and Kubernetes namespaces, which are critical to identify service owners, particularly in a multitenant environments
exec_id: A unique process identifier that correlates all recorded activity of a process
As a first step, let's start monitoring the events from the xwing pod:
Then in another terminal, let's kubectl exec into the xwing pod and execute
some example commands:
kubectl exec -it xwing -- /bin/bash
whoami
If you observe, the output in the first terminal should be:
🚀 process default/xwing /bin/bash
🚀 process default/xwing /usr/bin/whoami
💥 exit default/xwing /usr/bin/whoami 0
Here you can see the binary names along with its arguments, the pod info, and
return codes in a compact one-line view of the events.
For more details use the raw JSON events to get detailed information, you can stop
the Tetragon CLI by Crl-C and parse the tetragon.log file by executing:
Tetragon also provides the ability to check process capabilities and kernel
namespaces access.
This information would help us determine which process or Kubernetes pod has
started or gained access to privileges or host namespaces that it should not
have. This would help us answer questions like:
Which Kubernetes pods are running with CAP_SYS_ADMIN in my cluster?
Which Kubernetes pods have host network or pid namespace access in my
cluster?
As a first step let's enable visibility to capability and namespace changes via
the configmap by setting enable-process-cred and enable-process-ns from
false to true:
kubectl edit cm -n kube-system tetragon-config
# change "enable-process-cred" from "false" to "true"
# change "enable-process-ns" from "false" to "true"
# then save and exit
If you observe the output in the first terminal, you can see the container start with CAP_SYS_ADMIN:
🚀 process default/test-pod /bin/sleep 365d 🛑 CAP_SYS_ADMIN
🚀 process default/test-pod /usr/bin/jq -r .bundle 🛑 CAP_SYS_ADMIN
🚀 process default/test-pod /usr/bin/cp /kind/product_name /kind/product_uuid /run/containerd/io.containerd.runtime.v2.task/k8s.io/7c7e513cd4d506417bc9d97dd9af670d94d9e84161c8c8 fdc9fa3a678289a59/rootfs/ 🛑 CAP_SYS_ADMIN
Generic tracing
For more advanced use cases, Tetragon can observe tracepoints and arbitrary
kernel calls via kprobes. For that, Tetragon must be extended and configured
with custom resources objects named TracingPolicy. It can then generates
process_tracepoint and process_kprobes events.
TracingPolicy is a user-configurable Kubernetes custom resource that allows
users to trace arbitrary events in the kernel and optionally define actions to
take on a match. For example, a Sigkill signal can be sent to the process or
the return value of a system call can be overridden. For bare metal or VM use
cases without Kubernetes, the same YAML configuration can be passed via a flag
to the Tetragon binary or via the tetra CLI to load the policies via gRPC.
For more information on TracingPolicy and how to write them, see the
TracingPolicy Guide.
Use case 1: File Access
The first use case is file access, which can be observed with the Tetragon
process_kprobe JSON events. By using kprobe hook points, these events are
able to observe arbitrary kernel calls and file descriptors in the Linux
kernel, giving you the ability to monitor every file a process opens, reads,
writes, and closes throughout its lifecycle.
In this example, we can monitor if a process inside a Kubernetes workload performs
an open, close, read or write in the /etc/ directory. The policy may further
specify additional directories or specific files if needed.
As a first step, let's apply the following TracingPolicy:
In another terminal, kubectl exec into the xwing pod:
kubectl exec -it xwing -- /bin/bash
and edit the /etc/passwd file:
vi /etc/passwd
If you observe, the output in the first terminal should be:
🚀 process default/xwing /usr/bin/vi /etc/passwd
📬 open default/xwing /usr/bin/vi /etc/passwd
📚 read default/xwing /usr/bin/vi /etc/passwd 1269 bytes
📪 close default/xwing /usr/bin/vi /etc/passwd
📬 open default/xwing /usr/bin/vi /etc/passwd
📝 write default/xwing /usr/bin/vi /etc/passwd 1277 bytes
💥 exit default/xwing /usr/bin/vi /etc/passwd 0
Note, that open and close are only generated for /etc/ files because of eBPF in kernel
filtering. The default CRD additionally filters events associated with the
pod init process to filter init noise from pod start.
Similarly to the previous example, reviewing the JSON events provides
additional data. An example process_kprobe event observing a write can be:
In addition to the Kubernetes Identity
and process metadata from exec events, process_kprobe events contain
the arguments of the observed system call. In the above case they are
path: the observed file path
bytes_arg: content of the observed file encoded in base64
Many common Linux distributions now ship with BTF enabled and do not require any extra work.
To check if BTF is enabled on your Linux system, the standard location is:
$ ls /sys/kernel/btf/
Otherwise Tetragon repository provides a Vagrantfile that can
be used to install a vagrant box for running Tetragon with BTF requirement. Other VM solutions
work as well.
To run with vagrant
we provide a standard VagrantFile with the required components enabled. Simply run,
$ vagrant up
$ vagrant ssh
This should be sufficient to create a Kind cluster and run Tetragon. For more information on the vagrant builds, see the Development Guide.
COSIGN_EXPERIMENTAL=1 is used to allow verification of images signed in KEYLESS mode. To learn more about keyless signing, please refer to Keyless Signatures.
Software Bill of Materials
A Software Bill of Materials (SBOM) is a complete, formally structured list of
components that are required to build a given piece of software. SBOM provides
insight into the software supply chain and any potential concerns related to
license compliance and security that might exist.
Starting with version 0.8.4, all Tetragon images include an SBOM. The SBOM is
generated in SPDX format using the bom tool.
If you are new to the concept of SBOM, see what an SBOM can do for you.
Download SBOM
The SBOM can be downloaded from the supplied Tetragon image using the cosign download sbom command.
It can be validated that the SBOM image was signed using Github Actions in the Cilium
repository from the Issuer and Subject fields of the output.
FAQ
Q: Can I install and use Tetragon in standalone mode (outside of k8s)?
A: Yes! You can run make to generate standalone binaries and run them directly.
Make sure to take a look at the Development Setup
guide for the build requirements. Then use sudo ./tetragon --bpf-lib bpf/objs
to run Tetragon.
Q: CI is complaining about Go module vendoring, what do I do?
A: You can run make vendor then add and commit your changes.
Q: CI is complaining about a missing "signed-off-by" line. What do I do?
A: You need to add a signed-off-by line to your commit messages. The easiest way to do
this is with git fetch origin/main && git rebase --signoff origin/main. Then push your changes.
Join the Tetragon Slack channel to chat with developers, maintainers, and other users. This
is a good first stop to ask questions and share your experiences.
Package rthooks contains code for managing run-time hooks Runtime hooks are hooks for (synchronously) notifying the agent for runtime events such as the creation of a container.
Package rthooks contains code for managing run-time hooks Runtime hooks are hooks for (synchronously) notifying the agent for runtime events such as the creation of a container.
This package provides a Tetragon gRPC client multiplexer and an RPCChecker that wraps a MultiEventChecker and uses the gRPC multiplexer to get a stream of events from all Tetragon pods.
This package provides a Tetragon gRPC client multiplexer and an RPCChecker that wraps a MultiEventChecker and uses the gRPC multiplexer to get a stream of events from all Tetragon pods.