k8sattributesprocessor

package module

v0.59.0 Latest Latest Go to latest Published: Aug 31, 2022 License: Apache-2.0 Imports: 22 Imported by: 19

README ¶

Documentation is published to pkg.go.dev

Documentation ¶

Overview ¶

Package k8sattributesprocessor allow automatic tagging of spans, metrics and logs with k8s metadata.

The processor automatically discovers k8s resources (pods), extracts metadata from them and adds the extracted metadata to the relevant spans, metrics and logs. The processor uses the kubernetes API to discover all pods running in a cluster, keeps a record of their IP addresses, pod UIDs and interesting metadata. The rules for associating the data passing through the processor (spans, metrics and logs) with specific Pod Metadata are configured via "pod_association" key. It represents a list of associations that are executed in the specified order until the first one is able to do the match.

Each association is specified as a list of sources of association. Sources represents list of rules. All rules are going to be executed and combination of result is going to be a pod metadata cache key. In order to get an association applied, all the data defined in each source has to be successfully fetched from a log, trace or metric.

Each sources rule is specified as a pair of `from` (representing the rule type) and `name` (representing the attribute name if `From` is set to `resource_attribute`). Following rule types are available:

from: "connection" - takes the IP attribute from connection context (if available)
from: "resource_attribute" - allows to specify the attribute name to lookup up in the list of attributes of the received Resource.
                             Semantic convention should be used for naming.

Pod association configuration. pod_association:

sources:
from: resource_attribute name: k8s.pod.ip # below association matches for pair `k8s.pod.name` and `k8s.namespace.name`
sources:
from: resource_attribute name: k8s.pod.name
from: resource_attribute name: k8s.namespace.name

If Pod association rules are not configured resources are associated with metadata only by connection's IP Address.

Which metadata to collect is determined by `metadata` configuration that defines list of resource attributes to be added. Items in the list called exactly the same as the resource attributes that will be added. All the available attributes are enabled by default, you can reduce the list with `metadata` configuration. The following attributes will be added if pod identified:

k8s.namespace.name
k8s.pod.name
k8s.pod.uid
k8s.pod.start_time
k8s.deployment.name
k8s.node.name

Not all the attributes are guaranteed to be added.

Only attribute names from `metadata` should be used for pod_association's `resource_attribute`, because empty or non-existing values will be ignored.

The following container level attributes require additional attributes to identify a particular container in a pod:

Container spec attributes - will be set only if container identifying attribute `k8s.container.name` is set as a resource attribute (similar to all other attributes, pod has to be identified as well): - container.image.name - container.image.tag
Container status attributes - in addition to pod identifier and `k8s.container.name` attribute, these attributes require identifier of a particular container run set as `k8s.container.restart_count` in resource attributes: - container.id

The k8sattributesprocessor can be used for automatic tagging of spans, metrics and logs with k8s labels and annotations from pods and namespaces. The config for associating the data passing through the processor (spans, metrics and logs) with specific Pod/Namespace annotations/labels is configured via "annotations" and "labels" keys. This config represents a list of annotations/labels that are extracted from pods/namespaces and added to spans, metrics and logs. Each item is specified as a config of tag_name (representing the tag name to tag the spans with), key (representing the key used to extract value) and from (representing the kubernetes object used to extract the value). The "from" field has only two possible values "pod" and "namespace" and defaults to "pod" if none is specified.

A few examples to use this config are as follows: annotations:

tag_name: a1 # extracts value of annotation from pods with key `annotation-one` and inserts it as a tag with key `a1` key: annotation-one from: pod
tag_name: a2 # extracts value of annotation from namespaces with key `annotation-two` with regexp and inserts it as a tag with key `a2` key: annotation-two regex: field=(?P<value>.+) from: namespace

labels:

tag_name: l1 # extracts value of label from namespaces with key `label1` and inserts it as a tag with key `l1` key: label1 from: namespace
tag_name: l2 # extracts value of label from pods with key `label1` with regexp and inserts it as a tag with key `l2` key: label2 regex: field=(?P<value>.+) from: pod

RBAC ¶

The k8sattributesprocessor needs `get`, `watch` and `list` permissions on both `pods` and `namespaces` resources, for all namespaces and pods included in the configured filters. Here is an example of a `ClusterRole` to give a `ServiceAccount` the necessary permissions for all pods and namespaces in the cluster (replace `<OTEL_COL_NAMESPACE>` with a namespace where collector is deployed):

apiVersion: v1
kind: ServiceAccount
metadata:
  name: collector
  namespace: <OTEL_COL_NAMESPACE>
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: otel-collector
rules:
- apiGroups: [""]
  resources: ["pods", "namespaces"]
  verbs: ["get", "watch", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: otel-collector
subjects:
- kind: ServiceAccount
  name: collector
  namespace: <OTEL_COL_NAMESPACE>
roleRef:
  kind: ClusterRole
  name: otel-collector
  apiGroup: rbac.authorization.k8s.io

Config

k8sattributes:
k8sattributes/2:
  auth_type: "serviceAccount"
  passthrough: false
  filter:
    node_from_env_var: KUBE_NODE_NAME

  extract:
    metadata:
      - k8s.pod.name
      - k8s.pod.uid
      - k8s.deployment.name
      - k8s.namespace.name
      - k8s.node.name
      - k8s.pod.start_time

  pod_association:
   - from: resource_attribute
     name: k8s.pod.ip
   - from: resource_attribute
     name: k8s.pod.uid
   - from: connection

Deployment scenarios ¶

The processor supports running both in agent and collector mode.

As an agent ¶

When running as an agent, the processor detects IP addresses of pods sending spans, metrics or logs to the agent and uses this information to extract metadata from pods. When running as an agent, it is important to apply a discovery filter so that the processor only discovers pods from the same host that it is running on. Not using such a filter can result in unnecessary resource usage especially on very large clusters. Once the filter is applied, each processor will only query the k8s API for pods running on it's own node.

Node filter can be applied by setting the `filter.node` config option to the name of a k8s node. While this works as expected, it cannot be used to automatically filter pods by the same node that the processor is running on in most cases as it is not know before hand which node a pod will be scheduled on. Luckily, kubernetes has a solution for this called the downward API. To automatically filter pods by the node the processor is running on, you'll need to complete the following steps:

1. Use the downward API to inject the node name as an environment variable. Add the following snippet under the pod env section of the OpenTelemetry container.

spec:
  containers:
  - env:
    - name: KUBE_NODE_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: spec.nodeName

This will inject a new environment variable to the OpenTelemetry container with the value as the name of the node the pod was scheduled to run on.

2. Set "filter.node_from_env_var" to the name of the environment variable holding the node name.

k8sattributes:
  filter:
    node_from_env_var: KUBE_NODE_NAME # this should be same as the var name used in previous step

This will restrict each OpenTelemetry agent to query pods running on the same node only dramatically reducing resource requirements for very large clusters.

As a collector ¶

The processor can be deployed both as an agent or as a collector.

When running as a collector, the processor cannot correctly detect the IP address of the pods generating the telemetry data without any of the well-known IP attributes, when it receives them from an agent instead of receiving them directly from the pods. To workaround this issue, agents deployed with the k8sattributes processor can be configured to detect the IP addresses and forward them along with the telemetry data resources. Collector can then match this IP address with k8s pods and enrich the records with the metadata. In order to set this up, you'll need to complete the following steps:

1. Setup agents in passthrough mode Configure the agents' k8sattributes processors to run in passthrough mode.

# k8sattributes config for agent
k8sattributes:
  passthrough: true

This will ensure that the agents detect the IP address as add it as an attribute to all telemetry resources. Agents will not make any k8s API calls, do any discovery of pods or extract any metadata.

2. Configure the collector as usual No special configuration changes are needed to be made on the collector. It'll automatically detect the IP address of spans, logs and metrics sent by the agents as well as directly by other services/pods.

Caveats ¶

There are some edge-cases and scenarios where k8sattributes will not work properly.

Host networking mode ¶

The processor cannot correct identify pods running in the host network mode and enriching telemetry data generated by such pods is not supported at the moment, unless the association rule is not based on IP attribute.

As a sidecar ¶

The processor does not support detecting containers from the same pods when running as a sidecar. While this can be done, we think it is simpler to just use the kubernetes downward API to inject environment variables into the pods and directly use their values as tags.

Index ¶

func NewFactory() component.ProcessorFactory
type Config
- func (cfg *Config) Validate() error
type ExcludeConfig
type ExcludePodConfig
type ExtractConfig
type FieldExtractConfig
type FieldFilterConfig
type FilterConfig
type PodAssociationConfig
type PodAssociationSourceConfig

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func NewFactory ¶

func NewFactory() component.ProcessorFactory

NewFactory returns a new factory for the k8s processor.

Types ¶

type Config ¶

type Config struct {
	config.ProcessorSettings `mapstructure:",squash"` // squash ensures fields are correctly decoded in embedded struct

	k8sconfig.APIConfig `mapstructure:",squash"`

	// Passthrough mode only annotates resources with the pod IP and
	// does not try to extract any other metadata. It does not need
	// access to the K8S cluster API. Agent/Collector must receive spans
	// directly from services to be able to correctly detect the pod IPs.
	Passthrough bool `mapstructure:"passthrough"`

	// Extract section allows specifying extraction rules to extract
	// data from k8s pod specs
	Extract ExtractConfig `mapstructure:"extract"`

	// Filter section allows specifying filters to filter
	// pods by labels, fields, namespaces, nodes, etc.
	Filter FilterConfig `mapstructure:"filter"`

	// Association section allows to define rules for tagging spans, metrics,
	// and logs with Pod metadata.
	Association []PodAssociationConfig `mapstructure:"pod_association"`

	// Exclude section allows to define names of pod that should be
	// ignored while tagging.
	Exclude ExcludeConfig `mapstructure:"exclude"`
}

Config defines configuration for k8s attributes processor.

func (*Config) Validate ¶

func (cfg *Config) Validate() error

type ExcludeConfig ¶

type ExcludeConfig struct {
	Pods []ExcludePodConfig `mapstructure:"pods"`
}

ExcludeConfig represent a list of Pods to exclude

type ExcludePodConfig ¶

type ExcludePodConfig struct {
	Name string `mapstructure:"name"`
}

ExcludePodConfig represent a Pod name to ignore

type ExtractConfig ¶

type ExtractConfig struct {
	// Metadata allows to extract pod/namespace metadata from a list of metadata fields.
	// The field accepts a list of strings.
	//
	// Metadata fields supported right now are,
	//   k8s.pod.name, k8s.pod.uid, k8s.deployment.name,
	//   k8s.node.name, k8s.namespace.name, k8s.pod.start_time,
	//   k8s.replicaset.name, k8s.replicaset.uid,
	//   k8s.daemonset.name, k8s.daemonset.uid,
	//   k8s.job.name, k8s.job.uid, k8s.cronjob.name,
	//   k8s.statefulset.name, k8s.statefulset.uid
	//
	// Specifying anything other than these values will result in an error.
	// By default all of the fields are extracted and added to spans and metrics.
	Metadata []string `mapstructure:"metadata"`

	// Annotations allows extracting data from pod annotations and record it
	// as resource attributes.
	// It is a list of FieldExtractConfig type. See FieldExtractConfig
	// documentation for more details.
	Annotations []FieldExtractConfig `mapstructure:"annotations"`

	// Annotations allows extracting data from pod labels and record it
	// as resource attributes.
	// It is a list of FieldExtractConfig type. See FieldExtractConfig
	// documentation for more details.
	Labels []FieldExtractConfig `mapstructure:"labels"`
}

ExtractConfig section allows specifying extraction rules to extract data from k8s pod specs.

type FieldExtractConfig ¶

type FieldExtractConfig struct {
	TagName string `mapstructure:"tag_name"`
	Key     string `mapstructure:"key"`
	// KeyRegex is a regular expression used to extract a Key that matches the regex.
	// Out of Key or KeyRegex only one option is expected to be configured at a time.
	KeyRegex string `mapstructure:"key_regex"`
	Regex    string `mapstructure:"regex"`
	// From represents the source of the labels/annotations.
	// Allowed values are "pod" and "namespace". The default is pod.
	From string `mapstructure:"from"`
}

FieldExtractConfig allows specifying an extraction rule to extract a value from exactly one field.

The field accepts a list FilterExtractConfig map. The map accepts several keys

	from, tag_name, key, key_regex and regex

  - tag_name represents the name of the tag that will be added to the span.
    When not specified a default tag name will be used of the format:
    k8s.pod.annotations.<annotation key>
    k8s.pod.labels.<label key>
    For example, if tag_name is not specified and the key is git_sha,
    then the attribute name will be `k8s.pod.annotations.git_sha`.
    When key_regex is present, tag_name supports back reference to both named capturing and positioned capturing.
    For example, if your pod spec contains the following labels,

    app.kubernetes.io/component: mysql
    app.kubernetes.io/version: 5.7.21

    and you'd like to add tags for all labels with prefix app.kubernetes.io/ and also trim the prefix,
    then you can specify the following extraction rules:

    processors:
    k8sattributes:
    extract:
    labels:

  - name: $1
    key_regex: kubernetes.io/(.*)

    this will add the `component` and `version` tags to the spans or metrics.

- key represents the annotation name. This must exactly match an annotation name.

regex is an optional field used to extract a sub-string from a complex field value. The supplied regular expression must contain one named parameter with the string "value" as the name. For example, if your pod spec contains the following annotation,
kubernetes.io/change-cause: 2019-08-28T18:34:33Z APP_NAME=my-app GIT_SHA=58a1e39 CI_BUILD=4120
and you'd like to extract the GIT_SHA and the CI_BUILD values as tags, then you must specify the following two extraction rules:
processors: k8sattributes: extract: annotations:
name: git.sha key: kubernetes.io/change-cause regex: GIT_SHA=(?P<value>\w+)
name: ci.build key: kubernetes.io/change-cause regex: JENKINS=(?P<value>[\w]+)
this will add the `git.sha` and `ci.build` tags to the spans or metrics.

type FieldFilterConfig ¶

type FieldFilterConfig struct {
	// Key represents the key or name of the field or labels that a filter
	// can apply on.
	Key string `mapstructure:"key"`

	// Value represents the value associated with the key that a filter
	// operation specified by the `Op` field applies on.
	Value string `mapstructure:"value"`

	// Op represents the filter operation to apply on the given
	// Key: Value pair. The following operations are supported
	//   equals, not-equals, exists, does-not-exist.
	Op string `mapstructure:"op"`
}

FieldFilterConfig allows specifying exactly one filter by a field. It can be used to represent a label or generic field filter.

type FilterConfig ¶

type FilterConfig struct {
	// Node represents a k8s node or host. If specified, any pods not running
	// on the specified node will be ignored by the tagger.
	Node string `mapstructure:"node"`

	// NodeFromEnv can be used to extract the node name from an environment
	// variable. The value must be the name of the environment variable.
	// This is useful when the node a Otel agent will run on cannot be
	// predicted. In such cases, the Kubernetes downward API can be used to
	// add the node name to each pod as an environment variable. K8s tagger
	// can then read this value and filter pods by it.
	//
	// For example, node name can be passed to each agent with the downward API as follows
	//
	// env:
	//   - name: K8S_NODE_NAME
	//     valueFrom:
	//       fieldRef:
	//         fieldPath: spec.nodeName
	//
	// Then the NodeFromEnv field can be set to `K8S_NODE_NAME` to filter all pods by the node that
	// the agent is running on.
	//
	// More on downward API here: https://kubernetes.io/docs/tasks/inject-data-application/downward-api-volume-expose-pod-information/
	NodeFromEnvVar string `mapstructure:"node_from_env_var"`

	// Namespace filters all pods by the provided namespace. All other pods are ignored.
	Namespace string `mapstructure:"namespace"`

	// Fields allows to filter pods by generic k8s fields.
	// Only the following operations are supported:
	//    - equals
	//    - not-equals
	//
	// Check FieldFilterConfig for more details.
	Fields []FieldFilterConfig `mapstructure:"fields"`

	// Labels allows to filter pods by generic k8s pod labels.
	// Only the following operations are supported:
	//    - equals
	//    - not-equals
	//    - exists
	//    - not-exists
	//
	// Check FieldFilterConfig for more details.
	Labels []FieldFilterConfig `mapstructure:"labels"`
}

FilterConfig section allows specifying filters to filter pods by labels, fields, namespaces, nodes, etc.

type PodAssociationConfig ¶

type PodAssociationConfig struct {
	// Deprecated: Sources should be used to provide From and Name.
	// If this is set, From and Name are going to be used as Sources' ones
	// From represents the source of the association.
	// Allowed values are "connection" and "resource_attribute".
	From string `mapstructure:"from"`

	// Deprecated: Sources should be used to provide From and Name.
	// If this is set, From and Name are going to be used as Sources' ones
	// Name represents extracted key name.
	// e.g. ip, pod_uid, k8s.pod.ip
	Name string `mapstructure:"name"`

	// List of pod association sources which should be taken
	// to identify pod
	Sources []PodAssociationSourceConfig `mapstructure:"sources"`
}

PodAssociationConfig contain single rule how to associate Pod metadata with logs, spans and metrics

type PodAssociationSourceConfig ¶ added in v0.55.0

type PodAssociationSourceConfig struct {
	// From represents the source of the association.
	// Allowed values are "connection" and "resource_attribute".
	From string `mapstructure:"from"`

	// Name represents extracted key name.
	// e.g. ip, pod_uid, k8s.pod.ip
	Name string `mapstructure:"name"`
}

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
internal
kube
observability

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL