README
¶
vmagent
vmagent is a tiny but brave agent, which helps you collect metrics from various sources
and stores them in VictoriaMetrics
or any other Prometheus-compatible storage system that supports the remote_write protocol.
Motivation
While VictoriaMetrics provides an efficient solution to store and observe metrics, our users needed something fast
and RAM friendly to scrape metrics from Prometheus-compatible exporters to VictoriaMetrics.
Also, we found that users’ infrastructure are snowflakes - no two are alike, and we decided to add more flexibility
to vmagent (like the ability to push metrics instead of pulling them). We did our best and plan to do even more.
Features
- Can be used as drop-in replacement for Prometheus for scraping targets such as node_exporter. See Quick Start for details.
- Can add, remove and modify labels (aka tags) via Prometheus relabeling. Can filter data before sending it to remote storage. See these docs for details.
- Accepts data via all the ingestion protocols supported by VictoriaMetrics:
- Influx line protocol via
http://<vmagent>:8429/write. See these docs. - Graphite plaintext protocol if
-graphiteListenAddrcommand-line flag is set. See these docs. - OpenTSDB telnet and http protocols if
-opentsdbListenAddrcommand-line flag is set. See these docs. - Prometheus remote write protocol via
http://<vmagent>:8429/api/v1/write. - JSON lines import protocol via
http://<vmagent>:8429/api/v1/import. See these docs. - Native data import protocol via
http://<vmagent>:8429/api/v1/import/native. See these docs. - Data in Prometheus exposition format. See these docs for details.
- Arbitrary CSV data via
http://<vmagent>:8429/api/v1/import/csv. See these docs.
- Influx line protocol via
- Can replicate collected metrics simultaneously to multiple remote storage systems.
- Works in environments with unstable connections to remote storage. If the remote storage is unavailable, the collected metrics
are buffered at
-remoteWrite.tmpDataPath. The buffered metrics are sent to remote storage as soon as connection to remote storage is recovered. The maximum disk usage for the buffer can be limited with-remoteWrite.maxDiskUsagePerURL. - Uses lower amounts of RAM, CPU, disk IO and network bandwidth compared to Prometheus.
Quick Start
Just download vmutils-* archive from releases page, unpack it
and pass the following flags to vmagent binary in order to start scraping Prometheus targets:
-promscrape.configwith the path to Prometheus config file (it is usually located at/etc/prometheus/prometheus.yml)-remoteWrite.urlwith the remote storage endpoint such as VictoriaMetrics. The-remoteWrite.urlargument can be specified multiple times in order to replicate data concurrently to an arbitrary number of remote storage systems.
Example command line:
/path/to/vmagent -promscrape.config=/path/to/prometheus.yml -remoteWrite.url=https://victoria-metrics-host:8428/api/v1/write
If you only need to collect Influx data, then the following is sufficient:
/path/to/vmagent -remoteWrite.url=https://victoria-metrics-host:8428/api/v1/write
Then send Influx data to http://vmagent-host:8429. See these docs for more details.
vmagent is also available in docker images.
Pass -help to vmagent in order to see the full list of supported command-line flags with their descriptions.
Configuration update
vmagent should be restarted in order to update config options set via command-line args.
vmagent supports multiple approaches for reloading configs from updated config files such as -promscrape.config, -remoteWrite.relabelConfig and -remoteWrite.urlRelabelConfig:
-
Sending
SUGHUPsignal tovmagentprocess:kill -SIGHUP `pidof vmagent` -
Sending HTTP request to
http://vmagent:8429/-/reloadendpoint.
There is also -promscrape.configCheckInterval command-line option, which can be used for automatic reloading configs from updated -promscrape.config file.
Use cases
IoT and Edge monitoring
vmagent can run and collect metrics in IoT and industrial networks with unreliable or scheduled connections to the remote storage.
It buffers the collected data in local files until the connection to remote storage becomes available and then sends the buffered
data to the remote storage. It re-tries sending the data to remote storage on any errors.
The maximum buffer size can be limited with -remoteWrite.maxDiskUsagePerURL.
vmagent works on various architectures from IoT world - 32-bit arm, 64-bit arm, ppc64, 386, amd64.
See the corresponding Makefile rules for details.
Drop-in replacement for Prometheus
If you use Prometheus only for scraping metrics from various targets and forwarding these metrics to remote storage,
then vmagent can replace such Prometheus setup. Usually vmagent requires lower amounts of RAM, CPU and network bandwidth comparing to Prometheus for such a setup.
See these docs for details.
Replication and high availability
vmagent replicates the collected metrics among multiple remote storage instances configured via -remoteWrite.url args.
If a single remote storage instance temporarily is out of service, then the collected data remains available in another remote storage instances.
vmagent buffers the collected data in files at -remoteWrite.tmpDataPath until the remote storage becomes available again.
Then it sends the buffered data to the remote storage in order to prevent data gaps in the remote storage.
Relabeling and filtering
vmagent can add, remove or update labels on the collected data before sending it to remote storage. Additionally,
it can remove unwanted samples via Prometheus-like relabeling before sending the collected data to remote storage.
See these docs for details.
Splitting data streams among multiple systems
vmagent supports splitting the collected data between muliple destinations with the help of -remoteWrite.urlRelabelConfig,
which is applied independently for each configured -remoteWrite.url destination. For instance, it is possible to replicate or split
data among long-term remote storage, short-term remote storage and real-time analytical system built on top of Kafka.
Note that each destination can receive its own subset of the collected data thanks to per-destination relabeling via -remoteWrite.urlRelabelConfig.
Prometheus remote_write proxy
vmagent may be used as a proxy for Prometheus data sent via Prometheus remote_write protocol. It can accept data via remote_write API
at /api/v1/write endpoint, apply relabeling and filtering and then proxy it to another remote_write systems.
The vmagent can be configured to encrypt the incoming remote_write requests with -tls* command-line flags.
Additionally, Basic Auth can be enabled for the incoming remote_write requests with -httpAuth.* command-line flags.
remote_write for clustered version
Despite vmagent can accept data in several supported protocols (OpenTSDB, Influx, Prometheus, Graphite) and scrape data from various targets, writes always peformed in Promethes remote_write protocol. Therefore for clustered version -remoteWrite.url command-line flag should be configured as <schema>://<vminsert-host>:8480/insert/<customer-id>/prometheus/api/v1/write
How to collect metrics in Prometheus format
Pass the path to prometheus.yml to -promscrape.config command-line flag. vmagent takes into account the following
sections from Prometheus config file:
globalscrape_configs
All the other sections are ignored, including remote_write section.
Use -remoteWrite.* command-line flags instead for configuring remote write settings.
The following scrape types in scrape_config section are supported:
static_configs- for scraping statically defined targets. See these docs for details.file_sd_configs- for scraping targets defined in external files aka file-based service discover. See these docs for details.kubernetes_sd_configs- for scraping targets in Kubernetes (k8s). See kubernetes_sd_config for details.ec2_sd_configs- for scraping targets in Amazon EC2. See ec2_sd_config for details.vmagentdoesn't supportprofileconfig param and aws credentials file yet.gce_sd_configs- for scraping targets in Google Compute Engine (GCE). See gce_sd_config for details.vmagentprovides the following additional functionality forgce_sd_config:- if
projectarg is missing, thenvmagentuses the project for the instance where it runs; - if
zonearg is missing, thenvmagentuses the zone for the instance where it runs; - if
zonearg equals to"*", thenvmagentdiscovers all the zones for the given project; zonemay contain arbitrary number of zones, i.e.zone: [us-east1-a, us-east1-b].
- if
consul_sd_configs- for scraping targets registered in Consul. See consul_sd_config for details.dns_sd_configs- for scraping targets discovered from DNS records (SRV, A and AAAA). See dns_sd_config for details.openstack_sd_configs- for scraping OpenStack targets. See openstack_sd_config for details. OpenStack identity API v3 is supported only.dockerswarm_sd_configs- for scraping Docker Swarm targets. See dockerswarm_sd_config for details.eureka_sd_configs- for scraping targets registered in Netflix Eureka. See eureka_sd_config for details.
File feature requests at our issue tracker if you need other service discovery mechanisms to be supported by vmagent.
vmagent also support the following additional options in scrape_config section:
disable_compression: true- for disabling response compression on a per-job basis. By defaultvmagentrequests compressed responses from scrape targets in order to save network bandwidth.disable_keepalive: true- for disabling HTTP keep-alive connections on a per-job basis. By defaultvmagentuses keep-alive connections to scrape targets in order to reduce overhead on connection re-establishing.
Note that vmagent doesn't support refresh_interval option these scrape configs. Use the corresponding -promscrape.*CheckInterval
command-line flag instead. For example, -promscrape.consulSDCheckInterval=60s sets refresh_interval for all the consul_sd_configs
entries to 60s. Run vmagent -help in order to see default values for -promscrape.*CheckInterval flags.
The file pointed by -promscrape.config may contain %{ENV_VAR} placeholders, which are substituted by the corresponding ENV_VAR environment variable values.
Adding labels to metrics
Labels can be added to metrics via the following mechanisms:
- Via
global -> external_labelssection in-promscrape.configfile. These labels are added only to metrics scraped from targets configured in-promscrape.configfile. - Via
-remoteWrite.labelcommand-line flag. These labels are added to all the collected metrics before sending them to-remoteWrite.url.
Relabeling
vmagent supports Prometheus relabeling.
Additionally it provides the following extra actions:
replace_all: replaces all the occurences ofregexin the values ofsource_labelswith thereplacementand stores the result in thetarget_label.labelmap_all: replaces all the occurences ofregexin all the label names with thereplacement.keep_if_equal: keeps the entry if all label values fromsource_labelsare equal.drop_if_equal: drops the entry if all the label values fromsource_labelsare equal.
The relabeling can be defined in the following places:
- At
scrape_config -> relabel_configssection in-promscrape.configfile. This relabeling is applied to target labels. - At
scrape_config -> metric_relabel_configssection in-promscrape.configfile. This relabeling is applied to all the scraped metrics in the givenscrape_config. - At
-remoteWrite.relabelConfigfile. This relabeling is aplied to all the collected metrics before sending them to remote storage. - At
-remoteWrite.urlRelabelConfigfiles. This relabeling is applied to metrics before sending them to the corresponding-remoteWrite.url.
Read more about relabeling in the following articles:
- How to use Relabeling in Prometheus and VictoriaMetrics
- Life of a label
- Discarding targets and timeseries with relabeling
- Dropping labels at scrape time
- Extracting labels from legacy metric names
- relabel_configs vs metric_relabel_configs
Monitoring
vmagent exports various metrics in Prometheus exposition format at http://vmagent-host:8429/metrics page. It is recommended setting up regular scraping of this page
either via vmagent itself or via Prometheus, so the exported metrics could be analyzed later.
Use official Grafana dashboard for vmagent state overview.
If you have suggestions, improvements or found a bug - feel free to open an issue on github or add review to the dashboard.
vmagent also exports target statuses at the following handlers:
-
http://vmagent-host:8429/targets. This handler returns human-readable plaintext status for every active target. This page is convenient to query from command line withwget,curlor similar tools. It accepts optionalshow_original_labels=1query arg, which shows the original labels per each target before applying relabeling. This information may be useful for debugging target relabeling. -
http://vmagent-host:8429/api/v1/targets. This handler returns data compatible with the corresponding page from Prometheus API. -
http://vmagent-host:8429/ready. This handler returns http 200 status code whenvmagentfinishes initialization for all service_discovery configs. It may be useful for performingvmagentrolling update without scrape loss.
Troubleshooting
-
It is recommended setting up the official Grafana dashboard in order to monitor
vmagentstate. -
It is recommended increasing the maximum number of open files in the system (
ulimit -n) when scraping big number of targets, sincevmagentestablishes at least a single TCP connection per each target. -
When
vmagentscrapes many unreliable targets, it can flood error log with scrape errors. These errors can be suppressed by passing-promscrape.suppressScrapeErrorscommand-line flag tovmagent. The most recent scrape error per each target can be observed athttp://vmagent-host:8429/targetsandhttp://vmagent-host:8429/api/v1/targets. -
The
/api/v1/targetspage could be useful for debugging relabeling process for scrape targets. This page contains original labels for targets dropped during relabeling (see "droppedTargets" section in the page output). By default up to-promscrape.maxDroppedTargetstargets are shown here. If your setup drops more targets during relabeling, then increase-promscrape.maxDroppedTargetscommand-line flag value in order to see all the dropped targets. Note that tracking each dropped target requires up to 10Kb of RAM, so big values for-promscrape.maxDroppedTargetsmay result in increased memory usage if big number of scrape targets are dropped during relabeling. -
If
vmagentscrapes big number of targets, then-promscrape.dropOriginalLabelscommand-line option may be passed tovmagentin order to reduce memory usage. This option drops"discoveredLabels"and"droppedTargets"lists at/api/v1/targetspage, which may result in reduced debuggability for improperly configured per-target relabeling. -
If
vmagentscrapes targets with millions of metrics per each target (for instance, when scraping federation endpoints), then it is recommended enablingstream parsing modein order to reduce memory usage during scraping. This mode may be enabled either globally for all the scrape targets by passing-promscrape.streamParsecommand-line flag or on a per-scrape target basis withstream_parse: trueoption. For example:scrape_configs: - job_name: 'big-federate' stream_parse: true static_configs: - targets: - big-prometeus1 - big-prometeus2 honor_labels: true metrics_path: /federate params: 'match[]': ['{__name__!=""}']Note that
sample_limitoption doesn't work if stream parsing is enabled, since the parsed data is pushed to remote storage as soon as it is parsed. Sosample_limitoption has no sense during stream parsing. -
It is recommended to increase
-remoteWrite.queuesifvmagent_remotewrite_pending_data_bytesmetric exported athttp://vmagent-host:8429/metricspage constantly grows. -
If you see gaps on the data pushed by
vmagentto remote storage when-remoteWrite.maxDiskUsagePerURLis set, then try increasing-remoteWrite.queues. Such gaps may appear becausevmagentcannot keep up with sending the collected data to remote storage, so it starts dropping the buffered data if the on-disk buffer size exceeds-remoteWrite.maxDiskUsagePerURL. -
vmagentbuffers scraped data at-remoteWrite.tmpDataPathdirectory until it is sent to-remoteWrite.url. The directory can grow large when remote storage is unavailable for extended periods of time and if-remoteWrite.maxDiskUsagePerURLisn't set. If you don't want to send all the data from the directory to remote storage, simply stopvmagentand delete the directory. -
By default
vmagentmasks-remoteWrite.urlwithsecret-urlvalues in logs and at/metricspage because the url may contain sensitive information such as auth tokens or passwords. Pass-remoteWrite.showURLcommand-line flag when startingvmagentin order to see all the valid urls. -
If you see
skipping duplicate scrape target with identical labelserrors when scraping Kubernetes pods, then it is likely these pods listen multiple ports or they use init container. These errors can be either fixed or suppressed with-promscrape.suppressDuplicateScrapeTargetErrorscommand-line flag. See available options below if you prefer fixing the root cause of the error:The following
relabel_configssection may help determining__meta_*labels resulting in duplicate targets:- action: labelmap regex: __meta_(.*)The following relabeling rule may be added to
relabel_configssection in order to filter out pods with unneeded ports:- action: keep_if_equal source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_port, __meta_kubernetes_pod_container_port_number]The following relabeling rule may be added to
relabel_configssection in order to filter out init container pods:- action: drop source_labels: [__meta_kubernetes_pod_container_init] regex: true
How to build from sources
It is recommended using binary releases - vmagent is located in vmutils-* archives there.
Development build
- Install Go. The minimum supported version is Go 1.13.
- Run
make vmagentfrom the root folder of the repository. It buildsvmagentbinary and puts it into thebinfolder.
Production build
- Install docker.
- Run
make vmagent-prodfrom the root folder of the repository. It buildsvmagent-prodbinary and puts it into thebinfolder.
Building docker images
Run make package-vmagent. It builds victoriametrics/vmagent:<PKG_TAG> docker image locally.
<PKG_TAG> is auto-generated image tag, which depends on source code in the repository.
The <PKG_TAG> may be manually set via PKG_TAG=foobar make package-vmagent.
The base docker image is alpine but it is possible to use any other base image
by setting it via <ROOT_IMAGE> environment variable. For example, the following command builds the image on top of scratch image:
ROOT_IMAGE=scratch make package-vmagent
ARM build
ARM build may run on Raspberry Pi or on energy-efficient ARM servers.
Development ARM build
- Install Go. The minimum supported version is Go 1.13.
- Run
make vmagent-armormake vmagent-arm64from the root folder of the repository. It buildsvmagent-armorvmagent-arm64binary respectively and puts it into thebinfolder.
Production ARM build
- Install docker.
- Run
make vmagent-arm-prodormake vmagent-arm64-prodfrom the root folder of the repository. It buildsvmagent-arm-prodorvmagent-arm64-prodbinary respectively and puts it into thebinfolder.
Profiling
vmagent provides handlers for collecting the following Go profiles:
- Memory profile. It can be collected with the following command:
curl -s http://<vmagent-host>:8429/debug/pprof/heap > mem.pprof
- CPU profile. It can be collected with the following command:
curl -s http://<vmagent-host>:8429/debug/pprof/profile > cpu.pprof
The command for collecting CPU profile waits for 30 seconds before returning.
The collected profiles may be analyzed with go tool pprof.
Documentation
¶
There is no documentation for this package.