README
¶
SQL-based Metrics Provider 0.3.15
Purpose
The application is designed to create metrics based on SQL queries against a SQL database. The metrics are exposed in Prometheus format. Supported databases are PostgreSQL, MySQL, MS SQL Server and Oracle DB.
Queries for metrics with and without labels are supported. If queries sometimes don't produce a result set, the handling can be controlled by different strategies.
You can find an example application in the example folder.
Check the example/README.md for details.
Details
Structure of the application
The Metrics Provider uses the Go SQL implementation for Oracle from https://github.com/sijms/go-ora, for MySQL from https://pkg.go.dev/github.com/go-sql-driver/mysql, for MS SQL Server from https://github.com/microsoft/go-mssqldb, and for PostgresSQL from https://pkg.go.dev/github.com/jackc/pgx as well as the Go Prometheus Client from https://github.com/prometheus/client_golang.
You can list available options by calling the application with the --help parameters.
To configure the connection to the database, you can use a config file or environment variables.
If you work with containers, it's easier to use environment variables.
It's values overrule the settings in the configuration file.
You can find details about the configuration in the source code (server.go)
as well as in the configuration file (db_config.yml).
The file contents describes in detail the different specification options for DB connections.
The metrics and the related SQL statements which retrieve the values, are described in another configuration file (config.yml).
You can use the CONFIG environment variable to define an alternative configuration file for the application.
The wait time between the retrievals of the values for the metrics can globally set by the SLEEP environment variable.
The default value is 1 second.
However, ist possible to define an individual intervall per metric.
It's value will be preferred over the global value.
After the reading of the configuration, the initialization of the metrics, and the setup of the database connection, the application will start a background routine to retrieve the data from the database in the defined cycles. An HTTP service is working in the foreground. It's listening by default on port 8080 and provides the data at the endpoint /metrics in the correct format for a Prometheus scraper.
If the environment variable VERBOSE is set with any value during the start of the application, the values and labels of each retrieval of the metrics will be printed into the log on STDOUT.
Configuration of Metrics
The metrics and the related SQL statements are defined in YAML format in a configuration file. Metrics can be defined with or without labels.
Metric without Label
An entry in the configuration file is defined in a structure like this:
- query: orcl_active_redo_log_count
metrics:
- name: orcl_active_redo_log_count
type: gauge
help: Count of the currently active Redo Logs
strategy: drop
sql: SELECT COUNT(*) FROM v$LOG WHERE STATUS='ACTIVE'
sleep: 10
The name in query is used for logging. It must be unique among all definitions. this will help you to quickly identify problems in case of errors. Below metrics, there you list the metrics in exactly the same order as retrieved in the leading columns of the SQL query.
The name of the metrics as transferred to Prometheus is given in the field name. The type of the metric can be defined only as gauge or counter (lower case!) in the type field. The optional strategy defines how to handle metrics if in the result set of a query is no record at all. The default strategy is keep. The least recently read value for the metric is kept until a new value is read from the DB. The strategy zero sets the value of a metric to zero, if no data is available in a query. To delete metrics if no data is found in the database, the strategy drop can be used. The metric will vanish in Prometheus after a while. The metric is recreated once new data is available.
Please note that counter metrics will be incremented by read values. Use gauge metrics to provide absolute values!
In the field help, there you can define a help text for the metric. Put the SQL statement for the query into the sql field. The query must provide exactly as many numeric result columns as the number of metrics for the query. If the quantity does not match, the application generates an error.
The sleep field is optional. It can define a wait time individually for this query. Such a value has precedence over the global sleep time.
Metric with Labels
An entry in the configuration file is defined in a structure like this:
- query: orcl_wait_times
metrics:
- name: orcl_wait_times
type: gauge
help: Wait Times for several DB Events
labels:
- event
sql: >
SELECT NVL(time_waited, 0),
en.name
FROM v$system_event se,
v$event_name en
WHERE se.event(+) = en.name
AND en.name IN ('free buffer waits', 'global enqueue expand wait', 'latch free',
'log file switch completion', 'log file sync', 'log file parallel write',
'db file sequential read', 'db file scattered read', 'db file single write',
'db file parallel write', 'direct path read', 'direct path write')
sleep: 10
The fields query, metrics, name, type, help and sleep are filled similarly to metrics without labels.
Labels are listed as array below the labels field. Use an array even if you have only one label!
Put the SQL statement into the sql field. It must contain per result record at first as many numeric columns as metrics are defined, and the as many string columns as labels have been defined. If the SQL has another structure, it cannot be handled by the application. Please note that it's possible to format the SQL statement to use multiple lines if a 'greater than' character is given after the sql: field.
Internal Metrics
The Metrics Provider automatically creates internal metrics. They include counters for errors in DB queries as well as summaries for the query durations. The internal metrics will be collected per defined metric, and their name corresponds with the name of the metric or query in the configuration file.
Based on the internal metrics, it's possible to identify problems in the DB access (e.g. failing or slow queries) as well as to create alarms or todos.
Example:
The following internal metrics will be generated for the metric definitions
orcl_active_redo_log_count and orcl_wait_times as listed above:
// error count of the SQL executions in the DB
metrics_provider_metric_retrieval_errors{metric_name="orcl_active_redo_log_count"}
metrics_provider_metric_retrieval_errors{metric_name="orcl_wait_times"}
// SQL query duration in seconds when reading data from the DB
metrics_provider_sql_durations_seconds{query_name="orcl_active_redo_log_count",quantile="0.5"}
metrics_provider_sql_durations_seconds{query_name="orcl_active_redo_log_count",quantile="0.9"}
metrics_provider_sql_durations_seconds{query_name="orcl_active_redo_log_count",quantile="0.99"}
metrics_provider_sql_durations_seconds{query_name="orcl_wait_times",quantile="0.5"}
metrics_provider_sql_durations_seconds{query_name="orcl_wait_times",quantile="0.9"}
metrics_provider_sql_durations_seconds{query_name="orcl_wait_times",quantile="0.99"}
Endpoints
There are two endpoints available, one for the metrics data and one for health checks:
- /metrics
- /health
Development
Use docker-compose for local development to bring up all needed services for testing like this:
cd local_dev_env
docker-compose up -d
At this point three databases (PostgreSQL, MySQL und Oracle) are available. To run the tests, use:
go test && go test gitlab.com/agungl/sqlmetrics/utils
Change the environment variables for the DB connection to switch between the databases.
Documentation
¶
There is no documentation for this package.