metrics

package
v2.1.0-beta.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 24, 2026 License: Apache-2.0 Imports: 6 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

View Source
var (

	// ControllerPanic is a counter to record the number of panics in the controller.
	ControllerPanic = prometheus.NewCounterVec(
		prometheus.CounterOpts{
			Namespace: "tidb_operator",
			Subsystem: "controller",
			Name:      "panic_total",
			Help:      "The total number of panics in the controller",
		}, []string{},
	)

	// AbnormalInstance is 1 when the named condition on the instance is False
	// (abnormal), 0 otherwise. The series stays present while the operator
	// manages the instance and is removed only when the instance is finalized.
	//
	// Use `metric == 1` together with PromQL `for: <duration>` to alert on
	// instances stuck in an abnormal state, e.g. a rolling restart that cannot
	// converge or a pod that is up but cannot serve.
	AbnormalInstance = prometheus.NewGaugeVec(
		prometheus.GaugeOpts{
			Namespace: "tidb_operator",
			Name:      "abnormal_instance",
			Help: "1 when the named condition on the instance is False, 0 otherwise. " +
				"Use `metric == 1` with PromQL `for: <duration>` to alert on stuck state.",
		}, InstanceAbnormalMetricLabels,
	)
)
View Source
var InstanceAbnormalMetricLabels = []string{"namespace", "cluster", "component", "group", "instance", "condition"}

InstanceAbnormalMetricLabels is the canonical label order for the per-instance abnormal-condition gauge. Keep in sync with WithLabelValues / DeleteLabelValues callers.

Functions

func ClearInstanceConditionMetrics

func ClearInstanceConditionMetrics(obj client.Object)

ClearInstanceConditionMetrics removes every tracked-condition series for the given instance.

Called from TaskInstanceFinalizerDel after the finalizer is removed, so every component that uses the standard finalize task is covered without per-builder wiring. Component builders short-circuit the deletion path with task.IfBreak around CondClusterIsDeleting / CondObjectIsDeleting, so the normal TaskInstanceConditionSynced / TaskInstanceConditionReady tasks (where ObserveCondition lives) never run during finalization; without this explicit cleanup, the gauge series would stay present at its last value forever, triggering false-positive `metric == 1 for: <duration>` alerts on a non-existent instance and growing label cardinality across each cluster lifecycle.

func ClearInstanceConditionMetricsByKey

func ClearInstanceConditionMetricsByKey(namespace, component, instance string)

ClearInstanceConditionMetricsByKey removes every tracked-condition series matching (namespace, component, instance) regardless of cluster / group labels. Use this from reconcile paths where the object has already been deleted from the API server and the business labels are no longer readable; passing component (which is a constant for a given reconcile scope) avoids sweeping up a different kind of instance that happens to share the same namespace and name (e.g. a PD and a TiDB both called "foo"). The partial match still covers series written under an earlier cluster / group label value if those labels ever shifted.

func ObserveCondition

func ObserveCondition(obj client.Object, conds []metav1.Condition, condType string)

ObserveCondition writes 1 to the abnormal-instance gauge when the named condition is False; 0 otherwise (True or absent are treated as healthy). The series stays present so PromQL `for:` alerts can fire reliably without gaps, and so dashboards never see missing samples for managed instances.

condType must be one of trackedConditions so the finalize-time cleanup in ClearInstanceConditionMetrics covers the same set of series this writes.

func ObserveConditions

func ObserveConditions(obj client.Object, conds []metav1.Condition)

ObserveConditions records the gauge for every condition type tracked by this package. This is the convenience entry point from reconcile tasks that want to refresh the full picture in one call.

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL