monitoring

package
v1.35.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 14, 2026 License: BSD-3-Clause Imports: 24 Imported by: 1

Documentation

Index

Constants

View Source
const (
	DefaultMetricsNamespace = "weaviate"
)

Variables

View Source
var (

	// LatencyBuckets is default histogram bucket for response time (in seconds).
	// It also includes request that served *very* fast and *very* slow
	LatencyBuckets = []float64{.005, .01, .025, .05, .1, .25, .5, 1, 2.5, 5, 10, 25, 50, 100}
)
View Source
var NoopRegisterer prometheus.Registerer = noopRegisterer{}

NoopRegisterer is a no-op Prometheus register.

Functions

func AddTracingToGRPCOptions added in v1.35.0

func AddTracingToGRPCOptions(options []grpc.ServerOption, logger logrus.FieldLogger) []grpc.ServerOption

AddTracingToGRPCOptions adds tracing interceptors to gRPC server options

func AddTracingToHTTPMiddleware added in v1.35.0

func AddTracingToHTTPMiddleware(next http.Handler, logger logrus.FieldLogger) http.Handler

AddTracingToHTTPMiddleware adds tracing to the existing HTTP middleware chain

func CountingListener added in v1.27.12

func CountingListener(l net.Listener, g prometheus.Gauge) net.Listener

func EnsureRegisteredMetric added in v1.31.17

func EnsureRegisteredMetric[T prometheus.Collector](reg prometheus.Registerer, metric T) (T, bool, error)

EnsureRegisteredMetric tries to register the given metric with the given registerer. If the metric is already registered, it returns the existing metric.

func GRPCStreamTracingInterceptor added in v1.35.0

func GRPCStreamTracingInterceptor() grpc.StreamServerInterceptor

GRPCStreamTracingInterceptor creates a gRPC stream interceptor that adds OpenTelemetry tracing

func GRPCTracingInterceptor added in v1.35.0

func GRPCTracingInterceptor() grpc.UnaryServerInterceptor

GRPCTracingInterceptor creates a gRPC interceptor that adds OpenTelemetry tracing

func HTTPTracingMiddleware added in v1.35.0

func HTTPTracingMiddleware(next http.Handler) http.Handler

HTTPTracingMiddleware creates a middleware that adds OpenTelemetry tracing to HTTP requests

func InitConfig added in v1.19.7

func InitConfig(cfg Config)

func InitCounterVec added in v1.31.17

func InitCounterVec(vec *prometheus.CounterVec, labelNames [][]string)

func InitGaugeVec added in v1.31.17

func InitGaugeVec(vec *prometheus.GaugeVec, labelNames [][]string)

func InstrumentGrpc added in v1.27.10

func InstrumentGrpc(svrMetrics *GRPCServerMetrics) []grpc.ServerOption

InstrumentGrpc accepts server metrics and returns the few `[]grpc.ServerOption` which you can then wrap it with any `grpc.Server` to get these metrics instrumented automatically.

```

svrMetrics := monitoring.NewGRPCServerMetrics(metrics, prometheus.DefaultRegisterer)
grpcServer := grpc.NewServer(monitoring.InstrumentGrpc(*svrMetrics)...)

grpcServer.Serve(listener)

```

func NewTracingTransport added in v1.35.0

func NewTracingTransport(base http.RoundTripper) http.RoundTripper

NewTracingTransport creates an HTTP transport that injects OpenTelemetry trace context

func StreamServerInstrument added in v1.27.10

func StreamServerInstrument(hist *prometheus.HistogramVec) grpc.StreamServerInterceptor

func UnaryServerInstrument added in v1.27.10

func UnaryServerInstrument(hist *prometheus.HistogramVec) grpc.UnaryServerInterceptor

Types

type Config added in v1.24.8

type Config struct {
	Enabled                    bool   `json:"enabled" yaml:"enabled" long:"enabled"`
	Tool                       string `json:"tool" yaml:"tool"`
	Port                       int    `json:"port" yaml:"port" long:"port" default:"8081"`
	Group                      bool   `json:"group_classes" yaml:"group_classes"`
	MonitorCriticalBucketsOnly bool   `json:"monitor_critical_buckets_only" yaml:"monitor_critical_buckets_only"`

	// Metrics namespace group the metrics with common prefix.
	MetricsNamespace string `json:"metrics_namespace" yaml:"metrics_namespace" long:"metrics_namespace" default:""`
}

type GRPCServerMetrics added in v1.27.12

type GRPCServerMetrics struct {
	RequestDuration  *prometheus.HistogramVec
	RequestBodySize  *prometheus.HistogramVec
	ResponseBodySize *prometheus.HistogramVec
	InflightRequests *prometheus.GaugeVec
}

GRPCServerMetrics exposes set of prometheus metrics for grpc servers.

func NewGRPCServerMetrics added in v1.27.12

func NewGRPCServerMetrics(namespace string, reg prometheus.Registerer) *GRPCServerMetrics

type GrpcStatsHandler added in v1.27.10

type GrpcStatsHandler struct {
	// contains filtered or unexported fields
}

func NewGrpcStatsHandler added in v1.27.10

func NewGrpcStatsHandler(inflight *prometheus.GaugeVec, requestSize *prometheus.HistogramVec, responseSize *prometheus.HistogramVec) *GrpcStatsHandler

func (*GrpcStatsHandler) HandleConn added in v1.27.10

func (g *GrpcStatsHandler) HandleConn(_ context.Context, _ stats.ConnStats)

func (*GrpcStatsHandler) HandleRPC added in v1.27.10

func (g *GrpcStatsHandler) HandleRPC(ctx context.Context, rpcStats stats.RPCStats)

func (*GrpcStatsHandler) TagConn added in v1.27.10

func (*GrpcStatsHandler) TagRPC added in v1.27.10

type HTTPServerMetrics added in v1.27.12

type HTTPServerMetrics struct {
	TCPActiveConnections *prometheus.GaugeVec
	RequestDuration      *prometheus.HistogramVec
	RequestBodySize      *prometheus.HistogramVec
	ResponseBodySize     *prometheus.HistogramVec
	InflightRequests     *prometheus.GaugeVec
}

HTTPServerMetrics exposes set of prometheus metrics for http servers.

func NewHTTPServerMetrics added in v1.27.12

func NewHTTPServerMetrics(namespace string, reg prometheus.Registerer) *HTTPServerMetrics

NewHTPServerMetrics return the ServerMetrics that can be used in any of the grpc or http servers.

type InstrumentHandler added in v1.27.12

type InstrumentHandler struct {
	// contains filtered or unexported fields
}

func InstrumentHTTP added in v1.27.12

func InstrumentHTTP(
	next http.Handler,
	routeLabel StaticRouteLabel,
	inflight *prometheus.GaugeVec,
	duration *prometheus.HistogramVec,
	requestSize *prometheus.HistogramVec,
	responseSize *prometheus.HistogramVec,
) *InstrumentHandler

func (*InstrumentHandler) ServeHTTP added in v1.27.12

func (i *InstrumentHandler) ServeHTTP(w http.ResponseWriter, r *http.Request)

type OnceUponATimer

type OnceUponATimer struct {
	sync.Once
	Timer *prometheus.Timer
}

func NewOnceTimer

func NewOnceTimer(promTimer *prometheus.Timer) *OnceUponATimer

func (*OnceUponATimer) ObserveDurationOnce

func (o *OnceUponATimer) ObserveDurationOnce()

type PrometheusMetrics

type PrometheusMetrics struct {
	Registerer prometheus.Registerer

	BatchTime                           *prometheus.HistogramVec
	BatchSizeBytes                      *prometheus.SummaryVec
	BatchSizeObjects                    prometheus.Summary
	BatchSizeTenants                    prometheus.Summary
	BatchDeleteTime                     *prometheus.SummaryVec
	BatchCount                          *prometheus.CounterVec
	BatchCountBytes                     *prometheus.CounterVec
	ObjectsTime                         *prometheus.SummaryVec
	AsyncOperations                     *prometheus.GaugeVec
	LSMSegmentCount                     *prometheus.GaugeVec
	LSMObjectsBucketSegmentCount        *prometheus.GaugeVec
	LSMCompressedVecsBucketSegmentCount *prometheus.GaugeVec
	LSMSegmentCountByLevel              *prometheus.GaugeVec
	LSMSegmentUnloaded                  *prometheus.GaugeVec
	LSMSegmentObjects                   *prometheus.GaugeVec
	LSMSegmentSize                      *prometheus.GaugeVec
	LSMMemtableSize                     *prometheus.GaugeVec
	LSMMemtableDurations                *prometheus.SummaryVec
	LSMBitmapBuffersUsage               *prometheus.CounterVec
	ObjectCount                         *prometheus.GaugeVec
	QueriesCount                        *prometheus.GaugeVec
	RequestsTotal                       *prometheus.GaugeVec
	QueriesDurations                    *prometheus.HistogramVec
	QueriesFilteredVectorDurations      *prometheus.SummaryVec
	QueryDimensions                     *prometheus.CounterVec
	QueryDimensionsCombined             prometheus.Counter
	GoroutinesCount                     *prometheus.GaugeVec
	BackupRestoreDurations              *prometheus.SummaryVec
	BackupStoreDurations                *prometheus.SummaryVec
	BucketPauseDurations                *prometheus.SummaryVec
	BackupRestoreClassDurations         *prometheus.SummaryVec
	BackupRestoreBackupInitDurations    *prometheus.SummaryVec
	BackupRestoreFromStorageDurations   *prometheus.SummaryVec
	BackupRestoreDataTransferred        *prometheus.CounterVec
	BackupStoreDataTransferred          *prometheus.CounterVec
	FileIOWrites                        *prometheus.SummaryVec
	FileIOReads                         *prometheus.SummaryVec
	MmapOperations                      *prometheus.CounterVec
	MmapProcMaps                        prometheus.Gauge

	// offload metric
	QueueSize                        *prometheus.GaugeVec
	QueueDiskUsage                   *prometheus.GaugeVec
	QueuePaused                      *prometheus.GaugeVec
	QueueCount                       *prometheus.GaugeVec
	QueuePartitionProcessingDuration *prometheus.HistogramVec

	VectorIndexQueueInsertCount *prometheus.CounterVec
	VectorIndexQueueDeleteCount *prometheus.CounterVec

	VectorIndexTombstones              *prometheus.GaugeVec
	VectorIndexTombstoneCleanupThreads *prometheus.GaugeVec
	VectorIndexTombstoneCleanedCount   *prometheus.CounterVec
	VectorIndexTombstoneUnexpected     *prometheus.CounterVec
	VectorIndexTombstoneCycleStart     *prometheus.GaugeVec
	VectorIndexTombstoneCycleEnd       *prometheus.GaugeVec
	VectorIndexTombstoneCycleProgress  *prometheus.GaugeVec
	VectorIndexOperations              *prometheus.GaugeVec
	VectorIndexDurations               *prometheus.SummaryVec
	VectorIndexSize                    *prometheus.GaugeVec
	VectorIndexMaintenanceDurations    *prometheus.SummaryVec
	// IVF
	VectorIndexPostings                      *prometheus.GaugeVec
	VectorIndexPostingSize                   *prometheus.HistogramVec
	VectorIndexPendingBackgroundOperations   *prometheus.GaugeVec
	VectorIndexBackgroundOperationsDurations *prometheus.SummaryVec
	VectorIndexBackgroundOperationsCount     *prometheus.GaugeVec
	VectorIndexStoreOperationsDurations      *prometheus.SummaryVec

	VectorDimensionsSum                 *prometheus.GaugeVec
	VectorSegmentsSum                   *prometheus.GaugeVec
	VectorIndexMemoryAllocationRejected prometheus.Counter

	StartupProgress  *prometheus.GaugeVec
	StartupDurations *prometheus.SummaryVec
	StartupDiskIO    *prometheus.SummaryVec

	ShardsLoaded    prometheus.Gauge
	ShardsUnloaded  prometheus.Gauge
	ShardsLoading   prometheus.Gauge
	ShardsUnloading prometheus.Gauge

	// RAFT-based schema metrics
	SchemaWrites         *prometheus.SummaryVec
	SchemaReadsLocal     *prometheus.SummaryVec
	SchemaReadsLeader    *prometheus.SummaryVec
	SchemaWaitForVersion *prometheus.SummaryVec

	TombstoneFindLocalEntrypoint  *prometheus.CounterVec
	TombstoneFindGlobalEntrypoint *prometheus.CounterVec
	TombstoneReassignNeighbors    *prometheus.CounterVec
	TombstoneDeleteListSize       *prometheus.GaugeVec

	Group bool
	// Keeping metering to only the critical buckets (objects, vectors_compressed)
	// helps cut down on noise when monitoring
	LSMCriticalBucketsOnly bool

	// Deprecated metrics, keeping around because the classification features
	// seems to sill use the old logic. However, those metrics are not actually
	// used for the schema anymore, but only for the classification features.
	SchemaTxOpened   *prometheus.CounterVec
	SchemaTxClosed   *prometheus.CounterVec
	SchemaTxDuration *prometheus.SummaryVec

	// Vectorization
	T2VBatches            *prometheus.GaugeVec
	T2VBatchQueueDuration *prometheus.HistogramVec
	T2VRequestDuration    *prometheus.HistogramVec
	T2VTokensInBatch      *prometheus.HistogramVec
	T2VTokensInRequest    *prometheus.HistogramVec
	T2VRateLimitStats     *prometheus.GaugeVec
	T2VRepeatStats        *prometheus.GaugeVec
	T2VRequestsPerBatch   *prometheus.HistogramVec

	TokenizerDuration           *prometheus.HistogramVec
	TokenizerRequests           *prometheus.CounterVec
	TokenizerInitializeDuration *prometheus.HistogramVec
	TokenCount                  *prometheus.CounterVec
	TokenCountPerRequest        *prometheus.HistogramVec

	// Currently targeted at OpenAI, the metrics will have to be added to every vectorizer for complete coverage
	ModuleExternalRequests           *prometheus.CounterVec
	ModuleExternalRequestDuration    *prometheus.HistogramVec
	ModuleExternalBatchLength        *prometheus.HistogramVec
	ModuleExternalRequestSingleCount *prometheus.CounterVec
	ModuleExternalRequestBatchCount  *prometheus.CounterVec
	ModuleExternalRequestSize        *prometheus.HistogramVec
	ModuleExternalResponseSize       *prometheus.HistogramVec
	ModuleExternalResponseStatus     *prometheus.CounterVec
	VectorizerRequestTokens          *prometheus.HistogramVec
	ModuleExternalError              *prometheus.CounterVec
	ModuleCallError                  *prometheus.CounterVec
	ModuleBatchError                 *prometheus.CounterVec

	// Checksum metrics
	ChecksumValidationDuration prometheus.Summary
	ChecksumBytesRead          prometheus.Summary
}

NOTE: Do not add any new metrics to this global `PrometheusMetrics` struct. Instead add your metrics close the corresponding component.

func GetMetrics

func GetMetrics() *PrometheusMetrics

func (*PrometheusMetrics) DeleteClass added in v1.21.7

func (pm *PrometheusMetrics) DeleteClass(className string) error

DeleteClass deletes all metrics that match the class name, but do not have a shard-specific label. See [DeleteShard] for more information.

func (*PrometheusMetrics) DeleteShard added in v1.21.7

func (pm *PrometheusMetrics) DeleteShard(className, shardName string) error

Delete Shard deletes existing label combinations that match both the shard and class name. If a metric is not collected at the shard level it is unaffected. This is to make sure that deleting a single shard (e.g. multi-tenancy) does not affect metrics for existing shards.

In addition, there are some metrics that we explicitly keep, such as vector_dimensions_sum as they can be used in billing decisions.

func (*PrometheusMetrics) FinishLoadingShard added in v1.23.0

func (pm *PrometheusMetrics) FinishLoadingShard()

Move the shard from in progress to loaded

func (*PrometheusMetrics) FinishUnloadingShard added in v1.23.0

func (pm *PrometheusMetrics) FinishUnloadingShard()

Move the shard from in progress to unloaded

func (*PrometheusMetrics) NewUnloadedshard added in v1.23.0

func (pm *PrometheusMetrics) NewUnloadedshard()

Register a new, unloaded shard

func (*PrometheusMetrics) StartLoadingShard added in v1.23.0

func (pm *PrometheusMetrics) StartLoadingShard()

Move the shard from unloaded to in progress

func (*PrometheusMetrics) StartUnloadingShard added in v1.23.0

func (pm *PrometheusMetrics) StartUnloadingShard()

Move the shard from loaded to in progress

type StaticRouteLabel added in v1.27.14

type StaticRouteLabel func(r *http.Request) (*http.Request, string)

Examples: `/schema/Movies/properties` -> `/schema/{className}` `/replicas/indices/Movies/shards/hello0/objects` -> `/replicas/indices`

type TenantOffloadMetrics added in v1.28.0

type TenantOffloadMetrics struct {
	// NOTE: These ops are not GET or PUT requests to object storage.
	// these are one of the `download`, `upload` or `delete`. Because we use s5cmd to talk
	// to object storage currently. Which supports these operations at high level.
	FetchedBytes     prometheus.Counter
	TransferredBytes prometheus.Counter
	OpsDuration      *prometheus.HistogramVec
}

func NewTenantOffloadMetrics added in v1.28.0

func NewTenantOffloadMetrics(cfg Config, reg prometheus.Registerer) *TenantOffloadMetrics

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL