utilization

package
v0.10.10 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 6, 2026 License: MIT Imports: 5 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Float64

func Float64(v float64) *float64

Float64 returns a pointer to v.

func Int

func Int(v int) *int

Int returns a pointer to v.

func Int64

func Int64(v int64) *int64

Int64 returns a pointer to v.

func ParsePrometheusMetricValue

func ParsePrometheusMetricValue(body, metric string) (float64, bool)

ParsePrometheusMetricValue returns the first numeric value for metric from a Prometheus-style plaintext metrics body.

func ServerRoot

func ServerRoot(baseURL string) string

ServerRoot strips a trailing /v1 path component from an OpenAI-compatible base URL while preserving the scheme, host, and any prefix path.

func String

func String(v string) *string

String returns a pointer to v.

Types

type Cache

type Cache struct {
	// contains filtered or unexported fields
}

Cache preserves the most recent successful sample so probe failures can return stale utilization instead of surfacing hard endpoint unavailability.

func (*Cache) Remember

func (c *Cache) Remember(sample EndpointUtilization) EndpointUtilization

Remember stores a fresh sample and returns a normalized copy with fresh freshness and an observed timestamp.

func (*Cache) Stale

func (c *Cache) Stale() (EndpointUtilization, bool)

Stale returns the last successful sample marked stale. The boolean reports whether a previous sample existed.

type EndpointUtilization

type EndpointUtilization struct {
	ActiveRequests         *int
	QueuedRequests         *int
	CacheUsage             *float64
	MaxConcurrency         *int
	TotalPromptTokens      *int
	TotalCompletionTokens  *int
	CacheHitType           *string
	CachedTokens           *int
	GeneratedTokens        *int
	ActiveRequestPhase     *string
	TTFTSeconds            *float64
	TokensPerSecond        *float64
	MetalActiveMemoryBytes *int64
	MetalPeakMemoryBytes   *int64
	MetalCacheMemoryBytes  *int64
	Source                 Source
	Freshness              Freshness
	ObservedAt             time.Time
}

EndpointUtilization is the normalized utilization shape shared by local provider probes.

func Unknown

func Unknown(source Source) EndpointUtilization

Unknown returns a sample with unknown freshness and no numeric values.

type Freshness

type Freshness string

Freshness describes whether a sample was observed just now, reused after a failed probe, or has no known prior observation.

const (
	FreshnessFresh   Freshness = "fresh"
	FreshnessStale   Freshness = "stale"
	FreshnessUnknown Freshness = "unknown"
)

type Source

type Source string

Source identifies the probe path that produced a utilization sample.

const (
	SourceUnknown        Source = "unknown"
	SourceOMLXStatus     Source = "omlx.status"
	SourceVLLMMetrics    Source = "vllm.metrics"
	SourceLlamaMetrics   Source = "llama-server.metrics"
	SourceLlamaSlots     Source = "llama-server.slots"
	SourceRapidMLXStatus Source = "rapid-mlx.status"
)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL