Documentation
¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func ParsePrometheusMetricValue ¶
ParsePrometheusMetricValue returns the first numeric value for metric from a Prometheus-style plaintext metrics body.
func ServerRoot ¶
ServerRoot strips a trailing /v1 path component from an OpenAI-compatible base URL while preserving the scheme, host, and any prefix path.
Types ¶
type Cache ¶
type Cache struct {
// contains filtered or unexported fields
}
Cache preserves the most recent successful sample so probe failures can return stale utilization instead of surfacing hard endpoint unavailability.
func (*Cache) Remember ¶
func (c *Cache) Remember(sample EndpointUtilization) EndpointUtilization
Remember stores a fresh sample and returns a normalized copy with fresh freshness and an observed timestamp.
func (*Cache) Stale ¶
func (c *Cache) Stale() (EndpointUtilization, bool)
Stale returns the last successful sample marked stale. The boolean reports whether a previous sample existed.
type EndpointUtilization ¶
type EndpointUtilization struct {
ActiveRequests *int
QueuedRequests *int
CacheUsage *float64
MaxConcurrency *int
TotalPromptTokens *int
TotalCompletionTokens *int
CacheHitType *string
CachedTokens *int
GeneratedTokens *int
ActiveRequestPhase *string
TTFTSeconds *float64
TokensPerSecond *float64
MetalActiveMemoryBytes *int64
MetalPeakMemoryBytes *int64
MetalCacheMemoryBytes *int64
Source Source
Freshness Freshness
ObservedAt time.Time
}
EndpointUtilization is the normalized utilization shape shared by local provider probes.
func Unknown ¶
func Unknown(source Source) EndpointUtilization
Unknown returns a sample with unknown freshness and no numeric values.
type Freshness ¶
type Freshness string
Freshness describes whether a sample was observed just now, reused after a failed probe, or has no known prior observation.