Documentation
¶
Index ¶
Constants ¶
View Source
const ( // KVEventsEnabledLabel indicates if KV events are enabled for this pod // This label was introduced specifically for KV Event Sync feature // Example: "model.aibrix.ai/kv-events-enabled": "true" KVEventsEnabledLabel = "model.aibrix.ai/kv-events-enabled" // LoraIDLabel specifies the LoRA adapter ID for KV sync // This label is used by KV Event Sync to track LoRA-specific caches // Example: "model.aibrix.ai/lora-id": "123" LoraIDLabel = "model.aibrix.ai/lora-id" )
Label keys for KV Event Sync
View Source
const ( // EnvPrefixCacheKVEventSyncEnabled enables KV event synchronization // When true, enables ZMQ-based cache event synchronization EnvPrefixCacheKVEventSyncEnabled = "AIBRIX_PREFIX_CACHE_KV_EVENT_SYNC_ENABLED" // EnvPrefixCacheKVEventPublishAddr specifies ZMQ publish address // Format: "tcp://*:5555" or similar ZMQ address EnvPrefixCacheKVEventPublishAddr = "AIBRIX_PREFIX_CACHE_KV_EVENT_PUBLISH_ADDR" // EnvPrefixCacheKVEventSubscribeAddrs specifies ZMQ subscribe addresses // Comma-separated list of ZMQ addresses to subscribe to EnvPrefixCacheKVEventSubscribeAddrs = "AIBRIX_PREFIX_CACHE_KV_EVENT_SUBSCRIBE_ADDRS" // EnvPrefixCacheLocalRouterMetricsEnabled enables prefix cache metrics // Added as part of KV Event Sync to control metrics registration EnvPrefixCacheLocalRouterMetricsEnabled = "AIBRIX_PREFIX_CACHE_LOCAL_ROUTER_METRICS_ENABLED" // EnvPrefixCacheUseRemoteTokenizer enables remote tokenizer usage // When true, uses remote tokenizer service instead of local tokenization EnvPrefixCacheUseRemoteTokenizer = "AIBRIX_PREFIX_CACHE_USE_REMOTE_TOKENIZER" // EnvPrefixCacheTokenizerType specifies the tokenizer type for prefix cache // Options: "character", "tiktoken", "remote" EnvPrefixCacheTokenizerType = "AIBRIX_PREFIX_CACHE_TOKENIZER_TYPE" // EnvPrefixCacheRemoteTokenizerEndpoint specifies the remote tokenizer service endpoint // Format: "http://service:port" - required when using remote tokenizer EnvPrefixCacheRemoteTokenizerEndpoint = "AIBRIX_PREFIX_CACHE_REMOTE_TOKENIZER_ENDPOINT" )
Environment variable names for KV Event Sync
View Source
const ( KVCacheLabelKeyIdentifier = "kvcache.orchestration.aibrix.ai/name" KVCacheLabelKeyRole = "kvcache.orchestration.aibrix.ai/role" KVCacheLabelKeyMetadataIndex = "kvcache.orchestration.aibrix.ai/etcd-index" KVCacheLabelKeyBackend = "kvcache.orchestration.aibrix.ai/backend" KVCacheAnnotationNodeAffinityKey = "kvcache.orchestration.aibrix.ai/node-affinity-key" KVCacheAnnotationNodeAffinityGPUType = "kvcache.orchestration.aibrix.ai/node-affinity-gpu-type" KVCacheAnnotationPodAffinityKey = "kvcache.orchestration.aibrix.ai/pod-affinity-workload" KVCacheAnnotationPodAntiAffinity = "kvcache.orchestration.aibrix.ai/pod-anti-affinity" KVCacheAnnotationNodeAffinityDefaultKey = "machine.cluster.vke.volcengine.com/gpu-name" // This config will be deprecated in future, users should specify kvcache backend directly. KVCacheAnnotationMode = "kvcache.orchestration.aibrix.ai/mode" KVCacheLabelValueRoleCache = "cache" KVCacheLabelValueRoleMetadata = "metadata" KVCacheLabelValueRoleKVWatcher = "kvwatcher" KVCacheBackendVineyard = "vineyard" KVCacheBackendHPKV = "hpkv" KVCacheBackendInfinistore = "infinistore" KVCacheBackendDefault = KVCacheBackendVineyard )
View Source
const ( // ModelLabelName is the label for identifying the model name // Example: "model.aibrix.ai/name": "deepseek-llm-7b-chat" ModelLabelName = "model.aibrix.ai/name" // ModelLabelEngine is the label for identifying the inference engine // Example: "model.aibrix.ai/engine": "vllm" ModelLabelEngine = "model.aibrix.ai/engine" // ModelLabelMetricPort is the label for specifying the metrics port // Example: "model.aibrix.ai/metric-port": "8000" ModelLabelMetricPort = "model.aibrix.ai/metric-port" // ModelLabelPort is the label for specifying the service port // Example: "model.aibrix.ai/port": "8080" ModelLabelPort = "model.aibrix.ai/port" // ModelLabelAdapterEnabled is the label for enabling or disabling adapter dynamic registration // Example: "adapter.model.aibrix.ai/enabled": "true" ModelLabelAdapterEnabled = "adapter.model.aibrix.ai/enabled" )
Variables ¶
This section is empty.
Functions ¶
func IsKVEventsEnabled ¶ added in v0.4.0
IsKVEventsEnabled checks if KV events are enabled for the pod
Types ¶
This section is empty.
Click to show internal directories.
Click to hide internal directories.