load

package
v1.4.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 15, 2026 License: MIT Imports: 9 Imported by: 0

README

AIStore clusters must remain resilient under varying resource pressure.

The operations — from EC encoding and LRU eviction to multi-object transformations and the get-batch (ML endpoint) — must adjust behavior at runtime based on current system load.

The load package provides a unified API for assessing the latter and generating an actionable throttling advice - load.Advice.

Overview

The package is built around several motivations:

  1. Load assessment and throttling policy must be independent.
  2. Callers choose which load dimensions matter: CPU, memory, goroutines, disk, FDs.
  3. Sleep duration and sampling frequency adjust automatically.
  4. Data I/O throttles aggressively; metadata operations throttle minimally.
  5. Critical memory immediately triggers strong back-off.

The package provides one core abstraction called throttling advice (load.Advice) that includes:

  • How long to sleep, if at all
  • How frequently to check (sample interval / batch size)
  • What the highest load level is among monitored load dimensions

Each of the 5 load dimensions - memory, CPU, goroutine count, disk utilization, number of open descriptors - is independently graded:

type Load uint8

const (
    Low Load = iota + 1
    Moderate
    High
    Critical
)

Design

Five-Dimensional Load Vector

The system evaluates load across up to five dimensions:

  • FlMem --- memory pressure via memsys.PageMM()
  • FlCla --- CPU load averages
  • FlGor --- goroutine count
  • FlDsk --- disk utilization (per-mountpath or max)
  • FlFdt --- file descriptors (reserved for future use)

Internally this is represented as a packed uint64 (8 bits per dimension):

┌────────┬────────┬────────┬────────┬────────┐
│  Fdt   │  Gor   │  Mem   │  Cla   │  Dsk   │
│ (8bit) │ (8bit) │ (8bit) │ (8bit) │ (8bit) │
└────────┴────────┴────────┴────────┴────────┘
    ↓        ↓        ↓        ↓        ↓
   Low     High   Critical   Low    Moderate

Priority Ordering

When computing load.Advice, the following logic takes place:

  1. Memory --- highest priority; Critical memory triggers immediate, aggressive throttling
  2. Goroutines + CPU --- compute pressure
  3. Disk utilization --- I/O saturation
  4. Workload type (RW) --- metadata vs. data I/O

Workload Type: Extra.RW

Two classes of workloads behave differently:

Data I/O (RW: true)

Used by: - PUT/GET

  • EC encoding
  • Mirroring
  • Rebalance & tcb
  • Chunk transformation, GetBatch, etc.

Behavior: - Under High load: sleep 1--10ms, check every 32--512 ops

  • Under Critical load: sleep 10--100ms, check every 16 ops
Metadata-only (RW: false)

Currently used by subsystems/operations:

  • metadata-in-memory eviction
  • storage summary
  • space cleanup
  • LRU eviction

Usage

Basic Pattern

import "github.com/NVIDIA/aistore/cmn/load"

type myOp struct {
    adv load.Advice
    n   int64
}

func (op *myOp) init(mp *fs.Mountpath, config *cmn.Config) {
    op.adv.Init(
        load.FlMem | load.FlCla | load.FlDsk,
        &load.Extra{
            Mi:  mp,
            Cfg: &config.Disk,
            RW:  true,
        },
    )
}

func (op *myOp) run() {
    for ... {
        op.n++
        if op.adv.ShouldCheck(op.n) {
            op.adv.Refresh()
            if op.adv.Sleep > 0 {
                time.Sleep(op.adv.Sleep)
            }
        }
    }
}

Low-Level API

if load.Mem() == load.Critical {
    // abort or back off
}

cpu := load.CPU()
gor := load.Gor(runtime.NumGoroutine())
dsk := load.Dsk(mp, &config.Disk)

Throttling Constants

Note: These defaults may evolve in future releases or be overridden by configuration.

Sleep Durations

Condition Sleep
Memory Critical 100ms
Memory High 10ms
Gor/CPU Critical (both) 10–100ms
Gor/CPU Critical (either) 1–10ms
Gor/CPU High 1ms
Disk Critical 10–100ms
Disk High 1ms

Batch Intervals

Level Mask Check Every
maxBatch 0x1fff 8192 ops
dfltBatch 0x1ff 512 ops
smallBatch 0x1f 32 ops
minBatch 0xf 16 ops

Future Development

  • FlFdt --- file-descriptor monitoring
  • Caching of load vector
  • Configurable thresholds

Documentation

Overview

Package load provides 5-dimensional node-pressure readings and per-dimension grading.

  • Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.

Package load provides 5-dimensional node-pressure readings and per-dimension grading.

  • Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.

Index

Constants

View Source
const (
	FlFdt = 1 << iota // file descriptors
	FlGor             // goroutines
	FlMem             // memory (memsys)
	FlCla             // CPU load averages
	FlDsk             // disk utilization

	FlAll = FlFdt | FlGor | FlMem | FlCla | FlDsk
)

load dimensions: bit flags

Variables

View Source
var (
	Text = [...]string{"", "low", "moderate", "high", "critical"}
)

Functions

func Init

func Init()

Types

type Advice

type Advice struct {

	// runtime
	Sleep time.Duration // recommended sleep at check points
	Batch int64         // bitmask of the form 2^k - 1 (e.g., 0x1f => check memory, CPU, etc. load every 32 ops)
	Load  Load          // highest load level across requested _dimensions_
	// contains filtered or unexported fields
}

func (*Advice) ClaLoad

func (a *Advice) ClaLoad() Load

func (*Advice) DskLoad

func (a *Advice) DskLoad() Load

func (*Advice) Init

func (a *Advice) Init(flags uint64, extra *Extra)

memory always included

func (*Advice) MemLoad

func (a *Advice) MemLoad() Load

func (*Advice) Refresh

func (a *Advice) Refresh()

recompute throttling recommendation; note: disk requires config and, optionally, mountpath

func (*Advice) ShouldCheck

func (a *Advice) ShouldCheck(n int64) bool

return true when the caller should perform a throttle check after the N-th op.

func (*Advice) String

func (a *Advice) String() string

type Extra

type Extra struct {
	Mi  *fs.Mountpath // when nil return max util across mountpaths
	Cfg *cmn.DiskConf
	RW  bool // true when reading or writing data
}

type Load

type Load uint8

per-dimension load level

const (
	Low Load = iota + 1
	Moderate
	High
	Critical // “extreme” in memsys terms
)

func CPU

func CPU() Load

func Dsk

func Dsk(mi *fs.Mountpath, cfg *cmn.DiskConf) Load

func Gor

func Gor(ngr int) Load

goroutines

func Mem

func Mem() Load

NOTE: may trigger free-to-OS

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL