load

package

v1.4.2 Latest Latest Go to latest Published: Jan 15, 2026 License: MIT Imports: 9 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/NVIDIA/aistore

Links

README ¶

AIStore clusters must remain resilient under varying resource pressure.

The operations — from EC encoding and LRU eviction to multi-object transformations and the get-batch (ML endpoint) — must adjust behavior at runtime based on current system load.

The load package provides a unified API for assessing the latter and generating an actionable throttling advice - load.Advice.

Overview

The package is built around several motivations:

Load assessment and throttling policy must be independent.
Callers choose which load dimensions matter: CPU, memory, goroutines, disk, FDs.
Sleep duration and sampling frequency adjust automatically.
Data I/O throttles aggressively; metadata operations throttle minimally.
Critical memory immediately triggers strong back-off.

The package provides one core abstraction called throttling advice (load.Advice) that includes:

How long to sleep, if at all
How frequently to check (sample interval / batch size)
What the highest load level is among monitored load dimensions

Each of the 5 load dimensions - memory, CPU, goroutine count, disk utilization, number of open descriptors - is independently graded:

type Load uint8

const (
    Low Load = iota + 1
    Moderate
    High
    Critical
)

Design

Five-Dimensional Load Vector

The system evaluates load across up to five dimensions:

FlMem --- memory pressure via memsys.PageMM()
FlCla --- CPU load averages
FlGor --- goroutine count
FlDsk --- disk utilization (per-mountpath or max)
FlFdt --- file descriptors (reserved for future use)

Internally this is represented as a packed uint64 (8 bits per dimension):

┌────────┬────────┬────────┬────────┬────────┐
│  Fdt   │  Gor   │  Mem   │  Cla   │  Dsk   │
│ (8bit) │ (8bit) │ (8bit) │ (8bit) │ (8bit) │
└────────┴────────┴────────┴────────┴────────┘
    ↓        ↓        ↓        ↓        ↓
   Low     High   Critical   Low    Moderate

Priority Ordering

When computing load.Advice, the following logic takes place:

Memory --- highest priority; Critical memory triggers immediate, aggressive throttling
Goroutines + CPU --- compute pressure
Disk utilization --- I/O saturation
Workload type (RW) --- metadata vs. data I/O

Workload Type: `Extra.RW`

Two classes of workloads behave differently:

Data I/O (`RW: true`)

Used by: - PUT/GET

EC encoding
Mirroring
Rebalance & tcb
Chunk transformation, GetBatch, etc.

Behavior: - Under High load: sleep 1--10ms, check every 32--512 ops

Under Critical load: sleep 10--100ms, check every 16 ops

Metadata-only (`RW: false`)

Currently used by subsystems/operations:

metadata-in-memory eviction
storage summary
space cleanup
LRU eviction

Usage

Basic Pattern

import "github.com/NVIDIA/aistore/cmn/load"

type myOp struct {
    adv load.Advice
    n   int64
}

func (op *myOp) init(mp *fs.Mountpath, config *cmn.Config) {
    op.adv.Init(
        load.FlMem | load.FlCla | load.FlDsk,
        &load.Extra{
            Mi:  mp,
            Cfg: &config.Disk,
            RW:  true,
        },
    )
}

func (op *myOp) run() {
    for ... {
        op.n++
        if op.adv.ShouldCheck(op.n) {
            op.adv.Refresh()
            if op.adv.Sleep > 0 {
                time.Sleep(op.adv.Sleep)
            }
        }
    }
}

Low-Level API

if load.Mem() == load.Critical {
    // abort or back off
}

cpu := load.CPU()
gor := load.Gor(runtime.NumGoroutine())
dsk := load.Dsk(mp, &config.Disk)

Throttling Constants

Note: These defaults may evolve in future releases or be overridden by configuration.

Sleep Durations

Condition	Sleep
Memory Critical	100ms
Memory High	10ms
Gor/CPU Critical (both)	10–100ms
Gor/CPU Critical (either)	1–10ms
Gor/CPU High	1ms
Disk Critical	10–100ms
Disk High	1ms

Batch Intervals

Level	Mask	Check Every
`maxBatch`	`0x1fff`	8192 ops
`dfltBatch`	`0x1ff`	512 ops
`smallBatch`	`0x1f`	32 ops
`minBatch`	`0xf`	16 ops

Future Development

FlFdt --- file-descriptor monitoring
Caching of load vector
Configurable thresholds

Documentation ¶

Overview ¶

Package load provides 5-dimensional node-pressure readings and per-dimension grading.

Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.

Package load provides 5-dimensional node-pressure readings and per-dimension grading.

Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.

Constants ¶

View Source

const (
	FlFdt = 1 << iota // file descriptors
	FlGor             // goroutines
	FlMem             // memory (memsys)
	FlCla             // CPU load averages
	FlDsk             // disk utilization

	FlAll = FlFdt | FlGor | FlMem | FlCla | FlDsk
)

load dimensions: bit flags

Variables ¶

View Source

var (
	Text = [...]string{"", "low", "moderate", "high", "critical"}
)

Functions ¶

func Init ¶

func Init()

Types ¶

type Advice ¶

type Advice struct {

	// runtime
	Sleep time.Duration // recommended sleep at check points
	Batch int64         // bitmask of the form 2^k - 1 (e.g., 0x1f => check memory, CPU, etc. load every 32 ops)
	Load  Load          // highest load level across requested _dimensions_
	// contains filtered or unexported fields
}

func (*Advice) ClaLoad ¶

func (a *Advice) ClaLoad() Load

func (*Advice) DskLoad ¶

func (a *Advice) DskLoad() Load

func (*Advice) Init ¶

func (a *Advice) Init(flags uint64, extra *Extra)

memory always included

func (*Advice) MemLoad ¶

func (a *Advice) MemLoad() Load

func (*Advice) Refresh ¶

func (a *Advice) Refresh()

recompute throttling recommendation; note: disk requires config and, optionally, mountpath

func (*Advice) ShouldCheck ¶

func (a *Advice) ShouldCheck(n int64) bool

return true when the caller should perform a throttle check after the N-th op.

func (*Advice) String ¶

func (a *Advice) String() string

type Extra ¶

type Extra struct {
	Mi  *fs.Mountpath // when nil return max util across mountpaths
	Cfg *cmn.DiskConf
	RW  bool // true when reading or writing data
}

type Load ¶

type Load uint8

per-dimension load level

const (
	Low Load = iota + 1
	Moderate
	High
	Critical // “extreme” in memsys terms
)

func CPU ¶

func CPU() Load

func Dsk ¶

func Dsk(mi *fs.Mountpath, cfg *cmn.DiskConf) Load

func Gor ¶

func Gor(ngr int) Load

goroutines

func Mem ¶

func Mem() Load

NOTE: may trigger free-to-OS

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL