shapegen

package
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 2, 2026 License: MIT Imports: 27 Imported by: 0

Documentation

Overview

Package shapegen defines a uniform contract for graph-shape generators used across property-based tests, golden corpora, and benchmarks in GoGraph.

A Shape value exposes the canonical Go construction of a particular graph topology over the LPG and adjlist backends. It reports its parameter knobs so that property-based tests can sweep them with pgregory.net/rapid and so that goldens can pin specific configurations. The Build method is fully parameterised by an adjlist.Config and by knob values supplied to the Shape implementation through its own constructor; no Shape consults hidden process-level state.

Concurrency

Shape implementations must be safe for concurrent Build calls. The Build method must be a pure function of its inputs and the supplied adjlist.Config: it must not consult, mutate, or close over any package-level state. The shapegen registry itself is safe for concurrent Register/Lookup and may be queried from any number of goroutines.

Determinism

Build calls must be reproducible from their parameters and the seed implied by knob values. Shapes that internally need pseudo-random choices derive them from the knob vector or from caller-supplied material; they must never consult global random sources directly.

Registry

shapegen maintains a typed, process-local registry keyed by Shape Name(). It is the only piece of mutable package-level state in the package, guarded by a sync.RWMutex. Tests that register transient shapes must clean up via t.Cleanup to keep the registry tidy.

Index

Constants

View Source
const (
	AdvKeyInt    = "adv.int"
	AdvKeyFloat  = "adv.float"
	AdvKeyString = "adv.string"
	AdvKeyBytes  = "adv.bytes"
	AdvKeyTime   = "adv.time"
	AdvKeyBool   = "adv.bool"
)

Property-key constants used by ApplyAdversarialProps so the same call site rewrites the same key on every invocation, preserving idempotency.

Variables

View Source
var ErrCycleTooSmall = errors.New("shapegen: cycle requires n >= 3 (undirected) or n >= 1 (directed)")

ErrCycleTooSmall is returned by Cycle.Build when n is below the minimum required by the catalogue: n >= 3 for undirected cycles and n >= 1 for directed cycles. Callers can errors.Is against this sentinel without unwrapping.

View Source
var ErrEdgeCountTooHigh = errors.New("shapegen: edge count exceeds C(n, 2)")

ErrEdgeCountTooHigh is returned by ErdosRenyiNM.Build when the requested edge count m exceeds C(n, 2), the number of unordered pairs available in a simple graph on n nodes. Callers can errors.Is against this sentinel without unwrapping.

View Source
var ErrGraphalyticsChecksumMismatch = errors.New("shapegen: Graphalytics archive checksum mismatch")

ErrGraphalyticsChecksumMismatch is returned when the cached or fetched archive does not match the registered SHA-256 or MD5.

View Source
var ErrGraphalyticsOffline = errors.New("shapegen: Graphalytics fetch failed; check network connectivity")

ErrGraphalyticsOffline wraps any other transport-layer failure of the HTTPS fetch (DNS, dial timeout, TLS, non-2xx response other than 409, body truncation). Callers running outside a network- enabled environment should errors.Is against it and skip the test.

View Source
var ErrGraphalyticsStaging = errors.New("shapegen: Graphalytics archive is offline (request staging via SURF API)")

ErrGraphalyticsStaging is returned when the SURF repository responds with HTTP 409 (the canonical "File is offline" status of an unstaged file). The wrapped message contains the JSON "stage" endpoint so the caller can request staging out of band.

View Source
var ErrGraphalyticsUnknownAlgorithm = errors.New("shapegen: unknown Graphalytics algorithm")

ErrGraphalyticsUnknownAlgorithm is returned by LoadGraphalyticsReference for a benchmark name that is not in the canonical algorithm set.

View Source
var ErrGraphalyticsUnknownDataset = errors.New("shapegen: unknown Graphalytics dataset")

ErrGraphalyticsUnknownDataset is returned by LoadGraphalytics for a name that is not registered.

View Source
var ErrLFRAssignmentFailed = errors.New("shapegen: LFR node-to-community assignment exhausted every community without placement")

ErrLFRAssignmentFailed is returned by LFR.Build when the greedy node-to-community assignment exhausts every community without placing a node — a condition reachable only when every realised community is at exact capacity and the unplaced node's intra- degree ceiling exceeds the per-community admission budget. Callers can errors.Is against this sentinel without unwrapping.

View Source
var ErrOddDegreeSum = errors.New("shapegen: degree sequence has an odd sum")

ErrOddDegreeSum is returned by ConfigurationModel.Build when the input degree sequence has an odd total. A pairing of an odd number of half-edges is impossible, so the constructor cannot produce a graph realising the supplied sequence. Callers can errors.Is against this sentinel without unwrapping.

View Source
var ErrRegularConstruction = errors.New("shapegen: random d-regular construction exhausted the retry budget")

ErrRegularConstruction is returned by RandomRegular.Build when the rejection-resampling pairing loop fails to produce a simple d-regular graph within the per-call retry budget (100 attempts). Repeated failure is overwhelmingly rare at the catalogue's (n, d) range — the asymptotic acceptance probability is bounded below by a positive constant under Bollobás's analysis — but a few pathological (n, d) tuples (notably d == n - 1 with n small) can exhaust the budget. Callers can errors.Is against this sentinel without unwrapping.

View Source
var ErrSBMAsymmetric = errors.New("shapegen: SBM probPercent matrix is not symmetric")

ErrSBMAsymmetric is returned by SBM.Build when probPercent[i][j] disagrees with probPercent[j][i] for some i != j. The SBM is defined on undirected edges, so the inter-block probability must be symmetric; an asymmetric matrix would imply a directed model the catalogue does not realise here. Callers can errors.Is against this sentinel without unwrapping.

View Source
var ErrSBMBlockMismatch = errors.New("shapegen: SBM blockSizes length disagrees with probPercent outer length")

ErrSBMBlockMismatch is returned by SBM.Build when the length of blockSizes does not match the outer length of probPercent. The block vector and the probability matrix must agree on the number of blocks — without that agreement the per-pair probability lookup is ill-defined. Callers can errors.Is against this sentinel without unwrapping.

View Source
var ErrSBMNonSquare = errors.New("shapegen: SBM probPercent matrix is not square")

ErrSBMNonSquare is returned by SBM.Build when any row of probPercent has a length different from the number of blocks. The probability matrix must be square (k-by-k for k blocks); a ragged matrix has no consistent inter-block probability for at least one (i, j) pair. Callers can errors.Is against this sentinel without unwrapping.

View Source
var ErrSBMProbOutOfRange = errors.New("shapegen: SBM probPercent entry outside [0, 100]")

ErrSBMProbOutOfRange is returned by SBM.Build when any probPercent[i][j] entry is outside the inclusive [0, 100] window. The catalogue encodes probabilities as integer percents in the closed window [0, 100]; values outside this window have no well-defined meaning under the model. Callers can errors.Is against this sentinel without unwrapping.

View Source
var ErrSNAPChecksumMismatch = errors.New("shapegen: SNAP archive SHA-256 mismatch")

ErrSNAPChecksumMismatch is returned when the cached or freshly fetched archive does not match the registered SHA-256.

View Source
var ErrSNAPOffline = errors.New("shapegen: SNAP fetch failed; check network connectivity")

ErrSNAPOffline wraps every transport-layer failure of the HTTPS fetch (DNS, dial timeout, TLS, non-2xx response, body truncation). Callers running outside a network-enabled environment should errors.Is against it and skip the test.

View Source
var ErrSNAPUnknownDataset = errors.New("shapegen: unknown SNAP dataset")

ErrSNAPUnknownDataset is returned by LoadSNAP for a name that is not registered in SNAPDatasets.

View Source
var GraphalyticsAlgorithms = []string{"BFS", "CDLP", "LCC", "PR", "SSSP", "WCC"}

GraphalyticsAlgorithms enumerates the six canonical Graphalytics reference algorithms. The string values match the LDBC file-name suffix convention so the loader can resolve a reference output by appending "-{alg}" to the dataset name.

BFS : breadth-first search distances. CDLP : community detection via label propagation. LCC : local clustering coefficient. PR : PageRank. SSSP : single-source shortest paths. WCC : weakly connected components.

View Source
var GraphalyticsDatasets = map[string]GraphalyticsDataset{
	"cit-Patents": {
		Name:   "cit-Patents",
		URL:    "https://repository.surfsara.nl/datasets/cwi/graphalytics/files/graphalytics-graph-data-sets/cit-Patents.tar.zst",
		MD5:    "3dc5bdb0e40f56ca7486140900fc1d0f",
		SHA256: "",
		Nodes:  3_774_768,
		Edges:  16_518_947,
	},
	"dota-league": {
		Name:   "dota-league",
		URL:    "https://repository.surfsara.nl/datasets/cwi/graphalytics/files/graphalytics-graph-data-sets/dota-league.tar.zst",
		MD5:    "bae15106fadb84d577b5824655fc27f0",
		SHA256: "",
		Nodes:  61_170,
		Edges:  50_870_313,
	},
	"kgs": {
		Name:   "kgs",
		URL:    "https://repository.surfsara.nl/datasets/cwi/graphalytics/files/graphalytics-graph-data-sets/kgs.tar.zst",
		MD5:    "f6c52a1b68b836410a4c6a5df821e611",
		SHA256: "",
		Nodes:  832_247,
		Edges:  17_891_698,
	},
}

GraphalyticsDatasets is the registry of supported Graphalytics datasets keyed by the canonical dataset name. The three entries pinned here are the smallest reference graphs declared by AC #1 of task #524: cit-Patents, dota-league and kgs.

TODO(#524): promote MD5 to SHA-256 once the SURF mirror has re-staged the dataset entries. The MD5 values are taken from the SURF API at retrieval time; SHA-256 will be filled by the first successful soak-layer run that completes a clean fetch+verify loop and emits the realised digest to stderr.

View Source
var SNAPDatasets = map[string]SNAPDataset{
	"web-Google": {
		Name:   "web-Google",
		URL:    "https://snap.stanford.edu/data/web-Google.txt.gz",
		SHA256: "bcac0af0471d749f4a8c010bca92b61cf2868a0570741de06892fc062f265ea6",
		Nodes:  875_713,
		Edges:  5_105_039,
	},
	"soc-LiveJournal1": {
		Name:   "soc-LiveJournal1",
		URL:    "https://snap.stanford.edu/data/soc-LiveJournal1.txt.gz",
		SHA256: "d7bcd5a87b88c896c35fdb9611e804c3f4033c39b58c4c9ea3ba53c680d516d8",
		Nodes:  4_847_571,
		Edges:  68_993_773,
	},
	"cit-HepPh": {
		Name:   "cit-HepPh",
		URL:    "https://snap.stanford.edu/data/cit-HepPh.txt.gz",
		SHA256: "917e77b3344aed33fd2d849443c9512b7c528b9dc87251d4245fb3777bbe4128",
		Nodes:  34_546,
		Edges:  421_578,
	},
}

SNAPDatasets is the registry of supported SNAP datasets keyed by the canonical basename of the archive.

Functions

func AdversarialBools

func AdversarialBools() []bool

AdversarialBools returns both boolean values, in canonical (false, true) order.

func AdversarialBytes

func AdversarialBytes() [][]byte

AdversarialBytes returns the canonical []byte adversarial soup: the empty slice, a 1 KB all-zero payload, a 1 MB all-zero payload, and a short slice of selected single-byte values (boundary cases for varint and length-prefixed codecs).

func AdversarialFloatWeights

func AdversarialFloatWeights() []float64

AdversarialFloatWeights returns the canonical float64 adversarial soup, including both infinities, both signs of zero, the largest and smallest representable finite magnitudes, and a quiet NaN. Callers comparing NaN entries must use math.IsNaN rather than ==.

func AdversarialIntWeights

func AdversarialIntWeights() []int64

AdversarialIntWeights returns the canonical int64 adversarial soup: signed extremes, signed extremes ± 1, ± 1, 0, ± 2^62, and 0 once (so the corpus is non-empty when callers index by id % len).

func AdversarialStrings

func AdversarialStrings() []string

AdversarialStrings returns the canonical string adversarial soup: empty string, a short ASCII control, NUL-embedded bytes, all-NUL payloads, mixed-script Unicode, the NFC and NFD forms of "é", a BMP emoji, a triple of mathematical-bold capitals (astral plane code points), and a 1 MB ASCII filler string. The NFC and NFD entries occupy adjacent positions so callers indexing by (id % len) cover both forms.

func AdversarialTimes

func AdversarialTimes() []time.Time

AdversarialTimes returns the canonical time.Time adversarial soup: year 1, the Unix epoch, the Y2038 boundary, and the last representable second of year 9999.

func ApplyAdversarialProps

func ApplyAdversarialProps[N comparable, W any](g *lpg.Graph[N, W], mix Mix)

ApplyAdversarialProps decorates every node and edge of g with a deterministic selection of adversarial properties chosen by mix. Calling this function a second time with the same (g, mix) is a no-op (idempotent): every (key, value) pair is recomputed deterministically from the node id and the edge iteration index, so re-writing yields the same shard contents.

Node bucket: AdversarialXxx()[id % len(corpus)]. Edge bucket: AdversarialXxx()[(srcID*7 + edgeIdx) % len(corpus)], where edgeIdx walks the in-iteration order of adjlist.AdjList.Neighbours. Multiplying by 7 mixes the source id into the edge bucket so adjacent edges of the same source do not all collide on the same corpus entry.

func CitHepPh

func CitHepPh(cacheDir string) (*lpg.Graph[int64, struct{}], error)

CitHepPh loads the cit-HepPh dataset.

func GenerateShardZeroKeys

func GenerateShardZeroKeys(n int) []string

GenerateShardZeroKeys returns n distinct strings, every one of which hashes to mapper shard 0 under the FNV-1a routing scheme pinned by graph.Mapper. The returned slice is deterministic across processes and platforms.

The first call materialises (or refreshes) the on-disk cache; later calls within the same process replay from memory in O(n).

n must be > 0; the helper panics on n <= 0 because the catalogue has no use case for an empty corpus.

func GraphalyticsDefaultCacheDir

func GraphalyticsDefaultCacheDir() string

GraphalyticsDefaultCacheDir returns the default cache directory. Resolution order:

  1. $GOGRAPH_GRAPHALYTICS_DIR (if set).
  2. $HOME/.cache/gograph-graphalytics.
  3. $TMPDIR/gograph-graphalytics (last-resort fallback).

func LoadGraphalytics

func LoadGraphalytics(name, cacheDir string) (*lpg.Graph[int64, struct{}], error)

LoadGraphalytics loads the dataset registered under name, fetching the .tar.zst archive lazily, verifying its checksum, inflating it into cacheDir/{name}/, and parsing the .v + .e files into a *lpg.Graph[int64, struct{}] over a directed adjlist. cacheDir = "" resolves to GraphalyticsDefaultCacheDir.

Network failures wrap ErrGraphalyticsOffline or ErrGraphalyticsStaging so the caller can decide whether to retry, request staging, or skip.

func LoadGraphalyticsReference

func LoadGraphalyticsReference(name, alg, cacheDir string) (io.ReadCloser, error)

LoadGraphalyticsReference returns a handle to the reference output file of `alg` for the named dataset. The caller must close the returned reader. cacheDir = "" resolves to the default; the archive must already be extracted under the cache (run LoadGraphalytics(name, cacheDir) first).

func LoadSNAP

func LoadSNAP(name, cacheDir string) (*lpg.Graph[int64, struct{}], error)

LoadSNAP loads the dataset registered under name. cacheDir = "" resolves to SNAPDefaultCacheDir. The function is the general-purpose entry point; WebGoogle, SocLiveJournal1 and CitHepPh are convenience wrappers.

func MakeKnobValues

func MakeKnobValues(g *rapid.T, knobs []Knob) []int

MakeKnobValues draws one integer for each knob in knobs using the supplied rapid.T, in declaration order. Each draw uses the knob Name as its rapid label so that shrunk-and-reported counterexamples are self-describing.

The returned slice has the same length and order as knobs. MakeKnobValues panics if any Knob has Min > Max; this indicates a programmer error in the Shape declaring the knob.

func RMATPick

func RMATPick(r *rand.Rand, n uint64, a, ab, abc int) (src, dst uint64)

RMATPick performs one Graph500 R-MAT recursive descent on the n-by-n adjacency matrix using cumulative integer-percent thresholds (a, ab, abc) = (a, a+b, a+b+c). It returns the (src, dst) cell coordinates with 0 <= src, dst < n.

The function consumes exactly log2(n) PRNG draws — one per recursion level. Every draw is an math/rand/v2.Rand.IntN(100) call; comparing the result against the cumulative thresholds implements the Graph500 quadrant split in branch-free integer arithmetic.

The function is exported so github.com/FlavioCFOliveira/GoGraph/bench/rmat.Generate can share the exact same recursive-descent core without duplicating the (a, ab, abc) split logic. Outside the gograph module RMATPick is not reachable because shapegen is under internal/.

Pre-conditions:

  • n is a positive power of two; the loop terminates iff size > 1 becomes false, which holds only for n in {1, 2, 4, 8, ...}.
  • 0 <= a <= ab <= abc <= 100. The Graph500 convention treats draws >= abc as quadrant D.

func Register

func Register[N comparable, W any](s Shape[N, W]) error

Register installs s in the package-local registry under its Name() and the (N, W) specialisation it inhabits. It returns an error if a Shape with the same Name has already been registered for the same specialisation, or if s.Name() is empty. Registration of distinct specialisations under the same name is allowed and expected — for example, "nil" may exist for both (string, int64) and (uint64, float64).

func SNAPDefaultCacheDir

func SNAPDefaultCacheDir() string

SNAPDefaultCacheDir returns the cache directory the loaders use when the caller passes an empty cacheDir. The lookup order is:

  1. $GOGRAPH_SNAP_DIR (if set).
  2. $HOME/.cache/gograph-snap (POSIX-style XDG fallback).
  3. $TMPDIR/gograph-snap (last-resort fallback when no home is resolvable — practically only on misconfigured CI runners).

func SocLiveJournal1

func SocLiveJournal1(cacheDir string) (*lpg.Graph[int64, struct{}], error)

SocLiveJournal1 loads the soc-LiveJournal1 dataset.

func Unregister

func Unregister[N comparable, W any](name string)

Unregister removes the Shape registered under name for the (N, W) specialisation. It is a no-op when no such Shape exists. Tests that register transient Shapes should call Unregister from a t.Cleanup hook.

func WebGoogle

func WebGoogle(cacheDir string) (*lpg.Graph[int64, struct{}], error)

WebGoogle loads the web-Google dataset. Equivalent to LoadSNAP("web-Google", cacheDir).

Types

type GraphalyticsDataset

type GraphalyticsDataset struct {
	// Name is the dataset key, matching the `.v`/`.e` base name and
	// the archive's basename (without the .tar.zst suffix).
	Name string

	// URL is the canonical HTTPS download URL inside the SURF
	// Data Repository.
	URL string

	// SHA256 is the hex SHA-256 of the .tar.zst archive. Empty
	// when the dataset has not yet been staged + audited; the
	// loader falls back to MD5 in that case.
	SHA256 string

	// MD5 is the hex MD5 published by the SURF repository's API.
	// It is always set for registered datasets.
	MD5 string

	// Nodes is the LDBC-published vertex count.
	Nodes uint64

	// Edges is the LDBC-published edge count (directed if the
	// graph is directed; otherwise the count of undirected pairs).
	Edges uint64
}

GraphalyticsDataset captures every static fact about one LDBC Graphalytics dataset.

type Knob

type Knob struct {
	// Name uniquely identifies the knob within its Shape. It is used
	// as the label passed to rapid.Generator.Draw, so the same Shape
	// must never declare two knobs with the same Name.
	Name string

	// Min is the inclusive lower bound for the knob value.
	Min int

	// Max is the inclusive upper bound for the knob value. It must
	// satisfy Max >= Min.
	Max int

	// Default is the value used by goldens and benchmarks when no
	// property-based sweep is in effect. It must satisfy
	// Min <= Default <= Max.
	Default int
}

Knob describes one integer-valued tunable parameter exposed by a Shape. Property-based tests sweep knobs by drawing values uniformly from [Min, Max]; goldens and benchmarks pin Default.

Knobs are intentionally restricted to integers. Non-integer parameters (for example, an edge density expressed as a float) should be derived inside Shape.Build from an integer knob — for instance a percent-of-max knob with Min=0, Max=100.

type Mix

type Mix struct {
	Ints, Floats, Strings, Bytes, Times, Bools bool
}

Mix selects which adversarial property kinds ApplyAdversarialProps stamps on each node and edge.

func AllMix

func AllMix() Mix

AllMix returns a Mix with every kind enabled.

type NilShape

type NilShape struct{}

NilShape is the canonical placeholder Shape used to bootstrap the shapegen self-test. It declares no knobs and Build always returns a fresh empty *lpg.Graph[string, int64] configured by cfg. Real graph topologies (path, cycle, complete, etc.) land in later sprints and follow the same Shape contract.

func (NilShape) Build

func (NilShape) Build(cfg adjlist.Config) (*lpg.Graph[string, int64], error)

Build returns an empty LPG specialised on (string, int64).

func (NilShape) Knobs

func (NilShape) Knobs() []Knob

Knobs reports that NilShape has no tunable parameters.

func (NilShape) Name

func (NilShape) Name() string

Name returns the canonical placeholder name.

type SNAPDataset

type SNAPDataset struct {
	// Name is the SNAP archive basename (without the .txt.gz suffix)
	// and the key under which the dataset is registered.
	Name string

	// URL is the canonical HTTPS download URL.
	URL string

	// SHA256 is the lowercase hexadecimal SHA-256 of the gzip-
	// compressed archive — the *.txt.gz file, not its inflated
	// contents.
	SHA256 string

	// Nodes is the SNAP-published node count.
	Nodes uint64

	// Edges is the SNAP-published directed edge count (i.e. the
	// number of `(FromNodeId, ToNodeId)` pairs in the archive
	// excluding the header comments).
	Edges uint64
}

SNAPDataset captures every static fact about one dataset.

type Shape

type Shape[N comparable, W any] interface {
	Name() string
	Build(cfg adjlist.Config) (*lpg.Graph[N, W], error)
	Knobs() []Knob
}

Shape is the canonical contract every graph-shape generator implements. It is generic on the node type N and the edge weight type W so it can produce graphs compatible with any LPG specialisation in the module.

Name returns a stable, package-unique identifier — used as the registry key, as the prefix in golden filenames, and as a label in benchmark reports.

Build constructs a fresh *lpg.Graph[N, W] using cfg for the underlying adjacency-list backend. Build must be safe to call concurrently with itself and with calls to Build on any other Shape instance.

Knobs returns the set of tunable parameters exposed by the Shape. The slice ordering must be stable across calls; tests rely on it to map drawn knob values back to their semantic role.

func BalancedBinary

func BalancedBinary(depth int) Shape[int, int64]

BalancedBinary returns a Shape that builds the perfect (a.k.a. balanced) binary tree of depth d: the rooted tree in which every node at depth < d has exactly two children, and every leaf lives at depth d. Nodes are numbered in BFS order starting from the root (id 0); the children of node i are 2i+1 and 2i+2.

Catalogue invariants on the returned graph:

  • Order() == uint64(1 << (d + 1) - 1) — i.e., 2^(d+1) - 1 nodes.
  • Size() == Order() - 1 (every tree has N-1 edges).
  • The graph has zero cycles (verified in the test suite via a local DFS, not via the search.* package).

BalancedBinary declares a single knob "depth" over [0, 20]. The short layer caps depth at 20 (2^21 - 1 ≈ 2.1M nodes); soak/nightly callers may drive the constructor up to depth 22 (~ 8.4M nodes). The constructor panics when depth is negative or above 20: the catalogue does not define a balanced binary tree outside that range.

Edges are inserted in ascending child id order: for every node i in [1, N), the parent edge ((i-1)/2, i) is published. This enumeration mirrors the BFS numbering and lets the goldens stay reproducible across runs.

func BarabasiAlbert

func BarabasiAlbert(n int, m0 int, seed uint64) Shape[int, int64]

BarabasiAlbert returns a Shape that builds a Barabási-Albert preferential-attachment graph on n nodes seeded with a complete graph on the first m0 nodes. Each subsequent node i in [m0, n) is connected to m0 distinct existing nodes drawn with probability proportional to their current degree (Barabási & Albert, "Emergence of Scaling in Random Networks", Science 286(5439), 1999). The graph is undirected and simple (no parallel edges, no self-loops); cfg.Directed and cfg.Multigraph are overridden to false.

The PRNG is a deterministically-seeded math/rand/v2.PCG, so every (n, m0, seed) tuple yields the same byte-for-byte adjacency.

Catalogue invariants on the returned graph:

  • Order() == uint64(n).
  • Size() == uint64(m0*(m0-1)/2 + (n-m0)*m0). The first term counts the edges of the initial K_{m0} clique; the second counts the m0 edges contributed by every node added after the seed.
  • The graph is undirected and simple.
  • The degree distribution fits a power-law tail with exponent gamma in [2.5, 3.5] at n=10000, m0=3 over five independent seeds. The statistical test lives in the soak layer ([TestRandom_BarabasiAlbert_PowerLawExponent]); the regression method is described in its godoc.

BarabasiAlbert declares two knobs: "n" over [2, 100000] (default 50) and "m0" over [1, 50] (default 3). The "seed" parameter is supplied at construction time as a uint64 and is not exposed as a knob, mirroring the convention pinned by Layered. The constructor panics when m0 < 1, when n < m0, when n > 100000, or when m0 > 50; the catalogue does not define the model outside these bounds.

Sampling strategy: standard cumulative-sum preferential attachment. For each new node i in [m0, n) the generator

  1. builds a per-step rejection set of node IDs already chosen within the same step (so the m0 edges out of i are distinct);
  2. draws u uniformly from [0, totalDegree); the half-open range is the sum of current degrees over all existing nodes;
  3. recovers the target node via cumulative-degree binary search;
  4. retries the draw when the target is already in the rejection set, until m0 distinct targets have been accepted.

The cumulative-degree array is rebuilt once per step from the per-node degree slice, taking O(i) time at step i. The binary search inside the draw is O(log i). The total cost is therefore O(n*m0 + n^2) = O(n^2) for m0 fixed, well inside the catalogue's short-layer ceiling (the soak layer goes up to n=10000, which runs comfortably in seconds on a modern machine).

Rejection is bounded because at step i there are i existing nodes and only m0 must be drawn; with m0 <= 50 and i >= m0 the expected number of retries per accepted target stays below a small constant. The catalogue does not measure or pin a worst-case retry bound — the soak-layer power-law test is the empirical guard that the sampling did not degenerate.

After each new node i is fully wired, its m0 sampled targets are sorted ascending and inserted with i as the source. Because the graph is undirected, every AddEdge updates both endpoints' adjacency lists; the per-node degree slice is therefore incremented for both i and each target after the AddEdge call succeeds, so the cumulative-sum sampling at step i+1 sees the up-to-date totals.

func BuildDepDAG

func BuildDepDAG(depth, fanIn, fanOut int, seed uint64) Shape[int, int64]

BuildDepDAG returns a Shape that builds a synthetic build-graph DAG of the kind that appears in dependency-graph analysis. The graph has a single root (id 0) at layer 0; each subsequent layer has up to fanOut * |previous layer| nodes; each child has up to fanIn parents drawn uniformly without replacement from the previous layer.

The "up to" qualifiers reflect the deterministic randomised construction: at every step, the number of children spawned at layer i+1 is min(fanOut * |layer i|, maxLayerWidth), and every child draws min(fanIn, |layer i|) parents from layer i. The concrete layer widths therefore depend on (depth, fanIn, fanOut) rather than the seed, but the parent choices depend on the seed.

Catalogue invariants on the returned graph:

  • Order() >= 1 (always at least the root).
  • The graph is acyclic — every edge points from a node at layer i to a node at layer i+1.
  • The graph is connected from the root (every node has at least one parent in the previous layer, except the root).

BuildDepDAG declares three knobs: "depth" over [0, 20], "fanIn" over [1, 10], and "fanOut" over [1, 10]. The "seed" parameter is supplied at construction time as a uint64 and is not exposed as a knob, mirroring the PruferTree convention. The constructor panics when depth is negative or above 20, when fanIn is below 1 or above 10, or when fanOut is below 1 or above 10.

The layer-width recurrence is bounded by buildDepDAGMaxWidth (1024) so a malicious (depth, fanOut) pair cannot cause an allocation explosion; once the recurrence hits the cap, every subsequent layer holds exactly buildDepDAGMaxWidth nodes.

Edges are inserted in this deterministic order: for every layer i from 1 to depth, for every child position c in ascending order, for every drawn parent position p in ascending order, the edge (parent, child) is inserted. The PRNG is a deterministically- seeded math/rand/v2.PCG and is consumed once per parent draw.

func Caterpillar

func Caterpillar(spine int, leafDist []int) Shape[int, int64]

Caterpillar returns a Shape that builds the caterpillar tree C(spine, leafDist): a spine path of length spine joined to leafDist[i] additional leaves at each spine node i. The "spine" is the path 0 -- 1 -- ... -- (spine-1); leaf nodes are appended after the spine, with node spine+offset being the j-th leaf of spine node i for the appropriate offset computed by the build.

Catalogue invariants on the returned graph:

  • Order() == uint64(spine + sum(leafDist)).
  • Size() == Order() - 1 (every tree has N-1 edges) provided Order() >= 1; equals 0 when spine == 0 and len(leafDist) == 0.

Caterpillar declares a single knob "spine" over [1, 1000]. The leafDist slice is variadic and not exposed as a numeric knob; the constructor panics when spine < 1, when len(leafDist) != spine, or when any leafDist[i] is negative or above 50.

Edges are inserted in this deterministic order:

  1. the spine-1 path edges (0, 1), (1, 2), ..., (spine-2, spine-1);
  2. for each spine node i in ascending order, the leafDist[i] leaf edges (i, leafStart), (i, leafStart+1), ..., where leafStart counts contiguously from spine across all spine nodes.

The first AddEdge error short-circuits every loop.

func Complete

func Complete(n int, directed bool) Shape[int, int64]

Complete returns a Shape that builds the complete graph K_n on nodes 0..n-1. When directed is true every ordered pair (i, j) with i != j is inserted (the canonical "tournament on K_n"); when directed is false every unordered pair {i, j} with i < j is inserted exactly once and the adjlist.AdjList mirrors the entry. Self-loops are never inserted: the catalogue defines K_n as a simple graph.

Catalogue invariants on the returned graph:

  • Order() == uint64(n)
  • Size() == uint64(n) * uint64(n-1) when directed; n*(n-1)/2 otherwise.
  • Diameter is 0 when n <= 1, 1 when n >= 2.
  • Triangle count (undirected, n >= 3): n*(n-1)*(n-2)/6 == C(n, 3).

Complete declares a single knob "n" over [0, 100_000]. Sweep callers must keep in mind that Size grows as n^2 and that n=100_000 produces ~10^10 edges; the property-based test caps n below this. The constructor panics when n < 0.

func CompleteBipartite

func CompleteBipartite(m, n int) Shape[int, int64]

CompleteBipartite returns a Shape that builds the complete bipartite graph K_{m,n}: m left nodes (ids 0..m-1) and n right nodes (ids m..m+n-1) with every left node connected to every right node. The result is undirected by definition; cfg.Directed is overridden to false and cfg.Multigraph to false.

Catalogue invariants on the returned graph:

  • Order() == uint64(m + n)
  • Size() == uint64(m) * uint64(n)
  • Degree of every left node == n.
  • Degree of every right node == m.
  • Diameter is 0 when m+n <= 1, 1 when min(m,n) == 0 and m+n >= 2, 2 otherwise — note however that the brief does not pin diameter for this shape, so the unit tests focus on Order, Size, and the bipartite degree sequence.

CompleteBipartite declares two knobs "m" and "n" over [0, 1000]. The bound is chosen so the worst-case edge count m*n stays below 1_000_000 in the short layer; soak/nightly callers can drive the underlying constructor directly past this cap. The constructor panics when m < 0 or n < 0.

func CompleteKAry

func CompleteKAry(k, depth int) Shape[int, int64]

CompleteKAry returns a Shape that builds the perfect complete k-ary tree of depth d: the rooted tree in which every node at depth < d has exactly k children and every leaf lives at depth d. Nodes are numbered in BFS order starting from the root (id 0); node i (i >= 1) has parent (i-1)/k for k >= 1.

Catalogue invariants on the returned graph:

  • Order() == 1 when k == 0 (a complete 0-ary tree is just the root).
  • Order() == uint64(d + 1) when k == 1 (a chain of d+1 nodes).
  • Order() == uint64((k^(d+1) - 1) / (k - 1)) when k >= 2.
  • Size() == Order() - 1 (every tree has N-1 edges).

CompleteKAry declares two knobs "k" over [1, 10] and "depth" over [0, 12]. The lower bound on k is 1 because the knob represents a sweep parameter for property-based tests; the special case k == 0 is admitted by the constructor (Order = 1, Size = 0) but is not part of the sweep. The upper bounds bound k^(d+1) at 10^13 in the worst case; property-based callers must cap their own draws below this to stay within the short layer's budget.

The constructor panics when k < 0, k > 10, depth < 0, or depth > 12.

Edges are inserted in ascending child id order: for every node i in [1, N), the parent edge ((i-1)/k, i) is published. This enumeration mirrors the BFS numbering and lets the goldens stay reproducible across runs.

func ConfigurationModel

func ConfigurationModel(degSeq []int, allowMulti bool, seed uint64) Shape[int, int64]

ConfigurationModel returns a Shape that builds a configuration-model graph on n = len(degSeq) nodes whose i-th vertex contributes degSeq[i] half-edges to the pairing. Half-edges are paired uniformly at random and consumed in adjacent pairs; the resulting pairings become undirected edges (Newman, Strogatz & Watts, "Random graphs with arbitrary degree distributions and their applications", Phys. Rev. E 64, 026118, 2001; Molloy & Reed, "A critical point for random graphs with a given degree sequence", Random Struct. Algor. 6(2-3), 1995).

When allowMulti is true the resulting graph is a multigraph: every pairing — including self-loops (pairings of two half-edges from the same node) and parallel edges (multiple pairings of the same unordered pair) — becomes an edge in g. cfg.Multigraph is forced to true; cfg.Directed is forced to false. The realised degree sequence equals degSeq exactly.

When allowMulti is false the generator realises the Erased Configuration Model: self-loop and parallel pairings are dropped at generation time. cfg.Multigraph is forced to false; cfg.Directed is forced to false. The realised degree sequence is at most degSeq componentwise — every dropped pairing reduces the realised degree of its endpoints by one each.

The PRNG is a deterministically-seeded math/rand/v2.PCG, so every (degSeq, allowMulti, seed) tuple yields the same byte-for-byte adjacency.

Catalogue invariants on the returned graph:

  • Order() == uint64(len(degSeq)).
  • When allowMulti == true: Size() == sum(degSeq) / 2; per-node degree (with self-loops counted twice) equals degSeq exactly.
  • When allowMulti == false: Size() <= sum(degSeq) / 2; per-node degree is at most degSeq componentwise.
  • The graph is undirected.

When sum(degSeq) is odd the Build closure returns ErrOddDegreeSum wrapped with the offending sum; callers can errors.Is against the sentinel without unwrapping. The constructor itself never panics on parity violations, matching the Cycle / ErrCycleTooSmall and ErdosRenyiNM / ErrEdgeCountTooHigh conventions for runtime- surfaced parameter errors.

ConfigurationModel declares no knobs: degSeq is variadic and property-based tests draw it directly via rapid, mirroring the convention pinned by Multipartite in the classic family. The constructor panics when any degSeq[i] is negative; the catalogue does not define the model on negative-degree inputs.

The constructor takes a defensive copy of degSeq so subsequent caller mutations cannot affect Build — mirroring Multipartite's contract.

func Cycle

func Cycle(n int, directed bool) Shape[int, int64]

Cycle returns a Shape that builds the cycle graph C_n: nodes 0,1,...,n-1 with edges (i, (i+1) mod n) for 0 <= i < n. When directed is true every edge points from i to (i+1) mod n only; when directed is false the underlying adjlist.AdjList mirrors each insertion.

Catalogue invariants on the returned graph:

  • Order() == uint64(n)
  • Size() == uint64(n) when n is large enough to be valid.
  • Diameter is floor(n / 2).
  • Undirected: 2-regular for n >= 3.
  • Directed: every node has in-degree 1 and out-degree 1.

Cycle requires n >= 3 in the undirected case and n >= 1 in the directed case. When n is below the threshold Build returns ErrCycleTooSmall wrapped with the offending parameter so callers get a descriptive error while still being able to errors.Is the sentinel. The constructor itself never panics on small n; this matches the brief's explicit contract for Cycle.

Special directed cases preserved by this implementation:

  • Cycle(1, true) builds the single self-loop graph (one edge 0 -> 0). Multigraph is set to false because the catalogue treats this as a simple graph with a single self-loop.
  • Cycle(2, true) builds the directed digon (two anti-parallel edges 0 -> 1 and 1 -> 0). Multigraph stays false because these two edges are not parallel — they have distinct sources.

The constructor panics when n < 0 because the catalogue does not define a cycle with a negative number of nodes.

func Diamond

func Diamond(k int) Shape[int, int64]

Diamond returns a Shape that builds the diamond DAG of parameter k: a source node, a sink node, and two parallel chains of length k connecting them. Concretely, the graph has 2k + 2 nodes when k >= 1 (a source at id 0, a sink at id 2k+1, and two internal chains occupying ids 1..k and k+1..2k respectively) and is the degenerate single-edge graph 0 -> 1 when k == 0.

For k >= 1 the two chains start at the source and converge on the sink:

chain A: 0 -> 1 -> 2 -> ... -> k -> 2k+1
chain B: 0 -> k+1 -> k+2 -> ... -> 2k -> 2k+1

Catalogue invariants on the returned graph:

  • Order() == 2 when k == 0; uint64(2*k + 2) when k >= 1.
  • Size() == 1 when k == 0; uint64(2 * (k + 1)) when k >= 1.
  • The graph is acyclic.
  • For k >= 1, there are exactly two directed paths of length k + 1 (edge count) from source 0 to sink 2k+1: one along chain A and one along chain B. AC #3 phrases this as "exactly 2 paths of length k" using the "k internal chain links" count; either phrasing identifies the same pair of paths.

The k == 0 special case is documented explicitly: Diamond(0) degenerates to the single-edge graph 0 -> 1, with no internal chains. AC #3 admits this case verbatim ("for k >= 1") so the path-count assertion is skipped at k == 0.

Diamond declares a single knob "k" over [0, 1000]. The constructor panics when k is negative or above 1000.

Edges are inserted in this deterministic order:

  1. when k == 0: the single edge (0, 1).
  2. when k >= 1: a. the k chain-A internal edges (0, 1), (1, 2), ..., (k-1, k); b. the chain-A sink edge (k, 2k+1); c. the k chain-B internal edges (0, k+1), (k+1, k+2), ..., (2k-1, 2k); d. the chain-B sink edge (2k, 2k+1).

The first AddEdge error short-circuits every loop.

func Dodecahedral

func Dodecahedral() Shape[int, int64]

Dodecahedral returns a Shape that builds the dodecahedron graph: the canonical 20-vertex, 30-edge, 3-regular planar graph that is the 1-skeleton of the regular dodecahedron polytope. It has girth 5 and chromatic number 3 (Bondy & Murty, Appendix A.4).

Catalogue invariants on the returned graph:

  • Order() == 20.
  • Size() == 30.
  • Every vertex has degree exactly 3 (3-regular).
  • Planar (1-skeleton of a polytope): documented in the godoc and AC, not asserted at runtime — the brief explicitly forbids implementing a planarity check.

Dodecahedral declares no knobs.

func DoubleStar

func DoubleStar(k1, k2 int) Shape[int, int64]

DoubleStar returns a Shape that builds the double-star graph D_{k1,k2}: two centre nodes (ids 0 and 1) joined by a single edge, with k1 leaves attached to centre 0 (ids 2..k1+1) and k2 leaves attached to centre 1 (ids k1+2..k1+k2+1).

DoubleStar is undirected: this is the conventional catalogue definition and yields the cleanest degree-sequence invariant. The build overrides cfg.Directed=false and cfg.Multigraph=false.

Catalogue invariants on the returned graph:

  • Order() == uint64(2 + k1 + k2)
  • Size() == uint64(k1 + k2 + 1)
  • Degree of centre 0 == k1 + 1.
  • Degree of centre 1 == k2 + 1.
  • Degree of every leaf == 1.

DoubleStar declares two knobs "k1" and "k2" over [0, 50_000]; property-based tests should cap their draws below this to stay within the short layer's time budget. The constructor panics when either k1 or k2 is negative.

func EmptyGraph

func EmptyGraph() Shape[int, int64]

EmptyGraph returns a Shape that builds the empty graph E0: zero nodes, zero edges. The graph is directed. EmptyGraph declares no knobs.

Catalogue invariants on the returned graph:

  • Order() == 0
  • Size() == 0

The graph respects cfg.MaxShardCapacity but always sets Directed=true: an empty directed graph is the canonical E0 across the catalogue, and any reverse-edge mirroring would be vacuous regardless.

func ErdosRenyiNM

func ErdosRenyiNM(n, m int, seed uint64) Shape[int, int64]

ErdosRenyiNM returns a Shape that builds an Erdős-Rényi G(n, m) random graph: nodes 0..n-1 with exactly m edges drawn uniformly without replacement from the C(n, 2) unordered pairs. The graph is undirected and simple (no parallel edges, no self-loops); cfg.Directed and cfg.Multigraph are overridden to false.

The PRNG is a deterministically-seeded math/rand/v2.PCG, so every (n, m, seed) tuple yields the same byte-for-byte adjacency.

Catalogue invariants on the returned graph:

  • Order() == uint64(n).
  • Size() == uint64(m) when m <= C(n, 2).
  • The graph is undirected and simple — every emitted edge is a distinct unordered pair (i, j) with i < j.

When m > C(n, 2) the Build closure returns ErrEdgeCountTooHigh alongside a nil graph; callers can errors.Is against the sentinel without unwrapping. The constructor itself never panics on a high m, matching the Cycle / ErrCycleTooSmall convention for runtime-surfaced parameter errors.

ErdosRenyiNM declares two knobs: "n" over [0, 1000] (default 50) and "m" over [0, 499_500] (default 50). The upper bound 499_500 is C(1000, 2), the maximum number of edges admissible by the n knob's upper bound. The "seed" parameter is supplied at construction time as a uint64 and is not exposed as a knob, mirroring the Layered convention. The constructor panics when n is negative or above 1000, or when m is negative or above 499_500.

Sampling strategy: Floyd's combination sampling algorithm (Bentley & Floyd, "A Sample of Brilliance", CACM 30(9), 1987), which draws m unique integers from [0, C(n, 2)) in O(m) time and O(m) space using a single PCG draw per accepted element. The resulting integers are interpreted as flat indices into the lexicographically-ordered list of unordered pairs (i, j) with i < j; the inverse mapping recovers (i, j) from the flat index in O(1) via the closed-form pairIndexToIJ helper. After sampling, the flat indices are sorted ascending so edges are inserted in lexicographic (i, j) order, pinning the golden bytes regardless of the draw permutation.

Floyd's algorithm is chosen over reservoir sampling because the catalogue's typical use is m << C(n, 2); Floyd's O(m) cost beats reservoir's O(C(n, 2)) by orders of magnitude in that regime, without sacrificing uniformity (every m-subset is equally likely by induction on m).

func ErdosRenyiNP

func ErdosRenyiNP(n int, pPercent int, seed uint64) Shape[int, int64]

ErdosRenyiNP returns a Shape that builds an Erdős-Rényi G(n, p) random graph: nodes 0..n-1 with every unordered pair {i, j} connected independently by an edge with probability p = pPercent/100. The graph is undirected and simple (no parallel edges, no self-loops); cfg.Directed and cfg.Multigraph are overridden to false.

The pPercent knob follows the percent-of-max convention pinned by Layered (T58.8): an integer in [0, 100] that maps to the Bernoulli success probability via a single IntN(100) draw per pair. The PRNG is a deterministically-seeded math/rand/v2.PCG, so every (n, pPercent, seed) tuple yields the same byte-for-byte adjacency.

Catalogue invariants on the returned graph:

  • Order() == uint64(n).
  • 0 <= Size() <= uint64(n * (n - 1) / 2).
  • The graph is undirected and simple.
  • When pPercent == 0 the graph has no edges; when pPercent == 100 every unordered pair is connected and the graph coincides with the undirected complete graph K_n.
  • For large enough n, the expected edge count C(n, 2) * p falls within the +/-3 sigma window of a Binomial(C(n, 2), p) distribution — pinned by TestRandom_ErdosRenyiNP_ExpectedEdgeCount at (n=200, pPercent=10, 100 seeds).

ErdosRenyiNP declares two knobs: "n" over [0, 1000] (default 50) and "p" over [0, 100] (default 10). The "seed" parameter is supplied at construction time as a uint64 and is not exposed as a knob, mirroring the Layered convention. The constructor panics when n is negative or above 1000, or when pPercent is negative or above 100.

Edges are inserted in lexicographic (i, j) order with i < j: for every i in [0, n), for every j in [i + 1, n), the candidate edge (i, j) is drawn from a single IntN(100); the edge is inserted iff the draw is strictly less than pPercent. The PRNG is consumed exactly once per unordered pair, regardless of the outcome, so the seed-to-output map is a pure function of the (n, pPercent, seed) tuple.

func GoldnerHarary

func GoldnerHarary() Shape[int, int64]

GoldnerHarary returns a Shape that builds the Goldner-Harary graph: the canonical 11-vertex, 27-edge simplicial 3-tree built by stellating K_4 four times then stellating three of the resulting faces. By construction it is maximal planar (3n - 6 = 27 edges on n = 11 vertices) and a 3-tree (every triangle bounds a face of the triangulation).

The Goldner-Harary graph is historically significant as a small non-Hamiltonian simplicial polytope (Goldner & Harary, 1975).

Catalogue invariants on the returned graph:

  • Order() == 11.
  • Size() == 27.
  • The graph is a simplicial 3-tree: there exists an elimination ordering where each removed vertex's neighbourhood at the time of removal is a clique of size 3 (verified in the test via a deterministic elimination scan).
  • Maximal planar (3n - 6 = 27): documented in the godoc and AC, not asserted at runtime per the brief.

GoldnerHarary declares no knobs.

func Grid

func Grid(m, n int, eightNeighbour bool) Shape[int, int64]

Grid returns a Shape that builds the m-by-n grid graph L_{m,n}. Nodes are arranged in row-major order: cell (r, c) at row r in [0, m) and column c in [0, n) has id r*n + c. When eightNeighbour is false the graph is the von Neumann (4-neighbour) lattice: each cell is connected only to its orthogonally adjacent cells. When eightNeighbour is true the graph is the Moore (8-neighbour) lattice: each cell is additionally connected to its four diagonal neighbours.

The graph is undirected and simple; cfg.Directed and cfg.Multigraph are overridden to false. Empty rows/columns are allowed: when m == 0 or n == 0 the returned graph has zero nodes and zero edges. A 1x1 grid has one node and zero edges in either neighbourhood scheme.

Catalogue invariants for the 4-neighbour variant:

  • Order() == uint64(m * n)
  • Size() == uint64((m-1)*n + m*(n-1)) for m, n >= 1; 0 otherwise.
  • Diameter == (m-1) + (n-1) for m, n >= 1.

Catalogue invariants for the 8-neighbour variant (m, n >= 1):

  • Order() == uint64(m * n)
  • Size() == uint64(4*m*n - 3*(m+n) + 2) for m, n >= 2; the 4-neighbour size for a degenerate 1xk or kx1 strip (no diagonals possible).

Grid declares two knobs "m" and "n" over [0, 1000]. The upper bound is chosen so the worst-case node count m*n stays at 10^6 in the short layer; soak/nightly callers can drive the constructor directly. The constructor panics when m < 0 or n < 0.

Edges are inserted in ascending source order. Per cell (r, c) the 4-neighbour variant emits at most the right neighbour (r, c+1) and the down neighbour (r+1, c); the 8-neighbour variant additionally emits the down-right (r+1, c+1) and down-left (r+1, c-1) diagonals. This enumeration covers each undirected edge exactly once.

func Hypercube

func Hypercube(d int) Shape[int, int64]

Hypercube returns a Shape that builds the d-dimensional hypercube graph Q_d: nodes are the integers 0..2^d-1 interpreted as bit strings, with an undirected edge between any two nodes whose labels differ in exactly one bit. The graph is undirected and simple; cfg.Directed and cfg.Multigraph are overridden to false.

Catalogue invariants on the returned graph:

  • Order() == uint64(1 << d).
  • Size() == uint64(d) * uint64(1<<(d-1)) for d >= 1; 0 for d == 0.
  • Every node has degree exactly d (d-regular).
  • Diameter == d for d >= 1; 0 for d == 0.

Hypercube declares a single knob "d" over [0, 24]. The upper bound reflects the brief's worst-case knob: d=24 yields 2^24 = 16_777_216 nodes and 24 * 2^23 ≈ 2 * 10^8 edges, which is the nightly-layer ceiling. The short layer is responsible for capping its own draws (d ≤ 12 by convention; see the property-based test below). The constructor panics when d is negative or above 24 because the catalogue does not define Q_d outside that range.

Edges are inserted in ascending (u, v) order: for every node u in [0, 2^d), and for every bit position i in [0, d), if u has bit i unset, the edge (u, u | (1<<i)) is inserted. This enumerates each undirected edge exactly once.

func IsolatedOnly

func IsolatedOnly(n int) Shape[int, int64]

IsolatedOnly returns a Shape that builds the graph on n isolated nodes: nodes 0..n-1 with no edges. When n is zero the result is observationally indistinguishable from EmptyGraph; the catalogue still treats this as a distinct shape because the construction path (explicit AddNode calls) exercises a different code path than the no-op build of EmptyGraph.

n is the parameter set at construction time; the knob "n" exposes the bounded sweep [0, 1000] to property-based tests, with a default of 5 (a small but non-trivial count that any catalogue inspection can comfortably enumerate). The constructor panics when n < 0 because the catalogue does not define a graph with a negative number of nodes.

Catalogue invariants on the returned graph:

  • Order() == uint64(n)
  • Size() == 0
  • HasEdge(i, j) == false for every pair (i, j).

The build forces Directed=true (the catalogue's canonical orientation for this family) and Multigraph=false. cfg.MaxShardCapacity is preserved.

AddNode cannot fail in the current adjlist contract (it only interns the user value with the Mapper, never touches shard slots). Errors are returned verbatim anyway, so the closure stays branch-free.

func Kneser

func Kneser(n, k int) Shape[int, int64]

Kneser returns a Shape that builds the Kneser graph K(n, k): the graph whose vertices are the C(n, k) k-subsets of {0, 1, ..., n-1} and whose edges connect every pair of disjoint k-subsets.

Node ids are assigned in lexicographic order of subsets: subset {0,1,...,k-1} is node 0, the next subset in lexicographic order is node 1, and so on through the C(n, k)-th subset. Each vertex carries a node label of the form "{i,j,...}" (comma-separated elements in ascending order, wrapped in braces) for hand inspection against the literature (West, §1.1.7).

Catalogue invariants on the returned graph:

  • Order() == C(n, k).
  • Every vertex has degree C(n - k, k).
  • Size() == Order() * C(n - k, k) / 2.
  • Kneser's theorem (Lovász, 1978) gives χ(K(n, k)) == n - 2k + 2 when n ≥ 2k; for n < 2k the graph has no edges and χ = 1 (single colour suffices). The chromatic number is documented in the godoc; the test does not exercise a solver per the brief.
  • K(5, 2) is the Petersen graph (semantic equivalence asserted in TestSpecials_Petersen_KneserEquivalence).

Kneser declares two knobs: "n" over [1, kneserMaxN] (default 5) and "k" over [0, kneserMaxK] (default 2). The constructor panics when n is outside [1, kneserMaxN], when k is outside [0, n], or when k is above kneserMaxK. The Name() reports "specials.kneser" (parameter-free, per the brief: the knobs carry n and k).

Edges are inserted in ascending (u, v) lexicographic order of the (subset, subset) pair; combined with the lexicographic node-id assignment this fully pins the golden bytes.

func LFR

func LFR(n, gammaPercent, betaPercent, avgDeg, maxDeg, minCom, maxCom, muPercent int, seed uint64) Shape[int, int64]

LFR returns a Shape that builds a simplified LFR community benchmark on n nodes. Node degrees are drawn from a truncated power-law with exponent gamma := gammaPercent / 100 on [max(1, avgDeg/2), maxDeg]; community sizes are drawn from a truncated power-law with exponent beta := betaPercent / 100 on [minCom, maxCom] until the cumulative size reaches n. Every node is assigned to a community by a greedy degree-descending admission pass; the per-node stub split places round((1 - muPercent/100) * deg) stubs into the intra-community pool and the remainder into the global inter-community pool. Edges are realised by Fisher-Yates pairing of the stub pools under the Erased Configuration Model (self-loops and parallel edges are dropped at generation time).

The PRNG is a deterministically-seeded math/rand/v2.PCG, so every (n, gammaPercent, betaPercent, avgDeg, maxDeg, minCom, maxCom, muPercent, seed) tuple yields the same byte-for-byte adjacency.

Catalogue invariants on the returned graph

  • Order() == uint64(n).
  • The graph is undirected and simple (no parallel edges, no self-loops).
  • Every node carries a "community_id" lpg.Int64Value property whose value is the zero-based index of the community that produced it.
  • At n = 5000 over 5 seeds the empirical degree distribution fits a power-law tail with exponent gamma_hat within +/- 5% of gammaPercent / 100, and the empirical community-size distribution fits a power-law with exponent beta_hat within +/- 5% of betaPercent / 100, both measured by CCDF log-log least-squares regression on the [kMin, max] tail (k_min = 5 for degrees, s_min = minCom for sizes). The statistical test lives in the soak layer; the regression method mirrors [fitPowerLawExponentCCDF] from the Barabási-Albert family.
  • The aggregate inter-edge fraction across communities at n = 5000 over 5 seeds lies within +/- 5 percentage points of muPercent / 100. The statistical test lives in the soak layer.

Parameters and validation

LFR declares eight knobs: "n" over [50, 50000] (default 1000), "gamma" over [200, 350] (default 300), "beta" over [100, 200] (default 150), "avgDeg" over [2, 50] (default 10), "maxDeg" over [5, 200] (default 50), "minCom" over [5, 100] (default 10), "maxCom" over [10, 1000] (default 50), and "mu" over [0, 100] (default 30). The "seed" parameter is supplied at construction time as a uint64 and is not exposed as a knob, mirroring the convention pinned by every other random-family generator. The constructor panics when:

  • n < 50 or n > 50000;
  • gammaPercent < 200 or gammaPercent > 350;
  • betaPercent < 100 or betaPercent > 200;
  • avgDeg < 2 or avgDeg > 50;
  • maxDeg < 5 or maxDeg > 200;
  • maxDeg < avgDeg (the truncated power-law tail must include the mean);
  • minCom < 5 or minCom > 100;
  • maxCom < 10 or maxCom > 1000;
  • maxCom < minCom;
  • muPercent < 0 or muPercent > 100.

func Ladder

func Ladder(n int) Shape[int, int64]

Ladder returns a Shape that builds the ladder graph L_n: two paths P_n joined rung-by-rung. The "left rail" is the path 0 -- 1 -- ... -- (n-1); the "right rail" is the path n -- (n+1) -- ... -- (2n-1); the rungs connect node i with node n+i for i in [0, n).

The graph is undirected and simple; cfg.Directed and cfg.Multigraph are overridden to false.

Catalogue invariants on the returned graph (n >= 1):

  • Order() == uint64(2 * n).
  • Size() == uint64(3*n - 2) for n >= 1 (2*(n-1) rail edges plus n rungs; collapses to 0 for n == 1 because there are no rail edges but there is one rung).

Note: for n == 1 the "two paths" each degenerate to a single node and the ladder coincides with K_2 (Order=2, Size=1).

Ladder declares a single knob "n" over [1, 1000]. The constructor panics when n < 1.

Edges are inserted in ascending source order: first the n-1 left-rail edges (0,1), (1,2), ..., (n-2, n-1); then the n-1 right-rail edges (n, n+1), ..., (2n-2, 2n-1); then the n rungs (0, n), (1, n+1), ..., (n-1, 2n-1).

func Layered

func Layered(L, w, density int, seed uint64) Shape[int, int64]

Layered returns a Shape that builds a layered DAG of L layers, each holding w nodes, with edges from every node in layer i to every node in layer i+1 drawn independently from a Bernoulli distribution with success probability density/100. The PRNG is a deterministically-seeded math/rand/v2.PCG so every (L, w, density, seed) tuple produces the same byte-for-byte adjacency.

Nodes are numbered in layer-major order: layer 0 holds ids 0..w-1, layer 1 holds ids w..2w-1, and so on. The pair (layer i, position j) maps to node id i*w + j.

Catalogue invariants on the returned graph:

  • Order() == uint64(L * w).
  • 0 <= Size() <= uint64((L - 1) * w * w).
  • The graph is acyclic (edges only go forward through layers).
  • When density == 0 the graph has no edges; when density == 100 every possible inter-layer edge is present and the graph has exactly (L - 1) * w * w edges.

Layered declares four knobs: "L" over [1, 100], "w" over [1, 100], and "density" over [0, 100] (interpreted as a percentage and divided by 100 inside Build). The "seed" parameter is supplied at construction time as a uint64 and is not exposed as a knob, mirroring the PruferTree convention.

The constructor panics when L is below 1 or above 100, when w is below 1 or above 100, or when density is below 0 or above 100.

Edges are inserted in this deterministic order: for every layer i from 0 to L-2, for every source position j from 0 to w-1, for every destination position k from 0 to w-1, the candidate edge (i*w + j, (i+1)*w + k) is drawn from the Bernoulli source. The PCG draw consumes one IntN(100) per candidate, regardless of the outcome, so the seed-to-output map is a pure function of the (L, w, density, seed) tuple.

The L parameter is intentionally upper-case because the brief pins it as the catalogue identifier for the layer count and the godoc / golden filenames reuse it verbatim; a lint waiver covers this single deviation from the project's lowercase-parameter convention.

func LengauerTarjanExample

func LengauerTarjanExample() Shape[int, int64]

LengauerTarjanExample returns a Shape that builds the canonical flowgraph from Lengauer & Tarjan, "A Fast Algorithm for Finding Dominators in a Flowgraph", TOPLAS 1(1), 1979. The graph has 13 nodes with user-keys 1..13 and literature labels R, A, B, C, D, E, F, G, H, I, J, K, L attached via SetNodeLabel. It carries 21 directed edges in the insertion order pinned by [lengauerTarjanEdges].

IMPORTANT: this generator is intentionally cyclic. It is included in the DAGs file because its purpose — dominator-tree verification — sits in the same algorithmic family, but it is the only entry in the family whose underlying graph is NOT a DAG. The dominator-tree pinning is encoded in the test TestDAGs_LengauerTarjanExample_DominatorTree, which computes immediate dominators from scratch via iterative dataflow (Cooper-Harvey-Kennedy) and compares against the table:

idom(A) = R, idom(B) = R, idom(C) = R, idom(D) = R,
idom(E) = R, idom(F) = C, idom(G) = C, idom(H) = R,
idom(I) = R, idom(J) = G, idom(K) = R, idom(L) = D.
(R has no idom.)

LengauerTarjanExample declares no knobs (the graph is fully specified by [lengauerTarjanEdges]).

Catalogue invariants on the returned graph:

  • Order() == 13.
  • Size() == 21.
  • The graph is directed and cyclic (the user-amended AC #5 skip-list comment cites this).

func Lobster

func Lobster(depths []int) Shape[int, int64]

Lobster returns a Shape that builds the lobster tree L(depths) per the disambiguated definition pinned in the package-level docstring: depths[i] is the height of the chain attached to spine node i. The spine has len(depths) nodes (ids 0..S-1); spine node i carries a single chain of length depths[i] rooted at it. depths[i]=0 means a bare spine node (no attachment); depths[i]=1 means one leaf; depths[i]=2 means leaf + leaf-of-leaf; etc.

Closed form: N = len(depths) + sum(depths), E = N - 1.

Example: Lobster([1,2,1]) builds a spine of 3, node 0 has one leaf, node 1 has one leaf and one grandleaf, node 2 has one leaf — N = 3 + 4 = 7, E = 6.

Catalogue invariants on the returned graph:

  • Order() == uint64(len(depths) + sum(depths)).
  • Size() == Order() - 1 when Order() >= 1; 0 otherwise.

Lobster declares no knobs: the depths slice is variadic and the length-of-slice / per-entry bounds are checked at construction time. The constructor panics when len(depths) is below 1 or above 50, or when any depths[i] is negative or above 5.

Edges are inserted in this deterministic order:

  1. the len(depths)-1 spine path edges (0, 1), (1, 2), ..., (S-2, S-1);
  2. for each spine node i in ascending order, the depths[i] chain edges (i, chainStart), (chainStart, chainStart+1), ..., (chainStart+depths[i]-2, chainStart+depths[i]-1), where chainStart counts contiguously from len(depths) across all spine nodes.

The first AddEdge error short-circuits every loop.

func Lookup

func Lookup[N comparable, W any](name string) (Shape[N, W], bool)

Lookup retrieves a previously registered Shape by its name within the (N, W) specialisation inferred from the call site. The second return value reports whether a Shape was found.

func Mobius

func Mobius(n int) Shape[int, int64]

Mobius returns a Shape that builds the Möbius ladder M_n: the cycle C_{2n} with n additional "rungs" connecting every pair of antipodal nodes (i, i+n) for i in [0, n). The graph is undirected and simple; cfg.Directed and cfg.Multigraph are overridden to false.

Nodes are 0..2n-1. The cycle edges run (i, (i+1) mod 2n) for i in [0, 2n); the rung edges run (i, i+n) for i in [0, n).

Catalogue invariants on the returned graph (n >= 2):

  • Order() == uint64(2 * n).
  • Size() == uint64(3 * n) (2n cycle edges plus n rungs; 3-regular).
  • M_n is non-planar for n >= 3 (M_3 is K_{3,3}).

Mobius declares a single knob "n" over [2, 1000]. The lower bound is 2 because M_n requires at least 2 antipodal pairs to be distinct from a smaller cycle; the constructor panics when n < 2.

Edges are inserted in ascending source order: first the 2n cycle edges (0,1), (1,2), ..., (2n-2, 2n-1), (2n-1, 0); then the n rung edges (0, n), (1, n+1), ..., (n-1, 2n-1).

func MoserSpindle

func MoserSpindle() Shape[int, int64]

MoserSpindle returns a Shape that builds the Moser spindle: the canonical 7-vertex, 11-edge unit-distance graph with chromatic number 4 (Moser & Moser, 1961). It is the smallest 4-chromatic unit-distance graph in the plane and was used in the original proof that the chromatic number of the plane is at least 4.

Catalogue invariants on the returned graph:

  • Order() == 7.
  • Size() == 11.
  • Chromatic number == 4. The test asserts both halves:
  • χ ≤ 4 by exhibiting a hard-coded valid 4-colouring and checking no edge is monochromatic.
  • χ ≥ 4 is documented in the godoc and AC, not asserted at runtime per the brief (the test does not implement a 3-colouring solver).

MoserSpindle declares no knobs.

func Multipartite

func Multipartite(parts []int) Shape[int, int64]

Multipartite returns a Shape that builds the complete multipartite graph K_{parts[0], parts[1], ..., parts[k-1]}: a graph in which the node set is partitioned into k groups of sizes parts[0], parts[1], ..., parts[k-1] and every pair of nodes in distinct groups is joined by an edge (no intra-group edges).

The graph is undirected; cfg.Directed and cfg.Multigraph are overridden to false. Nodes are assigned ids in contiguous blocks: group 0 takes ids 0..parts[0]-1, group 1 takes ids parts[0]..parts[0]+parts[1]-1, and so on. Empty groups (parts[i] == 0) contribute zero nodes and zero edges; they are silently ignored. A nil or empty parts slice yields the empty graph.

Catalogue invariants on the returned graph:

  • Order() == uint64(sum(parts))
  • Size() == sum_{i<j} parts[i] * parts[j].

Multipartite declares no knobs: the parts slice is variadic and property-based tests draw it directly via rapid. The constructor panics when any parts[i] is negative.

func NegativeWeightAcyclic

func NegativeWeightAcyclic(n, signMix int, seed uint64) Shape[int, int64]

NegativeWeightAcyclic returns a Shape that builds an acyclic directed graph carrying signed edge weights — specifically, the transitive-tournament skeleton on n nodes with a fraction signMix of edges given negative weights. The graph is acyclic by construction (edges only go forward in the topological order 0 < 1 < ... < n-1), so Bellman-Ford and Johnson reweighting run safely despite the negative weights.

Edge weights are drawn uniformly from [-1000, 1000] excluding 0, with the sign drawn from a Bernoulli source with probability signMix/100 of being negative. The PRNG is a deterministically- seeded math/rand/v2.PCG so every (n, signMix, seed) tuple produces the same byte-for-byte output.

Catalogue invariants on the returned graph:

  • Order() == uint64(n).
  • Size() == uint64(n * (n - 1) / 2).
  • The graph is acyclic.
  • Every edge weight is in [-1000, -1] or [1, 1000].
  • When signMix == 0 every weight is positive; when signMix == 100 every weight is negative.

NegativeWeightAcyclic declares two knobs: "n" over [0, 100] and "signMix" over [0, 100] (interpreted as a percentage and divided by 100 inside Build). The "seed" parameter is supplied at construction time as a uint64 and is not exposed as a knob, mirroring the PruferTree convention. The constructor panics when n is negative or above 100, or when signMix is below 0 or above 100.

Edges are inserted in lexicographic (i, j) order with i < j; the PRNG is consumed twice per edge — once for the magnitude in [1, 1000], once for the sign Bernoulli draw — so the seed-to- output map is a pure function of the (n, signMix, seed) tuple.

func ParallelDigon

func ParallelDigon(k int) Shape[int, int64]

ParallelDigon returns a Shape that builds the multigraph M_k on two nodes: two nodes (0 and 1) joined by k parallel directed edges 0 → 1. Every edge carries the same weight [unweightedSentinel]; algorithms that care about edge multiplicity rather than weight (path counts, multigraph traversals) get a clean fixture.

k is the parameter set at construction time; the knob "k" exposes the bounded sweep [1, 1000] to property-based tests, with a default of 2 (the smallest k that distinguishes this shape from the simple K2). The constructor panics when k < 1 because the catalogue does not define ParallelDigon(0) — the zero-edge degenerate case is the responsibility of EmptyGraph or, on two nodes, of IsolatedOnly(2).

Catalogue invariants on the returned graph:

  • Order() == 2
  • Size() == uint64(k)
  • HasEdge(0,1) == true; HasEdge(1,0) == false.

The build forces Directed=true and Multigraph=true regardless of cfg, because the catalogue definition fixes both. cfg.MaxShardCapacity is preserved so the caller can still bound shard growth.

Errors from g.AddEdge are returned verbatim from the first failing iteration; the build aborts at that point and the partial graph is discarded by returning nil.

func Path

func Path(n int, directed bool) Shape[int, int64]

Path returns a Shape that builds the path graph P_n: nodes 0,1,...,n-1 with edges (i, i+1) for 0 <= i < n-1. When directed is true every edge points from i to i+1 only; when directed is false the underlying adjlist.AdjList mirrors each insertion.

Catalogue invariants on the returned graph:

  • Order() == uint64(n)
  • Size() == 0 when n <= 1; n-1 otherwise.
  • Diameter is n-1 when n >= 1.
  • Degree sequence (undirected, n >= 2): 1, 2, 2, ..., 2, 1.
  • Out-degree sequence (directed): 1, 1, ..., 1, 0.

Path declares a single knob "n" over [0, 100_000]; property-based tests should draw a smaller range to stay within the short layer budget. Multigraph is always set to false. Build returns adjlist.ErrShardFull verbatim when cfg.MaxShardCapacity refuses an AddEdge; in that case the partially built graph is discarded.

The constructor panics when n < 0 because the catalogue does not define a path with a negative number of nodes.

func PathDegenerate

func PathDegenerate(n int) Shape[int, int64]

PathDegenerate returns a Shape that builds the path-degenerate tree on n nodes — the canonical worst case for tree algorithms that assume balanced height (LCA without preprocessing, naive centroid decomposition, recursive DFS). It delegates to Path(n, false), the undirected path P_n. The resulting Shape's Name() therefore reports "classic.path"; tests that need a catalogue-stable name for the tree-family entry must inspect the declared knobs or the Order/Size of the built graph.

Catalogue invariants on the returned graph match those of Path(n, false):

  • Order() == uint64(n)
  • Size() == 0 when n <= 1; n - 1 otherwise.
  • The graph is undirected.

PathDegenerate exposes the same single knob "n" as Path, over [0, 100_000]. The constructor panics on n < 0 via the delegated Path guard.

PathDegenerate exists so that the trees-family catalogue surfaces the path-degenerate worst case as a first-class entry. The delegation keeps a single implementation of the path topology in the package; the equivalence is verified by TestTrees_PathDegenerate_DelegatesToClassic.

func Petersen

func Petersen() Shape[int, int64]

Petersen returns a Shape that builds the Petersen graph: the canonical 10-vertex, 15-edge, 3-regular graph with girth 5, chromatic number 3, and crossing number 2 (hence non-planar). It is the smallest 3-regular graph without a Hamiltonian path (only without a Hamiltonian cycle in the standard sense; it does have a Hamiltonian path), the smallest hypohamiltonian graph, and the unique 5-cage (Diestel, 5th ed., §1.8).

The Petersen graph is isomorphic to the Kneser graph K(5, 2): every vertex is a 2-subset of {0,1,2,3,4} and two vertices are adjacent iff their subsets are disjoint. The semantic equivalence with Kneser(5, 2) is asserted by TestSpecials_Petersen_KneserEquivalence (matching Order, Size, and the degree distribution rather than the byte-for-byte adjacency, because Kneser uses a lexicographic node-id assignment over k-subsets whereas Petersen uses the 0..9 layout pinned by [petersenEdges]).

Catalogue invariants on the returned graph:

  • Order() == 10.
  • Size() == 15.
  • Every vertex has degree exactly 3 (3-regular).
  • Girth == 5.
  • Chromatic number == 3 (a valid 3-colouring is asserted in the test; the lower bound χ ≥ 3 follows from the fact that Petersen contains odd cycles and is documented in the godoc, not asserted at runtime per the brief).
  • Non-planar: contains a K_{3,3} minor (Diestel, §4.4). The non-planarity is documented in the godoc and AC, not asserted at runtime — the brief explicitly forbids implementing a planarity check.

Petersen declares no knobs (the graph is fully specified by [petersenEdges]).

func PlantedPartition

func PlantedPartition(k, blockSize, pInPercent, pOutPercent int, seed uint64) Shape[int, int64]

PlantedPartition returns a Shape that builds the planted partition model (Condon & Karp, "Algorithms for graph partitioning on the planted partition model", Random Struct. Algor. 18(2), 2001): k equal-size blocks of blockSize nodes each, with intra-block edge probability pInPercent / 100 and inter-block edge probability pOutPercent / 100. The implementation delegates to SBM with a k-block uniform-size vector and a constant probability matrix (pIn on the diagonal, pOut off-diagonal).

PlantedPartition declares four knobs:

  • "k" over [1, 20] (default 4) — number of blocks.
  • "blockSize" over [0, 500] (default 25) — nodes per block.
  • "pIn" over [0, 100] (default 50) — intra-block edge probability, as integer percent.
  • "pOut" over [0, 100] (default 1) — inter-block edge probability, as integer percent.

The "seed" parameter is supplied at construction time as a uint64 and is not exposed as a knob, mirroring the convention pinned by Layered, BarabasiAlbert, WattsStrogatz, ErdosRenyiNP, RMAT, RandomRegular, and RGG. The constructor panics when:

  • k < 1 or k > 20 (catalogue out-of-range);
  • blockSize < 0 or blockSize > 500 (catalogue out-of-range);
  • pInPercent < 0 or pInPercent > 100 (catalogue out-of-range);
  • pOutPercent < 0 or pOutPercent > 100 (catalogue out-of-range).

Catalogue invariants on the returned graph

  • Order() == uint64(k * blockSize).
  • The graph is undirected and simple.
  • Every node carries a "block_id" lpg.Int64Value property in [0, k).
  • Per-block expected intra-edge count: C(blockSize, 2) * pInPercent / 100.
  • Per-block-pair expected inter-edge count: blockSize * blockSize * pOutPercent / 100.

The soak layer pins statistical recoverability at the canonical recoverability knobs (k=4, blockSize=50, pIn=50, pOut=1) over 5 seeds. The recoverability check is applied to the *aggregated* pooled mean across the 5 seeds — that is, summed over every block for intra-edges and summed over every unordered block-pair for inter-edges, then divided by the seed count. The aggregated mean of the intra-edge total must be at least 80% of its expectation (k * C(blockSize, 2) * pIn / 100) and the aggregated mean of the inter-edge total must be at most 105% of its expectation (C(k, 2) * blockSize^2 * pOut / 100). The aggregated reading is one of the two choices the brief explicitly permits ("per-block-pair OR aggregated") and is the one that stays inside the 5% leakage budget at the canonical knobs without flaking: per-block-pair inter-edge ratios fluctuate up to ~1.20 over short seed windows, while the aggregated ratio remains tightly inside [0.97, 1.04] over every observed 5-seed window. See [TestRandom_PlantedPartition_Recoverability_Soak] in sbm_test.go.

func Prism

func Prism(n int) Shape[int, int64]

Prism returns a Shape that builds the prism graph Y_n: two cycles C_n joined rung-by-rung. The "inner" cycle is the cycle 0 -- 1 -- ... -- (n-1) -- 0; the "outer" cycle is the cycle n -- (n+1) -- ... -- (2n-1) -- n; the rungs connect node i with node n+i for i in [0, n).

The graph is undirected and simple; cfg.Directed and cfg.Multigraph are overridden to false.

Catalogue invariants on the returned graph (n >= 3):

  • Order() == uint64(2 * n).
  • Size() == uint64(3 * n) (2n cycle edges plus n rungs; 3-regular).

Prism declares a single knob "n" over [3, 1000]. The lower bound is 3 because C_n is only defined for n >= 3; the constructor panics when n < 3.

Edges are inserted in ascending source order: first the n inner cycle edges (0,1), ..., (n-1, 0); then the n outer cycle edges (n, n+1), ..., (2n-1, n); then the n rungs (0, n), (1, n+1), ..., (n-1, 2n-1).

func PruferTree

func PruferTree(n int, seed uint64) Shape[int, int64]

PruferTree returns a Shape that builds a uniformly random labelled tree on n nodes, decoded via Cayley's formula from a Prüfer sequence drawn from a deterministically seeded math/rand/v2.PCG source.

The Prüfer encoding (Heinz Prüfer, 1918) is a bijection between labelled trees on n nodes and integer sequences of length n-2 with entries in [0, n-1]. Decoding a uniform sequence therefore yields a uniform tree by Cayley's formula, and supports the AC#2 distribution test that 10_000 seeds for n == 10 produce uniformly-distributed edge histograms.

Catalogue invariants on the returned graph:

  • Order() == uint64(n).
  • Size() == 0 when n <= 1; uint64(n - 1) otherwise.

PruferTree declares a single knob "n" over [2, 5000]. Soak callers may drive the constructor up to n == 50_000 directly. The constructor panics when n is negative or above 5000: the catalogue does not define a Prüfer tree outside that range.

Edges are emitted in the order produced by the standard Prüfer decoding algorithm: every iteration pairs the smallest-indexed node with degree 1 with the head of the remaining sequence, and the final pair joins the last two remaining nodes. This is independent of the AddEdge insertion order from the perspective of the resulting graph, but it pins the golden bytes because the formatter sorts the output lexicographically.

func RGG

func RGG(n, radiusPercent, dim int, seed uint64) Shape[int, int64]

RGG returns a Shape that builds a random geometric graph on n nodes in the unit hyper-cube [0, 1]^dim. Each node i is assigned a uniformly-distributed point p_i in [0, 1]^dim; an undirected edge (i, j) with i < j is added whenever the Euclidean distance ||p_i - p_j||_2 is at most radius := radiusPercent / 100.0 (Gilbert, "Random plane networks", J. SIAM 9(4), 1961; Penrose, "Random Geometric Graphs", Oxford University Press, 2003). The graph is undirected and simple (no parallel edges, no self-loops); cfg.Directed and cfg.Multigraph are overridden to false.

Coordinate properties

Every node carries its sampled coordinates as lpg.Float64Value properties: "x" and "y" for every dim, plus "z" when dim == 3. These properties are the source of truth for the Euclidean heuristic used by A* over RGG fixtures; they are also reproducible from (n, radiusPercent, dim, seed) — the point-sampling phase consumes exactly dim PRNG Float64 draws per node, in node order.

Parameters and validation

The PRNG is a deterministically-seeded math/rand/v2.PCG, so every (n, radiusPercent, dim, seed) tuple yields the same byte-for-byte adjacency. The constructor panics when:

  • n < 0 or n > 1000 (catalogue out-of-range);
  • radiusPercent < 0 or radiusPercent > 100 (catalogue out-of- range);
  • dim < 2 or dim > 3 (catalogue out-of-range).

Catalogue invariants on the returned graph

  • Order() == uint64(n).
  • The graph is undirected and simple.
  • When radiusPercent == 0 the graph has zero edges: no pair of distinct points lies within distance 0 (the sampler draws from a continuous distribution; coincident points form a measure- zero event and the catalogue accepts the implicit assumption).
  • When radiusPercent reaches the unit-cube diameter (sqrt(dim) * 100), every pair is within radius and the graph is K_n. The knob ceiling is pinned at 100 by the brief, so the d=2 case at radius=1.0 is *approximately* K_n: the expected edge fraction is the probability that a uniform pair in the unit square lies within distance 1, which integrates to pi/3 - sqrt(3)/4 + 1 ~= 0.7965 of C(n, 2). The catalogue's AC #1 therefore pins the contract at the empirically-bounded floor 0.85 * C(n, 2) for n <= 20; see the dedicated test [TestRandom_RGG_NearComplete_AtFullRadius] for the rationale and the per-seed slack.
  • Edge count is monotone non-decreasing in radiusPercent: a pair within distance r is also within any r' >= r. The catalogue test [TestRandom_RGG_EdgeCountMonotoneInRadius] pins the contract over the grid {0, 25, 50, 75, 100} at fixed seed.
  • Every node carries an "x" and "y" property; nodes with dim == 3 additionally carry a "z" property. All three values are reproducible per (n, dim, seed) — the per-coordinate Float64 draw order is fixed at "x first for node 0, then y, then z, then x for node 1, ...".

RGG declares three knobs: "n" over [0, 1000] (default 50), "r" over [0, 100] (default 30), and "dim" over [2, 3] (default 2). The "seed" parameter is supplied at construction time as a uint64 and is not exposed as a knob, mirroring the convention pinned by Layered, BarabasiAlbert, WattsStrogatz, ErdosRenyiNP, RMAT, and RandomRegular.

func RMAT

func RMAT(scale, edgeFactor int, a, b, c, d, seed uint64) Shape[int, int64]

RMAT returns a Shape that builds a Graph500-style R-MAT graph with 2^scale nodes and edgeFactor * 2^scale edge-emission attempts. The per-emission recursive descent of the 2x2 (a, b, c, d) probability matrix is the canonical Chakrabarti/Zhan/Faloutsos placement (SDM 2004); the integer-percent encoding of (a, b, c, d) pins the quadrant decisions to a single IntN(100) draw per recursion level, giving a deterministic, allocation-free hot path.

Quadrant convention

At every recursion level the unit square (representing the current 2^k-by-2^k sub-matrix) splits into four equal quadrants:

+-----+-----+
|  A  |  B  |
+-----+-----+
|  C  |  D  |
+-----+-----+

where A is top-left (src in lower half, dst in lower half), B is top-right (src in lower half, dst in upper half), C is bottom-left (src in upper half, dst in lower half), and D is bottom-right (src in upper half, dst in upper half). A draw v ~ Uniform[0, 100) is classified as A iff v < a, B iff a <= v < a+b, C iff a+b <= v < a+b+c, D otherwise. This is the [Graph500] split; the previous bench/rmat implementation split A/B at ab/2 instead of at a, which is correct only under a == b and silently diverged for the canonical (57, 19, 19, 5) tuple.

Parameters and validation

The constructor panics when:

  • scale < 1 or scale > 30 (the knob range);
  • edgeFactor < 1 or edgeFactor > 64 (the knob range);
  • any of a, b, c, d exceeds 100;
  • a + b + c + d != 100.

The Graph500 reference values are (a, b, c, d) = (57, 19, 19, 5).

Catalogue invariants on the returned graph

  • Order() == uint64(1) << scale.
  • Size() <= uint64(edgeFactor) * (uint64(1) << scale). The upper bound is exact; the lower bound depends on the dedup rate of the (a, b, c, d) tuple, which under canonical Graph500 parameters concentrates a large share of draws on the A quadrant and therefore on the (0, 0) corner of the adjacency. Retention grows monotonically with scale (more destination cells dilute collisions) and decreases with edgeFactor (more draws into the same cells inflate dedup). The empirical retention floor is documented in the "Empirical retention" section below.
  • The graph is directed and may contain self-loops.
  • Determinism: same (scale, edgeFactor, a, b, c, d, seed) tuple yields a byte-identical *lpg.Graph[int, int64].

Empirical retention

The empirical floor across five independent seeds for the canonical (57, 19, 19, 5) tuple, measured against the upper bound m = edgeFactor * 2^scale, is:

scale  ef   m         min retention
    6   4    256          0.6797
    7   4    512          0.7402
    8   4   1024          0.8066
    9   4   2048          0.8496
   10   4   4096          0.8738
    8   8   2048          0.7349
    8  16   4096          0.6262
   10  16  16384          0.7334
   12   8  32768          0.8736

[TestRandom_RMAT_EdgeCountWithinTolerance] pins the >= 80 percent retention contract at the AC #1 configuration (scale = 8, edgeFactor = 4), where the empirical floor is 0.8066 across the three reference seeds; the 0.80 threshold is the published Graph500 contract under canonical parameter dedup tolerance and matches the documented behaviour at the [TestRandom_RMAT_Soak] (scale = 12, edgeFactor = 8) point where the empirical floor is 0.8736. Configurations outside this range (notably (scale = 8, edgeFactor = 16)) drift below the 0.80 floor and are not part of the pinned contract.

RMAT declares two knobs: "scale" over [1, 30] (default 10) and "edgeFactor" over [1, 64] (default 16). The (a, b, c, d, seed) arguments are supplied at construction time and are not exposed as knobs, mirroring the convention pinned by Layered and the [a, b, c, d] joint-sum constraint documented at the head of this file.

func RandomRegular

func RandomRegular(n, d int, seed uint64) Shape[int, int64]

RandomRegular returns a Shape that builds a uniformly-random d-regular graph on n nodes: nodes 0..n-1, every node with degree exactly d, drawn from the configuration-model pairing distribution conditioned on the absence of self-loops and parallel edges (Bollobás, "A probabilistic proof of an asymptotic formula for the number of labelled regular graphs", European J. Combin. 1(4), 1980). The graph is undirected and simple (no parallel edges, no self-loops); cfg.Directed and cfg.Multigraph are overridden to false.

The PRNG is a deterministically-seeded math/rand/v2.PCG, so every (n, d, seed) tuple yields the same byte-for-byte adjacency.

Catalogue invariants on the returned graph:

  • Order() == uint64(n).
  • Size() == uint64(n * d / 2).
  • Every node has undirected degree exactly d.
  • The graph is undirected and simple.
  • When d == 0 the graph has n isolated nodes and zero edges.

RandomRegular declares two knobs: "n" over [1, 1000] (default 20) and "d" over [0, 50] (default 3). The "seed" parameter is supplied at construction time as a uint64 and is not exposed as a knob, mirroring the convention pinned by Layered, BarabasiAlbert, WattsStrogatz, and ErdosRenyiNP. The constructor panics when:

  • n < 1 or n > 1000 (catalogue out-of-range);
  • d < 0 or d > 50 (catalogue out-of-range);
  • n * d is odd (a d-regular graph on n vertices requires the degree sum n * d to be even, by the handshake lemma);
  • d >= n (a simple graph cannot have a vertex of degree >= n).

Pairing strategy: the practical configuration-model construction with incremental rejection-by-swap (Newman, "The Structure and Function of Complex Networks", SIAM Review 45(2), 2003, section IV.A). Materialise n * d half-edges as a flat slice in which each node id i appears d times consecutively, shuffle the slice with Fisher-Yates, then walk adjacent pairs (half-edges 2k and 2k+1) left-to-right. Whenever a candidate pair is a self-loop or a duplicate of an earlier pair, search forward in the slice for a swap partner that produces a valid pair at the current position; if no such partner exists the attempt fails and the helper restarts from a fresh shuffle. The pure-restart variant of the pairing model (Bollobás 1980) has asymptotic acceptance probability exp(-(d^2 - 1) / 2), which collapses to ~5e-4 at d = 4 and 2e-8 at d = 6 — far below any reasonable retry budget. The rejection-by-swap variant recovers practical success: a single shuffle followed by an O(n*d^2) sweep is typically sufficient, so the retry budget here serves only as a safety net for the few pathological d-close-to-n corners.

The retry budget is capped at [randomRegularRetryBudget] = 100; on exhaustion Build returns ErrRegularConstruction wrapped with the offending (n, d) pair. The first successful attempt is materialised as a set of (min(u,v), max(u,v)) unordered pairs and sorted ascending; the edges are inserted into g in this lexicographic order so the golden bytes stay stable regardless of the underlying shuffle permutation.

func Rook

func Rook(n int) Shape[int, int64]

Rook returns a Shape that builds the n-by-n rook graph R_n: the n^2 squares of an n-by-n chessboard, with two squares connected when they share a row or a column (the set of squares a rook can reach in one move). Equivalently, R_n is the line graph of the complete bipartite graph K_{n,n}, or the Cartesian product K_n □ K_n.

Nodes are arranged in row-major order: square (r, c) at row r in [0, n) and column c in [0, n) has id r*n + c. The graph is undirected and simple; cfg.Directed and cfg.Multigraph are overridden to false.

Catalogue invariants on the returned graph (n >= 1):

  • Order() == uint64(n^2).
  • Size() == uint64(n^2 * (n-1)) (each node has degree 2(n-1); the n^2 nodes each contribute n-1 row neighbours and n-1 column neighbours, and we count each undirected edge once).
  • Diameter == 2 for n >= 2; 0 for n == 1.

Rook declares a single knob "n" over [1, 1000]. The constructor panics when n < 1: the catalogue does not define a zero-rook graph (it would coincide with the empty graph).

Edges are inserted in ascending source order. Per square (r, c) the constructor emits the row neighbours (r, c') for c' > c and the column neighbours (r', c) for r' > r. This enumerates each undirected edge exactly once.

func SBM

func SBM(blockSizes []int, probPercent [][]int, seed uint64) Shape[int, int64]

SBM returns a Shape that builds a stochastic block model graph (Holland, Laskey & Leinhardt, "Stochastic blockmodels: First steps", Social Networks 5(2), 1983; Karrer & Newman, "Stochastic blockmodels and community structure in networks", Phys. Rev. E 83, 016107, 2011). The node set is partitioned into k = len(blockSizes) blocks; block i contains blockSizes[i] nodes. Every unordered pair (u, v) with u in block A and v in block B receives an undirected edge independently with probability probPercent[A][B] / 100. Self-loops are forbidden (the i < j pair iteration enforces this) and the probability matrix must be symmetric (probPercent[i][j] == probPercent[j][i] for all i, j).

Nodes are assigned ids in contiguous blocks: block 0 takes ids 0..blockSizes[0]-1, block 1 takes ids blockSizes[0]..blockSizes[0]+blockSizes[1]-1, and so on. Empty blocks (blockSizes[i] == 0) contribute zero nodes and zero edges; they are silently skipped. A nil or empty blockSizes slice yields the empty graph (assuming probPercent is also empty — otherwise ErrSBMBlockMismatch surfaces).

Ground-truth community labels

Every node carries its zero-based block index as an lpg.Int64Value property under the key "block_id". Consumers running community- detection algorithms (Leiden, label propagation, spectral partitioning) read the labels back through lpg.Graph.GetNodeProperty to score recovery quality (NMI, ARI, accuracy) against the planted partition.

Parameters and validation

The PRNG is a deterministically-seeded math/rand/v2.PCG, so every (blockSizes, probPercent, seed) tuple yields the same byte-for-byte adjacency. The constructor panics when any blockSizes[i] is negative (the catalogue does not define the model on negative-size inputs); every other validation is deferred to Build because the failure modes are caller-provided data errors rather than catalogue out-of- range conditions. Build surfaces four distinct sentinels:

All four are checked in the order above; the first failure short-circuits the others. Each is wrapped with the offending coordinates so callers can format diagnostic messages without unwrapping.

Catalogue invariants on the returned graph

  • Order() == uint64(sum(blockSizes)).

  • The graph is undirected and simple (no parallel edges, no self-loops).

  • Every node carries a "block_id" lpg.Int64Value property whose value is the zero-based index of the block that produced it.

  • The number of edges is a random variable; the expected count of edges between block i and block j is:

    blockSizes[i] * blockSizes[j] * probPercent[i][j] / 100 (i != j) C(blockSizes[i], 2) * probPercent[i][i] / 100 (i == j)

    The soak layer pins the pooled grand mean of the block-edge counts — one per seed averaged across 100 seeds — against these expectations within +/- 3 * sigma / sqrt(N), the standard error of the sample mean. The pooled interpretation is required because per-seed pointwise +/- 3 sigma checks have a non-trivial false-positive rate over 100 independent draws (about 27%); pooling shifts the test from a per-seed concentration test to a concentration test on the average, which is the correct statistical reading of "matches the expectation across 100 seeds". See [TestRandom_SBM_BlockEdgeSums_Soak] in sbm_test.go.

SBM declares no knobs: the blockSizes and probPercent slices are variadic-style and property-based tests draw them directly via rapid, mirroring the convention pinned by Multipartite in the classic family and ConfigurationModel in the random family. The constructor takes a defensive copy of both inputs so subsequent caller mutations cannot affect Build.

func SingleEdge

func SingleEdge(directed, weighted, selfLoop bool) Shape[int, int64]

SingleEdge returns a Shape that builds K2 or — when selfLoop is true — the single self-loop graph. The three boolean flags select the variant:

  • directed: true → cfg.Directed=true; false → cfg.Directed=false.
  • weighted: true → edge weight is [weightedSentinel] (1); false → edge weight is [unweightedSentinel] (0).
  • selfLoop: true → one node with one (0→0) edge; the directed flag is ignored in this case because a single self-loop has the same topology in directed and undirected graphs.

Name is "trivial.k2" in the non-selfLoop case and "trivial.k1.selfloop" when selfLoop=true.

Catalogue invariants on the returned graph:

  • selfLoop=false: Order=2, Size=1, HasEdge(0,1)=true. When directed=false, HasEdge(1,0)=true as well; when directed=true, HasEdge(1,0)=false.
  • selfLoop=true: Order=1, Size=1, HasEdge(0,0)=true.

SingleEdge declares no Knobs: every combination of (directed, weighted, selfLoop) is a discrete shape rather than a numeric sweep, so the property-based test enumerates the eight-cell constructor matrix directly. Multigraph is always set to false because K2 and the self-loop graph are simple by definition.

The single g.AddEdge call inside Build can in principle return adjlist.ErrShardFull when the caller has set a tight cfg.MaxShardCapacity; that error is returned verbatim with no extra wrapping. With the canonical node ids 0 and 1 the error path is not reachable for any maxCap >= 1.

func SingleNode

func SingleNode() Shape[int, int64]

SingleNode returns a Shape that builds K1: one node, zero edges. The canonical node value is 0. SingleNode declares no knobs.

Catalogue invariants on the returned graph:

  • Order() == 1
  • Size() == 0

The underlying graph is directed; the lone node has no outgoing adjacency entry yet (it is interned only via the mapper).

The g.AddNode call cannot fail in the current adjlist contract — AddNode does not touch the shard slot array, so cfg.MaxShardCapacity has no observable effect on it. SingleNode therefore returns the error verbatim, with no wrapping, in the interest of leaving zero dead branches in the build closure.

func Spider

func Spider(legs, legLen int) Shape[int, int64]

Spider returns a Shape that builds the spider graph S(legs, legLen): a central node (id 0) joined to legs distinct paths each of length legLen. The j-th leg occupies node ids 1 + j*legLen .. (j+1)*legLen, with an edge from the centre to the first node of every leg and chain edges along each leg.

Catalogue invariants on the returned graph:

  • Order() == uint64(1 + legs * legLen).
  • Size() == Order() - 1 (every tree has N-1 edges).
  • The centre has degree legs; each leg-end leaf has degree 1; every interior leg node has degree 2.

Spider declares two knobs "legs" over [1, 100] and "legLen" over [1, 100]. The constructor panics when either is below 1 or above 100; the catalogue does not define a spider with zero or excessively long legs in the short layer.

Edges are inserted in this deterministic order: for each leg j in ascending order, first the centre-to-first-node edge (0, 1 + j*legLen), then the chain edges (1 + j*legLen, 2 + j*legLen), ..., ((j+1)*legLen - 1, (j+1)*legLen). The first AddEdge error short-circuits every loop.

func Star

func Star(n int, outgoing bool) Shape[int, int64]

Star returns a Shape that builds the star graph S_n: a centre node (id 0) and n-1 leaves (ids 1..n-1). When outgoing is true every edge is 0 -> v; when outgoing is false every edge is v -> 0.

The Star catalogue entry is always directed: the orientation of edges is the whole point of the shape, and the "outgoing" flag pins it. Multigraph is always set to false. cfg.Directed and cfg.Multigraph are overridden accordingly.

Catalogue invariants on the returned graph:

  • Order() == uint64(n)
  • Size() == 0 when n <= 1; n-1 otherwise.
  • When outgoing == true: out-degree(0)=n-1, out-degree(v)=0 for v != 0.
  • When outgoing == false: in-degree(0)=n-1, in-degree(v)=0 for v != 0.
  • Diameter: 0 when n <= 1, 1 when n == 2, 2 when n >= 3 (treating the graph as undirected for diameter purposes — the standard catalogue definition).

Star declares a single knob "n" over [0, 100_000]. The constructor panics when n < 0.

func Theta

func Theta(a, b, c int) Shape[int, int64]

Theta returns a Shape that builds the theta graph θ_{a,b,c}: two "pole" vertices joined by three internally vertex-disjoint paths of lengths a, b, c (lengths measured in edges). The graph is undirected and simple; cfg.Directed and cfg.Multigraph are overridden to false.

Node ids:

  • 0 is pole s.
  • 1 is pole t.
  • 2..a are the a-1 internal vertices of the first path (a edges, so a-1 internal vertices between s and t).
  • a+1..a+b-1 are the b-1 internal vertices of the second path.
  • a+b..a+b+c-2 are the c-1 internal vertices of the third path.

Total Order = 2 + (a-1) + (b-1) + (c-1) = a + b + c - 1. Total Size = a + b + c (each path contributes its length in edges; the paths share only s and t, which contribute zero duplicates because the internal vertices are disjoint).

Theta declares three knobs "a", "b", "c" over [1, 1000]. The lower bound is 1 because a length-1 path collapses to the single edge (s, t). When two or more of (a, b, c) are 1 the simple-graph regime collapses the duplicate (s, t) edges to one, which would violate the closed-form Size: the constructor therefore panics when more than one of (a, b, c) equals 1. The remaining degenerate case (exactly one length-1 path) is well-defined and produces a "lollipop with two arcs and one chord" graph.

Edges are inserted path by path in ascending source order:

  1. path A: (0, 2), (2, 3), ..., (a, 1) — a edges total.
  2. path B: (0, a+1), (a+1, a+2), ..., (a+b-1, 1)
  3. path C: (0, a+b), (a+b, a+b+1), ..., (a+b+c-2, 1)

When a path has length 1 its only edge is (0, 1) directly.

func Torus

func Torus(m, n int) Shape[int, int64]

Torus returns a Shape that builds the m-by-n torus graph T_{m,n}. It is the 4-neighbour grid with wrap-around: every cell (r, c) is joined to (r, (c+1) mod n), ((r+1) mod m, c), and so on. The graph is undirected and simple; cfg.Directed and cfg.Multigraph are overridden to false.

Catalogue invariants on the returned graph (m, n >= 1):

  • Order() == uint64(m * n).
  • Size() == 2 * uint64(m * n) when m, n >= 3 (the canonical case: every node has degree exactly 4).
  • For m == 2 (or n == 2) the row wrap-around coincides with the non-wrap row neighbour, so Size collapses to m*n + the column contribution. The unit tests pin the exact value via a closed-form helper; property-based assertions cap m, n at 3 to keep the closed form clean.
  • Diameter == floor(m/2) + floor(n/2) for m, n >= 1.

Torus declares two knobs "m" and "n" over [1, 1000] (1 is the smallest valid dimension; m=1, n=1 collapses to a single node with no edges in the simple-graph regime because every neighbour coincides with the cell itself). The constructor panics when m < 1 or n < 1.

Edges are inserted in ascending source order. Per cell (r, c) the constructor emits the right neighbour (r, (c+1) mod n) only when that target is not the cell itself (n >= 2); analogously for the down neighbour ((r+1) mod m, c). When m == 2 or n == 2 the wrap neighbour coincides with the non-wrap neighbour, so the simple graph collapses the duplicate to a single edge: HasEdge is idempotent and AddEdge on the second call is a no-op in the non-multigraph regime.

func TransitiveTournament

func TransitiveTournament(n int) Shape[int, int64]

TransitiveTournament returns a Shape that builds the transitive tournament on n nodes: the unique acyclic orientation of the complete graph K_n in which every pair (i, j) with i < j is joined by the single directed edge i -> j. Equivalently, it is the comparability graph of the total order 0 < 1 < ... < n-1.

Catalogue invariants on the returned graph:

  • Order() == uint64(n).
  • Size() == uint64(n * (n - 1) / 2).
  • The graph is acyclic (verified in the test suite via a local Kahn topological sort, not via the search.* package).
  • The unique topological order is the identity permutation 0, 1, ..., n-1.

TransitiveTournament declares a single knob "n" over [0, 200]. The short layer caps n at 200 because the edge count grows as O(n^2) — n=200 gives 19_900 edges, comfortably inside the short budget. The constructor panics when n is negative or above 200: the catalogue does not define a transitive tournament outside that range.

Edges are inserted in lexicographic order: (0,1), (0,2), ..., (0,n-1), (1,2), ..., (n-2,n-1). This is also the order in which the goldens enumerate them.

func UniversalSelfLoops

func UniversalSelfLoops(n int, weighted bool) Shape[int, int64]

UniversalSelfLoops returns a Shape that builds the "universal self-loops" graph on n nodes: n nodes 0..n-1 with every node carrying exactly one self-loop. When weighted is true every loop carries weight [weightedSentinel] (1); otherwise every loop carries weight [unweightedSentinel] (0). When n is zero the build is a no-op and the result coincides with EmptyGraph.

n is the parameter set at construction time; the knob "n" exposes the bounded sweep [0, 1000] to property-based tests, with a default of 4. The constructor panics when n < 0; see IsolatedOnly for the rationale.

Catalogue invariants on the returned graph:

  • Order() == uint64(n)
  • Size() == uint64(n)
  • HasEdge(v, v) == true for every v in 0..n-1.
  • HasEdge(u, v) == false for every u != v.

The build forces Directed=true and Multigraph=false; a self-loop is its own reverse so the directed flag has no observable effect, but fixing it removes a useless degree of freedom from the configuration matrix. cfg.MaxShardCapacity is preserved.

func WattsStrogatz

func WattsStrogatz(n, k, betaPercent int, seed uint64) Shape[int, int64]

WattsStrogatz returns a Shape that builds a Watts-Strogatz small-world graph on n nodes. The construction starts from a k-nearest-neighbour ring lattice — every node i is connected to k/2 left-neighbours and k/2 right-neighbours on the ring — and rewires each edge independently with probability betaPercent / 100 (Watts & Strogatz, "Collective dynamics of 'small-world' networks", Nature 393(6684), 1998). The graph is undirected and simple (no parallel edges, no self-loops); cfg.Directed and cfg.Multigraph are overridden to false.

The PRNG is a deterministically-seeded math/rand/v2.PCG, so every (n, k, betaPercent, seed) tuple yields the same byte-for-byte adjacency.

Catalogue invariants on the returned graph:

  • Order() == uint64(n).
  • Size() == uint64(n * k / 2). The rewiring preserves edge count exactly: each rewire replaces one edge with another, never adds or removes one.
  • The graph is undirected and simple.
  • When betaPercent == 0 the graph equals the k-nearest-neighbour ring lattice (byte-equal across runs at fixed seed).
  • When betaPercent == 100 the rewiring distribution is approximately uniform over the C(n, 2) candidate unordered pairs — Watts-Strogatz at beta=1 is approximately G(n, p), not strictly so: the ring-origin bias (every rewire starts from a ring-lattice edge and the rejection-sampling step forbids duplicates against the current neighbour set) leaves a small but measurable excess on pairs of small ring distance. The soak-layer chi-squared test [TestRandom_WattsStrogatz_UniformAtBeta100ChiSquared] pins the empirical guarantee at the alpha ~= 1e-100 tail so catastrophic regressions are caught without falsely rejecting the documented approximate-uniformity contract.
  • The global clustering coefficient decreases monotonically as betaPercent sweeps {0, 1, 10, 100}; the soak-layer test [TestRandom_WattsStrogatz_ClusteringDecreasesMonotonically] pins the catalogue contract.

WattsStrogatz declares three knobs: "n" over [4, 10000] (default 20), "k" over [2, 50] (default 4), and "beta" over [0, 100] (default 10). The "seed" parameter is supplied at construction time as a uint64 and is not exposed as a knob, mirroring the convention pinned by Layered and BarabasiAlbert. The constructor panics when:

  • n < 4 or n > 10000 (catalogue out-of-range);
  • k < 2 or k > 50 (catalogue out-of-range);
  • k is odd (the ring lattice requires k/2 left + k/2 right neighbours, which forces k to be even);
  • k >= n (every node would have to connect to itself or duplicate a neighbour);
  • betaPercent < 0 or betaPercent > 100.

Rewire algorithm: the canonical Watts & Strogatz (1998) procedure. For each shift distance s in [1, k/2], for each node i in [0, n), consider the edge (i, (i + s) mod n) from the original ring lattice. Draw u from IntN(100): if u < betaPercent, replace the endpoint j = (i + s) mod n with a uniformly drawn t != i that is not already a neighbour of i in the current adjacency. Otherwise keep the original edge. The PRNG is consumed exactly once per edge for the rewire decision, plus zero or more times per rewire to resolve a valid target — the latter is by rejection sampling (draw a candidate from IntN(n), retry on self-loop or duplicate against the current per-node neighbour set).

The rejection-sampling loop is bounded because at any point in the rewire the source node has at most k neighbours and n - 1 - k targets remain admissible. Under the constructor's k < n invariant the rejection probability per attempt is at most (k + 1) / n < 1; for the catalogue's (n, k) sweep this stays well below 0.1, so the expected number of attempts per accepted rewire is below 1.2. The catalogue does not pin a worst-case retry bound — the soak-layer uniformity test is the empirical guard that the sampling did not degenerate.

After the rewire phase, the per-node neighbour sets are walked in ascending node order; for each source u, the neighbour ids v with v > u are emitted in ascending order so the final adjacency is inserted into the lpg graph in lexicographic (u, v) order with u < v. This pins the golden bytes regardless of the rewire draw permutation.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL