extern

package
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 2, 2026 License: MIT Imports: 7 Imported by: 0

Documentation

Overview

Package extern provides graph algorithms that operate directly on a Tier 2 (mmap-backed) csrfile.Reader without first materialising the CSR in memory. The algorithms are semi-external: vertex-sized auxiliary structures (visited bitsets, level frontiers) live in RAM while edge data is streamed from the mapped file.

Index

Examples

Constants

This section is empty.

Variables

View Source
var ErrInvalidInput = errors.New("extern: input option contains NaN or Inf")

ErrInvalidInput is returned by extern algorithms when their float options carry NaN or +/-Inf.

Functions

func BFS

func BFS(r *csrfile.Reader, src graph.NodeID, visit func(node graph.NodeID, depth int) bool) error

BFS performs breadth-first traversal of the graph captured by r, starting at src. visit is invoked for every reached node in non-decreasing depth order; returning false aborts the traversal.

The implementation keeps the visited bitset and per-level frontiers in RAM and streams adjacency directly from the mmap region. The current frontier is sorted before expansion so that edge reads stay sequential, maximising the benefit of any MADV_SEQUENTIAL hint configured on the reader.

Example

ExampleBFS runs a semi-external breadth-first traversal directly over an mmap-backed csrfile.Reader, reporting the depth at which each node is reached. Adjacency is streamed from the file; only the visited set and frontiers live in RAM.

package main

import (
	"fmt"
	"os"
	"path/filepath"

	"github.com/FlavioCFOliveira/GoGraph/graph"
	"github.com/FlavioCFOliveira/GoGraph/graph/adjlist"
	"github.com/FlavioCFOliveira/GoGraph/graph/csr"
	"github.com/FlavioCFOliveira/GoGraph/search/extern"
	"github.com/FlavioCFOliveira/GoGraph/store/csrfile"
)

// writeDiamond builds the directed diamond 0->{1,2}->3, serialises it
// to a Tier 2 csrfile in a fresh temp directory, and returns an open
// mmap-backed reader, the mapper, and a cleanup func the caller must
// defer. The reader streams adjacency from the mapped file; extern's
// algorithms never materialise the CSR in memory.
func writeDiamond() (*csrfile.Reader, *graph.Mapper[int], func()) {
	a := adjlist.New[int, struct{}](adjlist.Config{Directed: true})
	for _, e := range [][2]int{{0, 1}, {0, 2}, {1, 3}, {2, 3}} {
		_ = a.AddEdge(e[0], e[1], struct{}{})
	}
	c := csr.BuildFromAdjList(a)

	dir, err := os.MkdirTemp("", "extern-example-")
	if err != nil {
		panic(err)
	}
	path := filepath.Join(dir, "diamond.csr")
	if _, err := csrfile.WriteToFile(path, c); err != nil {
		_ = os.RemoveAll(dir)
		panic(err)
	}
	r, err := csrfile.Open(path)
	if err != nil {
		_ = os.RemoveAll(dir)
		panic(err)
	}
	cleanup := func() {
		_ = r.Close()
		_ = os.RemoveAll(dir)
	}
	return r, a.Mapper(), cleanup
}

func main() {
	r, m, cleanup := writeDiamond()
	defer cleanup()

	src, _ := m.Lookup(0)
	depth := map[int]int{}
	if err := extern.BFS(r, src, func(node graph.NodeID, d int) bool {
		v, _ := m.Resolve(node)
		depth[v] = d
		return true // keep traversing
	}); err != nil {
		fmt.Println("error:", err)
		return
	}

	for v := 0; v < 4; v++ {
		fmt.Printf("node %d at depth %d\n", v, depth[v])
	}
}
Output:
node 0 at depth 0
node 1 at depth 1
node 2 at depth 1
node 3 at depth 2

func BFSCtx

func BFSCtx(ctx context.Context, r *csrfile.Reader, src graph.NodeID, visit func(node graph.NodeID, depth int) bool) error

BFSCtx is the context-aware variant of BFS. ctx.Err() is checked once per BFS level; on cancellation returns the wrapped ctx.Err.

func PageRank

func PageRank(r *csrfile.Reader, opts PageRankOptions) (ranks []float64, iterations int, err error)

PageRank runs the power-iteration form of PageRank over the graph captured by an mmap-backed csrfile.Reader. Rank arrays live in RAM (size = nVertices); adjacency is streamed sequentially from the file each iteration.

The returned slice is indexed by NodeID; only NodeIDs that participate in at least one edge (live nodes) carry non-zero rank. The sum over the slice equals 1.0 within numerical tolerance.

Algorithm. Mirrors the in-memory centrality.PageRank: at every step the mass currently held by dangling nodes (out-degree 0) is redistributed uniformly across all live nodes via baseShare = (1-d)/live + d * danglingMass / live, before mass from non-dangling sources is forwarded along outgoing edges. This guarantees total-mass conservation.

Concurrency: safe to invoke from any number of goroutines on a shared csrfile.Reader.

Example

ExamplePageRank runs PageRank over the same mmap-backed reader. The rank vector lives in RAM while edges stream from the file each iteration; the ranks sum to 1.0, and the sink (node 3, the only destination with no out-edges) carries more mass than the source.

package main

import (
	"fmt"
	"os"
	"path/filepath"

	"github.com/FlavioCFOliveira/GoGraph/graph"
	"github.com/FlavioCFOliveira/GoGraph/graph/adjlist"
	"github.com/FlavioCFOliveira/GoGraph/graph/csr"
	"github.com/FlavioCFOliveira/GoGraph/search/extern"
	"github.com/FlavioCFOliveira/GoGraph/store/csrfile"
)

// writeDiamond builds the directed diamond 0->{1,2}->3, serialises it
// to a Tier 2 csrfile in a fresh temp directory, and returns an open
// mmap-backed reader, the mapper, and a cleanup func the caller must
// defer. The reader streams adjacency from the mapped file; extern's
// algorithms never materialise the CSR in memory.
func writeDiamond() (*csrfile.Reader, *graph.Mapper[int], func()) {
	a := adjlist.New[int, struct{}](adjlist.Config{Directed: true})
	for _, e := range [][2]int{{0, 1}, {0, 2}, {1, 3}, {2, 3}} {
		_ = a.AddEdge(e[0], e[1], struct{}{})
	}
	c := csr.BuildFromAdjList(a)

	dir, err := os.MkdirTemp("", "extern-example-")
	if err != nil {
		panic(err)
	}
	path := filepath.Join(dir, "diamond.csr")
	if _, err := csrfile.WriteToFile(path, c); err != nil {
		_ = os.RemoveAll(dir)
		panic(err)
	}
	r, err := csrfile.Open(path)
	if err != nil {
		_ = os.RemoveAll(dir)
		panic(err)
	}
	cleanup := func() {
		_ = r.Close()
		_ = os.RemoveAll(dir)
	}
	return r, a.Mapper(), cleanup
}

func main() {
	r, m, cleanup := writeDiamond()
	defer cleanup()

	ranks, _, err := extern.PageRank(r, extern.DefaultPageRankOptions())
	if err != nil {
		fmt.Println("error:", err)
		return
	}

	var total float64
	for _, x := range ranks {
		total += x
	}
	src, _ := m.Lookup(0)
	sink, _ := m.Lookup(3)

	fmt.Printf("ranks sum to 1.0: %v\n", total > 0.999 && total < 1.001)
	fmt.Printf("sink outranks source: %v\n", ranks[sink] > ranks[src])
}
Output:
ranks sum to 1.0: true
sink outranks source: true

func PageRankCtx

func PageRankCtx(ctx context.Context, r *csrfile.Reader, opts PageRankOptions) (ranks []float64, iterations int, err error)

PageRankCtx is the context-aware variant of PageRank. ctx.Err() is checked at every iteration boundary; on cancellation returns (nil, 0, wrapped ctx.Err()).

Types

type PageRankOptions

type PageRankOptions struct {
	// Damping is the random-jump probability (typical 0.85).
	Damping float64
	// MaxIterations bounds the iteration count.
	MaxIterations int
	// Tolerance is the convergence threshold on the L1 norm of the
	// rank delta.
	Tolerance float64
}

PageRankOptions configures PageRank.

func DefaultPageRankOptions

func DefaultPageRankOptions() PageRankOptions

DefaultPageRankOptions returns the classic Brin-Page configuration.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL