materialization

package
v0.0.0-...-090f458 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 7, 2025 License: MIT Imports: 16 Imported by: 0

README

Graph Materialization Service

Converts heterogeneous graphs (multiple node/edge types) into homogeneous graphs (single node type) using meta-path traversal. Perfect for feeding into community detection algorithms like Louvain, Leiden, or SCAR.

Quick Start

import "github.com/gilchrisn/graph-clustering-service/pkg/materialization"

// Convert SCAR-format input directly to edge list
err := materialization.SCARToMaterialization(
    "graph.txt",      // Input graph edges
    "properties.txt", // Node type assignments  
    "path.txt",       // Meta-path specification
    "output.txt"      // Edge list output
)

Input Formats

Option 1: SCAR Format (Simple)

Graph File (graph.txt) - Edge list:

0 1
1 2
2 0
3 4
4 5

Properties File (properties.txt) - Node types:

0 0
1 1
2 0
3 0
4 1
5 1

Path File (path.txt) - Meta-path as type sequence:

0
1
0
Option 2: JSON Format (Advanced)

Graph JSON:

{
  "nodes": {
    "alice": {"id": "alice", "type": "Author", "properties": {}},
    "paper1": {"id": "paper1", "type": "Paper", "properties": {}},
    "bob": {"id": "bob", "type": "Author", "properties": {}}
  },
  "edges": [
    {"from": "alice", "to": "paper1", "type": "writes", "weight": 1.0},
    {"from": "bob", "to": "paper1", "type": "writes", "weight": 1.0}
  ]
}

Meta-path JSON:

{
  "id": "author-collaboration",
  "node_sequence": ["Author", "Paper", "Author"],
  "edge_sequence": ["writes", "writes"]
}

Output Format

Simple edge list compatible with most community detection tools:

alice bob 2.000000
alice charlie 1.000000
bob charlie 1.000000
diana alice 1.000000

API Reference

Basic Usage
// Parse SCAR input
graph, metaPath, err := materialization.ParseSCARInput("graph.txt", "properties.txt", "path.txt")

// Configure materialization
config := materialization.DefaultMaterializationConfig()
config.Aggregation.Strategy = materialization.Count
config.Aggregation.Symmetric = true

// Run materialization
engine := materialization.NewMaterializationEngine(graph, metaPath, config, nil)
result, err := engine.Materialize()

// Save as edge list
err = materialization.SaveAsSimpleEdgeList(result.HomogeneousGraph, "output.txt")
Configuration Options
type MaterializationConfig struct {
    Traversal   TraversalConfig   // Path finding settings
    Aggregation AggregationConfig // Edge weight calculation
    Progress    ProgressConfig    // Progress reporting
}

// Key settings:
config.Aggregation.Strategy = Count     // Count, Sum, Average, Maximum, Minimum
config.Aggregation.Symmetric = true     // Force symmetric edges
config.Traversal.MaxInstances = 1000000 // Memory limit
config.Traversal.AllowCycles = false    // Prevent cycles in paths
Advanced Pipeline
// Custom progress tracking
progressCb := func(current, total int, message string) {
    fmt.Printf("Progress: %d/%d - %s\n", current, total, message)
}

engine := materialization.NewMaterializationEngine(graph, metaPath, config, progressCb)
result, err := engine.Materialize()

// Multiple output formats
materialization.SaveAsSimpleEdgeList(result.HomogeneousGraph, "edges.txt")
materialization.SaveAsCSV(result.HomogeneousGraph, "edges.csv") 
materialization.SaveAsJSON(result.HomogeneousGraph, "graph.json")
materialization.SaveMaterializationResult(result, "detailed_result.json")

Common Use Cases

Author Collaboration Network

Input: Authors write Papers
Meta-path: Author → Paper → Author
Output: Author-Author collaboration edges

Citation Co-occurrence

Input: Papers cite Papers
Meta-path: Paper → Paper → Paper
Output: Papers connected by shared citations

User-Item Similarity

Input: Users rate Items
Meta-path: User → Item → User
Output: User-User similarity based on shared items

Pipeline Integration

With Louvain Community Detection
# Step 1: Materialize graph
./materialization -graph="network.txt" -props="props.txt" -path="path.txt" -output="edges.txt"

# Step 2: Run Louvain
./louvain edges.txt
With SCAR Clustering
# Step 1: Materialize
./materialization -graph="data.txt" -props="types.txt" -path="metapath.txt" -output="homogeneous.txt"

# Step 2: Run SCAR  
./scar.exe "homogeneous.txt" -louvain -sketch-output
Programmatic Pipeline
func runPipeline(graphFile, propsFile, pathFile, outputDir string) error {
    // Materialize
    graph, metaPath, err := materialization.ParseSCARInput(graphFile, propsFile, pathFile)
    if err != nil {
        return err
    }
    
    config := materialization.DefaultMaterializationConfig()
    engine := materialization.NewMaterializationEngine(graph, metaPath, config, nil)
    result, err := engine.Materialize()
    if err != nil {
        return err
    }
    
    // Save edge list
    edgeFile := filepath.Join(outputDir, "edges.txt")
    err = materialization.SaveAsSimpleEdgeList(result.HomogeneousGraph, edgeFile)
    if err != nil {
        return err
    }
    
    // Run community detection
    cmd := exec.Command("./louvain", edgeFile)
    cmd.Dir = outputDir
    return cmd.Run()
}

Performance Tips

For Large Graphs (>100K nodes)
config.Traversal.MaxInstances = 500000    // Reduce memory usage
config.Aggregation.MinWeight = 0.1        // Filter weak edges
config.Aggregation.MaxEdges = 1000000     // Limit output size
For High Quality Results
config.Aggregation.Strategy = Average     // Better than Count
config.Traversal.AllowCycles = false      // Cleaner paths
config.Aggregation.Normalization = DegreeNorm // Normalize weights
For Speed
config.Traversal.MaxInstances = 100000    // Early termination
config.Aggregation.Strategy = Count       // Fastest aggregation
config.Progress.EnableProgress = false    // Disable progress reporting

Error Handling

Common issues and solutions:

// Check input validity
if err := graph.Validate(); err != nil {
    log.Fatalf("Invalid graph: %v", err)
}

// Handle memory limits
estimated, err := engine.GetMemoryEstimate()
if estimated > maxMemoryMB {
    config.Traversal.MaxInstances = 50000 // Reduce limit
}

// Verify meta-path traversability  
generator := materialization.NewInstanceGenerator(graph, metaPath, config.Traversal)
if err := generator.ValidateMetaPathTraversability(); err != nil {
    log.Fatalf("Meta-path not traversable: %v", err)
}

Build and Install

# Build the library
go build ./pkg/materialization

# Build CLI tool (if available)
go build -o materialization ./cmd/materialization

# Run tests
go test ./pkg/materialization/...

# Run with verification
go test ./pkg/materialization/ -run TestVerifyMaterialization

Examples

See materialization_test.go for complete examples including:

  • Author collaboration networks
  • Citation networks
  • User-item recommendation graphs
  • Performance benchmarks
  • Verification and debugging tools

License

MIT License - see LICENSE file for details.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func MaterializeToFile

func MaterializeToFile(graph *models.HeterogeneousGraph, metaPath *models.MetaPath,
	outputPath string, config MaterializationConfig) error

MaterializeToFile performs materialization and saves result to file

func ParseSCARInput

func ParseSCARInput(graphFile, propertiesFile, pathFile string) (*models.HeterogeneousGraph, *models.MetaPath, error)

ParseSCARInput parses SCAR-format input files and creates a heterogeneous graph

func PrintMissingNodesReport

func PrintMissingNodesReport(investigation map[string]interface{})

PrintMissingNodesReport prints a detailed report of missing nodes

func PrintVerificationSummary

func PrintVerificationSummary(result *VerificationResult)

PrintVerificationSummary prints a summary of verification results

func SCARToMaterialization

func SCARToMaterialization(graphFile, propertiesFile, pathFile, outputFile string) error

SCARToMaterialization runs the complete conversion from SCAR input to edge list output

func SaveAsCSV

func SaveAsCSV(graph *HomogeneousGraph, outputPath string) error

SaveAsCSV saves as CSV with header: from,to,weight

func SaveAsEdgeList

func SaveAsEdgeList(graph *HomogeneousGraph, outputPath string) error

SaveAsEdgeList saves in standard graph format: first line = "nodes edges", then edge list

func SaveAsJSON

func SaveAsJSON(graph *HomogeneousGraph, outputPath string) error

SaveAsJSON saves the complete graph with metadata as JSON SaveAsJSON saves the complete graph with metadata as JSON

func SaveAsSimpleEdgeList

func SaveAsSimpleEdgeList(graph *HomogeneousGraph, outputPath string) error

func SaveHomogeneousGraph

func SaveHomogeneousGraph(graph *HomogeneousGraph, outputPath string) error

SaveHomogeneousGraph saves a homogeneous graph to file Format is determined by file extension: .csv, .json, .edgelist, .txt

func SaveMaterializationResult

func SaveMaterializationResult(result *MaterializationResult, outputPath string) error

SaveMaterializationResult saves the complete materialization result SaveMaterializationResult saves the complete materialization result

func SaveVerificationResult

func SaveVerificationResult(result *VerificationResult, outputPath string) error

SaveVerificationResult saves verification results to a file

Types

type AggregationConfig

type AggregationConfig struct {
	Strategy       AggregationStrategy    `json:"strategy"`       // Count, Sum, Average, etc.
	Interpretation MetaPathInterpretation `json:"interpretation"` // DirectTraversal, MeetingBased
	Normalization  NormalizationType      `json:"normalization"`  // None, Degree, Max, etc.
	MinWeight      float64                `json:"min_weight"`     // Filter weak edges
	MaxEdges       int                    `json:"max_edges"`      // Keep only top-k edges (0 = no limit)
	Symmetric      bool                   `json:"symmetric"`      // Force symmetric edges
}

AggregationConfig contains configuration for edge weight calculation

func DefaultAggregationConfig

func DefaultAggregationConfig() AggregationConfig

DefaultAggregationConfig returns sensible default configuration

type AggregationStats

type AggregationStats struct {
	EdgeGroupsProcessed  int            `json:"edge_groups_processed"`
	InstancesAggregated  int            `json:"instances_aggregated"`
	EdgesFiltered        int            `json:"edges_filtered"`
	WeightDistribution   map[string]int `json:"weight_distribution"`
	NormalizationApplied bool           `json:"normalization_applied"`
}

AggregationStats contains statistics about the aggregation process

type AggregationStrategy

type AggregationStrategy int

AggregationStrategy defines how to combine multiple path instances into edge weights

const (
	Count   AggregationStrategy = iota // Count number of instances
	Sum                                // Sum of instance weights
	Average                            // Average of instance weights
	Maximum                            // Maximum instance weight
	Minimum                            // Minimum instance weight
)

type EdgeKey

type EdgeKey struct {
	From string `json:"from"`
	To   string `json:"to"`
}

EdgeKey represents a directed edge between two nodes

func (EdgeKey) Reverse

func (ek EdgeKey) Reverse() EdgeKey

Reverse returns the reverse edge key

func (EdgeKey) String

func (ek EdgeKey) String() string

String returns a string representation of the edge key

type GraphStatistics

type GraphStatistics struct {
	NodeCount          int            `json:"node_count"`
	EdgeCount          int            `json:"edge_count"`
	InstanceCount      int            `json:"instance_count"`      // Total meta path instances
	Density            float64        `json:"density"`             // Edge density
	AverageWeight      float64        `json:"average_weight"`      // Average edge weight
	MaxWeight          float64        `json:"max_weight"`          // Maximum edge weight
	MinWeight          float64        `json:"min_weight"`          // Minimum edge weight
	DegreeDistribution map[int]int    `json:"degree_distribution"` // degree -> count
	WeightDistribution map[string]int `json:"weight_distribution"` // weight_bucket -> count
}

GraphStatistics contains metrics about the homogeneous graph

type GraphStats

type GraphStats struct {
	NodeCount           int            `json:"node_count"`
	EdgeCount           int            `json:"edge_count"`
	NodeTypes           map[string]int `json:"node_types"` // type -> count
	EdgeTypes           map[string]int `json:"edge_types"` // type -> count
	AvgDegree           float64        `json:"avg_degree"`
	MaxDegree           int            `json:"max_degree"`
	ConnectedComponents int            `json:"connected_components"`
}

GraphStats contains basic graph statistics

type GraphVerificationStats

type GraphVerificationStats struct {
	OriginalGraph     GraphStats    `json:"original_graph"`
	MaterializedGraph GraphStats    `json:"materialized_graph"`
	MetaPath          MetaPathStats `json:"meta_path"`
}

GraphVerificationStats contains statistics for verification

type GraphVerifier

type GraphVerifier struct {
	// contains filtered or unexported fields
}

GraphVerifier handles verification of graph materialization

func NewGraphVerifier

func NewGraphVerifier() *GraphVerifier

NewGraphVerifier creates a new graph verifier

func (*GraphVerifier) InvestigateMissingNodes

func (gv *GraphVerifier) InvestigateMissingNodes() map[string]interface{}

InvestigateMissingNodes provides detailed analysis of why nodes are missing

func (*GraphVerifier) LoadFromFiles

func (gv *GraphVerifier) LoadFromFiles(graphFile, metaPathFile string) error

LoadFromFiles loads graph and meta path from files

func (*GraphVerifier) LoadFromObjects

func (gv *GraphVerifier) LoadFromObjects(graph *models.HeterogeneousGraph, metaPath *models.MetaPath)

LoadFromObjects loads graph and meta path from objects

func (*GraphVerifier) VerifyMaterialization

func (gv *GraphVerifier) VerifyMaterialization(config MaterializationConfig) (*VerificationResult, error)

VerifyMaterialization performs the materialization and verifies the results

type HomogeneousBuilder

type HomogeneousBuilder struct {
	// contains filtered or unexported fields
}

HomogeneousBuilder converts path instances into a homogeneous graph

func NewHomogeneousBuilder

func NewHomogeneousBuilder(metaPath *models.MetaPath, config AggregationConfig) *HomogeneousBuilder

NewHomogeneousBuilder creates a new homogeneous graph builder

func (*HomogeneousBuilder) AddInstance

func (hb *HomogeneousBuilder) AddInstance(instance PathInstance)

AddInstance adds a path instance to the builder

func (*HomogeneousBuilder) Build

Build constructs the final homogeneous graph from all added instances

func (*HomogeneousBuilder) GetAggregationStatistics

func (hb *HomogeneousBuilder) GetAggregationStatistics() AggregationStats

GetAggregationStatistics returns current aggregation statistics

func (*HomogeneousBuilder) GetEdgeGroupCount

func (hb *HomogeneousBuilder) GetEdgeGroupCount() int

GetEdgeGroupCount returns the number of unique edge groups

func (*HomogeneousBuilder) GetEdgeInstanceCount

func (hb *HomogeneousBuilder) GetEdgeInstanceCount(from, to string) int

GetEdgeInstanceCount returns the number of instances for a specific edge

func (*HomogeneousBuilder) GetInstanceGroups

func (hb *HomogeneousBuilder) GetInstanceGroups() map[EdgeKey][]PathInstance

GetInstanceGroups returns the current instance groups (for debugging/analysis)

func (*HomogeneousBuilder) GetNodeCount

func (hb *HomogeneousBuilder) GetNodeCount() int

GetNodeCount returns the number of unique nodes

func (*HomogeneousBuilder) Reset

func (hb *HomogeneousBuilder) Reset()

Reset clears all stored data (useful for reusing the builder)

func (*HomogeneousBuilder) ValidateConfiguration

func (hb *HomogeneousBuilder) ValidateConfiguration() error

ValidateConfiguration checks if the aggregation configuration is valid

type HomogeneousGraph

type HomogeneousGraph struct {
	NodeType   string              `json:"node_type"`  // "Author" for symmetric paths
	Nodes      map[string]Node     `json:"nodes"`      // Nodes in result graph
	Edges      map[EdgeKey]float64 `json:"edges"`      // (from,to) -> weight
	Statistics GraphStatistics     `json:"statistics"` // Graph metrics
	MetaPath   models.MetaPath     `json:"meta_path"`  // Original meta path used
}

HomogeneousGraph represents the materialized graph result

func (*HomogeneousGraph) AddEdge

func (hg *HomogeneousGraph) AddEdge(from, to string, weight float64)

AddEdge adds an edge to the homogeneous graph

func (*HomogeneousGraph) AddNode

func (hg *HomogeneousGraph) AddNode(nodeID string, originalNode models.Node)

AddNode adds a node to the homogeneous graph

func (*HomogeneousGraph) CalculateStatistics

func (hg *HomogeneousGraph) CalculateStatistics()

CalculateStatistics computes graph statistics

func (*HomogeneousGraph) GetNeighbors

func (hg *HomogeneousGraph) GetNeighbors(nodeID string) []string

GetNeighbors returns all neighbors of a node

func (*HomogeneousGraph) GetWeight

func (hg *HomogeneousGraph) GetWeight(from, to string) float64

GetWeight returns the weight of an edge, or 0 if it doesn't exist

func (*HomogeneousGraph) HasEdge

func (hg *HomogeneousGraph) HasEdge(from, to string) bool

HasEdge checks if an edge exists

type InstanceGenerator

type InstanceGenerator struct {
	// contains filtered or unexported fields
}

InstanceGenerator finds all instances of a meta path in a heterogeneous graph using BFS

func NewInstanceGenerator

func NewInstanceGenerator(graph *models.HeterogeneousGraph, metaPath *models.MetaPath, config TraversalConfig) *InstanceGenerator

NewInstanceGenerator creates a new instance generator

func (*InstanceGenerator) EstimateInstanceCount

func (ig *InstanceGenerator) EstimateInstanceCount() (int, error)

EstimateInstanceCount provides a rough estimate of the number of path instances

func (*InstanceGenerator) FindAllInstances

func (ig *InstanceGenerator) FindAllInstances(progressCb func(int, string)) ([]PathInstance, TraversalStats, error)

FindAllInstances finds all instances of the meta path using BFS

func (*InstanceGenerator) GetConnectedComponents

func (ig *InstanceGenerator) GetConnectedComponents() map[string][]string

GetConnectedComponents analyzes the connectivity of nodes relevant to the meta path

func (*InstanceGenerator) GetTraversalStatistics

func (ig *InstanceGenerator) GetTraversalStatistics() TraversalStats

GetTraversalStatistics returns current traversal statistics

func (*InstanceGenerator) ValidateMetaPathTraversability

func (ig *InstanceGenerator) ValidateMetaPathTraversability() error

ValidateMetaPathTraversability checks if the meta path can be traversed in the graph

type MaterializationConfig

type MaterializationConfig struct {
	Traversal   TraversalConfig   `json:"traversal"`
	Aggregation AggregationConfig `json:"aggregation"`
	Progress    ProgressConfig    `json:"progress"`
}

MaterializationConfig combines all configuration options

func DefaultMaterializationConfig

func DefaultMaterializationConfig() MaterializationConfig

DefaultMaterializationConfig returns sensible defaults

type MaterializationEngine

type MaterializationEngine struct {
	// contains filtered or unexported fields
}

MaterializationEngine is the main component that orchestrates graph materialization

func NewMaterializationEngine

func NewMaterializationEngine(graph *models.HeterogeneousGraph, metaPath *models.MetaPath,
	config MaterializationConfig, progressCb ProgressCallback) *MaterializationEngine

NewMaterializationEngine creates a new materialization engine

func (*MaterializationEngine) CanMaterialize

func (me *MaterializationEngine) CanMaterialize(maxMemoryMB int64) (bool, string, error)

CanMaterialize checks if materialization is feasible given memory constraints

func (*MaterializationEngine) EstimateComplexity

func (me *MaterializationEngine) EstimateComplexity() (int, error)

EstimateComplexity estimates the computational complexity of materialization

func (*MaterializationEngine) GetMemoryEstimate

func (me *MaterializationEngine) GetMemoryEstimate() (int64, error)

GetMemoryEstimate estimates memory usage for materialization

func (*MaterializationEngine) Materialize

func (me *MaterializationEngine) Materialize() (*MaterializationResult, error)

Materialize performs the complete materialization process

type MaterializationError

type MaterializationError struct {
	Component string `json:"component"` // "traversal", "aggregation", "memory", etc.
	Message   string `json:"message"`
	Details   string `json:"details,omitempty"`
}

ValidationError represents materialization-specific validation errors

func (MaterializationError) Error

func (me MaterializationError) Error() string

type MaterializationResult

type MaterializationResult struct {
	HomogeneousGraph *HomogeneousGraph     `json:"homogeneous_graph"`
	Statistics       ProcessingStatistics  `json:"statistics"`
	Config           MaterializationConfig `json:"config"`
	Success          bool                  `json:"success"`
	Error            string                `json:"error,omitempty"`
}

MaterializationResult contains the complete result of materialization

func BatchMaterialize

func BatchMaterialize(graph *models.HeterogeneousGraph, metaPaths []*models.MetaPath,
	config MaterializationConfig, progressCb ProgressCallback) ([]*MaterializationResult, error)

BatchMaterialize performs materialization on multiple meta paths

func MaterializeWithDefaults

func MaterializeWithDefaults(graph *models.HeterogeneousGraph, metaPath *models.MetaPath,
	progressCb ProgressCallback) (*MaterializationResult, error)

MaterializeWithDefaults is a convenience function that uses default configuration

type MetaPathInterpretation

type MetaPathInterpretation int

MetaPathInterpretation defines how the meta path is interpreted

const (
	DirectTraversal MetaPathInterpretation = iota // Alice → Paper → Bob = Alice ↔ Bob
	MeetingBased                                  // Alice → Venue, Bob → Venue = Alice ↔ Bob
)

type MetaPathStats

type MetaPathStats struct {
	Length             int      `json:"length"`
	IsSymmetric        bool     `json:"is_symmetric"`
	StartNodeType      string   `json:"start_node_type"`
	EndNodeType        string   `json:"end_node_type"`
	NodeTypes          []string `json:"node_types"`
	EdgeTypes          []string `json:"edge_types"`
	EstimatedInstances int      `json:"estimated_instances"`
}

MetaPathStats contains meta path statistics

type Node

type Node struct {
	ID         string                 `json:"id"`
	Type       string                 `json:"type"`
	Properties map[string]interface{} `json:"properties"`
	Degree     int                    `json:"degree"` // Number of connections
}

Node represents a node in the homogeneous graph

type NormalizationType

type NormalizationType int

NormalizationType defines edge weight normalization strategies

const (
	NoNormalization NormalizationType = iota // No normalization
	DegreeNorm                               // Normalize by node degrees
	MaxNorm                                  // Normalize to [0,1] range
	StandardNorm                             // Z-score normalization
)

type PathInstance

type PathInstance struct {
	Nodes    []string               `json:"nodes"`    // Sequence of node IDs [a1, p1, a2]
	Edges    []string               `json:"edges"`    // Sequence of edge types used
	Weight   float64                `json:"weight"`   // Accumulated weight along path
	Metadata map[string]interface{} `json:"metadata"` // Additional information
}

PathInstance represents a single instance of the meta path in the graph

func (PathInstance) GetEndNode

func (pi PathInstance) GetEndNode() string

GetEndNode returns the last node in the path

func (PathInstance) GetStartNode

func (pi PathInstance) GetStartNode() string

GetStartNode returns the first node in the path

func (PathInstance) IsValid

func (pi PathInstance) IsValid() bool

IsValid checks if the path instance is structurally valid

func (PathInstance) String

func (pi PathInstance) String() string

String returns a human-readable representation of the path instance

type PathState

type PathState struct {
	Nodes   []string        // Current path of nodes
	Edges   []string        // Current path of edge types
	Weight  float64         // Accumulated weight
	Step    int             // Current step in meta path (0 to len(edges))
	Visited map[string]bool // Visited nodes (for cycle detection)
}

PathState represents the current state during BFS traversal

type ProcessingStatistics

type ProcessingStatistics struct {
	RuntimeMS             int64            `json:"runtime_ms"`
	MemoryPeakMB          int64            `json:"memory_peak_mb"`
	InstancesGenerated    int              `json:"instances_generated"`
	InstancesFiltered     int              `json:"instances_filtered"`
	EdgesCreated          int              `json:"edges_created"`
	NodesInResult         int              `json:"nodes_in_result"`
	TraversalStatistics   TraversalStats   `json:"traversal_stats"`
	AggregationStatistics AggregationStats `json:"aggregation_stats"`
}

ProcessingStatistics contains detailed metrics about the materialization process

type ProgressCallback

type ProgressCallback func(current, total int, message string)

ProgressCallback is a function type for progress reporting

type ProgressConfig

type ProgressConfig struct {
	EnableProgress bool `json:"enable_progress"` // Whether to report progress
	ReportInterval int  `json:"report_interval"` // Report every N instances
}

ProgressConfig contains configuration for progress reporting

type TestResult

type TestResult struct {
	Name        string `json:"name"`
	Description string `json:"description"`
	Passed      bool   `json:"passed"`
	Expected    string `json:"expected"`
	Actual      string `json:"actual"`
	ErrorMsg    string `json:"error_msg,omitempty"`
	Severity    string `json:"severity"` // "critical", "warning", "info"
}

TestResult represents a single test result

type TraversalConfig

type TraversalConfig struct {
	Strategy       TraversalStrategy `json:"strategy"`        // BFS, DFS, etc.
	MaxPathLength  int               `json:"max_path_length"` // Prevent infinite recursion
	AllowCycles    bool              `json:"allow_cycles"`    // Whether to allow node revisits
	MaxInstances   int               `json:"max_instances"`   // Memory safety limit
	TimeoutSeconds int               `json:"timeout_seconds"` // Processing timeout
	Parallelism    int               `json:"parallelism"`     // Number of parallel workers
}

TraversalConfig contains configuration for meta path traversal

func DefaultTraversalConfig

func DefaultTraversalConfig() TraversalConfig

DefaultTraversalConfig returns sensible default configuration

type TraversalStats

type TraversalStats struct {
	StartingNodes     int         `json:"starting_nodes"`
	NodesVisited      int         `json:"nodes_visited"`
	EdgesTraversed    int         `json:"edges_traversed"`
	PathsExplored     int         `json:"paths_explored"`
	CyclesDetected    int         `json:"cycles_detected"`
	TimeoutOccurred   bool        `json:"timeout_occurred"`
	WorkerUtilization map[int]int `json:"worker_utilization"` // worker_id -> instances_processed
	RuntimeMS         int64       `json:"runtime_ms"`         // Total traversal time
}

TraversalStats contains statistics about the traversal process

type TraversalStrategy

type TraversalStrategy int

TraversalStrategy defines how to traverse the meta path

const (
	BFS TraversalStrategy = iota // Breadth-First Search
	DFS                          // Depth-First Search (for future extension)
)

type VerificationResult

type VerificationResult struct {
	Passed           bool                   `json:"passed"`
	TotalTests       int                    `json:"total_tests"`
	PassedTests      int                    `json:"passed_tests"`
	FailedTests      int                    `json:"failed_tests"`
	TestResults      []TestResult           `json:"test_results"`
	GraphStats       GraphVerificationStats `json:"graph_stats"`
	RecommendedFixes []string               `json:"recommended_fixes"`
}

VerificationResult contains the results of verification

func QuickVerify

func QuickVerify(graphFile, metaPathFile string) (*VerificationResult, error)

QuickVerify performs a quick verification with default settings

type WeightCalculator

type WeightCalculator struct {
	// contains filtered or unexported fields
}

WeightCalculator handles weight processing and normalization for homogeneous graphs

func NewWeightCalculator

func NewWeightCalculator(config AggregationConfig) *WeightCalculator

NewWeightCalculator creates a new weight calculator

func (*WeightCalculator) ApplyCustomNormalization

func (wc *WeightCalculator) ApplyCustomNormalization(graph *HomogeneousGraph,
	normFunc func(EdgeKey, float64, *HomogeneousGraph) float64) error

ApplyCustomNormalization allows applying a custom normalization function

func (*WeightCalculator) GetNormalizationSummary

func (wc *WeightCalculator) GetNormalizationSummary(graph *HomogeneousGraph) string

GetNormalizationSummary returns a summary of the normalization that would be applied

func (*WeightCalculator) GetWeightStatistics

func (wc *WeightCalculator) GetWeightStatistics(graph *HomogeneousGraph) WeightStatistics

GetWeightStatistics calculates detailed statistics about edge weights

func (*WeightCalculator) ProcessGraph

func (wc *WeightCalculator) ProcessGraph(graph *HomogeneousGraph) error

ProcessGraph applies all weight processing steps to the homogeneous graph

func (*WeightCalculator) ValidateWeights

func (wc *WeightCalculator) ValidateWeights(graph *HomogeneousGraph) error

ValidateWeights checks if all weights in the graph are valid

type WeightStatistics

type WeightStatistics struct {
	Count             int                `json:"count"`
	Sum               float64            `json:"sum"`
	Mean              float64            `json:"mean"`
	Min               float64            `json:"min"`
	Max               float64            `json:"max"`
	Variance          float64            `json:"variance"`
	StandardDeviation float64            `json:"standard_deviation"`
	Percentiles       map[string]float64 `json:"percentiles"`
}

WeightStatistics contains detailed statistics about edge weights

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL