Benchmarking with NWGraph

This guide covers how to build and run the NWGraph benchmarks, including the GAP Benchmark Suite implementations.

Building Benchmarks

NWGraph benchmarks are disabled by default. To build them:

# From the NWGraph root directory
mkdir -p build && cd build

# Configure with benchmarks enabled
cmake .. -DNWGRAPH_BUILD_BENCH=ON

# Build all benchmarks
make bench -j8

# Or build individual benchmarks
make bfs bc cc pr sssp tc -j8

The benchmark executables are placed in build/bench/gapbs/.

Quick Test

To verify benchmarks are working, run with a small test graph:

# From the build directory
./bench/gapbs/bfs -f ../test/data/karate.mtx -n 3 --verify
./bench/gapbs/cc -f ../test/data/karate.mtx --verify
./bench/gapbs/tc -f ../test/data/karate.mtx --verify

GAP Benchmark Suite

NWGraph implements the six kernels from the GAP Benchmark Suite:

Table 1 GAP Benchmark Kernels

Kernel

Description

Best Version

Notes

BFS

Breadth-First Search

--version 11

Direction-optimizing BFS

SSSP

Single-Source Shortest Path

--version 12

Delta-stepping algorithm

PR

PageRank

--version 11

1000 iterations typical

CC

Connected Components

--version 7

Afforest algorithm

BC

Betweenness Centrality

--version 5

Brandes’ algorithm

TC

Triangle Counting

--version 4

Requires --relabel --upper

GAP Benchmark Datasets

The GAP Benchmark Suite uses five standard graphs for evaluation. These graphs are available from the SuiteSparse Matrix Collection.

Dataset Overview

Table 2 GAP Benchmark Graphs

Name

Vertices

Edges

Download Size

Description

twitter

61,578,415

1,468,365,182

~26 GB

Twitter follower network snapshot. Exhibits realistic social network properties with highly skewed degree distribution (power-law).

web

50,636,154

1,949,412,601

~7 GB

Web crawl of the .sk (Slovakia) domain from 2005. From the Laboratory for Web Algorithmics. Shows substantial locality.

road

23,947,347

58,333,344

~1 GB

Road network of the entire USA. Notable for being much smaller but having very high diameter (~6000 hops).

kron

134,217,726

2,111,632,938

Generated

Kronecker synthetic graph using Graph 500 parameters (scale 27, edge factor 16). Exhibits scale-free properties.

urand

134,217,728

2,147,483,648

Generated

Erdos-Renyi uniform random graph (2^27 vertices, degree 16). Represents worst-case memory access locality.

Detailed Dataset Information

Twitter (GAP-twitter)

Source

ANLAB-KAIST GitHub repository

Original

Twitter follower graph (Kwak et al., WWW 2010)

Format

Edge list (.el)

Type

Directed, unweighted

Characteristics
  • Power-law degree distribution

  • Maximum in-degree: 2,997,469

  • Maximum out-degree: 770,155

  • Average degree: ~48

The Twitter graph is split into 4 compressed parts for download. Use the download script to automatically fetch and combine them.

Web (GAP-web / sk-2005)

Source

SuiteSparse Matrix Collection

Original

Laboratory for Web Algorithmics

Format

Matrix Market (.mtx)

Type

Directed, unweighted

Characteristics
  • Web crawl locality patterns

  • Maximum out-degree: 8,563

  • High clustering coefficient

  • Contains many small SCCs

Road (GAP-road / USA-road-d)

Source

9th DIMACS Implementation Challenge

Original

TIGER/Line road network data

Format

DIMACS graph format (.gr)

Type

Directed, weighted (distances in meters)

Characteristics
  • Very high diameter (~6000)

  • Low average degree (~2.4)

  • Nearly planar structure

  • Weights represent travel distances

This graph tests algorithms on high-diameter, low-degree networks.

Kron (GAP-kron)

Source

Generated using Graph 500 Kronecker generator

Parameters

Scale 27, edge factor 16 (-g27 -k16)

Format

Edge list or serialized binary

Type

Undirected, unweighted

Characteristics
  • Scale-free degree distribution

  • Small-world properties

  • Self-similar structure

  • Synthetic but realistic

Generate using the GAPBS reference converter tool (included in bench/gapbs-reference/):

# Generate Kronecker graph as edge list (for NWGraph)
./bench/gapbs-reference/converter -g27 -k16 -e kron.el

# Or generate as GAPBS serialized format
./bench/gapbs-reference/converter -g27 -k16 -b kron.sg

For smaller test graphs, use a lower scale (e.g., -g20 for ~1M vertices).

Urand (GAP-urand)

Source

Generated using Erdos-Renyi random generator

Parameters

2^27 vertices, degree 16 (-u27 -k16)

Format

Edge list or serialized binary

Type

Undirected, unweighted

Characteristics
  • Uniform random edge placement

  • Worst-case memory locality

  • No clustering structure

  • Binomial degree distribution around 16

Generate using the GAPBS reference converter tool:

# Generate uniform random graph as edge list (for NWGraph)
./bench/gapbs-reference/converter -u27 -k16 -e urand.el

# Or generate as GAPBS serialized format
./bench/gapbs-reference/converter -u27 -k16 -b urand.sg

For smaller test graphs, use a lower scale (e.g., -u20 for ~1M vertices).

Downloading Graphs

Using the Download Script

NWGraph provides a script to download the GAP benchmark graphs:

# Show available graphs and sizes
./scripts/download_gap_graphs.sh --list

# Download road network only (smallest, ~1GB)
./scripts/download_gap_graphs.sh road

# Download multiple graphs
./scripts/download_gap_graphs.sh road web

# Download all real-world graphs
./scripts/download_gap_graphs.sh all

# Download small test graphs (for quick testing)
./scripts/download_gap_graphs.sh --small

# Specify custom download directory
./scripts/download_gap_graphs.sh -d /data/gap road web

Manual Download

Graphs can also be downloaded manually from SuiteSparse:

  1. Visit https://sparse.tamu.edu/GAP

  2. Click on the desired graph

  3. Download in Matrix Market format

  4. Extract and place in your data directory

Storage Requirements

Table 3 Disk Space Requirements

Graph

Compressed

Uncompressed

Notes

Small test set

~5 MB

~10 MB

Good for development

road

~1 GB

~2 GB

Recommended starting point

web

~7 GB

~20 GB

Moderate size

twitter

~26 GB

~80 GB

Largest real-world graph

kron (generated)

N/A

~25 GB

Generated in memory

urand (generated)

N/A

~25 GB

Generated in memory

Total for all graphs: ~150 GB (plus ~64 GB RAM for generation)

File Formats

NWGraph supports multiple graph file formats:

Matrix Market (.mtx)

The standard format for sparse matrices. Human-readable but slower to load.

%%MatrixMarket matrix coordinate pattern general
% Comment lines start with %
5 5 8
1 2
1 3
2 3
...

Edge List (.el)

Simple format with one edge per line:

0 1
0 2
1 2
...

DIMACS (.gr)

Format from the DIMACS Implementation Challenges:

c Comment lines
p sp 5 8
a 1 2 10
a 1 3 5
...

NWGraph Binary (.nw)

Native binary format for fast loading. Convert using:

./bench/gapbs/process_edge_list -d output.nw input.mtx

Running Benchmarks

Standard Benchmark Parameters

For reproducible benchmarking matching the GAP methodology:

# BFS: 64 random sources
./bench/gapbs/bfs -f graph.mtx -n 64 --version 11

# SSSP: 64 random sources, delta=2 (or 50000 for road)
./bench/gapbs/sssp -f graph.wsg -n 64 -d 2 --version 12

# PR: 1000 iterations
./bench/gapbs/pr -f graph.mtx -i 1000 --version 11

# CC: 16 trials
./bench/gapbs/cc -f graph.mtx -n 16 --version 7

# BC: 4 sources, 16 trials
./bench/gapbs/bc -f graph.mtx -i 4 -n 16 --version 5

# TC: Requires undirected, upper triangular
./bench/gapbs/tc -f graph.mtx -n 3 --relabel --upper --version 4

Multi-threading

Thread count is specified as a positional argument:

# Run with 16 threads
./bench/gapbs/bfs -f graph.mtx -n 10 16

# Run with 64 threads
./bench/gapbs/cc -f graph.mtx 64

Verification

Add --verify to check correctness:

./bench/gapbs/bfs -f ../test/data/karate.mtx --verify

Logging Results

Output results to a file:

./bench/gapbs/bfs -f graph.mtx -n 64 --log results.csv --log-header

References

See Also