Build with gcc-11 image1 image2 Codacy Badge

Quickstart

NWGraph is a high-performance header-only generic C++ graph library, based on C++20 concept and range language feature. It consists of multiple graph algorithms for well-known graph kernels and supporting data structures. Both sequential and parallel algorithms are available.

Project Organization

The organization of our library is shown as follow:

$NWGraph_HOME/
├── README.md
├── CMakeLists.txt
├── apb/
├── bench/
├── examples/
│   └── imdb/
├── include/
│   └── nwgraph/
│       ├── adaptors/
│       ├── algorithms/
│       ├── containers/
│       ├── experimental/
│       │   └── algorithms/
│       ├── graphs/
│       ├── io/
│       ├── util/
│       ├── graph_concepts.hpp
│       └── ...
├── test/
└── ...

The genericity of different algorithms available in the NWGraph library stems from a taxonomy of graph concepts. The definition of these concepts can be found in the include/nwgraph/graph_concepts.hpp file. The header files containing various sequential and parallel graph algorithms for well-known graph kernels can be found under the $NWGraph_HOME/include/nwgraph/algorithms/ directory (some of the experimental algorithms are located in the$NWGraph_HOME/include/nwgraph/experimental/algorithms/ subdirectory). The header files for the range adaptors are under $NWGraph_HOME/include/nwgraph/adaptors/ directory. The code for the applications is located in the $NWGraph_HOME/bench/ diretory. The abstraction penalty benchmark for benchmarking different containers and a variety of different ways to iterate through a graph (including the use of graphadaptors) are under the $NWGraph_HOME/apb/ directory. Various examples of how to use NWGraph can be found in the $NWGraph_HOME/example/imdb/ directory.

How to Compile

NWGraph uses Intel OneTBB as the parallel backend.

Requirements

  • CMake >= 3.20

  • g++ >= 11 with support for OneTBB as parallel backend

  • oneTBB >= 2021

You should be able to install cmake and g++ with your system’s package manager (e.g., apt or homebrew). oneTBB appears to be available on homebrew for MacOS 11.6 and later (and perhaps earlier).

Instructions for installing oneTBB with various Linux package managers can be found here:

https://www.intel.com/content/www/us/en/developer/articles/tool/oneapi-standalone-components.html

Installation packages for oneAPI for Linux are available on intel.com:

https://www.intel.com/content/www/us/en/developer/articles/tool/oneapi-standalone-components.html#onetbb

Compilation

Basic build (tests only):

$ mkdir -p build && cd build
$ cmake ..
$ make -j8

Building Everything

To build tests, examples, and benchmarks together:

$ cmake .. -DNWGRAPH_BUILD_TESTS=ON \
           -DNWGRAPH_BUILD_EXAMPLES=ON \
           -DNWGRAPH_BUILD_BENCH=ON
$ make -j8

Building Specific Components

# Build all benchmarks (GAP Benchmark Suite + abstraction penalty)
$ make bench -j8

# Build individual benchmark executables
$ make bfs cc pr sssp bc tc -j8

# Build BGL book examples
$ make ch3_toposort ch4_kevin_bacon ch5_dijkstra ch6_kruskal -j8

Build Output Locations

After building, executables are located in:

build/
├── test/                        # Unit tests (Catch2)
│   ├── adjacency_test
│   ├── bfs_test_0
│   └── ...
├── examples/
│   └── bgl-book/                # BGL Book examples
│       ├── ch3_toposort
│       ├── ch4_kevin_bacon
│       ├── ch5_dijkstra
│       └── ...
└── bench/
    ├── gapbs/                   # GAP Benchmark Suite
    │   ├── bfs
    │   ├── sssp
    │   ├── pr
    │   ├── cc
    │   ├── bc
    │   └── tc
    └── abstraction_penalty/     # Abstraction penalty benchmarks
        ├── plain
        ├── spmv
        └── ...

CMake Options

To specify compiler:

$ cmake .. -DCMAKE_CXX_COMPILER=g++-11

To specify build type as Release or Debug (default is Release):

$ cmake .. -DCMAKE_BUILD_TYPE=Release

To enable/disable components:

# Unit tests (ON by default)
$ cmake .. -DNWGRAPH_BUILD_TESTS=ON

# GAP benchmarks and abstraction penalty benchmarks
$ cmake .. -DNWGRAPH_BUILD_BENCH=ON

# BGL book examples and IMDB examples
$ cmake .. -DNWGRAPH_BUILD_EXAMPLES=ON

If cmake is not able to find TBB in its expected places, you may get an error during the cmake step. In this case, you need to set the TBBROOT environment variable to the location where oneTBB was installed. For example:

$ TBBROOT=/opt/intel/oneapi/tbb/2021.5.1 cmake ..

To see verbose information during compilation:

$ make VERBOSE=1

Building Documentation

NWGraph documentation is built using Sphinx with Doxygen integration via Breathe and Exhale.

Documentation Dependencies

Install Python dependencies:

$ pip install sphinx sphinx_rtd_theme breathe exhale sphinxcontrib-bibtex

Install Doxygen:

# macOS
$ brew install doxygen

# Ubuntu/Debian
$ sudo apt install doxygen

Building via CMake

The recommended way to build documentation:

$ cmake .. -DNWGRAPH_BUILD_DOCS=ON
$ make docs

This runs Doxygen to extract API documentation and Sphinx to build the HTML pages.

Documentation Targets

Target

Description

make docs

Build complete documentation (Doxygen + Sphinx)

make docs-html

Build HTML only (faster, uses cached Doxygen output)

make docs-clean

Clean all built documentation

make docs-open

Build and open in browser (macOS/Linux)

Building Directly with Sphinx

Alternatively, build directly from the sphinx directory:

$ cd doc-src/sphinx
$ pip install -r requirements.txt
$ make html

The built documentation is located at doc-src/sphinx/_build/html/index.html.

Running code in NWGraph

NWGraph uses command-line interface description language DOCOPT to define the interface of our command-line applications and abstraction penalty experiments.

A typical interface of a benchmark driver looks like this:

bfs: breadth first search benchmark driver.
  Usage:
      bfs (-h | --help)
      bfs -f FILE [-r NODE | -s FILE] [-i NUM] [-a NUM] [-b NUM] [-B NUM] [-n NUM] [--seed NUM] [--version ID...] [--log FILE] [--log-header] [-dvV] [THREADS]...

  Options:
      -h, --help              show this screen
      -f FILE                 input file path
      -i NUM                  number of iteration [default: 1]
      -a NUM                  alpha parameter [default: 15]
      -b NUM                  beta parameter [default: 18]
      -B NUM                  number of bins [default: 32]
      -n NUM                  number of trials [default: 1]
      -r NODE                 start from node r (default is random)
      -s, --sources FILE      sources file
      --seed NUM              random seed [default: 27491095]
      --version ID            algorithm version to run [default: 0]
      --log FILE              log times to a file
      --log-header            add a header to the log file
      -d, --debug             run in debug mode
      -v, --verify            verify results
      -V, --verbose           run in verbose mode

The applications takes options followed by the arguments of the options as inputs. A minimal example takes a graph as input is as follow:

$ ./build/bench/gapbs/bfs -f test/data/karate.mtx

Supported graph file format

NWGraph recogonizes the following types of file format: * Matrix Market Exchange Formats

Running benchmarks

We have six main benchmarks: Breadth-first Search, Betweenness Centrality, Connected Component Decomposition, PageRank, Single Source Shortest Path, and Triangle Counting.

First, build the benchmarks:

$ cd build
$ cmake .. -DNWGRAPH_BUILD_BENCH=ON
$ make bench -j8

Connected Component Decomposition

The default sequential version of CC is version 0 (default). The fastest parallel version of CC is version 7, Afforest.

$ ./bench/gapbs/cc -f ../test/data/karate.mtx --relabel --direction ascending

PageRank

The fastest parallel version of PR is version 11 (default). The max iterations can be set with -i.

$ ./bench/gapbs/pr -f ../test/data/karate.mtx -i 1000

Single Source Shortest Path

The default sequential version of SSSP is version 0 (default). The fastest parallel version of SSSP is version 12, Delta-stepping. As an alternative to specifying one seed at a time, one or more sources can be provided in a Matrix Market format file as an input of SSSP driver. Also, number of trials can be specified with -n. In this way, if no seed or seed file is provided, each trial will generate one random number from 0 to |V|-1 as the random source for SSSP as an input.

$ ./bench/gapbs/sssp -f ../test/data/karate.mtx --seed 0 -n 3

Triangle Counting

The default sequential version of TC is version 0 (default). The fastest parallel version of TC is version 4.

$ ./bench/gapbs/tc -f ../test/data/karate.mtx --version 4 --relabel --upper

Betweenness Centrality

The default sequential version of BC is version 0 (default). The fastest parallel version of BC is version 5. As an alternative to specifying one seed at a time, one or more sources can be provided in a Matrix Market format file as an input of BC driver.

$ ./bench/gapbs/bc -f ../test/data/karate.mtx --version 5 --seed 0

Other useful things

Note that the following features may or may not be available for every benchmark.

Relabel-by-degree

Relabel vertex by degree (also known as column/row permutation in matrix-matrix multiplication) may speed up the performance of the graph algorithm. It can improve the workload distribution and memory access pattern of the algorithm itself. To enable relabel-by-degree and relabel the degree of vertices in ascending order:

$ ./bench/gapbs/cc -f ../test/data/karate.mtx --relabel --direction ascending

Upper Triangular Order

In triangle counting, it allows to relabel the graph in upper/lower triangular order. This will greatly improve the performance of the algorithm. To enable relabel-by-degree and relabel the degree of vertices in upper triangular order:

$ ./bench/gapbs/tc -f ../test/data/karate.mtx --relabel --upper

Verifier

We implement a verifier in each benchmark to verify the correctness of the algorithms. To enable the verification of the algorithm:

$ ./bench/gapbs/cc -f ../test/data/karate.mtx -v

or

$ ./bench/gapbs/cc -f ../test/data/karate.mtx --verify

Multi-threading

Each algorithm/benchmark has both sequential version and parallel version. When a parallel algorithm is selected, multi-threading is enable by default. The number of threads is set to be the maximum available core on the machine. To enable multi-threading with different thread number, such as 128 threads:

$ ./bench/gapbs/cc -f ../test/data/karate.mtx 128

Benchmarking with GAP Datasets

For detailed information about the GAP Benchmark Suite datasets, see Benchmarking with NWGraph.

To download the GAP graphs:

# Show available graphs
$ ./scripts/download_gap_graphs.sh --list

# Download the road network (smallest, good for testing)
$ ./scripts/download_gap_graphs.sh road

# Download all real-world graphs
$ ./scripts/download_gap_graphs.sh all

Then run benchmarks with the downloaded graphs. Note that BFS and SSSP are run with 64 sources, BC with 4 sources, and PR with 1000 iterations.

Benchmarking abstraction penalties

What is abstraction penalty?

There are two types of abstraction penalties here. Using a range-based interface introduces a variety of different ways to iterate through a graph (including the use of graph adaptors). While ranges and range based for loops are useful programming abstractions, it is important to consider any performance abstraction penalties associated with their use. We benchmark these penalties to ensure they will not significantly limit performance compared to a raw for loop implementation.

We also evaluated the abstraction penalty incurred for storing a graph in different containers. In particular, we have selected struct_of_array, vector_of_vector, vector_of_list, vector_of_forward_list containers.

Running abstraction penalty experiments

First, build the abstraction penalty benchmarks:

$ cmake .. -DNWGRAPH_BUILD_BENCH=ON
$ make bench -j8

For example let us consider the sparse matrix-dense vector multiplication (SpMV) kernel used in page rank, which multiplies the adjacency matrix representation of a graph by a dense vector x and stores the result in another vector y. To experimentally evaluate the abstraction penalty of different ways to iterate through a graph:

$ ./bench/abstraction_penalty/spmv -f ../test/data/karate.mtx

To experimentally evaluate the abstraction penalty of different containers for storing a graph:

$ ./bench/abstraction_penalty/containers -f ../test/data/karate.mtx --format CSR --format VOV --format VOL --format VOF