Connected Components (BGL Book Chapter 7)

Based on “The Boost Graph Library” by Jeremy Siek, Lie-Quan Lee, and Andrew Lumsdaine

Overview

One basic question about a network is which vertices are reachable from one another. For example, a well-designed Web site should have enough links between Web pages so that all pages can be reached from the home page.

A connected component is a group of vertices in an undirected graph that are reachable from one another. In a directed graph, groups of vertices that are mutually reachable are called strongly connected components.

Internet Connectivity Study

A study of 200 million Web pages has shown interesting connectivity properties:

56 million Web pages form one large strongly connected component
When viewed as an undirected graph, 150 million pages are in one large connected component
About 50 million pages are disconnected from the large component (they reside in much smaller connected components of their own)

This demonstrates how connected components analysis can reveal the structure of large networks.

Definitions

A path is a sequence of vertices where there is an edge connecting each vertex to the next vertex in the path. If there exists a path from vertex u to w, then we say that vertex w is reachable from vertex u.

The reachable relation for undirected graphs is an equivalence relation: it is reflexive, symmetric, and transitive. Connected components are therefore equivalence classes with respect to the reachable relation, and they partition the vertices of a graph into disjoint subsets.

Algorithm

Computing the connected components of an undirected graph is a straightforward application of depth-first search:

Run DFS on the graph
Mark all vertices in the same DFS tree as belonging to the same connected component
Increment the component number at each “start vertex” event point

The BGL implementation calls depth_first_search() with a special visitor object that labels each discovered vertex with the current component number.

NWGraph Implementation

NWGraph provides a connected_components function that computes all connected components using DFS.

Listing 8 Complete source code

/**
 * @file ch7_connected.cpp
 *
 * @brief Connected Components (BGL Book Chapter 7)
 *
 * This example demonstrates finding connected components in a graph.
 * A connected component is a maximal subgraph where every vertex is
 * reachable from every other vertex.
 *
 * Application: Network analysis, image segmentation, social network clusters.
 *
 * @copyright SPDX-FileCopyrightText: 2022 Battelle Memorial Institute
 * @copyright SPDX-FileCopyrightText: 2022 University of Washington
 *
 * SPDX-License-Identifier: BSD-3-Clause
 *
 * @authors
 *   Andrew Lumsdaine
 *
 */

#include <algorithm>
#include <iostream>
#include <limits>
#include <map>
#include <vector>

#include "nwgraph/adjacency.hpp"
#include "nwgraph/edge_list.hpp"

using namespace nw::graph;

/**
 * @brief Simple BFS-based connected components algorithm
 *
 * This implementation uses BFS to explore each component, similar to
 * the approach in the BGL book.
 */
template <typename Graph>
std::vector<size_t> compute_connected_components_bfs(const Graph& G) {
  size_t N = G.size();
  std::vector<size_t> component(N, std::numeric_limits<size_t>::max());
  size_t current_component = 0;

  for (size_t start = 0; start < N; ++start) {
    if (component[start] != std::numeric_limits<size_t>::max()) {
      continue;  // Already assigned to a component
    }

    // BFS from this vertex
    component[start] = current_component;
    std::vector<size_t> frontier;
    frontier.push_back(start);

    while (!frontier.empty()) {
      std::vector<size_t> next_frontier;
      for (auto u : frontier) {
        for (auto&& [v] : G[u]) {
          if (component[v] == std::numeric_limits<size_t>::max()) {
            component[v] = current_component;
            next_frontier.push_back(v);
          }
        }
      }
      frontier = std::move(next_frontier);
    }

    ++current_component;
  }

  return component;
}

int main() {
  std::cout << "=== Connected Components ===" << std::endl;
  std::cout << "Based on BGL Book Chapter 7" << std::endl << std::endl;

  // Create a graph with multiple connected components
  // Component 0: vertices 0, 1, 2 (triangle)
  // Component 1: vertices 3, 4 (edge)
  // Component 2: vertex 5 (isolated)
  // Component 3: vertices 6, 7, 8 (path)

  std::cout << "Graph structure:" << std::endl;
  std::cout << "  Component 0: 0 - 1 - 2 - 0 (triangle)" << std::endl;
  std::cout << "  Component 1: 3 - 4 (edge)" << std::endl;
  std::cout << "  Component 2: 5 (isolated vertex)" << std::endl;
  std::cout << "  Component 3: 6 - 7 - 8 (path)" << std::endl;
  std::cout << std::endl;

  edge_list<directedness::undirected> edges(9);
  edges.open_for_push_back();
  // Component 0: triangle
  edges.push_back(0, 1);
  edges.push_back(1, 2);
  edges.push_back(2, 0);
  // Component 1: single edge
  edges.push_back(3, 4);
  // Component 2: vertex 5 is isolated (no edges)
  // Component 3: path
  edges.push_back(6, 7);
  edges.push_back(7, 8);
  edges.close_for_push_back();

  adjacency<0> G(edges);

  // Find connected components
  auto component = compute_connected_components_bfs(G);

  // Display results
  std::cout << "Vertex assignments:" << std::endl;
  for (size_t v = 0; v < G.size(); ++v) {
    std::cout << "  Vertex " << v << " -> Component " << component[v] << std::endl;
  }

  // Count components and their sizes
  std::map<size_t, std::vector<size_t>> comp_members;
  for (size_t v = 0; v < G.size(); ++v) {
    comp_members[component[v]].push_back(v);
  }

  std::cout << std::endl;
  std::cout << "Number of connected components: " << comp_members.size() << std::endl;
  std::cout << std::endl;

  std::cout << "Component details:" << std::endl;
  for (auto&& [comp_id, members] : comp_members) {
    std::cout << "  Component " << comp_id << " (size " << members.size() << "): {";
    for (size_t i = 0; i < members.size(); ++i) {
      if (i > 0) std::cout << ", ";
      std::cout << members[i];
    }
    std::cout << "}" << std::endl;
  }

  // Example with a fully connected graph
  std::cout << std::endl;
  std::cout << "=== Fully Connected Graph ===" << std::endl;

  edge_list<directedness::undirected> edges2(5);
  edges2.open_for_push_back();
  for (size_t i = 0; i < 5; ++i) {
    for (size_t j = i + 1; j < 5; ++j) {
      edges2.push_back(i, j);
    }
  }
  edges2.close_for_push_back();

  adjacency<0> G2(edges2);
  auto component2 = compute_connected_components_bfs(G2);

  std::map<size_t, size_t> comp_counts;
  for (auto c : component2) {
    comp_counts[c]++;
  }

  std::cout << "Complete graph K5 has " << comp_counts.size() << " component(s)" << std::endl;
  std::cout << "All " << G2.size() << " vertices belong to component 0" << std::endl;

  return 0;
}

Building and Running the Example

First, configure and build the example:

# From the NWGraph root directory
mkdir -p build && cd build
cmake .. -DNWGRAPH_BUILD_EXAMPLES=ON
make ch7_connected

Then run the example:

./examples/bgl-book/ch7_connected

The program uses a sample network of Internet routers and identifies the connected components, showing which groups of routers can reach each other.

Sample Output

Internet Router Network - Connected Components

Found 4 connected components

Component 0: boston1-br1, cambridge1-nbr2, nycmny1-cr1, chicago1-nbr1
Component 1: engr-fe21, shub-e27
Component 2: albnxg1, teledk, gw-dkuug
Component 3: lilac-dmc, helios, rip, ccn-nerif35, ccngw-ner-cc

Key NWGraph Features Demonstrated

Undirected graphs: Modeling bidirectional network connections
DFS-based traversal: Using depth-first search to find components
Component labeling: Assigning each vertex to its component
Equivalence classes: Partitioning vertices into disjoint subsets

References

Cormen, Leiserson, Rivest, and Stein. Introduction to Algorithms, 4th Edition (2022), Chapter 20.3: Depth-First Search (connected components via DFS forest)
Siek, Lee, and Lumsdaine. The Boost Graph Library (2002), Chapter 7.2