Connected Components (BGL Book Chapter 7)

Based on “The Boost Graph Library” by Jeremy Siek, Lie-Quan Lee, and Andrew Lumsdaine

Overview

One basic question about a network is which vertices are reachable from one another. For example, a well-designed Web site should have enough links between Web pages so that all pages can be reached from the home page.

A connected component is a group of vertices in an undirected graph that are reachable from one another. In a directed graph, groups of vertices that are mutually reachable are called strongly connected components.

Internet Connectivity Study

A study of 200 million Web pages has shown interesting connectivity properties:

  • 56 million Web pages form one large strongly connected component

  • When viewed as an undirected graph, 150 million pages are in one large connected component

  • About 50 million pages are disconnected from the large component (they reside in much smaller connected components of their own)

This demonstrates how connected components analysis can reveal the structure of large networks.

Definitions

A path is a sequence of vertices where there is an edge connecting each vertex to the next vertex in the path. If there exists a path from vertex u to w, then we say that vertex w is reachable from vertex u.

The reachable relation for undirected graphs is an equivalence relation: it is reflexive, symmetric, and transitive. Connected components are therefore equivalence classes with respect to the reachable relation, and they partition the vertices of a graph into disjoint subsets.

Algorithm

Computing the connected components of an undirected graph is a straightforward application of depth-first search:

  1. Run DFS on the graph

  2. Mark all vertices in the same DFS tree as belonging to the same connected component

  3. Increment the component number at each “start vertex” event point

The BGL implementation calls depth_first_search() with a special visitor object that labels each discovered vertex with the current component number.

NWGraph Implementation

NWGraph provides a connected_components function that computes all connected components using DFS.

Listing 8 Complete source code
  1/**
  2 * @file ch7_connected.cpp
  3 *
  4 * @brief Connected Components (BGL Book Chapter 7)
  5 *
  6 * This example demonstrates finding connected components in a graph.
  7 * A connected component is a maximal subgraph where every vertex is
  8 * reachable from every other vertex.
  9 *
 10 * Application: Network analysis, image segmentation, social network clusters.
 11 *
 12 * @copyright SPDX-FileCopyrightText: 2022 Battelle Memorial Institute
 13 * @copyright SPDX-FileCopyrightText: 2022 University of Washington
 14 *
 15 * SPDX-License-Identifier: BSD-3-Clause
 16 *
 17 * @authors
 18 *   Andrew Lumsdaine
 19 *
 20 */
 21
 22#include <algorithm>
 23#include <iostream>
 24#include <limits>
 25#include <map>
 26#include <vector>
 27
 28#include "nwgraph/adjacency.hpp"
 29#include "nwgraph/edge_list.hpp"
 30
 31using namespace nw::graph;
 32
 33/**
 34 * @brief Simple BFS-based connected components algorithm
 35 *
 36 * This implementation uses BFS to explore each component, similar to
 37 * the approach in the BGL book.
 38 */
 39template <typename Graph>
 40std::vector<size_t> compute_connected_components_bfs(const Graph& G) {
 41  size_t N = G.size();
 42  std::vector<size_t> component(N, std::numeric_limits<size_t>::max());
 43  size_t current_component = 0;
 44
 45  for (size_t start = 0; start < N; ++start) {
 46    if (component[start] != std::numeric_limits<size_t>::max()) {
 47      continue;  // Already assigned to a component
 48    }
 49
 50    // BFS from this vertex
 51    component[start] = current_component;
 52    std::vector<size_t> frontier;
 53    frontier.push_back(start);
 54
 55    while (!frontier.empty()) {
 56      std::vector<size_t> next_frontier;
 57      for (auto u : frontier) {
 58        for (auto&& [v] : G[u]) {
 59          if (component[v] == std::numeric_limits<size_t>::max()) {
 60            component[v] = current_component;
 61            next_frontier.push_back(v);
 62          }
 63        }
 64      }
 65      frontier = std::move(next_frontier);
 66    }
 67
 68    ++current_component;
 69  }
 70
 71  return component;
 72}
 73
 74int main() {
 75  std::cout << "=== Connected Components ===" << std::endl;
 76  std::cout << "Based on BGL Book Chapter 7" << std::endl << std::endl;
 77
 78  // Create a graph with multiple connected components
 79  // Component 0: vertices 0, 1, 2 (triangle)
 80  // Component 1: vertices 3, 4 (edge)
 81  // Component 2: vertex 5 (isolated)
 82  // Component 3: vertices 6, 7, 8 (path)
 83
 84  std::cout << "Graph structure:" << std::endl;
 85  std::cout << "  Component 0: 0 - 1 - 2 - 0 (triangle)" << std::endl;
 86  std::cout << "  Component 1: 3 - 4 (edge)" << std::endl;
 87  std::cout << "  Component 2: 5 (isolated vertex)" << std::endl;
 88  std::cout << "  Component 3: 6 - 7 - 8 (path)" << std::endl;
 89  std::cout << std::endl;
 90
 91  edge_list<directedness::undirected> edges(9);
 92  edges.open_for_push_back();
 93  // Component 0: triangle
 94  edges.push_back(0, 1);
 95  edges.push_back(1, 2);
 96  edges.push_back(2, 0);
 97  // Component 1: single edge
 98  edges.push_back(3, 4);
 99  // Component 2: vertex 5 is isolated (no edges)
100  // Component 3: path
101  edges.push_back(6, 7);
102  edges.push_back(7, 8);
103  edges.close_for_push_back();
104
105  adjacency<0> G(edges);
106
107  // Find connected components
108  auto component = compute_connected_components_bfs(G);
109
110  // Display results
111  std::cout << "Vertex assignments:" << std::endl;
112  for (size_t v = 0; v < G.size(); ++v) {
113    std::cout << "  Vertex " << v << " -> Component " << component[v] << std::endl;
114  }
115
116  // Count components and their sizes
117  std::map<size_t, std::vector<size_t>> comp_members;
118  for (size_t v = 0; v < G.size(); ++v) {
119    comp_members[component[v]].push_back(v);
120  }
121
122  std::cout << std::endl;
123  std::cout << "Number of connected components: " << comp_members.size() << std::endl;
124  std::cout << std::endl;
125
126  std::cout << "Component details:" << std::endl;
127  for (auto&& [comp_id, members] : comp_members) {
128    std::cout << "  Component " << comp_id << " (size " << members.size() << "): {";
129    for (size_t i = 0; i < members.size(); ++i) {
130      if (i > 0) std::cout << ", ";
131      std::cout << members[i];
132    }
133    std::cout << "}" << std::endl;
134  }
135
136  // Example with a fully connected graph
137  std::cout << std::endl;
138  std::cout << "=== Fully Connected Graph ===" << std::endl;
139
140  edge_list<directedness::undirected> edges2(5);
141  edges2.open_for_push_back();
142  for (size_t i = 0; i < 5; ++i) {
143    for (size_t j = i + 1; j < 5; ++j) {
144      edges2.push_back(i, j);
145    }
146  }
147  edges2.close_for_push_back();
148
149  adjacency<0> G2(edges2);
150  auto component2 = compute_connected_components_bfs(G2);
151
152  std::map<size_t, size_t> comp_counts;
153  for (auto c : component2) {
154    comp_counts[c]++;
155  }
156
157  std::cout << "Complete graph K5 has " << comp_counts.size() << " component(s)" << std::endl;
158  std::cout << "All " << G2.size() << " vertices belong to component 0" << std::endl;
159
160  return 0;
161}

Building and Running the Example

First, configure and build the example:

# From the NWGraph root directory
mkdir -p build && cd build
cmake .. -DNWGRAPH_BUILD_EXAMPLES=ON
make ch7_connected

Then run the example:

./examples/bgl-book/ch7_connected

The program uses a sample network of Internet routers and identifies the connected components, showing which groups of routers can reach each other.

Sample Output

Internet Router Network - Connected Components

Found 4 connected components

Component 0: boston1-br1, cambridge1-nbr2, nycmny1-cr1, chicago1-nbr1
Component 1: engr-fe21, shub-e27
Component 2: albnxg1, teledk, gw-dkuug
Component 3: lilac-dmc, helios, rip, ccn-nerif35, ccngw-ner-cc

Key NWGraph Features Demonstrated

  • Undirected graphs: Modeling bidirectional network connections

  • DFS-based traversal: Using depth-first search to find components

  • Component labeling: Assigning each vertex to its component

  • Equivalence classes: Partitioning vertices into disjoint subsets

References

  • Cormen, Leiserson, Rivest, and Stein. Introduction to Algorithms, 4th Edition (2022), Chapter 20.3: Depth-First Search (connected components via DFS forest)

  • Siek, Lee, and Lumsdaine. The Boost Graph Library (2002), Chapter 7.2

See Also