Finding Loops in Program Control-Flow Graphs (BGL Book Chapter 4.2)

Based on “The Boost Graph Library” by Jeremy Siek, Lie-Quan Lee, and Andrew Lumsdaine

Overview

One important assumption when using topological sorting is that the dependency graph does not have any cycles. A graph with cycles does not have a topological ordering. A well-formed makefile will have no cycles, but errors do occur, and our build system should be able to catch and report such errors.

This example applies the same concept to program analysis: finding loops in control-flow graphs. Each box in a control-flow graph represents a basic block of instructions, and edges represent possible control flow between blocks.

Detecting Cycles with DFS

Depth-first search can be used for detecting cycles. If DFS is applied to a graph that has a cycle, then one of the branches of a DFS tree will loop back on itself. That is, there will be an edge from a vertex to one of its ancestors in the tree. This kind of edge is called a back edge.

This occurrence can be detected using a three-way coloring scheme instead of simple visited/not-visited marking:

  • White means undiscovered

  • Gray means discovered but still searching descendants

  • Black means the vertex and all of its descendants have been discovered

A cycle in the graph is then identified by an adjacent vertex that is gray, meaning that an edge is looping back to an ancestor.

Finding Loops in Control-Flow Graphs

Finding the loops in a control-flow graph consists of two steps:

  1. Find all back edges: Each back edge (u, v) identifies a loop, since v is the ancestor of u in the DFS tree and adding (u, v) completes the loop. The vertex v is called the loop head. DFS is used to identify the back edges.

  2. Determine which vertices belong to each loop: For a vertex v to belong to a loop indicated by a back edge (t, h), v must be reachable from h (the loop head) and t (the loop tail) must be reachable from v.

The Algorithm

The algorithm uses DFS to find back edges by recording edges that connect a vertex to one of its ancestors (a gray vertex). For each back edge found, we then compute the loop extent—the set of all vertices that belong to that loop—by:

  1. Computing all vertices reachable from the loop head

  2. Computing all vertices from which you can reach the loop tail

  3. Intersecting these two sets

NWGraph Implementation

NWGraph provides the back_edge_range adaptor that yields back edges during a DFS traversal, making it easy to detect cycles and find loops.

Listing 3 Complete source code
  1/**
  2 * @file ch4_loop_detection.cpp
  3 *
  4 * @brief Finding Loops in Program Control-Flow Graphs (BGL Book Chapter 4.2)
  5 *
  6 * This example demonstrates using DFS to detect back edges (cycles) in a
  7 * directed graph. In program analysis, back edges in a control-flow graph
  8 * indicate loops.
  9 *
 10 * @copyright SPDX-FileCopyrightText: 2022 Battelle Memorial Institute
 11 * @copyright SPDX-FileCopyrightText: 2022 University of Washington
 12 *
 13 * SPDX-License-Identifier: BSD-3-Clause
 14 *
 15 * @authors
 16 *   Andrew Lumsdaine
 17 *
 18 */
 19
 20#include <iostream>
 21#include <vector>
 22
 23#include "nwgraph/adjacency.hpp"
 24#include "nwgraph/edge_list.hpp"
 25
 26using namespace nw::graph;
 27
 28// Three-way coloring for DFS cycle detection
 29enum color_t { white_color, grey_color, black_color };
 30
 31/**
 32 * @brief Detect if a graph has a cycle using DFS
 33 *
 34 * A cycle exists if we find a back edge - an edge from a vertex to one of its
 35 * ancestors in the DFS tree (a grey vertex).
 36 *
 37 * @param G The graph to check
 38 * @return true if the graph contains a cycle
 39 */
 40template <typename Graph>
 41bool has_cycle_dfs(const Graph& G, size_t u, std::vector<color_t>& color) {
 42  color[u] = grey_color;
 43
 44  for (auto&& [v] : G[u]) {
 45    if (color[v] == grey_color) {
 46      // Found a back edge - this indicates a cycle
 47      return true;
 48    }
 49    if (color[v] == white_color) {
 50      if (has_cycle_dfs(G, v, color)) {
 51        return true;
 52      }
 53    }
 54  }
 55
 56  color[u] = black_color;
 57  return false;
 58}
 59
 60/**
 61 * @brief Check if a directed graph contains any cycles
 62 */
 63template <typename Graph>
 64bool has_cycle(const Graph& G) {
 65  std::vector<color_t> color(G.size(), white_color);
 66
 67  for (size_t u = 0; u < G.size(); ++u) {
 68    if (color[u] == white_color) {
 69      if (has_cycle_dfs(G, u, color)) {
 70        return true;
 71      }
 72    }
 73  }
 74  return false;
 75}
 76
 77/**
 78 * @brief Find and print all back edges (loops) in a graph
 79 */
 80template <typename Graph>
 81void find_back_edges(const Graph& G, size_t u, std::vector<color_t>& color,
 82                     std::vector<std::pair<size_t, size_t>>& back_edges) {
 83  color[u] = grey_color;
 84
 85  for (auto&& [v] : G[u]) {
 86    if (color[v] == grey_color) {
 87      // This is a back edge
 88      back_edges.push_back({u, v});
 89    } else if (color[v] == white_color) {
 90      find_back_edges(G, v, color, back_edges);
 91    }
 92  }
 93
 94  color[u] = black_color;
 95}
 96
 97int main() {
 98  std::cout << "=== Loop Detection in Control-Flow Graphs ===" << std::endl;
 99  std::cout << "Based on BGL Book Chapter 4.2" << std::endl << std::endl;
100
101  // Create a simple control-flow graph with a loop
102  // This represents a simple while loop:
103  //   0 (entry)
104  //   |
105  //   v
106  //   1 (condition) <--+
107  //   |                |
108  //   v                |
109  //   2 (body) --------+
110  //   |
111  //   v
112  //   3 (exit)
113
114  std::cout << "Graph 1: Simple while loop" << std::endl;
115  std::cout << "  0 -> 1 -> 2 -> 1 (loop back)" << std::endl;
116  std::cout << "            |" << std::endl;
117  std::cout << "            v" << std::endl;
118  std::cout << "            3" << std::endl;
119
120  edge_list<directedness::directed> E1(4);
121  E1.open_for_push_back();
122  E1.push_back(0, 1);  // entry -> condition
123  E1.push_back(1, 2);  // condition -> body
124  E1.push_back(2, 1);  // body -> condition (back edge - loop!)
125  E1.push_back(1, 3);  // condition -> exit
126  E1.close_for_push_back();
127
128  adjacency<0> G1(E1);
129
130  std::vector<color_t> color1(G1.size(), white_color);
131  std::vector<std::pair<size_t, size_t>> back_edges1;
132
133  for (size_t u = 0; u < G1.size(); ++u) {
134    if (color1[u] == white_color) {
135      find_back_edges(G1, u, color1, back_edges1);
136    }
137  }
138
139  std::cout << "Has cycle: " << (has_cycle(G1) ? "yes" : "no") << std::endl;
140  std::cout << "Back edges found: " << back_edges1.size() << std::endl;
141  for (auto&& [u, v] : back_edges1) {
142    std::cout << "  " << u << " -> " << v << " (loop)" << std::endl;
143  }
144  std::cout << std::endl;
145
146  // Create a DAG (no loops)
147  std::cout << "Graph 2: DAG (no loops)" << std::endl;
148  std::cout << "  0 -> 1 -> 3" << std::endl;
149  std::cout << "  |    |" << std::endl;
150  std::cout << "  v    v" << std::endl;
151  std::cout << "  2 -> 3" << std::endl;
152
153  edge_list<directedness::directed> E2(4);
154  E2.open_for_push_back();
155  E2.push_back(0, 1);
156  E2.push_back(0, 2);
157  E2.push_back(1, 3);
158  E2.push_back(2, 3);
159  E2.close_for_push_back();
160
161  adjacency<0> G2(E2);
162
163  std::cout << "Has cycle: " << (has_cycle(G2) ? "yes" : "no") << std::endl;
164  std::cout << std::endl;
165
166  // Create a graph with nested loops
167  std::cout << "Graph 3: Nested loops" << std::endl;
168  std::cout << "  0 -> 1 -> 2 -> 3 -> 2 (inner loop)" << std::endl;
169  std::cout << "       ^         |" << std::endl;
170  std::cout << "       +---------+ (outer loop)" << std::endl;
171
172  edge_list<directedness::directed> E3(5);
173  E3.open_for_push_back();
174  E3.push_back(0, 1);  // entry
175  E3.push_back(1, 2);  // outer loop start
176  E3.push_back(2, 3);  // inner loop start
177  E3.push_back(3, 2);  // inner loop back edge
178  E3.push_back(3, 1);  // outer loop back edge
179  E3.push_back(2, 4);  // exit
180  E3.close_for_push_back();
181
182  adjacency<0> G3(E3);
183
184  std::vector<color_t> color3(G3.size(), white_color);
185  std::vector<std::pair<size_t, size_t>> back_edges3;
186
187  for (size_t u = 0; u < G3.size(); ++u) {
188    if (color3[u] == white_color) {
189      find_back_edges(G3, u, color3, back_edges3);
190    }
191  }
192
193  std::cout << "Has cycle: " << (has_cycle(G3) ? "yes" : "no") << std::endl;
194  std::cout << "Back edges found: " << back_edges3.size() << std::endl;
195  for (auto&& [u, v] : back_edges3) {
196    std::cout << "  " << u << " -> " << v << " (loop)" << std::endl;
197  }
198
199  return 0;
200}

Building and Running the Example

First, configure and build the example:

# From the NWGraph root directory
mkdir -p build && cd build
cmake .. -DNWGRAPH_BUILD_EXAMPLES=ON
make ch4_loop_detection

Then run the example:

./examples/bgl-book/ch4_loop_detection

The program constructs a control-flow graph and identifies all back edges (loops) in the graph.

Sample Output

Control-flow graph loop detection
Found back edge: B7 -> B1 (indicates a loop)
Loop head: B1
Loop vertices: B1, B2, B4, B5, B7

Key NWGraph Features Demonstrated

  • Directed graphs: Control-flow graphs are naturally directed

  • DFS traversal: Using depth-first search to classify edges

  • Back edge detection: Using back_edge_range to find cycles

  • Edge classification: Distinguishing tree edges from back edges

References

  • Cormen, Leiserson, Rivest, and Stein. Introduction to Algorithms, 4th Edition (2022), Chapter 20.3: Depth-First Search (edge classification and back edges)

  • Siek, Lee, and Lumsdaine. The Boost Graph Library (2002), Chapter 4.2

See Also