Finding Loops in Program Control-Flow Graphs (BGL Book Chapter 4.2)
Based on “The Boost Graph Library” by Jeremy Siek, Lie-Quan Lee, and Andrew Lumsdaine
Overview
One important assumption when using topological sorting is that the dependency graph does not have any cycles. A graph with cycles does not have a topological ordering. A well-formed makefile will have no cycles, but errors do occur, and our build system should be able to catch and report such errors.
This example applies the same concept to program analysis: finding loops in control-flow graphs. Each box in a control-flow graph represents a basic block of instructions, and edges represent possible control flow between blocks.
Detecting Cycles with DFS
Depth-first search can be used for detecting cycles. If DFS is applied to a graph that has a cycle, then one of the branches of a DFS tree will loop back on itself. That is, there will be an edge from a vertex to one of its ancestors in the tree. This kind of edge is called a back edge.
This occurrence can be detected using a three-way coloring scheme instead of simple visited/not-visited marking:
White means undiscovered
Gray means discovered but still searching descendants
Black means the vertex and all of its descendants have been discovered
A cycle in the graph is then identified by an adjacent vertex that is gray, meaning that an edge is looping back to an ancestor.
Finding Loops in Control-Flow Graphs
Finding the loops in a control-flow graph consists of two steps:
Find all back edges: Each back edge (u, v) identifies a loop, since v is the ancestor of u in the DFS tree and adding (u, v) completes the loop. The vertex v is called the loop head. DFS is used to identify the back edges.
Determine which vertices belong to each loop: For a vertex v to belong to a loop indicated by a back edge (t, h), v must be reachable from h (the loop head) and t (the loop tail) must be reachable from v.
The Algorithm
The algorithm uses DFS to find back edges by recording edges that connect a vertex to one of its ancestors (a gray vertex). For each back edge found, we then compute the loop extent—the set of all vertices that belong to that loop—by:
Computing all vertices reachable from the loop head
Computing all vertices from which you can reach the loop tail
Intersecting these two sets
NWGraph Implementation
NWGraph provides the back_edge_range adaptor that yields back edges during a DFS
traversal, making it easy to detect cycles and find loops.
1/**
2 * @file ch4_loop_detection.cpp
3 *
4 * @brief Finding Loops in Program Control-Flow Graphs (BGL Book Chapter 4.2)
5 *
6 * This example demonstrates using DFS to detect back edges (cycles) in a
7 * directed graph. In program analysis, back edges in a control-flow graph
8 * indicate loops.
9 *
10 * @copyright SPDX-FileCopyrightText: 2022 Battelle Memorial Institute
11 * @copyright SPDX-FileCopyrightText: 2022 University of Washington
12 *
13 * SPDX-License-Identifier: BSD-3-Clause
14 *
15 * @authors
16 * Andrew Lumsdaine
17 *
18 */
19
20#include <iostream>
21#include <vector>
22
23#include "nwgraph/adjacency.hpp"
24#include "nwgraph/edge_list.hpp"
25
26using namespace nw::graph;
27
28// Three-way coloring for DFS cycle detection
29enum color_t { white_color, grey_color, black_color };
30
31/**
32 * @brief Detect if a graph has a cycle using DFS
33 *
34 * A cycle exists if we find a back edge - an edge from a vertex to one of its
35 * ancestors in the DFS tree (a grey vertex).
36 *
37 * @param G The graph to check
38 * @return true if the graph contains a cycle
39 */
40template <typename Graph>
41bool has_cycle_dfs(const Graph& G, size_t u, std::vector<color_t>& color) {
42 color[u] = grey_color;
43
44 for (auto&& [v] : G[u]) {
45 if (color[v] == grey_color) {
46 // Found a back edge - this indicates a cycle
47 return true;
48 }
49 if (color[v] == white_color) {
50 if (has_cycle_dfs(G, v, color)) {
51 return true;
52 }
53 }
54 }
55
56 color[u] = black_color;
57 return false;
58}
59
60/**
61 * @brief Check if a directed graph contains any cycles
62 */
63template <typename Graph>
64bool has_cycle(const Graph& G) {
65 std::vector<color_t> color(G.size(), white_color);
66
67 for (size_t u = 0; u < G.size(); ++u) {
68 if (color[u] == white_color) {
69 if (has_cycle_dfs(G, u, color)) {
70 return true;
71 }
72 }
73 }
74 return false;
75}
76
77/**
78 * @brief Find and print all back edges (loops) in a graph
79 */
80template <typename Graph>
81void find_back_edges(const Graph& G, size_t u, std::vector<color_t>& color,
82 std::vector<std::pair<size_t, size_t>>& back_edges) {
83 color[u] = grey_color;
84
85 for (auto&& [v] : G[u]) {
86 if (color[v] == grey_color) {
87 // This is a back edge
88 back_edges.push_back({u, v});
89 } else if (color[v] == white_color) {
90 find_back_edges(G, v, color, back_edges);
91 }
92 }
93
94 color[u] = black_color;
95}
96
97int main() {
98 std::cout << "=== Loop Detection in Control-Flow Graphs ===" << std::endl;
99 std::cout << "Based on BGL Book Chapter 4.2" << std::endl << std::endl;
100
101 // Create a simple control-flow graph with a loop
102 // This represents a simple while loop:
103 // 0 (entry)
104 // |
105 // v
106 // 1 (condition) <--+
107 // | |
108 // v |
109 // 2 (body) --------+
110 // |
111 // v
112 // 3 (exit)
113
114 std::cout << "Graph 1: Simple while loop" << std::endl;
115 std::cout << " 0 -> 1 -> 2 -> 1 (loop back)" << std::endl;
116 std::cout << " |" << std::endl;
117 std::cout << " v" << std::endl;
118 std::cout << " 3" << std::endl;
119
120 edge_list<directedness::directed> E1(4);
121 E1.open_for_push_back();
122 E1.push_back(0, 1); // entry -> condition
123 E1.push_back(1, 2); // condition -> body
124 E1.push_back(2, 1); // body -> condition (back edge - loop!)
125 E1.push_back(1, 3); // condition -> exit
126 E1.close_for_push_back();
127
128 adjacency<0> G1(E1);
129
130 std::vector<color_t> color1(G1.size(), white_color);
131 std::vector<std::pair<size_t, size_t>> back_edges1;
132
133 for (size_t u = 0; u < G1.size(); ++u) {
134 if (color1[u] == white_color) {
135 find_back_edges(G1, u, color1, back_edges1);
136 }
137 }
138
139 std::cout << "Has cycle: " << (has_cycle(G1) ? "yes" : "no") << std::endl;
140 std::cout << "Back edges found: " << back_edges1.size() << std::endl;
141 for (auto&& [u, v] : back_edges1) {
142 std::cout << " " << u << " -> " << v << " (loop)" << std::endl;
143 }
144 std::cout << std::endl;
145
146 // Create a DAG (no loops)
147 std::cout << "Graph 2: DAG (no loops)" << std::endl;
148 std::cout << " 0 -> 1 -> 3" << std::endl;
149 std::cout << " | |" << std::endl;
150 std::cout << " v v" << std::endl;
151 std::cout << " 2 -> 3" << std::endl;
152
153 edge_list<directedness::directed> E2(4);
154 E2.open_for_push_back();
155 E2.push_back(0, 1);
156 E2.push_back(0, 2);
157 E2.push_back(1, 3);
158 E2.push_back(2, 3);
159 E2.close_for_push_back();
160
161 adjacency<0> G2(E2);
162
163 std::cout << "Has cycle: " << (has_cycle(G2) ? "yes" : "no") << std::endl;
164 std::cout << std::endl;
165
166 // Create a graph with nested loops
167 std::cout << "Graph 3: Nested loops" << std::endl;
168 std::cout << " 0 -> 1 -> 2 -> 3 -> 2 (inner loop)" << std::endl;
169 std::cout << " ^ |" << std::endl;
170 std::cout << " +---------+ (outer loop)" << std::endl;
171
172 edge_list<directedness::directed> E3(5);
173 E3.open_for_push_back();
174 E3.push_back(0, 1); // entry
175 E3.push_back(1, 2); // outer loop start
176 E3.push_back(2, 3); // inner loop start
177 E3.push_back(3, 2); // inner loop back edge
178 E3.push_back(3, 1); // outer loop back edge
179 E3.push_back(2, 4); // exit
180 E3.close_for_push_back();
181
182 adjacency<0> G3(E3);
183
184 std::vector<color_t> color3(G3.size(), white_color);
185 std::vector<std::pair<size_t, size_t>> back_edges3;
186
187 for (size_t u = 0; u < G3.size(); ++u) {
188 if (color3[u] == white_color) {
189 find_back_edges(G3, u, color3, back_edges3);
190 }
191 }
192
193 std::cout << "Has cycle: " << (has_cycle(G3) ? "yes" : "no") << std::endl;
194 std::cout << "Back edges found: " << back_edges3.size() << std::endl;
195 for (auto&& [u, v] : back_edges3) {
196 std::cout << " " << u << " -> " << v << " (loop)" << std::endl;
197 }
198
199 return 0;
200}
Building and Running the Example
First, configure and build the example:
# From the NWGraph root directory
mkdir -p build && cd build
cmake .. -DNWGRAPH_BUILD_EXAMPLES=ON
make ch4_loop_detection
Then run the example:
./examples/bgl-book/ch4_loop_detection
The program constructs a control-flow graph and identifies all back edges (loops) in the graph.
Sample Output
Control-flow graph loop detection
Found back edge: B7 -> B1 (indicates a loop)
Loop head: B1
Loop vertices: B1, B2, B4, B5, B7
Key NWGraph Features Demonstrated
Directed graphs: Control-flow graphs are naturally directed
DFS traversal: Using depth-first search to classify edges
Back edge detection: Using
back_edge_rangeto find cyclesEdge classification: Distinguishing tree edges from back edges
References
Cormen, Leiserson, Rivest, and Stein. Introduction to Algorithms, 4th Edition (2022), Chapter 20.3: Depth-First Search (edge classification and back edges)
Siek, Lee, and Lumsdaine. The Boost Graph Library (2002), Chapter 4.2
See Also
File Dependencies - Topological Sort (BGL Book Chapter 3) - Topological sorting (requires acyclic graphs)
Six Degrees of Kevin Bacon (BGL Book Chapter 4.1) - BFS for shortest paths
Boost Graph Library Examples (Rewritten for NW Graph) - BGL Book examples overview