Six Degrees of Kevin Bacon (BGL Book Chapter 4.1)
Based on “The Boost Graph Library” by Jeremy Siek, Lie-Quan Lee, and Andrew Lumsdaine
Overview
An amusing application of breadth-first search comes up in the popular game “Six Degrees of Kevin Bacon.” The idea of the game is to connect an actor to Kevin Bacon through a trail of actors who appeared together in movies, and do so in less than six steps.
For example, Theodore Hesburgh (President Emeritus of the University of Notre Dame) was in the movie Rudy with the actor Gerry Becker, who was in the movie Sleepers with Kevin Bacon. Why Kevin Bacon? For some reason, the three students who invented the game— Mike Ginelli, Craig Fass, and Brian Turtle—decided that Kevin Bacon was the center of the entertainment world.
Mathematicians play a similar game; they keep track of their Erdős number, which is the number of co-authored publications that separate them from the famous Paul Erdős.
The Graph Model
The “Six Degrees of Kevin Bacon” game is really a graph problem. The graph representing the problem can be modeled by:
Assigning a vertex for each actor
Creating an edge between two vertices if the corresponding actors have appeared together in a movie
Since the relationship between actors appearing together in a movie is symmetric, edges between actors can be undirected, resulting in an undirected graph.
Algorithm Description
The problem of finding a trail of actors to Kevin Bacon becomes a traditional graph problem—that of finding a path between two vertices. Since we wish to find a path that is shorter than six steps, ideally we would like to find the shortest path between vertices.
Breadth-first search (BFS) can be used to find shortest paths in unweighted graphs. Similar to the Erdős number, we use the term Bacon number to mean the shortest path length from a given actor to Kevin Bacon.
BFS works by exploring vertices in order of their distance from a source vertex:
Start at the source vertex (Kevin Bacon)
Visit all vertices one edge away
Then visit all vertices two edges away
Continue until all reachable vertices are visited
The distance at which each vertex is discovered is its shortest distance from the source.
NWGraph Implementation
NWGraph provides a bfs_range adaptor that yields vertices in BFS order, making it
easy to compute shortest path distances.
1/**
2 * @file ch4_kevin_bacon.cpp
3 *
4 * @brief Six Degrees of Kevin Bacon (BGL Book Chapter 4.1)
5 *
6 * This example demonstrates BFS to compute the "Bacon number" - the number
7 * of connections from any actor to Kevin Bacon through co-starring in movies.
8 *
9 * @copyright SPDX-FileCopyrightText: 2022 Battelle Memorial Institute
10 * @copyright SPDX-FileCopyrightText: 2022 University of Washington
11 *
12 * SPDX-License-Identifier: BSD-3-Clause
13 *
14 * @authors
15 * Andrew Lumsdaine
16 *
17 */
18
19#include <fstream>
20#include <iostream>
21#include <map>
22#include <sstream>
23#include <string>
24#include <vector>
25
26#include "nwgraph/adjacency.hpp"
27#include "nwgraph/adaptors/bfs_edge_range.hpp"
28#include "nwgraph/edge_list.hpp"
29
30using namespace nw::graph;
31
32auto parse_buffer(const std::string& buffer) {
33 std::stringstream link(buffer);
34 std::string actor_one, movie_name, actor_two;
35 getline(link, actor_one, ';');
36 getline(link, movie_name, ';');
37 getline(link, actor_two);
38 return std::make_tuple(actor_one, movie_name, actor_two);
39}
40
41auto read_imdb(const std::string& path, std::map<std::string, size_t>& actor_id_map) {
42 std::string buffer;
43 size_t actor_vertex_counter = 0;
44 std::ifstream datastream(path);
45
46 edge_list<directedness::undirected, std::string> imdb(0);
47 imdb.open_for_push_back();
48 while (getline(datastream, buffer)) {
49 auto&& [actor_one, movie_name, actor_two] = parse_buffer(buffer);
50
51 auto&& [key_val_one, inserted_one] = actor_id_map.insert({ actor_one, actor_vertex_counter });
52 if (inserted_one) ++actor_vertex_counter;
53 auto&& [dummy_one, index_one] = *key_val_one;
54
55 auto&& [key_val_two, inserted_two] = actor_id_map.insert({ actor_two, actor_vertex_counter });
56 if (inserted_two) ++actor_vertex_counter;
57 auto&& [dummy_two, index_two] = *key_val_two;
58
59 imdb.push_back(index_one, index_two, movie_name);
60 }
61 imdb.close_for_push_back();
62
63 adjacency<0, std::string> A(imdb);
64 return A;
65}
66
67int main(int argc, char* argv[]) {
68 std::cout << "=== Six Degrees of Kevin Bacon ===" << std::endl;
69 std::cout << "Based on BGL Book Chapter 4.1" << std::endl << std::endl;
70
71 if (argc != 2) {
72 std::cerr << "Usage: " << argv[0] << " scsv_file" << std::endl;
73 return -1;
74 }
75
76 std::map<std::string, size_t> actor_id_map;
77 auto&& A = read_imdb(argv[1], actor_id_map);
78
79 size_t N = A.size();
80 std::vector<size_t> bacon_number(N);
81
82 auto it = actor_id_map.find("Kevin Bacon");
83 if (it == actor_id_map.end()) {
84 std::cerr << "Kevin Bacon not found in the database" << std::endl;
85 return -1;
86 }
87 auto&& [kb, kb_id] = *it;
88
89 for (auto&& [v, u, movie] : bfs_edge_range(A, kb_id)) {
90 bacon_number[u] = bacon_number[v] + 1;
91 }
92
93 std::cout << "Bacon numbers:" << std::endl;
94 for (auto&& [actor, id] : actor_id_map) {
95 std::cout << " " << actor << " has Bacon number of " << bacon_number[id] << std::endl;
96 }
97
98 return 0;
99}
Building and Running the Example
First, configure and build the example:
# From the NWGraph root directory
mkdir -p build && cd build
cmake .. -DNWGRAPH_BUILD_EXAMPLES=ON
make ch4_kevin_bacon
Then run the example:
./examples/bgl-book/ch4_kevin_bacon
The program uses built-in sample data and computes the Bacon number for each actor.
Sample Output
Kevin Bacon has a Bacon number of 0
Patrick Stewart has a Bacon number of 1
Steve Martin has a Bacon number of 1
Glenn Close has a Bacon number of 1
Tom Hanks has a Bacon number of 1
...
Key NWGraph Features Demonstrated
Undirected graphs: Using symmetric edges for actor relationships
BFS traversal: Using
bfs_rangeto compute shortest pathsProperty maps: Associating actor names with vertices
Unweighted shortest paths: Computing Bacon numbers via BFS distances
References
Cormen, Leiserson, Rivest, and Stein. Introduction to Algorithms, 4th Edition (2022), Chapter 20.2: Breadth-First Search
Siek, Lee, and Lumsdaine. The Boost Graph Library (2002), Chapter 4.1
See Also
Finding Loops in Program Control-Flow Graphs (BGL Book Chapter 4.2) - Using DFS to detect cycles
Internet Routing with Dijkstra’s Algorithm (BGL Book Chapter 5.4) - Weighted shortest paths
Boost Graph Library Examples (Rewritten for NW Graph) - BGL Book examples overview