Generic Graph Algorithms
In this section we analyze the requirements for graph algorithms in
order to derive generic graph algorithms. In NWGraph, these generic
algorithms are realized as function templates, and the type requirements
are realized as C++20 concept
s. Our process centers on defining
type requirements at the interfaces to algorithms based on what the
algorithms actually need from their types, rather than starting with
graph types and building algorithms to those types.
Algorithm Requirements
Algorithms in the STL operate over containers. The concepts defined for the STL have to do with mechanisms for traversing a container and accessing the data therein. Since graphs in some sense are also containers of data, we can reuse the mechanisms from the STL for traversing graphs and accessing graph data, to the extent that makes sense. However, graphs are structured data and graph algorithms traverse that structure in various ways. Accordingly, our graph concepts must support structured traversal of graphs.
Most (but not all) graph algorithms traverse a graph vertex to vertex by following the edges that connect vertices. For implementing such algorithms, it is assumed that a graph \(G = \{V, E\}\) is represented with an adjacency-list structure [1]_ as defined in .
Requirements for Concrete Algorithms
A prototypical algorithm in this class is the breadth-first search (BFS) algorithm. The pseudocode for this algorithm, along with its C++ implementation is shown in Fig. [fig:bfs]. The algorithm is abbreviated from:cite:cormen2009introduction. Modulo some type declarations that would be necessary for real code to compile, but which can be omitted from pseudocode, the C++ code, using out of the box language mechanisms and library components, has essentially a one-one correspondence to the pseudocode.
\(\displaystyle color[u] \gets \const{\textsc{white}}\) \(\displaystyle u\gets\proc{Dequeue}{(Q)}\) \(\displaystyle color[v] \gets \const{gray}\)
void bfs(const Graph& G, int s) { ...
for (int u = 0; u < size(G); ++u)
color[u] = WHITE;
color[s] = GREY;
std::queue<int> Q;
Q.push(s);
while (!Q.empty()) {
auto u = Q.front(); Q.pop();
for (auto&& v : G[u]) {
if (color[v] == WHITE) {
color[v] == GREY;
Q.push(v); }}
color[u] = BLACK; }}
From this implementation we can extract an initial set of requirements for the BFS algorithm:
The graph
G
is a random access range, meaning it can be indexed into with an object (of its difference type) and it has a size.The value type of
G
(the inner range ofG
) is a forward range, meaning it is something that can be iterated over and have values extracted.The value type of the inner range is something that can be used to index into
G
.All elements stored in
G
must be able to correctly index into it, meaning their value are between 0 andsize(G)-1
, inclusive.
Associated with the concepts of a random access range and forward range
are complexity guarantees (which are also implied by the theoretical
algorithm). Indexing into G
is a constant-time operation and
iterating over the elements in G[u]
is linear in the number of
elements stored in G[u]
.
As an example, we could use any of the following compositions of
standard library components for the Graph
datatype above:
using Graph = std::vector<std::list<int>>;
using Graph = std::vector<std::vector<unsigned>>;
using Graph = std::vector<std::forward_list<size_t>>;
In fact, any Graph
data structure meeting the above requirements
could be used.
We now have a set of requirements for a concrete implementation of BFS. Following the generic programming process, there are various aspects of the implementation that we could begin lifting. Ultimately, as with the STL, we want a set of concepts useful across families of graph algorithms. So rather than lifting BFS in isolation, we now examine concrete implementations of other algorithms in order to identify common functionality that can be lifted in order to unify abstractions.
\(\displaystyle d[u] \gets \infty\) \(\displaystyle u\gets\proc{Extract-Min}{(Q)}\) \(d[v] \gets d[u] + w(u,v)\)
void dijkstra(const Graph& G, int s) {...
for (int u = 0; u < size(G); ++u) {
d[u] = INF;
pi[u] = NIL; }
d[s] = 0
for (int u = 0; u < size(G); ++u)
Q.push({u, d[u]});
while (!Q.empty()) {
auto [u, x] = Q.top(); Q.pop();
for (auto&& [v, w] : G[u]) {
if (d[u] + w < d[v]) {
d[v] = d[u] + w;
pi[v] = u; }}}}
Fig. [fig:dijkstra] shows the pseudocode and
corresponding C++ implementation for Dijksra’s algorithm for solving the
single-source shortest paths problem. From this implementation we can
extract an initial set of requirements for the concrete dijkstra
algorithm:
The graph
G
is a random access range.The value type of
G
(the inner range ofG
) is a forward range.The value type of the inner range is a pair, consisting of a something we will call a vertex type and something we will call a weight.
The vertex type is something that can be used to index into
G
.All values stored as vertex types in
G
must be able to correctly index intoG
meaning their value are between 0 andsize(G)-1
, inclusive.
Just as the code of dijkstra
is similar to bfs
, some of these
requirements are also the same. However, the key difference is in what
is stored inside of the graph. This implementation of dijkstra
assumes that the graph stores a tuple consisting of a vertex value and
an edge weight. That is, rather than the Graph
types shown above,
we could use the following for dijkstra
:
using Graph = std::vector<std::list<std::tuple<size_t, int>>>;
using Graph = std::vector<std::vector<std::tuple<unsigned, double>>>;
using Graph = std::vector<std::forward_list<std::tuple<int, float>>>;
This is a different kind of graph than we had for bfs
, which only
stored a value. Yet, even a graph that stores a weight on its edge is
suitable for BFS exploration. Similarly, a graph without a weight on its
edge should be suitable for Dijkstra’s algorithm, provided a weight
value can be provided in some way, or a default value, say, 1, used.
Lifting
From the foregoing discussion, we have two pieces of functionality we need to lift. First, we need to lift how the neighbor vertex is stored so that whether it is stored as a direct value or as part of a tuple (or any other way), it can be obtained. Second, we need to lift how weights (or, more generally, properties) are stored on edges. And, finally, implied when we say we want to use different kinds of graphs with these algorithms, we need to parameterize them on the graph type (make them function templates rather than functions).
Parameterizing the Graph Type
In this lifting process we will be building up to a concept, which we
will illustrate by lifting dijkstra
. We begin by presenting its type
parameterization. The prototype for a dijkstra
function template
based on our previous definition would be
template <class Graph>
auto dijkstra(const Graph& G, vertex_id_t<G> s);
Note that we have parameterized two things: the Graph
type itself,
as well as the type of the starting vertex s
. In this case, the
vertex type is not arbitrary, it is related to the type of the graph,
and so we have a type primitive vertex_id_t
that returns the type of
the vertex associated with graph G
.
We can update some of the previous requirements for the
type-parameterized dijkstra
:
Graph
must meet the requirements ofrandom_access_range
.The value type of
Graph
(the inner range ofGraph
) must meet the requirements offorward_range
.The type
vertex_id_t<Graph>
is an associated type ofGraph
.The type
vertex_id_t<Graph>
is convertible to therange_difference_t
ofGraph
(that is, it can be used to index into aGraph
).
Both classes of graphs that we had previously seen for bfs
and
dijkstra
satisfy these requirements (which are more general than
either of the previous requirements). For example, both of the following
compound structures
std::vector<std::vector<int>>;
std::vector<std::vector<std::tuple<int, float, double>>>;
satisfy the lifted requirements, (provided a suitable overload of
vertex_id_t
is defined) though each would have only satisfied one of
the previous requirements. Note however, we still need to do more
lifting before we can actually compose bfs
or dijkstra
with
these types.
Lifting Neighbor Access
How a neighbor is stored is dependent on the graph structure itself. We
want the mechanism for accessing it therefore to vary based on the graph
type. In keeping with standard C++ practice – and since we want to be
able to use with C++ standard library containers, we adopt a polymorphic
free function interface to abstract the process of accessing a neighbor.
In particular, we define a target
customization point object (CPO)
to abstract how a neighbor vertex is accessed, given an object obtained
from traversing the neighbor list.
If variable
G
is of typeGraph
and variablee
is of the value type of the inner range ofGraph
, thentarget(G, e)
is a valid expression that returns a type ofvertex_id_t<Graph>
.All values returned by
target(G, e)
must be able to correctly index into aGraph
G
.
With this abstraction, the loop and neighbor access in bfs
and
dijkstra
(respectively at
lines [bfs:foreach_v_in_Adj]
and [code:foreach-dijkstra]) are replaced
by
for (auto&& e : G[u]) {
auto v = target(G, e);
... }
Now, provided that suitable overloads for target
are defined, the
two model graph types
std::vector<std::vector<int>>;
std::vector<std::vector<std::tuple<int, float, double>>>;
will satisfy the above requirements, and we can compose them with the
bfs
and dijkstra
we have lifted to this point.
We can, for example, define overloads for target thusly:
int target(const std::vector<std::vector<int>>& G, int e) { return e; }
using E = std::tuple<int, float, double>;
int target(const std::vector<std::vector<E>>& G, E& e) { return std::get<0>(e); }
Note that these overloads are each specific to a single graph type. In
practice we can define generalized overloads for entire classes of
containers. In NWGraph we opted to realize target
as a CPO,
implemented using the tag_invoke
mechanism:cite:_tag_invoke.
Graph Concepts: Encapsulating Lifted Requirements
We can encapsulate (and formalize) the above requirements in the form of a concept (which is almost a direct translation of the stated requirements to code).
We first capture the very fundamental requirements of a graph, that it has an associated vertex id type:
template <typename G>
concept graph = std::semiregular<G>
&& requires(G g) { typename vertex_id_t<G>; };
We define this as a separate concept since we may wish to define other concepts that reuse these requirements.
Next, we define some convenience type aliases to capture the type of the inner range of a graph as well as the type that is stored by the inner range:
template <typename G>
using inner_range = std::ranges::range_value_t<G>;
template <typename G>
using inner_value = std::ranges::range_value_t<inner_range<G>>;
Now we can define the concept that captures the requirements from the
lifted bfs
and dijkstra
:
template <typename G>
concept adjacency_list = graph<G>
&& std::ranges::random_access_range<G>
&& std::ranges::forward_range<inner_range_t<G>>
&& std::convertible_to<vertex_id_t<G>, std::ranges::range_difference_t<G>>
&& requires(G g, vertex_id_t<G> u, inner_value_t<G> e) {
{ g[u] } -> std::convertible_to<inner_range_t<G>>;
{ target(g, e) } -> std::convertible_to<vertex_id_t<G>>; };
Although we restricted our illustration of lifting to bfs
and
dijkstra
in this paper, this concept applies to almost all of the
algorithms in NWGraph (and, since it represents an adjacency list, we
expect it could apply to the majority of, if not all, graph algorithms
based on adjacency lists).
We can use this concept to constrain the interface to bfs
in the
following two ways:
template <class Graph>
requires adjacency_list<Graph>
void bfs(const Graph& G);
template <adjacency_list Graph>
void bfs(const Graph& G);
Generally there is a fully general declaration when using concepts via
the requires
keyword, and a number of abbreviated forms. In the
NWGraph library, the second syntax above is preferred.
Lifting Edge Weight
In the concrete implementation of Dijkstra’s algorithm shown above, we
assumed the container associated with each vertex in the graph (i.e.,
the container obtained by G[u]
) provided tuples containing the
vertex id and the edge weight. In fact, there are numerous ways to
associate a weight with each edge. We could, for example, store an edge
index with each neighbor and use that to index into an array that we
also pass into dijkstra
. In such a case the (unconstrained)
prototype for the algorithm might be
template <class Graph, class Range>
auto dijkstra(const Graph& G, vertex_id_t<Graph> s, Range wt);
The inner loop might then look like
for (auto&& e : G[u]) {
auto v = target(e);
auto w = wt[v];
if (d[u] + w < d[v]) {
d[v] = d[u] + w;
pi[v] = u; }}
To lift this version and the version with the directly-stored property
on edges, we introduce a weights
callable WeighFunction as a
parameter at the interface of dijkstra
, using concepts to ensure
that the function meets the required use.
template <adjacency_list Graph, class WeightFunction>
auto dijkstra(const Graph& G, vertex_id_t<Graph> s, WeightFunction wt);
In this case, the inner loop would look like
for (auto&& e : G[u]) {
auto v = target(e);
auto w = wt(e);
if (d[u] + w < d[v]) {
d[v] = d[u] + w;
pi[v] = u; }}
Non-Type Constraints
We have already seen in lifting the edge weight that not all constraints
for an algorithm are encapsulated in the type requirements for the input
graph. There are other requirements that an algorithm may have that
cannot be captured as a type requirement, or as any compile-time
checkable requirement. For example, some algorithms, such as triangle
counting, may require that the edges within each neighborhood be sorted.
Or, as we have seen, directedness of a graph is not a type property, but
rather a runtime property. Such requirements become part of the
specification of the API, but cannot be made part of type checking. This
is similar to, say, binary_search
in the C++ standard library, which
requires that the elements of the container to which it is applied be
sorted. Yet, there is no such thing as a sorted container type in the
standard library.
Other Concepts
So far we have developed a single concept (adjacency_list
) and the
reader may ask how broad that concept is, given the wide variety of
potential graph algorithms. In fact, adjacency_list
concept is
surprisingly broad in its applicability; only a few supplemental
concepts are required to cover all of the algorithms implemented in
NWGraph and, based on our analysis of the comprehensive Boost Graph
Library:cite:Siek_Lee_Lumsdaine_2002, probably all of the
algorithms that are likely to be implemented in NWGraph in the future.
This is perhaps not so surprising since the adjacency list
\(Adj(G)\) is also the primary theoretical construct upon which the
majority of graph algorithms are built.
There are two additional concepts that we introduce briefly here which
we found necessary for algorithms in NWGraph: degree_enumerable
and
edge_list
. The former extends adjacency_list
with the
requirement that there be a valid expession degree
, necessary in
some algorithms. The latter is basically a container of objects for
which source
and target
are valid expressions. Algorithms such
as Bellman-Ford and Kruskal’s MST use an edge list rather and adjacency
list:cite:Cormen_2009.
Our confidence that these concepts are sufficient is based on an
extensive study of the concepts in the Boost Graph
Library:cite:Siek_Lee_Lumsdaine_2002. The Boost Graph
Library (BGL) has five essential concepts that cover all of its
algorithms: VertexListGraph
, EdgeListGraph
, AdjacencyGraph
,
IncidenceGraph
, and BiIncidenceGraph
. Of these, the design
decisions of NWGraph to require vertex identifiers to be indices
obviates VertexListGraph
. We don’t need to iterate through a list of
vertices provided by the graph. The NWgraph adjacency_list
and
degree_enumerable
concepts subsume the essential functionality of
AdjacencyGraph
and IncidenceGraph
. adjacency_list
does not
have a source
function requirement, but that is in fact only rarely
used in the BGL algorithms requiring IncidenceGraph
(and when it is
used, there are other ways of obtaining the same information). The
BiIncidenceGraph
concept is essentially a composite of a graph and
its transpose. This is obviated in the NWGraph design. Algorithms
requiring a graph and its transpose take them as two arguments rather
than one (there is no data that can be shared between a graph and its
transpose, so there is no loss of efficiency in that design decision).
Finally, the NWgraph edge_list
concept is essentially identical to
the BGL EdgeListGraph
concept.