Graph Background

In order to describe the NWGraph library and its abstractions and interfaces, it is important to be precise about what we mean when we say “graph,” not only in terms of the mathematical abstraction, but also in terms of the corresponding software abstractions. To define our terminology and to amplify our point regarding “graphs are not for storing data,” we walk through the steps involved in representing the implicit relationships in data tables as a graph.

Remark. Understanding graphs is necessary to develop requirements for algorithms. However, it should be noted that we don’t derive those requirements from the graph model, but instead from the algorithms. This is a key distinction between generic programming and, say, OO requirements analysis.

Graph Terminology

Abstractly, we define a graph \(G\) as comprising two finite sets, \(G =\{ V, E \}\), where the set \(V\) is a set of entities of interest, “vertices” or “nodes,” and \(E\) is a set of pairs of entities from \(V\), “edges” or “links.” Without loss of generality we label the entities in \(V\) as \(v_i\) so that \(V = \{ v_0, v_1, \ldots v_{n-1} \}\). The set of edges, also labeled, can be constructed using the labeled entities from \(V\) so that \(E = \{ e_0, e_1, \ldots e_{m-1} \}\). The edges may be ordered pairs, denoted as \((v_i, v_j)\), which have equality defined such that \((v_i,v_j) = (v_m,v_n) \leftrightarrow v_i = v_m \wedge v_j = v_n\). Or, the edges may be unordered sets, denoted as \(\{v_i, v_j\}\) which have equality defined as \((v_i,v_j) = (v_m,v_n) \leftrightarrow\left( v_i = v_m \wedge v_j = v_n\right) \vee \left( v_i = v_n \wedge v_j = v_m\right)\). If a graph is defined with ordered edges we say the graph is directed; if the graph is defined with unordered edges we say the graph is undirected.

Graph Models

Graphs are powerful abstractions because they allow us to reason about the relationships between entities, irrespective of what the entities actually are. But, when we use graph algorithms in practice, we are using them to model some specific problem. Since one of the motivations behind NWGraph is to support graph computing in the context of real programs, we briefly describe the first part of the abstraction process when modeling with graphs.

Electrical circuit modeled as a directed graph. The nodes in the circuit are modeled as vertices and the circuit elements are modeled as edges.

Fig. 1 Electrical circuit modeled as a directed graph. The nodes in the circuit are modeled as vertices and the circuit elements are modeled as edges.

Airport route table modeled as an undirected graph. Airports are modeled as vertices and routes between cities are modeled as (weighted) edges.

Fig. 2 Airport route table modeled as an undirected graph. Airports are modeled as vertices and routes between cities are modeled as (weighted) edges.

Fig. 1 shows a model of an electrical circuit as a directed graph, both schematically, as a circle and line diagram, and mathematically, as the sets \(V\) and \(E\). Two-terminal circuit elements connect to each other at given circuit nodes. We thus model circuit connection points as graph vertices, and the connections between them as edges. In the case of circuits, orientation of circuit elements matters and so we use directed edges in the graph.

Fig. 2 similarly shows a model of an airport route table as an undirected graph. We begin with a table of airports and a table of distances in kilometers between pairs of airports. We model this situation as a graph by identifying graph nodes with airports and graph edges with pairs of cities that are given as pairs in the distance table.

Representing Graphs

To define algorithms on graphs and to be able to reason about those algorithms, we need to define some representations for graphs (and corresponding terminology)—–not much can be done computationally with abstract sets of vertices and edges. Various characteristics of these representations are what we use to express algorithms (still abstractly) but when those algorithms are implemented as generic library functions, those characteristics will in turn become the basis for the library’s concepts.

Circuit index graph.

Fig. 3 Index graph and associated index edge list and adjacency list corresponding to the circuit graph example. Also shown is the translation table from vertex to index.

Airport index graph.

Fig. 4 Index graph and associated index edge list and adjacency list corresponding to the airport graph example. Also shown is the translation table from vertex to index.

One of the fundamental operations in graph algorithms is a traversal. That is, given a vertex \(u\), we would like to find the neighbors of \(u\), i.e., all vertices \(v\) such that the edge \((u,v)\) is in the graph. Then, for each of those edges, we would like to find their neighbors, and so on. The representation that we can define to make this efficient is an adjacency list.

Given a graph \(G = (V,E)\), we can define an adjacency-list representation in the following way. Assign to each element of \(V\) a unique index from the range \([0,|V|)\) and denote the vertex identified with index \(i\) as \(V[i]\). We can now define a new graph with the same structure as \(G\), but in terms of the indices in \([0,|V|)\), rather than with the elements in \(V\). Let the index graph of \(G\) be the graph \(G'=(V',E')\), where \(V'=[0,|V|)\) and \(E'\) consists of \(|E|\) pairs of indices from \(V\), such that a pair \((i,j)\) is in E’ if and only if \((V[i],V[j])\) is in \(E\). Which is all to say, the index graph of \(G\) is the graph we get by replacing all elements of \(G\) with their corresponding indices. Fig. 3 and Fig. 4 show the progression from an index graph to an index adjacency list (compare also to Fig. 1 and Fig. 2). Since edges are given in terms of vertex names, in order to create a list of edge indices, we need to translate from vertex name to vertex index. Accordingly, Fig. 3 and Fig. 4 also show the translation table from vertex to index.

Of course, we don’t need an underlying graph to define what an index graph itself is. We can say that a graph \(G = (V, E)\) is an index graph if its vertex set is a set of contiguous indices, i.e., with \(V=[0,|V|-1)\). Since an index graph is just a graph, in cases where the context is clear, we may refer to an index graph simply as a graph. We note that an adjacency list can only be defined over an index graph.

Finally, we can make the following precise definition: An adjacency list of an index graph \(G=(V,E)\) is an array \(Adj(G)\) of size \(|V|\) (the array is indexed from \(0\) to \(|V|-1\)) with the following properties:

  • \(Adj(G)\) is a container of \(|V|\) containers, one container for each vertex in \(V\), and

  • The container \(Adj(G)[u]\) contains all vertices \(v\) for which there is an edge \((u,v)\in E\).

This structure, an adjacency list of an index graph, or an index adjacency list, is the fundamental structure used by almost all graph algorithms. and show the index graph and the adjacency list representation of our airport and circuit examples.

Remark (1): Although the standard term for this kind of abstraction is “adjacency list”, and although it is often drawn schematically with linked lists as elements, it is not necessary that this abstraction be implemented as an actual linked list. In fact, other representations are significantly more efficient. What is important is that the items that are stored, vertex indices, can be used to index into the adjacency list to obtain other lists of neighbors.

Remark (2): The index adjacency list does not store edges per se, rather it stores lists of reachable neighbors. Therefore, the index adjacency list is neither inherently directed nor undirected. That is, for a given vertex \(u\), the container \(Adj(G)[u]\) contains the vertex \(v\) if the edge \((u,v)\) is contained in \(E\). This means that for a directed graph with edge \((u,v)\) in , \(E\), \(Adj(G)[u]\) will contain \(v\). For an undirected graph with edge \((u,v)\) is contained in \(E\), \(Adj(G)[u]\) will contain \(v\) and \(Adj(G)[v]\) will contain \(u\). Directedness of the original graph is thus made manifest in the values stored in the index adjacency list.

Compare, for instance, the adjacency lists in Fig. 3 and Fig. 4. The graphs have the same structure in the schematics and in the mathematical notations (shown in Fig. 1 and Fig. 2). However, when realized as adjacency list, the adjacency list in Fig. 2 is symmetrized. Every edge \(\{u,v\}\) in the graph is inserted twice into the adjacency list: once as \(\{u,v\}\) and once as \(\{v,u\}\).