Skip to content

Graph Data Type

Kevin Lin edited this page Aug 27, 2025 · 4 revisions

Note

Objectives

  1. Identify the features and representation for an example graph or graph problem.
  2. Analyze the runtime of a graph representation in terms of vertices and edges.
  3. Analyze the affordances of a graph interface or method for a particular problem.

Code: Graph.java, Digraph.java

All the data structures we've studied so far organize elements according to the properties of data in order to implement abstract data types like sets, maps, and priority queues. Clients (users) of data structures treat them as abstractions that they can use without knowing about the implementation details. In fact, the client can't assume any particular implementation because they're writing their code against more generic interfaces like Set, Map, or MinPQ.

Graphs, however, represent a totally different kind of abstract data type with different goals. Rather than focus on enabling efficient access to data, graphs focus on representing client-defined relationships between data. Watch the following video through the end.

Graphs video screencapture

How do graph data types provide different functionality from tree data structures?

The greatest difference lies in how a graph is used by a client. Instead of only specifying the element to add in a graph, we often will also specify the edges or relationships between elements too. The usefulness of a graph comes from the explicit relationships between elements.

Husky Maps

In Husky Maps, the MapGraph class represents all the streets in the Husky Maps app. How is data represented in a graph?

public class MapGraph {
    private final Map<Point, List<Edge<Point>>> neighbors;
}

The map data is stored in the neighbors variable: a map where the keys are unique Point objects and the values are the "neighbors" of each Point. A Point is an object that has an $x$ coordinate and a $y$ coordinate representing a physical place in the world, like a street intersection. The List<Edge<Point>>, or list of edges to other points, represents the "neighbors" of each point such as an adjacent street intersection.

U District map with two markers on NE 43rd St

In this image, the Point labeled A represents the intersection Brooklyn Ave NE & NE 43rd St while the Point labeled B represents the intersection University Way NE & NE 43rd St. We say that the two points are "neighbors" because there's a stretch of NE 43rd St in front of the light rail station that directly connects the two points.

Note

In MapGraph, we included an edge representing the stretch of NE 43rd St in front of the light rail station. But this stretch of road is only accessible to buses, cyclists, and pedestrians. Cars are not allowed on this stretch of road, so the inclusion of this edge suggests that this graph would afford applications that emphasize public transit, bicycling, or walking. But there are also other ways we can flag a stretch of road as disallowing certain types of mobility. Graph designers make key decisions about what data is included and represented in their graph, and these decisions affect what kinds of problems can be solved.

Vertices and edges

More generally, a graph is a data type composed of vertices and edges defined by the client. Each vertex (or node) is a basic element of a graph. Each edge is a direct connection between a pair of vertices $v$ and $w$ in a graph, usually written as $(v, w)$ with an optional weight associated with the edge. Unlike lists, deques, sets, maps, and priority queues that are defined largely by how they organize elements, graphs are defined by vertices and edges that define relationships between vertices.

What are the vertices and edges of MapGraph?

In MapGraph, the vertices are Point objects representing real-world locations while the edges are represented using the Edge class. The weight of an edge represents the physical distance between the two points adjacent to the edge.

Given a vertex, its edges lead to neighboring (or adjacent) vertices. In this course, we will always assume two restrictions on edges in a graph.

  • No self-loops: a vertex $v$ cannot have an edge $(v, v)$ back to itself.
  • No parallel (duplicate) edges: there can only exist at most one edge $(v, w)$.

In addition to edges being weighted or unweighted, there are two further categorizations for edges. An undirected edge represents a pairwise connection between vertices allowing movement in both directions (like a two-way street). A directed edge represents a pairwise connection between vertices allowing movement in a single direction (like a one-way street) and visualized as an arrow rather than a plain line.

Undirected graph ADT

An undirected graph (or simply "graph" without any qualifiers) is an abstract data type that contains zero or more vertices and zero or more undirected edges between pairs of vertices. The following slides (from Graphs and Digraphs by Robert Sedgewick and Kevin Wayne) describes some undirected graph terminology.

Undirected graph terminology

  • A path is a sequence of vertices connected by undirected edges with no repeated edges.
  • Two vertices are connected if there is a path between them.
  • A cycle is a path (with 1 or more edges) whose first and last vertices are the same.
  • The degree of a vertex is the number of undirected edges associated with it.

A minimal interface for an undirected graph could require an addEdge and neighbors method.

public interface UndirectedGraph<V> {

    /** Add an undirected edge between vertices (v, w). */
    public void addEdge(V v, V w);

    /** Returns a list of the edges adjacent to the given vertex. */
    public List<Edge<V>> neighbors(V v);
}

Directed graph ADT

A directed graph (or "digraph") is an abstract data type that contains zero or more vertices and zero or more directed edges between pairs of vertices. The following slide describes some directed graph terminology.

Directed graph terminology

Although parallel (duplicate) edges are not allowed, a directed graph can have edges in both directions between each pair of nodes. In the example above, note there are edges $(2, 3)$ and $(3, 2)$, as well as edges $(6, 8)$ and $(8, 6)$. These two pairs of edges allow for movement in both directions.

  • A directed path is a sequence of vertices connected by directed edges with no repeated edges.
  • Vertex $w$ is reachable from vertex $v$ if there is a directed path from $v$ to $w$.
  • A directed cycle is a directed path (with 1 or more edges) whose first and last vertices are the same.
  • The outdegree of a vertex is the number of directed edges outgoing from it.
  • The indegree of a vertex is the number of directed edges incoming to it.

A minimal interface for an directed graph could also require an addEdge and neighbors method, just as in the undirected graph example.

public interface DirectedGraph<V> {

    /** Add an directed edge between vertices (v, w). */
    public void addEdge(V v, V w);

    /** Returns a list of the outgoing edges from the given vertex. */
    public List<Edge<V>> neighbors(V v);
}

Adjacency lists data structure

The adjacency lists data structure that associates each vertex with a list of edges can implement both undirected graphs and directed graphs. MapGraph uses an adjacency lists data structure: the neighbors map associates each Point with a List<Edge<Point>>. Adjacency lists are a very direct implementation of the DirectedGraph interface methods like addEdge and neighbors.

public class MapGraph implements DirectedGraph<Point> {
    private final Map<Point, List<Edge<Point>>> neighbors;

    /** Constructs a new map graph. */
    public MapGraph(...) {
        neighbors = new HashMap<>();
    }

    /** Adds a directed edge to this graph using distance as weight. */
    public void addEdge(Point from, Point to) {
        if (!neighbors.containsKey(from)) {
            neighbors.put(from, new ArrayList<>());
        }
        neighbors.get(from).add(new Edge<>(from, to, estimatedDistance(from, to)));
    }

    /** Returns a list of the outgoing edges from the given vertex. */
    public List<Edge<Point>> neighbors(Point point) {
        if (!neighbors.containsKey(point)) {
            return new ArrayList<>();
        }
        return neighbors.get(point);
    }
}
What is a one-line addition that could make this directed graph an undirected graph?

In the addEdge method, add another edge to the neighbors map in the reverse direction. Then, we will have edges going in both directions between every pair of adjacent vertices, making the result an undirected graph.

Note

The MapGraph class doesn't provide methods for adding a single vertex or getting a list of all the vertices or all the edges. These aren't necessary for Husky Maps. But in other situations, you might like having these methods that provide different functionality. The Java developers did not provide a standard Graph interface like they did for List, Set, or Map because graphs are often custom-designed for specific problems. What works for Husky Maps might not work well for other graph problems.

We'll often visualize adjacency lists by drawing the neighbors map as an array associating each vertex with a linked list representing its neighbors. The following graph visualization on the left matches with the data structure visualization on the right, with the edge $(6, 9)$ marked red in both the left and the right side.

Adjacency lists representation

What other graph data structures are available?

Adjacency lists aren't the only way to implement graphs. Two other common approaches are adjacency matrices and edge sets. Both of these approaches provide different affordances (making some graph methods easier to implement or more efficient in runtime), but adjacency lists are still the most popular option for the graph algorithms we'll see in this class. Keith Schwarz writes on StackOverflow about a few graph problems where you might prefer using an adjacency matrix over an adjacency list but summarizes at the top:

Adjacency lists are generally faster than adjacency matrices in algorithms in which the key operation performed per node is "iterate over all the nodes adjacent to this node." That can be done in time $O(deg(v))$ time for an adjacency list, where $deg(v)$ is the degree of node $v$, while it takes time $\Theta(n)$ in an adjacency matrix. Similarly, adjacency lists make it fast to iterate over all of the edges in a graph: it takes time $O(m + n)$ to do so, compared with time $\Theta(n^2)$ for adjacency matrices.

Some of the most-commonly-used graph algorithms (BFS, DFS, Dijkstra's algorithm, A* search, Kruskal's algorithm, Prim's algorithm, Bellman-Ford, Karger's algorithm, etc.) require fast iteration over all edges or the edges incident to particular nodes, so they work best with adjacency lists.

Clone this wiki locally