Graphs

A graph $G$ is a set $V$ of vertices and a set $V$ of edges that connect vertices. We can write this as $G = (V, E)$.

The difference between trees and graphs is that there can be more than one path between two nodes in a graph, and that a tree is connected and acyclic (cannot have a cylce).

There are two main types of graphs:

In directed graphs, every edge $e$ is directed from some vertex $v$ to some vertex $w$.

In undirected graphs, edges connect vertices in no particular order.


Both directed and undirected graphs can have selfedges of the form $(v, v)$, which connect a vertex to itself.

A cycle is a path whose first and last vertices are the same.
In an undirected graph, a path $\langle v_0, v_1, v_2, ..., v_k \rangle$ forms a cycle if $k \geq 3$ and $v_0 = v_k$.
In a directed graph, a path $\langle v_0, v_1, v_2, ..., v_k \rangle$ forms a cycle if $v_0 = v_k$ and the path contains at least one edge.
The cycle is simple if, in addition, $v_0, v_1, v_2, ..., v_k$ are distinct. A selfloop is a cycle of length $1$. A graph with a cycle is called cyclic. A graph without a cycle is called acyclic.

An undirected graph is connected if every vertex is reachable from all other vertices. A directed graph is strongly connected if every two vertices are reachable from each other.

Let’s now take a look at several examples of graphs.

Here is a graph representing Paris metro.^{[1]} It is connected, undirected and cyclic.

Here is a graph, where each node represents an intersection in Ann Arbor and each (labelled) edge represents a street. Notice while now on most streets, you can drive in both directions, you can only drive West on part of South U. (due to construction) and you cannot drive at all on East U. This graph is connected, directed and cyclic.

Finally, here’s a graph representing states (and one province!), where edges show the ability to cross borders between them. This graph is connected, undirected and cyclic.


A path is a sequence of vertices such that each adjacent pair of vertices is connected by an edge. If the graph is directed, the edges that form the path must all be aligned with the direction of the path.

The length of a path is the number of edges traversed by the path. Above, $\langle 1, 4, 3, 2 \rangle$ is a path of length $3$. A path can also be of length $0$, such as $\langle 3 \rangle$.

The distance from one vertex to another is the length of the shortest path from one to the other.

The degree of a vertex is the number of edges incident on that vertex. In the graph above, Michigan has degree 4 and Ontario has degree 2. A vertex in a directed graph has an indegree (the number of edges directed toward it) and an outdegree (the number of edges directed away). Intersection 3 above has indegree 2 and outdegree 1.

Several kinds of graphs have special names.

A complete graph is an undirected graph in which every pair of vertices is adjacent.

A bipartite graph is an undirected graph $G = (V, E)$ in which $V$ can be partitioned into two sets $V_1$ and $V_2$ such that $(v, w) \in E$ implies either $v \in V_1$ and $w \in V_2$ or $v \in V_2$ and $w \in V_1$. That is, all edges go between the two sets $V_1$ and $V_2$.

An acyclic, undirected graph is a forest.

A connected, acyclic, undirected graph is a free tree (not rooted tree).

We often call directed acyclic graph a dag.

Graph Representations

When we choose how to represent a graph $G = (V, E)$, it is important to consider the relationship between $V$ (the number of vertices) and $E$ (the number of edges).

The maximum possible number of edges is $V^2$ for a directed graph, and slightly more than half of that for an undirected graph. However, the number of edges is often much less than $\Theta(V^2)$ in many applications.
For instance, planar graphs (those that can be drawn without edges crossing, such as street maps) have at most $V$ edges.
In a complete graph, every vertex is connected to every other vertex.

A graph is sparse if $E$ is much less than $V^2$. A graph is dense if $E$ is close to $V^2$.

There are two standard ways to represent a graph $G = (V, E)$: as an adjacency matrix or as a collection of adjacency lists.
List of Edges

The simplest way to represent a graph is to just list all of its edges.

So for the graph of intersections above, we would list the edges as $(0,1), (0,2), (1,0), (1,4), (2,0), (2,3), (3,2), (4,1), (4,3)$.
Adjacency Matrix

In the adjacencymatrix representation of a graph $G = (V, E)$, we assume that the vertices are numbered $1, 2, ..., V$ in some arbitrary manner. ($V$ is the number of vertices in the graph.) Then the adjacencymatrix representation of a graph G consists of a $V$ by $V$ matrix $A = (a_{ij})$ such that
$a_{ij} = \begin{cases} 1, & \text{if } (i, j) \in E \\ 0, & \text{otherwise } \end{cases}$

Here is how we could represent intersections of Ann Arbor shown above as an adjacency matrix:

The total memory used for representing a graph as an adjacency list is $\Theta(V^2)$.

We prefer adjacencymatrix representation for dense graphs.
Adjacency List

The adjacencylist representation of a graph is an array of size $V$. Each element in the array is a list, one for each vertex in $V$. For each vertex $v \in V$, the adjacency list contains all the vertices (or pointers to the vertices) such that there is an edge $(v, w) \in E$. In other words, the the list contains all vertices adjacent to $v$ in $G$.

Here is how we could represent intersections of Ann Arbor shown above as an adjacency list:

If vertices are not numbered $0, 1, 2, ..., V  1$ (e.g., if they are strings), we can use a hash table to map names to lists. Each entry in the hash table uses the object representing the vertex (e.g., its name) as a key, and a list as the associated value.

The total memory used for representing a graph as an adjacency list is $\Theta(V + E)$.

We prefer adjacencylist representation for sparse graphs. And most realworld graphs are in fact sparse.
Weighted Graphs
A weighted graph is a graph in which each edge is labeled with a numerical weight. A weight might express the distance between two nodes, the cost of moving from one to the other or many other things.

In an adjacency matrix, weights is stored in the matrix. Whereas an unweighted graph uses an array of booleans, a weighted graph uses an array of numbers (
int
,double
or another type). Edges missing from the graph can be represented by a sentinel value (such as 0, 1 orINT_MIN
). Alternatively, we might use an array of booleans and an array of numbers to allow all numerical values to be valid weghts. 
In an adjacency list, recall that each edge is represented by an array or a linked list of adjacent vertices. We could augment the nodes of the linked list that represents each edge to include a weight, in addition to the reference to the destination vertex. (Or if we are using arrays instead of linked lists, we could use two separate arrays: one for weights and one for destinations.)

There are two common problems involving weighted graphs:

The shortest path problem: find a path between two vertices in a graph such that the sum of the weights of of the edges along the path is minimized.

The minimum spanning tree (MST) problem: find which edges in the graph must remain such that the graph is connected, but the total weight in the remaining edges is minimized.

Graph Algorithms

See the Graph Algorithms section!