A spanning tree of a graph G is a subgraph that is a tree and contains every vertex of G. Informally, the minimum spanning tree, MST, is to find a free tree T of a given graph G that contains all the vertices of G and has the minimum total weight of the edges of G over all such trees.
Problem
A town has set of houses and a set of roads. A road connects two and only two houses. A road connecting houses u and v has a repair cost w(u, v). The goal is to repair enough (and no more) roads such that
1. everyone stays connected: can reach every house from all
other houses, and
2. total repair cost is minimum.
Model as a Graph
Given undirected graph G = (V, E) and weight w(u, v) on each edge (u, v) ∈ E. The problem is to find T ⊆ E such that
1. T connects all vertices (T is a spanning tree), and
2. w(T) = ∑(u, v)∈T w(u, v) is minimized.
A spanning tree whose weight is minimum over all spanning trees is called a minimum spanning tree, or MST.
The example of such a graph is as follows:
Note that the edges in MST are shaded. In this example, there is more than one MST. Replace edge (e, f) by (c, e). We get a different spanning tree with the same weight.
Growing a minimum spanning tree
Some important properties of an MST are as follows:
It has |V| − 1 edges.
It has no cycles.
It might not be unique.
Building up the Solution
We will build a set A of edges.
Initially, set A has no edges.
As we add edges to the set A, maintain a loop invariant:
Loop invariant: The set A is a subset of some MST.
Add only edges that maintain the invariant. If a set A is a subset of some MST, an edge (u, v) is safe for A if and only if A ∪ {(u, v)} is also a subset of some MST. So, we will add only safe edges.
Generic MST Algorithm
There are several different methods for computing minimum spanning trees, but really they are all instances of the following generic algorithm.
GENERIC-MST (G, w)
A = { }
while A is not a spanning tree
do find an edge (u, v) that is safe for set A
A = A ∪ {(u,
v)}
return A
Use the loop invariant to show that this generic algorithm works.
Initialization: The empty set trivially
satisfies the loop invariant.
Maintenance: Since we add only safe edges, set A
remains a subset of some MST.
Termination: All edges added to the set A are
in an MST, so when we stop, A is a spanning tree that is also
an MST.
Finding a Safe Edge
How do we find safe edges?
Let's look at the example. Edge (c, f) has the lowest weight of any edge in the graph. Is it safe for A = { }?
Intuitively: Let S ⊂ V be any set of vertices that includes c but not f (so that f is in V − S). In any MST, there has to be one edge (at least) that connects S with V − S. Why not choose the edge with minimum weight? (Which would be (c, f) in this case.)
Some definitions: Let S ⊂ V and A ⊆ E.
A cut (S, V − S) is a partition of vertices into disjoint sets V and V − S.
Edge (u, v) in E crosses cut (S, V − S) if one endpoint is in S and the other is in V − S.
A cut respects A if and only if no edge in A crosses the cut.
An edge is a light edge crossing a cut if and only if its weight is minimum over all edges crossing the cut. For a given cut, there can be more than one light edge crossing it.
Theorem Let A be a subset of some MST, (S, V − S) be a cut that respects A, and (u, v) be a light edge crossing (S, V − S). Then (u, v) is safe for A.
Proof Let T be an MST that includes A.
If T contains an edge (u, v), we are done.
So now assume that T does not contain (u, v). We will construct a different MST T that include A = A ∪ {(u, v)}.
Recall: a tree has unique path between each pair of vertices. Since T is an MST, it contains a unique path p between u and v. Path p must cross the cut (S, V − S) at least once. Let (x, y) be an edge of p that crosses the cut. From how we chose (u, v), must have w(u, v) < w(x, y).
[Note carefully: Except for the dashed edge (u, v), all edges shown are in T. A is some subset of the edges of T, but A cannot contain any edges that cross the cut (S, V − S), since this cut respects A. Shaded edges are the path p.]
Since the cut respects A, edge (x, y) is not in A.
To form T' from T:
Remove the edge (x, y). Breaks T into two components.
Add edge (u, v). Reconnects.
So T' = T − {(x, y)} ∪ {(u, v)}.
T' is a spanning tree.
w(T') = w(T) − w(x, y) + w(u, v)
≤ w(T),
since w(u, v) ≤ w(x, y). Since T' is a spanning tree, w(T') ≤ w(T), and T is an MST, then T' must be an MST.
We need to show that edge (u, v) is safe for the set A:
A ⊆ T and (x, y) ∉ A ⇒ A ⊆ T'.
A ∪ {(u, v)} ⊆ T'.
Since T' is an MST, edge (u, v) is safe for the set A.
This completes the proof.
So, in GENERIC-MST:
A is a forest containing connected components into one. Initially, each component is a single vertex.
Any safe edge merges two of these components into one. Each components is a tree.
Since an MST has exactly |V| − 1 edges, the for loop iterates |V| − 1 times. Equivalently, after adding |V| − 1 safe edges, we're down to just one components.
Corollary If C = (VC, EC) is a connected component in the forest GA = (V, A) and (u, v) is a light edge connecting C to some other component in GA [i.e., (u, v) is a light edge crossing the cut (VC, V − VC)], then (u, v) is safe for A.
Proof Set S = VC in the theorem. And this completes the proof.
This is naturally leads to the algorithm called Kruskal's algorithm to solve the minimum-spanning -tree problem.
Dated: February 9, 2010.