Kruskal's Algorithm

This minimum spanning tree algorithm was first described by Kruskal in 1956 in the same paper where he rediscovered Jarnik's algorithm. This algorithm was also rediscovered in 1957 by Loberman and Weinberger, but somehow avoided being renamed after them. The basic idea of the Kruskal's algorithms is as follows: scan all edges in increasing weight order; if an edge is safe, keep it (i.e. add it to the set A).

Overall Strategy

Kruskal's Algorithm, as described in CLRS, is directly based on the generic MST algorithm. It builds the MST in forest. Initially, each vertex is in its own tree in forest. Then, algorithm consider each edge in turn, order by increasing weight. If an edge (u, v) connects two different trees, then (u, v) is added to the set of edges of the MST, and two trees connected by an edge (u, v) are merged into a single tree on the other hand, if an edge (u, v) connects two vertices in the same tree, then edge (u, v) is discarded.

A little more formally, given a connected, undirected, weighted graph with a function w : E → R.

Starts with each vertex being its own component.
Repeatedly merges two components into one by choosing the light edge that connects them (i.e., the light edge crossing the cut between them).
Scans the set of edges in monotonically increasing order by weight.
Uses a disjoint-set data structure to determine whether an edge connects vertices in different components.

Data Structure

Before formalizing the above idea, lets quickly review the disjoint-set data structure from Chapter 21.

Make_SET(v): Create a new set whose only member is pointed to by v. Note that for this operation v must already be in a set.
FIND_SET(v): Returns a pointer to the set containing v.
UNION(u, v): Unites the dynamic sets that contain u and v into a new set that is union of these two sets.

Algorithm

Start with an empty set A, and select at every stage the shortest edge that has not been chosen or rejected, regardless of where this edge is situated in the graph.

KRUSKAL(V, E, w)

A ← { }           ▷ Set A will ultimately contains the edges of the MST
for each vertex v in V
    do MAKE-SET(v)
sort E into nondecreasing order by weight w
for each (u, v) taken from the sorted list
    do if FIND-SET(u) = FIND-SET(v)
        then A ← A ∪ {(u, v)}
            UNION(u, v)
return A

Illustrative Examples

Lets run through the following graph quickly to see how Kruskal's algorithm works on it:

Kruskal's algorithms Example

We get the shaded edges shown in the above figure.

Edge (c, f) : safe
Edge (g, i) : safe
Edge (e, f) : safe
Edge (c, e) : reject
Edge (d, h) : safe
Edge (f, h) : safe
Edge (e, d) : reject
Edge (b, d) : safe
Edge (d, g) : safe
Edge (b, c) : reject
Edge (g, h) : reject
Edge (a, b) : safe

At this point, we have only one component, so all other edges will be rejected. [We could add a test to the main loop of KRUSKAL to stop once |V| − 1 edges have been added to A.]

Note Carefully: Suppose we had examined (c, e) before (e, f ). Then would have found (c, e) safe and would have rejected (e, f ).

Example (CLRS) Step-by-Step Operation of Kurskal's Algorithm.

Step 1. In the graph, the Edge(g, h) is shortest. Either vertex g or vertex h could be representative. Lets choose vertex g arbitrarily.

Step 2. The edge (c, i) creates the second tree. Choose vertex c as representative for second tree.

Step 3. Edge (g, g) is the next shortest edge. Add this edge and choose vertex g as representative.

Step 4. Edge (a, b) creates a third tree.

Step 5. Add edge (c, f) and merge two trees. Vertex c is chosen as the representative.

Step 6. Edge (g, i) is the next next cheapest, but if we add this edge a cycle would be created. Vertex c is the representative of both.

Step 7. Instead, add edge (c, d).

Step 8. If we add edge (h, i), edge(h, i) would make a cycle.

Step 9. Instead of adding edge (h, i) add edge (a, h).

Step 10. Again, if we add edge (b, c), it would create a cycle. Add edge (d, e) instead to complete the spanning tree. In this spanning tree all trees joined and vertex c is a sole representative.

Analysis

Initialize the set A: O(1)

First for loop: |V| MAKE-SETs

Sort E: O(E lg E)

Second for loop: O(E) FIND-SETs and UNIONs

Assuming the implementation of disjoint-set data structure, already seen in Chapter 21, that uses union by rank and path compression: O((V + E) α(V)) + O(E lg E)
Since G is connected, |E| ≥ |V| − 1⇒ O(E α(V)) + O(E lg E).
α(|V|) = O(lg V) = O(lg E).
Therefore, total time is O(E lg E).
|E| ≤ |V|² ⇒lg |E| = O(2 lg V) = O(lg V).
Therefore, O(E lg V) time. (If edges are already sorted, O(E α(V)), which is almost linear.)

II Kruskal's Algorithm Implemented with Priority Queue Data Structure

MST_KRUSKAL(G)

for each vertex v in V[G]
     do define set S(v) ← {v}
Initialize priority queue Q that contains all edges of G, using the weights as keys
A ← { }                               ▷ A will ultimately contains the edges of the MST
while A has less than n − 1 edges
    do Let set S(v) contains v and S(u) contain u
        if S(v) ≠ S(u)
            then Add edge (u, v) to A
                    Merge S(v) and S(u) into one set i.e., union
return A

Analysis

The edge weight can be compared in constant time. Initialization of priority queue takes O(E lg E) time by repeated insertion. At each iteration of while-loop, minimum edge can be removed in O(log E) time, which is O(log V), since graph is simple. The total running time is O((V + E) log V), which is O(E lg V) since graph is simple and connected.

Dated: February 9, 2010.