Run Times - FSU Computer Science

Run Time Analysis
Jordan Snow
Kruskal’s Algorithm
Finds minimum spanning tree
Kruskal’s add edges starting with a forest of trees
Sorts the edges by cost
Adds edges by ascending order of cost
Edge is only added if it connects two different trees
 Cycles are not formed
Finishes with complete tree
Pseudo Code for Kruskal
MST-Kruskal(G, w)
for each vertex v ∈ G.V
sort the edges of G.E into ascending order by cost
for each edge (u, v) ∈ G.E, considered in ascending order:
if Find-Set (u) ≠ Find-Set(v) //vertices not in the same set
A = A ⋃ {(u, v)} //it is ok to add the edge to A
Union(u, v) //union the two trees
return A
Single Linkage
Analysis done in term of Kruskal’s
• Start by placing each point in its own cluster O(V)
• Sort the edges O(E lgE)
• Finding sets and unions O(E)
 We assume the implementation of disjoint-set Union find
with path compression and union by rank
 So O((V+E))α(V))
Total time is O((V+E))α(V)) + O(E lgE)
α(V) = O(lgV) = O(lgE)
Total time reduces to O(E lgE)
Single Linkage
After using Kruskal’s, we start cutting the minimum
spanning tree
We want to cut k-1 longest edges
This results in k clusters
If k =3 , we cut the two longest
Complete Linkage
Start by placing each point in its own cluster O(n)
Store the distance between each pair of clusters O(n²)
While there are more than k clusters O(n)
Let A,B be the two closest, farthest clusters O(n²)
Add cluster A ⋃ B O(n)
Remove cluster A and B O(n)
Find farthest distance from A ⋃ B to all other clusters O(n²)
Total time comes to O(n³)
Average Linkage
Same analysis applies to average linkage as complete
Average must take all distances and find the average of
those, then compare
Total time is still O(n³)
Lloyd’s Method
Pick k random points O(k)
Until convergence: ?
Assign each point to its closest center O(kn)
Compute the mean of each cluster O(n)
Let these means be the new centers O(k)
In practice, Lloyd’s converges so quickly that the algorithm
is linear in practice
Deterministic Lloyd’s Method
Furthest Centroids:
Pick a random center C1 O(1)
Set C2 as the farthest point from C1 O(n)
Set Ci to have largest minimum distance from any
center already chosen O(kn)
Running time of seeding is O(kn)
Runs same as previous after seeding, linear run time