MCS 401 Exam 2 1 April 2015 Name: • Do not start until instructed to

MCS 401
Exam 2
• Do not start until instructed to do so.
• In order to get full credit, you need to show your work.
• You have 50 minutes to complete the exam.
• Good Luck!
1 April 2015
Problem 1 (25 points) Show that the solution of T (n) = T (n/3) + 2 is O(log n). To get full
credit, you must clearly state what you are doing. Don’t just write a bunch of inequalities
without a few sentences of explanation.
Answer. Substitution Method To show this, we need to find a constant d such that
T (n) ≤ d log2 n. (You could use any base you like, I just chose base two.) We will make the
computations with d as a variable and then at the end decide what d should be.
T (n) = T (n/3) + 2
≤ d log2
= d log2 n − d log2 3 + 2.
We win if −d log2 3 + 2 is negative since then T (n) ≤ d log2 n. So
−d log2 3 + 2 ≤ 0
2 ≤ d log2 3
≤ d.
log2 3
So choose a d like d = 10. The computation then shows T (n) ≤ 10 log2 n which implies
T (n) = O(log n).
Recursion Tree You could also use the tree. In the tree, each node has exactly one child
(the number of children is the coefficient in front of T (n/3) which is 1). The depth of the
tree is O(log n) and each node has a cost of 2, so the total is O(log n).
Problem 2 (35 points) You are given an undirected unweighted graph G where in addition
each edge is colored either red or blue. You are also given an integer k. Describe an algorithm
that either produces a spanning tree of G consisting of exactly k red edges, or outputs that
no such spanning tree exists. Also, briefly justify why your algorithm works correctly. Hint:
either Kruskal’s or Prim’s algorithms can be modified slightly to solve this problem, but I
think modifying Kruskal is easier. To do so, you will need to run Kruskal twice with two
different orderings of edges. Hint 2: the first time you run Kruskal, you should determine
the tree which uses the fewest number of red edges.
Answer. First run Kruskal with all blue edges appearing before all red edges to produce a
tree T . Say X is the set of red edges in T . Now order the edges by putting all red edges
in X first, then the other red edges, and finally the blue edges. But during this second run,
stop adding red edges once you get to k red edges and immediately skip to the blue edges
once you get to k red edges.
It works because the first tree T produced is the tree which has the fewest number of red
edges since all blue edges appear first. If this tree has more than k red edges, we can
immediately return that no tree exists.
During the second run of Kruskal, we put the red edges X first in the ordering, so Kruskal is
guaranteed to pick them. (They are part of the tree T so Kruskal will not skip any of them
because they have no cycles.) Another property is that from X, we can complete a spanning
tree using only blue edges. So we start adding more red edges, eventually reaching k, and
then immediately jumping to the blue edges. We are guaranteed that jumping to the blue
edges we can still get to a spanning tree.
Problem 3 (15 points) Construct a Huffman code for the following alphabet with the given
frequencies. You should draw the Huffman tree with labeled vertices, and also give the
assignment of codes to letters.
a: 6
b: 2
c: 5
d: 2
e: 7
The codes are then a = 00, b = 110, c = 10, d = 111, e = 01. (This is not the only choice
for codes, there are other correct answers.)
Problem 4 (25 points) Consider the following greedy algorithm to solve the weighted interval
scheduling problem: sort intervals by weight with the largest weight first. Go through the
intervals in this ordering, adding them one by one if the interval is compatible with all the
previously chosen intervals. Explain why this greedy algorithm does not work and give an
example set of intervals for which it does not work.
Answer. Consider a single interval with time [0, 10] with weight 5000, and 10 intervals [0, 1],
and [1, 2], and [2, 3], and so on up to [9, 10]. Each of these shorter intervals has weight 2000.
The greedy algorithm will pick the weight 5000 interval and no other intervals. The optimal
solution is to take all the small intervals, for a total weight of 10, 000.