Python代写 | INF553 Foundations and Applications of Data Mining


Question 1: Divide and Conquer Minimum Spanning Tree (30 points)

Suppose that we are given a graph G = (V, E), with V = {v1, v2, . . . , vn} where n = 2c for some
integer c > 0, and every edge e ∈ E has a distinct edge weight w(e) > 0. For any X ⊆ V , let
G(X) = (X, EX) where EX = {(u, v) | (u, v) ∈ E ∧ u, v ∈ X}. Consider my proposed algorithm
to compute the minimum spanning tree of G:


if E = {e}, return {e};
X := {v1, v2, . . . , v|V |/2};
Find the minimum weight edge e between X and V − X;
T1 := MST(G(X));
T2 := MST(G(V − X));
return T1 ∪ T2 ∪ {e};

(a) (10 points) Suppose that in each recursive call, we find the minimum edge e by searching
sequentially through the list of edges for each vertex in X; what is the work of my algorithm?
Please explain how you obtain the solution.

(b) (10 points) What is the span of my algorithm? Please explain how you obtain the solution.

(c) (10 points) Does my algorithm always compute a minimum spanning tree? Give a proof
of correctness, or provide a counterexample.

Question 2: Bounded Shortest Paths (25 points)

In many real-world graphs, the shortest paths between vertices include no more than a few
edges. Suppose we know we’re only looking at graphs where the shortest path between any two
vertices has t or fewer edges. That is, any path with more than t edges is not a shortest path.

We can assume the graph can contain negative edges, but will not contain negative cycles.

(a) (5 points) Describe an algorithm to solve the single-source shortest path problem for a
graph in which the shortest path between any vertices has t or fewer edges. Your algorithm
should be efficient with cost bounds that depend on t (and other graph parameters).

(b) (10 points) Argue that your algorithm is correct.

(c) (10 points) Suppose G is a directed graph with n vertices and m edges. What is the work
and span of your algorithm? Justify your answer, in a few words.

Question 3: Improving Dijkstra (30 points)

In our analysis of the work done by Dijkstra’s algorithm, we ended up with a bound of
O(|E| log |E|). Let’s take a closer look at how changing the type of heap used affects this work

(a) (5 points) A d-ary heap is a generalization of a binary heap in which we have a d-ary tree
as the data structure. The heap and shape properties are still maintained, but each internal
node now has d children (except possibly the rightmost leaf). What is the maximum depth
of a d-ary heap?

(b) (10 points) In a binary heap the “delete-min” operation removes the root, places the
rightmost leaf at the root position and restores the heap property by swapping downward.
Similarly the “insert” operation places the new element as the rightmost leaf and swaps
upward to restore the heap property. What is the work done by ‘delete-min‘ and ‘insert‘
operations in a d-ary heap? Note that the work differs for each operation.

(c) (10 points) Now, suppose we use a d-ary heap for Dijkstra’s algorithm. What is the new
bound on the work? Your bound will be a function of |V |, |E|, and d and will account for
the ‘delete-min‘ and ‘insert‘ operations separately. Please explain why.

(d) (5 points) Now that we have a characterization of how Dijkstra’s algorithm performs with
a d-ary heap, let’s look at how we might be able to optimize the choice of d under certain
assumptions. Let’s suppose that we have a moderate number of edges, that is |E| = |V |1+
for 0 <  < 1. What value of d yields an overall running time of O(|E|)?

Question 4: Splitting the Load (25 points)

You and a friend are going on a hike. You have n items that you need to split between the two
of you. Each item has an integer weight and you want to split them as close as possible to be
the same total weight. Let’s say the total weight of the items is t. For simplicity, all you have
to return is the total weight wh of the heavier pile (wh ≥ t/2). You can assume the input is a
list with n integers.

(a) (5 points) A greedy algorithm will not work for this problem. Prove this by giving a

(b) (10 points) For a dynamic programming approach, devise an optimal substructure property
for an optimal solution to this problem, and prove that it is correct. Write a recurrence, or
recursive program that solves this problem.