Problem 1. In various areas of theory, one often needs an encoding (into the natural numbers) of various objects. Consider encoding finite sorted lists of natural numbers. We want, naturally, an encoding that is injective, reasonably easy to construct (and to decode later), and such that nonsense codes, if any (there will be some if the encoding function is not surjective), are easily detected. Someone proposed the following encoding scheme, unfortunately defined ``in reverse.''
The function f maps natural numbers into finite sorted lists of natural numbers (sorted in decreasing order) and is defined recursively as follows
Problem 2.
You are given a query routine which, when given the parameters i and j (two
distinct positive integers) returns the result of the game played between
players i and j. (Every one of the possible (n || 2) games has been
played.) Design and analyze an algorithm that makes Q(nlogn) queries
and orders the players in such a way that the ith player in the resulting
ordering has defeated the i+1st, for each i, 1 £ i < n.
Problem 3. Analyze the time and space complexity of the following algorithm, intended to compute the maximum and minimum of a set by the divide-and-conquer method.
function MaxMin(S,k)
if ( |S| <= k )
then return (max(S), min(S))
else begin
{S1,S2,...,Sk} <- partition(S,k)
for i=1 to k do
(mxi,mni) <- MaxMin(Si,k)
return (max(mx1,mx2,...,mxk), min(mn1,mn2,...,mnk))
end
end
The functions max (which returns the maximum of a set of no more than
k elements), min (which returns the minimum of such a set),
and partition (which divides a set into k subsets, as nearly equal
in size as possible) are predefined.
Specifically, you are to find, as a function of |S| and k:
Consider the following generic problem. Some process (details later) has created an unrooted binary tree (in which every node has degree 1 or 3) with every node (internal nodes and leaves) labelled. The space of labels is a metric space, that is, there exists a distance measure on the space of labels that obeys the properties of a metric (non-negative, obeys triangle inequality, etc.).
Unfortunately, you have not been given the labels of the internal nodes; indeed, you have not even been given the tree-you only know the labels of the leaves and the distance metric. From this information, you are to reconstruct the tree (and, as a bonus, the labels of the internal nodes). This is a common problem in evolutionary biology, linguistics, and other areas: reconstruct from modern characteristics the evolutionary history of a collection of species, languages, etc.
In most cases, you will also be given a specification of the stochastic process by which new labels are created from existing ones-i.e., of the process by which the label at one end of a tree edge is transformed into the label at the other end of the tree edge. If the data are DNA sequences, for instance, the process is basically the same as in string editing: insertion, deletion, and substitution are the three mechanisms.
Simplify the problem to assume that all labels are strings over a small alphabet (as in DNA, over a 4-character alphabet) and that all strings have exactly the same length. Assume that we want to produce the tree with the smallest possible sum of edge lengths.
Propose a method for computing plausible distances from the new entry to the other n-2 entries. Does your method provide a way to assign a label to the new internal node? Analyze the running time of the entire algorithm. Finally, provide an example showing that this greedy algorithm, unlike Huffman's algorithm from which it is inspired, is not optimal.
Problem 1 (algorithm design). You are given a rooted (but not necessarily binary) tree of items, where each item has a positive weight. (For instance, the tree can be specified by a collection of pairs of the form (i,p(i)), where i is an item and p(i) its parent in the tree.) Devise (and analyze and prove the correctness of) an efficient algorithm to produce a total ordering of the items that
|
Problem 2 (computational geometry).
Devise, prove the correctness of, and analyze a fast algorithm to compute
exactly (i.e., not through probabilistic or approximate methods)
the area of a convex polygon. The polygon is given as a list of vertices,
where each vertex is a pair of coordinates, and where the list follows a
counterclockwise tour of the perimeter.
Assume that an unbounded precision rational arithmetic package is available.
If the input is a collection of one-word integers, how many words might
you need to represent the answer?
How does your answer change if the polygon is simple, but not necessarily convex?
Problem 3 (computational geometry).
What is a kinetic data structure? How does it differ from a more conventional
dynamic data structure? What purpose does it serve? What assumptions are
usually made about their usage? What are the main tradeoffs to be considered
in the design of a kinetic data structure? Since we can often dynamize
typical geometric data structures, do you think we can kinetize them as well?
Write a few paragraphs, using various examples (which can be very artificial and need not be from the literature) to illustrate
Problem 4 (optimization algorithms).
Consider the following problem. An n×n array of positive integers
is wrapped around in a cylinder (so that its nth row is followed by its
1st row). A path across the cylinder is a succession of n array elements,
such that the first element in the path is an element of the first column,
the last element in the path is an element of the last column,
and an element (i,j)
can be followed only by one of the three adjacent elements in the next
column, namely, (i,j+1), (i-1,j+1), or (i+1,j+1) (where all arithmetic
is done modulo n). Finally, the cost of a path is just the sum of the values
of the elements in the squares crossed on the path.
Design an efficient (decent polynomial time) algorithm which finds a minimum-cost path across the cylinder. Carefully analyze your algorithm; in particular, find the total number of possible paths, the number of paths examined by your algorithm, and its running time.
Problem 5 (complexity). The most celebrated result in Complexity Theory in the last 25 years is the PCP theorem. This theorem states that every problem in NP has a certificate that can be checked probabilistically by looking only at a constant number of bits (chosen at random with the help of a logarithmic number of random bits) in the certificate. The main use of this theorem so far has been to show inapproximability results-that is, to show that some problem P cannot be approximated in polynomial time within a better ratio than 1+a (for some constant a > 0) unless we have P=NP. Explain the connection between the PCP theorem and the inapproximability results in your own words: no proof, of course, but discuss informally the path that links the two types of results.
Problem 6 (complexity).
Prove that the following problem is NP-complete. Does your proof imply
that the problem is NP-complete even for graphs of degree 3? for planar graphs?
Instance: a graph, G=(V,E), and a positive integer, k £ |V|.
Question: is there a coloring of the vertices with two colors, say {r,b}, such that, for each vertex colored r, there is at least one adjacent vertex colored b, and where there are at least k vertices colored r?
Problem 7 (computability).
Define a real number to be computable if there exists a total function
f such that: (i) f(0) is the integer part of the number; and (ii) for
n > 0, f(n) is the nth decimal digit of the fractional part of the number.
Problem 8 (computability).
For each of the following sets and their complements,
establish whether it is recursive, r.e., or non-r.e.
Problem 9 (parallel algorithms). A standard randomized parallel algorithm for symmetry-breaking runs on constant time and linear total work on an EREW PRAM model as follows:
Use this algorithm to develop a list-ranking algorithm that runs in O(logloglogn) time with high-probability.