#+TITLE: Theory of Computation * meta - website :: http://www.santafe.edu/~moore/500/500.html - book :: Nature of Computation, print shop, Dane Smith Hall - mailing list :: listinfo/cs500 - office hours :: available by email and Thursday 1:15-2:00 and 3:30-4:30 - grade :: - 50% homework - 16.66% midterm - 33.33% final - recommended book :: Sipser's /Introduction to the Theory of Computation/ ** hw and midterm statistics | | Average | Median | Highest | Lowest | |---------+---------+--------+---------+--------| | Hw#1 | 76 | 79 | 100 | 52 | | Hw#2 | 79.5 | 78 | 97 | 62 | | Hw#3 | 81 | 82.5 | 96 | 72.5 | | Midterm | 62 | 65 | 97 | 21 | | Hw#4 | 82 | 87 | 96 | 51 | * class notes ** 2010-01-19 Tue This course will largely be dealing with /computational complexity/, specifically drawing /qualitative distinctions/ between programs in terms of their complexity. look at Martin Gardner's collection of math games. *** Konigsberg's bridges -- Eulerian paths ----------------+--------------+-----+-------------------- | | | +--|---+ +--|-----|---+ | | | | | +-------+ | | | | | +--|---+ +--|-----|---+ | | | ---------------+--------------+-----+---------------------- file:data/konigsberg-bridges.png is it possible to cross every bridge once. Euler turned into a graph problem and found analytic solution -- *no* because there are more than 2 vertices with odd degree. graph simple { top -- i1; top -- i2; top -- i1; i1 -- i2; bottom -- i1; bottom -- i2; bottom -- i1; } file:data/konigsberg-graph.png A graph G contains an Eurlian tour iff G has at most 2 vertices of odd degree. Problem statement - name :: Eurlerian Tour (decision problem) - input :: a graph G - question :: does G have a Eurlerian tour - complexity :: P *** Hamiltonian Paths like Eurlian paths only you must visit each vertex exactly once rather than each edge. Problem statement: - name :: Hamiltonian tour - input :: a graph G - question :: does G have a Hamiltonian tour - complexity :: NP *** degrees of complexity - Computable :: can solve in finite time - P space :: take a polynomial amount of memory - NP complete :: every NP problem can be transformed into an NP-Complete problem - NP :: _check_ solution quickly, needle in the haystack in that you know when you've found the needle. - P :: can solve in polynomial time - Log-space :: like finishing a maze, can solve in log(n) amount of memory *** asymptotic notation | O | $\Theta$ | o | $\Omega$ | $\omega$ | |---+----------+---+----------+----------| | | | | | | ** 2010-01-21 Thu Moore's law (no relation to professor) -- everything computer improves exponentially, roughly doubling every 1.5 years - for polynomial problems this means the size of the problem can double an be solved in the same time - for exponential problems (say sn) this means the size of the problem can grow by 1 *** models of computation -- we don't care We don't care about polynomial changes in runtime -- as long as my computer can simulate yours in polynomial time then they're equal - problem representation :: The run-time can vary based on the graph representation. For example in the bridges of Konigsberg's checking for number of odd-degree vertices would be - $\Theta(n2)$ for an n by n vertex to vertex matrix - $\Theta(m)$ for a list of m edges We won't care about the representation of our problems and about small changes in the run time -- we just care that this problem can be solved in polynomial time. - models of computation :: - RAM: has constant time for any memory access - Turing machine: has various access times based on the location of memory on the tape -- even in the worst case, this could take a program running in time t and push the time up to t2, and again we don't care about these small changes some models of computation *do* matter. for example we /believe/ that factoring large integers is outside of P for normal computers but it is /known/ that it is in the analog of polynomial time BQP for polynomial computers. the take home point is that P is robust across almost all models of computation. *** worst case complexity -- is what we care about we always care about worst-case complexity -- as if selected by the /adversary/ who has god-like abilities and will always server up the worst possible example for our algorithm. part of CS's preoccupation with adversarial thinking could be its birth in the cryptography of WWII *** Euclid's algorithm for gcd euclid(a,b) = if (b == 0) then a else euclid(b, a `mod` b) this works because any common divisor of a and b is also a common divisor of =a mod b= -- basically an inductive proof the base case and inductive step of which come directly from the above algorithm. how long does this take to run? - suppose a and b are n-bit numbers (n normally is the bits required to pose a question) - =a mod b= can be computed in poly(n) time - claim: if $b \leq a$ then $a mod b \leq \frac{a}{2}$ -> a halves every 2 steps -> the number of bits decreases by 1 every two steps -> linear number of operations - linear * poly = poly, so gcd is in P the above is a good example of the level at which we will compute the running time of algorithms worst case turns out to be when a and b are adjacent Fibonacci numbers. - $Fn \sim \phit$ - $t \sim log\phi{a}$ - n is number of bits is $log(a)$ - $t = O(n)$ *** multiplication -- a cautionary tale how can you do better than $O(n2)$ running time for multiplication of n digit numbers? the solution is divide and conquer -- recursively multiply n/2 digit numbers - $x = 10n/2a + b$ - $y = 10n/2c + d$ - $x*y = 10nac + 10n/2(ad+bc) + bd$ - $T(n) = 4T(n/2)$ - however, given that $(a+b)(c+d) = ac + bd + ad + bc$ and we only need (ad + bc) and we're already calculating ac and bd we can just subtract those from ((a+b)(c+d)) meaning we only need to do 3 instead of 4 multiplications, so - $T(n) = 3T(n/2)$ - if you continually divide into smaller sections this turns into the /convolution/ of two sequences the take home point is that a lower bound on running time is very difficult to prove *** P vs. NP we can't prove that NP problems can't be solved in P time, we can just relate the hardness of all of these NP problems. ** 2010-01-26 Tue *** checkerboard domino trick Question - suppose I remove two opposite corners from a checkerboard - is it possible to cover the remaining places on the board with dominoes? Answer - no: there are two more squares of one color than the other, and each domino will cover one square of each color *** Hamiltonian paths on grids prove that for any connected grid there is a Hamiltonian path iff one side is even proof: the total number of vertices must be even, just like the checkerboard coloring problem above *** review and dealing with big-O $$f(n) = O(g(n))$$ and $$2f(n) \neq O(2g(n))$$ for example $f(n) = 2n$ and $g(n) = n$ because $$\frac{22^n}{2n} = \infty \rightarrown \rightarrow \infty \infty$$ - $f=O(g)$ means $lim(\frac{f}{g}) = \infty$ - $f=o(g)$ means $lim(\frac{f}{g}) = 0$ - $f=\Omega(g)$ means $lim(\frac{f}{g}) > 0$ - $f=\Theta(g)$ means $A \leq lim(\frac{f}{g}) \leq B$ *** finite state automata (the flatworms of theoretical computer science) I have a string of a's and b's, and a rule that says no two b's in a row. the following creature can check this rule digraph fsa { 1 -> 2 [ label = "b" ]; 1 -> 1 [ label = "a" ]; 2 -> 1 [ label = "a" ]; 2 -> 3 [ label = "b" ]; 3 -> 3 [ label = "a or b" ]; } file:data/simple-fsa-a-b.png - alphabet $\Sigma = \{a, b\}$ - set of states $Q = \{1, 2, 3\}$ - transition function $\gamma:Qc\Sigma \rightarrow Q$ - start state $qo \in Q$ - accept statues $F \subset Q = \{1, 2\}$ - language $L \subset \Sigma*$ - language "recognized" by M is the set of words it accepts (e.g. no consecutive b's) a language L is _regular_ if there is a DFA that recognizes it what would be an FSA which accepts any string where the 3rd to last symbol is a digraph fsa { "111" -> "110" [label = "0"]; "110" -> "100" [label = "0"]; "101" -> "010" [label = "0"]; "100" -> "000" [label = "0"]; "011" -> "110" [label = "0"]; "010" -> "100" [label = "0"]; "001" -> "010" [label = "0"]; "000" -> "000" [label = "0"]; "111" -> "111" [label = "1"]; "110" -> "101" [label = "1"]; "101" -> "011" [label = "1"]; "100" -> "001" [label = "1"]; "011" -> "111" [label = "1"]; "010" -> "100" [label = "1"]; "001" -> "010" [label = "1"]; "000" -> "001" [label = "1"]; } file:data/another-simple-fsa-a-b.png file:data/another-simple-fsa-a-b.png *** machinery for proving things about FSA - fix a language L - for two words $u,v \in \Sigma*$ - say that $u \sim v$ if $\forall x \in \Sigma^{*}$, $uw \in L \Leftrightarrow vw \in L$ to prove that a language is not regular it is sufficient to provide an infinite set of mutually in-equivalent words punchline for today -- a language is _regular_ if it has a finite number of equivalence classes under this $\sim$ relation ** 2010-01-28 Thu say that $u \sim v$ if $\forall w : uw \in L \Leftrightarrow vw \in L$ the converse would be $u \nsim v$ if $\exists w : uw \in L \wedge vw \notin L$ Using this $\sim$ relation we can divide the language into equivalency classes. In the smallest possible FSA there is a one-to-one and onto correspondence between these classes and the equivalency classes. L is regular $\Leftrightarrow$ $\simL$ has a finite number of equivalence classes if M and M' are both minimal machines for L, then $M \cong M'$ this is the Myhill-Nerode Theorem *** intersections of regular languages if L1 and L2 are regular then is $L1 \cap L2$ regular? yes The size of $L1 \cap L2$ is the product of the size of their respective sizes. once you know that the compliment of regular languages are regular, and the intersection of regular languages are regular, then you know that the compliment of the intersection of the compliments of the regular languages (which is the union : Demorgan's law) is regular *** concatenation of regular languages is not as straightforward $L1 L2 = \{w \in w1 w2 | w1 \in L, w2 \in L\}$ *** non-deterministic finite automata (NFA) all that matters is that $\exists$ an accepting path is the set of languages recognized by NFAs *bigger* than the set of languages recognized by DFAs. The answer is that given any DFA $\exists$ a DFA which expresses the same thing. ** 2010-02-02 Tue *** two points from the homework 1) when things are too obvious they can be hard to prove (e.g. /Euclid's algorithm/). Inductive proofs and structurally identical to recursive algorithms, exploit this and convert the recursive =Euclid= algorithm to an inductive proof of its validity for solving GCD. primes a = [x | x <- facts a, prime x] where facts a = [x | x <- [1..(a - 1)], a `mod` x == 0] prime a = facts a == [1] 2) in the questions about regular languages the alphabet of pairs of bits can be combined to words which express two binary integers. $$\binom{1}{0} \binom{0}{1} \binom{1}{1} = \binom{x}{y}$$ *** NFA and DFA NFA: non deterministic finite state automata consists of: - $\Sigma$ alphabet - $Q$ finite set of states - $q0 \in Q$ start state - $F \subset Q$ accepting states - $\delta : Q x F \rightarrow P(Q)$ transition function $L$ language recognized by an NFA s.t. $L = \{w \in \Sigma* | \text{a possible path defined by w leads from start to F}\}$ Lets apply this to another of our familiar NFAs -- the language over $\Sigma = \{0, 1\}$ where the third-to-last symbol was a 1. % Define block styles \tikzstyle{rstate} = [circle, draw, text centered, font=\footnotesize, fill=red!25] \tikzstyle{astate} = [circle, draw, text centered, font=\footnotesize, fill=blue!25] \begin{tikzpicture}[->,>=stealth', shorten >=1pt, auto, node distance=2.8cm, semithick] \node [rstate] (1) at (0,0) {a}; % a even, b even \node [rstate] (2) at (1,0) {b}; % a even, b odd \node [rstate] (3) at (2,0) {c}; % a odd , b even \node [astate] (4) at (3,0) {d}; % a odd , b odd \path (a) edge [loop above] node {0,1} (a); \path (a) edge node {1} (b); \path (b) edge node {0,1} (c); \path (c) edge node {0,1} (d); \end{tikzpicture} file:data/fsa.pdf in this NFA we /guess/ at some point that we're on the third to last symbol in the word and jump to state $b$. Note that in the above there is *no* legal transition out of state $d$. *** lets prove that every NFA can be converted to a DFA In effect our DFA would need to track the set of all states that we could be in were we using our NFA, and if any of those states accept. So to define our new DFA in terms of the elements from our old NFA we get the following - $Q' = P(Q)$ - $qo' = \{Q0\}$ - $F' = \{S : S \cap F \neq \emptyset \}$ - $\delta'(S,a) = \cupq \in s{\delta(q,a)}$ note that $|Q'| = 2m$ when $|Q| = m$ (problem 10 on hw1) recall our language of concatenated words $L1L_2 = \{w : w = w1w_2, w1 \in L1, w2 \in L2\}$ notice that while the statement "if $L$ is reg., so is $\bar{L}$" is obvious in the world of DFAs it is not in the language of NFAs. *** regular expressions regular expressions over the alphabet $\Sigma$ - $\emptyset$ the empty set - $\epsilon$ the empty word - $a$ s.t. $a \in \Sigma$ - if $\phi$ and $\phi'$ are regular expressions then - $\phi + \phi'$ their concatenation is also a regexp - $(\phi)*$ is the continued application of $\phi$ is also a regexp the languages recognized by regular expressions are equivalent to the languages recognized by DFAs and NFAs etc... partial proof by induction - base cases -- can be recognized by DFAs - $\emptyset$ - $\{\epsilon\}$ - $\{a\}$ - inductive step - if $\phi$ and $\phi'$ can be recognized by DFAs then so can $\phi + \phi'$ - if $\phi$ can be recognized by DFAs then so can $\phi*$, for this its more convenient to use NFAs -- we just wire an $\epsilon$ transition from each accepting state back to the initial state. ** 2010-02-04 Thu *** pumping lemma a method of proving that a language $L$ is not regular if L is regular, then: \exists an integer p s.t. \forall strings s \in L with |s| \geq p \exists strings x,y,z s.t. s = xyz, and |y| \geq 0, |xy| \leq p and \forall integers i \geq 0, xyiz \in L. - basically you can /pump-up/ the inner part of the word and continually produce words in the language - this corresponds to loops in the FSA defining the language - $p$ is the minimum number of steps required before you are retracing previously visited states - *note* the above only has to hold for strings where $|s| \geq p$ and there is no requirement that there need by any such strings in the language - in languages with large words the existence of a loop in the FSA is guaranteed because the FSA must have finitely many states and once $p \geq |FSA|$ you're set this can be used to prove languages are /not/ regular through contra-positive *** application of the pumping lemma negation of the pumping lemma, just flip all of the quantifiers... using the pumping lemma to prove that the language consisting of an equal number of a's and b's is not regular. $\forall p$ just select the word of length $2p$ composed of p a's followed by p b's. Then it is not possible to select a sub-string in the first p letters which can be repeated -- because the first p letters are all a's. an important take home point is that we have nothing corresponding to the pumping lemma for which problems are in P (solvable in polynomial time). We don't have anything that we know *must* be true $\forall$ problems in P. *** context free grammars an example: consider the following rules - $S \rightarrow aSb, \epsilon$ which describes the language of words with a number of a's followed by that same number of b's. - $S \rightarrow x,y,(S + S), (S * S)$ which results in all grammatically correct algebraic statements with paren's +'s and *'s these context free grammars can be used to describe the programming languages which we use This comes form linguists associated with Noam Chomsky, who believed that rules like this were how humans thought and manipulated language regular languages are to FSAs as these grammars are to FSAs augmented with simple stacks these grammars are context free because the left side of every $\rightarrow$ is always a single symbol (no context) types make programs *not* true context free languages where are linguists now? how does our brain really process/generate language ** 2010-02-09 Tue *** office hours question -- FSA how to tell if an automata is the smallest possible? there are well known algorithms for minimizing an existing DFA -- either saying yes/no this is/isn't the smallest possible, or suggesting states to merge. two states q and q' are equivalent q \sim q' iff \forall w: \delta*(q, w) \in F \Leftrightarrow \delta*(q',w) \in F It turns out that finding the minimal NFA is *much* harder because the notion of state equivalence is more complicated on an NFA. and thus ends FSA *** P, NP, and NP-completeness NP problems are equivalent to finding a needle in a haystack -- what is it about some problems that allow you to skip the exhaustive search (i.e. why can some of these problems be solved in polynomial time)? We will repeat some material from cs561 as we discuss why some algorithms can be pulled down from NP into P. *** Towers of Hanoi | | | +-----+ | | | | | | +-----+ | | +---------+ | | | | | | +---------+ | | +-----------+ | | | | | | +-----------+ | | --------------------------------------------- file:data/hanoi.png ;; k is the other peg (defn hanoi [n i j] (when (not (= n 0)) (hanoi (- n 1) i k) (move i j) (hanoi (- n 1) i k))) How many moves does it take to move n disks? $f(n) = 2f(n-1)+1$ or $f(n) = 2n-1$ This can be proved optimal through induction on the number of disks. Look at the figure in page 85 of the text to see some of the state space of this problem represented as a graph in which vertices are states and edges are moves. If we think similarly about our computer as a *large* graph in which nodes are memory states and edges are moves, then the amount of memory needed is the log of the number of vertices and the runtime is the length of a path. The optimal Towers of Hanoi algorithm is not known for more than 3 pegs. *** mergesort :PROPERTIES: :CUSTOM_ID: mergesort :END: the canonical divide and conquer algorithm (defn mergesort (l) (when l (let [merge ;; our sorting zipper lefthalf ;; left half of list righthalf ;; right half of list ] (merge ;; n-1 comparisons (mergesort (lefthalf l)) ;; f(n/2) comparisons (mergesort (righthalf l)) ;; f(n/2) comparisons )))) What's the runtime of mergesort? Lets just count the number of comparisons. $$f(n) = 2f(\frac{n}{2})+n$$ the solution ends up being $$f(n) = nlog2{n}$$ *** quicksort :PROPERTIES: :CUSTOM_ID: quicksort :END: (defn quicksort (l) (when l (let [pivot ;; choose our pivot lp ;; elements less than p gp ;; elements greater than p ] (concat (quicksort lp) p (quicksort gp))))) - n comparisons to get greater and less than pivot - if our pivot is really in the middle then we have $2f(\frac{n}{2})+n$ more comparisons - if our pivot is the smallest element, then we have $f(n-1)+n$ comparisons which becomes the arithmetic series $1 + 2 + 3 + \ldots$ which is $\Theta(n2)$ - in the *average* case where p is randomly placed in our list and $a$ is the fractional amount of p through our list, then we have $f(an)+f((1-a)n)+n$ -- then setting $f(n)$ as the average over all possible values of $a$. $$f(n) = (n - 1) + \frac{1}{n} \sumi=0^{n-1}{f(i) + f(n - 1 - i)}$$ when $n$ is large we can replace this sum by an integral $$f(n) = n + \frac{1}{n} \int0^{n}{dx f(x) + f(n - x)}$$ we can try to substitute in $f(n) = An\ln{n}$ and solve for $A$ this is our first example of a _randomized algorithm_ be sure to be explicit about what your input could be - designed by an adversary - truly random - real world ** 2010-02-11 Thu *** sorting runtimes Can we sort n things in less than $n\log2{n}$ comparisons To distinguish N possibilities with binary (yes/no) questions you will need to ask $\log2{n}$ questions. when there are n! sortings of a list, to select the correct one will require $\log2{n!}$ questions $$\log2{n!} = n\log2{n} - n\log2{e} - O(\log2{n})$$ or $O(n\log2{n})$ /note/: this argument is based upon the minimum amount of time taken for our sorting algorithm to access the information in the list -- not the trivial computation performed on the list info after it is known to the algorithm. radix-sort and bin-sort are faster /non-comparison/ based sorting algorithms that are applicable in some cases. *** modular exponentiation and discrete log :PROPERTIES: :CUSTOM_ID: modular-exponentiation :END: - mod. exponentiation - input :: n-digit integers x, y, p - output :: $xy \bmod{p}$ - discussion :: if $y=1024$ then since $1024 = 210$ we can just do $x = x2 \bmod{p}$ 10 times for values of y which are not power of 2 we can just run out powers of 2 trick up to the nearest power of below y, this is another divide and conquer algorithm this runs in poly time and is in P if we have time at the end of the semester we'll look at some cryptography stuff which will relate here. - discrete log - input :: n-digit integers x, z, p - output :: y s.t. $z = xy \bmod{p}$ - discussion :: this function doesn't appear to be in P even though its inverse /above/ is in P These functions in which one direction is in P while the inverse isn't are called /one-way functions/. There are some cool one-way functions, like generating random sequences which are so random that *no* poly-time algorithm can find a pattern in them. *** fast Fourier transforms are very important for many day-to-day applications, and are vital to understanding quantum computing and its ability to crack RSA keys, etc... *** dynamic programming For example putting line breaks into a paragraph. need to assign some cost to each line based on how stretched its words are. namely the total space in the line - the amount of space taken by the words. $$c(i,j) = (line\_space - \sumk = i^j{length(wk)} - (i-j))$$ So taking a /divide-and-conquer/ approach, we continually place a line break into the paragraph dividing the paragraph into two sub-paragraphs which we can then typeset. However it is not at all clear a-priori where the best initial divisions will be. taking a /dynamic programming/ approach we will place a line break after the first line and assign that break the cost of that line break as the cost of that line, plus the cost of the remained of the paragraph type-set as well as possible. side note: short-vs-long term costs -- there is a relevant book by the guy who talked on Colbert recently (defn typeset-cost "Return the lowest cost of typeseting a paragraph of WORDS as well as possible" [words cost] (min (map (fn [break] (+ (cost (take break words)) (typeset-cost (drop break words)))) (range (.size words))))) this would be very inefficient because we are continually recalculating the cost of the same paragraphs. however we can cache our intermediate results as in the following -- also since its in clojure its multithreaded with safe access to the =cache=. (def cache (ref {})) (defn typeset-cost "Return the lowest cost of typeseting a paragraph of WORDS as well as possible -- with thread-safe caching." [words cost] (or (@cache words) ((dosync assoc @cache words (min (pmap (fn [break] (+ (cost (take break words)) (typeset-cost (drop break words)))) (range (.size words))))) words))) this brings us down from an exponential runtime to a polynomial runtime. so - dynamic programming :: recursion with memorization this is typically applicable to string and to trees -- problems which can be cut into separate problems in a polynomial number of places. ** 2010-02-16 Tue *** minimum spanning tree :PROPERTIES: :CUSTOM_ID: minimum-spanning-tree :END: minimum spanning tree - input :: a weighted graph $G = (V,E)$ - question :: spanning tree T, smallest total weight graph simple { 1 -- 2 [style=bold, label = "2"]; 1 -- 3 [label = "8"]; 4 -- 2 [style=bold, label = "1"]; 3 -- 2 [style=bold, label = "3"]; 3 -- 5 [style=bold, label = "2"]; 1 -- 4 [label = "16"]; } file:data/minimum-spanning-tree.png greedy algorithm: Kruskal's alg., sort E from lightest to heaviest add each one if this doesn't create a cycle. - *proof*: we will maintain the invariant, that the set of edges we have so far, $F \subseteq E$ is contained in some minimal spanning tree (MST) $T$. initialization(/base case/): $F = \emptyset$ termination: left as an exercise maintenance(/inductive step/): if $F \subseteq T$ s.t. $T$ is a MST then $F \cup \{e\} \subseteq T$. Proof by contradiction, suppose that $e \notin T$, then $\{e\} \cup T$ has a cycle which means that *any* of the edges in that cycle could be removed and you would still have a minimum spanning tree, since $e$ was the smallest remaining edge one of the other edges has a greater or equal weight than $e$, $\square$. Note that for the traveling salesman problem (a simple restriction of this problem) a greedy algorithm performs very poorly. *** max flow max flow - input :: directed graph with two special verticies, the /source/ (s) and the /sink/ (t), and each edge has a capacity - question :: what is the maximum flow from s to t in the graph digraph simple { s -> 0 [style=bold, label = "2"]; s -> 1 [style=bold, label = "2"]; 1 -> t [style=bold, label = "2"]; 0 -> t [style=bold, label = "2"]; 0 -> 1 [label = "1"]; } file:data/max-flow.png improvement algorithm: if I have a flow $f$ (a path from s to t), I can tell if $f$ is optimal and if it isn't then I can tell how to improve it. all of the parts of this algorithm will be polynomial in the size of the graph -- including the bits needed to encode the capacities of the edges. proof: $f$ is optimal unless $\exists$ a path $p$ from s to t s.t. $\forall e \in p$, $e$ has nonzero residual capacity -- not quite true residual graph: given a current flow $f$, the graph $Gf$ has forward edges e with capacity $cf(e) = c(e) - f(e)$, and reverse edges $\bar{e}$ with capacity $cf(\bar{e}) = f(e)$ amended proof: $f$ is optimal unless $\exists$ a path $p$ in the residual graph, from s to t s.t. $\forall e \in p$, $e$ has nonzero residual capacity. flow along a reverse edge cancels out flow along the related forward edge. Refer to the book for the proof. *note* that the number of iterations through this, path -> flow -> residual -> path loop is run could be infinite w/real-number capacities, and can take an exponential number of trials if the capacitances are exponentially large. *note* in a fitness landscape, local optima only exist if there is an idea of /small/ changes, so broadening the set of /small/ changes can remove local optima and smooth a fitness landscape *** reduction/transformation between problems min cut - input :: given a weighted graph - question :: find the _cut_ $C \subseteq E$ which eliminates all paths between s and t and minimizes the capacity of the edges cut in every case the weight of the minimum cut is equal to the maximum flow -- intuitively this should be clear, each problem find the bottleneck between two subgraphs containing s and t. +--------------------\ +----------------------\ | | | | | | ------ | | | | | | | s | | t | | | ------ | | | | | | | | ------ | | | | | | \--------------------+ \----------------------+ file:data/min-cut.png a _reduction_ from problem a to problem b is a poly-time translation of instances of a to instances of b. here's one more example of a problem amenable to reduction/translation Bipartite perfect matching - input :: bipartite graph $G$ - question :: find a set of edges s.t. every vertex is contained in exactly one edge. this is reducible to max flow, through adding s to one bipartite half, and adding t to the other bipartite graph, and ask if there is a flow of value n -- every edge along compatibility graph is given a flow of 1. so, /Perfect Matching/ $\leq$ /Max Flow/ ** 2010-02-18 Thu *** un-skipping part of section three -- Reachability Reachability - input :: directed graph G and two verticies s, t - question :: is there a path from s -> t it is common to ask for the shortest path (either weighted or not) - middle first search -- as opposed to breadth first or depth first we will be using an adjacency matrix \begin{displaymath} Aij = \left\{ \begin{array}{lr} 1 & : (i,j) \in E\\ 0 & : (i,j) \notin E \end{array} \right. \end{displaymath} Raising A to powers gives us $An_{ij} = \sumk{AikAkj}$ gives us the number of paths of length $n$ from $i$ to $j$. we can quickly get to high powers of $Aij$ using modular-exponentiation how would this look at code (defn reachable? [A s t] (loop [A A n 0] (if (A s t) (if (>= n (.size A)) nil (recur (matrix-square A) (inc n)))))) if you're looking for the shortest path your initialization may want to look something like \begin{displaymath} Aij = \left\{ \begin{array}{ll} 0 & i \equiv j\\ 1 & (i,j) \in E\\ \infty & (i,j) \notin E \end{array} \right. \end{displaymath} would solve the /all pairs shortest path/ problem *** on to Chapter 4 -- NP decision problems (yes/no) - p :: polynomial time problems -- \exists a program running in poly(n) time which solves the problem where n is the size of the input measured in bits - NP :: class of problems where _checking_ a solution is in *P* -- the class of problems where the answer is "yes" if \exists w : B(x,w) where B \in *P* (we call w the witness) - coNP :: the class of problems who's compliment is in *NP*, for example proving that a graph does not have a Hamiltonian path *** a tour of problems in NP Graph k-colorability - input :: graph - question :: is there a coloring of the vertices using k colors s.t. no two vertices of the same color share an edge this is in NP as the witness can be checked in poly time we think this takes exponential time the 4-colorability of planar graphs was proved with a computer-aided search in the 1970s Graph 3-colorability $\subseteq$ planar graph 3-colorability -- through the introduction of little /gadget/ graphs at each intersection ** 2010-02-23 Tue *** some points related to the homework - problem 2 the point of problem 2 was a language which is not regular, but which does satisfy the pumping lemma. closure properties means taking the languages union, intersection, or compliment or any of those actions which preserve regularity, and then show that the resulting languages is not regular. - factoring - input :: n-bit integer x - output :: a list of prime factors $pi$ and integers $ti$ s.t. $x = \prodp_i{ti}$ see the hint on the list -- note that factoring can be reduced to the /find a factor/ problem. so the easiest setup is FACTORING $\leq$ FIND A FACTOR $\leq$ MOD. FACTORIAL - there is also the divide and conquer problem with Fibonacci numbers -- not that if the given recursion is used directly the result is poly(l), but maybe not in the number of bits in l -- it needs to be polynomial in the number of bits in l $poly(n=log2{l})$ - finally some terminology related to dynamic programming, /shared subproblems/ -- means basically exactly what the name sounds like -- its related to the Hamiltonian path problem naively this would be checking the n! vertex orders where $n! \sim nn \sim nO(nlog{n})$ *** more Chapt. 4 -- problems in NP **** k colorability NP, \forall yes instances \exists a /witness/, /example/, or /certificate/ of the solution which can be checked in poly time Graph k-coloring - input :: G - output :: is G k-colorable last time we mentioned the surprising fact that graph 3-coloring $\leq$ planar graph 3-coloring **** satisfiability CNF (in terms), any formula/truth-table can be represented in CNF a truth assignment is an assignment of each variable to either true or false. \phi is _satisfiable_ if \exists a truth assignment for which \phi is true SAT - input :: a CNF formula \phi - output :: is \phi satisfiable this is clearly in NP, its easy to check a truth assignment. proving unsatisfiable is pretty hard KSAT - input :: a CNF formula \phi with k literals in each clause - output :: is \phi satisfiable graph 3-coloring $\leq$ SAT - one variable for each vertex and color combination - one clause for each edge and color combination - four clauses for each variable once you get used to this you realize that its easy to convert most constraint satisfaction problems into a SAT problem -- and this is something that is actually done in the real world where smart people spend real time working on efficient sat solvers. ** 2010-02-25 Thu *** 2 and 3, and SAT -> graph - coloring - 2-coloring is in P - 3-coloring isn't in P - SAT - 2-SAT is in P - 3-SAT isn't in P and is equivalent to every other k-SAT p. 112 $\phi(p,q,r) = (p \vee \bar{q}) \wedge (\bar{p} \vee \bar{r}) \wedge (q \vee r) \wedge (p \vee q)$ \usetikzlibrary{shapes,arrows} % Define block styles \tikzstyle{state} = [circle, draw, text centered, font=\footnotesize] \begin{tikzpicture}[->,>=stealth', shorten >=1pt, auto, node distance=2.8cm, semithick] \node [state] (p) at (0,2) {p}; \node [state] (q) at (1,2) {q}; \node [state] (r) at (2,2) {r}; \node [state] (np) at (0,0) {$\bar{p}$}; \node [state] (nq) at (1,0) {$\bar{q}$}; \node [state] (nr) at (2,0) {$\bar{r}$}; \path (p) edge node {} (nr); \path (q) edge node {} (p); \path (r) edge node {} (np); \path (np) edge node {} (q); \path (np) edge node {} (nq); \path (nq) edge node {} (r); \path (nq) edge node {} (p); \path (nr) edge node {} (q); \end{tikzpicture} file:data/cnf-graph.png the formula is satisfiable iff $\nexists$ a cycle including both $x$ and $\bar{x}$ for some $x$. while there are unset vars... - choose unset x - if path x -> $\bar{x}$, set x false - if path $\bar{x}$ -> x, set x true - else set x however you want then do unit clause propagation note that edges in this graph come in pairs, so x -> y means $\bar{y}$ -> $\bar{x}$ its tempting to do something similar for 3-SAT, however we can't *** k-SAT <= 3-SAT Thus far we've only done /gadget/ reductions, where we make simple substitutions to get from one problem to another, however for problem reduction we can do /anything/ which can be accomplished in polynomial time reduction of a 5-variable clause to a 3-variable clause $$(x1 \vee x2 \vee x3 \vee x4 \vee x5)$$ goes to $$(x1 \vee x2 \vee z1) \wedge (\bar{z1} \vee x3 \vee z2) \wedge (\bar{z2} \vee x4 \vee x5)$$ what's qualitatively different between 2 and 3 *** NP-completeness -- enough beating around the bush, Chapt. 5 a problem A is NP-complete if 1) A \in NP 2) \forall B \in NP, B $\leq$ A (there is a poly-time reduction from B to A) - Prove 3-SAT is NP complete if B is in NP, then \exists a program C(x,w) that returns true iff w is a valid witness for x, where x is a yes-instance of B. lets replace the word /program/ above with /circuit/. so we compile our program all the way down to Boolean circuits converting the input bits to outputs bits. claim: given an instance x of B, we can generate a circuit c'(w) s.t. c'(w)=true iff w is a valid witness for x. this is a reduction form B to CIRCUIT-SAT CIRCUIT-SAT - input :: a boolean circuit c' - output :: is there an input x s.t. c'(w) = true so we've shown CIRCUIT-SAT is NP-complete reduction is transitive, so if CIRCUIT-SAT $\leq$ 3-SAT then 3-SAT is also NP-complete WITNESS EXISTENCE $\leq$ CIRCUIT-SAT $\leq$ 3-SAT we can take an instance of circuit-sat, assign variables to all internal wires, we can then in a fairly straightforward manner turn a circuit into a k-SAT problem which ends in $\wedge (z)$ where $z$ is the variable for our output. So how do we know this is poly-size of the original circuit, seems like it may be obvious, possibly only one clause per-wire. Summary: any program, take its witness-checker to a circuit, convert that circuit to a 3-SAT formula, that formula is satisfiable iff a witness exists. ** 2010-03-02 Tue *** NAE-k-SAT NAE-k-SAT -- not all equal satisfiability - input :: a finite conjunction of clauses of k variables - output :: is there an assignment of variables s.t. each clause contains at least one literal that is true and one that is false note that true and false are totally equivalent in this specification, so for any solution, swapping true and false will yield another solution *** NAE-2-SAT \in P NAE-2-SAT $\leq$ Graph 2-coloring just say that every literal is a vertex, every literal is connected by an edge to its compliment, and every clause is an edge \usetikzlibrary{shapes,arrows} % Define block styles \tikzstyle{ts} = [circle, draw, text centered, font=\large, fill=red!25] \tikzstyle{fs} = [circle, draw, text centered, font=\large, fill=blue!25] \begin{tikzpicture}[->,>=stealth', shorten >=1pt, auto, node distance=2.8cm, semithick] \node [ts] (y) at (0,2) {$y$}; \node [fs] (ny) at (1,2) {$\bar{y}$}; \node [fs] (x) at (0,1) {$x$}; \node [ts] (nx) at (1,1) {$\bar{x}$}; \node [ts] (z) at (0,0) {$z$}; \node [fs] (nz) at (1,0) {$\bar{z}$}; \path (y) edge node {} (ny); \path (x) edge node {} (nx); \path (z) edge node {} (nz); \path (y) edge node {} (x); \end{tikzpicture} file:data/nae-2-sat.png *** 3-SAT <= NAE-SAT 3-SAT $\leq$ NAE-4-SAT $\leq$ NAE-3-SAT so this $leq$ relation in NP problem reductions requires that we can map /no/ and /yes/ instances between the two problems -- in this case 3-SAT and NAE-4-SAT **** to convert form 3-SAT to NAE-4-SAT $$(x1 \vee y1 \vee z1) \wedge (x2 \vee y2 \vee z2)$$ becomes $$(x1, y1, z1, b) \wedge (x2, y2, z2, b)$$ where $b$ is added to *every* clause, and can be set to either true or false so, the intuition here is that if 3-SAT is *not* satisfiable, then there must be one clause of all false, and one clause of all true, because if that is not the case, then we can just swap our true and false assignments, and then if there *is* a clause of all false, and there *is not* a clause of all true, then the swapped values will satisfy. So the above is not NAE-4-SAT iff there is a clause of all true and one of all false. **** now to show that NAE-4-SAT $\leq$ NAE-3-SAT we add variables to reduce the size of clauses $$(x1 \vee y1 \vee z1 \vee t1)$$ becomes, just need to know what new variables are inserted $$(x1 \vee y1 \vee \_) \wedge (x1 \vee z1 \vee \_) \wedge \ldots$$ *** 3-SAT <= 3-coloring another gadget reduction, here generating graphical representations of clauses types of gadgets - choice :: where you set the values to one of the possible values - constraint :: where you force two or more variables to obey a constraint so we can make one color true, one false, and then the other can be used to enforce constraints, so for example \usetikzlibrary{shapes} % Define block styles \tikzstyle{ts} = [circle, draw, text centered, font=\large, fill=red!25] \tikzstyle{os} = [circle, draw, text centered, font=\large, fill=yellow!25] \tikzstyle{fs} = [circle, draw, text centered, font=\large, fill=blue!25] \begin{tikzpicture}[->,>=stealth', shorten >=1pt, auto, node distance=2.8cm, semithick] \node [os] (o) at (0,0) {other}; \node [ts] (x) at (1,1) {$x$}; \node [fs] (nx) at (1,-1) {$\bar{x}$}; \path (o) edge node {} (x); \path (o) edge node {} (nx); \path (x) edge node {} (nx); \end{tikzpicture} file:data/choice-gadget-sat.png with this gadget forcing each variable and its compliment to be different colors, how do we convert our clauses into subgraphs of our graph. turns out we'll use NAE-3-SAT to generate these subgraphs, then the subgraphs just turn into fully connected graphs of three vertices, or triangles, that way they will *not* be three colorable if all three vertices outside the subgraph with incoming edges are the same color -- or NAE. ** 2010-03-04 Thu reduction of sorting to graphs, consider a graph where each vertex is a number, and we draw directed edges between vertices from the smaller to the greater (representing the less than relation) then sorting can be reduced to a Hamiltonian path through this graph this was to make a point about the directions of reductions, sorting is not as hard as Hamiltonian paths *** sidebar -- DAGs and topological orderings if a DAG is a partial ordering, then the /topological orderings/ related to that DAG are all of the possible total orderings which do not violate the partial ordering of the DAG. *** back to SAT - clause :: is a disjunction of terms - assignment :: is a grounding of the literals in a collection of clause causing their conjunction to be true since we know 3-SAT is NP-Hard we'll use it to prove that other problems are NP-Hard *** independent set is NP-Hard INDEPENDENT SET - input :: a graph G - question :: is there a set of vertices which share *no* edges - in NP :: this is trivially in NP, because we can check any set of vertices in polynomial time - in NP-Hard :: can we reduce 3-SAT to independent set, for each clause introduce a connected subgraph (triangle) where the vertices are the variables in the clause. Then connect each vertex to each of its opposites, so $x$ is connected to every $\bar{x}$, then finding a independent set with size equal to the number of clauses will result in a satisfying vertex assignment for 3-SAT. *** clique is also NP-Hard CLIQUE - input :: a graph G - question :: is the a collected subgraph of size k this is exactly the same as independent set of the compliment of the graph \usetikzlibrary{shapes} % Define block styles \tikzstyle{vertex} = [circle, draw, text centered, font=\large] \begin{tikzpicture}[->,>=stealth', shorten >=1pt, auto, node distance=2.8cm, semithick] \node [vertex] (1) at (-1,0) {}; \node [vertex] (2) at (1,0) {}; \node [vertex] (3) at (-1,1) {}; \node [vertex] (4) at (1,1) {}; \node [vertex] (5) at (0,2) {}; \node [vertex] (10) at (4,0) {}; \node [vertex] (20) at (6,0) {}; \node [vertex] (30) at (4,1) {}; \node [vertex] (40) at (6,1) {}; \node [vertex] (50) at (5,2) {}; \path (1) edge node {} (2); \path (1) edge node {} (3); \path (2) edge node {} (4); \path (4) edge node {} (5); \path (3) edge node {} (5); \path (10) edge node {} (40); \path (40) edge node {} (30); \path (30) edge node {} (20); \path (20) edge node {} (50); \path (10) edge node {} (50); \end{tikzpicture} file:data/self-complimentary-graph.png *** vertex cover VERTEX COVER - input :: a graph G - output :: set of vertices of size k s.t. every edge in G touches one of those vertices If you have a clique of size k in the compliment graph of G, then you have a vertex cover of size |V|-k in G. proof -- if there was an edge not covered by the non-clique in the compliment of G, then that edge would mean that the clique in compliment of G was not fully connected. *** set cover SET COVER - input :: given a set A on n elements and a family F of subsets of A - question :: is there a sub-family of F whose union is A trivial ** 2010-03-09 Tue *** NP-COMPLETE review WITNESS-EXISTENCE - input :: an instance and a program that checks witnesses to the instance - question :: is there a witness that satisfies this instance/program - we can compile this to an instance of CIRCUIT-SAT - which we can convert to a 3-SAT problem - which we can convert to a NAE-3-SAT problem (through NAE-4-SAT) - which we can convert to GRAPH-3-COLORING **** 3-SAT to NAE-3-SAT looking once more at the 3-SAT to NAE-3-SAT (through NAE-4-SAT) - we can take any 3-SAT instance and add a variable $S$ to each clause generating an instance of NAE-4-SAT - and some more... just be sure that you can map yes instance to yes instances, and no instances to no instance **** NAE-3-SAT to GRAPH-3-COlORING \usetikzlibrary{shapes} % Define block styles \tikzstyle{ts} = [circle, draw, text centered, font=\large, fill=red!25] \tikzstyle{os} = [circle, draw, text centered, font=\large, fill=yellow!25] \tikzstyle{fs} = [circle, draw, text centered, font=\large, fill=blue!25] \begin{tikzpicture}[->,>=stealth', shorten >=1pt, auto, node distance=2.8cm, semithick] \node [os] (o) at (0,3) {other}; \node [ts] (x) at (-3,0) {$x$}; \node [fs] (nx) at (-2,0){$\bar{x}$}; \node [ts] (y) at (3,0) {$y$}; \node [fs] (ny) at (2,0){$\bar{y}$}; \path (o) edge node {} (x); \path (o) edge node {} (nx); \path (x) edge node {} (nx); \path (o) edge node {} (y); \path (o) edge node {} (ny); \path (y) edge node {} (ny); \end{tikzpicture} file:data/nae-sat-to-graph-color.png *** Now for some problems with a different flavor TILING - input :: set of rotatable tile shapes T, and a finite region R - question :: can I tile R with tiles from T w/o gaps or overlaps for simplicity we'll say both the tile shapes and the region are made of unit squares, and they will be conveyed as gif images (basically images of bits) our tile set will be little elbows and squares : | +--+ : +-- | | : +--+ we can use these shapes to make wires and gates (see the book) s.t. truth values are based on the how the little elbows are aligned in the wires... the last output can be setup so that its only covered if the wire heading to it is aligned as true, so the whole shape is tilable *iff* the analogous circuit would have returned true. so tiling with these shapes is NP-Complete *** tiling with dominoes is in P 1) convert R to a bipartite graph by coloring the vertices as a checker board 2) then domino covering is equivalent to bipartite perfect matching, which is equivalent to max flow some weird relationships between improving imperfect domino matching and the Ford-Fulkerson algorithm for improving max flow once again the difference between *2* and *3* is made manifest, if someone really understood this basic difference that insight should lead to a proof that $P \neq NP$. *** Integer Partition introduced in section 4.2.3 INTEGER PARTITIONING - input :: a list of integers {x1,... , xl}, note: n is the number of bits, so its possible for xl >> n - question :: is there a balanced partition of this list of integers? $A \subseteq \{1, \ldots, l\}$ s.t. $\sumi \in A{xi} = \frac{1}{2}\sumi{xi}$ this is a special case of SUBSET SUM in which we want the sum of elements in $A$ to equal some sum $t$ -- this is in NP we can try this with dynamic programming... ** 2010-03-23 Tue *** review of the reduction tree Integer Partition Independent Set = Clique = Vertex Cover ^ ^ | | | | Subset Sum 3-Col ^ ^ | | | | Tiling NAE-SAT ^ ^ | | | | Planar Circuit SAT 3-SAT ^ ^ | | | | Circuit SAT------+ ^ | W.E. file:data/np-reduction-tree.png *** Cosine Integrals NP-Complete problem from calculus COSINE INTEGRALS - input :: list of integers a1, a2,..., an - question :: is $$\int\pi_{\pi}{d\theta (\cos{a1 \theta})(\cos{a2 \theta})\ldots(\cos{an \theta})} \neq 0$$ this is actually integer partitioning in disguise 1) recall $$cos\theta = \frac{ei\theta + e-i\theta}{2}$$ 2) then we have \begin{eqnarray*} \prodn_{j=1}{\cos{aj \theta}} &=& \frac{1}{2n}\prodn_j{eia_j\theta + e-ia_j\theta}\\ &=& \frac{1}{2n} \sumA \subseteq \{1,\ldots,n\}{\left(\prodj \in A{eia_j\theta} \prodj \notin A{e-ia_j\theta}\right)}\\ &=& \frac{1}{2n} \sumA \subseteq \{1,\ldots,n\}{ei\theta \left( \sumj \in A{aj} - \sumj \notin A{aj}\right)} \end{eqnarray*} 3) which equals 0 iff A is a balanced partition 4) so, in fact the entire integral is equal to $$\frac{2\pi}{2n}(\text{\# balanced partitions})$$ so, telling whether the integral is non-zero is NP-complete, however actually /computing/ the integral is much harder, in general the non-decision version of an NP problem is in #P (pronounced /count P/) *** primality is in NP if /p/ is a prime, then the set of non-zero integers mod(p), or the set {1,...,p-1} form a _group_ under x a _group_ requires an operator =.= which is - closed in the group - /associative/ meaning a.(b.c)=(a.b).c - has an /identity/ element e, s.t. a.e=e.a=a - has /inverses/, \forall a \exists a-1 s.t. a.a-1=a-1.a=e - (abelian groups also have this property) a.b=b.a p has to be prime to ensure the existence of /multiplicative inverses/. generally every element that is mutually prime with n has an inverse mod(n). a _cyclic group_ is a _group_ generated by a single element a: $$\{1, a, a2, \ldots, ar=1\}$$ if p is prime, $\mathbb{Z}*_p$ is cyclic. the /generator/ = "primitive root" example - p = 5 - a = 2 is a primitive root because its powers generate everything in the group with the powers, {1, 2, 4, 3, 1, 2...} Theorem: p is prime, iff \exists a primitive root a. a is primitive implies, - $$ap-1 \equivp 1$$ - $$\nexists t | 0<t<p-1 s.t. at \equivp 1$$ this is all to show that using a as our witness we can show p is prime in poly-time - easy to check $$ap-1 \equivp 1$$, because modular exponentiation is in P - checking all of the values of t - we only have to check values of t which divide p-1 - however an n-bit number can have more than a polynomial number of divisors - so we claim: if \exists t<p-1 s.t. $$at \equivp 1$$, then \exists a prime q which divides p-1, s.t. $$a\frac{(p-1)}{q} \equivp 1$$ - luckily the /prover/ who gave us our witness will need to give us *both* the primitive root a, and the prime factors of p-1, the combination of which is called /Pratt's primality certificate for p/ - /Pratt's primality certificate for p/ - a primitive root a - prime factorization of p-1, $p-1=qt_1_1qt_2_2\ldots qt_l_l$ - as well as Pratt certificates for q1, q2, etc... so we just make sure that the total size of all these /Pratt certificates/ is poly-size - the total number of bits in q1...ql is at most n - each time we recurse things get significantly smaller, and we'll only recurse down n levels - so n levels of n bits = O(n2) _primality *is* in NP_ - 70s -- primality is in NP - 60s -- randomized algorithms for finding primes in poly time - 04 -- deterministic algorithm in something like n12 time ** 2010-03-25 Thu next two chapters are both fun/philosophical -- conceptual depth with technical ease *Note:* definitely read section 6.1.3 *** why is P vs. NP so hard? Seems intuitively obvious, but seems very hard to prove. The /Clay Mathematics Institute/ poses 7 questions including the great remaining unsolved problems in mathematics, including this problem. *** what if P+NP polynomial hierarchy -- $$PH = \cup\infty_{i=1}{(\sumi{P \cup \prodi{P}})}$$ Polynomial Hierarchy ... +-----------------------------------------------------+ \Pi_2 P | ∀ | | ∃ | | +---------------------------------------------+----------+ | | | | \Sigma_2 P | | | | | | +----------------------------+ | ∃ | \Pi_1 P | | | coNP | | ∀ | | | | ∀ | | | | | | +----------------+----------+ | | | | | | | NP | | | | | | | | ∃ | | | \Sigma_1 P | | | | | | | | | | | | | | | | | | | | | | | | | | | | +------+ | | | | | | | | | P | | | | | | | | | | | | | | | | | | | +------+ | | | | +-------+--+-----------+----------------+----------+--+----------+ file:data/complexity-classes.png SMALLEST BOOLEAN CIRCUIT - input :: a Boolean circuit C - question :: is C the smallest possible circuit that computes fc? how many quantifiers would this problem require? - $\exists c'<c: \forall w: fc'(w) = fc(w)$ - $\forall c'<c: \exists w: fc'(w) \neq fc(w)$ - in fact the /circuit difference/ sub-problem is in NP the building of \forall and \exist quantifiers is similar to claiming a winning strategy in chess, you need to be able to say that - \forall moves by your opponent \exists a move by you s.t. some poly-time property is still true - or \forall opponent moves, \exists a move s.t., \forall opponent moves, \exists a move s.t. etc... *if P=NP then the entire polynomial hierarchy collapses into P* because P is closed under compliment, NP=P -> coNP=P=NP, meaning you could just start absorbing \exists and \forall quantifiers and everything else would also end up in P *** P-space ------------------------------------ NEXPEXPTIME ------------------------------- EXPEXPTIME ... --------------------------- NEXPTIME ------------------------ EXPTIME ------------------ PSPACE file:data/p-space.png *if P=NP then TIME(f(n))=NTIME(f(n))* - proof :: suppose A \in NEXPTIME, input is n bits long and a witness can be checked in time $t(n)=2O(n^c)$. pad out the input: add t(n)-n 0's, now it has length $n'=t(n)$ bits, and the witness can now be checked in time $t(n)=n'$. this new padded problem is then in NP, but if P=NP then its in P, which means it can be solved in poly(n') time, however $poly(t(n))=2O(n^c)$ which means that A \in EXPTIME *** cryptography P=NP -> modern cryptography does not work encryption in polynomial time -> decryption is in NP *** theorem proof PROOF CHECKING - input :: set of axioms A, statement S, proof P (collection of axiomatic statements) - question :: is P a valid proof of the statement S SHORT PROOF - input :: axioms A, statement S, integer L (in unary, to make things easier) - question :: is there a valid proof P of S which is < L statements long so, if P=NP then we can tell if proofs exist at arbitrary length in poly(length) time. *** Goedel's question to Von-Neuman let \phi(n) = time it takes for the optimal machine to search for proofs up to length n. then the mental effort of mathematicians in resolution of yes/no questions could be replaced by machine. he had a note in margin that mathematicians could still be creative in creating axioms ** 2010-03-30 Tue *** introduction to time hierarchies It is surprising how few ways we have for proving lower bounds on the runtime of an algorithm. one of these is /diagonalization/. we will construct some artificial problems which can be solved in $n2.0000001$ time, but can not be solved in $n2$ time. PREDICTION - input :: a program \Pi and an input x - question :: if \Pi(x) halts within f(|x|) steps, return \Pi(x) (the output), however if \Pi(x) takes > f(|x|) steps then return "don't know" is there a faster way to get the output of a program, then running the program itself. in the above "f(|X|)" is the running time of the complexity class you want to "get out of". in that case PREDICTION is outside of the class of problems which can be solved in exactly f(|n|) steps or TIME(f(n)), but it is inside of a larger class TIME(g(n)). by this the existence of PREDICTION proves that \exists g(n) and that $f(n) \subset g(n)$ *** diagonalization CATCH22 - input :: a program \Pi which returns yes/no answers - question :: suppose \Pi is given its own source code as input, \Pi(\Pi). if it halts within f(|\Pi|) steps, then return the opposite of \Pi(\Pi), else return "don't know" notice that CATCH22 is a special case of prediction feeding CATCH22 to itself is a contradiction, so it takes more time than previously. *** time hierarchy theorem If our programming language model of computation lets us simulate t steps of an arbitrary program \Pi, while running a clock that goes off after t steps in S(t) time, and if g(n)=o(f(n)), then $TIME(g(n)) \subset TIME(S(f(n)))$ *** why can't we prove P \subset NP w/diagonalization this could happen in PREDICT were in NP its not in NP because to check programs running in higher and higher poly times, there is no fixed poly time which can check *all* fixed poly times, sort of like how the greatest $n \in \mathbb{N}$ is $\infty$. Relativized complexity - PA :: class of problems we can solve in poly(n) time given access to an _oracle_ for A. (call subroutine for A in poly time) - NPA :: ditto only for checking in PA \exists problems A,B s.t. PA=NPA but PB \neq NPB A proof technique _relativizes_ if it works in all possible worlds, i.e. if it proves that C \neq D, then CA \new DA diagonalization relativizes, and no relativizing technique can prove that P \neq NP. *** Q-SAT recall the hierarchy of NP and coNP classes differentiated by their quantifiers (\forall and \exists) Quantified SAT - input :: a quantified Boolean formula \exists x1: \forall x2 ... \exists xn : \phi(x1,...xn) = \Phi - question :: is \Phi true? this problem lives in P-SPACE above our hierarchy. in fact it is P-SPACE complete meaning it is the hardest problem solvable with polynomial space an infinite time. we claim that $PQ-SAT=NPQSAT$, this is true because no matter our world, NP is just P with one more quantifier in front of it, but QSAT with another quantifier is just another instance of QSAT *** haystack oracle, B The oracle will say "yes" to at most one sn of each length n. \forall n > 0 we flip a coin - heads :: choose a random bit string Sn of length n and add it to S - tails :: we don't add anything to S of length n FINICKY ORACLE - input :: n in unary - question :: does B (haystack oracle) say yes to any string of length n trivially in NPB, however not in PB because you have to guess a single random string out of 2n possible strings, so you can't reliably find the random string with a poly number of guesses ** 2010-04-01 Thu review of the midterm questions (see the midterm-solutions.pdf) we are strongly urged to convince ourselves of the following $$ \sumt=0^n{\binom{n}{t}2t} = 3t $$ note that in problem 5, the "insert a vertex in each edge" gadget needs to be extended by completely connecting all of the inserted vertices. start reading chapter 7 -- its very fun ** 2010-04-06 Tue - we will *not* cancel class on Thursday, it will be up on video (link will be sent to the email list, http://mts.unm.edu/Cs_courses.html) - we should really do ourselves a favor and read Chapt. 7 *** a couple of tidbits from Chapt. 6 the take home point of the following is that there is some significant inner structure inside of P and NP - if P \neq NP then \exists problems which are in between, i.e. are not in P and are not NP-Complete. a couple of problems people believe are in between are - factoring - graph isomorphism -- is almost always in P - if any problem in NP \cap coNP is NP-complete, then NP=coNP, this would mean that whenever you have a poly-time property P whith a \exists P you could change it to a poly-time property with a \forall P, or rather existence statements and non-existence statements would be equivalent This means that the entire polynomial hierarchy would collapse because two consecutive existential quantifiers of the same type can be collapsed, e.g. (\exists \forall) would be equal to (\exists \exists) which collapses to (\exists) - total function NP (TFNP) -- witness always exists but is hard to find - pidgin subset - input: a list of integers x1 ... xl - output: a pair of subsets A \neq B \subset {1, ..., l} s.t. $$\sumi \in A{xi} \equiv2^l \sumi \in B{xi}$$ - see Chpt. 6 for more information, but this relates to non-constructive proofs, and to a new complexity class of things that can't be found in P, but the pidgin hole principle can proof that they exist in P (PPP). if P and NP collapse, then pidgin hole proofs can be used as constructive proofs *** some early programming history the grand unification of 1936 - 1800s - Leibniz was the first to build machines to compute functions - Babbage was the first to try to build a machine which could compute a wide variety of functions namely polynomials (his Differential Engine), and he wanted to be able to mechanically compute series of instructions (his Analytical Engine) (1840s), he was inspired by a type of programmable loom - Ada Augusta, Lady Lovelace (the illegitimate daughter of Lord Byron) can be considered the first programmer as she wrote a non-trivial program for Babbage's Analytical Engine. She was also among the first to imagine the use of computers beyond simply numerical functions - 1900s -- (Hilbert, Church, Turing, Godel) - Hilbert was a formalist -- meaning he hoped that mathematics could be "completed", that given the right axioms and enough work every true mathematical statement could be proven. He is responsible for the "Decision Problem". around this time people were trying to "formalize" math with Set Theory. on Thursday we'll prove Godel's incompleteness theorem. Recommended Reading - /logicomix/ is a comic book about Bertrand Russel and the foundations of mathematics. - _Godel, Escher, Bach_ -- Hofstadter some discussion of the different cardinalities of \infty (see cardinality of sets -- sizes of infinity in the cs550 notes) Russel's Paradox: The set of all sets that do not contain themselves. this paradox led to a stratified structure of sets s.t. no set can refer to sets on the same or lower levels | \emptyset | | integers | | sets of the above | | $\vdots$ | ** 2010-04-08 Thu - see video http://mts.unm.edu/Cs_courses.html - ensure comfort with /recursive enumerability/ ** 2010-04-13 Tue skim section 7.4, read 7.5 *** a couple of words about the homework - for any f(n) $$ NTIME (f(n)) \subseteq TIME(2O(f(n))) $$ - yes-instance have witnesses w of size |w|=O(f(n)) which can be checked in O(f(n)) time - there are $2|w| = sO(f(n))$ possible witnesses, each of which takes O(f(n)) time to check so $2O(f(n)) \times O(f(n)) = 2O(f(n))$ time to check all witnesses $$ NTIME (f(n)) \subseteq TIME(2O(f(n))) \subseteq NTIME(2O(f(n))) \subseteq TIME(22^{O(f(n))}) \ldots$$ - Monier-Speckenmeyer -- 3-SAT solver with better than 2n time 1.8n << 2n clause | x1 | x2 | x3 | <- a clause and its variable assignments | |-----+-----+-----+----------------------------------------------| | T | | | if this leads to a contradiction then try... | | F | T | | if this leads to a contradiction then try... | | F | F | T | | is better than naively trying all possible assignments to each variable. - we can prove problems are undecidable by reducing the halting problem to them Rice's Theorem: any long-term question about the behavior of a program is undecidable *** foundations programs being both code and data, similar to DNA/RNA being both the passive information storage /data/ and also being enzymes which are active and can modify the original DNA data like a /program/ *** main models of computation initial explorations into programming were performed by logicians trying to build up complex functions from a primitive set of basic functions. **** primitive recursive functions :: building functions on $\mathbb{N}$ from the following primitive set - 0(x) = 0 - S(x) = x + 1 -- note "+" is not yet defined in this language, just used for the gist of its meaning - I(x) = x - $I3_2(x, y, z) = y$ Some functions on functions - composition. $(f \circ g)(x) = f(g(x))$ - primitive recursion. if f(x), g(x,y,z) - base case h(x,0) = f(x) - recursive step h(x,y+1) = g(x,y,h(x,y)) -- not that by definition the value of the recursive variable "y" must decrease with every nesting of recursion. - examples with simple arithmetic - addition (defun add (x y) (if (= x 0) x (successor (add x (predecessor y))))) - multiplication (defun mult (x y) (if (= x 1) x (add x (mult x (predecessor y))))) - by definition there is no primitive recursive function which does not terminate - there can be no "universal" partial recursive function because it would not always terminate -- count the number of loops (recursions) in the "universal" function, then hand it a function with one more loop $\lighting$ **** Ackermann function - A1(x,y) = x + y = x + 1 + 1 + ... y times - A2(x,y) = x * y = x + x + x + ... y times - A3(x,y) = xy = x * x * x * ... y times - $$x \uparrow2 y = x^{xx^{\ldots^{x}}}$$ y times lets use 1 as our base case $$ An(x, y) = \left\{ \begin{array}{lr} 1 & : y = 0\\ An-1(x, An(x,y-1)) & : y \neq 0 \end{array} \right. $$ so lets see what A3(2,2) is equal to... - A2(2,A3(2,1)) - A2(2,A2(2,A3(2,0))) - A2(2,A2(2,1)) - A2(2,A1(2,A2(2,0))) - ... if we look at $\bar{A}(n)= An(n,n)$ - $\bar{A}(1) = 1 + 1 = 2$ - $\bar{A}(2) = 2 \times 2 = 4$ - $\bar{A}(3) = 33^3 = 327 = 7625597484987$ - $\bar{A}(4) = BIG$ so Ackermann is computable, but *not* partial recursive, because it has a variable number of loops (/points of recursion/) depending on its argument. **** partial recursive functions -- primitive recursion \cup \mu-recursion \mu-recursion is like =while= loops in imperative languages, it is not guaranteed to terminate - if f(x,y) is computable - then so it g(x) = \mux f(x,y) = min{ y: f(x,y) = 0 } however if there is no such y then g would run forever primitive recursion \cup \mu-recursion can compute *any* computable function **** \lambda-calculus Alonzo Church, with Rosser and Kleene a different view of computability -- all syntax the /add/ function in \lambda calculus - \lambda x. \lambda y. x + y - (\lambda x. \lambda y. x + y) 3 $\rightarrow$ \lambda y. 3+y - (\lambda x. \lambda y. x + y) 3 5 $\rightarrow$ 3+5 notice that the above /currys/ its variables see cs558 and cs550 for more information on \lambda-calculus - fixed point theorem :: \forall R, \exists f s.t. R(f) = f meaning R(f)(x) = f(x) *and* \exists Y s.t. Y(R) = f computable in \lambda-calculus \equiv computable in partial recursion ** 2010-04-15 Thu *** homework stuffs - a reduction from (e.g.) 3SAT \rightarrow B converts *any* instance of 3SAT to an instance of B - proving undecidability of B consists of reducing *any* version of the halting problem \rightarrow an instance of B - for example, let B = is there an input y of \phi s.t. \phi(y)=17 our input program \phi is just a program, and we can make any changes to the program we like e.g., we can change \phi, s.t. \phi runs \pi(x) and then returns 17, then the "returning 17" property of \phi depends on the halting of \pi(x), and we've reduced halting of \pi to "returning 17" of \phi $$ f(\pi1, \pi2) = \left\{ \begin{array}{ll} 1 & : \pi1 halts \, first\\ 2 & : \pi2 halts \, first\\ undecidable \, & : neither \, halts \end{array} \right. $$ - if f is a total function (defined on all inputs), then f is /computable/ if \exists \pi s.t. \forall x \pi(x)=f(x) and \pi always halts if B is a decision problem $$ fB(x) = \left\{ \begin{array}{lr} "yes"\\ "no" \end{array} \right. $$ B is /decidable/ \leftrightarrow fB is /computable/ - halting problem $$ haltp(\pi, x) = \left\{ \begin{array}{ll} "yes" &: \pi(x) \, halts\\ "undefined" &: \pi(x) \, never halts \end{array} \right. $$ the above is computable, the below is not computable because you can't firmly say "no" w/o infinite computation $$ haltp(\pi, x) = \left\{ \begin{array}{ll} "yes" &: \pi(x) \, halts\\ "no" &: \pi(x) \, never halts \end{array} \right. $$ - suppose there was a computable function f(|x|) s.t. if \pi(x) ever halts then it will halt in f(|x|) steps *** computing maximum run times 1) partial recursive functions \rightarrow imperative functions 2) \lambda-calculus \rightarrow lisp, ml, Haskell 3) Turing machine +-------+ | Q | +-------+ | +-----+-----+-----+-----+-----+-----+ | a_1 | a_2 | a_3 | a_4 | a_5 | ... | +-----+-----+-----+-----+-----+-----+ file:data/turing-machine.png infinite toilet roll of paper, each square has a symbol, can always get more squares. finite alphabet of square symbols (sometimes called \gamma) the /head/ of our Turing machine is a FSA (sometimes called Q) \exists a /universal Turing machine/ which can simulate *any* Turing machine. Just encode the FSA (Q) of any turing machine to tape, and feed that tape + input to the universal Turing machine. once you have this /universal Turing machine/ all of the snake-eats-tail paradoxes arise. Turing actually wrote out this universal Turing machine, the same year Church did the same with \lambda-calculus. \gamma and |Q| are relatively fungible, with enough symbols you can get the number of states down to 2 and with enough states you can get the number of symbols down to 2 this is basically a FSA with access to a data structure (the tape), what if we replace the tape with a set of counters s.t. with each counter it can - increment - decrement - check if equal to 0 (there is a very cute proof of the above in the book) *Church Turing Thesis*: these above 3 definitions capture anything which could be called an "algorithm" or "procedure" or "program" *Physical Church Turing Thesis*: no physical device can compute anything that can't be computer by one of the above 3 definitions - fractran :: John Conway, consists of - a big list of fractions (program) - a start number - continually 1) move down the list of fractions 2) check if the faction time your number is an integer 3) if so move up that number of steps - there is a list of fractions given in the book which computes the primes numbers or some such - Collatz problem :: the following function, we don't know if it ever terminates $$ f(x) = \left\{ \begin{array}{ll} \frac{x}{2} &: even(x)\\ 3x+1 &: odd(x) \end{array} \right. $$ ** 2010-04-20 Tue :PROPERTIES: :ID: 9b681d5b-344e-4630-9acf-e8a4bdc3c227 :END: we'll end the semester by devoting each day to a specific topic. today's topic is *memory* (Chpt. 8 in the test). | <2010-04-22 Thu> | may not have class, prof. in Mexico | | <2010-04-27 Tue> | randomized algorithms | we will have 1 more homework, and we will have another 3-4 day takehome final, around the weekend right before finals. *** memory Including the hard drive your computer will include roughly 1012 bits, resulting in 210^{12} possible states. SPACE(f(n)) is the spatial analog to TIME(f(n)), it originally referred to the length of the tape in your Turing machine. - SPACE(f(n)) \subseteq TIME(2O(f(n))) - similarly TIME(f(n)) \subseteq SPACE(f(n)) -- assuming you have a random access machine. - PSPACE = SPACE(poly(n)) - LSPACE = LOGSPACE = SPACE(O(log(n))) -- this only counts the workspace to which you have read/write access, not the space required to store the problem from which you only have read access - given the above LSPACE \subseteq PTIME - NSPACE = set of problems where, if input is a yes-instance, \exists a path through the space of possible machine states of your non-deterministic program to an accepting state that ends in returning "yes" - Reachability is NLOGPSACE-complete. given (G, s, t) : does \exists a path from s \rightarrow t. the following program will fit this bill u = s guess v if ((u, v) in E) u = v; else return false if (u == t) return true - NTIME(f(n)) = TIME(2Of(n)) - NSPACE(f(n)) = SPACE(f(n)2) -- space can be re-used -- /Savages Theorem/ - \rightarrow NPSPACE = PSPACE *** Savages Theorem Reachability \subseteq SPACE(log2(n)) For Reachability you only need to keep track of the "horizon" of all of the possible paths from s to t to find out if there is a path, which can be stored in log2(n) space. 2log^2(n) = nlog(n) now to refine our Reachability problem REACH(G,s,t,l) = \exists a path s \rightarrow t with length \leq l remember /middle first search/ from our shortest path problem, basically works as follows - Reach(G,i,j,l) = \exists k : Reach(G,i,k,l/2) \wedge Reach(G,k,j,l/2) - algorithm if (i == j) return true if (l=1 and E.include?(i,j)) return true for k=1 to n do if (reach(i,k,l/2) and reach(k,j,l/2)) return true end this algorithm runs in SPACE O(log(n)), it is constantly forgetting and recomputing the many recursive calls to itself. this version of Reachability also generalizes to programs moving through state space *** one last surprising difference between space and time _coNL = NL_ there is a reduction from non-Reachability to Reachability, and vice-versa somehow existence and checking are equal for space coNSPACE(f(n)) = NSPACE(f(n)) ** 2010-04-27 Tue *** games in the following game tree - memory needed = t memory(one position) - alternating rows in the following switch between \exists and \forall \begin{tikzpicture} \tikzstyle{every node}=[fill=red!30,rounded corners] \node{$\vee$} child {node {$\wedge$} child {node {$\vee$}} child {node {$\vee$}}} child {node {$\wedge$} child {node {$\vee$}} child {node {$\vee$}}} \end{tikzpicture} file:data/game-tree.png - p.368 it is possible to build positions in GO which encode arbitrary QSAT formulas, thus GO is PSPACE complete. - computers recently got better at GO by searching as far as they could, and then filling open space up randomly with stones and seeing how the territory breaks out - it seems that humans search /deeply/ but /selectively/ *** walk sat random walking through a 3-SAT problem 3-SAT with n variables x1, ..., xn in ($\frac{4}{3}$)n poly(n) /the following is all in the book/ given a formula \phi start with a random truth assignment B if out_of_time return "don't know" if B.satisfies(phi) then return B else choose clause C randomly from all the unsat clauses choose X randomly from C.variables flip x recur no-one is able to prove that this completes through an analysis of the total number of satisfied clauses. - proof :: Assume \phi is satisfiable, \rightarrow, \exists A s.t. A satisfies \phi. Let d(A,B) be the /Hamming distance/ between A and B (the number of variables on which they differ). We will analyze the change in the hamming distance. We'll compute the probability that \delta(d) is positive or negative (i.e. closer to or further from solution) with each change. In the worst case B already agrees with A in 2/3 of the variables in C, so - Pr[\delta(d) = +1] \leq 2/3 - Pr[\delta(d) = -1] \geq 1/3 so with 2-SAT where the above Pr's are both 1/2, it will generally take n2 steps to find a satisfying assignment (see the math-aside) however in our case where we're more likely to move away from than towards a hamming distance of 0. We can look at p(d) if we start at a distance of d from A, p(d) is the probability that we will *ever* read d=0 instead of drifting infinitely far away from the best solution. p(d) = 1/3(p(d-1)) + 2/3(p(d+1)) /left as an exercise/, given the above p(d)=\frac12n if you will _ever_ touch 0, then you probably will within the first O(d) steps, in fact 3d steps is generally all you need. this is all important because we will wrap our algorithm in another outer loop. A /random restart/ loop, which will restart our algorithm from time to time. Basically we will start over every 3n steps. so (back to our running time), we will restart (4/3)n times and each time will take poly(n) (running 3n steps) times, then we will succeed with \frac34n likelihood, so our average number of attempts will be the inverse of the probability of success. our average value of p(d) will be... \begin{eqnarray*} Psuccess &=& \sumd=0^n{Pr[d(A,B)=d]p(d)}\\ Psuccess &=& \frac{1}{2n}\sumd=0^n{{{n}\choose{d}} \frac{1}{2d}} \\ Psuccess &=& \frac{1}{2n}(\frac{1}{2}+1)n\\ Psuccess &=& \left(\frac{3}{4}\right)n \end{eqnarray*} this is *very* close to the best known algorithm for 3-SAT, the best is \alphan with \alpha=1.332 where as this one is \alpha=1.333... the other one is super-complicated, and uses this as a subroutine - some random walk stuff (/homework relevant/), we should *really* know this stuff when we go left or right with equal probability after 2 steps we will be at our starting point with probability 1/2, after four steps it would be with probability 6/16 in general after t steps we could be anywhere from -t to +t from our starting point. lots of ${t}\choose{n} \times \frac{1}{2t}$, which when graphed looks like a normal distribution around t/2 with width 1/sqrt(t). given that n! \simeq nn e-n, (see math appendix) *** math aside :PROPERTIES: :CUSTOM_ID: math-aside :END: Random Walk: in a random walk on n steps, starting in the middle it will take n2 steps to reach 0. when flipping random coins the resulting number of heads will be a bell curve centered around t/2 with a width of sqrt(t). when reporting error from a set of trials, e.g. p plus or minus \epsilon, then \epsilon \sim 1/sqrt(t) where t is the number of trials. *read the math appendix!* ** 2010-04-29 Thu *** counting in SPACE m and NSPACE m - stronger than TIME m - still limited - counting? up to 2m we can count higher with randomness Nondeterministic state transition +-----+ +------+ | | 1/2 | | | |----------->| | +-----+ +------+ | |1/2 v +-----+ | | | | +-----+ file:data/non-det-state-trans.png - w/deterministic machine of m states, after 2m steps we've repeated something and are in a loop. - w/non-deterministic machine of m states, after 2m steps it is possible that \exists unvisited states after 2m steps \usetikzlibrary{shapes} % Define block styles \tikzstyle{state} = [circle, draw, text centered, fill=blue!25] \begin{tikzpicture}[->,>=stealth', shorten >=1pt, auto, node distance=2.8cm, semithick] \node [state] (0) at (0,0) {0}; \node [state] (1) [right of=0] {1}; \node [state] (d) [right of=1] {$\ldots$}; \node [state] (n) [right of=d] {n}; \path (0) edge node {} (1); \path (1) edge node {} (d); \path (d) edge node {} (n); \path (1) edge [bend right] (0); \path (d) edge [bend right] (0); \path (n) edge [bend right] (0); \end{tikzpicture} file:data/non-det-counter.png the expected time to get to any state $i$, $$\mathbb{E}Ti = 2(\mathbb{E}Ti-1 +1) \sim 2i$$ so with a randomized machine we can /count/ to $22^m$ *** improved counting using $\mathbb{E}Ti = 2i$ we can output an update every time we enter a previously unseen state (suppose our output screen has sufficient memory to handle this part) - can't get better than factor of 2 accuracy - additional "noise" due to randomness ideas/solutions: - changing probabilities to forward with back \frac14 and forward \frac34. - _controlling variance_: if we split our clock up into t pieces of size m/t, and independently run a clock in each piece, then the average of these clock times will be closer to the expected time. how close will these be? we can apply chebyshev's inequality (below). \forall clocks i, let Yi be the clock's time, then $$ Pr\left(\left|\frac{y1, \ldots, yt}{t}\right|-\mathbb{E}yi \leq t\sqrt{Var\left(\frac{y1, \ldots, yt}{t}\right)}\right) \leq \frac{1}{t2} $$ definition of variance, $var(x) = \mathbb{E}((x - \mathbb{E}x)2)$, expected distance from average value, squared - if x is a coin flip - $\mathbb{E}x=\frac{1}{2}$ - $(0-\frac{1}{2})2 = \frac{1}{4}$ - $(1-\frac{1}{2})2 = \frac{1}{4}$ - 2 coins, x and y - $\mathbb{E}(x+y)=1=2\mathbb{E}(x)$ - Var(x+y)=1/4*(-1)2+1/2*02+1/4*12=1/2=2Var(x) - $(Var(x+y))\frac{1}{2}=2^{\frac{1}{2} \times (Var(x))\frac{1}{2}}$ - so with k flips, the expectation grows by a factor of k, and the variance grows by a factor of $k1/2$ _Chebyshev Inequality_: \forall t \geq 0, $Prob(abs((z-\mathbb{E}(z))) \geq t\sqrt{Var(z)}) \leq \frac{1}{t2}$ _Law of Large Numbers_: independent random variables, x1, x2, x3, ..., the limit of the average value will converge to the expected value, also stated as $$ limt \rightarrow \inf{\frac{x1+x2+\ldots+xt}{t}}=\mathbb{E}x $$ 2 facts: - \forall x, 1-x \leq e-x - \forall x, 1+x \leq ex *** application to streaming algorithms Alon, Matias, Szegedy 1996 -- approximating frequency moments you have some vast amount of stuff (say google web searches) flying past you, and you just want to update a couple of bits as these gigs fly by. stream of numbers from the set {1, ..., N}, and we want an idea of the number of distinct elements in the stream (the 0th frequency moment) - mi = # times i appears in the stream - the kth frequency moment $$Fk=\sumi=1^{b}{(mi^k)}$$ One approach for F0 (# distinct) would be to track the smallest element seen thus far. - let J = the smallest element in the stream - $\mathbb{E}J=\frac{N}{F0}$, so if J is close to its expectation, then a good estimate for F0 is $\frac{N}{J}$ ** 2010-05-04 Tue *** approximation algorithms we've spent a lot of time saying how all NP-complete problems are equally hard, however when you are _approximating_ the solutions they are *not* all equally hard. /branch and bound/ and /branch and cut/ are popular approaches for real-world approximations of the solutions of NP-complete problems **** vertex cover :PROPERTIES: :CUSTOM_ID: vertex-cover :END: Vertex Cover - input :: a graph G=(V,E) and an integer i - question :: what is the smallest vertex cover S \subseteq V B is NP-hard if A \subseteq B \forall A \in NP Algorithm for a /decent/ vertex cover - start: S = \emptyset - while \exists uncovered edges(u,v) s.t. (u,v \notin S) - add u,v to S A is a /2-approximation/ for a minimal vertex cover, so $$ \frac{|SA|}{Sopt} \leq 2 $$ proof: the sequence of edges covered by this method are disjoint (a /partial matching/), the optimal vertex cover (VC) must include at least 1 of the ends of each of these edges, or at least \frac12 as many vertices as included in this cover. the kicker here is that we can't do any better than this silly algorithm for a poly-time algorithm. **** fuzzy vertex cover :PROPERTIES: :CUSTOM_ID: fuzzy-vertex-cover :END: one other approach for vertex cover is the following /Fuzzy/ vertex cover; variables, \forall v \in V, 0 \leq xv \leq 1 s.t. \forall (u,v) \in E, xu + xv \geq 1 here we want to minimize the sum of the vertices in the cover rather than the number of vertices this is called a /linear programming/ relaxation of this problem from the above we can get a real vertex cover in the following way; v \in S \leftrightarrow xv \geq \frac1/2 this *also* results in a two approximation of the minimal VC **** continuously approximatable problems -- Fully Poly Time Approx. Scheme (FPTAS) \forall \epsilon > 0, \exists a (1+\epsilon)-approximation that takes poly(n,1/\epsilon) time **** Traveling Salesman Problem (TSP) :PROPERTIES: :CUSTOM_ID: tsp :END: Traveling Salesman Problem - input :: n by n matrix dij - question :: tour s.t. i1, i2, ..., in which minimizes $\sumj=0^{n-1}{di_j,i_{j+1}}$ Hamiltonian Path \subseteq TSPthreshold \subseteq TSPoptimization $$ dij = \left\{ \begin{array}{ll} 1 &: (i,j) \in E\\ 1000000 &: (i,j) \notin E \end{array} \right. $$ \exists a Hamiltonian path \leftrightarrow the shortest path above has distance \leq n note that the above could violate the triangle inequality, or \forall i,j,k , dik \leq dij + djk we can uses a minimal spanning tree (which can be found \in P) to build a /not so bad/ Hamiltonian path MSTop \leq Topt \leq 2MSTopt The above uses the triangle inequality when short-circuiting a tour along the MST, by skipping previously visited cities. traveling out and back on all edges in MST (doubling the edges into a multipath) leads to an Eulerian tour. \exists an Eulerian tour \leftrightarrow each vertex has even degree, we can force each edge to have even degree by only adding edges between vertices which have odd degree -- this is a more efficient way of generating a shortest tour (TB) from an MST TB \leq MST + MM \leq 3/2 Topt -- where MM is the minimum matching of the odd degree vertices Euclidean TSP (1+\epsilon)-approximation in $\sim n\frac{1}{\epsilon}$ -- done w/dynamic programming ** 2010-05-06 Thu *** Quantum Mechanics **** The "two slit" experiment performable with waves of light or water. - Light of some frequency hits a screen with two holes in it, and then hits a second screen on the other side of the first screen. - the light propagates from each hole at some new frequency - at different points in the second screen the two lights will either arrive in phase, or out of phase with each other -- as a result the light on the second screen appears at a higher frequency than the original waves of light in the late 1800s this experiment was carried out with very faint light sources -- such that small numbers of individual particles should be hitting the back screen, however the continuous wave effect was surprisingly still observed. similarly this experiment has been performed with the light replaced with _single electrons_ passing through the slit screen at a single time, and the single electron lands on the screen with the exact sum of the probabilities of moving through both slits. so rather than /probability/ the /amplitude/ of arriving on the second screen at some point is the sum of the amplitudes (measured in complex numbers) of the electron moving through each slit. $$ probability = \sum|amplitude|2 $$ some funny facts -- placing a detector on the slits which detects which of the slits the electron have moved through, then the results on the back screen do not show the sum of amplitudes of both slits but rather only of the detected slit. This is due to /decoherence/ when the actual measuring of the electron is intricately linked with the remainder of the universe. **** quantum computing due to /decoherence/ it is necessary to both bind some states of the computing elements in such a way that they are not truly random, but such that stray electrons moving by the computer don't bounce off an element and inadvertently measure its state thereby removing its /quantum/ state. the above may be harder than landing on Mars but easier than constructing a space elevator. physics at the microscopic level is reversible, meaning the /machine code/ level operations may also need to be reversible (i.e. \oplus instead of \wedge). **** quantum operations computational state changes through reversible matrix multiplication b----------+-----------b | | | a---------nor----------a nor applied to a and b $$ \left( \begin{array}{llll} 1 & 0 & 0 & 0\\ 0 & 0 & 0 & 1\\ 0 & 0 & 1 & 0\\ 0 & 1 & 0 & 0 \end{array} \right) \left( \begin{array}{llll} 1 & 0\\ 0 & 1\\ 1 & 0\\ 1 & 1 \end{array} \right) $$ if our computer is represented as a large vector of bits an example quantum operator $$ \frac{1}{\sqrt{2}} \left( \begin{array}{lr} 1 & 1\\ 1 & -1 \end{array} \right) \left( \begin{array}{l} 1\\ 0 \end{array} \right) = \left( \begin{array}{l} \frac{1}{\sqrt{2}}\\ \frac{1}{\sqrt{2}} \end{array} \right) $$ the above applied to $$ \left( \begin{array}{l} \frac{1}{\sqrt{2}}\\ \frac{1}{\sqrt{2}} \end{array} \right) $$ yields $$ \left( \begin{array}{l} 1\\ 0 \end{array} \right) $$ and applied to $$ \left( \begin{array}{l} \frac{1}{\sqrt{2}}\\ -\frac{1}{\sqrt{2}} \end{array} \right) $$ yields $$ \left( \begin{array}{l} 0\\ 1 \end{array} \right) $$ **** reversible computation every erased bit _must_ result in some generation of heat (entropy), however reversible computation need not theoretically generate heat **** example quantum computation f:{0,1} \rightarrow {1,0}, is f(0) == f(1)? a----------+------------a | | +---+ b--------| f |----------b \nor f(a) +---+ you can in effect run f on 0 and 1 at the same time with - a = 1/sqrt(2)(0+1) - b = 1/sqrt(2)(0-1) the following is true if f(0)==f(1) $$ \frac{1}{2} \left( \begin{array}{rr} 0 & 0\\ -0 & 1\\ 1 & 0\\ -0 & 1 \end{array} \right) \rightarrow \left( \begin{array}{r} -\frac{1}{2}\\ \frac{1}{2}\\ -\frac{1}{2}\\ \frac{1}{2} \end{array} \right) $$ we don't know what the values are, but we can play tricks with interference to find out if they are the same value aside from all these artificial problems, we found out that factoring was qualitatively different on quantum computers **** quantum factoring want to factor N=pq where p and q are prime, let n=log(N) -- or the bits required to store N - choose a random c\in{2, ..., N-1} - if gcd(c,N) \neq 1 then we're done - else compute powers of c, mod n: and find the /order/ of c or the smallest r s.t. cr = 1 -- this is the period of this sequence - if r is odd then start over - else r is even - cr \equiv 1 mod n \rightarrow cr-1 is a multiple of N, \exists k s.t. cr-1=kn, since r is even we can do $(c\frac{r}{2}-1)(c\frac{r}{2}+1)=kn$ - now we compute the $gcd(c\frac{r}{2}-1, N), gcd(c\frac{r}{2}+1, N)$ - if one of these is a multiple of N, then try again, else done do the above a small number of times and you will win the only thing here that _can't_ be done in poly time is finding the order of c mod n, which could take exponential time on a quantum computer we 1) put a register x into a superposition of all possible values from 0 to N. 2) have an empty register set to 0 3) we run a program which computer cx mod n, and feed it our super-positioned x, and compute cx mod n for all of these values. 4) we now measure the output of this program, when we measure one particular output the wave function collapses in x, and everything that doesn't map to that particular output falls to 0. x is now in a periodic state, and r is the period of this state we can take the Fourier transform of x to find its period, this can be done in O(log2(N)) quantum steps * misc ** giving a good colloquial talk file:reading/how to give a good colloquium.pdf ** terms :PROPERTIES: :CUSTOM_ID: terms :END: - simple graph :: every pair of vertices share at most 1 edge - turing reduction :: A $\leq$ B if A can be solved with a polynomial number of calls to B - karp reduction :: A $\leq$ B if each instance of A can be converted to an instance of B s.t. yes(A) iff yes(B) - CNF :: /conjunctive normal form/ conjunction of clauses each of which is a disjunction of literals - linear programming :: a programming problem where the goal is to minimize some linear combination of a series of vertices (see fuzzy-vertex-cover) ** math appendix :PROPERTIES: :CUSTOM_ID: math-appendix :END: mathwriting.pdf