1
00:00:01,069 --> 00:00:03,189

the following content is provided under  a Creative Commons license your support  will help MIT OpenCourseWare continue to  offer high quality educational resources  for free  to make a donation or to view additional  materials from hundreds of MIT courses  visit MIT opencourseware at ocw.mit.edu  I could afternoon everyone  so let's get started so today we're  going to be talking about races and  parallelism and you'll be doing a lot of  parallel programming for the next  homework assignment in project one thing  I want to point out is that it's  important to meet with your MIT policy  as soon as possible if you haven't done  so already since that's going to be part  of the evaluation for the project one  grade and if you have trouble reaching  your MIT policy members please contact  your TA and also make a post on Piazza  as soon as possible ok so as a reminder  let's look at the basics of silk so we  have silk spawn and silk sync statements  in silk this was the code that we saw in  last lecture which computes the nth  Fibonacci number so when we say silk  spawn it means that the named child  function the function right after the  silk spawn keyword can execute in  parallel with the parent caller so it  says that fib of n minus 1 can execute  in parallel with the fib function that  called it and then silks Inc says that  control cannot pass this point until all  the spawned children have returned so  this is going to wait for fib of n minus  1 to finish before it goes on and  returns the sum of X and y and recall  that the sole keywords grant permission  for parallel execution but they don't  actually force parallel executions so  this code here says that we can execute  fib of n minus 1 in parallel with this  parent caller but it doesn't say that we  necessarily have to execute it execute  them in parallel and it's up to the  runtime system to decide whether  these different functions will be  executed in parallel we'll talk more  about the runtime system today and also  we talked about this example where we  wanted to do an in-place matrix  transpose and this used the silk for  keyword and this says that we can  execute the iterations of this silk for  loop in parallel and again this says  that runtime system is allowed to  schedule these iterations in parallel it  doesn't necessarily say that they have  to execute in parallel and under the  hood silk spawned statements or soap for  statements are translated into nested  silk spawn and silk sink calls so the  compiler is going to divide the  iteration space in half do a silk spawn  on one of the two halfs called the other  half and then this is done recursively  until we reach a certain size for the  number of iterations and a loop at which  point it just creates a single task for  that so any questions on on the soul  constructs yes so it's actually not  going to figure out whether the  iterations are independent for you the  programmer actually has to reason about  that but silk does have a nice tool  which we'll talk about that will tell  you which places your code might  possibly be reading and writing the same  memory location and that allows you to  localize any possible race bugs in your  code so we'll actually talk about races  but if you just compile this code silk  isn't going to know whether the  iterations are independent  all right so determinacy races  okay so race conditions are the bane of  concurrency so you don't want to have  race conditions in your code and there  are these two famous race bugs that  cause disaster so there's this therac-25  radiation therapy machine and there's a  race condition in the software and this  led to three people being killed and  many more being seriously injured the  North American blackout of 2003 was also  caused by a race bug in the software and  this left 50 million people without  power so these are very bad and they're  notoriously difficult to discover by  conventional testing so race bugs aren't  going to appear every time you execute  your program in fact the hardest ones to  find and the hardest ones to find which  caused these events are actually very  rare event so most of the times when you  won't run your program you're not gonna  see the race bug only very rarely will  you see it so this makes it very hard to  find these rates bugs and furthermore  when you see a race bug it doesn't  necessarily always happen in the same  place in your code so that makes it even  harder ok so what is a race so a  determinacy race is one of the most  basic forms of races in a determinacy  race occurs when two logically parallel  instructions access the same memory  location and at least one of these  instructions performs a right to that  location so let's look at a simple  example so in this code here  my first setting X equal to zero and  then I have a soak for loop with two  iterations and each of the two  iterations are incrementing this  variable X and then at the end I'm going  to assert that X is equal to two so  there's actually a race in this program  here so in order to understand where the  race occurs let's look at the execution  graph here so I'm gonna label each of  these statements with a letter  the first statement a is just setting X  equal to zero and then after that we're  actually going to have two parallel  paths because we have two iterations of  this so for loop which can execute  parallel and each of these paths are  going to increment X by one and then  finally we're going to assert that X is  equal to two at the end and this sort of  graph is known as a dependency graph it  tells you what X what instructions have  to finish before you execute the next  instruction so here it says that B and C  must wait for a to execute before they  proceed but B and C can actually happen  in parallel because there's no  dependency among them and then D has to  happen after B and C finish so to  understand why there's a race bug here  we actually need to take a closer look  at this dependency graph so let's take a  closer look so when you run this code X  plus plus is actually going to be  translated into 3 steps so first we're  going to load the value of x into some  some processors register r1 and then  we're going to increment r1 and then  we're going to set X equal to the result  of our one in a same thing for r2 we're  gonna load X into register r2 increment  r2 and then set X equal to r2 ok so here  we have a race because both of these in  both of these stores X 1 equal to R 1  and X 2 equal to r2 are actually writing  to the same memory location so let's  look at one possible execution of this  computation graph and we're going to  keep track of the values of X r1 and r2  so the first instruction we're going to  execute is X equal to 0 so we just set X  equal 0 and everything's good so far  then next we can actually pick one of  two instructions to execute because both  of these two instructions have their  predecessors satisfied already their  predecessors have already executed so  let's say I pick r1 equal X to execute  and this is going to place the value 0  into register r1 now I'm going to  increment r1 so  this change it's a value in our  one-to-one and then now let's say I  execute our to equal to X so that's  going to read X which has a value of 0  is going to place the value of 0 into R  2 it's going to increment R 2 that's  gonna change that value to 1 and then  now let's say I write R 2 back to X so  I'm gonna place a value of 1 into X then  now when I execute this instruction X 1  equal to R 1 is also placing a value of  1 into X and then finally when I do the  assertion this value here is not equal  to 2 and that's wrong because if you  executed this sequentially you would get  a value of 2 here and the reason as I  said the reason why this occurs is  because we have multiple rights the same  a shared memory location which could  execute in parallel and one of the nasty  things about this example here is that  the race bug doesn't necessarily always  occur so does anyone see why this race  bug doesn't necessarily always show up  yes right so the answer is because if  one of these two branches execute all  three of his instructions before we  start the other one then the result the  final result in X is going to be 2 which  is correct so if I executed these  instructions in order of 1 2 3 7 4 5 6  and then finally 8 the value is going to  be 2 and X so the race bug here doesn't  necessarily always occur and this is  what one one thing that makes these bugs  hard to find so any questions  okay so there are two different types of  determinacy races I mean they're shown  in this table here so let's suppose that  instruction a an instruction B both  access some location X and suppose a is  parallel to B so both of the  instructions can execute in parallel so  if a and B are just reading that  location then that's fine you don't  actually have a race here but if one of  the two instructions is writing to that  location whereas the other one is  reading to that location then you have  what's called a read race and the  program might have a non-deterministic  result when you have a read race because  the final answer might depend on whether  you've read a first before be updated  their value or whether a read the  updated value before B reads it so the  order of the execution of a and B can  affect the final result that you see and  finally if both a and B write to the  same shared location then you have a  write race and again this will cause  non-deterministic behavior in your  program because the final answer could  depend on whether a did the right first  or bead at the right first and we say  that two sections of code are  independent if there are no determinacy  races between them so the two pieces of  code can't have a shared location where  one computation writes to it and on the  other computation reads two reads from  it or if both computations write to that  location any questions on the definition  okay so races are really bad and you  should avoid having races in your  program so here are some tips on how to  avoid races so I mean I can tell you not  to write races in your program and you  know that races are bad but sometimes I  mean when you're writing code you just  have races in your program and you can't  help it but here are some tips on how  you can avoid races  so first the iterations of a soap for  loop should be independent so you should  make sure that the diff  iterations of a silk for-loop aren't  writing to the same memory location  secondly between a silk spawn statement  in the corresponding silk sync the code  of the spawn child should be independent  of the code of the parent and this  includes code that's executed by  additional spawned or call children by  the spawn child so you should make sure  that these pieces of code are  independent there's no read or write  racist between them one thing to note is  that the arguments to a spawn function  are evaluated in the parent before the  spawn actually occurs so you can't get a  race in the argument evaluation because  the parent is going to evaluate these  arguments and there's only there's place  only one thread that's doing this so so  it's fine and another thing to note is  that the machine word size matters so  you need to watch out for races when  you're reading and writing to packed  data structures so here's an example I  have a struct ax with two cars a and B  and updating X dot a and X dot B may  possibly cause a race and this is a  nasty race because it depends on the  compiler optimization level fortunately  this is safe on the Intel machines that  we're using in this class you can't get  a race in this example but there are  other architectures that might have a  race when you're updating the two  variables a and B in this case so with  the Intel machines that were using if  you're using standard data types like  cars shorts intz and Long's inside a  struct you won't get races but if you're  using non-standard types for example  you're using the C bit fields facilities  and the sizes of the fields are not one  of the standard sizes that you could  possibly get a race and in particular of  your ending individual bits inside a  word in parallel then you might see a  race there so you need to be careful  questions okay so fortunately silk the  silk platform has a very nice tool  called the yes question because the  architecture might actually be updating  destruct at the granularity of more than  one byte so if you're updating single  bits a single bytes inside this larger  word than that might cause a race but  fortunately this doesn't happen on in  toll machines okay so the souks and race  detector if you compile your code using  this flag - f sanitized equal to silk  then it's going to generate a silk seein  instrumented program and then if an  ostensibly deterministic SIL program run  on a given input could possibly behave  any differently than its serial elision  then silk sin is going to guarantee to  report and localize the offending race  so silk sin is going to tell you which  memory location there might be a race on  and which of the instructions were  involved in this race so silks an  employs a regression test methodology  where the programmer provides it  different test inputs and for each test  input if there could possibly be a race  in the program then it will report these  races and it identifies the file names  the lines the variables involved in the  races including the stack traces so it's  very helpful when you're trying to debug  your code and find out where there's a  race in your program one thing to note  is that you should ensure that all of  your program files are instrumented  because if you only instrument some of  your files and not the other ones then  you'll you'll possibly miss out on some  of these race bugs and one of the nice  things about the silks and race detector  is that it's always going to report a  race if there is possibly a race unlike  many other retail race detectors which  are best-effort so they might report a  race some of the times when the race  actually occurs but they don't  necessarily report a race all the time  because in some acts executions the race  doesn't occur but the silks and race  detector is going to always report the  race if there there's potentially a race  in there okay and silks and is your best  friends so use this when you're  debugging your homeworks and projects  here's an example of the output that's  generated by soaks in so you can see  that it's saying that there's a race  detected at this memory address here and  the function took the code the line of  code that caused this race is shown here  as well as the file name so this is a  matrix multiplication example and then  it also tells you how many races are  detected okay so any questions on  determinacy races all right so let's now  talk about parallelism so what is  parallelism can we quantitatively define  what parallelism is so what does it mean  when somebody tells you that their code  is highly parallel all right so to have  a formal definition of parallelism we  first need to look at the silk execution  model so this is a code that we saw  before for fibonacci  let's now look at what a call to fib say  before looks like so here I've  color-coded the different lines of code  here so that I can refer to them when  I'm drawing this computation graph so  now I'm gonna draw this computation  graph corresponding to how the  computation unfolds during execution so  the first thing I'm gonna do is I'm  gonna call fit before and that's going  to generate this magenta node here  corresponding to the call to fit before  and that's going to represent  it's pink code here and this  illustration is similar to the  computation graphs that you saw in the  previous lecture but this is happening  in parallel and I'm only labeling the  argument here but you could actually  also write the local variables thereby I  didn't do it because I want to fit  everything on this slide so so what  happens when you call fit before it's  gonna get to this silk spawn statement  and then it's gonna call fib of 3 and  when I get to a silk spawn statement  what I do is I'm going to create another  node that corresponds to the child that  I spawned so this is this magenta node  here in this blue box and then I also  have a continuing to agree note that  represents the computation after the  silk spawned statement so this green  node here corresponds to the green line  of code in the code snippet ok now I can  unfold this computation graph one more  step so we see that fib 3 is going to  call fib of 2 so I created another node  here and here the green node here which  corresponds to this green line of code  it's also going to make a function call  it's gonna call fib of 2 and that's also  going to create a new node so in general  when I do a spawn I'm gonna have two  outgoing edges out of the out of a  magenta node and when I do a call I'm  gonna have one outgoing edge out of a  green node so this green node the  outgoing edge corresponds to a function  call and for this magenta node the its  first outgoing edge corresponds to a  spawn and then that second outgoing edge  goes to the continuation strand okay so  I can unfold this one more time and here  I see that I'm creating some more spawns  and calls to fib and if I do this one  more time I've actually reached the base  case because once N is equal to 1 or 0  I'm not going to make any more recursive  calls and by the way  the bark the color of these boxes that I  used here correspond to whether I called  that function or whether I spawned it so  a box with white background corresponds  to a function that I called whereas a  box with blue background corresponds to  a function that I spawned okay so now  now that I'm done I'm now that I've  gotten to the base case I need to now  execute this blue statement which sums  up x and y and returns the result to the  parent caller okay so here I have a blue  node so this is going to take the  results of the two recursive calls sum  them together I mean I have another blue  note here and then it's gonna pass its  value to the parent that called it so  I'm gonna pass this up to its parent and  then I'm gonna pass this one up as well  and finally I have a blue node at the  top level which is going to compute my  final result and that's going to be the  output of the program okay so one thing  to know is that this computation dag  unfolds dynamically during the execution  so the runtime system isn't going to  create this graph at the beginning it's  actually going to create this on the fly  as you run the program so this this  graph here unfolds dynamically and also  this graph here is processor oblivious  and nowhere in this computation dag did  I mention the number of processors I had  for the computation and similarly in the  code here I never mentioned the number  of processors that I'm using so the  runtime system is going to figure out  how to map these tasks to the number of  processors that you give to the  computation diamond dynamically at  runtime so for example I can run this on  any number of prostitutes if I run it on  one processor is just going to execute  these tasks in parallel in fact it's  going to execute them in a in a  depth-first order which corresponds to  the what the sequential algorithm would  do so I'm going to start with before go  to fail above 3 v to fib of 1 and got  popped back up and then do fib of 0 and  go  and so on so if I use one processor it's  going to execute it's gonna create and  execute this computation dag in a def  first manner and if I do if I have more  than one processor it's not necessarily  gonna follow a def first order because I  could have multiple computations going  on  any questions on this example I'm  actually going to formally define some  terms on the next slide so that we can  formalize the notion of a computation  dag so so DAC stands for directed  acyclic graph and this is a directed  acyclic graph so we call it a  computation dag  so a parallel instruction stream is a  DAC G with vertices V and edges E and  each vertex in this dag corresponds to a  strand in a strand is a sequence of  instructions not containing a spawn a  sync or return from a spawn so the  instructions inside a strand are  executed sequentially there's no  parallelism within a strand we call the  first strand the initial strand so this  is the purple of the magenta node up  here the last string we call the final  strand and then everything else we just  call it a strand and then there are four  types of edges so there's spawn edges  call edges return edges or continue  edges in a spawn edge corresponds to an  edge to a function that you spawned so  these spawn edges are going to go to a  magenta node a call edge corresponds to  an edge that goes a function that you  called so in this example these are  coming out of the green nodes and going  to a magenta node return edge  corresponds to an edge going back up to  the parent caller so here it's going  into one of these blue nodes and then  finally a continue edge it's just a the  other edge when you spawn a function so  this is - that goes the green node it's  representing the the computation after  you spawn something  and notice that in this computation that  we never explicitly represented silk for  because as I said before silk fours are  converted to nested silks bonds and  silks Inc statements so we don't  actually need to explicitly represent  silk fours in the computation dag any  questions on this definition so we're  going to be using this computation dag  throughout this lecture to analyze how  much parallelism there is in in a  program  so assuming that each of these strands  execute sin unit time this assumption  isn't always true in practice and  practice strands will take different  amounts of time but let's assume for  simplicity each strand here takes unit  time does anyone want to guess what the  parallelism of this computation is so  how parallel do you think this is what's  the maximum speed-up you might get on  this computation five somebody said five  any other guesses who thinks this is  gonna be less than five you know a  couple people who thinks it's gonna be  more than five couple people who thinks  there's any parallelism at all in this  computation yeah it seems like a lot of  people think there is some parallelism  here so we're actually going to analyze  how much parallelism is in this  computation so I'm not going to tell you  the answer now but I'll tell you in a  couple slides first need to go over some  terminology so whenever you start  talking about parallelism somebody's  always almost always going to bring up  am Dahl's law and Amdahl's law says that  if 50% of your application is parallel  in the other 50% is serial then you  can't get more than a factor of 2  speed-up no matter how many processors  you run the computation on does anyone  know why this is the case yes  right so you you have to spend at least  fifty percent of the time in the serial  portion so in the best case if I gave  you an infinite number of processors and  you can reduce the parallel portion of  your code to zero running time you still  have the 50 percent of the serial time  that you have to execute and therefore  the best speed-up you can get is a  factor of two and in general if a  fraction alpha of an application must be  run serially then the speed-up can be at  most one over alpha so if one third of  your program has to be executed  sequentially then the speed-up can be at  most three because even if you reduce  the parallel portion of your code to tab  a running time of zero you still have  the sequential part of your code that  you have to wait for okay so let's try  to quantify the parallelism in this  computation here so how many of these  nodes have to be executed sequentially  yes so it turns out to be less than nine  yes seven turns out to be less than  seven yes  so it turns out to be less than six  turns out to be less than four you  you're getting close to so it turns out  to be more than to what's left three  okay so so three of these notes have to  be executed - sequentially because when  you're executing these notes there's  nothing else that can happen in parallel  for all the remaining notes when you're  executing them you can potentially be  executing some of some of the other  notes in parallel but for these three  notes that I've colored in yellow you  have to execute those sequential II  because there's nothing else that's  going on in parallel so according to  Amdahl's law this says that the serial  fraction of the program is three over 18  so there's 18 nodes in this graph here  so therefore the serial fractions one  over six in the speed-up is upper  bounded by one over that which is 6 so  Amdahl's law tells us that the maximum  speed-up we can get is six any questions  on how I got this number here  so it turns out that Amdahl's law  actually gives us a pretty loose upper  bound on the on the parallelism and it's  not that useful in many practical cases  so we're actually going to look at a  better definition of parallelism that  will give us a better upper bound on the  maximum speed-up we can get okay so  we're gonna define T sub P to be the  execution time of the program on P  processors and T sub one is just a work  so T sub one is if you executed this  program on one processor how much stuff  do you have to do and we define that to  me to work recall in lecture two we  looked at many ways to optimize the work  this is the work term so in this example  the number of nodes here is 18 so the  work is just going to be 18 we also  define T of infinity to be the span the  span is also called the critical path  length or the computational depth of the  graph and this is equal to the longest  directed path you can find in this graph  so in this example the longest path is  nine so one of the students answered  nine earlier and this is actually the  span of this graph so there are nine  nodes along this path here and that's  the longest one you can find and we call  this T of infinity because that's  actually the execution time of this  program if you had an infinite number of  processors so there are two laws that  are going to relate these quantities so  T sub P the work law says that he's a P  is greater than or equal to the t sub 1  divided by P so it says that the  execution time on three P processors has  to be greater than or equal to the work  of the program divided by the number of  processors you have does anyone see why  the work law is true so the answer is  if you have P processors on each time  step you can do at most P work so if you  multiply both sides by P get P times T  sub P is greater than or equal to t1 if  P times T sub P was less than t1 so that  means you're not done with the  computation because you haven't done all  the work yet so the work law says that T  sub P has to be greater than or equal to  t1 over P any questions on the work law  alright so let's look at another law  this is called the span law it says that  T sub P has to be greater than or equal  to T sub infinity so the execution time  on P processors has to be at least  execution time on an infinite number of  processors anyone know why the span law  has to be true so another way to see  this is that if you had an infinite  number of processors you can actually  simulate a PD processor system you just  use P of the processors and leave all  the remaining processors idle and that  can't slow down your program so  therefore you have that T sub P has to  be greater than or equal to T sub  infinity if you add more processors to  it the the running time can't go up any  questions so let's see how we can  compose the work and the span quantities  of different computations so let's say I  have two computations a and B and let's  say that a has to execute before B so  everything in a has to be done before I  start the computation in B and let's say  I know what the work of a and the work  of B individually are what would be the  work of a union B  yes  yeah so why is that  yeah so the work is just gonna be the  sum of the work of a and the work of B  because you have to do all the work of a  and then do all the work of B so you  just add them together what about the  span so let's say I know the span of a  and I know the span of B what's the span  of a union B so again it's just the sum  of the span of a in the sum of in the  span of B this is because I have to  execute everything in a before I start a  a B so I just sum together the spans so  this is series copy to composition what  if I do parallel composition so let's  say here I'm executing the two  computations in parallel what's the work  of a union B so it's not it's not going  to be the maximum yes  yes it's still going to be the sum of t1  of a and t1 of B because you still have  the same amount of work that you have to  do it's just that you're doing it in  parallel but the work is just the time  if you had one processor so if you had  one processor you wouldn't be executing  these in parallel what about the span so  if I know the span of a in the span of B  what's the span of the parallel  composition of the two yes yeah so the  span of a union B is going to be the max  of the span of a in a span of B because  I'm going to be bottlenecked by the  slower by the slower of the two  computations so I just take the one that  has longer span and that gives me the  overall span any questions ok so here's  another definition so T 1 divided by T P  is the speed-up on P processors if I  have T 1 divided by T P less than P then  this means that I have sub linear  speed-up I'm not making use of all of  the processors because I'm using P  processors but I'm not getting a  speed-up of P if T 1 over T P is equal  to P then I'm getting perfect linear  speed-up I'm making use of all my  processors I'm putting P at P times as  many resources into my computation and  it becomes P times faster so this is the  good case  and finally if T 1 over T P is greater  than P we have something called super  linear speed-up in our simple  performance model this can't actually  happen because of the work law the work  law says that T P has to be at least t1  divided by P so if you rearrange the  terms you'll see that we get a  contradiction in our model in practice  you might sometimes see that you have a  super linear speed-up because when  you're using more processors you might  have access to more cache and that could  improve the  normos of your program but in general  you might see a little bit of super  linear speed-up but not that much and in  our simplified model we're just going to  assume that you can't have super linear  speed-up and getting perfect linear  speed-up is already very good so the  because the span law says that TP has to  be at least T infinity the maximum  possible speed up is just going to be T  1 divided by T infinity and that's the  parallelism of your computation this is  a maximum possible speed up you can get  another way to view this is that it's  equal to the average amount of work that  you have to do per step along the  spanned so for every step along the span  you're doing this much work and after  all the steps and you've done all of the  work so what's the parallelism of of  this computation dag here  - why is it - yeah so t1 is 18 there are  18 nodes in this graph T infinity is 9  and the last time I checked t1 over 18  divided by 9 is 2 so the parallelism  here is 2 okay so now we can go back to  our Fibonacci example and we can also  analyze the work and span of this and  compute the maximum parallelism so again  for simplicity let's assume that each of  these strands takes unit time to execute  again in practice is not necessarily  true but for some please let's for  simplicity let's just assume that so  what's the work of this computation  17 right so the work is just the number  of notes you have in this graph and you  can just count that up and you get 17  what about the span somebody said 8 yeah  so the span is 8 and here's the longest  path so this is the path that has 8  nodes in it and that's the longest one  you can find here  so therefore the parallelism is just 17  divided by 8 which is 2 point 1 2 5 and  so for all of you who guessed that the  parallelism was 2 you're very close this  tells us that using many more than 2  processors can only yield us perform  marginal performance gains because the  maximum speed-up we can get us to point  1 to 5 so we throw 8 processors at this  computation we're not gonna get a  speed-up beyond 2 point 1 to 5 ok so to  figure out how much parallelism is in  your computation you need to analyze the  work of your computation and the span of  your computation and then take the ratio  between the two quantities but for large  computations is actually pretty tedious  to analyze this by hand you don't want  to draw these things out by hand for a  very large computation and fortunately  silk has a tool called the silk scale  scalability analyzer so this is  integrated into the tape or LLVM  compiler that you'll be using for this  course  and silk scale use this compiler  instrumentation to analyze a serial  execution of a program and it's going to  generate the work and the span  quantities and then use those quantities  to derive upper bounds on the parallel  speed-up of your program so you have a  chance to play around with silk scale in  homework 4 so let's try to analyze the  parallelism of quicksort  and here we're using a parallel  quicksort algorithm the function  quicksort here takes two inputs  these are two pointers laugh points to  the beginning of the array that we want  to sort right points to one element  after the end of the array and what we  do is we first check if left is equal to  right if so then we just return because  there are no elements to sort otherwise  we're going to call this partition  function the partition function is going  to pick a random pivot so this is a  randomized quicksort algorithm and then  it's going to move everything that's  less than the pivot to the left part of  the array and everything to the  everything that's greater than or equal  to the pivot to the right part of the  array is also going to return as a  pointer to the pivot and then now we can  execute two recursive calls so we do  quicksort on the left side and quicksort  on the right side and this can happen in  parallel so we use the silks spawn here  to spawn off one of these calls to  quicksort in parallel and therefore the  two recursive calls are parallel and  then finally we sync up before we return  from the function all right so let's say  we wanted to sort 1 million numbers with  this quicksort algorithm and let's also  assume that the partition function here  is written sequentially so you have to  go through all the elements one by one  can anyone guess what the parallelism is  in this computation so the guess was 1  million any other guesses  fifty thousand any other guesses yes -  it's a good guess log log base two of a  million any other guesses so log base  two of a million to 50,000 in 1 million  anything anyone think it's more than 1  million no so no takers on more than 1  million ok so if you run this program  using silk skill it will generate a plot  that looks like this and there are  several lines on this plot so let's talk  about what each of these lines mean ok  so this purple line here is the speed-up  that you observe in your computation  when you're running it and you can get  that by taking the single processor  running time and dividing it by the  running time on P processors so this is  the observed speed-up that's the purple  line the blue line here is the line that  you get from the span law so this is T 1  over T infinity and here this gives us a  bound of about 6 for the parallelism the  Green Line is the bound from the work  law so this is just a linear in a linear  line with a slope of 1 it says that on P  processors you can't get more than a  factor of P speed-up so therefore the  the maximum speed up you can get has to  be below the green line and below the  blue line so you're in this lower right  quadrant of the plot there's also this  orange line which is of the speed-up you  would get if you used a greedy scheduler  we'll talk more about the greedy  scheduler later on in this lecture so  this is sort of the plot that you would  get and we see here that the maximum  speed-up is about 5 so for those of you  who guessed 2 and log base 2 of a  million you're the closest  you can also generate a plot that just  tells you the execution time versus the  number of processors and you can get  this quite easily just by doing a simple  transformation from the previous plot so  sylph scale is going to give you these  useful plots that you can use to figure  out how much parallelism is in your  program and let's see why why the  parallelism here is so low so I said  that we were going to execute this  partition function sequentially and it  turns out that that's actually the  bottleneck to the parallelism so so the  expected work of quicksort is an order n  log n so some of you might have seen  this in your previous algorithms courses  if you haven't seen this yet then you  can take a look at your favorite  textbook introduction to algorithms it  turns out that the parallel version of  quicksort also has an expected work down  of order n log n if you pick a random  pivot so the analysis is similar they  expect a span bound turns out to be at  least n and this is because on the first  level of recursion we have to call this  partition function which is going to go  through the elements one by one so so  that already has linear span and it  turns out that the overall span is also  order n because the span actually works  out to be a geometrically decreasing  sequence and sums to order n and  therefore the maximum parallelism you  can get is order log n so you just take  the work divided by the span so for the  student who guessed that the parallelism  is log base 2 of n that's very good  turns out that like it's not exactly log  base 2 event because there are constants  and these work and span bounds so it's  on the order of log log of n that's the  parallelism and turns out that order log  n parallelism is not very high in  general you want the parallelism to be  much higher something  polynomial and n and in order to get  more parallelism in this algorithm what  you have to do is sorry what you have to  do is you have to paralyze this  partition function because right now I  I'm just  executing this sequentially but you can  actually indeed write a parallel  partition function that takes linear  work in order log n span and then this  would give you an overall span bound  with log squared n and if you take n log  n divided by log squared and that gives  you an overall parallelism of n over log  n which is much higher than order log n  here similarly if you were to implement  a merge sort you would also need to make  sure that the merging routine is  implemented in parallel if you want to  see significant speed-up so not only do  you have to execute the two recursive  calls in parallel you also need to make  sure that the merging portion of the  code is done in parallel any questions  on this example  [Music]  yeah yeah so though I mean I I believe  that that's just due to noise because  there's some noise going on in the  machine so if you ran it enough times  and took the average or the median it  should be always going up or it  shouldn't be decreasing at least yes so  at one level of recursion the partition  function takes order log n span you can  show that there log n levels of  recursion in this quicksort algorithm I  didn't go over the details of this of  this analysis but you can show that and  then therefore the overall span is going  to be order log squared and I can show  you on the board after class if you're  interested or I can give you a reference  other questions okay so it turns out  that in addition to quicksort there are  also many other interesting practical  parallel algorithms out there so here  I've listed a few of them and by  practical I mean that the silk program  running on one processor is competitive  with the best sequential program for  that problem and so you can see that  I've listed the work in a span of merge  sort here and if you implement the merge  in parallel the span of the overall  computation would be log cubed n + n log  n divided by log cubed n is n over log  squared and thus apparel ISM which is  pretty high and in general all of these  computations have pretty high  parallelism another thing to note is  that these algorithms are practical  because their work bound is  asymptotically equal to the work of the  corresponding sequential algorithm  that's known as a work efficient  parallel algorithm and it's actually one  of the goals of parallel algorithm  designed to come up with work efficient  parallel algorithms because this means  that even if you have a small number of  processors you can still be competitive  with the sequential algorithm running on  running on one processor and in the next  lecture we'll actually see  Yeah right in the next lecture we'll  actually see some examples of these  other algorithms and possibly even ones  not listed on the slide and we'll go  over the work and span analysis and  figure out the parallelism so now I want  to move on to talk about some scheduling  theories so I talked about these  computation DAGs earlier analyze the  work in the span of them but I never  talked about how these different strands  are actually mapped to processors at  running time so let's talk a little bit  about scheduling theory it turns out  that scheduling theory is actually very  general it's not just limited to a  parallel programming I was used and all  over over the places in computer science  operations research research and math so  as a reminder silk allows the programmer  to express potential parallelism in an  application and a soak schedule is gonna  map these strands onto the processors  that you have available dynamically at  runtime silk actually uses a distributed  scheduler but since the theory of  distributed schedulers is a little bit  complicated we'll actually explore the  ideas of scheduling first using a  centralized scheduler and the  centralized scheduler knows everything  about what's going on in the computation  and it can use that to make a good  decision so let's first look at what a  centralized scheduler does and then I'll  talk a little bit about the silk  distributed schedule and we'll learn  more about that in a future lecture as  well so we're gonna look at a greedy  scheduler in an idea of a greedy  scheduler is to just do as much as  possible on every step of the  computation so as anyone seen greedy  algorithms before right so many of you  have seen greedy algorithms before so  the idea is similar here we're just  gonna do as much as possible at the  current time step we're not gonna think  too much about the future so we're going  to define a ready strand to be a strand  where all of its predecessors in the  computation dag have already executed  so in this example here let's say I  already executed all of these blue  strands then the yellow the one shaded  in yellow are going to be my ready  strands because they have all of their  predecessors executed already and there  are two types of steps in a greedy  scheduler the first kind of step is  called a complete step and in a complete  step we have at least P strands ready so  if we had P equal to 3 then we have a  complete step now because we have 5  strands ready which is greater than 3 so  what are we gonna do in a complete step  what would a greedy scheduler do yes  yeah so greedy schedule would just do as  much as it can say we just run any three  of these or any P in general so let's  say I picked these these three to run so  it turns out that these are actually the  worst three to run because they don't  enable any new strands to be ready but I  can pick those three and then the  incomplete step is one where I have  fewer than P strands ready so here I  have two strands ready and I have three  processors so what would I do in an  incomplete step yeah so just running all  of them so here I'm gonna execute these  two strands and here we're gonna use  complete steps and incomplete steps to  analyze the performance of the greedy  scheduler there's a famous theorem which  was first shown by Ron Graham in 1968  that says that any greedy scheduler  achieves the following time bound T sub  P is less than or equal to T 1 over P  plus T infinity and you might recognize  the terms on the right hand side T 1 is  the work and T infinity is the span that  we saw earlier and here's a simple proof  for why this time bound holds so we can  upper bound the number of complete steps  in the computation by T 1 over P and  this is because each complete step is  going to perform P work so after T 1  over P complete steps will have done all  of the work in our computation so that  that that means that the number of  complete steps can be at most T 1 over P  so any questions on this  so now let's look at the number of  incomplete steps we can have so the  number of incomplete steps we can have  its upper bounded by the span or T  infinity and the reason why is that  if you look at the UH necks acute addd  AG right before you execute an  incomplete step and you measure the span  of that uh next acute dag you'll see  that once you execute an incomplete step  it's going to reduce a span of that dag  by one so here this is the span of our  uh necks acute attack that contains just  these seven nodes the span of this is  five and when we execute an incomplete  step we're gonna process all of the  roots of this uh necks acute attack  delete them from the dag and therefore  we're gonna reduce the length of the  longest path by one so well we execute a  incomplete step it decreases a span from  five to four and then the time bound up  here T sub P it's just upper bounded by  the sum of these two types of steps  because after you execute t1 over P  complete steps and T infinity incomplete  steps you must must have finished the  entire computation so any questions  okay a corollary of this theorem is that  any grease scheduled or achieved within  a factor of two of the optimal running  time so this is the optimal running time  of a scheduler that knows everything it  can predict the future and so on so  let's let T sub P TP star be the  execution time produced by an optimal  scheduler we know that TP star has to be  at least the max of t1 over P and T  infinity does this do to the work and  span laws so it has to be at least a max  of these two terms otherwise we wouldn't  have finished the computation so now we  can take the inequality we had before  for the greedy scheduler bound so TP is  less than equal to t1 over P plus T  infinity and this is upper bounded by  two times the max of these two terms so  a plus B is upper bounded by two times  the max of a and B and then now the max  of t1 over P and T infinity is just  upper bounded by TP star so we can  substitute that in and we get that TP is  upper bounded by two times TP star which  is the running time of the optimal  scheduler so the greedy scheduler  achieves within a factor of two of the  optimal scheduler  here's another corollary this is a more  interesting corollary it says that any  greedy scheduler achieves near perfect  linear speed-up whenever T 1 / T  infinity is greater than or equal to P 2  see why this is chuh if we have that T 1  over T infinity is great much greater  than P so the the double arrows here  mean that the left-hand side is much  greater than the right-hand side then  this means that the span is much less  than T 1 over P and the greedy  scheduling theorem gives us that TP is  less than or equal to T 1 over P plus T  infinity but T infinity is much less  than T 1 over P so the first term  dominates and we have that TP is  approximately equal to T 1 over P and  therefore the speed-up you get is T 1  over P which is P and this is near this  is linear speed-up the quantity T 1  divided by P times T infinity is known  as the parallel slackness so this is  basically measuring how much more  parallelism you have in a computation  than the number of processors you have  and if parallel slackness is very high  then this corollary is going to hold and  you're gonna see near linear speed-up  as a rule of thumb you usually want the  parallel slackness of your program to be  at least 10 because if you have a  parallel slackness of just 1 you can't  actually amortize the overheads of the  scheduling mechanism so therefore you  want the parallel slackness to be at  least 10 when you're programming in silk  so that was the greedy scheduler let's  talk a little bit about the silk  scheduler so seok uses a work stealing  scheduler and achieves and expected  running time TP equal to t1 over P plus  order T infinity so instead of just  summing the two terms we actually have a  Big O in front of the T infinity and  this is used to account for the  overheads of scheduling the greedy  scheduler I presented earlier I didn't  account for any of the overheads of  scheduling I just assumed that it could  figure out which of the tasks to execute  so this silk work stealing scheduler has  has this expected time provably so you  can prove this using random variables  and tail balance of distributions so  charles leiserson has a paper that talks  about how to prove this and empirically  we usually see that TP is more like t1  over P plus T infinity so we usually  don't see any big constant in front of  the T infinity term in practice and  therefore we can get near linear near  perfect linear speed-up as long as the  number of processors is much less than T  1 over T infinity the maximum  parallelism and as I said earlier the  instrumentation is silk scale will allow  you to measure the work and span terms  so that you can figure out how much  parallelism is in your program  any questions  okay so let's talk a little bit about  how the silk runtime system works so in  the silk runtime system each worker or  processor maintains a work deck deck  stands for double ended queue so it's  just short for double ended queue  maintains a work deck of ready strands  and it manipulates the bottom of the  deck just like you would in in a stack  of a sequential program so here I have  four processors and each one of them  have their own decks and they have these  things on on the stack these function  calls saves the return address to local  variables and so on so a processor can  call a function and when it calls a  function it just places that functions  frame at the bottom of its stack you can  also spawn things so then it places a  spawn frame at the bottom of this deck  and then these things can happen in  parallel so multiple processes can be  spawning and calling things in parallel  and you can also return from spawn or a  call so here I'm gonna return from a  call then I returned from a spawn and at  this point I don't actually have  anything left to do for the second  processor so what I what what I do now  when I'm left with nothing to do yes  yeah so the idea here is to steal some  work from another processor so when a  worker runs out of work to do it's gonna  steal from the top of a random victims  deck so it's gonna pick one of these  processors at random it's gonna roll  some dice to determine who to steal from  and let's say that it picked the third  processor now it's going to take all  this stuff at the top of the deck up  until the next spawn and place it into  its own deck and then now it has stuff  to do again so now we can continue  executing this code it can spawn stuff  call stuff and so on so the idea is that  whenever a worker runs out of work to do  it's going to start stealing some work  from other processors but if it always  has enough work to do then it's happy  and it doesn't need to steal things from  other processors and this is why MIT  gives us so much work to do so we don't  have to steal work from other people so  famous theorem says that were sufficient  parallelism workers steal very  infrequently and this gives us near  linear speed-up so with sufficient  parallelism the first term in our  running bound is going to dominate the  t1 over P term and that gives us near  linear speed-up let me actually show you  a pseudo proof of this theorem and I'm  I'm allowed to do a pseudo proof it's  not actually a real proof a pseudo proof  so I'm allowed to do this because I I'm  not the author of an algorithms textbook  so here's a pseudo proof yet so  processor is either working or stealing  at every time step in the total time  that all processors spend working is  just t1 because that's the total work  that you have to do and then when it's  not doing work it's stealing and each  steal has a 1 over P chance of reducing  the span by one because one of the  processors is contributing to the  longest path in the  in the computation dag and there's a one  over P chance that I'm gonna pick that  processor and steal some work from that  processor and reduce the span of my  remaining computation by one and  therefore the expected cost of all  steals is going to be order P times T  infinity because I have to steal P  things and expectation before I get to  the processor that has the the the  critical path and and therefore my  overall cost for stealing is order P  times T infinity because I'm gonna do  this T infinity times and since there  are P processors I'm gonna divide the  expected time by P so t1 plus o of P  times T infinity divided by P and that's  gonna give me the bound t1 over P plus  order T infinity so this you know proof  here ignores issues with independence  but it still gives you an intuition of  why we get this expected running time if  you want to actually see the full proof  it's actually quite interesting it uses  random variables in tail bounds of  distributions and this is the paper that  has this this is by blue muff and  charles leiserson  all right so another thing I want to  talk about is that silk supports C's  rules for pointers so a pointer to a  stack space can be passed from a parent  to a child but not from a child to a  parent and this is the same as the  stacked rule for sequential C programs  so let's say I have this computation on  the left here so a is going to spawn off  B in it and then it's going to continue  executing C and the C is going to spawn  off D and xqe so we see on the right  hand side the views of the stacks for  each of the tasks here so a sees its own  stack B has it sees its own stack but it  also sees a stack because a is its  parent C will see its own stack but  again it sees a stack because a is its  parent and then finally D and E they see  the stack of C and they also see the  stack of a so in general TAS can see the  stack of all of its ancestors in this  computation graph and we call this a  cactus stack because it sort of looks  like a cactus if you draw this upside  down and silks cactus stack supports  multiple views of the stacks in parallel  this is what makes the parallel calls to  functions work and see we can also found  the stack space used by a silk program  so let's let S sub 1 be the stack space  required by the serial execution of a  silk program then the sack space  required by a P processor execution is  going to be bounded by P times s 1 so s  P is the stack space required by P P  processor execution that's less than or  equal to P times s 1 here's a high-level  proof of why this is true so it turns  out that the work-stealing algorithm and  silk maintains what's called the busy  leaves property and this says that each  of the existing leaves that are still  active in the computation dag have some  worker executing on it  so in this example here the vertices  shaded in blue and purple these are the  ones that are in my remaining  computation dag and all of the gray  notes have already been finished and  here for each of the leaves here I have  one processor on that leaf executing the  tasks associated with this so it's so  guarantees this busy leaves property and  now for each of these processors the  amount of stack space it needs is in  these two sacks space for its own tasks  and then everything above it in this  computation dag and we can actually  bound that by the stack space needed by  a single processor execution of the silk  program s 1 because s 1 is just the  maximum a stack space we need which is  the basically the the longest path in  this graph and we do this for every  processor so therefore the upper bound  on the stack space required by P  processor execution is just P times s 1  and in general this is a quite loose  upper bound because you're not  necessarily going all the way all the  way down in this computation dag every  time sometimes you'll be and usually  you'll be much higher in this  computation dag so any questions yes in  practice if you have enough parallelism  then you're not actually going to steal  that much in your algorithm so if you  guarantee that there's a lot of  parallelism then each processor is going  to have a lot of its own work to do and  it doesn't need to steal very frequently  but if if if your parallelism is very  low  I'm compared to the number of processes  if it's equal to the number of  processors then you're going to spend a  significant amount of time stealing and  the overheads of the works dealing  algorithm are going to show up in your  running time  are you so the standard silks were  stealing scheduler it takes everything  at the top of the deck up until the next  spawn so basically that's the strand so  it takes that there are variants that  take more than that but the silk work  stealing scheduler we'll be using in  this class just takes the top strand any  other questions okay so that's actually  all I have for today if you have any  additional questions you can come talk  to talk to us after class and remember  to meet with your MIT Posse mentors soon  you

2
00:00:03,189 --> 00:00:05,769
the following content is provided under  a Creative Commons license your support  will help MIT OpenCourseWare continue to  offer high quality educational resources  for free  to make a donation or to view additional  materials from hundreds of MIT courses  visit MIT opencourseware at ocw.mit.edu  I could afternoon everyone  so let's get started so today we're  going to be talking about races and  parallelism and you'll be doing a lot of  parallel programming for the next  homework assignment in project one thing  I want to point out is that it's  important to meet with your MIT policy  as soon as possible if you haven't done  so already since that's going to be part  of the evaluation for the project one  grade and if you have trouble reaching  your MIT policy members please contact  your TA and also make a post on Piazza  as soon as possible ok so as a reminder  let's look at the basics of silk so we  have silk spawn and silk sync statements  in silk this was the code that we saw in  last lecture which computes the nth  Fibonacci number so when we say silk  spawn it means that the named child  function the function right after the  silk spawn keyword can execute in  parallel with the parent caller so it  says that fib of n minus 1 can execute  in parallel with the fib function that  called it and then silks Inc says that  control cannot pass this point until all  the spawned children have returned so  this is going to wait for fib of n minus  1 to finish before it goes on and  returns the sum of X and y and recall  that the sole keywords grant permission  for parallel execution but they don't  actually force parallel executions so  this code here says that we can execute  fib of n minus 1 in parallel with this  parent caller but it doesn't say that we  necessarily have to execute it execute  them in parallel and it's up to the  runtime system to decide whether  these different functions will be  executed in parallel we'll talk more  about the runtime system today and also  we talked about this example where we  wanted to do an in-place matrix  transpose and this used the silk for  keyword and this says that we can  execute the iterations of this silk for  loop in parallel and again this says  that runtime system is allowed to  schedule these iterations in parallel it  doesn't necessarily say that they have  to execute in parallel and under the  hood silk spawned statements or soap for  statements are translated into nested  silk spawn and silk sink calls so the  compiler is going to divide the  iteration space in half do a silk spawn  on one of the two halfs called the other  half and then this is done recursively  until we reach a certain size for the  number of iterations and a loop at which  point it just creates a single task for  that so any questions on on the soul  constructs yes so it's actually not  going to figure out whether the  iterations are independent for you the  programmer actually has to reason about  that but silk does have a nice tool  which we'll talk about that will tell  you which places your code might  possibly be reading and writing the same  memory location and that allows you to  localize any possible race bugs in your  code so we'll actually talk about races  but if you just compile this code silk  isn't going to know whether the  iterations are independent  all right so determinacy races  okay so race conditions are the bane of  concurrency so you don't want to have  race conditions in your code and there  are these two famous race bugs that  cause disaster so there's this therac-25  radiation therapy machine and there's a  race condition in the software and this  led to three people being killed and  many more being seriously injured the  North American blackout of 2003 was also  caused by a race bug in the software and  this left 50 million people without  power so these are very bad and they're  notoriously difficult to discover by  conventional testing so race bugs aren't  going to appear every time you execute  your program in fact the hardest ones to  find and the hardest ones to find which  caused these events are actually very  rare event so most of the times when you  won't run your program you're not gonna  see the race bug only very rarely will  you see it so this makes it very hard to  find these rates bugs and furthermore  when you see a race bug it doesn't  necessarily always happen in the same  place in your code so that makes it even  harder ok so what is a race so a  determinacy race is one of the most  basic forms of races in a determinacy  race occurs when two logically parallel  instructions access the same memory  location and at least one of these  instructions performs a right to that  location so let's look at a simple  example so in this code here  my first setting X equal to zero and  then I have a soak for loop with two  iterations and each of the two  iterations are incrementing this  variable X and then at the end I'm going  to assert that X is equal to two so  there's actually a race in this program  here so in order to understand where the  race occurs let's look at the execution  graph here so I'm gonna label each of  these statements with a letter  the first statement a is just setting X  equal to zero and then after that we're  actually going to have two parallel  paths because we have two iterations of  this so for loop which can execute  parallel and each of these paths are  going to increment X by one and then  finally we're going to assert that X is  equal to two at the end and this sort of  graph is known as a dependency graph it  tells you what X what instructions have  to finish before you execute the next  instruction so here it says that B and C  must wait for a to execute before they  proceed but B and C can actually happen  in parallel because there's no  dependency among them and then D has to  happen after B and C finish so to  understand why there's a race bug here  we actually need to take a closer look  at this dependency graph so let's take a  closer look so when you run this code X  plus plus is actually going to be  translated into 3 steps so first we're  going to load the value of x into some  some processors register r1 and then  we're going to increment r1 and then  we're going to set X equal to the result  of our one in a same thing for r2 we're  gonna load X into register r2 increment  r2 and then set X equal to r2 ok so here  we have a race because both of these in  both of these stores X 1 equal to R 1  and X 2 equal to r2 are actually writing  to the same memory location so let's  look at one possible execution of this  computation graph and we're going to  keep track of the values of X r1 and r2  so the first instruction we're going to  execute is X equal to 0 so we just set X  equal 0 and everything's good so far  then next we can actually pick one of  two instructions to execute because both  of these two instructions have their  predecessors satisfied already their  predecessors have already executed so  let's say I pick r1 equal X to execute  and this is going to place the value 0  into register r1 now I'm going to  increment r1 so  this change it's a value in our  one-to-one and then now let's say I  execute our to equal to X so that's  going to read X which has a value of 0  is going to place the value of 0 into R  2 it's going to increment R 2 that's  gonna change that value to 1 and then  now let's say I write R 2 back to X so  I'm gonna place a value of 1 into X then  now when I execute this instruction X 1  equal to R 1 is also placing a value of  1 into X and then finally when I do the  assertion this value here is not equal  to 2 and that's wrong because if you  executed this sequentially you would get  a value of 2 here and the reason as I  said the reason why this occurs is  because we have multiple rights the same  a shared memory location which could  execute in parallel and one of the nasty  things about this example here is that  the race bug doesn't necessarily always  occur so does anyone see why this race  bug doesn't necessarily always show up  yes right so the answer is because if  one of these two branches execute all  three of his instructions before we  start the other one then the result the  final result in X is going to be 2 which  is correct so if I executed these  instructions in order of 1 2 3 7 4 5 6  and then finally 8 the value is going to  be 2 and X so the race bug here doesn't  necessarily always occur and this is  what one one thing that makes these bugs  hard to find so any questions  okay so there are two different types of  determinacy races I mean they're shown  in this table here so let's suppose that  instruction a an instruction B both  access some location X and suppose a is  parallel to B so both of the  instructions can execute in parallel so  if a and B are just reading that  location then that's fine you don't  actually have a race here but if one of  the two instructions is writing to that  location whereas the other one is  reading to that location then you have  what's called a read race and the  program might have a non-deterministic  result when you have a read race because  the final answer might depend on whether  you've read a first before be updated  their value or whether a read the  updated value before B reads it so the  order of the execution of a and B can  affect the final result that you see and  finally if both a and B write to the  same shared location then you have a  write race and again this will cause  non-deterministic behavior in your  program because the final answer could  depend on whether a did the right first  or bead at the right first and we say  that two sections of code are  independent if there are no determinacy  races between them so the two pieces of  code can't have a shared location where  one computation writes to it and on the  other computation reads two reads from  it or if both computations write to that  location any questions on the definition  okay so races are really bad and you  should avoid having races in your  program so here are some tips on how to  avoid races so I mean I can tell you not  to write races in your program and you  know that races are bad but sometimes I  mean when you're writing code you just  have races in your program and you can't  help it but here are some tips on how  you can avoid races  so first the iterations of a soap for  loop should be independent so you should  make sure that the diff  iterations of a silk for-loop aren't  writing to the same memory location  secondly between a silk spawn statement  in the corresponding silk sync the code  of the spawn child should be independent  of the code of the parent and this  includes code that's executed by  additional spawned or call children by  the spawn child so you should make sure  that these pieces of code are  independent there's no read or write  racist between them one thing to note is  that the arguments to a spawn function  are evaluated in the parent before the  spawn actually occurs so you can't get a  race in the argument evaluation because  the parent is going to evaluate these  arguments and there's only there's place  only one thread that's doing this so so  it's fine and another thing to note is  that the machine word size matters so  you need to watch out for races when  you're reading and writing to packed  data structures so here's an example I  have a struct ax with two cars a and B  and updating X dot a and X dot B may  possibly cause a race and this is a  nasty race because it depends on the  compiler optimization level fortunately  this is safe on the Intel machines that  we're using in this class you can't get  a race in this example but there are  other architectures that might have a  race when you're updating the two  variables a and B in this case so with  the Intel machines that were using if  you're using standard data types like  cars shorts intz and Long's inside a  struct you won't get races but if you're  using non-standard types for example  you're using the C bit fields facilities  and the sizes of the fields are not one  of the standard sizes that you could  possibly get a race and in particular of  your ending individual bits inside a  word in parallel then you might see a  race there so you need to be careful  questions okay so fortunately silk the  silk platform has a very nice tool  called the yes question because the  architecture might actually be updating  destruct at the granularity of more than  one byte so if you're updating single  bits a single bytes inside this larger  word than that might cause a race but  fortunately this doesn't happen on in  toll machines okay so the souks and race  detector if you compile your code using  this flag - f sanitized equal to silk  then it's going to generate a silk seein  instrumented program and then if an  ostensibly deterministic SIL program run  on a given input could possibly behave  any differently than its serial elision  then silk sin is going to guarantee to  report and localize the offending race  so silk sin is going to tell you which  memory location there might be a race on  and which of the instructions were  involved in this race so silks an  employs a regression test methodology  where the programmer provides it  different test inputs and for each test  input if there could possibly be a race  in the program then it will report these  races and it identifies the file names  the lines the variables involved in the  races including the stack traces so it's  very helpful when you're trying to debug  your code and find out where there's a  race in your program one thing to note  is that you should ensure that all of  your program files are instrumented  because if you only instrument some of  your files and not the other ones then  you'll you'll possibly miss out on some  of these race bugs and one of the nice  things about the silks and race detector  is that it's always going to report a  race if there is possibly a race unlike  many other retail race detectors which  are best-effort so they might report a  race some of the times when the race  actually occurs but they don't  necessarily report a race all the time  because in some acts executions the race  doesn't occur but the silks and race  detector is going to always report the  race if there there's potentially a race  in there okay and silks and is your best  friends so use this when you're  debugging your homeworks and projects  here's an example of the output that's  generated by soaks in so you can see  that it's saying that there's a race  detected at this memory address here and  the function took the code the line of  code that caused this race is shown here  as well as the file name so this is a  matrix multiplication example and then  it also tells you how many races are  detected okay so any questions on  determinacy races all right so let's now  talk about parallelism so what is  parallelism can we quantitatively define  what parallelism is so what does it mean  when somebody tells you that their code  is highly parallel all right so to have  a formal definition of parallelism we  first need to look at the silk execution  model so this is a code that we saw  before for fibonacci  let's now look at what a call to fib say  before looks like so here I've  color-coded the different lines of code  here so that I can refer to them when  I'm drawing this computation graph so  now I'm gonna draw this computation  graph corresponding to how the  computation unfolds during execution so  the first thing I'm gonna do is I'm  gonna call fit before and that's going  to generate this magenta node here  corresponding to the call to fit before  and that's going to represent  it's pink code here and this  illustration is similar to the  computation graphs that you saw in the  previous lecture but this is happening  in parallel and I'm only labeling the  argument here but you could actually  also write the local variables thereby I  didn't do it because I want to fit  everything on this slide so so what  happens when you call fit before it's  gonna get to this silk spawn statement  and then it's gonna call fib of 3 and  when I get to a silk spawn statement  what I do is I'm going to create another  node that corresponds to the child that  I spawned so this is this magenta node  here in this blue box and then I also  have a continuing to agree note that  represents the computation after the  silk spawned statement so this green  node here corresponds to the green line  of code in the code snippet ok now I can  unfold this computation graph one more  step so we see that fib 3 is going to  call fib of 2 so I created another node  here and here the green node here which  corresponds to this green line of code  it's also going to make a function call  it's gonna call fib of 2 and that's also  going to create a new node so in general  when I do a spawn I'm gonna have two  outgoing edges out of the out of a  magenta node and when I do a call I'm  gonna have one outgoing edge out of a  green node so this green node the  outgoing edge corresponds to a function  call and for this magenta node the its  first outgoing edge corresponds to a  spawn and then that second outgoing edge  goes to the continuation strand okay so  I can unfold this one more time and here  I see that I'm creating some more spawns  and calls to fib and if I do this one  more time I've actually reached the base  case because once N is equal to 1 or 0  I'm not going to make any more recursive  calls and by the way  the bark the color of these boxes that I  used here correspond to whether I called  that function or whether I spawned it so  a box with white background corresponds  to a function that I called whereas a  box with blue background corresponds to  a function that I spawned okay so now  now that I'm done I'm now that I've  gotten to the base case I need to now  execute this blue statement which sums  up x and y and returns the result to the  parent caller okay so here I have a blue  node so this is going to take the  results of the two recursive calls sum  them together I mean I have another blue  note here and then it's gonna pass its  value to the parent that called it so  I'm gonna pass this up to its parent and  then I'm gonna pass this one up as well  and finally I have a blue node at the  top level which is going to compute my  final result and that's going to be the  output of the program okay so one thing  to know is that this computation dag  unfolds dynamically during the execution  so the runtime system isn't going to  create this graph at the beginning it's  actually going to create this on the fly  as you run the program so this this  graph here unfolds dynamically and also  this graph here is processor oblivious  and nowhere in this computation dag did  I mention the number of processors I had  for the computation and similarly in the  code here I never mentioned the number  of processors that I'm using so the  runtime system is going to figure out  how to map these tasks to the number of  processors that you give to the  computation diamond dynamically at  runtime so for example I can run this on  any number of prostitutes if I run it on  one processor is just going to execute  these tasks in parallel in fact it's  going to execute them in a in a  depth-first order which corresponds to  the what the sequential algorithm would  do so I'm going to start with before go  to fail above 3 v to fib of 1 and got  popped back up and then do fib of 0 and  go  and so on so if I use one processor it's  going to execute it's gonna create and  execute this computation dag in a def  first manner and if I do if I have more  than one processor it's not necessarily  gonna follow a def first order because I  could have multiple computations going  on  any questions on this example I'm  actually going to formally define some  terms on the next slide so that we can  formalize the notion of a computation  dag so so DAC stands for directed  acyclic graph and this is a directed  acyclic graph so we call it a  computation dag  so a parallel instruction stream is a  DAC G with vertices V and edges E and  each vertex in this dag corresponds to a  strand in a strand is a sequence of  instructions not containing a spawn a  sync or return from a spawn so the  instructions inside a strand are  executed sequentially there's no  parallelism within a strand we call the  first strand the initial strand so this  is the purple of the magenta node up  here the last string we call the final  strand and then everything else we just  call it a strand and then there are four  types of edges so there's spawn edges  call edges return edges or continue  edges in a spawn edge corresponds to an  edge to a function that you spawned so  these spawn edges are going to go to a  magenta node a call edge corresponds to  an edge that goes a function that you  called so in this example these are  coming out of the green nodes and going  to a magenta node return edge  corresponds to an edge going back up to  the parent caller so here it's going  into one of these blue nodes and then  finally a continue edge it's just a the  other edge when you spawn a function so  this is - that goes the green node it's  representing the the computation after  you spawn something  and notice that in this computation that  we never explicitly represented silk for  because as I said before silk fours are  converted to nested silks bonds and  silks Inc statements so we don't  actually need to explicitly represent  silk fours in the computation dag any  questions on this definition so we're  going to be using this computation dag  throughout this lecture to analyze how  much parallelism there is in in a  program  so assuming that each of these strands  execute sin unit time this assumption  isn't always true in practice and  practice strands will take different  amounts of time but let's assume for  simplicity each strand here takes unit  time does anyone want to guess what the  parallelism of this computation is so  how parallel do you think this is what's  the maximum speed-up you might get on  this computation five somebody said five  any other guesses who thinks this is  gonna be less than five you know a  couple people who thinks it's gonna be  more than five couple people who thinks  there's any parallelism at all in this  computation yeah it seems like a lot of  people think there is some parallelism  here so we're actually going to analyze  how much parallelism is in this  computation so I'm not going to tell you  the answer now but I'll tell you in a  couple slides first need to go over some  terminology so whenever you start  talking about parallelism somebody's  always almost always going to bring up  am Dahl's law and Amdahl's law says that  if 50% of your application is parallel  in the other 50% is serial then you  can't get more than a factor of 2  speed-up no matter how many processors  you run the computation on does anyone  know why this is the case yes  right so you you have to spend at least  fifty percent of the time in the serial  portion so in the best case if I gave  you an infinite number of processors and  you can reduce the parallel portion of  your code to zero running time you still  have the 50 percent of the serial time  that you have to execute and therefore  the best speed-up you can get is a  factor of two and in general if a  fraction alpha of an application must be  run serially then the speed-up can be at  most one over alpha so if one third of  your program has to be executed  sequentially then the speed-up can be at  most three because even if you reduce  the parallel portion of your code to tab  a running time of zero you still have  the sequential part of your code that  you have to wait for okay so let's try  to quantify the parallelism in this  computation here so how many of these  nodes have to be executed sequentially  yes so it turns out to be less than nine  yes seven turns out to be less than  seven yes  so it turns out to be less than six  turns out to be less than four you  you're getting close to so it turns out  to be more than to what's left three  okay so so three of these notes have to  be executed - sequentially because when  you're executing these notes there's  nothing else that can happen in parallel  for all the remaining notes when you're  executing them you can potentially be  executing some of some of the other  notes in parallel but for these three  notes that I've colored in yellow you  have to execute those sequential II  because there's nothing else that's  going on in parallel so according to  Amdahl's law this says that the serial  fraction of the program is three over 18  so there's 18 nodes in this graph here  so therefore the serial fractions one  over six in the speed-up is upper  bounded by one over that which is 6 so  Amdahl's law tells us that the maximum  speed-up we can get is six any questions  on how I got this number here  so it turns out that Amdahl's law  actually gives us a pretty loose upper  bound on the on the parallelism and it's  not that useful in many practical cases  so we're actually going to look at a  better definition of parallelism that  will give us a better upper bound on the  maximum speed-up we can get okay so  we're gonna define T sub P to be the  execution time of the program on P  processors and T sub one is just a work  so T sub one is if you executed this  program on one processor how much stuff  do you have to do and we define that to  me to work recall in lecture two we  looked at many ways to optimize the work  this is the work term so in this example  the number of nodes here is 18 so the  work is just going to be 18 we also  define T of infinity to be the span the  span is also called the critical path  length or the computational depth of the  graph and this is equal to the longest  directed path you can find in this graph  so in this example the longest path is  nine so one of the students answered  nine earlier and this is actually the  span of this graph so there are nine  nodes along this path here and that's  the longest one you can find and we call  this T of infinity because that's  actually the execution time of this  program if you had an infinite number of  processors so there are two laws that  are going to relate these quantities so  T sub P the work law says that he's a P  is greater than or equal to the t sub 1  divided by P so it says that the  execution time on three P processors has  to be greater than or equal to the work  of the program divided by the number of  processors you have does anyone see why  the work law is true so the answer is  if you have P processors on each time  step you can do at most P work so if you  multiply both sides by P get P times T  sub P is greater than or equal to t1 if  P times T sub P was less than t1 so that  means you're not done with the  computation because you haven't done all  the work yet so the work law says that T  sub P has to be greater than or equal to  t1 over P any questions on the work law  alright so let's look at another law  this is called the span law it says that  T sub P has to be greater than or equal  to T sub infinity so the execution time  on P processors has to be at least  execution time on an infinite number of  processors anyone know why the span law  has to be true so another way to see  this is that if you had an infinite  number of processors you can actually  simulate a PD processor system you just  use P of the processors and leave all  the remaining processors idle and that  can't slow down your program so  therefore you have that T sub P has to  be greater than or equal to T sub  infinity if you add more processors to  it the the running time can't go up any  questions so let's see how we can  compose the work and the span quantities  of different computations so let's say I  have two computations a and B and let's  say that a has to execute before B so  everything in a has to be done before I  start the computation in B and let's say  I know what the work of a and the work  of B individually are what would be the  work of a union B  yes  yeah so why is that  yeah so the work is just gonna be the  sum of the work of a and the work of B  because you have to do all the work of a  and then do all the work of B so you  just add them together what about the  span so let's say I know the span of a  and I know the span of B what's the span  of a union B so again it's just the sum  of the span of a in the sum of in the  span of B this is because I have to  execute everything in a before I start a  a B so I just sum together the spans so  this is series copy to composition what  if I do parallel composition so let's  say here I'm executing the two  computations in parallel what's the work  of a union B so it's not it's not going  to be the maximum yes  yes it's still going to be the sum of t1  of a and t1 of B because you still have  the same amount of work that you have to  do it's just that you're doing it in  parallel but the work is just the time  if you had one processor so if you had  one processor you wouldn't be executing  these in parallel what about the span so  if I know the span of a in the span of B  what's the span of the parallel  composition of the two yes yeah so the  span of a union B is going to be the max  of the span of a in a span of B because  I'm going to be bottlenecked by the  slower by the slower of the two  computations so I just take the one that  has longer span and that gives me the  overall span any questions ok so here's  another definition so T 1 divided by T P  is the speed-up on P processors if I  have T 1 divided by T P less than P then  this means that I have sub linear  speed-up I'm not making use of all of  the processors because I'm using P  processors but I'm not getting a  speed-up of P if T 1 over T P is equal  to P then I'm getting perfect linear  speed-up I'm making use of all my  processors I'm putting P at P times as  many resources into my computation and  it becomes P times faster so this is the  good case  and finally if T 1 over T P is greater  than P we have something called super  linear speed-up in our simple  performance model this can't actually  happen because of the work law the work  law says that T P has to be at least t1  divided by P so if you rearrange the  terms you'll see that we get a  contradiction in our model in practice  you might sometimes see that you have a  super linear speed-up because when  you're using more processors you might  have access to more cache and that could  improve the  normos of your program but in general  you might see a little bit of super  linear speed-up but not that much and in  our simplified model we're just going to  assume that you can't have super linear  speed-up and getting perfect linear  speed-up is already very good so the  because the span law says that TP has to  be at least T infinity the maximum  possible speed up is just going to be T  1 divided by T infinity and that's the  parallelism of your computation this is  a maximum possible speed up you can get  another way to view this is that it's  equal to the average amount of work that  you have to do per step along the  spanned so for every step along the span  you're doing this much work and after  all the steps and you've done all of the  work so what's the parallelism of of  this computation dag here  - why is it - yeah so t1 is 18 there are  18 nodes in this graph T infinity is 9  and the last time I checked t1 over 18  divided by 9 is 2 so the parallelism  here is 2 okay so now we can go back to  our Fibonacci example and we can also  analyze the work and span of this and  compute the maximum parallelism so again  for simplicity let's assume that each of  these strands takes unit time to execute  again in practice is not necessarily  true but for some please let's for  simplicity let's just assume that so  what's the work of this computation  17 right so the work is just the number  of notes you have in this graph and you  can just count that up and you get 17  what about the span somebody said 8 yeah  so the span is 8 and here's the longest  path so this is the path that has 8  nodes in it and that's the longest one  you can find here  so therefore the parallelism is just 17  divided by 8 which is 2 point 1 2 5 and  so for all of you who guessed that the  parallelism was 2 you're very close this  tells us that using many more than 2  processors can only yield us perform  marginal performance gains because the  maximum speed-up we can get us to point  1 to 5 so we throw 8 processors at this  computation we're not gonna get a  speed-up beyond 2 point 1 to 5 ok so to  figure out how much parallelism is in  your computation you need to analyze the  work of your computation and the span of  your computation and then take the ratio  between the two quantities but for large  computations is actually pretty tedious  to analyze this by hand you don't want  to draw these things out by hand for a  very large computation and fortunately  silk has a tool called the silk scale  scalability analyzer so this is  integrated into the tape or LLVM  compiler that you'll be using for this  course  and silk scale use this compiler  instrumentation to analyze a serial  execution of a program and it's going to  generate the work and the span  quantities and then use those quantities  to derive upper bounds on the parallel  speed-up of your program so you have a  chance to play around with silk scale in  homework 4 so let's try to analyze the  parallelism of quicksort  and here we're using a parallel  quicksort algorithm the function  quicksort here takes two inputs  these are two pointers laugh points to  the beginning of the array that we want  to sort right points to one element  after the end of the array and what we  do is we first check if left is equal to  right if so then we just return because  there are no elements to sort otherwise  we're going to call this partition  function the partition function is going  to pick a random pivot so this is a  randomized quicksort algorithm and then  it's going to move everything that's  less than the pivot to the left part of  the array and everything to the  everything that's greater than or equal  to the pivot to the right part of the  array is also going to return as a  pointer to the pivot and then now we can  execute two recursive calls so we do  quicksort on the left side and quicksort  on the right side and this can happen in  parallel so we use the silks spawn here  to spawn off one of these calls to  quicksort in parallel and therefore the  two recursive calls are parallel and  then finally we sync up before we return  from the function all right so let's say  we wanted to sort 1 million numbers with  this quicksort algorithm and let's also  assume that the partition function here  is written sequentially so you have to  go through all the elements one by one  can anyone guess what the parallelism is  in this computation so the guess was 1  million any other guesses  fifty thousand any other guesses yes -  it's a good guess log log base two of a  million any other guesses so log base  two of a million to 50,000 in 1 million  anything anyone think it's more than 1  million no so no takers on more than 1  million ok so if you run this program  using silk skill it will generate a plot  that looks like this and there are  several lines on this plot so let's talk  about what each of these lines mean ok  so this purple line here is the speed-up  that you observe in your computation  when you're running it and you can get  that by taking the single processor  running time and dividing it by the  running time on P processors so this is  the observed speed-up that's the purple  line the blue line here is the line that  you get from the span law so this is T 1  over T infinity and here this gives us a  bound of about 6 for the parallelism the  Green Line is the bound from the work  law so this is just a linear in a linear  line with a slope of 1 it says that on P  processors you can't get more than a  factor of P speed-up so therefore the  the maximum speed up you can get has to  be below the green line and below the  blue line so you're in this lower right  quadrant of the plot there's also this  orange line which is of the speed-up you  would get if you used a greedy scheduler  we'll talk more about the greedy  scheduler later on in this lecture so  this is sort of the plot that you would  get and we see here that the maximum  speed-up is about 5 so for those of you  who guessed 2 and log base 2 of a  million you're the closest  you can also generate a plot that just  tells you the execution time versus the  number of processors and you can get  this quite easily just by doing a simple  transformation from the previous plot so  sylph scale is going to give you these  useful plots that you can use to figure  out how much parallelism is in your  program and let's see why why the  parallelism here is so low so I said  that we were going to execute this  partition function sequentially and it  turns out that that's actually the  bottleneck to the parallelism so so the  expected work of quicksort is an order n  log n so some of you might have seen  this in your previous algorithms courses  if you haven't seen this yet then you  can take a look at your favorite  textbook introduction to algorithms it  turns out that the parallel version of  quicksort also has an expected work down  of order n log n if you pick a random  pivot so the analysis is similar they  expect a span bound turns out to be at  least n and this is because on the first  level of recursion we have to call this  partition function which is going to go  through the elements one by one so so  that already has linear span and it  turns out that the overall span is also  order n because the span actually works  out to be a geometrically decreasing  sequence and sums to order n and  therefore the maximum parallelism you  can get is order log n so you just take  the work divided by the span so for the  student who guessed that the parallelism  is log base 2 of n that's very good  turns out that like it's not exactly log  base 2 event because there are constants  and these work and span bounds so it's  on the order of log log of n that's the  parallelism and turns out that order log  n parallelism is not very high in  general you want the parallelism to be  much higher something  polynomial and n and in order to get  more parallelism in this algorithm what  you have to do is sorry what you have to  do is you have to paralyze this  partition function because right now I  I'm just  executing this sequentially but you can  actually indeed write a parallel  partition function that takes linear  work in order log n span and then this  would give you an overall span bound  with log squared n and if you take n log  n divided by log squared and that gives  you an overall parallelism of n over log  n which is much higher than order log n  here similarly if you were to implement  a merge sort you would also need to make  sure that the merging routine is  implemented in parallel if you want to  see significant speed-up so not only do  you have to execute the two recursive  calls in parallel you also need to make  sure that the merging portion of the  code is done in parallel any questions  on this example  [Music]  yeah yeah so though I mean I I believe  that that's just due to noise because  there's some noise going on in the  machine so if you ran it enough times  and took the average or the median it  should be always going up or it  shouldn't be decreasing at least yes so  at one level of recursion the partition  function takes order log n span you can  show that there log n levels of  recursion in this quicksort algorithm I  didn't go over the details of this of  this analysis but you can show that and  then therefore the overall span is going  to be order log squared and I can show  you on the board after class if you're  interested or I can give you a reference  other questions okay so it turns out  that in addition to quicksort there are  also many other interesting practical  parallel algorithms out there so here  I've listed a few of them and by  practical I mean that the silk program  running on one processor is competitive  with the best sequential program for  that problem and so you can see that  I've listed the work in a span of merge  sort here and if you implement the merge  in parallel the span of the overall  computation would be log cubed n + n log  n divided by log cubed n is n over log  squared and thus apparel ISM which is  pretty high and in general all of these  computations have pretty high  parallelism another thing to note is  that these algorithms are practical  because their work bound is  asymptotically equal to the work of the  corresponding sequential algorithm  that's known as a work efficient  parallel algorithm and it's actually one  of the goals of parallel algorithm  designed to come up with work efficient  parallel algorithms because this means  that even if you have a small number of  processors you can still be competitive  with the sequential algorithm running on  running on one processor and in the next  lecture we'll actually see  Yeah right in the next lecture we'll  actually see some examples of these  other algorithms and possibly even ones  not listed on the slide and we'll go  over the work and span analysis and  figure out the parallelism so now I want  to move on to talk about some scheduling  theories so I talked about these  computation DAGs earlier analyze the  work in the span of them but I never  talked about how these different strands  are actually mapped to processors at  running time so let's talk a little bit  about scheduling theory it turns out  that scheduling theory is actually very  general it's not just limited to a  parallel programming I was used and all  over over the places in computer science  operations research research and math so  as a reminder silk allows the programmer  to express potential parallelism in an  application and a soak schedule is gonna  map these strands onto the processors  that you have available dynamically at  runtime silk actually uses a distributed  scheduler but since the theory of  distributed schedulers is a little bit  complicated we'll actually explore the  ideas of scheduling first using a  centralized scheduler and the  centralized scheduler knows everything  about what's going on in the computation  and it can use that to make a good  decision so let's first look at what a  centralized scheduler does and then I'll  talk a little bit about the silk  distributed schedule and we'll learn  more about that in a future lecture as  well so we're gonna look at a greedy  scheduler in an idea of a greedy  scheduler is to just do as much as  possible on every step of the  computation so as anyone seen greedy  algorithms before right so many of you  have seen greedy algorithms before so  the idea is similar here we're just  gonna do as much as possible at the  current time step we're not gonna think  too much about the future so we're going  to define a ready strand to be a strand  where all of its predecessors in the  computation dag have already executed  so in this example here let's say I  already executed all of these blue  strands then the yellow the one shaded  in yellow are going to be my ready  strands because they have all of their  predecessors executed already and there  are two types of steps in a greedy  scheduler the first kind of step is  called a complete step and in a complete  step we have at least P strands ready so  if we had P equal to 3 then we have a  complete step now because we have 5  strands ready which is greater than 3 so  what are we gonna do in a complete step  what would a greedy scheduler do yes  yeah so greedy schedule would just do as  much as it can say we just run any three  of these or any P in general so let's  say I picked these these three to run so  it turns out that these are actually the  worst three to run because they don't  enable any new strands to be ready but I  can pick those three and then the  incomplete step is one where I have  fewer than P strands ready so here I  have two strands ready and I have three  processors so what would I do in an  incomplete step yeah so just running all  of them so here I'm gonna execute these  two strands and here we're gonna use  complete steps and incomplete steps to  analyze the performance of the greedy  scheduler there's a famous theorem which  was first shown by Ron Graham in 1968  that says that any greedy scheduler  achieves the following time bound T sub  P is less than or equal to T 1 over P  plus T infinity and you might recognize  the terms on the right hand side T 1 is  the work and T infinity is the span that  we saw earlier and here's a simple proof  for why this time bound holds so we can  upper bound the number of complete steps  in the computation by T 1 over P and  this is because each complete step is  going to perform P work so after T 1  over P complete steps will have done all  of the work in our computation so that  that that means that the number of  complete steps can be at most T 1 over P  so any questions on this  so now let's look at the number of  incomplete steps we can have so the  number of incomplete steps we can have  its upper bounded by the span or T  infinity and the reason why is that  if you look at the UH necks acute addd  AG right before you execute an  incomplete step and you measure the span  of that uh next acute dag you'll see  that once you execute an incomplete step  it's going to reduce a span of that dag  by one so here this is the span of our  uh necks acute attack that contains just  these seven nodes the span of this is  five and when we execute an incomplete  step we're gonna process all of the  roots of this uh necks acute attack  delete them from the dag and therefore  we're gonna reduce the length of the  longest path by one so well we execute a  incomplete step it decreases a span from  five to four and then the time bound up  here T sub P it's just upper bounded by  the sum of these two types of steps  because after you execute t1 over P  complete steps and T infinity incomplete  steps you must must have finished the  entire computation so any questions  okay a corollary of this theorem is that  any grease scheduled or achieved within  a factor of two of the optimal running  time so this is the optimal running time  of a scheduler that knows everything it  can predict the future and so on so  let's let T sub P TP star be the  execution time produced by an optimal  scheduler we know that TP star has to be  at least the max of t1 over P and T  infinity does this do to the work and  span laws so it has to be at least a max  of these two terms otherwise we wouldn't  have finished the computation so now we  can take the inequality we had before  for the greedy scheduler bound so TP is  less than equal to t1 over P plus T  infinity and this is upper bounded by  two times the max of these two terms so  a plus B is upper bounded by two times  the max of a and B and then now the max  of t1 over P and T infinity is just  upper bounded by TP star so we can  substitute that in and we get that TP is  upper bounded by two times TP star which  is the running time of the optimal  scheduler so the greedy scheduler  achieves within a factor of two of the  optimal scheduler  here's another corollary this is a more  interesting corollary it says that any  greedy scheduler achieves near perfect  linear speed-up whenever T 1 / T  infinity is greater than or equal to P 2  see why this is chuh if we have that T 1  over T infinity is great much greater  than P so the the double arrows here  mean that the left-hand side is much  greater than the right-hand side then  this means that the span is much less  than T 1 over P and the greedy  scheduling theorem gives us that TP is  less than or equal to T 1 over P plus T  infinity but T infinity is much less  than T 1 over P so the first term  dominates and we have that TP is  approximately equal to T 1 over P and  therefore the speed-up you get is T 1  over P which is P and this is near this  is linear speed-up the quantity T 1  divided by P times T infinity is known  as the parallel slackness so this is  basically measuring how much more  parallelism you have in a computation  than the number of processors you have  and if parallel slackness is very high  then this corollary is going to hold and  you're gonna see near linear speed-up  as a rule of thumb you usually want the  parallel slackness of your program to be  at least 10 because if you have a  parallel slackness of just 1 you can't  actually amortize the overheads of the  scheduling mechanism so therefore you  want the parallel slackness to be at  least 10 when you're programming in silk  so that was the greedy scheduler let's  talk a little bit about the silk  scheduler so seok uses a work stealing  scheduler and achieves and expected  running time TP equal to t1 over P plus  order T infinity so instead of just  summing the two terms we actually have a  Big O in front of the T infinity and  this is used to account for the  overheads of scheduling the greedy  scheduler I presented earlier I didn't  account for any of the overheads of  scheduling I just assumed that it could  figure out which of the tasks to execute  so this silk work stealing scheduler has  has this expected time provably so you  can prove this using random variables  and tail balance of distributions so  charles leiserson has a paper that talks  about how to prove this and empirically  we usually see that TP is more like t1  over P plus T infinity so we usually  don't see any big constant in front of  the T infinity term in practice and  therefore we can get near linear near  perfect linear speed-up as long as the  number of processors is much less than T  1 over T infinity the maximum  parallelism and as I said earlier the  instrumentation is silk scale will allow  you to measure the work and span terms  so that you can figure out how much  parallelism is in your program  any questions  okay so let's talk a little bit about  how the silk runtime system works so in  the silk runtime system each worker or  processor maintains a work deck deck  stands for double ended queue so it's  just short for double ended queue  maintains a work deck of ready strands  and it manipulates the bottom of the  deck just like you would in in a stack  of a sequential program so here I have  four processors and each one of them  have their own decks and they have these  things on on the stack these function  calls saves the return address to local  variables and so on so a processor can  call a function and when it calls a  function it just places that functions  frame at the bottom of its stack you can  also spawn things so then it places a  spawn frame at the bottom of this deck  and then these things can happen in  parallel so multiple processes can be  spawning and calling things in parallel  and you can also return from spawn or a  call so here I'm gonna return from a  call then I returned from a spawn and at  this point I don't actually have  anything left to do for the second  processor so what I what what I do now  when I'm left with nothing to do yes  yeah so the idea here is to steal some  work from another processor so when a  worker runs out of work to do it's gonna  steal from the top of a random victims  deck so it's gonna pick one of these  processors at random it's gonna roll  some dice to determine who to steal from  and let's say that it picked the third  processor now it's going to take all  this stuff at the top of the deck up  until the next spawn and place it into  its own deck and then now it has stuff  to do again so now we can continue  executing this code it can spawn stuff  call stuff and so on so the idea is that  whenever a worker runs out of work to do  it's going to start stealing some work  from other processors but if it always  has enough work to do then it's happy  and it doesn't need to steal things from  other processors and this is why MIT  gives us so much work to do so we don't  have to steal work from other people so  famous theorem says that were sufficient  parallelism workers steal very  infrequently and this gives us near  linear speed-up so with sufficient  parallelism the first term in our  running bound is going to dominate the  t1 over P term and that gives us near  linear speed-up let me actually show you  a pseudo proof of this theorem and I'm  I'm allowed to do a pseudo proof it's  not actually a real proof a pseudo proof  so I'm allowed to do this because I I'm  not the author of an algorithms textbook  so here's a pseudo proof yet so  processor is either working or stealing  at every time step in the total time  that all processors spend working is  just t1 because that's the total work  that you have to do and then when it's  not doing work it's stealing and each  steal has a 1 over P chance of reducing  the span by one because one of the  processors is contributing to the  longest path in the  in the computation dag and there's a one  over P chance that I'm gonna pick that  processor and steal some work from that  processor and reduce the span of my  remaining computation by one and  therefore the expected cost of all  steals is going to be order P times T  infinity because I have to steal P  things and expectation before I get to  the processor that has the the the  critical path and and therefore my  overall cost for stealing is order P  times T infinity because I'm gonna do  this T infinity times and since there  are P processors I'm gonna divide the  expected time by P so t1 plus o of P  times T infinity divided by P and that's  gonna give me the bound t1 over P plus  order T infinity so this you know proof  here ignores issues with independence  but it still gives you an intuition of  why we get this expected running time if  you want to actually see the full proof  it's actually quite interesting it uses  random variables in tail bounds of  distributions and this is the paper that  has this this is by blue muff and  charles leiserson  all right so another thing I want to  talk about is that silk supports C's  rules for pointers so a pointer to a  stack space can be passed from a parent  to a child but not from a child to a  parent and this is the same as the  stacked rule for sequential C programs  so let's say I have this computation on  the left here so a is going to spawn off  B in it and then it's going to continue  executing C and the C is going to spawn  off D and xqe so we see on the right  hand side the views of the stacks for  each of the tasks here so a sees its own  stack B has it sees its own stack but it  also sees a stack because a is its  parent C will see its own stack but  again it sees a stack because a is its  parent and then finally D and E they see  the stack of C and they also see the  stack of a so in general TAS can see the  stack of all of its ancestors in this  computation graph and we call this a  cactus stack because it sort of looks  like a cactus if you draw this upside  down and silks cactus stack supports  multiple views of the stacks in parallel  this is what makes the parallel calls to  functions work and see we can also found  the stack space used by a silk program  so let's let S sub 1 be the stack space  required by the serial execution of a  silk program then the sack space  required by a P processor execution is  going to be bounded by P times s 1 so s  P is the stack space required by P P  processor execution that's less than or  equal to P times s 1 here's a high-level  proof of why this is true so it turns  out that the work-stealing algorithm and  silk maintains what's called the busy  leaves property and this says that each  of the existing leaves that are still  active in the computation dag have some  worker executing on it  so in this example here the vertices  shaded in blue and purple these are the  ones that are in my remaining  computation dag and all of the gray  notes have already been finished and  here for each of the leaves here I have  one processor on that leaf executing the  tasks associated with this so it's so  guarantees this busy leaves property and  now for each of these processors the  amount of stack space it needs is in  these two sacks space for its own tasks  and then everything above it in this  computation dag and we can actually  bound that by the stack space needed by  a single processor execution of the silk  program s 1 because s 1 is just the  maximum a stack space we need which is  the basically the the longest path in  this graph and we do this for every  processor so therefore the upper bound  on the stack space required by P  processor execution is just P times s 1  and in general this is a quite loose  upper bound because you're not  necessarily going all the way all the  way down in this computation dag every  time sometimes you'll be and usually  you'll be much higher in this  computation dag so any questions yes in  practice if you have enough parallelism  then you're not actually going to steal  that much in your algorithm so if you  guarantee that there's a lot of  parallelism then each processor is going  to have a lot of its own work to do and  it doesn't need to steal very frequently  but if if if your parallelism is very  low  I'm compared to the number of processes  if it's equal to the number of  processors then you're going to spend a  significant amount of time stealing and  the overheads of the works dealing  algorithm are going to show up in your  running time  are you so the standard silks were  stealing scheduler it takes everything  at the top of the deck up until the next  spawn so basically that's the strand so  it takes that there are variants that  take more than that but the silk work  stealing scheduler we'll be using in  this class just takes the top strand any  other questions okay so that's actually  all I have for today if you have any  additional questions you can come talk  to talk to us after class and remember  to meet with your MIT Posse mentors soon  you
 

3
00:00:05,769 --> 00:00:08,019

 
4
00:00:08,019 --> 00:00:09,850

 
5
00:00:09,850 --> 00:00:10,930

 
6
00:00:10,930 --> 00:00:13,120

 
7
00:00:13,120 --> 00:00:15,160

 
8
00:00:15,160 --> 00:00:19,889

 
9
00:00:19,889 --> 00:00:23,229

 
10
00:00:23,229 --> 00:00:26,740

 
11
00:00:26,740 --> 00:00:29,019

 
12
00:00:29,019 --> 00:00:33,190

 
13
00:00:33,190 --> 00:00:35,590

 
14
00:00:35,590 --> 00:00:38,530

 
15
00:00:38,530 --> 00:00:39,820

 
16
00:00:39,820 --> 00:00:42,820

 
17
00:00:42,820 --> 00:00:44,500

 
18
00:00:44,500 --> 00:00:46,870

 
19
00:00:46,870 --> 00:00:49,330

 
20
00:00:49,330 --> 00:00:51,640

 
21
00:00:51,640 --> 00:00:54,520

 
22
00:00:54,520 --> 00:00:56,979

 
23
00:00:56,979 --> 00:01:02,400

 
24
00:01:02,400 --> 00:01:06,430

 
25
00:01:06,430 --> 00:01:08,860

 
26
00:01:08,860 --> 00:01:12,310

 
27
00:01:12,310 --> 00:01:13,960

 
28
00:01:13,960 --> 00:01:18,700

 
29
00:01:18,700 --> 00:01:21,580

 
30
00:01:21,580 --> 00:01:24,250

 
31
00:01:24,250 --> 00:01:25,960

 
32
00:01:25,960 --> 00:01:28,180

 
33
00:01:28,180 --> 00:01:30,219

 
34
00:01:30,219 --> 00:01:33,070

 
35
00:01:33,070 --> 00:01:37,410

 
36
00:01:37,410 --> 00:01:40,090

 
37
00:01:40,090 --> 00:01:42,910

 
38
00:01:42,910 --> 00:01:45,280

 
39
00:01:45,280 --> 00:01:48,310

 
40
00:01:48,310 --> 00:01:53,710

 
41
00:01:53,710 --> 00:01:55,810

 
42
00:01:55,810 --> 00:01:57,760

 
43
00:01:57,760 --> 00:01:59,679

 
44
00:01:59,679 --> 00:02:02,320

 
45
00:02:02,320 --> 00:02:05,380

 
46
00:02:05,380 --> 00:02:07,060

 
47
00:02:07,060 --> 00:02:09,550

 
48
00:02:09,550 --> 00:02:11,409

 
49
00:02:11,409 --> 00:02:14,259

 
50
00:02:14,259 --> 00:02:16,000

 
51
00:02:16,000 --> 00:02:17,679

 
52
00:02:17,679 --> 00:02:22,449

 
53
00:02:22,449 --> 00:02:25,449

 
54
00:02:25,449 --> 00:02:27,009

 
55
00:02:27,009 --> 00:02:30,130

 
56
00:02:30,130 --> 00:02:33,580

 
57
00:02:33,580 --> 00:02:35,649

 
58
00:02:35,649 --> 00:02:40,360

 
59
00:02:40,360 --> 00:02:42,759

 
60
00:02:42,759 --> 00:02:44,500

 
61
00:02:44,500 --> 00:02:46,030

 
62
00:02:46,030 --> 00:02:50,800

 
63
00:02:50,800 --> 00:02:53,110

 
64
00:02:53,110 --> 00:02:56,830

 
65
00:02:56,830 --> 00:03:00,099

 
66
00:03:00,099 --> 00:03:02,020

 
67
00:03:02,020 --> 00:03:04,780

 
68
00:03:04,780 --> 00:03:07,240

 
69
00:03:07,240 --> 00:03:09,839

 
70
00:03:09,839 --> 00:03:13,509

 
71
00:03:13,509 --> 00:03:15,009

 
72
00:03:15,009 --> 00:03:17,500

 
73
00:03:17,500 --> 00:03:22,240

 
74
00:03:22,240 --> 00:03:32,199

 
75
00:03:32,199 --> 00:03:36,280

 
76
00:03:36,280 --> 00:03:37,839

 
77
00:03:37,839 --> 00:03:39,879

 
78
00:03:39,879 --> 00:03:42,490

 
79
00:03:42,490 --> 00:03:44,679

 
80
00:03:44,679 --> 00:03:47,080

 
81
00:03:47,080 --> 00:03:49,539

 
82
00:03:49,539 --> 00:03:51,369

 
83
00:03:51,369 --> 00:03:54,039

 
84
00:03:54,039 --> 00:03:56,219

 
85
00:03:56,219 --> 00:03:58,929

 
86
00:03:58,929 --> 00:04:01,539

 
87
00:04:01,539 --> 00:04:06,390

 
88
00:04:06,390 --> 00:04:10,190

 
89
00:04:10,190 --> 00:04:14,369

 
90
00:04:14,369 --> 00:04:15,839

 
91
00:04:15,839 --> 00:04:19,469

 
92
00:04:19,469 --> 00:04:21,629

 
93
00:04:21,629 --> 00:04:25,590

 
94
00:04:25,590 --> 00:04:28,320

 
95
00:04:28,320 --> 00:04:30,900

 
96
00:04:30,900 --> 00:04:32,659

 
97
00:04:32,659 --> 00:04:36,150

 
98
00:04:36,150 --> 00:04:39,570

 
99
00:04:39,570 --> 00:04:41,610

 
100
00:04:41,610 --> 00:04:44,279

 
101
00:04:44,279 --> 00:04:47,430

 
102
00:04:47,430 --> 00:04:49,560

 
103
00:04:49,560 --> 00:04:51,659

 
104
00:04:51,659 --> 00:04:53,310

 
105
00:04:53,310 --> 00:04:56,129

 
106
00:04:56,129 --> 00:04:58,890

 
107
00:04:58,890 --> 00:05:00,719

 
108
00:05:00,719 --> 00:05:02,640

 
109
00:05:02,640 --> 00:05:04,020

 
110
00:05:04,020 --> 00:05:06,000

 
111
00:05:06,000 --> 00:05:08,340

 
112
00:05:08,340 --> 00:05:11,010

 
113
00:05:11,010 --> 00:05:12,570

 
114
00:05:12,570 --> 00:05:14,490

 
115
00:05:14,490 --> 00:05:16,320

 
116
00:05:16,320 --> 00:05:21,029

 
117
00:05:21,029 --> 00:05:22,920

 
118
00:05:22,920 --> 00:05:26,460

 
119
00:05:26,460 --> 00:05:28,770

 
120
00:05:28,770 --> 00:05:30,300

 
121
00:05:30,300 --> 00:05:32,730

 
122
00:05:32,730 --> 00:05:34,740

 
123
00:05:34,740 --> 00:05:36,719

 
124
00:05:36,719 --> 00:05:40,710

 
125
00:05:40,710 --> 00:05:43,080

 
126
00:05:43,080 --> 00:05:44,580

 
127
00:05:44,580 --> 00:05:46,710

 
128
00:05:46,710 --> 00:05:48,390

 
129
00:05:48,390 --> 00:05:51,060

 
130
00:05:51,060 --> 00:05:54,409

 
131
00:05:54,409 --> 00:05:57,330

 
132
00:05:57,330 --> 00:06:00,270

 
133
00:06:00,270 --> 00:06:03,300

 
134
00:06:03,300 --> 00:06:06,240

 
135
00:06:06,240 --> 00:06:07,620

 
136
00:06:07,620 --> 00:06:11,100

 
137
00:06:11,100 --> 00:06:14,189

 
138
00:06:14,189 --> 00:06:15,959

 
139
00:06:15,959 --> 00:06:17,700

 
140
00:06:17,700 --> 00:06:19,800

 
141
00:06:19,800 --> 00:06:22,830

 
142
00:06:22,830 --> 00:06:26,909

 
143
00:06:26,909 --> 00:06:30,000

 
144
00:06:30,000 --> 00:06:33,810

 
145
00:06:33,810 --> 00:06:36,360

 
146
00:06:36,360 --> 00:06:38,490

 
147
00:06:38,490 --> 00:06:40,440

 
148
00:06:40,440 --> 00:06:43,050

 
149
00:06:43,050 --> 00:06:45,570

 
150
00:06:45,570 --> 00:06:47,430

 
151
00:06:47,430 --> 00:06:48,629

 
152
00:06:48,629 --> 00:06:50,640

 
153
00:06:50,640 --> 00:06:55,730

 
154
00:06:55,730 --> 00:06:57,870

 
155
00:06:57,870 --> 00:06:59,820

 
156
00:06:59,820 --> 00:07:01,950

 
157
00:07:01,950 --> 00:07:07,500

 
158
00:07:07,500 --> 00:07:09,450

 
159
00:07:09,450 --> 00:07:13,320

 
160
00:07:13,320 --> 00:07:15,570

 
161
00:07:15,570 --> 00:07:19,260

 
162
00:07:19,260 --> 00:07:21,240

 
163
00:07:21,240 --> 00:07:23,610

 
164
00:07:23,610 --> 00:07:26,040

 
165
00:07:26,040 --> 00:07:29,159

 
166
00:07:29,159 --> 00:07:38,040

 
167
00:07:38,040 --> 00:07:42,330

 
168
00:07:42,330 --> 00:07:45,000

 
169
00:07:45,000 --> 00:07:47,550

 
170
00:07:47,550 --> 00:07:49,950

 
171
00:07:49,950 --> 00:07:52,950

 
172
00:07:52,950 --> 00:07:54,900

 
173
00:07:54,900 --> 00:07:59,960

 
174
00:07:59,960 --> 00:08:02,129

 
175
00:08:02,129 --> 00:08:05,460

 
176
00:08:05,460 --> 00:08:07,460

 
177
00:08:07,460 --> 00:08:10,770

 
178
00:08:10,770 --> 00:08:12,719

 
179
00:08:12,719 --> 00:08:16,430

 
180
00:08:16,430 --> 00:08:19,200

 
181
00:08:19,200 --> 00:08:20,879

 
182
00:08:20,879 --> 00:08:25,620

 
183
00:08:25,620 --> 00:08:27,629

 
184
00:08:27,629 --> 00:08:31,620

 
185
00:08:31,620 --> 00:08:33,240

 
186
00:08:33,240 --> 00:08:34,409

 
187
00:08:34,409 --> 00:08:38,190

 
188
00:08:38,190 --> 00:08:41,399

 
189
00:08:41,399 --> 00:08:43,890

 
190
00:08:43,890 --> 00:08:45,840

 
191
00:08:45,840 --> 00:08:48,660

 
192
00:08:48,660 --> 00:08:50,970

 
193
00:08:50,970 --> 00:08:54,690

 
194
00:08:54,690 --> 00:08:59,010

 
195
00:08:59,010 --> 00:09:01,380

 
196
00:09:01,380 --> 00:09:03,600

 
197
00:09:03,600 --> 00:09:07,740

 
198
00:09:07,740 --> 00:09:11,580

 
199
00:09:11,580 --> 00:09:13,290

 
200
00:09:13,290 --> 00:09:14,970

 
201
00:09:14,970 --> 00:09:18,930

 
202
00:09:18,930 --> 00:09:20,580

 
203
00:09:20,580 --> 00:09:23,280

 
204
00:09:23,280 --> 00:09:25,560

 
205
00:09:25,560 --> 00:09:30,240

 
206
00:09:30,240 --> 00:09:32,400

 
207
00:09:32,400 --> 00:09:34,190

 
208
00:09:34,190 --> 00:09:37,740

 
209
00:09:37,740 --> 00:09:41,900

 
210
00:09:41,900 --> 00:09:51,780

 
211
00:09:51,780 --> 00:09:54,270

 
212
00:09:54,270 --> 00:09:55,950

 
213
00:09:55,950 --> 00:09:58,260

 
214
00:09:58,260 --> 00:10:00,329

 
215
00:10:00,329 --> 00:10:03,000

 
216
00:10:03,000 --> 00:10:06,690

 
217
00:10:06,690 --> 00:10:09,030

 
218
00:10:09,030 --> 00:10:13,560

 
219
00:10:13,560 --> 00:10:16,200

 
220
00:10:16,200 --> 00:10:18,600

 
221
00:10:18,600 --> 00:10:28,960

 
222
00:10:28,960 --> 00:10:32,060

 
223
00:10:32,060 --> 00:10:35,240

 
224
00:10:35,240 --> 00:10:38,510

 
225
00:10:38,510 --> 00:10:40,400

 
226
00:10:40,400 --> 00:10:43,610

 
227
00:10:43,610 --> 00:10:45,320

 
228
00:10:45,320 --> 00:10:48,710

 
229
00:10:48,710 --> 00:10:50,300

 
230
00:10:50,300 --> 00:10:52,340

 
231
00:10:52,340 --> 00:10:55,040

 
232
00:10:55,040 --> 00:10:57,110

 
233
00:10:57,110 --> 00:10:59,090

 
234
00:10:59,090 --> 00:11:01,550

 
235
00:11:01,550 --> 00:11:03,770

 
236
00:11:03,770 --> 00:11:06,350

 
237
00:11:06,350 --> 00:11:08,330

 
238
00:11:08,330 --> 00:11:11,120

 
239
00:11:11,120 --> 00:11:13,670

 
240
00:11:13,670 --> 00:11:16,370

 
241
00:11:16,370 --> 00:11:21,050

 
242
00:11:21,050 --> 00:11:23,210

 
243
00:11:23,210 --> 00:11:26,180

 
244
00:11:26,180 --> 00:11:28,550

 
245
00:11:28,550 --> 00:11:31,670

 
246
00:11:31,670 --> 00:11:33,440

 
247
00:11:33,440 --> 00:11:35,180

 
248
00:11:35,180 --> 00:11:37,010

 
249
00:11:37,010 --> 00:11:39,140

 
250
00:11:39,140 --> 00:11:42,680

 
251
00:11:42,680 --> 00:11:44,180

 
252
00:11:44,180 --> 00:11:47,060

 
253
00:11:47,060 --> 00:11:50,000

 
254
00:11:50,000 --> 00:11:52,240

 
255
00:11:52,240 --> 00:11:55,940

 
256
00:11:55,940 --> 00:11:57,950

 
257
00:11:57,950 --> 00:12:00,530

 
258
00:12:00,530 --> 00:12:08,560

 
259
00:12:08,560 --> 00:12:12,200

 
260
00:12:12,200 --> 00:12:14,510

 
261
00:12:14,510 --> 00:12:18,140

 
262
00:12:18,140 --> 00:12:20,210

 
263
00:12:20,210 --> 00:12:22,460

 
264
00:12:22,460 --> 00:12:25,070

 
265
00:12:25,070 --> 00:12:26,240

 
266
00:12:26,240 --> 00:12:28,280

 
267
00:12:28,280 --> 00:12:30,260

 
268
00:12:30,260 --> 00:12:32,810

 
269
00:12:32,810 --> 00:12:36,440

 
270
00:12:36,440 --> 00:12:38,240

 
271
00:12:38,240 --> 00:12:39,210

 
272
00:12:39,210 --> 00:12:41,990

 
273
00:12:41,990 --> 00:12:46,819

 
274
00:12:46,819 --> 00:12:50,009

 
275
00:12:50,009 --> 00:12:52,650

 
276
00:12:52,650 --> 00:12:54,809

 
277
00:12:54,809 --> 00:12:57,329

 
278
00:12:57,329 --> 00:12:59,249

 
279
00:12:59,249 --> 00:13:02,550

 
280
00:13:02,550 --> 00:13:04,980

 
281
00:13:04,980 --> 00:13:06,449

 
282
00:13:06,449 --> 00:13:08,280

 
283
00:13:08,280 --> 00:13:13,110

 
284
00:13:13,110 --> 00:13:14,910

 
285
00:13:14,910 --> 00:13:17,220

 
286
00:13:17,220 --> 00:13:19,290

 
287
00:13:19,290 --> 00:13:21,629

 
288
00:13:21,629 --> 00:13:24,090

 
289
00:13:24,090 --> 00:13:26,670

 
290
00:13:26,670 --> 00:13:29,610

 
291
00:13:29,610 --> 00:13:33,689

 
292
00:13:33,689 --> 00:13:36,809

 
293
00:13:36,809 --> 00:13:38,309

 
294
00:13:38,309 --> 00:13:40,470

 
295
00:13:40,470 --> 00:13:44,309

 
296
00:13:44,309 --> 00:13:47,780

 
297
00:13:47,780 --> 00:13:52,499

 
298
00:13:52,499 --> 00:13:55,499

 
299
00:13:55,499 --> 00:13:58,019

 
300
00:13:58,019 --> 00:14:01,139

 
301
00:14:01,139 --> 00:14:03,119

 
302
00:14:03,119 --> 00:14:04,530

 
303
00:14:04,530 --> 00:14:06,720

 
304
00:14:06,720 --> 00:14:09,749

 
305
00:14:09,749 --> 00:14:12,720

 
306
00:14:12,720 --> 00:14:15,660

 
307
00:14:15,660 --> 00:14:17,189

 
308
00:14:17,189 --> 00:14:19,139

 
309
00:14:19,139 --> 00:14:23,100

 
310
00:14:23,100 --> 00:14:25,889

 
311
00:14:25,889 --> 00:14:27,960

 
312
00:14:27,960 --> 00:14:30,240

 
313
00:14:30,240 --> 00:14:32,730

 
314
00:14:32,730 --> 00:14:35,759

 
315
00:14:35,759 --> 00:14:40,049

 
316
00:14:40,049 --> 00:14:43,530

 
317
00:14:43,530 --> 00:14:46,740

 
318
00:14:46,740 --> 00:14:50,389

 
319
00:14:50,389 --> 00:15:01,469

 
320
00:15:01,469 --> 00:15:03,269

 
321
00:15:03,269 --> 00:15:11,879

 
322
00:15:11,879 --> 00:15:14,749

 
323
00:15:14,749 --> 00:15:18,840

 
324
00:15:18,840 --> 00:15:22,799

 
325
00:15:22,799 --> 00:15:24,929

 
326
00:15:24,929 --> 00:15:30,419

 
327
00:15:30,419 --> 00:15:31,859

 
328
00:15:31,859 --> 00:15:36,179

 
329
00:15:36,179 --> 00:15:40,710

 
330
00:15:40,710 --> 00:15:44,689

 
331
00:15:44,689 --> 00:15:47,189

 
332
00:15:47,189 --> 00:15:51,629

 
333
00:15:51,629 --> 00:15:54,119

 
334
00:15:54,119 --> 00:15:55,949

 
335
00:15:55,949 --> 00:15:58,699

 
336
00:15:58,699 --> 00:16:01,379

 
337
00:16:01,379 --> 00:16:04,249

 
338
00:16:04,249 --> 00:16:07,559

 
339
00:16:07,559 --> 00:16:09,799

 
340
00:16:09,799 --> 00:16:12,329

 
341
00:16:12,329 --> 00:16:15,629

 
342
00:16:15,629 --> 00:16:17,669

 
343
00:16:17,669 --> 00:16:19,079

 
344
00:16:19,079 --> 00:16:21,449

 
345
00:16:21,449 --> 00:16:23,539

 
346
00:16:23,539 --> 00:16:26,729

 
347
00:16:26,729 --> 00:16:30,900

 
348
00:16:30,900 --> 00:16:33,989

 
349
00:16:33,989 --> 00:16:36,389

 
350
00:16:36,389 --> 00:16:38,669

 
351
00:16:38,669 --> 00:16:41,099

 
352
00:16:41,099 --> 00:16:44,729

 
353
00:16:44,729 --> 00:16:46,529

 
354
00:16:46,529 --> 00:16:48,389

 
355
00:16:48,389 --> 00:16:50,489

 
356
00:16:50,489 --> 00:16:51,960

 
357
00:16:51,960 --> 00:16:54,179

 
358
00:16:54,179 --> 00:16:59,129

 
359
00:16:59,129 --> 00:17:01,259

 
360
00:17:01,259 --> 00:17:03,659

 
361
00:17:03,659 --> 00:17:06,299

 
362
00:17:06,299 --> 00:17:08,549

 
363
00:17:08,549 --> 00:17:11,189

 
364
00:17:11,189 --> 00:17:12,659

 
365
00:17:12,659 --> 00:17:13,889

 
366
00:17:13,889 --> 00:17:15,539

 
367
00:17:15,539 --> 00:17:17,939

 
368
00:17:17,939 --> 00:17:20,100

 
369
00:17:20,100 --> 00:17:22,230

 
370
00:17:22,230 --> 00:17:24,059

 
371
00:17:24,059 --> 00:17:29,430

 
372
00:17:29,430 --> 00:17:32,250

 
373
00:17:32,250 --> 00:17:35,240

 
374
00:17:35,240 --> 00:17:37,710

 
375
00:17:37,710 --> 00:17:40,919

 
376
00:17:40,919 --> 00:17:42,960

 
377
00:17:42,960 --> 00:17:46,500

 
378
00:17:46,500 --> 00:17:49,889

 
379
00:17:49,889 --> 00:17:52,110

 
380
00:17:52,110 --> 00:17:54,389

 
381
00:17:54,389 --> 00:17:57,060

 
382
00:17:57,060 --> 00:17:58,740

 
383
00:17:58,740 --> 00:18:06,080

 
384
00:18:06,080 --> 00:18:17,519

 
385
00:18:17,519 --> 00:18:20,490

 
386
00:18:20,490 --> 00:18:24,000

 
387
00:18:24,000 --> 00:18:26,430

 
388
00:18:26,430 --> 00:18:28,289

 
389
00:18:28,289 --> 00:18:32,399

 
390
00:18:32,399 --> 00:18:34,470

 
391
00:18:34,470 --> 00:18:37,500

 
392
00:18:37,500 --> 00:18:39,570

 
393
00:18:39,570 --> 00:18:42,649

 
394
00:18:42,649 --> 00:18:47,250

 
395
00:18:47,250 --> 00:18:50,009

 
396
00:18:50,009 --> 00:18:53,250

 
397
00:18:53,250 --> 00:18:55,560

 
398
00:18:55,560 --> 00:18:58,590

 
399
00:18:58,590 --> 00:19:00,119

 
400
00:19:00,119 --> 00:19:01,560

 
401
00:19:01,560 --> 00:19:05,220

 
402
00:19:05,220 --> 00:19:07,379

 
403
00:19:07,379 --> 00:19:09,509

 
404
00:19:09,509 --> 00:19:12,649

 
405
00:19:12,649 --> 00:19:15,000

 
406
00:19:15,000 --> 00:19:16,500

 
407
00:19:16,500 --> 00:19:21,360

 
408
00:19:21,360 --> 00:19:24,440

 
409
00:19:24,440 --> 00:19:26,370

 
410
00:19:26,370 --> 00:19:27,900

 
411
00:19:27,900 --> 00:19:30,840

 
412
00:19:30,840 --> 00:19:33,330

 
413
00:19:33,330 --> 00:19:35,460

 
414
00:19:35,460 --> 00:19:37,500

 
415
00:19:37,500 --> 00:19:42,510

 
416
00:19:42,510 --> 00:19:44,100

 
417
00:19:44,100 --> 00:19:46,050

 
418
00:19:46,050 --> 00:19:49,350

 
419
00:19:49,350 --> 00:19:51,360

 
420
00:19:51,360 --> 00:19:53,010

 
421
00:19:53,010 --> 00:19:56,370

 
422
00:19:56,370 --> 00:19:58,650

 
423
00:19:58,650 --> 00:20:02,340

 
424
00:20:02,340 --> 00:20:05,550

 
425
00:20:05,550 --> 00:20:07,710

 
426
00:20:07,710 --> 00:20:09,240

 
427
00:20:09,240 --> 00:20:12,300

 
428
00:20:12,300 --> 00:20:18,300

 
429
00:20:18,300 --> 00:20:20,880

 
430
00:20:20,880 --> 00:20:24,240

 
431
00:20:24,240 --> 00:20:26,130

 
432
00:20:26,130 --> 00:20:29,310

 
433
00:20:29,310 --> 00:20:32,040

 
434
00:20:32,040 --> 00:20:34,140

 
435
00:20:34,140 --> 00:20:37,110

 
436
00:20:37,110 --> 00:20:40,590

 
437
00:20:40,590 --> 00:20:43,410

 
438
00:20:43,410 --> 00:20:46,290

 
439
00:20:46,290 --> 00:20:48,270

 
440
00:20:48,270 --> 00:20:50,280

 
441
00:20:50,280 --> 00:20:53,340

 
442
00:20:53,340 --> 00:20:54,990

 
443
00:20:54,990 --> 00:20:58,470

 
444
00:20:58,470 --> 00:21:00,090

 
445
00:21:00,090 --> 00:21:02,550

 
446
00:21:02,550 --> 00:21:06,750

 
447
00:21:06,750 --> 00:21:12,540

 
448
00:21:12,540 --> 00:21:15,930

 
449
00:21:15,930 --> 00:21:19,320

 
450
00:21:19,320 --> 00:21:20,940

 
451
00:21:20,940 --> 00:21:24,450

 
452
00:21:24,450 --> 00:21:26,670

 
453
00:21:26,670 --> 00:21:30,210

 
454
00:21:30,210 --> 00:21:32,340

 
455
00:21:32,340 --> 00:21:34,769

 
456
00:21:34,769 --> 00:21:36,840

 
457
00:21:36,840 --> 00:21:39,629

 
458
00:21:39,629 --> 00:21:41,310

 
459
00:21:41,310 --> 00:21:43,919

 
460
00:21:43,919 --> 00:21:49,019

 
461
00:21:49,019 --> 00:21:52,049

 
462
00:21:52,049 --> 00:21:54,060

 
463
00:21:54,060 --> 00:21:56,190

 
464
00:21:56,190 --> 00:21:58,860

 
465
00:21:58,860 --> 00:22:05,879

 
466
00:22:05,879 --> 00:22:08,310

 
467
00:22:08,310 --> 00:22:10,919

 
468
00:22:10,919 --> 00:22:13,230

 
469
00:22:13,230 --> 00:22:16,409

 
470
00:22:16,409 --> 00:22:18,960

 
471
00:22:18,960 --> 00:22:22,710

 
472
00:22:22,710 --> 00:22:24,619

 
473
00:22:24,619 --> 00:22:28,230

 
474
00:22:28,230 --> 00:22:30,299

 
475
00:22:30,299 --> 00:22:32,070

 
476
00:22:32,070 --> 00:22:38,639

 
477
00:22:38,639 --> 00:22:41,249

 
478
00:22:41,249 --> 00:22:43,710

 
479
00:22:43,710 --> 00:22:46,919

 
480
00:22:46,919 --> 00:22:48,990

 
481
00:22:48,990 --> 00:22:51,330

 
482
00:22:51,330 --> 00:22:54,570

 
483
00:22:54,570 --> 00:22:59,100

 
484
00:22:59,100 --> 00:23:01,409

 
485
00:23:01,409 --> 00:23:04,110

 
486
00:23:04,110 --> 00:23:07,409

 
487
00:23:07,409 --> 00:23:09,389

 
488
00:23:09,389 --> 00:23:11,580

 
489
00:23:11,580 --> 00:23:13,680

 
490
00:23:13,680 --> 00:23:15,119

 
491
00:23:15,119 --> 00:23:17,460

 
492
00:23:17,460 --> 00:23:18,720

 
493
00:23:18,720 --> 00:23:21,210

 
494
00:23:21,210 --> 00:23:23,279

 
495
00:23:23,279 --> 00:23:25,049

 
496
00:23:25,049 --> 00:23:27,180

 
497
00:23:27,180 --> 00:23:29,430

 
498
00:23:29,430 --> 00:23:31,049

 
499
00:23:31,049 --> 00:23:32,879

 
500
00:23:32,879 --> 00:23:35,340

 
501
00:23:35,340 --> 00:23:37,980

 
502
00:23:37,980 --> 00:23:41,580

 
503
00:23:41,580 --> 00:23:43,769

 
504
00:23:43,769 --> 00:23:43,779

 
505
00:23:43,779 --> 00:23:44,010


506
00:23:44,010 --> 00:23:47,010

 
507
00:23:47,010 --> 00:23:49,920

 
508
00:23:49,920 --> 00:23:51,900

 
509
00:23:51,900 --> 00:23:54,270

 
510
00:23:54,270 --> 00:23:56,730

 
511
00:23:56,730 --> 00:23:58,470

 
512
00:23:58,470 --> 00:24:00,300

 
513
00:24:00,300 --> 00:24:00,310

 
514
00:24:00,310 --> 00:24:05,220


515
00:24:05,220 --> 00:24:08,280

 
516
00:24:08,280 --> 00:24:10,500

 
517
00:24:10,500 --> 00:24:12,540

 
518
00:24:12,540 --> 00:24:15,150

 
519
00:24:15,150 --> 00:24:18,780

 
520
00:24:18,780 --> 00:24:20,880

 
521
00:24:20,880 --> 00:24:22,230

 
522
00:24:22,230 --> 00:24:23,930

 
523
00:24:23,930 --> 00:24:27,120

 
524
00:24:27,120 --> 00:24:31,200

 
525
00:24:31,200 --> 00:24:35,190

 
526
00:24:35,190 --> 00:24:37,350

 
527
00:24:37,350 --> 00:24:40,320

 
528
00:24:40,320 --> 00:24:42,810

 
529
00:24:42,810 --> 00:24:45,000

 
530
00:24:45,000 --> 00:24:46,830

 
531
00:24:46,830 --> 00:24:50,280

 
532
00:24:50,280 --> 00:24:53,100

 
533
00:24:53,100 --> 00:24:55,440

 
534
00:24:55,440 --> 00:24:57,480

 
535
00:24:57,480 --> 00:24:59,370

 
536
00:24:59,370 --> 00:25:03,090

 
537
00:25:03,090 --> 00:25:06,360

 
538
00:25:06,360 --> 00:25:08,640

 
539
00:25:08,640 --> 00:25:12,840

 
540
00:25:12,840 --> 00:25:16,410

 
541
00:25:16,410 --> 00:25:19,740

 
542
00:25:19,740 --> 00:25:24,090

 
543
00:25:24,090 --> 00:25:25,680

 
544
00:25:25,680 --> 00:25:28,410

 
545
00:25:28,410 --> 00:25:31,200

 
546
00:25:31,200 --> 00:25:36,540

 
547
00:25:36,540 --> 00:25:38,460

 
548
00:25:38,460 --> 00:25:42,360

 
549
00:25:42,360 --> 00:25:45,150

 
550
00:25:45,150 --> 00:25:48,300

 
551
00:25:48,300 --> 00:25:50,160

 
552
00:25:50,160 --> 00:25:52,640

 
553
00:25:52,640 --> 00:25:55,380

 
554
00:25:55,380 --> 00:25:59,200

 
555
00:25:59,200 --> 00:26:03,080

 
556
00:26:03,080 --> 00:26:05,930

 
557
00:26:05,930 --> 00:26:08,000

 
558
00:26:08,000 --> 00:26:11,420

 
559
00:26:11,420 --> 00:26:12,920

 
560
00:26:12,920 --> 00:26:15,080

 
561
00:26:15,080 --> 00:26:20,210

 
562
00:26:20,210 --> 00:26:22,970

 
563
00:26:22,970 --> 00:26:24,590

 
564
00:26:24,590 --> 00:26:26,780

 
565
00:26:26,780 --> 00:26:28,490

 
566
00:26:28,490 --> 00:26:28,500

 
567
00:26:28,500 --> 00:26:38,169


568
00:26:38,169 --> 00:26:40,549

 
569
00:26:40,549 --> 00:26:45,020

 
570
00:26:45,020 --> 00:26:46,310

 
571
00:26:46,310 --> 00:26:47,750

 
572
00:26:47,750 --> 00:26:49,850

 
573
00:26:49,850 --> 00:26:52,520

 
574
00:26:52,520 --> 00:26:54,980

 
575
00:26:54,980 --> 00:27:04,190

 
576
00:27:04,190 --> 00:27:06,380

 
577
00:27:06,380 --> 00:27:08,240

 
578
00:27:08,240 --> 00:27:12,230

 
579
00:27:12,230 --> 00:27:15,650

 
580
00:27:15,650 --> 00:27:20,409

 
581
00:27:20,409 --> 00:27:22,700

 
582
00:27:22,700 --> 00:27:28,789

 
583
00:27:28,789 --> 00:27:30,320

 
584
00:27:30,320 --> 00:27:37,159

 
585
00:27:37,159 --> 00:27:38,960

 
586
00:27:38,960 --> 00:27:41,090

 
587
00:27:41,090 --> 00:27:42,740

 
588
00:27:42,740 --> 00:27:45,049

 
589
00:27:45,049 --> 00:27:48,289

 
590
00:27:48,289 --> 00:27:50,930

 
591
00:27:50,930 --> 00:27:54,110

 
592
00:27:54,110 --> 00:27:56,330

 
593
00:27:56,330 --> 00:27:58,250

 
594
00:27:58,250 --> 00:28:02,169

 
595
00:28:02,169 --> 00:28:05,480

 
596
00:28:05,480 --> 00:28:08,630

 
597
00:28:08,630 --> 00:28:10,400

 
598
00:28:10,400 --> 00:28:13,370

 
599
00:28:13,370 --> 00:28:17,080

 
600
00:28:17,080 --> 00:28:27,050

 
601
00:28:27,050 --> 00:28:29,970

 
602
00:28:29,970 --> 00:28:32,490

 
603
00:28:32,490 --> 00:28:35,700

 
604
00:28:35,700 --> 00:28:37,140

 
605
00:28:37,140 --> 00:28:39,480

 
606
00:28:39,480 --> 00:28:41,970

 
607
00:28:41,970 --> 00:28:43,860

 
608
00:28:43,860 --> 00:28:46,140

 
609
00:28:46,140 --> 00:28:48,690

 
610
00:28:48,690 --> 00:28:53,040

 
611
00:28:53,040 --> 00:28:55,500

 
612
00:28:55,500 --> 00:28:58,260

 
613
00:28:58,260 --> 00:29:01,170

 
614
00:29:01,170 --> 00:29:03,180

 
615
00:29:03,180 --> 00:29:05,820

 
616
00:29:05,820 --> 00:29:08,130

 
617
00:29:08,130 --> 00:29:12,030

 
618
00:29:12,030 --> 00:29:13,770

 
619
00:29:13,770 --> 00:29:15,150

 
620
00:29:15,150 --> 00:29:22,590

 
621
00:29:22,590 --> 00:29:24,960

 
622
00:29:24,960 --> 00:29:27,330

 
623
00:29:27,330 --> 00:29:40,240

 
624
00:29:40,240 --> 00:29:52,919

 
625
00:29:52,919 --> 00:29:57,220

 
626
00:29:57,220 --> 00:30:03,450

 
627
00:30:03,450 --> 00:30:10,020

 
628
00:30:10,020 --> 00:30:12,100

 
629
00:30:12,100 --> 00:30:18,100

 
630
00:30:18,100 --> 00:30:28,630

 
631
00:30:28,630 --> 00:30:33,370

 
632
00:30:33,370 --> 00:30:37,000

 
633
00:30:37,000 --> 00:30:38,350

 
634
00:30:38,350 --> 00:30:40,330

 
635
00:30:40,330 --> 00:30:42,730

 
636
00:30:42,730 --> 00:30:44,919

 
637
00:30:44,919 --> 00:30:47,289

 
638
00:30:47,289 --> 00:30:49,180

 
639
00:30:49,180 --> 00:30:52,149

 
640
00:30:52,149 --> 00:30:53,710

 
641
00:30:53,710 --> 00:30:55,149

 
642
00:30:55,149 --> 00:30:58,630

 
643
00:30:58,630 --> 00:31:01,840

 
644
00:31:01,840 --> 00:31:04,720

 
645
00:31:04,720 --> 00:31:07,740

 
646
00:31:07,740 --> 00:31:11,110

 
647
00:31:11,110 --> 00:31:13,090

 
648
00:31:13,090 --> 00:31:17,320

 
649
00:31:17,320 --> 00:31:19,659

 
650
00:31:19,659 --> 00:31:24,010

 
651
00:31:24,010 --> 00:31:30,600

 
652
00:31:30,600 --> 00:31:33,159

 
653
00:31:33,159 --> 00:31:35,440

 
654
00:31:35,440 --> 00:31:38,649

 
655
00:31:38,649 --> 00:31:41,049

 
656
00:31:41,049 --> 00:31:42,460

 
657
00:31:42,460 --> 00:31:44,740

 
658
00:31:44,740 --> 00:31:46,509

 
659
00:31:46,509 --> 00:31:52,060

 
660
00:31:52,060 --> 00:31:54,610

 
661
00:31:54,610 --> 00:31:57,039

 
662
00:31:57,039 --> 00:32:01,480

 
663
00:32:01,480 --> 00:32:04,180

 
664
00:32:04,180 --> 00:32:06,730

 
665
00:32:06,730 --> 00:32:08,289

 
666
00:32:08,289 --> 00:32:10,919

 
667
00:32:10,919 --> 00:32:13,720

 
668
00:32:13,720 --> 00:32:21,419

 
669
00:32:21,419 --> 00:32:24,879

 
670
00:32:24,879 --> 00:32:31,570

 
671
00:32:31,570 --> 00:32:35,110

 
672
00:32:35,110 --> 00:32:37,210

 
673
00:32:37,210 --> 00:32:39,610

 
674
00:32:39,610 --> 00:32:43,180

 
675
00:32:43,180 --> 00:32:46,000

 
676
00:32:46,000 --> 00:32:51,039

 
677
00:32:51,039 --> 00:32:53,409

 
678
00:32:53,409 --> 00:32:56,259

 
679
00:32:56,259 --> 00:32:59,289

 
680
00:32:59,289 --> 00:33:01,629

 
681
00:33:01,629 --> 00:33:09,340

 
682
00:33:09,340 --> 00:33:11,289

 
683
00:33:11,289 --> 00:33:13,899

 
684
00:33:13,899 --> 00:33:16,120

 
685
00:33:16,120 --> 00:33:19,840

 
686
00:33:19,840 --> 00:33:22,450

 
687
00:33:22,450 --> 00:33:26,379

 
688
00:33:26,379 --> 00:33:28,360

 
689
00:33:28,360 --> 00:33:30,909

 
690
00:33:30,909 --> 00:33:33,639

 
691
00:33:33,639 --> 00:33:35,470

 
692
00:33:35,470 --> 00:33:37,600

 
693
00:33:37,600 --> 00:33:40,870

 
694
00:33:40,870 --> 00:33:43,990

 
695
00:33:43,990 --> 00:33:47,140

 
696
00:33:47,140 --> 00:33:50,230

 
697
00:33:50,230 --> 00:33:54,280

 
698
00:33:54,280 --> 00:33:57,580

 
699
00:33:57,580 --> 00:34:00,940

 
700
00:34:00,940 --> 00:34:02,260

 
701
00:34:02,260 --> 00:34:04,060

 
702
00:34:04,060 --> 00:34:07,210

 
703
00:34:07,210 --> 00:34:08,740

 
704
00:34:08,740 --> 00:34:15,780

 
705
00:34:15,780 --> 00:34:18,190

 
706
00:34:18,190 --> 00:34:21,540

 
707
00:34:21,540 --> 00:34:24,010

 
708
00:34:24,010 --> 00:34:26,470

 
709
00:34:26,470 --> 00:34:28,690

 
710
00:34:28,690 --> 00:34:31,150

 
711
00:34:31,150 --> 00:34:34,810

 
712
00:34:34,810 --> 00:34:37,750

 
713
00:34:37,750 --> 00:34:40,090

 
714
00:34:40,090 --> 00:34:41,770

 
715
00:34:41,770 --> 00:34:43,870

 
716
00:34:43,870 --> 00:34:46,240

 
717
00:34:46,240 --> 00:34:48,940

 
718
00:34:48,940 --> 00:34:51,130

 
719
00:34:51,130 --> 00:34:54,490

 
720
00:34:54,490 --> 00:34:56,470

 
721
00:34:56,470 --> 00:34:58,330

 
722
00:34:58,330 --> 00:35:03,610

 
723
00:35:03,610 --> 00:35:10,840

 
724
00:35:10,840 --> 00:35:13,030

 
725
00:35:13,030 --> 00:35:15,310

 
726
00:35:15,310 --> 00:35:18,400

 
727
00:35:18,400 --> 00:35:22,930

 
728
00:35:22,930 --> 00:35:24,910

 
729
00:35:24,910 --> 00:35:28,030

 
730
00:35:28,030 --> 00:35:30,640

 
731
00:35:30,640 --> 00:35:33,310

 
732
00:35:33,310 --> 00:35:44,210

 
733
00:35:44,210 --> 00:35:44,220

 
734
00:35:44,220 --> 00:35:49,509


735
00:35:49,509 --> 00:35:59,220

 
736
00:35:59,220 --> 00:36:01,870

 
737
00:36:01,870 --> 00:36:04,299

 
738
00:36:04,299 --> 00:36:06,819

 
739
00:36:06,819 --> 00:36:09,549

 
740
00:36:09,549 --> 00:36:13,420

 
741
00:36:13,420 --> 00:36:15,430

 
742
00:36:15,430 --> 00:36:18,069

 
743
00:36:18,069 --> 00:36:21,279

 
744
00:36:21,279 --> 00:36:23,589

 
745
00:36:23,589 --> 00:36:26,289

 
746
00:36:26,289 --> 00:36:29,289

 
747
00:36:29,289 --> 00:36:33,940

 
748
00:36:33,940 --> 00:36:36,339

 
749
00:36:36,339 --> 00:36:38,380

 
750
00:36:38,380 --> 00:36:40,989

 
751
00:36:40,989 --> 00:36:45,279

 
752
00:36:45,279 --> 00:36:55,599

 
753
00:36:55,599 --> 00:37:01,740

 
754
00:37:01,740 --> 00:37:05,410

 
755
00:37:05,410 --> 00:37:07,690

 
756
00:37:07,690 --> 00:37:09,370

 
757
00:37:09,370 --> 00:37:11,980

 
758
00:37:11,980 --> 00:37:15,160

 
759
00:37:15,160 --> 00:37:17,260

 
760
00:37:17,260 --> 00:37:18,940

 
761
00:37:18,940 --> 00:37:21,580

 
762
00:37:21,580 --> 00:37:23,800

 
763
00:37:23,800 --> 00:37:26,260

 
764
00:37:26,260 --> 00:37:38,970

 
765
00:37:38,970 --> 00:37:42,130

 
766
00:37:42,130 --> 00:37:44,710

 
767
00:37:44,710 --> 00:37:46,450

 
768
00:37:46,450 --> 00:37:48,790

 
769
00:37:48,790 --> 00:37:51,700

 
770
00:37:51,700 --> 00:37:53,680

 
771
00:37:53,680 --> 00:38:05,530

 
772
00:38:05,530 --> 00:38:09,070

 
773
00:38:09,070 --> 00:38:14,320

 
774
00:38:14,320 --> 00:38:17,560

 
775
00:38:17,560 --> 00:38:19,540

 
776
00:38:19,540 --> 00:38:21,280

 
777
00:38:21,280 --> 00:38:23,140

 
778
00:38:23,140 --> 00:38:24,750

 
779
00:38:24,750 --> 00:38:29,530

 
780
00:38:29,530 --> 00:38:32,110

 
781
00:38:32,110 --> 00:38:34,030

 
782
00:38:34,030 --> 00:38:37,030

 
783
00:38:37,030 --> 00:38:38,920

 
784
00:38:38,920 --> 00:38:41,500

 
785
00:38:41,500 --> 00:38:42,370

 
786
00:38:42,370 --> 00:38:45,400

 
787
00:38:45,400 --> 00:38:47,740

 
788
00:38:47,740 --> 00:38:50,620

 
789
00:38:50,620 --> 00:38:52,090

 
790
00:38:52,090 --> 00:38:54,040

 
791
00:38:54,040 --> 00:38:57,130

 
792
00:38:57,130 --> 00:38:59,860

 
793
00:38:59,860 --> 00:39:01,180

 
794
00:39:01,180 --> 00:39:04,150

 
795
00:39:04,150 --> 00:39:06,190

 
796
00:39:06,190 --> 00:39:08,650

 
797
00:39:08,650 --> 00:39:10,690

 
798
00:39:10,690 --> 00:39:12,940

 
799
00:39:12,940 --> 00:39:13,630

 
800
00:39:13,630 --> 00:39:16,029

 
801
00:39:16,029 --> 00:39:18,039

 
802
00:39:18,039 --> 00:39:20,559

 
803
00:39:20,559 --> 00:39:22,660

 
804
00:39:22,660 --> 00:39:24,730

 
805
00:39:24,730 --> 00:39:26,319

 
806
00:39:26,319 --> 00:39:35,549

 
807
00:39:35,549 --> 00:39:38,859

 
808
00:39:38,859 --> 00:39:41,710

 
809
00:39:41,710 --> 00:39:43,930

 
810
00:39:43,930 --> 00:39:46,450

 
811
00:39:46,450 --> 00:39:50,319

 
812
00:39:50,319 --> 00:39:52,120

 
813
00:39:52,120 --> 00:39:55,259

 
814
00:39:55,259 --> 00:39:58,210

 
815
00:39:58,210 --> 00:40:00,190

 
816
00:40:00,190 --> 00:40:03,430

 
817
00:40:03,430 --> 00:40:05,890

 
818
00:40:05,890 --> 00:40:07,839

 
819
00:40:07,839 --> 00:40:13,839

 
820
00:40:13,839 --> 00:40:26,480

 
821
00:40:26,480 --> 00:40:33,920

 
822
00:40:33,920 --> 00:40:37,390

 
823
00:40:37,390 --> 00:40:41,120

 
824
00:40:41,120 --> 00:40:44,000

 
825
00:40:44,000 --> 00:40:48,740

 
826
00:40:48,740 --> 00:40:51,620

 
827
00:40:51,620 --> 00:40:54,829

 
828
00:40:54,829 --> 00:40:59,540

 
829
00:40:59,540 --> 00:41:01,609

 
830
00:41:01,609 --> 00:41:03,680

 
831
00:41:03,680 --> 00:41:05,390

 
832
00:41:05,390 --> 00:41:07,430

 
833
00:41:07,430 --> 00:41:10,730

 
834
00:41:10,730 --> 00:41:21,770

 
835
00:41:21,770 --> 00:41:26,030

 
836
00:41:26,030 --> 00:41:28,340

 
837
00:41:28,340 --> 00:41:30,700

 
838
00:41:30,700 --> 00:41:39,710

 
839
00:41:39,710 --> 00:41:42,770

 
840
00:41:42,770 --> 00:41:46,820

 
841
00:41:46,820 --> 00:41:48,590

 
842
00:41:48,590 --> 00:41:49,570

 
843
00:41:49,570 --> 00:41:52,640

 
844
00:41:52,640 --> 00:41:57,950

 
845
00:41:57,950 --> 00:42:00,500

 
846
00:42:00,500 --> 00:42:05,030

 
847
00:42:05,030 --> 00:42:07,790

 
848
00:42:07,790 --> 00:42:09,980

 
849
00:42:09,980 --> 00:42:12,950

 
850
00:42:12,950 --> 00:42:15,140

 
851
00:42:15,140 --> 00:42:17,660

 
852
00:42:17,660 --> 00:42:20,600

 
853
00:42:20,600 --> 00:42:28,970

 
854
00:42:28,970 --> 00:42:30,410

 
855
00:42:30,410 --> 00:42:32,960

 
856
00:42:32,960 --> 00:42:35,360

 
857
00:42:35,360 --> 00:42:37,610

 
858
00:42:37,610 --> 00:42:40,340

 
859
00:42:40,340 --> 00:42:42,290

 
860
00:42:42,290 --> 00:42:43,970

 
861
00:42:43,970 --> 00:42:46,520

 
862
00:42:46,520 --> 00:42:48,530

 
863
00:42:48,530 --> 00:42:51,100

 
864
00:42:51,100 --> 00:42:54,290

 
865
00:42:54,290 --> 00:42:56,570

 
866
00:42:56,570 --> 00:42:58,310

 
867
00:42:58,310 --> 00:42:58,320

 
868
00:42:58,320 --> 00:42:59,320


869
00:42:59,320 --> 00:43:02,510

 
870
00:43:02,510 --> 00:43:05,930

 
871
00:43:05,930 --> 00:43:07,400

 
872
00:43:07,400 --> 00:43:09,020

 
873
00:43:09,020 --> 00:43:11,000

 
874
00:43:11,000 --> 00:43:13,460

 
875
00:43:13,460 --> 00:43:17,480

 
876
00:43:17,480 --> 00:43:19,820

 
877
00:43:19,820 --> 00:43:25,910

 
878
00:43:25,910 --> 00:43:27,830

 
879
00:43:27,830 --> 00:43:30,070

 
880
00:43:30,070 --> 00:43:33,350

 
881
00:43:33,350 --> 00:43:35,569

 
882
00:43:35,569 --> 00:43:38,089

 
883
00:43:38,089 --> 00:43:39,739

 
884
00:43:39,739 --> 00:43:41,930

 
885
00:43:41,930 --> 00:43:47,329

 
886
00:43:47,329 --> 00:43:49,699

 
887
00:43:49,699 --> 00:43:52,489

 
888
00:43:52,489 --> 00:43:55,249

 
889
00:43:55,249 --> 00:43:56,900

 
890
00:43:56,900 --> 00:43:59,180

 
891
00:43:59,180 --> 00:44:02,719

 
892
00:44:02,719 --> 00:44:05,120

 
893
00:44:05,120 --> 00:44:08,449

 
894
00:44:08,449 --> 00:44:10,759

 
895
00:44:10,759 --> 00:44:12,160

 
896
00:44:12,160 --> 00:44:14,059

 
897
00:44:14,059 --> 00:44:15,799

 
898
00:44:15,799 --> 00:44:17,630

 
899
00:44:17,630 --> 00:44:20,390

 
900
00:44:20,390 --> 00:44:23,299

 
901
00:44:23,299 --> 00:44:25,430

 
902
00:44:25,430 --> 00:44:27,079

 
903
00:44:27,079 --> 00:44:29,749

 
904
00:44:29,749 --> 00:44:31,339

 
905
00:44:31,339 --> 00:44:33,739

 
906
00:44:33,739 --> 00:44:36,109

 
907
00:44:36,109 --> 00:44:38,420

 
908
00:44:38,420 --> 00:44:45,439

 
909
00:44:45,439 --> 00:44:49,219

 
910
00:44:49,219 --> 00:44:52,640

 
911
00:44:52,640 --> 00:44:54,529

 
912
00:44:54,529 --> 00:44:56,959

 
913
00:44:56,959 --> 00:44:59,199

 
914
00:44:59,199 --> 00:45:02,029

 
915
00:45:02,029 --> 00:45:09,499

 
916
00:45:09,499 --> 00:45:20,040

 
917
00:45:20,040 --> 00:45:28,070

 
918
00:45:28,070 --> 00:45:34,140

 
919
00:45:34,140 --> 00:45:39,150

 
920
00:45:39,150 --> 00:45:44,520

 
921
00:45:44,520 --> 00:45:46,530

 
922
00:45:46,530 --> 00:45:50,550

 
923
00:45:50,550 --> 00:45:56,460

 
924
00:45:56,460 --> 00:45:59,250

 
925
00:45:59,250 --> 00:46:01,860

 
926
00:46:01,860 --> 00:46:04,230

 
927
00:46:04,230 --> 00:46:06,900

 
928
00:46:06,900 --> 00:46:11,400

 
929
00:46:11,400 --> 00:46:13,320

 
930
00:46:13,320 --> 00:46:16,020

 
931
00:46:16,020 --> 00:46:18,840

 
932
00:46:18,840 --> 00:46:20,340

 
933
00:46:20,340 --> 00:46:22,950

 
934
00:46:22,950 --> 00:46:24,840

 
935
00:46:24,840 --> 00:46:30,060

 
936
00:46:30,060 --> 00:46:34,620

 
937
00:46:34,620 --> 00:46:37,890

 
938
00:46:37,890 --> 00:46:42,030

 
939
00:46:42,030 --> 00:46:44,220

 
940
00:46:44,220 --> 00:46:47,070

 
941
00:46:47,070 --> 00:46:51,900

 
942
00:46:51,900 --> 00:46:53,490

 
943
00:46:53,490 --> 00:46:56,640

 
944
00:46:56,640 --> 00:46:58,860

 
945
00:46:58,860 --> 00:47:01,650

 
946
00:47:01,650 --> 00:47:05,040

 
947
00:47:05,040 --> 00:47:08,310

 
948
00:47:08,310 --> 00:47:10,740

 
949
00:47:10,740 --> 00:47:12,840

 
950
00:47:12,840 --> 00:47:14,040

 
951
00:47:14,040 --> 00:47:18,120

 
952
00:47:18,120 --> 00:47:20,520

 
953
00:47:20,520 --> 00:47:24,390

 
954
00:47:24,390 --> 00:47:28,680

 
955
00:47:28,680 --> 00:47:30,960

 
956
00:47:30,960 --> 00:47:34,560

 
957
00:47:34,560 --> 00:47:38,530

 
958
00:47:38,530 --> 00:47:40,240

 
959
00:47:40,240 --> 00:47:44,200

 
960
00:47:44,200 --> 00:47:46,360

 
961
00:47:46,360 --> 00:47:50,140

 
962
00:47:50,140 --> 00:47:51,580

 
963
00:47:51,580 --> 00:47:53,740

 
964
00:47:53,740 --> 00:47:55,920

 
965
00:47:55,920 --> 00:48:02,130

 
966
00:48:02,130 --> 00:48:06,550

 
967
00:48:06,550 --> 00:48:08,770

 
968
00:48:08,770 --> 00:48:11,590

 
969
00:48:11,590 --> 00:48:12,820

 
970
00:48:12,820 --> 00:48:18,850

 
971
00:48:18,850 --> 00:48:21,970

 
972
00:48:21,970 --> 00:48:24,220

 
973
00:48:24,220 --> 00:48:26,920

 
974
00:48:26,920 --> 00:48:29,110

 
975
00:48:29,110 --> 00:48:30,340

 
976
00:48:30,340 --> 00:48:34,750

 
977
00:48:34,750 --> 00:48:36,370

 
978
00:48:36,370 --> 00:48:39,160

 
979
00:48:39,160 --> 00:48:40,900

 
980
00:48:40,900 --> 00:48:45,820

 
981
00:48:45,820 --> 00:48:48,880

 
982
00:48:48,880 --> 00:48:51,880

 
983
00:48:51,880 --> 00:48:53,890

 
984
00:48:53,890 --> 00:48:56,470

 
985
00:48:56,470 --> 00:48:58,780

 
986
00:48:58,780 --> 00:49:02,140

 
987
00:49:02,140 --> 00:49:04,080

 
988
00:49:04,080 --> 00:49:07,420

 
989
00:49:07,420 --> 00:49:09,610

 
990
00:49:09,610 --> 00:49:14,410

 
991
00:49:14,410 --> 00:49:16,510

 
992
00:49:16,510 --> 00:49:19,720

 
993
00:49:19,720 --> 00:49:23,320

 
994
00:49:23,320 --> 00:49:24,580

 
995
00:49:24,580 --> 00:49:27,690

 
996
00:49:27,690 --> 00:49:30,490

 
997
00:49:30,490 --> 00:49:32,110

 
998
00:49:32,110 --> 00:49:34,690

 
999
00:49:34,690 --> 00:49:37,630

 
1000
00:49:37,630 --> 00:49:41,020

 
1001
00:49:41,020 --> 00:49:42,940

 
1002
00:49:42,940 --> 00:49:44,560

 
1003
00:49:44,560 --> 00:49:46,299

 
1004
00:49:46,299 --> 00:49:50,799

 
1005
00:49:50,799 --> 00:49:53,170

 
1006
00:49:53,170 --> 00:49:57,699

 
1007
00:49:57,699 --> 00:49:59,650

 
1008
00:49:59,650 --> 00:50:01,390

 
1009
00:50:01,390 --> 00:50:02,559

 
1010
00:50:02,559 --> 00:50:05,079

 
1011
00:50:05,079 --> 00:50:06,969

 
1012
00:50:06,969 --> 00:50:09,009

 
1013
00:50:09,009 --> 00:50:12,880

 
1014
00:50:12,880 --> 00:50:14,650

 
1015
00:50:14,650 --> 00:50:16,809

 
1016
00:50:16,809 --> 00:50:18,609

 
1017
00:50:18,609 --> 00:50:20,559

 
1018
00:50:20,559 --> 00:50:23,559

 
1019
00:50:23,559 --> 00:50:25,929

 
1020
00:50:25,929 --> 00:50:27,670

 
1021
00:50:27,670 --> 00:50:29,650

 
1022
00:50:29,650 --> 00:50:31,390

 
1023
00:50:31,390 --> 00:50:33,189

 
1024
00:50:33,189 --> 00:50:34,870

 
1025
00:50:34,870 --> 00:50:36,519

 
1026
00:50:36,519 --> 00:50:38,349

 
1027
00:50:38,349 --> 00:50:42,339

 
1028
00:50:42,339 --> 00:50:54,230

 
1029
00:50:54,230 --> 00:50:54,240

 
1030
00:50:54,240 --> 00:50:58,320


1031
00:50:58,320 --> 00:51:02,770

 
1032
00:51:02,770 --> 00:51:05,020

 
1033
00:51:05,020 --> 00:51:06,220

 
1034
00:51:06,220 --> 00:51:08,050

 
1035
00:51:08,050 --> 00:51:09,940

 
1036
00:51:09,940 --> 00:51:12,640

 
1037
00:51:12,640 --> 00:51:27,790

 
1038
00:51:27,790 --> 00:51:30,160

 
1039
00:51:30,160 --> 00:51:33,280

 
1040
00:51:33,280 --> 00:51:34,990

 
1041
00:51:34,990 --> 00:51:37,020

 
1042
00:51:37,020 --> 00:51:39,490

 
1043
00:51:39,490 --> 00:51:42,700

 
1044
00:51:42,700 --> 00:51:44,470

 
1045
00:51:44,470 --> 00:51:46,599

 
1046
00:51:46,599 --> 00:51:48,040

 
1047
00:51:48,040 --> 00:51:52,079

 
1048
00:51:52,079 --> 00:52:00,190

 
1049
00:52:00,190 --> 00:52:04,180

 
1050
00:52:04,180 --> 00:52:05,980

 
1051
00:52:05,980 --> 00:52:07,839

 
1052
00:52:07,839 --> 00:52:09,640

 
1053
00:52:09,640 --> 00:52:12,040

 
1054
00:52:12,040 --> 00:52:14,020

 
1055
00:52:14,020 --> 00:52:15,970

 
1056
00:52:15,970 --> 00:52:21,010

 
1057
00:52:21,010 --> 00:52:23,140

 
1058
00:52:23,140 --> 00:52:25,780

 
1059
00:52:25,780 --> 00:52:27,730

 
1060
00:52:27,730 --> 00:52:30,460

 
1061
00:52:30,460 --> 00:52:32,290

 
1062
00:52:32,290 --> 00:52:34,000

 
1063
00:52:34,000 --> 00:52:35,829

 
1064
00:52:35,829 --> 00:52:37,480

 
1065
00:52:37,480 --> 00:52:40,030

 
1066
00:52:40,030 --> 00:52:41,980

 
1067
00:52:41,980 --> 00:52:44,109

 
1068
00:52:44,109 --> 00:52:46,960

 
1069
00:52:46,960 --> 00:52:48,790

 
1070
00:52:48,790 --> 00:52:50,950

 
1071
00:52:50,950 --> 00:52:52,750

 
1072
00:52:52,750 --> 00:52:54,099

 
1073
00:52:54,099 --> 00:52:56,079

 
1074
00:52:56,079 --> 00:52:57,970

 
1075
00:52:57,970 --> 00:52:59,770

 
1076
00:52:59,770 --> 00:53:01,300

 
1077
00:53:01,300 --> 00:53:04,210

 
1078
00:53:04,210 --> 00:53:07,300

 
1079
00:53:07,300 --> 00:53:09,780

 
1080
00:53:09,780 --> 00:53:12,010

 
1081
00:53:12,010 --> 00:53:14,470

 
1082
00:53:14,470 --> 00:53:16,300

 
1083
00:53:16,300 --> 00:53:17,890

 
1084
00:53:17,890 --> 00:53:20,470

 
1085
00:53:20,470 --> 00:53:26,530

 
1086
00:53:26,530 --> 00:53:28,810

 
1087
00:53:28,810 --> 00:53:30,280

 
1088
00:53:30,280 --> 00:53:33,310

 
1089
00:53:33,310 --> 00:53:34,540

 
1090
00:53:34,540 --> 00:53:37,090

 
1091
00:53:37,090 --> 00:53:39,400

 
1092
00:53:39,400 --> 00:53:42,340

 
1093
00:53:42,340 --> 00:53:44,109

 
1094
00:53:44,109 --> 00:53:45,550

 
1095
00:53:45,550 --> 00:53:48,010

 
1096
00:53:48,010 --> 00:53:50,560

 
1097
00:53:50,560 --> 00:53:52,540

 
1098
00:53:52,540 --> 00:53:58,150

 
1099
00:53:58,150 --> 00:53:59,920

 
1100
00:53:59,920 --> 00:54:02,140

 
1101
00:54:02,140 --> 00:54:04,570

 
1102
00:54:04,570 --> 00:54:08,250

 
1103
00:54:08,250 --> 00:54:10,840

 
1104
00:54:10,840 --> 00:54:15,640

 
1105
00:54:15,640 --> 00:54:17,800

 
1106
00:54:17,800 --> 00:54:19,780

 
1107
00:54:19,780 --> 00:54:22,150

 
1108
00:54:22,150 --> 00:54:23,980

 
1109
00:54:23,980 --> 00:54:25,510

 
1110
00:54:25,510 --> 00:54:29,140

 
1111
00:54:29,140 --> 00:54:31,570

 
1112
00:54:31,570 --> 00:54:33,820

 
1113
00:54:33,820 --> 00:54:35,500

 
1114
00:54:35,500 --> 00:54:37,840

 
1115
00:54:37,840 --> 00:54:39,490

 
1116
00:54:39,490 --> 00:54:40,900

 
1117
00:54:40,900 --> 00:54:42,790

 
1118
00:54:42,790 --> 00:54:48,370

 
1119
00:54:48,370 --> 00:54:50,740

 
1120
00:54:50,740 --> 00:54:52,900

 
1121
00:54:52,900 --> 00:54:54,790

 
1122
00:54:54,790 --> 00:54:57,609

 
1123
00:54:57,609 --> 00:55:00,430

 
1124
00:55:00,430 --> 00:55:02,230

 
1125
00:55:02,230 --> 00:55:03,460

 
1126
00:55:03,460 --> 00:55:04,810

 
1127
00:55:04,810 --> 00:55:06,700

 
1128
00:55:06,700 --> 00:55:12,070

 
1129
00:55:12,070 --> 00:55:14,800

 
1130
00:55:14,800 --> 00:55:16,900

 
1131
00:55:16,900 --> 00:55:19,790

 
1132
00:55:19,790 --> 00:55:22,520

 
1133
00:55:22,520 --> 00:55:24,410

 
1134
00:55:24,410 --> 00:55:28,070

 
1135
00:55:28,070 --> 00:55:30,050

 
1136
00:55:30,050 --> 00:55:31,850

 
1137
00:55:31,850 --> 00:55:35,900

 
1138
00:55:35,900 --> 00:55:38,150

 
1139
00:55:38,150 --> 00:55:40,760

 
1140
00:55:40,760 --> 00:55:45,110

 
1141
00:55:45,110 --> 00:55:50,390

 
1142
00:55:50,390 --> 00:55:53,690

 
1143
00:55:53,690 --> 00:55:55,280

 
1144
00:55:55,280 --> 00:55:58,520

 
1145
00:55:58,520 --> 00:56:00,380

 
1146
00:56:00,380 --> 00:56:07,280

 
1147
00:56:07,280 --> 00:56:10,430

 
1148
00:56:10,430 --> 00:56:13,160

 
1149
00:56:13,160 --> 00:56:16,520

 
1150
00:56:16,520 --> 00:56:20,870

 
1151
00:56:20,870 --> 00:56:22,790

 
1152
00:56:22,790 --> 00:56:24,470

 
1153
00:56:24,470 --> 00:56:27,110

 
1154
00:56:27,110 --> 00:56:30,560

 
1155
00:56:30,560 --> 00:56:32,870

 
1156
00:56:32,870 --> 00:56:35,060

 
1157
00:56:35,060 --> 00:56:37,340

 
1158
00:56:37,340 --> 00:56:40,040

 
1159
00:56:40,040 --> 00:56:47,210

 
1160
00:56:47,210 --> 00:56:49,820

 
1161
00:56:49,820 --> 00:56:53,870

 
1162
00:56:53,870 --> 00:56:56,090

 
1163
00:56:56,090 --> 00:56:57,950

 
1164
00:56:57,950 --> 00:57:02,240

 
1165
00:57:02,240 --> 00:57:05,390

 
1166
00:57:05,390 --> 00:57:07,160

 
1167
00:57:07,160 --> 00:57:10,070

 
1168
00:57:10,070 --> 00:57:12,950

 
1169
00:57:12,950 --> 00:57:16,700

 
1170
00:57:16,700 --> 00:57:19,250

 
1171
00:57:19,250 --> 00:57:22,760

 
1172
00:57:22,760 --> 00:57:27,200

 
1173
00:57:27,200 --> 00:57:33,830

 
1174
00:57:33,830 --> 00:57:35,690

 
1175
00:57:35,690 --> 00:57:39,230

 
1176
00:57:39,230 --> 00:57:41,630

 
1177
00:57:41,630 --> 00:57:44,060

 
1178
00:57:44,060 --> 00:57:46,550

 
1179
00:57:46,550 --> 00:57:49,460

 
1180
00:57:49,460 --> 00:57:50,960

 
1181
00:57:50,960 --> 00:57:53,710

 
1182
00:57:53,710 --> 00:58:01,800

 
1183
00:58:01,800 --> 00:58:05,050

 
1184
00:58:05,050 --> 00:58:09,130

 
1185
00:58:09,130 --> 00:58:11,170

 
1186
00:58:11,170 --> 00:58:13,690

 
1187
00:58:13,690 --> 00:58:17,740

 
1188
00:58:17,740 --> 00:58:20,440

 
1189
00:58:20,440 --> 00:58:23,320

 
1190
00:58:23,320 --> 00:58:26,440

 
1191
00:58:26,440 --> 00:58:28,450

 
1192
00:58:28,450 --> 00:58:30,790

 
1193
00:58:30,790 --> 00:58:32,680

 
1194
00:58:32,680 --> 00:58:38,110

 
1195
00:58:38,110 --> 00:58:40,090

 
1196
00:58:40,090 --> 00:58:42,040

 
1197
00:58:42,040 --> 00:58:44,710

 
1198
00:58:44,710 --> 00:58:46,900

 
1199
00:58:46,900 --> 00:58:48,550

 
1200
00:58:48,550 --> 00:58:51,220

 
1201
00:58:51,220 --> 00:58:52,780

 
1202
00:58:52,780 --> 00:58:55,240

 
1203
00:58:55,240 --> 00:58:57,490

 
1204
00:58:57,490 --> 00:59:03,670

 
1205
00:59:03,670 --> 00:59:05,920

 
1206
00:59:05,920 --> 00:59:08,970

 
1207
00:59:08,970 --> 00:59:12,430

 
1208
00:59:12,430 --> 00:59:14,650

 
1209
00:59:14,650 --> 00:59:16,390

 
1210
00:59:16,390 --> 00:59:27,029

 
1211
00:59:27,029 --> 00:59:30,400

 
1212
00:59:30,400 --> 00:59:32,410

 
1213
00:59:32,410 --> 00:59:34,539

 
1214
00:59:34,539 --> 00:59:36,670

 
1215
00:59:36,670 --> 00:59:39,640

 
1216
00:59:39,640 --> 00:59:43,779

 
1217
00:59:43,779 --> 00:59:46,960

 
1218
00:59:46,960 --> 00:59:49,150

 
1219
00:59:49,150 --> 00:59:53,499

 
1220
00:59:53,499 --> 00:59:56,710

 
1221
00:59:56,710 --> 00:59:58,809

 
1222
00:59:58,809 --> 01:00:03,249

 
1223
01:00:03,249 --> 01:00:05,319

 
1224
01:00:05,319 --> 01:00:09,160

 
1225
01:00:09,160 --> 01:00:12,900

 
1226
01:00:12,900 --> 01:00:16,839

 
1227
01:00:16,839 --> 01:00:18,670

 
1228
01:00:18,670 --> 01:00:21,609

 
1229
01:00:21,609 --> 01:00:24,309

 
1230
01:00:24,309 --> 01:00:26,710

 
1231
01:00:26,710 --> 01:00:31,120

 
1232
01:00:31,120 --> 01:00:33,539

 
1233
01:00:33,539 --> 01:00:37,239

 
1234
01:00:37,239 --> 01:00:40,900

 
1235
01:00:40,900 --> 01:00:43,839

 
1236
01:00:43,839 --> 01:00:45,309

 
1237
01:00:45,309 --> 01:00:47,529

 
1238
01:00:47,529 --> 01:00:50,019

 
1239
01:00:50,019 --> 01:00:56,300

 
1240
01:00:56,300 --> 01:00:59,990

 
1241
01:00:59,990 --> 01:01:01,520

 
1242
01:01:01,520 --> 01:01:03,590

 
1243
01:01:03,590 --> 01:01:06,530

 
1244
01:01:06,530 --> 01:01:12,980

 
1245
01:01:12,980 --> 01:01:16,250

 
1246
01:01:16,250 --> 01:01:18,800

 
1247
01:01:18,800 --> 01:01:24,170

 
1248
01:01:24,170 --> 01:01:26,240

 
1249
01:01:26,240 --> 01:01:27,650

 
1250
01:01:27,650 --> 01:01:29,690

 
1251
01:01:29,690 --> 01:01:33,020

 
1252
01:01:33,020 --> 01:01:37,250

 
1253
01:01:37,250 --> 01:01:40,130

 
1254
01:01:40,130 --> 01:01:42,140

 
1255
01:01:42,140 --> 01:01:43,730

 
1256
01:01:43,730 --> 01:01:45,890

 
1257
01:01:45,890 --> 01:01:48,560

 
1258
01:01:48,560 --> 01:01:52,610

 
1259
01:01:52,610 --> 01:01:56,150

 
1260
01:01:56,150 --> 01:02:03,350

 
1261
01:02:03,350 --> 01:02:06,170

 
1262
01:02:06,170 --> 01:02:08,840

 
1263
01:02:08,840 --> 01:02:10,970

 
1264
01:02:10,970 --> 01:02:12,380

 
1265
01:02:12,380 --> 01:02:14,780

 
1266
01:02:14,780 --> 01:02:17,440

 
1267
01:02:17,440 --> 01:02:20,090

 
1268
01:02:20,090 --> 01:02:21,830

 
1269
01:02:21,830 --> 01:02:25,250

 
1270
01:02:25,250 --> 01:02:27,500

 
1271
01:02:27,500 --> 01:02:30,290

 
1272
01:02:30,290 --> 01:02:34,190

 
1273
01:02:34,190 --> 01:02:36,100

 
1274
01:02:36,100 --> 01:02:38,990

 
1275
01:02:38,990 --> 01:02:40,430

 
1276
01:02:40,430 --> 01:02:50,120

 
1277
01:02:50,120 --> 01:02:53,910

 
1278
01:02:53,910 --> 01:02:55,710

 
1279
01:02:55,710 --> 01:02:58,260

 
1280
01:02:58,260 --> 01:03:01,800

 
1281
01:03:01,800 --> 01:03:05,670

 
1282
01:03:05,670 --> 01:03:08,880

 
1283
01:03:08,880 --> 01:03:10,650

 
1284
01:03:10,650 --> 01:03:12,720

 
1285
01:03:12,720 --> 01:03:14,820

 
1286
01:03:14,820 --> 01:03:17,160

 
1287
01:03:17,160 --> 01:03:19,140

 
1288
01:03:19,140 --> 01:03:20,640

 
1289
01:03:20,640 --> 01:03:22,230

 
1290
01:03:22,230 --> 01:03:25,250

 
1291
01:03:25,250 --> 01:03:28,110

 
1292
01:03:28,110 --> 01:03:31,680

 
1293
01:03:31,680 --> 01:03:34,830

 
1294
01:03:34,830 --> 01:03:37,280

 
1295
01:03:37,280 --> 01:03:39,660

 
1296
01:03:39,660 --> 01:03:42,930

 
1297
01:03:42,930 --> 01:03:46,650

 
1298
01:03:46,650 --> 01:03:49,170

 
1299
01:03:49,170 --> 01:03:52,230

 
1300
01:03:52,230 --> 01:03:55,640

 
1301
01:03:55,640 --> 01:03:58,470

 
1302
01:03:58,470 --> 01:04:00,450

 
1303
01:04:00,450 --> 01:04:03,840

 
1304
01:04:03,840 --> 01:04:05,730

 
1305
01:04:05,730 --> 01:04:10,110

 
1306
01:04:10,110 --> 01:04:12,180

 
1307
01:04:12,180 --> 01:04:14,370

 
1308
01:04:14,370 --> 01:04:16,380

 
1309
01:04:16,380 --> 01:04:20,210

 
1310
01:04:20,210 --> 01:04:27,640

 
1311
01:04:27,640 --> 01:04:30,770

 
1312
01:04:30,770 --> 01:04:36,290

 
1313
01:04:36,290 --> 01:04:39,380

 
1314
01:04:39,380 --> 01:04:42,560

 
1315
01:04:42,560 --> 01:04:44,420

 
1316
01:04:44,420 --> 01:04:45,700

 
1317
01:04:45,700 --> 01:04:48,589

 
1318
01:04:48,589 --> 01:04:51,380

 
1319
01:04:51,380 --> 01:04:53,750

 
1320
01:04:53,750 --> 01:04:56,540

 
1321
01:04:56,540 --> 01:04:58,400

 
1322
01:04:58,400 --> 01:05:01,250

 
1323
01:05:01,250 --> 01:05:04,370

 
1324
01:05:04,370 --> 01:05:07,040

 
1325
01:05:07,040 --> 01:05:10,700

 
1326
01:05:10,700 --> 01:05:12,380

 
1327
01:05:12,380 --> 01:05:15,650

 
1328
01:05:15,650 --> 01:05:19,849

 
1329
01:05:19,849 --> 01:05:22,280

 
1330
01:05:22,280 --> 01:05:24,230

 
1331
01:05:24,230 --> 01:05:26,900

 
1332
01:05:26,900 --> 01:05:28,970

 
1333
01:05:28,970 --> 01:05:32,920

 
1334
01:05:32,920 --> 01:05:37,670

 
1335
01:05:37,670 --> 01:05:39,560

 
1336
01:05:39,560 --> 01:05:43,579

 
1337
01:05:43,579 --> 01:05:44,990

 
1338
01:05:44,990 --> 01:05:47,150

 
1339
01:05:47,150 --> 01:05:50,089

 
1340
01:05:50,089 --> 01:05:59,150

 
1341
01:05:59,150 --> 01:06:03,200

 
1342
01:06:03,200 --> 01:06:06,020

 
1343
01:06:06,020 --> 01:06:08,360

 
1344
01:06:08,360 --> 01:06:11,240

 
1345
01:06:11,240 --> 01:06:12,740

 
1346
01:06:12,740 --> 01:06:14,360

 
1347
01:06:14,360 --> 01:06:18,160

 
1348
01:06:18,160 --> 01:06:22,160

 
1349
01:06:22,160 --> 01:06:24,710

 
1350
01:06:24,710 --> 01:06:27,620

 
1351
01:06:27,620 --> 01:06:29,810

 
1352
01:06:29,810 --> 01:06:33,230

 
1353
01:06:33,230 --> 01:06:35,120

 
1354
01:06:35,120 --> 01:06:37,810

 
1355
01:06:37,810 --> 01:06:43,370

 
1356
01:06:43,370 --> 01:06:45,440

 
1357
01:06:45,440 --> 01:06:47,330

 
1358
01:06:47,330 --> 01:06:49,490

 
1359
01:06:49,490 --> 01:06:52,640

 
1360
01:06:52,640 --> 01:06:54,080

 
1361
01:06:54,080 --> 01:06:57,650

 
1362
01:06:57,650 --> 01:06:59,690

 
1363
01:06:59,690 --> 01:07:03,550

 
1364
01:07:03,550 --> 01:07:06,470

 
1365
01:07:06,470 --> 01:07:09,290

 
1366
01:07:09,290 --> 01:07:12,080

 
1367
01:07:12,080 --> 01:07:14,030

 
1368
01:07:14,030 --> 01:07:16,460

 
1369
01:07:16,460 --> 01:07:17,960

 
1370
01:07:17,960 --> 01:07:20,810

 
1371
01:07:20,810 --> 01:07:27,260

 
1372
01:07:27,260 --> 01:07:32,270

 
1373
01:07:32,270 --> 01:07:34,250

 
1374
01:07:34,250 --> 01:07:35,990

 
1375
01:07:35,990 --> 01:07:37,820

 
1376
01:07:37,820 --> 01:07:41,230

 
1377
01:07:41,230 --> 01:07:48,470

 
1378
01:07:48,470 --> 01:07:50,690

 
1379
01:07:50,690 --> 01:07:54,140

 
1380
01:07:54,140 --> 01:07:56,300

 
1381
01:07:56,300 --> 01:07:58,970

 
1382
01:07:58,970 --> 01:08:01,640

 
1383
01:08:01,640 --> 01:08:04,280

 
1384
01:08:04,280 --> 01:08:07,190

 
1385
01:08:07,190 --> 01:08:09,020

 
1386
01:08:09,020 --> 01:08:10,730

 
1387
01:08:10,730 --> 01:08:11,940

 
1388
01:08:11,940 --> 01:08:14,730

 
1389
01:08:14,730 --> 01:08:16,499

 
1390
01:08:16,499 --> 01:08:18,300

 
1391
01:08:18,300 --> 01:08:20,519

 
1392
01:08:20,519 --> 01:08:23,090

 
1393
01:08:23,090 --> 01:08:25,530

 
1394
01:08:25,530 --> 01:08:27,720

 
1395
01:08:27,720 --> 01:08:29,579

 
1396
01:08:29,579 --> 01:08:31,620

 
1397
01:08:31,620 --> 01:08:35,430

 
1398
01:08:35,430 --> 01:08:39,419

 
1399
01:08:39,419 --> 01:08:41,729

 
1400
01:08:41,729 --> 01:08:43,979

 
1401
01:08:43,979 --> 01:08:46,919

 
1402
01:08:46,919 --> 01:08:50,419

 
1403
01:08:50,419 --> 01:08:54,599

 
1404
01:08:54,599 --> 01:08:58,260

 
1405
01:08:58,260 --> 01:09:00,599

 
1406
01:09:00,599 --> 01:09:04,289

 
1407
01:09:04,289 --> 01:09:07,640

 
1408
01:09:07,640 --> 01:09:10,430

 
1409
01:09:10,430 --> 01:09:14,459

 
1410
01:09:14,459 --> 01:09:15,899

 
1411
01:09:15,899 --> 01:09:17,999

 
1412
01:09:17,999 --> 01:09:20,550

 
1413
01:09:20,550 --> 01:09:23,039

 
1414
01:09:23,039 --> 01:09:26,430

 
1415
01:09:26,430 --> 01:09:32,979

 
1416
01:09:32,979 --> 01:09:36,110

 
1417
01:09:36,110 --> 01:09:38,240

 
1418
01:09:38,240 --> 01:09:41,300

 
1419
01:09:41,300 --> 01:09:43,910

 
1420
01:09:43,910 --> 01:09:46,129

 
1421
01:09:46,129 --> 01:09:48,970

 
1422
01:09:48,970 --> 01:09:52,120

 
1423
01:09:52,120 --> 01:09:55,310

 
1424
01:09:55,310 --> 01:09:58,910

 
1425
01:09:58,910 --> 01:10:01,189

 
1426
01:10:01,189 --> 01:10:03,709

 
1427
01:10:03,709 --> 01:10:08,149

 
1428
01:10:08,149 --> 01:10:10,490

 
1429
01:10:10,490 --> 01:10:14,030

 
1430
01:10:14,030 --> 01:10:17,390

 
1431
01:10:17,390 --> 01:10:19,520

 
1432
01:10:19,520 --> 01:10:22,820

 
1433
01:10:22,820 --> 01:10:24,890

 
1434
01:10:24,890 --> 01:10:27,620

 
1435
01:10:27,620 --> 01:10:29,600

 
1436
01:10:29,600 --> 01:10:32,959

 
1437
01:10:32,959 --> 01:10:35,689

 
1438
01:10:35,689 --> 01:10:41,629

 
1439
01:10:41,629 --> 01:10:43,550

 
1440
01:10:43,550 --> 01:10:46,310

 
1441
01:10:46,310 --> 01:10:49,250

 
1442
01:10:49,250 --> 01:10:51,500

 
1443
01:10:51,500 --> 01:10:54,770

 
1444
01:10:54,770 --> 01:11:00,050

 
1445
01:11:00,050 --> 01:11:03,560

 
1446
01:11:03,560 --> 01:11:06,680

 
1447
01:11:06,680 --> 01:11:09,439

 
1448
01:11:09,439 --> 01:11:12,680

 
1449
01:11:12,680 --> 01:11:15,470

 
1450
01:11:15,470 --> 01:11:19,310

 
1451
01:11:19,310 --> 01:11:21,979

 
1452
01:11:21,979 --> 01:11:23,899

 
1453
01:11:23,899 --> 01:11:28,520

 
1454
01:11:28,520 --> 01:11:31,370

 
1455
01:11:31,370 --> 01:11:33,140

 
1456
01:11:33,140 --> 01:11:35,479

 
1457
01:11:35,479 --> 01:11:38,870

 
1458
01:11:38,870 --> 01:11:41,600

 
1459
01:11:41,600 --> 01:11:44,390

 
1460
01:11:44,390 --> 01:11:46,360

 
1461
01:11:46,360 --> 01:11:50,720

 
1462
01:11:50,720 --> 01:11:52,670

 
1463
01:11:52,670 --> 01:11:54,400

 
1464
01:11:54,400 --> 01:11:56,450

 
1465
01:11:56,450 --> 01:11:59,600

 
1466
01:11:59,600 --> 01:12:02,060

 
1467
01:12:02,060 --> 01:12:05,240

 
1468
01:12:05,240 --> 01:12:07,070

 
1469
01:12:07,070 --> 01:12:11,900

 
1470
01:12:11,900 --> 01:12:14,120

 
1471
01:12:14,120 --> 01:12:16,640

 
1472
01:12:16,640 --> 01:12:18,380

 
1473
01:12:18,380 --> 01:12:20,000

 
1474
01:12:20,000 --> 01:12:23,150

 
1475
01:12:23,150 --> 01:12:25,280

 
1476
01:12:25,280 --> 01:12:29,180

 
1477
01:12:29,180 --> 01:12:31,610

 
1478
01:12:31,610 --> 01:12:33,830

 
1479
01:12:33,830 --> 01:12:37,340

 
1480
01:12:37,340 --> 01:12:40,640

 
1481
01:12:40,640 --> 01:12:43,220

 
1482
01:12:43,220 --> 01:12:45,530

 
1483
01:12:45,530 --> 01:12:48,890

 
1484
01:12:48,890 --> 01:12:51,110

 
1485
01:12:51,110 --> 01:12:53,150

 
1486
01:12:53,150 --> 01:12:56,000

 
1487
01:12:56,000 --> 01:12:58,610

 
1488
01:12:58,610 --> 01:13:01,280

 
1489
01:13:01,280 --> 01:13:03,590

 
1490
01:13:03,590 --> 01:13:09,830

 
1491
01:13:09,830 --> 01:13:11,920

 
1492
01:13:11,920 --> 01:13:14,930

 
1493
01:13:14,930 --> 01:13:17,810

 
1494
01:13:17,810 --> 01:13:18,890

 
1495
01:13:18,890 --> 01:13:22,250

 
1496
01:13:22,250 --> 01:13:26,000

 
1497
01:13:26,000 --> 01:13:28,130

 
1498
01:13:28,130 --> 01:13:30,590

 
1499
01:13:30,590 --> 01:13:30,600

 
1500
01:13:30,600 --> 01:13:31,430


1501
01:13:31,430 --> 01:13:33,080

 
1502
01:13:33,080 --> 01:13:34,310

 
1503
01:13:34,310 --> 01:13:36,230

 
1504
01:13:36,230 --> 01:13:38,330

 
1505
01:13:38,330 --> 01:13:41,300

 
1506
01:13:41,300 --> 01:13:42,740

 
1507
01:13:42,740 --> 01:13:48,640

 
1508
01:13:48,640 --> 01:13:51,650

 
1509
01:13:51,650 --> 01:13:53,680

 
1510
01:13:53,680 --> 01:13:56,360

 
1511
01:13:56,360 --> 01:13:59,030

 
1512
01:13:59,030 --> 01:14:00,800

 
1513
01:14:00,800 --> 01:14:02,600

 
1514
01:14:02,600 --> 01:14:04,220

 
1515
01:14:04,220 --> 01:14:09,800

 
1516
01:14:09,800 --> 01:14:14,510

 
1517
01:14:14,510 --> 01:14:16,880

 
1518
01:14:16,880 --> 01:14:18,830

 
1519
01:14:18,830 --> 01:14:20,930

 
1520
01:14:20,930 --> 01:14:34,060

 
1521
01:14:34,060 --> 01:14:34,070

 
1522
01:14:34,070 --> 01:14:36,130