1
00:00:01,069 --> 00:00:03,189

the following content is provided under  a Creative Commons license your support  will help MIT OpenCourseWare continue to  offer high quality educational resources  for free  to make a donation or to view additional  materials from hundreds of MIT courses  visit MIT opencourseware at ocw.mit.edu  good afternoon everyone  so today we're going to talk about  storage allocation this is a  continuation from last lecture where we  talked about serial storage allocation  so they will also talk a little bit more  about serial allocation but then I'll  talk more about parallel allocation and  also garbage collection so I want to  just do a review of some memory  allocation primitives so recall that you  can use malloc to allocate memory from  the heap and if you call malloc with a  size of s it's going to allocate and  return a pointer to a block of memory  containing at least s bytes so you might  actually get more more than s bytes even  though you asked for a spice but it's  guaranteed to give you at least s bytes  the return value is a void star but good  programming practice is to typecast this  pointer to whatever type you're using  this memory for when you receive this  from the malloc call there's also  aligned allocation so you can do a line  to a location with memo line which takes  two arguments this a size a as well as a  size s and a has to be an exact power of  two and it's going to allocate and  return a pointer to a block of memory  again containing at least s bytes but  this time this memory is going to be  aligned to a multiple of a so the  address is going to be a multiple of a  where this memory block starts so does  anyone know why we might want to do an  aligned memory allocation yeah  yeah so one reason is that you can align  memories so that they're aligned to cash  lines so that when you access an object  that fits within a cash line it's not  going to cross to cash lines and you  only get one cash access instead of two  so one reason is that you want to align  the memory to cash lines to reduce a  number of cache misses you get another  reason is that the vectorization  operators also require you to have  memory addresses that are aligned to  some power of two so if you align your  memory allocation with mammal line then  that's also good for the vector units we  also talked about deallocations you can  free memory back to the heap with the  free function so if you pass it a  pointer P to some block of memory it's  going to deallocate this block and  return it to the storage allocator and  we also talked about some anonomys  anomalies of freeing so what is it  called when you fail to free some memory  that you allocated yes  [Music]  yeah so if you fail to free something  that you allocated that's called a  memory leak and this can cause your  program to use more and more memory and  eventually your programs gonna use up  all the memory on your machine and it's  going to crash we also talked about  freeing something more than once does  anyone remember what that's called  yeah yeah so that's called double  freeing double freeing is when you free  something more than once and the  behavior is going to be undefined you'll  you might get a seg fault or immediately  or you'll free something that was  allocated for some other purpose and  then later down the road your programs  going to have some unexpected behavior  okay  I also want to talk about a map so a map  is a system call and usually a map is  used to treat some file on disk as part  of memory so that when you write to that  memory region it also backs it up on  disk in this context here I'm actually  using M map to allocate virtual memory  without having any backing file so M map  has a whole bunch of parameters here the  second to the last parameter indicates  the file I want to map in a bypass at a  negative one that means there's no  backing fall so I'm just using this to  allocate some virtual memory the first  argument is where I want to allocate it  and zero means that I don't care the  size in terms of number of bytes has how  much memory I want to allocate then  there's also permission so here it says  I can read and write this memory region  map private means that this memory  region is private to the process that's  allocating it and then map nan means  that there's no name associated with  this memory region and then as I said  negative one means that there's no  backing file and the last parameter is  just 0 if there's no backing file  normally it would be an offset into the  file that you're trying to map but here  there's no backing file and what M map  does is it finds a contiguous unused  region in the address  face of the application that's large  enough to hold size bytes and then it  updates the page table so that it now  contains an entry for this for the pages  that you allocated and then it creates a  necessary virtual memory management  structures within the operating system  to make it so that users accesses to  this area are legal and accesses won't  result in a seg fault if you try to  access some region of memory without  using without having to OS set these  parameters that you might get a sect  fault because the program might not have  permission to access our area but add  map is gonna make sure that that user  can access this area of virtual memory  and a map is a system call whereas  malloc which we talked about last time  is a library call so these are two  different things and malloc actually  uses a map under the hood to get more  memory from the operating system so  let's look at some properties of a map  so a map is lazy so when you request a  certain amount of memory it doesn't  immediately allocate physical memory for  the requested allocation instead it just  populates the page table with entries  pointing to a special zero page and then  it marks these pages as read-only and  then the first time you write to such a  page it will cause a page fault and at  that point the OS is going to modify the  page table get the appropriate physical  memory and store the mapping from the  virtual address space to physical  address space for the particular page  that you touch and then it will restart  the instruction so that it can continue  to execute you can turns out that you  can actually add map a terabyte of  virtual memory even on a machine with  just a gigabyte of DRAM because when you  call em map it doesn't actually allocate  the physical memory but then you should  be careful because a process might die  from  running out of physical memory well  after you call em map because ab map is  going to allocate this physical memory  whenever you first touch it and this  could be much later than when you  actually made the call to M map so any  questions so far okay so so what's the  difference between malloc and a map so  as I said malloc is a library call and  it's part of malloc and free are part of  the memory allocation interface of the  heap management code in the C library  and the heap management code uses the  available system facilities including  the M map function to get a virtual  address space from the operating system  and then the heat management code is  going within malloc is going to attempt  to satisfy user requests for heap  storage by reusing the memory that it  got from the OS as much as possible  until it can't do that anymore and then  it will go and call MF to get more  memory from the operating system  so the malloc implementation invokes at  map and other system calls to expand the  size of the users heap storage and the  responsibility of malloc is to reuse the  memory such that your fragmentation is  reduced and you have good temporal  locality where as the responsibility of  M math is actually getting this memory  from the operating system  so any questions on the differences  between malloc and M map so one question  is why don't we just call em map all the  time instead of just instead of using  malloc why don't we just directly call m  map  yes  what answer is that you might have free  storage from before that you you would  want to reuse and it turns out that map  is relatively heavy weight so it works  on a perp on a page granularity so if  you want to do a small allocation it's  quite wasteful to allocate an entire  page for that allocation and not reuse  it you'll get very bad external  fragmentation and when you call MF has  to go through all of the overhead of the  security of the OS and updating the page  table and so on whereas if you use  malloc it's actually pretty fast for  most allocations and especially if you  have temporal locality where you  allocate something that you just  recently freed so your program would be  pretty slow if you used a map all the  time even for small allocations for big  allocations it's fine but for small  allocations you should use malloc any  questions on M map versus malloc okay so  I just want to do a little bit of review  on how address translation works so some  of you might have seen this before in  your computer architecture course so how  it works is when you access a memory  location you access it via the virtual  address and the virtual address can be  divided in the two parts where the lower  order bits store the offset in the  higher order bits store the virtual page  number and in order to get the physical  address associate with this virtual  address the hardware is going to look up  this virtual page number in what's  called the page table and then if it  finds the corresponding entry for the  virtual page number in the page table  that will tell us the physical frame  number and then the physical frame  number corresponds to where this  physical memory is is in DRAM  so you can just take the frame number  and then use the same offset as before  to get the appropriate offset into the  physical memory frame so if the if the  virtual page that you're looking for  doesn't reside in physical memory then a  page fault is going to occur and when a  page fault occurs either the operating  system will see that the process  actually has permissions to look at that  memory region and it will set the  permissions and place the entry into the  page table so that you can get the  appropriate physical address but  otherwise the operating system might see  that this process actually can't access  that region of memory and then you'll  get a segmentation fault it turns out  that the page table search also called a  page walk is pretty expensive and that's  why we have the translation lookaside  buffer or TLB which is essentially a  cache for the page table so the hardware  uses a TLB to cache the recent page  table lookups into this TLB so that  later on when you access the same page  it doesn't have to go all the way to the  page table to find the physical address  it can first look in the TLB to see if  it's been recently accessed so why would  you expect it to see something that's it  recently has accessed  so what's one property of a program that  will make it so that you get a lot of  TLB hits yes  [Music]  yeah so so that's correct so the page  table stores pages which are typically  four kilobytes nowadays are also huge  pages which can be a couple megabytes  and most of the access and accesses in  your program are going to be near each  other so they're likely going to reside  on the same page for accesses that have  been done close together in time so  therefore you'll expect it that many of  your recent accesses are going to be  stored in the TLB if your program has  locality either spatial or temporal  locality or both so how this  architecture works is that the processor  is first going to check whether the  virtual address you're looking for is in  TLB if it's not it's going to go to the  page table and look it up and then if it  finds it there then it's going to store  that entry into the TLB and then next is  gonna go get this physical address that  it that it found from the TLB and look  it up into the CPU cache it finds it  there it gets it if it doesn't then it  goes to DRAM to satisfy the request most  modern machines actually have an  optimization that allow you to do TLB  access in parallel with the l1 cache  access so the l1 cache actually uses  virtual addresses instead of physical  addresses and this reduces the latency  of a memory axis so that's a brief  review of address translation alright so  let's talk about stacks so when you  execute a serial C and C++ program  you're using a stack to keep track of  the function calls and local variables  that you have to save so here let's say  we have this invocation tree where a  function a calls function B which then  returns and then a calls function c  which calls D returns calls e returns  and their returns again here are the  different views of the stack at  different points of the execution so  initially when we call a we  have a stack frame for a and then when  we when a calls B we're going to place a  stack frame for B right below the stack  frame of a so these are going to be  linearly ordered when we're done with B  then this part of the stack is no longer  going to be used apart for B and then  when it calls C it's going to allocate a  stack frame below a on the stack and  this space is actually going to be the  same space as what B was using before  but this is fine because we're already  done with the call to B then when C  calls D we're gonna create a stack frame  for D right below C when it returns  we're not going to use that space  anymore so then we can reuse it for the  stack frame when we call E and  eventually all of these will pop back up  and all of these views here's share the  same view of the stack frame for a and  then for C D and E they all stare share  the same view of the stack for C so this  is how a traditional linear stack works  when you call a serial Cu or C++ program  and you can view this as a serial walk  over the invocation tree there's one  rule for pointers with traditional  linear stacks is that a parent can pass  pointers to its stack variables down to  his children but not the other way  around a child can't pass a pointer to  some local variable back to its parents  so if you do that you'll get a bug in  your program how many of you have tried  doing that before yeah so lot of you  right so let's see why that causes a  problem so if I'm calling if I call B  and I pass a pointer to some local  variable in B stack to a and then now  when a call see it's gonna overwrite  this space that B was using and if B's  local variable was stored in this space  that C has now overwritten then you're  just gonna see garbage and  when you try to access that you're not  gonna get the correct value so you can  pass a pointer to a local variable down  to any of these descendant function  calls because they all see the same view  of a stack and that's not going to be  overwritten  while these descendant function calls  are proceeding but if you pass pass it  the other way then potentially the  variable that you had a pointer to is  going to be overwritten so here's one  question if you want to pass memory from  a child back to the parent where would  you allocate it so you can allocate it  on the parent what's another option yes  another way to do this is to allocate it  on the heap if you allocate on the heap  even after you return from the function  called that memory is going to persist  you can also allocate it in the parents  stack if you want  in fact some programs are written that  way and one of the reasons why many C  functions require you to pass in memory  to the function where it's going to  store the return value is to try to  avoid an expensive heap allocation in  the child because if the parent  allocates this space to store the result  that child can just put whatever it  wants to compute in that space and the  parent will see it C so then the  responsibility is up to the parent to  figure out whether it wants to allocate  the memory on the stack or on the heap  so this is one of the reasons why you'll  see many C functions where one of the  arguments is a memory location where the  result should be stored  okay so that was the serial case what  happens in parallel so in parallel we  have what's called a cactus stack where  we can support multiple views of the  stack in parallel so let's say we have a  program where it calls function a and  then a spawns B and C so B and C are  going to be running potentially in  parallel and this C spawns D and E which  can potentially be running in parallel  so for this program we could have  functions B D and E all executing in  parallel in a cactus stack is going to  allow us to have all these functions see  the same view of the stack as they would  have if if this program were executed  serially and the silk runtime system  supports the cactus stack to make it  easy for writing parallel programs  because now when you're writing programs  you just have to obey the same rules for  programming in serial C and C++ with  regards to the stack and then you'll  still get the intended behavior and it  turns out that there's no copying of the  stacks here so all of these different  views are seeing the same virtual memory  addresses for a but now there is an  issue of how how do we implement this  cactus stack because in this serial case  we could have these later stacks  overriding the earlier stacks but in  parallel how can we do this  so as anyone have any simple ideas on  how we can implement a cactus stack yes  you could just have each child's like  stacked start  separate stack we're just pepper exes  yeah so so one way to do this is to have  each thread use a different stack and  then store pointers to the different  stack frames across the different stacks  there's actually another way to do this  which is easier okay yes  put them all in the same stack  separated by that  yeah so if the stacks all have a maximum  depth and you could just allocate a  whole bunch of stacks which are  separated by this maximum depth there's  actually another way to do this which is  to not use the stack so yes way to do it  the easiest way to do it is just to  allocate it off the heap so instead of  allocating the the frames on the stack  you just do a heap allocation for each  of these stack frames and then each of  these stack frames has a pointer to the  parent stack frame so whenever you do a  function call you're going to do a  memory allocation from the heap to get a  new stack frame and then when you finish  a function you're gonna pop something  off of this stack and free it back to  the heap in fact a lot of early systems  for parallel programming use this  strategy of heap based  cactus stacks turns out that you can  actually minimize the performance impact  using this strategy if you optimize the  code enough but there is actually a  bigger problem with using a heat based  cactus stack which doesn't have to do  with performance does anybody have any  guesses of what this potential issue is  yeah yeah so let's assume that we can do  parallel heap allocation and we'll talk  about that so assuming that we can do  that correctly what's the issue with  this approach  [Music]  um so let's assume that you can get  whatever stack frames you need from the  heap so you don't actually need to put  an upper bound on this yeah yeah so we  don't know the maximum depth well let's  say we can we can make that work so you  don't actually need to know the maximum  depth if you're allocating off the heap  any other guesses so let's say we could  get that to work as well so what happens  if I try to run some program using this  heap based character stack with  something that's using the regular stack  so let's say I have some old legacy code  that was already compiled using the  traditional linear stack so there's a  problem with interoperability here  because the traditional code is assuming  that when you make a function call the  stack frame for the function call is  going to appear right after the stack  frame for the particular Kali function  so if you try to mix code that uses the  traditional stack as well as this  heat-based cactus stack approach then  it's not going to work well together one  approach is that you can just recompile  all your code to use this heap based  character stack even if you could do  that even if all the source codes were  available there are some legacy programs  that actually in the source code they do  some manipulations with the stack  because they assume that you're using  the traditional stack and those programs  would no longer work if you're using a  heap based cactus stack so the problem  is interoperability with legacy code  turns out that you can fix this using an  approach called thread-local memory  mapping so one of the students mentioned  memory mapping but that requires changes  to the operating system so it's not  general-purpose but the heat based  cactus stack turns out to be very simple  and we can prove nice balance about it  so besides the interoperability issue  heat-based cactus stacks are pretty good  in practice as well as in theory so we  can actually prove a space-bound  of a silk program that uses the cactus  heat based cactus stack so let's say s1  is the stack space required by a serial  execution of a soap program then the  stack space of a pea worker execution  using a heat based cactus stack it's  going to be upper bounded by P times s1  so SP is the space for a P work of  execution and that's less than or equal  to P times s1 to understand how this  works we need to understand a little bit  about how the soap works dealing  algorithm works so in the silk  work-stealing algorithm whenever you  spawn something a work or that spawns a  new task is going to work on the task  that it spawned so therefore for any  leaf in the invocation tree that  currently exists there's always going to  be a worker working on it there's not  gonna be any leaves in the tree where  there's no worker working on it because  when a work responds a task it creates a  new leaf but then it works immediately  on that leaf so here we have a we have a  invocation tree and for all of the  leaves we have a processor working on it  and with this busy leaves property we  can we can easily show this space-bound  so for each one of these processors the  maximum stack spaces using is going to  be upper bounded by s 1 because that's  maximum stack space across a serial  execution that excuse the whole program  and then since we have P of these leaves  we just multiply s 1 by P and that gives  us an upper bound on the overall  space used by Appy worker execution this  can be a loose upper bound because we're  double counting here there's some part  of this memory that we're counting more  than once because they're shared among  the different processors but that's why  we have the less than or equal to here  so it's upper bounded by P times s1 so  this is one of the nice things about  using a heap base cactus stack is that  you get this good space bound any  questions on on the space bound here so  let's try to apply this theorem to a  real example so this is the dividing  conquer matrix multiplication code that  we saw in a previous lecture so this is  in this code we're making eight  recursive calls the divide-and-conquer  function each of size and over two and  before we make any of these calls were  doing a malloc to get some temporary  space and this is of size order N  squared and then we free this temporary  space at the end and notice here that  the allocations of the temporary matrix  obey a stack discipline so we're  allocating stuff before we make  recursive calls and we're freeing it  after or right before we return from the  function so all the stack all the  allocations are nested and they follow a  stack discipline and it turns out that  even if you're allocating off the heap  if you follow a stack discipline you can  still use the space bound from the  previous slide to to upper bound the P  worker space okay so let's try to  analyze the the space of this code here  so first let's look at what the work and  span are so this is just going to be  review what's the work of this dividing  conquer matrix multiply so it's n cubed  so it's n cubed because we have eight  subproblems of size and over two and  then we have to do linear work to add  together the matrices  so our recurrence is going to be t1 of n  is equal to eight times t1 event over 2  plus order N squared and that solves two  order n cubed if you just pull out your  master theorem card what about the span  so what's the recurrence here yeah so  the span T infinity of n is equal to T  infinity of n over 2 plus a span of the  addition and what's the span of the  addition no let's assume that we have a  parallel we have nested soak for loops  right so then the span of that is just  going to be log n since the span of ones  soak for loop is log N and when you nest  them you just add together the span so  it's gonna be T infinity of n is equal  to T infinity of n over 2 plus order log  N and what does that solve to yeah so  it's gonna solve to order log squared n  again you can pull out your master  theorem card in look at one of the three  cases okay so now let's look at the  space what's going to be the recurrence  for the space yes  the only only place we're generating new  space is when we call this malloc here  so they're all seeing the same original  matrix so what would the recurrence be  yeah  yeah  [Music]  so the N squared term is right do we  actually need eight subproblems of size  n over two what happens after we finish  one of these subproblems are we still  gonna use the space for it right so you  can actually reuse the memory because  you free the memory you allocate it  after each one of these recursive calls  so therefore the recurrence is just  gonna be s of n over 2 plus theta of N  squared and what does that solve 2n  squared right  so here the N squared term actually  dominates you have a decreasing  geometric series so it's dominated at  the root and you get theta of n squared  and therefore by using the busy leaves  property and the theorem for the space  bound this tells us that on P processors  the space is going to be bounded by P  times N squared and this is actually  pretty good since we are we have a bound  on this it turns out that we can  actually prove a stronger bound for this  particular example and I'll walk you  through how we can prove this stronger  bound here so order P times N squared is  already pretty good but we can actually  do better if we look internally at how  this algorithm is structured so on each  level of recursion we're branching eight  ways and most of the space is going to  be used near the top of this recursion  tree so if I branch as much as possible  near the top of my recursion tree then  that's going to give me my worst case  space bound because the space is  decreasing geometrically as I go down  the tree so I'm going to branch eight  ways until I get to some level K in the  recursion tree where I have P nodes and  at that point I'm not gonna branch  anymore anymore because I've already  used up all P nodes and that's the  number of workers I have so let's say I  have this level K here where I have P  nodes so so what what would be the value  of K here so if I branch eight ways how  many levels do I have to go until I get  to P nodes yes yes log base 8 of P so we  have 8 to the K equal P because we're  branching K ways and then using some  algebra you can get it so that K is  equal to log base 8 of P which is equal  to log base 2 of P divided by 3 and then  at this level K downwards it's going to  decrease geometrically so the space is  going to be dominating at this level K  so the space decreases geometrically as  you go down from level K and also as you  go up from level K so therefore we can  just look at what the space is at this  level K here so the space is going to be  P times the size of each one of these  notes squared and the size of each one  of these nodes is going to be n over 2  to the log base 2 of P over 3 and then  we square that because we're using N  squared temporary space so if you solve  that that gives you P to the one-third  times N squared which is better than the  upper bound we saw earlier of order P  times N squared so you can work out the  details for this example not all the  details are shown on this slide you need  to show that the level K here actually  dominates all the other levels in the  recursion tree but in general if you  know what the structure of the algorithm  is you can potentially prove a stronger  space-bound than just applying the  general theorem we showed on the  previous slide  so any any questions on this  okay so as I said before the problem  with heap base linkage is that parallel  functions fail to interoperate with  legacy and third party serial binaries  yes was there a question yes  [Music]  boss  you don't actually know that but this  turns out to be the worst case so if it  branches any other way the space is just  going to be lower so you have to argue  that this is going to be the worst case  and it's going to be intuitively it's  the worst case because you're using most  of the memory near the root of the  recursion tree so if you can get all the  alpide notes as close as possible to the  root that's going to make your space as  high as possible it's a good question so  parallel functions fail to interoperate  with legacy and third party serial  binaries even if you can recompile all  of this code which isn't always  necessarily the case you can still have  issue is if the legacy code is taking  advantage of the traditional in your  stack inside the source code so our  implementation of silk uses a less space  efficient strategy that is interoperable  with legacy code and it uses a pool of  linear stacks instead of a heap based  strategy so we're gonna maintain a pool  of linear stacks lying around there's  gonna be more than P stacks lying around  and whenever I work or tries to steal  something it's going to try to acquire  one of these tasks from this pool of  linear tasks and when it's done it will  return it back but when it finds that  there's no more linear sex in this pool  then it's not gonna steal any more so  this is still going to preserve the  space bound as long as the number of  stacks is a constant times the number of  processors but it will affect the time  bounds of the work-stealing algorithm  because now when a workers idle it might  not necessarily have the chance to steal  if they're no more stacks lying around  this strategy doesn't require any  changes to the operating system there is  a way where you can preserve the space  and the time balance using thread-local  memory mapping but this does require  changes to the operating system so our  implementation of silk uses a pool of  linear stacks and is based on the intel  implementation  okay all right so we talked about stacks  and that we just reduced a problem to  heap allocation so now we have to talk  about heaps so let's review some basic  properties of heap storage allocators so  here's a definition the allocator speed  is a number of allocations and  deallocations per second that the  allocator can sustain and here's a  question is it more important to  maximize the allocator speed for large  blocks or small blocks yeah  so small blocks here's another question  why  yes sir you're gonna be doing a lot of  this  yes one answer is that you're gonna be  doing a lot more allocations and  deallocations of small blocks than large  blocks there's actually a more  fundamental reason why it's more  important to optimize for small blocks  it's anybody yeah so that's another  reason for small blocks it's more likely  that it will lead to fragmentation if  you don't optimize for small blocks  what's another reason yes anyway so like  the overhead is growing anymore  so the reason the main reason is that  when you're allocating a large when  you're allocating a block a user program  is typically going to write to all the  bytes in the block and therefore for a  large block it takes so much time to  write that the allocator time has little  effect on the overall running time  whereas if a program allocates many  small blocks the amount of work useful  work is actually doing on the block is  going to be it can be comparable to the  overhead for the allocation and  therefore all the allocation overhead  can add up to a significant amount for  small blocks so essentially for large  blocks you can amortize away the  overheads for storage allocation whereas  for small blocks it's harder to do that  therefore it's important to optimize for  small blocks here's another definition  so the user footprint is the maximum  over time of the number U of bytes in  use by the user program and these are  the bytes that are allocated and not  freed and this is measuring the peak  memory usage it's not necessarily equal  to the sum of the sizes that you have  allocated so far because you might have  reused some of that so the user  footprint is the peak memory usage in  number of bytes  and the allocator footprint is a maximum  over time of the number of bytes that  the memory provided to the allocator by  the operating system and the reason why  the allocator footprint could be larger  than the user footprint is that when you  ask the OS for a saw memory it could  give you more than what you asked for  and similarly if you ask malloc for some  amount of memory you can also give you  more than what you asked for and the  fragmentation is defined to be a divided  by you and a program with low  fragmentation will keep this ratio as  low as possible so to keep the allocated  footprint as close as possible to the  user footprint and in the best case this  ratio is going to be once you're using  all of the memory that the operating  system allocated one remark is that the  alligator footprint a usually grows  monotonically for many alligators so it  turns out that many alligators do Maps  to get more memory but they don't always  free this memory back to the OS and you  can't actually free memory using  something called M on map which is the  opposite of a map to give memory back to  the OS but this turns out to be pretty  expensive in modern operating systems  their implementation is not very  efficient so many alligators don't use  unmapped you can also use something  called M advised and what M advised tel  does is it tells the operating system  that you're not going to be using this  page anymore but to keep it around in  virtual memory so this has less overhead  because it doesn't have to clear this  entry from the page table it just has to  mark that the program isn't using this  page anymore so some alligators use m  advice with the option don't need to  free memory but but a is usually still  growing monotonically over time because  alligators don't necessarily free all  the things back to the OS that they  allocated  here's a theorem that we proved in last  week's lecture which says that the  fragmentation for been free list is  order log base two of you or just order  log U and the reason for this is that  you can have log base two of you bins  and for each bin it can basically  contain you bytes of storage so overall  you can you use overall you could have  allocated u times log u storage and only  be using U of those bytes so therefore  the fragmentation is order log u another  thing to note is that modern 64-bit  processors only provide about 2 to 48  bytes of virtual address space so this  is sort of news because you would  probably expect that for a 64-bit  processor you have to the 64 bytes of  virtual address space but that turns out  not to be the case so they only support  to the 48 bytes and that turns out to be  enough for all the programs that you  would want to write and that's also  going to be much more than the physical  memory you would have on a machine so  nowadays you can get a big server with a  terabyte of memory or two xl bytes of  physical memory which is still much  lower than the number of bytes in the  virtual address space any questions  okay so here are some more definitions  so the space overhead of an alligator is  a space used for bookkeeping so you  could store perhaps you could store  headers with the blocks that you  allocate to keep track of the size and  other information and that would  contribute to the space overhead  internal fragmentation is a waste due to  allocating larger blocks in the user  requests so you can get internal  fragmentation if when you call malloc  you get back a block that's actually  larger than what the user requested we  saw in the bin free list algorithm we're  rounding up to the nearest power of  two's if you allocate 9 bytes you'll  actually get back 16 bytes in our bin  free list algorithm from last lecture so  that contributes the internal  fragmentation it turns out that not all  bin free list implementations use powers  of two so some of them use other powers  that are smaller than two than two in  order to reduce the internal  fragmentation then there's external  fragmentation which is the waste due to  the inability to use storage because  it's not contiguous so for example if I  allocated a whole bunch of 1 byte things  consecutively in memory then I freed  every other byte and then now I want to  allocate a 2 byte thing I don't actually  have a contiguous memory to satisfy that  request because all of my free memory  all of my free bytes are in one byte  chunks and they're not next to each  other so this is one example of how  external fragmentation can happen after  you allocate stuff and free stuff then  there's blow-up and this is for a  parallel alligator the additional space  beyond what a serial alligator would  require so if a serial alligator  requires a space and a parallel  alligator requires tea space then it's  just going to be T over asked us to blow  up  okay so now let's look at some parallel  heap allocation strategies so the first  strategy is to use a global heap and  this is how the default see allocator  works see if you just use a default see  allocator out of the box this is how  it's implemented it uses a global heap  where all the accesses to this global  heap are protected by mutex you can also  use long lock free synchronization  primitives to implement this will  actually talk about some of these  synchronization primitives later on in  the semester and this is done to  preserve atomicity because you can have  multiple threads trying to access the  global heap at the same time and you  need to ensure that races are handled  correctly so what's to blow up for this  strategy how much more space am I using  than just a serial alligator yeah  yeah so the blow up is one because I'm  not actually using any more space than  the serial alligator since I'm just  maintaining one global heap and  everybody is going to that heap to do  allocations and deallocations but what  what's a potential issue with this  approach yeah yeah so you're gonna take  a performance hit for the for trying to  acquire this lock so basically every  time you do a allocation or D allocation  you have to acquire this lock and this  is pretty slow and it gets slower as you  increase the number of processors  roughly speaking acquiring a lock the  perform is a similar to an l2 cache axis  and if you just run a serial allocator  many of your requests are going to be  satisfied just by going into the l1  cache because you're gonna be allocating  things that you recently freed and those  things are gonna be residing in l1 cache  but here before you even get started you  have to grab a lock and you have to pay  a performance hit similar to an l2 cache  access so that's bad and it gets worse  as you increase the number of processors  so the contention increases as you  increase the number of threads and then  you can't you're not going to be able to  get good scalability so ideally that as  a number of threads or processor grows  the time to perform an allocation or D  allocation should an increase but in  fact it does and the most common reason  for loss of scalability is law  contention so here all the processors  are trying to acquire the same lock  which is the same memory address and and  if you recall from the caching lecture  or the multi-core programming lecture  every time you acquire a memory location  you have to bring that cache line into  your own cash and then invalidate the  same cache line in other processors  caches so if all the processor doing  this cash line is gonna be bouncing  around among all the processors caches  and this could lead to very bad  performance so here's a question is lock  contention more of a problem for large  blocks or small blocks yes so here's  another question why yes  yeah so one of the reasons is that when  you're doing small allocations that  means that your request rate is going to  be pretty high and your your processors  are going to be spending a lot of time  acquiring this lock and this can  exacerbate the lock contention and  another reason is that when you allocate  a large block you're doing a lot of work  because you have to write most of most  of the time you're gonna write to all  the bytes in that large block and  therefore you can amortize the overheads  of the storage allocator across all the  work that you're doing worse for small  blocks in addition to increasing this  rate of memory requests it's also  there's much less work to amortize the  overheads across so any questions okay  good all right so here's another  strategy which is to use local heap so  each thread is going to maintain its own  heap and it's going to allocate out of  its own heap and there's no locking  that's necessary so when you allocate  something you get it from your own heap  and when you free something you put it  back into your own heap so there's no  synchronization required so that's a  good thing it's very fast  what's a potential issue with this  approach  yes yeah so this approach you're gonna  be using a lot of extra space so first  first of all because you have to  maintain multiple heaps and what's one  phenomenon that you might see if you're  executing a program with this local heap  approach so it's a space could the space  potentially keep growing over time yes  so so you could actually have an  unbounded blow-up because if you do all  of the allocations in one heap and you  free everything in another heap then  whenever the first heap does an  allocation there's actually free space  sitting around in another heap but it's  just gonna grab more memory from the  operating system so your blow-up can be  unbounded and this phenomenon is what's  called memory drift so blocks allocated  by one thread are freed by another  thread and if you've run your program  for long enough your memory consumption  can keep increasing and this is sort of  like a memory leak so you might see that  if you have a memory drift problem your  program running on multiple processors  could run out of memory eventually worse  if you just run it on the single core it  won't run out of memory and here it's  because the allocator isn't smart enough  to reuse things in other heaps so what's  another strategy you can use to try to  fix this yes  sorry your question because if you keep  allocating from one thread if you do all  of your allocations in one thread and do  all your D allocations on another thread  every time you allocate from the first  thread there's actually memory sitting  around in the system but the first  thread isn't going to see it because it  only sees its own heap and it's just  gonna keep grabbing more memory from the  OS and then the second thread actually  has this extra memory sitting around but  it's not using it because it's only  doing the freeze it's not doing allocate  and if we recall the definition of a  blow up is how much more space you're  using compared to a serial execution of  the program if you executed this program  on a single core a single you you would  only have a single heap that does the  allocations and freeze so you're not  gonna your memory isn't gonna blow up  it's just gonna be constant over time  whereas if you use two threads to  execute this the memory could just keep  growing over time yes  so so it just so if you remember the bin  free list approach lets us say let's say  we're using that then all you have to do  is set some pointers in your bin free  list data structure as well as the block  that you're freeing so that it appears  in one of the linked lists so you can do  that even if some other processor  allocated that block okay so what what's  another strategy that can avoid this  issue of memory drift yes yes oh that's  a good idea you could peer it out  periodically rebalance the memory what's  a simpler approach to solve this problem  sorry could you repeat that yeah so you  could have all the processors know all  the free memory and then every time to  grab something it looks and all the  other heaps that does require a lot of  synchronization overhead might not  perform that well what's an easier way  to solve this problem yes restructure  your program so that the same thread  does the allocations and freeze for the  same memory block but what if you didn't  want to restructure your program how can  you change the allocator so we want the  behavior that you said but we don't want  to change our program program yes yeah  so you could have a single free list but  that gets back to the first strategy of  having a global heap and and then you  have high synchronization overheads yes  or  seriously free back to the the threat  that allocated it yeah so that that's  exactly right so here each object when  you allocate it it's labeled with an  owner and then whenever you free it you  return it back to the owner so the  objects that are allocated will  eventually go back to the owners heap if  they're not in use and they're not gonna  be free lying around in somebody else's  heap that the energy of this approach is  that you get fast allocation and freeing  of local objects local objects are  objects that you allocated however free  remote objects require some  synchronization because you have to  coordinate with the other threats heap  that you're sending the memory object  back to but this synchronization isn't  as bad as having a global heap since you  only have to talk to one other thread in  this case you can also bound the blow-up  by P so the reason why the blow-up is  upper bounded by P is that lets say the  serial allocator uses at most X memory  in this case each of the heels can use  that most X memory because that's how  much the serial program would have used  and you have P of these heaps so overall  you're using P times X memory and  therefore the ratio is upper bounded by  P yes  [Music]  so so when you free an object it goes if  you allocate it that object it goes back  to your own sheep if your heap is empty  it's it's actually gonna get more memory  from the operating system it's not going  to take something from another threads  heap but the maximum amount of memory  that you're gonna allocate is going to  be upper bounded by X because the  sequential program cereal program took  that much yeah so the upper bound for  the blow-up is P another advantage of  this approach is that it's resilience  it has resilience to false sharing so  let me just talk a little bit about  false sharing so true sharing is when  two processors they're trying to access  the same memory location and false  sharing is when multiple processors are  accessing different memory locations but  those locations happen to be on the same  cache line so here's an example let's  say we have two variables x and y and  the compiler happens to place x and y on  the same cache line now when the first  processor writes to X it's gonna bring  this cache line into its cache when the  other processor writes to Y since it's  on the same cache line it's gonna bring  this cache line two y's cache and then  now the first processor writes X is  gonna bring this cache line back to the  first processors cache and then you can  keep you can see this phenomenon keep  happening so here even though the  processors are writing to different  memory locations because they happen to  be on the same cache line the cache line  is going to be bouncing back and forth  on the machine between the different  processors caches and this problem gets  worse if more processors are accessing  this cache line  so and this this can be quite hard to  debug because if you're using just  variables on the stack you don't  actually know necessarily where the  compiler is going to place these memory  locations so the compiler could just  happen to play at place X&Y in the same  cache block and then you'll get this  performance head even though it seems  like you're accessing different memory  locations if you're using the heap for  memory allocation you have more  knowledge because if you allocate a huge  block you know that all the memory  locations are contiguous in physical  memory so you can just space your you  can space the accesses far enough apart  so that different processes aren't going  to touch the same cache line a more  general approach is that you you can  actually pad the object so first you can  align the object on a cache line  boundary and then you Pat out the  remaining memory locations of the object  so that it fills up the entire cache  line and now there's only one thing on  that cache line but this does lead to a  waste of space because you have this  wasted padding here so program can  induce fall sharing by having different  threads process nearby objects both on  the stack and on the heap and then an  alligator can also induce fall sharing  in two ways so it can actively induce  fall sharing and this is when the  allocator satisfies memory requests from  different threads using the same cache  block and it can also do this passively  and this is when the program passes  objects lying around and the same cache  lines two different threads and then the  allocator uses the object storage after  the objects are free to satisfy requests  from those different threads and the  local the local ownership approach tends  to reduce fall sharing because the  thread that allocates an object is  eventually going to get it back you're  not going to have have it so that an  object is permanently split among  multiple processes processors heaps so  even if you  faul sharing in local ownership it's  usually temporary eventually it's going  the object is gonna go back to the heap  that it was allocated from and a default  sharing is gonna go away yes I mean you  can implement it in various ways I mean  you can have each one of them do have a  bin free list allocator so there's no  restriction on where they have to appear  in physical memory there are many  different ways where you can you can  basically plug in any serial allocator  for the local heap so let's go back to  parallel heap allocation so I talked  about three approaches already here's a  fourth approach this is called the hoard  allocator and this was actually a pretty  good allocator when it was introduced  almost two decades ago and it's inspired  a lot of further research on how to  improve parallel memory allocations so  let me talk about how this works so in  the hoard allocator we're gonna have P  local heaps but we're also going to have  a global heap the memory is going to be  organized into large super blocks of  size s and s is usually a multiple of  the page size so this is the granularity  at which objects are going to be moved  around in the allocator and then you can  move super blocks between the local  heaps and the global heaps so when a  local heat becomes has has a lot of  super blocks that are not being fully  used then you can move it to the global  heap and then when a local heap doesn't  have enough memory you can go to the  global heap to get more memory and then  when the global heap doesn't have any  more memory then it gets more memory  from the operating system just so this  is sort of a combination of the  approaches that we saw before that the  ages are that this is a pretty fast  allocator it's also scalable as you add  more processors to performance  improves you can also bound the blow up  and it also has resilience to fall  sharing because it's using local heaps  so let's look at how an allocation using  the Horde allocator works plushies  assume without loss of generality that  all the blocks are the same size so we  have fixed size allocation so let's say  we call malloc in our program and let's  say thread I calls the malloc so what  we're gonna do is we're going to check  if there's a free object in heap I that  can satisfy this request and if so we're  going to get an object from the fullest  non-full super block in eyes heap anyone  know why we want to get the object from  the fullest non-full super block yes  right so when a super block needs to be  moved as as dense as possible and more  importantly this is to reduce external  fragmentation because as we saw in the  last lecture if you ask you the  distribution of allocated memory objects  just few pages or in this case as few  super blocks as possible that reduces  your external fragmentation okay so if  it finds it in its own heap then it's  gonna allocate an object from there  otherwise it's going to check the global  heap and if there's something in the  global heap or so here says it if the  global heat is empty then it's gonna get  a new super block from the OS otherwise  we can get a super block from the global  heap and then use that one and then  finally we set the set the owner of the  block we got either from the OS or from  the global heap to I and then we return  that free object to the program so this  is how a malloc works using the Horde  alligator and now let's look at 4 D  allocation select like use of I beta in  you storage in heap I this is the heap  for thread I and let a sub I be the  storage owned by heap I the Horde  allocator maintains the following  invariant for all heaps I and the  invariant is as follows so u sub I is  always going to be greater than or equal  to the min of a sub I minus 2 times s  recall s is the super block size and a  sub I over 2 so how it implements this  is as false when we call free of X let's  say X is owned by thread I then we're  gonna put X back into heap I and then  we're gonna check if the in u storage in  heap I u sub I is less than the min of a  sub I minus 2 s and a sub I over 2 and  what this condition says if it's true it  means that your heap is  last at most half utilized because if  it's smaller than this it has to be  smaller than a sub I over to that means  there's twice as much allocated than  used in the local he buy and therefore  there must be some super block that's at  least half empty and you move that super  block or one of those super blocks to  the global heap so any questions on how  the allocation and de-allocation works  so and since we're maintaining this  invariant it's gonna allow us to prove  about and on the blow up and I'll show  you that on the next slide but before I  go on are there any questions okay so  let's look at how we can bound the  blow-up of the Horde allocator so  there's actually allowed me that we're  gonna use and not prove dilemma is that  the maximum storage allocated in the  global heap is at most a maximum storage  allocated in the local heaps so we just  need to analyze how much stores is is  allocated in the local heaps because the  total amount of storage is gonna be at  most twice as much in the since the  global heap storage is dominated by the  local heap storage so you can prove this  lemma by case analysis and there's the  the Horde paper that's available on  learning modules and you're free to look  at that if you want to look at how this  is proved but here I'm just gonna use  this lemma to prove this theorem which  says that let u be the user footprint  for a program and let a be the hoards  alligator footprint we have that a is  upper bounded by order u plus SP and  therefore a divided by U which is a  blow-up is going to be one plus order SP  divided by u okay so let's see how this  proof works so we're just gonna analyze  the storage in the local heaps now  recall that we're always satisfying this  invariant here where u sub I is greater  than the men of a sub I minus 2s and a  sub I of the two so the first term says  that we can have two s unutilized  storage per heap so it's basically  giving two  superblocks for free to each heap and  they don't have to use it they can  basically use that as much as they want  and therefore the total amount of  storage contributed by the first term is  going to be order SP because each  processor has to s up to two as  unutilized storage so that's where the  second term comes from here and the the  the second term a sub I over to this  this will give us the first term order u  so this says that the allocated storage  is that most twice the use storage for  it and then if you if you sum up across  all the processors then there's a total  of order use storage that's allocated  because the allocated storage can be at  most twice the use storage okay so  that's the proof of the blow up for  Horde and this is pretty good it's one  plus some lower lower order term okay so  nowadays are some other allocators that  people use so je malloc is a pretty  popular one as a few differences was  with Horde it has a separate global lock  for each different allocation size  allocates the object with the smallest  address among all the objects of the  requested size and it releases empty  pages using em advice which we talked  about I talked about earlier and it's  pretty popular because it has good  performance and it's pretty robust to  different allocation traces there's also  another one called super malloc which is  an up-and-coming contender and it was  developed by Bradley Kussmaul here are  some allocator speeds for the alligators  that we looked at for our pit particular  benchmark and for this particular  benchmark we can see that super malloc  actually does really well it's more than  three times faster than J email luck and  J malloc is more than twice as fast as  Horde and then the default allocator  which uses a global heap is pretty slow  because I can't get  Goodspeed up and all these experiments  aren't 32 threads I also have the lines  of code so we see that super Malek  actually has very few lines of code  compared to the other allocators so it's  relatively simple okay so I also have  some slides in garbage collection but  since we're out of time I'll just put  these slides online and you can read  them

2
00:00:03,189 --> 00:00:05,769
the following content is provided under  a Creative Commons license your support  will help MIT OpenCourseWare continue to  offer high quality educational resources  for free  to make a donation or to view additional  materials from hundreds of MIT courses  visit MIT opencourseware at ocw.mit.edu  good afternoon everyone  so today we're going to talk about  storage allocation this is a  continuation from last lecture where we  talked about serial storage allocation  so they will also talk a little bit more  about serial allocation but then I'll  talk more about parallel allocation and  also garbage collection so I want to  just do a review of some memory  allocation primitives so recall that you  can use malloc to allocate memory from  the heap and if you call malloc with a  size of s it's going to allocate and  return a pointer to a block of memory  containing at least s bytes so you might  actually get more more than s bytes even  though you asked for a spice but it's  guaranteed to give you at least s bytes  the return value is a void star but good  programming practice is to typecast this  pointer to whatever type you're using  this memory for when you receive this  from the malloc call there's also  aligned allocation so you can do a line  to a location with memo line which takes  two arguments this a size a as well as a  size s and a has to be an exact power of  two and it's going to allocate and  return a pointer to a block of memory  again containing at least s bytes but  this time this memory is going to be  aligned to a multiple of a so the  address is going to be a multiple of a  where this memory block starts so does  anyone know why we might want to do an  aligned memory allocation yeah  yeah so one reason is that you can align  memories so that they're aligned to cash  lines so that when you access an object  that fits within a cash line it's not  going to cross to cash lines and you  only get one cash access instead of two  so one reason is that you want to align  the memory to cash lines to reduce a  number of cache misses you get another  reason is that the vectorization  operators also require you to have  memory addresses that are aligned to  some power of two so if you align your  memory allocation with mammal line then  that's also good for the vector units we  also talked about deallocations you can  free memory back to the heap with the  free function so if you pass it a  pointer P to some block of memory it's  going to deallocate this block and  return it to the storage allocator and  we also talked about some anonomys  anomalies of freeing so what is it  called when you fail to free some memory  that you allocated yes  [Music]  yeah so if you fail to free something  that you allocated that's called a  memory leak and this can cause your  program to use more and more memory and  eventually your programs gonna use up  all the memory on your machine and it's  going to crash we also talked about  freeing something more than once does  anyone remember what that's called  yeah yeah so that's called double  freeing double freeing is when you free  something more than once and the  behavior is going to be undefined you'll  you might get a seg fault or immediately  or you'll free something that was  allocated for some other purpose and  then later down the road your programs  going to have some unexpected behavior  okay  I also want to talk about a map so a map  is a system call and usually a map is  used to treat some file on disk as part  of memory so that when you write to that  memory region it also backs it up on  disk in this context here I'm actually  using M map to allocate virtual memory  without having any backing file so M map  has a whole bunch of parameters here the  second to the last parameter indicates  the file I want to map in a bypass at a  negative one that means there's no  backing fall so I'm just using this to  allocate some virtual memory the first  argument is where I want to allocate it  and zero means that I don't care the  size in terms of number of bytes has how  much memory I want to allocate then  there's also permission so here it says  I can read and write this memory region  map private means that this memory  region is private to the process that's  allocating it and then map nan means  that there's no name associated with  this memory region and then as I said  negative one means that there's no  backing file and the last parameter is  just 0 if there's no backing file  normally it would be an offset into the  file that you're trying to map but here  there's no backing file and what M map  does is it finds a contiguous unused  region in the address  face of the application that's large  enough to hold size bytes and then it  updates the page table so that it now  contains an entry for this for the pages  that you allocated and then it creates a  necessary virtual memory management  structures within the operating system  to make it so that users accesses to  this area are legal and accesses won't  result in a seg fault if you try to  access some region of memory without  using without having to OS set these  parameters that you might get a sect  fault because the program might not have  permission to access our area but add  map is gonna make sure that that user  can access this area of virtual memory  and a map is a system call whereas  malloc which we talked about last time  is a library call so these are two  different things and malloc actually  uses a map under the hood to get more  memory from the operating system so  let's look at some properties of a map  so a map is lazy so when you request a  certain amount of memory it doesn't  immediately allocate physical memory for  the requested allocation instead it just  populates the page table with entries  pointing to a special zero page and then  it marks these pages as read-only and  then the first time you write to such a  page it will cause a page fault and at  that point the OS is going to modify the  page table get the appropriate physical  memory and store the mapping from the  virtual address space to physical  address space for the particular page  that you touch and then it will restart  the instruction so that it can continue  to execute you can turns out that you  can actually add map a terabyte of  virtual memory even on a machine with  just a gigabyte of DRAM because when you  call em map it doesn't actually allocate  the physical memory but then you should  be careful because a process might die  from  running out of physical memory well  after you call em map because ab map is  going to allocate this physical memory  whenever you first touch it and this  could be much later than when you  actually made the call to M map so any  questions so far okay so so what's the  difference between malloc and a map so  as I said malloc is a library call and  it's part of malloc and free are part of  the memory allocation interface of the  heap management code in the C library  and the heap management code uses the  available system facilities including  the M map function to get a virtual  address space from the operating system  and then the heat management code is  going within malloc is going to attempt  to satisfy user requests for heap  storage by reusing the memory that it  got from the OS as much as possible  until it can't do that anymore and then  it will go and call MF to get more  memory from the operating system  so the malloc implementation invokes at  map and other system calls to expand the  size of the users heap storage and the  responsibility of malloc is to reuse the  memory such that your fragmentation is  reduced and you have good temporal  locality where as the responsibility of  M math is actually getting this memory  from the operating system  so any questions on the differences  between malloc and M map so one question  is why don't we just call em map all the  time instead of just instead of using  malloc why don't we just directly call m  map  yes  what answer is that you might have free  storage from before that you you would  want to reuse and it turns out that map  is relatively heavy weight so it works  on a perp on a page granularity so if  you want to do a small allocation it's  quite wasteful to allocate an entire  page for that allocation and not reuse  it you'll get very bad external  fragmentation and when you call MF has  to go through all of the overhead of the  security of the OS and updating the page  table and so on whereas if you use  malloc it's actually pretty fast for  most allocations and especially if you  have temporal locality where you  allocate something that you just  recently freed so your program would be  pretty slow if you used a map all the  time even for small allocations for big  allocations it's fine but for small  allocations you should use malloc any  questions on M map versus malloc okay so  I just want to do a little bit of review  on how address translation works so some  of you might have seen this before in  your computer architecture course so how  it works is when you access a memory  location you access it via the virtual  address and the virtual address can be  divided in the two parts where the lower  order bits store the offset in the  higher order bits store the virtual page  number and in order to get the physical  address associate with this virtual  address the hardware is going to look up  this virtual page number in what's  called the page table and then if it  finds the corresponding entry for the  virtual page number in the page table  that will tell us the physical frame  number and then the physical frame  number corresponds to where this  physical memory is is in DRAM  so you can just take the frame number  and then use the same offset as before  to get the appropriate offset into the  physical memory frame so if the if the  virtual page that you're looking for  doesn't reside in physical memory then a  page fault is going to occur and when a  page fault occurs either the operating  system will see that the process  actually has permissions to look at that  memory region and it will set the  permissions and place the entry into the  page table so that you can get the  appropriate physical address but  otherwise the operating system might see  that this process actually can't access  that region of memory and then you'll  get a segmentation fault it turns out  that the page table search also called a  page walk is pretty expensive and that's  why we have the translation lookaside  buffer or TLB which is essentially a  cache for the page table so the hardware  uses a TLB to cache the recent page  table lookups into this TLB so that  later on when you access the same page  it doesn't have to go all the way to the  page table to find the physical address  it can first look in the TLB to see if  it's been recently accessed so why would  you expect it to see something that's it  recently has accessed  so what's one property of a program that  will make it so that you get a lot of  TLB hits yes  [Music]  yeah so so that's correct so the page  table stores pages which are typically  four kilobytes nowadays are also huge  pages which can be a couple megabytes  and most of the access and accesses in  your program are going to be near each  other so they're likely going to reside  on the same page for accesses that have  been done close together in time so  therefore you'll expect it that many of  your recent accesses are going to be  stored in the TLB if your program has  locality either spatial or temporal  locality or both so how this  architecture works is that the processor  is first going to check whether the  virtual address you're looking for is in  TLB if it's not it's going to go to the  page table and look it up and then if it  finds it there then it's going to store  that entry into the TLB and then next is  gonna go get this physical address that  it that it found from the TLB and look  it up into the CPU cache it finds it  there it gets it if it doesn't then it  goes to DRAM to satisfy the request most  modern machines actually have an  optimization that allow you to do TLB  access in parallel with the l1 cache  access so the l1 cache actually uses  virtual addresses instead of physical  addresses and this reduces the latency  of a memory axis so that's a brief  review of address translation alright so  let's talk about stacks so when you  execute a serial C and C++ program  you're using a stack to keep track of  the function calls and local variables  that you have to save so here let's say  we have this invocation tree where a  function a calls function B which then  returns and then a calls function c  which calls D returns calls e returns  and their returns again here are the  different views of the stack at  different points of the execution so  initially when we call a we  have a stack frame for a and then when  we when a calls B we're going to place a  stack frame for B right below the stack  frame of a so these are going to be  linearly ordered when we're done with B  then this part of the stack is no longer  going to be used apart for B and then  when it calls C it's going to allocate a  stack frame below a on the stack and  this space is actually going to be the  same space as what B was using before  but this is fine because we're already  done with the call to B then when C  calls D we're gonna create a stack frame  for D right below C when it returns  we're not going to use that space  anymore so then we can reuse it for the  stack frame when we call E and  eventually all of these will pop back up  and all of these views here's share the  same view of the stack frame for a and  then for C D and E they all stare share  the same view of the stack for C so this  is how a traditional linear stack works  when you call a serial Cu or C++ program  and you can view this as a serial walk  over the invocation tree there's one  rule for pointers with traditional  linear stacks is that a parent can pass  pointers to its stack variables down to  his children but not the other way  around a child can't pass a pointer to  some local variable back to its parents  so if you do that you'll get a bug in  your program how many of you have tried  doing that before yeah so lot of you  right so let's see why that causes a  problem so if I'm calling if I call B  and I pass a pointer to some local  variable in B stack to a and then now  when a call see it's gonna overwrite  this space that B was using and if B's  local variable was stored in this space  that C has now overwritten then you're  just gonna see garbage and  when you try to access that you're not  gonna get the correct value so you can  pass a pointer to a local variable down  to any of these descendant function  calls because they all see the same view  of a stack and that's not going to be  overwritten  while these descendant function calls  are proceeding but if you pass pass it  the other way then potentially the  variable that you had a pointer to is  going to be overwritten so here's one  question if you want to pass memory from  a child back to the parent where would  you allocate it so you can allocate it  on the parent what's another option yes  another way to do this is to allocate it  on the heap if you allocate on the heap  even after you return from the function  called that memory is going to persist  you can also allocate it in the parents  stack if you want  in fact some programs are written that  way and one of the reasons why many C  functions require you to pass in memory  to the function where it's going to  store the return value is to try to  avoid an expensive heap allocation in  the child because if the parent  allocates this space to store the result  that child can just put whatever it  wants to compute in that space and the  parent will see it C so then the  responsibility is up to the parent to  figure out whether it wants to allocate  the memory on the stack or on the heap  so this is one of the reasons why you'll  see many C functions where one of the  arguments is a memory location where the  result should be stored  okay so that was the serial case what  happens in parallel so in parallel we  have what's called a cactus stack where  we can support multiple views of the  stack in parallel so let's say we have a  program where it calls function a and  then a spawns B and C so B and C are  going to be running potentially in  parallel and this C spawns D and E which  can potentially be running in parallel  so for this program we could have  functions B D and E all executing in  parallel in a cactus stack is going to  allow us to have all these functions see  the same view of the stack as they would  have if if this program were executed  serially and the silk runtime system  supports the cactus stack to make it  easy for writing parallel programs  because now when you're writing programs  you just have to obey the same rules for  programming in serial C and C++ with  regards to the stack and then you'll  still get the intended behavior and it  turns out that there's no copying of the  stacks here so all of these different  views are seeing the same virtual memory  addresses for a but now there is an  issue of how how do we implement this  cactus stack because in this serial case  we could have these later stacks  overriding the earlier stacks but in  parallel how can we do this  so as anyone have any simple ideas on  how we can implement a cactus stack yes  you could just have each child's like  stacked start  separate stack we're just pepper exes  yeah so so one way to do this is to have  each thread use a different stack and  then store pointers to the different  stack frames across the different stacks  there's actually another way to do this  which is easier okay yes  put them all in the same stack  separated by that  yeah so if the stacks all have a maximum  depth and you could just allocate a  whole bunch of stacks which are  separated by this maximum depth there's  actually another way to do this which is  to not use the stack so yes way to do it  the easiest way to do it is just to  allocate it off the heap so instead of  allocating the the frames on the stack  you just do a heap allocation for each  of these stack frames and then each of  these stack frames has a pointer to the  parent stack frame so whenever you do a  function call you're going to do a  memory allocation from the heap to get a  new stack frame and then when you finish  a function you're gonna pop something  off of this stack and free it back to  the heap in fact a lot of early systems  for parallel programming use this  strategy of heap based  cactus stacks turns out that you can  actually minimize the performance impact  using this strategy if you optimize the  code enough but there is actually a  bigger problem with using a heat based  cactus stack which doesn't have to do  with performance does anybody have any  guesses of what this potential issue is  yeah yeah so let's assume that we can do  parallel heap allocation and we'll talk  about that so assuming that we can do  that correctly what's the issue with  this approach  [Music]  um so let's assume that you can get  whatever stack frames you need from the  heap so you don't actually need to put  an upper bound on this yeah yeah so we  don't know the maximum depth well let's  say we can we can make that work so you  don't actually need to know the maximum  depth if you're allocating off the heap  any other guesses so let's say we could  get that to work as well so what happens  if I try to run some program using this  heap based character stack with  something that's using the regular stack  so let's say I have some old legacy code  that was already compiled using the  traditional linear stack so there's a  problem with interoperability here  because the traditional code is assuming  that when you make a function call the  stack frame for the function call is  going to appear right after the stack  frame for the particular Kali function  so if you try to mix code that uses the  traditional stack as well as this  heat-based cactus stack approach then  it's not going to work well together one  approach is that you can just recompile  all your code to use this heap based  character stack even if you could do  that even if all the source codes were  available there are some legacy programs  that actually in the source code they do  some manipulations with the stack  because they assume that you're using  the traditional stack and those programs  would no longer work if you're using a  heap based cactus stack so the problem  is interoperability with legacy code  turns out that you can fix this using an  approach called thread-local memory  mapping so one of the students mentioned  memory mapping but that requires changes  to the operating system so it's not  general-purpose but the heat based  cactus stack turns out to be very simple  and we can prove nice balance about it  so besides the interoperability issue  heat-based cactus stacks are pretty good  in practice as well as in theory so we  can actually prove a space-bound  of a silk program that uses the cactus  heat based cactus stack so let's say s1  is the stack space required by a serial  execution of a soap program then the  stack space of a pea worker execution  using a heat based cactus stack it's  going to be upper bounded by P times s1  so SP is the space for a P work of  execution and that's less than or equal  to P times s1 to understand how this  works we need to understand a little bit  about how the soap works dealing  algorithm works so in the silk  work-stealing algorithm whenever you  spawn something a work or that spawns a  new task is going to work on the task  that it spawned so therefore for any  leaf in the invocation tree that  currently exists there's always going to  be a worker working on it there's not  gonna be any leaves in the tree where  there's no worker working on it because  when a work responds a task it creates a  new leaf but then it works immediately  on that leaf so here we have a we have a  invocation tree and for all of the  leaves we have a processor working on it  and with this busy leaves property we  can we can easily show this space-bound  so for each one of these processors the  maximum stack spaces using is going to  be upper bounded by s 1 because that's  maximum stack space across a serial  execution that excuse the whole program  and then since we have P of these leaves  we just multiply s 1 by P and that gives  us an upper bound on the overall  space used by Appy worker execution this  can be a loose upper bound because we're  double counting here there's some part  of this memory that we're counting more  than once because they're shared among  the different processors but that's why  we have the less than or equal to here  so it's upper bounded by P times s1 so  this is one of the nice things about  using a heap base cactus stack is that  you get this good space bound any  questions on on the space bound here so  let's try to apply this theorem to a  real example so this is the dividing  conquer matrix multiplication code that  we saw in a previous lecture so this is  in this code we're making eight  recursive calls the divide-and-conquer  function each of size and over two and  before we make any of these calls were  doing a malloc to get some temporary  space and this is of size order N  squared and then we free this temporary  space at the end and notice here that  the allocations of the temporary matrix  obey a stack discipline so we're  allocating stuff before we make  recursive calls and we're freeing it  after or right before we return from the  function so all the stack all the  allocations are nested and they follow a  stack discipline and it turns out that  even if you're allocating off the heap  if you follow a stack discipline you can  still use the space bound from the  previous slide to to upper bound the P  worker space okay so let's try to  analyze the the space of this code here  so first let's look at what the work and  span are so this is just going to be  review what's the work of this dividing  conquer matrix multiply so it's n cubed  so it's n cubed because we have eight  subproblems of size and over two and  then we have to do linear work to add  together the matrices  so our recurrence is going to be t1 of n  is equal to eight times t1 event over 2  plus order N squared and that solves two  order n cubed if you just pull out your  master theorem card what about the span  so what's the recurrence here yeah so  the span T infinity of n is equal to T  infinity of n over 2 plus a span of the  addition and what's the span of the  addition no let's assume that we have a  parallel we have nested soak for loops  right so then the span of that is just  going to be log n since the span of ones  soak for loop is log N and when you nest  them you just add together the span so  it's gonna be T infinity of n is equal  to T infinity of n over 2 plus order log  N and what does that solve to yeah so  it's gonna solve to order log squared n  again you can pull out your master  theorem card in look at one of the three  cases okay so now let's look at the  space what's going to be the recurrence  for the space yes  the only only place we're generating new  space is when we call this malloc here  so they're all seeing the same original  matrix so what would the recurrence be  yeah  yeah  [Music]  so the N squared term is right do we  actually need eight subproblems of size  n over two what happens after we finish  one of these subproblems are we still  gonna use the space for it right so you  can actually reuse the memory because  you free the memory you allocate it  after each one of these recursive calls  so therefore the recurrence is just  gonna be s of n over 2 plus theta of N  squared and what does that solve 2n  squared right  so here the N squared term actually  dominates you have a decreasing  geometric series so it's dominated at  the root and you get theta of n squared  and therefore by using the busy leaves  property and the theorem for the space  bound this tells us that on P processors  the space is going to be bounded by P  times N squared and this is actually  pretty good since we are we have a bound  on this it turns out that we can  actually prove a stronger bound for this  particular example and I'll walk you  through how we can prove this stronger  bound here so order P times N squared is  already pretty good but we can actually  do better if we look internally at how  this algorithm is structured so on each  level of recursion we're branching eight  ways and most of the space is going to  be used near the top of this recursion  tree so if I branch as much as possible  near the top of my recursion tree then  that's going to give me my worst case  space bound because the space is  decreasing geometrically as I go down  the tree so I'm going to branch eight  ways until I get to some level K in the  recursion tree where I have P nodes and  at that point I'm not gonna branch  anymore anymore because I've already  used up all P nodes and that's the  number of workers I have so let's say I  have this level K here where I have P  nodes so so what what would be the value  of K here so if I branch eight ways how  many levels do I have to go until I get  to P nodes yes yes log base 8 of P so we  have 8 to the K equal P because we're  branching K ways and then using some  algebra you can get it so that K is  equal to log base 8 of P which is equal  to log base 2 of P divided by 3 and then  at this level K downwards it's going to  decrease geometrically so the space is  going to be dominating at this level K  so the space decreases geometrically as  you go down from level K and also as you  go up from level K so therefore we can  just look at what the space is at this  level K here so the space is going to be  P times the size of each one of these  notes squared and the size of each one  of these nodes is going to be n over 2  to the log base 2 of P over 3 and then  we square that because we're using N  squared temporary space so if you solve  that that gives you P to the one-third  times N squared which is better than the  upper bound we saw earlier of order P  times N squared so you can work out the  details for this example not all the  details are shown on this slide you need  to show that the level K here actually  dominates all the other levels in the  recursion tree but in general if you  know what the structure of the algorithm  is you can potentially prove a stronger  space-bound than just applying the  general theorem we showed on the  previous slide  so any any questions on this  okay so as I said before the problem  with heap base linkage is that parallel  functions fail to interoperate with  legacy and third party serial binaries  yes was there a question yes  [Music]  boss  you don't actually know that but this  turns out to be the worst case so if it  branches any other way the space is just  going to be lower so you have to argue  that this is going to be the worst case  and it's going to be intuitively it's  the worst case because you're using most  of the memory near the root of the  recursion tree so if you can get all the  alpide notes as close as possible to the  root that's going to make your space as  high as possible it's a good question so  parallel functions fail to interoperate  with legacy and third party serial  binaries even if you can recompile all  of this code which isn't always  necessarily the case you can still have  issue is if the legacy code is taking  advantage of the traditional in your  stack inside the source code so our  implementation of silk uses a less space  efficient strategy that is interoperable  with legacy code and it uses a pool of  linear stacks instead of a heap based  strategy so we're gonna maintain a pool  of linear stacks lying around there's  gonna be more than P stacks lying around  and whenever I work or tries to steal  something it's going to try to acquire  one of these tasks from this pool of  linear tasks and when it's done it will  return it back but when it finds that  there's no more linear sex in this pool  then it's not gonna steal any more so  this is still going to preserve the  space bound as long as the number of  stacks is a constant times the number of  processors but it will affect the time  bounds of the work-stealing algorithm  because now when a workers idle it might  not necessarily have the chance to steal  if they're no more stacks lying around  this strategy doesn't require any  changes to the operating system there is  a way where you can preserve the space  and the time balance using thread-local  memory mapping but this does require  changes to the operating system so our  implementation of silk uses a pool of  linear stacks and is based on the intel  implementation  okay all right so we talked about stacks  and that we just reduced a problem to  heap allocation so now we have to talk  about heaps so let's review some basic  properties of heap storage allocators so  here's a definition the allocator speed  is a number of allocations and  deallocations per second that the  allocator can sustain and here's a  question is it more important to  maximize the allocator speed for large  blocks or small blocks yeah  so small blocks here's another question  why  yes sir you're gonna be doing a lot of  this  yes one answer is that you're gonna be  doing a lot more allocations and  deallocations of small blocks than large  blocks there's actually a more  fundamental reason why it's more  important to optimize for small blocks  it's anybody yeah so that's another  reason for small blocks it's more likely  that it will lead to fragmentation if  you don't optimize for small blocks  what's another reason yes anyway so like  the overhead is growing anymore  so the reason the main reason is that  when you're allocating a large when  you're allocating a block a user program  is typically going to write to all the  bytes in the block and therefore for a  large block it takes so much time to  write that the allocator time has little  effect on the overall running time  whereas if a program allocates many  small blocks the amount of work useful  work is actually doing on the block is  going to be it can be comparable to the  overhead for the allocation and  therefore all the allocation overhead  can add up to a significant amount for  small blocks so essentially for large  blocks you can amortize away the  overheads for storage allocation whereas  for small blocks it's harder to do that  therefore it's important to optimize for  small blocks here's another definition  so the user footprint is the maximum  over time of the number U of bytes in  use by the user program and these are  the bytes that are allocated and not  freed and this is measuring the peak  memory usage it's not necessarily equal  to the sum of the sizes that you have  allocated so far because you might have  reused some of that so the user  footprint is the peak memory usage in  number of bytes  and the allocator footprint is a maximum  over time of the number of bytes that  the memory provided to the allocator by  the operating system and the reason why  the allocator footprint could be larger  than the user footprint is that when you  ask the OS for a saw memory it could  give you more than what you asked for  and similarly if you ask malloc for some  amount of memory you can also give you  more than what you asked for and the  fragmentation is defined to be a divided  by you and a program with low  fragmentation will keep this ratio as  low as possible so to keep the allocated  footprint as close as possible to the  user footprint and in the best case this  ratio is going to be once you're using  all of the memory that the operating  system allocated one remark is that the  alligator footprint a usually grows  monotonically for many alligators so it  turns out that many alligators do Maps  to get more memory but they don't always  free this memory back to the OS and you  can't actually free memory using  something called M on map which is the  opposite of a map to give memory back to  the OS but this turns out to be pretty  expensive in modern operating systems  their implementation is not very  efficient so many alligators don't use  unmapped you can also use something  called M advised and what M advised tel  does is it tells the operating system  that you're not going to be using this  page anymore but to keep it around in  virtual memory so this has less overhead  because it doesn't have to clear this  entry from the page table it just has to  mark that the program isn't using this  page anymore so some alligators use m  advice with the option don't need to  free memory but but a is usually still  growing monotonically over time because  alligators don't necessarily free all  the things back to the OS that they  allocated  here's a theorem that we proved in last  week's lecture which says that the  fragmentation for been free list is  order log base two of you or just order  log U and the reason for this is that  you can have log base two of you bins  and for each bin it can basically  contain you bytes of storage so overall  you can you use overall you could have  allocated u times log u storage and only  be using U of those bytes so therefore  the fragmentation is order log u another  thing to note is that modern 64-bit  processors only provide about 2 to 48  bytes of virtual address space so this  is sort of news because you would  probably expect that for a 64-bit  processor you have to the 64 bytes of  virtual address space but that turns out  not to be the case so they only support  to the 48 bytes and that turns out to be  enough for all the programs that you  would want to write and that's also  going to be much more than the physical  memory you would have on a machine so  nowadays you can get a big server with a  terabyte of memory or two xl bytes of  physical memory which is still much  lower than the number of bytes in the  virtual address space any questions  okay so here are some more definitions  so the space overhead of an alligator is  a space used for bookkeeping so you  could store perhaps you could store  headers with the blocks that you  allocate to keep track of the size and  other information and that would  contribute to the space overhead  internal fragmentation is a waste due to  allocating larger blocks in the user  requests so you can get internal  fragmentation if when you call malloc  you get back a block that's actually  larger than what the user requested we  saw in the bin free list algorithm we're  rounding up to the nearest power of  two's if you allocate 9 bytes you'll  actually get back 16 bytes in our bin  free list algorithm from last lecture so  that contributes the internal  fragmentation it turns out that not all  bin free list implementations use powers  of two so some of them use other powers  that are smaller than two than two in  order to reduce the internal  fragmentation then there's external  fragmentation which is the waste due to  the inability to use storage because  it's not contiguous so for example if I  allocated a whole bunch of 1 byte things  consecutively in memory then I freed  every other byte and then now I want to  allocate a 2 byte thing I don't actually  have a contiguous memory to satisfy that  request because all of my free memory  all of my free bytes are in one byte  chunks and they're not next to each  other so this is one example of how  external fragmentation can happen after  you allocate stuff and free stuff then  there's blow-up and this is for a  parallel alligator the additional space  beyond what a serial alligator would  require so if a serial alligator  requires a space and a parallel  alligator requires tea space then it's  just going to be T over asked us to blow  up  okay so now let's look at some parallel  heap allocation strategies so the first  strategy is to use a global heap and  this is how the default see allocator  works see if you just use a default see  allocator out of the box this is how  it's implemented it uses a global heap  where all the accesses to this global  heap are protected by mutex you can also  use long lock free synchronization  primitives to implement this will  actually talk about some of these  synchronization primitives later on in  the semester and this is done to  preserve atomicity because you can have  multiple threads trying to access the  global heap at the same time and you  need to ensure that races are handled  correctly so what's to blow up for this  strategy how much more space am I using  than just a serial alligator yeah  yeah so the blow up is one because I'm  not actually using any more space than  the serial alligator since I'm just  maintaining one global heap and  everybody is going to that heap to do  allocations and deallocations but what  what's a potential issue with this  approach yeah yeah so you're gonna take  a performance hit for the for trying to  acquire this lock so basically every  time you do a allocation or D allocation  you have to acquire this lock and this  is pretty slow and it gets slower as you  increase the number of processors  roughly speaking acquiring a lock the  perform is a similar to an l2 cache axis  and if you just run a serial allocator  many of your requests are going to be  satisfied just by going into the l1  cache because you're gonna be allocating  things that you recently freed and those  things are gonna be residing in l1 cache  but here before you even get started you  have to grab a lock and you have to pay  a performance hit similar to an l2 cache  access so that's bad and it gets worse  as you increase the number of processors  so the contention increases as you  increase the number of threads and then  you can't you're not going to be able to  get good scalability so ideally that as  a number of threads or processor grows  the time to perform an allocation or D  allocation should an increase but in  fact it does and the most common reason  for loss of scalability is law  contention so here all the processors  are trying to acquire the same lock  which is the same memory address and and  if you recall from the caching lecture  or the multi-core programming lecture  every time you acquire a memory location  you have to bring that cache line into  your own cash and then invalidate the  same cache line in other processors  caches so if all the processor doing  this cash line is gonna be bouncing  around among all the processors caches  and this could lead to very bad  performance so here's a question is lock  contention more of a problem for large  blocks or small blocks yes so here's  another question why yes  yeah so one of the reasons is that when  you're doing small allocations that  means that your request rate is going to  be pretty high and your your processors  are going to be spending a lot of time  acquiring this lock and this can  exacerbate the lock contention and  another reason is that when you allocate  a large block you're doing a lot of work  because you have to write most of most  of the time you're gonna write to all  the bytes in that large block and  therefore you can amortize the overheads  of the storage allocator across all the  work that you're doing worse for small  blocks in addition to increasing this  rate of memory requests it's also  there's much less work to amortize the  overheads across so any questions okay  good all right so here's another  strategy which is to use local heap so  each thread is going to maintain its own  heap and it's going to allocate out of  its own heap and there's no locking  that's necessary so when you allocate  something you get it from your own heap  and when you free something you put it  back into your own heap so there's no  synchronization required so that's a  good thing it's very fast  what's a potential issue with this  approach  yes yeah so this approach you're gonna  be using a lot of extra space so first  first of all because you have to  maintain multiple heaps and what's one  phenomenon that you might see if you're  executing a program with this local heap  approach so it's a space could the space  potentially keep growing over time yes  so so you could actually have an  unbounded blow-up because if you do all  of the allocations in one heap and you  free everything in another heap then  whenever the first heap does an  allocation there's actually free space  sitting around in another heap but it's  just gonna grab more memory from the  operating system so your blow-up can be  unbounded and this phenomenon is what's  called memory drift so blocks allocated  by one thread are freed by another  thread and if you've run your program  for long enough your memory consumption  can keep increasing and this is sort of  like a memory leak so you might see that  if you have a memory drift problem your  program running on multiple processors  could run out of memory eventually worse  if you just run it on the single core it  won't run out of memory and here it's  because the allocator isn't smart enough  to reuse things in other heaps so what's  another strategy you can use to try to  fix this yes  sorry your question because if you keep  allocating from one thread if you do all  of your allocations in one thread and do  all your D allocations on another thread  every time you allocate from the first  thread there's actually memory sitting  around in the system but the first  thread isn't going to see it because it  only sees its own heap and it's just  gonna keep grabbing more memory from the  OS and then the second thread actually  has this extra memory sitting around but  it's not using it because it's only  doing the freeze it's not doing allocate  and if we recall the definition of a  blow up is how much more space you're  using compared to a serial execution of  the program if you executed this program  on a single core a single you you would  only have a single heap that does the  allocations and freeze so you're not  gonna your memory isn't gonna blow up  it's just gonna be constant over time  whereas if you use two threads to  execute this the memory could just keep  growing over time yes  so so it just so if you remember the bin  free list approach lets us say let's say  we're using that then all you have to do  is set some pointers in your bin free  list data structure as well as the block  that you're freeing so that it appears  in one of the linked lists so you can do  that even if some other processor  allocated that block okay so what what's  another strategy that can avoid this  issue of memory drift yes yes oh that's  a good idea you could peer it out  periodically rebalance the memory what's  a simpler approach to solve this problem  sorry could you repeat that yeah so you  could have all the processors know all  the free memory and then every time to  grab something it looks and all the  other heaps that does require a lot of  synchronization overhead might not  perform that well what's an easier way  to solve this problem yes restructure  your program so that the same thread  does the allocations and freeze for the  same memory block but what if you didn't  want to restructure your program how can  you change the allocator so we want the  behavior that you said but we don't want  to change our program program yes yeah  so you could have a single free list but  that gets back to the first strategy of  having a global heap and and then you  have high synchronization overheads yes  or  seriously free back to the the threat  that allocated it yeah so that that's  exactly right so here each object when  you allocate it it's labeled with an  owner and then whenever you free it you  return it back to the owner so the  objects that are allocated will  eventually go back to the owners heap if  they're not in use and they're not gonna  be free lying around in somebody else's  heap that the energy of this approach is  that you get fast allocation and freeing  of local objects local objects are  objects that you allocated however free  remote objects require some  synchronization because you have to  coordinate with the other threats heap  that you're sending the memory object  back to but this synchronization isn't  as bad as having a global heap since you  only have to talk to one other thread in  this case you can also bound the blow-up  by P so the reason why the blow-up is  upper bounded by P is that lets say the  serial allocator uses at most X memory  in this case each of the heels can use  that most X memory because that's how  much the serial program would have used  and you have P of these heaps so overall  you're using P times X memory and  therefore the ratio is upper bounded by  P yes  [Music]  so so when you free an object it goes if  you allocate it that object it goes back  to your own sheep if your heap is empty  it's it's actually gonna get more memory  from the operating system it's not going  to take something from another threads  heap but the maximum amount of memory  that you're gonna allocate is going to  be upper bounded by X because the  sequential program cereal program took  that much yeah so the upper bound for  the blow-up is P another advantage of  this approach is that it's resilience  it has resilience to false sharing so  let me just talk a little bit about  false sharing so true sharing is when  two processors they're trying to access  the same memory location and false  sharing is when multiple processors are  accessing different memory locations but  those locations happen to be on the same  cache line so here's an example let's  say we have two variables x and y and  the compiler happens to place x and y on  the same cache line now when the first  processor writes to X it's gonna bring  this cache line into its cache when the  other processor writes to Y since it's  on the same cache line it's gonna bring  this cache line two y's cache and then  now the first processor writes X is  gonna bring this cache line back to the  first processors cache and then you can  keep you can see this phenomenon keep  happening so here even though the  processors are writing to different  memory locations because they happen to  be on the same cache line the cache line  is going to be bouncing back and forth  on the machine between the different  processors caches and this problem gets  worse if more processors are accessing  this cache line  so and this this can be quite hard to  debug because if you're using just  variables on the stack you don't  actually know necessarily where the  compiler is going to place these memory  locations so the compiler could just  happen to play at place X&Y in the same  cache block and then you'll get this  performance head even though it seems  like you're accessing different memory  locations if you're using the heap for  memory allocation you have more  knowledge because if you allocate a huge  block you know that all the memory  locations are contiguous in physical  memory so you can just space your you  can space the accesses far enough apart  so that different processes aren't going  to touch the same cache line a more  general approach is that you you can  actually pad the object so first you can  align the object on a cache line  boundary and then you Pat out the  remaining memory locations of the object  so that it fills up the entire cache  line and now there's only one thing on  that cache line but this does lead to a  waste of space because you have this  wasted padding here so program can  induce fall sharing by having different  threads process nearby objects both on  the stack and on the heap and then an  alligator can also induce fall sharing  in two ways so it can actively induce  fall sharing and this is when the  allocator satisfies memory requests from  different threads using the same cache  block and it can also do this passively  and this is when the program passes  objects lying around and the same cache  lines two different threads and then the  allocator uses the object storage after  the objects are free to satisfy requests  from those different threads and the  local the local ownership approach tends  to reduce fall sharing because the  thread that allocates an object is  eventually going to get it back you're  not going to have have it so that an  object is permanently split among  multiple processes processors heaps so  even if you  faul sharing in local ownership it's  usually temporary eventually it's going  the object is gonna go back to the heap  that it was allocated from and a default  sharing is gonna go away yes I mean you  can implement it in various ways I mean  you can have each one of them do have a  bin free list allocator so there's no  restriction on where they have to appear  in physical memory there are many  different ways where you can you can  basically plug in any serial allocator  for the local heap so let's go back to  parallel heap allocation so I talked  about three approaches already here's a  fourth approach this is called the hoard  allocator and this was actually a pretty  good allocator when it was introduced  almost two decades ago and it's inspired  a lot of further research on how to  improve parallel memory allocations so  let me talk about how this works so in  the hoard allocator we're gonna have P  local heaps but we're also going to have  a global heap the memory is going to be  organized into large super blocks of  size s and s is usually a multiple of  the page size so this is the granularity  at which objects are going to be moved  around in the allocator and then you can  move super blocks between the local  heaps and the global heaps so when a  local heat becomes has has a lot of  super blocks that are not being fully  used then you can move it to the global  heap and then when a local heap doesn't  have enough memory you can go to the  global heap to get more memory and then  when the global heap doesn't have any  more memory then it gets more memory  from the operating system just so this  is sort of a combination of the  approaches that we saw before that the  ages are that this is a pretty fast  allocator it's also scalable as you add  more processors to performance  improves you can also bound the blow up  and it also has resilience to fall  sharing because it's using local heaps  so let's look at how an allocation using  the Horde allocator works plushies  assume without loss of generality that  all the blocks are the same size so we  have fixed size allocation so let's say  we call malloc in our program and let's  say thread I calls the malloc so what  we're gonna do is we're going to check  if there's a free object in heap I that  can satisfy this request and if so we're  going to get an object from the fullest  non-full super block in eyes heap anyone  know why we want to get the object from  the fullest non-full super block yes  right so when a super block needs to be  moved as as dense as possible and more  importantly this is to reduce external  fragmentation because as we saw in the  last lecture if you ask you the  distribution of allocated memory objects  just few pages or in this case as few  super blocks as possible that reduces  your external fragmentation okay so if  it finds it in its own heap then it's  gonna allocate an object from there  otherwise it's going to check the global  heap and if there's something in the  global heap or so here says it if the  global heat is empty then it's gonna get  a new super block from the OS otherwise  we can get a super block from the global  heap and then use that one and then  finally we set the set the owner of the  block we got either from the OS or from  the global heap to I and then we return  that free object to the program so this  is how a malloc works using the Horde  alligator and now let's look at 4 D  allocation select like use of I beta in  you storage in heap I this is the heap  for thread I and let a sub I be the  storage owned by heap I the Horde  allocator maintains the following  invariant for all heaps I and the  invariant is as follows so u sub I is  always going to be greater than or equal  to the min of a sub I minus 2 times s  recall s is the super block size and a  sub I over 2 so how it implements this  is as false when we call free of X let's  say X is owned by thread I then we're  gonna put X back into heap I and then  we're gonna check if the in u storage in  heap I u sub I is less than the min of a  sub I minus 2 s and a sub I over 2 and  what this condition says if it's true it  means that your heap is  last at most half utilized because if  it's smaller than this it has to be  smaller than a sub I over to that means  there's twice as much allocated than  used in the local he buy and therefore  there must be some super block that's at  least half empty and you move that super  block or one of those super blocks to  the global heap so any questions on how  the allocation and de-allocation works  so and since we're maintaining this  invariant it's gonna allow us to prove  about and on the blow up and I'll show  you that on the next slide but before I  go on are there any questions okay so  let's look at how we can bound the  blow-up of the Horde allocator so  there's actually allowed me that we're  gonna use and not prove dilemma is that  the maximum storage allocated in the  global heap is at most a maximum storage  allocated in the local heaps so we just  need to analyze how much stores is is  allocated in the local heaps because the  total amount of storage is gonna be at  most twice as much in the since the  global heap storage is dominated by the  local heap storage so you can prove this  lemma by case analysis and there's the  the Horde paper that's available on  learning modules and you're free to look  at that if you want to look at how this  is proved but here I'm just gonna use  this lemma to prove this theorem which  says that let u be the user footprint  for a program and let a be the hoards  alligator footprint we have that a is  upper bounded by order u plus SP and  therefore a divided by U which is a  blow-up is going to be one plus order SP  divided by u okay so let's see how this  proof works so we're just gonna analyze  the storage in the local heaps now  recall that we're always satisfying this  invariant here where u sub I is greater  than the men of a sub I minus 2s and a  sub I of the two so the first term says  that we can have two s unutilized  storage per heap so it's basically  giving two  superblocks for free to each heap and  they don't have to use it they can  basically use that as much as they want  and therefore the total amount of  storage contributed by the first term is  going to be order SP because each  processor has to s up to two as  unutilized storage so that's where the  second term comes from here and the the  the second term a sub I over to this  this will give us the first term order u  so this says that the allocated storage  is that most twice the use storage for  it and then if you if you sum up across  all the processors then there's a total  of order use storage that's allocated  because the allocated storage can be at  most twice the use storage okay so  that's the proof of the blow up for  Horde and this is pretty good it's one  plus some lower lower order term okay so  nowadays are some other allocators that  people use so je malloc is a pretty  popular one as a few differences was  with Horde it has a separate global lock  for each different allocation size  allocates the object with the smallest  address among all the objects of the  requested size and it releases empty  pages using em advice which we talked  about I talked about earlier and it's  pretty popular because it has good  performance and it's pretty robust to  different allocation traces there's also  another one called super malloc which is  an up-and-coming contender and it was  developed by Bradley Kussmaul here are  some allocator speeds for the alligators  that we looked at for our pit particular  benchmark and for this particular  benchmark we can see that super malloc  actually does really well it's more than  three times faster than J email luck and  J malloc is more than twice as fast as  Horde and then the default allocator  which uses a global heap is pretty slow  because I can't get  Goodspeed up and all these experiments  aren't 32 threads I also have the lines  of code so we see that super Malek  actually has very few lines of code  compared to the other allocators so it's  relatively simple okay so I also have  some slides in garbage collection but  since we're out of time I'll just put  these slides online and you can read  them
 

3
00:00:05,769 --> 00:00:08,019

 
4
00:00:08,019 --> 00:00:09,850

 
5
00:00:09,850 --> 00:00:10,930

 
6
00:00:10,930 --> 00:00:13,120

 
7
00:00:13,120 --> 00:00:15,160

 
8
00:00:15,160 --> 00:00:21,269

 
9
00:00:21,269 --> 00:00:23,549

 
10
00:00:23,549 --> 00:00:27,929

 
11
00:00:27,929 --> 00:00:29,920

 
12
00:00:29,920 --> 00:00:33,070

 
13
00:00:33,070 --> 00:00:34,840

 
14
00:00:34,840 --> 00:00:36,910

 
15
00:00:36,910 --> 00:00:40,030

 
16
00:00:40,030 --> 00:00:42,850

 
17
00:00:42,850 --> 00:00:45,640

 
18
00:00:45,640 --> 00:00:48,760

 
19
00:00:48,760 --> 00:00:53,590

 
20
00:00:53,590 --> 00:00:57,069

 
21
00:00:57,069 --> 00:01:00,370

 
22
00:01:00,370 --> 00:01:03,220

 
23
00:01:03,220 --> 00:01:04,929

 
24
00:01:04,929 --> 00:01:08,950

 
25
00:01:08,950 --> 00:01:11,740

 
26
00:01:11,740 --> 00:01:13,360

 
27
00:01:13,360 --> 00:01:16,080

 
28
00:01:16,080 --> 00:01:19,510

 
29
00:01:19,510 --> 00:01:21,520

 
30
00:01:21,520 --> 00:01:23,740

 
31
00:01:23,740 --> 00:01:26,740

 
32
00:01:26,740 --> 00:01:31,170

 
33
00:01:31,170 --> 00:01:34,510

 
34
00:01:34,510 --> 00:01:37,180

 
35
00:01:37,180 --> 00:01:40,240

 
36
00:01:40,240 --> 00:01:43,750

 
37
00:01:43,750 --> 00:01:45,280

 
38
00:01:45,280 --> 00:01:47,289

 
39
00:01:47,289 --> 00:01:50,050

 
40
00:01:50,050 --> 00:01:52,300

 
41
00:01:52,300 --> 00:01:54,580

 
42
00:01:54,580 --> 00:01:56,170

 
43
00:01:56,170 --> 00:02:00,880

 
44
00:02:00,880 --> 00:02:03,820

 
45
00:02:03,820 --> 00:02:14,790

 
46
00:02:14,790 --> 00:02:19,030

 
47
00:02:19,030 --> 00:02:20,860

 
48
00:02:20,860 --> 00:02:24,280

 
49
00:02:24,280 --> 00:02:26,920

 
50
00:02:26,920 --> 00:02:29,560

 
51
00:02:29,560 --> 00:02:32,320

 
52
00:02:32,320 --> 00:02:35,440

 
53
00:02:35,440 --> 00:02:38,380

 
54
00:02:38,380 --> 00:02:40,390

 
55
00:02:40,390 --> 00:02:44,220

 
56
00:02:44,220 --> 00:02:46,990

 
57
00:02:46,990 --> 00:02:48,550

 
58
00:02:48,550 --> 00:02:51,400

 
59
00:02:51,400 --> 00:02:53,590

 
60
00:02:53,590 --> 00:02:57,430

 
61
00:02:57,430 --> 00:02:59,410

 
62
00:02:59,410 --> 00:03:02,470

 
63
00:03:02,470 --> 00:03:05,710

 
64
00:03:05,710 --> 00:03:07,690

 
65
00:03:07,690 --> 00:03:11,530

 
66
00:03:11,530 --> 00:03:14,530

 
67
00:03:14,530 --> 00:03:16,870

 
68
00:03:16,870 --> 00:03:20,770

 
69
00:03:20,770 --> 00:03:23,140

 
70
00:03:23,140 --> 00:03:29,380

 
71
00:03:29,380 --> 00:03:29,390

 
72
00:03:29,390 --> 00:03:30,809


73
00:03:30,809 --> 00:03:33,209

 
74
00:03:33,209 --> 00:03:34,440

 
75
00:03:34,440 --> 00:03:37,530

 
76
00:03:37,530 --> 00:03:40,530

 
77
00:03:40,530 --> 00:03:43,500

 
78
00:03:43,500 --> 00:03:45,059

 
79
00:03:45,059 --> 00:03:48,349

 
80
00:03:48,349 --> 00:03:50,729

 
81
00:03:50,729 --> 00:03:53,000

 
82
00:03:53,000 --> 00:03:56,369

 
83
00:03:56,369 --> 00:03:57,750

 
84
00:03:57,750 --> 00:03:59,849

 
85
00:03:59,849 --> 00:04:02,339

 
86
00:04:02,339 --> 00:04:05,339

 
87
00:04:05,339 --> 00:04:07,110

 
88
00:04:07,110 --> 00:04:09,539

 
89
00:04:09,539 --> 00:04:11,190

 
90
00:04:11,190 --> 00:04:16,099

 
91
00:04:16,099 --> 00:04:16,109

 
92
00:04:16,109 --> 00:04:17,849


93
00:04:17,849 --> 00:04:21,689

 
94
00:04:21,689 --> 00:04:25,740

 
95
00:04:25,740 --> 00:04:29,129

 
96
00:04:29,129 --> 00:04:31,980

 
97
00:04:31,980 --> 00:04:34,890

 
98
00:04:34,890 --> 00:04:37,770

 
99
00:04:37,770 --> 00:04:41,100

 
100
00:04:41,100 --> 00:04:44,070

 
101
00:04:44,070 --> 00:04:47,100

 
102
00:04:47,100 --> 00:04:49,140

 
103
00:04:49,140 --> 00:04:51,180

 
104
00:04:51,180 --> 00:04:52,560

 
105
00:04:52,560 --> 00:04:54,270

 
106
00:04:54,270 --> 00:04:57,420

 
107
00:04:57,420 --> 00:04:59,760

 
108
00:04:59,760 --> 00:05:02,279

 
109
00:05:02,279 --> 00:05:04,649

 
110
00:05:04,649 --> 00:05:07,830

 
111
00:05:07,830 --> 00:05:10,710

 
112
00:05:10,710 --> 00:05:14,420

 
113
00:05:14,420 --> 00:05:17,070

 
114
00:05:17,070 --> 00:05:19,350

 
115
00:05:19,350 --> 00:05:22,200

 
116
00:05:22,200 --> 00:05:23,969

 
117
00:05:23,969 --> 00:05:26,939

 
118
00:05:26,939 --> 00:05:28,680

 
119
00:05:28,680 --> 00:05:31,529

 
120
00:05:31,529 --> 00:05:33,450

 
121
00:05:33,450 --> 00:05:36,300

 
122
00:05:36,300 --> 00:05:37,800

 
123
00:05:37,800 --> 00:05:40,230

 
124
00:05:40,230 --> 00:05:42,719

 
125
00:05:42,719 --> 00:05:44,090

 
126
00:05:44,090 --> 00:05:45,620

 
127
00:05:45,620 --> 00:05:48,620

 
128
00:05:48,620 --> 00:05:51,590

 
129
00:05:51,590 --> 00:05:54,170

 
130
00:05:54,170 --> 00:05:56,960

 
131
00:05:56,960 --> 00:05:59,120

 
132
00:05:59,120 --> 00:06:01,100

 
133
00:06:01,100 --> 00:06:03,830

 
134
00:06:03,830 --> 00:06:07,430

 
135
00:06:07,430 --> 00:06:12,110

 
136
00:06:12,110 --> 00:06:15,740

 
137
00:06:15,740 --> 00:06:20,440

 
138
00:06:20,440 --> 00:06:22,520

 
139
00:06:22,520 --> 00:06:24,440

 
140
00:06:24,440 --> 00:06:26,240

 
141
00:06:26,240 --> 00:06:28,460

 
142
00:06:28,460 --> 00:06:30,670

 
143
00:06:30,670 --> 00:06:33,830

 
144
00:06:33,830 --> 00:06:36,080

 
145
00:06:36,080 --> 00:06:38,150

 
146
00:06:38,150 --> 00:06:40,460

 
147
00:06:40,460 --> 00:06:43,160

 
148
00:06:43,160 --> 00:06:47,200

 
149
00:06:47,200 --> 00:06:51,370

 
150
00:06:51,370 --> 00:06:56,240

 
151
00:06:56,240 --> 00:06:57,710

 
152
00:06:57,710 --> 00:07:00,620

 
153
00:07:00,620 --> 00:07:03,440

 
154
00:07:03,440 --> 00:07:06,710

 
155
00:07:06,710 --> 00:07:08,930

 
156
00:07:08,930 --> 00:07:11,780

 
157
00:07:11,780 --> 00:07:14,330

 
158
00:07:14,330 --> 00:07:17,660

 
159
00:07:17,660 --> 00:07:20,230

 
160
00:07:20,230 --> 00:07:24,620

 
161
00:07:24,620 --> 00:07:27,710

 
162
00:07:27,710 --> 00:07:29,780

 
163
00:07:29,780 --> 00:07:31,940

 
164
00:07:31,940 --> 00:07:33,500

 
165
00:07:33,500 --> 00:07:36,110

 
166
00:07:36,110 --> 00:07:41,510

 
167
00:07:41,510 --> 00:07:43,520

 
168
00:07:43,520 --> 00:07:45,890

 
169
00:07:45,890 --> 00:07:48,860

 
170
00:07:48,860 --> 00:07:51,800

 
171
00:07:51,800 --> 00:07:55,460

 
172
00:07:55,460 --> 00:07:57,469

 
173
00:07:57,469 --> 00:07:57,479

 
174
00:07:57,479 --> 00:07:57,950


175
00:07:57,950 --> 00:07:59,330

 
176
00:07:59,330 --> 00:08:03,140

 
177
00:08:03,140 --> 00:08:04,879

 
178
00:08:04,879 --> 00:08:06,320

 
179
00:08:06,320 --> 00:08:08,210

 
180
00:08:08,210 --> 00:08:12,680

 
181
00:08:12,680 --> 00:08:22,999

 
182
00:08:22,999 --> 00:08:26,150

 
183
00:08:26,150 --> 00:08:30,409

 
184
00:08:30,409 --> 00:08:33,110

 
185
00:08:33,110 --> 00:08:35,570

 
186
00:08:35,570 --> 00:08:38,019

 
187
00:08:38,019 --> 00:08:40,850

 
188
00:08:40,850 --> 00:08:43,219

 
189
00:08:43,219 --> 00:08:46,610

 
190
00:08:46,610 --> 00:08:49,570

 
191
00:08:49,570 --> 00:08:52,850

 
192
00:08:52,850 --> 00:08:55,040

 
193
00:08:55,040 --> 00:08:57,440

 
194
00:08:57,440 --> 00:08:59,480

 
195
00:08:59,480 --> 00:09:02,240

 
196
00:09:02,240 --> 00:09:04,340

 
197
00:09:04,340 --> 00:09:07,370

 
198
00:09:07,370 --> 00:09:10,790

 
199
00:09:10,790 --> 00:09:13,069

 
200
00:09:13,069 --> 00:09:15,199

 
201
00:09:15,199 --> 00:09:18,650

 
202
00:09:18,650 --> 00:09:22,190

 
203
00:09:22,190 --> 00:09:24,800

 
204
00:09:24,800 --> 00:09:27,440

 
205
00:09:27,440 --> 00:09:30,019

 
206
00:09:30,019 --> 00:09:32,569

 
207
00:09:32,569 --> 00:09:35,440

 
208
00:09:35,440 --> 00:09:38,750

 
209
00:09:38,750 --> 00:09:45,019

 
210
00:09:45,019 --> 00:09:47,150

 
211
00:09:47,150 --> 00:09:50,329

 
212
00:09:50,329 --> 00:09:52,130

 
213
00:09:52,130 --> 00:09:52,140

 
214
00:09:52,140 --> 00:10:02,290


215
00:10:02,290 --> 00:10:02,300

 
216
00:10:02,300 --> 00:10:08,770


217
00:10:08,770 --> 00:10:11,620

 
218
00:10:11,620 --> 00:10:15,070

 
219
00:10:15,070 --> 00:10:19,480

 
220
00:10:19,480 --> 00:10:22,420

 
221
00:10:22,420 --> 00:10:25,750

 
222
00:10:25,750 --> 00:10:28,180

 
223
00:10:28,180 --> 00:10:30,160

 
224
00:10:30,160 --> 00:10:33,640

 
225
00:10:33,640 --> 00:10:35,950

 
226
00:10:35,950 --> 00:10:38,140

 
227
00:10:38,140 --> 00:10:40,030

 
228
00:10:40,030 --> 00:10:42,100

 
229
00:10:42,100 --> 00:10:45,400

 
230
00:10:45,400 --> 00:10:48,130

 
231
00:10:48,130 --> 00:10:51,430

 
232
00:10:51,430 --> 00:10:52,930

 
233
00:10:52,930 --> 00:10:54,250

 
234
00:10:54,250 --> 00:10:57,280

 
235
00:10:57,280 --> 00:11:00,310

 
236
00:11:00,310 --> 00:11:02,530

 
237
00:11:02,530 --> 00:11:04,870

 
238
00:11:04,870 --> 00:11:10,750

 
239
00:11:10,750 --> 00:11:18,180

 
240
00:11:18,180 --> 00:11:21,400

 
241
00:11:21,400 --> 00:11:24,550

 
242
00:11:24,550 --> 00:11:25,990

 
243
00:11:25,990 --> 00:11:29,800

 
244
00:11:29,800 --> 00:11:32,440

 
245
00:11:32,440 --> 00:11:36,220

 
246
00:11:36,220 --> 00:11:38,920

 
247
00:11:38,920 --> 00:11:40,870

 
248
00:11:40,870 --> 00:11:43,090

 
249
00:11:43,090 --> 00:11:45,700

 
250
00:11:45,700 --> 00:11:48,760

 
251
00:11:48,760 --> 00:11:50,530

 
252
00:11:50,530 --> 00:11:54,190

 
253
00:11:54,190 --> 00:11:56,530

 
254
00:11:56,530 --> 00:12:00,400

 
255
00:12:00,400 --> 00:12:03,280

 
256
00:12:03,280 --> 00:12:04,660

 
257
00:12:04,660 --> 00:12:06,640

 
258
00:12:06,640 --> 00:12:10,000

 
259
00:12:10,000 --> 00:12:11,770

 
260
00:12:11,770 --> 00:12:15,730

 
261
00:12:15,730 --> 00:12:19,480

 
262
00:12:19,480 --> 00:12:22,150

 
263
00:12:22,150 --> 00:12:24,430

 
264
00:12:24,430 --> 00:12:31,150

 
265
00:12:31,150 --> 00:12:32,740

 
266
00:12:32,740 --> 00:12:35,110

 
267
00:12:35,110 --> 00:12:38,800

 
268
00:12:38,800 --> 00:12:41,380

 
269
00:12:41,380 --> 00:12:42,790

 
270
00:12:42,790 --> 00:12:45,820

 
271
00:12:45,820 --> 00:12:47,440

 
272
00:12:47,440 --> 00:12:49,750

 
273
00:12:49,750 --> 00:12:52,690

 
274
00:12:52,690 --> 00:12:55,660

 
275
00:12:55,660 --> 00:12:57,760

 
276
00:12:57,760 --> 00:12:59,500

 
277
00:12:59,500 --> 00:13:00,790

 
278
00:13:00,790 --> 00:13:04,510

 
279
00:13:04,510 --> 00:13:06,940

 
280
00:13:06,940 --> 00:13:11,500

 
281
00:13:11,500 --> 00:13:13,660

 
282
00:13:13,660 --> 00:13:17,080

 
283
00:13:17,080 --> 00:13:20,080

 
284
00:13:20,080 --> 00:13:22,630

 
285
00:13:22,630 --> 00:13:26,920

 
286
00:13:26,920 --> 00:13:29,020

 
287
00:13:29,020 --> 00:13:31,390

 
288
00:13:31,390 --> 00:13:33,040

 
289
00:13:33,040 --> 00:13:35,950

 
290
00:13:35,950 --> 00:13:39,220

 
291
00:13:39,220 --> 00:13:42,670

 
292
00:13:42,670 --> 00:13:46,549

 
293
00:13:46,549 --> 00:13:49,049

 
294
00:13:49,049 --> 00:13:51,269

 
295
00:13:51,269 --> 00:14:02,300

 
296
00:14:02,300 --> 00:14:02,310

 
297
00:14:02,310 --> 00:14:04,090


298
00:14:04,090 --> 00:14:08,050

 
299
00:14:08,050 --> 00:14:10,270

 
300
00:14:10,270 --> 00:14:13,180

 
301
00:14:13,180 --> 00:14:14,860

 
302
00:14:14,860 --> 00:14:17,830

 
303
00:14:17,830 --> 00:14:19,390

 
304
00:14:19,390 --> 00:14:22,450

 
305
00:14:22,450 --> 00:14:25,090

 
306
00:14:25,090 --> 00:14:29,820

 
307
00:14:29,820 --> 00:14:33,490

 
308
00:14:33,490 --> 00:14:35,260

 
309
00:14:35,260 --> 00:14:38,050

 
310
00:14:38,050 --> 00:14:40,060

 
311
00:14:40,060 --> 00:14:45,040

 
312
00:14:45,040 --> 00:14:47,140

 
313
00:14:47,140 --> 00:14:48,760

 
314
00:14:48,760 --> 00:14:50,290

 
315
00:14:50,290 --> 00:14:53,350

 
316
00:14:53,350 --> 00:14:56,530

 
317
00:14:56,530 --> 00:14:57,940

 
318
00:14:57,940 --> 00:15:00,430

 
319
00:15:00,430 --> 00:15:02,800

 
320
00:15:02,800 --> 00:15:05,920

 
321
00:15:05,920 --> 00:15:08,860

 
322
00:15:08,860 --> 00:15:10,300

 
323
00:15:10,300 --> 00:15:13,570

 
324
00:15:13,570 --> 00:15:14,800

 
325
00:15:14,800 --> 00:15:17,770

 
326
00:15:17,770 --> 00:15:19,840

 
327
00:15:19,840 --> 00:15:23,140

 
328
00:15:23,140 --> 00:15:25,000

 
329
00:15:25,000 --> 00:15:27,430

 
330
00:15:27,430 --> 00:15:32,950

 
331
00:15:32,950 --> 00:15:36,370

 
332
00:15:36,370 --> 00:15:43,540

 
333
00:15:43,540 --> 00:15:47,550

 
334
00:15:47,550 --> 00:15:50,770

 
335
00:15:50,770 --> 00:15:55,000

 
336
00:15:55,000 --> 00:15:57,670

 
337
00:15:57,670 --> 00:15:59,680

 
338
00:15:59,680 --> 00:16:02,170

 
339
00:16:02,170 --> 00:16:04,960

 
340
00:16:04,960 --> 00:16:08,680

 
341
00:16:08,680 --> 00:16:11,860

 
342
00:16:11,860 --> 00:16:13,510

 
343
00:16:13,510 --> 00:16:15,790

 
344
00:16:15,790 --> 00:16:17,800

 
345
00:16:17,800 --> 00:16:21,700

 
346
00:16:21,700 --> 00:16:24,460

 
347
00:16:24,460 --> 00:16:26,769

 
348
00:16:26,769 --> 00:16:28,390

 
349
00:16:28,390 --> 00:16:32,040

 
350
00:16:32,040 --> 00:16:34,870

 
351
00:16:34,870 --> 00:16:37,870

 
352
00:16:37,870 --> 00:16:40,210

 
353
00:16:40,210 --> 00:16:42,820

 
354
00:16:42,820 --> 00:16:45,100

 
355
00:16:45,100 --> 00:16:48,220

 
356
00:16:48,220 --> 00:16:50,140

 
357
00:16:50,140 --> 00:16:53,260

 
358
00:16:53,260 --> 00:16:55,480

 
359
00:16:55,480 --> 00:16:58,120

 
360
00:16:58,120 --> 00:16:59,860

 
361
00:16:59,860 --> 00:17:01,900

 
362
00:17:01,900 --> 00:17:05,170

 
363
00:17:05,170 --> 00:17:08,170

 
364
00:17:08,170 --> 00:17:12,130

 
365
00:17:12,130 --> 00:17:16,210

 
366
00:17:16,210 --> 00:17:19,750

 
367
00:17:19,750 --> 00:17:24,970

 
368
00:17:24,970 --> 00:17:27,340

 
369
00:17:27,340 --> 00:17:30,430

 
370
00:17:30,430 --> 00:17:32,770

 
371
00:17:32,770 --> 00:17:39,940

 
372
00:17:39,940 --> 00:17:42,640

 
373
00:17:42,640 --> 00:17:45,370

 
374
00:17:45,370 --> 00:17:48,040

 
375
00:17:48,040 --> 00:17:50,950

 
376
00:17:50,950 --> 00:17:53,080

 
377
00:17:53,080 --> 00:17:55,960

 
378
00:17:55,960 --> 00:17:58,300

 
379
00:17:58,300 --> 00:18:00,160

 
380
00:18:00,160 --> 00:18:03,750

 
381
00:18:03,750 --> 00:18:06,640

 
382
00:18:06,640 --> 00:18:13,540

 
383
00:18:13,540 --> 00:18:15,730

 
384
00:18:15,730 --> 00:18:19,720

 
385
00:18:19,720 --> 00:18:22,270

 
386
00:18:22,270 --> 00:18:24,669

 
387
00:18:24,669 --> 00:18:26,560

 
388
00:18:26,560 --> 00:18:28,990

 
389
00:18:28,990 --> 00:18:31,870

 
390
00:18:31,870 --> 00:18:33,610

 
391
00:18:33,610 --> 00:18:36,430

 
392
00:18:36,430 --> 00:18:39,280

 
393
00:18:39,280 --> 00:18:42,460

 
394
00:18:42,460 --> 00:18:44,440

 
395
00:18:44,440 --> 00:18:46,300

 
396
00:18:46,300 --> 00:18:46,310

 
397
00:18:46,310 --> 00:18:47,710


398
00:18:47,710 --> 00:18:50,620

 
399
00:18:50,620 --> 00:18:52,720

 
400
00:18:52,720 --> 00:18:54,790

 
401
00:18:54,790 --> 00:18:56,410

 
402
00:18:56,410 --> 00:19:02,200

 
403
00:19:02,200 --> 00:19:04,270

 
404
00:19:04,270 --> 00:19:06,730

 
405
00:19:06,730 --> 00:19:13,060

 
406
00:19:13,060 --> 00:19:17,310

 
407
00:19:17,310 --> 00:19:19,840

 
408
00:19:19,840 --> 00:19:22,630

 
409
00:19:22,630 --> 00:19:24,460

 
410
00:19:24,460 --> 00:19:26,640

 
411
00:19:26,640 --> 00:19:30,160

 
412
00:19:30,160 --> 00:19:31,330

 
413
00:19:31,330 --> 00:19:33,640

 
414
00:19:33,640 --> 00:19:38,860

 
415
00:19:38,860 --> 00:19:41,920

 
416
00:19:41,920 --> 00:19:44,290

 
417
00:19:44,290 --> 00:19:46,990

 
418
00:19:46,990 --> 00:19:49,810

 
419
00:19:49,810 --> 00:19:51,760

 
420
00:19:51,760 --> 00:19:54,190

 
421
00:19:54,190 --> 00:19:56,440

 
422
00:19:56,440 --> 00:19:58,360

 
423
00:19:58,360 --> 00:20:01,290

 
424
00:20:01,290 --> 00:20:04,630

 
425
00:20:04,630 --> 00:20:06,430

 
426
00:20:06,430 --> 00:20:08,860

 
427
00:20:08,860 --> 00:20:11,530

 
428
00:20:11,530 --> 00:20:13,600

 
429
00:20:13,600 --> 00:20:16,930

 
430
00:20:16,930 --> 00:20:23,190

 
431
00:20:23,190 --> 00:20:26,820

 
432
00:20:26,820 --> 00:20:30,120

 
433
00:20:30,120 --> 00:20:33,000

 
434
00:20:33,000 --> 00:20:34,770

 
435
00:20:34,770 --> 00:20:37,080

 
436
00:20:37,080 --> 00:20:41,070

 
437
00:20:41,070 --> 00:20:44,010

 
438
00:20:44,010 --> 00:20:45,780

 
439
00:20:45,780 --> 00:20:47,970

 
440
00:20:47,970 --> 00:20:49,410

 
441
00:20:49,410 --> 00:20:53,070

 
442
00:20:53,070 --> 00:20:55,410

 
443
00:20:55,410 --> 00:20:58,230

 
444
00:20:58,230 --> 00:21:02,040

 
445
00:21:02,040 --> 00:21:04,470

 
446
00:21:04,470 --> 00:21:07,520

 
447
00:21:07,520 --> 00:21:13,040

 
448
00:21:13,040 --> 00:21:16,020

 
449
00:21:16,020 --> 00:21:18,330

 
450
00:21:18,330 --> 00:21:20,580

 
451
00:21:20,580 --> 00:21:23,400

 
452
00:21:23,400 --> 00:21:26,850

 
453
00:21:26,850 --> 00:21:29,610

 
454
00:21:29,610 --> 00:21:36,900

 
455
00:21:36,900 --> 00:21:39,330

 
456
00:21:39,330 --> 00:21:41,130

 
457
00:21:41,130 --> 00:21:44,570

 
458
00:21:44,570 --> 00:21:49,440

 
459
00:21:49,440 --> 00:21:53,190

 
460
00:21:53,190 --> 00:21:55,800

 
461
00:21:55,800 --> 00:21:58,260

 
462
00:21:58,260 --> 00:22:00,630

 
463
00:22:00,630 --> 00:22:03,710

 
464
00:22:03,710 --> 00:22:07,290

 
465
00:22:07,290 --> 00:22:14,450

 
466
00:22:14,450 --> 00:22:20,670

 
467
00:22:20,670 --> 00:22:22,909

 
468
00:22:22,909 --> 00:22:27,730

 
469
00:22:27,730 --> 00:22:31,860

 
470
00:22:31,860 --> 00:22:36,039

 
471
00:22:36,039 --> 00:22:39,100

 
472
00:22:39,100 --> 00:22:41,740

 
473
00:22:41,740 --> 00:22:43,810

 
474
00:22:43,810 --> 00:22:58,480

 
475
00:22:58,480 --> 00:23:01,130

 
476
00:23:01,130 --> 00:23:03,550

 
477
00:23:03,550 --> 00:23:07,280

 
478
00:23:07,280 --> 00:23:10,430

 
479
00:23:10,430 --> 00:23:11,930

 
480
00:23:11,930 --> 00:23:17,300

 
481
00:23:17,300 --> 00:23:20,540

 
482
00:23:20,540 --> 00:23:29,320

 
483
00:23:29,320 --> 00:23:32,180

 
484
00:23:32,180 --> 00:23:35,900

 
485
00:23:35,900 --> 00:23:38,690

 
486
00:23:38,690 --> 00:23:41,090

 
487
00:23:41,090 --> 00:23:43,070

 
488
00:23:43,070 --> 00:23:46,580

 
489
00:23:46,580 --> 00:23:51,890

 
490
00:23:51,890 --> 00:23:54,080

 
491
00:23:54,080 --> 00:23:56,000

 
492
00:23:56,000 --> 00:23:58,070

 
493
00:23:58,070 --> 00:23:59,870

 
494
00:23:59,870 --> 00:24:02,660

 
495
00:24:02,660 --> 00:24:06,470

 
496
00:24:06,470 --> 00:24:09,530

 
497
00:24:09,530 --> 00:24:11,810

 
498
00:24:11,810 --> 00:24:16,010

 
499
00:24:16,010 --> 00:24:17,990

 
500
00:24:17,990 --> 00:24:20,990

 
501
00:24:20,990 --> 00:24:22,850

 
502
00:24:22,850 --> 00:24:25,340

 
503
00:24:25,340 --> 00:24:28,010

 
504
00:24:28,010 --> 00:24:31,130

 
505
00:24:31,130 --> 00:24:37,690

 
506
00:24:37,690 --> 00:24:44,270

 
507
00:24:44,270 --> 00:24:45,890

 
508
00:24:45,890 --> 00:24:48,020

 
509
00:24:48,020 --> 00:24:50,750

 
510
00:24:50,750 --> 00:24:55,820

 
511
00:24:55,820 --> 00:24:55,830

 
512
00:24:55,830 --> 00:24:58,160


513
00:24:58,160 --> 00:25:01,350

 
514
00:25:01,350 --> 00:25:03,300

 
515
00:25:03,300 --> 00:25:05,730

 
516
00:25:05,730 --> 00:25:18,330

 
517
00:25:18,330 --> 00:25:20,220

 
518
00:25:20,220 --> 00:25:23,760

 
519
00:25:23,760 --> 00:25:25,200

 
520
00:25:25,200 --> 00:25:31,580

 
521
00:25:31,580 --> 00:25:53,820

 
522
00:25:53,820 --> 00:25:58,560

 
523
00:25:58,560 --> 00:26:00,660

 
524
00:26:00,660 --> 00:26:02,430

 
525
00:26:02,430 --> 00:26:05,360

 
526
00:26:05,360 --> 00:26:08,010

 
527
00:26:08,010 --> 00:26:10,560

 
528
00:26:10,560 --> 00:26:14,700

 
529
00:26:14,700 --> 00:26:16,590

 
530
00:26:16,590 --> 00:26:18,600

 
531
00:26:18,600 --> 00:26:21,900

 
532
00:26:21,900 --> 00:26:23,310

 
533
00:26:23,310 --> 00:26:25,170

 
534
00:26:25,170 --> 00:26:28,440

 
535
00:26:28,440 --> 00:26:31,770

 
536
00:26:31,770 --> 00:26:33,450

 
537
00:26:33,450 --> 00:26:37,170

 
538
00:26:37,170 --> 00:26:39,930

 
539
00:26:39,930 --> 00:26:42,630

 
540
00:26:42,630 --> 00:26:45,720

 
541
00:26:45,720 --> 00:26:49,230

 
542
00:26:49,230 --> 00:26:51,450

 
543
00:26:51,450 --> 00:26:54,330

 
544
00:26:54,330 --> 00:26:57,390

 
545
00:26:57,390 --> 00:26:59,280

 
546
00:26:59,280 --> 00:27:00,810

 
547
00:27:00,810 --> 00:27:03,090

 
548
00:27:03,090 --> 00:27:04,950

 
549
00:27:04,950 --> 00:27:07,260

 
550
00:27:07,260 --> 00:27:11,419

 
551
00:27:11,419 --> 00:27:13,879

 
552
00:27:13,879 --> 00:27:16,220

 
553
00:27:16,220 --> 00:27:17,950

 
554
00:27:17,950 --> 00:27:21,350

 
555
00:27:21,350 --> 00:27:22,940

 
556
00:27:22,940 --> 00:27:27,499

 
557
00:27:27,499 --> 00:27:30,289

 
558
00:27:30,289 --> 00:27:32,210

 
559
00:27:32,210 --> 00:27:36,259

 
560
00:27:36,259 --> 00:27:39,259

 
561
00:27:39,259 --> 00:27:42,950

 
562
00:27:42,950 --> 00:27:44,509

 
563
00:27:44,509 --> 00:27:47,210

 
564
00:27:47,210 --> 00:27:50,749

 
565
00:27:50,749 --> 00:27:53,029

 
566
00:27:53,029 --> 00:27:56,899

 
567
00:27:56,899 --> 00:27:59,119

 
568
00:27:59,119 --> 00:28:01,310

 
569
00:28:01,310 --> 00:28:03,739

 
570
00:28:03,739 --> 00:28:06,200

 
571
00:28:06,200 --> 00:28:08,269

 
572
00:28:08,269 --> 00:28:13,489

 
573
00:28:13,489 --> 00:28:16,580

 
574
00:28:16,580 --> 00:28:18,169

 
575
00:28:18,169 --> 00:28:20,749

 
576
00:28:20,749 --> 00:28:23,119

 
577
00:28:23,119 --> 00:28:25,519

 
578
00:28:25,519 --> 00:28:28,159

 
579
00:28:28,159 --> 00:28:32,060

 
580
00:28:32,060 --> 00:28:35,060

 
581
00:28:35,060 --> 00:28:36,859

 
582
00:28:36,859 --> 00:28:38,509

 
583
00:28:38,509 --> 00:28:40,460

 
584
00:28:40,460 --> 00:28:42,320

 
585
00:28:42,320 --> 00:28:45,109

 
586
00:28:45,109 --> 00:28:46,970

 
587
00:28:46,970 --> 00:28:52,279

 
588
00:28:52,279 --> 00:28:55,220

 
589
00:28:55,220 --> 00:28:57,769

 
590
00:28:57,769 --> 00:29:00,919

 
591
00:29:00,919 --> 00:29:04,220

 
592
00:29:04,220 --> 00:29:07,340

 
593
00:29:07,340 --> 00:29:10,279

 
594
00:29:10,279 --> 00:29:12,320

 
595
00:29:12,320 --> 00:29:14,869

 
596
00:29:14,869 --> 00:29:16,989

 
597
00:29:16,989 --> 00:29:20,299

 
598
00:29:20,299 --> 00:29:22,700

 
599
00:29:22,700 --> 00:29:25,130

 
600
00:29:25,130 --> 00:29:28,790

 
601
00:29:28,790 --> 00:29:30,710

 
602
00:29:30,710 --> 00:29:32,390

 
603
00:29:32,390 --> 00:29:35,330

 
604
00:29:35,330 --> 00:29:37,160

 
605
00:29:37,160 --> 00:29:40,340

 
606
00:29:40,340 --> 00:29:42,740

 
607
00:29:42,740 --> 00:29:47,270

 
608
00:29:47,270 --> 00:29:48,440

 
609
00:29:48,440 --> 00:29:50,510

 
610
00:29:50,510 --> 00:29:55,430

 
611
00:29:55,430 --> 00:30:03,650

 
612
00:30:03,650 --> 00:30:06,680

 
613
00:30:06,680 --> 00:30:09,110

 
614
00:30:09,110 --> 00:30:11,150

 
615
00:30:11,150 --> 00:30:18,080

 
616
00:30:18,080 --> 00:30:20,720

 
617
00:30:20,720 --> 00:30:23,050

 
618
00:30:23,050 --> 00:30:26,800

 
619
00:30:26,800 --> 00:30:29,000

 
620
00:30:29,000 --> 00:30:31,070

 
621
00:30:31,070 --> 00:30:33,950

 
622
00:30:33,950 --> 00:30:36,380

 
623
00:30:36,380 --> 00:30:38,540

 
624
00:30:38,540 --> 00:30:40,640

 
625
00:30:40,640 --> 00:30:43,670

 
626
00:30:43,670 --> 00:30:45,590

 
627
00:30:45,590 --> 00:30:47,660

 
628
00:30:47,660 --> 00:30:50,630

 
629
00:30:50,630 --> 00:30:53,030

 
630
00:30:53,030 --> 00:30:55,310

 
631
00:30:55,310 --> 00:30:57,170

 
632
00:30:57,170 --> 00:30:59,330

 
633
00:30:59,330 --> 00:31:00,920

 
634
00:31:00,920 --> 00:31:03,170

 
635
00:31:03,170 --> 00:31:05,840

 
636
00:31:05,840 --> 00:31:10,310

 
637
00:31:10,310 --> 00:31:13,850

 
638
00:31:13,850 --> 00:31:16,520

 
639
00:31:16,520 --> 00:31:18,380

 
640
00:31:18,380 --> 00:31:20,240

 
641
00:31:20,240 --> 00:31:22,580

 
642
00:31:22,580 --> 00:31:28,340

 
643
00:31:28,340 --> 00:31:31,370

 
644
00:31:31,370 --> 00:31:34,160

 
645
00:31:34,160 --> 00:31:38,370

 
646
00:31:38,370 --> 00:31:42,990

 
647
00:31:42,990 --> 00:31:46,440

 
648
00:31:46,440 --> 00:31:50,400

 
649
00:31:50,400 --> 00:31:51,900

 
650
00:31:51,900 --> 00:31:59,930

 
651
00:31:59,930 --> 00:32:03,510

 
652
00:32:03,510 --> 00:32:07,590

 
653
00:32:07,590 --> 00:32:11,039

 
654
00:32:11,039 --> 00:32:12,750

 
655
00:32:12,750 --> 00:32:19,470

 
656
00:32:19,470 --> 00:32:24,500

 
657
00:32:24,500 --> 00:32:26,760

 
658
00:32:26,760 --> 00:32:30,180

 
659
00:32:30,180 --> 00:32:32,700

 
660
00:32:32,700 --> 00:32:35,580

 
661
00:32:35,580 --> 00:32:37,980

 
662
00:32:37,980 --> 00:32:40,830

 
663
00:32:40,830 --> 00:32:46,230

 
664
00:32:46,230 --> 00:32:48,419

 
665
00:32:48,419 --> 00:32:50,220

 
666
00:32:50,220 --> 00:32:52,500

 
667
00:32:52,500 --> 00:32:55,919

 
668
00:32:55,919 --> 00:32:59,070

 
669
00:32:59,070 --> 00:33:14,550

 
670
00:33:14,550 --> 00:33:17,580

 
671
00:33:17,580 --> 00:33:20,630

 
672
00:33:20,630 --> 00:33:24,120

 
673
00:33:24,120 --> 00:33:35,770

 
674
00:33:35,770 --> 00:33:35,780

 
675
00:33:35,780 --> 00:33:37,799


676
00:33:37,799 --> 00:33:37,809

 
677
00:33:37,809 --> 00:33:39,080


678
00:33:39,080 --> 00:33:39,090

 
679
00:33:39,090 --> 00:33:51,539


680
00:33:51,539 --> 00:33:54,840

 
681
00:33:54,840 --> 00:33:57,120

 
682
00:33:57,120 --> 00:34:00,480

 
683
00:34:00,480 --> 00:34:03,360

 
684
00:34:03,360 --> 00:34:10,320

 
685
00:34:10,320 --> 00:34:12,060

 
686
00:34:12,060 --> 00:34:14,070

 
687
00:34:14,070 --> 00:34:16,950

 
688
00:34:16,950 --> 00:34:19,409

 
689
00:34:19,409 --> 00:34:24,540

 
690
00:34:24,540 --> 00:34:34,889

 
691
00:34:34,889 --> 00:34:37,560

 
692
00:34:37,560 --> 00:34:41,070

 
693
00:34:41,070 --> 00:34:43,320

 
694
00:34:43,320 --> 00:34:46,139

 
695
00:34:46,139 --> 00:34:49,070

 
696
00:34:49,070 --> 00:34:51,839

 
697
00:34:51,839 --> 00:34:54,359

 
698
00:34:54,359 --> 00:34:57,300

 
699
00:34:57,300 --> 00:34:59,460

 
700
00:34:59,460 --> 00:35:01,829

 
701
00:35:01,829 --> 00:35:04,320

 
702
00:35:04,320 --> 00:35:07,800

 
703
00:35:07,800 --> 00:35:10,320

 
704
00:35:10,320 --> 00:35:13,530

 
705
00:35:13,530 --> 00:35:16,109

 
706
00:35:16,109 --> 00:35:18,030

 
707
00:35:18,030 --> 00:35:19,710

 
708
00:35:19,710 --> 00:35:22,650

 
709
00:35:22,650 --> 00:35:28,829

 
710
00:35:28,829 --> 00:35:31,050

 
711
00:35:31,050 --> 00:35:34,890

 
712
00:35:34,890 --> 00:35:37,589

 
713
00:35:37,589 --> 00:35:40,170

 
714
00:35:40,170 --> 00:35:42,420

 
715
00:35:42,420 --> 00:35:44,820

 
716
00:35:44,820 --> 00:35:46,589

 
717
00:35:46,589 --> 00:35:48,540

 
718
00:35:48,540 --> 00:35:51,960

 
719
00:35:51,960 --> 00:35:54,450

 
720
00:35:54,450 --> 00:35:57,390

 
721
00:35:57,390 --> 00:35:58,740

 
722
00:35:58,740 --> 00:36:00,599

 
723
00:36:00,599 --> 00:36:03,630

 
724
00:36:03,630 --> 00:36:07,890

 
725
00:36:07,890 --> 00:36:12,690

 
726
00:36:12,690 --> 00:36:18,510

 
727
00:36:18,510 --> 00:36:20,730

 
728
00:36:20,730 --> 00:36:22,410

 
729
00:36:22,410 --> 00:36:36,630

 
730
00:36:36,630 --> 00:36:39,780

 
731
00:36:39,780 --> 00:36:44,250

 
732
00:36:44,250 --> 00:36:46,050

 
733
00:36:46,050 --> 00:36:48,000

 
734
00:36:48,000 --> 00:36:53,010

 
735
00:36:53,010 --> 00:36:57,990

 
736
00:36:57,990 --> 00:37:01,410

 
737
00:37:01,410 --> 00:37:03,360

 
738
00:37:03,360 --> 00:37:06,360

 
739
00:37:06,360 --> 00:37:08,490

 
740
00:37:08,490 --> 00:37:12,870

 
741
00:37:12,870 --> 00:37:15,570

 
742
00:37:15,570 --> 00:37:19,310

 
743
00:37:19,310 --> 00:37:23,370

 
744
00:37:23,370 --> 00:37:26,970

 
745
00:37:26,970 --> 00:37:29,460

 
746
00:37:29,460 --> 00:37:31,950

 
747
00:37:31,950 --> 00:37:33,990

 
748
00:37:33,990 --> 00:37:36,540

 
749
00:37:36,540 --> 00:37:39,360

 
750
00:37:39,360 --> 00:37:43,070

 
751
00:37:43,070 --> 00:37:45,630

 
752
00:37:45,630 --> 00:37:49,230

 
753
00:37:49,230 --> 00:37:52,050

 
754
00:37:52,050 --> 00:37:55,470

 
755
00:37:55,470 --> 00:37:58,770

 
756
00:37:58,770 --> 00:38:02,460

 
757
00:38:02,460 --> 00:38:04,770

 
758
00:38:04,770 --> 00:38:07,050

 
759
00:38:07,050 --> 00:38:08,880

 
760
00:38:08,880 --> 00:38:10,560

 
761
00:38:10,560 --> 00:38:12,450

 
762
00:38:12,450 --> 00:38:15,680

 
763
00:38:15,680 --> 00:38:29,980

 
764
00:38:29,980 --> 00:38:32,450

 
765
00:38:32,450 --> 00:38:35,780

 
766
00:38:35,780 --> 00:38:37,339

 
767
00:38:37,339 --> 00:38:40,420

 
768
00:38:40,420 --> 00:38:50,730

 
769
00:38:50,730 --> 00:38:50,740

 
770
00:38:50,740 --> 00:39:01,410


771
00:39:01,410 --> 00:39:01,420

 
772
00:39:01,420 --> 00:39:04,089


773
00:39:04,089 --> 00:39:06,279

 
774
00:39:06,279 --> 00:39:08,469

 
775
00:39:08,469 --> 00:39:10,719

 
776
00:39:10,719 --> 00:39:13,989

 
777
00:39:13,989 --> 00:39:16,329

 
778
00:39:16,329 --> 00:39:18,489

 
779
00:39:18,489 --> 00:39:20,739

 
780
00:39:20,739 --> 00:39:22,479

 
781
00:39:22,479 --> 00:39:25,150

 
782
00:39:25,150 --> 00:39:27,459

 
783
00:39:27,459 --> 00:39:30,219

 
784
00:39:30,219 --> 00:39:36,479

 
785
00:39:36,479 --> 00:39:38,890

 
786
00:39:38,890 --> 00:39:40,420

 
787
00:39:40,420 --> 00:39:43,509

 
788
00:39:43,509 --> 00:39:44,920

 
789
00:39:44,920 --> 00:39:47,319

 
790
00:39:47,319 --> 00:39:49,989

 
791
00:39:49,989 --> 00:39:52,329

 
792
00:39:52,329 --> 00:39:56,079

 
793
00:39:56,079 --> 00:39:59,170

 
794
00:39:59,170 --> 00:40:02,650

 
795
00:40:02,650 --> 00:40:05,709

 
796
00:40:05,709 --> 00:40:08,279

 
797
00:40:08,279 --> 00:40:11,469

 
798
00:40:11,469 --> 00:40:14,439

 
799
00:40:14,439 --> 00:40:16,269

 
800
00:40:16,269 --> 00:40:19,209

 
801
00:40:19,209 --> 00:40:21,969

 
802
00:40:21,969 --> 00:40:23,829

 
803
00:40:23,829 --> 00:40:26,469

 
804
00:40:26,469 --> 00:40:29,349

 
805
00:40:29,349 --> 00:40:31,199

 
806
00:40:31,199 --> 00:40:33,759

 
807
00:40:33,759 --> 00:40:35,499

 
808
00:40:35,499 --> 00:40:37,749

 
809
00:40:37,749 --> 00:40:40,089

 
810
00:40:40,089 --> 00:40:41,859

 
811
00:40:41,859 --> 00:40:44,469

 
812
00:40:44,469 --> 00:40:46,870

 
813
00:40:46,870 --> 00:40:48,759

 
814
00:40:48,759 --> 00:40:51,900

 
815
00:40:51,900 --> 00:40:54,189

 
816
00:40:54,189 --> 00:40:56,979

 
817
00:40:56,979 --> 00:40:59,170

 
818
00:40:59,170 --> 00:41:01,719

 
819
00:41:01,719 --> 00:41:04,390

 
820
00:41:04,390 --> 00:41:08,709

 
821
00:41:08,709 --> 00:41:11,380

 
822
00:41:11,380 --> 00:41:13,479

 
823
00:41:13,479 --> 00:41:13,489

 
824
00:41:13,489 --> 00:41:16,910


825
00:41:16,910 --> 00:41:24,239

 
826
00:41:24,239 --> 00:41:26,220

 
827
00:41:26,220 --> 00:41:27,870

 
828
00:41:27,870 --> 00:41:31,079

 
829
00:41:31,079 --> 00:41:36,359

 
830
00:41:36,359 --> 00:41:38,430

 
831
00:41:38,430 --> 00:41:40,289

 
832
00:41:40,289 --> 00:41:42,720

 
833
00:41:42,720 --> 00:41:48,299

 
834
00:41:48,299 --> 00:41:49,650

 
835
00:41:49,650 --> 00:41:51,749

 
836
00:41:51,749 --> 00:42:03,020

 
837
00:42:03,020 --> 00:42:06,710

 
838
00:42:06,710 --> 00:42:06,720

 
839
00:42:06,720 --> 00:42:10,940


840
00:42:10,940 --> 00:42:14,299

 
841
00:42:14,299 --> 00:42:14,309

 
842
00:42:14,309 --> 00:42:16,580


843
00:42:16,580 --> 00:42:18,830

 
844
00:42:18,830 --> 00:42:22,160

 
845
00:42:22,160 --> 00:42:24,950

 
846
00:42:24,950 --> 00:42:27,440

 
847
00:42:27,440 --> 00:42:29,900

 
848
00:42:29,900 --> 00:42:31,910

 
849
00:42:31,910 --> 00:42:44,630

 
850
00:42:44,630 --> 00:42:46,970

 
851
00:42:46,970 --> 00:42:49,430

 
852
00:42:49,430 --> 00:42:51,730

 
853
00:42:51,730 --> 00:42:58,040

 
854
00:42:58,040 --> 00:43:05,530

 
855
00:43:05,530 --> 00:43:10,000

 
856
00:43:10,000 --> 00:43:13,070

 
857
00:43:13,070 --> 00:43:15,440

 
858
00:43:15,440 --> 00:43:16,970

 
859
00:43:16,970 --> 00:43:19,580

 
860
00:43:19,580 --> 00:43:21,050

 
861
00:43:21,050 --> 00:43:23,930

 
862
00:43:23,930 --> 00:43:25,840

 
863
00:43:25,840 --> 00:43:28,550

 
864
00:43:28,550 --> 00:43:31,670

 
865
00:43:31,670 --> 00:43:34,460

 
866
00:43:34,460 --> 00:43:38,270

 
867
00:43:38,270 --> 00:43:40,460

 
868
00:43:40,460 --> 00:43:42,620

 
869
00:43:42,620 --> 00:43:45,290

 
870
00:43:45,290 --> 00:43:48,650

 
871
00:43:48,650 --> 00:43:50,630

 
872
00:43:50,630 --> 00:43:52,880

 
873
00:43:52,880 --> 00:43:54,650

 
874
00:43:54,650 --> 00:43:56,660

 
875
00:43:56,660 --> 00:44:02,180

 
876
00:44:02,180 --> 00:44:05,870

 
877
00:44:05,870 --> 00:44:08,810

 
878
00:44:08,810 --> 00:44:12,410

 
879
00:44:12,410 --> 00:44:13,850

 
880
00:44:13,850 --> 00:44:16,010

 
881
00:44:16,010 --> 00:44:18,110

 
882
00:44:18,110 --> 00:44:20,660

 
883
00:44:20,660 --> 00:44:22,970

 
884
00:44:22,970 --> 00:44:25,760

 
885
00:44:25,760 --> 00:44:27,830

 
886
00:44:27,830 --> 00:44:29,760

 
887
00:44:29,760 --> 00:44:31,530

 
888
00:44:31,530 --> 00:44:33,600

 
889
00:44:33,600 --> 00:44:35,820

 
890
00:44:35,820 --> 00:44:38,610

 
891
00:44:38,610 --> 00:44:40,710

 
892
00:44:40,710 --> 00:44:42,330

 
893
00:44:42,330 --> 00:44:44,400

 
894
00:44:44,400 --> 00:44:47,150

 
895
00:44:47,150 --> 00:44:50,430

 
896
00:44:50,430 --> 00:44:52,320

 
897
00:44:52,320 --> 00:44:54,180

 
898
00:44:54,180 --> 00:44:57,930

 
899
00:44:57,930 --> 00:45:00,750

 
900
00:45:00,750 --> 00:45:03,150

 
901
00:45:03,150 --> 00:45:05,310

 
902
00:45:05,310 --> 00:45:07,530

 
903
00:45:07,530 --> 00:45:09,930

 
904
00:45:09,930 --> 00:45:11,520

 
905
00:45:11,520 --> 00:45:13,110

 
906
00:45:13,110 --> 00:45:19,200

 
907
00:45:19,200 --> 00:45:21,210

 
908
00:45:21,210 --> 00:45:26,580

 
909
00:45:26,580 --> 00:45:29,970

 
910
00:45:29,970 --> 00:45:31,740

 
911
00:45:31,740 --> 00:45:35,220

 
912
00:45:35,220 --> 00:45:37,050

 
913
00:45:37,050 --> 00:45:38,940

 
914
00:45:38,940 --> 00:45:41,850

 
915
00:45:41,850 --> 00:45:43,560

 
916
00:45:43,560 --> 00:45:46,980

 
917
00:45:46,980 --> 00:45:49,530

 
918
00:45:49,530 --> 00:45:52,010

 
919
00:45:52,010 --> 00:45:54,870

 
920
00:45:54,870 --> 00:45:57,810

 
921
00:45:57,810 --> 00:46:00,000

 
922
00:46:00,000 --> 00:46:01,770

 
923
00:46:01,770 --> 00:46:04,350

 
924
00:46:04,350 --> 00:46:06,810

 
925
00:46:06,810 --> 00:46:08,580

 
926
00:46:08,580 --> 00:46:11,370

 
927
00:46:11,370 --> 00:46:13,680

 
928
00:46:13,680 --> 00:46:16,260

 
929
00:46:16,260 --> 00:46:20,540

 
930
00:46:20,540 --> 00:46:25,110

 
931
00:46:25,110 --> 00:46:27,150

 
932
00:46:27,150 --> 00:46:29,280

 
933
00:46:29,280 --> 00:46:30,990

 
934
00:46:30,990 --> 00:46:31,000

 
935
00:46:31,000 --> 00:46:36,290


936
00:46:36,290 --> 00:46:39,260

 
937
00:46:39,260 --> 00:46:41,000

 
938
00:46:41,000 --> 00:46:44,109

 
939
00:46:44,109 --> 00:46:48,200

 
940
00:46:48,200 --> 00:46:51,400

 
941
00:46:51,400 --> 00:46:54,290

 
942
00:46:54,290 --> 00:46:58,880

 
943
00:46:58,880 --> 00:47:03,050

 
944
00:47:03,050 --> 00:47:06,380

 
945
00:47:06,380 --> 00:47:08,870

 
946
00:47:08,870 --> 00:47:11,990

 
947
00:47:11,990 --> 00:47:19,670

 
948
00:47:19,670 --> 00:47:22,250

 
949
00:47:22,250 --> 00:47:25,820

 
950
00:47:25,820 --> 00:47:29,150

 
951
00:47:29,150 --> 00:47:31,550

 
952
00:47:31,550 --> 00:47:34,430

 
953
00:47:34,430 --> 00:47:37,940

 
954
00:47:37,940 --> 00:47:39,710

 
955
00:47:39,710 --> 00:47:42,530

 
956
00:47:42,530 --> 00:47:44,510

 
957
00:47:44,510 --> 00:47:46,700

 
958
00:47:46,700 --> 00:47:49,940

 
959
00:47:49,940 --> 00:47:52,190

 
960
00:47:52,190 --> 00:47:54,349

 
961
00:47:54,349 --> 00:47:57,170

 
962
00:47:57,170 --> 00:47:59,990

 
963
00:47:59,990 --> 00:48:01,760

 
964
00:48:01,760 --> 00:48:04,430

 
965
00:48:04,430 --> 00:48:18,059

 
966
00:48:18,059 --> 00:48:21,519

 
967
00:48:21,519 --> 00:48:24,549

 
968
00:48:24,549 --> 00:48:28,420

 
969
00:48:28,420 --> 00:48:30,579

 
970
00:48:30,579 --> 00:48:32,109

 
971
00:48:32,109 --> 00:48:33,849

 
972
00:48:33,849 --> 00:48:36,190

 
973
00:48:36,190 --> 00:48:40,289

 
974
00:48:40,289 --> 00:48:43,739

 
975
00:48:43,739 --> 00:48:45,849

 
976
00:48:45,849 --> 00:48:48,339

 
977
00:48:48,339 --> 00:48:51,069

 
978
00:48:51,069 --> 00:48:52,630

 
979
00:48:52,630 --> 00:48:55,299

 
980
00:48:55,299 --> 00:48:57,039

 
981
00:48:57,039 --> 00:48:58,569

 
982
00:48:58,569 --> 00:49:01,299

 
983
00:49:01,299 --> 00:49:04,210

 
984
00:49:04,210 --> 00:49:06,099

 
985
00:49:06,099 --> 00:49:08,049

 
986
00:49:08,049 --> 00:49:11,440

 
987
00:49:11,440 --> 00:49:14,349

 
988
00:49:14,349 --> 00:49:16,210

 
989
00:49:16,210 --> 00:49:18,249

 
990
00:49:18,249 --> 00:49:20,410

 
991
00:49:20,410 --> 00:49:24,190

 
992
00:49:24,190 --> 00:49:26,289

 
993
00:49:26,289 --> 00:49:28,390

 
994
00:49:28,390 --> 00:49:33,009

 
995
00:49:33,009 --> 00:49:35,079

 
996
00:49:35,079 --> 00:49:37,450

 
997
00:49:37,450 --> 00:49:39,370

 
998
00:49:39,370 --> 00:49:42,190

 
999
00:49:42,190 --> 00:49:45,370

 
1000
00:49:45,370 --> 00:49:47,680

 
1001
00:49:47,680 --> 00:49:50,440

 
1002
00:49:50,440 --> 00:49:51,700

 
1003
00:49:51,700 --> 00:49:54,579

 
1004
00:49:54,579 --> 00:49:57,609

 
1005
00:49:57,609 --> 00:50:01,329

 
1006
00:50:01,329 --> 00:50:04,359

 
1007
00:50:04,359 --> 00:50:07,210

 
1008
00:50:07,210 --> 00:50:09,819

 
1009
00:50:09,819 --> 00:50:13,059

 
1010
00:50:13,059 --> 00:50:16,690

 
1011
00:50:16,690 --> 00:50:19,960

 
1012
00:50:19,960 --> 00:50:21,640

 
1013
00:50:21,640 --> 00:50:21,650

 
1014
00:50:21,650 --> 00:50:25,359


1015
00:50:25,359 --> 00:50:28,189

 
1016
00:50:28,189 --> 00:50:33,499

 
1017
00:50:33,499 --> 00:50:36,169

 
1018
00:50:36,169 --> 00:50:39,199

 
1019
00:50:39,199 --> 00:50:42,229

 
1020
00:50:42,229 --> 00:50:43,789

 
1021
00:50:43,789 --> 00:50:47,539

 
1022
00:50:47,539 --> 00:50:50,509

 
1023
00:50:50,509 --> 00:50:54,259

 
1024
00:50:54,259 --> 00:50:56,209

 
1025
00:50:56,209 --> 00:50:58,099

 
1026
00:50:58,099 --> 00:50:59,529

 
1027
00:50:59,529 --> 00:51:01,519

 
1028
00:51:01,519 --> 00:51:03,589

 
1029
00:51:03,589 --> 00:51:05,449

 
1030
00:51:05,449 --> 00:51:07,159

 
1031
00:51:07,159 --> 00:51:08,870

 
1032
00:51:08,870 --> 00:51:12,469

 
1033
00:51:12,469 --> 00:51:19,870

 
1034
00:51:19,870 --> 00:51:24,769

 
1035
00:51:24,769 --> 00:51:32,550

 
1036
00:51:32,550 --> 00:51:34,980

 
1037
00:51:34,980 --> 00:51:37,800

 
1038
00:51:37,800 --> 00:51:39,270

 
1039
00:51:39,270 --> 00:51:41,010

 
1040
00:51:41,010 --> 00:51:42,840

 
1041
00:51:42,840 --> 00:51:47,760

 
1042
00:51:47,760 --> 00:51:49,500

 
1043
00:51:49,500 --> 00:52:03,300

 
1044
00:52:03,300 --> 00:52:05,850

 
1045
00:52:05,850 --> 00:52:09,270

 
1046
00:52:09,270 --> 00:52:12,240

 
1047
00:52:12,240 --> 00:52:14,430

 
1048
00:52:14,430 --> 00:52:17,040

 
1049
00:52:17,040 --> 00:52:19,760

 
1050
00:52:19,760 --> 00:52:22,830

 
1051
00:52:22,830 --> 00:52:25,820

 
1052
00:52:25,820 --> 00:52:29,340

 
1053
00:52:29,340 --> 00:52:31,500

 
1054
00:52:31,500 --> 00:52:33,420

 
1055
00:52:33,420 --> 00:52:35,730

 
1056
00:52:35,730 --> 00:52:37,950

 
1057
00:52:37,950 --> 00:52:40,080

 
1058
00:52:40,080 --> 00:52:42,420

 
1059
00:52:42,420 --> 00:52:45,300

 
1060
00:52:45,300 --> 00:52:48,240

 
1061
00:52:48,240 --> 00:52:51,060

 
1062
00:52:51,060 --> 00:52:54,890

 
1063
00:52:54,890 --> 00:52:57,840

 
1064
00:52:57,840 --> 00:53:00,330

 
1065
00:53:00,330 --> 00:53:02,190

 
1066
00:53:02,190 --> 00:53:08,940

 
1067
00:53:08,940 --> 00:53:10,740

 
1068
00:53:10,740 --> 00:53:12,690

 
1069
00:53:12,690 --> 00:53:16,080

 
1070
00:53:16,080 --> 00:53:18,300

 
1071
00:53:18,300 --> 00:53:19,920

 
1072
00:53:19,920 --> 00:53:24,630

 
1073
00:53:24,630 --> 00:53:26,580

 
1074
00:53:26,580 --> 00:53:29,400

 
1075
00:53:29,400 --> 00:53:33,030

 
1076
00:53:33,030 --> 00:53:35,010

 
1077
00:53:35,010 --> 00:53:37,260

 
1078
00:53:37,260 --> 00:53:39,060

 
1079
00:53:39,060 --> 00:53:41,700

 
1080
00:53:41,700 --> 00:53:44,130

 
1081
00:53:44,130 --> 00:53:46,080

 
1082
00:53:46,080 --> 00:53:47,820

 
1083
00:53:47,820 --> 00:53:50,610

 
1084
00:53:50,610 --> 00:53:51,960

 
1085
00:53:51,960 --> 00:53:55,680

 
1086
00:53:55,680 --> 00:53:57,300

 
1087
00:53:57,300 --> 00:54:11,910

 
1088
00:54:11,910 --> 00:54:27,760

 
1089
00:54:27,760 --> 00:54:33,700

 
1090
00:54:33,700 --> 00:54:35,859

 
1091
00:54:35,859 --> 00:54:37,600

 
1092
00:54:37,600 --> 00:54:40,330

 
1093
00:54:40,330 --> 00:54:41,710

 
1094
00:54:41,710 --> 00:54:45,900

 
1095
00:54:45,900 --> 00:54:48,900

 
1096
00:54:48,900 --> 00:54:51,190

 
1097
00:54:51,190 --> 00:54:53,740

 
1098
00:54:53,740 --> 00:54:56,200

 
1099
00:54:56,200 --> 00:54:57,490

 
1100
00:54:57,490 --> 00:55:00,070

 
1101
00:55:00,070 --> 00:55:01,930

 
1102
00:55:01,930 --> 00:55:06,100

 
1103
00:55:06,100 --> 00:55:07,780

 
1104
00:55:07,780 --> 00:55:09,990

 
1105
00:55:09,990 --> 00:55:13,530

 
1106
00:55:13,530 --> 00:55:16,180

 
1107
00:55:16,180 --> 00:55:27,160

 
1108
00:55:27,160 --> 00:55:30,970

 
1109
00:55:30,970 --> 00:55:33,550

 
1110
00:55:33,550 --> 00:55:36,490

 
1111
00:55:36,490 --> 00:55:40,090

 
1112
00:55:40,090 --> 00:55:42,820

 
1113
00:55:42,820 --> 00:55:44,230

 
1114
00:55:44,230 --> 00:55:45,820

 
1115
00:55:45,820 --> 00:55:47,560

 
1116
00:55:47,560 --> 00:55:49,060

 
1117
00:55:49,060 --> 00:55:52,330

 
1118
00:55:52,330 --> 00:55:53,760

 
1119
00:55:53,760 --> 00:55:56,290

 
1120
00:55:56,290 --> 00:55:56,300

 
1121
00:55:56,300 --> 00:56:04,089


1122
00:56:04,089 --> 00:56:10,010

 
1123
00:56:10,010 --> 00:56:13,730

 
1124
00:56:13,730 --> 00:56:14,960

 
1125
00:56:14,960 --> 00:56:17,300

 
1126
00:56:17,300 --> 00:56:19,930

 
1127
00:56:19,930 --> 00:56:23,000

 
1128
00:56:23,000 --> 00:56:27,410

 
1129
00:56:27,410 --> 00:56:42,849

 
1130
00:56:42,849 --> 00:56:45,560

 
1131
00:56:45,560 --> 00:56:47,270

 
1132
00:56:47,270 --> 00:56:50,120

 
1133
00:56:50,120 --> 00:56:53,240

 
1134
00:56:53,240 --> 00:56:54,650

 
1135
00:56:54,650 --> 00:56:56,240

 
1136
00:56:56,240 --> 00:56:57,770

 
1137
00:56:57,770 --> 00:56:59,300

 
1138
00:56:59,300 --> 00:57:01,130

 
1139
00:57:01,130 --> 00:57:03,859

 
1140
00:57:03,859 --> 00:57:06,950

 
1141
00:57:06,950 --> 00:57:09,050

 
1142
00:57:09,050 --> 00:57:11,690

 
1143
00:57:11,690 --> 00:57:14,150

 
1144
00:57:14,150 --> 00:57:16,670

 
1145
00:57:16,670 --> 00:57:18,680

 
1146
00:57:18,680 --> 00:57:20,630

 
1147
00:57:20,630 --> 00:57:22,370

 
1148
00:57:22,370 --> 00:57:24,710

 
1149
00:57:24,710 --> 00:57:29,000

 
1150
00:57:29,000 --> 00:57:31,609

 
1151
00:57:31,609 --> 00:57:33,109

 
1152
00:57:33,109 --> 00:57:39,290

 
1153
00:57:39,290 --> 00:57:41,630

 
1154
00:57:41,630 --> 00:57:49,560

 
1155
00:57:49,560 --> 00:57:58,060

 
1156
00:57:58,060 --> 00:58:01,480

 
1157
00:58:01,480 --> 00:58:03,580

 
1158
00:58:03,580 --> 00:58:05,860

 
1159
00:58:05,860 --> 00:58:07,330

 
1160
00:58:07,330 --> 00:58:09,310

 
1161
00:58:09,310 --> 00:58:11,830

 
1162
00:58:11,830 --> 00:58:13,180

 
1163
00:58:13,180 --> 00:58:14,830

 
1164
00:58:14,830 --> 00:58:16,480

 
1165
00:58:16,480 --> 00:58:19,510

 
1166
00:58:19,510 --> 00:58:21,340

 
1167
00:58:21,340 --> 00:58:22,780

 
1168
00:58:22,780 --> 00:58:24,760

 
1169
00:58:24,760 --> 00:58:26,650

 
1170
00:58:26,650 --> 00:58:29,110

 
1171
00:58:29,110 --> 00:58:31,270

 
1172
00:58:31,270 --> 00:58:33,100

 
1173
00:58:33,100 --> 00:58:36,730

 
1174
00:58:36,730 --> 00:58:38,470

 
1175
00:58:38,470 --> 00:58:41,080

 
1176
00:58:41,080 --> 00:58:43,330

 
1177
00:58:43,330 --> 00:58:44,950

 
1178
00:58:44,950 --> 00:58:46,930

 
1179
00:58:46,930 --> 00:58:49,840

 
1180
00:58:49,840 --> 00:58:59,470

 
1181
00:58:59,470 --> 00:59:05,420

 
1182
00:59:05,420 --> 00:59:08,479

 
1183
00:59:08,479 --> 00:59:10,220

 
1184
00:59:10,220 --> 00:59:13,190

 
1185
00:59:13,190 --> 00:59:15,229

 
1186
00:59:15,229 --> 00:59:17,029

 
1187
00:59:17,029 --> 00:59:19,249

 
1188
00:59:19,249 --> 00:59:21,170

 
1189
00:59:21,170 --> 00:59:27,319

 
1190
00:59:27,319 --> 00:59:29,660

 
1191
00:59:29,660 --> 00:59:37,700

 
1192
00:59:37,700 --> 00:59:39,109

 
1193
00:59:39,109 --> 00:59:41,749

 
1194
00:59:41,749 --> 00:59:53,600

 
1195
00:59:53,600 --> 01:00:02,210

 
1196
01:00:02,210 --> 01:00:04,460

 
1197
01:00:04,460 --> 01:00:06,980

 
1198
01:00:06,980 --> 01:00:08,690

 
1199
01:00:08,690 --> 01:00:10,700

 
1200
01:00:10,700 --> 01:00:12,770

 
1201
01:00:12,770 --> 01:00:15,500

 
1202
01:00:15,500 --> 01:00:26,300

 
1203
01:00:26,300 --> 01:00:28,630

 
1204
01:00:28,630 --> 01:00:31,130

 
1205
01:00:31,130 --> 01:00:33,650

 
1206
01:00:33,650 --> 01:00:35,720

 
1207
01:00:35,720 --> 01:00:39,500

 
1208
01:00:39,500 --> 01:00:41,870

 
1209
01:00:41,870 --> 01:00:47,450

 
1210
01:00:47,450 --> 01:00:49,880

 
1211
01:00:49,880 --> 01:00:52,010

 
1212
01:00:52,010 --> 01:00:54,230

 
1213
01:00:54,230 --> 01:01:05,140

 
1214
01:01:05,140 --> 01:01:05,150

 
1215
01:01:05,150 --> 01:01:12,910


1216
01:01:12,910 --> 01:01:17,300

 
1217
01:01:17,300 --> 01:01:20,540

 
1218
01:01:20,540 --> 01:01:24,260

 
1219
01:01:24,260 --> 01:01:26,750

 
1220
01:01:26,750 --> 01:01:28,820

 
1221
01:01:28,820 --> 01:01:32,030

 
1222
01:01:32,030 --> 01:01:33,800

 
1223
01:01:33,800 --> 01:01:35,840

 
1224
01:01:35,840 --> 01:01:37,520

 
1225
01:01:37,520 --> 01:01:40,280

 
1226
01:01:40,280 --> 01:01:44,000

 
1227
01:01:44,000 --> 01:01:46,880

 
1228
01:01:46,880 --> 01:01:48,800

 
1229
01:01:48,800 --> 01:01:53,720

 
1230
01:01:53,720 --> 01:01:55,400

 
1231
01:01:55,400 --> 01:01:57,320

 
1232
01:01:57,320 --> 01:02:00,560

 
1233
01:02:00,560 --> 01:02:02,960

 
1234
01:02:02,960 --> 01:02:06,020

 
1235
01:02:06,020 --> 01:02:09,410

 
1236
01:02:09,410 --> 01:02:11,930

 
1237
01:02:11,930 --> 01:02:15,140

 
1238
01:02:15,140 --> 01:02:21,380

 
1239
01:02:21,380 --> 01:02:23,630

 
1240
01:02:23,630 --> 01:02:27,320

 
1241
01:02:27,320 --> 01:02:30,950

 
1242
01:02:30,950 --> 01:02:32,600

 
1243
01:02:32,600 --> 01:02:35,540

 
1244
01:02:35,540 --> 01:02:37,850

 
1245
01:02:37,850 --> 01:02:39,650

 
1246
01:02:39,650 --> 01:02:42,380

 
1247
01:02:42,380 --> 01:02:44,360

 
1248
01:02:44,360 --> 01:02:44,370

 
1249
01:02:44,370 --> 01:02:51,490


1250
01:02:51,490 --> 01:02:56,020

 
1251
01:02:56,020 --> 01:02:57,850

 
1252
01:02:57,850 --> 01:03:00,220

 
1253
01:03:00,220 --> 01:03:02,170

 
1254
01:03:02,170 --> 01:03:03,670

 
1255
01:03:03,670 --> 01:03:06,280

 
1256
01:03:06,280 --> 01:03:09,040

 
1257
01:03:09,040 --> 01:03:10,720

 
1258
01:03:10,720 --> 01:03:12,790

 
1259
01:03:12,790 --> 01:03:15,040

 
1260
01:03:15,040 --> 01:03:20,830

 
1261
01:03:20,830 --> 01:03:24,850

 
1262
01:03:24,850 --> 01:03:26,890

 
1263
01:03:26,890 --> 01:03:31,870

 
1264
01:03:31,870 --> 01:03:33,370

 
1265
01:03:33,370 --> 01:03:36,640

 
1266
01:03:36,640 --> 01:03:38,320

 
1267
01:03:38,320 --> 01:03:42,850

 
1268
01:03:42,850 --> 01:03:44,980

 
1269
01:03:44,980 --> 01:03:46,840

 
1270
01:03:46,840 --> 01:03:49,210

 
1271
01:03:49,210 --> 01:03:51,910

 
1272
01:03:51,910 --> 01:03:55,420

 
1273
01:03:55,420 --> 01:03:58,060

 
1274
01:03:58,060 --> 01:04:01,750

 
1275
01:04:01,750 --> 01:04:04,150

 
1276
01:04:04,150 --> 01:04:09,130

 
1277
01:04:09,130 --> 01:04:11,380

 
1278
01:04:11,380 --> 01:04:13,930

 
1279
01:04:13,930 --> 01:04:17,290

 
1280
01:04:17,290 --> 01:04:19,240

 
1281
01:04:19,240 --> 01:04:22,390

 
1282
01:04:22,390 --> 01:04:25,180

 
1283
01:04:25,180 --> 01:04:27,700

 
1284
01:04:27,700 --> 01:04:29,680

 
1285
01:04:29,680 --> 01:04:31,450

 
1286
01:04:31,450 --> 01:04:34,300

 
1287
01:04:34,300 --> 01:04:37,000

 
1288
01:04:37,000 --> 01:04:38,520

 
1289
01:04:38,520 --> 01:04:42,640

 
1290
01:04:42,640 --> 01:04:44,920

 
1291
01:04:44,920 --> 01:04:48,040

 
1292
01:04:48,040 --> 01:04:52,500

 
1293
01:04:52,500 --> 01:04:55,470

 
1294
01:04:55,470 --> 01:04:58,250

 
1295
01:04:58,250 --> 01:05:00,510

 
1296
01:05:00,510 --> 01:05:02,280

 
1297
01:05:02,280 --> 01:05:04,770

 
1298
01:05:04,770 --> 01:05:07,170

 
1299
01:05:07,170 --> 01:05:09,960

 
1300
01:05:09,960 --> 01:05:12,500

 
1301
01:05:12,500 --> 01:05:14,790

 
1302
01:05:14,790 --> 01:05:16,349

 
1303
01:05:16,349 --> 01:05:20,010

 
1304
01:05:20,010 --> 01:05:22,470

 
1305
01:05:22,470 --> 01:05:24,120

 
1306
01:05:24,120 --> 01:05:26,580

 
1307
01:05:26,580 --> 01:05:28,380

 
1308
01:05:28,380 --> 01:05:31,890

 
1309
01:05:31,890 --> 01:05:34,980

 
1310
01:05:34,980 --> 01:05:36,630

 
1311
01:05:36,630 --> 01:05:44,340

 
1312
01:05:44,340 --> 01:05:46,140

 
1313
01:05:46,140 --> 01:05:48,930

 
1314
01:05:48,930 --> 01:05:50,490

 
1315
01:05:50,490 --> 01:05:52,770

 
1316
01:05:52,770 --> 01:05:55,050

 
1317
01:05:55,050 --> 01:05:57,060

 
1318
01:05:57,060 --> 01:06:01,260

 
1319
01:06:01,260 --> 01:06:04,530

 
1320
01:06:04,530 --> 01:06:06,450

 
1321
01:06:06,450 --> 01:06:10,380

 
1322
01:06:10,380 --> 01:06:11,910

 
1323
01:06:11,910 --> 01:06:15,359

 
1324
01:06:15,359 --> 01:06:18,450

 
1325
01:06:18,450 --> 01:06:21,210

 
1326
01:06:21,210 --> 01:06:24,599

 
1327
01:06:24,599 --> 01:06:25,980

 
1328
01:06:25,980 --> 01:06:28,140

 
1329
01:06:28,140 --> 01:06:30,599

 
1330
01:06:30,599 --> 01:06:33,480

 
1331
01:06:33,480 --> 01:06:35,820

 
1332
01:06:35,820 --> 01:06:37,500

 
1333
01:06:37,500 --> 01:06:39,750

 
1334
01:06:39,750 --> 01:06:41,970

 
1335
01:06:41,970 --> 01:06:44,670

 
1336
01:06:44,670 --> 01:06:47,910

 
1337
01:06:47,910 --> 01:06:51,150

 
1338
01:06:51,150 --> 01:06:54,210

 
1339
01:06:54,210 --> 01:06:56,520

 
1340
01:06:56,520 --> 01:06:58,080

 
1341
01:06:58,080 --> 01:07:00,030

 
1342
01:07:00,030 --> 01:07:02,010

 
1343
01:07:02,010 --> 01:07:05,430

 
1344
01:07:05,430 --> 01:07:06,299

 
1345
01:07:06,299 --> 01:07:09,150

 
1346
01:07:09,150 --> 01:07:11,819

 
1347
01:07:11,819 --> 01:07:15,120

 
1348
01:07:15,120 --> 01:07:17,429

 
1349
01:07:17,429 --> 01:07:29,370

 
1350
01:07:29,370 --> 01:07:31,289

 
1351
01:07:31,289 --> 01:07:33,390

 
1352
01:07:33,390 --> 01:07:36,029

 
1353
01:07:36,029 --> 01:07:37,859

 
1354
01:07:37,859 --> 01:07:40,049

 
1355
01:07:40,049 --> 01:07:41,819

 
1356
01:07:41,819 --> 01:07:44,789

 
1357
01:07:44,789 --> 01:07:50,729

 
1358
01:07:50,729 --> 01:07:54,059

 
1359
01:07:54,059 --> 01:07:57,120

 
1360
01:07:57,120 --> 01:07:59,849

 
1361
01:07:59,849 --> 01:08:03,900

 
1362
01:08:03,900 --> 01:08:05,729

 
1363
01:08:05,729 --> 01:08:09,569

 
1364
01:08:09,569 --> 01:08:12,059

 
1365
01:08:12,059 --> 01:08:14,009

 
1366
01:08:14,009 --> 01:08:17,309

 
1367
01:08:17,309 --> 01:08:19,109

 
1368
01:08:19,109 --> 01:08:21,959

 
1369
01:08:21,959 --> 01:08:26,160

 
1370
01:08:26,160 --> 01:08:29,010

 
1371
01:08:29,010 --> 01:08:33,299

 
1372
01:08:33,299 --> 01:08:35,579

 
1373
01:08:35,579 --> 01:08:37,169

 
1374
01:08:37,169 --> 01:08:41,669

 
1375
01:08:41,669 --> 01:08:44,220

 
1376
01:08:44,220 --> 01:08:46,499

 
1377
01:08:46,499 --> 01:08:49,499

 
1378
01:08:49,499 --> 01:08:52,200

 
1379
01:08:52,200 --> 01:08:53,970

 
1380
01:08:53,970 --> 01:08:56,430

 
1381
01:08:56,430 --> 01:08:57,899

 
1382
01:08:57,899 --> 01:09:00,269

 
1383
01:09:00,269 --> 01:09:01,589

 
1384
01:09:01,589 --> 01:09:03,510

 
1385
01:09:03,510 --> 01:09:06,990

 
1386
01:09:06,990 --> 01:09:09,539

 
1387
01:09:09,539 --> 01:09:12,299

 
1388
01:09:12,299 --> 01:09:15,059

 
1389
01:09:15,059 --> 01:09:17,280

 
1390
01:09:17,280 --> 01:09:19,680

 
1391
01:09:19,680 --> 01:09:22,939

 
1392
01:09:22,939 --> 01:09:25,890

 
1393
01:09:25,890 --> 01:09:28,950

 
1394
01:09:28,950 --> 01:09:32,519

 
1395
01:09:32,519 --> 01:09:35,160

 
1396
01:09:35,160 --> 01:09:36,930

 
1397
01:09:36,930 --> 01:09:39,479

 
1398
01:09:39,479 --> 01:09:42,990

 
1399
01:09:42,990 --> 01:09:46,530

 
1400
01:09:46,530 --> 01:09:49,530

 
1401
01:09:49,530 --> 01:09:51,150

 
1402
01:09:51,150 --> 01:09:56,130

 
1403
01:09:56,130 --> 01:09:59,850

 
1404
01:09:59,850 --> 01:10:01,650

 
1405
01:10:01,650 --> 01:10:06,209

 
1406
01:10:06,209 --> 01:10:08,729

 
1407
01:10:08,729 --> 01:10:17,090

 
1408
01:10:17,090 --> 01:10:20,100

 
1409
01:10:20,100 --> 01:10:22,080

 
1410
01:10:22,080 --> 01:10:24,479

 
1411
01:10:24,479 --> 01:10:27,720

 
1412
01:10:27,720 --> 01:10:30,180

 
1413
01:10:30,180 --> 01:10:32,399

 
1414
01:10:32,399 --> 01:10:35,490

 
1415
01:10:35,490 --> 01:10:37,530

 
1416
01:10:37,530 --> 01:10:41,520

 
1417
01:10:41,520 --> 01:10:43,439

 
1418
01:10:43,439 --> 01:10:45,800

 
1419
01:10:45,800 --> 01:10:48,149

 
1420
01:10:48,149 --> 01:10:51,149

 
1421
01:10:51,149 --> 01:10:54,810

 
1422
01:10:54,810 --> 01:10:56,580

 
1423
01:10:56,580 --> 01:10:59,490

 
1424
01:10:59,490 --> 01:11:01,800

 
1425
01:11:01,800 --> 01:11:06,000

 
1426
01:11:06,000 --> 01:11:09,300

 
1427
01:11:09,300 --> 01:11:11,220

 
1428
01:11:11,220 --> 01:11:13,830

 
1429
01:11:13,830 --> 01:11:18,450

 
1430
01:11:18,450 --> 01:11:20,729

 
1431
01:11:20,729 --> 01:11:25,169

 
1432
01:11:25,169 --> 01:11:28,560

 
1433
01:11:28,560 --> 01:11:32,189

 
1434
01:11:32,189 --> 01:11:34,979

 
1435
01:11:34,979 --> 01:11:39,479

 
1436
01:11:39,479 --> 01:11:40,740

 
1437
01:11:40,740 --> 01:11:43,470

 
1438
01:11:43,470 --> 01:11:45,840

 
1439
01:11:45,840 --> 01:11:47,520

 
1440
01:11:47,520 --> 01:11:50,729

 
1441
01:11:50,729 --> 01:11:54,660

 
1442
01:11:54,660 --> 01:12:00,660

 
1443
01:12:00,660 --> 01:12:04,169

 
1444
01:12:04,169 --> 01:12:06,720

 
1445
01:12:06,720 --> 01:12:09,479

 
1446
01:12:09,479 --> 01:12:12,240

 
1447
01:12:12,240 --> 01:12:15,330

 
1448
01:12:15,330 --> 01:12:20,610

 
1449
01:12:20,610 --> 01:12:23,640

 
1450
01:12:23,640 --> 01:12:26,949

 
1451
01:12:26,949 --> 01:12:31,899

 
1452
01:12:31,899 --> 01:12:33,399

 
1453
01:12:33,399 --> 01:12:35,560

 
1454
01:12:35,560 --> 01:12:37,179

 
1455
01:12:37,179 --> 01:12:40,239

 
1456
01:12:40,239 --> 01:12:42,100

 
1457
01:12:42,100 --> 01:12:44,409

 
1458
01:12:44,409 --> 01:12:47,139

 
1459
01:12:47,139 --> 01:12:52,629

 
1460
01:12:52,629 --> 01:12:54,969

 
1461
01:12:54,969 --> 01:12:58,419

 
1462
01:12:58,419 --> 01:13:00,129

 
1463
01:13:00,129 --> 01:13:02,139

 
1464
01:13:02,139 --> 01:13:03,909

 
1465
01:13:03,909 --> 01:13:08,649

 
1466
01:13:08,649 --> 01:13:10,299

 
1467
01:13:10,299 --> 01:13:12,969

 
1468
01:13:12,969 --> 01:13:14,169

 
1469
01:13:14,169 --> 01:13:16,509

 
1470
01:13:16,509 --> 01:13:18,219

 
1471
01:13:18,219 --> 01:13:20,379

 
1472
01:13:20,379 --> 01:13:22,810

 
1473
01:13:22,810 --> 01:13:24,699

 
1474
01:13:24,699 --> 01:13:27,909

 
1475
01:13:27,909 --> 01:13:29,469

 
1476
01:13:29,469 --> 01:13:32,189

 
1477
01:13:32,189 --> 01:13:34,689

 
1478
01:13:34,689 --> 01:13:36,909

 
1479
01:13:36,909 --> 01:13:40,029

 
1480
01:13:40,029 --> 01:13:42,069

 
1481
01:13:42,069 --> 01:13:43,449

 
1482
01:13:43,449 --> 01:13:45,250

 
1483
01:13:45,250 --> 01:13:47,049

 
1484
01:13:47,049 --> 01:13:50,049

 
1485
01:13:50,049 --> 01:13:52,540

 
1486
01:13:52,540 --> 01:13:55,179

 
1487
01:13:55,179 --> 01:13:59,799

 
1488
01:13:59,799 --> 01:14:04,299

 
1489
01:14:04,299 --> 01:14:06,549

 
1490
01:14:06,549 --> 01:14:10,389

 
1491
01:14:10,389 --> 01:14:16,750

 
1492
01:14:16,750 --> 01:14:19,810

 
1493
01:14:19,810 --> 01:14:22,659

 
1494
01:14:22,659 --> 01:14:24,609

 
1495
01:14:24,609 --> 01:14:27,850

 
1496
01:14:27,850 --> 01:14:30,100

 
1497
01:14:30,100 --> 01:14:33,759

 
1498
01:14:33,759 --> 01:14:37,299

 
1499
01:14:37,299 --> 01:14:39,790

 
1500
01:14:39,790 --> 01:14:40,780

 
1501
01:14:40,780 --> 01:14:42,790

 
1502
01:14:42,790 --> 01:14:46,210

 
1503
01:14:46,210 --> 01:14:48,360

 
1504
01:14:48,360 --> 01:14:51,480

 
1505
01:14:51,480 --> 01:14:54,160

 
1506
01:14:54,160 --> 01:14:55,900

 
1507
01:14:55,900 --> 01:14:59,590

 
1508
01:14:59,590 --> 01:15:01,750

 
1509
01:15:01,750 --> 01:15:06,780

 
1510
01:15:06,780 --> 01:15:11,380

 
1511
01:15:11,380 --> 01:15:13,950

 
1512
01:15:13,950 --> 01:15:16,630

 
1513
01:15:16,630 --> 01:15:19,300

 
1514
01:15:19,300 --> 01:15:22,960

 
1515
01:15:22,960 --> 01:15:25,870

 
1516
01:15:25,870 --> 01:15:28,420

 
1517
01:15:28,420 --> 01:15:30,220

 
1518
01:15:30,220 --> 01:15:35,320

 
1519
01:15:35,320 --> 01:15:38,410

 
1520
01:15:38,410 --> 01:15:40,750

 
1521
01:15:40,750 --> 01:15:48,780

 
1522
01:15:48,780 --> 01:15:51,670

 
1523
01:15:51,670 --> 01:15:53,890

 
1524
01:15:53,890 --> 01:15:55,990

 
1525
01:15:55,990 --> 01:15:58,540

 
1526
01:15:58,540 --> 01:16:00,840

 
1527
01:16:00,840 --> 01:16:02,830

 
1528
01:16:02,830 --> 01:16:04,810

 
1529
01:16:04,810 --> 01:16:06,520

 
1530
01:16:06,520 --> 01:16:09,130

 
1531
01:16:09,130 --> 01:16:13,090

 
1532
01:16:13,090 --> 01:16:15,430

 
1533
01:16:15,430 --> 01:16:17,680

 
1534
01:16:17,680 --> 01:16:21,250

 
1535
01:16:21,250 --> 01:16:22,990

 
1536
01:16:22,990 --> 01:16:25,840

 
1537
01:16:25,840 --> 01:16:30,820

 
1538
01:16:30,820 --> 01:16:33,040

 
1539
01:16:33,040 --> 01:16:35,620

 
1540
01:16:35,620 --> 01:16:39,280

 
1541
01:16:39,280 --> 01:16:40,900

 
1542
01:16:40,900 --> 01:16:42,520

 
1543
01:16:42,520 --> 01:16:44,590

 
1544
01:16:44,590 --> 01:16:47,230

 
1545
01:16:47,230 --> 01:16:50,130

 
1546
01:16:50,130 --> 01:16:53,110

 
1547
01:16:53,110 --> 01:16:54,520

 
1548
01:16:54,520 --> 01:16:56,650

 
1549
01:16:56,650 --> 01:17:01,030

 
1550
01:17:01,030 --> 01:17:03,160

 
1551
01:17:03,160 --> 01:17:05,830

 
1552
01:17:05,830 --> 01:17:07,750

 
1553
01:17:07,750 --> 01:17:12,640

 
1554
01:17:12,640 --> 01:17:14,110

 
1555
01:17:14,110 --> 01:17:16,360

 
1556
01:17:16,360 --> 01:17:18,430

 
1557
01:17:18,430 --> 01:17:18,440

 
1558
01:17:18,440 --> 01:17:20,650