1 00:00:01,069 --> 00:00:03,189 the following content is provided under a Creative Commons license your support will help MIT OpenCourseWare continue to offer high quality educational resources for free to make a donation or to view additional materials from hundreds of MIT courses visit MIT opencourseware at ocw.mit.edu so today we're going to talk about assembly language and computer architecture it's interesting these days most software courses don't bother to talk about these things and the reason is because as much as possible people have been insulated in writing their software from performance considerations but if you want to write fast code you have to know what is going on underneath so you can exploit the strengths of the architecture and the interface the best interface that we have to that is the assembly language so that's what we're going to talk about today so when you take a particular piece of code like fib here to compile it you run it through clang as I'm sure you're familiar at this point and what it produces it is a binary machine language that the computer is hardware programmed to interpret and execute ok it it looks at the bits as instructions as opposed to as data and it executes them and and that's what we see when we execute this process is not one step it's actually there for stages two compilation pre-processing compiling sorry for the redundancy that's sort of a bad name conflict but that's what they call it assembling and linking so I want to take us through those stages so so the first thing that goes through is you go through a pre process stage and you can invoke that with clang manually so you can say for example if you do clang - II okay that will show that will run the preprocessor and nothing else and you can take a look at the output there and look to see how all your macros got expanded and such okay before the compilation actually goes it goes through then you compile it and that produces assembly code okay so assembly is a mnemonic structure of the machine code that makes it more human readable than the machine code itself would be and then and once again you can produce the assembly yourself with clang - s and then finally you was penultimately maybe you can assemble that that assembly language code to produce an object file and since we like to have separate compilations you don't have to compile everything is one big monolithic hunk okay then there's typically a linking stage to produce the final executable and for that we are using LD for for the most part we're actually using the gold linker but but LD is the command that calls it okay so let's go through each of those steps and see what's going on so first the the the pre-processing is really straightforward so I'm not going to do that that's just a textual substitution the next stage is the source code - assembly code so when we do clang - at s we get this symbolic representation and it looks something like like this okay where we have some labels on the side some labels on the side and we have some operations we may have some directives and then we have a lot of gibberish which won't seem like so much gibberish after you've played with it a little bit okay but to begin with looks kind of like gibberish from there we assemble that assembly code and that produces the binary okay and once again you can invoke it just by doing running clang clang will recognize that it doesn't have a CE file or a C++ file it says oh goodness I've got a an assembly language file and they'll produce the the the binary now the other thing that turns out to be the case is because assembly and machine the code they're really very similar in in structure okay just things like the OP codes which are the the things that are here in in blue or purple whatever that color is like these guys okay those correspond to specific bit patterns over here in the machine code okay so and these are the the addresses and the registers that were operating on the operands okay those correspond to other two other bit codes over there okay and there's very close there's a very much a it's not exactly one to one but it's pretty close one to one compared to if you had C and you look at the binary it's like it's it's way way different okay so so one of the things that turns out you can do is if you have the if you have the machine code and especially if the machine code that was produced with so called debug symbols that is it was compiled with dash G you can use this program called ABS okay which will produce a disassembly of the machine code so to tell you okay here's what the mnemonic human more human readable code is the assembly code from the binary and that's really useful especially if you're trying to do things well let's see why do we bother looking at the assembly so why would you want to look at the assembly of your program does anybody have some ideas yeah yeah you can see whether certain optimizations are made or not so other right reasons everybody's gonna say that one okay another one is well let's see so here's some reasons okay the assembly reveals what the compiler did and did not do because you can see exactly what the machine what the assembly is that it's going to be executed as a machine code the second reason which turns out to happen more often than you would think is that hey guess what compiler is a piece of software it has bugs so your code isn't operating correctly oh goodness what's going wrong maybe the compiler made an error and we have certainly found found that for especially when you start using some of the less frequently used features of a compiler you may discover all it's actually not that well broken in and mentions here you may only have an effect when compiling at - oh three but if you compile it - oh zero - zero one everything works out just fine so then it says cheese somewhere in the optimizations they did an optimization wrong okay so one of the first principles of optimization is do it right and then the second is make it fast and then and so sometimes the compiler doesn't that it's also the case that sometimes you cannot write code that produces the assembly that you want and in that case you can actually write the assembly by hand okay now used to be many years ago many many years ago that a lot of software was written in assembly okay in fact I had a my first job out of college I spent about half the time programming in assembly language okay and it's not as bad as you would think okay but it certainly is easier to have high level languages that's for sure you get a lot more done a lot quicker okay and the last one is reverse engineer you can figure out what a program does when you have only have access to its source so for example the matrix multiplication example that I gave on day one okay you know we had the overall outer structure but the inner loop we could not match the Intel and math kernal library code so what do we do we look to see what it was we didn't have the source for it we look to see what it was doing we said oh is that what they're doing okay and then we were able to do it ourselves okay without having to without having to you know get the source from them so we reverse engineered what they did so all those are good reasons now in this class we have some expectations so one thing is you know assembly is complicated and you needn't memorize the manual in fact the manual has has like over a thousand pages okay it's like okay but here's what we do expect of you you should understand how a compiler implements various seed linguistic constructs with x86 instructions and that's what we'll see in the next lecture and you should be able to read x86 assembly language with the aid of an architecture manual and on a quiz for example we would give you you know snippets or explain what the op codes are that are being used in case it's not there but you should have some understanding of that so you can see what's actually happening you should understand the high level performance implications of common assembly patterns okay so what happens in you know what does it mean to do things in a particular way in terms of performance so some of them are quite obvious vector operations tend to be faster than you know doing the same thing with a bunch of scalar operations okay if you do write an assembly typically what we do ziz they're a bunch of compiler intrinsic functions built-ins so called that that allowed you allow you to use the assembly language instructions and you should be after we've done this able to write code from scratch if the situation demands at some time in the future we won't do that in this class but we expect that you will be in a position to do that after after you should get a mastery to the level where that would not be impossible for you to do you'd be able to do that with a reasonable amount of effort so the rest of the lecture here is I'm going to first start by talking about the instruction set architecture of the x86 64 which is the one that we are using for the cloud machines that we're using and then I'm going to talk about floating-point and vector hardware and then I'm gonna do an overview of computer architecture now all of this I'm doing this is a software class right okay software performance engineering we're doing so the reason we're doing this is so you can write code that better matches the hardware therefore to better get it in order to do that I could give things at a high level my experience is that if you really want to understand something you want to understand it to the level that's necessary and then one level below that okay it's not that you'll necessarily use that one level below it but that gives you insight as to why that layer is what it is and what's really going on okay and so that's kind of what we're gonna do we're going to do a dive that takes us one level beyond what you probably will need to know in in the class so that you have a robust foundation for understanding does that make sense okay that's just that's my part of my learning philosophy is you know go one step beyond and then you can you know come back okay the is a primer okay so the is a talks about the syntax and semantics of assembly this is there are four important concepts in in the instruction set architecture the notion of registers the notion of instructions the data types and the memory addressing modes and those are sort of indicated for example here we're going to go through those one by one so let's start with the registers so the registers is where the processor stores things and there are a bunch of x86 registers so many that you don't need to know most of them okay the ones that are important are these okay so first of all there are general purpose registers and those typically have with 64 and there are many of those there is a so called Flags register called are flags which keeps track of things like whether there was an overflow whether the last arithmetic operation resulted in a 0 whether a kid there was a carry out of of a word or what-have-you the next one is the instruction pointer so the assembly language is organized as a sequence of instructions and the hardware marches linearly through that sequence one one after the other unless it encounters a conditional jump or an unconditional jump in which case it'll branch to whatever the location is but for the most part it's just running straight through memory then there are some registers that were added quite late in the that our that namely the SSC registers and the AVX registers and these are vector registers so the xmm registers were when they first did vectorization they use 128 bits there's also four AVX they're the ymm registers and in the most recent processors which were not using this term there there's another level of av X that gives you 512 bit registers but maybe we'll use that for for the final project because it's just like more a little more power okay for the for the game-playing project okay but for most of what you'll be doing will just be keeping to the to the basic to the C for instances that in AWS that you guys have been using okay now the x86 64 didn't start out as x86 64 started out as x86 and it was used for machines in particularly 8086 which had a 16-bit word okay so really short okay how many things can you index with a 16-bit word about how many yeah about 65,000 65536 okay words you can execute in that sorry you can address or bytes this is byte addressing okay so that's 65 k bytes that you can address how could they possibly use that for machines well the answer is that's how much memory was on the machine you didn't have gigabytes so as the machines as Moore's law you know marched along and we got more and more memory then the words had to become wider to be able to index them yeah yeah but here's the thing is if you're building stuff that's going to have to thick it's too expensive and you can't get memory that's big enough then then the then if you build a wider word like if you build a word of 32 bits then your processor just cost twice as much as the next guy's processor so instead what they did is they went along as long as that was the common size and then had a some growth pains and went to 6:32 and from there they had some more growth pains and went to 64 okay those are two separate things and in fact they did they did some really weird stuff okay so what they did in fact is when they made these longer registers they have registers that are aliased to exactly the same thing for the lower bits so they could address them either by a by a byte okay so these registers all have the same you can do the lower and upper half of the short word or you can do the the 32-bit word or you can do the 64-bit word okay and that's just like if you're doing this today you wouldn't do that you wouldn't have all these registers that alias and such okay but that's what they did because it because this is history not not design and the reason was because when they're doing that they were not designing for long term now are we gonna go to 128 bit addressing probably not 64 bits address is a spectacular amount of stuff I'm you know not quite as many 2 to the 64th is what is like it's like how many gazillions it's a lot of gazillions ok ok so yeah we're not going to have to have to go beyond 64 probably ok so here are the general-purpose registers and as I mentioned you know they have different names but they for the same thing so if you change EA X for example that also changes our ax okay and so you see they they originally all had functional purposes now they're all pretty much the same thing except for the and but the names have stuck because of history okay they instead of calling them register 0 register 1 or whatever they all have these funny names okay some of them still are used for a particular purpose like RSP is used as the stack pointer and RB p is used to point to the base of the frame for those who remember there's six double o4 stuff so anyway they're a whole bunch of them and there are different names depending upon which part of the register you're accessing now the format of an x86 64 instruction code is to have an opcode and then an operand list and the operand list is typically 0 1 2 or rarely three operands separated by commas typically all operands are sources and one operand might also be the destination so for example if you take a look at this at this add instruction the operation is an ad and the operand list is the is these two registers one is EDI and the other is ECX and the destination is the second one okay when you add in this case what's going on is it's taking the value in ECX adding the value in EDI into it and the result is in ECX yes funny you should ask yes okay so what does OP a B mean it turns out naturally that the literature is inconsistent about how it refers operations and there's two major ways that are used one is the AT&T syntax and the other is the Intel syntax so the AT&T syntax the second operand is the destination the last operand is the destination in the Intel Singh syntax the first operand is the destination okay is that confusing okay so almost all the tools that we're going to use are going to use the AT&T syntax okay the but you will read documentation which is which is Intel documentation it will use the other syntax don't get confused okay I can't help you know it's like I I can't help that this is the way the state of the world is okay yeah oh yeah in particular if you you know you could compile it yeah and undo but I'm sure there's a I mean that this is not a hard translation thing I'll bet if you just google it you can in two minutes in two seconds fine to find somebody who'll translate from one to the other okay yeah this is not a not a complicated translation process now here are some very common x86 op codes and so let me just mention a few of these because these are ones that you'll often see in in the code so move what do you think move does yeah it puts something in one register into another register of course when it moves it this is computer science move not real move you know when I move my belongings in my house to my new house they're no longer in the old place right but in computer science for some reason when we move things we leave a copy behind okay so so they may call it move but yeah why don't they call it copy you got me okay okay then there's conditional move so this is move based on a on a condition like move and we'll see some of the ways that this thing like move if if a flag is equal or equal to zero or and so forth so basically conditional move it doesn't always do the move then there's then you can extend the sign so for example suppose you're moving from a 32-bit value register into a 64-bit register okay then the question is what happens to the higher order bits so there's two basic mechanisms can be used either can be filled with zeros or remember that the first bit there the leftmost bit as we think of it is the sign bit right from from our lecture on binary that bit will be extended through the high order part of the word okay so that the whole number will be if it's negative will be negative and if it's positive it'll be zeros and so forth okay does that make sense then there are things like push and pop to do stacks there's a lot of integer arithmetic and you can take there's you know addition subtraction multiplication division you know very shifts address calculation shifts rotations incrementing decrementing negating etc there's also a lot of binary logic and or X or not those are all doing bitwise operations and then there is boolean logic like testing to see whether some value is has a given value or comparing there's unconditional jump which is and there's conditional jumps which is jump with a condition and then of things like subroutines and there are a bunch more which were which the manual will have and which will undoubtedly show up like for example there's the whole set of vector operations we'll talk about a little bit later okay now the opcodes may be augmented with a suffix that describes the data type of the operation or a condition code okay so an opcode for data movement arithmetic or logic use a signal single character suffix to indicate the data type and if the suffix is missing it can usually be inferred so take a look at this example so this is a move with a cue at the end what do you think q stands for quad word okay how many bytes in a quad word eight that's because originally it started out with a 16-bit word so they said a quad word was four of those 16-bit words so that's eight bytes okay you get the idea right but let me tell you this is all over the x86 instruction set all these historical things and all these mnemonics that that if you don't understand what they really mean you can get very confused okay so in this case we're moving a 64-bit integer because a quad word has eight bytes or 64 bits okay I this is one of my it's like whenever I prepare this lecture I just go into spasms laughter okay as I look and I say oh my god they really did that like for example on the last page when I did subtract okay so the sub operator if it's a two argument operator it it subtracts the I think it's the first one the second but there is no way of subtracting the other way around okay it puts the destination in the second one it basically takes the second one - the first one and puts that in the second one okay but if you wanted to have it the other way around to save yourself a cycle you anyway it doesn't matter that you can't do it that way okay and all this stuff the compiler has to understand okay so here are the x86 64 datatypes okay the way I've done it is to show you the difference between C and an x86 64 so for example here the declarations here the declarations in C so there's a char a short int unsigned int long etc here's an example of a C constant that does those things and here's the size in bytes that you get when you declare that okay and then the the assembly suffix is one of these things okay so in the assembly it says B or W for a word and L or D for a double word a queue for a quad word ie 8 bytes single precision double precision extended precision okay the so sign extension use two datatype suffixes so here's an example so the first one says we're going to move and now you see I can't read this without without my cheat sheet so what is this saying this is saying we're gonna move a with a 0 extend and it's going to be the first operand is a byte and the second operand is along that's that right if I'm wrong it's like I got a look at the chart - okay and of course we don't hold you to that so but the Z there says extends with zeros and the S says preserve the sign ok so that's the the things now that would all be all well and good except that then what they did is if you do 32-bit operations where you're moving it to a sick the 4-bit value it implicitly zero extends the sign if you do it for smaller values and you store it in it simply puts over writes the values in those registers doesn't touch the higher bits but for that when they did the 32 to 64 bit extension of the of the instruction set they decided that they wouldn't do what had been done in the past and they decided that they would zero extend things unless there was something explicit to the contrary okay you got me okay yeah I have a friend who worked at Intel and he had a joke about the Intel instruction set he discovered the Intel instructor says really complicated he says here's the idea of the Intel instruction set he said to become an Intel fellow you need to have an instruction in the Intel instruction set okay you have an instruction that you invented and that that's now used in Intel he says nobody becomes an Intel fellow for removing instructions so just so grows and grows and grows and gets more and more complicated for for each thing now once again you can for extension you can sign extend and here's two examples in one case moving an 8-bit integer to a 32-bit integer and 0 expanded it versus preserving the sign conditional jumps and conditional moves also use suffixes to indicate the condition code so here for example the ne indicates the jump should only be taken if the argument of the previous comparison are not equal so ne is not equal so you do a comparison and that's going to set a flag in the are flags register then the jump will look at that flag and decide whether it's going to jump or not or just continue the sequential execution of the of the code okay and there are a bunch of things that you can jump on so which are status flags and you can see the names here there's Carrie there's parity parity is the XOR of all the bits in the word there's and just I don't even know what that's for okay there's the zero flag tells with a zero there's a sign flag whether it's positive or negative there's a trap flag which and interrupt enable and Direction overflow so anyway you can see there are a whole bunch of these okay so for example here this is going to decrement RBX and then it sets the zero flag if the results are equal and then the jump the conditional jump jumps to the label if the z-f flag is set to is not set in this case okay makes sense after a fashion okay it doesn't make rational sense but it does make sense okay here are the main ones that you're going to need the carry flag is whether you got a carrier or borrow out of the most significant bit the zero flag is if the ALU operation was zero whether the last lu application had the sign bit set and the overflow says it resulted in arithmetic overflow the condition codes are if you put one of these condition codes on the on your conditional jump or whatever this tells you exactly what the flag is that is is being set so for example you know the easy ones are if it's equal but you know there are some other ones there so for example you know if you say why you know why for example do the condition codes E&N a check the zero flag okay and the answer is typically rather than rather than having a separate comparison that what they've done is separate the branch from the comparison itself but it also needn't be a compare instruction it could be the result of the last arithmetic operation was a zero and therefore it can branch without having to do a comparison with zero okay so for example if you have a loop okay that where your decrementing a counter till it gets to zero that's actually faster to buy one instruction okay to to compare whether the loop index hits zero then it is if you have the loop going up to N and then every time to loop having to compare with N in order before you can branch okay so these days that optimization doesn't mean anything because as we'll talk about in a little bit these machines are so powerful that you know doing an extra integer arithmetic like that probably has no bearing on the overall cost yeah just looks at the flags yep just looks at the flags doesn't take any arguments okay now the next aspect of this is that you can give registers but you also can address memory and there are there are three direct addressing modes and three indirect addressing modes okay at most one operand may specify a memory address so so here the direct addressing lines so for immediate what you do is you give it a constant like like 1 7 - random constant to store into the register in this case that's called an immediate what happens in the if you look at the instruction if you look at the machine language 172 is right in the instruction ok it's right in the instruction that number 172 ok our register says will move the value from the register in this case % CX and then the index of the register is put in that in that part and direct memory says use a particular memory location okay and you can give a hex value when you do direct memory it's gonna fetch it out of that he's going to use the value at that place in memory and and to indicate that memory is going to take you on a 64-bit machine 64 eight bytes to specify that memory whereas for example the the move cue I can I can get one seven two will fit in one bite okay and so it will move you know I'll have spent a lot less storage in order to do it plus I can do it directly from the instruction stream and I avoid having an access to memory which is very expensive so what's how many cycles does it take if the value that you're fetching from memory is not in you know is not in cache or whatever okay or a register if I'm fetching something remember how many cycles of the Machine does it typically take these days yeah yeah a couple hundred or more yeah a couple hundred cycles to fetch something from memory it's so slow no it's the processors are so fast okay and so so clearly if you can get things into registers most registers you can access in a single cycle okay so that we want to move things close to the process or operate on them shove them back and while we pull things from memory we want other things to be to be working on okay and so the hardware is all organized to the do to do that okay now of course we spent a lot of time fetching stuff from memory and that's one reason we use caching and we'll have a big thing caching is really important we're gonna spend a bunch of time on how to get the best out of your cache there's also indirect addressing so instead of just giving a location you say oh let's go to some other place for example the register and get the value and the address is going to be stored in that location so for example here register indirect says in this case move the contents of our ax into sorry the contents is the address of the thing that you're going to move into our di okay so if our ax was location 172 okay then it would take whatever's in location 172 and put it in our di okay registered index says well do the same thing but while you're at it add an offset okay so if it once again if our ax had 172 in this case it would go to 344 okay to fetch the value out of that location 344 for this particular instruction okay and then instruction pointer relative okay instead of indexing off of a general purpose register you index off the instruction pointer okay that usually happens in the code where your modern where the code is is for example you can jump to where you are in the code plus for instructions okay so you can jump down some some number of instructions in the code usually you'll see that lonely with use with control because you're talking about things but sometimes they'll put some data in the instruction stream and then it can index off the instruction pointer to get those values without having to to soil annex another register now the most general form is base index scale displacement addressing Wow okay this is a move that has a constant plus three terms okay and this is the most complicated instruction that is supported the mode refers to the address whatever the base is okay so the base is is a general purpose register in this case RDI and then it adds the index times the scale so the scale is one two four eight okay and then a displacement which is that number on the front okay and this gives you very gentle indexing of things off of a base pointer so you'll often see this kind of accessing when you're accessing stack memory okay because everything you can say here's the base of my frame on the stack and now for anything that I want to add I'm gonna be going up a certain amount I'm in a scaling by a certain amount to get the value that I want okay so once again you know you will become familiar with these with a manual okay you don't have to memorize all these but you do have to understand that there are a lot of these complex addressing modes the jump instruction take a label as their operand which locate identifies a location in the code for this the labels can be symbols in other words you can say here's a symbol that I want to jump to might be the beginning of a function or it might be a label that's generated to be at the beginning of a loop or whatever they can be exact addresses go to this place in the code or they can be relative address jump to someplace as I mentioned that's indexed off the instruction pointer okay and then an indirect jump takes as its operand and in direct address I've got okay as it's not brand as its operand okay so that's a typo just takes an operand as an indirect address so basically you can say go you know jump to whatever is pointed to by that register using whatever indexing method that you want okay so that's kind of an overview of the assembly language now let's take a look at some idioms so the extra opcode computes the bitwise XOR of a and B we saw XOR was a great trick for swapping numbers for example the other day so often in the code you will see something like this XOR ra x ra x what does that do yeah it zeros the register why is that the register yeah it's basically taking it's basically taking the results of our ax the results of our axe X touring them and when you XOR something with itself you get zero storing that back into it so that's actually how you zero things so you'll see that whenever you see that hey what are they doing they're zeroing the register okay and that's actually quicker and easier than having a zero constant that they put into into the instruction it saves a byte because this ends up being a very short instruction I remember how many bytes that instruction is but okay here's another one the test opcode test a B computes the bitwise and of a and B and discards the result preserving the are Flags register okay so basically it says what does the test instruction for for these things do okay so what is the first one doing so it takes our CX yeah so it takes the bitwise and of a and B right and so then it's saying jump if equal so right and is nonzero if any of the bits are set that's right so if the zero flag is set then our CX is set so this is going to jump to that location if our CX is holds the value 0 okay in all the other cases it won't set the zero flag because the result of the end will be zero so once again that's kind of an idiom that they use what about the second one so this is a conditional move so both of them are basically checking to see if the registers 0 ok and then doing something if it is or isn't ok but those are just idioms that you sort of have to look at to see you know how it is that they accomplish the particular thing okay here's another one so the ISA can include several no op no operation instructions including knop not a that's an operation with an argument and data 16 which sets aside 2 bytes of a no op so here's a line of assembly that we found in some of our code ok data 16 days 16 data 6 no op w you know and then % CSX you know so no op W is going to take this argument which is got all this address calculation in it so what do you think this is doing what's the effect of this by the way they're all no ops so what the effect is nothing ok the effect is nothing ok now it does set the are flags but basically mostly it does nothing why were the cat compiler generated assembly with these idioms why would you get that kind of that's crazy right yeah yeah it's actually doing alignment optimization typically okay there's it's or code size so it may want to start the next instruction on the beginning of a cache line and in fact there's a directive to do that if you want all your functions to start at the beginning of cache line then it wants to make sure that if code gets to that point it will you know you'll just proceed to jump through memory continue through memory okay so mainly is to optimize marking so you'll see those things I mean you just have to realize oh that's the compiler generating some some no-ops so that's sort of our brief excursion over assembly language x86 assembly language now I want to dive into floating-point and vector hardware which is going to be the main part and then if there's any time at the end I'll show the slides where there's I have a bunch of other slides on how branch prediction works and and a variety of other machines sorts of things that if we don't get to it's no problem you can take a look at the slides and there's also the architecture manual so floating point instruction sets so mostly the scalar floating point operations are accessed via a couple of different instruction sets so the history of floating point is interesting because Ridge '''l II the 8086 did not have a floating point unit okay floating point was done in software and then they made a companionship that would do floating point and then they started integrating and so forth as as miniaturization took hold so the SSC and AVX instructions do both single and double precision scalar floating point ie floats or doubles and then the x86 instructions the x87 instructions that's the 88 seven that was attached to the 8086 and that's where they get them support single double and extended precision scalar floating-point arithmetic including float double and long double so you can actually get a great big result of a multiply if you use the x87 instruction sets and they also include vector instructions you can multiply or add there as well so all these places on the chip where you can decide to do one thing or another okay compilers generally like the SSE instructions over the x87 instructions because they're simpler to compile for and to optimize and the SSE op codes are similar to the normal x86 op codes and they use the xmm registers and floating-point types and so you'll see stuff like this where you've got a move SD and so forth okay the suffix there is saying what the datatype in this case it's saying it's a double precision floating point value ie a double okay once again they're using suffix the SD in this case is a double precision floating point the other option is the first letter says whether it's single I use scalar operation or pet I have vector operation okay and the second letter says whether it's single or double precision okay and so this when you see one of these operations you can decode oh this is operating on a 64-bit value or a 32-bit value floating point value or on a vector of those values now what about these vectors so when you start using the packed representation and you start using vectors you have to understand a little bit about the vector units that are on these machines so they so the way a vector unit works is that there is the processor issuing instructions and it issues the instructions to all of the vector units okay so for example if you take a look at a typical thing that you may have a vector with the four vector units each of them is often called a lane la ne and the X is the vector with and so when the instruction is given is given to all of the vector units and they all do it on their own local copy of the register so the register you can think of as a very wide thing broken into several words and when I say add two vectors together it'll add four words together okay and store it back into another vector register okay and so whatever K is you know in the example I just said K was four and the lanes are the thing that each of which contains the integer or floating-point arithmetic but the important thing is that they all operate in lockstep okay it's not like one is going to do one thing and another is going to do another thing they all have to do exactly the same thing and the basic idea here is the price of one instruction okay I can command a bunch of operations to be done now generally vector instructions operate in an element white fashion where you take the eye of one vector and operate on it with the I L iment of another vector and all the lanes perform exactly the same operation depending upon the architecture some architectures the operands need to be aligned that is you've got to have the beginnings at the exactly same place in memory a multiple of the vector length there are others where the vectors can be shifted in memory usually there's a performance difference between the two okay if it does support some of them will not support unaligned vector operations so if it can't figure out that they're aligned I'm sorry your code will end up being executed scalar in a scalar fashion if they are aligned okay it's got to be able to figure that out and and in in that case sorry if it's not aligned but you do support vector operations on a on line it's usually slower than if they are aligned okay and for some machines now they actually have good performance on both okay so it really depends upon the machine and then also there are some architectures will support cross lane operation such as inserting or extracting subsets of vector elements permuting shuffling scatter gather types of operations so X supports several instruction sets as I mentioned there's SSE there's a VX there's a VX - and then there's now the avx-512 or sometimes called a VX 3 and which is not available on the machines that we'll be using the Haswell machines that we'll be doing generally the a VX and avx2 in extend the SSE instruction set by using the wider registers and operate on to the SSE use wider registers and operate on most two operates the a VX ones can use the 256 and also have three operands not just two operands so so you can say you know add a to be in store it and see as opposed to saying add a to b and store it and be ok so it can also support 3 yeah most of them are similar to traditional op codes with minor differences so there's you know if you look at them they look you know you basically just if you have an SSE it basically looks just like a the traditional name like ad in this case but you can then say do a packed add or a vector with packed data so the V premise says that say V X so if you see it's V you go to the part in the manual that says a VX ok if you see the peas that say it's packed data then then you go to SSE if it doesn't have the V ok and the P prefix distinguish an introvert or instruction you got me I tried to think why does P in distinguishing a an integer it's like a P no good mnemonic for integer right okay then in addition they do this aliasing trick again where the ymm registers actually alias the xmm registers okay so you can use both operations but you got to be careful what's going on okay so because they just extended them and now of course with the avx-512 they did another extension to 512 bits okay that's vectors stuff so so you can use those explicitly the compiler will vectorize for you and the homework this week takes you through some vectorization exercises actually a lot of fun we're just going over it in the staff meeting and it's really fun I think it's really fun exercise we introduced that last year by the O he hadn't or maybe two years ago but in any case it's a fun one for my definition of fun which I hope is your definition of fun okay now I want to talk generally about computer architecture and I'm not going to get through all of these slides as I say but but I want to get started on them and give you a sense of other things going on in the processor that you should be aware of so in six double-o for you probably talked about a five stage processor to anybody remember that okay five stage processor there's an instruction fetch there's an instruction decode there's an execute then there's a memory addressing and then you write back the values and this is done as a pipeline so as to make you could do all of this in one thing but then you would have a long clock cycle and you only be able to do one thing a time instead they stack them together so here's a block diagram of the five stage processor we read the instruction from memory in the instruction fetch cycle then we decode it basically it takes a look at what is the opcode what are the addressing modes etc and figures out what it actually has to do then actually performs the ALU operations and then it reads and writes the data memory and then it writes back the results into registers that's typically a common way that these things go for a for a for a five stage processor by the way this is vastly oversimplified okay if you can take six eight to three if you want to learn truth okay I'm gonna tell you I'm gonna tell you nothing but white lies okay for this lecture now if you look at the Intel Haswell the machine that we're using it actually has between 14 and 19 pipeline stages the 14 to 19 reflects the fact that there are different paths through it that take different amounts of time it also I think reflects a little bit that nobody has published the Intel internal stuff so maybe we're not sure if it's 14 to 19 but somewhere in that range ok but I think it's actually because the different lengths of time I'll explain so what I want to do is is you've seen the five stage pipeline I want to talk about the difference between that and a modern processor by looking at several design features we already talked about vector or Hardware I then want to talk about superscalar processing out of order execution and branch prediction ok a little bit and at the out of order I'm gonna skip a bunch of that because it has to do with score boarding which it's really interesting and fun but it's also time consuming but it's really interesting and fun that's what you learn in 6/8 2/3 so historically there's two ways that people make processors go faster by exploiting parallelism and by exploiting locality ok and parallelism there's instruction well we already did word level parallelism right in in the bit tricks thing but there's also instruction level parallelism so called ILP vectorization multi-core and for locality the main thing that's used there is caching I would say also the fact that you have a design with registers that also reflects locality because the way that the processor wants to do things is fetch stuff from memory doesn't want to operate on it in memory that's very expensive wants to fetch things into memory get enough of them there that you can do some calculations do a whole bunch of calculations and then put them back out there okay so this lecture we're talking about ILP and vectorization so let me talk about instruction level parallelism so when you have a let's say a five stage pipeline you're interested in finding opportunities to execute multiple instructions simultaneously so an instruction one it's going to do an instruction fish then it does its decode and so it takes five cycles for this for this instruction to complete so ideally what you'd like is that you can start instruction two on cycle two instruction three on cycle three and so forth and have five instructions once you get into the steady state have five instructions executing all the time that would be ideal okay where each one takes just one thing so that was really pretty good and that would improve the throughput even though it might take a long time to get one instruction done I can have many instructions in the pipeline at some time okay so each pipeline is executing a different structure however in practice this isn't what happens in practice you discover that there are what's called pipeline stalls when it comes time to execute an instruction for some correctness reason it cannot execute the instruction has to wait and that's a pipeline stall that's what you want to try to avoid and the compiler tries to bruce code that will avoid stalls okay so why do stalls happen they happen because of what are called hazards there's actually two notions of hazard and this is one of them the other is a race condition hazard this is the pendency hazard but people call them both hazards just like they call the the second stage of compilation compiling they it's like they make up these words okay so here's three types of hazards can prevent an instruction from executing first of all there's what's called a structural hazard to instructions attempt to use the same functional unit the same time if there's for example only one floating point you know multiplier and two of them try to use at the same time one has to wait okay in modern process there's a bunch of each of those but if you have you know k functional units and k plus one instructions want to access it you're out of luck one of them is going to have to wait the second is a data hazard this is when an instruction depends on the result of a prior instruction in the pipeline okay so you know I'm you know one instruction is computing a value that it's going to stick in in in you know RCX say okay so they stick it into RCX the other one is to read the value from RCX and it comes later it's got a weight that other instruction has to wait until that value is written there before it can read it that's a data hazard and a control hazard is where you're where where you decide that you need to make a jump and you can't execute the next instruction because you don't know which way the jump is going to go so if you have a conditional jump it's like well what's the next instruction after that jump I don't know so I have to wait to execute that I can't go ahead and do the jump and then do the net instruction after it because I don't know what happened to the previous one okay now all these we're going to mostly talk about data hazards so an instruction can create a data hazard I can create a data hazard due to a dependence between I and J so the first type is called a true dependence or I read after write dependence and this is where as in this example I'm adding something and story into our ax and the next instruction wants to read from our ax okay so the second instruction can't get going until the previous one or it's going to may stall until the previa the result of the previous one is known there's another one called an anti-dependence this is where I want to write into a location but I have to wait until the previous instruction has read the value okay because otherwise I'm going to clobber that instruction andrey clobber the value before it gets read okay so that's an anti-dependence and then the final one is an output dependence where where they're both trying to move something to our ax so why would two things want to move things to the same location after all one of them is going to be lost and just not do that instruction why we yeah maybe because it wants to set some flags okay so that's that's one reason that it might might do this because it wants you know the first instruction set some flags in addition to moving the output to that location and there's one other reason what's the other reason I'm blanking there's two reasons and I didn't put them in my notes I don't remember okay but anyway that's a good that's a good question for quiz then okay give me two reasons yeah there could but of course then you know if it's you're gonna use that register then oh I know the other reason okay so so this is still good for a quiz okay the other reason is there may be aliasing going on maybe an intervening instruction uses one of the values and it's aliased okay so use as part of the result or whatever there still could be a a dependency anyway some arithmetic operations arithmetic operations are complex to implement in hardware and have long latencies so here's some sample op codes and how many instructions how many latency they take they take a different number so for example integer division actually is variable but a multiply takes about three times what most of the integer operations are and floating-point multiply is like five and then F ma what's F M a fuse multiply add this is where you're doing both a multiply in an ADD and why do we care about fuse multiply adds not for memory actually this is actually floating point multiplier an add it's called linear algebra okay so when you do makes multiplication you're doing dot product you're doing multiplies and adds so that kind of thing that's where that's where you do a lot of those so how does the hardware accommodate these complex operations so the strategy that that hardware I meant much hardware tends to use is to have separate functional units for complex operations such as floating-point arithmetic so there's a there may be in fact separate registers for example the xmm registers that only work with the floating-point so you have your basic five-stage pipeline you have another pipeline that's off on the side and it's going to take multiple cycles sometimes for things and maybe pipeline to a different depth okay and so you basically separate these separate these operations the the maybe pipeline fully partially or not at all okay and so I now have a whole bunch of different different functional units and there's different paths and I'm gonna be able to take through the data path of the of the processor so it has well there have integer a vector floating-point demit distributed among eight different ports which is sort of the inch from the entry so so given that things get really complicated if we go back to our our simple diagram suppose we have all these additional functional units how can I now exploit more instruction level parallelism so right now we have you know we can start up one operation at a time what what might I do to get more parallelism out of the hardware that I've got what do you think computer architects did okay yeah so so even simpler than than that but which is implied in what you're saying is you can just inch you know fetch an issue multiple instructions per cycle so rather than just doing one per cycle as we showed with a typical pipeline processor let me fetch several that use different parts of the pipe processor pipeline because they're not going to interfere okay to keep everything busy and so that's basically what's called a superscalar processor where it's not executing one thing at a time it's executing multiple things at a time so has well in fact breaks up the instructions into simpler operations called micro ops and they can emit for micro ops per cycle to the rest of the pipeline and the fetch and decode stages implement optimizations on microcode processing including special cases for common patterns for example if it sees the X or of our ax and our ax it knows that our X is being set to zero it doesn't even use a functional unit for that it just does it and it's done ok has just a special logic that observes that because it's such a common thing to set things out and so that means that now your processor can execute a lot of things at one time and that's the machines that you're doing that's why when I said if you save one add instruction it probably doesn't make any difference in today's processor because there's probably an idle add or lying around there's probably did I record how many where do we go here yeah so if you look here you can you can discover that they're actually a bunch of Al use that are capable of doing an ad so you know they're all over the map and in has well now still we are insisting that the processors execute in things in order and that's kind of the next stage is how do you end up making things run that is how do you make it so that that you can free yourself from the tyranny of one instruction after the other okay and so the first thing is there's a strategy called bypassing so suppose that you have you know a instructions running into our ax and then you're gonna use that to read well why bother waiting for it to be stored into into the register file and then pulled back out for the second instruction okay instead let's have it let's have a bypass a special circuit that identifies that kind of situation and feeds it directly to the to the next instruction without requiring that it go into the register file and back out okay so that's called bypassing there are lots of places where things are bypassed and we'll talk about it more so normally you would stall waiting for it to be written back and now when you eliminate it now I can move it way forward because I just use the bypass path to to execute and allows the second instruction to get going earlier okay what else can we do well let's take a large code example given the amount of time what I'm going to do is basically say you can go through and figure out what are the read after write dependence a--'s and the write after read dependence is there all over the place and what you can do is is if you look at what the dependencies are that I did that I just flashed through you can discover oh there's all these things each one right now has to wait for the previous one before it can get started and but there are some for example this the first one is just issue order you can't start the second B if it's in order you can't start the second till you've started the first okay that it's finished the first stage but the other thing here is there's a data dependence between the second and third instructions if you look at the second and third instructions they're both using xmm - and so we're prevented so one of the questions there is well why not do a little bit better by taking a look at this as a graph and figuring out what's the best way through the graph and there are a bunch of tricks you can do there which I'll run through here very quickly okay and you can take a look at these okay you can discover that some of these dependences are not real dependence and as long as you're willing to execute things out of order and keep track of that it's perfectly fine okay if you're not actually dependent on it then just go ahead and execute it and then you can advance things and then the other trick you can use is what's called register renaming if you have a destination that's going to be read from sorry if you have a if you have a if I want to read from something but if I want to write to something but something I have to wait for something else to read from it okay they write after read dependence then what I can do is just rename the register so that I have something to write to that is the same thing and there's a very complex mechanism called scoreboarding that that does that so anyway you can take a look at all of these tricks and then the last thing that I want to so this is the part I was going to skip over and indeed I don't have time to do it I just want to mention the last thing which is is worthwhile so this you don't have to know any of the details of that part but it's in there if you're interested so it does renaming and reordering and then the last thing I just want to mention is branch prediction so when you come to branch prediction the outcome you can have a hazard because the outcome is known to late and so in that case what they do is what's called speculative execution which you've probably heard of okay so basically that says I'm going to guess the outcome of the branch and execute okay if it's encountered you you assume it's taken otherwise you exit and you execute normally and if you're right everything is hunky-dory if you're wrong it costs you something like a you know you have to undo that speculative computation and the effect is sort of like stalling so you don't want that to happen and so there are a mispredicted branch on on has well cost about 15 to 20 cycles most of the machines use a branch predictor to tell whether or not it's going to do there's a little bit of stuff here about how you tell tell about whether something is going to be branch is going to be predicted or not okay and you can take a look at that on your own so sorry to rush a little bit at the end but I I knew I wasn't gonna get through all of this but it's in the notes you know in the slides when we put it up and this is really kind of interesting stuff once again I remember that I'm dealing with this at one level below what you really need to do but it is really helpful to understand that layer so you have a deep understanding of why certain software optimizations work and don't work sound good okay good luck on finishing your project ones you 2 00:00:03,189 --> 00:00:05,769 the following content is provided under a Creative Commons license your support will help MIT OpenCourseWare continue to offer high quality educational resources for free to make a donation or to view additional materials from hundreds of MIT courses visit MIT opencourseware at ocw.mit.edu so today we're going to talk about assembly language and computer architecture it's interesting these days most software courses don't bother to talk about these things and the reason is because as much as possible people have been insulated in writing their software from performance considerations but if you want to write fast code you have to know what is going on underneath so you can exploit the strengths of the architecture and the interface the best interface that we have to that is the assembly language so that's what we're going to talk about today so when you take a particular piece of code like fib here to compile it you run it through clang as I'm sure you're familiar at this point and what it produces it is a binary machine language that the computer is hardware programmed to interpret and execute ok it it looks at the bits as instructions as opposed to as data and it executes them and and that's what we see when we execute this process is not one step it's actually there for stages two compilation pre-processing compiling sorry for the redundancy that's sort of a bad name conflict but that's what they call it assembling and linking so I want to take us through those stages so so the first thing that goes through is you go through a pre process stage and you can invoke that with clang manually so you can say for example if you do clang - II okay that will show that will run the preprocessor and nothing else and you can take a look at the output there and look to see how all your macros got expanded and such okay before the compilation actually goes it goes through then you compile it and that produces assembly code okay so assembly is a mnemonic structure of the machine code that makes it more human readable than the machine code itself would be and then and once again you can produce the assembly yourself with clang - s and then finally you was penultimately maybe you can assemble that that assembly language code to produce an object file and since we like to have separate compilations you don't have to compile everything is one big monolithic hunk okay then there's typically a linking stage to produce the final executable and for that we are using LD for for the most part we're actually using the gold linker but but LD is the command that calls it okay so let's go through each of those steps and see what's going on so first the the the pre-processing is really straightforward so I'm not going to do that that's just a textual substitution the next stage is the source code - assembly code so when we do clang - at s we get this symbolic representation and it looks something like like this okay where we have some labels on the side some labels on the side and we have some operations we may have some directives and then we have a lot of gibberish which won't seem like so much gibberish after you've played with it a little bit okay but to begin with looks kind of like gibberish from there we assemble that assembly code and that produces the binary okay and once again you can invoke it just by doing running clang clang will recognize that it doesn't have a CE file or a C++ file it says oh goodness I've got a an assembly language file and they'll produce the the the binary now the other thing that turns out to be the case is because assembly and machine the code they're really very similar in in structure okay just things like the OP codes which are the the things that are here in in blue or purple whatever that color is like these guys okay those correspond to specific bit patterns over here in the machine code okay so and these are the the addresses and the registers that were operating on the operands okay those correspond to other two other bit codes over there okay and there's very close there's a very much a it's not exactly one to one but it's pretty close one to one compared to if you had C and you look at the binary it's like it's it's way way different okay so so one of the things that turns out you can do is if you have the if you have the machine code and especially if the machine code that was produced with so called debug symbols that is it was compiled with dash G you can use this program called ABS okay which will produce a disassembly of the machine code so to tell you okay here's what the mnemonic human more human readable code is the assembly code from the binary and that's really useful especially if you're trying to do things well let's see why do we bother looking at the assembly so why would you want to look at the assembly of your program does anybody have some ideas yeah yeah you can see whether certain optimizations are made or not so other right reasons everybody's gonna say that one okay another one is well let's see so here's some reasons okay the assembly reveals what the compiler did and did not do because you can see exactly what the machine what the assembly is that it's going to be executed as a machine code the second reason which turns out to happen more often than you would think is that hey guess what compiler is a piece of software it has bugs so your code isn't operating correctly oh goodness what's going wrong maybe the compiler made an error and we have certainly found found that for especially when you start using some of the less frequently used features of a compiler you may discover all it's actually not that well broken in and mentions here you may only have an effect when compiling at - oh three but if you compile it - oh zero - zero one everything works out just fine so then it says cheese somewhere in the optimizations they did an optimization wrong okay so one of the first principles of optimization is do it right and then the second is make it fast and then and so sometimes the compiler doesn't that it's also the case that sometimes you cannot write code that produces the assembly that you want and in that case you can actually write the assembly by hand okay now used to be many years ago many many years ago that a lot of software was written in assembly okay in fact I had a my first job out of college I spent about half the time programming in assembly language okay and it's not as bad as you would think okay but it certainly is easier to have high level languages that's for sure you get a lot more done a lot quicker okay and the last one is reverse engineer you can figure out what a program does when you have only have access to its source so for example the matrix multiplication example that I gave on day one okay you know we had the overall outer structure but the inner loop we could not match the Intel and math kernal library code so what do we do we look to see what it was we didn't have the source for it we look to see what it was doing we said oh is that what they're doing okay and then we were able to do it ourselves okay without having to without having to you know get the source from them so we reverse engineered what they did so all those are good reasons now in this class we have some expectations so one thing is you know assembly is complicated and you needn't memorize the manual in fact the manual has has like over a thousand pages okay it's like okay but here's what we do expect of you you should understand how a compiler implements various seed linguistic constructs with x86 instructions and that's what we'll see in the next lecture and you should be able to read x86 assembly language with the aid of an architecture manual and on a quiz for example we would give you you know snippets or explain what the op codes are that are being used in case it's not there but you should have some understanding of that so you can see what's actually happening you should understand the high level performance implications of common assembly patterns okay so what happens in you know what does it mean to do things in a particular way in terms of performance so some of them are quite obvious vector operations tend to be faster than you know doing the same thing with a bunch of scalar operations okay if you do write an assembly typically what we do ziz they're a bunch of compiler intrinsic functions built-ins so called that that allowed you allow you to use the assembly language instructions and you should be after we've done this able to write code from scratch if the situation demands at some time in the future we won't do that in this class but we expect that you will be in a position to do that after after you should get a mastery to the level where that would not be impossible for you to do you'd be able to do that with a reasonable amount of effort so the rest of the lecture here is I'm going to first start by talking about the instruction set architecture of the x86 64 which is the one that we are using for the cloud machines that we're using and then I'm going to talk about floating-point and vector hardware and then I'm gonna do an overview of computer architecture now all of this I'm doing this is a software class right okay software performance engineering we're doing so the reason we're doing this is so you can write code that better matches the hardware therefore to better get it in order to do that I could give things at a high level my experience is that if you really want to understand something you want to understand it to the level that's necessary and then one level below that okay it's not that you'll necessarily use that one level below it but that gives you insight as to why that layer is what it is and what's really going on okay and so that's kind of what we're gonna do we're going to do a dive that takes us one level beyond what you probably will need to know in in the class so that you have a robust foundation for understanding does that make sense okay that's just that's my part of my learning philosophy is you know go one step beyond and then you can you know come back okay the is a primer okay so the is a talks about the syntax and semantics of assembly this is there are four important concepts in in the instruction set architecture the notion of registers the notion of instructions the data types and the memory addressing modes and those are sort of indicated for example here we're going to go through those one by one so let's start with the registers so the registers is where the processor stores things and there are a bunch of x86 registers so many that you don't need to know most of them okay the ones that are important are these okay so first of all there are general purpose registers and those typically have with 64 and there are many of those there is a so called Flags register called are flags which keeps track of things like whether there was an overflow whether the last arithmetic operation resulted in a 0 whether a kid there was a carry out of of a word or what-have-you the next one is the instruction pointer so the assembly language is organized as a sequence of instructions and the hardware marches linearly through that sequence one one after the other unless it encounters a conditional jump or an unconditional jump in which case it'll branch to whatever the location is but for the most part it's just running straight through memory then there are some registers that were added quite late in the that our that namely the SSC registers and the AVX registers and these are vector registers so the xmm registers were when they first did vectorization they use 128 bits there's also four AVX they're the ymm registers and in the most recent processors which were not using this term there there's another level of av X that gives you 512 bit registers but maybe we'll use that for for the final project because it's just like more a little more power okay for the for the game-playing project okay but for most of what you'll be doing will just be keeping to the to the basic to the C for instances that in AWS that you guys have been using okay now the x86 64 didn't start out as x86 64 started out as x86 and it was used for machines in particularly 8086 which had a 16-bit word okay so really short okay how many things can you index with a 16-bit word about how many yeah about 65,000 65536 okay words you can execute in that sorry you can address or bytes this is byte addressing okay so that's 65 k bytes that you can address how could they possibly use that for machines well the answer is that's how much memory was on the machine you didn't have gigabytes so as the machines as Moore's law you know marched along and we got more and more memory then the words had to become wider to be able to index them yeah yeah but here's the thing is if you're building stuff that's going to have to thick it's too expensive and you can't get memory that's big enough then then the then if you build a wider word like if you build a word of 32 bits then your processor just cost twice as much as the next guy's processor so instead what they did is they went along as long as that was the common size and then had a some growth pains and went to 6:32 and from there they had some more growth pains and went to 64 okay those are two separate things and in fact they did they did some really weird stuff okay so what they did in fact is when they made these longer registers they have registers that are aliased to exactly the same thing for the lower bits so they could address them either by a by a byte okay so these registers all have the same you can do the lower and upper half of the short word or you can do the the 32-bit word or you can do the 64-bit word okay and that's just like if you're doing this today you wouldn't do that you wouldn't have all these registers that alias and such okay but that's what they did because it because this is history not not design and the reason was because when they're doing that they were not designing for long term now are we gonna go to 128 bit addressing probably not 64 bits address is a spectacular amount of stuff I'm you know not quite as many 2 to the 64th is what is like it's like how many gazillions it's a lot of gazillions ok ok so yeah we're not going to have to have to go beyond 64 probably ok so here are the general-purpose registers and as I mentioned you know they have different names but they for the same thing so if you change EA X for example that also changes our ax okay and so you see they they originally all had functional purposes now they're all pretty much the same thing except for the and but the names have stuck because of history okay they instead of calling them register 0 register 1 or whatever they all have these funny names okay some of them still are used for a particular purpose like RSP is used as the stack pointer and RB p is used to point to the base of the frame for those who remember there's six double o4 stuff so anyway they're a whole bunch of them and there are different names depending upon which part of the register you're accessing now the format of an x86 64 instruction code is to have an opcode and then an operand list and the operand list is typically 0 1 2 or rarely three operands separated by commas typically all operands are sources and one operand might also be the destination so for example if you take a look at this at this add instruction the operation is an ad and the operand list is the is these two registers one is EDI and the other is ECX and the destination is the second one okay when you add in this case what's going on is it's taking the value in ECX adding the value in EDI into it and the result is in ECX yes funny you should ask yes okay so what does OP a B mean it turns out naturally that the literature is inconsistent about how it refers operations and there's two major ways that are used one is the AT&T syntax and the other is the Intel syntax so the AT&T syntax the second operand is the destination the last operand is the destination in the Intel Singh syntax the first operand is the destination okay is that confusing okay so almost all the tools that we're going to use are going to use the AT&T syntax okay the but you will read documentation which is which is Intel documentation it will use the other syntax don't get confused okay I can't help you know it's like I I can't help that this is the way the state of the world is okay yeah oh yeah in particular if you you know you could compile it yeah and undo but I'm sure there's a I mean that this is not a hard translation thing I'll bet if you just google it you can in two minutes in two seconds fine to find somebody who'll translate from one to the other okay yeah this is not a not a complicated translation process now here are some very common x86 op codes and so let me just mention a few of these because these are ones that you'll often see in in the code so move what do you think move does yeah it puts something in one register into another register of course when it moves it this is computer science move not real move you know when I move my belongings in my house to my new house they're no longer in the old place right but in computer science for some reason when we move things we leave a copy behind okay so so they may call it move but yeah why don't they call it copy you got me okay okay then there's conditional move so this is move based on a on a condition like move and we'll see some of the ways that this thing like move if if a flag is equal or equal to zero or and so forth so basically conditional move it doesn't always do the move then there's then you can extend the sign so for example suppose you're moving from a 32-bit value register into a 64-bit register okay then the question is what happens to the higher order bits so there's two basic mechanisms can be used either can be filled with zeros or remember that the first bit there the leftmost bit as we think of it is the sign bit right from from our lecture on binary that bit will be extended through the high order part of the word okay so that the whole number will be if it's negative will be negative and if it's positive it'll be zeros and so forth okay does that make sense then there are things like push and pop to do stacks there's a lot of integer arithmetic and you can take there's you know addition subtraction multiplication division you know very shifts address calculation shifts rotations incrementing decrementing negating etc there's also a lot of binary logic and or X or not those are all doing bitwise operations and then there is boolean logic like testing to see whether some value is has a given value or comparing there's unconditional jump which is and there's conditional jumps which is jump with a condition and then of things like subroutines and there are a bunch more which were which the manual will have and which will undoubtedly show up like for example there's the whole set of vector operations we'll talk about a little bit later okay now the opcodes may be augmented with a suffix that describes the data type of the operation or a condition code okay so an opcode for data movement arithmetic or logic use a signal single character suffix to indicate the data type and if the suffix is missing it can usually be inferred so take a look at this example so this is a move with a cue at the end what do you think q stands for quad word okay how many bytes in a quad word eight that's because originally it started out with a 16-bit word so they said a quad word was four of those 16-bit words so that's eight bytes okay you get the idea right but let me tell you this is all over the x86 instruction set all these historical things and all these mnemonics that that if you don't understand what they really mean you can get very confused okay so in this case we're moving a 64-bit integer because a quad word has eight bytes or 64 bits okay I this is one of my it's like whenever I prepare this lecture I just go into spasms laughter okay as I look and I say oh my god they really did that like for example on the last page when I did subtract okay so the sub operator if it's a two argument operator it it subtracts the I think it's the first one the second but there is no way of subtracting the other way around okay it puts the destination in the second one it basically takes the second one - the first one and puts that in the second one okay but if you wanted to have it the other way around to save yourself a cycle you anyway it doesn't matter that you can't do it that way okay and all this stuff the compiler has to understand okay so here are the x86 64 datatypes okay the way I've done it is to show you the difference between C and an x86 64 so for example here the declarations here the declarations in C so there's a char a short int unsigned int long etc here's an example of a C constant that does those things and here's the size in bytes that you get when you declare that okay and then the the assembly suffix is one of these things okay so in the assembly it says B or W for a word and L or D for a double word a queue for a quad word ie 8 bytes single precision double precision extended precision okay the so sign extension use two datatype suffixes so here's an example so the first one says we're going to move and now you see I can't read this without without my cheat sheet so what is this saying this is saying we're gonna move a with a 0 extend and it's going to be the first operand is a byte and the second operand is along that's that right if I'm wrong it's like I got a look at the chart - okay and of course we don't hold you to that so but the Z there says extends with zeros and the S says preserve the sign ok so that's the the things now that would all be all well and good except that then what they did is if you do 32-bit operations where you're moving it to a sick the 4-bit value it implicitly zero extends the sign if you do it for smaller values and you store it in it simply puts over writes the values in those registers doesn't touch the higher bits but for that when they did the 32 to 64 bit extension of the of the instruction set they decided that they wouldn't do what had been done in the past and they decided that they would zero extend things unless there was something explicit to the contrary okay you got me okay yeah I have a friend who worked at Intel and he had a joke about the Intel instruction set he discovered the Intel instructor says really complicated he says here's the idea of the Intel instruction set he said to become an Intel fellow you need to have an instruction in the Intel instruction set okay you have an instruction that you invented and that that's now used in Intel he says nobody becomes an Intel fellow for removing instructions so just so grows and grows and grows and gets more and more complicated for for each thing now once again you can for extension you can sign extend and here's two examples in one case moving an 8-bit integer to a 32-bit integer and 0 expanded it versus preserving the sign conditional jumps and conditional moves also use suffixes to indicate the condition code so here for example the ne indicates the jump should only be taken if the argument of the previous comparison are not equal so ne is not equal so you do a comparison and that's going to set a flag in the are flags register then the jump will look at that flag and decide whether it's going to jump or not or just continue the sequential execution of the of the code okay and there are a bunch of things that you can jump on so which are status flags and you can see the names here there's Carrie there's parity parity is the XOR of all the bits in the word there's and just I don't even know what that's for okay there's the zero flag tells with a zero there's a sign flag whether it's positive or negative there's a trap flag which and interrupt enable and Direction overflow so anyway you can see there are a whole bunch of these okay so for example here this is going to decrement RBX and then it sets the zero flag if the results are equal and then the jump the conditional jump jumps to the label if the z-f flag is set to is not set in this case okay makes sense after a fashion okay it doesn't make rational sense but it does make sense okay here are the main ones that you're going to need the carry flag is whether you got a carrier or borrow out of the most significant bit the zero flag is if the ALU operation was zero whether the last lu application had the sign bit set and the overflow says it resulted in arithmetic overflow the condition codes are if you put one of these condition codes on the on your conditional jump or whatever this tells you exactly what the flag is that is is being set so for example you know the easy ones are if it's equal but you know there are some other ones there so for example you know if you say why you know why for example do the condition codes E&N a check the zero flag okay and the answer is typically rather than rather than having a separate comparison that what they've done is separate the branch from the comparison itself but it also needn't be a compare instruction it could be the result of the last arithmetic operation was a zero and therefore it can branch without having to do a comparison with zero okay so for example if you have a loop okay that where your decrementing a counter till it gets to zero that's actually faster to buy one instruction okay to to compare whether the loop index hits zero then it is if you have the loop going up to N and then every time to loop having to compare with N in order before you can branch okay so these days that optimization doesn't mean anything because as we'll talk about in a little bit these machines are so powerful that you know doing an extra integer arithmetic like that probably has no bearing on the overall cost yeah just looks at the flags yep just looks at the flags doesn't take any arguments okay now the next aspect of this is that you can give registers but you also can address memory and there are there are three direct addressing modes and three indirect addressing modes okay at most one operand may specify a memory address so so here the direct addressing lines so for immediate what you do is you give it a constant like like 1 7 - random constant to store into the register in this case that's called an immediate what happens in the if you look at the instruction if you look at the machine language 172 is right in the instruction ok it's right in the instruction that number 172 ok our register says will move the value from the register in this case % CX and then the index of the register is put in that in that part and direct memory says use a particular memory location okay and you can give a hex value when you do direct memory it's gonna fetch it out of that he's going to use the value at that place in memory and and to indicate that memory is going to take you on a 64-bit machine 64 eight bytes to specify that memory whereas for example the the move cue I can I can get one seven two will fit in one bite okay and so it will move you know I'll have spent a lot less storage in order to do it plus I can do it directly from the instruction stream and I avoid having an access to memory which is very expensive so what's how many cycles does it take if the value that you're fetching from memory is not in you know is not in cache or whatever okay or a register if I'm fetching something remember how many cycles of the Machine does it typically take these days yeah yeah a couple hundred or more yeah a couple hundred cycles to fetch something from memory it's so slow no it's the processors are so fast okay and so so clearly if you can get things into registers most registers you can access in a single cycle okay so that we want to move things close to the process or operate on them shove them back and while we pull things from memory we want other things to be to be working on okay and so the hardware is all organized to the do to do that okay now of course we spent a lot of time fetching stuff from memory and that's one reason we use caching and we'll have a big thing caching is really important we're gonna spend a bunch of time on how to get the best out of your cache there's also indirect addressing so instead of just giving a location you say oh let's go to some other place for example the register and get the value and the address is going to be stored in that location so for example here register indirect says in this case move the contents of our ax into sorry the contents is the address of the thing that you're going to move into our di okay so if our ax was location 172 okay then it would take whatever's in location 172 and put it in our di okay registered index says well do the same thing but while you're at it add an offset okay so if it once again if our ax had 172 in this case it would go to 344 okay to fetch the value out of that location 344 for this particular instruction okay and then instruction pointer relative okay instead of indexing off of a general purpose register you index off the instruction pointer okay that usually happens in the code where your modern where the code is is for example you can jump to where you are in the code plus for instructions okay so you can jump down some some number of instructions in the code usually you'll see that lonely with use with control because you're talking about things but sometimes they'll put some data in the instruction stream and then it can index off the instruction pointer to get those values without having to to soil annex another register now the most general form is base index scale displacement addressing Wow okay this is a move that has a constant plus three terms okay and this is the most complicated instruction that is supported the mode refers to the address whatever the base is okay so the base is is a general purpose register in this case RDI and then it adds the index times the scale so the scale is one two four eight okay and then a displacement which is that number on the front okay and this gives you very gentle indexing of things off of a base pointer so you'll often see this kind of accessing when you're accessing stack memory okay because everything you can say here's the base of my frame on the stack and now for anything that I want to add I'm gonna be going up a certain amount I'm in a scaling by a certain amount to get the value that I want okay so once again you know you will become familiar with these with a manual okay you don't have to memorize all these but you do have to understand that there are a lot of these complex addressing modes the jump instruction take a label as their operand which locate identifies a location in the code for this the labels can be symbols in other words you can say here's a symbol that I want to jump to might be the beginning of a function or it might be a label that's generated to be at the beginning of a loop or whatever they can be exact addresses go to this place in the code or they can be relative address jump to someplace as I mentioned that's indexed off the instruction pointer okay and then an indirect jump takes as its operand and in direct address I've got okay as it's not brand as its operand okay so that's a typo just takes an operand as an indirect address so basically you can say go you know jump to whatever is pointed to by that register using whatever indexing method that you want okay so that's kind of an overview of the assembly language now let's take a look at some idioms so the extra opcode computes the bitwise XOR of a and B we saw XOR was a great trick for swapping numbers for example the other day so often in the code you will see something like this XOR ra x ra x what does that do yeah it zeros the register why is that the register yeah it's basically taking it's basically taking the results of our ax the results of our axe X touring them and when you XOR something with itself you get zero storing that back into it so that's actually how you zero things so you'll see that whenever you see that hey what are they doing they're zeroing the register okay and that's actually quicker and easier than having a zero constant that they put into into the instruction it saves a byte because this ends up being a very short instruction I remember how many bytes that instruction is but okay here's another one the test opcode test a B computes the bitwise and of a and B and discards the result preserving the are Flags register okay so basically it says what does the test instruction for for these things do okay so what is the first one doing so it takes our CX yeah so it takes the bitwise and of a and B right and so then it's saying jump if equal so right and is nonzero if any of the bits are set that's right so if the zero flag is set then our CX is set so this is going to jump to that location if our CX is holds the value 0 okay in all the other cases it won't set the zero flag because the result of the end will be zero so once again that's kind of an idiom that they use what about the second one so this is a conditional move so both of them are basically checking to see if the registers 0 ok and then doing something if it is or isn't ok but those are just idioms that you sort of have to look at to see you know how it is that they accomplish the particular thing okay here's another one so the ISA can include several no op no operation instructions including knop not a that's an operation with an argument and data 16 which sets aside 2 bytes of a no op so here's a line of assembly that we found in some of our code ok data 16 days 16 data 6 no op w you know and then % CSX you know so no op W is going to take this argument which is got all this address calculation in it so what do you think this is doing what's the effect of this by the way they're all no ops so what the effect is nothing ok the effect is nothing ok now it does set the are flags but basically mostly it does nothing why were the cat compiler generated assembly with these idioms why would you get that kind of that's crazy right yeah yeah it's actually doing alignment optimization typically okay there's it's or code size so it may want to start the next instruction on the beginning of a cache line and in fact there's a directive to do that if you want all your functions to start at the beginning of cache line then it wants to make sure that if code gets to that point it will you know you'll just proceed to jump through memory continue through memory okay so mainly is to optimize marking so you'll see those things I mean you just have to realize oh that's the compiler generating some some no-ops so that's sort of our brief excursion over assembly language x86 assembly language now I want to dive into floating-point and vector hardware which is going to be the main part and then if there's any time at the end I'll show the slides where there's I have a bunch of other slides on how branch prediction works and and a variety of other machines sorts of things that if we don't get to it's no problem you can take a look at the slides and there's also the architecture manual so floating point instruction sets so mostly the scalar floating point operations are accessed via a couple of different instruction sets so the history of floating point is interesting because Ridge '''l II the 8086 did not have a floating point unit okay floating point was done in software and then they made a companionship that would do floating point and then they started integrating and so forth as as miniaturization took hold so the SSC and AVX instructions do both single and double precision scalar floating point ie floats or doubles and then the x86 instructions the x87 instructions that's the 88 seven that was attached to the 8086 and that's where they get them support single double and extended precision scalar floating-point arithmetic including float double and long double so you can actually get a great big result of a multiply if you use the x87 instruction sets and they also include vector instructions you can multiply or add there as well so all these places on the chip where you can decide to do one thing or another okay compilers generally like the SSE instructions over the x87 instructions because they're simpler to compile for and to optimize and the SSE op codes are similar to the normal x86 op codes and they use the xmm registers and floating-point types and so you'll see stuff like this where you've got a move SD and so forth okay the suffix there is saying what the datatype in this case it's saying it's a double precision floating point value ie a double okay once again they're using suffix the SD in this case is a double precision floating point the other option is the first letter says whether it's single I use scalar operation or pet I have vector operation okay and the second letter says whether it's single or double precision okay and so this when you see one of these operations you can decode oh this is operating on a 64-bit value or a 32-bit value floating point value or on a vector of those values now what about these vectors so when you start using the packed representation and you start using vectors you have to understand a little bit about the vector units that are on these machines so they so the way a vector unit works is that there is the processor issuing instructions and it issues the instructions to all of the vector units okay so for example if you take a look at a typical thing that you may have a vector with the four vector units each of them is often called a lane la ne and the X is the vector with and so when the instruction is given is given to all of the vector units and they all do it on their own local copy of the register so the register you can think of as a very wide thing broken into several words and when I say add two vectors together it'll add four words together okay and store it back into another vector register okay and so whatever K is you know in the example I just said K was four and the lanes are the thing that each of which contains the integer or floating-point arithmetic but the important thing is that they all operate in lockstep okay it's not like one is going to do one thing and another is going to do another thing they all have to do exactly the same thing and the basic idea here is the price of one instruction okay I can command a bunch of operations to be done now generally vector instructions operate in an element white fashion where you take the eye of one vector and operate on it with the I L iment of another vector and all the lanes perform exactly the same operation depending upon the architecture some architectures the operands need to be aligned that is you've got to have the beginnings at the exactly same place in memory a multiple of the vector length there are others where the vectors can be shifted in memory usually there's a performance difference between the two okay if it does support some of them will not support unaligned vector operations so if it can't figure out that they're aligned I'm sorry your code will end up being executed scalar in a scalar fashion if they are aligned okay it's got to be able to figure that out and and in in that case sorry if it's not aligned but you do support vector operations on a on line it's usually slower than if they are aligned okay and for some machines now they actually have good performance on both okay so it really depends upon the machine and then also there are some architectures will support cross lane operation such as inserting or extracting subsets of vector elements permuting shuffling scatter gather types of operations so X supports several instruction sets as I mentioned there's SSE there's a VX there's a VX - and then there's now the avx-512 or sometimes called a VX 3 and which is not available on the machines that we'll be using the Haswell machines that we'll be doing generally the a VX and avx2 in extend the SSE instruction set by using the wider registers and operate on to the SSE use wider registers and operate on most two operates the a VX ones can use the 256 and also have three operands not just two operands so so you can say you know add a to be in store it and see as opposed to saying add a to b and store it and be ok so it can also support 3 yeah most of them are similar to traditional op codes with minor differences so there's you know if you look at them they look you know you basically just if you have an SSE it basically looks just like a the traditional name like ad in this case but you can then say do a packed add or a vector with packed data so the V premise says that say V X so if you see it's V you go to the part in the manual that says a VX ok if you see the peas that say it's packed data then then you go to SSE if it doesn't have the V ok and the P prefix distinguish an introvert or instruction you got me I tried to think why does P in distinguishing a an integer it's like a P no good mnemonic for integer right okay then in addition they do this aliasing trick again where the ymm registers actually alias the xmm registers okay so you can use both operations but you got to be careful what's going on okay so because they just extended them and now of course with the avx-512 they did another extension to 512 bits okay that's vectors stuff so so you can use those explicitly the compiler will vectorize for you and the homework this week takes you through some vectorization exercises actually a lot of fun we're just going over it in the staff meeting and it's really fun I think it's really fun exercise we introduced that last year by the O he hadn't or maybe two years ago but in any case it's a fun one for my definition of fun which I hope is your definition of fun okay now I want to talk generally about computer architecture and I'm not going to get through all of these slides as I say but but I want to get started on them and give you a sense of other things going on in the processor that you should be aware of so in six double-o for you probably talked about a five stage processor to anybody remember that okay five stage processor there's an instruction fetch there's an instruction decode there's an execute then there's a memory addressing and then you write back the values and this is done as a pipeline so as to make you could do all of this in one thing but then you would have a long clock cycle and you only be able to do one thing a time instead they stack them together so here's a block diagram of the five stage processor we read the instruction from memory in the instruction fetch cycle then we decode it basically it takes a look at what is the opcode what are the addressing modes etc and figures out what it actually has to do then actually performs the ALU operations and then it reads and writes the data memory and then it writes back the results into registers that's typically a common way that these things go for a for a for a five stage processor by the way this is vastly oversimplified okay if you can take six eight to three if you want to learn truth okay I'm gonna tell you I'm gonna tell you nothing but white lies okay for this lecture now if you look at the Intel Haswell the machine that we're using it actually has between 14 and 19 pipeline stages the 14 to 19 reflects the fact that there are different paths through it that take different amounts of time it also I think reflects a little bit that nobody has published the Intel internal stuff so maybe we're not sure if it's 14 to 19 but somewhere in that range ok but I think it's actually because the different lengths of time I'll explain so what I want to do is is you've seen the five stage pipeline I want to talk about the difference between that and a modern processor by looking at several design features we already talked about vector or Hardware I then want to talk about superscalar processing out of order execution and branch prediction ok a little bit and at the out of order I'm gonna skip a bunch of that because it has to do with score boarding which it's really interesting and fun but it's also time consuming but it's really interesting and fun that's what you learn in 6/8 2/3 so historically there's two ways that people make processors go faster by exploiting parallelism and by exploiting locality ok and parallelism there's instruction well we already did word level parallelism right in in the bit tricks thing but there's also instruction level parallelism so called ILP vectorization multi-core and for locality the main thing that's used there is caching I would say also the fact that you have a design with registers that also reflects locality because the way that the processor wants to do things is fetch stuff from memory doesn't want to operate on it in memory that's very expensive wants to fetch things into memory get enough of them there that you can do some calculations do a whole bunch of calculations and then put them back out there okay so this lecture we're talking about ILP and vectorization so let me talk about instruction level parallelism so when you have a let's say a five stage pipeline you're interested in finding opportunities to execute multiple instructions simultaneously so an instruction one it's going to do an instruction fish then it does its decode and so it takes five cycles for this for this instruction to complete so ideally what you'd like is that you can start instruction two on cycle two instruction three on cycle three and so forth and have five instructions once you get into the steady state have five instructions executing all the time that would be ideal okay where each one takes just one thing so that was really pretty good and that would improve the throughput even though it might take a long time to get one instruction done I can have many instructions in the pipeline at some time okay so each pipeline is executing a different structure however in practice this isn't what happens in practice you discover that there are what's called pipeline stalls when it comes time to execute an instruction for some correctness reason it cannot execute the instruction has to wait and that's a pipeline stall that's what you want to try to avoid and the compiler tries to bruce code that will avoid stalls okay so why do stalls happen they happen because of what are called hazards there's actually two notions of hazard and this is one of them the other is a race condition hazard this is the pendency hazard but people call them both hazards just like they call the the second stage of compilation compiling they it's like they make up these words okay so here's three types of hazards can prevent an instruction from executing first of all there's what's called a structural hazard to instructions attempt to use the same functional unit the same time if there's for example only one floating point you know multiplier and two of them try to use at the same time one has to wait okay in modern process there's a bunch of each of those but if you have you know k functional units and k plus one instructions want to access it you're out of luck one of them is going to have to wait the second is a data hazard this is when an instruction depends on the result of a prior instruction in the pipeline okay so you know I'm you know one instruction is computing a value that it's going to stick in in in you know RCX say okay so they stick it into RCX the other one is to read the value from RCX and it comes later it's got a weight that other instruction has to wait until that value is written there before it can read it that's a data hazard and a control hazard is where you're where where you decide that you need to make a jump and you can't execute the next instruction because you don't know which way the jump is going to go so if you have a conditional jump it's like well what's the next instruction after that jump I don't know so I have to wait to execute that I can't go ahead and do the jump and then do the net instruction after it because I don't know what happened to the previous one okay now all these we're going to mostly talk about data hazards so an instruction can create a data hazard I can create a data hazard due to a dependence between I and J so the first type is called a true dependence or I read after write dependence and this is where as in this example I'm adding something and story into our ax and the next instruction wants to read from our ax okay so the second instruction can't get going until the previous one or it's going to may stall until the previa the result of the previous one is known there's another one called an anti-dependence this is where I want to write into a location but I have to wait until the previous instruction has read the value okay because otherwise I'm going to clobber that instruction andrey clobber the value before it gets read okay so that's an anti-dependence and then the final one is an output dependence where where they're both trying to move something to our ax so why would two things want to move things to the same location after all one of them is going to be lost and just not do that instruction why we yeah maybe because it wants to set some flags okay so that's that's one reason that it might might do this because it wants you know the first instruction set some flags in addition to moving the output to that location and there's one other reason what's the other reason I'm blanking there's two reasons and I didn't put them in my notes I don't remember okay but anyway that's a good that's a good question for quiz then okay give me two reasons yeah there could but of course then you know if it's you're gonna use that register then oh I know the other reason okay so so this is still good for a quiz okay the other reason is there may be aliasing going on maybe an intervening instruction uses one of the values and it's aliased okay so use as part of the result or whatever there still could be a a dependency anyway some arithmetic operations arithmetic operations are complex to implement in hardware and have long latencies so here's some sample op codes and how many instructions how many latency they take they take a different number so for example integer division actually is variable but a multiply takes about three times what most of the integer operations are and floating-point multiply is like five and then F ma what's F M a fuse multiply add this is where you're doing both a multiply in an ADD and why do we care about fuse multiply adds not for memory actually this is actually floating point multiplier an add it's called linear algebra okay so when you do makes multiplication you're doing dot product you're doing multiplies and adds so that kind of thing that's where that's where you do a lot of those so how does the hardware accommodate these complex operations so the strategy that that hardware I meant much hardware tends to use is to have separate functional units for complex operations such as floating-point arithmetic so there's a there may be in fact separate registers for example the xmm registers that only work with the floating-point so you have your basic five-stage pipeline you have another pipeline that's off on the side and it's going to take multiple cycles sometimes for things and maybe pipeline to a different depth okay and so you basically separate these separate these operations the the maybe pipeline fully partially or not at all okay and so I now have a whole bunch of different different functional units and there's different paths and I'm gonna be able to take through the data path of the of the processor so it has well there have integer a vector floating-point demit distributed among eight different ports which is sort of the inch from the entry so so given that things get really complicated if we go back to our our simple diagram suppose we have all these additional functional units how can I now exploit more instruction level parallelism so right now we have you know we can start up one operation at a time what what might I do to get more parallelism out of the hardware that I've got what do you think computer architects did okay yeah so so even simpler than than that but which is implied in what you're saying is you can just inch you know fetch an issue multiple instructions per cycle so rather than just doing one per cycle as we showed with a typical pipeline processor let me fetch several that use different parts of the pipe processor pipeline because they're not going to interfere okay to keep everything busy and so that's basically what's called a superscalar processor where it's not executing one thing at a time it's executing multiple things at a time so has well in fact breaks up the instructions into simpler operations called micro ops and they can emit for micro ops per cycle to the rest of the pipeline and the fetch and decode stages implement optimizations on microcode processing including special cases for common patterns for example if it sees the X or of our ax and our ax it knows that our X is being set to zero it doesn't even use a functional unit for that it just does it and it's done ok has just a special logic that observes that because it's such a common thing to set things out and so that means that now your processor can execute a lot of things at one time and that's the machines that you're doing that's why when I said if you save one add instruction it probably doesn't make any difference in today's processor because there's probably an idle add or lying around there's probably did I record how many where do we go here yeah so if you look here you can you can discover that they're actually a bunch of Al use that are capable of doing an ad so you know they're all over the map and in has well now still we are insisting that the processors execute in things in order and that's kind of the next stage is how do you end up making things run that is how do you make it so that that you can free yourself from the tyranny of one instruction after the other okay and so the first thing is there's a strategy called bypassing so suppose that you have you know a instructions running into our ax and then you're gonna use that to read well why bother waiting for it to be stored into into the register file and then pulled back out for the second instruction okay instead let's have it let's have a bypass a special circuit that identifies that kind of situation and feeds it directly to the to the next instruction without requiring that it go into the register file and back out okay so that's called bypassing there are lots of places where things are bypassed and we'll talk about it more so normally you would stall waiting for it to be written back and now when you eliminate it now I can move it way forward because I just use the bypass path to to execute and allows the second instruction to get going earlier okay what else can we do well let's take a large code example given the amount of time what I'm going to do is basically say you can go through and figure out what are the read after write dependence a--'s and the write after read dependence is there all over the place and what you can do is is if you look at what the dependencies are that I did that I just flashed through you can discover oh there's all these things each one right now has to wait for the previous one before it can get started and but there are some for example this the first one is just issue order you can't start the second B if it's in order you can't start the second till you've started the first okay that it's finished the first stage but the other thing here is there's a data dependence between the second and third instructions if you look at the second and third instructions they're both using xmm - and so we're prevented so one of the questions there is well why not do a little bit better by taking a look at this as a graph and figuring out what's the best way through the graph and there are a bunch of tricks you can do there which I'll run through here very quickly okay and you can take a look at these okay you can discover that some of these dependences are not real dependence and as long as you're willing to execute things out of order and keep track of that it's perfectly fine okay if you're not actually dependent on it then just go ahead and execute it and then you can advance things and then the other trick you can use is what's called register renaming if you have a destination that's going to be read from sorry if you have a if you have a if I want to read from something but if I want to write to something but something I have to wait for something else to read from it okay they write after read dependence then what I can do is just rename the register so that I have something to write to that is the same thing and there's a very complex mechanism called scoreboarding that that does that so anyway you can take a look at all of these tricks and then the last thing that I want to so this is the part I was going to skip over and indeed I don't have time to do it I just want to mention the last thing which is is worthwhile so this you don't have to know any of the details of that part but it's in there if you're interested so it does renaming and reordering and then the last thing I just want to mention is branch prediction so when you come to branch prediction the outcome you can have a hazard because the outcome is known to late and so in that case what they do is what's called speculative execution which you've probably heard of okay so basically that says I'm going to guess the outcome of the branch and execute okay if it's encountered you you assume it's taken otherwise you exit and you execute normally and if you're right everything is hunky-dory if you're wrong it costs you something like a you know you have to undo that speculative computation and the effect is sort of like stalling so you don't want that to happen and so there are a mispredicted branch on on has well cost about 15 to 20 cycles most of the machines use a branch predictor to tell whether or not it's going to do there's a little bit of stuff here about how you tell tell about whether something is going to be branch is going to be predicted or not okay and you can take a look at that on your own so sorry to rush a little bit at the end but I I knew I wasn't gonna get through all of this but it's in the notes you know in the slides when we put it up and this is really kind of interesting stuff once again I remember that I'm dealing with this at one level below what you really need to do but it is really helpful to understand that layer so you have a deep understanding of why certain software optimizations work and don't work sound good okay good luck on finishing your project ones you 3 00:00:05,769 --> 00:00:08,019 4 00:00:08,019 --> 00:00:09,850 5 00:00:09,850 --> 00:00:10,930 6 00:00:10,930 --> 00:00:13,120 7 00:00:13,120 --> 00:00:15,160 8 00:00:15,160 --> 00:00:22,139 9 00:00:22,139 --> 00:00:24,130 10 00:00:24,130 --> 00:00:26,139 11 00:00:26,139 --> 00:00:30,639 12 00:00:30,639 --> 00:00:32,530 13 00:00:32,530 --> 00:00:35,920 14 00:00:35,920 --> 00:00:38,680 15 00:00:38,680 --> 00:00:40,300 16 00:00:40,300 --> 00:00:42,970 17 00:00:42,970 --> 00:00:48,460 18 00:00:48,460 --> 00:00:50,950 19 00:00:50,950 --> 00:00:54,040 20 00:00:54,040 --> 00:00:57,220 21 00:00:57,220 --> 00:01:01,290 22 00:01:01,290 --> 00:01:03,970 23 00:01:03,970 --> 00:01:06,520 24 00:01:06,520 --> 00:01:11,590 25 00:01:11,590 --> 00:01:15,790 26 00:01:15,790 --> 00:01:18,039 27 00:01:18,039 --> 00:01:21,039 28 00:01:21,039 --> 00:01:25,149 29 00:01:25,149 --> 00:01:28,630 30 00:01:28,630 --> 00:01:32,080 31 00:01:32,080 --> 00:01:35,109 32 00:01:35,109 --> 00:01:40,469 33 00:01:40,469 --> 00:01:46,000 34 00:01:46,000 --> 00:01:48,100 35 00:01:48,100 --> 00:01:51,719 36 00:01:51,719 --> 00:01:54,550 37 00:01:54,550 --> 00:01:56,890 38 00:01:56,890 --> 00:01:58,710 39 00:01:58,710 --> 00:02:03,550 40 00:02:03,550 --> 00:02:11,029 41 00:02:11,029 --> 00:02:13,280 42 00:02:13,280 --> 00:02:16,069 43 00:02:16,069 --> 00:02:18,920 44 00:02:18,920 --> 00:02:21,319 45 00:02:21,319 --> 00:02:25,490 46 00:02:25,490 --> 00:02:27,289 47 00:02:27,289 --> 00:02:29,149 48 00:02:29,149 --> 00:02:34,699 49 00:02:34,699 --> 00:02:37,520 50 00:02:37,520 --> 00:02:40,069 51 00:02:40,069 --> 00:02:42,860 52 00:02:42,860 --> 00:02:45,610 53 00:02:45,610 --> 00:02:51,020 54 00:02:51,020 --> 00:02:53,690 55 00:02:53,690 --> 00:02:55,309 56 00:02:55,309 --> 00:02:58,520 57 00:02:58,520 --> 00:03:02,210 58 00:03:02,210 --> 00:03:09,069 59 00:03:09,069 --> 00:03:13,160 60 00:03:13,160 --> 00:03:18,830 61 00:03:18,830 --> 00:03:21,800 62 00:03:21,800 --> 00:03:23,449 63 00:03:23,449 --> 00:03:25,280 64 00:03:25,280 --> 00:03:28,129 65 00:03:28,129 --> 00:03:30,860 66 00:03:30,860 --> 00:03:33,349 67 00:03:33,349 --> 00:03:36,140 68 00:03:36,140 --> 00:03:38,990 69 00:03:38,990 --> 00:03:43,670 70 00:03:43,670 --> 00:03:45,289 71 00:03:45,289 --> 00:03:51,229 72 00:03:51,229 --> 00:03:52,849 73 00:03:52,849 --> 00:03:54,500 74 00:03:54,500 --> 00:03:57,710 75 00:03:57,710 --> 00:04:01,640 76 00:04:01,640 --> 00:04:05,240 77 00:04:05,240 --> 00:04:06,949 78 00:04:06,949 --> 00:04:12,619 79 00:04:12,619 --> 00:04:16,399 80 00:04:16,399 --> 00:04:20,120 81 00:04:20,120 --> 00:04:22,790 82 00:04:22,790 --> 00:04:24,370 83 00:04:24,370 --> 00:04:27,330 84 00:04:27,330 --> 00:04:30,010 85 00:04:30,010 --> 00:04:33,070 86 00:04:33,070 --> 00:04:36,300 87 00:04:36,300 --> 00:04:40,390 88 00:04:40,390 --> 00:04:43,330 89 00:04:43,330 --> 00:04:46,240 90 00:04:46,240 --> 00:04:49,240 91 00:04:49,240 --> 00:04:52,000 92 00:04:52,000 --> 00:04:54,880 93 00:04:54,880 --> 00:04:56,620 94 00:04:56,620 --> 00:05:03,130 95 00:05:03,130 --> 00:05:05,260 96 00:05:05,260 --> 00:05:07,840 97 00:05:07,840 --> 00:05:11,160 98 00:05:11,160 --> 00:05:11,170 99 00:05:11,170 --> 00:05:12,340 100 00:05:12,340 --> 00:05:16,900 101 00:05:16,900 --> 00:05:20,370 102 00:05:20,370 --> 00:05:23,940 103 00:05:23,940 --> 00:05:28,000 104 00:05:28,000 --> 00:05:29,860 105 00:05:29,860 --> 00:05:34,050 106 00:05:34,050 --> 00:05:36,490 107 00:05:36,490 --> 00:05:39,130 108 00:05:39,130 --> 00:05:43,900 109 00:05:43,900 --> 00:05:46,870 110 00:05:46,870 --> 00:05:49,960 111 00:05:49,960 --> 00:05:51,820 112 00:05:51,820 --> 00:05:55,420 113 00:05:55,420 --> 00:05:58,630 114 00:05:58,630 --> 00:06:03,880 115 00:06:03,880 --> 00:06:06,820 116 00:06:06,820 --> 00:06:10,810 117 00:06:10,810 --> 00:06:13,360 118 00:06:13,360 --> 00:06:15,790 119 00:06:15,790 --> 00:06:17,800 120 00:06:17,800 --> 00:06:20,470 121 00:06:20,470 --> 00:06:25,200 122 00:06:25,200 --> 00:06:29,170 123 00:06:29,170 --> 00:06:32,830 124 00:06:32,830 --> 00:06:36,040 125 00:06:36,040 --> 00:06:38,200 126 00:06:38,200 --> 00:06:40,210 127 00:06:40,210 --> 00:06:42,760 128 00:06:42,760 --> 00:06:44,800 129 00:06:44,800 --> 00:06:47,439 130 00:06:47,439 --> 00:06:49,570 131 00:06:49,570 --> 00:06:55,930 132 00:06:55,930 --> 00:06:57,370 133 00:06:57,370 --> 00:07:02,460 134 00:07:02,460 --> 00:07:10,080 135 00:07:10,080 --> 00:07:13,839 136 00:07:13,839 --> 00:07:16,240 137 00:07:16,240 --> 00:07:18,189 138 00:07:18,189 --> 00:07:20,650 139 00:07:20,650 --> 00:07:23,320 140 00:07:23,320 --> 00:07:24,999 141 00:07:24,999 --> 00:07:27,310 142 00:07:27,310 --> 00:07:28,749 143 00:07:28,749 --> 00:07:31,510 144 00:07:31,510 --> 00:07:36,129 145 00:07:36,129 --> 00:07:37,900 146 00:07:37,900 --> 00:07:43,930 147 00:07:43,930 --> 00:07:45,879 148 00:07:45,879 --> 00:07:48,550 149 00:07:48,550 --> 00:07:50,770 150 00:07:50,770 --> 00:07:53,740 151 00:07:53,740 --> 00:07:56,140 152 00:07:56,140 --> 00:08:01,890 153 00:08:01,890 --> 00:08:04,510 154 00:08:04,510 --> 00:08:07,240 155 00:08:07,240 --> 00:08:09,430 156 00:08:09,430 --> 00:08:11,649 157 00:08:11,649 --> 00:08:12,820 158 00:08:12,820 --> 00:08:16,210 159 00:08:16,210 --> 00:08:18,100 160 00:08:18,100 --> 00:08:20,379 161 00:08:20,379 --> 00:08:22,270 162 00:08:22,270 --> 00:08:22,280 163 00:08:22,280 --> 00:08:23,010 164 00:08:23,010 --> 00:08:25,779 165 00:08:25,779 --> 00:08:29,350 166 00:08:29,350 --> 00:08:33,100 167 00:08:33,100 --> 00:08:36,930 168 00:08:36,930 --> 00:08:40,540 169 00:08:40,540 --> 00:08:42,600 170 00:08:42,600 --> 00:08:46,960 171 00:08:46,960 --> 00:08:51,220 172 00:08:51,220 --> 00:08:51,920 173 00:08:51,920 --> 00:08:57,320 174 00:08:57,320 --> 00:09:00,170 175 00:09:00,170 --> 00:09:02,690 176 00:09:02,690 --> 00:09:06,650 177 00:09:06,650 --> 00:09:09,650 178 00:09:09,650 --> 00:09:11,300 179 00:09:11,300 --> 00:09:12,140 180 00:09:12,140 --> 00:09:13,600 181 00:09:13,600 --> 00:09:16,880 182 00:09:16,880 --> 00:09:18,710 183 00:09:18,710 --> 00:09:20,810 184 00:09:20,810 --> 00:09:23,120 185 00:09:23,120 --> 00:09:25,370 186 00:09:25,370 --> 00:09:29,630 187 00:09:29,630 --> 00:09:31,460 188 00:09:31,460 --> 00:09:36,430 189 00:09:36,430 --> 00:09:40,130 190 00:09:40,130 --> 00:09:43,520 191 00:09:43,520 --> 00:09:45,050 192 00:09:45,050 --> 00:09:47,330 193 00:09:47,330 --> 00:09:51,290 194 00:09:51,290 --> 00:09:54,500 195 00:09:54,500 --> 00:09:59,570 196 00:09:59,570 --> 00:10:01,220 197 00:10:01,220 --> 00:10:03,560 198 00:10:03,560 --> 00:10:06,520 199 00:10:06,520 --> 00:10:10,040 200 00:10:10,040 --> 00:10:13,580 201 00:10:13,580 --> 00:10:15,740 202 00:10:15,740 --> 00:10:20,090 203 00:10:20,090 --> 00:10:24,920 204 00:10:24,920 --> 00:10:27,590 205 00:10:27,590 --> 00:10:31,600 206 00:10:31,600 --> 00:10:34,880 207 00:10:34,880 --> 00:10:37,490 208 00:10:37,490 --> 00:10:41,180 209 00:10:41,180 --> 00:10:43,430 210 00:10:43,430 --> 00:10:45,230 211 00:10:45,230 --> 00:10:47,420 212 00:10:47,420 --> 00:10:50,060 213 00:10:50,060 --> 00:10:52,130 214 00:10:52,130 --> 00:10:54,230 215 00:10:54,230 --> 00:10:55,970 216 00:10:55,970 --> 00:10:57,970 217 00:10:57,970 --> 00:10:59,740 218 00:10:59,740 --> 00:11:01,480 219 00:11:01,480 --> 00:11:04,780 220 00:11:04,780 --> 00:11:07,780 221 00:11:07,780 --> 00:11:10,090 222 00:11:10,090 --> 00:11:12,160 223 00:11:12,160 --> 00:11:15,040 224 00:11:15,040 --> 00:11:18,310 225 00:11:18,310 --> 00:11:21,300 226 00:11:21,300 --> 00:11:25,960 227 00:11:25,960 --> 00:11:28,570 228 00:11:28,570 --> 00:11:30,100 229 00:11:30,100 --> 00:11:34,750 230 00:11:34,750 --> 00:11:36,850 231 00:11:36,850 --> 00:11:41,590 232 00:11:41,590 --> 00:11:44,770 233 00:11:44,770 --> 00:11:48,160 234 00:11:48,160 --> 00:11:49,720 235 00:11:49,720 --> 00:11:51,880 236 00:11:51,880 --> 00:11:54,660 237 00:11:54,660 --> 00:11:57,520 238 00:11:57,520 --> 00:12:01,030 239 00:12:01,030 --> 00:12:03,070 240 00:12:03,070 --> 00:12:06,220 241 00:12:06,220 --> 00:12:07,840 242 00:12:07,840 --> 00:12:10,590 243 00:12:10,590 --> 00:12:13,930 244 00:12:13,930 --> 00:12:16,930 245 00:12:16,930 --> 00:12:19,150 246 00:12:19,150 --> 00:12:22,950 247 00:12:22,950 --> 00:12:25,420 248 00:12:25,420 --> 00:12:26,980 249 00:12:26,980 --> 00:12:29,590 250 00:12:29,590 --> 00:12:31,650 251 00:12:31,650 --> 00:12:34,120 252 00:12:34,120 --> 00:12:36,280 253 00:12:36,280 --> 00:12:40,060 254 00:12:40,060 --> 00:12:42,640 255 00:12:42,640 --> 00:12:44,380 256 00:12:44,380 --> 00:12:46,030 257 00:12:46,030 --> 00:12:47,770 258 00:12:47,770 --> 00:12:49,480 259 00:12:49,480 --> 00:12:50,830 260 00:12:50,830 --> 00:12:54,330 261 00:12:54,330 --> 00:12:56,140 262 00:12:56,140 --> 00:12:58,840 263 00:12:58,840 --> 00:13:01,150 264 00:13:01,150 --> 00:13:03,550 265 00:13:03,550 --> 00:13:05,530 266 00:13:05,530 --> 00:13:06,820 267 00:13:06,820 --> 00:13:09,760 268 00:13:09,760 --> 00:13:10,720 269 00:13:10,720 --> 00:13:13,720 270 00:13:13,720 --> 00:13:15,970 271 00:13:15,970 --> 00:13:19,569 272 00:13:19,569 --> 00:13:21,400 273 00:13:21,400 --> 00:13:24,160 274 00:13:24,160 --> 00:13:26,500 275 00:13:26,500 --> 00:13:31,210 276 00:13:31,210 --> 00:13:34,600 277 00:13:34,600 --> 00:13:41,460 278 00:13:41,460 --> 00:13:46,439 279 00:13:46,439 --> 00:13:49,449 280 00:13:49,449 --> 00:13:51,819 281 00:13:51,819 --> 00:13:55,480 282 00:13:55,480 --> 00:13:58,930 283 00:13:58,930 --> 00:14:00,639 284 00:14:00,639 --> 00:14:03,879 285 00:14:03,879 --> 00:14:05,829 286 00:14:05,829 --> 00:14:08,500 287 00:14:08,500 --> 00:14:14,290 288 00:14:14,290 --> 00:14:16,990 289 00:14:16,990 --> 00:14:19,420 290 00:14:19,420 --> 00:14:24,220 291 00:14:24,220 --> 00:14:26,790 292 00:14:26,790 --> 00:14:29,769 293 00:14:29,769 --> 00:14:33,759 294 00:14:33,759 --> 00:14:36,819 295 00:14:36,819 --> 00:14:38,829 296 00:14:38,829 --> 00:14:41,829 297 00:14:41,829 --> 00:14:45,759 298 00:14:45,759 --> 00:14:49,180 299 00:14:49,180 --> 00:14:49,190 300 00:14:49,190 --> 00:14:51,309 301 00:14:51,309 --> 00:14:53,650 302 00:14:53,650 --> 00:14:56,829 303 00:14:56,829 --> 00:14:59,439 304 00:14:59,439 --> 00:15:02,350 305 00:15:02,350 --> 00:15:05,829 306 00:15:05,829 --> 00:15:09,100 307 00:15:09,100 --> 00:15:11,530 308 00:15:11,530 --> 00:15:13,840 309 00:15:13,840 --> 00:15:15,129 310 00:15:15,129 --> 00:15:18,660 311 00:15:18,660 --> 00:15:22,630 312 00:15:22,630 --> 00:15:24,610 313 00:15:24,610 --> 00:15:28,760 314 00:15:28,760 --> 00:15:31,760 315 00:15:31,760 --> 00:15:35,030 316 00:15:35,030 --> 00:15:37,940 317 00:15:37,940 --> 00:15:41,000 318 00:15:41,000 --> 00:15:44,990 319 00:15:44,990 --> 00:15:46,760 320 00:15:46,760 --> 00:15:50,320 321 00:15:50,320 --> 00:15:54,320 322 00:15:54,320 --> 00:15:57,790 323 00:15:57,790 --> 00:16:00,860 324 00:16:00,860 --> 00:16:03,470 325 00:16:03,470 --> 00:16:06,320 326 00:16:06,320 --> 00:16:08,000 327 00:16:08,000 --> 00:16:12,550 328 00:16:12,550 --> 00:16:17,960 329 00:16:17,960 --> 00:16:22,550 330 00:16:22,550 --> 00:16:26,390 331 00:16:26,390 --> 00:16:29,900 332 00:16:29,900 --> 00:16:34,310 333 00:16:34,310 --> 00:16:41,720 334 00:16:41,720 --> 00:16:43,610 335 00:16:43,610 --> 00:16:51,760 336 00:16:51,760 --> 00:16:56,840 337 00:16:56,840 --> 00:17:00,560 338 00:17:00,560 --> 00:17:03,050 339 00:17:03,050 --> 00:17:07,069 340 00:17:07,069 --> 00:17:08,480 341 00:17:08,480 --> 00:17:12,350 342 00:17:12,350 --> 00:17:14,000 343 00:17:14,000 --> 00:17:16,939 344 00:17:16,939 --> 00:17:19,819 345 00:17:19,819 --> 00:17:22,250 346 00:17:22,250 --> 00:17:24,319 347 00:17:24,319 --> 00:17:27,600 348 00:17:27,600 --> 00:17:29,680 349 00:17:29,680 --> 00:17:31,210 350 00:17:31,210 --> 00:17:33,490 351 00:17:33,490 --> 00:17:37,419 352 00:17:37,419 --> 00:17:40,840 353 00:17:40,840 --> 00:17:43,870 354 00:17:43,870 --> 00:17:46,030 355 00:17:46,030 --> 00:17:48,700 356 00:17:48,700 --> 00:17:50,770 357 00:17:50,770 --> 00:17:53,370 358 00:17:53,370 --> 00:17:57,640 359 00:17:57,640 --> 00:17:59,530 360 00:17:59,530 --> 00:18:02,680 361 00:18:02,680 --> 00:18:05,160 362 00:18:05,160 --> 00:18:08,590 363 00:18:08,590 --> 00:18:11,410 364 00:18:11,410 --> 00:18:14,110 365 00:18:14,110 --> 00:18:16,870 366 00:18:16,870 --> 00:18:19,570 367 00:18:19,570 --> 00:18:24,910 368 00:18:24,910 --> 00:18:24,920 369 00:18:24,920 --> 00:18:25,590 370 00:18:25,590 --> 00:18:29,560 371 00:18:29,560 --> 00:18:32,470 372 00:18:32,470 --> 00:18:37,450 373 00:18:37,450 --> 00:18:40,030 374 00:18:40,030 --> 00:18:42,970 375 00:18:42,970 --> 00:18:44,710 376 00:18:44,710 --> 00:18:46,720 377 00:18:46,720 --> 00:18:49,780 378 00:18:49,780 --> 00:18:52,000 379 00:18:52,000 --> 00:18:55,990 380 00:18:55,990 --> 00:18:58,330 381 00:18:58,330 --> 00:19:00,669 382 00:19:00,669 --> 00:19:03,520 383 00:19:03,520 --> 00:19:07,090 384 00:19:07,090 --> 00:19:10,770 385 00:19:10,770 --> 00:19:14,980 386 00:19:14,980 --> 00:19:20,350 387 00:19:20,350 --> 00:19:24,820 388 00:19:24,820 --> 00:19:26,770 389 00:19:26,770 --> 00:19:30,880 390 00:19:30,880 --> 00:19:36,070 391 00:19:36,070 --> 00:19:37,810 392 00:19:37,810 --> 00:19:39,230 393 00:19:39,230 --> 00:19:41,419 394 00:19:41,419 --> 00:19:44,889 395 00:19:44,889 --> 00:19:47,779 396 00:19:47,779 --> 00:19:50,299 397 00:19:50,299 --> 00:19:53,419 398 00:19:53,419 --> 00:19:57,019 399 00:19:57,019 --> 00:20:00,380 400 00:20:00,380 --> 00:20:02,510 401 00:20:02,510 --> 00:20:04,100 402 00:20:04,100 --> 00:20:06,649 403 00:20:06,649 --> 00:20:09,560 404 00:20:09,560 --> 00:20:14,870 405 00:20:14,870 --> 00:20:16,519 406 00:20:16,519 --> 00:20:19,250 407 00:20:19,250 --> 00:20:21,230 408 00:20:21,230 --> 00:20:22,580 409 00:20:22,580 --> 00:20:24,169 410 00:20:24,169 --> 00:20:28,669 411 00:20:28,669 --> 00:20:30,950 412 00:20:30,950 --> 00:20:33,830 413 00:20:33,830 --> 00:20:38,510 414 00:20:38,510 --> 00:20:41,510 415 00:20:41,510 --> 00:20:44,659 416 00:20:44,659 --> 00:20:46,460 417 00:20:46,460 --> 00:20:49,610 418 00:20:49,610 --> 00:20:53,630 419 00:20:53,630 --> 00:20:58,370 420 00:20:58,370 --> 00:21:01,159 421 00:21:01,159 --> 00:21:06,260 422 00:21:06,260 --> 00:21:09,919 423 00:21:09,919 --> 00:21:12,080 424 00:21:12,080 --> 00:21:15,980 425 00:21:15,980 --> 00:21:23,860 426 00:21:23,860 --> 00:21:28,630 427 00:21:28,630 --> 00:21:31,930 428 00:21:31,930 --> 00:21:36,100 429 00:21:36,100 --> 00:21:39,430 430 00:21:39,430 --> 00:21:42,279 431 00:21:42,279 --> 00:21:45,640 432 00:21:45,640 --> 00:21:50,350 433 00:21:50,350 --> 00:21:52,390 434 00:21:52,390 --> 00:21:55,539 435 00:21:55,539 --> 00:21:58,870 436 00:21:58,870 --> 00:22:01,930 437 00:22:01,930 --> 00:22:05,680 438 00:22:05,680 --> 00:22:07,930 439 00:22:07,930 --> 00:22:11,639 440 00:22:11,639 --> 00:22:15,970 441 00:22:15,970 --> 00:22:19,419 442 00:22:19,419 --> 00:22:21,970 443 00:22:21,970 --> 00:22:26,980 444 00:22:26,980 --> 00:22:30,010 445 00:22:30,010 --> 00:22:34,680 446 00:22:34,680 --> 00:22:38,230 447 00:22:38,230 --> 00:22:40,330 448 00:22:40,330 --> 00:22:42,610 449 00:22:42,610 --> 00:22:44,680 450 00:22:44,680 --> 00:22:47,500 451 00:22:47,500 --> 00:22:50,110 452 00:22:50,110 --> 00:22:52,779 453 00:22:52,779 --> 00:22:56,490 454 00:22:56,490 --> 00:23:01,779 455 00:23:01,779 --> 00:23:06,730 456 00:23:06,730 --> 00:23:09,490 457 00:23:09,490 --> 00:23:11,880 458 00:23:11,880 --> 00:23:15,250 459 00:23:15,250 --> 00:23:20,919 460 00:23:20,919 --> 00:23:22,960 461 00:23:22,960 --> 00:23:24,880 462 00:23:24,880 --> 00:23:27,430 463 00:23:27,430 --> 00:23:30,639 464 00:23:30,639 --> 00:23:33,190 465 00:23:33,190 --> 00:23:36,220 466 00:23:36,220 --> 00:23:37,450 467 00:23:37,450 --> 00:23:42,850 468 00:23:42,850 --> 00:23:47,440 469 00:23:47,440 --> 00:23:54,850 470 00:23:54,850 --> 00:23:57,760 471 00:23:57,760 --> 00:24:02,620 472 00:24:02,620 --> 00:24:04,389 473 00:24:04,389 --> 00:24:09,450 474 00:24:09,450 --> 00:24:13,090 475 00:24:13,090 --> 00:24:15,700 476 00:24:15,700 --> 00:24:17,500 477 00:24:17,500 --> 00:24:21,070 478 00:24:21,070 --> 00:24:22,960 479 00:24:22,960 --> 00:24:29,740 480 00:24:29,740 --> 00:24:33,340 481 00:24:33,340 --> 00:24:36,210 482 00:24:36,210 --> 00:24:39,130 483 00:24:39,130 --> 00:24:43,470 484 00:24:43,470 --> 00:24:46,630 485 00:24:46,630 --> 00:24:49,060 486 00:24:49,060 --> 00:24:52,210 487 00:24:52,210 --> 00:24:55,060 488 00:24:55,060 --> 00:24:58,870 489 00:24:58,870 --> 00:25:02,019 490 00:25:02,019 --> 00:25:03,580 491 00:25:03,580 --> 00:25:05,760 492 00:25:05,760 --> 00:25:11,710 493 00:25:11,710 --> 00:25:13,720 494 00:25:13,720 --> 00:25:16,889 495 00:25:16,889 --> 00:25:19,840 496 00:25:19,840 --> 00:25:23,200 497 00:25:23,200 --> 00:25:25,090 498 00:25:25,090 --> 00:25:28,750 499 00:25:28,750 --> 00:25:31,389 500 00:25:31,389 --> 00:25:34,690 501 00:25:34,690 --> 00:25:37,510 502 00:25:37,510 --> 00:25:41,320 503 00:25:41,320 --> 00:25:47,139 504 00:25:47,139 --> 00:25:49,360 505 00:25:49,360 --> 00:25:51,100 506 00:25:51,100 --> 00:25:53,200 507 00:25:53,200 --> 00:25:55,480 508 00:25:55,480 --> 00:25:57,250 509 00:25:57,250 --> 00:26:00,580 510 00:26:00,580 --> 00:26:02,680 511 00:26:02,680 --> 00:26:04,180 512 00:26:04,180 --> 00:26:05,799 513 00:26:05,799 --> 00:26:07,139 514 00:26:07,139 --> 00:26:11,019 515 00:26:11,019 --> 00:26:13,120 516 00:26:13,120 --> 00:26:14,950 517 00:26:14,950 --> 00:26:18,490 518 00:26:18,490 --> 00:26:21,810 519 00:26:21,810 --> 00:26:25,629 520 00:26:25,629 --> 00:26:28,360 521 00:26:28,360 --> 00:26:30,039 522 00:26:30,039 --> 00:26:32,560 523 00:26:32,560 --> 00:26:34,240 524 00:26:34,240 --> 00:26:40,810 525 00:26:40,810 --> 00:26:51,730 526 00:26:51,730 --> 00:26:54,100 527 00:26:54,100 --> 00:26:57,970 528 00:26:57,970 --> 00:27:00,519 529 00:27:00,519 --> 00:27:03,009 530 00:27:03,009 --> 00:27:06,460 531 00:27:06,460 --> 00:27:08,289 532 00:27:08,289 --> 00:27:11,799 533 00:27:11,799 --> 00:27:13,690 534 00:27:13,690 --> 00:27:16,090 535 00:27:16,090 --> 00:27:18,580 536 00:27:18,580 --> 00:27:21,639 537 00:27:21,639 --> 00:27:25,299 538 00:27:25,299 --> 00:27:28,419 539 00:27:28,419 --> 00:27:31,080 540 00:27:31,080 --> 00:27:35,769 541 00:27:35,769 --> 00:27:38,620 542 00:27:38,620 --> 00:27:40,240 543 00:27:40,240 --> 00:27:44,110 544 00:27:44,110 --> 00:27:47,409 545 00:27:47,409 --> 00:27:50,200 546 00:27:50,200 --> 00:27:51,610 547 00:27:51,610 --> 00:27:54,730 548 00:27:54,730 --> 00:27:56,669 549 00:27:56,669 --> 00:28:00,279 550 00:28:00,279 --> 00:28:01,779 551 00:28:01,779 --> 00:28:04,600 552 00:28:04,600 --> 00:28:06,730 553 00:28:06,730 --> 00:28:08,860 554 00:28:08,860 --> 00:28:10,419 555 00:28:10,419 --> 00:28:13,149 556 00:28:13,149 --> 00:28:17,049 557 00:28:17,049 --> 00:28:20,919 558 00:28:20,919 --> 00:28:23,889 559 00:28:23,889 --> 00:28:29,649 560 00:28:29,649 --> 00:28:33,630 561 00:28:33,630 --> 00:28:36,580 562 00:28:36,580 --> 00:28:40,620 563 00:28:40,620 --> 00:28:43,389 564 00:28:43,389 --> 00:28:46,419 565 00:28:46,419 --> 00:28:48,759 566 00:28:48,759 --> 00:28:56,009 567 00:28:56,009 --> 00:28:58,690 568 00:28:58,690 --> 00:29:02,409 569 00:29:02,409 --> 00:29:05,529 570 00:29:05,529 --> 00:29:09,129 571 00:29:09,129 --> 00:29:12,240 572 00:29:12,240 --> 00:29:18,389 573 00:29:18,389 --> 00:29:23,649 574 00:29:23,649 --> 00:29:27,490 575 00:29:27,490 --> 00:29:29,490 576 00:29:29,490 --> 00:29:32,799 577 00:29:32,799 --> 00:29:40,019 578 00:29:40,019 --> 00:29:44,019 579 00:29:44,019 --> 00:29:46,180 580 00:29:46,180 --> 00:29:49,419 581 00:29:49,419 --> 00:29:52,180 582 00:29:52,180 --> 00:29:54,490 583 00:29:54,490 --> 00:29:57,159 584 00:29:57,159 --> 00:29:59,710 585 00:29:59,710 --> 00:30:04,690 586 00:30:04,690 --> 00:30:08,019 587 00:30:08,019 --> 00:30:09,610 588 00:30:09,610 --> 00:30:15,940 589 00:30:15,940 --> 00:30:17,760 590 00:30:17,760 --> 00:30:20,610 591 00:30:20,610 --> 00:30:23,970 592 00:30:23,970 --> 00:30:27,150 593 00:30:27,150 --> 00:30:30,030 594 00:30:30,030 --> 00:30:31,530 595 00:30:31,530 --> 00:30:34,710 596 00:30:34,710 --> 00:30:40,080 597 00:30:40,080 --> 00:30:42,750 598 00:30:42,750 --> 00:30:44,190 599 00:30:44,190 --> 00:30:46,890 600 00:30:46,890 --> 00:30:49,290 601 00:30:49,290 --> 00:30:52,680 602 00:30:52,680 --> 00:30:59,340 603 00:30:59,340 --> 00:31:01,920 604 00:31:01,920 --> 00:31:04,680 605 00:31:04,680 --> 00:31:05,910 606 00:31:05,910 --> 00:31:08,220 607 00:31:08,220 --> 00:31:10,140 608 00:31:10,140 --> 00:31:13,920 609 00:31:13,920 --> 00:31:16,380 610 00:31:16,380 --> 00:31:18,810 611 00:31:18,810 --> 00:31:21,060 612 00:31:21,060 --> 00:31:23,610 613 00:31:23,610 --> 00:31:29,190 614 00:31:29,190 --> 00:31:31,500 615 00:31:31,500 --> 00:31:33,960 616 00:31:33,960 --> 00:31:39,390 617 00:31:39,390 --> 00:31:41,490 618 00:31:41,490 --> 00:31:46,350 619 00:31:46,350 --> 00:31:48,540 620 00:31:48,540 --> 00:31:55,040 621 00:31:55,040 --> 00:31:57,240 622 00:31:57,240 --> 00:31:59,640 623 00:31:59,640 --> 00:32:02,280 624 00:32:02,280 --> 00:32:05,100 625 00:32:05,100 --> 00:32:07,230 626 00:32:07,230 --> 00:32:10,230 627 00:32:10,230 --> 00:32:12,900 628 00:32:12,900 --> 00:32:14,670 629 00:32:14,670 --> 00:32:18,570 630 00:32:18,570 --> 00:32:20,220 631 00:32:20,220 --> 00:32:22,410 632 00:32:22,410 --> 00:32:25,340 633 00:32:25,340 --> 00:32:27,960 634 00:32:27,960 --> 00:32:30,299 635 00:32:30,299 --> 00:32:36,519 636 00:32:36,519 --> 00:32:38,289 637 00:32:38,289 --> 00:32:42,159 638 00:32:42,159 --> 00:32:47,230 639 00:32:47,230 --> 00:32:48,720 640 00:32:48,720 --> 00:32:51,850 641 00:32:51,850 --> 00:32:53,769 642 00:32:53,769 --> 00:32:56,799 643 00:32:56,799 --> 00:33:01,120 644 00:33:01,120 --> 00:33:03,250 645 00:33:03,250 --> 00:33:05,590 646 00:33:05,590 --> 00:33:07,870 647 00:33:07,870 --> 00:33:10,570 648 00:33:10,570 --> 00:33:13,000 649 00:33:13,000 --> 00:33:16,210 650 00:33:16,210 --> 00:33:20,080 651 00:33:20,080 --> 00:33:24,340 652 00:33:24,340 --> 00:33:25,450 653 00:33:25,450 --> 00:33:27,490 654 00:33:27,490 --> 00:33:32,950 655 00:33:32,950 --> 00:33:35,470 656 00:33:35,470 --> 00:33:37,720 657 00:33:37,720 --> 00:33:39,220 658 00:33:39,220 --> 00:33:42,250 659 00:33:42,250 --> 00:33:46,029 660 00:33:46,029 --> 00:33:48,220 661 00:33:48,220 --> 00:33:51,539 662 00:33:51,539 --> 00:33:57,220 663 00:33:57,220 --> 00:34:00,310 664 00:34:00,310 --> 00:34:03,250 665 00:34:03,250 --> 00:34:06,100 666 00:34:06,100 --> 00:34:11,889 667 00:34:11,889 --> 00:34:14,859 668 00:34:14,859 --> 00:34:16,419 669 00:34:16,419 --> 00:34:22,119 670 00:34:22,119 --> 00:34:23,740 671 00:34:23,740 --> 00:34:29,409 672 00:34:29,409 --> 00:34:34,149 673 00:34:34,149 --> 00:34:36,099 674 00:34:36,099 --> 00:34:37,810 675 00:34:37,810 --> 00:34:40,589 676 00:34:40,589 --> 00:34:43,570 677 00:34:43,570 --> 00:34:47,220 678 00:34:47,220 --> 00:34:50,050 679 00:34:50,050 --> 00:34:52,120 680 00:34:52,120 --> 00:34:56,350 681 00:34:56,350 --> 00:34:59,080 682 00:34:59,080 --> 00:35:01,390 683 00:35:01,390 --> 00:35:04,330 684 00:35:04,330 --> 00:35:11,290 685 00:35:11,290 --> 00:35:13,810 686 00:35:13,810 --> 00:35:16,030 687 00:35:16,030 --> 00:35:19,900 688 00:35:19,900 --> 00:35:22,180 689 00:35:22,180 --> 00:35:26,320 690 00:35:26,320 --> 00:35:28,360 691 00:35:28,360 --> 00:35:30,340 692 00:35:30,340 --> 00:35:30,350 693 00:35:30,350 --> 00:35:30,760 694 00:35:30,760 --> 00:35:33,820 695 00:35:33,820 --> 00:35:39,010 696 00:35:39,010 --> 00:35:41,110 697 00:35:41,110 --> 00:35:44,790 698 00:35:44,790 --> 00:35:47,590 699 00:35:47,590 --> 00:35:50,160 700 00:35:50,160 --> 00:35:54,370 701 00:35:54,370 --> 00:35:56,560 702 00:35:56,560 --> 00:36:00,780 703 00:36:00,780 --> 00:36:05,230 704 00:36:05,230 --> 00:36:09,790 705 00:36:09,790 --> 00:36:12,720 706 00:36:12,720 --> 00:36:16,210 707 00:36:16,210 --> 00:36:18,730 708 00:36:18,730 --> 00:36:21,930 709 00:36:21,930 --> 00:36:26,680 710 00:36:26,680 --> 00:36:27,940 711 00:36:27,940 --> 00:36:31,080 712 00:36:31,080 --> 00:36:32,830 713 00:36:32,830 --> 00:36:32,840 714 00:36:32,840 --> 00:36:33,490 715 00:36:33,490 --> 00:36:37,810 716 00:36:37,810 --> 00:36:40,080 717 00:36:40,080 --> 00:36:43,660 718 00:36:43,660 --> 00:36:47,500 719 00:36:47,500 --> 00:36:50,440 720 00:36:50,440 --> 00:36:53,800 721 00:36:53,800 --> 00:36:56,680 722 00:36:56,680 --> 00:36:59,440 723 00:36:59,440 --> 00:37:02,920 724 00:37:02,920 --> 00:37:05,950 725 00:37:05,950 --> 00:37:07,990 726 00:37:07,990 --> 00:37:12,820 727 00:37:12,820 --> 00:37:16,270 728 00:37:16,270 --> 00:37:19,690 729 00:37:19,690 --> 00:37:23,590 730 00:37:23,590 --> 00:37:27,220 731 00:37:27,220 --> 00:37:29,290 732 00:37:29,290 --> 00:37:32,260 733 00:37:32,260 --> 00:37:34,810 734 00:37:34,810 --> 00:37:37,150 735 00:37:37,150 --> 00:37:39,070 736 00:37:39,070 --> 00:37:41,350 737 00:37:41,350 --> 00:37:44,200 738 00:37:44,200 --> 00:37:48,850 739 00:37:48,850 --> 00:37:51,670 740 00:37:51,670 --> 00:37:52,900 741 00:37:52,900 --> 00:37:55,030 742 00:37:55,030 --> 00:38:01,300 743 00:38:01,300 --> 00:38:02,440 744 00:38:02,440 --> 00:38:05,590 745 00:38:05,590 --> 00:38:09,850 746 00:38:09,850 --> 00:38:12,760 747 00:38:12,760 --> 00:38:15,220 748 00:38:15,220 --> 00:38:17,260 749 00:38:17,260 --> 00:38:20,350 750 00:38:20,350 --> 00:38:21,790 751 00:38:21,790 --> 00:38:23,670 752 00:38:23,670 --> 00:38:26,140 753 00:38:26,140 --> 00:38:28,750 754 00:38:28,750 --> 00:38:30,520 755 00:38:30,520 --> 00:38:34,570 756 00:38:34,570 --> 00:38:36,040 757 00:38:36,040 --> 00:38:37,780 758 00:38:37,780 --> 00:38:39,520 759 00:38:39,520 --> 00:38:41,320 760 00:38:41,320 --> 00:38:43,420 761 00:38:43,420 --> 00:38:45,490 762 00:38:45,490 --> 00:38:49,180 763 00:38:49,180 --> 00:38:51,550 764 00:38:51,550 --> 00:38:57,100 765 00:38:57,100 --> 00:39:02,760 766 00:39:02,760 --> 00:39:05,590 767 00:39:05,590 --> 00:39:08,310 768 00:39:08,310 --> 00:39:11,430 769 00:39:11,430 --> 00:39:17,010 770 00:39:17,010 --> 00:39:20,440 771 00:39:20,440 --> 00:39:22,150 772 00:39:22,150 --> 00:39:28,680 773 00:39:28,680 --> 00:39:30,820 774 00:39:30,820 --> 00:39:35,080 775 00:39:35,080 --> 00:39:37,360 776 00:39:37,360 --> 00:39:39,580 777 00:39:39,580 --> 00:39:43,870 778 00:39:43,870 --> 00:39:49,920 779 00:39:49,920 --> 00:39:54,460 780 00:39:54,460 --> 00:39:56,710 781 00:39:56,710 --> 00:40:00,160 782 00:40:00,160 --> 00:40:02,050 783 00:40:02,050 --> 00:40:05,620 784 00:40:05,620 --> 00:40:07,330 785 00:40:07,330 --> 00:40:10,240 786 00:40:10,240 --> 00:40:12,460 787 00:40:12,460 --> 00:40:18,610 788 00:40:18,610 --> 00:40:20,680 789 00:40:20,680 --> 00:40:23,950 790 00:40:23,950 --> 00:40:26,530 791 00:40:26,530 --> 00:40:28,180 792 00:40:28,180 --> 00:40:30,100 793 00:40:30,100 --> 00:40:31,780 794 00:40:31,780 --> 00:40:34,840 795 00:40:34,840 --> 00:40:36,460 796 00:40:36,460 --> 00:40:38,290 797 00:40:38,290 --> 00:40:41,740 798 00:40:41,740 --> 00:40:45,970 799 00:40:45,970 --> 00:40:48,970 800 00:40:48,970 --> 00:40:54,660 801 00:40:54,660 --> 00:40:59,230 802 00:40:59,230 --> 00:41:01,420 803 00:41:01,420 --> 00:41:05,260 804 00:41:05,260 --> 00:41:09,700 805 00:41:09,700 --> 00:41:13,480 806 00:41:13,480 --> 00:41:17,790 807 00:41:17,790 --> 00:41:20,860 808 00:41:20,860 --> 00:41:22,440 809 00:41:22,440 --> 00:41:26,410 810 00:41:26,410 --> 00:41:30,400 811 00:41:30,400 --> 00:41:33,670 812 00:41:33,670 --> 00:41:35,829 813 00:41:35,829 --> 00:41:38,260 814 00:41:38,260 --> 00:41:40,870 815 00:41:40,870 --> 00:41:42,280 816 00:41:42,280 --> 00:41:45,670 817 00:41:45,670 --> 00:41:48,040 818 00:41:48,040 --> 00:41:49,900 819 00:41:49,900 --> 00:41:51,609 820 00:41:51,609 --> 00:41:56,650 821 00:41:56,650 --> 00:41:59,970 822 00:41:59,970 --> 00:42:02,800 823 00:42:02,800 --> 00:42:05,410 824 00:42:05,410 --> 00:42:06,700 825 00:42:06,700 --> 00:42:10,870 826 00:42:10,870 --> 00:42:12,579 827 00:42:12,579 --> 00:42:14,910 828 00:42:14,910 --> 00:42:18,040 829 00:42:18,040 --> 00:42:20,470 830 00:42:20,470 --> 00:42:22,300 831 00:42:22,300 --> 00:42:24,640 832 00:42:24,640 --> 00:42:26,849 833 00:42:26,849 --> 00:42:29,200 834 00:42:29,200 --> 00:42:31,750 835 00:42:31,750 --> 00:42:33,910 836 00:42:33,910 --> 00:42:36,130 837 00:42:36,130 --> 00:42:37,420 838 00:42:37,420 --> 00:42:40,180 839 00:42:40,180 --> 00:42:44,109 840 00:42:44,109 --> 00:42:50,859 841 00:42:50,859 --> 00:42:52,900 842 00:42:52,900 --> 00:42:55,900 843 00:42:55,900 --> 00:42:58,150 844 00:42:58,150 --> 00:43:01,450 845 00:43:01,450 --> 00:43:03,160 846 00:43:03,160 --> 00:43:05,710 847 00:43:05,710 --> 00:43:09,790 848 00:43:09,790 --> 00:43:12,730 849 00:43:12,730 --> 00:43:15,420 850 00:43:15,420 --> 00:43:18,700 851 00:43:18,700 --> 00:43:20,980 852 00:43:20,980 --> 00:43:24,310 853 00:43:24,310 --> 00:43:26,470 854 00:43:26,470 --> 00:43:30,160 855 00:43:30,160 --> 00:43:35,140 856 00:43:35,140 --> 00:43:35,150 857 00:43:35,150 --> 00:43:35,500 858 00:43:35,500 --> 00:43:42,130 859 00:43:42,130 --> 00:43:45,550 860 00:43:45,550 --> 00:43:47,950 861 00:43:47,950 --> 00:43:49,930 862 00:43:49,930 --> 00:43:52,150 863 00:43:52,150 --> 00:43:54,099 864 00:43:54,099 --> 00:43:55,960 865 00:43:55,960 --> 00:43:58,900 866 00:43:58,900 --> 00:44:00,340 867 00:44:00,340 --> 00:44:02,650 868 00:44:02,650 --> 00:44:05,680 869 00:44:05,680 --> 00:44:10,030 870 00:44:10,030 --> 00:44:14,320 871 00:44:14,320 --> 00:44:15,700 872 00:44:15,700 --> 00:44:18,700 873 00:44:18,700 --> 00:44:21,099 874 00:44:21,099 --> 00:44:24,340 875 00:44:24,340 --> 00:44:26,740 876 00:44:26,740 --> 00:44:30,760 877 00:44:30,760 --> 00:44:33,310 878 00:44:33,310 --> 00:44:38,740 879 00:44:38,740 --> 00:44:53,359 880 00:44:53,359 --> 00:44:56,499 881 00:44:56,499 --> 00:45:01,329 882 00:45:01,329 --> 00:45:12,979 883 00:45:12,979 --> 00:45:20,479 884 00:45:20,479 --> 00:45:22,849 885 00:45:22,849 --> 00:45:24,529 886 00:45:24,529 --> 00:45:31,370 887 00:45:31,370 --> 00:45:33,109 888 00:45:33,109 --> 00:45:34,819 889 00:45:34,819 --> 00:45:37,640 890 00:45:37,640 --> 00:45:39,499 891 00:45:39,499 --> 00:45:44,410 892 00:45:44,410 --> 00:45:46,609 893 00:45:46,609 --> 00:45:49,459 894 00:45:49,459 --> 00:45:53,479 895 00:45:53,479 --> 00:45:55,039 896 00:45:55,039 --> 00:45:58,849 897 00:45:58,849 --> 00:46:00,440 898 00:46:00,440 --> 00:46:03,829 899 00:46:03,829 --> 00:46:08,660 900 00:46:08,660 --> 00:46:11,329 901 00:46:11,329 --> 00:46:13,969 902 00:46:13,969 --> 00:46:17,299 903 00:46:17,299 --> 00:46:21,709 904 00:46:21,709 --> 00:46:23,329 905 00:46:23,329 --> 00:46:28,239 906 00:46:28,239 --> 00:46:34,609 907 00:46:34,609 --> 00:46:36,920 908 00:46:36,920 --> 00:46:39,259 909 00:46:39,259 --> 00:46:41,930 910 00:46:41,930 --> 00:46:44,839 911 00:46:44,839 --> 00:46:48,920 912 00:46:48,920 --> 00:46:52,779 913 00:46:52,779 --> 00:46:56,660 914 00:46:56,660 --> 00:47:00,440 915 00:47:00,440 --> 00:47:02,570 916 00:47:02,570 --> 00:47:04,790 917 00:47:04,790 --> 00:47:07,370 918 00:47:07,370 --> 00:47:14,740 919 00:47:14,740 --> 00:47:17,030 920 00:47:17,030 --> 00:47:20,390 921 00:47:20,390 --> 00:47:23,960 922 00:47:23,960 --> 00:47:26,150 923 00:47:26,150 --> 00:47:28,580 924 00:47:28,580 --> 00:47:31,430 925 00:47:31,430 --> 00:47:32,720 926 00:47:32,720 --> 00:47:34,640 927 00:47:34,640 --> 00:47:39,160 928 00:47:39,160 --> 00:47:42,800 929 00:47:42,800 --> 00:47:44,860 930 00:47:44,860 --> 00:47:47,900 931 00:47:47,900 --> 00:47:49,850 932 00:47:49,850 --> 00:47:51,470 933 00:47:51,470 --> 00:47:53,800 934 00:47:53,800 --> 00:47:58,460 935 00:47:58,460 --> 00:48:01,670 936 00:48:01,670 --> 00:48:04,700 937 00:48:04,700 --> 00:48:07,820 938 00:48:07,820 --> 00:48:09,770 939 00:48:09,770 --> 00:48:11,330 940 00:48:11,330 --> 00:48:13,280 941 00:48:13,280 --> 00:48:15,830 942 00:48:15,830 --> 00:48:17,690 943 00:48:17,690 --> 00:48:20,120 944 00:48:20,120 --> 00:48:21,980 945 00:48:21,980 --> 00:48:25,250 946 00:48:25,250 --> 00:48:28,430 947 00:48:28,430 --> 00:48:35,150 948 00:48:35,150 --> 00:48:38,020 949 00:48:38,020 --> 00:48:40,160 950 00:48:40,160 --> 00:48:43,040 951 00:48:43,040 --> 00:48:44,210 952 00:48:44,210 --> 00:48:47,180 953 00:48:47,180 --> 00:48:49,130 954 00:48:49,130 --> 00:48:51,440 955 00:48:51,440 --> 00:48:54,050 956 00:48:54,050 --> 00:48:55,940 957 00:48:55,940 --> 00:48:58,930 958 00:48:58,930 --> 00:49:01,460 959 00:49:01,460 --> 00:49:05,270 960 00:49:05,270 --> 00:49:07,370 961 00:49:07,370 --> 00:49:09,590 962 00:49:09,590 --> 00:49:14,150 963 00:49:14,150 --> 00:49:15,680 964 00:49:15,680 --> 00:49:19,309 965 00:49:19,309 --> 00:49:20,750 966 00:49:20,750 --> 00:49:22,280 967 00:49:22,280 --> 00:49:24,280 968 00:49:24,280 --> 00:49:26,960 969 00:49:26,960 --> 00:49:28,849 970 00:49:28,849 --> 00:49:33,190 971 00:49:33,190 --> 00:49:35,480 972 00:49:35,480 --> 00:49:37,550 973 00:49:37,550 --> 00:49:39,980 974 00:49:39,980 --> 00:49:41,599 975 00:49:41,599 --> 00:49:42,579 976 00:49:42,579 --> 00:49:45,559 977 00:49:45,559 --> 00:49:47,809 978 00:49:47,809 --> 00:49:49,849 979 00:49:49,849 --> 00:49:52,910 980 00:49:52,910 --> 00:49:58,099 981 00:49:58,099 --> 00:50:00,140 982 00:50:00,140 --> 00:50:01,579 983 00:50:01,579 --> 00:50:04,190 984 00:50:04,190 --> 00:50:07,130 985 00:50:07,130 --> 00:50:09,950 986 00:50:09,950 --> 00:50:11,809 987 00:50:11,809 --> 00:50:14,780 988 00:50:14,780 --> 00:50:20,329 989 00:50:20,329 --> 00:50:23,930 990 00:50:23,930 --> 00:50:25,250 991 00:50:25,250 --> 00:50:29,510 992 00:50:29,510 --> 00:50:31,760 993 00:50:31,760 --> 00:50:37,640 994 00:50:37,640 --> 00:50:39,079 995 00:50:39,079 --> 00:50:42,500 996 00:50:42,500 --> 00:50:44,359 997 00:50:44,359 --> 00:50:47,170 998 00:50:47,170 --> 00:50:50,359 999 00:50:50,359 --> 00:50:52,309 1000 00:50:52,309 --> 00:50:57,020 1001 00:50:57,020 --> 00:50:58,960 1002 00:50:58,960 --> 00:51:01,040 1003 00:51:01,040 --> 00:51:02,300 1004 00:51:02,300 --> 00:51:03,980 1005 00:51:03,980 --> 00:51:08,120 1006 00:51:08,120 --> 00:51:11,359 1007 00:51:11,359 --> 00:51:14,120 1008 00:51:14,120 --> 00:51:17,240 1009 00:51:17,240 --> 00:51:21,410 1010 00:51:21,410 --> 00:51:23,900 1011 00:51:23,900 --> 00:51:26,990 1012 00:51:26,990 --> 00:51:29,420 1013 00:51:29,420 --> 00:51:29,430 1014 00:51:29,430 --> 00:51:30,050 1015 00:51:30,050 --> 00:51:33,560 1016 00:51:33,560 --> 00:51:35,480 1017 00:51:35,480 --> 00:51:37,640 1018 00:51:37,640 --> 00:51:39,950 1019 00:51:39,950 --> 00:51:41,780 1020 00:51:41,780 --> 00:51:43,670 1021 00:51:43,670 --> 00:51:47,000 1022 00:51:47,000 --> 00:51:49,430 1023 00:51:49,430 --> 00:51:52,910 1024 00:51:52,910 --> 00:51:56,750 1025 00:51:56,750 --> 00:52:00,020 1026 00:52:00,020 --> 00:52:04,640 1027 00:52:04,640 --> 00:52:07,430 1028 00:52:07,430 --> 00:52:09,290 1029 00:52:09,290 --> 00:52:12,500 1030 00:52:12,500 --> 00:52:15,740 1031 00:52:15,740 --> 00:52:17,510 1032 00:52:17,510 --> 00:52:18,740 1033 00:52:18,740 --> 00:52:20,480 1034 00:52:20,480 --> 00:52:22,820 1035 00:52:22,820 --> 00:52:26,000 1036 00:52:26,000 --> 00:52:29,320 1037 00:52:29,320 --> 00:52:32,120 1038 00:52:32,120 --> 00:52:34,100 1039 00:52:34,100 --> 00:52:36,940 1040 00:52:36,940 --> 00:52:39,350 1041 00:52:39,350 --> 00:52:41,930 1042 00:52:41,930 --> 00:52:46,070 1043 00:52:46,070 --> 00:52:48,980 1044 00:52:48,980 --> 00:52:52,220 1045 00:52:52,220 --> 00:52:55,220 1046 00:52:55,220 --> 00:52:57,260 1047 00:52:57,260 --> 00:52:59,930 1048 00:52:59,930 --> 00:53:02,630 1049 00:53:02,630 --> 00:53:05,360 1050 00:53:05,360 --> 00:53:08,180 1051 00:53:08,180 --> 00:53:09,590 1052 00:53:09,590 --> 00:53:12,650 1053 00:53:12,650 --> 00:53:14,000 1054 00:53:14,000 --> 00:53:16,790 1055 00:53:16,790 --> 00:53:19,670 1056 00:53:19,670 --> 00:53:23,450 1057 00:53:23,450 --> 00:53:25,690 1058 00:53:25,690 --> 00:53:29,840 1059 00:53:29,840 --> 00:53:32,420 1060 00:53:32,420 --> 00:53:34,340 1061 00:53:34,340 --> 00:53:38,330 1062 00:53:38,330 --> 00:53:40,550 1063 00:53:40,550 --> 00:53:42,690 1064 00:53:42,690 --> 00:53:44,880 1065 00:53:44,880 --> 00:53:48,030 1066 00:53:48,030 --> 00:53:50,100 1067 00:53:50,100 --> 00:53:52,320 1068 00:53:52,320 --> 00:53:54,270 1069 00:53:54,270 --> 00:53:58,080 1070 00:53:58,080 --> 00:54:05,130 1071 00:54:05,130 --> 00:54:06,990 1072 00:54:06,990 --> 00:54:10,620 1073 00:54:10,620 --> 00:54:13,200 1074 00:54:13,200 --> 00:54:14,340 1075 00:54:14,340 --> 00:54:16,560 1076 00:54:16,560 --> 00:54:18,150 1077 00:54:18,150 --> 00:54:20,630 1078 00:54:20,630 --> 00:54:24,900 1079 00:54:24,900 --> 00:54:28,590 1080 00:54:28,590 --> 00:54:31,890 1081 00:54:31,890 --> 00:54:34,350 1082 00:54:34,350 --> 00:54:37,560 1083 00:54:37,560 --> 00:54:42,180 1084 00:54:42,180 --> 00:54:45,270 1085 00:54:45,270 --> 00:54:47,700 1086 00:54:47,700 --> 00:54:49,530 1087 00:54:49,530 --> 00:54:55,700 1088 00:54:55,700 --> 00:54:58,590 1089 00:54:58,590 --> 00:55:02,010 1090 00:55:02,010 --> 00:55:04,140 1091 00:55:04,140 --> 00:55:05,430 1092 00:55:05,430 --> 00:55:08,070 1093 00:55:08,070 --> 00:55:10,740 1094 00:55:10,740 --> 00:55:12,780 1095 00:55:12,780 --> 00:55:16,860 1096 00:55:16,860 --> 00:55:21,770 1097 00:55:21,770 --> 00:55:23,970 1098 00:55:23,970 --> 00:55:25,950 1099 00:55:25,950 --> 00:55:27,350 1100 00:55:27,350 --> 00:55:31,260 1101 00:55:31,260 --> 00:55:35,040 1102 00:55:35,040 --> 00:55:39,360 1103 00:55:39,360 --> 00:55:41,700 1104 00:55:41,700 --> 00:55:44,250 1105 00:55:44,250 --> 00:55:46,590 1106 00:55:46,590 --> 00:55:52,020 1107 00:55:52,020 --> 00:55:54,490 1108 00:55:54,490 --> 00:55:58,760 1109 00:55:58,760 --> 00:56:01,849 1110 00:56:01,849 --> 00:56:04,099 1111 00:56:04,099 --> 00:56:07,580 1112 00:56:07,580 --> 00:56:09,020 1113 00:56:09,020 --> 00:56:12,170 1114 00:56:12,170 --> 00:56:13,460 1115 00:56:13,460 --> 00:56:17,540 1116 00:56:17,540 --> 00:56:22,240 1117 00:56:22,240 --> 00:56:26,630 1118 00:56:26,630 --> 00:56:28,280 1119 00:56:28,280 --> 00:56:30,740 1120 00:56:30,740 --> 00:56:32,839 1121 00:56:32,839 --> 00:56:34,940 1122 00:56:34,940 --> 00:56:36,230 1123 00:56:36,230 --> 00:56:38,480 1124 00:56:38,480 --> 00:56:40,460 1125 00:56:40,460 --> 00:56:42,140 1126 00:56:42,140 --> 00:56:44,480 1127 00:56:44,480 --> 00:56:51,470 1128 00:56:51,470 --> 00:56:54,950 1129 00:56:54,950 --> 00:56:58,640 1130 00:56:58,640 --> 00:57:01,609 1131 00:57:01,609 --> 00:57:03,530 1132 00:57:03,530 --> 00:57:06,530 1133 00:57:06,530 --> 00:57:09,109 1134 00:57:09,109 --> 00:57:10,940 1135 00:57:10,940 --> 00:57:13,760 1136 00:57:13,760 --> 00:57:16,190 1137 00:57:16,190 --> 00:57:19,400 1138 00:57:19,400 --> 00:57:22,849 1139 00:57:22,849 --> 00:57:24,710 1140 00:57:24,710 --> 00:57:26,839 1141 00:57:26,839 --> 00:57:31,160 1142 00:57:31,160 --> 00:57:34,130 1143 00:57:34,130 --> 00:57:37,640 1144 00:57:37,640 --> 00:57:39,620 1145 00:57:39,620 --> 00:57:41,210 1146 00:57:41,210 --> 00:57:42,800 1147 00:57:42,800 --> 00:57:46,010 1148 00:57:46,010 --> 00:57:49,790 1149 00:57:49,790 --> 00:57:52,730 1150 00:57:52,730 --> 00:57:54,670 1151 00:57:54,670 --> 00:57:58,400 1152 00:57:58,400 --> 00:58:00,349 1153 00:58:00,349 --> 00:58:02,810 1154 00:58:02,810 --> 00:58:05,480 1155 00:58:05,480 --> 00:58:07,340 1156 00:58:07,340 --> 00:58:08,990 1157 00:58:08,990 --> 00:58:11,120 1158 00:58:11,120 --> 00:58:13,100 1159 00:58:13,100 --> 00:58:16,220 1160 00:58:16,220 --> 00:58:19,610 1161 00:58:19,610 --> 00:58:22,280 1162 00:58:22,280 --> 00:58:24,080 1163 00:58:24,080 --> 00:58:26,570 1164 00:58:26,570 --> 00:58:28,580 1165 00:58:28,580 --> 00:58:33,350 1166 00:58:33,350 --> 00:58:37,490 1167 00:58:37,490 --> 00:58:40,850 1168 00:58:40,850 --> 00:58:45,800 1169 00:58:45,800 --> 00:58:49,220 1170 00:58:49,220 --> 00:58:50,870 1171 00:58:50,870 --> 00:58:53,420 1172 00:58:53,420 --> 00:58:55,310 1173 00:58:55,310 --> 00:58:58,040 1174 00:58:58,040 --> 00:59:01,060 1175 00:59:01,060 --> 00:59:03,590 1176 00:59:03,590 --> 00:59:04,880 1177 00:59:04,880 --> 00:59:07,150 1178 00:59:07,150 --> 00:59:11,180 1179 00:59:11,180 --> 00:59:12,890 1180 00:59:12,890 --> 00:59:15,050 1181 00:59:15,050 --> 00:59:17,060 1182 00:59:17,060 --> 00:59:18,980 1183 00:59:18,980 --> 00:59:21,350 1184 00:59:21,350 --> 00:59:23,360 1185 00:59:23,360 --> 00:59:26,420 1186 00:59:26,420 --> 00:59:29,180 1187 00:59:29,180 --> 00:59:30,770 1188 00:59:30,770 --> 00:59:32,660 1189 00:59:32,660 --> 00:59:35,080 1190 00:59:35,080 --> 00:59:37,880 1191 00:59:37,880 --> 00:59:39,050 1192 00:59:39,050 --> 00:59:44,480 1193 00:59:44,480 --> 00:59:47,180 1194 00:59:47,180 --> 00:59:50,210 1195 00:59:50,210 --> 00:59:54,860 1196 00:59:54,860 --> 00:59:56,660 1197 00:59:56,660 --> 00:59:59,210 1198 00:59:59,210 --> 01:00:01,340 1199 01:00:01,340 --> 01:00:03,590 1200 01:00:03,590 --> 01:00:07,970 1201 01:00:07,970 --> 01:00:09,680 1202 01:00:09,680 --> 01:00:12,200 1203 01:00:12,200 --> 01:00:13,700 1204 01:00:13,700 --> 01:00:16,070 1205 01:00:16,070 --> 01:00:18,020 1206 01:00:18,020 --> 01:00:20,420 1207 01:00:20,420 --> 01:00:21,860 1208 01:00:21,860 --> 01:00:23,840 1209 01:00:23,840 --> 01:00:25,640 1210 01:00:25,640 --> 01:00:27,170 1211 01:00:27,170 --> 01:00:28,640 1212 01:00:28,640 --> 01:00:32,180 1213 01:00:32,180 --> 01:00:33,980 1214 01:00:33,980 --> 01:00:35,510 1215 01:00:35,510 --> 01:00:40,310 1216 01:00:40,310 --> 01:00:44,300 1217 01:00:44,300 --> 01:00:47,810 1218 01:00:47,810 --> 01:00:49,370 1219 01:00:49,370 --> 01:00:53,300 1220 01:00:53,300 --> 01:00:56,480 1221 01:00:56,480 --> 01:00:58,460 1222 01:00:58,460 --> 01:01:02,570 1223 01:01:02,570 --> 01:01:05,750 1224 01:01:05,750 --> 01:01:07,850 1225 01:01:07,850 --> 01:01:11,390 1226 01:01:11,390 --> 01:01:13,490 1227 01:01:13,490 --> 01:01:15,920 1228 01:01:15,920 --> 01:01:18,140 1229 01:01:18,140 --> 01:01:20,960 1230 01:01:20,960 --> 01:01:24,770 1231 01:01:24,770 --> 01:01:27,140 1232 01:01:27,140 --> 01:01:28,940 1233 01:01:28,940 --> 01:01:30,380 1234 01:01:30,380 --> 01:01:33,580 1235 01:01:33,580 --> 01:01:35,690 1236 01:01:35,690 --> 01:01:41,900 1237 01:01:41,900 --> 01:01:43,160 1238 01:01:43,160 --> 01:01:45,080 1239 01:01:45,080 --> 01:01:49,160 1240 01:01:49,160 --> 01:01:51,440 1241 01:01:51,440 --> 01:01:54,110 1242 01:01:54,110 --> 01:01:57,230 1243 01:01:57,230 --> 01:01:59,450 1244 01:01:59,450 --> 01:02:01,910 1245 01:02:01,910 --> 01:02:03,890 1246 01:02:03,890 --> 01:02:06,170 1247 01:02:06,170 --> 01:02:11,330 1248 01:02:11,330 --> 01:02:12,410 1249 01:02:12,410 --> 01:02:14,990 1250 01:02:14,990 --> 01:02:16,850 1251 01:02:16,850 --> 01:02:19,250 1252 01:02:19,250 --> 01:02:21,410 1253 01:02:21,410 --> 01:02:24,700 1254 01:02:24,700 --> 01:02:27,880 1255 01:02:27,880 --> 01:02:30,920 1256 01:02:30,920 --> 01:02:34,019 1257 01:02:34,019 --> 01:02:36,269 1258 01:02:36,269 --> 01:02:38,909 1259 01:02:38,909 --> 01:02:40,889 1260 01:02:40,889 --> 01:02:42,359 1261 01:02:42,359 --> 01:02:45,239 1262 01:02:45,239 --> 01:02:50,699 1263 01:02:50,699 --> 01:02:53,309 1264 01:02:53,309 --> 01:02:55,129 1265 01:02:55,129 --> 01:02:58,079 1266 01:02:58,079 --> 01:02:59,819 1267 01:02:59,819 --> 01:03:03,809 1268 01:03:03,809 --> 01:03:06,419 1269 01:03:06,419 --> 01:03:08,369 1270 01:03:08,369 --> 01:03:11,939 1271 01:03:11,939 --> 01:03:13,469 1272 01:03:13,469 --> 01:03:15,719 1273 01:03:15,719 --> 01:03:18,869 1274 01:03:18,869 --> 01:03:21,419 1275 01:03:21,419 --> 01:03:24,719 1276 01:03:24,719 --> 01:03:27,779 1277 01:03:27,779 --> 01:03:29,759 1278 01:03:29,759 --> 01:03:32,849 1279 01:03:32,849 --> 01:03:34,259 1280 01:03:34,259 --> 01:03:36,599 1281 01:03:36,599 --> 01:03:37,979 1282 01:03:37,979 --> 01:03:41,429 1283 01:03:41,429 --> 01:03:46,499 1284 01:03:46,499 --> 01:03:48,419 1285 01:03:48,419 --> 01:03:50,159 1286 01:03:50,159 --> 01:03:51,719 1287 01:03:51,719 --> 01:03:54,199 1288 01:03:54,199 --> 01:03:56,039 1289 01:03:56,039 --> 01:03:57,539 1290 01:03:57,539 --> 01:03:59,849 1291 01:03:59,849 --> 01:04:01,919 1292 01:04:01,919 --> 01:04:03,539 1293 01:04:03,539 --> 01:04:04,919 1294 01:04:04,919 --> 01:04:09,989 1295 01:04:09,989 --> 01:04:13,099 1296 01:04:13,099 --> 01:04:15,539 1297 01:04:15,539 --> 01:04:18,839 1298 01:04:18,839 --> 01:04:21,569 1299 01:04:21,569 --> 01:04:23,639 1300 01:04:23,639 --> 01:04:28,019 1301 01:04:28,019 --> 01:04:31,109 1302 01:04:31,109 --> 01:04:33,269 1303 01:04:33,269 --> 01:04:35,159 1304 01:04:35,159 --> 01:04:39,809 1305 01:04:39,809 --> 01:04:42,059 1306 01:04:42,059 --> 01:04:44,099 1307 01:04:44,099 --> 01:04:45,479 1308 01:04:45,479 --> 01:04:47,299 1309 01:04:47,299 --> 01:04:49,109 1310 01:04:49,109 --> 01:04:51,660 1311 01:04:51,660 --> 01:04:53,609 1312 01:04:53,609 --> 01:04:56,069 1313 01:04:56,069 --> 01:04:59,670 1314 01:04:59,670 --> 01:05:02,280 1315 01:05:02,280 --> 01:05:04,460 1316 01:05:04,460 --> 01:05:07,109 1317 01:05:07,109 --> 01:05:11,370 1318 01:05:11,370 --> 01:05:14,670 1319 01:05:14,670 --> 01:05:18,150 1320 01:05:18,150 --> 01:05:22,230 1321 01:05:22,230 --> 01:05:25,079 1322 01:05:25,079 --> 01:05:27,270 1323 01:05:27,270 --> 01:05:33,390 1324 01:05:33,390 --> 01:05:36,569 1325 01:05:36,569 --> 01:05:38,579 1326 01:05:38,579 --> 01:05:41,849 1327 01:05:41,849 --> 01:05:43,140 1328 01:05:43,140 --> 01:05:45,000 1329 01:05:45,000 --> 01:05:49,079 1330 01:05:49,079 --> 01:05:54,299 1331 01:05:54,299 --> 01:05:57,120 1332 01:05:57,120 --> 01:06:03,930 1333 01:06:03,930 --> 01:06:03,940 1334 01:06:03,940 --> 01:06:04,440 1335 01:06:04,440 --> 01:06:06,809 1336 01:06:06,809 --> 01:06:11,789 1337 01:06:11,789 --> 01:06:19,900 1338 01:06:19,900 --> 01:06:22,400 1339 01:06:22,400 --> 01:06:24,250 1340 01:06:24,250 --> 01:06:29,060 1341 01:06:29,060 --> 01:06:30,890 1342 01:06:30,890 --> 01:06:32,270 1343 01:06:32,270 --> 01:06:34,340 1344 01:06:34,340 --> 01:06:38,360 1345 01:06:38,360 --> 01:06:42,230 1346 01:06:42,230 --> 01:06:43,790 1347 01:06:43,790 --> 01:06:50,500 1348 01:06:50,500 --> 01:06:52,940 1349 01:06:52,940 --> 01:06:54,770 1350 01:06:54,770 --> 01:06:57,350 1351 01:06:57,350 --> 01:06:59,950 1352 01:06:59,950 --> 01:06:59,960 1353 01:06:59,960 --> 01:07:01,280 1354 01:07:01,280 --> 01:07:03,620 1355 01:07:03,620 --> 01:07:06,800 1356 01:07:06,800 --> 01:07:08,750 1357 01:07:08,750 --> 01:07:10,640 1358 01:07:10,640 --> 01:07:13,280 1359 01:07:13,280 --> 01:07:15,530 1360 01:07:15,530 --> 01:07:22,040 1361 01:07:22,040 --> 01:07:23,480 1362 01:07:23,480 --> 01:07:25,430 1363 01:07:25,430 --> 01:07:33,080 1364 01:07:33,080 --> 01:07:34,720 1365 01:07:34,720 --> 01:07:40,670 1366 01:07:40,670 --> 01:07:43,970 1367 01:07:43,970 --> 01:07:46,010 1368 01:07:46,010 --> 01:07:48,590 1369 01:07:48,590 --> 01:07:50,990 1370 01:07:50,990 --> 01:07:53,420 1371 01:07:53,420 --> 01:07:55,310 1372 01:07:55,310 --> 01:07:59,360 1373 01:07:59,360 --> 01:08:01,520 1374 01:08:01,520 --> 01:08:03,740 1375 01:08:03,740 --> 01:08:05,360 1376 01:08:05,360 --> 01:08:08,110 1377 01:08:08,110 --> 01:08:10,820 1378 01:08:10,820 --> 01:08:12,830 1379 01:08:12,830 --> 01:08:14,780 1380 01:08:14,780 --> 01:08:16,610 1381 01:08:16,610 --> 01:08:18,410 1382 01:08:18,410 --> 01:08:20,390 1383 01:08:20,390 --> 01:08:23,150 1384 01:08:23,150 --> 01:08:26,540 1385 01:08:26,540 --> 01:08:30,740 1386 01:08:30,740 --> 01:08:33,410 1387 01:08:33,410 --> 01:08:35,840 1388 01:08:35,840 --> 01:08:40,459 1389 01:08:40,459 --> 01:08:44,059 1390 01:08:44,059 --> 01:08:46,130 1391 01:08:46,130 --> 01:08:48,410 1392 01:08:48,410 --> 01:08:52,910 1393 01:08:52,910 --> 01:08:55,220 1394 01:08:55,220 --> 01:08:57,680 1395 01:08:57,680 --> 01:09:00,229 1396 01:09:00,229 --> 01:09:05,260 1397 01:09:05,260 --> 01:09:07,729 1398 01:09:07,729 --> 01:09:11,930 1399 01:09:11,930 --> 01:09:13,610 1400 01:09:13,610 --> 01:09:19,189 1401 01:09:19,189 --> 01:09:22,130 1402 01:09:22,130 --> 01:09:24,769 1403 01:09:24,769 --> 01:09:27,919 1404 01:09:27,919 --> 01:09:31,280 1405 01:09:31,280 --> 01:09:38,689 1406 01:09:38,689 --> 01:09:40,489 1407 01:09:40,489 --> 01:09:49,030 1408 01:09:49,030 --> 01:09:52,430 1409 01:09:52,430 --> 01:09:53,900 1410 01:09:53,900 --> 01:09:57,230 1411 01:09:57,230 --> 01:09:59,510 1412 01:09:59,510 --> 01:10:02,540 1413 01:10:02,540 --> 01:10:04,250 1414 01:10:04,250 --> 01:10:06,470 1415 01:10:06,470 --> 01:10:08,180 1416 01:10:08,180 --> 01:10:10,160 1417 01:10:10,160 --> 01:10:14,060 1418 01:10:14,060 --> 01:10:15,890 1419 01:10:15,890 --> 01:10:18,080 1420 01:10:18,080 --> 01:10:20,060 1421 01:10:20,060 --> 01:10:22,970 1422 01:10:22,970 --> 01:10:26,630 1423 01:10:26,630 --> 01:10:28,670 1424 01:10:28,670 --> 01:10:31,340 1425 01:10:31,340 --> 01:10:34,010 1426 01:10:34,010 --> 01:10:36,800 1427 01:10:36,800 --> 01:10:39,130 1428 01:10:39,130 --> 01:10:41,810 1429 01:10:41,810 --> 01:10:44,030 1430 01:10:44,030 --> 01:10:47,780 1431 01:10:47,780 --> 01:10:50,120 1432 01:10:50,120 --> 01:10:51,650 1433 01:10:51,650 --> 01:10:55,760 1434 01:10:55,760 --> 01:10:58,640 1435 01:10:58,640 --> 01:11:00,170 1436 01:11:00,170 --> 01:11:03,350 1437 01:11:03,350 --> 01:11:05,450 1438 01:11:05,450 --> 01:11:07,340 1439 01:11:07,340 --> 01:11:08,510 1440 01:11:08,510 --> 01:11:10,820 1441 01:11:10,820 --> 01:11:13,520 1442 01:11:13,520 --> 01:11:15,530 1443 01:11:15,530 --> 01:11:17,540 1444 01:11:17,540 --> 01:11:21,020 1445 01:11:21,020 --> 01:11:25,280 1446 01:11:25,280 --> 01:11:28,190 1447 01:11:28,190 --> 01:11:31,130 1448 01:11:31,130 --> 01:11:36,230 1449 01:11:36,230 --> 01:11:40,330 1450 01:11:40,330 --> 01:11:44,000 1451 01:11:44,000 --> 01:11:47,600 1452 01:11:47,600 --> 01:11:50,420 1453 01:11:50,420 --> 01:11:59,030 1454 01:11:59,030 --> 01:12:01,780 1455 01:12:01,780 --> 01:12:04,610 1456 01:12:04,610 --> 01:12:06,470 1457 01:12:06,470 --> 01:12:10,640 1458 01:12:10,640 --> 01:12:13,850 1459 01:12:13,850 --> 01:12:16,630 1460 01:12:16,630 --> 01:12:19,310 1461 01:12:19,310 --> 01:12:22,910 1462 01:12:22,910 --> 01:12:26,000 1463 01:12:26,000 --> 01:12:28,160 1464 01:12:28,160 --> 01:12:29,660 1465 01:12:29,660 --> 01:12:34,189 1466 01:12:34,189 --> 01:12:36,590 1467 01:12:36,590 --> 01:12:38,930 1468 01:12:38,930 --> 01:12:42,229 1469 01:12:42,229 --> 01:12:44,419 1470 01:12:44,419 --> 01:12:47,330 1471 01:12:47,330 --> 01:12:49,220 1472 01:12:49,220 --> 01:12:50,930 1473 01:12:50,930 --> 01:12:54,590 1474 01:12:54,590 --> 01:12:56,300 1475 01:12:56,300 --> 01:12:58,520 1476 01:12:58,520 --> 01:13:01,490 1477 01:13:01,490 --> 01:13:06,500 1478 01:13:06,500 --> 01:13:08,270 1479 01:13:08,270 --> 01:13:13,729 1480 01:13:13,729 --> 01:13:17,229 1481 01:13:17,229 --> 01:13:19,370 1482 01:13:19,370 --> 01:13:21,439 1483 01:13:21,439 --> 01:13:23,060 1484 01:13:23,060 --> 01:13:25,220 1485 01:13:25,220 --> 01:13:27,560 1486 01:13:27,560 --> 01:13:31,850 1487 01:13:31,850 --> 01:13:34,459 1488 01:13:34,459 --> 01:13:36,200 1489 01:13:36,200 --> 01:13:38,419 1490 01:13:38,419 --> 01:13:42,860 1491 01:13:42,860 --> 01:13:45,020 1492 01:13:45,020 --> 01:13:48,200 1493 01:13:48,200 --> 01:13:50,780 1494 01:13:50,780 --> 01:13:53,150 1495 01:13:53,150 --> 01:13:55,459 1496 01:13:55,459 --> 01:13:58,070 1497 01:13:58,070 --> 01:14:00,200 1498 01:14:00,200 --> 01:14:01,610 1499 01:14:01,610 --> 01:14:04,160 1500 01:14:04,160 --> 01:14:06,380 1501 01:14:06,380 --> 01:14:08,360 1502 01:14:08,360 --> 01:14:13,370 1503 01:14:13,370 --> 01:14:15,020 1504 01:14:15,020 --> 01:14:18,950 1505 01:14:18,950 --> 01:14:20,750 1506 01:14:20,750 --> 01:14:22,399 1507 01:14:22,399 --> 01:14:24,109 1508 01:14:24,109 --> 01:14:26,169 1509 01:14:26,169 --> 01:14:28,760 1510 01:14:28,760 --> 01:14:32,600 1511 01:14:32,600 --> 01:14:34,250 1512 01:14:34,250 --> 01:14:36,049 1513 01:14:36,049 --> 01:14:38,149 1514 01:14:38,149 --> 01:14:40,939 1515 01:14:40,939 --> 01:14:42,799 1516 01:14:42,799 --> 01:14:45,200 1517 01:14:45,200 --> 01:14:47,270 1518 01:14:47,270 --> 01:14:48,649 1519 01:14:48,649 --> 01:14:50,600 1520 01:14:50,600 --> 01:14:53,290 1521 01:14:53,290 --> 01:14:56,959 1522 01:14:56,959 --> 01:14:59,390 1523 01:14:59,390 --> 01:15:01,220 1524 01:15:01,220 --> 01:15:02,720 1525 01:15:02,720 --> 01:15:05,870 1526 01:15:05,870 --> 01:15:08,930 1527 01:15:08,930 --> 01:15:12,260 1528 01:15:12,260 --> 01:15:15,140 1529 01:15:15,140 --> 01:15:16,399 1530 01:15:16,399 --> 01:15:20,359 1531 01:15:20,359 --> 01:15:23,479 1532 01:15:23,479 --> 01:15:26,689 1533 01:15:26,689 --> 01:15:28,520 1534 01:15:28,520 --> 01:15:29,990 1535 01:15:29,990 --> 01:15:31,790 1536 01:15:31,790 --> 01:15:34,330 1537 01:15:34,330 --> 01:15:36,799 1538 01:15:36,799 --> 01:15:39,379 1539 01:15:39,379 --> 01:15:41,930 1540 01:15:41,930 --> 01:15:44,180 1541 01:15:44,180 --> 01:15:45,500 1542 01:15:45,500 --> 01:15:47,779 1543 01:15:47,779 --> 01:15:50,959 1544 01:15:50,959 --> 01:15:52,459 1545 01:15:52,459 --> 01:15:58,310 1546 01:15:58,310 --> 01:15:59,839 1547 01:15:59,839 --> 01:16:01,850 1548 01:16:01,850 --> 01:16:04,459 1549 01:16:04,459 --> 01:16:06,169 1550 01:16:06,169 --> 01:16:09,979 1551 01:16:09,979 --> 01:16:12,260 1552 01:16:12,260 --> 01:16:14,319 1553 01:16:14,319 --> 01:16:17,000 1554 01:16:17,000 --> 01:16:23,779 1555 01:16:23,779 --> 01:16:25,220 1556 01:16:25,220 --> 01:16:28,310 1557 01:16:28,310 --> 01:16:28,320 1558 01:16:28,320 --> 01:16:28,580 1559 01:16:28,580 --> 01:16:29,959 1560 01:16:29,959 --> 01:16:34,640 1561 01:16:34,640 --> 01:16:37,580 1562 01:16:37,580 --> 01:16:40,729 1563 01:16:40,729 --> 01:16:42,709 1564 01:16:42,709 --> 01:16:44,330 1565 01:16:44,330 --> 01:16:46,760 1566 01:16:46,760 --> 01:16:49,520 1567 01:16:49,520 --> 01:16:52,129 1568 01:16:52,129 --> 01:16:54,459 1569 01:16:54,459 --> 01:16:56,959 1570 01:16:56,959 --> 01:16:58,879 1571 01:16:58,879 --> 01:17:00,830 1572 01:17:00,830 --> 01:17:03,040 1573 01:17:03,040 --> 01:17:05,600 1574 01:17:05,600 --> 01:17:07,550 1575 01:17:07,550 --> 01:17:09,950 1576 01:17:09,950 --> 01:17:11,899 1577 01:17:11,899 --> 01:17:14,000 1578 01:17:14,000 --> 01:17:16,520 1579 01:17:16,520 --> 01:17:18,169 1580 01:17:18,169 --> 01:17:21,620 1581 01:17:21,620 --> 01:17:31,680 1582 01:17:31,680 --> 01:17:31,690 1583 01:17:31,690 --> 01:17:33,750