Transcript 00:15 okay well if you have any questions 00:20 you'd like to ask before I stop today 00:25 yeah if you had a problem reaching the 00:29 recorded lectures on the Internet I've 00:33 been told there's a problem with the 00:35 link that had been posted on the course 00:36 website that link got you directly to 00:39 the login page for web CT finally you 00:42 have to start a few pages earlier 00:43 otherwise you don't get in so the link 00:46 has been updated and hopefully you won't 00:49 have a problem getting to the recorded 00:51 lectures starting from the new link okay 00:55 so we were looking at improving the 01:01 performance of merge sort and to that 01:06 end we developed the Tournament 3 01:09 structure last time and as said one of 01:13 the applications of tournament tree was 01:16 to run merging another application we 01:20 saw last time was to first fit bin 01:23 packing today we're going to look at 01:27 another aspect of much so that we want 01:31 to improve on and that's the run 01:33 generation aspect and here we said that 01:39 one thing we could try and do is overlap 01:42 the time that is spent inputting data 01:45 from the disk and the time spent 01:47 outputting data as well as the time 01:49 spent internally generating the sorted 01:52 runs and the other thing you wanted to 01:56 do is to reduce the number of runs that 01:58 are being generated in order to do the 02:08 first item to overlap 02:10 you need to have hardware that supports 02:13 overlap of those three operations so in 02:17 particular we're going to need two disks 02:20 say the lower disk is the one from which 02:22 we are reading we have another disk the 02:26 upper one is the one to which we are 02:27 so the hardware has to support what we 02:30 might call a non-blocking read from a 02:33 disk you simply issue a command which 02:35 says read this block from a disk and 02:37 once you've issued the command you can 02:40 go ahead and do other things you don't 02:42 wait for that block to actually come in 02:44 similarly we assume we can do a 02:46 non-blocking right 02:48 yes issue a command which says write 02:50 this buffer load out from memory and 02:52 then the computer is busy writing it and 02:55 while it's writing you can do other 02:56 things okay so if non-blocking read and 02:59 writes and then you have to have 03:02 synchronization commands which say that 03:04 now wait until the read and write that 03:07 are going on finish ok so we assume that 03:11 you have these capabilities ok all right 03:17 so let's see first how we might meet 03:25 some of these objectives by using a sort 03:29 scheme that we already know of ok so 03:31 let's say we have quicksort and we say 03:33 well I'd like to overlap input/output 03:36 and internal processing now in quicksort 03:42 we need to first select a pivot element 03:50 and if you're using the median of 3 rule 03:53 then to select the pivot you need to 03:55 know the leftmost item the item in the 03:57 middle and the item at the end but since 04:01 all of this data is sitting out on a 04:03 disk in order for you to be able to 04:08 select the pivot quickly you would read 04:13 the blocks end in a different order than 04:17 one dictated by trying to get the whole 04:19 data in that you're trying to sort so 04:22 you might first read in the leftmost 04:23 block of data read in a middle block of 04:26 data and then read in the last block of 04:28 data and then from those three blocks 04:31 you can select the first middle and last 04:34 record and find the pivot 04:39 okay but still the very first step 04:41 finding the pivot would have to wait 04:43 until you have read in three blocks of 04:45 data 04:48 all right then once you've found the 04:50 pivot in order to do the partitioning 04:55 okay so in order to do a partitioning 04:59 you kind of sweep from the left to the 05:03 right and you also sweep from the right 05:05 to the left 05:05 looking for a swappable pad okay so in 05:09 order to do the sweep it's necessary for 05:12 you to have these blocks of data in 05:14 memory okay and so so to facilitate the 05:26 sweep you might read your blocks in in 05:28 the order in which the sweep is going to 05:30 access them so the first step or the 05:42 first phase of the quicksort which is 05:46 select the pivot and then partition okay 05:51 you can achieve some limited amount of 05:55 overlapping of input/output and internal 05:57 processing but in order to finish this 06:02 first phase you must have read in all of 06:04 the records that are going to be 06:05 involved in the quicksort okay and then 06:08 after that you're going to sort this 06:09 recursively in that recourse recursively 06:11 and at that time there won't be any 06:15 input and output taking place until you 06:18 actually have let's say generated a 06:21 leftmost sorted block at that time you 06:23 can begin to write that block out okay 06:28 so you recursively sort the left part 06:32 and the right part once you've done the 06:34 leftmost block you can write that out 06:35 and then you could be working on the 06:38 next block and you could be writing that 06:39 out while you're writing a block out you 06:41 could be reading in the next block of 06:43 later for the next quicksort okay so you 06:46 can get some amount of overlapping 06:50 between input output 06:52 an internal processing here but it takes 06:54 some somewhat tricky management of the 06:59 buffers right so that would be one 07:04 strategy but this of course would not 07:06 increase the length of your runs you 07:09 still be limited to the length of memory 07:12 you're not beginning runs whose length 07:15 is of size more than memory capacity 07:17 though you would be overlapping some of 07:20 the input/output and internal processing 07:25 an alternative strategy is somewhat 07:28 simpler is to take the memory that you 07:33 have and petition it into three buffers 07:37 b1 b2 and b3 these buffers are rather 07:44 large in previous examples our buffers 07:47 have tended to be one block one but now 07:49 these buffers are pretty big okay and 07:52 the strategy here is in the steady state 07:58 okay we may be sorting one of these 08:02 buffers internally so using an internal 08:05 sort say on b1 okay and while you are 08:09 internally sorting this one you may be 08:11 writing this one out this was sorted on 08:13 a previous pass and you might be filling 08:17 this one with many blocks from the disk 08:21 okay so a steady state diagram would 08:24 look like this you're reading a whole 08:29 bunch of blocks from the disk into one 08:30 of those three buffers and while this is 08:34 taking place you're writing out many 08:36 blocks of data to the disk from one of 08:39 the other buffers and while these two 08:42 activities are taking place you're 08:44 sorting the third buffer okay and then 08:49 you have a synchronization point here 08:52 where you wait for these three 08:55 activities to finish and once they 08:58 finish you into you change the role of 09:01 buffers okay so the one that you were 09:04 reading 09:05 to becomes the one that you're going to 09:06 be sorting next okay this one is the one 09:12 that you're going to be writing next and 09:13 this is the one that you're going to be 09:15 sorting so this the one you're going to 09:19 be inputting next okay so you have a 09:25 synchronization point and once you 09:29 synchronize all these activities are 09:31 done you then reassign rows two buffers 09:34 and repeat this process all right so the 09:40 buffer management becomes a lot easier 09:42 here it's easier to see that you're 09:44 getting a lot of parallelism however 09:47 what's happening here is that let me go 09:56 back so what's going to happen here is 10:07 that each of your buffers is one-third 10:10 the size of memory okay so while we have 10:15 a simple strategy to effectively overlap 10:19 input/output and internal processing the 10:21 size of the runs is one third of what it 10:24 was when we used our simplistic scheme 10:30 so we've lost out on the size of the 10:33 runs 10:41 alright so let's try yet another 10:43 strategy and here the buffers will be 10:46 made one block long and we'll have two 10:52 input buffers and two output buffers 10:56 okay so we assume we've had enough 10:59 memory for four buffers plus some 11:02 additional stuff and typically the the 11:07 amount of memory you'll need for four 11:09 buffers would be small you would have a 11:10 lot of additional memory left this 11:14 additional memory will be used in our 11:16 example to maintain a looser tree 11:20 certainly you could use other data 11:22 structures here other than a lose a tree 11:26 and get the same asymptotic performance 11:28 as we're going to get with a looser tree 11:29 the difference would be some constant 11:31 factors all right so we take our memory 11:38 partition it into four buffers two input 11:41 two output the balance of the memory 11:42 will be used to build a looser tree 11:44 typically here the looser tree is going 11:46 to be a large looser tree in that the 11:48 number of players involved could be in 11:50 the thousands or tens of thousands the 11:56 steady-state operation is similar to the 12:00 diagram we had before okay we will be 12:06 reading from the disk into one of the 12:09 two input buffers well that read is 12:13 taking place will be writing one of the 12:15 two output buffers to the disk and while 12:19 this read and write it's taking place 12:21 will be generating runs into the other 12:26 output buffer and to generate runs into 12:30 the other output buffer they'll be 12:31 generated from the other input buffer so 12:34 the input buffer that's not being read 12:36 in here the output buffer that's not 12:38 being read in written out here will be 12:41 active over here in the run generation 12:46 and then 12:49 when this Reid finishes this write 12:52 finishes when a buffer load of output is 12:55 generated here will synchronize and at 12:59 that time we will swap the roles of 13:01 input buffers output buffers okay well 13:06 let's see this as an example by an 13:09 examples give you a better idea of how 13:10 things are working alright so I've got 13:13 my two input buffers down here I've got 13:16 two output buffers up there okay and 13:19 then I've got a loser tree that I'm 13:21 going to build in the middle so I start 13:26 by reading a bunch of records from the 13:28 disk and those are used to fill the 13:31 player nodes of the loser tree okay 13:35 right so once I fill the prayer nodes I 13:38 then initiate a read to fill a buffer 13:44 okay so I'm going to start to read from 13:47 this here and while this read is taking 13:50 place I'm going to initialize my loser 13:52 tree okay so I start playing matches as 13:56 we did before you play a match out here 13:59 the four losses 14:02 okay the smaller one moves forward and 14:04 waits over there you play a match here 14:08 the six and the eight the eight losses 14:09 the six comes here to play the six 14:11 losers the three moves up there to play 14:14 and while this is going on this buffer 14:16 is getting filled from the desk okay so 14:19 the 1 and the 5 will play the five 14:21 losers the one moves up and waits the 14:23 seven and the three will play the seven 14:25 losers 14:26 the three moves up in place of the one 14:28 and losers the one moves up and plays 14:30 with the three and wins and then the one 14:32 moves up and waits out there well then 14:37 the two and the six will play the six 14:39 loses the two weights the nine and the 14:42 four play the nine losses the four goes 14:44 up in place with the two and loses the 14:46 two moves up and waits and then we start 14:50 out here the five and the two will play 14:53 the five losers the two goes up then the 14:56 5 and the 8 play the eight loses the 14:59 five plays with the two and losers the 15:01 two goes up in place with the tools that 15:03 I 15:03 the tutu's moves up and plays with the 15:05 one and then the one is the overall 15:07 winner so I initialize the tree while 15:16 I'm initializing the tree this thing is 15:18 being read into and I come to now a 15:20 synchronization point I'm done with what 15:23 I had to do I wait for this input to 15:25 finish all right so the inputs finished 15:32 I've got a buffaloed with zoom a buffalo 15:34 does three records well then I assign 15:40 roles for buffers okay I would initiate 15:43 a non-blocking read into i1 and at this 15:50 time I have nothing to write so I will 15:51 not initiate a right-oh 15:55 assign this buffer as the buffer into 15:58 which I'm going to generate output and 16:00 this is the buffer from which I will 16:02 feed my loser tree all right so so now 16:11 what I'll do is I'll take the winner 16:14 which was a1 okay and move it from a 16:19 player node into the output buffer okay 16:24 so the one moves into the output buffer 16:25 then I'm going to replace the player 16:29 here with the next fellow in the active 16:32 input buffer that's the three so the 16:34 three is going to enter the tree and the 16:37 tree gets readjusted and when the three 16:40 enters the tree I need to first decide 16:44 whether or not it's possible for me to 16:46 output the three as part of the current 16:49 run that's being generated okay so I 16:53 give my runs numbers 16:55 so first I'm generating let's say run 16:56 number one run number one already has a 17:00 key value one that's been output into 17:02 this buffer the succeeding keys have to 17:05 be at least one okay so if the record 17:10 that you're moving from here to here had 17:12 a value that is 0 then 17:15 you should not have put it as part of 17:17 run one it's too late and you had to be 17:19 careful because if you put a zero here 17:21 and you play matches the zero will win 17:24 and come out right away and you will not 17:27 get a sorted sequence all right so when 17:32 a player enters the tournament tree you 17:35 need to assign it a run number the run 17:38 number is lie there it's gonna be part 17:39 of the current run or part of the next 17:41 run all right the three can be part of 17:44 the current run because its value is not 17:46 smaller than the last fellow that's gone 17:48 out all right so we take the three out 17:53 it enters the tree it's going to come in 17:56 as part of the current run and it plays 17:58 matches up toward the leaf so the three 18:02 will play with the five and win it plays 18:04 with this three it's a tie one of the 18:06 two wins it comes up here in plays it's 18:08 another tie one of the two three wins 18:10 then it goes up over there and loses to 18:13 the two and now the two moves to the 18:19 output buffer okay so the two is copied 18:24 from it's player node into the output 18:27 buffer and then the next fellow in the 18:30 input buffer moves in the occupied space 18:34 next follows a five since 5 is bigger 18:38 than two it can come out as part of the 18:39 current run okay so five enters the tree 18:44 as part of the current run it plays with 18:49 the six and wins then it plays with the 18:51 four and loses the four moves up and it 18:55 plays with the two and loses the two 18:58 moves up it plays with the three and 18:59 wins and then it is transmitted to the 19:03 output buffer so that two is really a 19:05 pointer to a player node which contains 19:08 the record the record is copied from the 19:10 player node into the output buffer well 19:16 the next fellow is moved from the input 19:17 buffer into a player node so that's a 19:20 for the four is read and equal to the 19:22 two so it can come out as part of the 19:24 current run it's moved from the input 19:28 buffer 19:29 into a player note I said this time the 19:35 output buffer is full the input buffer 19:38 is empty so we go into our 19:41 synchronization mode we wait for input 19:44 and output that was taking place 19:45 concurrent with this activity to 19:47 complete alright so at this point we 19:53 don't have any output taking place we 19:54 only have input so when the input is 19:56 complete okay 19:57 we interchange the rows of the input 20:01 buffers this will be the buffer into 20:04 which you will read so you initiate a 20:06 non-blocking read into this buffer 20:08 that's the buffer that will feed the 20:11 loser tree we have to change the roles 20:16 of the output buffers this one you're 20:19 going to start writing out so you 20:20 initiate a non-blocking right and that's 20:23 the one that you're going to start that 20:31 you're going to start filling all right 20:37 so not blocking right from here non 20:41 blocking read into here that's the one 20:43 that's going to get filled from the tree 20:49 all right so I continue now the four 20:54 plays matches okay the four plays with 20:57 the five at this node and wins it plays 21:01 with the five with that node at wins it 21:03 plays with this for one of the two fours 21:05 wins the other one loses the winner 21:07 moves up it plays with the three and 21:09 loses the three is the winner okay 21:13 and the three gets copied out into the 21:16 output buffer okay so we're still 21:18 generating run number one okay this is a 21:21 second block of run one being generated 21:23 all right so 21:30 I got to find a replacement for this 21:33 fellow at this node and I go into the 21:36 current active input buffer which is i1 21:38 and then the next record there has key 21:42 one I compare that with the last key I 21:46 put into the output buffer that's a 21:47 three 21:49 so one being smaller than three tells me 21:51 this record cannot come out as part of 21:54 the current run if it comes out as part 21:56 of run one you have a problem you 21:57 already sent out things with bigger keys 21:59 okay so this has to be part of the next 22:01 run so you assign it a new run number 22:03 run - yes 22:19 at what time are we checking the values 22:21 with the values all in the output buffer 22:23 this is with respect to assigning a run 22:25 number is that your question you know 22:31 why do we assign run numbers yeah so 22:35 when a record enters the tree from the 22:39 input buffer into a player node at that 22:41 time you assign it a run number okay 22:45 so right now this was empty okay the 22:49 three moved out and I'm going to replace 22:52 what was here before with the next 22:54 fellow from the input buffer it had a 22:56 value one so compare this one with the 22:59 three at this time and since it's 23:01 smaller I give it a new run number 23:16 I suppose the output buffer is almost 23:20 full how do you compare the next value 23:39 implementation question I mean if the 23:41 buffer has a large capacity how do you 23:43 find the last fellow you put in the 23:44 buffer well you have a variable in your 23:47 program which says last key in buffer is 23:50 three whenever you move somebody into 23:53 the buffer you update that variable okay 23:59 all right so when somebody enters the 24:03 tree you check with the last key that 24:05 was that's in the buffer if it's bigger 24:08 so if it is smaller you give it the next 24:11 run number okay so I'm showing in blue 24:14 fellows they belong in the next run 24:17 otherwise everybody else is in black 24:21 they are part of the current run when 24:26 you play matches you first compare run 24:28 numbers if somebody is part of the next 24:32 run they have to lose when they play 24:34 against someone as part of the current 24:35 run if you're part of the same run then 24:40 you compare the keys okay all right so 24:44 these two are part of different runs so 24:47 this guy is going to lose even though 24:49 this value is smaller so the one gets 24:52 recorded there as the loser the five 24:56 moves up they're part of the same run so 24:58 you actually compare the keys the five 25:00 loses the three moves up they're part of 25:03 the same run so you actually compare the 25:04 keys it's a tie one of the threes moves 25:07 up you compare against the four because 25:09 they're part of the same run so before 25:12 losers the three wins and goes into the 25:16 output buffer 25:32 all right so then we are going to 25:35 replenish that spot with somebody from 25:38 the buffer the next fella on the buffer 25:40 is the 9 so the 9 is taken out and it's 25:42 moved over here you again compared 9 25:47 with the last key that was sent to the 25:50 output buffer the last key sent to the 25:52 output buffers at 3 so mine can still 25:56 come out as part of the current run even 25:59 though we have one record here that has 26:00 been designated as next run it's still 26:03 possible for fellows to enter the tree 26:05 and come out as part of the current run 26:07 ok so mine can come out as part of the 26:10 current run so it comes in as a current 26:14 run record alright so you play matches 26:21 the 9 plays with the 7 they're both part 26:24 of the same run so you compare the keys 26:26 the 9 loses the 7 comes in place with 26:30 the 5 they're part of the same runs you 26:32 compare the keys the 7 loses the 5 plays 26:35 here with the 3 the 5 loses the 3 plays 26:38 there with the 4 the 4 loses the 3 is 26:41 copied out 26:49 alright then the two is going to enter 26:51 the tree you compare to with the last 26:56 key that went out three two is smaller 26:58 so two cannot come out as part of the 26:59 current run has to come out as part of 27:01 the next run so it's assigned the next 27:08 run 27:23 all right so at this time the output 27:27 buffers for the input buffer is empty 27:29 it's time to synchronize you wait for 27:32 input output to complete and then you 27:35 need to change the roles of the buffers 27:37 so you're now going to be reading into 27:43 that buffer okay you're going to use 27:46 this one to feed into the tree you're 27:50 going to write that one to disk and then 27:52 the tree is going to be filling this one 28:17 the last one that came in here was a - 28:20 yeah okay suppose this had been a for if 28:25 this had been a four it would have come 28:27 in as part of the current run right so 28:32 so the rule is whenever somebody moves 28:33 from the input buffer into a player node 28:35 you check its value against the last key 28:38 in the output buffer if the value is is 28:43 less than the last key then it has to be 28:46 part of the next run otherwise it's part 28:48 of the current run yeah 28:56 nah no rats so questions we can't decide 29:01 the number of runs right when you use 29:02 the scheme you don't know how many runs 29:04 you're going to get okay 29:06 the length of a run depends a lot on the 29:10 sequence of keys that you start with a 29:29 sign for to the same run and then I'm 29:32 going to flush the buffer I won't know 29:33 what's the previous key well I again 29:39 you're getting into an implementation 29:41 detail and as I said it's very easy to 29:43 implement these things you just keep a 29:45 program variable which keeps track of 29:47 what is the last thing you put out and 29:51 in fact you're going to need that 29:54 variable because as we noted earlier if 29:56 this is a large buffer the position of 29:58 the last fellow keeps changing and you 30:00 don't want to keep keep I guess making 30:05 an access into that place to get it okay 30:07 so the simplest thing to do really is 30:11 just have a variable and save the last 30:13 item 30:23 all right so we interchange the role of 30:25 the buffers okay and then we continue to 30:31 now fill this one from the tree as you 30:35 fill this positions will get emptied you 30:37 use this one to feed it and while that 30:40 activity is taking place your writing to 30:43 disk non-blocking right and your reading 30:45 from disk non-blocking reading all right 30:59 so we're still generating run one all 31:07 right so the two entered the tree and 31:13 we're going to play matches and playing 31:16 a match as always first compare run 31:18 numbers if you're from the same run 31:20 number then you compare the key 31:21 otherwise that the fellow from the 31:23 current run always wins okay so you play 31:27 a match here between the two and the 31:28 four they're from different runs four is 31:31 from the current run two is from the 31:32 next one so the two is going to lose the 31:35 four will play with the six they're from 31:37 the same run so you compare the keys the 31:40 four wins it comes here in plays with 31:42 the five it wins it goes up and plays 31:44 with that four it's a tie one of the two 31:46 fours wins and has moved to the output 31:49 buffer alright then you're going to fill 31:56 this from here so the six is going to 31:59 enter the tree you compare the six with 32:02 the last fellow that was moved to an 32:04 output buffer that's a four six is 32:06 bigger than four it can come out as part 32:08 of the current run so it enters as 32:10 current run and then you play matches 32:13 along that path 32:18 see I don't know if I got any more in 32:20 this example yeah I do 32:22 alright so let me go back a bit alright 32:31 so the six centers as part of the 32:32 current run it plays with the two okay 32:36 you first compare the run numbers this 32:39 is part of the next run so it has to 32:41 lose when it plays with someone as part 32:43 of the current run okay so the six wins 32:45 and the two loses okay 32:47 the six comes here in plays at that six 32:49 the part of the same run so you compare 32:50 the keys it's a tight one of the two 32:53 sixes wins the other one moves up plays 32:55 with the five the run numbers are the 32:57 same you compare the keys the six loses 33:01 the five moves up to play with the for 33:03 the run numbers are the same so the six 33:07 loses sorry the five loses the four wins 33:10 and has moved into the output buffer 33:19 alright then the one here enters the 33:22 tree you compare with the last fellow 33:25 that went out that's a four you can't 33:27 put this one as part of the current run 33:29 so it's assigned to the next run and 33:36 then you play matches upwards the one 33:39 will play with the nine they're from 33:42 different runs the nine even though it's 33:44 bigger as part of the current run it 33:46 will win and the one will lose the mine 33:51 goes up and it loses against the five 33:53 the five losses against the four the 33:56 four wins against the five and moves to 33:59 an output buffer alright so the four 34:06 wins it will get copied over here and 34:11 then the process continues 34:17 okay alright questions about how this 34:20 scheme works at some point in time the 34:32 first record from the next run will exit 34:35 the system and at that time you start 34:39 generating run number two 34:50 alright now we already noted that with 34:57 this scheme you can't really predict how 34:59 many runs you're going to get that's 35:01 just another way of saying you can't 35:02 really predict the size of the runs if 35:09 you lose a tree has K players in it ok 35:12 external nodes we know that you will not 35:15 get a run whose size is less than equal 35:17 to K when you start everybody is 35:23 assigned run number 1 so you got K 35:25 people will be part of Run 1 when the 35:29 first fellow assigned to Run 2 comes out 35:31 everybody in the tree has to be assigned 35:33 to run 2 if there's a single Run 1 it 35:35 should have come out first ok so at 35:37 least K people assigned to run 2 35:38 similarly 2 3 & 4 so every run is going 35:42 to have at least K records in it if your 35:50 input was sorted to begin with nobody 35:53 get assigned to run 2 you get only one 35:55 run fact if your input is close to being 35:59 sorted nobody get assigned to run 2 so 36:03 long as you never put somebody in the 36:04 tree whose value is smaller than someone 36:07 already in an output buffer run 36:09 everybody gets assigned to run 1 so you 36:12 can get several nearly sorted sequences 36:14 all coming out as one run and you never 36:16 get to phase two 36:26 if the input was the reverse of so did 36:28 sequence then every time somebody comes 36:32 into the tree it is smaller than what 36:35 went out so it gets assigned to the next 36:38 run so in that case every run is of size 36:41 exactly K okay so the best that can 36:49 happen to you is one run the worst that 36:51 can happen to you is you get n over K 36:53 runs now what you can show is that if 37:00 you have a random sequence okay then the 37:03 expectation is that the size of your run 37:06 will be two times the number of players 37:10 in your tree now we're not going to 37:17 prove that but we'll just take that 37:19 result okay so the expectation is two 37:22 times K and let's see what that means 37:25 okay suppose we have a memory that has 37:31 the capacity to hold M records plus some 37:34 additional memory to store a program and 37:36 so on okay so if you were using the 37:39 simple scheme we looked at a few 37:40 lectures ago we could read the anem 37:42 records 37:43 sort them and write them out so using 37:50 the simple scheme our runs would be of 37:53 size M when we use the scheme just 38:00 described if your block size is B 38:04 records then we need memory for four 38:08 buffers so for B records worth of memory 38:12 is designated for our four buffers and 38:18 that means that the loser tree K number 38:21 of player nodes will be at most M minus 38:24 4b then of course you've got the 38:27 internal nodes there are K minus one of 38:29 those but those are just pointers to 38:31 records will assume the pointers are 38:32 small compared to records and we've got 38:34 enough memory for those K minus 1 38:35 pointers 38:38 all right so loser three K is going to 38:42 be M minus 4b and the expectation is 38:47 that the run size will be two times case 38:49 will be about 2m minus eight P so if you 39:00 use the simple scheme where we have no 39:02 overlap between input output and stuff 39:04 we just fill up memory sort and write 39:06 out or if you go into a somewhat more 39:10 involved skiing where we try to bring in 39:12 blocks at the right time as we did in 39:14 quicksort we can overlap some of input 39:16 output but not a whole lot we can get 39:20 runs who sizes M and here we're going to 39:25 get runs of sizes 2m minus 8b and if you 39:28 compare the two expressions M + 2 m 39:30 minus 8 B will see that the loser 3 39:34 scheme is better when M is at least 8 P 39:38 the expectation is better performance 39:42 when M is at least 8 B so 8 block loads 39:53 so got a numeric example suppose that 39:56 your block size is 100 records if your 40:01 memory is 600 records capacity there's 40:05 six blocks at this time we don't expect 40:08 the loser three scheme to be better on 40:10 average it's got to be 8 B okay so if 40:14 you got 600 capacity the K you can run 40:19 with you need four blocks of size 40:21 hundreds so 400 records worth of 40:24 memories gone to the buffers you left 40:25 with 200 for your loser tree so 2k is 40:29 400 so we expect to get runs whose size 40:33 is 400 whereas the simple scheme would 40:37 have been 600 40:40 if you have 1,000 units of memory okay 40:45 so that's 10 B so now we expect the 40:48 loser tree scheme to be better so K take 40:53 out 400 units for the four buffers he 40:55 left with 600 two K's 1200 and we're 40:58 doing better than the fill memory sort 41:03 and output scheme if you go to 10,000 M 41:09 we're doing about 19,000 200 okay so as 41:13 you get to bigger and bigger memories 41:16 relative to 8 B you get closer to doing 41:19 two times the size of memory and you're 41:23 doing a lot better than the well but 41:26 better by a factor of approaching - all 41:36 right so so the strategy effectively 41:40 overlaps input/output and internal 41:42 processing and at the same time it gives 41:45 us runs that are expected to be of about 41:47 twice the size of memory 41:49 assuming memory is large compared to the 41:52 block size however it does give us runs 41:59 whose size varies you may get some runs 42:01 of size 200 some of size 1,000 some of 42:06 size 10,000 42:13 the total amount of time taken to 42:16 generate these runs if you use the 42:19 simple scheme okay to sort em records 42:23 using say a logarithmic sort M log M if 42:27 the number of Records is n you have to 42:28 go through n over m rounds of that okay 42:31 so your total time is order of n log M 42:36 if you use the loser tree scheme to move 42:42 a record from an input buffer to an 42:44 output buffer takes the log of K base 2 42:47 amount of time and all the records have 42:51 to move from input to output buffers so 42:53 it's n times log of K raised to K is not 43:00 going to be more than M fact K is about 43:02 M minus 4b so this is really the same as 43:05 that so asymptotically you're not 43:09 spending any more time in the process 43:21 all right when you get runs of different 43:23 lengths that introduces a new problem 43:25 which is when you go to the next phase 43:29 which is merging the runs how should we 43:32 merge these runs together to reduce the 43:34 time spent in the second phase so when 43:41 we discussed the adaptation of merge 43:44 sort we were using a merging scheme 43:50 which correspond it to a nicely balanced 43:53 binary tree which said that in the first 43:55 phase we take pairs of runs merge them 43:58 together okay and then we go to another 44:01 pass where we take the output from the 44:03 first pass and merge them together so 44:06 adapted to the case where you have four 44:08 runs 44:09 suppose we ran this new run generation 44:11 scheme and we got a run of size 4 3 6 & 44:14 9 so I do a merge pass over these ok to 44:19 do the merge here I read in four box 44:21 records from this run three blocks from 44:24 that run so seven inputs I write out 44:27 seven blocks and I internally merge 44:29 seven blocks okay so let's say that 44:31 seven units of time to do this one I 44:35 read in these six blocks those nine 44:37 blocks that's 15 blocks are right out 44:39 fifteen blocks and I move and I 44:42 internally merge fifteen blocks so 44:44 that's 15 units of time then I go to the 44:48 second phase I read the result from the 44:50 first phase that seven blocks plus 44:53 fifteen blocks 22 blocks I merge them 44:55 internally 22 blocks emerging and I 44:57 write out 22 blocks that's 22 units of 45:00 time so my total cost is 7 plus 15 plus 45:06 22 just 44 okay now suppose instead I 45:15 did it this way okay so I'm doing by 45:20 merging this way I'm still staying with 45:21 a 2-way merge the first time merge these 45:24 two runs okay so read in four blocks in 45:28 three blocks seven blocks merge them 45:29 write them out that takes seven units of 45:32 then I take the result from this merge 45:35 and an original run okay so sixteen 45:38 blocks of input sixteen merges 16 45:41 outputs oh sorry thirteen seven plus six 45:44 thirteen so thirteen inputs thirteen 45:46 merges thirteen outputs okay and then I 45:48 take this result and that original block 45:51 of nine so that's twenty two inputs 45:53 twenty two outputs 22 merges so now my 45:59 time becomes seven plus 13 plus 22 46:02 that's 42 so merging using that pattern 46:08 was better than merging using this 46:10 pattern so when you have runs of 46:13 different lengths this nice pass by pass 46:17 merging strategy is not the best way to 46:20 merge your runs when they're all of the 46:22 same length it is the best but when 46:25 they're not of the same length it's not 46:30 so so the new problem that arises is how 46:37 do you figure out the best pattern to 46:40 use to merge these runs together 46:47 okay right so we'll study that problem 46:50 in the next lecture 46:51 optimal merging of runs or any questions 47:00 okay