Transcript


00:15
okay well if you have any questions
00:20
you'd like to ask before I stop today
00:25
yeah if you had a problem reaching the
00:29
recorded lectures on the Internet I've
00:33
been told there's a problem with the
00:35
link that had been posted on the course
00:36
website that link got you directly to
00:39
the login page for web CT finally you
00:42
have to start a few pages earlier
00:43
otherwise you don't get in so the link
00:46
has been updated and hopefully you won't
00:49
have a problem getting to the recorded
00:51
lectures starting from the new link okay
00:55
so we were looking at improving the
01:01
performance of merge sort and to that
01:06
end we developed the Tournament 3
01:09
structure last time and as said one of
01:13
the applications of tournament tree was
01:16
to run merging another application we
01:20
saw last time was to first fit bin
01:23
packing today we're going to look at
01:27
another aspect of much so that we want
01:31
to improve on and that's the run
01:33
generation aspect and here we said that
01:39
one thing we could try and do is overlap
01:42
the time that is spent inputting data
01:45
from the disk and the time spent
01:47
outputting data as well as the time
01:49
spent internally generating the sorted
01:52
runs and the other thing you wanted to
01:56
do is to reduce the number of runs that
01:58
are being generated in order to do the
02:08
first item to overlap
02:10
you need to have hardware that supports
02:13
overlap of those three operations so in
02:17
particular we're going to need two disks
02:20
say the lower disk is the one from which
02:22
we are reading we have another disk the
02:26
upper one is the one to which we are
02:27
so the hardware has to support what we
02:30
might call a non-blocking read from a
02:33
disk you simply issue a command which
02:35
says read this block from a disk and
02:37
once you've issued the command you can
02:40
go ahead and do other things you don't
02:42
wait for that block to actually come in
02:44
similarly we assume we can do a
02:46
non-blocking right
02:48
yes issue a command which says write
02:50
this buffer load out from memory and
02:52
then the computer is busy writing it and
02:55
while it's writing you can do other
02:56
things okay so if non-blocking read and
02:59
writes and then you have to have
03:02
synchronization commands which say that
03:04
now wait until the read and write that
03:07
are going on finish ok so we assume that
03:11
you have these capabilities ok all right
03:17
so let's see first how we might meet
03:25
some of these objectives by using a sort
03:29
scheme that we already know of ok so
03:31
let's say we have quicksort and we say
03:33
well I'd like to overlap input/output
03:36
and internal processing now in quicksort
03:42
we need to first select a pivot element
03:50
and if you're using the median of 3 rule
03:53
then to select the pivot you need to
03:55
know the leftmost item the item in the
03:57
middle and the item at the end but since
04:01
all of this data is sitting out on a
04:03
disk in order for you to be able to
04:08
select the pivot quickly you would read
04:13
the blocks end in a different order than
04:17
one dictated by trying to get the whole
04:19
data in that you're trying to sort so
04:22
you might first read in the leftmost
04:23
block of data read in a middle block of
04:26
data and then read in the last block of
04:28
data and then from those three blocks
04:31
you can select the first middle and last
04:34
record and find the pivot
04:39
okay but still the very first step
04:41
finding the pivot would have to wait
04:43
until you have read in three blocks of
04:45
data
04:48
all right then once you've found the
04:50
pivot in order to do the partitioning
04:55
okay so in order to do a partitioning
04:59
you kind of sweep from the left to the
05:03
right and you also sweep from the right
05:05
to the left
05:05
looking for a swappable pad okay so in
05:09
order to do the sweep it's necessary for
05:12
you to have these blocks of data in
05:14
memory okay and so so to facilitate the
05:26
sweep you might read your blocks in in
05:28
the order in which the sweep is going to
05:30
access them so the first step or the
05:42
first phase of the quicksort which is
05:46
select the pivot and then partition okay
05:51
you can achieve some limited amount of
05:55
overlapping of input/output and internal
05:57
processing but in order to finish this
06:02
first phase you must have read in all of
06:04
the records that are going to be
06:05
involved in the quicksort okay and then
06:08
after that you're going to sort this
06:09
recursively in that recourse recursively
06:11
and at that time there won't be any
06:15
input and output taking place until you
06:18
actually have let's say generated a
06:21
leftmost sorted block at that time you
06:23
can begin to write that block out okay
06:28
so you recursively sort the left part
06:32
and the right part once you've done the
06:34
leftmost block you can write that out
06:35
and then you could be working on the
06:38
next block and you could be writing that
06:39
out while you're writing a block out you
06:41
could be reading in the next block of
06:43
later for the next quicksort okay so you
06:46
can get some amount of overlapping
06:50
between input output
06:52
an internal processing here but it takes
06:54
some somewhat tricky management of the
06:59
buffers right so that would be one
07:04
strategy but this of course would not
07:06
increase the length of your runs you
07:09
still be limited to the length of memory
07:12
you're not beginning runs whose length
07:15
is of size more than memory capacity
07:17
though you would be overlapping some of
07:20
the input/output and internal processing
07:25
an alternative strategy is somewhat
07:28
simpler is to take the memory that you
07:33
have and petition it into three buffers
07:37
b1 b2 and b3 these buffers are rather
07:44
large in previous examples our buffers
07:47
have tended to be one block one but now
07:49
these buffers are pretty big okay and
07:52
the strategy here is in the steady state
07:58
okay we may be sorting one of these
08:02
buffers internally so using an internal
08:05
sort say on b1 okay and while you are
08:09
internally sorting this one you may be
08:11
writing this one out this was sorted on
08:13
a previous pass and you might be filling
08:17
this one with many blocks from the disk
08:21
okay so a steady state diagram would
08:24
look like this you're reading a whole
08:29
bunch of blocks from the disk into one
08:30
of those three buffers and while this is
08:34
taking place you're writing out many
08:36
blocks of data to the disk from one of
08:39
the other buffers and while these two
08:42
activities are taking place you're
08:44
sorting the third buffer okay and then
08:49
you have a synchronization point here
08:52
where you wait for these three
08:55
activities to finish and once they
08:58
finish you into you change the role of
09:01
buffers okay so the one that you were
09:04
reading
09:05
to becomes the one that you're going to
09:06
be sorting next okay this one is the one
09:12
that you're going to be writing next and
09:13
this is the one that you're going to be
09:15
sorting so this the one you're going to
09:19
be inputting next okay so you have a
09:25
synchronization point and once you
09:29
synchronize all these activities are
09:31
done you then reassign rows two buffers
09:34
and repeat this process all right so the
09:40
buffer management becomes a lot easier
09:42
here it's easier to see that you're
09:44
getting a lot of parallelism however
09:47
what's happening here is that let me go
09:56
back so what's going to happen here is
10:07
that each of your buffers is one-third
10:10
the size of memory okay so while we have
10:15
a simple strategy to effectively overlap
10:19
input/output and internal processing the
10:21
size of the runs is one third of what it
10:24
was when we used our simplistic scheme
10:30
so we've lost out on the size of the
10:33
runs
10:41
alright so let's try yet another
10:43
strategy and here the buffers will be
10:46
made one block long and we'll have two
10:52
input buffers and two output buffers
10:56
okay so we assume we've had enough
10:59
memory for four buffers plus some
11:02
additional stuff and typically the the
11:07
amount of memory you'll need for four
11:09
buffers would be small you would have a
11:10
lot of additional memory left this
11:14
additional memory will be used in our
11:16
example to maintain a looser tree
11:20
certainly you could use other data
11:22
structures here other than a lose a tree
11:26
and get the same asymptotic performance
11:28
as we're going to get with a looser tree
11:29
the difference would be some constant
11:31
factors all right so we take our memory
11:38
partition it into four buffers two input
11:41
two output the balance of the memory
11:42
will be used to build a looser tree
11:44
typically here the looser tree is going
11:46
to be a large looser tree in that the
11:48
number of players involved could be in
11:50
the thousands or tens of thousands the
11:56
steady-state operation is similar to the
12:00
diagram we had before okay we will be
12:06
reading from the disk into one of the
12:09
two input buffers well that read is
12:13
taking place will be writing one of the
12:15
two output buffers to the disk and while
12:19
this read and write it's taking place
12:21
will be generating runs into the other
12:26
output buffer and to generate runs into
12:30
the other output buffer they'll be
12:31
generated from the other input buffer so
12:34
the input buffer that's not being read
12:36
in here the output buffer that's not
12:38
being read in written out here will be
12:41
active over here in the run generation
12:46
and then
12:49
when this Reid finishes this write
12:52
finishes when a buffer load of output is
12:55
generated here will synchronize and at
12:59
that time we will swap the roles of
13:01
input buffers output buffers okay well
13:06
let's see this as an example by an
13:09
examples give you a better idea of how
13:10
things are working alright so I've got
13:13
my two input buffers down here I've got
13:16
two output buffers up there okay and
13:19
then I've got a loser tree that I'm
13:21
going to build in the middle so I start
13:26
by reading a bunch of records from the
13:28
disk and those are used to fill the
13:31
player nodes of the loser tree okay
13:35
right so once I fill the prayer nodes I
13:38
then initiate a read to fill a buffer
13:44
okay so I'm going to start to read from
13:47
this here and while this read is taking
13:50
place I'm going to initialize my loser
13:52
tree okay so I start playing matches as
13:56
we did before you play a match out here
13:59
the four losses
14:02
okay the smaller one moves forward and
14:04
waits over there you play a match here
14:08
the six and the eight the eight losses
14:09
the six comes here to play the six
14:11
losers the three moves up there to play
14:14
and while this is going on this buffer
14:16
is getting filled from the desk okay so
14:19
the 1 and the 5 will play the five
14:21
losers the one moves up and waits the
14:23
seven and the three will play the seven
14:25
losers
14:26
the three moves up in place of the one
14:28
and losers the one moves up and plays
14:30
with the three and wins and then the one
14:32
moves up and waits out there well then
14:37
the two and the six will play the six
14:39
loses the two weights the nine and the
14:42
four play the nine losses the four goes
14:44
up in place with the two and loses the
14:46
two moves up and waits and then we start
14:50
out here the five and the two will play
14:53
the five losers the two goes up then the
14:56
5 and the 8 play the eight loses the
14:59
five plays with the two and losers the
15:01
two goes up in place with the tools that
15:03
I
15:03
the tutu's moves up and plays with the
15:05
one and then the one is the overall
15:07
winner so I initialize the tree while
15:16
I'm initializing the tree this thing is
15:18
being read into and I come to now a
15:20
synchronization point I'm done with what
15:23
I had to do I wait for this input to
15:25
finish all right so the inputs finished
15:32
I've got a buffaloed with zoom a buffalo
15:34
does three records well then I assign
15:40
roles for buffers okay I would initiate
15:43
a non-blocking read into i1 and at this
15:50
time I have nothing to write so I will
15:51
not initiate a right-oh
15:55
assign this buffer as the buffer into
15:58
which I'm going to generate output and
16:00
this is the buffer from which I will
16:02
feed my loser tree all right so so now
16:11
what I'll do is I'll take the winner
16:14
which was a1 okay and move it from a
16:19
player node into the output buffer okay
16:24
so the one moves into the output buffer
16:25
then I'm going to replace the player
16:29
here with the next fellow in the active
16:32
input buffer that's the three so the
16:34
three is going to enter the tree and the
16:37
tree gets readjusted and when the three
16:40
enters the tree I need to first decide
16:44
whether or not it's possible for me to
16:46
output the three as part of the current
16:49
run that's being generated okay so I
16:53
give my runs numbers
16:55
so first I'm generating let's say run
16:56
number one run number one already has a
17:00
key value one that's been output into
17:02
this buffer the succeeding keys have to
17:05
be at least one okay so if the record
17:10
that you're moving from here to here had
17:12
a value that is 0 then
17:15
you should not have put it as part of
17:17
run one it's too late and you had to be
17:19
careful because if you put a zero here
17:21
and you play matches the zero will win
17:24
and come out right away and you will not
17:27
get a sorted sequence all right so when
17:32
a player enters the tournament tree you
17:35
need to assign it a run number the run
17:38
number is lie there it's gonna be part
17:39
of the current run or part of the next
17:41
run all right the three can be part of
17:44
the current run because its value is not
17:46
smaller than the last fellow that's gone
17:48
out all right so we take the three out
17:53
it enters the tree it's going to come in
17:56
as part of the current run and it plays
17:58
matches up toward the leaf so the three
18:02
will play with the five and win it plays
18:04
with this three it's a tie one of the
18:06
two wins it comes up here in plays it's
18:08
another tie one of the two three wins
18:10
then it goes up over there and loses to
18:13
the two and now the two moves to the
18:19
output buffer okay so the two is copied
18:24
from it's player node into the output
18:27
buffer and then the next fellow in the
18:30
input buffer moves in the occupied space
18:34
next follows a five since 5 is bigger
18:38
than two it can come out as part of the
18:39
current run okay so five enters the tree
18:44
as part of the current run it plays with
18:49
the six and wins then it plays with the
18:51
four and loses the four moves up and it
18:55
plays with the two and loses the two
18:58
moves up it plays with the three and
18:59
wins and then it is transmitted to the
19:03
output buffer so that two is really a
19:05
pointer to a player node which contains
19:08
the record the record is copied from the
19:10
player node into the output buffer well
19:16
the next fellow is moved from the input
19:17
buffer into a player node so that's a
19:20
for the four is read and equal to the
19:22
two so it can come out as part of the
19:24
current run it's moved from the input
19:28
buffer
19:29
into a player note I said this time the
19:35
output buffer is full the input buffer
19:38
is empty so we go into our
19:41
synchronization mode we wait for input
19:44
and output that was taking place
19:45
concurrent with this activity to
19:47
complete alright so at this point we
19:53
don't have any output taking place we
19:54
only have input so when the input is
19:56
complete okay
19:57
we interchange the rows of the input
20:01
buffers this will be the buffer into
20:04
which you will read so you initiate a
20:06
non-blocking read into this buffer
20:08
that's the buffer that will feed the
20:11
loser tree we have to change the roles
20:16
of the output buffers this one you're
20:19
going to start writing out so you
20:20
initiate a non-blocking right and that's
20:23
the one that you're going to start that
20:31
you're going to start filling all right
20:37
so not blocking right from here non
20:41
blocking read into here that's the one
20:43
that's going to get filled from the tree
20:49
all right so I continue now the four
20:54
plays matches okay the four plays with
20:57
the five at this node and wins it plays
21:01
with the five with that node at wins it
21:03
plays with this for one of the two fours
21:05
wins the other one loses the winner
21:07
moves up it plays with the three and
21:09
loses the three is the winner okay
21:13
and the three gets copied out into the
21:16
output buffer okay so we're still
21:18
generating run number one okay this is a
21:21
second block of run one being generated
21:23
all right so
21:30
I got to find a replacement for this
21:33
fellow at this node and I go into the
21:36
current active input buffer which is i1
21:38
and then the next record there has key
21:42
one I compare that with the last key I
21:46
put into the output buffer that's a
21:47
three
21:49
so one being smaller than three tells me
21:51
this record cannot come out as part of
21:54
the current run if it comes out as part
21:56
of run one you have a problem you
21:57
already sent out things with bigger keys
21:59
okay so this has to be part of the next
22:01
run so you assign it a new run number
22:03
run - yes
22:19
at what time are we checking the values
22:21
with the values all in the output buffer
22:23
this is with respect to assigning a run
22:25
number is that your question you know
22:31
why do we assign run numbers yeah so
22:35
when a record enters the tree from the
22:39
input buffer into a player node at that
22:41
time you assign it a run number okay
22:45
so right now this was empty okay the
22:49
three moved out and I'm going to replace
22:52
what was here before with the next
22:54
fellow from the input buffer it had a
22:56
value one so compare this one with the
22:59
three at this time and since it's
23:01
smaller I give it a new run number
23:16
I suppose the output buffer is almost
23:20
full how do you compare the next value
23:39
implementation question I mean if the
23:41
buffer has a large capacity how do you
23:43
find the last fellow you put in the
23:44
buffer well you have a variable in your
23:47
program which says last key in buffer is
23:50
three whenever you move somebody into
23:53
the buffer you update that variable okay
23:59
all right so when somebody enters the
24:03
tree you check with the last key that
24:05
was that's in the buffer if it's bigger
24:08
so if it is smaller you give it the next
24:11
run number okay so I'm showing in blue
24:14
fellows they belong in the next run
24:17
otherwise everybody else is in black
24:21
they are part of the current run when
24:26
you play matches you first compare run
24:28
numbers if somebody is part of the next
24:32
run they have to lose when they play
24:34
against someone as part of the current
24:35
run if you're part of the same run then
24:40
you compare the keys okay all right so
24:44
these two are part of different runs so
24:47
this guy is going to lose even though
24:49
this value is smaller so the one gets
24:52
recorded there as the loser the five
24:56
moves up they're part of the same run so
24:58
you actually compare the keys the five
25:00
loses the three moves up they're part of
25:03
the same run so you actually compare the
25:04
keys it's a tie one of the threes moves
25:07
up you compare against the four because
25:09
they're part of the same run so before
25:12
losers the three wins and goes into the
25:16
output buffer
25:32
all right so then we are going to
25:35
replenish that spot with somebody from
25:38
the buffer the next fella on the buffer
25:40
is the 9 so the 9 is taken out and it's
25:42
moved over here you again compared 9
25:47
with the last key that was sent to the
25:50
output buffer the last key sent to the
25:52
output buffers at 3 so mine can still
25:56
come out as part of the current run even
25:59
though we have one record here that has
26:00
been designated as next run it's still
26:03
possible for fellows to enter the tree
26:05
and come out as part of the current run
26:07
ok so mine can come out as part of the
26:10
current run so it comes in as a current
26:14
run record alright so you play matches
26:21
the 9 plays with the 7 they're both part
26:24
of the same run so you compare the keys
26:26
the 9 loses the 7 comes in place with
26:30
the 5 they're part of the same runs you
26:32
compare the keys the 7 loses the 5 plays
26:35
here with the 3 the 5 loses the 3 plays
26:38
there with the 4 the 4 loses the 3 is
26:41
copied out
26:49
alright then the two is going to enter
26:51
the tree you compare to with the last
26:56
key that went out three two is smaller
26:58
so two cannot come out as part of the
26:59
current run has to come out as part of
27:01
the next run so it's assigned the next
27:08
run
27:23
all right so at this time the output
27:27
buffers for the input buffer is empty
27:29
it's time to synchronize you wait for
27:32
input output to complete and then you
27:35
need to change the roles of the buffers
27:37
so you're now going to be reading into
27:43
that buffer okay you're going to use
27:46
this one to feed into the tree you're
27:50
going to write that one to disk and then
27:52
the tree is going to be filling this one
28:17
the last one that came in here was a -
28:20
yeah okay suppose this had been a for if
28:25
this had been a four it would have come
28:27
in as part of the current run right so
28:32
so the rule is whenever somebody moves
28:33
from the input buffer into a player node
28:35
you check its value against the last key
28:38
in the output buffer if the value is is
28:43
less than the last key then it has to be
28:46
part of the next run otherwise it's part
28:48
of the current run yeah
28:56
nah no rats so questions we can't decide
29:01
the number of runs right when you use
29:02
the scheme you don't know how many runs
29:04
you're going to get okay
29:06
the length of a run depends a lot on the
29:10
sequence of keys that you start with a
29:29
sign for to the same run and then I'm
29:32
going to flush the buffer I won't know
29:33
what's the previous key well I again
29:39
you're getting into an implementation
29:41
detail and as I said it's very easy to
29:43
implement these things you just keep a
29:45
program variable which keeps track of
29:47
what is the last thing you put out and
29:51
in fact you're going to need that
29:54
variable because as we noted earlier if
29:56
this is a large buffer the position of
29:58
the last fellow keeps changing and you
30:00
don't want to keep keep I guess making
30:05
an access into that place to get it okay
30:07
so the simplest thing to do really is
30:11
just have a variable and save the last
30:13
item
30:23
all right so we interchange the role of
30:25
the buffers okay and then we continue to
30:31
now fill this one from the tree as you
30:35
fill this positions will get emptied you
30:37
use this one to feed it and while that
30:40
activity is taking place your writing to
30:43
disk non-blocking right and your reading
30:45
from disk non-blocking reading all right
30:59
so we're still generating run one all
31:07
right so the two entered the tree and
31:13
we're going to play matches and playing
31:16
a match as always first compare run
31:18
numbers if you're from the same run
31:20
number then you compare the key
31:21
otherwise that the fellow from the
31:23
current run always wins okay so you play
31:27
a match here between the two and the
31:28
four they're from different runs four is
31:31
from the current run two is from the
31:32
next one so the two is going to lose the
31:35
four will play with the six they're from
31:37
the same run so you compare the keys the
31:40
four wins it comes here in plays with
31:42
the five it wins it goes up and plays
31:44
with that four it's a tie one of the two
31:46
fours wins and has moved to the output
31:49
buffer alright then you're going to fill
31:56
this from here so the six is going to
31:59
enter the tree you compare the six with
32:02
the last fellow that was moved to an
32:04
output buffer that's a four six is
32:06
bigger than four it can come out as part
32:08
of the current run so it enters as
32:10
current run and then you play matches
32:13
along that path
32:18
see I don't know if I got any more in
32:20
this example yeah I do
32:22
alright so let me go back a bit alright
32:31
so the six centers as part of the
32:32
current run it plays with the two okay
32:36
you first compare the run numbers this
32:39
is part of the next run so it has to
32:41
lose when it plays with someone as part
32:43
of the current run okay so the six wins
32:45
and the two loses okay
32:47
the six comes here in plays at that six
32:49
the part of the same run so you compare
32:50
the keys it's a tight one of the two
32:53
sixes wins the other one moves up plays
32:55
with the five the run numbers are the
32:57
same you compare the keys the six loses
33:01
the five moves up to play with the for
33:03
the run numbers are the same so the six
33:07
loses sorry the five loses the four wins
33:10
and has moved into the output buffer
33:19
alright then the one here enters the
33:22
tree you compare with the last fellow
33:25
that went out that's a four you can't
33:27
put this one as part of the current run
33:29
so it's assigned to the next run and
33:36
then you play matches upwards the one
33:39
will play with the nine they're from
33:42
different runs the nine even though it's
33:44
bigger as part of the current run it
33:46
will win and the one will lose the mine
33:51
goes up and it loses against the five
33:53
the five losses against the four the
33:56
four wins against the five and moves to
33:59
an output buffer alright so the four
34:06
wins it will get copied over here and
34:11
then the process continues
34:17
okay alright questions about how this
34:20
scheme works at some point in time the
34:32
first record from the next run will exit
34:35
the system and at that time you start
34:39
generating run number two
34:50
alright now we already noted that with
34:57
this scheme you can't really predict how
34:59
many runs you're going to get that's
35:01
just another way of saying you can't
35:02
really predict the size of the runs if
35:09
you lose a tree has K players in it ok
35:12
external nodes we know that you will not
35:15
get a run whose size is less than equal
35:17
to K when you start everybody is
35:23
assigned run number 1 so you got K
35:25
people will be part of Run 1 when the
35:29
first fellow assigned to Run 2 comes out
35:31
everybody in the tree has to be assigned
35:33
to run 2 if there's a single Run 1 it
35:35
should have come out first ok so at
35:37
least K people assigned to run 2
35:38
similarly 2 3 & 4 so every run is going
35:42
to have at least K records in it if your
35:50
input was sorted to begin with nobody
35:53
get assigned to run 2 you get only one
35:55
run fact if your input is close to being
35:59
sorted nobody get assigned to run 2 so
36:03
long as you never put somebody in the
36:04
tree whose value is smaller than someone
36:07
already in an output buffer run
36:09
everybody gets assigned to run 1 so you
36:12
can get several nearly sorted sequences
36:14
all coming out as one run and you never
36:16
get to phase two
36:26
if the input was the reverse of so did
36:28
sequence then every time somebody comes
36:32
into the tree it is smaller than what
36:35
went out so it gets assigned to the next
36:38
run so in that case every run is of size
36:41
exactly K okay so the best that can
36:49
happen to you is one run the worst that
36:51
can happen to you is you get n over K
36:53
runs now what you can show is that if
37:00
you have a random sequence okay then the
37:03
expectation is that the size of your run
37:06
will be two times the number of players
37:10
in your tree now we're not going to
37:17
prove that but we'll just take that
37:19
result okay so the expectation is two
37:22
times K and let's see what that means
37:25
okay suppose we have a memory that has
37:31
the capacity to hold M records plus some
37:34
additional memory to store a program and
37:36
so on okay so if you were using the
37:39
simple scheme we looked at a few
37:40
lectures ago we could read the anem
37:42
records
37:43
sort them and write them out so using
37:50
the simple scheme our runs would be of
37:53
size M when we use the scheme just
38:00
described if your block size is B
38:04
records then we need memory for four
38:08
buffers so for B records worth of memory
38:12
is designated for our four buffers and
38:18
that means that the loser tree K number
38:21
of player nodes will be at most M minus
38:24
4b then of course you've got the
38:27
internal nodes there are K minus one of
38:29
those but those are just pointers to
38:31
records will assume the pointers are
38:32
small compared to records and we've got
38:34
enough memory for those K minus 1
38:35
pointers
38:38
all right so loser three K is going to
38:42
be M minus 4b and the expectation is
38:47
that the run size will be two times case
38:49
will be about 2m minus eight P so if you
39:00
use the simple scheme where we have no
39:02
overlap between input output and stuff
39:04
we just fill up memory sort and write
39:06
out or if you go into a somewhat more
39:10
involved skiing where we try to bring in
39:12
blocks at the right time as we did in
39:14
quicksort we can overlap some of input
39:16
output but not a whole lot we can get
39:20
runs who sizes M and here we're going to
39:25
get runs of sizes 2m minus 8b and if you
39:28
compare the two expressions M + 2 m
39:30
minus 8 B will see that the loser 3
39:34
scheme is better when M is at least 8 P
39:38
the expectation is better performance
39:42
when M is at least 8 B so 8 block loads
39:53
so got a numeric example suppose that
39:56
your block size is 100 records if your
40:01
memory is 600 records capacity there's
40:05
six blocks at this time we don't expect
40:08
the loser three scheme to be better on
40:10
average it's got to be 8 B okay so if
40:14
you got 600 capacity the K you can run
40:19
with you need four blocks of size
40:21
hundreds so 400 records worth of
40:24
memories gone to the buffers you left
40:25
with 200 for your loser tree so 2k is
40:29
400 so we expect to get runs whose size
40:33
is 400 whereas the simple scheme would
40:37
have been 600
40:40
if you have 1,000 units of memory okay
40:45
so that's 10 B so now we expect the
40:48
loser tree scheme to be better so K take
40:53
out 400 units for the four buffers he
40:55
left with 600 two K's 1200 and we're
40:58
doing better than the fill memory sort
41:03
and output scheme if you go to 10,000 M
41:09
we're doing about 19,000 200 okay so as
41:13
you get to bigger and bigger memories
41:16
relative to 8 B you get closer to doing
41:19
two times the size of memory and you're
41:23
doing a lot better than the well but
41:26
better by a factor of approaching - all
41:36
right so so the strategy effectively
41:40
overlaps input/output and internal
41:42
processing and at the same time it gives
41:45
us runs that are expected to be of about
41:47
twice the size of memory
41:49
assuming memory is large compared to the
41:52
block size however it does give us runs
41:59
whose size varies you may get some runs
42:01
of size 200 some of size 1,000 some of
42:06
size 10,000
42:13
the total amount of time taken to
42:16
generate these runs if you use the
42:19
simple scheme okay to sort em records
42:23
using say a logarithmic sort M log M if
42:27
the number of Records is n you have to
42:28
go through n over m rounds of that okay
42:31
so your total time is order of n log M
42:36
if you use the loser tree scheme to move
42:42
a record from an input buffer to an
42:44
output buffer takes the log of K base 2
42:47
amount of time and all the records have
42:51
to move from input to output buffers so
42:53
it's n times log of K raised to K is not
43:00
going to be more than M fact K is about
43:02
M minus 4b so this is really the same as
43:05
that so asymptotically you're not
43:09
spending any more time in the process
43:21
all right when you get runs of different
43:23
lengths that introduces a new problem
43:25
which is when you go to the next phase
43:29
which is merging the runs how should we
43:32
merge these runs together to reduce the
43:34
time spent in the second phase so when
43:41
we discussed the adaptation of merge
43:44
sort we were using a merging scheme
43:50
which correspond it to a nicely balanced
43:53
binary tree which said that in the first
43:55
phase we take pairs of runs merge them
43:58
together okay and then we go to another
44:01
pass where we take the output from the
44:03
first pass and merge them together so
44:06
adapted to the case where you have four
44:08
runs
44:09
suppose we ran this new run generation
44:11
scheme and we got a run of size 4 3 6 &
44:14
9 so I do a merge pass over these ok to
44:19
do the merge here I read in four box
44:21
records from this run three blocks from
44:24
that run so seven inputs I write out
44:27
seven blocks and I internally merge
44:29
seven blocks okay so let's say that
44:31
seven units of time to do this one I
44:35
read in these six blocks those nine
44:37
blocks that's 15 blocks are right out
44:39
fifteen blocks and I move and I
44:42
internally merge fifteen blocks so
44:44
that's 15 units of time then I go to the
44:48
second phase I read the result from the
44:50
first phase that seven blocks plus
44:53
fifteen blocks 22 blocks I merge them
44:55
internally 22 blocks emerging and I
44:57
write out 22 blocks that's 22 units of
45:00
time so my total cost is 7 plus 15 plus
45:06
22 just 44 okay now suppose instead I
45:15
did it this way okay so I'm doing by
45:20
merging this way I'm still staying with
45:21
a 2-way merge the first time merge these
45:24
two runs okay so read in four blocks in
45:28
three blocks seven blocks merge them
45:29
write them out that takes seven units of
45:32
then I take the result from this merge
45:35
and an original run okay so sixteen
45:38
blocks of input sixteen merges 16
45:41
outputs oh sorry thirteen seven plus six
45:44
thirteen so thirteen inputs thirteen
45:46
merges thirteen outputs okay and then I
45:48
take this result and that original block
45:51
of nine so that's twenty two inputs
45:53
twenty two outputs 22 merges so now my
45:59
time becomes seven plus 13 plus 22
46:02
that's 42 so merging using that pattern
46:08
was better than merging using this
46:10
pattern so when you have runs of
46:13
different lengths this nice pass by pass
46:17
merging strategy is not the best way to
46:20
merge your runs when they're all of the
46:22
same length it is the best but when
46:25
they're not of the same length it's not
46:30
so so the new problem that arises is how
46:37
do you figure out the best pattern to
46:40
use to merge these runs together
46:47
okay right so we'll study that problem
46:50
in the next lecture
46:51
optimal merging of runs or any questions
47:00
okay