Transcript


00:12
okay all right before starting I just
00:16
want to remind you a search for our
00:18
second exam for this Friday syllabus is
00:22
posted on the web as our sample tests
00:27
okay you have any questions you want to
00:29
ask alright so today we're going to
00:37
start looking at what you might call the
00:41
analog of radix sort to the search
00:43
problem we look at different ways of
00:47
organizing dictionaries so far the
00:51
dictionary organizations who looked at
00:52
whether it was binary search trees or
00:56
higher-order search trees were based on
00:59
making full key comparisons amongst the
01:01
search key and the items currently
01:03
stored in your dictionary the structures
01:07
we want to explore now are those where
01:10
the organization is based not on the
01:15
full key but on parts of the key so you
01:18
do a decomposition of the key using some
01:21
radix like you would in a radix sort and
01:25
initially what you're going to be
01:27
looking at is decompositions using a
01:29
radix of two so binary decompositions of
01:32
keys and then organizing such structures
01:35
based on the bits in a key okay so as I
01:42
said it's an analog of radix sort to
01:44
searching will interpret each key as
01:49
being a string of bits so you got zeros
01:53
and ones and we may be dealing either
01:58
with fixed length keys so for example
02:01
here all of the keys are four bits long
02:06
or you may be dealing with a situation
02:09
where the keys are of variable length so
02:13
some fields
02:14
maybe short and some maybe long okay
02:18
most of our discussion will be centered
02:21
around peace today around fixed-length
02:24
keys and part of it will make some
02:28
reference to variable length keys as
02:30
well the dictionary is organized using
02:41
say the bits in a key find a fair amount
02:47
of application one particular
02:49
application area where a large number of
02:52
variants of the fundamental structure
02:54
we're going to look at today have been
02:55
developed and are in commercial use is
02:58
in the implementation of router tables
03:01
for packet forwarding in the internet or
03:04
for packet classifiers and firewalls
03:07
so I guess sometime next week we'll have
03:13
a lecture where I'll just kind of review
03:14
the application of tries to do packet
03:19
forwarding and packet classification
03:20
applications okay but we'll see the
03:23
details of that next week and when we
03:29
get there we'll see that if you're
03:32
dealing with IP version 4 then the keys
03:35
will be variable lengths going up to 32
03:38
bits in length and if you're dealing
03:41
with IP version 6 the keys would be
03:44
variable length going up to 128 bits
03:49
okay so so what binary decomposition of
03:52
keys actually gets used quite a bit in
03:54
practice we'll also look at
03:57
decompositions using higher rate X's red
04:00
x's than two in a subsequent lecture all
04:05
right so let's start by looking at
04:06
probably the simplest type of such
04:09
structure based on key decomposition
04:11
this will be a digital search tree and
04:16
then we'll take a look at an alternative
04:20
to a digital surgery something called a
04:21
try
04:23
right so in a digital search tree okay
04:29
it's the search tree and if it's not
04:31
empty
04:32
then one of the pairs of your
04:36
dictionaries resides in the root you can
04:39
place any one you like out there and the
04:42
remaining ones get partitioned into the
04:44
left and right subtree depending upon
04:46
the first bit okay so if the first bit
04:50
is a 0 then the room those pairs will be
04:52
in the left subtree and if the first bit
04:54
is a 1 they will go into the right
04:56
subtree okay so the partitioning into
04:59
left and right subtrees isn't done by
05:02
comparing the entire key with what's in
05:05
the root but only by looking at the
05:06
first bit so if the first bit is a 0 you
05:09
go into the left subtree if the first
05:11
bit is a 1 you go into the right subtree
05:12
and then the left and right subtrees are
05:16
similarly organized right the best way
05:20
to get a feel for that definition is to
05:23
simply look at an example all right and
05:28
we'll actually build the example as a
05:30
sequence of inserts starting with an MT
05:33
digital search tree the number of nodes
05:36
in a digital search tree will be exactly
05:38
equal to the number of entries in the
05:40
dictionary so when you start with an
05:42
empty tree we've got no nodes and then
05:45
we're going to insert a pair whose keys
05:47
is 0 1 1 0 this would result in a 1 node
05:52
search tree and that one node search
05:54
tree will have a dictionary pair in it
05:55
this the single one that's in the
05:57
dictionary so you just end up with that
06:01
right now suppose you want to insert the
06:05
key 0 0 1 0 so the first thing we'll do
06:11
is we'll search for the key okay so will
06:14
we have only one node so look at the key
06:17
in this node and find that they're
06:18
different ok so it's not a duplicate you
06:22
put it into the tree I will look at its
06:24
first bit which is a 0 so you have to
06:26
put it in the left subtree ok and since
06:29
the left subtree is empty there's
06:30
nothing to search there we just get a
06:32
new node and put that in over there to
06:35
end up with the structure
06:42
all right now suppose you want to put in
06:44
a 1 0 0 1 so you started the route to
06:50
make a comparison okay they're not the
06:53
same okay to decide whether to go to the
06:56
left or the right sub-tree when you have
06:57
the root level you look at bit 1 ok the
07:00
leftmost bit which is a 1 so I go into
07:02
the right subtree the right subtree is
07:05
empty so that's where you make the
07:07
insert and we end up with that structure
07:17
all right now suppose we want to insert
07:20
a 1 0 1 1 so you start at the root do
07:23
you compare they different then you look
07:27
at the first bit it's a 1 so we end up
07:28
here okay you compare the two they're
07:32
different at this level you look at bit
07:34
to the 2nd bit out there's a 0 so we'll
07:37
go down here you fall off the tree
07:39
that's the place to make the insert all
07:51
right suppose you want to insert 4 zeros
07:54
so you've started the root you don't get
07:58
a match so you look at the first bit
08:00
it's a 0 you end up here you compare the
08:02
two keys they're not equal so you get
08:05
the second bit from there which is a 0
08:07
and you go into the left subtree here
08:09
okay since it's empty you know that 4
08:12
zeros is not currently present and
08:13
that's the place to insert the new pair
08:24
okay so in a digital search tree the
08:29
root can contain any one of the
08:30
dictionary pairs the remaining pairs are
08:32
partitioned depending upon the first bit
08:35
if it's a zero you come here if it's a
08:38
one you go there okay at this level here
08:42
the petitioning will be done based on
08:44
bit too
08:45
then on bit three than bit four and so
08:47
on
08:59
right so if you want to search a digital
09:02
search tree
09:02
they said you start at the root okay you
09:06
compared the search key with what's in
09:08
here if it's equal you're done if it's
09:10
not you look at bit one if it's a zero
09:12
you go there otherwise you go to the
09:14
other side and you kind of repeat that
09:16
process at each level we saw how to do
09:19
an insert you basically do a search and
09:22
then wherever you fall off you insert a
09:24
new node with the new pair if you want
09:32
to do a delete so suppose you want to
09:42
delete the item in the root so to delete
09:52
the item in the root you have to find a
09:56
replacement for that item okay and you
10:03
have to get a replacement in such a way
10:05
that you don't mess up the partitioning
10:06
x' at the remaining levels at each level
10:09
your elements or pairs are being
10:11
partitioned by a particular bit so a
10:16
suitable replacement for this fellow
10:19
here would be anybody who's sitting in a
10:23
leaf below it okay so if I took the four
10:30
zeros and put it up here
10:32
then I could throw this node away and I
10:34
would not affect the level numbers of
10:36
any other node if I took this fellow and
10:40
put it up here I could throw the leaf
10:41
away okay now so some things that don't
10:54
work for example is if you're trying to
10:59
remove let's say this one one zero zero
11:01
one then if you just change the pointer
11:05
from here to here okay and assume for a
11:09
moment there are sub trees down here you
11:11
will end up with a problem
11:12
okay because the petitioning here in two
11:15
subtrees was done previously by level
11:17
three but by promoting this up to level
11:19
two the petitioning is now done here by
11:22
bit three rather than bit too okay so
11:26
you would have a problem okay so when
11:31
you do a delete you need to go down
11:35
until you find a leaf find the item in
11:37
the leaf move it up to where you've got
11:39
a vacancy and then throw away that leaf
11:43
okay so it's not quite the same as
11:46
deletion from a binary search tree right
11:52
the time required to perform any of the
11:55
three operation search insert and delete
11:56
depends on the number of bits and the
11:58
key it's linear in the number of bits so
12:04
if the number of bits is small this is a
12:06
good way to go if the number of bits is
12:08
large this isn't such a good way to go
12:13
aren't any questions about digital
12:15
search trees yeah right the question is
12:36
what happens if the bits and the key are
12:40
exhausted and you still haven't fallen
12:41
off the tree okay and the assumption we
12:44
made was that all keys are of the same
12:48
length and when all keys are the same
12:51
length that case cannot arise so that
12:56
that's the reason to make that
12:57
assumption that all keys are of the same
13:00
length now that there's a way around it
13:04
and we will see that I guess when you
13:07
look at variants of this structure where
13:10
you make provision to hold different
13:12
kinds of elements in the interior or you
13:16
take a key and at the end you put in a
13:18
special digit so then you really
13:22
increase the the radix by one okay so we
13:29
may take binary Keys and suffix them
13:33
with a special symbol say a pound symbol
13:36
now you cannot have one key which is a
13:40
proper prefix of another and that
13:42
prevents the case that you mentioned but
13:46
we'll see that later when we look at
13:48
higher degree tries okay so under the
13:53
assumption that all keys are of the same
13:54
length the insertion algorithm always
13:58
works the way I described if the item is
14:01
not the you will always fall off the
14:03
tree before you run out of bits okay
14:08
other questions
14:14
now in this scheme okay when you're
14:21
doing a search or any of the other
14:22
operations you make one comparison at
14:25
each level so you compare the key stored
14:27
here with the search key so the total
14:29
number of key comparisons is also of the
14:33
order of the number of bits
14:35
okay now this makes the strategy its
14:40
present form not very efficient when
14:44
you're dealing with long keys because
14:47
then the key comparisons are expensive
14:49
even though you doing a small number of
14:51
them each one is expensive
14:55
all right so people came up with
14:59
alternative structures which are also
15:01
based on this radix decomposition idea
15:03
where the total number of full key
15:07
comparisons that is needed is limited to
15:10
one so let's take a look at those
15:15
alternative structures the simplest is a
15:18
binary try and then you can go to higher
15:20
order trice and variants of higher order
15:23
try say we look at later all right so
15:27
the the word try was derived from the
15:31
word retrieval the structure was
15:33
initially designed for information
15:35
retrieval applications where people were
15:37
indexing large texts so maybe you're
15:41
indexing articles in a magazine and
15:43
there could be thousands of characters
15:46
one so you didn't really want to be
15:48
comparing thousand character Keys at
15:52
each level of the try of the structure
15:55
or the nice property about these is you
16:01
can perform your operations by using at
16:05
most one comparison again we'll assume
16:10
there are the our keys are all of the
16:12
same length but as we'll see later
16:16
that's really not a hard requirement we
16:18
can now relax that to accommodate
16:21
variable length keys
16:25
so when you're dealing with fixed-length
16:28
keys we'll have two types of nodes in
16:31
the system one is a branch node and a
16:35
branch node has a left child and a right
16:37
child pointer but no data fields okay so
16:43
that's different from a binary for a
16:47
digital search tree where the nodes had
16:49
both left child right child and a data
16:51
field so here a branch node cannot store
16:54
data you can only have pointers then you
16:58
have element nodes where element nodes
17:02
can't have children but they can only
17:06
store data all right so the fixed-length
17:13
case we will have a node structure you
17:16
have really two types of nodes branch
17:18
nodes and element nodes let's look at an
17:25
example all right so the green nodes
17:28
here are the branch nodes in the yellow
17:32
ones are the element nodes the
17:36
dictionary pairs or keys are divided out
17:40
again by looking at the bits okay
17:43
so the root level you look at the first
17:45
bit if the first bit is 0 it's going to
17:47
show up on the left side if the first
17:49
bit is a 1 it's going to show up in the
17:51
right side if you have more than one
17:55
dictionary pair in a subtree then again
17:58
your petition but now we're going to do
18:01
here with bit too so if it's a zero you
18:04
come here if it's a one you go then
18:06
again if this subtree is got more than
18:08
two pairs
18:10
you'll partition if it's a zero in bit
18:13
three you come here with someone you
18:14
come down okay so the partitioning goes
18:17
on until you're left with a single pair
18:21
and at that time you put in a data node
18:28
you might call this a pure try in
18:32
practice people will not do this
18:35
petitioning all the way down to single
18:37
entities or single pairs you may stop
18:40
the petitioning when you reach a number
18:43
of pairs that maybe say 5 or 10 so you
18:48
have a concept of a bucket and once the
18:51
number of pairs here is small enough
18:53
that it fits into a bucket you stop the
18:55
petitioning if you look at a search so
19:11
suppose you are searching for a 0 0 1 1
19:13
okay well then you start at the root
19:16
okay you look at the first bit it's a 0
19:19
so you come here you look at the second
19:21
bidder to 0 you come there you look at
19:22
the third bit it's a 1 you come here so
19:26
once you reach a data node then you have
19:30
to do a compare to make sure there is a
19:33
match because you could come here for
19:36
any key that begins with 0 0 1 so if
19:41
even if it had been 0 0 1 0 it could
19:44
have been stored here so you follow a
19:49
path dictated by the bits in your search
19:51
key and that path would either fall off
19:57
the try so for example if you looking
20:00
for something that begins with 0 1 you'd
20:02
fall off here in which case no
20:04
comparison is done or the path will end
20:08
at a data node in which case you have to
20:10
compare the search key with the key
20:12
stored there to see if there's a match
20:20
all right so searches can now be done
20:23
with at most one comparison whereas the
20:25
previous structure digital search trees
20:27
required a number of comparisons that
20:30
was equal to the number of bits
20:40
all right we have variable-length keys
20:43
then I will use a single node type which
20:52
could have say left and right child
20:54
fields and each node will be able to
21:00
store two pairs okay a left and a right
21:04
there and the strategy I will use is
21:08
that the left pair will have a key that
21:14
terminates at the root of the left
21:16
subtree of this node or if the left
21:21
subtree is only going to contain one
21:23
pair then I'll stop here won't do any
21:25
further partitioning similar strategy
21:28
for the right pair again it's probably
21:32
easier to look at an example than to try
21:34
and follow the wording of the thing all
21:41
right so here I guess I've got two
21:47
fields in a node this I guess colors not
21:50
showing up too good yeah so I got a
21:52
yellow field that's the left one and
21:54
then I've got a light blue field that's
21:56
the right one so at the root level I can
22:03
store pairs whose keys are one bit long
22:07
okay so if the key was zero the pair
22:11
would be in the store as the left pair
22:13
if the key was one it'd be stored as the
22:15
right pet
22:16
in this case I don't have a pair whose
22:18
key is one at the next level I can store
22:22
pairs whose keys are two bits long okay
22:25
so on this side the first bit would have
22:28
to be zero and then this node would have
22:30
a zero zero pair and a zero one pair an
22:34
exception is if there is only one pair
22:37
to go in the subtree then I'll just
22:39
store it here instead of going down one
22:41
two three four five levels to storage
22:44
okay so I have a provision to store
22:47
early in case there's only one pair left
22:49
in the subtree
22:52
again here this node is nominally four
22:56
keys that are two bits long okay they
22:59
have to begin with the one so nominally
23:01
you'd put a one zero here now one one
23:02
here exception being that if that
23:05
subtree is going to have only one key
23:08
then you might as well store it here
23:09
instead of having a long path where
23:12
you're trying to partition that single
23:14
key further okay all right so this would
23:19
be one way to handle keys of different
23:21
lengths in a binary scenario
23:37
again you would search these in a
23:40
similar fashion to the previous
23:41
structure so you'd follow the bits and
23:48
the last node you see on the path take a
23:52
look at the content do a compare if it
23:56
matches your okay otherwise it's not in
23:58
the tree all right any questions about
24:05
the fixed length of the variable length
24:10
try we'll take a look at insert and
24:12
delete from the fixed length in a moment
24:16
yes what do you mean as an index well
24:35
actually both the this thing is just one
24:38
big node okay
24:40
so the yellow part represents the left
24:42
data and the light blue part represents
24:45
the right data and then you have two
24:47
child pointers the left child and the
24:49
right child pointer okay so this could I
24:53
mean if you didn't have you know these
24:56
fellows here okay then this thing here
24:59
could have kept a 0 0 1 0 0 just as well
25:02
if there were no other fellows that
25:05
began with 0 0 so the partitioning
25:10
strategy is again you look at the bits
25:12
left to right and if you're left with
25:16
only one pair then you store it in that
25:19
node instead of continuing to partition
25:21
a single term if you're not left with
25:26
one pair then what you can store at a
25:29
level is determined by the length of the
25:32
key so at level one you can store keys
25:34
of length 1 then length to length 3
25:40
all right other questions okay let's go
25:54
back to the fixed-length case and look
25:56
at how you would do an insert all right
26:00
suppose you want to insert 0 followed by
26:02
3 ones so we'd follow a path from the
26:05
root first bit is a 0 so you go into the
26:08
left subtree next bit is a 1 so we fall
26:11
off over here that tells us that I don't
26:15
already have 0 1 1 1 in my dictionary
26:18
and that the place to insert it would be
26:22
out here
26:34
all right so this insert required no
26:36
compares right next lists insert 1 1 0 1
26:44
so again you start at the root the first
26:47
bit is a 1 the second bit is a 1 and you
26:50
reach a data node so you would compare
26:54
the keys and we find they're different
26:57
so to find the place to make the
27:00
insertion I need to continue to compare
27:03
these two keys together so I've got a 1
27:05
1 0 1 and a 1 1 0 0 and I need to find
27:09
out where they differ and up to that
27:11
point I'll have to introduce branch
27:15
nodes ok ok so I put in a branch node
27:18
there because now the subtree that is
27:21
rooted here is going to have whatever it
27:25
had before plus this one ok and before
27:28
it had one I was good once you got to so
27:30
you go to have a branch node so that
27:33
level you're going to branch at bit 3
27:35
bit 3 for both of these guys 1 1 0 0 & 1
27:40
1 0 1 is a 0 so you don't end up
27:43
petitioning ok so now I'm going to have
27:51
a branch node at level 4 where I'm going
27:54
to branch looking at bit for and bit for
27:59
for the old fellow's a 0 so show up in
28:02
the left subtree and for the new forward
28:04
so 1 so it'll show up in the right
28:06
subtree
28:06
ok so this is where the sets break down
28:10
into size one each and you put in yellow
28:13
nodes
28:22
right so this case needs one comparison
28:36
all right so any questions about an
28:38
insert you perform a search if you fall
28:43
off the try you insert a new node I did
28:45
a note there if you reach a data node
28:49
you have to do a compare and that could
28:52
involve inserting a bunch of new branch
28:54
nodes one per level
29:03
all right the total time for that insert
29:06
of course is limited by the number of
29:08
bits okay so it's order the number of
29:10
bits and the keys now let's take a look
29:14
at the delete operation all right
29:19
suppose you want to delete a zero
29:20
followed by three ones so you first
29:22
search for it first bit is a zero you
29:25
end up here a second bit is a one you
29:27
reach a data node you compare the keys
29:29
you find there's a match as you go to
29:31
get rid of this fellow okay alright so
29:36
that data node disappears and when that
29:39
data node disappears you got to decide
29:41
whether its parent if there is one
29:44
okay this branch note here is that a
29:46
legitimate branch node or should that
29:49
also disappear now this branch node
29:56
should disappear if it's other subtree
30:01
of has only one data node in it okay now
30:08
if this subtree had only one data node
30:10
in it then the sibling of the node II
30:12
threw away would be a data node but it's
30:15
a branch node which tells you there's
30:17
more than one down here okay every
30:19
branch node is the root of a subtree
30:21
that has at least two data nodes in it
30:26
so in this case this fall is going to
30:29
stay because the sibling of the yellow
30:31
node that went is a branch node telling
30:34
you that there at least two people here
30:36
and so you got to keep that guy okay
30:44
let's try another delete all right so
30:48
this deletion took one comparison you
30:50
reach a NATO node you have to compare I
30:54
suppose you want to delete a one one
30:56
zero zero so you followed the search
31:01
path one one zero zero you end up here
31:05
you do a comparison you find a match so
31:11
you get rid of that data node and now
31:14
you got to decide whether this branch
31:16
node stays or goes to make that decision
31:20
you look at the other child or the
31:22
sibling of the fellow who went that's
31:24
the data node which tells you if you
31:26
left this fellow here it would be the
31:28
root of a subtree that has only one they
31:30
don't note in it and that's not legal
31:33
right so this fellow's got to go well
31:39
then you back up here now you go to
31:41
decide whether this fall has got to go
31:44
okay now if this fellow stays this
31:47
uptight down here is going to have only
31:48
one data node in it the subtype down
31:52
there is empty so the subtree rooted
31:55
here is going to have only one data node
31:57
and so that node has got to go as well
32:01
now if this fellow had a non-empty child
32:05
well then this node would stay because
32:08
you have one fellow on this side and
32:10
some number here one or more on that
32:13
side okay but since you have an empty
32:15
sub try to the right this branch node
32:20
here has got to go since this which is
32:22
now going to be the root of a subtree
32:25
with only one yellow node in it all
32:28
right so that fellas got to go as well
32:32
so now I'm going to ask the question
32:35
whether this green node is got to go if
32:37
this one stays you'll have one yellow
32:39
node here with this guy in it and then
32:42
you'll have this subtree down here which
32:45
has one or more well since it's a branch
32:47
node there at least two people down here
32:49
so this green node has got to stay even
32:53
if you had a yellow node here that green
32:55
node would stay because then you'd have
32:57
one
32:57
yellow node and once that this subtree
32:59
would have two elements all right so
33:02
that fellow's got to stay we just put it
33:04
in that
33:17
all right so when you do a delete you
33:20
follow the search path and if the item
33:24
you're looking for is actually there
33:25
you'll end up at a data node you do a
33:28
comparison and it'll be a match we'll
33:31
throw away that data node then you have
33:34
to retrace a path going back up toward
33:36
the root looking for these branch nodes
33:38
that need to be thrown away okay a
33:43
branch node is going to get thrown away
33:45
if it's other child is empty all right
33:55
so we back up some number of levels
33:57
throwing away branch nodes as necessary
34:00
total time required is again linear in
34:04
the number of bits in the key and you
34:06
make it most one comparison
34:18
all right any questions about a delete
34:24
let's take a look at join and then the
34:30
split operation we'll take a look at in
34:32
the next lecture all right so for the
34:36
join if you recall look at a three-way
34:37
join you're given a dictionary of small
34:41
keys a dictionary of big keys and an
34:45
intermediate key so everybody in s is
34:47
smaller than M and everybody in B is
34:50
bigger than M we want to combine these
34:52
three entities into a single dictionary
34:56
okay
34:58
all right so the strategy that we're
35:01
going to adopt is we're going to first
35:03
really convert it into a two-way joint
35:07
by getting rid of that middle key so
35:12
I'll take that middle fellow M and I'll
35:15
insert it into the big tree and end up
35:18
with a new big tree B prime so they have
35:26
to take me time equal to the number of
35:27
bits now if my small tree is empty
35:33
then I'm done B prime is the answer if
35:38
my small tree just has a data node in it
35:41
so if the root is a data node I'll just
35:43
take that data and insert it into B
35:46
Prime and then I'm done so the time is
35:54
still order of the number of bits just
35:56
two times number of bits you did one
35:58
insert here and one insert up there now
36:05
if B prime is an element node which
36:10
means that B was empty okay well in that
36:13
case I'll just take that element m and
36:16
instead of I'll insert it into s because
36:20
you can organize this different you
36:21
could probably check out here if one of
36:24
the two is empty insert M into the non
36:27
empty one and you're done
36:31
basically all of this stuff here is to
36:34
take care of is the one reduce the
36:38
three-way joint to to a joint and to
36:40
take care of the case when one of the
36:42
two trees s and B is empty
36:45
not empty but has a it could be empty or
36:50
it has only one element in it and if
36:57
you're not done as a result of these
36:58
then when you get down here you
37:01
guaranteed that the root of both s and B
37:05
prime is a branch node okay so the first
37:12
four steps are to eliminate all the
37:15
simple cases and then if you get to the
37:17
fifth one down here both s and B prime
37:23
have a root that is a branch node which
37:26
means they both have two or more keys in
37:28
them all right so let's look at that
37:33
situation okay so now both s and B prime
37:38
have a branch node as the key as the
37:42
root and we break this thing down into
37:46
two cases depending upon whether s has
37:50
an empty right subtree or it doesn't
37:53
okay so if the small tree has an empty
37:56
right subtree okay so it looks like this
37:59
okay that site is empty
38:03
and we're trying to join it to B Prime
38:07
and B prime may have an empty left
38:13
subtree or it may not okay may have an
38:16
empty right subtree or it may not
38:17
doesn't really matter but the key thing
38:19
here is that s has an empty right
38:22
subtree so the result of joining these
38:30
two fellows when it looks like this
38:35
is this try here the right sub-tree is
38:41
going to be C C's got everything that
38:43
begins with a 1 nobody and B begins with
38:47
a one and nobody and a begins with a one
38:49
so C is going to be unchanged as a
38:53
result of the join and then what you're
38:57
going to get down here would be the
39:00
stuff that's here plus the stuff over
39:03
there but since everybody here and this
39:07
whole thing is smaller than everybody
39:09
here it follows that everybody here is
39:12
smaller than everybody there because
39:14
they both begin with zeros ok so we can
39:19
recursively do a two-way join for a and
39:22
B tries of a smaller height
39:40
okay all right any questions about this
39:42
case
39:52
yeah I didn't in this case the
39:56
assumption is that the small tree has an
39:59
empty right subtree and so it has a left
40:04
subtree a everybody in a begins with a
40:06
zero by definition of try the big 3 B
40:10
prime has sub trees B and C one or both
40:15
of well both can't be empty but one of
40:18
them could be empty doesn't really
40:19
matter so when you're doing the joint we
40:24
say the result of the joint is you're
40:26
going to have a root node which is a
40:29
branch node the right subtree is going
40:31
to be C because the right subtree is go
40:33
to have everybody that begins with a 1
40:34
and it's only the fellows here that
40:36
begin with a 1 the left subtree is going
40:39
to have everybody that begins with a 0
40:40
and those are contained in a as well as
40:43
B so I'll do a recursive join on a and B
40:48
ok and I can do a recursive joined on a
40:51
and B because since everybody in s is
40:55
smaller than everybody in B prime and
40:58
everybody in a begins with a 0 and
41:01
everybody in B begins with a 0 it
41:03
follows that if you forget about the 0
41:05
then everybody in a is smaller than
41:08
everybody and B again ok because once
41:10
you go down one level you lop off one
41:13
bit
41:14
ok so it's still true that everybody
41:17
here if you disregard the leftmost bit
41:19
everybody here is smaller than everybody
41:20
there and so you can make it because of
41:23
join with one bit less ok and so once
41:32
you come here you again have to check
41:33
all the cases I had before where what
41:38
where the root of you know this fellow's
41:40
root may be a data node ok in which case
41:45
you can just take that item and insert
41:46
it into B and so on
41:55
okay all right so let's look at the
41:59
second cases where s has a non-empty
42:02
right subtree so if s has a non-empty
42:12
right subtree let me just get some keys
42:16
that begin with a 1 okay so if it has
42:19
some keys that begin with a 1 then B
42:22
prime cannot have any keys that begin
42:24
with a 0 if b prime had some keys that
42:28
began with a 0 then everybody in b would
42:31
be bigger than everybody down here but
42:34
that's not possible
42:36
since everybody in this whole tree is
42:38
smaller than everybody and that whole
42:40
tree okay so if the right subtree here
42:47
is non-empty the left subtree here has
42:50
to be empty okay so it's got to look
42:53
like that and you can handle this in a
42:58
symmetric fashion to what we had before
43:01
okay so the left subtree of the join
43:04
will just have the left subtree of s and
43:07
the right subtree of the join you do a
43:09
two-way joint between b and c
43:25
all right so each time you make a
43:26
recursive call you go one level down to
43:28
try and the number of levels is the
43:32
number of bits so the time required is
43:35
also linear in the number of bits or
43:37
linear in the height of the try or any
43:50
questions about the joint
44:02
okay so we'll stop here and then in the
44:04
next lecture we'll take a look at a
44:05
split and we also look at things called
44:08
compressed binary tries we will get rid
44:10
of these branch nodes that have only one
44:12
child okay