Transcript 00:12 okay all right before starting I just 00:16 want to remind you a search for our 00:18 second exam for this Friday syllabus is 00:22 posted on the web as our sample tests 00:27 okay you have any questions you want to 00:29 ask alright so today we're going to 00:37 start looking at what you might call the 00:41 analog of radix sort to the search 00:43 problem we look at different ways of 00:47 organizing dictionaries so far the 00:51 dictionary organizations who looked at 00:52 whether it was binary search trees or 00:56 higher-order search trees were based on 00:59 making full key comparisons amongst the 01:01 search key and the items currently 01:03 stored in your dictionary the structures 01:07 we want to explore now are those where 01:10 the organization is based not on the 01:15 full key but on parts of the key so you 01:18 do a decomposition of the key using some 01:21 radix like you would in a radix sort and 01:25 initially what you're going to be 01:27 looking at is decompositions using a 01:29 radix of two so binary decompositions of 01:32 keys and then organizing such structures 01:35 based on the bits in a key okay so as I 01:42 said it's an analog of radix sort to 01:44 searching will interpret each key as 01:49 being a string of bits so you got zeros 01:53 and ones and we may be dealing either 01:58 with fixed length keys so for example 02:01 here all of the keys are four bits long 02:06 or you may be dealing with a situation 02:09 where the keys are of variable length so 02:13 some fields 02:14 maybe short and some maybe long okay 02:18 most of our discussion will be centered 02:21 around peace today around fixed-length 02:24 keys and part of it will make some 02:28 reference to variable length keys as 02:30 well the dictionary is organized using 02:41 say the bits in a key find a fair amount 02:47 of application one particular 02:49 application area where a large number of 02:52 variants of the fundamental structure 02:54 we're going to look at today have been 02:55 developed and are in commercial use is 02:58 in the implementation of router tables 03:01 for packet forwarding in the internet or 03:04 for packet classifiers and firewalls 03:07 so I guess sometime next week we'll have 03:13 a lecture where I'll just kind of review 03:14 the application of tries to do packet 03:19 forwarding and packet classification 03:20 applications okay but we'll see the 03:23 details of that next week and when we 03:29 get there we'll see that if you're 03:32 dealing with IP version 4 then the keys 03:35 will be variable lengths going up to 32 03:38 bits in length and if you're dealing 03:41 with IP version 6 the keys would be 03:44 variable length going up to 128 bits 03:49 okay so so what binary decomposition of 03:52 keys actually gets used quite a bit in 03:54 practice we'll also look at 03:57 decompositions using higher rate X's red 04:00 x's than two in a subsequent lecture all 04:05 right so let's start by looking at 04:06 probably the simplest type of such 04:09 structure based on key decomposition 04:11 this will be a digital search tree and 04:16 then we'll take a look at an alternative 04:20 to a digital surgery something called a 04:21 try 04:23 right so in a digital search tree okay 04:29 it's the search tree and if it's not 04:31 empty 04:32 then one of the pairs of your 04:36 dictionaries resides in the root you can 04:39 place any one you like out there and the 04:42 remaining ones get partitioned into the 04:44 left and right subtree depending upon 04:46 the first bit okay so if the first bit 04:50 is a 0 then the room those pairs will be 04:52 in the left subtree and if the first bit 04:54 is a 1 they will go into the right 04:56 subtree okay so the partitioning into 04:59 left and right subtrees isn't done by 05:02 comparing the entire key with what's in 05:05 the root but only by looking at the 05:06 first bit so if the first bit is a 0 you 05:09 go into the left subtree if the first 05:11 bit is a 1 you go into the right subtree 05:12 and then the left and right subtrees are 05:16 similarly organized right the best way 05:20 to get a feel for that definition is to 05:23 simply look at an example all right and 05:28 we'll actually build the example as a 05:30 sequence of inserts starting with an MT 05:33 digital search tree the number of nodes 05:36 in a digital search tree will be exactly 05:38 equal to the number of entries in the 05:40 dictionary so when you start with an 05:42 empty tree we've got no nodes and then 05:45 we're going to insert a pair whose keys 05:47 is 0 1 1 0 this would result in a 1 node 05:52 search tree and that one node search 05:54 tree will have a dictionary pair in it 05:55 this the single one that's in the 05:57 dictionary so you just end up with that 06:01 right now suppose you want to insert the 06:05 key 0 0 1 0 so the first thing we'll do 06:11 is we'll search for the key okay so will 06:14 we have only one node so look at the key 06:17 in this node and find that they're 06:18 different ok so it's not a duplicate you 06:22 put it into the tree I will look at its 06:24 first bit which is a 0 so you have to 06:26 put it in the left subtree ok and since 06:29 the left subtree is empty there's 06:30 nothing to search there we just get a 06:32 new node and put that in over there to 06:35 end up with the structure 06:42 all right now suppose you want to put in 06:44 a 1 0 0 1 so you started the route to 06:50 make a comparison okay they're not the 06:53 same okay to decide whether to go to the 06:56 left or the right sub-tree when you have 06:57 the root level you look at bit 1 ok the 07:00 leftmost bit which is a 1 so I go into 07:02 the right subtree the right subtree is 07:05 empty so that's where you make the 07:07 insert and we end up with that structure 07:17 all right now suppose we want to insert 07:20 a 1 0 1 1 so you start at the root do 07:23 you compare they different then you look 07:27 at the first bit it's a 1 so we end up 07:28 here okay you compare the two they're 07:32 different at this level you look at bit 07:34 to the 2nd bit out there's a 0 so we'll 07:37 go down here you fall off the tree 07:39 that's the place to make the insert all 07:51 right suppose you want to insert 4 zeros 07:54 so you've started the root you don't get 07:58 a match so you look at the first bit 08:00 it's a 0 you end up here you compare the 08:02 two keys they're not equal so you get 08:05 the second bit from there which is a 0 08:07 and you go into the left subtree here 08:09 okay since it's empty you know that 4 08:12 zeros is not currently present and 08:13 that's the place to insert the new pair 08:24 okay so in a digital search tree the 08:29 root can contain any one of the 08:30 dictionary pairs the remaining pairs are 08:32 partitioned depending upon the first bit 08:35 if it's a zero you come here if it's a 08:38 one you go there okay at this level here 08:42 the petitioning will be done based on 08:44 bit too 08:45 then on bit three than bit four and so 08:47 on 08:59 right so if you want to search a digital 09:02 search tree 09:02 they said you start at the root okay you 09:06 compared the search key with what's in 09:08 here if it's equal you're done if it's 09:10 not you look at bit one if it's a zero 09:12 you go there otherwise you go to the 09:14 other side and you kind of repeat that 09:16 process at each level we saw how to do 09:19 an insert you basically do a search and 09:22 then wherever you fall off you insert a 09:24 new node with the new pair if you want 09:32 to do a delete so suppose you want to 09:42 delete the item in the root so to delete 09:52 the item in the root you have to find a 09:56 replacement for that item okay and you 10:03 have to get a replacement in such a way 10:05 that you don't mess up the partitioning 10:06 x' at the remaining levels at each level 10:09 your elements or pairs are being 10:11 partitioned by a particular bit so a 10:16 suitable replacement for this fellow 10:19 here would be anybody who's sitting in a 10:23 leaf below it okay so if I took the four 10:30 zeros and put it up here 10:32 then I could throw this node away and I 10:34 would not affect the level numbers of 10:36 any other node if I took this fellow and 10:40 put it up here I could throw the leaf 10:41 away okay now so some things that don't 10:54 work for example is if you're trying to 10:59 remove let's say this one one zero zero 11:01 one then if you just change the pointer 11:05 from here to here okay and assume for a 11:09 moment there are sub trees down here you 11:11 will end up with a problem 11:12 okay because the petitioning here in two 11:15 subtrees was done previously by level 11:17 three but by promoting this up to level 11:19 two the petitioning is now done here by 11:22 bit three rather than bit too okay so 11:26 you would have a problem okay so when 11:31 you do a delete you need to go down 11:35 until you find a leaf find the item in 11:37 the leaf move it up to where you've got 11:39 a vacancy and then throw away that leaf 11:43 okay so it's not quite the same as 11:46 deletion from a binary search tree right 11:52 the time required to perform any of the 11:55 three operation search insert and delete 11:56 depends on the number of bits and the 11:58 key it's linear in the number of bits so 12:04 if the number of bits is small this is a 12:06 good way to go if the number of bits is 12:08 large this isn't such a good way to go 12:13 aren't any questions about digital 12:15 search trees yeah right the question is 12:36 what happens if the bits and the key are 12:40 exhausted and you still haven't fallen 12:41 off the tree okay and the assumption we 12:44 made was that all keys are of the same 12:48 length and when all keys are the same 12:51 length that case cannot arise so that 12:56 that's the reason to make that 12:57 assumption that all keys are of the same 13:00 length now that there's a way around it 13:04 and we will see that I guess when you 13:07 look at variants of this structure where 13:10 you make provision to hold different 13:12 kinds of elements in the interior or you 13:16 take a key and at the end you put in a 13:18 special digit so then you really 13:22 increase the the radix by one okay so we 13:29 may take binary Keys and suffix them 13:33 with a special symbol say a pound symbol 13:36 now you cannot have one key which is a 13:40 proper prefix of another and that 13:42 prevents the case that you mentioned but 13:46 we'll see that later when we look at 13:48 higher degree tries okay so under the 13:53 assumption that all keys are of the same 13:54 length the insertion algorithm always 13:58 works the way I described if the item is 14:01 not the you will always fall off the 14:03 tree before you run out of bits okay 14:08 other questions 14:14 now in this scheme okay when you're 14:21 doing a search or any of the other 14:22 operations you make one comparison at 14:25 each level so you compare the key stored 14:27 here with the search key so the total 14:29 number of key comparisons is also of the 14:33 order of the number of bits 14:35 okay now this makes the strategy its 14:40 present form not very efficient when 14:44 you're dealing with long keys because 14:47 then the key comparisons are expensive 14:49 even though you doing a small number of 14:51 them each one is expensive 14:55 all right so people came up with 14:59 alternative structures which are also 15:01 based on this radix decomposition idea 15:03 where the total number of full key 15:07 comparisons that is needed is limited to 15:10 one so let's take a look at those 15:15 alternative structures the simplest is a 15:18 binary try and then you can go to higher 15:20 order trice and variants of higher order 15:23 try say we look at later all right so 15:27 the the word try was derived from the 15:31 word retrieval the structure was 15:33 initially designed for information 15:35 retrieval applications where people were 15:37 indexing large texts so maybe you're 15:41 indexing articles in a magazine and 15:43 there could be thousands of characters 15:46 one so you didn't really want to be 15:48 comparing thousand character Keys at 15:52 each level of the try of the structure 15:55 or the nice property about these is you 16:01 can perform your operations by using at 16:05 most one comparison again we'll assume 16:10 there are the our keys are all of the 16:12 same length but as we'll see later 16:16 that's really not a hard requirement we 16:18 can now relax that to accommodate 16:21 variable length keys 16:25 so when you're dealing with fixed-length 16:28 keys we'll have two types of nodes in 16:31 the system one is a branch node and a 16:35 branch node has a left child and a right 16:37 child pointer but no data fields okay so 16:43 that's different from a binary for a 16:47 digital search tree where the nodes had 16:49 both left child right child and a data 16:51 field so here a branch node cannot store 16:54 data you can only have pointers then you 16:58 have element nodes where element nodes 17:02 can't have children but they can only 17:06 store data all right so the fixed-length 17:13 case we will have a node structure you 17:16 have really two types of nodes branch 17:18 nodes and element nodes let's look at an 17:25 example all right so the green nodes 17:28 here are the branch nodes in the yellow 17:32 ones are the element nodes the 17:36 dictionary pairs or keys are divided out 17:40 again by looking at the bits okay 17:43 so the root level you look at the first 17:45 bit if the first bit is 0 it's going to 17:47 show up on the left side if the first 17:49 bit is a 1 it's going to show up in the 17:51 right side if you have more than one 17:55 dictionary pair in a subtree then again 17:58 your petition but now we're going to do 18:01 here with bit too so if it's a zero you 18:04 come here if it's a one you go then 18:06 again if this subtree is got more than 18:08 two pairs 18:10 you'll partition if it's a zero in bit 18:13 three you come here with someone you 18:14 come down okay so the partitioning goes 18:17 on until you're left with a single pair 18:21 and at that time you put in a data node 18:28 you might call this a pure try in 18:32 practice people will not do this 18:35 petitioning all the way down to single 18:37 entities or single pairs you may stop 18:40 the petitioning when you reach a number 18:43 of pairs that maybe say 5 or 10 so you 18:48 have a concept of a bucket and once the 18:51 number of pairs here is small enough 18:53 that it fits into a bucket you stop the 18:55 petitioning if you look at a search so 19:11 suppose you are searching for a 0 0 1 1 19:13 okay well then you start at the root 19:16 okay you look at the first bit it's a 0 19:19 so you come here you look at the second 19:21 bidder to 0 you come there you look at 19:22 the third bit it's a 1 you come here so 19:26 once you reach a data node then you have 19:30 to do a compare to make sure there is a 19:33 match because you could come here for 19:36 any key that begins with 0 0 1 so if 19:41 even if it had been 0 0 1 0 it could 19:44 have been stored here so you follow a 19:49 path dictated by the bits in your search 19:51 key and that path would either fall off 19:57 the try so for example if you looking 20:00 for something that begins with 0 1 you'd 20:02 fall off here in which case no 20:04 comparison is done or the path will end 20:08 at a data node in which case you have to 20:10 compare the search key with the key 20:12 stored there to see if there's a match 20:20 all right so searches can now be done 20:23 with at most one comparison whereas the 20:25 previous structure digital search trees 20:27 required a number of comparisons that 20:30 was equal to the number of bits 20:40 all right we have variable-length keys 20:43 then I will use a single node type which 20:52 could have say left and right child 20:54 fields and each node will be able to 21:00 store two pairs okay a left and a right 21:04 there and the strategy I will use is 21:08 that the left pair will have a key that 21:14 terminates at the root of the left 21:16 subtree of this node or if the left 21:21 subtree is only going to contain one 21:23 pair then I'll stop here won't do any 21:25 further partitioning similar strategy 21:28 for the right pair again it's probably 21:32 easier to look at an example than to try 21:34 and follow the wording of the thing all 21:41 right so here I guess I've got two 21:47 fields in a node this I guess colors not 21:50 showing up too good yeah so I got a 21:52 yellow field that's the left one and 21:54 then I've got a light blue field that's 21:56 the right one so at the root level I can 22:03 store pairs whose keys are one bit long 22:07 okay so if the key was zero the pair 22:11 would be in the store as the left pair 22:13 if the key was one it'd be stored as the 22:15 right pet 22:16 in this case I don't have a pair whose 22:18 key is one at the next level I can store 22:22 pairs whose keys are two bits long okay 22:25 so on this side the first bit would have 22:28 to be zero and then this node would have 22:30 a zero zero pair and a zero one pair an 22:34 exception is if there is only one pair 22:37 to go in the subtree then I'll just 22:39 store it here instead of going down one 22:41 two three four five levels to storage 22:44 okay so I have a provision to store 22:47 early in case there's only one pair left 22:49 in the subtree 22:52 again here this node is nominally four 22:56 keys that are two bits long okay they 22:59 have to begin with the one so nominally 23:01 you'd put a one zero here now one one 23:02 here exception being that if that 23:05 subtree is going to have only one key 23:08 then you might as well store it here 23:09 instead of having a long path where 23:12 you're trying to partition that single 23:14 key further okay all right so this would 23:19 be one way to handle keys of different 23:21 lengths in a binary scenario 23:37 again you would search these in a 23:40 similar fashion to the previous 23:41 structure so you'd follow the bits and 23:48 the last node you see on the path take a 23:52 look at the content do a compare if it 23:56 matches your okay otherwise it's not in 23:58 the tree all right any questions about 24:05 the fixed length of the variable length 24:10 try we'll take a look at insert and 24:12 delete from the fixed length in a moment 24:16 yes what do you mean as an index well 24:35 actually both the this thing is just one 24:38 big node okay 24:40 so the yellow part represents the left 24:42 data and the light blue part represents 24:45 the right data and then you have two 24:47 child pointers the left child and the 24:49 right child pointer okay so this could I 24:53 mean if you didn't have you know these 24:56 fellows here okay then this thing here 24:59 could have kept a 0 0 1 0 0 just as well 25:02 if there were no other fellows that 25:05 began with 0 0 so the partitioning 25:10 strategy is again you look at the bits 25:12 left to right and if you're left with 25:16 only one pair then you store it in that 25:19 node instead of continuing to partition 25:21 a single term if you're not left with 25:26 one pair then what you can store at a 25:29 level is determined by the length of the 25:32 key so at level one you can store keys 25:34 of length 1 then length to length 3 25:40 all right other questions okay let's go 25:54 back to the fixed-length case and look 25:56 at how you would do an insert all right 26:00 suppose you want to insert 0 followed by 26:02 3 ones so we'd follow a path from the 26:05 root first bit is a 0 so you go into the 26:08 left subtree next bit is a 1 so we fall 26:11 off over here that tells us that I don't 26:15 already have 0 1 1 1 in my dictionary 26:18 and that the place to insert it would be 26:22 out here 26:34 all right so this insert required no 26:36 compares right next lists insert 1 1 0 1 26:44 so again you start at the root the first 26:47 bit is a 1 the second bit is a 1 and you 26:50 reach a data node so you would compare 26:54 the keys and we find they're different 26:57 so to find the place to make the 27:00 insertion I need to continue to compare 27:03 these two keys together so I've got a 1 27:05 1 0 1 and a 1 1 0 0 and I need to find 27:09 out where they differ and up to that 27:11 point I'll have to introduce branch 27:15 nodes ok ok so I put in a branch node 27:18 there because now the subtree that is 27:21 rooted here is going to have whatever it 27:25 had before plus this one ok and before 27:28 it had one I was good once you got to so 27:30 you go to have a branch node so that 27:33 level you're going to branch at bit 3 27:35 bit 3 for both of these guys 1 1 0 0 & 1 27:40 1 0 1 is a 0 so you don't end up 27:43 petitioning ok so now I'm going to have 27:51 a branch node at level 4 where I'm going 27:54 to branch looking at bit for and bit for 27:59 for the old fellow's a 0 so show up in 28:02 the left subtree and for the new forward 28:04 so 1 so it'll show up in the right 28:06 subtree 28:06 ok so this is where the sets break down 28:10 into size one each and you put in yellow 28:13 nodes 28:22 right so this case needs one comparison 28:36 all right so any questions about an 28:38 insert you perform a search if you fall 28:43 off the try you insert a new node I did 28:45 a note there if you reach a data node 28:49 you have to do a compare and that could 28:52 involve inserting a bunch of new branch 28:54 nodes one per level 29:03 all right the total time for that insert 29:06 of course is limited by the number of 29:08 bits okay so it's order the number of 29:10 bits and the keys now let's take a look 29:14 at the delete operation all right 29:19 suppose you want to delete a zero 29:20 followed by three ones so you first 29:22 search for it first bit is a zero you 29:25 end up here a second bit is a one you 29:27 reach a data node you compare the keys 29:29 you find there's a match as you go to 29:31 get rid of this fellow okay alright so 29:36 that data node disappears and when that 29:39 data node disappears you got to decide 29:41 whether its parent if there is one 29:44 okay this branch note here is that a 29:46 legitimate branch node or should that 29:49 also disappear now this branch node 29:56 should disappear if it's other subtree 30:01 of has only one data node in it okay now 30:08 if this subtree had only one data node 30:10 in it then the sibling of the node II 30:12 threw away would be a data node but it's 30:15 a branch node which tells you there's 30:17 more than one down here okay every 30:19 branch node is the root of a subtree 30:21 that has at least two data nodes in it 30:26 so in this case this fall is going to 30:29 stay because the sibling of the yellow 30:31 node that went is a branch node telling 30:34 you that there at least two people here 30:36 and so you got to keep that guy okay 30:44 let's try another delete all right so 30:48 this deletion took one comparison you 30:50 reach a NATO node you have to compare I 30:54 suppose you want to delete a one one 30:56 zero zero so you followed the search 31:01 path one one zero zero you end up here 31:05 you do a comparison you find a match so 31:11 you get rid of that data node and now 31:14 you got to decide whether this branch 31:16 node stays or goes to make that decision 31:20 you look at the other child or the 31:22 sibling of the fellow who went that's 31:24 the data node which tells you if you 31:26 left this fellow here it would be the 31:28 root of a subtree that has only one they 31:30 don't note in it and that's not legal 31:33 right so this fellow's got to go well 31:39 then you back up here now you go to 31:41 decide whether this fall has got to go 31:44 okay now if this fellow stays this 31:47 uptight down here is going to have only 31:48 one data node in it the subtype down 31:52 there is empty so the subtree rooted 31:55 here is going to have only one data node 31:57 and so that node has got to go as well 32:01 now if this fellow had a non-empty child 32:05 well then this node would stay because 32:08 you have one fellow on this side and 32:10 some number here one or more on that 32:13 side okay but since you have an empty 32:15 sub try to the right this branch node 32:20 here has got to go since this which is 32:22 now going to be the root of a subtree 32:25 with only one yellow node in it all 32:28 right so that fellas got to go as well 32:32 so now I'm going to ask the question 32:35 whether this green node is got to go if 32:37 this one stays you'll have one yellow 32:39 node here with this guy in it and then 32:42 you'll have this subtree down here which 32:45 has one or more well since it's a branch 32:47 node there at least two people down here 32:49 so this green node has got to stay even 32:53 if you had a yellow node here that green 32:55 node would stay because then you'd have 32:57 one 32:57 yellow node and once that this subtree 32:59 would have two elements all right so 33:02 that fellow's got to stay we just put it 33:04 in that 33:17 all right so when you do a delete you 33:20 follow the search path and if the item 33:24 you're looking for is actually there 33:25 you'll end up at a data node you do a 33:28 comparison and it'll be a match we'll 33:31 throw away that data node then you have 33:34 to retrace a path going back up toward 33:36 the root looking for these branch nodes 33:38 that need to be thrown away okay a 33:43 branch node is going to get thrown away 33:45 if it's other child is empty all right 33:55 so we back up some number of levels 33:57 throwing away branch nodes as necessary 34:00 total time required is again linear in 34:04 the number of bits in the key and you 34:06 make it most one comparison 34:18 all right any questions about a delete 34:24 let's take a look at join and then the 34:30 split operation we'll take a look at in 34:32 the next lecture all right so for the 34:36 join if you recall look at a three-way 34:37 join you're given a dictionary of small 34:41 keys a dictionary of big keys and an 34:45 intermediate key so everybody in s is 34:47 smaller than M and everybody in B is 34:50 bigger than M we want to combine these 34:52 three entities into a single dictionary 34:56 okay 34:58 all right so the strategy that we're 35:01 going to adopt is we're going to first 35:03 really convert it into a two-way joint 35:07 by getting rid of that middle key so 35:12 I'll take that middle fellow M and I'll 35:15 insert it into the big tree and end up 35:18 with a new big tree B prime so they have 35:26 to take me time equal to the number of 35:27 bits now if my small tree is empty 35:33 then I'm done B prime is the answer if 35:38 my small tree just has a data node in it 35:41 so if the root is a data node I'll just 35:43 take that data and insert it into B 35:46 Prime and then I'm done so the time is 35:54 still order of the number of bits just 35:56 two times number of bits you did one 35:58 insert here and one insert up there now 36:05 if B prime is an element node which 36:10 means that B was empty okay well in that 36:13 case I'll just take that element m and 36:16 instead of I'll insert it into s because 36:20 you can organize this different you 36:21 could probably check out here if one of 36:24 the two is empty insert M into the non 36:27 empty one and you're done 36:31 basically all of this stuff here is to 36:34 take care of is the one reduce the 36:38 three-way joint to to a joint and to 36:40 take care of the case when one of the 36:42 two trees s and B is empty 36:45 not empty but has a it could be empty or 36:50 it has only one element in it and if 36:57 you're not done as a result of these 36:58 then when you get down here you 37:01 guaranteed that the root of both s and B 37:05 prime is a branch node okay so the first 37:12 four steps are to eliminate all the 37:15 simple cases and then if you get to the 37:17 fifth one down here both s and B prime 37:23 have a root that is a branch node which 37:26 means they both have two or more keys in 37:28 them all right so let's look at that 37:33 situation okay so now both s and B prime 37:38 have a branch node as the key as the 37:42 root and we break this thing down into 37:46 two cases depending upon whether s has 37:50 an empty right subtree or it doesn't 37:53 okay so if the small tree has an empty 37:56 right subtree okay so it looks like this 37:59 okay that site is empty 38:03 and we're trying to join it to B Prime 38:07 and B prime may have an empty left 38:13 subtree or it may not okay may have an 38:16 empty right subtree or it may not 38:17 doesn't really matter but the key thing 38:19 here is that s has an empty right 38:22 subtree so the result of joining these 38:30 two fellows when it looks like this 38:35 is this try here the right sub-tree is 38:41 going to be C C's got everything that 38:43 begins with a 1 nobody and B begins with 38:47 a one and nobody and a begins with a one 38:49 so C is going to be unchanged as a 38:53 result of the join and then what you're 38:57 going to get down here would be the 39:00 stuff that's here plus the stuff over 39:03 there but since everybody here and this 39:07 whole thing is smaller than everybody 39:09 here it follows that everybody here is 39:12 smaller than everybody there because 39:14 they both begin with zeros ok so we can 39:19 recursively do a two-way join for a and 39:22 B tries of a smaller height 39:40 okay all right any questions about this 39:42 case 39:52 yeah I didn't in this case the 39:56 assumption is that the small tree has an 39:59 empty right subtree and so it has a left 40:04 subtree a everybody in a begins with a 40:06 zero by definition of try the big 3 B 40:10 prime has sub trees B and C one or both 40:15 of well both can't be empty but one of 40:18 them could be empty doesn't really 40:19 matter so when you're doing the joint we 40:24 say the result of the joint is you're 40:26 going to have a root node which is a 40:29 branch node the right subtree is going 40:31 to be C because the right subtree is go 40:33 to have everybody that begins with a 1 40:34 and it's only the fellows here that 40:36 begin with a 1 the left subtree is going 40:39 to have everybody that begins with a 0 40:40 and those are contained in a as well as 40:43 B so I'll do a recursive join on a and B 40:48 ok and I can do a recursive joined on a 40:51 and B because since everybody in s is 40:55 smaller than everybody in B prime and 40:58 everybody in a begins with a 0 and 41:01 everybody in B begins with a 0 it 41:03 follows that if you forget about the 0 41:05 then everybody in a is smaller than 41:08 everybody and B again ok because once 41:10 you go down one level you lop off one 41:13 bit 41:14 ok so it's still true that everybody 41:17 here if you disregard the leftmost bit 41:19 everybody here is smaller than everybody 41:20 there and so you can make it because of 41:23 join with one bit less ok and so once 41:32 you come here you again have to check 41:33 all the cases I had before where what 41:38 where the root of you know this fellow's 41:40 root may be a data node ok in which case 41:45 you can just take that item and insert 41:46 it into B and so on 41:55 okay all right so let's look at the 41:59 second cases where s has a non-empty 42:02 right subtree so if s has a non-empty 42:12 right subtree let me just get some keys 42:16 that begin with a 1 okay so if it has 42:19 some keys that begin with a 1 then B 42:22 prime cannot have any keys that begin 42:24 with a 0 if b prime had some keys that 42:28 began with a 0 then everybody in b would 42:31 be bigger than everybody down here but 42:34 that's not possible 42:36 since everybody in this whole tree is 42:38 smaller than everybody and that whole 42:40 tree okay so if the right subtree here 42:47 is non-empty the left subtree here has 42:50 to be empty okay so it's got to look 42:53 like that and you can handle this in a 42:58 symmetric fashion to what we had before 43:01 okay so the left subtree of the join 43:04 will just have the left subtree of s and 43:07 the right subtree of the join you do a 43:09 two-way joint between b and c 43:25 all right so each time you make a 43:26 recursive call you go one level down to 43:28 try and the number of levels is the 43:32 number of bits so the time required is 43:35 also linear in the number of bits or 43:37 linear in the height of the try or any 43:50 questions about the joint 44:02 okay so we'll stop here and then in the 44:04 next lecture we'll take a look at a 44:05 split and we also look at things called 44:08 compressed binary tries we will get rid 44:10 of these branch nodes that have only one 44:12 child okay