Transcript 00:12 [Music] 00:20 okay yeah we were looking at binary 00:24 tries and we had seen how to perform 00:28 pretty much all of the standard 00:30 dictionary operations on a binary try 00:32 other than the split operation so today 00:37 we're going to start by taking a look at 00:38 the split operation and then we're going 00:41 to look at a variant of the binary try 00:47 which is called a compressed binary try 00:49 which uses potentially a smaller number 00:53 of ranch nodes then is used by an 00:55 ordinary binary try okay so and actually 01:00 in the next lecture we'll take a look at 01:03 a rather interesting method to represent 01:07 a compressed binary tree okay alright so 01:12 we're looking at binary tries the 01:15 outstanding operation to look at now is 01:17 the split in a split operation you're 01:20 given a key and you're supposed to now 01:23 create two tries one containing elements 01:28 whose key is smaller than the split key 01:30 K the small try and then another try 01:34 called the big try containing elements 01:36 whose key is bigger than K and if you 01:38 happen to have an element whose key is 01:39 equal to K then that's not present in 01:42 either of these two tries it's returned 01:44 as a separate entity right the split 01:50 process is very similar to how you would 01:52 split in in an unbalanced binary search 01:56 tree namely you'd use the key and follow 01:58 a path looking for the split key and as 02:02 you follow this path you'll separate out 02:04 in this case they try into two parts 02:09 yeah once you're done splitting your try 02:14 into the small and big try on the way 02:16 down we will have a pass that is 02:19 normally not there when you're splitting 02:23 an unbalanced binary tree a backward 02:25 pass where we're going to do some 02:27 cleanup to end up with the proper binary 02:30 tries 02:32 let's the forward pass which is the 02:40 splitting pass going down the Tri so 02:43 when you're sitting at a tri node say 02:45 you're sitting at node X then if this 02:47 node is at say level X of the tree then 02:50 you're going to be splitting based on 02:52 bit so if it's at level J you and we're 02:56 splitting based on bit J of the key okay 02:59 so we're sitting at a node in the Tri if 03:05 you recall the forward pass of splitting 03:08 a binary search tree you started the 03:10 root and you follow a path okay and then 03:13 when you're at a node you perform some 03:16 action and in the case of a try the 03:21 action you perform is based on the bit 03:24 the appropriate bit in the key K rather 03:27 than on a comparison between the key K 03:30 and whatever key may be stored in this 03:32 node okay so if the jet bit of the key 03:36 is a 1 then we'll be moving to the side 03:40 ok and that would also tell us that 03:43 everything in a is smaller than the 03:46 split key K because everything in a has 03:50 the j thread 0 whereas the key k has the 03:54 Jareth bit equal to 1 okay so if you're 03:58 moving to the side then the action we 04:01 will take is into the small try I will 04:06 add this branch node together with the 04:09 subtree a with a sub try a and then into 04:18 the big try I will still need to put in 04:22 a branch node ok so I'll put in a branch 04:25 node that doesn't have oh oh that has an 04:29 empty left subtree and then you kind of 04:32 move down towards the right 04:40 all right similarly if the Jets bit of K 04:43 is a zero so we're moving down this side 04:45 then I know that the fellows here are 04:48 bigger than the split key and so into 04:52 the big try I will add X with the sub 04:55 try B and do the small try I'll just add 04:58 a node X okay so we'll put this into the 05:03 big try and we'll put a branch node at 05:09 that level into the small try alright so 05:15 those are the two basic operations or 05:17 the two basic cases to know let's take a 05:19 look at an example to see how that works 05:25 all right in this example okay 05:29 I have a try and I'm going to split 05:33 based upon the key here so in this 05:37 example if we can I guess assume that G 05:43 actually is an element node so there's a 05:46 key in there okay so I'm splitting based 05:50 upon the key that's in G and then it 05:54 doesn't really matter how you interpret 05:56 the other fellows they could be 05:57 interpreted as being sub tries with many 06:00 nodes in them or you could interpret 06:02 them as just being element nodes okay 06:05 but as far as this fellow is concerned 06:07 this is an element node it's got a 06:10 single key in there and that key value 06:12 is 1 0 1 0 1 1 and we're going to be 06:15 splitting based upon this key so I'm 06:20 going to follow a path from here 06:21 I look at the bits and K the first bit 06:24 is a one's you end up here the next bit 06:26 is a 0 then a 1 0 a 1 and a 1 you 06:30 finally end up here okay each time I 06:32 move down the try I will create a small 06:36 and big try component here ok 06:39 placing all of the elements in this try 06:42 into either S or B when I'm done 06:46 everybody will be in either sob except 06:48 for the fellow here 06:51 okay this gal be returned as a separate 06:53 entity all right so when we start the 06:58 small tries empty and the big try is 07:00 empty we look at the first bit okay any 07:04 move here so now my rule says this node 07:09 and that subtree are going to get added 07:14 into the appropriate level here which 07:16 means at the root level and into the big 07:21 fellow we'll just put a branch node with 07:24 no subtree okay right so that guy goes 07:26 away you get this in a small try and you 07:31 just get a branch node in the big try 07:34 okay all right then I'm sitting at this 07:37 node in my input try and now I'm going 07:41 to use the next bit which is a zero so 07:44 I'm going to move this way 07:45 and my rule says when you move this way 07:48 then into the big try you add this sub 07:53 try be together with a branch node okay 07:56 so down here I'm going to add a branch 07:59 node together with the sub try B and 08:03 into the small try I'm going to just put 08:07 in a branch node here okay all right so 08:10 that node goes away here you will just 08:14 put in a branch node and over here we'll 08:17 put in both a branch node and the sub 08:19 troy V all right then I move to this 08:27 level 3 node now I use the third bit 08:29 from the left in my key which is a 1 08:32 somewhere to come down here so a branch 08:36 node together with the sub traicee will 08:38 get added into the small try at level 3 08:49 and then here at level three you just 08:55 put in a branch note when I'm sitting at 09:01 this node so I look at the next bit 09:03 which is the fourth bit from the left 09:05 that's a zero so that's going to cause 09:06 me to move this way so into the big try 09:10 I'm going to place a branch node 09:14 together with this sub tri-d 09:16 and into the small try I'm going to just 09:18 place a branch node 09:27 again exactly where you place this node 09:30 is determined by where this node is 09:34 present in this fellow so you just put 09:36 it in the corresponding place out there 09:40 all right so now we're going to look at 09:42 the next bit which is a 1 so that's 09:46 going to cause us to move that way and 09:48 then into the small try I'll put a 09:52 branch node together with the subtree II 09:54 and the big try a branch node with no 09:57 sub tries ok so this fellow goes away 10:00 you get that fellow there and you get 10:04 this guy here all right we come here I 10:10 use the fifth bit I guess the sixth bit 10:14 from the left and that's a 1 so I'm 10:17 going to move down here so this a branch 10:20 node together with this subtree will go 10:22 into the small try and into the big try 10:28 you'll put a branch node right then you 10:37 reach the node G and that's an element 10:40 node is contained 10:41 it contains the element that you were 10:44 splitting on and that's not placed 10:45 anywhere all right so that's the forward 10:52 pass right any questions about the 10:56 forward pass if you look at the result 11:02 of the forward pass what you'll see is 11:04 that you don't really have to properly 11:09 structured tries okay they do satisfy 11:15 the property that whatever elements are 11:17 here the keys are smaller than the split 11:19 key whatever elements are here 11:21 the keys are bigger than the split key 11:23 all of the original elements except for 11:26 G is are either here or there so 11:28 everybody's been taken care of okay 11:30 structurally it looks like a binary try 11:33 but it fails to be a binary try because 11:36 a binary try has an important property 11:39 we mentioned last time 11:41 that every branch node has at least two 11:45 elements in it in that subtree okay so 11:49 if you were to look at say this branch 11:52 node there are no elements in the 11:53 subtree rooted here this branch node 11:56 here has no elements of the subtree 11:58 rooted at this branch node has only if d 12:04 is an element then this one has only one 12:07 element on the other hand if d is a 12:09 subtree then you've got at least two 12:11 because d is guaranteed to have two 12:18 similarly I feel here this node here 12:21 this branch node has only one element in 12:24 its subtree if F is an element node but 12:27 if F is a branch node then this node 12:30 here has more than one in which case 12:35 you're okay okay so in the yellow try 12:40 you may or may not have a problem 12:42 depending upon whether F is a branch 12:45 node or an element node in the blue try 12:47 you at least have a problem here and 12:49 here you may or may not have a problem 12:52 here depending upon whether D is a 12:54 branch node or an element node okay so 12:58 we need to make a backward pass fixing 13:00 these possible violations of a binary 13:05 tribe property all right so you have a 13:10 backward cleanup pass you move back on 13:13 both the small try the yellow try and 13:16 the bigger blue try getting rid of 13:21 branch nodes that aren't the root of sub 13:25 tries that have at least two elements 13:33 all right so we will start at the 13:37 bottommost branch node so this node in 13:40 this case and the corresponding node 13:42 here and move upwards okay all right so 13:45 let's suppose we do the yellow try first 13:47 they were here then I'll verify whether 13:50 F is a branch node or an element node if 13:54 it's an element node I have to eliminate 13:56 this fellow and move this guy one up 14:01 okay so let's assume in this example 14:04 that F is an element node okay so we 14:09 will get rid of the yellow branch node 14:13 shown in that red box and create this 14:17 situation right from here I don't need 14:22 to go any further up since there are two 14:24 elements in this sub try going up will 14:27 only possibly increase but not decrease 14:29 the number of elements in the subtree 14:33 all right turning our attention to the 14:35 blue fellow the blue try okay all right 14:38 this branch node has no elements so that 14:41 node has to go away then you come here 14:47 this guy's got no children so that goes 14:51 away then you come out here you have a 14:54 child if this is an element node you 14:57 have to get rid of this branch node if 14:59 this is a branch node you're done okay 15:03 now let's assume D is a branch node okay 15:06 then we don't have to go any further up 15:14 all right so there is a backward cleanup 15:18 pass you need to retrace your path on 15:20 both the yellow and the blue try 15:22 eliminating branch nodes that are not 15:25 the root of sub tries with at least two 15:27 elements and that would complete the 15:32 process or any questions about a split 15:38 it works in order of height time yeah 15:52 I do we what 16:08 okay oh okay so as you're retracing your 16:14 path upwards okay so this branch node 16:17 will disappear so you just get rid of it 16:19 and in this case F then becomes a child 16:23 of this branch note here was that your 16:25 question yeah okay 16:38 well let's turn our attention to a 16:40 variant of a binary try called a 16:42 compressed binary try such a try has the 16:48 property that it contains no branch node 16:51 that has only one child every branch 16:57 node is required to have two children 17:02 and to make the scheme work will augment 17:06 every branch node that survives with a 17:09 bit number field that tells you which 17:11 bit was used to branch at that node in 17:15 an uncompressed binary tree as we've 17:18 been seeing so far the level of the node 17:21 tells you which bit to use okay but in a 17:24 compressed binary try we're going to get 17:26 rid of some of the branch nodes 17:28 specifically those branch nodes that had 17:31 a degree one in an uncompressed try and 17:33 now you no longer have a match between 17:35 the level number and the bit that was 17:37 used for the branching okay so that 17:40 information is stored explicitly in a 17:42 node alright so here's an uncompressed 17:49 binary tree it has some nodes whose 17:52 degree is 1 some branch nodes ok so you 17:55 got a branch node over there it has only 17:58 one child you have another one over 18:02 there and yet another one there okay so 18:09 in a compressed binary try the these 3 18:14 branch nodes of degree 1 will be absent 18:17 so you just toss them away and replace 18:21 so you kind of merge these two pointers 18:24 you have a pointer from here coming 18:25 directly to this node a pointer from 18:27 here going to that node and a pointer 18:30 from here going to that node 18:38 bit number fields are stored with each 18:40 node so here the branching is done using 18:46 bit one of the key over here you were 18:49 using bit two over there you were using 18:51 bit too you're still going to use bit 18:53 too here you are using bit three you're 18:56 still going to use bit three okay so 19:00 I'll store with each node the bit that's 19:04 used so bit two is used here then you 19:08 use bit three over there over here we 19:10 use bit four and here you use bit for 19:18 okay all right so once we throw away the 19:21 nodes that are encircled in blue we end 19:25 up with this structure alright so 19:32 compressed binary tree is a binary tree 19:34 in which each node each branch node has 19:37 exactly two children each branch node is 19:41 augmented with a bit number field 19:50 already should be able to convince 19:53 yourself that the number of branch nodes 19:55 in a compressed binary try is always 1 19:58 less than the number of elements this is 20:02 not true in an uncompressed binary try 20:13 alright let's take a look at first how 20:18 you would perform a search now let's 20:24 suppose we're looking for 0 0 1 1 okay 20:27 so we're looking for the key that is 20:29 stored in this node okay 20:31 so you started the root that tells you 20:33 to use bit number 1 so that's a 0 you 20:37 end up here 20:37 this fellow tells you to use bit number 20:39 3 which is a 1 you end up out here when 20:44 you end up here you need to compare the 20:47 search key with the fellow that is 20:49 stored out here ok it's not enough to 20:52 just compare bits 4 and beyond the bits 20:55 that weren't used on the way down may 20:57 also be places where you differ and end 21:00 up here but basically the search is 21:04 similar to what was done in an 21:06 uncompressed binary try you follow a 21:08 path looking at bits as dictated by the 21:11 bit numbers and when you reach a branch 21:14 node so when you reach an element node 21:16 you compare the search key with the key 21:18 stored here 21:25 all right so you make one comparison and 21:28 the time is order of height let's do an 21:33 insert suppose you want to insert zero 21:37 zero one zero so you follow the search 21:40 path and see where you which which 21:43 element know the end up and so you use 21:47 bit one or two zero then you use bit 21:49 three which is a one you end up here now 21:53 you need to sample these two fellows to 21:58 determine the first place where these 22:00 two guys differ okay 22:03 they could have differed on bit too 22:05 because in this case they don't or they 22:08 could differ on a bit number higher than 22:10 three in this case the first place they 22:13 differ is bit number four okay so in 22:18 this search path that you took we're 22:21 going to insert a branch node with bit 22:23 number field equals four and then 0 0 1 22:28 1 will be one child of that branch node 22:30 and 0 0 1 0 will be the other child of 22:34 that branch node to determine where to 22:38 put that branch node we note that all 22:40 the bit numbers increase as you go down 22:42 okay so the bit number sequence here was 22:45 1 3 we're going to add a node with bit 22:47 number field 4 so it's going to come 22:49 over here ok 22:53 so I'll put a branch node here where the 22:55 yellow node is give it a left child 22:58 which has 0 0 1 0 in it and the right 23:01 child which has this fellow on it 23:12 let's try another one zero one zero zero 23:15 so you follow a search path you look at 23:17 bit one which is a zero you end up here 23:19 then you look at bit three which is a 23:21 zero you end up over there okay once you 23:25 reach the element node you need to 23:27 compare the two keys the insert key and 23:30 the key you found define the first place 23:32 where they differ and in this case these 23:35 two keys differed bit too okay so we're 23:38 going to introduce a branch node with 23:40 bit number field equals two and it'll be 23:43 placed on the path that you took okay 23:46 but it has to be placed so that the bit 23:49 numbers increase which means you're 23:50 going to have to place it here between 23:52 the 1 and the 3 okay so we're going to 23:59 place it here one child of the branch 24:03 node you put in will be an element node 24:08 containing this guy so this of course 24:11 will be the right child because this has 24:13 got bit two equals one and then the left 24:16 child will just point to this branch 24:18 node here all right so we're going to 24:22 break this pointer and place our branch 24:25 node in there and so you end up with a 24:32 configuration like this this is the 24:34 newly inserted branch node it's right 24:37 child is the newly inserted key its left 24:39 child is the old left subtree all right 24:47 so the strategy to insert is follow the 24:50 search path you reach an element node 24:52 compare the search key or the insert key 24:55 with the element key find the first 24:57 place they differ insert a branch node 25:00 on the search path so that the bit 25:02 numbers are increasing okay or any 25:09 questions about how an insert is done 25:13 again this is an order height process 25:19 look at the delete 25:22 I suppose you want to delete zero zero 25:26 one zero okay you first do a search for 25:29 it so you use bit number one which is a 25:32 zero then you use bit number two which 25:34 is a zero then you use bit number three 25:37 which is a one and then you use bit 25:39 number four which is a zero so you end 25:40 up here then you compare the two keys to 25:46 make sure you've reached the fellow you 25:48 have to remove in this case they match 25:50 so you're going to take this fellow out 25:59 all right so we'll take this note out 26:04 yeah when you take this note out I know 26:07 that the parent of this node if it 26:08 exists that has to go also because the 26:12 parent now has only one subtree every 26:16 branch node is required to have two sub 26:18 trees okay so the parent always vanishes 26:22 the grandparent if it exists never 26:27 vanishes okay because the grandparent 26:30 already has another subtree is 26:32 guaranteed okay so the the back trace 26:37 here cannot go too far it can only go up 26:42 one level here okay all right so this 26:47 node is going to vanish and this one 26:49 will get promoted up to this level 26:56 okay giving you this configuration here 27:01 let's try another one 27:02 suppose you want to remove one zero zero 27:04 one so bit one is a one then you look at 27:09 bit two is a zero then you look at bit 27:12 four as a one so you end up over there 27:15 compare the two keys make sure it's the 27:18 right fellow then we're going to remove 27:22 this node the parent if it exists is now 27:28 going to be deficient it's going to have 27:30 only one child so the parent has to go 27:32 away the grandparent if it exists will 27:36 not become deficient because you're 27:37 going to give back a child who just 27:40 gonna change this pointer to the 27:42 surviving child of the parent 27:51 alright so delete also runs an order of 27:54 right time or any questions about a 28:03 delete all right the split operation is 28:11 very similar to splitting an 28:12 uncompressed binary try so we're not 28:14 going to go through that again we'll 28:18 take a look at the joint operation which 28:20 is a little bit different from how you 28:25 would join uncompressed binary tries as 28:34 in the case of joining uncompressed 28:35 binary tries the middle key that's 28:38 provided to you we're going to insert 28:40 that into B to get a B Prime and then 28:47 we're going to handle the cases where 28:51 the small tri is either empty or has one 28:54 item or the big try now the modified big 28:57 try has exactly one item as special 29:00 cases just like we did in the case of 29:02 uncompressed rice which simply means 29:07 that if s is empty the answer is B prime 29:10 if s has one item you insert it into B 29:13 Prime 29:13 if B prime has only one item you insert 29:17 it into s okay so those are done exactly 29:20 the way you would for an uncompressed 29:22 trial right the interesting case is when 29:28 the small tri has more than one item in 29:30 it and the modified big tri has more 29:32 than one mido minute in this case what 29:36 we're going to do is we're going to find 29:38 the largest key in s okay call that s 29:42 max 29:46 and we're going to find the smallest key 29:51 and b-prime call that b-prime men so you 29:57 can find these in order of height time 29:59 but following a path from the root 30:01 downwards and we're going to do this 30:10 only one so even though the algorithm 30:13 we're going to give is recursive we will 30:15 find s max and B prime once and having 30:21 found them we'll find the first place 30:23 where they differ and that's also done 30:25 only once ok so let D be the first bit 30:30 where s max and B prime men differ okay 30:40 so here's our small try and here's the 30:46 modified big try bit number s refers to 30:49 the bit number that's used in the root 30:51 and a bit number B prime refers to the 30:54 bit number used in the root of B prime 30:57 to do the branching so the cases will 31:01 look at a 1 the first place where s Max 31:03 and and and B men differ position D is 31:10 smaller than the men of these two 31:13 numbers ok in this case we're going to 31:17 have to create a branch node which is 31:18 further up the tree then either this or 31:20 that the case when D is greater than 31:25 equal to the men gets subdivided into 3 31:29 sub cases ok so there are we're going to 31:33 look at 3 sub cases ok 31:36 so for all three of these D is greater 31:40 than equal to men so the first sub cases 31:42 these two-bit numbers are equal the 31:45 second one is this bit number is less 31:48 than that in the third one is the 31:52 reverse ok right so this is kind of an 31:54 all-inclusive set of cases for D 31:58 two men all right so let's take a look 32:05 then at these four cases all right the 32:12 first one is where all of the fellows in 32:18 the small try differ from all of those 32:24 in the big try in a bit number which is 32:26 smaller than the bit number used in the 32:30 roots of these two tries so the first 32:36 observation we can make is that all of 32:41 the fellows in a small try must have 32:43 this bit equal to zero because they're 32:45 all supposed to be smaller elements than 32:47 those in the big try so we're going to 32:55 create this structure you create a new 32:57 branch node at a higher level or smaller 33:00 bit number field and was used before 33:02 you're branching now and bit D which is 33:05 smaller than the bit number here and is 33:08 smaller than the bit number there okay 33:10 everybody on this side must have bit D 33:13 equals zero and everybody on this side 33:15 must have bit D equals one 33:21 all right any questions about this case 33:27 okay 33:31 all right the next case is where the bit 33:36 number fields of the small and modified 33:38 big try are the same so both of them are 33:46 first branching on bit number s write 33:53 the claim here is that this case is 33:56 impossible since my none of these sub 34:02 tries can be empty ABCD if you had this 34:07 situation then we know that the keys in 34:12 C okay 34:15 are smaller than the keys and B because 34:22 the first place where the keys differ is 34:24 on a bit number s are bigger so they so 34:28 all of the keys in your system agree on 34:30 the first s minus 1 bits because D is 34:34 greater than equal to the men of well in 34:37 this case D is great negative s okay so 34:40 all of the keys agree on the first s 34:43 minus 1 bits and since none of these are 34:48 empty we know that the keys here are 34:50 smaller than the keys there okay so 34:55 that's not possible 35:06 all right the third case bit number of s 35:11 is smaller than the bit number of B 35:13 Prime 35:25 okay so alright so maybe this is a three 35:36 and that's a ten okay so this guy's got 35:42 some fellas who's bit number three is a 35:45 zero okay this guy here okay can't 35:51 possibly have people who's bit number 35:53 three is a zero okay if it did then 35:59 those guys would be smaller than the 36:01 guys and B which isn't possible okay so 36:08 this fellow can't have anybody with bit 36:12 three equal to zero so you only get 36:13 these guys and then on this side we're 36:16 going to take the fellows and B and join 36:18 them with everybody here recursively of 36:28 course when you do that recursive joined 36:31 here we already know the value of D so 36:33 we're going to see if D is smaller than 36:34 the bit number in the root here and the 36:36 bit number there 36:37 if so you're back in case one otherwise 36:39 you know one of the other cases right 36:46 the last case is similar to the previous 36:50 case so maybe now this is a ten and 36:56 that's a three okay so the first 37:01 branching node is going to be using bit 37:05 number three okay this fellow has people 37:10 with bit number three equal to one as 37:12 well as people with bit number three 37:14 equal to zero nobody here can really 37:21 have bit number three equal to one if 37:23 they did you'd have the same 37:24 contradiction we had before that this 37:27 guy's got some people bigger than the 37:29 people here okay so everybody here must 37:32 have bit number three equal to zero so 37:35 then we recursively do a join 37:38 of the small try this whole thing and 37:43 the subtree rooted at sea or any 37:52 questions about the joint right the 38:02 complexity of the join is again linear 38:05 in the heights of the two tries you're 38:08 trying to join and said it's important 38:13 to keep in mind that s max and B prime 38:15 are computed only once and so D is 38:17 computed only once all right now in our 38:26 next lecture what we're going to do is 38:28 we're going to take a look at the 38:29 compressed binary try and come up with 38:33 an alternative representation for it 38:36 the alternative representation will use 38:39 only one type of node right now we have 38:42 two types of nodes we have branch nodes 38:44 and we have element nodes so the 38:51 alternative representation will have 38:53 only one type of node and the 38:55 alternative representation is a data 38:59 structure whose name is patricia so 39:02 we'll see how you set up that 39:05 alternative representation perform 39:07 inserts and deletes okay I'm going to 39:11 stop here now