Transcript 00:37 last couple of lectures we saw how to 00:41 set up a balanced binary search tree 00:44 namely an AVL tree to represent a 00:47 dictionary now today we're gonna take a 00:52 look at another variety of balance trees 00:54 and at this one we're going to allow 00:57 nodes to have a degree higher than two 01:00 specifically some nodes will have a 01:03 degree that is 2 and some will have a 01:04 degree that is 3 now after we look at 01:09 this balance tree which is called a 2 3 01:12 3 in the next lecture we'll take a look 01:15 at a balance tree which is called a 2 3 01:19 4 tree in which in addition to notice 01:22 they have a degree that's 2 & 3 also 01:24 allowed notice of degrees 4 and one of 01:29 the reasons for looking at say the 2 3 4 01:32 tree which is going to be kind of a 01:34 generalization of the two three tree 01:36 looking at today is to arrive at another 01:40 balanced binary search tree which is 01:42 called a red black tree which as we'll 01:44 see is just a binary representation of 01:47 the 2 3 4 3 so in some sense all of 01:51 these three topics 2 3 3 2 3 4 and red 01:55 black trees are all related to one 01:57 another 01:59 so today we want to look at two three 02:02 trees and to formally define it 2 3 3 we 02:07 need to talk about extended trees we 02:11 used extended trees and some of our 02:14 previous discussions for example when we 02:16 talked about leftist trees we had to 02:19 talk about extended binary trees in 02:21 which we added an external node whatever 02:24 you would otherwise have a null pointer 02:27 so the concept of an extended tree 02:30 remains the same here wherever you have 02:35 a null pointer or an empty subtree 02:36 you throw in and new node a new node 02:41 which we'll call an external node 02:45 the original nodes are called internal 02:48 nodes so for example if this is your 02:53 original binary tree then all of the 02:56 nodes that the binary tree has to start 02:59 with these are all internal nodes 03:01 wherever you have an empty subtree for 03:03 example here there's an empty right 03:04 subtree here there's an empty left 03:06 subtree this guy's got an empty left and 03:08 right so at all those points you throw 03:11 in an external node which shown here in 03:15 red writes all of the red nodes our 03:23 external nodes and all of these gold 03:26 nodes here our internal nodes so an 03:30 extended tree in this case an extended 03:32 binary tree is simply a tree in which 03:35 all empty sub trees are replaced by 03:38 special nodes called external nodes all 03:44 right so a 2-3 tree is defined with 03:46 respect to an extended tree and this 03:50 notion of extending the tree is 03:52 convenient mainly for definitional 03:55 purposes just like when we define the 03:57 leftist tree it was convenient to 03:59 introduce the concept of an external 04:01 node just so that we would have a clean 04:03 definition as far as implementation goes 04:06 external nodes are not implemented they 04:08 don't exist in any implementation all 04:13 right send a 2/3 tree there are two 04:16 kinds of internal nodes one of these 04:19 will call a 2 node and the other one we 04:21 call a 3 node the properties of two 04:28 nodes and three nodes first are for two 04:30 node will have exactly one key or in our 04:33 case is going to have one dictionary 04:34 pair unless you have a key and an 04:36 associated element and a two node will 04:41 have two children or two subtrees so for 04:45 example a two node would always look 04:48 something like this you would have a 04:50 node it has one dictionary pair in it it 04:53 has a left subtree and it has a right 04:56 subtree 05:02 requirements are that all keys in the 05:04 left sub-tree is smaller than eight all 05:06 keys in the right subtree are bigger 05:08 than eight all right then you have three 05:14 nodes the three node has two keys of two 05:18 dictionary pairs in it and has three 05:20 children or sub trees the first key is 05:26 required to be smaller than the second 05:28 one the three sub trees labeled here is 05:34 the left subtree the middle subtree in 05:36 the right subtree so everybody in the 05:39 left subtree has to be smaller than one 05:41 in this case everybody in the middle 05:44 subtree has to be between one and three 05:47 and everybody in the right subtree has 05:50 to be bigger than three okay alright so 05:57 you have two nodes and you have three 06:00 nodes and then we just seen what the 06:01 properties of two nodes and three nodes 06:03 are in addition to requiring that a two 06:08 three tree be such that all internal 06:12 nodes are either two nodes or three 06:13 nodes we also require that all of the 06:15 external nodes should reside on the same 06:17 level let's take a look at an example 06:25 alright so the two nodes shown here in 06:29 bold so you've got a two node here a two 06:32 node there a two node and a two node 06:34 okay so four of the internal nodes the 06:36 two nodes then these four internal nodes 06:39 shown in yellow these four are three 06:42 nodes so all of the internal nodes are 06:46 either two nodes or three nodes so we 06:47 satisfy the first requirement and then 06:51 if you look at all of the external nodes 06:52 they all lie on level four of this tree 06:56 okay so they're all on the same level so 06:59 we satisfy the last requirement also 07:07 all right so that's a two three tree but 07:13 before we look at how you might 07:16 represent a two three tree and perform 07:18 operations you have any questions on the 07:20 definition what is a two three tree I 07:28 should point out there is a variant to 07:30 the definition we have depending upon 07:33 what part of the literature you're 07:36 looking at in the version of the two 07:38 three tree that I talked about just now 07:41 each dictionary pair is stored in well I 07:46 guess here you have one dictionary pair 07:47 its key is 8 into there's an element you 07:50 know one dictionary pair here with key 07:51 four and an element in here at this node 07:53 you've got two additional repairs one 07:56 with key one one with key three and so 07:57 on so dictionary pairs can be stored in 08:01 any internal node in a variant of the 08:06 two three tree called a leaf pushed to 08:08 three three the dictionary pairs are 08:11 stored only in the leaves of the tree 08:13 okay the internal nodes are used just to 08:16 direct the search to the appropriate 08:18 leaf okay so in that case it's only 08:22 these bottom nodes here that will 08:23 contain dictionary pairs and here for 08:26 example I might keep the key three but 08:29 no element associated with it which 08:31 would tell me that the largest key in 08:32 this subtree is three here I might keep 08:35 the key six which would tell me the 08:37 largest key in the left subtree is six 08:39 similarly here I would keep the number 08:41 nine - tell me the largest key in the 08:42 left subtree is mine here that would 08:45 keep the number seventeen telling me the 08:46 largest in the left in the middle 08:47 subtree is 17 okay so no leaf pushed two 08:52 three tree the dictionary pairs are 08:54 stored only in the leaves the remaining 08:56 internal nodes are used to direct the 08:59 search to the appropriate leaf and the 09:01 remaining nodes will contain only keys 09:04 and these keys will actually be replicas 09:06 of keys that are stored in the leaves 09:10 and whatever we say about the version 09:16 that we're looking at now we're 09:18 repairs are stored anywhere you can 09:20 extend all of that to apply to leaf 09:22 pushed to three trees as well okay 09:25 alright so just keep in mind that there 09:28 are two versions of two three trees 09:29 which differ in the restrictions you 09:32 place and where the dictionary pairs can 09:34 be placed okay so first let's take a 09:40 look at what's the smallest number of 09:42 pairs you might be able to store in a 09:44 2-3 tree of a given height you know your 09:50 internal nodes are right the two nodes 09:51 or three nodes a two node has only one 09:55 pair a three node as to so what you seem 09:57 natural to expect that the smallest 09:59 number of pairs would arise when all of 10:01 your nodes are two nodes right so all of 10:08 the internal nodes are two nodes well 10:11 then you're required that all of the 10:12 external nodes be at the same level so 10:15 if you work with those two restrictions 10:18 all internal nodes the two nodes all 10:20 external nodes are the same level the 10:22 only trees that satisfy those two 10:25 requirements are for binary trees 10:29 because of the two node criteria you 10:31 have to be a binary tree if you're not 10:33 full then some external nodes would be 10:35 at levels different from other external 10:37 nodes so you have to be a full binary 10:40 tree each of the internal nodes has one 10:43 pair in it a full binary tree of height 10:47 H has to raise to H minus one internal 10:49 nodes right so the number of internal 10:55 nodes is 2 raise to H minus 1 H is the 10:57 height and when you count the height the 11:00 external nodes are not counted in 11:02 determining the height so it's got to be 11:05 a full binary tree you got to have that 11:07 many internal nodes which means you got 11:09 to have that many dictionary pairs so a 11:16 2 3 3 whose height is H cannot have 11:19 fewer than two of the H minus 1 11:21 Dictionary pairs stored in it ok the 11:25 worst that could happen to you is a full 11:26 binary tree which was the best that 11:29 could happen to you when you dealt with 11:30 AVL trees 11:39 well the best that can happen to you is 11:42 that each internal node instead of being 11:44 a two node is a three node so now each 11:47 node has two pairs in it so if you take 11:53 that requirement to every node is the 11:55 three node couple it with the 11:56 requirement that every external node is 11:58 at the same level you'll end up with a 12:00 full tree with internal nodes having a 12:03 degree three if you count the number of 12:09 internal nodes at the root level you 12:11 have one node that node can have or will 12:15 have three children 12:16 each of these three nodes will have 12:18 three children so you end up with nine 12:20 nodes at level three each of these mine 12:23 nodes will have three children so you'll 12:25 end up with 27 nodes at level for each 12:27 of these 27 will have three children so 12:29 you'll end up but I guess twenty-seven 12:32 times three so 81 and that level and so 12:36 on so you end up with this formula which 12:40 tells you that if the height is H the 12:42 total number of internal nodes is three 12:44 raised to H minus one divided by two now 12:48 each internal node has two pairs and 12:50 therefore the number of pairs you're 12:52 storing is three raised to H minus one 13:04 all right so the total number of pairs 13:06 in a 2-3 tree dictionary pairs stored in 13:09 a 2-3 tree whose height is H lies 13:11 somewhere between 2 to the H minus 1 the 13:14 minimum and 3 to the H minus 1 the 13:16 maximum okay so the number of dictionary 13:21 pairs lies between the min and Max we 13:24 just determined and if you turn this 13:27 equation around we get that the height 13:29 of a two three tree lies somewhere 13:32 between log n plus one base two so 13:35 that's a full binary tree and log of n 13:39 plus one base three this would be at 13:41 this end a full tree of degree three if 13:48 you compare this with an AVL tree for an 13:54 AVL tree the the height could get as 13:58 large as one point four four log of n 14:02 plus one okay so in terms of worst case 14:05 you may be doing about thirty percent 14:07 better here the best case was log n plus 14:12 one base two and here we can go down to 14:14 log n plus one base three all right so 14:20 as far as height goes we will always be 14:22 doing better using this structure and 14:29 that becomes important if you're working 14:31 on a machine where the cash penalty is 14:33 very high if you're trying to do a 14:35 search as we see in a moment you'll be 14:37 going down the tree looking at one node 14:39 per level so the number of node accesses 14:42 would be equal to the height of the 14:44 structure in the worst case and here the 14:47 smaller right height there with smaller 14:48 node accesses which would mean a smaller 14:51 number of cache misses each node is 14:53 small enough that you can retrieve the 14:55 whole node with a single cache miss so 14:59 you could see a fairly noticeable 15:03 difference in performance by reducing 15:06 the height of the structure 15:11 all right the note structure that we can 15:16 employ to represent a two three three is 15:20 to use a node of just one type 15:24 independent of whether you're 15:25 representing a two node or a three node 15:30 okay so so rather than have one data 15:34 structure for a two node and a different 15:36 one for a three node we were typically 15:38 employed just a single structure for 15:41 both types of nodes and that structure 15:43 is designed to hold a three node as 15:46 we'll see when we start inserting and 15:49 deleting items insertion and deletion 15:52 changes nodes that were two nodes to 15:55 three nodes and three nodes back to two 15:57 nodes and if you had different 15:59 structures for a two node and a three 16:02 node well then you'll be having to make 16:04 more calls to new and delete each time 16:08 you were adding an item changing from a 16:12 two node representation to a three node 16:14 representation by using a single 16:17 structure if you're simply taking a two 16:20 node and changing it to a three node you 16:22 don't have to invoke the new method or 16:24 the delete method you can simply use the 16:27 fields that were not previously used all 16:31 right so we're going to use a node 16:33 structure that looks like this 16:35 you've got three children field a left 16:38 child field a middle child and a right 16:40 child and you got three fields two so 16:42 dictionary pairs per one and pair two if 16:48 you if this structure is used to 16:51 represent a two node then we'll employ 16:53 the first three fields okay so for a two 17:00 node we just use these three fields for 17:03 a three node will employ all five fields 17:07 so there's some wastage in space for two 17:10 nodes you have these two fields that are 17:12 not getting used but there is 17:14 improvement in terms of runtime 17:16 performance because you aren't 17:18 constantly allocating and deallocating 17:19 nodes to change from to no two to three 17:23 notes or three notes to - no 17:38 at times you might find it useful to 17:41 augment the structure by having a parent 17:44 field my parent fields are not needed to 17:50 implement the search insert and delete 17:53 operations you can do them fairly 17:55 effectively without parent fields but in 17:58 some cases you may find it advantageous 17:59 to have a parent field maybe for the 18:02 purposes or even for inserts and deletes 18:04 you may be able to run a little bit 18:06 faster again keep in mind that when you 18:14 implement two three trees we explicitly 18:19 represent only the internal nodes the 18:22 external nodes was something we 18:24 introduced just so that we could have a 18:26 definition and because the definition 18:28 has a component to it which says all 18:30 external nodes are at the same level 18:37 well let's look at the operations say 18:39 you want to do a search ok all right so 18:44 here's my example 2 3 3 and let's say 18:49 you want to search for the key 17 so you 18:54 start at the root it's a two node you'd 18:58 compare with the single key stored here 19:00 8:17 is bigger than 8 it can't be in the 19:03 left subtree if at all it's present it 19:05 must be in the right okay so you move 19:07 into the right subtree you reach this 3 19:10 node here you compare the 17 with the 19:17 first key 15 17 is bigger than 15 so it 19:20 can't be in the left subtree you compare 19:24 the 17 with the next key that's 20 it's 19:26 smaller than 20 so it can't be in the 19:28 right subtree if at all it's present it 19:30 must be in the middle subtree so you 19:33 move down into the middle tree and it's 19:37 a 2 node you can pull with the first key 19:38 or the only key it's a match 19:40 you found the pair you're looking for 19:47 if you're looking for a seven it's 19:52 smaller than eight you come over here 19:55 it's bigger than four you come down here 19:58 it's bigger than six you follow the 20:00 right child pointer here and you fall 20:02 off the tree that tells you seven is not 20:05 present 20:14 all right questions about how a search 20:16 is done again you can see the time 20:24 required to perform a search depends 20:27 upon the height of the tree and the 20:31 height is logarithmic in the number of 20:33 pairs so it's an order of log and search 20:35 at some nodes you might look at just one 20:38 pair and at some nodes you might be 20:40 comparing against two keys so if you if 20:49 you look at the total number of 20:50 comparisons that you might perform that 20:52 number could become larger than the 20:54 number for an AVL tree the height 20:57 difference is one point four four in the 21:00 worst case so by a factor of one point 21:03 four four 21:04 but each node may contain two keys in it 21:07 so you could be doing two comparisons 21:09 for node as you go down the thing that 21:14 makes this thing really good is that in 21:17 most of the machines out there your 21:20 runtime is determined not by the number 21:22 of comparisons you perform but by the 21:24 number of cache misses you encounter 21:26 okay and since the height of the 21:29 structure in the worst case is log of n 21:31 base 2 you could have only log n base 2 21:34 cache misses whereas with an AVL tree 21:37 you could get one point four four times 21:39 as many cache misses 21:47 all right so we're left with two other 21:49 operations to consider one is the insert 21:52 and the other one is the delete let's 21:55 take a look at insert suppose you want 22:02 to insert a new dictionary pair and this 22:05 one has a key value 16 so you perform a 22:09 search if you find 16 already exists 22:13 well then it's a duplicate entry 22:16 so perhaps you update the element 22:18 associated with 16 no you throw an error 22:21 if it's not a duplicate as in this case 22:24 it's not a duplicate well then the 22:26 number of pairs stored in the dictionary 22:27 go up by one all right so I follow a 22:31 search for 16 it's bigger than eight 22:33 lies between 15 and 20 s you end up here 22:36 smaller than 17 you fall off you fell 22:41 off from a two node 2 node has enough 22:43 capacity to accommodate the new pair 22:46 okay so the new pair is going to be 22:49 thrown into this particular node here 22:52 the new power is stored in this two node 22:54 which is now going to become a three 22:56 node in such a fashion that the keys are 22:58 in ascending order so this will change 23:02 from a two node to a three node you put 23:07 in the new pair whose key is 16 that 23:09 occupies the first position the 17 which 23:12 is previously in that node gets moved to 23:14 the second position because we're using 23:20 a common structure for two nodes and 23:21 three nodes I don't have to go through 23:23 this time-consuming process first 23:27 allocating the new node which is a three 23:29 node and then D allocating the old two 23:31 node all right so what I need to do is 23:35 previously this pair was stored in 23:37 position p1 of the node it has to be 23:39 copied from p1 to p2 and then the new 23:45 fellow of this pair with the key 16 is 23:47 inserted into position p1 okay so you've 23:50 got some copying to do p1 has moved over 23:52 to p2 and then the new pairs tossed into 23:55 the position p1 24:03 let's take a look at another insert 24:05 suppose we want to insert two okay 24:07 so smaller than eight smaller than four 24:13 lies between one and three you fall off 24:15 the middle subtree here so that tells 24:19 you you got to insert it into this node 24:21 the node II fell off of okay well this 24:24 is a three node a three node doesn't 24:29 have any excess capacity for you to 24:32 actually put that new fellow in there so 24:34 insertion is a little bit more involved 24:37 than it was in the previous case the way 24:45 we're going to handle this insert okay 24:48 we're inserting a new pair into a three 24:50 node that is a leaf we will first make 24:57 that insertion into that node at least 24:59 logically how physically you can't do it 25:01 the fields aren't there but logically 25:03 you can do anything okay so logically 25:06 we'll put it in so that the keys are in 25:07 ascending order okay you had one three 25:10 the two lies in between so you end up 25:12 with one two three or having done that 25:18 logically then whatever ends up being 25:20 the rightmost key we're going to take 25:23 that out and put it into a new node 25:25 which is a which will become a two node 25:32 then we'll take the middle key out of 25:36 here too to have this logical structure 25:43 and the intent now is the middle key 25:46 together with a pointer to this newly 25:48 created two node will be inserted into 25:51 the parent of this node here okay so we 25:57 started with the three node we just had 25:59 one and three we inserted the new pair 26:02 into this node so that the keys were in 26:04 ascending order there's a logical 26:06 insertion then we took out the largest 26:10 pair and moved it into a two node so for 26:14 that you have to allocate space then we 26:17 took out the middle key okay and said 26:20 we're going to take that middle key of 26:22 the middle pair together with a pointer 26:24 to the newly created node and insert it 26:26 into the parent of that original three 26:28 node alright so looking at our picture 26:35 here okay 26:37 doing all of those steps in one shot the 26:41 three will be separated out okay and 26:44 then the middle fellas here you go two 26:47 pointer and I were trying to insert the 26:49 two with this pointer into the parent 26:52 node 27:02 all right so let's see how we insert the 27:04 two with that right child pointer into 27:06 the parent node okay 27:08 fortunately the parent node is a two 27:11 node and so it has excess capacity it's 27:16 got enough space in there to accommodate 27:18 both the two and this right pointer when 27:23 we do the insertion into the parent node 27:25 we have to make sure that the keys are 27:26 in ascending order so this fellow which 27:28 is currently occupying position p1 would 27:30 have to get copied over to p2 this 27:32 pointer which is currently using the 27:34 middle child pointer will get copied 27:35 over to the right child and then this 27:37 fellow will come in and take the p1 27:38 position and this fellow will 27:40 automatically fall into the middle chart 27:55 okay or any questions about that we will 27:57 look at a few more examples to cover all 28:00 the cases 28:09 whereas you can notice this insert even 28:14 though did create a new node the new 28:15 node was created at an existing level 28:18 the trick in this thing is not to create 28:20 new nodes at these low levels because 28:22 then you're going to have external nodes 28:23 at different levels and you no longer 28:26 have a two three tree okay so you can 28:29 create new nodes as you go upwards just 28:31 don't create new nodes a new nodes say 28:35 at this level here level four let's try 28:41 another example suppose you want to 28:43 insert an eighteen okay so you follow 28:45 the search path from eight you move down 28:47 here you go into the middle and you fall 28:49 off here there right okay so you want to 28:55 insert it into the node you fell off 28:57 from but that's already a three node so 28:59 you don't have the space to go through 29:01 the same process make a logical 29:02 insertion split off the rightmost pair 29:04 and so on all right so exactly the same 29:11 process logically put the fellow in 29:15 split off the rightmost pair into a two 29:19 node then take the Keen in the second 29:25 position take it out and have a right 29:28 pointer to the 18 and then the 17 29:33 together with this right pointer you 29:35 want to insert into the parent of the 29:36 original three node that was over here 29:40 so that's the same as what we did in our 29:42 previous example all right so we're here 29:48 we're going to put in an 18 the 29:50 rightmost key is 18 you split it off you 29:52 take off the next key that 17 17 with a 29:55 pointer to that new two node is to be 29:57 inserted into the parent node 30:09 unfortunately this time the parent node 30:12 is a 3-node okay 30:14 on this side we were lucky it was a 2 30:16 node we were inserting into here it's 30:17 the 3 node so after going to a similar 30:19 process to what I went through one level 30:21 lower I'll put the 17 in logically I'll 30:28 split off the rightmost key I'll split 30:30 off the middle key okay and then that 30:32 middle key and a pointer to the newly 30:34 created two node will get inserted in 30:35 the parent node okay so let's see how we 30:42 insert the 17 into that three node so we 30:47 first put in the 17 together with the 30:49 pointer together with this right pointer 30:52 okay then we're going to split off the 30:57 rightmost key but when we split off the 30:59 rightmost key we're going to take these 31:00 two pointers with us when we're doing it 31:04 and this operation and the leaf node we 31:06 didn't really talk about taking those 31:08 two pointers but in principle you were 31:10 it's just that those two pointers were 31:11 now okay here those two pointers are not 31:16 now so it's important that you 31:17 explicitly carry those over okay so I 31:22 take over the rightmost key together 31:23 with those two right pointers and put 31:25 them into a new two node then I'm going 31:29 to take the second key that together 31:32 with a right pointer to this fellow is 31:35 going to get inserted in the parent okay 31:39 so I end up with a new two node the only 31:46 difference from the previous example is 31:47 we explicitly carried over two pointers 31:49 which in the previous example were not 31:52 and then a key and a pointer are to be 31:56 inserted into the parent 32:04 all right so performing all of that here 32:10 we saw the 20 is going to go off into a 32:12 new node okay it's gonna end up what 32:15 this the 20 goes off it borrows two 32:18 pointers from the previous place and 32:20 then you have a 17 and a pointer to 32:24 insert it into the parent well the 32:31 parent is a 2 node so this insertion is 32:34 going to be doable here without 32:36 exceeding capacity this 2 node will 32:38 become a 3 node and you end up with that 32:46 configuration 32:56 get in this particular case we created a 32:59 new node at level three we created a new 33:02 node at level two and nowhere else we 33:05 didn't create nodes at a level that 33:07 didn't exist before 33:08 so if previously all external nodes were 33:10 the same level they still are but 33:19 questions about that example we're going 33:21 to look at one more example which will 33:23 increase the height of the structure 33:30 okay all right let's insert a seven okay 33:37 you can visualize what's going to happen 33:40 the seven goes here that's a three node 33:42 so that'll split then you have to insert 33:44 something out here that's a three nodes 33:46 that'll split then you have to insert 33:49 something here that's a three node 33:50 that's going to split then you're going 33:51 to go up and that's going to cause you 33:56 to have an extra level now let's go 34:00 through the details right we have to 34:04 insert in that node we follow the 34:06 procedure to insert into a with the 34:10 inserting into a leaf or non-league 34:11 that's really the same thing we just 34:14 carry over to null pointers so we do the 34:20 logical insertion you exceed the 34:22 capacity you split off the rightmost key 34:26 together with the two pointers on the 34:28 right into a new two node the second key 34:31 together with a pointer to the new node 34:34 or to be inserted at the next level 34:38 they're doing that here split off the 34:43 seven will become the rightmost case you 34:46 split it off the six you'd need to 34:48 insert into the parent node which is 34:52 this fellow together with a pointer to 34:54 the newly created two node that's the 34:56 seventh so the six is now going to fall 35:00 into the rightmost position there 35:07 the six falls into the rightmost 35:10 position because we have now four 35:11 pointers here take off the rightmost 35:16 pair together with the third and fourth 35:18 pointers form a new two node and then 35:22 the second key for together with a 35:25 pointer to this new node to be inserted 35:28 at the next level up so performing that 35:35 on this diagram here the three node gets 35:40 split okay so this node here is going to 35:48 get split and we're going to have to 35:49 insert something up here the what you're 35:53 going to insert there is the middle key 35:54 from here which is the four because six 35:55 is bigger for together with a pointer to 35:58 the newly created two node and the newly 36:01 created two node will have a six in it 36:04 okay so four together with a pointer to 36:07 the newly newly created two node alright 36:12 that's a three node so the process is 36:15 the same as before you make a logical 36:21 insertion if I put the four end it's 36:28 going to occupy the leftmost position 36:30 we're going to split off the 17 together 36:33 with the pointer to 15 and 20 the eight 36:39 is the middle fellows that's going to 36:40 get removed and we'll try and insert 36:42 that at the next level up together with 36:44 a pointer to the 17 okay what you're 36:49 going to be left with on this side the 36:50 four will be here a pointer to the two 36:52 and a pointer to the six so that's what 36:57 we got for a pointer to the two and the 36:59 six the 17 got into a new two node you 37:02 borrowed two pointers from the overflow 37:04 node one going to 15 1 to 20 the middle 37:08 key 8 has to be inserted at the next 37:10 level up I 37:14 this case there is no next level up to 37:15 insert into and so what we do is we just 37:19 create a new route at that level okay so 37:24 when you reach all the way to the top 37:26 there's no place to insert that new key 37:29 we create a route which is a two node 37:40 so when you do this all of the notes 37:44 that were previously at level 3 jump or 37:48 drop down to level 4 nodes previously at 37:51 level 2 drop down to level 3 those 37:53 previous at level 1 the route they drop 37:55 down to level 2 and you create a new 37:57 node at the root level Y so this tree 38:03 really grows by moving the route further 38:05 up rather than by adding fellows down in 38:09 this level okay and because of this way 38:12 of growing you can see that external 38:14 nodes always remain at the same level 38:15 you will never have external nodes at 38:19 two different levels or any questions 38:27 about insertion 38:38 but again shouldn't be hard to see that 38:41 the time required to insert depends on 38:44 the height of the tree it's linear in 38:45 the height which is logarithmic cells an 38:47 old log an insert 38:59 look turning to the last operation you 39:02 want to look at the delete but the 39:09 delete operation at the highest level 39:11 breaks down to two cases one way we're 39:14 trying to remove from the interior a 39:16 node other than a leaf and the other one 39:19 you're trying to remove from a leaf the 39:23 first case removal from the interior is 39:25 handled by transforming it into the 39:27 second case removal from a leaf so for 39:32 example if you want to remove the eight 39:35 it's not in a leaf node it's in the 39:37 interior okay to handle this kind of a 39:41 delete we will actually replace the 39:44 eight with a suitable pair from a leaf 39:46 node we can replace the eight either 39:51 with this pair the six then also you 39:54 will have you'll satisfy all of the 39:56 requirements of a two three tree except 39:58 that something is missing out here okay 40:00 or you can replace the eight with the 40:02 nine okay so you can either take the 40:06 largest key in the left subtree the left 40:08 of the eight or the smallest key to the 40:12 right of the eight and move that fellow 40:15 up so if we choose to take the largest 40:21 to the left of eight this pair with the 40:24 key six will move up and now our problem 40:27 becomes one of deleting the six from 40:29 this node okay so deletion from the 40:32 interior in this case the root is 40:34 transformed into deletion from the leaf 40:41 and you can do that independent of which 40:44 internal node you're trying to remove 40:45 from okay just do a replacement from a 40:48 leaf node so really the thing to look at 40:51 is how do you remove from a leaf 40:58 look at a few examples to cover all of 40:59 the possibilities that might arise 41:04 suppose you want to remove the 16 right 41:09 so again in these examples we only have 41:10 to look at removing from a leaf because 41:12 we've seen what to do in case what you 41:14 want to remove is not in a leaf okay so 41:16 all of my examples now will focus on 41:18 items that are sitting in leaves right 41:22 so the 16 is sitting in this leaf it's a 41:23 three node well you can just throw it 41:26 away okay and just throwing it away 41:29 would in this case mean that you 41:31 actually have to copy the 17 from 41:33 position p2 to position p1 and okay 41:40 alright so the three node becomes a two 41:42 node we just toss away the 16 or so 41:51 deletion from a three node leaf is 41:52 pretty simple the three node becomes a 41:54 two node and that's the end of it the 41:56 interesting case is deleting from a two 41:58 node leaf so for example suppose you 42:03 were removing the 17 okay when you 42:08 delete the 17 that node becomes empty 42:10 and you aren't allowed to have empty 42:12 nodes only two nodes and three nodes so 42:17 at this time we need a strategy what do 42:20 we do because this fellow has become 42:21 empty so when a node or the leaf becomes 42:26 empty you delete it from a two node our 42:30 strategy is going to be to check one of 42:32 its siblings in this case this 2 node 42:37 has two siblings it's got this full on 42:39 the left and that full on the right but 42:41 if you were deleting the 9 you would 42:43 have only one sibling so you would have 42:45 a choice you'd have to check that fellow 42:46 ok similarly here well I guess this 42:49 guy's about two siblings so deleting the 42:52 9 you don't have a choice if this photo 42:54 was a 2 node it deleted orders in there 42:55 you don't have a choice you'd have to 42:56 look at the left sibling if you're 42:58 deleting from the middle child you have 43:01 a choice you have to look in the left or 43:02 the right you typically don't look into 43:05 both ok you don't look into both this 43:08 we're trying to minimize the number of 43:09 cache misses in a delete 43:11 okay so you look into one of the two 43:14 siblings and there see if the sibling 43:18 you decide to look at is a 3-node okay 43:24 so suppose for a moment that here you 43:27 chose to look at this node then you 43:30 would discover it is a three node if you 43:33 chose to look at this node you would 43:34 discover it's a two node and then you'd 43:36 go into a different action okay but for 43:39 the moment suppose you chose to go there 43:40 you found us a three node if it's a 43:44 three node well then we're going to do 43:46 some borrowing okay we say you don't 43:48 really need both of your pairs to 43:52 survive I'll take away one in this case 43:54 I'll take away the 30 I can't take that 43:57 and move it directly here if I did that 44:00 I would violate the property which said 44:02 that the fellows in this node must lie 44:04 between 15 and 20 okay so I'm going to 44:07 borrow from this three node and the 44:10 borrowing is always done through the 44:11 parent so the 30 will move into the 44:13 parent and then the 20 will move down 44:15 here so we borrow through the parent and 44:24 you end up with this configuration 44:36 another case to look at is your removing 44:40 the 20 again here so you're removing 44:44 from a 2 node but now it doesn't really 44:46 matter which sibling you choose to look 44:47 at there's no possibility of hitting a 3 44:50 node okay so the sibling you look at is 44:52 a 2 node and now you can't borrow okay 44:58 so we're deleting from a 2 node and the 45:01 sibling that we're going to check for a 45:03 3 node in this case doesn't matter which 45:05 one we check either way you're going to 45:07 determine it's not a 3 node okay and if 45:12 it's not a 3 node let's suppose we check 45:14 this fellow here okay it's not a sibling 45:16 so what I'll do is I'll combine I'll 45:18 take the single pair here the in-between 45:21 pair and the parent this one is empty 45:23 and create a 3 node at this level so I 45:27 have a combined operation in this case 45:34 or in all cases when you do the combine 45:37 the occupancy of the parent decreases by 45:40 1 because you moved a pair out of the 45:42 parent into this node yes 45:45 in this example the parent was a 3 node 45:47 it dropped to a 2 node but had the 45:50 parent been a 2 node you'd be in trouble 45:53 ok let's look at another example 45:57 okay suppose you're removing the 30 ok 46:00 it's coming out of a three node a 3 node 46:03 just transforms into a 2 node there's no 46:09 borrowing or combining as suppose you're 46:15 taking out the 3 okay you're taking it 46:18 out from a 2 node so you check one of 46:20 the siblings let's suppose you check 46:22 that one you determine if it's a 3 node 46:25 so you do a borrow the borrows done 46:28 through the parent so the 5 moves up the 46:30 4 moves down ok a borrow is fine a 46:37 borrower never disturbs the parent it's 46:39 capacities or its occupancy is whatever 46:41 it was before 46:42 it's the combines that messed things up 46:46 potentially suppose you're removing the 46:49 sex you are deleting from a two node you 46:54 check the nearest sibling it's also a 46:57 two node you can't borrow so in that 46:59 case you do a combine the sibling the 47:01 in-between node or the in-between pair 47:04 okay in this case it was a three nodes 47:07 that drops to a two node and you're 47:09 still fine all right let's take out a 40 47:19 it's a two node you check a nearest 47:23 sibling to see if it's a three node it's 47:26 not so you have to do a combined so the 47:32 nine and the in-between key combined to 47:34 give you a three node at this level but 47:37 now the parent which is previously two 47:39 node becomes deficient it doesn't have a 47:43 single pair in it okay so when a node 47:47 becomes deficient you check one of its 47:51 nearest siblings to see if you can 47:53 borrow the newer sibling is not a three 47:57 node so you can't borrow if you could 47:59 have borrowed your Donna borrow you'd be 48:00 okay when you borrow you also have to 48:03 borrow subtrees when you move up the 48:05 tree so in this case you can't borrow so 48:09 since you can't borrow you do a combine 48:11 so the pair in the sibling the 48:14 in-between pair combine together to give 48:16 you a three node at this level 48:26 so you took somebody in a combine you 48:30 always decrease the occupancy in the 48:32 parent the parent previously had one key 48:35 now it has zero keys okay so when the 48:40 parent becomes deficient you check a 48:42 sibling to see if you can borrow if you 48:45 can borrow you borrow a pair through the 48:47 parent but you also borrow a subtree if 48:50 you can't borrow you would do a combine 48:54 in this case you don't have a sibling to 48:58 borrow from or to combine with and that 49:01 only happens when you reach the root 49:02 level when the root becomes deficient 49:06 you just throw away the root so in this 49:16 case I'm going to throw away the root 49:20 and the height of my structure will 49:24 decrease by one again you can see that 49:30 this process of deletion where nodes 49:35 change levels only when you throw away 49:36 the root ensures that your external 49:40 nodes are always on the same level so 49:41 you satisfy the requirements for a 2/3 49:43 tree 49:51 all right so we've pretty much covered 49:53 all of the important operations for a 49:55 2/3 tree and hopefully it convinced that 49:58 the operations perhaps are more simpler 50:00 than they were in an AVL tree certainly 50:02 took us only half the time to cover them 50:04 and the height structure is better the 50:08 worst case height here is better than 50:09 the worst case height there in fact the 50:12 worst case height here is the same as 50:14 the best case height for AVL trees and 50:16 so as far as cache mish cache miss 50:19 properties are concerned you got better 50:20 cache miss properties here as I said in 50:25 the next lecture we're going to extend 50:26 this to two three four trees okay thank 50:30 you