Transcript 00:11 okay all right you have any questions 00:14 before we start today 00:18 the programming project coming along 00:20 okay yeah all right today we're going to 00:27 talk about two structures that are 00:29 closely related to the B tree first 00:32 we'll talk about the B+ tree which is a 00:38 structure that is primarily used in 00:40 almost every almost every relational 00:44 database that's out there as an index 00:46 structure into the collection of records 00:49 stored in the database and then we'll 00:52 take a look at the B star tree which is 00:56 an attempt to provide B tree-like 01:00 structures with a reduced height and so 01:04 they improve the search performance 01:06 because the height is less but as you'll 01:08 see the insert and deletes get more 01:09 involved so the number of disk accesses 01:11 involve for insults and deletes could be 01:13 larger okay all right so let's start 01:16 with the B+ tree in some sense it's the 01:24 same structure as a B tree there are 01:27 some important differences though the in 01:32 a B+ tree the actual dictionary pairs or 01:38 records are stored only in the leaves of 01:42 the B tree okay the remaining nodes are 01:46 used as an index to get to the right 01:49 leaf also another difference is that 01:54 typically you tend to link all of the 01:57 leaves together to form a W linked list 01:59 and as we'll see that these two things 02:03 together putting all of the records in 02:07 your database only in the leaves of the 02:09 B tree rather than all over the tree and 02:15 linking them together like this provides 02:17 an easy way to perform range searches in 02:22 addition to searching by key all right 02:26 now if you're not a leaf node then the 02:30 remaining nodes are used only to index 02:32 to get you to the right leaf because 02:34 that's where the real data is okay so 02:37 the remaining nodes have a format that 02:41 looks like this 02:41 the first item tells you how many keys 02:44 are stored in the node this node only 02:46 has keys and point those two subtrees 02:48 okay so this tells you how many keys are 02:51 stored the A's are pointers two subtrees 02:54 and then the Khazar the keys and a key 02:59 is chosen so that all of the keys in the 03:05 subtree towards right okay all of these 03:07 must be greater than equal to this 03:08 fellow and everybody on this side must 03:11 be smaller than that guy okay so 03:17 everybody in AI or a one is greater than 03:20 equal to k1 and everybody in the subtree 03:23 a 0 is smaller than k1 and the same 03:26 applies everywhere else okay all right 03:31 so an example okay so the leaf nodes are 03:34 down here in yellow these contain the 03:36 actual records of the database and here 03:42 I've only shown the keys of these 03:44 records the leaves are linked together 03:48 in a doubly linked list the remaining 03:54 nodes of the B tree are used to index to 03:59 get you to the right leaf node so the 04:03 remaining nodes contain only keys they 04:06 don't contain the data associated with 04:08 the key if you check any node like this 04:13 one okay so the keys in the left subtree 04:17 must be smaller than this and the keys 04:19 on the subject to the right must be 04:21 greater than equal to Bank okay same is 04:24 true up here everybody to the left is 04:26 smaller than mine you've got 1 3 5 6 04:28 everybody to the right is greater than 04:30 equal to 9 here on this side everybody 04:36 must be less than 16 ok and of course 04:39 also Gordon equal to 9 in here you have 04:44 to be good 04:45 equal to 16 but also less than 30 and 04:48 here you must be grinning eagle to 30 04:52 okay alright so the full records are 04:59 stored only in the leaves the other 05:01 nodes of the B tree or now we call it a 05:04 B+ tree contain only keys or any 05:12 questions about what a B+ tree is before 05:14 we look at the operations now you have 05:22 the same requirements in terms of 05:25 degrees okay so the root degree has to 05:28 be at least two and at most M other 05:31 people must have a degree that's at 05:33 least M over two and at most M now when 05:39 you go to this kind of structure you can 05:43 make a variation where the number of 05:46 records stored in a leaf okay or the 05:49 capacity of a leaf is different from the 05:51 capacity of the other nodes okay so if 05:55 you're talking about a B+ tree of degree 05:58 or order M we could have the requirement 06:01 that roots got at least two children at 06:03 most M this guy's got at least M over 06:08 two at most M at least M over two at 06:10 most M but when it comes here children 06:13 are measured by number of external nodes 06:14 which is related to number of Records 06:17 and here the requirement here could 06:19 change okay so if you got very long 06:22 records the number you might store here 06:27 may not be as many as you could get here 06:30 here you're only storing keys okay so 06:33 it's possible that here you could have 06:34 very high degrees because typically you 06:36 would choose the size of these nodes to 06:39 be the same as that of a disk block okay 06:42 so you might be working with high 06:44 degrees here maybe 500 or 600 or 06:47 something but then when you come down 06:48 here because the records get longer you 06:51 might be working with much smaller 06:52 capacities maybe only 20 records per 06:54 disk work okay so you have that 06:57 flexibility the 06:59 capacity of these yellow nodes the leaf 07:01 nodes may be very different from the 07:03 capacity of everybody else now in our 07:07 example we'll assume that the capacities 07:10 of both types of nodes is the same but 07:12 that doesn't really change the 07:13 algorithms much you can work with both 07:17 with different capacities for the leaves 07:19 as from the intent from the remaining 07:22 nodes okay all right now if you're going 07:26 to search for something 07:27 suppose you're searching for the record 07:32 whose key is five okay so you start up 07:34 here okay if if five is present it's got 07:39 to be in the left subtree of mine so you 07:40 come down here if it's present it's got 07:43 even the right because the right has 07:44 things going legal - okay so even though 07:46 you've got a match you can't stop out 07:48 here there's no information here to get 07:50 you to the record okay so you have to 07:52 always go to a leaf okay so you come 07:55 down over here you reach a leaf node we 08:00 call that a data node okay the other 08:02 nodes we can call them index nodes so 08:04 once you get to a leaf node then you 08:07 read in that block 08:08 and you have to search the leaf node 08:11 okay so the leaf node is searched 08:14 thoroughly to see if the five is in 08:16 there in this case we find the five 08:26 but I suppose you want to do a range 08:28 search get me all the records whose Keys 08:31 lie between six and twenty well take the 08:39 low end of the range for just six and 08:41 I'll do a search for six okay all right 08:44 so it's smaller than nine 08:46 it's bigger than five you end up here so 08:48 I bring this block in this leaf node I 08:51 search it I find a sex then I follow its 08:54 right pointer and bring that block in 08:57 and see which one's which of those 09:00 records have keys between the six and 09:03 twenty so nine does then I fall at this 09:07 point and bring this fellow in get 16 09:09 and 17 I fall at that point and I bring 09:11 that fellow in and find nobody there and 09:14 I stop okay so there may be many yellow 09:17 notes there right there but I don't need 09:18 to bring those in all right so to do a 09:23 range search you search for the low end 09:26 of the range that takes you number of 09:28 access is equal to the height of the 09:30 structure okay and then you bring in one 09:32 block at a time until you bring in a 09:36 block that has no one or if you stop 09:39 somewhere in the middle if you're 09:40 looking for people in the range 16 6 to 09:43 16 once you brought this fellow in you 09:45 don't have to bring in that block all 09:51 right any questions about how you do a 09:53 search or a range search 10:02 it's a lot harder to do a rain search in 10:05 a standard b-tree 10:17 okay yeah yeah right right so if the 10:26 lower range was seven the process is the 10:28 same we kind of come here we bring this 10:30 block and it doesn't have anybody but 10:32 the things increase as you go to the 10:34 right so you may still find someone so 10:36 we still bring in this block okay so you 10:39 have to keep going until you bring in a 10:42 block that has something that is bigger 10:43 than your range the only two bad blocks 10:53 you bring in would be the first one may 10:55 not contain anything and then you may 10:57 bring in a vast block that doesn't 10:59 contain anything but all the other 11:01 blocks have to contain records that are 11:03 part of the answer and actually the 11:07 other ones all of them would be part of 11:09 the answer 11:16 all right any other questions about a 11:18 range search or a search by key all 11:24 right so let's take a look at insertion 11:26 okay in insertion again has a lot of 11:31 similarity with insertion into a 11:33 standard B tree but there are also 11:35 differences to come about because 11:39 records only stored in the leaves and so 11:42 the way you handle insertions is a 11:45 little bit different okay all right so 11:49 suppose you're going to insert a temp so 11:53 you follow the search path they take 11:55 determined by the ten 11:57 it's bigger than nine you come out here 12:00 okay so greater than equal to nine you'd 12:02 come here less than sixteen you come out 12:04 here you bring in this block and that 12:07 block has got capacity and you can put 12:09 the ten in there okay when you put the 12:13 ten in here we don't have to change any 12:15 of the keys because we followed a path 12:18 that was determined by those keys okay 12:21 so you can leave those exactly the way 12:23 they are and you'll be able to find the 12:24 ten again it's the same path okay so the 12:28 ten will go into you bring that picture 12:33 back all right so the ten will go into 12:40 this leaf node here it's got the 12:42 capacity and none of the keys in the 12:46 interior nodes change that's always true 12:49 when you make an insert that doesn't 12:52 cause a leaf to overflow it's only when 12:59 leaves overflow that you may need to 13:02 write out some of the non leaf nodes 13:05 because we're going to make changes to 13:06 them 13:13 all right let's look at a case with the 13:16 leaf into which you do the insert or 13:19 overflow we will try and insert a two 13:21 it's going to go into this node here you 13:24 follow the path less than nine less than 13:26 five you end up out here we're going to 13:28 put it in here so so when you put it in 13:37 logically you get a one two three and of 13:40 course we said the capacity we're 13:42 assuming for relief is the same as for 13:44 the interior so you have an over full 13:46 node and we have to split it alright so 13:52 you put it in logically it's overflowed 13:55 we're going to split the split is 13:57 slightly different from splitting in the 14:00 case of a standard B tree in the 14:04 standard B tree we would get a new node 14:06 and put the three in there and then take 14:08 this fellow and push it into the parent 14:09 well you can't really do that here 14:11 because the parent can't hold a record 14:13 it can only hold the key okay so you 14:18 still split into two but you got to keep 14:22 the records only in the yellow leaf 14:24 nodes okay so the split takes place like 14:27 that and then since we've created a new 14:33 leaf node we have to insert a new key in 14:36 the parent okay so the number of keys in 14:38 a node is one less than the number of 14:40 children it has okay so I'll take the 14:45 smallest key in the new node two and 14:47 I'll insert that into the parent 14:49 together with a pointer to this new node 14:54 okay 14:55 so the process is very similar to what 14:59 happened before except that I still have 15:02 to put this record here previously it 15:05 was you had a one you had a three and 15:07 then you took the two and put it up but 15:08 now I've got this record to which is got 15:11 us go into that leaf node 15:18 all right so this is what I look like I 15:22 got to take the two and put it in here 15:24 and now insertion into an index node 15:27 works exactly the way it did for a 15:29 b-tree okay so if you put this in here 15:34 it's going to become 2 5 and then you 15:36 have a pointer to that new node 15:46 well let's try another insert okay 15:49 suppose you want to put in an 18 all 15:53 rights greater than equal to nine 15:55 great and equal to 16 you end up here 15:58 you reach a leaf so the new records 16:01 going to go into that leaf and it's a 16:04 three node it's at capacity so the leaf 16:07 will split okay so when the leaf splits 16:10 you get like 16 17 18 when it's 16:16 overflowed so the other half will have 16:18 the 17 18 in it okay so you split that 16:24 node the old node contains only the 16:27 first key in this case in general yes 16:30 Green was splitting down the middle okay 16:32 and the new node will have the 17 18 and 16:36 then into the parent we're going to 16:38 stick a key corresponding to the 16:40 smallest fellow here together with a 16:42 pointer to that new knowledge all right 16:46 so that's going to come in here and from 16:50 now on it's going to be the same as 16:51 dealing with the B tree okay so this 16:54 will become sixteen seventeen thirty 16:58 okay 16:59 so we'll split it you'll get a sixteen 17:01 you'll get a 30 and then a 17 which will 17:04 come over there 17:13 okay so we get the sixteen and it has 17:16 two of the sub-trees 17:17 you get the thirty it gets two of the 17:19 sub-trees the middle key is moved up 17:21 just like before and that's a two node 17:24 so it becomes a nine seventeen and the 17:28 works so it's only at the leaf where the 17:35 split is a little bit different because 17:38 records can only remain at leaves let's 17:46 try another one let's insert a seven 17:51 right smaller than mine so you come here 17:54 Goodin equal to five you come out here 17:57 you bring this fellow in and you find 17:59 that it doesn't have any capacity you 18:01 put the seven and logically okay so you 18:05 get five six seven you split it this 18:09 physical node will contain only the five 18:11 and the new node will have the 6 and the 18:14 7 and then a pointer to that node 18:20 together with the value six is going to 18:22 get inserted here okay let me see if I 18:29 have pictures for this 18:30 I don't okay so let me just explain it 18:34 alright so so into this node I'll put in 18:40 a six together with a pointer to the six 18:42 seven when you put the six in here it 18:45 becomes two five six you split it into a 18:47 two and a six 18:48 each will have two subtrees the five 18:51 will come up here okay so this will 18:56 become a five nine seventeen and that'll 18:58 get split into a five and a 17 and the 19:00 nine will go up and form a new route 19:18 ok any questions about inserting into a 19:20 B+ tree 19:30 right you can analyze the number of disk 19:32 accesses it's pretty similar to what you 19:36 have in a standard b-tree let's take a 19:43 look at a delete as far as delete goes 19:51 the first thing to notice is that 19:55 deletions automatically end up being 19:58 deletions from a leaf okay so when we 20:00 were handling deletions from a b-tree we 20:05 had to first worry about where they were 20:06 deleting from the interior an on leaf or 20:09 a leaf and we had to transform deletion 20:11 from an on leaf into deletion from a 20:13 leaf now since the interior contains 20:17 only keys you're never deleting from the 20:20 interior you always deleting a record so 20:22 deletions are always from the from a 20:25 leaf so that case of interior deletion 20:28 kind of goes away 20:36 all right so you want to delete the 20:39 record whose key is 16 so you search for 20:42 the 16 ok greater than equal to 9 20:45 greater than equal to 16 you end up here 20:49 okay so you pull out and pull it out of 20:52 this node and when you pull it out of 20:55 that node you end up with a node whose 20:57 capacity is I mean that's still occupied 21:00 sufficiently it's not deficient so you 21:03 just write out the new node okay we 21:07 don't have to worry about any of the 21:08 keys on the path because it's still true 21:12 that everybody here is written equal to 21:14 this fellow so that property is still 21:17 holds so we don't have to change any of 21:19 the keys 21:36 alright next let's consider deleting 21:41 somebody whose key is one so less than 21:51 nine less than two you end up here okay 21:56 so when you take it out that node 22:00 becomes deficient so when it becomes 22:08 deficient we will look at a nearest 22:12 sibling and see if we can borrow from 22:16 there okay so the nearest sibling will 22:20 also be a leaf so take that leaf and I 22:24 want to take some records from there now 22:27 when we were dealing with B trees we 22:30 only talked about borrowing one but 22:32 typically when people are looking at B 22:34 plus trees they don't really stop at 22:36 just taking one you try and balance out 22:39 the load between these two leaf nodes 22:42 okay so you may take half or well you 22:46 look at the difference between the two 22:48 if this one's got ten and that one's got 22:50 20 you might borrow five and balance out 22:53 the stuff okay but you take at least one 22:56 assuming this fellow's got more than it 22:58 needs okay so you check the nearest 23:01 sibling if it has more than it needs 23:02 then you borrow some number from there 23:10 and when you borrow some number from 23:13 there let's see what happens okay so 23:15 when you borrow some from here you can't 23:17 leave that key the same as it was before 23:19 okay because previously this was a two 23:22 and once you you borrow things then you 23:25 have some people here who might violate 23:28 the property that people here must be 23:30 only smaller than the - okay so you move 23:34 some number from here then you take the 23:36 smallest key that's left out here and 23:38 move it up there okay all right so 23:44 deviations from b-tree you don't 23:47 necessarily bar only one you try and 23:49 balance out the load you borrow maybe 23:52 half the difference and then you have to 23:58 change the key up here okay previously 24:00 it was boring when turkey moved up here 24:02 and then that fellow moved here that's 24:04 not what takes place yeah all right 24:13 questions about this case all right now 24:25 suppose we're going to delete the two 24:27 okay so again you search for it you come 24:30 down here okay you take it out and the 24:35 node becomes deficient okay so when it 24:38 becomes deficient you check a near a 24:40 sibling in this case this fellow and 24:43 that guy doesn't have anything extra 24:45 okay so when he doesn't have anything 24:48 extra you combine and in the case of a 24:52 b-tree combining meant the deficient 24:54 node this fellow and the in bin in 24:57 between key combined good there's no 25:00 place for a key okay you can only have 25:02 records right so this fellow and that 25:04 fellow combine and the in-between key is 25:06 just thrown away so that would cause the 25:14 occupants are here to go down by one and 25:16 this guy might become deficient and then 25:18 you'd had to continue in this case we 25:19 don't have a problem we combine this 25:21 fellow and that fellow throw away that 25:23 key and the parent node is not deficient 25:28 so we're okay 25:43 okay all right now suppose you want to 25:46 delete the three okay so you look at a 25:52 nearest sibling see if you can borrow 25:58 this fellas back more than it needs so 26:01 we borrow some to balance out the load 26:03 and in the case of a two three three you 26:08 can only borrow one okay with the 26:09 capacities is 2 so we can borrow one we 26:12 borrow the five and move it here but 26:14 then I go to change the key out here to 26:16 be the smallest one or what's left okay 26:28 but I suppose we're going to remove the 26:30 nine so that's over here okay you check 26:38 it's near a sibling the 17 doesn't have 26:41 anything to give you 26:43 so we're going to do a combined okay so 26:47 when you do a combine the deficient node 26:50 is near a sibling which is barely you 26:53 know that minimum combined together the 26:55 in-between key gets thrown away okay and 27:05 of course you don't have to go any 27:06 further of the tree none of those keys 27:08 are impacted 27:16 okay let's go back to this state and 27:20 delete a six so you're gonna check a 27:29 near a sibling which is this guy and it 27:36 doesn't have anything for you to borrow 27:37 so we're going to do a combined so 27:43 what's a deficient node barely legal 27:46 near a sibling combine in between key 27:49 gets thrown away now this index node 27:57 becomes deficient okay when an index 28:01 node becomes deficient we work exactly 28:05 as if you had a be tree rather than a B+ 28:08 tree from now on okay so you check a 28:11 near a sibling okay it's got more than 28:15 what it needs so we borrow a key 28:19 together with a subtree from there and 28:21 when you borrow from here you got to 28:22 borrow through the parent okay so the 28:25 sixteen is going to move up here and the 28:27 nine is going to come down here alright 28:37 so okay so you borrow from your sibling 28:44 which has more than it needs the 28:46 borrowing is done through the parent the 28:48 sixteen moves up the line comes down and 28:50 you're fine 28:59 again something to notice that in 29:01 practice people may not just borrow only 29:04 one what they were doing at the leaf 29:07 level where you borrow many you may do 29:10 that here also if this fellow had 20 or 29:14 30 more than it needed and this became 29:16 deficient maybe we take half of the 29:18 excess and take all of those sub trees 29:20 out here the in-between key and then 29:24 we'll have to change the key up here to 29:26 be the last key that's being left out 29:30 there so as far as this six okay right 29:40 so let me just say that people do borrow 29:44 more than one item okay so so you're 29:54 deleting a nine 30:00 so that's found in this node here the 30:06 node becomes deficient you check it's 30:08 near a sibling it doesn't have anything 30:10 to give you so we're going to do a 30:12 combine when you combine the in-between 30:16 key disappears the parent becomes 30:20 deficient you check the nearest sibling 30:23 it's got nothing extra so you do a 30:26 combine but now you're doing a combine 30:30 at an index node so combining at an 30:36 index node was deficient node in between 30:39 key barely legal node they all get 30:42 together to create a new node 30:54 then you move up one level this becomes 30:57 deficient index node you check an error 30:59 sibling if you can borrow you borrow if 31:03 you can't borrow you do a combine if you 31:07 have no nearer sibling you have to be 31:09 the root and the root is deficient only 31:12 when it's empty you throw the root away 31:30 okay alright so again it's very similar 31:33 to what takes place in a b-tree except 31:36 that at the leaf level you you have a 31:38 slight deviation all right any questions 31:45 about a delete from a B+ tree or I'll 31:54 spend a few minutes talking about 31:56 another variant of a be tree this is a B 32:01 star tree right this one differs from a 32:09 bee tree only in the degree constraint 32:12 of the nodes right the placement of 32:15 records is the same as in a bee tree all 32:22 right so the degree constraints here are 32:24 the roots got to have at least two 32:27 children and it can have up to this 32:36 money if you look at this forget about 32:38 the floor you're looking at about four M 32:40 over three so you can the root can 32:44 actually get very fat it can be 33% more 32:46 than normally allowed in a bee tree of 32:49 order M okay so the route can have a 32:57 very small degree just like in a bee 33:00 tree up to two but it can also have a 33:03 degree higher than you would normally 33:05 expect in a bee tree of order M okay so 33:08 the route can go up to roughly four M 33:11 over three the remaining nodes at the 33:20 lower end instead of M over two we now 33:21 have 2/3 of M so I've increased the 33:25 lower bound on the degree from M over 33:31 two to about 2m over 3 but the upper 33:36 bound is the same you can only have M 33:48 as before all external nodes are 33:53 required to be at the same level 34:07 all right so the net effect of this is 34:09 if you do your height analysis like we 34:12 did for a b-tree if you recall there the 34:19 bounds that we had we had log of em but 34:25 then the base there was M over 2 okay so 34:29 we had something like log of n plus 1 34:32 and then the base was M over 2 now if 34:34 you do it the base instead of being M 34:36 over 2 because the lower bound on the 34:38 degree is 2 m over 3 the base will 34:40 become 2 m over 3 so the worst case 34:45 height is going to be smaller here than 34:48 it was in the case of a B 3 and because 34:53 the worst case height is smaller for the 34:56 same degree which means for the same 34:58 node size the number of disk accesses 35:02 needed to do a search is reduced in the 35:04 worst case and so you get somewhat 35:06 better performance as far as search goes 35:12 I'll quickly run through what happens 35:15 for an insert and a delete and there 35:18 you'll see that the number of disk 35:19 accesses made per level actually goes up 35:21 and so typically the total number of 35:25 disk accesses goes up for an insert or a 35:27 delete so if your primary concern is 35:30 just such maybe 99% of the time all 35:33 you're doing is search you get a reduced 35:35 height so you can reduce the number of 35:36 disk access accesses may be y1 maybe by 35:39 2 okay but then you would end up paying 35:42 for it when you're doing insert and 35:44 delete and if that happens what is 35:46 readily it doesn't really matter right 35:52 so if you're doing an insert the process 35:54 is very similar to when you're handling 35:56 a b-tree you insert it goes into a leaf 36:01 if the knee if the leaf becomes over 36:05 full then you're going to check in 36:08 adjacent sibling ok yeah you're going to 36:14 check in adjacent sibling to see if you 36:15 can offload something to that guy so 36:17 it's kind of the inverse of a combine 36:19 that 36:19 explain delete so so rather than split 36:24 directly if you try to split right now 36:26 we can't meet that 2m over three lower 36:29 bound on the degree of a node if you 36:33 split it off full node you're going to 36:37 end up with one node that's got M over 2 36:40 or less items in it okay so to meet that 36:44 lower bound of 2 m over 3 whenever you 36:46 do a split you have to split two full 36:49 nodes into 3 rather than one over full 36:53 node into two okay and that's why we 36:56 have to do this check check the adjacent 36:59 node and if that node has got capacity 37:06 to take more then you offload something 37:08 to that guy doing the inverse of the 37:10 borrow operations is sending it in the 37:13 other direction okay so if the adjacent 37:16 sibling is not full you move something 37:18 from this overflow node so the adjacent 37:21 siblings on the right you could take the 37:22 largest fellow and the overflow node 37:25 move it up as the in-between key tag in 37:27 between key and move it into the fellow 37:29 on the right okay 37:31 so you do the inverse okay but if the 37:35 adjacent sibling is full okay now I've 37:38 got one over full node and I've got one 37:41 full node plus I've got an in-between 37:44 key so I'll take these fellows okay and 37:48 I'm going to split them up into two 37:51 nodes and two in-between keys okay so we 37:56 take two nodes a full node and a know 37:58 full node and it in between key divide 38:01 it out to get one node that's going to 38:07 have this degree so one node is going to 38:13 have that many records in it 38:15 okay another node will have that many 38:17 and a third node will have that many and 38:20 there will be two keys left over which 38:24 will serve as the in-between keys for 38:26 these three nodes 38:31 so the splitting is done two nodes into 38:34 three rather than one into two and you 38:37 can check this out for different 38:40 possibilities for M whether m mod 3 is a 38:44 0 1 or 2 and you'll see that in all 38:47 cases you're going to be able to get 38:48 this to work out this will satisfy that 38:55 the sum of this will not exceed the 38:57 total number of keys that you have ok 39:01 all right so this thing works out you 39:03 can do a split of this way 39:04 and then into the parent node 39:08 you took one in-between key out you put 39:10 back to so that occupancy goes up by one 39:13 so that might become over full and then 39:16 you apply the same thing at that level 39:21 okay all right so the insert strategy is 39:24 very similar but we have to now split 2 39:28 into 3 which means you got to check an 39:31 adjacent sibling you try and borrow from 39:34 there if it's not full sorry you don't 39:37 borrow from there but you sent into 39:38 there if it's not full if it's full you 39:41 do a 2 into 3 split and when you do a 2 39:44 and 2 3 split the parent node may become 39:48 over full and that propagates back up 39:56 again when you reach the root okay 40:03 if we hadn't allowed for a root to get 40:06 very fat when he split a root if it only 40:09 has M minus one keys in it there's no 40:12 way to split it so that you end up with 40:15 at least 2 and minus 3 in the kids okay 40:19 that's why we allow the leaf sorry we 40:21 allowed the root to get over fat 40:24 something like 4 M over 3 so that when a 40:27 root of when a root has to split you can 40:29 create two children that have the 40:31 minimum occupancy or any questions about 40:36 insertion and a B star tree yeah 40:42 oh can I draw one not sure we get a 40:49 camera out here okay let's see can you 40:52 get the camera here ah 40:55 good okay all right so basically what's 40:58 happening is okay you have a node that 41:01 becomes over full you check a sibling 41:06 okay all right so we're going to take 41:10 the in-between key here now if this 41:16 fellow has space then I'll take the 41:19 biggest fellow here so maybe you got a 41:21 20 here a 25 here and a 30 out there 41:26 okay plus more things okay so I'll take 41:29 the 20 I'll move it here and I'll move 41:31 the 25 here so that's the reverse of a 41:35 borrow if there are sub trees okay then 41:40 this sub tree moves along with the 20 so 41:46 the so the first case where this is not 41:50 full is the inverse of a borrow for a be 41:54 tree in the second case this is full 42:01 so let's say this is full if you're 42:11 thinking about let's say order three 42:15 so that's full this has become over full 42:18 so maybe it's got a twenty twenty-one 42:21 twenty-two okay so now we're going to 42:25 take this this and that 42:27 and split it okay so if I'm going to 42:31 split it I can get a let's say 20 out 42:35 here and get a twenty-year get a maybe 42:44 21 22 well actually let's see how many 42:52 do I need I've got M is three so I need 42:55 them so I need to have two fellows in 43:01 each probably right so maybe you get a 43:04 20 and a 21 I'll move the 22 in between 43:09 I'll get a 25 and a 30 and then I'll get 43:16 a 33 well actually probably trying to do 43:25 this up with order three isn't going to 43:26 work we need to go with a higher order 43:28 but let me just try and explain it 43:29 logically okay so maybe you had somebody 43:33 else okay so we make three fellows then 43:37 you're going to have an in-between T 43:38 that's left or maybe a 22 here and a 31 43:42 so I'll take these two fellows and 43:45 insert it into the parent okay so when 43:50 you because this fellow was taken out 43:53 okay so the idea is you take two 43:55 adjacent nodes one is ofö the other one 43:59 is full you take the in-between key okay 44:02 and then we're going to split these into 44:04 three nodes that have the minimum 44:07 required for a B star tree okay now this 44:11 doesn't work when M is 3 we need to be 44:13 at a higher 44:14 degree than three together to work and 44:17 then when he split it into three nodes 44:19 will have two keys left over in this one 44:22 whose value is between these another one 44:25 whose values between these they will 44:27 take these two guys and put them into 44:28 the current so into the parent we went 44:32 from one key to two so that occupancy 44:35 here increased by one so this one might 44:37 become over full then you had to repeat 44:40 this strategy at this level okay and 44:43 that can propagate all the way up to the 44:45 root okay or as far as the delete goes 45:00 again when you do a delete in some sense 45:04 it's very similar to what's happening in 45:05 a b-tree but again because of this 45:07 minimum requirement of 2m over three 45:10 occupancy you can't really combine two 45:14 nodes into one so you have to really 45:21 combine three adjacent nodes and two 45:25 in-between keys so it's the reverse of a 45:30 split where you went from two into three 45:33 here we'll go from 3 into 2 so when you 45:41 go from 3 into 2 you'll be dealing with 45:50 two fellows who are at bare minimum ok 45:54 so that's these guys one fella who is 45:58 one below by minimums become deficient 46:01 and then there'll be two in between keys 46:05 because we've got three nodes that we're 46:06 talking about ok so this is the number 46:09 of keys we're going to have that we're 46:10 going to have to distribute over to 46:12 nodes plus 1 in between key ok and so 46:19 we'll distribute those out ok this is 46:21 the total number of keys that we have in 46:24 the three nodes that we are combining 46:25 plus the two in-between keys 46:28 we'll end up with two nodes and one in 46:32 between key and depending upon whether M 46:38 is divisible by 3 or not the size of 46:41 these two nodes will be different so if 46:46 M is dual by 3 each of the two nodes 46:49 will have n minus 1 keys if M R 3 is 1 46:54 it will have one node with n minus 1 one 46:56 with n minus 2 and if it's 2 then both 47:03 those nodes will have will have n minus 47:05 2 47:21 alright so the deletion operation is has 47:31 the same basic steps when you become 47:33 deficient you try to borrow from either 47:37 the nearest sibling or two siblings away 47:39 because you gotta have three nodes 47:40 involved if you're going to end up doing 47:42 something other than a borrow if a 47:48 borrow doesn't work we will do a combine 47:53 combining now has to be three nodes into 47:55 two because if you're trying to do two 47:58 into one you'll end up with too many 48:00 keys going to that one node a node is 48:03 deficient when you have less than 2m 48:05 over 3 ok so if you do 2 into one you 48:08 got 4m over 3 that you can't put into 48:10 one node except at the root level okay 48:15 so it's got to be 3 into 2 and when you 48:17 have 3 into 2 you have exactly this many 48:20 keys and now it's possible to distribute 48:23 them so that no nodes capacity is 48:27 exceeded and you have one key left to 48:31 put back in the parent okay since you 48:35 took 2 out of the parent and put one 48:36 back in the occupancy of the parent may 48:39 fall below the minimum required then you 48:42 have to repeat this process at the 48:43 parent all right any questions about 48:52 deletion from a be star tree 49:00 okay so we'll stop here