00:38 okay we were looking at tree structures 00:43 to represent dictionaries and in the 00:46 last lecture we took a look at building 00:48 a least cost binary search tree to 00:51 represent a static dictionary today 00:54 we're going to start our discussion of 00:55 dynamic dictionaries and for dynamic 00:58 dictionaries we'll see if a range of 01:02 data structures each having its own 01:05 pluses and minuses alright so to begin 01:10 with in a dynamic dictionary the primary 01:14 operations or you want to be able to 01:17 perform a search you want to be able to 01:20 insert a new pair you want to be able to 01:23 remove a pair from your dictionary and 01:28 some additional operations that data 01:32 structures for dynamic dictionaries 01:35 supported this tree like structure data 01:38 structures are the ability to list the 01:42 contents of a dictionary in ascending 01:44 order of key to perform a search by 01:50 index and to perform a remove by index 01:59 alright so if you look at the 02:01 complexities of the primary operations 02:06 get put and remove if you were to use a 02:09 hash table data structure to represent 02:12 the dictionary your worst case would be 02:14 order n your expected and really this is 02:16 what you would see in practice almost 02:18 all the time would be order one using a 02:21 binary search tree it would be order n 02:25 in the worst case and log N is the 02:28 expected case the unfortunate thing 02:33 about this worst case is that the worst 02:35 case corresponds to a case that is 02:38 likely to happen in practice so it's not 02:41 like you could safely use this and say 02:44 well we're going to consistently get log 02:46 n behavior there will be 02:48 or there are real applications where you 02:50 would get close to the order n okay 02:52 unlike the hash table where the worst 02:57 case of order n tends to happen in a in 03:02 an unrealistic situation okay all right 03:07 and so then there are balanced binary 03:10 search trees where the worst case and 03:12 the expected complexities are both log 03:14 in if you look at the the additional 03:22 operations I've listed there okay if you 03:25 use a hash table and you want to list 03:27 all of the entries in ascending order 03:28 here D is the number of buckets on the 03:31 table the only way to list the elements 03:33 in ascending order would be to first go 03:35 and find all of the elements in the 03:37 table and for that you would spend at 03:39 least D units of time going through all 03:41 the buckets locating the elements having 03:44 you found found the elements in order of 03:47 D or order of D plus n time depending 03:50 upon the data structure you're using you 03:53 would then have to sort them into 03:54 ascending order spending n log n time 03:56 and being the number of elements okay so 04:02 so for the ascent operation that's what 04:05 you've spent is not just an the worst 04:08 case was s also the expected time to 04:10 perform the ascend operation if you want 04:14 to do gets and removes by index well 04:18 then a simple approach to that would be 04:19 to do what you did here sort them and 04:22 then pick up the is one okay a more an 04:27 involved approach would which would 04:28 bring the complexity down from I guess D 04:31 plus n log n to D plus n will be you 04:34 collect all the elements and then you 04:36 use a selection algorithm which in order 04:38 n time can find the I element from an 04:40 unsorted collection okay so instead of 04:44 the n log n you could do it note of n 04:46 okay but there's really no way to get it 04:50 down below the D plus N 04:55 if you're looking at index binary search 04:58 trees again these are not balanced we 05:00 just throw the indexing operation onto 05:02 it that we discussed a few lectures ago 05:05 and then you're still you're spending 05:09 order n to do the Ascend okay oh that's 05:14 just an inorder traversal okay and 05:16 you're spending order of n also to do 05:19 the get in the remove because the height 05:21 could be order of n if you go into the 05:24 balanced mode okay then this remains 05:27 order n and that becomes log n so for 05:32 the secondary operations a balanced 05:36 binary search tree is going to give us 05:38 very good performance compared to even 05:40 to a hash table as far as expected 05:42 performance is concerned okay 05:48 so we really want to take a look at 05:53 balanced binary tree structures and all 05:57 of these are really based on unbalanced 06:00 binary trees so what I like to do is 06:03 just have a quick review of how the 06:06 operations are performed on an 06:07 unbalanced binary search tree so that 06:10 we're all on the same page with respect 06:12 to the background material okay so here 06:17 is a binary search tree if you want to 06:21 insert an item well then you just search 06:24 for that key the key for that item and 06:28 wherever you fall off the tree that's 06:30 where you put it in so if you want to 06:34 insert a pair whose key is 35 you search 06:37 for 35 it's bigger than 20 smaller than 06:39 40 bigger than 30 you fall off here 06:41 that's where you put it in okay so you 06:46 find out where to put it to get a node 06:48 and Link it in okay so that's an order 06:52 height complexity 06:57 if you want to do a remove well then the 07:00 remove breaks down into three cases you 07:06 have a case where the pair you're 07:09 removing is residing in a leaf when it's 07:13 residing in a node as degrees one and 07:15 then it's residing in a nodus degrees - 07:19 okay so those are the only three 07:20 possibilities for the remove assuming 07:22 the remove is successful if the item is 07:24 not there well then you don't have to do 07:26 any work you just I can't do the remove 07:27 okay all right so if you're removing 07:30 from a leaf for example you could be 07:32 removing the thirty five twenty five 07:34 eighteen seven - okay well then you need 07:40 to first find the leaf okay you're given 07:43 the key you perform a search you end up 07:45 over there once you find the leaf we 07:47 just change the pointer from the parent 07:49 to null and you got rid of the node okay 07:54 right so all you have to do there is 07:57 change the pointer from the parent to 08:00 now and you're done okay if you're 08:04 removing from a node whose degree is one 08:06 okay for example you're removing the 40 08:10 okay or you're removing the 15 or you're 08:14 removing the eight okay right in this 08:19 case if you are removing say the 40 you 08:23 simply change the pointer from the 08:25 parent assuming you have a parent if 08:27 you're the root well then you just 08:29 disappear okay but if you do have a 08:31 parent then you changed the pointer from 08:33 your parent to your single child okay so 08:40 you change that pointer and that cuts 08:41 that one off and it's essentially this 08:44 node is no longer part of your tree 08:55 alright here's another example we're 08:57 going to take out the 15 okay so you 09:01 change the pointer from your parent to 09:02 your single child alright the final 09:14 cases you're removing from a notice 09:16 degree is 2 okay so for example he 09:20 couldn't be removing the 20 or you could 09:22 be removing the 10 you could be removing 09:24 the 6 to remove the 30 all right so 09:29 let's say we are asked to remove the 09:31 10th okay so to remove the 10 you can't 09:35 really play the strategy we had for the 09:37 previous two cases ok you can't change 09:39 the pointer from the parent because 09:40 you've got two children out here to 09:43 worry about ok so the strategy that you 09:48 adopt is you find the largest pair or 09:54 the pair of the largest key in the left 09:55 subtree and in this case that would be 09:58 the 8 and you move it up here to fill 10:01 this vacancy ok 10:02 alternatively you could find the 10:05 smallest in the right subtree in which 10:07 the case that would be 15 and you could 10:09 move it up ok that would preserve the 10:11 binary search tree property ok alright 10:13 so we replace the pair you're removing 10:16 with the largest in the left subtree or 10:19 the smallest and the right and the 10:21 examples i have consistently replaced 10:23 with the largest in the left subtree so 10:25 in this case an 8 okay well I said as 10:31 follows moved up here okay now really 10:34 what you've done is you've moved the 10:36 vacant node problem further down the 10:38 tree but since you took the largest in 10:42 the left subtree what you know is that 10:44 the node which had this element cannot 10:48 have a right child because whatever's in 10:51 the right child will be bigger ok and 10:52 there's nobody bigger ok so this node is 10:54 guaranteed to not have a right child and 10:56 therefore it's degrees either 0 or 1 and 10:59 we know how to handle that from the 11:01 previous two cases okay in this case 11:03 this nodes degree is 1 11:05 as long as you change the viewpoint and 11:06 the parent to the surviving child okay 11:14 so there's only one pair that has to be 11:17 relocated it's not the problem that 11:19 percolates all the way down the tree 11:22 okay all right a last example you want 11:30 to take out the 20 okay so the strategy 11:37 is find the largest in the left subtree 11:39 the largest in the left subtree is 18 11:41 okay you find that by moving into the 11:43 left subtree and making a sequence of 11:45 right child moves and so the largest 11:48 item is then moved up okay largest item 11:53 has to be in a notice degrees either 0 11:54 or 1 in this case it's degree 0 in which 11:58 case we handle this with the leaf 12:00 removal strategy okay so you just change 12:04 the pointer from the parent to know 12:06 which incidentally is the same as the 12:08 strategy for the case where nodes degree 12:12 is 1 because you could say you're 12:14 changing the pointer from here to that 12:17 of one of the children which is null in 12:19 any case okay so so very briefly that's 12:27 how you perform puts and removes from 12:30 binary search trees when you don't care 12:32 about the height of the resulting tree 12:36 or any questions about those okay all 12:41 right all right the complexity of the 12:48 operation is order of the height 12:53 all right some other operations that you 12:55 might want to perform on binary search 12:57 trees operations say motivated by 13:00 operations performed on priority cubes 13:05 you may want to find the largest item or 13:08 the smallest item okay you may want to 13:11 remove the largest item or the smallest 13:14 time so there's a standard project queue 13:15 operations you may be interested in 13:21 initializing a binary search tree with n 13:23 tears and the other operation we had for 13:27 product users melt okay so you may want 13:30 to do the same thing here given two 13:31 binary search trees meld them into one 13:34 okay alright so let's take a look at how 13:37 you might perform these operations on 13:39 binary search trees okay so if you want 13:44 to find the max item we know that you 13:46 can start at the root and keep making 13:48 right child moves keep getting to bigger 13:50 and bigger items till you can go no 13:51 further so you can find the max item in 13:58 logarithmic time in a priority queue you 14:01 can find the max item 14:02 sorry not logarithmic but order height 14:04 time okay in a party queue you can find 14:07 the max item in order one time it's 14:10 sitting at the root okay so here it 14:12 takes you over height time you can find 14:18 them in item you just follow the 14:20 leftmost path you can find the main item 14:23 okay so that's also all over height okay 14:28 if you want to remove the item once 14:30 you've found it okay well once you've 14:33 found it you can remove it in order one 14:35 time okay because it's going to be in a 14:39 notice degree zero one but if you 14:45 include the time needed to find the item 14:47 then it sort of height okay so if you 14:51 had to implement a remove max or remove 14:52 men you could do it in order of height 14:54 time it would take you the order of 14:57 height to do the search to find it and 14:59 then order one these with our simple 15:01 algorithms to actually take it out 15:04 so the remove time is if you went into a 15:08 balanced version of this this would 15:10 translate into an order log n time to 15:12 perform a remove which asymptotically is 15:14 the same as that for say using binary 15:18 heaps if you want to initialize we know 15:26 that we can initialize priority queues 15:29 in linear time okay if you go through 15:34 and take a look at any one of the many 15:38 products you structures we looked at 15:40 binary heaps leftist heaps so Fibonacci 15:43 heaps binomial heaps and all of them the 15:46 initialized time is order of n now as 15:51 far as initializing binary search trees 15:52 is concerned we can prove there's no way 15:55 to initialize these in order n time and 16:00 you prove that by making use of a known 16:02 result for sorting which says you cannot 16:04 sort n elements in less than n log n 16:07 time so you can use a binary search tree 16:11 to sort ok given n elements you first 16:17 construct the binary search tree okay so 16:21 that's the initialize step having 16:23 constructed the Enosh the binary search 16:26 tree using the initialize operation you 16:28 then perform an inorder traversal and 16:30 list the elements in sorted order the 16:34 inorder traversal takes order of n time 16:37 but we know from theoretical results on 16:41 sorting that these two steps together 16:43 have to take n log n time okay so if 16:46 this step is going to take order n then 16:47 this one has to take the N log n ok 16:52 alright so so you really can't 16:55 initialize these fellows in anything 16:57 less than n log n time 17:04 you turn your attention to the MELD 17:07 operation we know that we can meld two 17:10 priority queues depending upon the data 17:12 structure you're using say in order one 17:14 time if you had Fibonacci heaps for 17:16 example okay or you can perform the meld 17:20 in order of log n time if you're using 17:24 say a leftist tree yeah turns out that 17:30 you can't meld binary search trees in 17:35 logarithmic time but you have to spend 17:38 linear time to do the molding and okay 17:44 so for example if you've got these two 17:47 binary search trees and you want to meld 17:50 them into one so you want to create a 17:51 new binary search tree it would look 17:53 like this and to see why you can't do 18:01 this in in logarithmic time okay we know 18:14 that the the mold operation here is is 18:20 one that can be used to perform a merge 18:23 on two sorted lists okay and I'll go 18:30 into the details of that in a moment but 18:31 there is a known result for merging that 18:35 says that if you have two lists say each 18:39 having n elements then there's no way to 18:43 merge these two lists performing fewer 18:46 than 2 n minus 1 comparisons in the 18:49 worst case okay so to perform the merge 18:54 in the worst case you have to have 2 n 18:57 minus 1 compares so 19:04 coming back to okay 19:07 this picture here okay right the way 19:10 we're going to do a merge okay so you 19:12 you're given one list that has n 19:14 elements and that's a sorted list from a 19:17 sorted list you can construct the binary 19:20 search tree with 0 compares okay in fact 19:26 you can build a left skewed tree or a 19:31 right skewed tree okay just take all the 19:34 elements within ascending order put the 19:36 first one in the root the next fellow is 19:37 this right child and so on you got a 19:39 right skewed binary search tree do the 19:42 same thing for the second list okay so 19:44 you can construct the binary search 19:46 trees with 0 compares if the elements 19:52 are given to you in ascending order okay 19:55 alright so using 0 compares you 19:57 construct the two binary search trees 19:58 then you perform the meld okay and then 20:02 you do an inorder traversal which also 20:04 has 0 compares and you list everything 20:06 out in mood ascending order okay so 20:11 there are no comparison creating the 20:13 initial orange and yellow trees there 20:16 are no comparison taking the melded red 20:18 tree and producing the result in 20:21 ascending order you do an inorder 20:22 traversal so the only place where 20:25 compares can be done is in the mode 20:27 operation but the theory says to perform 20:30 this meld you must do two n minus one 20:33 compares in the worst case so the mode 20:36 operation has to take buffer has to 20:39 perform order of n compare operations 20:42 and therefore must take at least that 20:44 much time alright so we can prove that 20:51 operations like meld and initialize 20:53 which were relatively fast for product 20:56 use cannot be done in a similarly fast 20:59 fashion for binary search trees any 21:07 any questions about those two claims no 21:18 okay 21:20 all right so let's turn our attention 21:21 then to trees whose height is order of 21:26 log n okay so if you want to perform 21:32 inserts searches and removes in order 21:36 log n time we know that they take order 21:38 of height time so we have to limit our 21:39 trees binary search trees that we use to 21:42 a subclass of all the binary search 21:44 trees a subclass in which every instance 21:47 has a height that is order of log n okay 21:50 and certainly we may look for binary 21:55 trees that we already know have a height 21:58 with this property okay so you might say 22:00 well let's limit ourselves to full 22:04 binary trees okay yeah we can see that 22:08 that's not going to work because full 22:09 binary trees don't exist for every value 22:11 of n okay they only exist when the 22:15 number of elements is one short of being 22:17 a power of two and if you want to 22:20 support a dynamic dictionary then you 22:22 have to be able to go from seven 22:24 elements to eight - nine - ten to eleven 22:26 and twelve you can't just say I'm now at 22:28 seven I won't do anything till you give 22:29 me eight more elements okay all right so 22:33 so full binary trees aren't going to 22:35 work we could look at complete binary 22:39 trees we know that these exist for all 22:41 values of n and we know that their 22:48 height is logarithmic but we can show 22:51 that if you stay with complete binary 22:54 trees then you cannot perform insert and 22:56 delete in logarithmic time okay 22:59 to see a simple example okay so here's a 23:02 complete binary tree so binary search 23:04 tree it has five elements in it okay and 23:08 we're going to perform an insert in this 23:10 and I'm going to insert an item that is 23:12 smaller than the smallest you presently 23:14 have okay so maybe anything smaller than 23:17 two okay so once you're done okay you 23:23 started with the five element complete 23:24 binary tree you have to end up with a 23:26 sixth element complete binary tree so 23:28 structure is well defined also the 23:31 placement of elements has to be such 23:33 that if you 23:34 when in order traversal in there you get 23:36 them out in ascending order so the 23:38 placement of elements is also unique get 23:40 the structures unique and the placement 23:41 is unique and like it or not this is 23:45 what you have to produce okay and what 23:51 you see is that to go from here to there 23:54 you have to relocate almost every 23:57 element okay 23:58 all elements change their position in 24:00 the tree if you even think of keeping 24:05 this as a linked binary tree you'll see 24:08 that for every element there its parent 24:12 has changed or both of its children or 24:14 one of us children have changed so for 24:16 every node you're going to have to make 24:17 some change and that's going to take 24:19 your order event time okay all right so 24:23 complete binary trees are not suited 24:25 either for dynamic dictionaries okay so 24:33 that kind of eliminates the tree types 24:36 that we know of already that have a log 24:39 in height property now as far as 24:44 balanced search trees are concerned now 24:46 in search trees very loosely speaking 24:48 would be anything in any class of trees 24:51 whose height is logged with making the 24:53 number of elements there are many 24:58 different categories of balanced search 25:02 trees or balanced trees in general we 25:06 could have a category which we will call 25:09 height balanced trees and in that 25:13 category there's a specific variety 25:15 called the AVL tree the edilson well ski 25:20 and Landis these are the two Russians 25:23 who came up with this class of trees and 25:26 this really represents the first known 25:30 class of log n height trees for which 25:35 people could perform inserts and deletes 25:37 in logarithmic time so historically it's 25:42 of great significance is the first 25:44 example of a dynamic the 25:47 restructure where you could perform 25:49 things in logarithmic time we will be 25:53 studying these ok then there are weight 25:56 balanced trees which there are a few 26:01 varieties but we aren't going to be 26:03 studying any variety of weight balanced 26:05 tree in the class then there are degree 26:08 balanced trees and in this category 26:12 there are there's a variety called two 26:15 three trees that we'll be looking at 26:17 generalization of them two three four 26:20 which will also look at say category 26:23 called red black trees which is really 26:26 they're really the same as two three 26:28 four it's just a different way of 26:31 representing a two three four tree than 26:33 the natural way to represent it two 26:34 three four tree okay there are B trees 26:39 and really when you look at B trees all 26:44 of these other three things are just 26:45 special cases of a B tree okay so in 26:48 some sense here there's only one 26:49 different type of tree we're looking at 26:51 B trees and the these two are direct 26:53 special cases and this one is just 26:55 another way to represent two three four 26:57 which is also a special case of a B tree 27:02 alright so we're going to look at quite 27:04 a few search tree varieties and 27:09 depending upon the application one or 27:11 the other may prove to be the better way 27:12 to go I'll start by looking at the first 27:18 of these the AVL tree alright it's a 27:24 binary tree and as I said a little while 27:30 ago it's a height balanced binary tree 27:31 and to understand what we mean by height 27:34 balancing we'll define a quantity called 27:37 the balance factor for every node in the 27:40 binary tree okay well the balance factor 27:44 is simply the difference in Heights 27:45 between the left and the right sub trees 27:48 now 27:50 we're going to define the balance 27:55 factors being height of left - hi - 27:57 right in some books you might find it to 28:00 find the other way around height of 28:01 right - height of left it doesn't really 28:04 make any difference okay it just means 28:05 that you change the sign of the fact 28:07 downs factor 28:11 all right the requirement okay 28:13 regardless of how you define it height 28:15 of left - right or right - left the 28:17 requirement is the same to be an AVL 28:19 tree the balance factor of every node 28:21 should be minus 1 0 or plus 1 so yeah 28:32 that's right so what I said is that 28:36 basically the height difference between 28:38 the left and right subtrees is at most 1 28:40 for every single node ok right as an 28:48 example consider this binary tree here 28:50 we can fill in the balance factors the 28:53 balance factor of a leaf is zero height 28:55 of left subtree zero height of right 28:57 subtree zero height difference is zero 28:59 okay so the balance factor there is zero 29:02 okay come here hi to the left subtree is 29:05 one I do the right subtrees one so the 29:07 height difference is zero if you come to 29:09 this node the bounce factor is zero okay 29:13 moving along here the height of the left 29:15 subtree is two right subtrees one 29:18 so the bounce factor is +1 okay zero out 29:23 there over here it's going to be minus 29:25 one over here it'll be a zero over there 29:31 it'll be a plus one and then here it's a 29:35 zero minus one and over here there's 29:41 going to be a plus one okay the height 29:45 here is three and the height here is two 29:48 coming over here the height of the left 29:50 subtree there are three levels one two 29:52 three and down here it's 1 2 3 4 so 3 29:54 minus 4 is the minus one 30:01 alright so in this particular example 30:03 the balance factor of every node is one 30:07 of the permissible values minus 1 0 and 30:10 plus 1 and therefore this is an AVL tree 30:15 okay it's not difficult to convert it 30:19 into something that will not be an AVL 30:20 tree just throw a node down here ok so 30:24 if you give this fellow here a right 30:25 child then the balance factor of this 30:28 node would become minus 1 and the 30:29 balance factor they would become minus 2 30:39 okay so a height balanced binary tree 30:45 the essential property is that you place 30:48 a bound on the height difference within 30:49 the left and the right subtrees of every 30:52 single node in an AVL tree that bound is 30:54 1 okay the absolute value of the height 30:57 difference cannot be any more than 1 but 31:00 you could go to other height balance 31:03 trees and say the absolute difference in 31:04 the height is no more than 2 or no more 31:06 than 3 okay and it doesn't really matter 31:09 what bound new place you can prove that 31:12 the class of trees you define has 31:15 logarithmic height it's just that the 31:18 base of the logarithm would change okay 31:22 now we're going to go through a proof 31:24 that shows that when the height 31:26 difference bound is 1 namely when you 31:29 have AVL trees that the height of these 31:32 trees cannot exceed something like 1 31:37 point 4 4 log of n and since we know 31:46 that every N Load binary tree must have 31:48 a height this at least log of n ok we 31:50 know that the height of AVL trees lies 31:53 between the lower bound for every single 31:55 binary tree and the upper bound for AVL 31:57 trees ok so your height could be 32:00 anywhere in this range and then if 32:05 you're able to perform your inserts and 32:08 deletes in order of height time ok we're 32:13 going to have to modify how we do 32:15 inserts and deletes because if you stick 32:18 with the simple algorithms even though 32:20 you may start with an AVL tree do an 32:21 insert the result may not be an AVL tree 32:23 you might move out of the class ok so 32:26 you have to preserve the class property 32:32 all right so let's come up with an upper 32:35 bound on the height ok the 1.44 log n 32:41 the analysis is really very similar to 32:47 the analysis done for say binomial heaps 32:51 showing that 32:52 if the degree is something then you got 32:54 to have to to the degree or for 32:58 Fibonacci heaps which said that if the 33:01 degree was I then you're going to have 33:03 fear raise to ioaf you raise to I plus 33:04 two number of nodes so here - we're 33:08 going to say well suppose you have a 33:12 particular height okay well then what is 33:14 the minimum number of nodes the AVL tree 33:17 could possibly have given that height 33:19 and that minimum number of nodes needs 33:21 to be some exponential function of the 33:23 height all right so as in the case of 33:31 the Fibonacci heap analysis we will say 33:33 let n sub H be the minimum number of 33:35 nodes in navy L tree whose height is H 33:38 there of course we were looking at a 33:40 Fibonacci heap whose degree was H okay 33:44 so instead of degree we're not talking 33:45 about height all right if the height of 33:53 your tree is zero well then you have no 33:55 nodes okay so n sub 0 is 0 if the height 34:02 is 1 then all you have is the root so n 34:04 sub 1 is 1 if the height is more than 1 34:16 then we draw a little picture you got to 34:22 have a root okay and then the root has a 34:24 left subtree and the root has a right 34:26 subtree and we know from the definition 34:31 of AVL trees that the left subtree must 34:34 be an AVL tree and the right subtree 34:36 must also be an AVL tree now if this 34:40 tree shown here is one that has the 34:44 minimum number of nodes possible in an 34:46 AVL tree of height H then we can reason 34:51 that L 34:58 well I guess before I do that part let 35:01 me take care of this statement here okay 35:06 the height of the whole tree is H okay 35:09 so we know that the height of the left 35:11 subtree is at most H minus one okay and 35:15 the height of the right subtree is at 35:17 most H minus one and what one of the two 35:20 has to be H minus one okay if both for H 35:22 minus two for example the whole thing 35:24 can't be H okay so one of the two has a 35:27 height that's H minus one okay the other 35:30 one could have a height that's H minus 1 35:33 or H minus 2 now we can reason that the 35:37 value of n sub H increases as you 35:40 increase H we know number of nodes in a 35:43 tree whose height is zero is less than 35:44 that in a tree whose height is one and 35:46 then if you go too high to the middle 35:47 number will go up and so on okay so if 35:50 this whole thing here has the minimum 35:53 number of nodes in a in an AVL tree 35:55 whose height is H if this fellow is H 35:58 minus one then that guy will not be H 36:00 minus one because we can reduce the 36:02 number of nodes by changing this to have 36:03 a height of H minus two okay so one tree 36:07 has a height that is H minus one the 36:08 other one has H minus two it doesn't 36:10 really matter which one is H minus one 36:12 and which one is H minus two okay so if 36:18 the whole thing has the member of our 36:20 nodes okay then the fellow whose height 36:22 is H minus one will have this many nodes 36:25 in it if it has more than that we can 36:27 throw out the L and put in a an AVL tree 36:30 here with smaller number of nodes okay 36:32 so since the whole thing is minimal Ln 36:34 are both minimal for their heights so 36:36 one of them has n H minus one the other 36:39 one has n H minus 2 okay and that gives 36:42 us a familiar recurrence okay it says 36:46 the minimum number of nodes when the 36:47 height is H is the minimal number of 36:49 nodes when the height is H minus one 36:50 plus the minimum when it's H minus two 36:52 so one takes care of L the other one 36:54 takes care of our and then this one here 36:57 takes care of the root 37:03 but that looks a lot like the equation 37:05 we had for Fibonacci heaps okay we've 37:11 seen the Fibonacci numbers before given 37:14 by those two recurrences here's what we 37:18 have for the minimum number of nodes in 37:21 an AVL tree from these two recurrences 37:27 we can derive this recurrence you can 37:31 prove this using induction and then 37:36 knowing how fast the fill we're not 37:40 Fibonacci numbers grow okay we know that 37:42 the art Fibonacci number is roughly fee 37:46 raised to I divided by root of five 37:48 where fee is one plus root 5 over 2 37:52 okay so using this stuff okay 38:06 all right so so from here basically what 38:09 you get is the minimum number of nodes 38:10 in an AVL tree whose height is H is a 38:12 fee raised to h plus 2 over root 5 ok so 38:19 the number of nodes is growing 38:20 exponentially and if you then turn 38:23 around and bring the end to one side and 38:26 the height to the other side okay you 38:29 could come up with the equation we had 38:33 before that the height is no more than 38:37 1.4 for log of n okay all right so 38:43 that's that's really the that's almost 38:46 identical to what we went through for 38:49 Fibonacci heaps all right an AVL search 38:56 tree is simply an AVL tree with the 38:59 binary search tree property added to it 39:04 so you don't have to use AVL trees only 39:07 in the context of search trees but when 39:09 you do use them in the context of search 39:11 tree so they're binary search trees with 39:13 the property that the height difference 39:15 between left and right subtrees is at 39:16 most 1 and most of the time oops 39:50 all right so most of the time we won't 39:52 use the full name AVL search tree we'll 39:55 just call these AVL trees is we're only 39:57 going to use them in the context of 39:59 binary search trees in this class now as 40:05 far as searching these is concerned 40:09 there's no difference between searching 40:11 an AVL tree and searching any other 40:14 binary search tree the same algorithm 40:17 works you don't have to make any change 40:18 we don't care about the balance factors 40:20 okay the difference is going to come in 40:22 how you insert and how you remove I will 40:28 take a look at those operations in our 40:30 next meeting okay 41:16 all right we were looking at AVL trees 41:21 as a first example of a balanced tree 41:23 last time this was a height balanced 41:26 tree today we want to basically look at 41:29 how you would perform inserts and 41:31 removes from an AVL tree while 41:35 preserving the AVL property all right so 41:38 first a quick review of AVL trees we saw 41:43 that an AVL tree is a binary tree and 41:46 you define a balance factor for every 41:50 node where the balance factor is simply 41:52 the height difference between the left 41:53 and right subtrees of that node AVL 41:57 trees have the property that the balance 42:00 factor of every node is minus 1 0 or 1 42:05 and as a consequence of this property we 42:10 saw that the height of an AVL tree lies 42:12 between log N and 1.44 log n so they 42:17 have the right height property all AVL 42:19 trees have a height that is order of log 42:22 n right as an example of an AVL tree 42:27 this is really an AVL search tree the 42:31 elements are placed to satisfy the 42:34 binary search tree property the 42:36 structure of the underlying binary tree 42:38 corresponds to that of an AVL tree the 42:42 numbers given outside are the balance 42:44 factors so all of the balance factors 42:46 have one of the allowable three values 42:48 plus one minus one and zero and inside 42:52 we've got the keys of the elements 42:54 satisfying the binary search tree 42:56 property ok alright so with that as a 43:01 recap of what we did last time let's 43:03 take a look at how to perform first the 43:06 insert operation 43:09 alright so suppose you want to put into 43:14 your collection a pair whose key value 43:17 is mine okay so we want to insert a nine 43:21 we'll proceed to perform the insertion 43:24 exactly the way we would 43:27 if we didn't care about balance okay so 43:29 let's just use the algorithm we were 43:31 using when we talked about binary search 43:34 trees a couple of lectures ago and that 43:37 algorithm if you recall would search for 43:39 the nine okay so smaller than ten you 43:43 end up here 43:43 bigger than seven you end up there 43:45 bigger than eight you fall off here and 43:47 where you fall off is where you put it 43:50 in okay so you get a new node you put in 43:53 a key value and then you make the 43:55 connection okay now we need to adjust 44:00 the balance factors on nodes in the tree 44:03 so that the balance factors are recorded 44:06 correctly we know that the only balance 44:09 factors we need to be concerned about 44:11 are those on the path from the root to 44:13 this newly inserted pink node okay it's 44:18 certainly along this path that the 44:20 height of sub trees could change down 44:22 here you didn't do anything so none of 44:23 those sub trees could have a height 44:25 change same thing around here you didn't 44:27 do anything so none of those sub trees 44:29 have a height change okay so we move 44:32 downwards we make the insert okay and 44:34 then we're going to retrace the path 44:36 backwards adjusting the balance factors 44:40 okay we would typically not adjust the 44:43 balance factors on the way down because 44:45 the insert might fail in the sense that 44:47 you might already have a pair whose key 44:50 is nine in which case you won't insert a 44:52 node you might update The Associated 44:53 element okay all right so having made 44:58 sure that I'm really going to add a new 44:59 node I will now set balanced factors on 45:02 the way back okay 45:03 the balance factor of the new node of 45:05 course is zero okay that's always zero 45:08 it doesn't come with the left or right 45:09 subtree okay when you move backwards 45:14 from a right subtree okay the balance 45:18 factor of this node will decrease by one 45:21 okay its height of left - hi - right so 45:25 since the height in the right subtree 45:26 went up it was previously zero it's now 45:28 one the balance factor here will 45:30 decrease by one so this becomes minus 45:33 one okay the height of this subtree was 45:37 previously one it has now become two 45:39 okay 45:40 so the same thing you're coming from the 45:41 right subtree so the balance factor 45:43 there's going to decrease by what so 45:47 that balance factor becomes zero when a 45:49 balance factor becomes zero we know that 45:52 the height of the sub tree rooted at 45:54 this node has not changed 45:56 it was previously one the right subtree 46:00 increased its height by one the balance 46:01 factors become zero the overall height 46:03 of the sub tree is unchanged okay so 46:05 when you reach a node whose balance 46:08 factor becomes zero then you don't need 46:11 to go any further up because none of 46:14 those balance factors will change okay 46:17 all right so the change and balance 46:19 factors you start from the new node and 46:21 you move back up okay and some nodes 46:28 that had a balance factor of zero their 46:30 balance factor would change to plus 1 or 46:32 minus 1 because the height of a sub tree 46:34 change okay when you encounter the first 46:37 fellow whose balance factors was plus 1 46:40 or minus 1 before the insert its balance 46:43 factor would now change to zero or of 46:46 course we would become plus 2 or minus 2 46:48 it's going to change by 1 okay so it is 46:51 a plus 1 it could go to a 2 it could go 46:53 to a zero for this minus 1 it could go 46:55 to a zero and go to minus 2 if it goes 46:59 to 0 the height hasn't changed so you 47:01 don't need to go any further let's try 47:05 another one okay let's try and insert a 47:08 29 all right so the process is the same 47:12 you started the root follow the search 47:15 path okay bigger than 10 smaller than 40 47:19 smaller than 30 bigger than 20 bigger 47:22 than 25 you fall off here so whatever 47:25 you fall off that's where you put in a 47:28 new node okay so you get the node you 47:31 put in the key and then you make the 47:33 connection okay having done the insert 47:37 we're now going to set the balance 47:39 factors starting from the new node going 47:42 back up the tree okay the balance factor 47:45 of the new node is always zero and the 47:48 new node always causes the height of 47:49 that sub tree to increase by one okay so 47:52 long as the height 47:53 Changez you have to go further up the 47:55 tree once the height doesn't change you 47:57 can stop okay all right so the height of 48:00 this subtree has change has gone up by 48:02 one you're coming from the right subtree 48:05 so you have to decrease the balance 48:07 factor by one if you're coming from the 48:09 left subtree would increase the balance 48:10 factor by one okay all right so the 48:13 balance factor becomes minus one okay 48:17 when the balance factor changes from 48:20 zero to something the height has 48:22 increased so you have to keep going up 48:26 okay all right so from this node I now 48:29 move up to that node and coming from the 48:32 right subtree so the balance factor 48:33 decreases by one it becomes minus two 48:38 all right so my backward pass up here is 48:44 going to stop at least initially at the 48:47 first place where the balance factor 48:49 becomes plus 2 or minus 2 signaling I 48:51 have a problem okay I'm not going to 48:53 continue to fix balance factors I'm 48:55 going to immediately move into a 48:57 recovery or readjustment mode for the 48:59 tree all right so this backward pass 49:09 where you readjust balance factors stops 49:13 at the first node where the balance 49:15 factor becomes plus 2 or minus 2 if 49:17 there is such a node otherwise it will 49:21 keep going until you reach a node is 49:23 balance factor becomes 0 and that time 49:26 you can stop or you find your way all 49:30 the way out of the tree ok all right so 49:33 in this example we reached a node whose 49:36 balance factor became minus 2 okay and 49:39 at that time we'll say we have an 49:42 imbalance okay the imbalance that we 49:46 have is characterized based upon the 49:51 location of the new node the pink node 49:53 relative to the white node the node 49:56 whose balance factors become plus 2 or 49:58 minus 2 in this particular case this 50:02 pink node was first inserted into the 50:05 right subtree of the whitener 50:07 okay so that will give me the first 50:11 letter in the characterization of the 50:13 imbalance so identify the white node 50:16 also called the a node is the nearest 50:20 ancestor of the newly inserted node 50:22 whose balance factors become plus 2 or 50:25 minus 2 if the pink node was inserted 50:30 into the right subtree of the white node 50:32 then it's an imbalance of type r that 50:35 gives the first letter in the 50:36 characterization the second letter comes 50:39 relative to the root of the subtree you 50:44 went into okay so from here I went into 50:46 the right subtree and then from here I 50:49 further went into its right subtree so 50:50 that gives me the second R if this pink 50:54 node was sitting down here this would be 50:56 called an r l type of imbalance okay all 51:02 right so he characterized the imbalance 51:03 based on the a node the nearest ancestor 51:08 of the newly inserted node whose balance 51:09 factors become plus or minus two with 51:12 the two letter characterization in this 51:14 case it's an RR imbalance okay we know 51:18 that the a node will always have if you 51:25 would have had a fluid differently we 51:29 know that the pink node cannot be a 51:31 child of the a node it's got to be a 51:34 grandchild or further down okay if there 51:38 was a possibility that the pink node 51:39 could be a child then you wouldn't be 51:41 able to come up with the second letter 51:42 in the characterization okay it can't be 51:46 a child because the balance factor here 51:48 had to be a plus one or minus one before 51:52 the insert okay and therefore this node 51:56 had to have a subtree and it must have 51:58 had a non-empty sub tree on the side 52:00 where you did the insert okay all right 52:03 so you can convince yourself that 52:07 whenever you get a plus 2 or minus 2 52:09 this characterization is well defined 52:12 the first node is easier to see and then 52:15 you need to convince yourself that it 52:16 will actually have a child that was 52:18 there from before whose subtree you move 52:20 into alright so that kind of imbalance 52:27 said is called an RR imbalance and the 52:30 way you fix an RR imbalance is you 52:34 perform what's called an RR rotation in 52:37 an RR rotation you simply rotate the 52:40 subtree rooted here okay 52:44 counter clockwise okay so this node the 52:49 right child of the a node becomes the 52:51 takes the position of the a node a node 52:53 becomes the left child of this cutting 52:56 okay there's some other changes which 52:59 we'll see in the more general picture 53:02 that we need to look at in a moment okay 53:05 so in this case you perform an RR 53:07 rotation and in this particular example 53:09 once you do the RR rotation you find 53:12 that the height of the sub tree rooted 53:14 here is the same as it was before the 53:16 insert okay so there's no need to go any 53:20 further up the tree okay so if you 53:22 compare with the previous picture so 53:27 before you did the insert the height of 53:28 the sub tree rooted at this position was 53:30 two you had the white node and that 53:32 orange node a gold node okay after you 53:35 do the rotation at that same position 53:39 the height is still two okay the balance 53:43 factor becomes zero and there's no need 53:46 to go further up the tree so in this 53:48 particular case the rotation fixes the 53:51 problem alright so as far as inserts are 53:59 concerned the general strategy is you 54:04 first insert the item as though you 54:08 would perform an insert in an unbalanced 54:10 tree then you have a backward pass from 54:14 that newly inserted node in which you 54:17 change or you adjust balance factors to 54:21 the correct current value 54:27 this backward pass will stop when you 54:31 either reach a notice balance factor 54:33 becomes zero that represents a subtree 54:36 whose height doesn't change and so you 54:38 don't need to go any further up or it 54:41 would stop when you reach an a note okay 54:44 a node whose balance factor becomes 2 or 54:47 minus 2 or it would stop when you really 54:53 fall off the tree so you're backing up 54:55 from the root alright so if you reach a 55:02 notice balance factor becomes 2 or minus 55:04 2 then you have a problem and that's 55:07 when you need to do something special 55:09 otherwise if you stop at a zero or once 55:14 you've trying to back up from the root 55:15 you're done you don't have to perform 55:17 any corrective measures on your tree all 55:25 right so if you do have a node whose 55:29 balance factor becomes plus 2 or minus 2 55:31 we'll say the tree is unbalanced we 55:36 defined the notion of an a node it's the 55:39 white node that we had in that example 55:41 nearest ancestor of newly inserted node 55:44 whose balance factor becomes plus 2 or 55:46 minus 2 but the something to note or 55:57 observe is that if you look at the nodes 56:01 between the a node and the newly 56:03 inserted node that pink node then prior 56:07 to the insertion all of those nodes must 56:11 had a balance factor that is 0 to 2 see 56:19 that suppose somebody on that path had a 56:22 balance factor that was one there on 56:24 your way back if you had a 1 you would 56:27 either change the 1 to a 2 or a 0 if you 56:30 changed it to a - well that node would 56:32 have become the a node 56:33 so you went have gone further up if you 56:35 change it to a zero you'll stop 56:39 so between the a node and that new pink 56:43 node everybody must have had a balance 56:45 factor zero before the insert okay there 56:53 are any questions about that observation 56:55 you want to see a picture or you're you 56:58 okay yeah everybody okay all right okay 57:05 so let's take a look at the different 57:09 types of imbalance and how you fix the 57:14 problem resulting from that imbalance 57:19 well we saw an example of an our our our 57:22 our simply means that your new node that 57:25 pink node is in the right subtree of the 57:27 right subtree of a okay and then you can 57:30 have an LL it's kind of symmetric to the 57:32 RR you can have an RL so you went into 57:39 the right subtree away the first letter 57:41 tells you whether you go into the left 57:42 or right subtree of a so you went into 57:44 the right subtree of a and after that 57:46 you went into the left subtree and then 57:51 the fourth type is the L R alright so LM 57:55 and r are are symmetric