Transcript 00:11 okay yeah well you have any questions 00:15 before I start on yet a new data 00:18 structure today all right so today we're 00:24 going to take a look at a data structure 00:27 called a priority search tree we're 00:29 actually going to spend two lectures on 00:31 this data structure and in today's 00:34 lecture I'll just talk about the tasks 00:39 of functions you can perform efficiently 00:41 using the structure and we'll take a 00:43 look at four or five applications of 00:47 this structure we'll see how these tasks 00:49 are useful in solving problems of 00:52 interest and then on Monday we will take 00:56 a look at how you might implement the 00:59 structure to get the kind of complexity 01:01 we're talking about okay all right so 01:04 first let's just take a look at what 01:08 this structure can do what kind of data 01:10 does it store all right so it's used to 01:15 store elements where the key associated 01:21 with the element is actually a pair okay 01:23 so you have two quantities an x and a y 01:26 which describe the key and of course in 01:30 addition to the key you have some other 01:31 data or payload that's being stored and 01:34 in the examples it typically may not 01:36 show the Associated data okay right so 01:42 the fundamental operations that you want 01:46 to perform of course the usual dynamic 01:48 stuff you want to be able to search for 01:51 the element whose key is given by this 01:54 pair X Y you want to be able to remove 01:59 from your collection and the element 02:03 associated with a given key you want to 02:07 be able to insert an element given its 02:10 associated key so again the key is given 02:13 by two numbers x and y so you have the 02:17 standard what you might call dictionary 02:20 operations get delete and insert except 02:23 that now a key is given by two numbers 02:26 rather than a singleton okay all right 02:31 those operations themselves are not of 02:34 terrible interest if that's all you 02:36 wanted to do whether you have two 02:38 numbers of one you can handle it with 02:40 the old kind of structures that we had 02:42 the old dictionary structures okay 02:45 the interesting ones are what we'll call 02:48 the rectangle operations and the 02:53 rectangle operations supported by a 02:54 priori search tree there are three of 02:56 them the I guess there are three that 03:05 return a single value and then there's 03:07 another one which will allow us to 03:08 return many values okay so the first one 03:11 is to find the leftmost point in a 03:15 rectangle okay so the Nexen rectangle so 03:21 the rectangle itself is specified by 03:24 providing the left boundary of the 03:27 rectangle that's excelled the right 03:29 boundary of the rectangle and a top 03:31 boundary in all of the rectangle 03:34 operations supported by the structure 03:36 the bottom boundary of the rectangle is 03:38 assumed to be y equals 0 and all the 03:40 points lie above that bottom boundary 03:44 okay so we want to find the point that 03:47 is the leftmost point in a rectangle 03:50 given its left and right boundaries and 03:52 its top boundary the bottom boundary is 03:54 implicit it is 0 okay so here's my 03:59 collection of points because only the 04:01 keys are shown XY z-- okay so why I 04:04 interpret a key XY as a point in 04:06 two-dimensional space and these blue 04:09 dots here represent all of the points in 04:11 my collection this is the y equals 0 04:17 line down here and this is the x equals 04:20 0 so all points lie in the 04:23 upper-right quadrant of your 04:24 two-dimensional space okay so I want to 04:30 find I'm going to specify a rectangle in 04:34 this upper right quadrant by giving you 04:36 a left line a right line and a top line 04:45 okay so I specify my rectangle by 04:48 providing XL XR & YT and then all of the 04:51 points that lie within the rectangle are 04:54 candidate points for the curry depending 05:00 upon how you implement the structure by 05:01 wood then you could include the points 05:04 that lie on the line okay or exclude 05:07 points that lie on the line for most of 05:10 the applications we're going to look at 05:11 here today the assumption is that points 05:14 on the line are part of the answer okay 05:18 all right so I'm given a rectangle and 05:21 now I want to find the leftmost point 05:23 inside the rectangle and it's going to 05:26 be this one here 05:29 okay so Maine accent rectangle reports 05:33 the leftmost point in a rectangle 05:35 defined by the parameters x LX r and y t 05:40 right you cannot stake that in a 05:44 different mathematical form also okay so 05:48 from a collection of points find the one 05:49 that has the smallest X X lies between 05:52 XL and XR and Y lies between 0 & YT okay 05:56 same thing on our data structure will 06:01 allow us to perform this operation 06:03 efficiently generally in log logarithm 06:07 of the number of points 06:13 right corresponding one which allows you 06:16 to find the rightmost point in the same 06:18 rectangle okay so you're given the left 06:21 and right boundaries and the top 06:22 boundary and now I want to find the 06:24 rightmost point rather than the leftmost 06:26 1 so the rightmost one is this fellow 06:27 here there and that's the one I want to 06:30 report okay again you can find that in 06:33 the same amount of time it takes to find 06:35 the leftmost point again you can restate 06:39 it in a different mathematical form all 06:44 right the third rectangle operation this 06:47 one also returns a single point is to 06:50 find a bottom most point and here 06:57 instead of spoiling a rectangle we 06:59 provide a just arrange so left and right 07:02 range for X values so you kind of get 07:04 maybe a semi infinite rectangle with the 07:07 bottom being at zero and it goes all the 07:08 way to infinity 07:09 so in this semi infinite rectangle find 07:12 the bottom most point okay 07:15 all right so here's my collection of 07:18 points you're given a left boundary and 07:21 a right boundary in all the points 07:23 within this semi infinite rectangular 07:26 candidates you find you want to report 07:29 the bottom most one so it's that one up 07:30 there which gets reported okay again you 07:35 can specify that in a different 07:38 mathematical form because that went very 07:40 quick so well okay right return the 07:44 smallest Y with X lying in between XL 07:48 and XR all right so those are three 07:53 rectangle operations which returns 07:57 single elements from your collection of 08:00 elements and there's in addition to that 08:02 we have one which can report a large 08:05 number of points and this simply returns 08:08 all of the points in a rectangle 08:11 rectangle is specified in the same way 08:14 as it was for the first two operations 08:16 you given a left and a right line and a 08:18 cup line the bottom line is always y 08:20 equals zero 08:21 okay all right so my collection of 08:24 points you given the left boundary the 08:27 right 08:27 Andre the top boundary and I want to 08:30 report all the points in the rectangle 08:32 so it's going to be these four points 08:35 here so all four of them get reported 08:42 right mathematically you can state it 08:45 like that now certainly a problem such 08:50 as this you don't expect to be able to 08:52 do that in log n time because it's 08:54 possible that all endpoints are part of 08:56 the answer so problems where you have to 08:59 report more than one point would 09:02 necessarily take time that's at least 09:04 linear in the size of the output so 09:07 something like this runs in log n plus 09:09 number of items in the output okay so 09:13 that's about the best you could expect 09:17 alright so the complexity is logarithmic 09:21 for all of the operations except for the 09:24 enumerate rectangle so with eating a get 09:28 insert delete min X and rectangle max X 09:31 and rectangle min Y okay all of those 09:34 take logarithmic time but then enumerate 09:36 rectangle takes log n plus s as being 09:40 the number of elements in the answer 09:43 okay so it's a very very efficient 09:46 structure from the asymptotic complexity 09:48 point of view for the operations that it 09:51 supports now since some of these 09:57 operations are probably new to you it's 10:00 probably a good idea to take a look at 10:04 users for these applications before we 10:07 try and look at how to implement the 10:09 structure okay and really that's what 10:11 we're going to be doing for the rest of 10:12 the lecture just take a look at a 10:14 variety of places where you may want to 10:16 use these operations all right so the 10:22 first application is a visibility 10:25 problem and basically here what you have 10:29 is you have a bunch of vertical lines 10:34 okay so you might think of these as say 10:37 say curtains hanging off of a ceiling 10:42 okay so each curtain is being 10:43 represented by a single line okay so 10:48 I've got these fellows you might think 10:50 of these as curtains or blinds or 10:53 whatever hanging off of a ceiling and so 10:56 for practical purposes the top of these 11:00 is infinity they go as far as the 11:02 ceiling and it's the bottom that varies 11:04 okay so some of them are long some are 11:07 short all right now you've got an 11:12 observer who's positioned somewhere in 11:14 the room say okay and the observers 11:16 looking in that direction and what you 11:21 would like to be able to report is when 11:24 you're looking in this direction what 11:27 does this person see okay so if these 11:30 curtains are opaque you can't see 11:32 through them then he's going to seek 11:34 this one if they are translucent and you 11:37 can see through them then he's going to 11:38 see all he's going to see this one that 11:40 one and that one but he won't see that 11:42 because it finishes before its line of 11:44 sight okay so in the wizard in the 11:48 visibility problem you're given a bunch 11:51 of semi infinite lines so they start 11:53 somewhere in infinity up at the top but 11:56 they end at some finite spots at the 11:59 bottom okay you have an observer looking 12:03 in toward the right you either want to 12:07 report the first line that is seen or 12:10 all of the lines that are seen at that 12:14 line of sight 12:20 all right so let's first take a look at 12:22 the case we want to report all of the 12:25 lines you would see so that's the lines 12:27 are translucent you're looking toward 12:29 the right report everything that gets 12:31 seen all right so one way to solve this 12:41 problem is to use a priori search tree 12:47 in which we store these endpoints okay 12:53 so in this case I will store 2 1 3 4 4 2 12:56 5 6 & 6 3 that's the x value and that's 12:59 the Y value and the query that I'm going 13:08 to ask is report all the points that 13:12 line a rectangle where the left boundary 13:15 is X the right boundary is infinity and 13:19 the top boundary is y ok let's see what 13:22 that translates into okay so a left 13:25 boundary is X so the eye is at X Y so 13:30 left boundary is X would eliminate 13:32 everything that's behind me okay so only 13:36 looking at things to my right 13:40 so whatever rectangle i define will 13:45 include only points that are to my right 13:47 cannot exclude include points to my left 13:50 okay all points to my right eligible so 13:55 the right boundary is infinity okay so 13:59 somewhere there's the right boundary 14:01 that makes all of these lines eligible 14:05 for my answer but I only want those 14:08 lines that end at or below my line of 14:12 sight okay so I put a Y boundary out 14:16 there which corresponds to my line of 14:18 sight okay so now only the points that 14:21 are inside well I've got my zero thing 14:23 here so only the points that lie within 14:26 this rectangle correspond to those 14:30 curtains that you're going to see if 14:32 they were translucent 14:33 yeah what's the application of 14:39 visibility yeah you could have several 14:42 for example let's say you want to shoot 14:49 a bullet through here okay so you could 14:52 find all the barriers you can have to go 14:53 to go through okay and then if you knew 14:56 the amount of power needed to go through 14:59 age you could compute how much you have 15:00 to shoot okay so you can think of a 15:02 laser going through and you want to 15:04 reach the other end 15:05 well how strong must the laser be to get 15:07 to the other end okay and you can change 15:09 the strength of the laser depending upon 15:12 you know where you are shooting it from 15:15 okay that's one application we could 15:19 probably think of several hundred 15:20 applications okay all right so this 15:27 would find all of the visible lines if 15:34 they were translucent 15:41 okay 15:44 any questions about that application 15:49 well let's suppose we had the opaque 15:51 lines okay so in this case you will see 15:55 only this one the first one so now I 16:05 want to report only the first one that 16:07 is seen not all of them and again this 16:15 could be important okay may be you've 16:17 got a flying object you want to know the 16:19 first obstacle it's going to hit okay 16:21 and once you know that you can change 16:23 its path when it comes near the obstacle 16:25 okay so you could use it in a in a 16:29 computer graphics or a computer game 16:31 environment all right 16:34 so the AI is the same place I want to 16:37 report the first point I'll use the same 16:40 products which we had before based on 16:43 the end points of these things and this 16:47 time the curry is going to be fine the 16:49 leftmost point in that same rectangle 16:54 okay so the same rectangles I had before 16:56 but now find the leftmost point in it 16:59 and that's going to give me the first 17:01 obstacle encountered okay all right so 17:05 it's the same rectangle as before find 17:10 the leftmost point and that would be the 17:12 first one all right let's take a look at 17:19 an application we've seen a couple of 17:21 times before in this class that's been 17:23 packing we know that there are two 17:26 popular heuristics for bin packing first 17:28 fit and best fit 17:34 before time with Robi back up to the 17:36 previous one the visibility one okay so 17:38 we only looked at how to report 17:40 everything that gets intersected but you 17:43 can do this in a dynamic environment 17:45 because the structure supports inserts 17:48 and deletes okay so you could be putting 17:51 in new lines and taking out all lines 17:53 also okay and that's true for pretty 18:00 much all of the application are talking 18:01 about I'm not going to show you how the 18:04 insert and delete take part in it but 18:06 you could be doing inserts and deletes 18:08 while you are doing the other operations 18:11 okay alright so so I guess getting back 18:18 to bin packing 18:20 those are the two common heuristics used 18:24 for bin packing now in several 18:30 applications we don't want to use either 18:33 this or that but we want to use a 18:34 combination okay so for example if you 18:39 look at say at some at some memory 18:46 allocation systems okay so suppose you 18:48 have a computer which is running 18:49 multiple programs so as program as a 18:53 program starts unity allocates our 18:55 memory to it when it finished as you 18:56 release some memory to it if you don't 19:00 have enough contiguous memory you cannot 19:01 satisfy a request for memory now people 19:05 are done studies and found that when the 19:09 request for memory size is small so 19:13 small may be defined as let's say 120th 19:16 of the total size of memory that you 19:17 have okay well then you you're better 19:20 off if you use let's say first fit and 19:23 when the requests are bigger than that 19:26 you're better off using what's a best 19:27 fit okay so then you may want to 19:30 implement a memory allocation system 19:32 which takes a look at the request size 19:34 and if it's small you want to use best 19:36 fit if it's big you want to use best fit 19:38 say so you want to be able to do both 19:40 things okay 19:42 well then the data structures we've seen 19:44 so far aren't good enough they don't do 19:46 it 19:47 if you always want to do first fit well 19:50 we know we can use a winner tree if you 19:52 always want to do best fit you can use a 19:54 balanced binary search tree but if you 19:57 want to sometimes do one sometimes the 19:59 other well neither of those helps you 20:02 alright so in that case we turn to a 20:06 priority search tree that allows us to 20:08 handle both things efficiently okay 20:14 so the way we do this is okay all right 20:19 so we're doing bin packing so we're 20:20 going to number the bins okay 20:25 and I guess bin packing is really just 20:27 an abstraction for a memory allocation 20:29 type thing so we have the bins that are 20:31 numbered let's suppose the capacity of a 20:34 bin is C okay 20:39 we initialize our priority search tree 20:43 to represent our bins and the keys okay 20:50 the first coordinate of a key is the 20:52 capacity and of course that changes in 20:56 time and the second coordinate is the 20:58 bin number all right so so I've got a 21:07 privately search tree which contains 21:10 these keys the current capacity of a bin 21:13 is the x coordinate and it's number is 21:15 the second coordinate leftmost bin is 21:18 one which has the smallest coordinate 21:21 all right 21:23 so this represents say situation at some 21:26 point in time different bins have 21:30 different available capacity all right 21:34 so the bins indexes go up that's the y 21:37 coordinate and the capacity increases to 21:41 the right that's the x coordinate 21:45 alright so I wanted to find so if you're 21:49 given a request so you want to you have 21:52 an object of this size that you want to 21:54 pack and I've decided by my heuristic I 21:58 want to do it using first foot 22:00 so I want to find the leftmost bin that 22:04 has at least this much capacity to do 22:08 that I will throw this query at my 22:10 private research tree find the min Y in 22:16 annex range which is given by the needed 22:19 size okay so the left point of the range 22:25 of the left boundary is this so this 22:26 eliminates everybody whose capacity is 22:28 too little okay 22:30 and the range goes on to infinity so 22:34 everybody within the range has enough 22:37 capacity so from the people that have 22:40 enough capacity I want to find the 22:42 leftmost bin okay so leftmost bin would 22:45 be the one with the smallest index so 22:48 it's the smallest value of y okay so I 22:53 want to do am in Y in X range okay so 22:58 that gives me the first fit bit and that 23:03 takes logarithmic time all right now 23:08 suppose you want to do a best fit okay 23:13 so in a best fit I'm looking for a bin 23:19 with the least capacity that is 23:22 sufficient okay so I'm going to do that 23:27 by using a min X and rectangle okay so 23:32 the rectangle is the left boundary is 23:35 the needed size how much size you need 23:36 okay so that eliminates everybody who's 23:41 ineligible okay everybody on this side 23:44 has enough capacity okay from the people 23:49 that have enough capacity I want to find 23:52 the one that has the least capacity okay 23:56 so that's the MINIX the smallest X okay 24:00 and that's going to be this fellow here 24:02 okay so the top boundary is infinity and 24:05 the right boundary is infinity and this 24:08 is the fellow that's the answer to the 24:11 best fit 24:18 okay so once I found the bin into which 24:22 you want to pack you can delete it from 24:25 the structure change its capacity 24:27 reinsert it into the structure and then 24:29 go to pack the next item alternatively 24:38 you could implement an new operation 24:41 which is called reduce the x-value okay 24:48 and so do all three things delete and 24:52 insert and reduce in one step instead of 24:55 three separate steps alright so we could 25:01 implement a combination of best fit and 25:03 first fit all right problems we've been 25:10 looking at in the last few lectures 25:12 using segment trees and interval trees 25:16 so you given a collection of line 25:20 segments so intervals and we want to 25:23 solve intersection type problems where 25:25 you're given a Curie interval find all 25:28 the intervals in your collection that 25:29 overlap this one either partial overlap 25:32 a full overlap alright so same kind of 25:38 intervals as we had before the increase 25:40 unit length okay so could represent like 25:47 we had before a machine that's busy from 25:49 2 p.m. to 3 p.m. you represented as the 25:52 interval to 3 and you're given a Curie 25:57 interval you V in one form you may want 26:01 to find all machines that are busy at 26:03 any time between noon and 4:00 p.m. in 26:07 another form you might want to find all 26:09 machines that are busy continuously from 26:11 noon to 4:00 p.m. the first kind of 26:16 query we could do efficiently using 26:17 interval trees report everybody that has 26:20 an intersection between two well between 26:22 noon and 4:00 26:26 for the second kind you could use 26:28 segments trees but that only worked for 26:30 unit in tools okay here it won't matter 26:34 whether it's unit or the interval sizes 26:36 three or six or ten will be able to 26:38 handle it okay so let's take a look at 26:46 the first kind so list all the machines 26:49 that are busy at any time between two 26:51 and four all right so here's my 26:54 collection of machines I've got six of 26:56 them this one's busy from one two three 26:59 that's busy from two to four and so on 27:05 okay and let's say I want to find old 27:09 machines that are busy at any time 27:10 between one and three okay so this one 27:16 is busy at one that one's busy at one 27:19 this one is busy it - okay this one's 27:23 busy at three this one's also busy at 27:26 three so between one and three the 27:28 machines that are busy you've got a II 27:31 see okay now I guess this example is 27:39 changed so if the interval is 1 3 ok 27:43 alright so I want to find the machines 27:45 that are busy between 1 and 3 it would 27:48 be a II C and F if you want to find the 27:53 machines that are busy between 2 and 4 27:54 okay then he is busy okay it just stops 28:03 being busy a little bit after 2 o'clock 28:05 so I'm going to count this as being busy 28:07 at 2 now 1c is busy at - that one's busy 28:10 it - okay this one is busy well I've 28:14 already got that one this one is busy in 28:16 that interval and this one is busy in 28:18 that in - okay so if the couriers report 28:21 all machines busy in the interval to 4 28:23 then the answer is going to be a b c e 28:29 all right so that's the same problem we 28:30 were looking at in our last lecture but 28:33 we solved it using interval trees all 28:38 right here what I'm going to do for a 28:40 priority search tree solution I'm going 28:42 to map these intervals into points and 28:45 then store these two-dimensional points 28:48 in my priority search tree all right so 28:53 if you have an interval IJ so for 28:55 example this is one three okay 28:57 and we'll do store that as a point by 29:00 inverting the endpoints of the interval 29:03 okay so one three is going to get stored 29:05 as the point three y similarly one two 29:09 will get stored is the point two one 29:11 this will get stored as the point four 29:12 two so I map these intervals into points 29:19 and this represents the data that's 29:21 being stored in my primary search tree 29:23 so for example e is 2 1 so X is 2 y is 1 29:28 e is going to be 3 1 so it's the same 29:31 y-value but the x value is 3 ok so I 29:35 invert the left and right points of my 29:37 intervals to get my pairs and I put them 29:44 into a privately search tree if you want 29:46 to insert more intervals you use the 29:48 private search tree insertion algorithm 29:50 I'm going to remove intervals you take 29:51 them out and then you'll be asked it 29:57 trying to perform a curry which says 30:00 give me all of the intervals that are 30:02 busy at some time between the curry 30:04 interval u-v and the way I'm going to 30:11 answer that is by running the enumerate 30:13 rectangle thing that's the only one that 30:16 reports more than one point so for 30:21 example here I want to find all the 30:24 machines that are busy at some time 30:26 between 2 & 4 so this is the query that 30:28 I'm going to pose ok now let's see why 30:31 this query works 30:36 all right so I'm looking at all the 30:38 points in a rectangle that begins at x 30:41 equals two so the points that I'm 30:46 eliminating okay the ones outside the 30:49 rectangle by this line are those that 30:53 finish before - because the x value is 30:59 the time at which the interval is over 31:02 okay so if you finish before - you can't 31:06 be busy between two and four okay 31:09 so that boundary takes care of points 31:12 that are can't possibly be part of the 31:14 answer 31:15 okay all right then all of the points on 31:21 the right 31:22 oh within that haven't been excluded 31:25 points will that finish at two or after 31:29 two and so they're potentially part of 31:31 the answer okay I was simply because you 31:35 finished at eight hundred doesn't mean 31:37 you can't be part of the answer you can 31:39 still be part of the answer so to 31:42 eliminate the remaining points that are 31:44 not part of the answer I got to remove 31:47 the points that start after four if you 31:54 start after four well then you again 31:55 can't be busy between two and four okay 31:58 so to remove the points that start after 32:00 four I put a vertical line here at four 32:04 okay so if you look at the rectangle 32:10 okay the lower boundary is zero the 32:13 right boundaries that infinity left is 32:15 at two and this is at four okay so all 32:20 of the points that have been excluded 32:22 okay so excluded on this side they 32:26 finished too early excluded up here they 32:30 started too late the people that are 32:32 left here they start or finish in that 32:37 interval and so they have to be busy now 32:39 that in that interval 32:43 okay alright so this query finds for me 32:47 all of the points that are busy at some 32:49 time between 2:00 and 4:00 and so you 32:56 can do that very fast same amount of 32:58 time that an interval tree took 33:02 asymptotically at least okay 33:07 interestingly if I just switch these two 33:13 things around vu I can find all the 33:16 machines that are busy continuously in 33:18 that interval using exactly the same 33:20 data structure so I can do both these 33:22 operations with the same structure ok 33:25 let's take a look at that so that's 33:27 interval containment okay so you want to 33:30 find all intervals in your collection 33:32 that fully contain a given interval all 33:42 right so for example if you're given a 33:45 unit interval 5/6 find all machines that 33:47 are busy from 5 through 6 ok then we 33:54 would report D F and B 34:06 okay so with the unit interval we could 34:10 do this using a segment tree but if I 34:12 change intervals to five nine a segment 34:15 tree wouldn't really do the job without 34:19 reporting duplicates okay all right so 34:24 the mapping into my practice socially 34:27 the same as before you just invert the 34:29 right and left points of the interval in 34:32 order to get your X Y coordinates okay 34:38 so the same set of points going into the 34:40 tree the query is changed okay again you 34:44 have to use a new minute rectangle 34:46 because there's more than one point 34:47 potentially in the answer and the only 34:49 difference between this and the previous 34:51 one is I've switched the coordinates 34:55 previously it was a U and a V and I've 34:59 made it a V na you okay let's see why 35:04 this works all right so I suppose I want 35:11 to find everybody who's busy from 6:00 35:14 from 5:00 to 6:00 okay all right so I 35:21 draw my left line is 6:00 35:24 okay so when I draw my rectangles left 35:28 line at 6:00 I'm eliminating everybody 35:31 who finishes before 6:00 okay so if you 35:35 finish before 6:00 there's no way you 35:36 can be busy continuously from 5:00 to 35:38 6:00 okay so I've only eliminating 35:44 points that cannot be part of the answer 35:47 if you finish at 6:00 or after 6:00 then 35:51 you're all candidates it doesn't matter 35:53 when you finish you could finish at 35:54 1,000 or 2,000 so the right boundary 35:56 goes to infinity okay do you eliminate 35:59 the other bad points I'm going to use my 36:03 top of rectangle and that's given by 36:07 five okay when I put this here okay what 36:12 am I eliminating I'm eliminating people 36:15 who start after five 36:18 if you start after five there's no way 36:19 you can be busy continuously between 36:21 five and six okay so this one takes care 36:25 of people who start after five this one 36:27 takes care of people who finish before 36:28 six what you left where there's people 36:30 who start before five and finish after 36:33 six and so they're busy continuously 36:35 between five and six okay and it doesn't 36:39 matter that my interval was of length 36:41 one it could have been at length ten the 36:43 same construction would do the job 37:00 or another problem from computational 37:02 geometry okay so we want to find inter 37:06 sank intersecting rectangle pairs so you 37:12 have a collection of rectangles again 37:14 the collection could be changing it 37:15 would be inserting and removing 37:16 rectangles and then once in a while you 37:19 get asked to report all of the 37:20 rectangles that intersect so let's say 37:31 this is my current state of the 37:32 rectangles and I want to in this case 37:36 tell you that a and B intersect a and C 37:41 intersect D and G so account a 37:44 containment as an intersection and then 37:46 ENF intersect so here at the time you 38:00 are asked to report all the 38:01 intersections we will employ a priority 38:04 search tree to answer the question okay 38:07 so these rectangles are not sitting in 38:09 my priority search tree to begin with 38:11 you've got some other structure that 38:12 keeps track of the rectangles and then 38:14 somebody says report everything that 38:16 intersects will now write a piece of 38:19 code that does that and that piece of 38:21 code will use a priority search tree all 38:27 right so the the way we will do this is 38:34 by employing one of the solutions we 38:36 already saw and this is the interval 38:40 intersection solution okay let's see how 38:42 it's going to work okay so I'm going to 38:49 look at the horizontal lines of these 38:52 rectangles in order of Y so start by 38:55 looking at the smallest wise and then go 38:57 up okay so I'm going to first see this 39:00 line here okay then I'll see this line 39:03 and then I'll see the bottom edge of F 39:06 and then this blue one that blue one and 39:09 so on okay and at some point I'll begin 39:11 to see the top edge 39:13 okay so depending upon the kind of edge 39:16 you see we'll perform a different action 39:19 the vertical edges are not of importance 39:23 it's only the horizontal edges in each 39:27 horizontal edge is just an interval it's 39:30 got a start point in the finish point so 39:35 if you see a bottom edge okay then what 39:37 I will do is I'll just insert the edge 39:39 into my private research tree when you 39:43 insert an edge into the private search 39:44 tree you have to transform it into a 39:46 point and that's done by inverting the 39:48 left and right as we were doing in our 39:50 interval intersection problem so this 39:52 would be the x-coordinate of the point 39:53 that would be the y coordinate of the 39:55 point okay so when you see a bottom edge 39:58 you just put it into the priority search 40:00 tree when you see a top edge then I'm 40:04 going to report all of the intersecting 40:09 segments using that top edge as the 40:13 query okay so to see how that works okay 40:17 so we see this that gets put into my 40:19 priority search tree I see that that 40:21 gets put into the priority search tree I 40:24 see the bottom edge out there that gets 40:26 put into the priority search tree then I 40:28 see this one it gets in I see that one 40:31 it gets in I see this one it gets N and 40:33 then I suppose I see this guy so first I 40:38 see a whole bunch of bottom edges they 40:40 got in the next edge I see is this one 40:43 it's a top edge so now I run my on line 40:47 intersection using this interval as the 40:49 query interval report all intervals in 40:52 my collection that intersect this guy 40:54 okay so it's going to report this one 40:57 okay but that's a self intersection so I 40:59 discard it it's going to report the blue 41:01 one so report an intersection between 41:03 two rectangles a and B and it's not 41:06 going to report any others so I report 41:12 that and then this rectangle is finished 41:14 so I delete this horizontal edge and you 41:18 lost all memory of this so I get all of 41:20 the intersections made by B and the 41:24 current open 41:25 angles and that's all intersections we 41:27 can make with anybody 41:28 you can't intersect a rectangle that 41:30 hasn't been seen yet okay all right so 41:34 then I go on and I look at the next edge 41:37 and maybe the next edge you see is this 41:41 one out here it's a top edge so again 41:44 you do the same thing report all current 41:47 edges with which you have an 41:49 intersection so you get this one which 41:51 is the self edge but you just delete it 41:54 from the collection you get the red one 41:57 so you report a overlap here and that's 42:00 it okay so you kind of continue in that 42:03 way and you're able to report all of the 42:07 intersections amongst the rectangles 42:10 including containments all right any 42:18 questions about how that works before we 42:20 look at the complexity 42:27 alright the complexity is in order to 42:31 run this algorithm we need to sort all 42:35 of our edges by y-values okay so you've 42:37 got this big bag of edges sitting 42:40 somewhere that represents our rectangles 42:41 we sort them by y-value and the time you 42:44 need to perform this query that could 42:46 take your n log n then each time you see 42:54 a bottom edge you need to insert it if 42:55 there are n rectangles you need to do n 42:57 insertions so that's n log n for the N 43:01 insertions when you see a top edge you 43:05 need to do a deletion of the 43:07 corresponding bottom edge so that's n 43:09 log n but you also need to do this 43:12 enumerate rectangle to get everybody 43:15 with whom you have an intersection okay 43:18 and the enumerate rectangle takes log n 43:21 plus the number of people in the answer 43:24 okay so you do it n times so it's n log 43:27 n plus the sum of the number of people 43:28 in each answer that's the total number 43:30 of people in the overall answer that 43:33 works out to n log n plus s so the total 43:39 time to perform this query is then n log 43:42 n plus s 43:50 okay and that's really the fastest 43:52 anybody knows how to do it 44:05 let's look at a last example it has to 44:09 do with packet forwarding so we've seen 44:17 earlier packet forwarding is typically 44:20 done using longest prefix matching so 44:23 for example your router table has a 44:25 bunch of prefixes you get a destination 44:27 address several prefixes match okay you 44:32 find of the ones that match in this case 44:34 I guess all I guess these these two 44:37 match that one doesn't okay so the ones 44:39 that match you find the longest one 44:41 that's this guy so the action associated 44:43 with this filter is the one that you 44:46 perform all right so to use priority 44:52 search trees to find the longest 44:56 matching prefix and also to support 44:58 insertion and deletion of prefixes each 45:05 prefix is represented as an interval 45:06 again we've seen that before you take 45:10 the smallest destination address matched 45:12 and the largest matched that defines an 45:14 interval so you take these intervals and 45:23 we're going to store them in a priority 45:24 search tree the property of our 45:28 intervals that we're going to use is 45:29 that two prefixes if you look at the 45:33 range that they define or the interval 45:36 they define they can nest but you can't 45:38 have a non nesting kind of intersection 45:42 okay so if I look at a picture okay so 45:46 you can have a prefix like this yellow 45:48 one which is fully contained in this 45:50 black one which is fully contained in 45:52 that blue one fully contained the red 45:53 one you can't have a prefix that kind of 45:56 starts somewhere in the middle of this 45:58 black one and hops over somewhere else 46:00 okay that's not possible for prefixes 46:03 you either have disjointness or you have 46:05 can containment okay so our intervals 46:10 are going to look like this okay you're 46:14 given a destination address when a 46:17 packet comes in 46:20 you can find all of the prefixes that 46:24 match this packet by simply doing a 46:28 point containment okay so B is contained 46:31 in this is contained in that it's 46:32 contained in that that's the same as 46:35 interval containment with the start and 46:36 end point of the interval being the same 46:39 so if I run an interval containment with 46:42 the interval starting at D and ending at 46:44 D I'll get everybody who intersects or 46:47 everybody who matches this packet okay 46:53 so if you do an enumerator rectangle 46:56 okay this is the same setup as I had 46:58 before for the interval containment 47:00 problem we looked at a little while ago 47:02 I can find everybody who matches should 47:07 you have a need to find all of the 47:09 prefixes that match but you don't really 47:12 want to find all of them we just want to 47:14 find the longest matching prefix all 47:17 right define the longest matching prefix 47:19 okay if you look at this example this 47:24 prefix is longer than that okay longer 47:28 prefixes represent shorter intervals 47:30 they're more specific okay so this 47:33 prefix is longer than that and this one 47:34 is longer than that so in this case I 47:37 want to report this fellow here okay so 47:40 if everybody of everyone who matches 47:42 since everyone who matches has to be 47:44 nested okay the longest matching prefix 47:47 has the largest start point sorry this 47:52 has a larger start point and the 47:55 smallest finish point okay so what if 47:59 you take the max of the start points 48:02 you'll get this you take the min of the 48:04 finish points you'll get that one which 48:07 is also equal to that one okay so it's 48:09 defined by that and the way I'm going to 48:15 solve this problem is by doing a min X 48:17 and rectangle what I'm going to do is 48:23 take my intervals map them into points 48:26 like I was doing before okay and I'm 48:30 going to find 48:32 the interval okay which it the the 48:37 rectangle gives me everybody who matches 48:39 okay so from the points in the rectangle 48:42 I want to find the fellow with the 48:47 smallest finish point so that's going to 48:50 be the MINIX because you're inverting 48:53 the start and finish points okay so the 48:56 main accent rectangle would give me the 48:58 fellow with the smallest finish point in 49:00 that rectangle now this would work fine 49:04 if everybody had different finish points 49:07 but here I got a tie so I don't know 49:09 who's going to get returned to ensure 49:12 that I actually get the longest matching 49:14 prefix I need to enforce a tie breaker 49:16 and to enforce a tie breaker what I'll 49:20 do is I'm going to change the scale of 49:22 the endpoints okay so I'm going to make 49:24 this guy's end point just a little bit 49:25 more than that guy's end point okay so 49:30 I'll remap the endpoints okay to remap 49:34 the endpoints I'm going to use this 49:35 function given here which takes the 49:38 original finished point and start point 49:39 W is the width of a destination address 49:44 so an IP for W would be 32 in IP 6 at 49:48 128 okay so remap the finish points 49:52 before inserting into my private search 49:55 tree so that in case of a tie the longer 50:01 prefix has a smaller finish point than 50:04 the shorter prefix okay and now when you 50:07 do the min X you're guaranteed to find 50:09 the longest prefix that matches okay so 50:15 by properly mapping my prefixes into a 50:19 priority search tree I get a dynamic 50:22 router table that works in logarithmic 50:23 time you can insert you can delete and 50:26 you can find the longest matching prefix 50:37 all right so the primary search tree is 50:41 a very versatile data structure we've 50:44 only seen a very small fraction of the 50:48 applications for the data structure but 50:53 we've seen I think a good sampling from 50:55 computational geometry to router tables 50:57 and memory allocation alright so in the 51:00 next lecture we'll take a look at how 51:01 you can actually implement a priority 51:05 search tree to work in the time bounds 51:08 we've claimed all right any questions ok 51:16 so happy Thanksgiving