Transcript


00:11
okay yeah well you have any questions
00:15
before I start on yet a new data
00:18
structure today all right so today we're
00:24
going to take a look at a data structure
00:27
called a priority search tree we're
00:29
actually going to spend two lectures on
00:31
this data structure and in today's
00:34
lecture I'll just talk about the tasks
00:39
of functions you can perform efficiently
00:41
using the structure and we'll take a
00:43
look at four or five applications of
00:47
this structure we'll see how these tasks
00:49
are useful in solving problems of
00:52
interest and then on Monday we will take
00:56
a look at how you might implement the
00:59
structure to get the kind of complexity
01:01
we're talking about okay all right so
01:04
first let's just take a look at what
01:08
this structure can do what kind of data
01:10
does it store all right so it's used to
01:15
store elements where the key associated
01:21
with the element is actually a pair okay
01:23
so you have two quantities an x and a y
01:26
which describe the key and of course in
01:30
addition to the key you have some other
01:31
data or payload that's being stored and
01:34
in the examples it typically may not
01:36
show the Associated data okay right so
01:42
the fundamental operations that you want
01:46
to perform of course the usual dynamic
01:48
stuff you want to be able to search for
01:51
the element whose key is given by this
01:54
pair X Y you want to be able to remove
01:59
from your collection and the element
02:03
associated with a given key you want to
02:07
be able to insert an element given its
02:10
associated key so again the key is given
02:13
by two numbers x and y so you have the
02:17
standard what you might call dictionary
02:20
operations get delete and insert except
02:23
that now a key is given by two numbers
02:26
rather than a singleton okay all right
02:31
those operations themselves are not of
02:34
terrible interest if that's all you
02:36
wanted to do whether you have two
02:38
numbers of one you can handle it with
02:40
the old kind of structures that we had
02:42
the old dictionary structures okay
02:45
the interesting ones are what we'll call
02:48
the rectangle operations and the
02:53
rectangle operations supported by a
02:54
priori search tree there are three of
02:56
them the I guess there are three that
03:05
return a single value and then there's
03:07
another one which will allow us to
03:08
return many values okay so the first one
03:11
is to find the leftmost point in a
03:15
rectangle okay so the Nexen rectangle so
03:21
the rectangle itself is specified by
03:24
providing the left boundary of the
03:27
rectangle that's excelled the right
03:29
boundary of the rectangle and a top
03:31
boundary in all of the rectangle
03:34
operations supported by the structure
03:36
the bottom boundary of the rectangle is
03:38
assumed to be y equals 0 and all the
03:40
points lie above that bottom boundary
03:44
okay so we want to find the point that
03:47
is the leftmost point in a rectangle
03:50
given its left and right boundaries and
03:52
its top boundary the bottom boundary is
03:54
implicit it is 0 okay so here's my
03:59
collection of points because only the
04:01
keys are shown XY z-- okay so why I
04:04
interpret a key XY as a point in
04:06
two-dimensional space and these blue
04:09
dots here represent all of the points in
04:11
my collection this is the y equals 0
04:17
line down here and this is the x equals
04:20
0 so all points lie in the
04:23
upper-right quadrant of your
04:24
two-dimensional space okay so I want to
04:30
find I'm going to specify a rectangle in
04:34
this upper right quadrant by giving you
04:36
a left line a right line and a top line
04:45
okay so I specify my rectangle by
04:48
providing XL XR & YT and then all of the
04:51
points that lie within the rectangle are
04:54
candidate points for the curry depending
05:00
upon how you implement the structure by
05:01
wood then you could include the points
05:04
that lie on the line okay or exclude
05:07
points that lie on the line for most of
05:10
the applications we're going to look at
05:11
here today the assumption is that points
05:14
on the line are part of the answer okay
05:18
all right so I'm given a rectangle and
05:21
now I want to find the leftmost point
05:23
inside the rectangle and it's going to
05:26
be this one here
05:29
okay so Maine accent rectangle reports
05:33
the leftmost point in a rectangle
05:35
defined by the parameters x LX r and y t
05:40
right you cannot stake that in a
05:44
different mathematical form also okay so
05:48
from a collection of points find the one
05:49
that has the smallest X X lies between
05:52
XL and XR and Y lies between 0 & YT okay
05:56
same thing on our data structure will
06:01
allow us to perform this operation
06:03
efficiently generally in log logarithm
06:07
of the number of points
06:13
right corresponding one which allows you
06:16
to find the rightmost point in the same
06:18
rectangle okay so you're given the left
06:21
and right boundaries and the top
06:22
boundary and now I want to find the
06:24
rightmost point rather than the leftmost
06:26
1 so the rightmost one is this fellow
06:27
here there and that's the one I want to
06:30
report okay again you can find that in
06:33
the same amount of time it takes to find
06:35
the leftmost point again you can restate
06:39
it in a different mathematical form all
06:44
right the third rectangle operation this
06:47
one also returns a single point is to
06:50
find a bottom most point and here
06:57
instead of spoiling a rectangle we
06:59
provide a just arrange so left and right
07:02
range for X values so you kind of get
07:04
maybe a semi infinite rectangle with the
07:07
bottom being at zero and it goes all the
07:08
way to infinity
07:09
so in this semi infinite rectangle find
07:12
the bottom most point okay
07:15
all right so here's my collection of
07:18
points you're given a left boundary and
07:21
a right boundary in all the points
07:23
within this semi infinite rectangular
07:26
candidates you find you want to report
07:29
the bottom most one so it's that one up
07:30
there which gets reported okay again you
07:35
can specify that in a different
07:38
mathematical form because that went very
07:40
quick so well okay right return the
07:44
smallest Y with X lying in between XL
07:48
and XR all right so those are three
07:53
rectangle operations which returns
07:57
single elements from your collection of
08:00
elements and there's in addition to that
08:02
we have one which can report a large
08:05
number of points and this simply returns
08:08
all of the points in a rectangle
08:11
rectangle is specified in the same way
08:14
as it was for the first two operations
08:16
you given a left and a right line and a
08:18
cup line the bottom line is always y
08:20
equals zero
08:21
okay all right so my collection of
08:24
points you given the left boundary the
08:27
right
08:27
Andre the top boundary and I want to
08:30
report all the points in the rectangle
08:32
so it's going to be these four points
08:35
here so all four of them get reported
08:42
right mathematically you can state it
08:45
like that now certainly a problem such
08:50
as this you don't expect to be able to
08:52
do that in log n time because it's
08:54
possible that all endpoints are part of
08:56
the answer so problems where you have to
08:59
report more than one point would
09:02
necessarily take time that's at least
09:04
linear in the size of the output so
09:07
something like this runs in log n plus
09:09
number of items in the output okay so
09:13
that's about the best you could expect
09:17
alright so the complexity is logarithmic
09:21
for all of the operations except for the
09:24
enumerate rectangle so with eating a get
09:28
insert delete min X and rectangle max X
09:31
and rectangle min Y okay all of those
09:34
take logarithmic time but then enumerate
09:36
rectangle takes log n plus s as being
09:40
the number of elements in the answer
09:43
okay so it's a very very efficient
09:46
structure from the asymptotic complexity
09:48
point of view for the operations that it
09:51
supports now since some of these
09:57
operations are probably new to you it's
10:00
probably a good idea to take a look at
10:04
users for these applications before we
10:07
try and look at how to implement the
10:09
structure okay and really that's what
10:11
we're going to be doing for the rest of
10:12
the lecture just take a look at a
10:14
variety of places where you may want to
10:16
use these operations all right so the
10:22
first application is a visibility
10:25
problem and basically here what you have
10:29
is you have a bunch of vertical lines
10:34
okay so you might think of these as say
10:37
say curtains hanging off of a ceiling
10:42
okay so each curtain is being
10:43
represented by a single line okay so
10:48
I've got these fellows you might think
10:50
of these as curtains or blinds or
10:53
whatever hanging off of a ceiling and so
10:56
for practical purposes the top of these
11:00
is infinity they go as far as the
11:02
ceiling and it's the bottom that varies
11:04
okay so some of them are long some are
11:07
short all right now you've got an
11:12
observer who's positioned somewhere in
11:14
the room say okay and the observers
11:16
looking in that direction and what you
11:21
would like to be able to report is when
11:24
you're looking in this direction what
11:27
does this person see okay so if these
11:30
curtains are opaque you can't see
11:32
through them then he's going to seek
11:34
this one if they are translucent and you
11:37
can see through them then he's going to
11:38
see all he's going to see this one that
11:40
one and that one but he won't see that
11:42
because it finishes before its line of
11:44
sight okay so in the wizard in the
11:48
visibility problem you're given a bunch
11:51
of semi infinite lines so they start
11:53
somewhere in infinity up at the top but
11:56
they end at some finite spots at the
11:59
bottom okay you have an observer looking
12:03
in toward the right you either want to
12:07
report the first line that is seen or
12:10
all of the lines that are seen at that
12:14
line of sight
12:20
all right so let's first take a look at
12:22
the case we want to report all of the
12:25
lines you would see so that's the lines
12:27
are translucent you're looking toward
12:29
the right report everything that gets
12:31
seen all right so one way to solve this
12:41
problem is to use a priori search tree
12:47
in which we store these endpoints okay
12:53
so in this case I will store 2 1 3 4 4 2
12:56
5 6 & 6 3 that's the x value and that's
12:59
the Y value and the query that I'm going
13:08
to ask is report all the points that
13:12
line a rectangle where the left boundary
13:15
is X the right boundary is infinity and
13:19
the top boundary is y ok let's see what
13:22
that translates into okay so a left
13:25
boundary is X so the eye is at X Y so
13:30
left boundary is X would eliminate
13:32
everything that's behind me okay so only
13:36
looking at things to my right
13:40
so whatever rectangle i define will
13:45
include only points that are to my right
13:47
cannot exclude include points to my left
13:50
okay all points to my right eligible so
13:55
the right boundary is infinity okay so
13:59
somewhere there's the right boundary
14:01
that makes all of these lines eligible
14:05
for my answer but I only want those
14:08
lines that end at or below my line of
14:12
sight okay so I put a Y boundary out
14:16
there which corresponds to my line of
14:18
sight okay so now only the points that
14:21
are inside well I've got my zero thing
14:23
here so only the points that lie within
14:26
this rectangle correspond to those
14:30
curtains that you're going to see if
14:32
they were translucent
14:33
yeah what's the application of
14:39
visibility yeah you could have several
14:42
for example let's say you want to shoot
14:49
a bullet through here okay so you could
14:52
find all the barriers you can have to go
14:53
to go through okay and then if you knew
14:56
the amount of power needed to go through
14:59
age you could compute how much you have
15:00
to shoot okay so you can think of a
15:02
laser going through and you want to
15:04
reach the other end
15:05
well how strong must the laser be to get
15:07
to the other end okay and you can change
15:09
the strength of the laser depending upon
15:12
you know where you are shooting it from
15:15
okay that's one application we could
15:19
probably think of several hundred
15:20
applications okay all right so this
15:27
would find all of the visible lines if
15:34
they were translucent
15:41
okay
15:44
any questions about that application
15:49
well let's suppose we had the opaque
15:51
lines okay so in this case you will see
15:55
only this one the first one so now I
16:05
want to report only the first one that
16:07
is seen not all of them and again this
16:15
could be important okay may be you've
16:17
got a flying object you want to know the
16:19
first obstacle it's going to hit okay
16:21
and once you know that you can change
16:23
its path when it comes near the obstacle
16:25
okay so you could use it in a in a
16:29
computer graphics or a computer game
16:31
environment all right
16:34
so the AI is the same place I want to
16:37
report the first point I'll use the same
16:40
products which we had before based on
16:43
the end points of these things and this
16:47
time the curry is going to be fine the
16:49
leftmost point in that same rectangle
16:54
okay so the same rectangles I had before
16:56
but now find the leftmost point in it
16:59
and that's going to give me the first
17:01
obstacle encountered okay all right so
17:05
it's the same rectangle as before find
17:10
the leftmost point and that would be the
17:12
first one all right let's take a look at
17:19
an application we've seen a couple of
17:21
times before in this class that's been
17:23
packing we know that there are two
17:26
popular heuristics for bin packing first
17:28
fit and best fit
17:34
before time with Robi back up to the
17:36
previous one the visibility one okay so
17:38
we only looked at how to report
17:40
everything that gets intersected but you
17:43
can do this in a dynamic environment
17:45
because the structure supports inserts
17:48
and deletes okay so you could be putting
17:51
in new lines and taking out all lines
17:53
also okay and that's true for pretty
18:00
much all of the application are talking
18:01
about I'm not going to show you how the
18:04
insert and delete take part in it but
18:06
you could be doing inserts and deletes
18:08
while you are doing the other operations
18:11
okay alright so so I guess getting back
18:18
to bin packing
18:20
those are the two common heuristics used
18:24
for bin packing now in several
18:30
applications we don't want to use either
18:33
this or that but we want to use a
18:34
combination okay so for example if you
18:39
look at say at some at some memory
18:46
allocation systems okay so suppose you
18:48
have a computer which is running
18:49
multiple programs so as program as a
18:53
program starts unity allocates our
18:55
memory to it when it finished as you
18:56
release some memory to it if you don't
19:00
have enough contiguous memory you cannot
19:01
satisfy a request for memory now people
19:05
are done studies and found that when the
19:09
request for memory size is small so
19:13
small may be defined as let's say 120th
19:16
of the total size of memory that you
19:17
have okay well then you you're better
19:20
off if you use let's say first fit and
19:23
when the requests are bigger than that
19:26
you're better off using what's a best
19:27
fit okay so then you may want to
19:30
implement a memory allocation system
19:32
which takes a look at the request size
19:34
and if it's small you want to use best
19:36
fit if it's big you want to use best fit
19:38
say so you want to be able to do both
19:40
things okay
19:42
well then the data structures we've seen
19:44
so far aren't good enough they don't do
19:46
it
19:47
if you always want to do first fit well
19:50
we know we can use a winner tree if you
19:52
always want to do best fit you can use a
19:54
balanced binary search tree but if you
19:57
want to sometimes do one sometimes the
19:59
other well neither of those helps you
20:02
alright so in that case we turn to a
20:06
priority search tree that allows us to
20:08
handle both things efficiently okay
20:14
so the way we do this is okay all right
20:19
so we're doing bin packing so we're
20:20
going to number the bins okay
20:25
and I guess bin packing is really just
20:27
an abstraction for a memory allocation
20:29
type thing so we have the bins that are
20:31
numbered let's suppose the capacity of a
20:34
bin is C okay
20:39
we initialize our priority search tree
20:43
to represent our bins and the keys okay
20:50
the first coordinate of a key is the
20:52
capacity and of course that changes in
20:56
time and the second coordinate is the
20:58
bin number all right so so I've got a
21:07
privately search tree which contains
21:10
these keys the current capacity of a bin
21:13
is the x coordinate and it's number is
21:15
the second coordinate leftmost bin is
21:18
one which has the smallest coordinate
21:21
all right
21:23
so this represents say situation at some
21:26
point in time different bins have
21:30
different available capacity all right
21:34
so the bins indexes go up that's the y
21:37
coordinate and the capacity increases to
21:41
the right that's the x coordinate
21:45
alright so I wanted to find so if you're
21:49
given a request so you want to you have
21:52
an object of this size that you want to
21:54
pack and I've decided by my heuristic I
21:58
want to do it using first foot
22:00
so I want to find the leftmost bin that
22:04
has at least this much capacity to do
22:08
that I will throw this query at my
22:10
private research tree find the min Y in
22:16
annex range which is given by the needed
22:19
size okay so the left point of the range
22:25
of the left boundary is this so this
22:26
eliminates everybody whose capacity is
22:28
too little okay
22:30
and the range goes on to infinity so
22:34
everybody within the range has enough
22:37
capacity so from the people that have
22:40
enough capacity I want to find the
22:42
leftmost bin okay so leftmost bin would
22:45
be the one with the smallest index so
22:48
it's the smallest value of y okay so I
22:53
want to do am in Y in X range okay so
22:58
that gives me the first fit bit and that
23:03
takes logarithmic time all right now
23:08
suppose you want to do a best fit okay
23:13
so in a best fit I'm looking for a bin
23:19
with the least capacity that is
23:22
sufficient okay so I'm going to do that
23:27
by using a min X and rectangle okay so
23:32
the rectangle is the left boundary is
23:35
the needed size how much size you need
23:36
okay so that eliminates everybody who's
23:41
ineligible okay everybody on this side
23:44
has enough capacity okay from the people
23:49
that have enough capacity I want to find
23:52
the one that has the least capacity okay
23:56
so that's the MINIX the smallest X okay
24:00
and that's going to be this fellow here
24:02
okay so the top boundary is infinity and
24:05
the right boundary is infinity and this
24:08
is the fellow that's the answer to the
24:11
best fit
24:18
okay so once I found the bin into which
24:22
you want to pack you can delete it from
24:25
the structure change its capacity
24:27
reinsert it into the structure and then
24:29
go to pack the next item alternatively
24:38
you could implement an new operation
24:41
which is called reduce the x-value okay
24:48
and so do all three things delete and
24:52
insert and reduce in one step instead of
24:55
three separate steps alright so we could
25:01
implement a combination of best fit and
25:03
first fit all right problems we've been
25:10
looking at in the last few lectures
25:12
using segment trees and interval trees
25:16
so you given a collection of line
25:20
segments so intervals and we want to
25:23
solve intersection type problems where
25:25
you're given a Curie interval find all
25:28
the intervals in your collection that
25:29
overlap this one either partial overlap
25:32
a full overlap alright so same kind of
25:38
intervals as we had before the increase
25:40
unit length okay so could represent like
25:47
we had before a machine that's busy from
25:49
2 p.m. to 3 p.m. you represented as the
25:52
interval to 3 and you're given a Curie
25:57
interval you V in one form you may want
26:01
to find all machines that are busy at
26:03
any time between noon and 4:00 p.m. in
26:07
another form you might want to find all
26:09
machines that are busy continuously from
26:11
noon to 4:00 p.m. the first kind of
26:16
query we could do efficiently using
26:17
interval trees report everybody that has
26:20
an intersection between two well between
26:22
noon and 4:00
26:26
for the second kind you could use
26:28
segments trees but that only worked for
26:30
unit in tools okay here it won't matter
26:34
whether it's unit or the interval sizes
26:36
three or six or ten will be able to
26:38
handle it okay so let's take a look at
26:46
the first kind so list all the machines
26:49
that are busy at any time between two
26:51
and four all right so here's my
26:54
collection of machines I've got six of
26:56
them this one's busy from one two three
26:59
that's busy from two to four and so on
27:05
okay and let's say I want to find old
27:09
machines that are busy at any time
27:10
between one and three okay so this one
27:16
is busy at one that one's busy at one
27:19
this one is busy it - okay this one's
27:23
busy at three this one's also busy at
27:26
three so between one and three the
27:28
machines that are busy you've got a II
27:31
see okay now I guess this example is
27:39
changed so if the interval is 1 3 ok
27:43
alright so I want to find the machines
27:45
that are busy between 1 and 3 it would
27:48
be a II C and F if you want to find the
27:53
machines that are busy between 2 and 4
27:54
okay then he is busy okay it just stops
28:03
being busy a little bit after 2 o'clock
28:05
so I'm going to count this as being busy
28:07
at 2 now 1c is busy at - that one's busy
28:10
it - okay this one is busy well I've
28:14
already got that one this one is busy in
28:16
that interval and this one is busy in
28:18
that in - okay so if the couriers report
28:21
all machines busy in the interval to 4
28:23
then the answer is going to be a b c e
28:29
all right so that's the same problem we
28:30
were looking at in our last lecture but
28:33
we solved it using interval trees all
28:38
right here what I'm going to do for a
28:40
priority search tree solution I'm going
28:42
to map these intervals into points and
28:45
then store these two-dimensional points
28:48
in my priority search tree all right so
28:53
if you have an interval IJ so for
28:55
example this is one three okay
28:57
and we'll do store that as a point by
29:00
inverting the endpoints of the interval
29:03
okay so one three is going to get stored
29:05
as the point three y similarly one two
29:09
will get stored is the point two one
29:11
this will get stored as the point four
29:12
two so I map these intervals into points
29:19
and this represents the data that's
29:21
being stored in my primary search tree
29:23
so for example e is 2 1 so X is 2 y is 1
29:28
e is going to be 3 1 so it's the same
29:31
y-value but the x value is 3 ok so I
29:35
invert the left and right points of my
29:37
intervals to get my pairs and I put them
29:44
into a privately search tree if you want
29:46
to insert more intervals you use the
29:48
private search tree insertion algorithm
29:50
I'm going to remove intervals you take
29:51
them out and then you'll be asked it
29:57
trying to perform a curry which says
30:00
give me all of the intervals that are
30:02
busy at some time between the curry
30:04
interval u-v and the way I'm going to
30:11
answer that is by running the enumerate
30:13
rectangle thing that's the only one that
30:16
reports more than one point so for
30:21
example here I want to find all the
30:24
machines that are busy at some time
30:26
between 2 & 4 so this is the query that
30:28
I'm going to pose ok now let's see why
30:31
this query works
30:36
all right so I'm looking at all the
30:38
points in a rectangle that begins at x
30:41
equals two so the points that I'm
30:46
eliminating okay the ones outside the
30:49
rectangle by this line are those that
30:53
finish before - because the x value is
30:59
the time at which the interval is over
31:02
okay so if you finish before - you can't
31:06
be busy between two and four okay
31:09
so that boundary takes care of points
31:12
that are can't possibly be part of the
31:14
answer
31:15
okay all right then all of the points on
31:21
the right
31:22
oh within that haven't been excluded
31:25
points will that finish at two or after
31:29
two and so they're potentially part of
31:31
the answer okay I was simply because you
31:35
finished at eight hundred doesn't mean
31:37
you can't be part of the answer you can
31:39
still be part of the answer so to
31:42
eliminate the remaining points that are
31:44
not part of the answer I got to remove
31:47
the points that start after four if you
31:54
start after four well then you again
31:55
can't be busy between two and four okay
31:58
so to remove the points that start after
32:00
four I put a vertical line here at four
32:04
okay so if you look at the rectangle
32:10
okay the lower boundary is zero the
32:13
right boundaries that infinity left is
32:15
at two and this is at four okay so all
32:20
of the points that have been excluded
32:22
okay so excluded on this side they
32:26
finished too early excluded up here they
32:30
started too late the people that are
32:32
left here they start or finish in that
32:37
interval and so they have to be busy now
32:39
that in that interval
32:43
okay alright so this query finds for me
32:47
all of the points that are busy at some
32:49
time between 2:00 and 4:00 and so you
32:56
can do that very fast same amount of
32:58
time that an interval tree took
33:02
asymptotically at least okay
33:07
interestingly if I just switch these two
33:13
things around vu I can find all the
33:16
machines that are busy continuously in
33:18
that interval using exactly the same
33:20
data structure so I can do both these
33:22
operations with the same structure ok
33:25
let's take a look at that so that's
33:27
interval containment okay so you want to
33:30
find all intervals in your collection
33:32
that fully contain a given interval all
33:42
right so for example if you're given a
33:45
unit interval 5/6 find all machines that
33:47
are busy from 5 through 6 ok then we
33:54
would report D F and B
34:06
okay so with the unit interval we could
34:10
do this using a segment tree but if I
34:12
change intervals to five nine a segment
34:15
tree wouldn't really do the job without
34:19
reporting duplicates okay all right so
34:24
the mapping into my practice socially
34:27
the same as before you just invert the
34:29
right and left points of the interval in
34:32
order to get your X Y coordinates okay
34:38
so the same set of points going into the
34:40
tree the query is changed okay again you
34:44
have to use a new minute rectangle
34:46
because there's more than one point
34:47
potentially in the answer and the only
34:49
difference between this and the previous
34:51
one is I've switched the coordinates
34:55
previously it was a U and a V and I've
34:59
made it a V na you okay let's see why
35:04
this works all right so I suppose I want
35:11
to find everybody who's busy from 6:00
35:14
from 5:00 to 6:00 okay all right so I
35:21
draw my left line is 6:00
35:24
okay so when I draw my rectangles left
35:28
line at 6:00 I'm eliminating everybody
35:31
who finishes before 6:00 okay so if you
35:35
finish before 6:00 there's no way you
35:36
can be busy continuously from 5:00 to
35:38
6:00 okay so I've only eliminating
35:44
points that cannot be part of the answer
35:47
if you finish at 6:00 or after 6:00 then
35:51
you're all candidates it doesn't matter
35:53
when you finish you could finish at
35:54
1,000 or 2,000 so the right boundary
35:56
goes to infinity okay do you eliminate
35:59
the other bad points I'm going to use my
36:03
top of rectangle and that's given by
36:07
five okay when I put this here okay what
36:12
am I eliminating I'm eliminating people
36:15
who start after five
36:18
if you start after five there's no way
36:19
you can be busy continuously between
36:21
five and six okay so this one takes care
36:25
of people who start after five this one
36:27
takes care of people who finish before
36:28
six what you left where there's people
36:30
who start before five and finish after
36:33
six and so they're busy continuously
36:35
between five and six okay and it doesn't
36:39
matter that my interval was of length
36:41
one it could have been at length ten the
36:43
same construction would do the job
37:00
or another problem from computational
37:02
geometry okay so we want to find inter
37:06
sank intersecting rectangle pairs so you
37:12
have a collection of rectangles again
37:14
the collection could be changing it
37:15
would be inserting and removing
37:16
rectangles and then once in a while you
37:19
get asked to report all of the
37:20
rectangles that intersect so let's say
37:31
this is my current state of the
37:32
rectangles and I want to in this case
37:36
tell you that a and B intersect a and C
37:41
intersect D and G so account a
37:44
containment as an intersection and then
37:46
ENF intersect so here at the time you
38:00
are asked to report all the
38:01
intersections we will employ a priority
38:04
search tree to answer the question okay
38:07
so these rectangles are not sitting in
38:09
my priority search tree to begin with
38:11
you've got some other structure that
38:12
keeps track of the rectangles and then
38:14
somebody says report everything that
38:16
intersects will now write a piece of
38:19
code that does that and that piece of
38:21
code will use a priority search tree all
38:27
right so the the way we will do this is
38:34
by employing one of the solutions we
38:36
already saw and this is the interval
38:40
intersection solution okay let's see how
38:42
it's going to work okay so I'm going to
38:49
look at the horizontal lines of these
38:52
rectangles in order of Y so start by
38:55
looking at the smallest wise and then go
38:57
up okay so I'm going to first see this
39:00
line here okay then I'll see this line
39:03
and then I'll see the bottom edge of F
39:06
and then this blue one that blue one and
39:09
so on okay and at some point I'll begin
39:11
to see the top edge
39:13
okay so depending upon the kind of edge
39:16
you see we'll perform a different action
39:19
the vertical edges are not of importance
39:23
it's only the horizontal edges in each
39:27
horizontal edge is just an interval it's
39:30
got a start point in the finish point so
39:35
if you see a bottom edge okay then what
39:37
I will do is I'll just insert the edge
39:39
into my private research tree when you
39:43
insert an edge into the private search
39:44
tree you have to transform it into a
39:46
point and that's done by inverting the
39:48
left and right as we were doing in our
39:50
interval intersection problem so this
39:52
would be the x-coordinate of the point
39:53
that would be the y coordinate of the
39:55
point okay so when you see a bottom edge
39:58
you just put it into the priority search
40:00
tree when you see a top edge then I'm
40:04
going to report all of the intersecting
40:09
segments using that top edge as the
40:13
query okay so to see how that works okay
40:17
so we see this that gets put into my
40:19
priority search tree I see that that
40:21
gets put into the priority search tree I
40:24
see the bottom edge out there that gets
40:26
put into the priority search tree then I
40:28
see this one it gets in I see that one
40:31
it gets in I see this one it gets N and
40:33
then I suppose I see this guy so first I
40:38
see a whole bunch of bottom edges they
40:40
got in the next edge I see is this one
40:43
it's a top edge so now I run my on line
40:47
intersection using this interval as the
40:49
query interval report all intervals in
40:52
my collection that intersect this guy
40:54
okay so it's going to report this one
40:57
okay but that's a self intersection so I
40:59
discard it it's going to report the blue
41:01
one so report an intersection between
41:03
two rectangles a and B and it's not
41:06
going to report any others so I report
41:12
that and then this rectangle is finished
41:14
so I delete this horizontal edge and you
41:18
lost all memory of this so I get all of
41:20
the intersections made by B and the
41:24
current open
41:25
angles and that's all intersections we
41:27
can make with anybody
41:28
you can't intersect a rectangle that
41:30
hasn't been seen yet okay all right so
41:34
then I go on and I look at the next edge
41:37
and maybe the next edge you see is this
41:41
one out here it's a top edge so again
41:44
you do the same thing report all current
41:47
edges with which you have an
41:49
intersection so you get this one which
41:51
is the self edge but you just delete it
41:54
from the collection you get the red one
41:57
so you report a overlap here and that's
42:00
it okay so you kind of continue in that
42:03
way and you're able to report all of the
42:07
intersections amongst the rectangles
42:10
including containments all right any
42:18
questions about how that works before we
42:20
look at the complexity
42:27
alright the complexity is in order to
42:31
run this algorithm we need to sort all
42:35
of our edges by y-values okay so you've
42:37
got this big bag of edges sitting
42:40
somewhere that represents our rectangles
42:41
we sort them by y-value and the time you
42:44
need to perform this query that could
42:46
take your n log n then each time you see
42:54
a bottom edge you need to insert it if
42:55
there are n rectangles you need to do n
42:57
insertions so that's n log n for the N
43:01
insertions when you see a top edge you
43:05
need to do a deletion of the
43:07
corresponding bottom edge so that's n
43:09
log n but you also need to do this
43:12
enumerate rectangle to get everybody
43:15
with whom you have an intersection okay
43:18
and the enumerate rectangle takes log n
43:21
plus the number of people in the answer
43:24
okay so you do it n times so it's n log
43:27
n plus the sum of the number of people
43:28
in each answer that's the total number
43:30
of people in the overall answer that
43:33
works out to n log n plus s so the total
43:39
time to perform this query is then n log
43:42
n plus s
43:50
okay and that's really the fastest
43:52
anybody knows how to do it
44:05
let's look at a last example it has to
44:09
do with packet forwarding so we've seen
44:17
earlier packet forwarding is typically
44:20
done using longest prefix matching so
44:23
for example your router table has a
44:25
bunch of prefixes you get a destination
44:27
address several prefixes match okay you
44:32
find of the ones that match in this case
44:34
I guess all I guess these these two
44:37
match that one doesn't okay so the ones
44:39
that match you find the longest one
44:41
that's this guy so the action associated
44:43
with this filter is the one that you
44:46
perform all right so to use priority
44:52
search trees to find the longest
44:56
matching prefix and also to support
44:58
insertion and deletion of prefixes each
45:05
prefix is represented as an interval
45:06
again we've seen that before you take
45:10
the smallest destination address matched
45:12
and the largest matched that defines an
45:14
interval so you take these intervals and
45:23
we're going to store them in a priority
45:24
search tree the property of our
45:28
intervals that we're going to use is
45:29
that two prefixes if you look at the
45:33
range that they define or the interval
45:36
they define they can nest but you can't
45:38
have a non nesting kind of intersection
45:42
okay so if I look at a picture okay so
45:46
you can have a prefix like this yellow
45:48
one which is fully contained in this
45:50
black one which is fully contained in
45:52
that blue one fully contained the red
45:53
one you can't have a prefix that kind of
45:56
starts somewhere in the middle of this
45:58
black one and hops over somewhere else
46:00
okay that's not possible for prefixes
46:03
you either have disjointness or you have
46:05
can containment okay so our intervals
46:10
are going to look like this okay you're
46:14
given a destination address when a
46:17
packet comes in
46:20
you can find all of the prefixes that
46:24
match this packet by simply doing a
46:28
point containment okay so B is contained
46:31
in this is contained in that it's
46:32
contained in that that's the same as
46:35
interval containment with the start and
46:36
end point of the interval being the same
46:39
so if I run an interval containment with
46:42
the interval starting at D and ending at
46:44
D I'll get everybody who intersects or
46:47
everybody who matches this packet okay
46:53
so if you do an enumerator rectangle
46:56
okay this is the same setup as I had
46:58
before for the interval containment
47:00
problem we looked at a little while ago
47:02
I can find everybody who matches should
47:07
you have a need to find all of the
47:09
prefixes that match but you don't really
47:12
want to find all of them we just want to
47:14
find the longest matching prefix all
47:17
right define the longest matching prefix
47:19
okay if you look at this example this
47:24
prefix is longer than that okay longer
47:28
prefixes represent shorter intervals
47:30
they're more specific okay so this
47:33
prefix is longer than that and this one
47:34
is longer than that so in this case I
47:37
want to report this fellow here okay so
47:40
if everybody of everyone who matches
47:42
since everyone who matches has to be
47:44
nested okay the longest matching prefix
47:47
has the largest start point sorry this
47:52
has a larger start point and the
47:55
smallest finish point okay so what if
47:59
you take the max of the start points
48:02
you'll get this you take the min of the
48:04
finish points you'll get that one which
48:07
is also equal to that one okay so it's
48:09
defined by that and the way I'm going to
48:15
solve this problem is by doing a min X
48:17
and rectangle what I'm going to do is
48:23
take my intervals map them into points
48:26
like I was doing before okay and I'm
48:30
going to find
48:32
the interval okay which it the the
48:37
rectangle gives me everybody who matches
48:39
okay so from the points in the rectangle
48:42
I want to find the fellow with the
48:47
smallest finish point so that's going to
48:50
be the MINIX because you're inverting
48:53
the start and finish points okay so the
48:56
main accent rectangle would give me the
48:58
fellow with the smallest finish point in
49:00
that rectangle now this would work fine
49:04
if everybody had different finish points
49:07
but here I got a tie so I don't know
49:09
who's going to get returned to ensure
49:12
that I actually get the longest matching
49:14
prefix I need to enforce a tie breaker
49:16
and to enforce a tie breaker what I'll
49:20
do is I'm going to change the scale of
49:22
the endpoints okay so I'm going to make
49:24
this guy's end point just a little bit
49:25
more than that guy's end point okay so
49:30
I'll remap the endpoints okay to remap
49:34
the endpoints I'm going to use this
49:35
function given here which takes the
49:38
original finished point and start point
49:39
W is the width of a destination address
49:44
so an IP for W would be 32 in IP 6 at
49:48
128 okay so remap the finish points
49:52
before inserting into my private search
49:55
tree so that in case of a tie the longer
50:01
prefix has a smaller finish point than
50:04
the shorter prefix okay and now when you
50:07
do the min X you're guaranteed to find
50:09
the longest prefix that matches okay so
50:15
by properly mapping my prefixes into a
50:19
priority search tree I get a dynamic
50:22
router table that works in logarithmic
50:23
time you can insert you can delete and
50:26
you can find the longest matching prefix
50:37
all right so the primary search tree is
50:41
a very versatile data structure we've
50:44
only seen a very small fraction of the
50:48
applications for the data structure but
50:53
we've seen I think a good sampling from
50:55
computational geometry to router tables
50:57
and memory allocation alright so in the
51:00
next lecture we'll take a look at how
51:01
you can actually implement a priority
51:05
search tree to work in the time bounds
51:08
we've claimed all right any questions ok
51:16
so happy Thanksgiving