Transcript


00:19
okay but at this point you should have
00:25
got back your second exams so hopefully
00:28
you've all picked them up from the TA
00:31
the performances on the second exam was
00:34
rather good the average is about 83 84 %
00:39
so that's a good performance you have
00:44
any questions you want to ask before I
00:46
start today's lecture okay
00:53
are we far we've had a few lectures on
00:55
tries and today we're going to really
00:58
take a look at an important application
01:01
of tries and this is in the
01:04
representation of of tables that are
01:08
used in the internet for things like
01:12
packet forwarding or for implementation
01:15
of firewalls and so on right so the
01:20
tables we're talking about these are
01:23
really sets of each table is a set of
01:27
rules and a rule has two components to
01:30
it there's a filter which is a boolean
01:32
expression and the components of the
01:36
boolean expression are really pieces of
01:38
data that you can find in the header of
01:40
a packet that's trying to move through
01:41
the network okay so a filter is a
01:45
boolean expression that might be
01:47
comprised of variables such as the
01:49
source address where did this packet
01:50
originate the destination was the packet
01:53
going and it could include things like
01:56
the port number where the packet
01:59
originated the port number where it is
02:01
going you could even have the protocol
02:04
being used here is it a tcp or UDP you
02:09
could have even things like the time of
02:12
day okay so for example you may might
02:18
make up a make up a filter which says
02:24
for packets that
02:29
that started originated at this
02:32
particular PC so that's got its own IP
02:34
address and it's going to some other
02:37
machine some ways you've got a
02:38
destination address and you might use
02:41
the time of day so between 8 a.m. and 5
02:45
p.m. so so packets that start somewhere
02:50
and they're going somewhere between 8
02:52
and 5 you may want to handle in some
02:54
special way and the special way on the
02:57
handle is given by the action what do
02:59
you want to do with packets that satisfy
03:01
the constraints of this filter ok
03:05
so for example you may put a rule in
03:07
your table that says packets originating
03:09
from from the database lab between 8 & 5
03:13
and that are going to this web server
03:15
which is used to download music well
03:18
they should be dropped ok we won't allow
03:20
you to download songs between 8 & 5 if
03:23
you're trying to download it from the
03:25
database lab ok but if it's outside of
03:28
business hours we might say oh that's ok
03:30
right so so we could set up a rule which
03:35
the filter part describes what packets
03:38
the rule applies to and then the action
03:41
part tells what should you do with that
03:43
packet instructions could vary things
03:46
like drop the packet in the example I
03:48
just gave you
03:49
or it could be forward than the packet
03:51
from the current location to a
03:55
particular destination location and the
03:59
destination that you're sending it to
04:00
which you might call the next hop where
04:02
does it go to the next may not agree
04:04
with that because the way you get from
04:07
the true source to the true destination
04:09
may be by going through a bunch of
04:11
intermediate nodes ok so so you could
04:17
have a next hop information okay so if
04:19
you're sending packets from Gainesville
04:22
Florida to Los Angeles California and if
04:26
right now I'm at the router at the
04:28
University of Florida well from here I
04:31
might send it to a router say in Atlanta
04:33
and that router might forward it to a
04:35
router in Texas which might eventually
04:37
forward it to a router in
04:39
California which might send it to the
04:41
final destination of the packet okay so
04:45
so each of these routers that I'm
04:48
talking about would have a table of this
04:50
sort collection of filters and
04:52
associated actions and the actions would
04:56
be different depending upon where this
04:59
table is located so the action for a
05:02
packet distant for California in the
05:05
University of Florida router could be
05:07
send it to Atlanta but for the router
05:10
sitting in Atlanta the corresponding
05:12
action would not be send it to Atlanta
05:14
to be send it somewhere else okay so the
05:16
tables are different depending upon the
05:20
location of the table the set of rules
05:23
could vary and also the set of actions
05:28
we've gone through some of these already
05:32
depending upon the type of application
05:35
the filters vary in complexity the most
05:39
complex filters around generally tend to
05:42
have about four or five components to
05:44
them so typically a QoS filter quality
05:51
of service filter would include the
05:52
source where is it coming from where is
05:54
it going
05:55
the two ports as well as the protocol
05:57
being used but you could get more
05:59
complex if you want to throw in say time
06:01
of day for example firewall filters
06:07
typically vary in their complexity you
06:11
may have a filter as simple as having
06:13
only one variable in it which could say
06:16
if you have an employee that's been
06:19
fired or is about to be fired you might
06:21
go in and insert a rule into your table
06:23
which says all packets originating from
06:25
this employees workstation are to be
06:28
dropped okay so you just have a single
06:33
variable in that filter but you could
06:36
have more variables in the firewall
06:39
filter the most common kinds of filters
06:45
around our simple one-dimensional
06:49
filters that are based only on the
06:51
destination address
06:52
so if you're sending packets
06:57
around the internet they typically do
07:00
their job by just looking at the
07:02
destination field of the header so the
07:08
most studied type of tables are
07:12
one-dimensional tables where the single
07:15
field is the destination address all
07:21
right so let's first look at destination
07:23
address filters and then towards the end
07:25
of the lecture we'll take a look at
07:26
higher dimensional filters now a 2d
07:33
filter a sorry a one-dimensional filter
07:37
you could specify that filter in one of
07:40
many ways so for example you could give
07:43
a range of destination addresses to
07:46
which it applies so any packet that's
07:49
going to a destination whose IP address
07:52
lies between 35 and 2096 is matched by
07:55
this specification more typical is to
08:01
give the filter as a pair which is an
08:04
address and a mask and the mask here
08:08
basically the zero signified don't care
08:11
and the ones mean that we do care so
08:14
wherever you get a 1 here the
08:16
corresponding bit out here is important
08:18
in the address okay so the zero says I
08:21
don't really care what this first bit is
08:23
the one says the next bits gotta be is
08:24
zero that one says then you got to have
08:26
a 1 and you could have a 1 and then you
08:29
don't care what it is and then you do
08:31
care it's got to be a zero so there are
08:33
two places where you have don't cares
08:35
you have a don't care out here and you
08:39
have a don't care out there okay but the
08:41
other three four bits have to be 0 1 1
08:45
and 0 okay so this specification will
08:50
satisfy exactly for destination
08:53
addresses you can vary this through it's
08:56
two possibilities 0 and 1 you can vary
08:58
that bit through its two possibilities 0
09:00
& 1 and so that gives you these four
09:02
destination addresses that are mash
09:05
by this specification now in a real
09:08
filter it's not going to look as simple
09:10
as this because in a real filter you're
09:14
generally dealing with say IP version 4
09:15
in which case these addresses are 32
09:18
bits long if you're dealing with version
09:20
6 then 128 bits long okay but most of
09:24
what's out there today is really version
09:25
4
09:25
so you've got 32 bits okay now in
09:33
reality people don't really deal with
09:36
filters that as general as this if you
09:38
pop open any commercial router out there
09:41
it's really looking at very simple
09:44
filters which have prefix filters so
09:47
prefix filter if you're looking at it in
09:50
this notation would have the property
09:52
that all the zeros are at the right hand
09:54
so they don't care at the right end
09:55
before several bits are fixed okay so
09:59
for example here I've got a 1-1 which
10:01
means I care about these two bits they
10:03
got to be a 1 and 0 then I got 4 zeros I
10:06
don't care about the other four bits
10:09
was it 5 I get 5 zeros okay so this one
10:14
with these two ones means I've got to
10:16
match that those twos I get a 1-0 stop
10:19
and you can convert that into a range
10:22
the 1-0 star the smallest address it
10:26
matches is 1 0 followed by as many zeros
10:28
as needed to get to your address length
10:31
and the largest address it matches is 1
10:34
0 followed by as many ones as needed to
10:36
get it to a complete destination address
10:39
okay so a prefix filter can be specified
10:44
as an address mass pair the mass will
10:46
have the property that you're going to
10:48
have the ones at the left and then 0 is
10:51
at the right or can be specified as a
10:54
range so it's really a special case of
10:57
both of these two types okay as I said
11:01
this is really what's what you're going
11:03
to find in real-world router tables a
11:06
whole bunch of prefixes
11:15
all right so an example of a a very very
11:20
small router table okay I have a bunch
11:24
of photos and of course there are
11:26
associated actions I haven't really
11:28
listed the actions out here okay alright
11:33
so we have a bunch of photos which is
11:35
specified as prefixes okay and so for
11:40
example this one here will match all
11:42
destination addresses that begin with 1
11:44
and 0 you know of course that's not the
11:47
only filter I've got in this table that
11:49
matches addresses that begin with 1 & 0
11:53
this guy here also matches some of the
11:55
begin with 1 & 0 except that they got to
11:57
be followed by two more zeros okay so
12:00
also here and so also there in fact
12:02
every address matched by p8 is also
12:05
matched by p7 and by p6 and by p4 and by
12:10
p1 okay and and that's quite common if
12:14
you look into real tables given a
12:17
destination address typically there will
12:18
be many photos that match that
12:21
destination address and so you need to
12:24
have a tiebreaker to determine which
12:27
rule actually applies and certainly you
12:32
could think of many tiebreakers for
12:36
example the first matching rule so your
12:39
table is a serial or a sequence of
12:43
filters the topmost filter in the table
12:46
that matches that's the first filter is
12:48
the one that applies you have a highest
12:52
priority rule so to each rule you assign
12:55
priorities of the ones that match you
12:58
select the one with the highest priority
13:00
you can have something called a most
13:03
specific rule so for example here if a
13:08
filter specified as a range okay two
13:10
four and you get another one that's one
13:12
six okay then
13:18
everything that's matched by 2 4 is also
13:21
matched by 1 6 but 1 6 matches some
13:24
additional things
13:25
okay so say this one is more specific
13:27
than that okay
13:29
so for example if you have a filter that
13:34
matches all the IP addresses in
13:36
Gainesville and you have another one
13:39
that matches all the IP addresses in
13:40
Florida then the gains will filter is
13:43
more specific than the Florida filter
13:45
see if you got a packet that's going to
13:47
Gainesville you'd rather use the
13:49
Gainesville rule versus the overall
13:52
Florida rule okay so the most specific
13:56
rule says if you have something that's
13:58
more specific than something else then
14:00
you'd rather use the most specific one
14:02
okay now this rule becomes somewhat
14:05
ambiguous because you may have filters
14:08
like this where you do have an overlap
14:11
so if you have a packet that's going to
14:14
let's say the destination address is 12
14:16
okay it's matched by this one is matched
14:19
by that but neither is more specific
14:21
than the other okay so if you're using
14:25
the most specific rule you have to be
14:27
careful that you don't have ambiguities
14:29
of this type okay so that leads to
14:32
problems like given a bunch of rules how
14:34
do you make sure you don't have this
14:36
kind of ambiguity in there all right the
14:42
one that's really used in practice for
14:44
router tables is the longest prefix rule
14:48
okay so if I go back to my previous
14:52
example
15:03
so if you have a that's a packet that's
15:07
matched by P 8 P 7 P 6 P 4 and P 1 then
15:13
the longest matching rule would say use
15:16
P 8 and that's the same as the most
15:19
specific rule okay because the address
15:24
is matched by P 8 are a subset of those
15:26
matched by P 7 which are a subset of
15:28
those matched by P 6 which are a subset
15:31
of those matched by P 1 which are a
15:33
subset of those matched by P 4 ok so the
15:37
longest matching prefix is really the
15:40
most specific rule translated into the
15:43
prefix domain and when you have prefixes
15:47
you cannot have ambiguities as far as
15:50
the most specific rule is concerned
15:52
because there's no way for two prefixes
15:54
to intersect and have and not have this
15:58
containment property ok so you can
16:01
verify that on your own ok all right so
16:05
typically then what we're dealing with
16:07
is the longest matching prefix and of
16:11
course our filters are prefixes ok
16:22
alright so that's I guess the example
16:25
that we just went through ok all right
16:28
static and dynamic router tables now in
16:32
the application we're talking about
16:34
performance is very crucial you need to
16:39
be able to if you want to keep if you
16:42
want to operate at the speed of the
16:44
transmission lines that are used in
16:47
networks you you have to be able to
16:51
process hundreds of millions of packets
16:54
a second and in order to process
16:57
hundreds of millions of packets per
16:59
second you need to have very very fast
17:03
fast data structures to do this the
17:07
trying to do it on a single high-speed
17:10
processor then you're looking at maybe
17:17
being able to afford of the order of
17:19
four five or six cache misses
17:23
so whatever structure you use is got to
17:27
be searchable very very quickly
17:29
okay small number of memory accesses now
17:33
it turns out you really can't get that
17:37
kind of performance with the known
17:39
structures today unless you limit
17:44
yourself to static structures so you
17:47
build a structure knowing the table
17:48
ahead of time and you optimize that
17:50
structure for search now in reality
17:55
you're a low tables never really static
17:57
because downstream nodes where you're
18:01
trying to send packets to might go down
18:03
in which case you've got to be able to
18:04
change the rules in your table or new
18:08
intermediate nodes come up you want to
18:10
be able to put rules into your table to
18:12
account for that so in reality the
18:15
tables do change or in the example of an
18:17
employee getting fired you have to
18:19
instantly be able to go in and update
18:21
your firewall okay so it's not static
18:25
okay so in reality things are not static
18:27
but for many situations you can kind of
18:32
behave like it is static okay and what I
18:35
mean by that is you have a structure
18:39
that is optimized for search and sure
18:41
there are changes taking place but you
18:43
don't record that change in this
18:46
structure okay well that means is that
18:50
at times you'll end up sending packets
18:53
to intermediate nodes that no longer
18:54
exist and that's really not a problem
18:57
because if you don't get an
18:59
acknowledgment back from a node you can
19:00
retransmit the packet okay so you can
19:03
have some failure recovery schemes
19:05
instead you accumulate changes as you
19:07
discover like some routers are down you
19:10
want to put in new rules take out old
19:11
rules you accumulate them in a shadow
19:13
structure which can be updated in batch
19:16
mode maybe every twenty minutes thirty
19:19
minutes whatever your
19:21
cycle timers and then you swap the
19:23
shadow structure with the working
19:25
structure with some periodicity so for a
19:28
while you're sending packets where you
19:30
shouldn't be but the alternative is
19:33
everything is slowed down all right so
19:36
so many of these structures that are in
19:41
use tend to be static okay or at least
19:44
semi static along the lines that I
19:46
described and they're optimized for
19:48
lookup okay you need to have reasonable
19:50
pre-processing time time required to
19:53
create the structure because with some
19:55
frequency you're going to be creating a
19:57
new structure and storage is important
20:01
you're an event of my storage because
20:03
then you can take this structure and you
20:05
can put it into high-speed memory
20:06
instead of into into larger low-speed
20:09
memory okay so the important things here
20:13
are lookup time you got to be able to
20:14
process several hundred million facts a
20:16
second and part of accomplishing that is
20:19
to use high speed memory which is
20:20
expensive so storage becomes important
20:24
and then of course you have dynamic
20:27
structures and dynamic structures every
20:29
time you get a change you got to
20:31
incorporate that change right away so
20:33
for example of this employee who was
20:36
going to get fired
20:37
you really can't wait 30 minutes from
20:40
the time you give him a pink slip
20:41
because in that 30 minutes all of your
20:43
corporate secrets would be transmitted
20:45
out of the company so you've got to be
20:48
able to change the rules instantly give
20:57
you some idea of the size of the tables
20:58
we're talking about
20:59
all right this table really maybe
21:03
reflects a few databases a few router
21:08
tables the size of tables actually
21:09
changes in time so if you went to a
21:12
particular ISP and you looked at the
21:14
router table at this point in time maybe
21:18
they've got 85,000 rules in there but if
21:21
you come back three hours later it might
21:23
have become 90,000 rules because the
21:25
rules were inserted or it might have
21:27
dropped down to 70,000 rules so they do
21:29
change in time even if they're working
21:31
in that semi static mode
21:35
the the largest tables that are out
21:37
there currently really of the order of
21:39
200,000 rules so this doesn't represent
21:42
the biggest tables just some sample of
21:45
tables if you represent these tables
21:50
using tries like we've been discussing
21:52
the last few lectures the number of
21:54
nodes in the tries tend to be roughly
21:56
two times the number of rules again the
21:59
analysis we did in class said that the
22:03
number of nodes in the worst case could
22:05
be the length of a key times the number
22:07
of keys now the length of the keys here
22:10
these are IP four tables are like 32
22:13
bits so you really don't get something
22:15
like 32 times then you typically get
22:18
like 2 times okay for a while the common
22:25
solution to router tables was to you was
22:28
a hardware solution which was using
22:30
something called ternary content
22:32
addressable memories and basically in a
22:36
ternary cam each bit can be set into one
22:39
of three states so in this case I've got
22:42
a ternary cam which has been organized
22:43
into two four six words each word being
22:47
two four five bits long okay
22:50
each bit can be either a 0 or a 1 or a
22:53
question mark question mark means don't
22:54
care so this setting for example would
22:58
represent the prefix 0 0 1 0 stop in a
23:03
scheme of destination addresses being
23:07
five bits long okay this would represent
23:10
the prefix 0 1 star now if you're doing
23:14
really IP 4
23:15
these have to be 32 bits long okay all
23:21
right so you you initialize your ternary
23:24
cam with the prefixes that you have and
23:27
then when you get a packet you look at
23:30
the destination address and you load it
23:31
into a register ok you press a button
23:35
and the hardware searches all of its
23:39
words in parallel for matches with that
23:42
ok so in one cycle of the cam the
23:45
matching fellows light up in this case
23:47
these
23:47
three fellows like that okay so you got
23:52
a parallel search of all the woods then
23:56
the cameras set up so that it's got
23:59
hardware to select one of the fellows
24:02
that lights up and the one that is
24:04
selected is the first one in the table
24:07
okay so you can think of it as like two
24:11
cycles of the cam or one big cycle of
24:13
the cam initially all the words that
24:16
match light up and then the first one
24:18
remains letting everybody else dies off
24:21
and this becomes the selector alright so
24:27
you can use this for first word and
24:30
table match okay you can use it for
24:32
longest prefix match by organizing this
24:35
table in order of prefix length you can
24:39
use it for highest priority match by
24:40
putting in the prefixes by priorities
24:43
okay so search is very quick it's
24:46
running at hardware speed inserts and
24:51
deletes a little bit problematic so if
24:54
you're doing longest prefix match in
24:55
ipv4 then you've really got thought you
24:58
could have let's say 32 blocks okay if
25:01
you ignore a zero length prefix so here
25:04
you could have prefixes of length 32
25:05
than 31 30 and so on and if you want to
25:09
insert a new prefix of length 32 you
25:12
could put it in at the end of the 32
25:14
block by putting it here you'll have to
25:17
take out the first 31 fellow and move
25:20
that by replacing the fellow at the top
25:23
of the 30 block and the 29 block in 28
25:26
block okay so what about 32 moves you
25:29
could affect an insert or you could do a
25:34
delete in about 32 moves okay but that's
25:36
a lot of moves but certainly as far as
25:40
search goes looks like a great way to go
25:43
okay now there are problems with using
25:49
this with very large tables first verse
25:55
comes at the capacity okay so if you
25:56
want to represent two hundred thousand
25:59
prefix says an IP
26:01
for you got to have a cam with 200,000
26:04
times 32 bits capacity of cams tends to
26:09
be around 100,000 prefixes there's a
26:15
cost associated with it you can imagine
26:18
these things are expensive and there's a
26:21
lot of hardware in there for the
26:25
arbitration to find the first one that
26:27
matched and also to find to get all of
26:29
these guys to be searched in parallel
26:31
there's a lot of power being consumed
26:33
which means a lot of heat being
26:34
generated they take up a lot of space on
26:36
your boards and if you want to go to ip6
26:40
okay so now instead of 32 bits the
26:43
prefix you got hundred and twenty eight
26:45
so if you could handle hundred thousand
26:48
prefixes or IP four you can only do
26:50
25,000 in IP six the capacities going
26:53
down to one if you want to handle ranges
26:56
then you got to take each range and
26:58
decompose it into a bunch of prefixes
27:00
okay so that gives you an expansion in
27:03
the size of the table so very few ranges
27:06
can be handled if you go into multi
27:08
dimensions so like if you have source
27:11
and destination
27:12
well then each entry in IP four instead
27:15
of 32 bits becomes 64 bits so so there
27:21
are problems with scalability ranges and
27:24
going with multi dimensions but there
27:26
are also other problems in terms of the
27:28
power consumed and their constant all
27:33
right so there are many other solutions
27:34
that people have pursued which range
27:37
from well perhaps which all start
27:40
somewhere in software and sometimes that
27:42
software gets implemented in hardware
27:44
maybe as an ASIC to get you better
27:47
performance
27:51
right the solutions which are more
27:55
sufferer oriented there's a wide range
27:57
of them things using hash tables to give
28:00
you good perform expected performance
28:02
solutions using binary search trees
28:05
using B trees using privately search
28:08
trees that we're going to talk about in
28:09
a subsequent lecture but the most
28:12
dominant structure that you see out
28:14
there is really tries so let's start by
28:20
looking at a simple try solution we'll
28:22
call this a 1-bit try because here at
28:25
each level of the try
28:27
we'll be branching by looking at one bit
28:28
of the destination address and then we
28:31
can generalize this to higher degree
28:33
tries like we did in our last lecture
28:35
okay so no one bit try we're going to be
28:41
representing keys that are of different
28:43
lengths okay so you've got a key here
28:45
which is of length to forget the start a
28:47
key of length three key length one and
28:50
zero so these prefixes are going to get
28:52
stored in a one bit try these are
28:54
different length prefixes and we're not
28:57
going to use the trick of appending a
28:59
special character at the end the
29:03
strategy is going to be that in each
29:04
node of the try will have capacity to
29:06
store two prefixes okay and the root
29:10
level will store prefixes of length one
29:13
it's children can store prefixes of
29:15
length two prefixes of length three four
29:18
five six seven and so on okay so each
29:22
level is able to store a prefix of a
29:24
particular length so p5 stored here p5
29:31
is going to be 0 star this has to be one
29:34
star if you begin with a 1 you go on
29:36
this side if you begin with a 0 you go
29:38
on that side okay so this one here's got
29:40
to be 1 0 star there's no 1 1 star so
29:44
this part is empty this has to be 1 1 1
29:48
star ok this is coming here it's a 1
29:53
that's a 0 0 ok it's 1 0 0 star ok so
29:58
depending upon whether you're a 1 or a 0
30:00
you can find your way through and if
30:03
once you get to
30:05
appropriate level you just Park yourself
30:07
in that spot if you want to search for
30:14
the longest length prefix here it's
30:16
fairly easy okay if you're given a
30:19
destination address you follow the path
30:22
given by the address if the first bit in
30:24
the address is one you go this way if
30:27
the next bit is zero you come here after
30:28
the next bit is zero you come here if
30:30
it's zero you come there if it's one you
30:32
go then okay all of the matching
30:36
prefixes will be found on this path okay
30:40
so you follow a path given by the
30:42
destination address and every matching
30:44
prefix on the table is found on this
30:45
path the longest matching prefix is the
30:48
last prefix you see on this path okay so
30:52
you keep track of prefixes as you see
30:53
them and the last one you saw is the
30:56
answer to the search you can insert the
31:04
prefix using the same strategy we know
31:07
how to insert into a try it's fairly
31:09
simple and we can remove a prefix with
31:12
great simplicity in fact the complexity
31:17
of all of the operations is order of W
31:20
where W is the length of the longest
31:22
prefix it's an IP for you'd be running
31:27
in about 32 units of time the
31:34
unfortunate thing about this is as
31:36
you're going down okay when you're
31:39
coming out here to decide on the next
31:43
node to move to I go to access this
31:44
field from memory okay so I know which
31:48
field to get whether to get the left
31:49
child or the right child field because I
31:51
know the bit in the destination address
31:53
so by looking in that bit I can figure
31:56
out which field I want I get it and then
31:58
I can move down here but now we go to
32:00
make another access okay so if you're
32:05
dealing with IP four you got to make
32:07
about 32 memory accesses in the worst
32:09
case in order to do a search and that's
32:13
too many memory accesses
32:17
okay alright so to get better
32:23
performance we're going to go into
32:27
higher order trice now the interesting
32:32
thing about the high order tries they
32:34
get used in router tables is that
32:36
they're not the same kind of high order
32:39
tries we studied in our last lecture in
32:43
the last lecture when we talked about a
32:44
try whose degree was let's say 10 like
32:47
the Social Security try well every
32:49
single node had that degree here we're
32:52
going to vary the degree of nodes so
32:55
some nodes might be of degree 2 some may
32:58
be of degree 4 some may be of degree 16
33:01
okay so it's really a variable degree
33:04
try and you have two versions here okay
33:08
oops
33:10
all right so we've called these multi
33:11
but tries because we're going to use a
33:14
different number of bits to branch on if
33:15
a node has degree 2 we'll branch on one
33:18
bit if a node has degree 4 we'll branch
33:21
on two bits if a node has degree 8 will
33:23
branch on eight bits of 3 3 bits of the
33:25
destination address now
33:29
alright so branching is done using one
33:31
or more bits where we have two versions
33:33
of multi bit tries one is a fixed tried
33:35
try and this has the property that if
33:39
you look at nodes on the same level so
33:42
all the children of the root well
33:44
they're all of the same degree they all
33:45
use the same number of bits all the
33:48
grandchildren use the same number of
33:50
bits but that could be different from
33:52
the number of bits used by the children
33:53
of the root okay so the number of bits
33:56
used varies by level but not within a
33:59
level again that's called a fixed tried
34:01
try then you have a variable stride try
34:04
where you relax that and say well the
34:07
number of bits can vary within a level
34:08
as well as between levels okay all right
34:14
so let's look first at fixed tried tries
34:21
the number of levels we need in a try
34:25
okay to figure that out we've got to
34:29
have a place for every prefix okay and
34:33
if you're looking at a fixed tried try
34:36
so if the routes you're going to be
34:40
branching on two bits well then in the
34:42
route I'll be able to store only
34:43
prefixes of length two and if the
34:47
children of the route of branching on
34:49
three then in the children I'll be able
34:52
to store prefixes of length five the two
34:54
bits used at the root in the three bits
34:56
being is the child the number of levels
35:00
that you need and a fixed right try to
35:03
house all the prefixes you have is
35:04
actually equal to the number of
35:06
different lengths in your prefixes and
35:10
if you want to reduce the number of
35:13
levels what you have to do is you have
35:15
to somehow control the different lengths
35:18
and to control the different lengths we
35:21
use a technique called prefix expansion
35:24
and what prefix expansion is if I look
35:27
at this stuff here this is what I had
35:29
before
35:29
well here there are seven different
35:31
lengths okay you got prefixes of length
35:35
one two three I don't have a four I
35:38
suppose I've got a five six out of seven
35:41
okay now I can come up with an
35:46
equivalent table which has only three
35:50
different lengths by expanding some of
35:52
the prefixes so I'm I'm going to have
35:54
lengths of two five and seven okay so
35:59
all the other lengths have to vanish now
36:01
I've got a prefix here of length one
36:04
that's not permissible because I'm going
36:05
to the smallest length eggman allow us
36:07
to but the prefix 1 star is equivalent
36:12
to having two prefixes 1 0 star and 1 1
36:14
star okay so I replace this with two
36:18
prefixes 1 0 star and 1 1 star I got a 1
36:22
1 star here when I replace it with 1 0
36:25
start kind of collides with this fellow
36:28
I can't have two copies in my thing but
36:32
between a 1 0 star coming
36:35
here in a 1-0 star from here this guy's
36:37
going to win because this is a longer
36:38
length prefix okay so in this case my
36:42
expansion of this didn't increase the
36:44
number of prefixes when I expand this
36:46
fellow I get 0 1 star and 0 0 star and
36:50
they don't collide with anybody so you
36:52
end up with 2 copies now here a p5 a and
36:56
a few 5b ok so you can expand prefixes
37:01
of a length that don't exist in your try
37:03
to the next available length if the next
37:06
available length is 1 more you expand to
37:08
two prefixes if it's 2 more you expand
37:10
it to 4 if it's 3 more you expand into 8
37:14
ok so by using prefix expansion I can
37:17
control the number of lengths in my
37:20
table and with that I can control my
37:24
number of levels in the try so this
37:27
would translate into something like this
37:31
the route we're going to branch on 2
37:34
bits then the children in the root 2
37:37
branch on 3 bits if you branch in 2 bits
37:40
you have 4 pointer fields and for data
37:42
fields if you branch on 3 bits you have
37:44
8 point of views and 8 data fields and
37:47
then the grandchildren branch it to bits
37:50
ok so here you can store a prefix of
37:53
length 2 here are prefixes of length 5
37:55
and then here prefixes of length 7 now
38:01
if you only search this fellow okay well
38:06
you search this fellow I'll first look
38:08
at two bits of the destination address
38:10
and knowing the starting point of the
38:14
root I can compute the field that I need
38:16
and make one memory access to get that
38:18
field okay then if it's this one the
38:22
data stored here gives me the starting
38:24
point of this node using the next three
38:27
bits I can figure out which field here
38:31
is needed okay so it doesn't matter that
38:34
this node doesn't fit in a cache because
38:35
we're not going to bring the whole init
38:36
node in I'm going to get the three bits
38:41
from my destination address use the
38:43
start address given here by this pointer
38:45
and figure out which field
38:48
need bring that field in and then
38:51
that'll give me the next pointer okay so
38:54
so long as each field fits into one
38:56
cache line and the number of memory
38:58
accesses needed is equal to the height
39:02
of this structure okay all right so now
39:06
instead of seven memory accesses I can
39:08
search that table with three memory
39:10
accesses so when you're designing fixed
39:21
droid tries you're faced with a design
39:25
problem really and the design problem
39:27
says I first do an analysis which says
39:30
I'm going to run this router on this
39:34
particular computer or on this chip or
39:37
whatever and this chip is going to
39:39
support memory accesses at this rate I
39:42
need to be able to process 200 million
39:44
packets per second and so I can afford
39:47
only four memory accesses per packet so
39:52
by doing that kind of analysis you get
39:55
something which says I need a try whose
39:57
height is at most cakes and my example K
39:59
was 4 so subject to that constraint and
40:03
given my table I want to find the best
40:07
fixed right try best meaning it takes
40:09
the smallest amount of memory okay
40:11
depending upon how you pick the strides
40:13
you get different amounts of memory so
40:16
for example I could pick the stride of
40:18
the route just 32 you need only one
40:21
access okay but you need to raise to 32
40:25
fields so depending upon how you
40:29
structure this you could need different
40:31
amount of memory even if you look at
40:34
within things of height exactly three
40:36
you can have a stride of two here three
40:39
here and two there but I could have had
40:41
a stride of three here or two there and
40:43
two there instead okay and those would
40:45
require different amount of memory okay
40:47
so I want to pick the strides of the
40:49
levels having at most K levels and using
40:53
the smallest amount of memory
40:59
if you go into I think I'm going to
41:04
probably skip this part so that we can
41:06
finish on time you can set up a
41:09
formulation using dynamic programming
41:11
using concepts of different levels of
41:14
the try covering levels of the of the
41:18
initial one bit
41:19
try that you started with and you can
41:23
set up dynamic programming equations let
41:27
me just skip all of this stuff okay and
41:32
you can solve these equations to figure
41:34
out the optimal try spending about this
41:38
much time case the is the height of the
41:42
try that's it committed and W is the
41:45
length of the longest prefix okay so you
41:49
can find the optimal strides in a fairly
41:52
small amount of time now there are
41:57
multiple dynamic programming
42:00
formulations you can come up with I
42:02
don't really go through the details of
42:04
this but there's another one which is a
42:06
little more complex and this one runs
42:11
about two to four times as fast as the
42:13
previous one okay now what's interesting
42:18
about this whole topic of using tries in
42:23
the router and firewall area is there's
42:27
a huge amount of dynamic programming
42:28
that gets used to build the optimal
42:31
structures but if you look at the amount
42:36
of memory that's taken by the fixed
42:38
tried try okay for the sample databases
42:42
we had up there if you allowed two
42:46
accesses okay then it's pretty big okay
42:50
but if you go from two to three there's
42:52
a huge drop in the amount of memory
42:54
that's needed okay so that seems to kind
42:56
of be a critical place certainly we go
42:59
from three to four there's another big
43:01
drop but not really as much it's out
43:02
there and once you go beyond four
43:05
there's not a whole lot of change in
43:07
particular when you're around six and
43:09
seven its flattens out
43:11
and what that tells you is that by
43:15
paying virtually no memory penalty if
43:18
you're flattening out this is what you
43:20
were taking you and when you had
43:21
thirty-two accesses which meant one bit
43:23
for the traditional one bit try okay so
43:28
by using about the same amount of memory
43:30
as used by the simple one but try using
43:32
the scheme here you can reduce the
43:34
number of memory accesses from 32 to
43:37
about six so you get a factor of five
43:39
speed-up without paying any memory
43:41
penalty alright this just talks about
43:48
runtime let's forget that you can go
43:51
into variable stride tries so in a
43:54
variable stride try just said nodes are
43:58
the same level can have a different
44:01
number of bits that they branch on so
44:04
over here I'm branching on two bits
44:06
two bits means I'm covering these two
44:10
levels of my original try then on this
44:13
side I got five bits so I'm going to
44:15
cover all five of these levels are going
44:19
to be mapped into this fellow and here
44:21
I've got three bits so I'm going to
44:23
cover these three fellows okay another
44:27
way to look at it is prefixes of length
44:28
two prefixes of length seven provided
44:33
they begin with 1 0 and then prefixes of
44:37
length 5 provided they begin with 1 1
44:42
you get the same optimization problem
44:46
here find the strides for all of the
44:49
nodes ok number of bits being used to
44:52
branch on on every node out here so that
44:55
you minimize the total memory against
44:58
subject to a height constraint so what
45:02
you would expect here is that the
45:06
optimal variable stride tries would take
45:10
less memory than the optimal fixed tried
45:12
tries for the same number of memory
45:15
accesses
45:16
all right so you can solve this using
45:18
dynamic programming I will skip the
45:21
formulation okay and actually you can
45:32
get clever about the algorithms and come
45:33
up with better algorithms for the case
45:36
where the two accesses are allowed and
45:38
also for the case of only three accesses
45:40
are allowed okay if you look at the
45:50
experimental studies okay this is a
45:52
fixed tried try out here okay sorry this
46:00
is the fixed tried try okay then you go
46:03
and do variable stride
46:04
the butler is just a technique again
46:06
that we talked about when we were
46:08
dealing with tries which says that
46:10
instead of using the Tri structure all
46:12
the way down to one key you may stop
46:15
when you have let's say five keys or six
46:16
keys and put them in a bucket okay and
46:19
so the same thing is done out here you
46:22
can have memory savings by stopping when
46:25
the number of keys is say small enough
46:27
to fit in a cache line you just don't
46:30
keep branching all the way down to one
46:31
key so that's a butler node in this
46:34
particular field
46:36
so with two accesses you can get a huge
46:41
amount of savings over fixed right tries
46:43
with three accesses you can still get
46:46
good amount of saving and then the
46:48
amount of memory saved decreases as you
46:51
go to higher numbers so when you get to
46:53
save seven or so there isn't a whole lot
46:57
of improvement you know variables try to
46:59
try over a fixed ride try okay
47:05
it's just spent a few minutes talking
47:06
about higher dimensions now even though
47:10
filters could go up to five dimensions
47:13
really for about the largest filters the
47:17
interesting case is really looking at
47:19
two dimensions okay a couple of reasons
47:22
for that
47:22
one is that if you just look at the
47:27
source and destination and then
47:28
everybody all filled
47:30
that have the same source prefix and
47:31
destination prefix put them into an
47:33
equivalence class well then these
47:35
equivalence classes tend to be very
47:36
small in practice so they tend to be
47:39
only like five filters in them so you
47:41
can put them all in a bucket and search
47:43
them serially okay so for practical
47:47
tables if you use this truck just play
47:51
with two and regard equivalence classes
47:54
where the source and prefix are the same
47:56
you get a very good solution okay and
48:00
there's no point trying to differentiate
48:03
or have a sophisticated structure on the
48:04
other three dimensions the other thing
48:07
is depending upon how people resolve
48:09
security issues that are being
48:11
considered at least for general purpose
48:15
routers you may not have access to
48:17
anything other than the source and the
48:18
destination so if you're working on an
48:21
intra enterprise router let's say you're
48:24
doing it internally within a company you
48:26
can do whatever you like and even read
48:28
the text or the payload of a packet and
48:31
use that for routing but once you get
48:33
outside you will not be permitted to
48:34
read payloads and some people are
48:37
arguing pretty locally that you
48:39
shouldn't be allowed to see things like
48:40
sauce protocol the source port and
48:43
destination port and so on either okay
48:45
so there are reasons to focus on two
48:47
dimensions and we've skipped this so
48:52
when you get to two dimensions you could
48:55
certainly extend tries to two
48:56
dimensional tries okay so what I've done
48:59
out here is I've taken the destination
49:03
prefixes which are the first ones out
49:04
here and represented them with a one bit
49:08
try up here okay then all of the filters
49:12
they have the same destination prefix
49:15
get mapped into a secondary try based on
49:18
the source prefix so for example this
49:22
would be where you'd store 0 stop ok so
49:26
0 Star has F 1 with this this with this
49:30
sauce prefix F 2 F 3 and F 7 okay so
49:35
those will show up here using the
49:38
destination so I've got an F 7 and F 1 F
49:41
2 and F 3
49:43
okay so you can build a two-dimensional
49:46
try in this fashion and of course you
49:53
could extend this to multi-bit tries
49:55
okay so for example here the destination
49:58
try here your branching on one bit two
50:01
bits one bit and then the source tries
50:04
here I'm branching on one bit here on
50:06
three bits here I've just got a single
50:09
root in this source try branching on two
50:11
bits I've got a two-level source try
50:13
here branching on one bit and two bits
50:16
okay so this leads to an even bigger
50:20
optimization problem you're told how
50:22
many memory accesses are allowed for the
50:24
search build a try that can be searched
50:27
using at most that many build a
50:29
two-dimensional try they can be searched
50:31
with that most that many accesses again
50:37
you can formulate dynamic programming
50:39
algorithms to build these things the
50:42
dynamic programming algorithms known
50:43
here don't find the true optimal they
50:45
find the true optimal under certain
50:46
restrictions but even those restricted
50:50
ones similar to what we had for
50:54
one-dimensional case without paying any
50:57
memory penalty over the one bit tries
50:59
you can reduce the number of accesses
51:01
somewhere between 1/2 to 1/4 okay but if
51:06
you're willing to pay a memory penalty
51:08
use about 50% more memory you can
51:10
improve the performance by a factor of
51:12
between 4 to 8 ok all right so as far as
51:19
price go there are several applications
51:22
of tries an application that has seen a
51:25
lot of research over the last several
51:27
years is in the design of router tables
51:30
firewalls packet classifiers and stuff
51:32
and that area has dealt a lot with multi
51:36
bit tries with a very heavy use of
51:39
dynamic programming there's a lot more
51:42
to tries than I've been able to get
51:44
through in this one hour
51:45
you worry about say succinct
51:47
representation of Christ compact
51:49
representation because I guess memory is
51:54
very important in this application
51:56
and while we've optimized here using the
52:01
multi bit schemes there are other
52:04
strategies you start with a one bit try
52:06
and figure out some compact way to
52:08
represent it and we haven't talked about
52:11
those okay all right any questions okay