Transcript 00:19 okay but at this point you should have 00:25 got back your second exams so hopefully 00:28 you've all picked them up from the TA 00:31 the performances on the second exam was 00:34 rather good the average is about 83 84 % 00:39 so that's a good performance you have 00:44 any questions you want to ask before I 00:46 start today's lecture okay 00:53 are we far we've had a few lectures on 00:55 tries and today we're going to really 00:58 take a look at an important application 01:01 of tries and this is in the 01:04 representation of of tables that are 01:08 used in the internet for things like 01:12 packet forwarding or for implementation 01:15 of firewalls and so on right so the 01:20 tables we're talking about these are 01:23 really sets of each table is a set of 01:27 rules and a rule has two components to 01:30 it there's a filter which is a boolean 01:32 expression and the components of the 01:36 boolean expression are really pieces of 01:38 data that you can find in the header of 01:40 a packet that's trying to move through 01:41 the network okay so a filter is a 01:45 boolean expression that might be 01:47 comprised of variables such as the 01:49 source address where did this packet 01:50 originate the destination was the packet 01:53 going and it could include things like 01:56 the port number where the packet 01:59 originated the port number where it is 02:01 going you could even have the protocol 02:04 being used here is it a tcp or UDP you 02:09 could have even things like the time of 02:12 day okay so for example you may might 02:18 make up a make up a filter which says 02:24 for packets that 02:29 that started originated at this 02:32 particular PC so that's got its own IP 02:34 address and it's going to some other 02:37 machine some ways you've got a 02:38 destination address and you might use 02:41 the time of day so between 8 a.m. and 5 02:45 p.m. so so packets that start somewhere 02:50 and they're going somewhere between 8 02:52 and 5 you may want to handle in some 02:54 special way and the special way on the 02:57 handle is given by the action what do 02:59 you want to do with packets that satisfy 03:01 the constraints of this filter ok 03:05 so for example you may put a rule in 03:07 your table that says packets originating 03:09 from from the database lab between 8 & 5 03:13 and that are going to this web server 03:15 which is used to download music well 03:18 they should be dropped ok we won't allow 03:20 you to download songs between 8 & 5 if 03:23 you're trying to download it from the 03:25 database lab ok but if it's outside of 03:28 business hours we might say oh that's ok 03:30 right so so we could set up a rule which 03:35 the filter part describes what packets 03:38 the rule applies to and then the action 03:41 part tells what should you do with that 03:43 packet instructions could vary things 03:46 like drop the packet in the example I 03:48 just gave you 03:49 or it could be forward than the packet 03:51 from the current location to a 03:55 particular destination location and the 03:59 destination that you're sending it to 04:00 which you might call the next hop where 04:02 does it go to the next may not agree 04:04 with that because the way you get from 04:07 the true source to the true destination 04:09 may be by going through a bunch of 04:11 intermediate nodes ok so so you could 04:17 have a next hop information okay so if 04:19 you're sending packets from Gainesville 04:22 Florida to Los Angeles California and if 04:26 right now I'm at the router at the 04:28 University of Florida well from here I 04:31 might send it to a router say in Atlanta 04:33 and that router might forward it to a 04:35 router in Texas which might eventually 04:37 forward it to a router in 04:39 California which might send it to the 04:41 final destination of the packet okay so 04:45 so each of these routers that I'm 04:48 talking about would have a table of this 04:50 sort collection of filters and 04:52 associated actions and the actions would 04:56 be different depending upon where this 04:59 table is located so the action for a 05:02 packet distant for California in the 05:05 University of Florida router could be 05:07 send it to Atlanta but for the router 05:10 sitting in Atlanta the corresponding 05:12 action would not be send it to Atlanta 05:14 to be send it somewhere else okay so the 05:16 tables are different depending upon the 05:20 location of the table the set of rules 05:23 could vary and also the set of actions 05:28 we've gone through some of these already 05:32 depending upon the type of application 05:35 the filters vary in complexity the most 05:39 complex filters around generally tend to 05:42 have about four or five components to 05:44 them so typically a QoS filter quality 05:51 of service filter would include the 05:52 source where is it coming from where is 05:54 it going 05:55 the two ports as well as the protocol 05:57 being used but you could get more 05:59 complex if you want to throw in say time 06:01 of day for example firewall filters 06:07 typically vary in their complexity you 06:11 may have a filter as simple as having 06:13 only one variable in it which could say 06:16 if you have an employee that's been 06:19 fired or is about to be fired you might 06:21 go in and insert a rule into your table 06:23 which says all packets originating from 06:25 this employees workstation are to be 06:28 dropped okay so you just have a single 06:33 variable in that filter but you could 06:36 have more variables in the firewall 06:39 filter the most common kinds of filters 06:45 around our simple one-dimensional 06:49 filters that are based only on the 06:51 destination address 06:52 so if you're sending packets 06:57 around the internet they typically do 07:00 their job by just looking at the 07:02 destination field of the header so the 07:08 most studied type of tables are 07:12 one-dimensional tables where the single 07:15 field is the destination address all 07:21 right so let's first look at destination 07:23 address filters and then towards the end 07:25 of the lecture we'll take a look at 07:26 higher dimensional filters now a 2d 07:33 filter a sorry a one-dimensional filter 07:37 you could specify that filter in one of 07:40 many ways so for example you could give 07:43 a range of destination addresses to 07:46 which it applies so any packet that's 07:49 going to a destination whose IP address 07:52 lies between 35 and 2096 is matched by 07:55 this specification more typical is to 08:01 give the filter as a pair which is an 08:04 address and a mask and the mask here 08:08 basically the zero signified don't care 08:11 and the ones mean that we do care so 08:14 wherever you get a 1 here the 08:16 corresponding bit out here is important 08:18 in the address okay so the zero says I 08:21 don't really care what this first bit is 08:23 the one says the next bits gotta be is 08:24 zero that one says then you got to have 08:26 a 1 and you could have a 1 and then you 08:29 don't care what it is and then you do 08:31 care it's got to be a zero so there are 08:33 two places where you have don't cares 08:35 you have a don't care out here and you 08:39 have a don't care out there okay but the 08:41 other three four bits have to be 0 1 1 08:45 and 0 okay so this specification will 08:50 satisfy exactly for destination 08:53 addresses you can vary this through it's 08:56 two possibilities 0 and 1 you can vary 08:58 that bit through its two possibilities 0 09:00 & 1 and so that gives you these four 09:02 destination addresses that are mash 09:05 by this specification now in a real 09:08 filter it's not going to look as simple 09:10 as this because in a real filter you're 09:14 generally dealing with say IP version 4 09:15 in which case these addresses are 32 09:18 bits long if you're dealing with version 09:20 6 then 128 bits long okay but most of 09:24 what's out there today is really version 09:25 4 09:25 so you've got 32 bits okay now in 09:33 reality people don't really deal with 09:36 filters that as general as this if you 09:38 pop open any commercial router out there 09:41 it's really looking at very simple 09:44 filters which have prefix filters so 09:47 prefix filter if you're looking at it in 09:50 this notation would have the property 09:52 that all the zeros are at the right hand 09:54 so they don't care at the right end 09:55 before several bits are fixed okay so 09:59 for example here I've got a 1-1 which 10:01 means I care about these two bits they 10:03 got to be a 1 and 0 then I got 4 zeros I 10:06 don't care about the other four bits 10:09 was it 5 I get 5 zeros okay so this one 10:14 with these two ones means I've got to 10:16 match that those twos I get a 1-0 stop 10:19 and you can convert that into a range 10:22 the 1-0 star the smallest address it 10:26 matches is 1 0 followed by as many zeros 10:28 as needed to get to your address length 10:31 and the largest address it matches is 1 10:34 0 followed by as many ones as needed to 10:36 get it to a complete destination address 10:39 okay so a prefix filter can be specified 10:44 as an address mass pair the mass will 10:46 have the property that you're going to 10:48 have the ones at the left and then 0 is 10:51 at the right or can be specified as a 10:54 range so it's really a special case of 10:57 both of these two types okay as I said 11:01 this is really what's what you're going 11:03 to find in real-world router tables a 11:06 whole bunch of prefixes 11:15 all right so an example of a a very very 11:20 small router table okay I have a bunch 11:24 of photos and of course there are 11:26 associated actions I haven't really 11:28 listed the actions out here okay alright 11:33 so we have a bunch of photos which is 11:35 specified as prefixes okay and so for 11:40 example this one here will match all 11:42 destination addresses that begin with 1 11:44 and 0 you know of course that's not the 11:47 only filter I've got in this table that 11:49 matches addresses that begin with 1 & 0 11:53 this guy here also matches some of the 11:55 begin with 1 & 0 except that they got to 11:57 be followed by two more zeros okay so 12:00 also here and so also there in fact 12:02 every address matched by p8 is also 12:05 matched by p7 and by p6 and by p4 and by 12:10 p1 okay and and that's quite common if 12:14 you look into real tables given a 12:17 destination address typically there will 12:18 be many photos that match that 12:21 destination address and so you need to 12:24 have a tiebreaker to determine which 12:27 rule actually applies and certainly you 12:32 could think of many tiebreakers for 12:36 example the first matching rule so your 12:39 table is a serial or a sequence of 12:43 filters the topmost filter in the table 12:46 that matches that's the first filter is 12:48 the one that applies you have a highest 12:52 priority rule so to each rule you assign 12:55 priorities of the ones that match you 12:58 select the one with the highest priority 13:00 you can have something called a most 13:03 specific rule so for example here if a 13:08 filter specified as a range okay two 13:10 four and you get another one that's one 13:12 six okay then 13:18 everything that's matched by 2 4 is also 13:21 matched by 1 6 but 1 6 matches some 13:24 additional things 13:25 okay so say this one is more specific 13:27 than that okay 13:29 so for example if you have a filter that 13:34 matches all the IP addresses in 13:36 Gainesville and you have another one 13:39 that matches all the IP addresses in 13:40 Florida then the gains will filter is 13:43 more specific than the Florida filter 13:45 see if you got a packet that's going to 13:47 Gainesville you'd rather use the 13:49 Gainesville rule versus the overall 13:52 Florida rule okay so the most specific 13:56 rule says if you have something that's 13:58 more specific than something else then 14:00 you'd rather use the most specific one 14:02 okay now this rule becomes somewhat 14:05 ambiguous because you may have filters 14:08 like this where you do have an overlap 14:11 so if you have a packet that's going to 14:14 let's say the destination address is 12 14:16 okay it's matched by this one is matched 14:19 by that but neither is more specific 14:21 than the other okay so if you're using 14:25 the most specific rule you have to be 14:27 careful that you don't have ambiguities 14:29 of this type okay so that leads to 14:32 problems like given a bunch of rules how 14:34 do you make sure you don't have this 14:36 kind of ambiguity in there all right the 14:42 one that's really used in practice for 14:44 router tables is the longest prefix rule 14:48 okay so if I go back to my previous 14:52 example 15:03 so if you have a that's a packet that's 15:07 matched by P 8 P 7 P 6 P 4 and P 1 then 15:13 the longest matching rule would say use 15:16 P 8 and that's the same as the most 15:19 specific rule okay because the address 15:24 is matched by P 8 are a subset of those 15:26 matched by P 7 which are a subset of 15:28 those matched by P 6 which are a subset 15:31 of those matched by P 1 which are a 15:33 subset of those matched by P 4 ok so the 15:37 longest matching prefix is really the 15:40 most specific rule translated into the 15:43 prefix domain and when you have prefixes 15:47 you cannot have ambiguities as far as 15:50 the most specific rule is concerned 15:52 because there's no way for two prefixes 15:54 to intersect and have and not have this 15:58 containment property ok so you can 16:01 verify that on your own ok all right so 16:05 typically then what we're dealing with 16:07 is the longest matching prefix and of 16:11 course our filters are prefixes ok 16:22 alright so that's I guess the example 16:25 that we just went through ok all right 16:28 static and dynamic router tables now in 16:32 the application we're talking about 16:34 performance is very crucial you need to 16:39 be able to if you want to keep if you 16:42 want to operate at the speed of the 16:44 transmission lines that are used in 16:47 networks you you have to be able to 16:51 process hundreds of millions of packets 16:54 a second and in order to process 16:57 hundreds of millions of packets per 16:59 second you need to have very very fast 17:03 fast data structures to do this the 17:07 trying to do it on a single high-speed 17:10 processor then you're looking at maybe 17:17 being able to afford of the order of 17:19 four five or six cache misses 17:23 so whatever structure you use is got to 17:27 be searchable very very quickly 17:29 okay small number of memory accesses now 17:33 it turns out you really can't get that 17:37 kind of performance with the known 17:39 structures today unless you limit 17:44 yourself to static structures so you 17:47 build a structure knowing the table 17:48 ahead of time and you optimize that 17:50 structure for search now in reality 17:55 you're a low tables never really static 17:57 because downstream nodes where you're 18:01 trying to send packets to might go down 18:03 in which case you've got to be able to 18:04 change the rules in your table or new 18:08 intermediate nodes come up you want to 18:10 be able to put rules into your table to 18:12 account for that so in reality the 18:15 tables do change or in the example of an 18:17 employee getting fired you have to 18:19 instantly be able to go in and update 18:21 your firewall okay so it's not static 18:25 okay so in reality things are not static 18:27 but for many situations you can kind of 18:32 behave like it is static okay and what I 18:35 mean by that is you have a structure 18:39 that is optimized for search and sure 18:41 there are changes taking place but you 18:43 don't record that change in this 18:46 structure okay well that means is that 18:50 at times you'll end up sending packets 18:53 to intermediate nodes that no longer 18:54 exist and that's really not a problem 18:57 because if you don't get an 18:59 acknowledgment back from a node you can 19:00 retransmit the packet okay so you can 19:03 have some failure recovery schemes 19:05 instead you accumulate changes as you 19:07 discover like some routers are down you 19:10 want to put in new rules take out old 19:11 rules you accumulate them in a shadow 19:13 structure which can be updated in batch 19:16 mode maybe every twenty minutes thirty 19:19 minutes whatever your 19:21 cycle timers and then you swap the 19:23 shadow structure with the working 19:25 structure with some periodicity so for a 19:28 while you're sending packets where you 19:30 shouldn't be but the alternative is 19:33 everything is slowed down all right so 19:36 so many of these structures that are in 19:41 use tend to be static okay or at least 19:44 semi static along the lines that I 19:46 described and they're optimized for 19:48 lookup okay you need to have reasonable 19:50 pre-processing time time required to 19:53 create the structure because with some 19:55 frequency you're going to be creating a 19:57 new structure and storage is important 20:01 you're an event of my storage because 20:03 then you can take this structure and you 20:05 can put it into high-speed memory 20:06 instead of into into larger low-speed 20:09 memory okay so the important things here 20:13 are lookup time you got to be able to 20:14 process several hundred million facts a 20:16 second and part of accomplishing that is 20:19 to use high speed memory which is 20:20 expensive so storage becomes important 20:24 and then of course you have dynamic 20:27 structures and dynamic structures every 20:29 time you get a change you got to 20:31 incorporate that change right away so 20:33 for example of this employee who was 20:36 going to get fired 20:37 you really can't wait 30 minutes from 20:40 the time you give him a pink slip 20:41 because in that 30 minutes all of your 20:43 corporate secrets would be transmitted 20:45 out of the company so you've got to be 20:48 able to change the rules instantly give 20:57 you some idea of the size of the tables 20:58 we're talking about 20:59 all right this table really maybe 21:03 reflects a few databases a few router 21:08 tables the size of tables actually 21:09 changes in time so if you went to a 21:12 particular ISP and you looked at the 21:14 router table at this point in time maybe 21:18 they've got 85,000 rules in there but if 21:21 you come back three hours later it might 21:23 have become 90,000 rules because the 21:25 rules were inserted or it might have 21:27 dropped down to 70,000 rules so they do 21:29 change in time even if they're working 21:31 in that semi static mode 21:35 the the largest tables that are out 21:37 there currently really of the order of 21:39 200,000 rules so this doesn't represent 21:42 the biggest tables just some sample of 21:45 tables if you represent these tables 21:50 using tries like we've been discussing 21:52 the last few lectures the number of 21:54 nodes in the tries tend to be roughly 21:56 two times the number of rules again the 21:59 analysis we did in class said that the 22:03 number of nodes in the worst case could 22:05 be the length of a key times the number 22:07 of keys now the length of the keys here 22:10 these are IP four tables are like 32 22:13 bits so you really don't get something 22:15 like 32 times then you typically get 22:18 like 2 times okay for a while the common 22:25 solution to router tables was to you was 22:28 a hardware solution which was using 22:30 something called ternary content 22:32 addressable memories and basically in a 22:36 ternary cam each bit can be set into one 22:39 of three states so in this case I've got 22:42 a ternary cam which has been organized 22:43 into two four six words each word being 22:47 two four five bits long okay 22:50 each bit can be either a 0 or a 1 or a 22:53 question mark question mark means don't 22:54 care so this setting for example would 22:58 represent the prefix 0 0 1 0 stop in a 23:03 scheme of destination addresses being 23:07 five bits long okay this would represent 23:10 the prefix 0 1 star now if you're doing 23:14 really IP 4 23:15 these have to be 32 bits long okay all 23:21 right so you you initialize your ternary 23:24 cam with the prefixes that you have and 23:27 then when you get a packet you look at 23:30 the destination address and you load it 23:31 into a register ok you press a button 23:35 and the hardware searches all of its 23:39 words in parallel for matches with that 23:42 ok so in one cycle of the cam the 23:45 matching fellows light up in this case 23:47 these 23:47 three fellows like that okay so you got 23:52 a parallel search of all the woods then 23:56 the cameras set up so that it's got 23:59 hardware to select one of the fellows 24:02 that lights up and the one that is 24:04 selected is the first one in the table 24:07 okay so you can think of it as like two 24:11 cycles of the cam or one big cycle of 24:13 the cam initially all the words that 24:16 match light up and then the first one 24:18 remains letting everybody else dies off 24:21 and this becomes the selector alright so 24:27 you can use this for first word and 24:30 table match okay you can use it for 24:32 longest prefix match by organizing this 24:35 table in order of prefix length you can 24:39 use it for highest priority match by 24:40 putting in the prefixes by priorities 24:43 okay so search is very quick it's 24:46 running at hardware speed inserts and 24:51 deletes a little bit problematic so if 24:54 you're doing longest prefix match in 24:55 ipv4 then you've really got thought you 24:58 could have let's say 32 blocks okay if 25:01 you ignore a zero length prefix so here 25:04 you could have prefixes of length 32 25:05 than 31 30 and so on and if you want to 25:09 insert a new prefix of length 32 you 25:12 could put it in at the end of the 32 25:14 block by putting it here you'll have to 25:17 take out the first 31 fellow and move 25:20 that by replacing the fellow at the top 25:23 of the 30 block and the 29 block in 28 25:26 block okay so what about 32 moves you 25:29 could affect an insert or you could do a 25:34 delete in about 32 moves okay but that's 25:36 a lot of moves but certainly as far as 25:40 search goes looks like a great way to go 25:43 okay now there are problems with using 25:49 this with very large tables first verse 25:55 comes at the capacity okay so if you 25:56 want to represent two hundred thousand 25:59 prefix says an IP 26:01 for you got to have a cam with 200,000 26:04 times 32 bits capacity of cams tends to 26:09 be around 100,000 prefixes there's a 26:15 cost associated with it you can imagine 26:18 these things are expensive and there's a 26:21 lot of hardware in there for the 26:25 arbitration to find the first one that 26:27 matched and also to find to get all of 26:29 these guys to be searched in parallel 26:31 there's a lot of power being consumed 26:33 which means a lot of heat being 26:34 generated they take up a lot of space on 26:36 your boards and if you want to go to ip6 26:40 okay so now instead of 32 bits the 26:43 prefix you got hundred and twenty eight 26:45 so if you could handle hundred thousand 26:48 prefixes or IP four you can only do 26:50 25,000 in IP six the capacities going 26:53 down to one if you want to handle ranges 26:56 then you got to take each range and 26:58 decompose it into a bunch of prefixes 27:00 okay so that gives you an expansion in 27:03 the size of the table so very few ranges 27:06 can be handled if you go into multi 27:08 dimensions so like if you have source 27:11 and destination 27:12 well then each entry in IP four instead 27:15 of 32 bits becomes 64 bits so so there 27:21 are problems with scalability ranges and 27:24 going with multi dimensions but there 27:26 are also other problems in terms of the 27:28 power consumed and their constant all 27:33 right so there are many other solutions 27:34 that people have pursued which range 27:37 from well perhaps which all start 27:40 somewhere in software and sometimes that 27:42 software gets implemented in hardware 27:44 maybe as an ASIC to get you better 27:47 performance 27:51 right the solutions which are more 27:55 sufferer oriented there's a wide range 27:57 of them things using hash tables to give 28:00 you good perform expected performance 28:02 solutions using binary search trees 28:05 using B trees using privately search 28:08 trees that we're going to talk about in 28:09 a subsequent lecture but the most 28:12 dominant structure that you see out 28:14 there is really tries so let's start by 28:20 looking at a simple try solution we'll 28:22 call this a 1-bit try because here at 28:25 each level of the try 28:27 we'll be branching by looking at one bit 28:28 of the destination address and then we 28:31 can generalize this to higher degree 28:33 tries like we did in our last lecture 28:35 okay so no one bit try we're going to be 28:41 representing keys that are of different 28:43 lengths okay so you've got a key here 28:45 which is of length to forget the start a 28:47 key of length three key length one and 28:50 zero so these prefixes are going to get 28:52 stored in a one bit try these are 28:54 different length prefixes and we're not 28:57 going to use the trick of appending a 28:59 special character at the end the 29:03 strategy is going to be that in each 29:04 node of the try will have capacity to 29:06 store two prefixes okay and the root 29:10 level will store prefixes of length one 29:13 it's children can store prefixes of 29:15 length two prefixes of length three four 29:18 five six seven and so on okay so each 29:22 level is able to store a prefix of a 29:24 particular length so p5 stored here p5 29:31 is going to be 0 star this has to be one 29:34 star if you begin with a 1 you go on 29:36 this side if you begin with a 0 you go 29:38 on that side okay so this one here's got 29:40 to be 1 0 star there's no 1 1 star so 29:44 this part is empty this has to be 1 1 1 29:48 star ok this is coming here it's a 1 29:53 that's a 0 0 ok it's 1 0 0 star ok so 29:58 depending upon whether you're a 1 or a 0 30:00 you can find your way through and if 30:03 once you get to 30:05 appropriate level you just Park yourself 30:07 in that spot if you want to search for 30:14 the longest length prefix here it's 30:16 fairly easy okay if you're given a 30:19 destination address you follow the path 30:22 given by the address if the first bit in 30:24 the address is one you go this way if 30:27 the next bit is zero you come here after 30:28 the next bit is zero you come here if 30:30 it's zero you come there if it's one you 30:32 go then okay all of the matching 30:36 prefixes will be found on this path okay 30:40 so you follow a path given by the 30:42 destination address and every matching 30:44 prefix on the table is found on this 30:45 path the longest matching prefix is the 30:48 last prefix you see on this path okay so 30:52 you keep track of prefixes as you see 30:53 them and the last one you saw is the 30:56 answer to the search you can insert the 31:04 prefix using the same strategy we know 31:07 how to insert into a try it's fairly 31:09 simple and we can remove a prefix with 31:12 great simplicity in fact the complexity 31:17 of all of the operations is order of W 31:20 where W is the length of the longest 31:22 prefix it's an IP for you'd be running 31:27 in about 32 units of time the 31:34 unfortunate thing about this is as 31:36 you're going down okay when you're 31:39 coming out here to decide on the next 31:43 node to move to I go to access this 31:44 field from memory okay so I know which 31:48 field to get whether to get the left 31:49 child or the right child field because I 31:51 know the bit in the destination address 31:53 so by looking in that bit I can figure 31:56 out which field I want I get it and then 31:58 I can move down here but now we go to 32:00 make another access okay so if you're 32:05 dealing with IP four you got to make 32:07 about 32 memory accesses in the worst 32:09 case in order to do a search and that's 32:13 too many memory accesses 32:17 okay alright so to get better 32:23 performance we're going to go into 32:27 higher order trice now the interesting 32:32 thing about the high order tries they 32:34 get used in router tables is that 32:36 they're not the same kind of high order 32:39 tries we studied in our last lecture in 32:43 the last lecture when we talked about a 32:44 try whose degree was let's say 10 like 32:47 the Social Security try well every 32:49 single node had that degree here we're 32:52 going to vary the degree of nodes so 32:55 some nodes might be of degree 2 some may 32:58 be of degree 4 some may be of degree 16 33:01 okay so it's really a variable degree 33:04 try and you have two versions here okay 33:08 oops 33:10 all right so we've called these multi 33:11 but tries because we're going to use a 33:14 different number of bits to branch on if 33:15 a node has degree 2 we'll branch on one 33:18 bit if a node has degree 4 we'll branch 33:21 on two bits if a node has degree 8 will 33:23 branch on eight bits of 3 3 bits of the 33:25 destination address now 33:29 alright so branching is done using one 33:31 or more bits where we have two versions 33:33 of multi bit tries one is a fixed tried 33:35 try and this has the property that if 33:39 you look at nodes on the same level so 33:42 all the children of the root well 33:44 they're all of the same degree they all 33:45 use the same number of bits all the 33:48 grandchildren use the same number of 33:50 bits but that could be different from 33:52 the number of bits used by the children 33:53 of the root okay so the number of bits 33:56 used varies by level but not within a 33:59 level again that's called a fixed tried 34:01 try then you have a variable stride try 34:04 where you relax that and say well the 34:07 number of bits can vary within a level 34:08 as well as between levels okay all right 34:14 so let's look first at fixed tried tries 34:21 the number of levels we need in a try 34:25 okay to figure that out we've got to 34:29 have a place for every prefix okay and 34:33 if you're looking at a fixed tried try 34:36 so if the routes you're going to be 34:40 branching on two bits well then in the 34:42 route I'll be able to store only 34:43 prefixes of length two and if the 34:47 children of the route of branching on 34:49 three then in the children I'll be able 34:52 to store prefixes of length five the two 34:54 bits used at the root in the three bits 34:56 being is the child the number of levels 35:00 that you need and a fixed right try to 35:03 house all the prefixes you have is 35:04 actually equal to the number of 35:06 different lengths in your prefixes and 35:10 if you want to reduce the number of 35:13 levels what you have to do is you have 35:15 to somehow control the different lengths 35:18 and to control the different lengths we 35:21 use a technique called prefix expansion 35:24 and what prefix expansion is if I look 35:27 at this stuff here this is what I had 35:29 before 35:29 well here there are seven different 35:31 lengths okay you got prefixes of length 35:35 one two three I don't have a four I 35:38 suppose I've got a five six out of seven 35:41 okay now I can come up with an 35:46 equivalent table which has only three 35:50 different lengths by expanding some of 35:52 the prefixes so I'm I'm going to have 35:54 lengths of two five and seven okay so 35:59 all the other lengths have to vanish now 36:01 I've got a prefix here of length one 36:04 that's not permissible because I'm going 36:05 to the smallest length eggman allow us 36:07 to but the prefix 1 star is equivalent 36:12 to having two prefixes 1 0 star and 1 1 36:14 star okay so I replace this with two 36:18 prefixes 1 0 star and 1 1 star I got a 1 36:22 1 star here when I replace it with 1 0 36:25 start kind of collides with this fellow 36:28 I can't have two copies in my thing but 36:32 between a 1 0 star coming 36:35 here in a 1-0 star from here this guy's 36:37 going to win because this is a longer 36:38 length prefix okay so in this case my 36:42 expansion of this didn't increase the 36:44 number of prefixes when I expand this 36:46 fellow I get 0 1 star and 0 0 star and 36:50 they don't collide with anybody so you 36:52 end up with 2 copies now here a p5 a and 36:56 a few 5b ok so you can expand prefixes 37:01 of a length that don't exist in your try 37:03 to the next available length if the next 37:06 available length is 1 more you expand to 37:08 two prefixes if it's 2 more you expand 37:10 it to 4 if it's 3 more you expand into 8 37:14 ok so by using prefix expansion I can 37:17 control the number of lengths in my 37:20 table and with that I can control my 37:24 number of levels in the try so this 37:27 would translate into something like this 37:31 the route we're going to branch on 2 37:34 bits then the children in the root 2 37:37 branch on 3 bits if you branch in 2 bits 37:40 you have 4 pointer fields and for data 37:42 fields if you branch on 3 bits you have 37:44 8 point of views and 8 data fields and 37:47 then the grandchildren branch it to bits 37:50 ok so here you can store a prefix of 37:53 length 2 here are prefixes of length 5 37:55 and then here prefixes of length 7 now 38:01 if you only search this fellow okay well 38:06 you search this fellow I'll first look 38:08 at two bits of the destination address 38:10 and knowing the starting point of the 38:14 root I can compute the field that I need 38:16 and make one memory access to get that 38:18 field okay then if it's this one the 38:22 data stored here gives me the starting 38:24 point of this node using the next three 38:27 bits I can figure out which field here 38:31 is needed okay so it doesn't matter that 38:34 this node doesn't fit in a cache because 38:35 we're not going to bring the whole init 38:36 node in I'm going to get the three bits 38:41 from my destination address use the 38:43 start address given here by this pointer 38:45 and figure out which field 38:48 need bring that field in and then 38:51 that'll give me the next pointer okay so 38:54 so long as each field fits into one 38:56 cache line and the number of memory 38:58 accesses needed is equal to the height 39:02 of this structure okay all right so now 39:06 instead of seven memory accesses I can 39:08 search that table with three memory 39:10 accesses so when you're designing fixed 39:21 droid tries you're faced with a design 39:25 problem really and the design problem 39:27 says I first do an analysis which says 39:30 I'm going to run this router on this 39:34 particular computer or on this chip or 39:37 whatever and this chip is going to 39:39 support memory accesses at this rate I 39:42 need to be able to process 200 million 39:44 packets per second and so I can afford 39:47 only four memory accesses per packet so 39:52 by doing that kind of analysis you get 39:55 something which says I need a try whose 39:57 height is at most cakes and my example K 39:59 was 4 so subject to that constraint and 40:03 given my table I want to find the best 40:07 fixed right try best meaning it takes 40:09 the smallest amount of memory okay 40:11 depending upon how you pick the strides 40:13 you get different amounts of memory so 40:16 for example I could pick the stride of 40:18 the route just 32 you need only one 40:21 access okay but you need to raise to 32 40:25 fields so depending upon how you 40:29 structure this you could need different 40:31 amount of memory even if you look at 40:34 within things of height exactly three 40:36 you can have a stride of two here three 40:39 here and two there but I could have had 40:41 a stride of three here or two there and 40:43 two there instead okay and those would 40:45 require different amount of memory okay 40:47 so I want to pick the strides of the 40:49 levels having at most K levels and using 40:53 the smallest amount of memory 40:59 if you go into I think I'm going to 41:04 probably skip this part so that we can 41:06 finish on time you can set up a 41:09 formulation using dynamic programming 41:11 using concepts of different levels of 41:14 the try covering levels of the of the 41:18 initial one bit 41:19 try that you started with and you can 41:23 set up dynamic programming equations let 41:27 me just skip all of this stuff okay and 41:32 you can solve these equations to figure 41:34 out the optimal try spending about this 41:38 much time case the is the height of the 41:42 try that's it committed and W is the 41:45 length of the longest prefix okay so you 41:49 can find the optimal strides in a fairly 41:52 small amount of time now there are 41:57 multiple dynamic programming 42:00 formulations you can come up with I 42:02 don't really go through the details of 42:04 this but there's another one which is a 42:06 little more complex and this one runs 42:11 about two to four times as fast as the 42:13 previous one okay now what's interesting 42:18 about this whole topic of using tries in 42:23 the router and firewall area is there's 42:27 a huge amount of dynamic programming 42:28 that gets used to build the optimal 42:31 structures but if you look at the amount 42:36 of memory that's taken by the fixed 42:38 tried try okay for the sample databases 42:42 we had up there if you allowed two 42:46 accesses okay then it's pretty big okay 42:50 but if you go from two to three there's 42:52 a huge drop in the amount of memory 42:54 that's needed okay so that seems to kind 42:56 of be a critical place certainly we go 42:59 from three to four there's another big 43:01 drop but not really as much it's out 43:02 there and once you go beyond four 43:05 there's not a whole lot of change in 43:07 particular when you're around six and 43:09 seven its flattens out 43:11 and what that tells you is that by 43:15 paying virtually no memory penalty if 43:18 you're flattening out this is what you 43:20 were taking you and when you had 43:21 thirty-two accesses which meant one bit 43:23 for the traditional one bit try okay so 43:28 by using about the same amount of memory 43:30 as used by the simple one but try using 43:32 the scheme here you can reduce the 43:34 number of memory accesses from 32 to 43:37 about six so you get a factor of five 43:39 speed-up without paying any memory 43:41 penalty alright this just talks about 43:48 runtime let's forget that you can go 43:51 into variable stride tries so in a 43:54 variable stride try just said nodes are 43:58 the same level can have a different 44:01 number of bits that they branch on so 44:04 over here I'm branching on two bits 44:06 two bits means I'm covering these two 44:10 levels of my original try then on this 44:13 side I got five bits so I'm going to 44:15 cover all five of these levels are going 44:19 to be mapped into this fellow and here 44:21 I've got three bits so I'm going to 44:23 cover these three fellows okay another 44:27 way to look at it is prefixes of length 44:28 two prefixes of length seven provided 44:33 they begin with 1 0 and then prefixes of 44:37 length 5 provided they begin with 1 1 44:42 you get the same optimization problem 44:46 here find the strides for all of the 44:49 nodes ok number of bits being used to 44:52 branch on on every node out here so that 44:55 you minimize the total memory against 44:58 subject to a height constraint so what 45:02 you would expect here is that the 45:06 optimal variable stride tries would take 45:10 less memory than the optimal fixed tried 45:12 tries for the same number of memory 45:15 accesses 45:16 all right so you can solve this using 45:18 dynamic programming I will skip the 45:21 formulation okay and actually you can 45:32 get clever about the algorithms and come 45:33 up with better algorithms for the case 45:36 where the two accesses are allowed and 45:38 also for the case of only three accesses 45:40 are allowed okay if you look at the 45:50 experimental studies okay this is a 45:52 fixed tried try out here okay sorry this 46:00 is the fixed tried try okay then you go 46:03 and do variable stride 46:04 the butler is just a technique again 46:06 that we talked about when we were 46:08 dealing with tries which says that 46:10 instead of using the Tri structure all 46:12 the way down to one key you may stop 46:15 when you have let's say five keys or six 46:16 keys and put them in a bucket okay and 46:19 so the same thing is done out here you 46:22 can have memory savings by stopping when 46:25 the number of keys is say small enough 46:27 to fit in a cache line you just don't 46:30 keep branching all the way down to one 46:31 key so that's a butler node in this 46:34 particular field 46:36 so with two accesses you can get a huge 46:41 amount of savings over fixed right tries 46:43 with three accesses you can still get 46:46 good amount of saving and then the 46:48 amount of memory saved decreases as you 46:51 go to higher numbers so when you get to 46:53 save seven or so there isn't a whole lot 46:57 of improvement you know variables try to 46:59 try over a fixed ride try okay 47:05 it's just spent a few minutes talking 47:06 about higher dimensions now even though 47:10 filters could go up to five dimensions 47:13 really for about the largest filters the 47:17 interesting case is really looking at 47:19 two dimensions okay a couple of reasons 47:22 for that 47:22 one is that if you just look at the 47:27 source and destination and then 47:28 everybody all filled 47:30 that have the same source prefix and 47:31 destination prefix put them into an 47:33 equivalence class well then these 47:35 equivalence classes tend to be very 47:36 small in practice so they tend to be 47:39 only like five filters in them so you 47:41 can put them all in a bucket and search 47:43 them serially okay so for practical 47:47 tables if you use this truck just play 47:51 with two and regard equivalence classes 47:54 where the source and prefix are the same 47:56 you get a very good solution okay and 48:00 there's no point trying to differentiate 48:03 or have a sophisticated structure on the 48:04 other three dimensions the other thing 48:07 is depending upon how people resolve 48:09 security issues that are being 48:11 considered at least for general purpose 48:15 routers you may not have access to 48:17 anything other than the source and the 48:18 destination so if you're working on an 48:21 intra enterprise router let's say you're 48:24 doing it internally within a company you 48:26 can do whatever you like and even read 48:28 the text or the payload of a packet and 48:31 use that for routing but once you get 48:33 outside you will not be permitted to 48:34 read payloads and some people are 48:37 arguing pretty locally that you 48:39 shouldn't be allowed to see things like 48:40 sauce protocol the source port and 48:43 destination port and so on either okay 48:45 so there are reasons to focus on two 48:47 dimensions and we've skipped this so 48:52 when you get to two dimensions you could 48:55 certainly extend tries to two 48:56 dimensional tries okay so what I've done 48:59 out here is I've taken the destination 49:03 prefixes which are the first ones out 49:04 here and represented them with a one bit 49:08 try up here okay then all of the filters 49:12 they have the same destination prefix 49:15 get mapped into a secondary try based on 49:18 the source prefix so for example this 49:22 would be where you'd store 0 stop ok so 49:26 0 Star has F 1 with this this with this 49:30 sauce prefix F 2 F 3 and F 7 okay so 49:35 those will show up here using the 49:38 destination so I've got an F 7 and F 1 F 49:41 2 and F 3 49:43 okay so you can build a two-dimensional 49:46 try in this fashion and of course you 49:53 could extend this to multi-bit tries 49:55 okay so for example here the destination 49:58 try here your branching on one bit two 50:01 bits one bit and then the source tries 50:04 here I'm branching on one bit here on 50:06 three bits here I've just got a single 50:09 root in this source try branching on two 50:11 bits I've got a two-level source try 50:13 here branching on one bit and two bits 50:16 okay so this leads to an even bigger 50:20 optimization problem you're told how 50:22 many memory accesses are allowed for the 50:24 search build a try that can be searched 50:27 using at most that many build a 50:29 two-dimensional try they can be searched 50:31 with that most that many accesses again 50:37 you can formulate dynamic programming 50:39 algorithms to build these things the 50:42 dynamic programming algorithms known 50:43 here don't find the true optimal they 50:45 find the true optimal under certain 50:46 restrictions but even those restricted 50:50 ones similar to what we had for 50:54 one-dimensional case without paying any 50:57 memory penalty over the one bit tries 50:59 you can reduce the number of accesses 51:01 somewhere between 1/2 to 1/4 okay but if 51:06 you're willing to pay a memory penalty 51:08 use about 50% more memory you can 51:10 improve the performance by a factor of 51:12 between 4 to 8 ok all right so as far as 51:19 price go there are several applications 51:22 of tries an application that has seen a 51:25 lot of research over the last several 51:27 years is in the design of router tables 51:30 firewalls packet classifiers and stuff 51:32 and that area has dealt a lot with multi 51:36 bit tries with a very heavy use of 51:39 dynamic programming there's a lot more 51:42 to tries than I've been able to get 51:44 through in this one hour 51:45 you worry about say succinct 51:47 representation of Christ compact 51:49 representation because I guess memory is 51:54 very important in this application 51:56 and while we've optimized here using the 52:01 multi bit schemes there are other 52:04 strategies you start with a one bit try 52:06 and figure out some compact way to 52:08 represent it and we haven't talked about 52:11 those okay all right any questions okay