Transcript 00:00 the following content is provided under 00:01 a Creative Commons license your support 00:04 will help MIT OpenCourseWare continue to 00:06 offer high quality educational resources 00:08 for free 00:09 to make a donation or view additional 00:12 materials from hundreds of MIT courses 00:14 visit MIT opencourseware at ocw.mit.edu 00:26 you 00:30 so my name my name is Hari Balakrishna 00:32 and I'm gonna take you through the rest 00:34 of 6:30 of doing the remaining lectures 00:36 in the class so so far in 602 what we've 00:40 looked at are ways in which we design a 00:44 single communication link so we know how 00:47 to take two computers or two nodes and 00:53 design what a link between them might 00:56 look like and this link might be an 00:57 actual wired link or it might be a radio 01:01 link or it might be an acoustic link 01:04 there's some medium over which these two 01:06 guys communicate and the main idea is 01:08 we've looked at have to do with coding 01:12 in particular channel coding which is a 01:14 strategy to combat noise and errors that 01:17 might show up on the channel and then in 01:20 order to match the what we communicate 01:24 to the characteristics of the channel 01:26 you know for example the ability of a 01:27 channel to deal in sinusoids we studied 01:31 modulation and demodulation so those are 01:37 the two main elements that we studied 01:39 and in both of these we looked at both 01:41 how you do this to achieve reliability 01:44 because ultimately we want to 01:45 communicate information in a way that's 01:47 reliable and do it efficiently in 01:51 particular with modulation we looked at 01:53 a scheme to share a medium amongst 01:55 multiple multiple conversations 01:58 frequency division multiplexing which is 02:00 the topic one of the tasks on this lab 02:02 and with coding we looked at ways in 02:03 which you you do this coding in a way 02:05 that isn't just replicating every bit 02:07 but involves some you know linear 02:10 algebra operations that allows you to do 02:12 gain efficiency so the rest of the class 02:15 is really about taking for granted our 02:19 ability to design communication links 02:21 and putting them together and composing 02:23 them to build networks so the basic 02:25 problem is actually very very easy the 02:28 problem is you're given a set of nodes 02:31 let's say computers and the problem you 02:38 want to solve is to come up with a way 02:39 by which you can allow any computer or 02:41 any phone or any device on this network 02:43 to communicate with any other device on 02:45 the network that's the problem so you're 02:49 given n nodes and you want all to all 02:54 communication now this is a little 03:00 different from the kind of you know the 03:02 other networks you could design you 03:04 could design a network where you're 03:05 given n nodes and you have one 2n 03:06 communication there's one transmitter 03:08 many many receivers and you want to 03:11 design a network for that purpose what's 03:13 an example of a network where you have 03:14 one transmitter many receivers and you 03:15 just want to build something that makes 03:17 that work radio is one example 03:20 television is another example and those 03:23 are good examples in fact for those 03:26 kinds of one-to-many networks where you 03:28 have or or K to n networks where you 03:30 have K sources of information and n 03:32 receivers and K is a lot smaller than n 03:35 it'll turn out that the basic frequency 03:37 division approach makes sense I mean 03:39 that's how radio stations or TV stations 03:41 work someone in the US the Federal 03:43 Communications Commission has decided to 03:45 allocate different chunks of frequency 03:47 to different TV stations and different 03:49 radio stations and the assumption is 03:51 they're always going to be using it it 03:53 turns out that assumption may or may not 03:54 be true but under the assumption that 03:56 they're always going to be using it and 03:58 you have many many many receivers you 03:59 just divide up frequencies and a lot of 04:01 them each to transmit in their own 04:04 frequencies and then you have a receiver 04:05 that's capable of tuning to different 04:07 frequencies and you get the information 04:10 of the channel that you want will 04:12 actually come back to that problem a 04:14 little bit in the next two lectures but 04:17 for today the design problem and going 04:19 forward the design problem you should 04:21 have in mind is you want a network where 04:22 you have alter all communication and you 04:24 want to be able to support any 04:26 application 04:28 this is a big deal we're not just 04:31 designing a network to allow telephone 04:33 calls to work or we're not just 04:34 designing network that allows you to do 04:36 video conferencing we're trying to 04:38 design a network where any application 04:40 can run on it in particular applications 04:42 that you might not have envisioned this 04:45 is the reason why the internet works 04:46 really well is because when they 04:48 designed the internet they designed it 04:50 under some set of assumptions but they 04:52 were really really smart to design a 04:54 network that made minimal assumptions 04:56 about the application so it's a network 04:59 that's good enough for almost any 05:00 application though it isn't perfectly 05:05 optimal for any application it's just 05:07 good enough for everything and that's a 05:09 really good characteristic of a 05:10 well-designed network is if you can make 05:12 if it can work even for things you 05:14 didn't even dream off when their 05:16 birthday Internet they certainly didn't 05:17 dream that the web would have to exist 05:19 they didn't dream that you know people 05:20 would be you know tweeting and telling 05:23 people they're going to the bathroom or 05:24 whatever they do on Twitter I mean they 05:26 designed a network and it just kind of 05:29 is amazing that all these applications 05:30 can work so the question is what did 05:32 they do correct what did they do right 05:34 and what are things that what what 05:37 general lessons can we learn from it and 05:39 the general high level lessons you learn 05:41 actually apply to any system you build 05:43 it'll turn out that whenever you know if 05:45 you confront it with a real-world 05:46 problem in an industry or research or 05:48 wherever very often you're trying to 05:51 make decisions on what you need to be 05:53 doing and it's very tempting to make 05:56 decisions based on what you think it's 05:58 going to be used for but very often what 06:00 you end up eventually using it for is 06:03 very different from what you thought in 06:04 the beginning so it's good to have 06:05 applications in mind but it's good not 06:07 to embed too much about those 06:08 applications of the design of networks 06:10 so the high-level principle here is how 06:12 you can do something that works well 06:15 enough without making too many 06:17 assumptions about what's running on top 06:18 of it 06:20 there are there are two big themes 06:22 they're the same two themes that we 06:23 studied before that we're going to keep 06:25 coming back to the first is efficiency 06:28 and the second is reliability the same 06:39 two themes we come back to over and over 06:40 again there's a third important theme 06:42 about network design which has to do 06:44 with scalability I mean how can you make 06:46 it work so this network can work for 06:49 millions of billions of devices and 06:51 billions of computers that's a topic 06:53 we're not really going to talk about 06:54 I'll get to it in the last lecture but 06:57 six or three three and six eight to nine 06:59 we'll talk about those issues so let me 07:01 start first with efficiency if I tell 07:05 you how to build a communication link 07:07 that can communicate between any two 07:09 devices or any two computers it should 07:12 be pretty straightforward to now design 07:13 a network that allows all to all 07:15 communication something out that's a 07:20 mouse great 07:26 one way you can design this network is 07:28 to simply take your communication link 07:30 that we know how to build and do this 07:32 just connect every pair of computers for 07:35 every pair of nodes to each other I'm 07:37 probably missing a few of these but this 07:42 is a great network design because it's 07:46 composed a bunch of links to build a 07:48 network so why don't we do this or maybe 07:51 we should do this right 07:58 it's too expensive why is it too 08:00 expensive sorry 08:03 you know um how many fewer in professors 08:06 lose the recitation great I understand 08:09 he gives you guys money to answer or if 08:11 he makes a mistake I'm gonna do the same 08:13 thing whenever I make a mistake 08:15 professors will give you some money so 08:20 so actually I mean I don't hold me to 08:23 this but why don't you guys answer this 08:26 is pretty straightforward how many links 08:27 do you need and choose two it's about n 08:31 squared right so N squared depending on 08:34 the context it's not it too big or too 08:35 small but the modern choose 2 N squared 08:37 links it turns out that's actually a 08:39 pretty large number of Link's because 08:41 and and the notes talked about some of 08:44 the reasons why this is too expensive 08:46 but the other reason it's a problem is 08:48 that you know it's one thing to design a 08:49 network where every computer in this 08:51 room can talk to each other and 08:52 conceivably you know we might get 08:53 tangled up in all these wires but we 08:55 could imagine laying wires between every 08:57 pair of our computers and communicating 08:59 but there are two reasons this is a big 09:01 problem I want to communicate with 09:02 computers in California or China or 09:04 wherever and you know individual links 09:07 going across the world and you know my 09:09 computer to China on your computer to 09:10 another computer in China just doesn't 09:12 scale does it work very well 09:14 the second problem the reason why this 09:16 issue matters is that not all 09:18 communication links are wires in fact 09:20 all right now the most the most dominant 09:23 mode by which people gain access to the 09:25 Internet including right now in this 09:26 room is through is through radio it's 09:29 through wireless and this is a shared 09:30 medium so it's not like you know we can 09:33 somehow you know put these wires 09:35 together we're gonna have to share this 09:37 communication medium we're gonna have to 09:39 share this communication network and 09:40 somehow we have to come up with a 09:42 strategy to do this efficiently and 09:44 there's a few different principles 09:46 involved in how you design networks but 09:48 the main one is that we're going to 09:50 construct a special computer called a 09:53 switch 09:56 and a lot of what we're going to be 09:58 doing has to do with what we do in the 10:01 switch the other part of what we're 10:03 going to be doing is what we do in the 10:05 computers itself so our network is going 10:07 to be designed using a set of rules that 10:09 are obeyed and implemented and followed 10:12 by the computers okay special set of 10:14 rules that are implemented by these 10:16 computers curved switches and a special 10:17 set of rules that are implemented by the 10:19 end computers by the devices on the 10:20 network and together they're going to 10:22 make our communication work so the the 10:25 high-level plan is going to be that we 10:26 take these computers and rather than put 10:28 wires between every pair of them we're 10:30 going to connect them together into 10:32 perhaps there's lots and lots of 10:34 computers and many of them get connected 10:36 to one of these boxes which is a switch 10:39 and a switch may connect to other 10:42 switches and some of these switches may 10:47 have other computers attached to them 10:49 and then eventually you might get to 10:55 other end computers and and when you 10:58 build a network like this a structure 11:00 like this this kind of a picture is 11:02 called the network topology 11:05 a switch has one or more links attached 11:10 to it these links could be wires they 11:13 could be shared things like like this 11:16 thing here is a switch it has no visible 11:20 links but it probably has one wired link 11:23 connecting it via ethernet to the rest 11:24 of the MIT campus and out here you know 11:27 lots of computers right now are 11:28 connected to it it gives the illusion 11:30 that each of your computers has a 11:31 separate link to the switch and we look 11:33 at how that illusion is maintained and 11:35 down next time next lecture but this is 11:38 an example of a switch probably the 11:39 world's you know this thing is made I 11:41 think by Cisco so they charge you know 11:43 six or eight hundred dollars for it but 11:45 really you know it's you can buy it for 11:46 forty bucks when you put the word 11:49 Enterprise next to anything you sell you 11:50 pay the price but anyway the world's 11:56 cheapest switches are on Wi-Fi access 11:57 points so you connect the stuff together 12:00 into a topology and the job of the 12:02 switch is to look at messages that come 12:04 in from from these links and figure out 12:07 what to do with those messages and make 12:09 sure that together they coordinate to 12:12 get messages to the destinations to 12:15 which you wish to send those messages so 12:18 here's the picture of that I got today 12:21 from MIT sis and T which is the picture 12:26 of MIT s network so I just want to give 12:28 you a sense for what this looks like for 12:33 a campus like MIT so the first thing to 12:36 notice is that this is actually it's got 12:38 some redundancy built in you don't see 12:39 it in the picture but really what's 12:41 going on here is that we have these two 12:42 routers here in the context of the 12:46 Internet these switches are also called 12:48 routers it's taken me 10 years to 12:50 pronounce it router because where I was 12:52 brought up 12:53 they pronounced a router and many people 12:55 say that but in the u.s. they say router 12:57 so anyway these routers here there are 12:59 two - backbone routers and they're 13:01 actually each of these guys these other 13:03 routers in these different buildings are 13:06 connected actually to both of these so 13:08 the idea here is that if one of those 13:09 links were to fail or if one of these 13:11 were to fail the other guy would take 13:13 over and handle this traffic under 13:16 normal conditions traffic is kind of 13:18 balanced between these two different 13:20 rudders so some of these computers some 13:21 of these other routers are connected to 13:23 one of them some of the other routers 13:24 are connected to the other and together 13:26 they work to provide connectivity these 13:29 backbone routers get connected to these 13:30 things that are called external routers 13:32 which are routers that connect to 13:35 various other networks and Internet 13:36 service providers that MIT uses MIT is 13:41 extremely well connected the amount of 13:42 bandwidth coming in and out as you might 13:44 have noticed growing you know I don't 13:46 know BitTorrent or whatever the cool 13:47 people do these days with with networks 13:49 is is phenomenal MIT commercially uses 13:55 sprint which is an Internet service 13:56 provided users level 3 which is probably 13:58 the biggest internet service provider in 14:00 the u.s. this thing called pay tech is I 14:04 found is that so MIT now does telephony 14:06 through the internet so it's voice over 14:09 IP as opposed to the old telephone 14:11 system so that's a lot of that voice 14:14 traffic goes through that network 14:17 service provider other things here this 14:21 n o X is I think it stands for the 14:24 Northeast crossroads or something like 14:26 that it connects to a network called the 14:28 Internet - which is the network 14:30 connecting many universities in the US 14:31 and it's a very very high bandwidth 14:33 network and so you can you know if you 14:36 were to communicate with say Stanford or 14:38 something like that it wouldn't go over 14:40 the public internet it goes over network 14:41 that's essentially not commercially paid 14:44 for but is the private network 14:46 connecting different universities so and 14:49 it has a connection to Comcast so many 14:52 people who have Comcast in their homes 14:54 in this area 14:55 tend to have good or supposed to inferi 14:57 have good delay low delay to MIT 15:02 out here on this side MIT is connected 15:05 to other research and education networks 15:07 it has high connectivity to Fermilab and 15:12 to CERN because I'm assuming there's a 15:14 huge amount of data flowing because of 15:16 things like the LHC experiments they 15:18 send terabytes or petabytes of data back 15:21 and forth so you need high bandwidth so 15:23 they have their own network connection 15:24 to do that 15:26 this NLR is something called the 15:29 national lambda rail which is another 15:30 hypes high speed network connecting a 15:32 bunch of East Coast universities and 15:34 then out here on the edges you have MIT 15:36 connecting to other out here other 15:39 Internet service providers this thing 15:41 here is funny it's called big ape which 15:43 is actually it's called the Big Apple 15:48 peering exchange it's this place in New 15:50 York City where a lot of people a lot of 15:53 companies and Internet service providers 15:55 have gotten together and you can just 15:56 connect to other networks so MIT 15:59 connects to I think 13 other networks on 16:01 a non-payment basis whereas two internet 16:03 service providers you have to pay money 16:05 you can peer with other networks 16:07 essentially on a bilateral agreement so 16:09 I carry your traffic you carry my 16:10 traffic so it turns out that out in New 16:13 York there is this building where a lot 16:17 of these different networks have gotten 16:18 together and MIT is one among those 16:20 networks so it has extremely good 16:21 connectivity but you can see that 16:23 already you know MIT is a tiny campus 16:25 and already it's got such rich 16:27 connectivity to the rest of the internet 16:29 I guess as far as college campuses go 16:31 it's a big campus but still in the grand 16:33 scale of the internet it's a tiny thing 16:35 and you can already see that there's so 16:36 much complexity and and so many things 16:39 going on inside the network so the 16:42 question is how does this network get 16:44 design 16:45 and the main idea that I want to get at 16:49 today is this idea of packets and packet 16:52 switching so the design principle that's 16:56 used in communication networks is this 17:01 idea of packets and packet switching 17:11 there are some special rules simple 17:14 special rules that you have to follow to 17:16 allow these switches to send messages 17:19 back and forth and in fact these are 17:22 fairly obvious rules but what's 17:26 remarkable about them is how simple they 17:28 are and they can work the main idea is 17:31 that you take your message and you have 17:36 to decide who it needs to be sent to and 17:39 you have to decide who it's coming from 17:42 so if I decide that I want to send a 17:44 message to you in this network my 17:47 computer and your computer have to 17:48 somehow have names associated with them 17:50 and in the context of packet switched 17:54 networks these names that we associate 17:55 with ideally these names should be 17:58 associated with computers but they turn 18:00 out to be names that are associated with 18:02 the link that you use from your computer 18:05 to send these messages these names are 18:08 called addresses 18:12 so very concretely if I have a computer 18:14 here my computer may have a name but 18:17 this computer here has two or three 18:19 different links coming out of it if I 18:21 connect this even this thing here this 18:24 Ethernet link to to the USB port here 18:26 and I connect a cable to it that's one 18:28 link the Wi-Fi on this is another link 18:31 if I turn the Bluetooth on and use that 18:34 it's a third link each of those links 18:36 has a different name the name here is 18:39 equivalent to an address each of these 18:40 things is an address so when I send a 18:43 packet I have to tell you my address and 18:44 similarly if I want to send someone else 18:47 some other computer a packet I have to 18:48 specify the address that I wish to send 18:51 it to so that's the first rule of packet 18:54 switching 18:54 it's specify an address in particular 18:58 specify a destination address and you 19:02 specify a source address okay 19:10 now the idea is once I specify the 19:13 addresses and I construct a message my 19:19 message has some bits in it maybe it's a 19:24 file maybe it's a piece of video 19:25 whatever I add something to that message 19:29 which I which I called the header the 19:33 header has a bunch of fields in it 19:35 specifying something about what should 19:38 be done with the message but the only 19:40 two important things here there's three 19:42 four things that you need but the 19:44 non-negotiable part that you need is a 19:46 part of this address a part of this 19:48 header should specify the destination 19:51 address 19:59 well there's not the parts of it that 20:01 specify the source address as well the 20:06 basic structure is very simple I send a 20:08 message in which I specify a destination 20:10 address and the job and my job is done 20:13 as the source for the time being I send 20:17 it to some switch I'm connected to a 20:20 bunch of switches my computer picks a 20:21 switch to send it to and the switch it 20:23 picks this typically the switch that 20:25 that link is connected to so if I I'm 20:27 connected right now through Ethernet and 20:28 Wi-Fi there's some rule on my computer 20:30 that decides whether to use Ethernet or 20:32 Wi-Fi and let's say it decides to use 20:36 Wi-Fi it sends this thing this message 20:40 with this destination address to that 20:43 access point and that's the first switch 20:45 it goes to and then it becomes the 20:46 switches job to figure out how to get 20:48 this message to the actual destination 20:52 this combination of a header that 20:54 includes the destination address and 20:56 some number of bits that correspond the 20:59 mesh corresponds to the message this 21:01 entire bag of which is called a packet 21:06 and for something technically to be 21:08 considered a packet it needs to have an 21:10 address on it or it needs to have 21:12 something that's equivalent to an 21:14 address on it that then allows the rest 21:16 of the network to decide how to send 21:19 that packet on would this is a lot like 21:21 the way the post office works when you 21:23 deliver you know you write your letter 21:25 you write who it's from and you like 21:26 write hoods to you put it in the mailbox 21:27 your job is done and maybe at some later 21:30 point if it's registered post you get an 21:32 acknowledgement that the other guy 21:34 received the message packet switch 21:36 networks are very much like that they 21:39 just work a little bit faster now why is 21:45 this idea good now the reason this idea 21:50 is good is that it's extremely robust 21:52 are dealing with failures at least in 21:54 theory because it becomes the job of the 21:58 switches in the network to talk to each 22:00 other and run some sort of algorithm 22:02 between each other that allows them to 22:04 always construct and maintain some 22:08 information that allows them to always 22:09 no matter what the failures are as long 22:11 as there is some path that takes you 22:13 from here to there in the network 22:16 regardless of failure as long as the 22:17 underlying topology allows you at least 22:21 one path to get between one place to 22:23 another the switches figure that out and 22:26 if you want to make a network more 22:28 reliable you add more switches and more 22:29 links and you figured out how to make it 22:31 reliable the end points and nothing else 22:33 have to really bother with that problem 22:35 and you can take portions of the network 22:38 that are unreliable and add some 22:39 redundancy to it add more pads to it and 22:41 run some other algorithm that allows the 22:44 switches to figure out how to divert how 22:46 to route packets or how to move these 22:48 messages across and this idea is a 22:51 brilliant idea it looks completely 22:53 obvious in retrospect like all brilliant 22:55 ideas but it's actually quite recent 22:57 it's you know I think they celebrated 22:59 its 50th anniversary quite recently in 23:02 1959 Paul Brown who was at the RAND 23:05 Corporation at the time wrote a 23:07 of one or two you know what you know 23:11 it's not often you can call a paper 23:12 seminal this is similar this is really 23:14 important it just changed the way 23:17 communication worked as papers called on 23:19 distributed communications introduction 23:22 to the first one was introduction to 23:24 distributed communication networks where 23:26 he looked at various ways you could 23:27 design these network topologies and 23:29 completely theoretically argued that 23:32 this design would allow you to build a 23:36 network that could withstand various 23:38 kinds of failures in particular even I 23:40 just serial failures caused by you know 23:42 enemy attacks and the second part of the 23:46 story with these messages that are in 23:49 packets is he said that if I want to 23:53 communicate a large amount of data what 23:56 you should do is break it up into 23:58 smaller pieces so you take a message if 24:00 you have a big file to transfer don't 24:02 put it in one big packet but instead you 24:06 break it up into smaller pieces and send 24:10 each piece into the network so a big 24:12 file gets broken up into many packets 24:14 each packet becomes an independent 24:15 atomic unit of delivery packets could be 24:20 sent along very different paths in 24:22 principle between any point in the 24:24 network and any other point in the 24:25 network and at the other end packets 24:28 could arrive along different paths 24:30 and as long as there's some working path 24:32 it's the job of the network to figure 24:33 out how to get those packets through 24:35 that's the basic idea so the first one 24:38 is this idea of using an address on 24:40 messages the second one is the idea of 24:43 breaking it up into packets 24:49 and in particularly these packets could 24:54 all take a bleep ads the sources and the 24:58 destinations don't determine the path 25:00 the switches determine the pads that you 25:02 have to use using some algorithms that 25:05 we're going to be studying so this idea 25:08 clear has everybody understand kind of 25:10 what a packet switch network is the 25:12 textbook the notes also talk about other 25:14 ways of doing it the other big way of 25:16 doing it which predates this was what 25:18 was done in the Bell Telephone network 25:21 it's called circuit switching it's a 25:23 different idea I'm not going to talk 25:24 about in lecture you can read about it 25:26 as it's important stuff to read about 25:29 but mostly cultural at this point 25:31 because almost every network is packet 25:33 switch today so any any questions about 25:36 this idea it's pretty simple ok so 25:43 here's an example of the world's 25:44 simplest packet header this is the sixth 25:46 or two reference design so for the labs 25:49 and everything else this is the packet 25:51 header we're going to be using it has 25:52 just four fields a destination address 25:55 which specifies where the packet should 25:58 be sent it has something called the hop 26:01 limit which I will talk about in a 26:04 couple lectures from now as to why we 26:06 need it it has a source address mainly 26:09 because when I receive a message when 26:11 this computer receives a message from 26:13 someone it often wants to send a message 26:16 back in response and having the source 26:18 address allows it to send a message back 26:20 to the person who sent the message it's 26:23 just for you know two-way communication 26:26 and it has a length and the reason for 26:29 having the length is convenience you 26:30 know you kind of know once the header is 26:32 done how many bits do you need how big 26:35 is the actual data corresponding to the 26:37 packet it's also called the payload how 26:39 big is the payload in the packet 26:42 now you know real-world Paquette hair is 26:46 a little more complicated just for 26:47 concreteness this is what IP version 6 26:49 which is the version of IP everybody's 26:51 trying to move to the internet protocol 26:53 looks like it has the destination and 26:56 source addresses it has the hop limit it 26:58 has the length and it's got a few other 26:59 things that we're really not gonna worry 27:01 about they have to do with allowing 27:03 switches to prioritize certain kinds of 27:05 packets so that I guess you know things 27:11 like if you were talking doing you know 27:13 Skype or voice telephony you might want 27:15 to schedule those packets differently in 27:18 the switch so you get low delay or if 27:21 you were you know maybe the CEOs packets 27:25 get higher priority whatever you could 27:26 come up with policies on deciding how 27:28 you switch these how you schedule these 27:30 packets so that's the main idea in in 27:35 packet switching for the rest of today I 27:37 actually want to talk about two 27:42 performance metrics that people use to 27:44 evaluate how well a packet switch 27:46 network is doing in terms of how you 27:49 know properties that users care about 27:50 and I want to also explain to you why 27:53 this idea works like this idea that node 27:57 just send data you know all these nodes 28:00 are sharing a communication medium I'm 28:02 sorry sharing resources in the switch so 28:05 this node can send packets this node can 28:08 send packet this node can send packets 28:09 and the switch must have a plan in mind 28:12 to let's say that all these packets are 28:14 going to some destination and have to go 28:16 on this link this switch must have a 28:18 plan in mind for deciding how to take 28:20 all these packets that are coming in and 28:22 sending them along this link I mean like 28:25 for example what happens if packets come 28:27 too fast for the switch to handle the 28:30 speed of these links when they all 28:31 simultaneously send packets could be 28:33 bigger than the speed of the link going 28:35 this way what does the switch do with 28:37 that does it just drop the packets does 28:39 it hold on to them for some time what 28:41 does it actually do 28:42 and I want to do this first bite with 28:45 this with a very simple picture that 28:46 tries to get at why this idea really 28:49 really actually works this idea that 28:52 makes packet-switching work has a fancy 28:55 name it's called statistical 28:58 multiplexing so let me explain what that 29:00 means let's take it with a very simple 29:06 picture so let's say that you have a 29:08 switch with one link coming out of it 29:12 and let's say that the speed of this 29:14 link arm I need to get into some metrics 29:17 here so links are measured in terms of 29:19 how quickly how quickly is the wrong 29:22 word in terms of the rate at which they 29:24 can send data and there's another metric 29:26 which is the delay of the link so I'll 29:29 get to both of these more carefully in a 29:31 bit but the important thing right now I 29:33 want to keep in mind is the rate of the 29:35 link this is the rate at which it can 29:38 send bits per second okay it's a it's a 29:41 it's a metric it's a measure of 29:43 throughput so it's typically measured in 29:45 bits per second so let me actually 29:48 imagine that the rate of this link is 29:49 one megabyte per second which is 10 to 29:54 the 6 which is a million bytes per 29:55 second or about 10 million bits per 29:57 second let's imagine a simple network 30:01 that looks like this 30:03 let's imagine that all these links are 30:06 also coming in at one megabyte per 30:08 second if somebody came and told you 30:18 here's the design of my network I have a 30:21 switch it's connected to three computers 30:23 each of which can is connected with the 30:26 link whose maximum speed is one megabyte 30:28 per second and this switch is going to 30:31 connect to something else downstream 30:33 maybe another switch and it goes 30:34 somewhere else and the speed of this 30:36 link is one megabyte per second is this 30:39 a good network design how would you go 30:41 about assessing that question 30:47 this is good or bad how would you know 30:49 yes right 30:57 so let me ask this before we answer this 30:59 question let's say this was ten 31:00 megabytes per second is this a good 31:01 network design it is they're paying too 31:04 much though because I mean really this 31:06 link is too fast for the amount of load 31:08 that is coming in but yeah you know it's 31:10 a reasonable network design but the real 31:12 question is if it's one here is it a 31:14 good network design and the answer as 31:15 the gentleman here pointed out is it 31:17 really depends on how much traffic how 31:21 many packets per second or bits per 31:22 second these different computers are 31:23 going to be sending let's say that they 31:26 all actually send when they send traffic 31:30 they send at one megabyte per second and 31:32 when they don't send traffic they're 31:34 quiet how would you determine whether 31:36 this is a good network design whether 31:38 this works or not like in practice on 31:40 average how often can each of these guys 31:42 be sending before you determine that 31:44 this is probably or not this network 31:46 isn't gonna work yeah yeah 31:54 right and they may or may not be equal 31:56 ideally what you'd like is just to make 31:58 sure that over some window of time they 31:59 send slower than the rate at which this 32:02 link can ship packets now the reason why 32:05 packet switching works is that when you 32:10 build a network like this and you scale 32:11 it out to bigger numbers it turns out to 32:14 be extremely unlikely that everybody 32:16 using the network exercises the network 32:18 at exactly the same time I mean a bunch 32:20 of people might have their computer on 32:21 but if you think about how it's used you 32:23 click on a link and you get a bunch of 32:25 stuff showing up and then you click on 32:26 you read read it for some time you click 32:28 on a link and something else shows up or 32:30 if you're watching a video stream you 32:32 know video is compressed so if the scene 32:35 changes very often you end up using a 32:37 lot more of the of the in terms of the 32:40 bit rate but then every once in a while 32:41 it's one of these you know old Russian 32:43 movies or nothing's changing for ten 32:44 minutes and yeah it's very heavily 32:46 compressed and then you get the 32:47 Schwarzenegger and you know it's it's 32:49 blowing your bandwidth limit so I mean 32:52 it's kind of like that traffic is bursty 32:56 so with traffic rods and bursts and the 32:59 users are not all highly correlated with 33:01 each other I mean from time to time you 33:03 do get these correlations like these are 33:05 called flash clouds presumably this 33:08 happened last night everybody's hitting 33:09 refresh on the New York Times website 33:11 and you know presumably what's happening 33:14 there of course is that these websites 33:15 really know how what they're doing so 33:18 you know they've actually provisioned 33:19 with the expectation that you know 33:21 starting from 8 p.m. everybody's sitting 33:23 there glued nothing's changing but you 33:25 know everybody's hitting reload and 33:27 they've designed this network they've 33:29 provisioned their network to allow for 33:31 people to get the answers they want or 33:35 the results they want to see so here's 33:37 some pictures so what I did was I took I 33:39 sniffed on the traffic in this in this 33:43 room 33:44 so here's here's the kind of stuff that 33:46 you see so this is the traffic in this 33:48 in this room during a lecture now this 33:54 is actually not this semester but I 33:57 would assume that it's fairly difficult 33:59 I should also say this was doing you 34:02 know the x-axis in these pictures is 34:08 time the y-axis is the number of bytes 34:10 that were sent okay so you can see that 34:13 what I've done on top is I've broken 34:14 time into 10 millisecond windows so 34:19 initially on top it's every 10 34:21 milliseconds I just count the total 34:22 number of bits actually thrown ever 34:23 bytes that was set now you can't read 34:26 the scale on the y-axis on top but it 34:28 goes on the top curve that goes up to 34:29 200,000 bytes in a small 10 millisecond 34:33 window then the curve down here does the 34:36 same thing but I've picked 100 34:38 millisecond window now you can see that 34:41 what has happened when you picked a 34:43 bigger window of time has it become 34:44 smoother or less smoother what can you 34:46 say about it 34:47 it's become a little smoother but there 34:49 surely there still are these Peaks the 34:51 bus do become smoother but they don't 34:52 completely disappear and what's 34:54 remarkable about network traffic is that 34:55 this bus never completely disappeared 34:57 but they do get a little smoother as you 34:59 aggregate over more time over 100 35:02 millisecond windows that's what it looks 35:04 like over one second window it looks 35:06 smoother but you can actually see that 35:08 from time to time there are these big 35:09 bursts that you know take up a lot of 35:11 the that end up over any window of time 35:17 that you expand out there's still some 35:21 probability with which you'll see a big 35:22 burst of traffic showing up in that 35:24 window that's kind of a nice and 35:26 noteworthy characteristic of kind of 35:28 real-world data traffic in fact even 35:31 when you go to 10-second windows which 35:33 says look I'm looking at 10 seconds at a 35:35 time you get stuff that looks like this 35:37 the MIT runs a website you can get 35:39 access to using your web certificates 35:41 it's called 35:44 mrtg mrtg door mit.edu you can actually 35:48 go to this website and you can see four 35:51 different switches including ones in 35:52 your dorm or wherever you are living if 35:53 you live on campus you can actually look 35:55 at the statistics from your router they 35:57 do this on a per switch level and it's 36:00 kind of interesting to see when people 36:02 use this network and when they don't I 36:04 think interesting characteristic of MIT 36:06 is networks as it turns out if you look 36:08 at some of the door network traffic it 36:10 peaks at like between one and three over 36:12 one in four in the morning which is 36:15 probably good because honestly I think 36:18 MIT should negotiate preferential 36:19 pricing with ISPs because no one else is 36:21 using those ISP networks for that time 36:23 so it would be actually yeah it turns 36:26 out I learned that the Amazon Kindle 36:27 kind of does that when you do your 36:30 newspaper subscription they actually you 36:31 send it to the - through wireless 36:33 networks through this commercial 3g and 36:35 4g wireless networks and i believe that 36:37 what they do is they send it to you in 36:41 the middle of the night when not many 36:42 people other than at MIT are using those 36:44 networks so you know you could take 36:47 advantage of some of these time-varying 36:48 properties so why did I tell you the 36:51 story the same thing I showed you these 36:54 time windows the same thing applies when 36:55 you bring many many users together the 36:57 odds that we all are going to run it 37:00 click on oh you know some link at 37:02 exactly the same time and all of us 37:05 cause a burst of traffic to happen 37:06 exactly at the same time is extremely 37:08 small now it can happen if there's an 37:11 adversary in the network if there are 37:12 bad guys and how many of you are heard 37:14 of denial of service attacks yeah DDoS 37:16 is stupid in our service attacks you 37:17 know I understand if you know Russian 37:20 you get an edge in doing it so you know 37:23 so these things are launched because 37:25 they commandeer a whole bunch of 37:27 machines and they coordinate an attack 37:29 they destroy the assumptions that make 37:31 statistical multiplexing works because 37:33 work because the normal assumption is 37:35 people are not exercising the network at 37:37 the same time so you're not attacking 37:38 some website or whatever at the same 37:40 time but if you coordinate an attack 37:43 then you you make that assumption not 37:45 hole causing congestion to happen 37:48 causing traffic to exceed what your 37:50 network link can support but under 37:53 normal non-adversarial conditions the 37:55 assumption is that people are you know 37:57 randomly gaining access to the network 37:59 which means that you can actually get 38:01 away with the design of a network that 38:02 looks like this as long as you study 38:05 statistics like the average amount of 38:07 traffic like on average the guy is not 38:08 going to be sending more that this nodes 38:09 not gonna be sending more than a certain 38:12 amount of traffic when measured over 38:13 some period of time what happens when 38:17 people send traffic you know bursts what 38:21 happens when from time to time in fact 38:22 you see these bursts of traffic right 38:24 you look at this picture here you do it 38:27 over 100 or one second window or one 38:28 hundred millisecond window and you see 38:29 these big peeks of traffic lots of bytes 38:33 you know 200 millisecond window what 38:36 that really means is that this switch 38:38 here is going to be getting traffic from 38:40 different users that probably exceeds 38:44 you know is perhaps the sum of all of 38:46 the input links so it's a large amount 38:49 of traffic if you have a design like 38:50 this something's gotta give because 38:53 you're getting water or packets coming 38:55 in at one megabyte per second times 38:57 three and you got a link that can only 38:59 send one megabyte per second so what can 39:02 you do what can the switch do 39:07 the easiest thing it can do is just drop 39:10 it just say you know what just just drop 39:14 it and it's not you know you laugh but 39:16 I'm telling you sometimes dropping it 39:18 and letting the end point deal with it 39:20 is a better strategy than holding on to 39:21 it and simply keeping it on line it's 39:23 like you gotta be careful right I like 39:25 the idea of storing it but for how long 39:27 you store it I'm how much do you store 39:32 for example if I look at that burst of 39:35 traffic here and I have a network like 39:37 this and I look at this big burst of 39:38 traffic here over a ten-second window 39:40 I'm seeing traffic that's probably in 39:42 this example perhaps 10 or 100 times 39:45 bigger than the average the average is 39:47 sitting down somewhere and maybe this is 39:50 ten times the average the peak to 39:51 average ratio might be ten to one or 39:52 twenty to one so how much should you 39:55 store inside the switch if you were 40:00 designing a network and I told you well 40:01 alright good idea why don't you store 40:03 the packets you're gonna put these 40:05 packets into a data structure called a 40:07 cube right packets come in packets go 40:10 out packets go out whenever the link is 40:13 able to send packets you keep shipping 40:14 packets out in the meantime traffic is 40:15 coming faster than you can handle 40:17 you're gonna put stuff in a queue how 40:19 much you want to keep everything 40:24 if you did you'd be like Disney World 40:26 because they have these lines that go 40:28 forever like nothing's moving and 40:29 everybody just piles on the end of the 40:30 line this is a tough question we're 40:35 gonna answer this question somewhat 40:37 there's no single easy answer to this 40:40 question but the rule of thumb that I'm 40:42 gonna have for you keep in mind now is 40:43 you're probably going to keep between 10 40:44 milliseconds and 100 milliseconds worth 40:47 of traffic I'll get to why later for now 40:50 then it's some small amount of time what 40:52 amount of traffic the reason why you 40:57 need this cue is to absorb a burst of 41:00 traffic that you're not able to 41:01 immediately send but the important 41:04 principle and packet-switched network is 41:05 you need a cue but there are necessary 41:07 evil because the only thing that the cue 41:10 is doing for you is absorbing the bursts 41:12 but the only thing the bad thing that 41:14 it's doing for you is adding delay just 41:17 because you have a cue the network ain't 41:19 gonna move faster the network is moving 41:21 the links moving at the same speed 41:22 whether you have a cue or not the only 41:25 thing the cue is doing is it absolves a 41:26 burst so that when the whenever the 41:29 network link is is able to send packets 41:31 you can ship packets from the cue and 41:33 you don't want to drop too many packets 41:35 now if you're lucky the the size of the 41:38 cue is enough to absorb all of the burst 41:40 and then the traffic eases when you get 41:42 to send the rest but if you're unlucky 41:43 the queue overflows and you drop some 41:45 packets and then the endpoints have to 41:46 somehow deal with it so what are the 41:51 things we've looked at packet switch 41:53 networks as defined by a header which 41:56 includes a destination address the way 41:58 the network works is that the sources 42:01 just ship a packet with the header that 42:03 includes the destination address the 42:04 switches somehow are going to figure out 42:05 how to ship how to get those packets to 42:07 the destination 42:09 the reason why the stuff work is because 42:11 of the statistical multiplexing and 42:14 finally the reason we need a queue in a 42:18 packet switch Network is to absorb these 42:20 bursts so what I want to do in the 42:23 remaining six or seven minutes is to 42:26 tell you about the other metric by which 42:29 we're going to evaluate our networks 42:31 this the first metric I introduced 42:32 already is the rate of a link when you 42:36 have links of different rates you can 42:38 also define the rate for an actual 42:39 communication when a sender source sends 42:41 a packet to a destination you can 42:43 measure the rate at which bits are 42:45 arriving at the destination that's the 42:47 throughput of the data transfer of the 42:50 bit rate the other metric we're going to 42:52 care a lot about is called the delay the 42:59 fancy term for delay is latency I really 43:04 don't know why they have two terms but 43:05 you know from time to time people use 43:08 the word delay or latency and by the way 43:10 I'm going to try it hard to use the word 43:12 rate here or bitrate or throughput often 43:16 you see the word bandwidth like oh my 43:17 bandwidth is 10 megabits per second and 43:20 that's actually fine to use except it's 43:22 confusing in a real communication system 43:24 because we're going to we've already 43:25 used the word bandwidth to refer to a 43:27 frequency and so we've already said that 43:30 bandwidth is defined in terms of say 43:32 Hertz or something like that and it's 43:36 just a little confusing to also use 43:38 bandwidth for rate so we're going to try 43:40 to use words like bitrate and throughput 43:41 to refer to bits per second so delay is 43:44 measured in seconds or milliseconds or 43:46 microseconds and when what we want is 43:50 you have a source that sends 43:53 Paquette or set of packets and let's say 43:56 a single packet to a receiver going 43:57 through a network of switches and I want 44:01 to ask if I send a packet at some point 44:04 in time let's say at time zero 44:07 when does that packet reach the receiver 44:10 okay 44:12 that's the delay for a single packet so 44:15 I just want to explain to you how to 44:17 calculate this or how to how to measure 44:19 this so let's say that the packet has a 44:24 size of L bits so what is the answer 44:31 depend on 44:38 let's take an even simpler example let's 44:40 say that I have a sender I have a 44:42 receiver I have one link between them no 44:44 switches and the packet has size L bits 44:48 I send a packet at time zero when does 44:53 the packet when does the last bit of the 44:54 packet show up at the receiver yes good 45:03 so I need to define this thing here 45:05 let's say that the bitrate of this link 45:08 is C bits per second so I have L bits 45:13 and I have a link that can send packets 45:15 at C bits per second therefore something 45:19 here should be L divided by C seconds 45:23 that is from the moment I start shipping 45:26 these bits from the time delay between 45:29 when the first bit arrives first bit 45:32 arrives at the receiver and the last bit 45:34 arrives at the receiver that time 45:36 distance or time difference is C divided 45:40 by L sorry L divided by C seconds 45:45 right because I ship these bits these 45:49 bits go back-to-back over the link if I 45:51 look at when the first bit arrives and I 45:53 look at when the last bit arrives that 45:54 time difference is the spacing you know 45:57 the time difference between any two bits 45:59 showing up at the receiver is one over C 46:02 seconds because if the Lincoln Cent C 46:05 bits per second any two bits are 46:06 separated about separated apart by one 46:08 over C seconds therefore from the time 46:12 at which the first bit arrives to the 46:14 time at which the last bit arrives that 46:16 distance is L over C that difference is 46:19 L over C seconds this n over C has a 46:22 name associated with it this is called 46:24 the transmission delay now let's say I 46:34 want to send just one bit I have to now 46:37 I send a bit at some point in time and 46:39 that bitch shows up some point in the 46:42 future right because it can't show up 46:44 immediately if it did you know we'd have 46:47 probably have to change the laws of 46:48 physics because you know speed of light 46:51 is no longer valid as a finite limit so 46:55 what is the time between when I send the 46:57 first bit and when the first bit shows 46:58 up here what does that depend on 47:03 but let's think I want to communicate to 47:05 the moon I send one bit of information I 47:09 put it out onto the or even one sample I 47:11 put it out on the on the radio or 47:13 whatever how long before it gets to the 47:14 moon 47:18 depends on what does it depend on the 47:22 rate at which I can communicate know 47:24 what does it depend on sorry you guys 47:30 said it what is it speed in the speed of 47:33 what it's the speed of light in that 47:37 medium or speed of whatever the signal 47:38 you use is if it's acoustic then it's 47:40 the speed of sound over the medium so it 47:42 depends on the distance and depends on 47:44 the speed of at which a signal can 47:46 propagate through that communication 47:47 medium for example the speed of light so 47:49 the distance is B and the speed of the 47:53 communication medium is let's say V that 47:57 thing is called the propagation delay so 48:03 let me organize this properly so I'm not 48:05 confusing everybody with these different 48:08 terms so so far we've hit two sources of 48:10 delay the first source of delay which I 48:13 said second is the propagation delay 48:18 this is the time it takes for the first 48:20 bit to get to the other side it depends 48:23 on the speed at which a signal 48:24 propagates through the medium and the 48:26 distance between sender and receiver so 48:28 sound travels at one foot per 48:29 millisecond I think or roughly something 48:31 like that so if I'm doing a cue stick 48:33 that dictates the propagation delay the 48:36 second delay is the transmission delay 48:43 which depends on this L oversee the 3rd 48:48 delay is what are the processing delays 48:50 there are that for example is when a 48:56 switch gets a package it has to look at 48:57 the package header figure out the 48:59 destination do something to that there's 49:01 some computation time that the switches 49:03 you know have to work with that delay is 49:06 called the processing delay this is 49:07 purely some sort of computing delay and 49:10 it's usually very very small and the 49:13 fourth delay is the queuing delay 49:15 because it could be that you get these 49:18 packets in and they have to sit behind 49:19 in a queue and that imposes a delay in 49:22 communication so that's called the 49:25 queuing delay and it's usually a very 49:26 variable source of delay on many 49:29 networks these other delays are constant 49:31 not always but generally constant the 49:34 transmission delay may or may not be 49:35 constant but usually these are more 49:37 constant the queuing delay is not a 49:39 constant delay and the actual delays 49:41 that you experience when you click on a 49:43 web on a link there's not some reasons 49:45 why the website is slow but these are a 49:48 principle dominant factor in many many 49:50 cases so we'll pick up on this next week 49:53 after the quiz to stuff you deal with 49:55 quiz 2 and pset 6 and we'll continue 49:58 with multi hop networks 50:06 you