Transcript


00:00
the following content is provided under
00:01
a Creative Commons license your support
00:04
will help MIT OpenCourseWare continue to
00:06
offer high quality educational resources
00:08
for free
00:09
to make a donation or view additional
00:12
materials from hundreds of MIT courses
00:14
visit MIT opencourseware at ocw.mit.edu
00:26
you
00:30
so my name my name is Hari Balakrishna
00:32
and I'm gonna take you through the rest
00:34
of 6:30 of doing the remaining lectures
00:36
in the class so so far in 602 what we've
00:40
looked at are ways in which we design a
00:44
single communication link so we know how
00:47
to take two computers or two nodes and
00:53
design what a link between them might
00:56
look like and this link might be an
00:57
actual wired link or it might be a radio
01:01
link or it might be an acoustic link
01:04
there's some medium over which these two
01:06
guys communicate and the main idea is
01:08
we've looked at have to do with coding
01:12
in particular channel coding which is a
01:14
strategy to combat noise and errors that
01:17
might show up on the channel and then in
01:20
order to match the what we communicate
01:24
to the characteristics of the channel
01:26
you know for example the ability of a
01:27
channel to deal in sinusoids we studied
01:31
modulation and demodulation so those are
01:37
the two main elements that we studied
01:39
and in both of these we looked at both
01:41
how you do this to achieve reliability
01:44
because ultimately we want to
01:45
communicate information in a way that's
01:47
reliable and do it efficiently in
01:51
particular with modulation we looked at
01:53
a scheme to share a medium amongst
01:55
multiple multiple conversations
01:58
frequency division multiplexing which is
02:00
the topic one of the tasks on this lab
02:02
and with coding we looked at ways in
02:03
which you you do this coding in a way
02:05
that isn't just replicating every bit
02:07
but involves some you know linear
02:10
algebra operations that allows you to do
02:12
gain efficiency so the rest of the class
02:15
is really about taking for granted our
02:19
ability to design communication links
02:21
and putting them together and composing
02:23
them to build networks so the basic
02:25
problem is actually very very easy the
02:28
problem is you're given a set of nodes
02:31
let's say computers and the problem you
02:38
want to solve is to come up with a way
02:39
by which you can allow any computer or
02:41
any phone or any device on this network
02:43
to communicate with any other device on
02:45
the network that's the problem so you're
02:49
given n nodes and you want all to all
02:54
communication now this is a little
03:00
different from the kind of you know the
03:02
other networks you could design you
03:04
could design a network where you're
03:05
given n nodes and you have one 2n
03:06
communication there's one transmitter
03:08
many many receivers and you want to
03:11
design a network for that purpose what's
03:13
an example of a network where you have
03:14
one transmitter many receivers and you
03:15
just want to build something that makes
03:17
that work radio is one example
03:20
television is another example and those
03:23
are good examples in fact for those
03:26
kinds of one-to-many networks where you
03:28
have or or K to n networks where you
03:30
have K sources of information and n
03:32
receivers and K is a lot smaller than n
03:35
it'll turn out that the basic frequency
03:37
division approach makes sense I mean
03:39
that's how radio stations or TV stations
03:41
work someone in the US the Federal
03:43
Communications Commission has decided to
03:45
allocate different chunks of frequency
03:47
to different TV stations and different
03:49
radio stations and the assumption is
03:51
they're always going to be using it it
03:53
turns out that assumption may or may not
03:54
be true but under the assumption that
03:56
they're always going to be using it and
03:58
you have many many many receivers you
03:59
just divide up frequencies and a lot of
04:01
them each to transmit in their own
04:04
frequencies and then you have a receiver
04:05
that's capable of tuning to different
04:07
frequencies and you get the information
04:10
of the channel that you want will
04:12
actually come back to that problem a
04:14
little bit in the next two lectures but
04:17
for today the design problem and going
04:19
forward the design problem you should
04:21
have in mind is you want a network where
04:22
you have alter all communication and you
04:24
want to be able to support any
04:26
application
04:28
this is a big deal we're not just
04:31
designing a network to allow telephone
04:33
calls to work or we're not just
04:34
designing network that allows you to do
04:36
video conferencing we're trying to
04:38
design a network where any application
04:40
can run on it in particular applications
04:42
that you might not have envisioned this
04:45
is the reason why the internet works
04:46
really well is because when they
04:48
designed the internet they designed it
04:50
under some set of assumptions but they
04:52
were really really smart to design a
04:54
network that made minimal assumptions
04:56
about the application so it's a network
04:59
that's good enough for almost any
05:00
application though it isn't perfectly
05:05
optimal for any application it's just
05:07
good enough for everything and that's a
05:09
really good characteristic of a
05:10
well-designed network is if you can make
05:12
if it can work even for things you
05:14
didn't even dream off when their
05:16
birthday Internet they certainly didn't
05:17
dream that the web would have to exist
05:19
they didn't dream that you know people
05:20
would be you know tweeting and telling
05:23
people they're going to the bathroom or
05:24
whatever they do on Twitter I mean they
05:26
designed a network and it just kind of
05:29
is amazing that all these applications
05:30
can work so the question is what did
05:32
they do correct what did they do right
05:34
and what are things that what what
05:37
general lessons can we learn from it and
05:39
the general high level lessons you learn
05:41
actually apply to any system you build
05:43
it'll turn out that whenever you know if
05:45
you confront it with a real-world
05:46
problem in an industry or research or
05:48
wherever very often you're trying to
05:51
make decisions on what you need to be
05:53
doing and it's very tempting to make
05:56
decisions based on what you think it's
05:58
going to be used for but very often what
06:00
you end up eventually using it for is
06:03
very different from what you thought in
06:04
the beginning so it's good to have
06:05
applications in mind but it's good not
06:07
to embed too much about those
06:08
applications of the design of networks
06:10
so the high-level principle here is how
06:12
you can do something that works well
06:15
enough without making too many
06:17
assumptions about what's running on top
06:18
of it
06:20
there are there are two big themes
06:22
they're the same two themes that we
06:23
studied before that we're going to keep
06:25
coming back to the first is efficiency
06:28
and the second is reliability the same
06:39
two themes we come back to over and over
06:40
again there's a third important theme
06:42
about network design which has to do
06:44
with scalability I mean how can you make
06:46
it work so this network can work for
06:49
millions of billions of devices and
06:51
billions of computers that's a topic
06:53
we're not really going to talk about
06:54
I'll get to it in the last lecture but
06:57
six or three three and six eight to nine
06:59
we'll talk about those issues so let me
07:01
start first with efficiency if I tell
07:05
you how to build a communication link
07:07
that can communicate between any two
07:09
devices or any two computers it should
07:12
be pretty straightforward to now design
07:13
a network that allows all to all
07:15
communication something out that's a
07:20
mouse great
07:26
one way you can design this network is
07:28
to simply take your communication link
07:30
that we know how to build and do this
07:32
just connect every pair of computers for
07:35
every pair of nodes to each other I'm
07:37
probably missing a few of these but this
07:42
is a great network design because it's
07:46
composed a bunch of links to build a
07:48
network so why don't we do this or maybe
07:51
we should do this right
07:58
it's too expensive why is it too
08:00
expensive sorry
08:03
you know um how many fewer in professors
08:06
lose the recitation great I understand
08:09
he gives you guys money to answer or if
08:11
he makes a mistake I'm gonna do the same
08:13
thing whenever I make a mistake
08:15
professors will give you some money so
08:20
so actually I mean I don't hold me to
08:23
this but why don't you guys answer this
08:26
is pretty straightforward how many links
08:27
do you need and choose two it's about n
08:31
squared right so N squared depending on
08:34
the context it's not it too big or too
08:35
small but the modern choose 2 N squared
08:37
links it turns out that's actually a
08:39
pretty large number of Link's because
08:41
and and the notes talked about some of
08:44
the reasons why this is too expensive
08:46
but the other reason it's a problem is
08:48
that you know it's one thing to design a
08:49
network where every computer in this
08:51
room can talk to each other and
08:52
conceivably you know we might get
08:53
tangled up in all these wires but we
08:55
could imagine laying wires between every
08:57
pair of our computers and communicating
08:59
but there are two reasons this is a big
09:01
problem I want to communicate with
09:02
computers in California or China or
09:04
wherever and you know individual links
09:07
going across the world and you know my
09:09
computer to China on your computer to
09:10
another computer in China just doesn't
09:12
scale does it work very well
09:14
the second problem the reason why this
09:16
issue matters is that not all
09:18
communication links are wires in fact
09:20
all right now the most the most dominant
09:23
mode by which people gain access to the
09:25
Internet including right now in this
09:26
room is through is through radio it's
09:29
through wireless and this is a shared
09:30
medium so it's not like you know we can
09:33
somehow you know put these wires
09:35
together we're gonna have to share this
09:37
communication medium we're gonna have to
09:39
share this communication network and
09:40
somehow we have to come up with a
09:42
strategy to do this efficiently and
09:44
there's a few different principles
09:46
involved in how you design networks but
09:48
the main one is that we're going to
09:50
construct a special computer called a
09:53
switch
09:56
and a lot of what we're going to be
09:58
doing has to do with what we do in the
10:01
switch the other part of what we're
10:03
going to be doing is what we do in the
10:05
computers itself so our network is going
10:07
to be designed using a set of rules that
10:09
are obeyed and implemented and followed
10:12
by the computers okay special set of
10:14
rules that are implemented by these
10:16
computers curved switches and a special
10:17
set of rules that are implemented by the
10:19
end computers by the devices on the
10:20
network and together they're going to
10:22
make our communication work so the the
10:25
high-level plan is going to be that we
10:26
take these computers and rather than put
10:28
wires between every pair of them we're
10:30
going to connect them together into
10:32
perhaps there's lots and lots of
10:34
computers and many of them get connected
10:36
to one of these boxes which is a switch
10:39
and a switch may connect to other
10:42
switches and some of these switches may
10:47
have other computers attached to them
10:49
and then eventually you might get to
10:55
other end computers and and when you
10:58
build a network like this a structure
11:00
like this this kind of a picture is
11:02
called the network topology
11:05
a switch has one or more links attached
11:10
to it these links could be wires they
11:13
could be shared things like like this
11:16
thing here is a switch it has no visible
11:20
links but it probably has one wired link
11:23
connecting it via ethernet to the rest
11:24
of the MIT campus and out here you know
11:27
lots of computers right now are
11:28
connected to it it gives the illusion
11:30
that each of your computers has a
11:31
separate link to the switch and we look
11:33
at how that illusion is maintained and
11:35
down next time next lecture but this is
11:38
an example of a switch probably the
11:39
world's you know this thing is made I
11:41
think by Cisco so they charge you know
11:43
six or eight hundred dollars for it but
11:45
really you know it's you can buy it for
11:46
forty bucks when you put the word
11:49
Enterprise next to anything you sell you
11:50
pay the price but anyway the world's
11:56
cheapest switches are on Wi-Fi access
11:57
points so you connect the stuff together
12:00
into a topology and the job of the
12:02
switch is to look at messages that come
12:04
in from from these links and figure out
12:07
what to do with those messages and make
12:09
sure that together they coordinate to
12:12
get messages to the destinations to
12:15
which you wish to send those messages so
12:18
here's the picture of that I got today
12:21
from MIT sis and T which is the picture
12:26
of MIT s network so I just want to give
12:28
you a sense for what this looks like for
12:33
a campus like MIT so the first thing to
12:36
notice is that this is actually it's got
12:38
some redundancy built in you don't see
12:39
it in the picture but really what's
12:41
going on here is that we have these two
12:42
routers here in the context of the
12:46
Internet these switches are also called
12:48
routers it's taken me 10 years to
12:50
pronounce it router because where I was
12:52
brought up
12:53
they pronounced a router and many people
12:55
say that but in the u.s. they say router
12:57
so anyway these routers here there are
12:59
two - backbone routers and they're
13:01
actually each of these guys these other
13:03
routers in these different buildings are
13:06
connected actually to both of these so
13:08
the idea here is that if one of those
13:09
links were to fail or if one of these
13:11
were to fail the other guy would take
13:13
over and handle this traffic under
13:16
normal conditions traffic is kind of
13:18
balanced between these two different
13:20
rudders so some of these computers some
13:21
of these other routers are connected to
13:23
one of them some of the other routers
13:24
are connected to the other and together
13:26
they work to provide connectivity these
13:29
backbone routers get connected to these
13:30
things that are called external routers
13:32
which are routers that connect to
13:35
various other networks and Internet
13:36
service providers that MIT uses MIT is
13:41
extremely well connected the amount of
13:42
bandwidth coming in and out as you might
13:44
have noticed growing you know I don't
13:46
know BitTorrent or whatever the cool
13:47
people do these days with with networks
13:49
is is phenomenal MIT commercially uses
13:55
sprint which is an Internet service
13:56
provided users level 3 which is probably
13:58
the biggest internet service provider in
14:00
the u.s. this thing called pay tech is I
14:04
found is that so MIT now does telephony
14:06
through the internet so it's voice over
14:09
IP as opposed to the old telephone
14:11
system so that's a lot of that voice
14:14
traffic goes through that network
14:17
service provider other things here this
14:21
n o X is I think it stands for the
14:24
Northeast crossroads or something like
14:26
that it connects to a network called the
14:28
Internet - which is the network
14:30
connecting many universities in the US
14:31
and it's a very very high bandwidth
14:33
network and so you can you know if you
14:36
were to communicate with say Stanford or
14:38
something like that it wouldn't go over
14:40
the public internet it goes over network
14:41
that's essentially not commercially paid
14:44
for but is the private network
14:46
connecting different universities so and
14:49
it has a connection to Comcast so many
14:52
people who have Comcast in their homes
14:54
in this area
14:55
tend to have good or supposed to inferi
14:57
have good delay low delay to MIT
15:02
out here on this side MIT is connected
15:05
to other research and education networks
15:07
it has high connectivity to Fermilab and
15:12
to CERN because I'm assuming there's a
15:14
huge amount of data flowing because of
15:16
things like the LHC experiments they
15:18
send terabytes or petabytes of data back
15:21
and forth so you need high bandwidth so
15:23
they have their own network connection
15:24
to do that
15:26
this NLR is something called the
15:29
national lambda rail which is another
15:30
hypes high speed network connecting a
15:32
bunch of East Coast universities and
15:34
then out here on the edges you have MIT
15:36
connecting to other out here other
15:39
Internet service providers this thing
15:41
here is funny it's called big ape which
15:43
is actually it's called the Big Apple
15:48
peering exchange it's this place in New
15:50
York City where a lot of people a lot of
15:53
companies and Internet service providers
15:55
have gotten together and you can just
15:56
connect to other networks so MIT
15:59
connects to I think 13 other networks on
16:01
a non-payment basis whereas two internet
16:03
service providers you have to pay money
16:05
you can peer with other networks
16:07
essentially on a bilateral agreement so
16:09
I carry your traffic you carry my
16:10
traffic so it turns out that out in New
16:13
York there is this building where a lot
16:17
of these different networks have gotten
16:18
together and MIT is one among those
16:20
networks so it has extremely good
16:21
connectivity but you can see that
16:23
already you know MIT is a tiny campus
16:25
and already it's got such rich
16:27
connectivity to the rest of the internet
16:29
I guess as far as college campuses go
16:31
it's a big campus but still in the grand
16:33
scale of the internet it's a tiny thing
16:35
and you can already see that there's so
16:36
much complexity and and so many things
16:39
going on inside the network so the
16:42
question is how does this network get
16:44
design
16:45
and the main idea that I want to get at
16:49
today is this idea of packets and packet
16:52
switching so the design principle that's
16:56
used in communication networks is this
17:01
idea of packets and packet switching
17:11
there are some special rules simple
17:14
special rules that you have to follow to
17:16
allow these switches to send messages
17:19
back and forth and in fact these are
17:22
fairly obvious rules but what's
17:26
remarkable about them is how simple they
17:28
are and they can work the main idea is
17:31
that you take your message and you have
17:36
to decide who it needs to be sent to and
17:39
you have to decide who it's coming from
17:42
so if I decide that I want to send a
17:44
message to you in this network my
17:47
computer and your computer have to
17:48
somehow have names associated with them
17:50
and in the context of packet switched
17:54
networks these names that we associate
17:55
with ideally these names should be
17:58
associated with computers but they turn
18:00
out to be names that are associated with
18:02
the link that you use from your computer
18:05
to send these messages these names are
18:08
called addresses
18:12
so very concretely if I have a computer
18:14
here my computer may have a name but
18:17
this computer here has two or three
18:19
different links coming out of it if I
18:21
connect this even this thing here this
18:24
Ethernet link to to the USB port here
18:26
and I connect a cable to it that's one
18:28
link the Wi-Fi on this is another link
18:31
if I turn the Bluetooth on and use that
18:34
it's a third link each of those links
18:36
has a different name the name here is
18:39
equivalent to an address each of these
18:40
things is an address so when I send a
18:43
packet I have to tell you my address and
18:44
similarly if I want to send someone else
18:47
some other computer a packet I have to
18:48
specify the address that I wish to send
18:51
it to so that's the first rule of packet
18:54
switching
18:54
it's specify an address in particular
18:58
specify a destination address and you
19:02
specify a source address okay
19:10
now the idea is once I specify the
19:13
addresses and I construct a message my
19:19
message has some bits in it maybe it's a
19:24
file maybe it's a piece of video
19:25
whatever I add something to that message
19:29
which I which I called the header the
19:33
header has a bunch of fields in it
19:35
specifying something about what should
19:38
be done with the message but the only
19:40
two important things here there's three
19:42
four things that you need but the
19:44
non-negotiable part that you need is a
19:46
part of this address a part of this
19:48
header should specify the destination
19:51
address
19:59
well there's not the parts of it that
20:01
specify the source address as well the
20:06
basic structure is very simple I send a
20:08
message in which I specify a destination
20:10
address and the job and my job is done
20:13
as the source for the time being I send
20:17
it to some switch I'm connected to a
20:20
bunch of switches my computer picks a
20:21
switch to send it to and the switch it
20:23
picks this typically the switch that
20:25
that link is connected to so if I I'm
20:27
connected right now through Ethernet and
20:28
Wi-Fi there's some rule on my computer
20:30
that decides whether to use Ethernet or
20:32
Wi-Fi and let's say it decides to use
20:36
Wi-Fi it sends this thing this message
20:40
with this destination address to that
20:43
access point and that's the first switch
20:45
it goes to and then it becomes the
20:46
switches job to figure out how to get
20:48
this message to the actual destination
20:52
this combination of a header that
20:54
includes the destination address and
20:56
some number of bits that correspond the
20:59
mesh corresponds to the message this
21:01
entire bag of which is called a packet
21:06
and for something technically to be
21:08
considered a packet it needs to have an
21:10
address on it or it needs to have
21:12
something that's equivalent to an
21:14
address on it that then allows the rest
21:16
of the network to decide how to send
21:19
that packet on would this is a lot like
21:21
the way the post office works when you
21:23
deliver you know you write your letter
21:25
you write who it's from and you like
21:26
write hoods to you put it in the mailbox
21:27
your job is done and maybe at some later
21:30
point if it's registered post you get an
21:32
acknowledgement that the other guy
21:34
received the message packet switch
21:36
networks are very much like that they
21:39
just work a little bit faster now why is
21:45
this idea good now the reason this idea
21:50
is good is that it's extremely robust
21:52
are dealing with failures at least in
21:54
theory because it becomes the job of the
21:58
switches in the network to talk to each
22:00
other and run some sort of algorithm
22:02
between each other that allows them to
22:04
always construct and maintain some
22:08
information that allows them to always
22:09
no matter what the failures are as long
22:11
as there is some path that takes you
22:13
from here to there in the network
22:16
regardless of failure as long as the
22:17
underlying topology allows you at least
22:21
one path to get between one place to
22:23
another the switches figure that out and
22:26
if you want to make a network more
22:28
reliable you add more switches and more
22:29
links and you figured out how to make it
22:31
reliable the end points and nothing else
22:33
have to really bother with that problem
22:35
and you can take portions of the network
22:38
that are unreliable and add some
22:39
redundancy to it add more pads to it and
22:41
run some other algorithm that allows the
22:44
switches to figure out how to divert how
22:46
to route packets or how to move these
22:48
messages across and this idea is a
22:51
brilliant idea it looks completely
22:53
obvious in retrospect like all brilliant
22:55
ideas but it's actually quite recent
22:57
it's you know I think they celebrated
22:59
its 50th anniversary quite recently in
23:02
1959 Paul Brown who was at the RAND
23:05
Corporation at the time wrote a
23:07
of one or two you know what you know
23:11
it's not often you can call a paper
23:12
seminal this is similar this is really
23:14
important it just changed the way
23:17
communication worked as papers called on
23:19
distributed communications introduction
23:22
to the first one was introduction to
23:24
distributed communication networks where
23:26
he looked at various ways you could
23:27
design these network topologies and
23:29
completely theoretically argued that
23:32
this design would allow you to build a
23:36
network that could withstand various
23:38
kinds of failures in particular even I
23:40
just serial failures caused by you know
23:42
enemy attacks and the second part of the
23:46
story with these messages that are in
23:49
packets is he said that if I want to
23:53
communicate a large amount of data what
23:56
you should do is break it up into
23:58
smaller pieces so you take a message if
24:00
you have a big file to transfer don't
24:02
put it in one big packet but instead you
24:06
break it up into smaller pieces and send
24:10
each piece into the network so a big
24:12
file gets broken up into many packets
24:14
each packet becomes an independent
24:15
atomic unit of delivery packets could be
24:20
sent along very different paths in
24:22
principle between any point in the
24:24
network and any other point in the
24:25
network and at the other end packets
24:28
could arrive along different paths
24:30
and as long as there's some working path
24:32
it's the job of the network to figure
24:33
out how to get those packets through
24:35
that's the basic idea so the first one
24:38
is this idea of using an address on
24:40
messages the second one is the idea of
24:43
breaking it up into packets
24:49
and in particularly these packets could
24:54
all take a bleep ads the sources and the
24:58
destinations don't determine the path
25:00
the switches determine the pads that you
25:02
have to use using some algorithms that
25:05
we're going to be studying so this idea
25:08
clear has everybody understand kind of
25:10
what a packet switch network is the
25:12
textbook the notes also talk about other
25:14
ways of doing it the other big way of
25:16
doing it which predates this was what
25:18
was done in the Bell Telephone network
25:21
it's called circuit switching it's a
25:23
different idea I'm not going to talk
25:24
about in lecture you can read about it
25:26
as it's important stuff to read about
25:29
but mostly cultural at this point
25:31
because almost every network is packet
25:33
switch today so any any questions about
25:36
this idea it's pretty simple ok so
25:43
here's an example of the world's
25:44
simplest packet header this is the sixth
25:46
or two reference design so for the labs
25:49
and everything else this is the packet
25:51
header we're going to be using it has
25:52
just four fields a destination address
25:55
which specifies where the packet should
25:58
be sent it has something called the hop
26:01
limit which I will talk about in a
26:04
couple lectures from now as to why we
26:06
need it it has a source address mainly
26:09
because when I receive a message when
26:11
this computer receives a message from
26:13
someone it often wants to send a message
26:16
back in response and having the source
26:18
address allows it to send a message back
26:20
to the person who sent the message it's
26:23
just for you know two-way communication
26:26
and it has a length and the reason for
26:29
having the length is convenience you
26:30
know you kind of know once the header is
26:32
done how many bits do you need how big
26:35
is the actual data corresponding to the
26:37
packet it's also called the payload how
26:39
big is the payload in the packet
26:42
now you know real-world Paquette hair is
26:46
a little more complicated just for
26:47
concreteness this is what IP version 6
26:49
which is the version of IP everybody's
26:51
trying to move to the internet protocol
26:53
looks like it has the destination and
26:56
source addresses it has the hop limit it
26:58
has the length and it's got a few other
26:59
things that we're really not gonna worry
27:01
about they have to do with allowing
27:03
switches to prioritize certain kinds of
27:05
packets so that I guess you know things
27:11
like if you were talking doing you know
27:13
Skype or voice telephony you might want
27:15
to schedule those packets differently in
27:18
the switch so you get low delay or if
27:21
you were you know maybe the CEOs packets
27:25
get higher priority whatever you could
27:26
come up with policies on deciding how
27:28
you switch these how you schedule these
27:30
packets so that's the main idea in in
27:35
packet switching for the rest of today I
27:37
actually want to talk about two
27:42
performance metrics that people use to
27:44
evaluate how well a packet switch
27:46
network is doing in terms of how you
27:49
know properties that users care about
27:50
and I want to also explain to you why
27:53
this idea works like this idea that node
27:57
just send data you know all these nodes
28:00
are sharing a communication medium I'm
28:02
sorry sharing resources in the switch so
28:05
this node can send packets this node can
28:08
send packet this node can send packets
28:09
and the switch must have a plan in mind
28:12
to let's say that all these packets are
28:14
going to some destination and have to go
28:16
on this link this switch must have a
28:18
plan in mind for deciding how to take
28:20
all these packets that are coming in and
28:22
sending them along this link I mean like
28:25
for example what happens if packets come
28:27
too fast for the switch to handle the
28:30
speed of these links when they all
28:31
simultaneously send packets could be
28:33
bigger than the speed of the link going
28:35
this way what does the switch do with
28:37
that does it just drop the packets does
28:39
it hold on to them for some time what
28:41
does it actually do
28:42
and I want to do this first bite with
28:45
this with a very simple picture that
28:46
tries to get at why this idea really
28:49
really actually works this idea that
28:52
makes packet-switching work has a fancy
28:55
name it's called statistical
28:58
multiplexing so let me explain what that
29:00
means let's take it with a very simple
29:06
picture so let's say that you have a
29:08
switch with one link coming out of it
29:12
and let's say that the speed of this
29:14
link arm I need to get into some metrics
29:17
here so links are measured in terms of
29:19
how quickly how quickly is the wrong
29:22
word in terms of the rate at which they
29:24
can send data and there's another metric
29:26
which is the delay of the link so I'll
29:29
get to both of these more carefully in a
29:31
bit but the important thing right now I
29:33
want to keep in mind is the rate of the
29:35
link this is the rate at which it can
29:38
send bits per second okay it's a it's a
29:41
it's a metric it's a measure of
29:43
throughput so it's typically measured in
29:45
bits per second so let me actually
29:48
imagine that the rate of this link is
29:49
one megabyte per second which is 10 to
29:54
the 6 which is a million bytes per
29:55
second or about 10 million bits per
29:57
second let's imagine a simple network
30:01
that looks like this
30:03
let's imagine that all these links are
30:06
also coming in at one megabyte per
30:08
second if somebody came and told you
30:18
here's the design of my network I have a
30:21
switch it's connected to three computers
30:23
each of which can is connected with the
30:26
link whose maximum speed is one megabyte
30:28
per second and this switch is going to
30:31
connect to something else downstream
30:33
maybe another switch and it goes
30:34
somewhere else and the speed of this
30:36
link is one megabyte per second is this
30:39
a good network design how would you go
30:41
about assessing that question
30:47
this is good or bad how would you know
30:49
yes right
30:57
so let me ask this before we answer this
30:59
question let's say this was ten
31:00
megabytes per second is this a good
31:01
network design it is they're paying too
31:04
much though because I mean really this
31:06
link is too fast for the amount of load
31:08
that is coming in but yeah you know it's
31:10
a reasonable network design but the real
31:12
question is if it's one here is it a
31:14
good network design and the answer as
31:15
the gentleman here pointed out is it
31:17
really depends on how much traffic how
31:21
many packets per second or bits per
31:22
second these different computers are
31:23
going to be sending let's say that they
31:26
all actually send when they send traffic
31:30
they send at one megabyte per second and
31:32
when they don't send traffic they're
31:34
quiet how would you determine whether
31:36
this is a good network design whether
31:38
this works or not like in practice on
31:40
average how often can each of these guys
31:42
be sending before you determine that
31:44
this is probably or not this network
31:46
isn't gonna work yeah yeah
31:54
right and they may or may not be equal
31:56
ideally what you'd like is just to make
31:58
sure that over some window of time they
31:59
send slower than the rate at which this
32:02
link can ship packets now the reason why
32:05
packet switching works is that when you
32:10
build a network like this and you scale
32:11
it out to bigger numbers it turns out to
32:14
be extremely unlikely that everybody
32:16
using the network exercises the network
32:18
at exactly the same time I mean a bunch
32:20
of people might have their computer on
32:21
but if you think about how it's used you
32:23
click on a link and you get a bunch of
32:25
stuff showing up and then you click on
32:26
you read read it for some time you click
32:28
on a link and something else shows up or
32:30
if you're watching a video stream you
32:32
know video is compressed so if the scene
32:35
changes very often you end up using a
32:37
lot more of the of the in terms of the
32:40
bit rate but then every once in a while
32:41
it's one of these you know old Russian
32:43
movies or nothing's changing for ten
32:44
minutes and yeah it's very heavily
32:46
compressed and then you get the
32:47
Schwarzenegger and you know it's it's
32:49
blowing your bandwidth limit so I mean
32:52
it's kind of like that traffic is bursty
32:56
so with traffic rods and bursts and the
32:59
users are not all highly correlated with
33:01
each other I mean from time to time you
33:03
do get these correlations like these are
33:05
called flash clouds presumably this
33:08
happened last night everybody's hitting
33:09
refresh on the New York Times website
33:11
and you know presumably what's happening
33:14
there of course is that these websites
33:15
really know how what they're doing so
33:18
you know they've actually provisioned
33:19
with the expectation that you know
33:21
starting from 8 p.m. everybody's sitting
33:23
there glued nothing's changing but you
33:25
know everybody's hitting reload and
33:27
they've designed this network they've
33:29
provisioned their network to allow for
33:31
people to get the answers they want or
33:35
the results they want to see so here's
33:37
some pictures so what I did was I took I
33:39
sniffed on the traffic in this in this
33:43
room
33:44
so here's here's the kind of stuff that
33:46
you see so this is the traffic in this
33:48
in this room during a lecture now this
33:54
is actually not this semester but I
33:57
would assume that it's fairly difficult
33:59
I should also say this was doing you
34:02
know the x-axis in these pictures is
34:08
time the y-axis is the number of bytes
34:10
that were sent okay so you can see that
34:13
what I've done on top is I've broken
34:14
time into 10 millisecond windows so
34:19
initially on top it's every 10
34:21
milliseconds I just count the total
34:22
number of bits actually thrown ever
34:23
bytes that was set now you can't read
34:26
the scale on the y-axis on top but it
34:28
goes on the top curve that goes up to
34:29
200,000 bytes in a small 10 millisecond
34:33
window then the curve down here does the
34:36
same thing but I've picked 100
34:38
millisecond window now you can see that
34:41
what has happened when you picked a
34:43
bigger window of time has it become
34:44
smoother or less smoother what can you
34:46
say about it
34:47
it's become a little smoother but there
34:49
surely there still are these Peaks the
34:51
bus do become smoother but they don't
34:52
completely disappear and what's
34:54
remarkable about network traffic is that
34:55
this bus never completely disappeared
34:57
but they do get a little smoother as you
34:59
aggregate over more time over 100
35:02
millisecond windows that's what it looks
35:04
like over one second window it looks
35:06
smoother but you can actually see that
35:08
from time to time there are these big
35:09
bursts that you know take up a lot of
35:11
the that end up over any window of time
35:17
that you expand out there's still some
35:21
probability with which you'll see a big
35:22
burst of traffic showing up in that
35:24
window that's kind of a nice and
35:26
noteworthy characteristic of kind of
35:28
real-world data traffic in fact even
35:31
when you go to 10-second windows which
35:33
says look I'm looking at 10 seconds at a
35:35
time you get stuff that looks like this
35:37
the MIT runs a website you can get
35:39
access to using your web certificates
35:41
it's called
35:44
mrtg mrtg door mit.edu you can actually
35:48
go to this website and you can see four
35:51
different switches including ones in
35:52
your dorm or wherever you are living if
35:53
you live on campus you can actually look
35:55
at the statistics from your router they
35:57
do this on a per switch level and it's
36:00
kind of interesting to see when people
36:02
use this network and when they don't I
36:04
think interesting characteristic of MIT
36:06
is networks as it turns out if you look
36:08
at some of the door network traffic it
36:10
peaks at like between one and three over
36:12
one in four in the morning which is
36:15
probably good because honestly I think
36:18
MIT should negotiate preferential
36:19
pricing with ISPs because no one else is
36:21
using those ISP networks for that time
36:23
so it would be actually yeah it turns
36:26
out I learned that the Amazon Kindle
36:27
kind of does that when you do your
36:30
newspaper subscription they actually you
36:31
send it to the - through wireless
36:33
networks through this commercial 3g and
36:35
4g wireless networks and i believe that
36:37
what they do is they send it to you in
36:41
the middle of the night when not many
36:42
people other than at MIT are using those
36:44
networks so you know you could take
36:47
advantage of some of these time-varying
36:48
properties so why did I tell you the
36:51
story the same thing I showed you these
36:54
time windows the same thing applies when
36:55
you bring many many users together the
36:57
odds that we all are going to run it
37:00
click on oh you know some link at
37:02
exactly the same time and all of us
37:05
cause a burst of traffic to happen
37:06
exactly at the same time is extremely
37:08
small now it can happen if there's an
37:11
adversary in the network if there are
37:12
bad guys and how many of you are heard
37:14
of denial of service attacks yeah DDoS
37:16
is stupid in our service attacks you
37:17
know I understand if you know Russian
37:20
you get an edge in doing it so you know
37:23
so these things are launched because
37:25
they commandeer a whole bunch of
37:27
machines and they coordinate an attack
37:29
they destroy the assumptions that make
37:31
statistical multiplexing works because
37:33
work because the normal assumption is
37:35
people are not exercising the network at
37:37
the same time so you're not attacking
37:38
some website or whatever at the same
37:40
time but if you coordinate an attack
37:43
then you you make that assumption not
37:45
hole causing congestion to happen
37:48
causing traffic to exceed what your
37:50
network link can support but under
37:53
normal non-adversarial conditions the
37:55
assumption is that people are you know
37:57
randomly gaining access to the network
37:59
which means that you can actually get
38:01
away with the design of a network that
38:02
looks like this as long as you study
38:05
statistics like the average amount of
38:07
traffic like on average the guy is not
38:08
going to be sending more that this nodes
38:09
not gonna be sending more than a certain
38:12
amount of traffic when measured over
38:13
some period of time what happens when
38:17
people send traffic you know bursts what
38:21
happens when from time to time in fact
38:22
you see these bursts of traffic right
38:24
you look at this picture here you do it
38:27
over 100 or one second window or one
38:28
hundred millisecond window and you see
38:29
these big peeks of traffic lots of bytes
38:33
you know 200 millisecond window what
38:36
that really means is that this switch
38:38
here is going to be getting traffic from
38:40
different users that probably exceeds
38:44
you know is perhaps the sum of all of
38:46
the input links so it's a large amount
38:49
of traffic if you have a design like
38:50
this something's gotta give because
38:53
you're getting water or packets coming
38:55
in at one megabyte per second times
38:57
three and you got a link that can only
38:59
send one megabyte per second so what can
39:02
you do what can the switch do
39:07
the easiest thing it can do is just drop
39:10
it just say you know what just just drop
39:14
it and it's not you know you laugh but
39:16
I'm telling you sometimes dropping it
39:18
and letting the end point deal with it
39:20
is a better strategy than holding on to
39:21
it and simply keeping it on line it's
39:23
like you gotta be careful right I like
39:25
the idea of storing it but for how long
39:27
you store it I'm how much do you store
39:32
for example if I look at that burst of
39:35
traffic here and I have a network like
39:37
this and I look at this big burst of
39:38
traffic here over a ten-second window
39:40
I'm seeing traffic that's probably in
39:42
this example perhaps 10 or 100 times
39:45
bigger than the average the average is
39:47
sitting down somewhere and maybe this is
39:50
ten times the average the peak to
39:51
average ratio might be ten to one or
39:52
twenty to one so how much should you
39:55
store inside the switch if you were
40:00
designing a network and I told you well
40:01
alright good idea why don't you store
40:03
the packets you're gonna put these
40:05
packets into a data structure called a
40:07
cube right packets come in packets go
40:10
out packets go out whenever the link is
40:13
able to send packets you keep shipping
40:14
packets out in the meantime traffic is
40:15
coming faster than you can handle
40:17
you're gonna put stuff in a queue how
40:19
much you want to keep everything
40:24
if you did you'd be like Disney World
40:26
because they have these lines that go
40:28
forever like nothing's moving and
40:29
everybody just piles on the end of the
40:30
line this is a tough question we're
40:35
gonna answer this question somewhat
40:37
there's no single easy answer to this
40:40
question but the rule of thumb that I'm
40:42
gonna have for you keep in mind now is
40:43
you're probably going to keep between 10
40:44
milliseconds and 100 milliseconds worth
40:47
of traffic I'll get to why later for now
40:50
then it's some small amount of time what
40:52
amount of traffic the reason why you
40:57
need this cue is to absorb a burst of
41:00
traffic that you're not able to
41:01
immediately send but the important
41:04
principle and packet-switched network is
41:05
you need a cue but there are necessary
41:07
evil because the only thing that the cue
41:10
is doing for you is absorbing the bursts
41:12
but the only thing the bad thing that
41:14
it's doing for you is adding delay just
41:17
because you have a cue the network ain't
41:19
gonna move faster the network is moving
41:21
the links moving at the same speed
41:22
whether you have a cue or not the only
41:25
thing the cue is doing is it absolves a
41:26
burst so that when the whenever the
41:29
network link is is able to send packets
41:31
you can ship packets from the cue and
41:33
you don't want to drop too many packets
41:35
now if you're lucky the the size of the
41:38
cue is enough to absorb all of the burst
41:40
and then the traffic eases when you get
41:42
to send the rest but if you're unlucky
41:43
the queue overflows and you drop some
41:45
packets and then the endpoints have to
41:46
somehow deal with it so what are the
41:51
things we've looked at packet switch
41:53
networks as defined by a header which
41:56
includes a destination address the way
41:58
the network works is that the sources
42:01
just ship a packet with the header that
42:03
includes the destination address the
42:04
switches somehow are going to figure out
42:05
how to ship how to get those packets to
42:07
the destination
42:09
the reason why the stuff work is because
42:11
of the statistical multiplexing and
42:14
finally the reason we need a queue in a
42:18
packet switch Network is to absorb these
42:20
bursts so what I want to do in the
42:23
remaining six or seven minutes is to
42:26
tell you about the other metric by which
42:29
we're going to evaluate our networks
42:31
this the first metric I introduced
42:32
already is the rate of a link when you
42:36
have links of different rates you can
42:38
also define the rate for an actual
42:39
communication when a sender source sends
42:41
a packet to a destination you can
42:43
measure the rate at which bits are
42:45
arriving at the destination that's the
42:47
throughput of the data transfer of the
42:50
bit rate the other metric we're going to
42:52
care a lot about is called the delay the
42:59
fancy term for delay is latency I really
43:04
don't know why they have two terms but
43:05
you know from time to time people use
43:08
the word delay or latency and by the way
43:10
I'm going to try it hard to use the word
43:12
rate here or bitrate or throughput often
43:16
you see the word bandwidth like oh my
43:17
bandwidth is 10 megabits per second and
43:20
that's actually fine to use except it's
43:22
confusing in a real communication system
43:24
because we're going to we've already
43:25
used the word bandwidth to refer to a
43:27
frequency and so we've already said that
43:30
bandwidth is defined in terms of say
43:32
Hertz or something like that and it's
43:36
just a little confusing to also use
43:38
bandwidth for rate so we're going to try
43:40
to use words like bitrate and throughput
43:41
to refer to bits per second so delay is
43:44
measured in seconds or milliseconds or
43:46
microseconds and when what we want is
43:50
you have a source that sends
43:53
Paquette or set of packets and let's say
43:56
a single packet to a receiver going
43:57
through a network of switches and I want
44:01
to ask if I send a packet at some point
44:04
in time let's say at time zero
44:07
when does that packet reach the receiver
44:10
okay
44:12
that's the delay for a single packet so
44:15
I just want to explain to you how to
44:17
calculate this or how to how to measure
44:19
this so let's say that the packet has a
44:24
size of L bits so what is the answer
44:31
depend on
44:38
let's take an even simpler example let's
44:40
say that I have a sender I have a
44:42
receiver I have one link between them no
44:44
switches and the packet has size L bits
44:48
I send a packet at time zero when does
44:53
the packet when does the last bit of the
44:54
packet show up at the receiver yes good
45:03
so I need to define this thing here
45:05
let's say that the bitrate of this link
45:08
is C bits per second so I have L bits
45:13
and I have a link that can send packets
45:15
at C bits per second therefore something
45:19
here should be L divided by C seconds
45:23
that is from the moment I start shipping
45:26
these bits from the time delay between
45:29
when the first bit arrives first bit
45:32
arrives at the receiver and the last bit
45:34
arrives at the receiver that time
45:36
distance or time difference is C divided
45:40
by L sorry L divided by C seconds
45:45
right because I ship these bits these
45:49
bits go back-to-back over the link if I
45:51
look at when the first bit arrives and I
45:53
look at when the last bit arrives that
45:54
time difference is the spacing you know
45:57
the time difference between any two bits
45:59
showing up at the receiver is one over C
46:02
seconds because if the Lincoln Cent C
46:05
bits per second any two bits are
46:06
separated about separated apart by one
46:08
over C seconds therefore from the time
46:12
at which the first bit arrives to the
46:14
time at which the last bit arrives that
46:16
distance is L over C that difference is
46:19
L over C seconds this n over C has a
46:22
name associated with it this is called
46:24
the transmission delay now let's say I
46:34
want to send just one bit I have to now
46:37
I send a bit at some point in time and
46:39
that bitch shows up some point in the
46:42
future right because it can't show up
46:44
immediately if it did you know we'd have
46:47
probably have to change the laws of
46:48
physics because you know speed of light
46:51
is no longer valid as a finite limit so
46:55
what is the time between when I send the
46:57
first bit and when the first bit shows
46:58
up here what does that depend on
47:03
but let's think I want to communicate to
47:05
the moon I send one bit of information I
47:09
put it out onto the or even one sample I
47:11
put it out on the on the radio or
47:13
whatever how long before it gets to the
47:14
moon
47:18
depends on what does it depend on the
47:22
rate at which I can communicate know
47:24
what does it depend on sorry you guys
47:30
said it what is it speed in the speed of
47:33
what it's the speed of light in that
47:37
medium or speed of whatever the signal
47:38
you use is if it's acoustic then it's
47:40
the speed of sound over the medium so it
47:42
depends on the distance and depends on
47:44
the speed of at which a signal can
47:46
propagate through that communication
47:47
medium for example the speed of light so
47:49
the distance is B and the speed of the
47:53
communication medium is let's say V that
47:57
thing is called the propagation delay so
48:03
let me organize this properly so I'm not
48:05
confusing everybody with these different
48:08
terms so so far we've hit two sources of
48:10
delay the first source of delay which I
48:13
said second is the propagation delay
48:18
this is the time it takes for the first
48:20
bit to get to the other side it depends
48:23
on the speed at which a signal
48:24
propagates through the medium and the
48:26
distance between sender and receiver so
48:28
sound travels at one foot per
48:29
millisecond I think or roughly something
48:31
like that so if I'm doing a cue stick
48:33
that dictates the propagation delay the
48:36
second delay is the transmission delay
48:43
which depends on this L oversee the 3rd
48:48
delay is what are the processing delays
48:50
there are that for example is when a
48:56
switch gets a package it has to look at
48:57
the package header figure out the
48:59
destination do something to that there's
49:01
some computation time that the switches
49:03
you know have to work with that delay is
49:06
called the processing delay this is
49:07
purely some sort of computing delay and
49:10
it's usually very very small and the
49:13
fourth delay is the queuing delay
49:15
because it could be that you get these
49:18
packets in and they have to sit behind
49:19
in a queue and that imposes a delay in
49:22
communication so that's called the
49:25
queuing delay and it's usually a very
49:26
variable source of delay on many
49:29
networks these other delays are constant
49:31
not always but generally constant the
49:34
transmission delay may or may not be
49:35
constant but usually these are more
49:37
constant the queuing delay is not a
49:39
constant delay and the actual delays
49:41
that you experience when you click on a
49:43
web on a link there's not some reasons
49:45
why the website is slow but these are a
49:48
principle dominant factor in many many
49:50
cases so we'll pick up on this next week
49:53
after the quiz to stuff you deal with
49:55
quiz 2 and pset 6 and we'll continue
49:58
with multi hop networks
50:06
you