nscript


00:00
the following content is provided under
00:01
a Creative Commons license your support
00:04
will help MIT OpenCourseWare continue to
00:06
offer high quality educational resources
00:08
for free
00:09
to make a donation or view additional
00:12
materials from hundreds of MIT courses
00:14
visit MIT opencourseware at ocw.mit.edu
00:25
so good afternoon continuing our story
00:28
of about networks what we've seen so far
00:30
is a story where you have a network that
00:34
you're trying to connect computers to
00:36
communicate together and we use a
00:39
network with switches arranged in some
00:43
topology to allow us to find paths
00:50
between computers and so we looked at
00:52
the routing problem we looked at two
00:53
different routing protocols to solve
00:54
that earlier we talked about this idea
00:58
of a packet switched Network and there
01:00
are queues and packet switched networks
01:01
and when traffic comes in too fast and
01:04
the queues overflow packets may get
01:06
dropped we also look before at links
01:09
which have errors on them and so if you
01:11
have errors on links to and your coding
01:13
scheme isn't able to correct for those
01:15
errors packets may get lost and when we
01:18
looked at shared media networks with Mac
01:19
protocols you know depending on the Mac
01:21
protocol you use you may have collisions
01:22
which means that packets may get may get
01:25
lost so what you have is a packet
01:28
switched network that has the property
01:30
of something called a best-effort
01:31
network and what best-effort means is
01:39
that the network has a few properties
01:41
that you have to cope with the first
01:43
property of a best Ekrem best-effort
01:45
network is that packets may get lost
01:53
the second problem in a best-effort
01:55
issue that arises in a best-effort
01:57
network is that packets have delays but
02:02
the delay is a variable and particular
02:08
queuing delays that happen or in
02:09
switches are variable delays the third
02:15
property of a packet switch network is
02:16
that you know each packet is treated
02:19
independently by the network so it could
02:21
be that you have a stream of packets you
02:23
want to send say belonging to a video
02:25
stream or a file and the sender sends
02:28
them in some sequential order
02:30
but these packets may take different
02:32
paths through the network and in fact
02:35
there may be switches in the network for
02:36
whatever reason that may not treat
02:37
packets in first-in first-out order they
02:41
may reorder packets but more generally
02:43
packets may take different paths through
02:44
the network because the routing protocol
02:47
may decide to change the pads on you and
02:49
so packets may get reordered
02:58
and the fourth issue in a best-effort
03:01
network is that in fact packets make it
03:04
duplicated so you may have duplicate the
03:07
same package show up multiple times even
03:09
though it was sent only once for a bunch
03:11
of reasons one of them is there could be
03:13
problems in the implementation of the
03:15
switches or the nodes that cause packets
03:17
to get duplicated but it could also be
03:19
that you may have a link with a high
03:22
packet loss rate or this may be a shared
03:24
medium where you have a Mac protocol
03:27
that has collisions so you may have a
03:29
retransmission protocol let me try to
03:31
resend a packet a few times at the
03:33
lowest layer overall over a shared
03:35
medium or a link and sometimes multiple
03:39
versions of the multiple copies of the
03:40
same packet may get through and black
03:42
should understand why that happens more
03:44
today so packets may get duplicated so
03:48
in a way a packet switch network fact
03:51
the Internet is great because it's very
03:53
easy to build and the reason it's easy
03:55
to build in some sense is because about
03:57
the only property that you're providing
03:59
from the design of the network is to
04:01
tell the endpoints oh I might get your
04:03
packets through there's no guarantees on
04:06
anything as long as there's some nonzero
04:08
probability of getting a packet through
04:10
from one end point to the other that's
04:12
pretty much all it takes to declare that
04:14
you have a conforming best-effort
04:15
network so it's easy to build but of
04:18
course it means that you have all these
04:19
issues that you want to deal with if you
04:21
actually want to run applications so an
04:23
example of an application is let's say
04:25
you are trying to download a web by a
04:27
web page or a set of web web pictures
04:31
and text on a page what you would like
04:34
is an abstraction that you can implement
04:37
some sort of a scheme you can implement
04:39
in the network or in the between the
04:42
endpoints that makes it so that an
04:44
application sends a bunch of bytes or
04:46
packets or sends a message and at the
04:50
receiving side you get those bytes
04:52
reliably
04:54
so that's what we're going to understand
04:56
today we're going to look at today and
04:57
next week we're going to look at how to
04:59
implement a protocol that provides
05:01
reliable data transport
05:10
and ideas we're going to look at are
05:13
probably the ideas that are in the
05:15
world's most popular computer program
05:17
it's it's the most popular in that it it
05:20
runs in the most number of places and
05:21
it's a protocol called TCP which stands
05:25
for the transmission control protocol
05:27
now we're not going through all the gory
05:29
details of TCP we're going to look at a
05:33
simplified version of this protocol so
05:38
maybe we see CP Lite but it'll cover the
05:43
main idea of how you can achieve
05:46
reliability and this is you know this
05:48
this particular program is running runs
05:51
on pretty much every computer and every
05:54
phone and every little device that's on
05:56
the Internet today so it's really really
05:58
popular in fact we're going to start
06:00
with a simpler protocol that is a
06:03
reliable data transport protocol that
06:06
isn't used between endpoints TCP is used
06:09
between endpoints now we're going to
06:11
look at a version of a protocol that's a
06:12
simple version that actually runs in
06:14
every 802 11 device but you're a laptop
06:17
phone and access point so we will study
06:19
that protocol to first in the context of
06:22
end-to-end you know between endpoints
06:24
reliable data transport the problem is
06:26
the following you have some network here
06:30
it's a best-effort network with those
06:32
properties and what you want is you have
06:35
an application at one end and you have
06:38
an application at another end running on
06:41
some endpoints
06:45
what this system provides that we're
06:48
going to study provides is an
06:51
abstraction where you run software here
06:53
you run software at this end and all the
06:56
stuff sits on your end node this is your
07:00
endpoint and the abstraction provides
07:08
some nice properties the application
07:11
writes some data in here at the sender
07:13
sending end so let's call this the
07:14
sender and the other guy is the receiver
07:17
the application writes stuff inside here
07:19
the network is a best-effort network and
07:22
there's some protocol between these two
07:25
pieces of software that make sure that
07:29
no matter what the network does what
07:31
goes up here into the application is
07:35
exactly the data that was written from
07:38
this application in exactly the same
07:40
order in which it was written so it
07:43
provides reliable and in order delivery
07:46
of data so reliable and in order so
07:54
every piece of data that certain shows
07:56
up in exactly the same order exactly
07:57
once at the receiver and these two ends
08:01
are called transport
08:05
this these two ends constitute the
08:07
transport layer and they run at the
08:13
endpoints okay so the application writes
08:16
in here the transport protocol delivers
08:18
up to the application stuff that's
08:20
reliable and in order and in particular
08:24
it provides the semantics that you can
08:27
think of as exactly once semantics in
08:29
other words anything that sent is
08:31
deliver exactly once to the receiver and
08:34
it's delivered in order now that's the
08:36
abstraction that TCP as well provides
08:38
and that's the abstraction of our 602
08:41
protocol is reliable in order exactly
08:44
once delivery now there are other
08:46
implementations you can have there are
08:48
protocols you could have which provide
08:49
reliability but not in order so you know
08:51
I'll give you all the data that you send
08:53
but it may show up in different order
08:54
and it's your problem to fix it or you
08:56
might provide a protocol that provides
08:58
in order but not reliable so I mean if
09:00
I'm doing a real-time video conferencing
09:05
say Skype Skype would probably want to
09:08
provide a protocol that's in order but
09:10
not reliable because you know if I speak
09:12
you'd like to actually get those things
09:14
into the Skype application in order but
09:17
it's not really required that it be
09:19
reliable because if the you know if a
09:22
message shows up say more than 100 or
09:24
200 milliseconds after I spoke it you
09:27
know it's going to start the
09:28
conversation it's not going to be
09:29
intelligible to you and the human ear is
09:30
wonderful that the human brain is
09:32
wonderful at dealing with some clipping
09:34
us in the voice you know occasional
09:36
packets get lost it's not the end of the
09:37
world so there are lots of interesting
09:39
applications where in order is useful
09:41
but not perfectly reliable applications
09:43
were reliable is useful but not
09:45
perfectly not a BitTorrent it would be
09:47
an example of a an application where you
09:49
know eventually you want all of those
09:50
movies that you're trying to get but who
09:52
cares what order they come in you're not
09:53
going to start watching it until the
09:54
whole file is assembled and so the
09:57
protocol that BitTorrent uses in effect
09:59
you know it's a complicated critical
10:00
start point-to-point but in effect it
10:02
provides reliability without worrying
10:04
about ordering so there's lots of
10:05
combinations the combination we care
10:07
about is reliable in order essentially
10:09
giving you the illusion the
10:10
you have a circuit between the two end
10:12
points or a wire between the two end
10:14
points okay so is the abstraction clear
10:17
everyone understands what we're trying
10:19
to solve and in between this is just
10:21
think there's an adversary or some
10:23
network in the middle but you know you
10:24
send packets and the thing is just
10:26
throwing packets away and every once in
10:28
a while just for the heck of it it
10:29
decides to delay a packet for a long
10:32
time and every once in a while it de
10:33
lisle decides to send packets in
10:35
different you know along different paths
10:37
and your job is to deal with all of that
10:39
and design the sending side and the
10:41
receiving side so stuff shows up
10:43
reliably and in the same order in which
10:46
it was sent so we're gonna try to solve
10:48
this problem we're gonna solve it first
10:49
by coming up with a protocol it has a
10:52
nice name to it called stop-and-wait
10:53
it's a very simple idea and this will be
10:58
a protocol that works but it's slow but
11:01
the good news is it works it's correct
11:02
it gives the semantics we want and then
11:04
we will try to improve its performance
11:06
it's a very very simple idea I'm sure
11:08
you you know you think about this for
11:10
three minutes you'll come up with
11:11
something that looks like this you take
11:13
the message you want to send you know
11:15
whatever files stream of video whatever
11:16
it is and break it up into packets so
11:19
far there's nothing new here the main
11:21
first idea is we're going to number
11:23
every packet with a sequence number so
11:25
that's what's shown here as data one
11:27
data to data three and so forth so we're
11:29
going to use a sequence number on every
11:31
packet
11:34
now again there are many ways to
11:36
implement sequence numbers the way
11:37
that's the simplest and most
11:39
conceptually clean is every packet has a
11:42
sequence number that increments by 1 for
11:44
every subsequent packet that's sent and
11:49
you might initially start the sequence
11:51
numbering at 0 or 1 or whatever the
11:53
sender and receiver have to agree on
11:55
that now in reality TCP in the real
11:57
world is a little more complicated TCP
11:59
provides sequence numbering by numbering
12:02
the bytes but the byte offset in the
12:04
stream so if you send a packet with just
12:06
25 bytes and the next packet is 200
12:08
bytes the first packets gonna have a
12:11
sequence number of let's say 0 the
12:13
second pack is gonna have a sequence
12:14
number of 26
12:15
because if numbers the starting of the
12:17
byte offset but these are all details
12:19
that the first order we don't have to
12:21
really worry about the important point
12:23
is there's a sequence number and a
12:25
sequence number is a unique identifier
12:27
for the packet in other words if I later
12:29
send a packet with the same sequence
12:31
number I have to guarantee that the
12:33
material inside the packet is the same
12:35
as it was before so the assumption is
12:37
that this is a unique identifier for the
12:39
contents of the packet so it's a unique
12:42
identifier
12:43
we won't reuse it for some other set of
12:46
bytes we will always use it again for
12:48
the same set of bytes if we have a
12:49
retransmitted packet with the same
12:50
sequence number when the receiver gets
12:54
the packet with a certain sequence
12:55
number it does what the post-office does
12:57
if you send registered post you turn
12:59
around in the same acknowledgment and to
13:02
allow the sender to know which packets
13:05
being acknowledged you stick in the
13:07
sequence number of the packet that's
13:08
being acknowledged so you send sequence
13:10
one data one you still get AK one data
13:12
two you get active and everything is
13:14
wonderful it's easy easy protocol
13:18
so what happens when a packets lost you
13:20
get this data's lost what's going to
13:22
happen is the sender is not going to get
13:24
an acknowledgement and after some period
13:27
of time called the timeout the sender
13:30
decides that it wants to retry that
13:31
packet and it tries to resend the packet
13:33
and if it works it gets an
13:35
acknowledgment when it gets that
13:36
acknowledgement that's when it goes and
13:38
sends the next packet so the property of
13:40
stop-and-wait protocol is that you send
13:43
a packet only after you get an
13:45
acknowledgment you send packet K plus-1
13:47
only after you get an acknowledgement
13:48
for packet K if you don't get an
13:51
acknowledgement for packet K you wait
13:53
for a period of time called the timeout
13:55
and after that timeout elapses you
13:58
retransmit the packet that you
14:00
considered was lost that you thought was
14:03
lost okay simple now is this protocol
14:08
reliable and and when I ask that
14:11
question you have to assume that the
14:12
network may drop and reorder and do
14:15
whatever it is two packets but there's
14:16
always a nonzero probability that any
14:19
packet or data packet or acknowledgment
14:21
packet sent on the network has a nonzero
14:24
probability of reaching the other side
14:25
because if the probability of packet
14:26
loss is one I mean oh now no one can
14:28
help us
14:28
so is this protocol reliable
14:33
okay is this protocol in order it is in
14:37
order in the way I've not actually
14:38
described what the receiver does but I
14:41
should tell you that the receivers
14:43
semantics here are when the receiver
14:45
gets a packet it delivers it to the
14:47
application now it'll turn out that this
14:51
protocol is not necessarily in order the
14:54
way I described it now I'll come back to
14:56
why but so far it looks like the
14:58
protocol is in order but remember what I
14:59
said about the receiver when the
15:01
receiver gets the data packet it
15:03
delivers it up to the application so is
15:08
the protocol potentially not in order
15:10
it's not actually in order we'll get
15:12
back to why you have a question yeah
15:23
yes
15:29
right so I haven't specified that and
15:32
you are one step ahead of your at the
15:36
next picture here what happens in this
15:37
case you get a duplicate packet and in
15:40
fact that's precisely for this reason
15:41
that this protocol is not actual it's
15:44
kind of in order but in order means that
15:45
you deliver packets in the same order in
15:47
which they were sent and in the way I've
15:49
described the description given the
15:50
description of this protocol this
15:52
protocol does not provide exactly one
15:54
semantics right it provides at least one
15:57
semantics in other words every packets
15:58
deliver at least once to the application
16:00
and what you would like is to deliver
16:02
every packet exactly wants to the
16:03
application in order so what would you
16:05
have to do at the receiver in that in
16:07
the software that you write at the
16:09
receiver transport to take the same idea
16:12
and make it be a reliable in order
16:15
exactly once protocol yes
16:21
sorry second
16:25
look up if you receive that sequence
16:26
number good
16:28
so one implementation is you perhaps
16:31
keep track of all the sequence numbers
16:33
you've ever received and delivered up to
16:34
the application if the new guy comes in
16:36
you look and see if it's in your list
16:38
and deliver it if not you could do
16:41
better do you have to do all that work
16:46
do you have to keep track of the list of
16:48
all the sequence numbers you've ever
16:50
received in order for this protocol to
16:51
work yeah is it enough to keep track of
16:56
simply the very last one you've
16:57
delivered and also guarantee that you'll
16:59
only deliver stuff in order so if you
17:01
get up to packet number 17 and you now
17:04
get 18 you deliver it up to the
17:06
application and update your counter to
17:07
be set from 17 to 18 of the last
17:10
sequence number you've delivered if your
17:12
last sequence number delivered in order
17:14
is 17 and you get 16 throw it out if you
17:17
get 17 you throw it out if you get
17:18
anything if you get 19 which probably
17:21
shouldn't happen in this protocol unless
17:22
there's a mistake in the implementation
17:24
at the sending end if the last sequence
17:26
number I got was 17 can the sender sent
17:29
in
17:33
why not
17:36
because that's right so unless there's a
17:38
bug in either side of the implementation
17:40
which trust me when you implement it
17:41
you'll probably end up having some you
17:43
know bugs and you'd know something is
17:45
amiss but there are these invariants
17:46
that have to hold the sender can send K
17:48
plus 1 only if it gets an act for K the
17:50
sender gets an act for K only the
17:52
receiver got K and therefore if the
17:54
sender is lost in order sequence number
17:56
received and delivered the application
17:58
was 17 it can't actually get a 19 in a
18:01
correctly implemented protocol but if it
18:04
does because you know in the real world
18:05
you don't know who the heck wrote the
18:06
sending side you know you might have
18:07
done your receiver and the sender might
18:09
have been done by oh I don't know
18:10
Microsoft and it may have an issue with
18:12
it or Apple or whoever I mean you don't
18:14
want to trust it right so you have to be
18:16
careful about making sure that you might
18:18
want to assume the protocols you don't
18:21
want to assume necessarily that the
18:22
other guys implemented the protocol
18:23
right because he might not have and so
18:26
who knows what might happen so your rule
18:28
of the receiver is to rigidly obey
18:31
whatever the you know the discipline is
18:32
which is you deliver up a packet exactly
18:34
in in order okay so that's we wanted
18:39
exactly one semantics and the way you
18:43
get that is you get that by keeping
18:44
track of the very last sequence number
18:46
that you received so this protocol so
18:49
the first idea sequence numbers the
18:51
second protocol is a retransmission
18:56
after a timeout now how big should this
19:01
timeout be this whole protocol rests on
19:04
this magic timeout what should it be 15
19:07
17 what are the units of the timeout
19:10
actually what are the units of this
19:12
timeout it's time so it's like seconds
19:17
or milliseconds or something how big
19:18
should it be what five milliseconds
19:27
yeah units of seconds all milliseconds
19:29
good but how would you pick it
19:35
okay good so that's a good idea there's
19:37
this thing I've written on the Left
19:38
called the round-trip time you do but
19:40
you don't know the round-trip time but
19:41
you could measure the round-trip time
19:42
you could met and I'll talk about how
19:44
you measure it a little bit later but
19:46
it's important to realize that if you
19:48
make the time out be smaller than the
19:50
round-trip time where the round-trip
19:52
time is defined as the time at which you
19:54
sent a packet to when you got an
19:56
acknowledgement for that packet if you
19:58
make the time out smaller than the
19:59
round-trip time what happens in this
20:00
protocol let me first start because the
20:03
protocol still correct by correct I mean
20:07
does it provide reliable in order
20:09
delivery okay it's correct because I
20:12
mean the doctor that correctness does
20:14
not rest on how we pick the time out
20:16
however what is the problem with making
20:18
the time much smaller than the
20:19
round-trip time yeah you know if the
20:23
protocol is gonna be you know you're
20:25
gonna be raised transmitting and
20:26
retransmitting and using up a lot more
20:28
of the networks resources then you need
20:30
to in order for you to actually get your
20:32
protocol to work correctly and you might
20:33
if the timeout is really really small
20:35
you would probably congest the network
20:37
okay
20:38
so the timer has to be bigger than the
20:40
round-trip time the trouble in a
20:41
packet-switched network is that delay is
20:43
a variable in a best-effort network and
20:45
in fact packets may be reordered there
20:47
may be weird things going on in the
20:48
network which means that the round-trip
20:50
times are actually not constant they
20:51
vary with time they vary with other
20:53
traffic they vary with lots of other you
20:55
know factors and so what you want is an
20:58
adaptive method that would measure the
21:01
round-trip time estimate the round-trip
21:02
time and then come up with some sort of
21:04
an algorithm to compute or to set the
21:07
time out as a function of the
21:09
observations of the round-trip time I'll
21:11
get back to that later on today and
21:13
we'll also talk about this in recitation
21:15
tomorrow it's actually a very nice
21:17
application of a very simple low-pass
21:19
filter so we will actually come back to
21:21
this idea but what I want to have in
21:23
you'd have in your head right now is
21:25
this idea that there's a timeout and the
21:26
timeout has to be which I'll call RTO
21:29
for retransmission timeout we have this
21:32
idea that the retransmission timer has
21:35
to be bigger than the round-trip time
21:37
okay so what I need to tell you still is
21:40
how to measure estimate the round-trip
21:42
time and how to use these estimates of
21:43
the round-trip time to pick the time out
21:45
but let's subcontract that problem
21:48
someone let's say that doesn't block the
21:50
black box that will tell you what the
21:52
time art should be and now you have this
21:54
protocol so assuming we have that black
21:57
box and someone telling you the
21:59
retransmission time out what I would
22:01
like to do now is to spend some time
22:03
telling you how well this protocol works
22:05
I'd like to understand what is the
22:07
throughput which is the data rate that
22:10
you get if you run the stop-and-wait
22:13
protocol so that's what I want to do now
22:16
throughput of stop-and-wait so the input
22:24
here is I'm going to assume a very very
22:27
simple model I'm going to assume for a
22:28
minute that the round-trip time you know
22:31
doesn't change a whole lot this is a
22:33
very simplifying assumption but there's
22:36
some average round-trip time and I'm
22:38
gonna assume that the amount of time is
22:39
RTT the same result holds if the router
22:43
con varies but just simple model let's
22:45
just assume that on trip time is fixed
22:46
and let's assume that somebody tells us
22:49
what the retransmission timeout is and I
22:51
need one more parameter I'm going to
22:53
assume that I know the network's packet
22:58
loss rate because intuitively if the
23:01
network's packet loss rate is zero there
23:03
is no packets are lost no data packets
23:06
no acknowledgments are lost then you
23:08
would expect this protocol has higher
23:09
throughput then if packets were lost
23:12
right if the packet loss rate is 50% you
23:15
would expect that what would happen is
23:16
well you know half the packets or acts
23:18
are getting lost which means you have to
23:20
retransmit the packet and every time you
23:22
retransmit the packet the protocol comes
23:23
to a wait and you have to wait until the
23:26
timeout happens so the bigger the packet
23:29
loss rate you would expect the protocol
23:31
to be slower so I'm going to assume that
23:32
we have our TT and party Oh
23:36
and we have a packet loss rate of L so
23:45
what does what does that mean what it
23:47
means is that if I send a large number
23:48
of packets through the network a
23:50
fraction L of them will get lost and
23:53
I'll just assume in the simplifying
23:55
model that the packet losses are
23:57
independent so there are sort of
23:59
Bernoulli losses you know every packet
24:00
gets lost independent independently with
24:02
some probability now I also will assume
24:06
in this protocol does it matter to the
24:09
performance of the protocol of the
24:10
packet is law and the data packet is
24:12
lost out of the AK packet is lost it
24:16
doesn't matter as far as the sender send
24:18
this is an important point on understand
24:20
as far as the sender is concerned if a
24:22
time art happens it has no way of
24:24
knowing whether the time are happened
24:26
because the data was lost or because the
24:28
AK was lost this is like absolutely the
24:30
receiver knows all right so the receiver
24:32
doesn't know if a time what happened but
24:34
the receiver does know whether it got a
24:35
data packet or not but the sender the
24:37
only thing that's acting on is the
24:39
absence of an AK and the absence of an
24:41
act indicates either that the data was
24:43
lost all that was lost in you it has no
24:45
idea which therefore we could assume in
24:49
this for this analysis that this packet
24:51
loss rate of L is actually a
24:53
bi-directional packet loss rate what I
24:58
mean by that is L is the probability
25:01
that either a data packet is lost or
25:03
it's act was lost okay now if I give you
25:06
the one-way lost probability you can do
25:08
the calculation that's a probability
25:10
calculation to find out what is the
25:12
probability that either the packet was
25:13
lost all the data was lost that's an
25:15
easy calculation but let me just assume
25:16
that the probability that either a
25:18
packet was lost data packet was lost or
25:20
attack was lost as L so given these
25:23
numbers what I want to do is given these
25:26
things I want to know what the
25:27
throughput is
25:29
in other words how many packets for how
25:34
many packets per second
25:36
am i transmitting am i able to transmit
25:39
or am i able to receive at the receiver
25:41
so if you want to look at what happens
25:43
in this picture if you draw time like
25:48
that you know you send a packet and
25:50
maybe you get an AK here so d1 a1 you
25:56
send d2 you send D to immediately and
26:00
you get a2 after some time and maybe you
26:04
have a timeout so you send d3 then you
26:10
have a period of time which is the RTO
26:15
Nowak happens you send d3 again and
26:20
maybe Nowak happens for a while you have
26:22
another RTO I'll assume that the RTO is
26:24
fixed here and you send d3 again and you
26:29
get an AK here and then you send d4 here
26:33
and so forth right that's an example of
26:36
what could happen in a particular
26:37
evolution a time of variation of the
26:40
protocol what I mean by throughput is
26:43
that I would like to run such an
26:46
experiment for a very long time or run
26:47
many many such experiments which is sort
26:50
of equivalent to running an experiment
26:51
for a very long time and then count how
26:53
many packets did I successfully get at
26:56
the receiver or equivalently I can ask
26:58
how many acts did I get at the sender
27:00
over that long experiment right and the
27:03
number of acts that I get at the sender
27:05
divided by the time of that experiment
27:08
will tell me the number of packets per
27:10
second
27:12
right or put another way if I run the
27:17
experiment for some long period of time
27:19
and I send I received n packets coming
27:22
back right if I receive n
27:27
acknowledgments
27:33
and if the expected time here between
27:37
when I send a data packet I send it in
27:41
the packet I get an ACK I sent a data
27:43
packet I get an ACK I send a data packet
27:44
and I get an ACK I sent a data packet
27:46
and I get an ACK
27:47
if I take the expected value of that
27:49
time that is the expected time between
27:51
when I send a packet and when I get an
27:53
ACK the one over that number 1 over the
27:59
expected time is equal to my throughput
28:05
in packets per second because if I run
28:12
the experiment for a long time I'm going
28:14
to get a you know some number of
28:16
acknowledgments so if I run it for some
28:18
period of time where n times e of T
28:22
where e of T is this number here and I
28:24
get back n acknowledgments n divided by
28:26
n times V of T is my throughput and
28:29
therefore 1 over the expected time is
28:31
the throughput of my experiment right so
28:35
this should be intuitive because what's
28:36
really happening is with a little bit of
28:39
hand waving actually that you know I get
28:42
I send data I get an accent data I get
28:43
an act there's the next certain expected
28:45
amount of time so I'm able to send one
28:47
over that packets per second okay so in
28:50
other words the throughput is the
28:51
reciprocal of the expected amount of
28:53
time between when I send a packet and
28:54
when I get an acknowledgment so it's
28:56
enough for us to compute the expected
28:58
value of the start right or the mean
29:00
value of that time
29:03
all right so we can do that calculation
29:06
in a simple way there's the sort of
29:09
tedious way to do it and there's a very
29:10
simple nice way to do it so we want to
29:12
calculate expected time between data and
29:20
and one way to do this is to say that
29:25
let's say I send a data packet one of
29:29
two things can happen
29:30
I either get a knack for it or I don't
29:32
get a knack for it what's the
29:34
probability that if I send a data packet
29:36
I get a knack for it well the
29:41
probability that I send a packet and I
29:47
don't get a knack for it is L therefore
29:49
the probability that if I send data
29:51
packet I get a knack for it is 1 minus L
29:53
right so with probability 1 minus L I
29:58
send a data packet and I immediately get
30:04
and when I say immediately I get an ACK
30:06
I get a knack for that data packet right
30:10
and how long does that take if I get a
30:13
knack for it the a comes back to me in a
30:15
time which is equal to RT T the
30:17
round-trip time right so therefore I can
30:20
write a formula that looks like this I
30:22
can write this expected time which I'm
30:25
trying to calculate as being equal to 1
30:27
minus L with probability 1 minus L the
30:31
expected time between when I send a data
30:32
packet and when I get a knack for it is
30:35
equal to the RTT
30:39
right because one minus L is by
30:43
definition the probability that I sent a
30:45
packet and I get a knack for it send a
30:47
date if I can good night now what
30:49
happens with probability L with
30:52
probability L I sent a data packet and I
30:54
don't get a knack for it so now I want
30:57
to compute the expected time given that
31:00
I don't get a knack for it the first
31:03
thing that has to happen is I need to
31:04
take a time out so I have to wait for a
31:06
period of time shown in this picture
31:08
given by the RTO and then once I wait
31:12
for that RTO and I now start by sending
31:15
a data packet the expected amount of
31:17
time before I get a knack for that data
31:19
packet is exactly equal to the original
31:21
expected time that I'm trying to
31:23
calculate right because it doesn't
31:24
matter what happened in the past let's
31:26
say I take a time out and now I come
31:27
back here and I sell it I'm now going to
31:30
send a data packet what's the expected
31:32
time before I get an ACK well that's
31:33
exactly equal to the same answer that
31:36
we're trying to calculate this expected
31:37
time over here therefore I could write
31:39
this recursion type of relationship the
31:42
expected time is 1 minus L times the RTT
31:44
plus L times the RTO plus the same
31:50
expected time that I'm trying to
31:51
calculate right well the says is with
31:54
probability 1 minus L the time the time
31:56
it'll take for me to get an act is equal
31:59
to the RTT and with probability L it's
32:01
equal to first of all this RTO I have to
32:03
wait for that retransmission timeout and
32:05
then once I do that well I have to add
32:08
some more time and that time that I have
32:09
to add is exactly equal to the same
32:11
expected time from the left-hand side
32:14
that I'm trying to calculate this it
32:17
makes sense
32:18
you could kind of do this in a more
32:20
tedious way you could say well with
32:21
probability 1 minus L my time is RT t
32:24
with probability L times 1 minus L the
32:28
time is equal to RT T plus RT Oh with
32:31
probability L times 1 minus L sorry l
32:34
squared times 1 minus L is like two
32:36
losses and then every transmission the
32:38
time is 2 times the RTO plus RT T with
32:42
probability L cubed times 1 minus L it's
32:44
that you do all of that stuff you get
32:45
the same thing but this is the more this
32:50
is a simple way to do it so if you run
32:52
you take the expected time over to one
32:54
side and solve this equation what you'll
32:56
end up with is that the expected time is
33:00
equal to RT T plus L over 1 minus L
33:04
times the RTO I mean as the packet loss
33:11
rate becomes larger and larger and
33:13
larger this term starts to dominate
33:14
because L over 1 minus L starts to be
33:17
bigger and bigger and bigger which is
33:20
what you expect if the packet loss if
33:21
the bi-directional packet loss rate is
33:23
large you'd expect the RTO terms to
33:25
start to dominate and the expected time
33:27
is larger and larger and larger if the
33:29
packet loss rate is zero then you would
33:32
like the expected term is exactly equal
33:33
to the RTT you send a packet you get an
33:35
ACK within an RTT you send the next
33:36
packet you get an ACK and of course the
33:38
throughput is equal to 1 over the
33:40
expected time that's the reciprocal of
33:43
the expected time okay now what's the
33:47
best case here the best case here is
33:48
that you get one packet per round-trip
33:50
time the worst case is you know
33:52
arbitrarily bag depending on the packet
33:53
loss rate but the important point here
33:56
is that even the best case you're able
33:59
to send only one packet at most one
34:00
packet per round-trip time
34:03
so question is how good or bad is one
34:06
packet for round trip time this is this
34:10
clear the situation behind why this is
34:12
one packet per round-trip time in the
34:13
best case that that should be pretty
34:15
obvious right I sent a packet to get an
34:16
axe and a packet to get an act this
34:18
calculation just shows a little bit more
34:20
detail about what happens when the
34:22
packet loss rates you know nonzero so if
34:24
the packet loss rate is say 20 percent
34:26
or you take you know 1/5 over 4/5 so
34:29
it's RTD plus 1/4 of the retransmission
34:33
time out that's that's what it says they
34:35
expected time is an one over that is
34:36
true now how bad or good is it clear any
34:40
questions ok so now how good or bad is
34:46
this one over the round-trip time so
34:49
let's say that you have you know a
34:53
network between Boston - I don't know
34:57
San Francisco and if you do these you
35:00
know pings or whatever let's the matter
35:02
I don't know the real numbers but let's
35:03
say it's 80 milliseconds just for the
35:06
calculation to be easier let's assume
35:07
it's 100 milliseconds and let's say that
35:10
a packet you know on the internet it's
35:14
about 10,000 bits so let's make it bytes
35:17
let's say that it's a thousand say 1,500
35:22
bytes
35:25
so what this says is that the true put
35:27
that I would get with the stop and wait
35:29
protocol if I ran it on this intranet
35:30
path would be 1500 bytes divided by 100
35:36
millisecond so that's 15,000 bytes per
35:38
second 15 kilobytes a second which might
35:45
have been really really good in 1985 but
35:47
you know no one's gonna be happy with
35:48
this today I mean you might have a link
35:53
that's a megabyte a second or a gigabyte
35:55
a second or 10 you know bigger than that
35:57
but no matter how fast the networks the
36:00
network links are this protocol is
36:02
completely dominated by the delay or the
36:04
latency there are trip latency between
36:06
the sender and the receiver and you end
36:07
up with the throughput that's pegged to
36:10
a small value and so people don't like
36:12
that so question is how can you do
36:15
better what can you do now to this
36:17
protocol or come up with a new method a
36:19
new protocol that would improve the
36:21
throughput of this system because if
36:25
people pay money for a network links
36:27
they'd like to actually get higher
36:28
performance from it so what could you do
36:35
what larger packets well yeah you know
36:40
larger packets is yeah why don't we make
36:44
our packets as big as the file we want
36:46
to send I should I digress why don't we
36:49
make packets really big like I got a
36:51
megabyte file or a gigabyte file to send
36:53
why do I have to break it up into
36:54
smaller packets
37:01
well just send the data no matter if you
37:03
break it up small or big you're gonna
37:05
use the same bandwidth okay that's a
37:07
good question what yeah you have an
37:09
answer that's kind of true you know if a
37:15
packets if a packet is you know let's
37:17
say a gigabyte file you want transfer
37:19
and you send that in one atomic unit and
37:20
goes through four hops in the network
37:22
and then it gets dropped on the fifth
37:23
hop you end up having to send an entire
37:25
gigabyte again over all those other hops
37:27
that's actually not good but in fact
37:29
really large packets are probably a bad
37:31
idea
37:31
even for networks which don't drop any
37:34
packets I think of the case when I have
37:37
a gigabyte file to send and you have a
37:38
gigabyte file to send the problem with
37:40
these really big if you make these
37:42
packets really big is that one of us is
37:44
you know on a shared link only one of us
37:46
can send that packet which means the
37:47
other guy's going to be waiting a really
37:49
really long time for him to send that
37:50
packet so the reason why in the end
37:53
packets are modest size has to do with
37:56
who are wanting to share the network
37:58
evenly over smaller timescales it's
38:00
because we want to give fairness across
38:02
smaller timescales allowing everybody
38:04
who's competing access to the network so
38:06
even if we have big amounts of data to
38:08
send we prefer to break them up into
38:09
smaller chunks among other reasons one
38:12
reason being we don't want to starve
38:14
other connections and prevent them from
38:18
gaining access to the network because
38:20
there's some you know huge transfers
38:22
sitting in front so that's that's part
38:25
of the reason so anyway so bigger
38:28
package doesn't quite cut it so what
38:30
else could you do yes
38:38
yeah
38:43
okay you know well I'll come back to
38:45
this on Monday that's actually a really
38:47
good idea but when would you stop four
38:49
eight sixteen thirty-two I mean at some
38:51
point this is like because packets are
38:55
lost okay this is a really good idea
38:58
we're not actually going to teach that
38:59
here in this course this is an this is
39:02
actually what TCP does in the beginning
39:04
of the connection but before we what
39:09
else could you do
39:10
that's a good idea yeah yeah you could
39:15
do a fixed number you know it somebody
39:17
could pick I actually kind of it is a
39:19
really good good idea to do one two four
39:22
eight and then if it fails you come back
39:24
down to say 1 or 1/2 of whatever work
39:27
the last time and then continue from
39:28
that that particular thing has a name to
39:31
it that protocol is called slow start
39:32
it's ironic because it's really fast
39:35
it's exponential write one two four
39:37
eight but yet it's called slow start
39:39
I'll probably tell you more about it on
39:42
Wednesday but will you know ease into
39:45
that solution we will do something
39:47
simpler we'll use something called a
39:48
sliding window protocol with a fixed
39:50
size window you just make that one be
39:52
seven or four or six or eight I'll tell
39:55
you later next time how you pick that
39:58
value okay and one way to pick the value
40:00
is to do dynamically like the gentleman
40:02
in front said it's it's more complicated
40:05
but let's just pick a fixed size value
40:07
so the idea is actually very very simple
40:09
rather than have one packet outstanding
40:11
use this idea you know in computer
40:13
science we use this over and over again
40:14
pipelining so you just send multiple of
40:17
them and have multiple outstanding
40:19
packets by outstanding I mean a packet
40:22
that hasn't yet been acknowledged a data
40:24
packet that hasn't yet been acknowledged
40:25
this is called an outstanding data
40:27
packet and you have multiple of these
40:29
outstanding and every time you get an
40:31
acknowledgment you send one more packet
40:34
so that's shown in this time line here
40:36
right so you start here you send a
40:38
packet
40:39
I don't know why this isn't working
40:46
you send a packet you get an
40:48
acknowledgment when you get an
40:49
acknowledgment you send another packet
40:50
get an acknowledgement you send another
40:52
packet but in the meantime there are
40:53
these other acknowledgments coming in
40:55
and the rule is very simple every time
40:57
you get an acknowledgment that you have
40:58
not seen before send the next packet in
41:01
sequence so the sender just keeps
41:03
sending packets in sequence order every
41:05
time it gets an acknowledgment that it
41:07
hasn't sent before seen before for a
41:09
packet that it had sent before it sends
41:12
the next incrementing sequence number
41:15
so uh this painstaking animation will
41:18
attempt to show you that assuming it's
41:20
correct so the window here is five
41:23
packets okay I'll tell you later how you
41:25
should some guidelines on how to pick
41:27
this window size but this number of
41:29
packets here is called the window the
41:31
number of outstanding packets okay or
41:33
the number of unacknowledged packets
41:35
it's always going to be five
41:36
under sorry it's going to be five in
41:38
this example it's always going to be a
41:40
fixed value in our protocol okay so you
41:43
send the first packet when you get an
41:45
acknowledgment for that first packet you
41:47
slide the window forward by one and you
41:49
send packet six when you get an
41:52
acknowledgment for packet two you slide
41:54
the window forward and you send package
41:56
seven when you get an attachment for
41:58
three you slide the window forward and
42:00
you get an acknowledgment for a and you
42:02
send back an eight
42:04
this is sorry yeah
42:10
that's a good question I'll get to that
42:13
in a moment the answer is that the
42:15
senders rule is always the same yes
42:17
you'll get acknowledgments out of order
42:19
as long as it's an acknowledgement for a
42:20
packet sorry as well as it's an
42:22
acknowledgement that you have not seen
42:24
before for a packet that you have
42:26
actually sent you slide the window
42:29
forward by one and send a new packet
42:31
okay and you keep track of the fact that
42:35
you've received an acknowledgment so you
42:36
know that you should never reach and
42:37
spend that packet I want to define this
42:42
thing and pause here I want you to
42:43
understand this definition of a window
42:44
and internalize it if the window size is
42:48
W what it means is that the maximum
42:51
number of unacknowledged packets that
42:53
you can have in the connection is W
42:55
there are many different ways of
42:57
implementing defining a window in fact
42:59
TCP inside it has two windows this
43:02
definition is one of those windows I
43:03
won't talk about the second definition
43:06
here I'll get to it on next week it's
43:08
not important for us all right now so
43:11
again to repeat if the window size is W
43:13
it means that the maximum number of
43:15
unacknowledged packets in the in the
43:16
system in the protocol is W so the rule
43:19
at the sender is going to be to very
43:21
religiously adhere to this to this rule
43:23
in other words every time it gets an
43:25
acknowledgment it waits and sees whether
43:27
it's an acknowledgment for a packet it
43:29
has sent before it that it had not seen
43:30
before if you get an acknowledgment like
43:33
that it means that some packet has been
43:36
received which means you can get rid of
43:38
that packet from the stack of
43:39
unacknowledged packets that you have and
43:41
send a new packet because you can send a
43:44
new packet because you know that the
43:47
number of unacknowledged packets reduced
43:49
by one because you got an AK which means
43:52
you can now send a new pack okay it's a
43:54
very simple rule if you just follow that
43:56
idea to implement it also surprisingly
43:58
easy to get wrong
44:05
the window doesn't have to be
44:07
consecutive this is a really really good
44:08
point and it's very tempting to
44:10
implement a window that's consecutive
44:11
and you'll find that after a while if
44:12
you follow this that idea and you do it
44:15
wrongly the protocol just stall and you
44:17
know every term there's about a quarter
44:18
of the students the first time they
44:20
implement it just stops working after a
44:21
while as the packet loss rates go so
44:23
it's important that in this definition
44:25
of the protocol in the way it's defined
44:26
here the window of unacknowledged
44:28
packets it's not necessarily consecutive
44:31
so you could have packets 1 2 3 4 8 9 10
44:35
11 outstanding if your window size is 8
44:37
the other guys may have gotten
44:38
acknowledged that's absolutely true
44:41
yes ok all right now what happens
44:46
understand all these are the weird cases
44:48
that are going to happen here so let me
44:50
first show you a timeline of how a
44:52
timeout is dealt with so let's say in
44:56
this case the window size is 5 again
44:58
like it was before so everything is
45:01
going wonderfully well here and let's
45:03
say now you move on you're sent back at
45:05
6 7 8 and let's say pack an 8 is lost
45:07
what sender is going to do is it's going
45:09
to send back at 9 it's going to send
45:11
packet 10 based on acknowledgments 4 4 &
45:16
3 that it received before so sorry
45:19
when it got acknowledgment 3 it sent
45:20
back at 8 8 was lost the sender didn't
45:22
know that at this point when he got an
45:24
acknowledgment for 4 it sends 9 when it
45:26
got an acknowledgment for 5 it sent Stan
45:28
when it gets an acknowledgment for 6
45:30
that goes ahead and sends 11 when it
45:32
gets an acknowledgement for 7 it goes
45:36
ahead and sends 12 so at this point the
45:39
sender actually has outstanding 12 Levin
45:42
10 9 & 8 okay
45:45
now at some point it discovers that in
45:49
fact this picture continues when it in
45:51
this picture what happened is that you
45:52
sent out nine you got an acknowledgement
45:54
for nine and at that point you send out
45:56
13 because whenever you get an
45:58
acknowledgment you send out the next
45:59
consecutive packet you should be sending
46:01
out so at this point in time the sender
46:03
has a bunch of outstanding packets in it
46:05
and it's got acknowledgement and this is
46:06
an interesting case because a packet 8
46:08
was lost 9 was sent later and 9 God
46:11
acknowledged but we still haven't timed
46:13
out on it so at this point in time the
46:15
outstanding packets in the window are 13
46:17
12 11 10 and 8 giving you that non
46:21
consecutive observation that you notice
46:24
and then at some point in time based on
46:26
the round-trip time based on that black
46:27
box the guy timed out and it got
46:30
retransmitted and then it got
46:31
retransmitted you got an acknowledgment
46:33
for 8 and the protocol sort of continues
46:35
in that in that fashion does this make
46:38
sense so there's a way this is actually
46:40
not that hard when you think about it
46:42
what the sender does is a very simple
46:46
idea which is every time it gets a new
46:48
acknowledgment for a packet it had sent
46:49
before but hadn't seen an acknowledgment
46:51
for before it just goes ahead and sends
46:52
the next packet and then it has a
46:54
separate process by which it maintains
46:56
this timeout and whenever an
46:58
acknowledgment does not arrive within a
46:59
timeout it goes ahead and talked
47:00
retransmits that packet now when it
47:03
retransmits the packet the assumption
47:05
here is that the original packet was
47:07
actually lost so the timer has to be
47:09
bigger than for the system to work well
47:14
the timer has to be bigger than the
47:16
maximum time that a packet can sit
47:18
around in the system so if the timeout
47:20
is too small and you retransmitted 8 to
47:22
le it could be that 8 is not lost but 8
47:25
is just being reordered in the network
47:26
of going on some very long circuit a
47:28
spot you know I recently read that
47:29
someone got a letter in New York 70
47:31
years it was sent in 1943 and it showed
47:33
up like two weeks ago so I mean it could
47:36
happen on the internet too I mean quite
47:37
literally if you're on an Amtrak train
47:39
and you're using their wireless network
47:40
some packets come to you in you know in
47:43
300 or 400 milliseconds which
47:46
arguably very long but literally there
47:47
are packets that will come back to you a
47:49
minute after you sent them I've reached
47:51
the receiver a minute after you sent
47:52
them and they could be out of order so
47:54
in fact this is you know my strengths
47:55
and I call this the Greater Amtrak
47:56
Network it's really good because it
47:58
gives us really interesting research
47:59
problems to work on but the people who
48:02
are on that train probably you know it's
48:03
miserable so anyway this could happen
48:06
but and so you know the timeout is a
48:08
heuristic it could be that 8 was
48:10
retransmitted wrongly in a spurious way
48:12
but our hope is that if the timeout is
48:14
long enough it could be that if the
48:17
timer is long enough the idea is that
48:19
you retransmit it because the original 8
48:20
was lost and now the outstanding packets
48:23
in the window are 8 13 12 11 and 10 but
48:27
the 8 here is not that 8 but this 8 but
48:32
as long as you know this the contents of
48:34
the 8 are the same it's sort of you know
48:35
it doesn't matter which 8 it is but of
48:37
course if the timeout is too small there
48:39
are two eight sitting in the network and
48:40
now you actually have more than W
48:43
packets in the window but as far as the
48:45
sender is concerned it has exactly those
48:47
eight 10 11 12 and 13 it's true that the
48:50
one more packet if the timeout happen
48:51
too early
48:52
it's the network's problem as far as the
48:54
sender is concerned it has five
48:55
outstanding packets the receiver is a
48:58
little trickier than in the other case
48:59
because what he has to do I mean it's
49:01
it's trickier in that it has to maintain
49:02
a buffer of packets so the receiver has
49:05
a little more of a job to do
49:07
in the previous stop-and-wait protocol
49:10
any kind of God an out-of-order packet
49:12
it's probably because the sender is
49:14
badly implemented right if the last
49:16
sequence number I delivered to the
49:17
application was 17 and I got a 19 it's a
49:20
bug whereas here if the lost sequence
49:22
number I deliver to the application is
49:24
17 and I get a 19 it means that well
49:26
there's a window and maybe 18 was lost
49:28
or who knows what happened right maybe
49:30
18 will show up later
49:31
so the receiver now has an interesting
49:33
job and this is important because when
49:35
you implement this stuff in the spirit
49:36
comes alive you got to make sure that
49:37
whenever you deliver packets from the
49:40
receiver protocol to the application you
49:41
deliver it in in order and update the
49:44
last sequence number you delivered
49:46
okay so that's important to do but if
49:49
you do that and then acknowledge a
49:50
packet when it's received just send an
49:53
acknowledgement for it the protocol will
49:54
continue and it'll work well so this is
49:56
what you'll be looking at in the in the
49:58
lab the piece that's going to go out
49:59
today this is the last lab and then on
50:01
tomorrow we are in recitation we look at
50:03
how the timeout is selected and then on
50:05
Monday I'll talk more about an analysis
50:07
of this protocol