Transcript


Search in video
0:00
so where we left off last time was we ended up with this picture of you know how modern systems work where you have
0:07
multiple processes at any given point in time that are sharing the Machine and controlling those processes and between
0:14
those processes and the hardware is an operating system that essentially
0:20
mediates between these processes and the hardware to get essentially three basic
0:28
goals the first one is perfection on privacy so we want to give a private address space to its process so that
0:33
processes can access it access each other's data the second one is to
0:39
abstract or hide away the details of underlying Hardware so we're giving higher level abstractions than Hardware
0:45
provides for example we're using files instead of which is exposing the raw hard drive or sockets instead of
0:52
exposing their raw network card and finally for these that the last piece of
0:58
the puzzle was resource management right so the last goal is essentially to control how these different processes
1:05
share the multiple Hardware resources in the system and so we saw last time that
1:13
there are three key enabling technologies first one is this combination of two modes user mode and
1:19
supervisor mode and essentially supervisor mode has some privileged instructions and some privileged state
1:26
that is not accessible to user mode programs and the operating system kernel is running in this supervisor mode and
1:34
that's why it has these special privileges and has direct access to hardware and the second bit is
1:40
essentially this business of exceptions or interrupts to safely transition from user mode into supervisor mode right so
1:48
that's that's what we saw last lecture today we're going to see the other big component in terms of hardware support for operating systems which is virtual
1:54
memory and so virtual memory essentially lets us abstract the storage resources
2:00
of the machine and also implement this protection and privacy feature
2:05
essentially giving each process a private space let's see this in a little bit
2:11
more detail so essentially what the virtual virtual memory system is providing is the
2:18
illusion of having a large private address space for each process so each
2:25
process believes that it's running alone in the in a machine and it looks to the process like it has the entire memory of
2:31
the machine available or at least that it has you know a certain a certain amount of memory available and so that's
2:39
one big feature but there's another one that you know we haven't discussed so
2:44
far we didn't discuss last lecture but it's actually quite important which is what we call demand paging so the man
2:51
pacing essentially lets us use main memory or DRAM as a cache or iterative
2:57
yes main memory it lets us use the theorem as a cache of disk right so
3:04
essentially we can't support programs that use more data and that access more
3:09
data than we actually have main memory to support and the strategy that we're going to have is the same thing that we
3:16
do when we implement caches we're going to have this primary memory hole a portion of the processes data and then
3:24
we'll have some mechanism to go to this this slower this which is what we call a
3:30
secondary store to retrieve data that's not present in main memory you'll see
3:36
that there's a lot of there are a lot of similarities between virtual memory and
3:42
caching at the same time there are some key differences are driven by just
3:47
constant factors some operations are much more expensive so accesses these are much more expensive expensive than
3:53
accesses to main memory and so that's gonna drive us to quite different
3:59
designs but conceptually it is very simpler okay so the price for these two
4:05
wonderful features is that every access that the program performs is going to
4:11
have to be translated we're going to have to essentially perform some
4:17
operations to convert these address that is in the scope
4:22
the process into an address that's in physical memory and so just to be more
4:29
concrete we actually give three different names to memory locations of which you know the two important ones
4:35
for this lecture are the virtual address on the physical address and so but you
4:41
know just just for generality let's begin with you know the Machine language address right so what does a machine
4:47
language address means so if you're right load word let's say X 12 0 X 1 0 0 X 11
5:02
this is the Machine language address right this is what you write in assembly
5:12
the is a we'll translate this into what
5:20
we call the virtual address
5:29
so in this case it will be 0 x1 0 plus the contents of register ela right so it
5:39
will perform the processor will perform some operations to compute the virtual address which we sometimes also call the
5:45
effective address and that sort of what the address that the program wants to access that the process wants to access
5:52
but what we're going to do is insert this address mapping machinery here to
5:59
translate these virtual addresses into physical addresses addresses that actually are in in physical memory that
6:06
actually map to the main memory that you have in your machine and the key did I
6:15
think the key mechanism here the key thing to understand is that these address mapping hardware is going to be
6:21
under the control of the operating system right so we're going to take you know this virtual address over here
6:28
translate it into a physical address and then access main memory with it ok so
6:36
there are two key mechanisms that we're going to see first we're going to look at segmentation which is a very simple technique that has some drawbacks so
6:42
it's not used today or it's not used frequently at least and then we'll look at paging which is what modern systems
6:47
actually do so let's start with the simple technique so segmentation which is also sometimes called based on bound
6:54
address translation essentially implements a very simple mechanism so
6:59
suppose that you start with the some you know chunk of memory so the process you
7:07
know has addresses from 0 to 0 F of F and so what we're going to do is map
7:14
this set of virtual addresses this what we call address space into a contiguous
7:21
chunk of physical memory right so we're going to place it at some starting
7:27
location right in some continues chunk this is what we call a segment but
7:33
because we want to have multiple processes we actually want all the processors or each process to believe
7:40
that its address space starts at address zero we you know need to perform some
7:46
translation right because not all the processes can have their memory starting
7:51
at physical address zero and so what we do is essentially we introduce a base
7:56
register in the CPU and then we compute the physical address by adding the
8:02
contents of that base register and the virtual address right so some address
8:07
over here right is going to map to you know the same offset the same if you
8:13
start counting from zero it will be the same distance right but the actual physical address will be computed by
8:19
adding the contents of this base register right so now this means that we can have multiple processes that all
8:26
believe that this their you know their address base starts at address zero and yet you know they're all coexisting in
8:31
different locations of physical memory but the other thing that we need to do is ensure that a process doesn't try to
8:37
access memory that doesn't belong to the process right so what if you will the
8:43
process tries to access something that is out of bounds and so for that we essentially add a second register which
8:49
we call a bound register and so we compare that the virtual address is
8:55
below a certain bound if it is not then we trigger an exception so if the
9:00
process tries to access some location over here which would correspond perhaps to some data that's outside the process
9:06
maybe you know it's the kernels data maybe is the data for a different process maybe it's some free physical
9:11
memory whatever it is we do not want this process to access it and so we trigger an exception if it if it tries
9:17
to go out of bounds and of course you know a key component of this is that
9:22
this base unbound registers are privileged right so they are under the control of the operating system but not
9:30
under the control of the application so the application you know each its processes its user process cannot access
9:36
these registers directly right all its memory accesses are going through these registers but it cannot modify these
9:44
registers okay any questions
9:52
all right so let's see as you know slight extension of this
9:58
technique which actually starts having some interesting features so suppose that instead of having a single segment
10:06
for each program we have two segments what we're going to call the code segment and the data segment
10:12
all right so each segment is exactly like what I described before but the difference is that the program counter
10:18
right so whenever we're fetching a memory instruction we go through this set of you know through the code base
10:25
unbound registers and you know whenever we're doing loads on stores we go through the database on bound registers
10:31
so essentially we're writing we're reading and writing a data segment which
10:37
is independent from the code segment so can you tell me what advantages does
10:43
this separation have what do we gain by having these two different segments there's some good things that happen
10:54
yes yes that's a that's a great one right so
11:02
the first benefit is you know we've talked about protecting applications
11:07
from each other but there's you know some nice extra benefit from getting the
11:13
application to protect itself from you know its own data accesses so if you
11:19
have a bug right now these separation guarantees that the the buggy program
11:24
will not go and overwrite some code right so this is essentially preventing
11:30
self-modifying code code cannot override itself right there's another benefit
11:37
that's less obvious and this is if you're running many copies of the same program you can actually share the code
11:44
segment across all of this right so you can have multiple processes all of them
11:49
which are sharing the code segment and then they give a an independent data
11:54
segment to its process right and so that is essentially saving physical memory
12:01
okay any questions
12:07
so as I said there's oh yes that's a
12:16
good point next slide I'll cover these right now so
12:22
yeah the problem here is that we're giving some fixed space to its process and these leads to what we call
12:28
fragmentation right memory fragmentation and this is sort of the big one of the big two problems that prevents us from
12:34
using segmentation so you know there are two sources or fragmentation one is
12:39
programs that essentially allocate more data as time goes on the second one is
12:45
simply you know if you have multiple processes that are you know starting and finishing over time so suppose that
12:51
you're in this situation where you have three processes running in the machine and they're running in these locations right process one has 16 kilobytes of
12:58
chunk of 16 kilobytes of memory process 2 has a 24 kilobyte chunk then there's this free chunk of 24 kilobytes of
13:04
memory and process 3 has a 3-2 kilobyte chunk all right so let's say we start processes 4 & 5 and at this point we
13:11
have you know we allocate process we're over here leaving an 8 kilobyte chunk of free memory and then process 5 France in
13:19
these 324 kilobytes of memory now assume that processes 2 & 5 finish and so now
13:25
what we're left with is the following situation so you know over time what happens is that now we have this 8
13:31
kilobyte segment here which might be too small to accommodate any new processes
13:37
and so as processes start and finish we're going to be left out with a
13:42
fragmented memory layout right we're going to be left out with a potentially
13:50
very large amount of you know small free chunks that cannot individually support
13:58
any particular process for example here right we have 56 kilobytes of free memory and so if we wanted to run
14:04
anything that had you know let's say 32 kilobytes there's no contiguous 32 kilobyte segments that we could that we could
14:10
allocate so we would need to move for example process 3 8 kilobytes up to
14:15
create a 32 kilo a chunk of 32 kilobytes and expensive right because yeah this
14:22
example shows 32 kilobytes but if you're talking about 2 gigabytes and moving you know and an entire chunk of 2 gigabytes
14:28
of data from one place to another that's very very expensive that's going to take a lot of latency another issue is you
14:38
know these pictures shows processes taking a fixed amount of space right but as as you ask processes don't actually
14:46
take a fixed amount amount of space they start with some allocation and then they demand more resources for example when
14:51
you're allocating there on the heap or when you want to grow your stack or you know any other number of cases you want
14:58
to give processes the ability to request more memory and get it and so this
15:04
scheme is not flexible enough for that ok so to solve this problem we're going
15:11
to shift to what we call Beijing or Paige memory systems and so the key idea
15:17
here is to divide physical memory into a smaller a fixed number of fixed size
15:22
blocks that we call pages and these spaces are reasonably small typically you know plate sizes are 4 or 8
15:30
kilobytes in most systems and so what we're going to do is translate physical
15:37
this or interpret Israel its virtual address as an offset into a particular
15:43
page and a page number and so if this is a virtual address we call this a virtual
15:48
page number which we're going to translate into a physical page number to produce the physical address so we're
15:55
going to have you know a new data structure that we're going to with that we call a page table that stores these translations from virtual pages into
16:02
physical pages and the key advantage of doing this is that now we can map any virtual page number into any physical
16:09
page number which means that we can store the different pages of a process
16:15
the different chunks of memory of a process in a non contiguous fashion right and so basically you know now we
16:24
have we have this more complex translation mechanism but we
16:29
have a much more flexible mapping function let's see how this works with with a more concrete example so what
16:36
happens is to implement these private address space for process that we want we have a page table for each process
16:43
right and this base table has a as many page table entries at least you know in
16:50
the simple design that we are going to see first as you have virtual pages in the process and each entry contains the
16:59
physical page number for the virtual page number of that particular process and so for example you know in this in
17:07
this particular case we have three processes all of which have this second
17:12
bates over here right let's say that there's some virtual address VA one over
17:17
here and so you know for to translate the this virtual address one of process
17:24
one we access you know entry one of the page table and that has a pointer right that that is pointing to some location
17:31
in physical memory to a page of physical memory these plates in solid blue over here right now for process to the same
17:39
address you know we're going to go to the same location in the page table and now you see that it's pointing to a different location right and same same
17:46
thing for process three right it's pointing to a different location so now note that you know the pages of process
17:52
1 the pages of process 2 and the pages of process 3 are all stored non-contiguous Li right we just have a
17:59
bunch of 4 kilobyte chunks which were allocating independently so we don't have this restriction of having a big
18:05
chunk of memory for each process anymore right yes that particular page in
18:20
physical memory like that is oh
18:25
right so apart from the bates tables the kernel needs to keep some sort of free list write some list of available space
18:32
and then it's going to allocate from there
18:38
memory yeah this actually is a poor
18:45
choice of colors so what happens is this free pattern is slightly different from
18:52
this one and this one and this one so I'll I'll fix this in the in the handle
18:58
sorry nothing's pointing to a free page yes ah
19:05
great question well we'll see that in a minute for now assume that there somewhere we don't know yet right so the key thing
19:19
here is these base tables are under the control of the operating system right that's that's the key technique here
19:25
otherwise you know we would allow processes to alter where they point you
19:30
know out there where what physical memories are accessing and so all of these translation mechanisms are always
19:37
under the control of the operating system ok we're going to see a few more
19:43
example self of Bates tables and address translation but at a high level let's
19:49
let's look at the pros and cons of paging versus segmentation and start getting into the issues of of paging so
19:56
as I said you know the good thing about paging right the obvious good thing
20:01
about paging is that now you don't have fragmentation right so now we don't have
20:08
this issue of having to move large chunks of memory can you think of
20:14
another another benefit
20:20
yes thank you don't need can contiguous sequences in memory right it just
20:27
depends on what's right so because
20:32
you're managing everything in fixed sized blocks right management of physical memories much easier but these
20:40
sort of falls on there they avoid fragmentation issues so this goes back to what I mentioned at the beginning
20:45
that the second big feature that we want from virtual memory is to use main
20:50
memory to use the RAM as a cache for disk and so what if we design our system
20:55
so that we design our page table so that it could say well this is pointing to
21:01
some location in memory or for some virtual pages it could just say you know it's not here it's somewhere in disk I
21:07
just don't have it if you want to access this particular page go get it from disk so that's precisely what demand paging
21:15
the technique that we're going to discuss next gets us the ability to use
21:21
main memory as a cache for disk okay so of course this comes at a fairly with a
21:30
really large caveat so we've gone from
21:35
using one you know well not one two registers right a base and a bound
21:43
register to using a page table and these speights tables are big right they're
21:48
much larger than their by that the base unbound registers which brings us to the
21:53
obvious question which is where do we store these speights tables right clearly not in registers we're talking
21:59
about at least megabytes of storage here so we're they're not going to be in registers so what makes sense is to have
22:06
them in main memory we have the base tables themselves in main memory right
22:13
so to access main memory you first need to access main memory okay so here's how it works right first
22:21
of all the kernel is going to have some array that stores the that source
22:28
pointers to all the page tables right it stores the addresses of all the page tables of all the running processes
22:35
right and then we're going to have you know at this particular address for example at this starting address will
22:40
have the page table for process 1 and this starting address will have the page table for process 2 and so what we're
22:47
going to do to do a single access is you
22:53
know we'll get the virtual we essentially divide the virtual address into the offset on the virtual page
22:59
number we take the virtual page number add it up to the to the base register of the page table of the process that we're
23:05
currently running in this case let's say we're running process 1 and this will give he will give us some memory address
23:11
which were going to access to get the physical page number and therefore produce the physical address right and
23:18
in this particular case you know this will this will have different the addresses of the different book or the
23:25
physical page numbers which basically is the address of the different pages of
23:30
physical memory that all our virtual pages and so in this particular case let's say we go through here right so
23:37
we'll we'll get to this particular page over here right and so then we'll use
23:42
the offset to figure out which word we need to access within its physical page so basically different process different
23:52
page tables are going to have pointers different you know are going to have different physical page numbers and then
23:58
as I said these these are Rea over here is going to store the starting addresses
24:03
of its of its page table right so then it's translation is basically one memory
24:08
access to get the physical page number and then one memory not one memory access to get the data add the physical
24:17
address that we want to access and by the way all the all these arrows
24:22
here essentially are physical addresses so we're not doing virtual to physical address translation in any of these
24:30
because you have multiple processes the last piece of the puzzle is that on a process which you need to change this
24:35
base register to point you know if for example we this schedule process 1 and
24:41
schedule process 2 we would have to change the base register to point 2 here so that we get the right translations
25:06
absolutely and so what we're going to do to simplify our lives for now is assume
25:12
that you have all the virtual address space so for example if you have 32 bits
25:18
of virtual addresses right assume that your page table has one entry for every
25:24
possible virtual page number that you can have but you're not restricting it in any way and that's in fact how this
25:31
the simplest possible page table organization works right so for every possible virtual page number we're going
25:38
to have a very large array that has an entry right for each possible virtual
25:43
page number right that's that gives us a physical page number or it just says you know it's it's not a locator is not here
25:51
right yes
26:00
yes so we're going to go from birth from the virtual address we're going to get the virtual place number which then
26:06
we're going to go through the page table to get the physical page number which then we're going to obtain the offset to get the physical address so it's a
26:13
multi-step process okay so the key issue
26:21
here is that accessing one piece of data
26:26
right each memory reference is now taking two accesses to my memory and so in the rest of the lecture we're
26:32
essentially going to address this efficiency issues and look at how demand
26:37
paging works right so these these two main issues the first one is how do we reduce this overhead of memory accesses
26:44
right so we want to have you know this paging based systems that are very
26:52
flexible but at the same time we want them to be fast and we want them to not take a ton of space to store the page tables so we'll see how to do that
26:59
towards the end of the lecture but first we're going to address the issue of what if we cannot fit all the pages in the
27:04
ramp what if we kind of fit everything in main memory and we need to use secondary storage now there's a couple
27:10
of issues that are you know quite interesting why do you page tables cannot fit in DRAM we're not going to see this in thatwell
27:16
before your aim if you're interested in these there are more advanced courses that will teach you you know some of the
27:23
of the rigmarole that you have to go through to to store the page tables
27:29
outside of the RAM okay so let's see these in more detail with demand paging and so now we're getting
27:35
to the specific organization of the page table the key idea here is that because
27:42
not all the pages fit in main memory we're going to use some part of disk
27:48
which is much larger than than main memory as you know storage for for main
27:55
memory pages and so we call this swap space and we'll see why it's all swap spaces because we're always swapping
28:01
pages between disk and main memory so the Bates table is just an array of
28:07
Bates table entries that's indexed by the virtual page number and then the paid state the page table entry so its
28:15
entry which we call the page table entry contains our resident bid to tell to tell us is this page in the room or is
28:23
this page on disk and then if it's resident it has what we call a physical
28:31
paid the physical page number for that for the memory resident pages otherwise we're going to have a disk base number
28:38
we're basically going to point to the location in the swap space on disk that
28:44
has the page that we want apart from these there's some protection and use its bits things like has this
28:50
been has this page being written right so that we need to write it back back to
28:56
this if needed or can these plates be ridden right if it's code maybe you don't want to allow programs to write to
29:04
that membrane and so again the way this works is you know you start from the base the page table base register right
29:12
that gives you the starting address taking the virtual page number as the
29:17
offset you use it to index the particular page table entry that you're interested on in this case we have a translation to a physical page number
29:23
that gives us some location in physical memory and then we take the offset to figure out which particular word within
29:30
the space we want to access right so one nice thing about this is that even if
29:37
everything fits in memory demand paging allows us to be lazy so suppose that you
29:43
have a program that that yeah is a hundred megabytes of code but when you
29:49
run it it doesn't do much right it's just you know looking at the input and
29:54
saying you know I kind of process this it's perhaps throwing an error so you don't really want to be loading the
30:00
whole program right run 10,000 instructions and then bail out and so
30:06
what demand paging allows us to do is initially have everything to be resident on disk and then us
30:13
we execute the parts of the code that we care about we're going to be bringing them to memory on demand okay any
30:24
questions yes okay so when I mentioned
30:36
main memory I'm always talking about the RAM which sees this dynamic random-access memory technology that
30:42
gives you you know accesses in about 100 cycles but you know in current technology you can have a few gigabytes
30:48
of it right when I say this this can be either a hard drive or a solid-state drive right so this is HDD or SSD this
30:57
stake you know the solid-state drives which are more common today our flash
31:03
Bay's they take about 10 to 100 microseconds to serve a read operation so we're talking about two orders of
31:09
magnitude three orders of magnitude worth of difference to read some data and hard disk is basically spinning
31:15
platter that's that's read with a magnetic needle it's pretty all at this
31:20
point but it's still used because it's very dense but it's really really slow we're talking about 10 milliseconds per per week and so there's this whole
31:27
hierarchy of storage technologies that that essentially yeah we saw a glimpse
31:32
of in the memory hierarchy lecture okay
31:38
any other questions all right so let's let's see a more
31:45
concrete example I'm particularly you know let's let's use something that has
31:51
small virtual and physical addresses so you know in general when your risk by processors you'll have 32-bit virtual
31:57
addresses but here let's assume that you know we have a system that has this 256
32:05
bytes per pates so we were going to have eight offset bits and then sixteen
32:10
virtual pages right so that's 2 to the 4 so overall we're going to have virtual addresses there are 12 bits right 12
32:18
bits it's just 2,000 locations it's not very big but you know some systems actually use small memories and then we're going to
32:25
have physical addresses that have so this is the actual amount of physical memory that we have we have eight
32:31
physical pages 256 bytes each so we have two kilobytes of memory right and so
32:36
again we're going to have 12 bit virtual addresses 11 bit physical addresses and because all of these numbers are pretty
32:43
small we can actually put the entire page table here which has 16 entries right it has 16 entries because we have 16 virtual pages right so one entry per
32:51
virtual page and then we're going to have eight physical memory pages so here
32:59
right so men trees are populated you know there's there's three bits here
33:05
dirt the dirty bit the writable bit on the resident bit so DWR for whatever is
33:11
resident like this entry over here for page e right we have a pointer to a
33:18
physical page number right so it's saying physical page this virtual page number e maps to physical page number
33:24
five yes
33:31
the DRAM is not a cache in the sense that there's no hardware and there are
33:36
no hardware tags but it is what we're using it as a cache thanks to virtual memory it is in the sense that misses
33:50
from those other caches are being served by my memory we will actually talk more
33:57
about the interaction of caches and main memory in the next lecture but for now
34:02
what we're going to do is assume that you just have the CPU and memory ok yes
34:10
so right it will bid implement something that's similar to the protection that that we saw when you have two separate
34:16
segments so for example when you have code you don't want anyone to write to it and so you'll mark that it's not
34:22
writable so that if you have a store to that page you'll trigger an exception ok
34:30
so let's say that let's assume that you have a load that issues had an access to
34:38
virtual address 0x to c8 and so what is the virtual page number for this for
34:47
this virtual address right we'd want to translate this virtual address into a physical address the first step is to
34:53
divide it into the virtual page number and the offset right so in this
34:59
particular case it's very nice because everything is a multiple of four so that's one hex digit so the virtual page
35:05
number is two and what physical page number you know what do we do with this
35:11
with this virtual page number well we go to location two in the page table right page table entry 2 so here we see that
35:19
this did the resident bit is set to 1 and the physical page number is 4 right
35:25
so our physical page number is 4 and therefore these corresponds to physical
35:30
address 0 x 4 c 8 right and so now we can go and access memory at that
35:36
particular address okay so we've seen what happens when you
35:46
have data actually in memory but what happens when you miss right what happens
35:51
when you hit on a page or a virtual page
35:56
number that's not actually resident that is on disk so there are lots of
36:03
similarities between the process that we're going to see at and caches which
36:09
we've we've seen a few lectures ago so here you know the set up that we've seen
36:14
so far is we have a cache and then main memory or primary memory and the setup
36:19
that we're going to see today is we have primary memory acting as a cache for this secondary memory which is on disk and so as I said the concepts are very
36:28
similar right so here we talk about a cache entry here we talk about a page or a page frame we talk about cache blocks
36:34
here we have pages right which are also large so that we can amortize the
36:39
overheads of keeping all these metadata there's cache miss rate there's Bates miss rate cached page hit cache means
36:46
page minutes but there are differences here right a cache miss in this case
36:52
it's served from this DRAM primary memory and so it takes around 100 cycles a page miss here means that we need to
37:01
go to disk and this is going to take milliseconds to reply or at least if you have a solid-state drive it's going to
37:07
take hundreds of microseconds and so we're talking about hundreds of thousands to millions of cycles so since
37:14
misses are so slow it actually doesn't make sense to handle the misses in
37:20
hardware so what we're going to do is this on our system to handle the misses in software and that's simpler it's
37:27
actually not any more expensive than it would be to handle them in hardware because we're going to be waiting for
37:33
the data to come back anyway right if we're going to wait for five million cycles what does it matter that we spend a hundred instructions
37:38
processing to make the Miss right why why not just the learning software which is more flexible okay so here's here's
37:46
what happens right and again what we're going to do is resort exceptions so on when you have when the
37:55
CPU issues and access to a page that doesn't have a valid Translate translation that's not resident this
38:01
actually causes an exception it triggers what we call a page fault or a page fault exception and that again invokes
38:07
the OS and the OS is going to handle this miss right and so let's see first
38:12
we've before diving into the details what the state in the system is before and after servicing the page fault so
38:19
before the debates fall suppose that the CPU wants to access this virtual page number here virtual page number five
38:25
which is not resident and so it's pointing somewhere on disk and so lets you know that there always is going to
38:31
have to choose something to replace it with in this case let's assume that it's this virtual page now virtual page one
38:40
which is resident on it's pointing somewhere over here you're going to have to implement some replacement policy
38:45
let's say that you have this recently used and these space is the one that was accessed the least recently you know the
38:52
least recently accessed page and so after the page ball we're going to have the following set up Thursday OS is
38:58
going to write whatever contents of the virtual page here virtual page one it's
39:06
going to evict that page to disk right and then it is going to load the page
39:13
you know virtual page 5 over here from disk to the location where virtual page
39:19
one was and then it's going to alter the page table so that now virtual page 1 or
39:25
page table entry 1 points to disk to the right location on disk and then we for
39:32
virtual page 5 we're pointing to the right location in memory right essentially we've swapped the pages the
39:39
locations in this on main memory so the way this happens is the same process on
39:46
its right describe essentially exactly what you implemented on lap 6 for a
39:51
cache but in software right so basically the
39:58
operating system chooses the baits to replace if it's dirtied writes it back and then it marks the page table entry
40:05
as you know that that page is no longer resident then it reads the page from these into the available physical page
40:10
and then it updates the face the debates able to show that the new page is resident and has the right page number
40:17
right and finally asks with all exceptions once we've done servicing it will return control to the program which
40:24
we execute the memory access and now this memory access is going to go through here through the baits they will
40:30
find a body translation and actually complete right yes well because in
40:45
general you're the reason that you have VP and five on this is that you're out of physical memory so every entry here
40:51
right is although not all mappings are shown every entry of physical memory is used so you have to pick something to
40:58
replace of course if you had a three physical page then you would just you know use three physical page but in this
41:05
example we're assuming that you don't have any free physical pages okay
41:10
all right so this concludes our discussion on demand paging now remember
41:16
there are two other issues that we need to take care of you know the first one is and these again are performance
41:23
optimizations that are very important performance performance optimizations the first one is that it's completely
41:29
impractical to have two memory accesses or at least two memory accesses their
41:34
reference right we don't want each page stable access to be a memory reference
41:40
and so what we do is actually we introduce a cache for the page table and
41:47
this cache for the page table is called a translation lookaside buffer or TLB
41:54
let despite you know this fairly complicated name it's just a cache it's
42:01
a cache where the tag is the virtual page number and then we're going to have all the
42:07
same bits that we have in a page table entry we're gonna have a physical page number we're gonna have the resident bit
42:15
dirty bit the right a little bit and so on how we're going to do half a Bali Beach is just the same way that we have
42:21
a valid bit in in our caches right so
42:27
this is going to be a small cache write that for every memory reference we're
42:33
going to first access this cache see we have a match and if we have a match then
42:38
we can perform the translation in a single cycle without going to memory otherwise we actually go half we have to
42:44
go access the Bates table and fill the right state in the in the TLB right
42:49
essentially feeling the trunk of the new translation in the TLB okay so we have a
42:56
cache for page table entries now these caches as I said are typically small so
43:03
they're 32 228 entries they don't store a lot of translation so that they can be
43:08
very very fast in fact modern processors have multiple levels of TLB just like modern processors have multiple levels
43:15
of caches and then you know there are two details that we need to take care of the first one is that one when you
43:23
switch from one process to another right the operating system if you don't do
43:28
anything special with the TLB actually needs to flash the TLB right because the TLB is now holding it's caching page
43:35
table entries that are stale so you have you know this is this problem of staleness and you know we could actually
43:42
solve this by including the process ID in the tag so that you know we can have context switching among multiple
43:48
processes and we only get a match if the process ID match the process ID matches
43:54
and the virtual page number matches so some some systems will do this and then
44:02
there's another big design decision which is what happens one emits and here
44:07
systems will will well I mean all systems need to check the page table which we call working the page table and
44:15
you know the key question is well how do you do this doing hardware you could do it
44:20
in software it's not that frequent so it's okay to do this software but in fact you know usually most modern
44:27
systems during hardware so they have what we call a memory management unit that issues the access to the base table
44:33
in hardware and avoid taking an exception but you could take an exception and there are other architectures like MIPS that that do
44:41
this in software but in particular our risk five and actually x86 do this this page table accesses in Hardware okay so
44:48
at this point there's a lot of design decisions let's let's go back to you know see how these all fits together and
44:56
so you know the program is issuing virtual addresses you know requests
45:02
references with virtual addresses so the first thing we're going to do is each well look up to the TLB see we have a
45:07
translation there best-case scenario we have a translation now even if we have a
45:13
translation we need to figure out whether we can actually perform the access for example if this is a store and we do not have right provision to
45:23
these to this page then we need to throw a an exception right and so if we pass
45:30
the protection checks we basically have the physical address and then go directly to memory if we hear one of
45:37
these you know the the translation is there but actually you don't have the right permissions for it then we cause
45:45
an exception which are what we call a protection fault okay so the other path
45:51
is if you have a Miss in the TLB so your
45:56
translation is not in the TLB now you need to do at updates they will work which we said can happen either in hardware or in software we will do it in
46:03
hardware and then there are two options either your page table has the translation in which case we go and
46:10
update the TLB and we have the right physical address already or it's not in memory in which case again we need to
46:15
trigger a page fault so that the operating system can load it from this and have the right translation right and
46:22
insert the right translation in the page table and then we'll replay the memory
46:28
access will replay the instruction and we'll go through these other battle
46:33
over here right okay so let's finish up with an example that
46:41
actually makes this makes this much more concrete so suppose that you have a virtual memory system that has four
46:49
gigabytes of 30 or you know 32 bits of virtual address space but you only have to do the 24 bytes of physical memory so
46:58
that's 16 megabytes right and we're gonna have one kilobyte pages not only
47:04
that we're gonna have a for entry TLB that's shown here and so first you know some questions about you know how many
47:11
pages can be stored in physical memory at once well you have 2 2 3 4 bytes you have to do the damn lights per page so 2
47:19
to the 24 divided by 2 to the 10 that's 2 to the 14 so we can have to do there
47:25
14 pages in physical memory now how many entries are doing the page table so we
47:35
have 2 to the 32 possible byte addresses and to do the 42 to the 10 bytes per
47:43
page so that's to do the 22 right you divide both to get to the - no - to the 22 virtual page numbers yes for which
47:54
one okay so these is because we have
48:12
- to the 24 bytes over 2 to the 10 bytes
48:19
per page that gives us 2 to the 14 pages right so this is bytes per page this is
48:34
physical memory bytes so this is
48:42
physical memory pages right same thing
48:51
for this one right except now that now we're considering virtual memory bytes
48:56
okay so you can go through and answer some of these questions about how much
49:02
storage is taking you know we have four-team we're gonna need 48 bytes or
49:07
14 bits for the physical page number and 2 bits for the rest in and dirty bits so we're gonna have 16 bits per per page
49:14
table entry so you can compute the size of the of the page table but let's do
49:20
one of these examples of what is the physical address for virtual address 0 X 1 8 0 4 so if you take 0 X 1 8 0 4 again
49:31
the first step here is to compute the
49:37
virtual page number now here things are not so nicely aligned so
49:46
you put it in binary this is your offset
49:56
right and these bits are your virtual page number what is this virtual page
50:04
number one one zero six right and so what components are involved in this
50:12
translation well so we go to the TLB do we have a translation for six yes and
50:18
what is the physical page number two and so given that you can go back and compute the physical address or this
50:27
virtual address and in this particular case you only have to check the TLB right because you already had a valid
50:33
translation for some of these other ones you will see that you actually need to go to the page table and in one of the
50:40
cases you go to the page say one you actually don't find a translation yes
50:46
yes not a very efficient design and so we'll we'll see how to fix that at the
50:52
beginning of next lecture yes
50:59
begin our order right you absolutely can access this row
51:14
the problem is that that would take you an extra memory access and you don't want that that's why we added TLB
51:19
because this base table access is very expensive right this TLB access we can do in a single
51:26
cycle that's why we have this very small cache right this is a performance
51:32
optimization oh yeah great question
51:38
we'll discuss this next time I assume that there's no cache and so you access
51:44
the TLB before main memory the TLB is typically before the cache but we'll see this in detail in next lecture all right
51:50
that's it for today two more on the next lecture we will actually start discussing how virtual memory interacts
51:59
with with caches right thank you