Transcript Search in video 0:00 so where we left off last time was we ended up with this picture of you know how modern systems work where you have 0:07 multiple processes at any given point in time that are sharing the Machine and controlling those processes and between 0:14 those processes and the hardware is an operating system that essentially 0:20 mediates between these processes and the hardware to get essentially three basic 0:28 goals the first one is perfection on privacy so we want to give a private address space to its process so that 0:33 processes can access it access each other's data the second one is to 0:39 abstract or hide away the details of underlying Hardware so we're giving higher level abstractions than Hardware 0:45 provides for example we're using files instead of which is exposing the raw hard drive or sockets instead of 0:52 exposing their raw network card and finally for these that the last piece of 0:58 the puzzle was resource management right so the last goal is essentially to control how these different processes 1:05 share the multiple Hardware resources in the system and so we saw last time that 1:13 there are three key enabling technologies first one is this combination of two modes user mode and 1:19 supervisor mode and essentially supervisor mode has some privileged instructions and some privileged state 1:26 that is not accessible to user mode programs and the operating system kernel is running in this supervisor mode and 1:34 that's why it has these special privileges and has direct access to hardware and the second bit is 1:40 essentially this business of exceptions or interrupts to safely transition from user mode into supervisor mode right so 1:48 that's that's what we saw last lecture today we're going to see the other big component in terms of hardware support for operating systems which is virtual 1:54 memory and so virtual memory essentially lets us abstract the storage resources 2:00 of the machine and also implement this protection and privacy feature 2:05 essentially giving each process a private space let's see this in a little bit 2:11 more detail so essentially what the virtual virtual memory system is providing is the 2:18 illusion of having a large private address space for each process so each 2:25 process believes that it's running alone in the in a machine and it looks to the process like it has the entire memory of 2:31 the machine available or at least that it has you know a certain a certain amount of memory available and so that's 2:39 one big feature but there's another one that you know we haven't discussed so 2:44 far we didn't discuss last lecture but it's actually quite important which is what we call demand paging so the man 2:51 pacing essentially lets us use main memory or DRAM as a cache or iterative 2:57 yes main memory it lets us use the theorem as a cache of disk right so 3:04 essentially we can't support programs that use more data and that access more 3:09 data than we actually have main memory to support and the strategy that we're going to have is the same thing that we 3:16 do when we implement caches we're going to have this primary memory hole a portion of the processes data and then 3:24 we'll have some mechanism to go to this this slower this which is what we call a 3:30 secondary store to retrieve data that's not present in main memory you'll see 3:36 that there's a lot of there are a lot of similarities between virtual memory and 3:42 caching at the same time there are some key differences are driven by just 3:47 constant factors some operations are much more expensive so accesses these are much more expensive expensive than 3:53 accesses to main memory and so that's gonna drive us to quite different 3:59 designs but conceptually it is very simpler okay so the price for these two 4:05 wonderful features is that every access that the program performs is going to 4:11 have to be translated we're going to have to essentially perform some 4:17 operations to convert these address that is in the scope 4:22 the process into an address that's in physical memory and so just to be more 4:29 concrete we actually give three different names to memory locations of which you know the two important ones 4:35 for this lecture are the virtual address on the physical address and so but you 4:41 know just just for generality let's begin with you know the Machine language address right so what does a machine 4:47 language address means so if you're right load word let's say X 12 0 X 1 0 0 X 11 5:02 this is the Machine language address right this is what you write in assembly 5:12 the is a we'll translate this into what 5:20 we call the virtual address 5:29 so in this case it will be 0 x1 0 plus the contents of register ela right so it 5:39 will perform the processor will perform some operations to compute the virtual address which we sometimes also call the 5:45 effective address and that sort of what the address that the program wants to access that the process wants to access 5:52 but what we're going to do is insert this address mapping machinery here to 5:59 translate these virtual addresses into physical addresses addresses that actually are in in physical memory that 6:06 actually map to the main memory that you have in your machine and the key did I 6:15 think the key mechanism here the key thing to understand is that these address mapping hardware is going to be 6:21 under the control of the operating system right so we're going to take you know this virtual address over here 6:28 translate it into a physical address and then access main memory with it ok so 6:36 there are two key mechanisms that we're going to see first we're going to look at segmentation which is a very simple technique that has some drawbacks so 6:42 it's not used today or it's not used frequently at least and then we'll look at paging which is what modern systems 6:47 actually do so let's start with the simple technique so segmentation which is also sometimes called based on bound 6:54 address translation essentially implements a very simple mechanism so 6:59 suppose that you start with the some you know chunk of memory so the process you 7:07 know has addresses from 0 to 0 F of F and so what we're going to do is map 7:14 this set of virtual addresses this what we call address space into a contiguous 7:21 chunk of physical memory right so we're going to place it at some starting 7:27 location right in some continues chunk this is what we call a segment but 7:33 because we want to have multiple processes we actually want all the processors or each process to believe 7:40 that its address space starts at address zero we you know need to perform some 7:46 translation right because not all the processes can have their memory starting 7:51 at physical address zero and so what we do is essentially we introduce a base 7:56 register in the CPU and then we compute the physical address by adding the 8:02 contents of that base register and the virtual address right so some address 8:07 over here right is going to map to you know the same offset the same if you 8:13 start counting from zero it will be the same distance right but the actual physical address will be computed by 8:19 adding the contents of this base register right so now this means that we can have multiple processes that all 8:26 believe that this their you know their address base starts at address zero and yet you know they're all coexisting in 8:31 different locations of physical memory but the other thing that we need to do is ensure that a process doesn't try to 8:37 access memory that doesn't belong to the process right so what if you will the 8:43 process tries to access something that is out of bounds and so for that we essentially add a second register which 8:49 we call a bound register and so we compare that the virtual address is 8:55 below a certain bound if it is not then we trigger an exception so if the 9:00 process tries to access some location over here which would correspond perhaps to some data that's outside the process 9:06 maybe you know it's the kernels data maybe is the data for a different process maybe it's some free physical 9:11 memory whatever it is we do not want this process to access it and so we trigger an exception if it if it tries 9:17 to go out of bounds and of course you know a key component of this is that 9:22 this base unbound registers are privileged right so they are under the control of the operating system but not 9:30 under the control of the application so the application you know each its processes its user process cannot access 9:36 these registers directly right all its memory accesses are going through these registers but it cannot modify these 9:44 registers okay any questions 9:52 all right so let's see as you know slight extension of this 9:58 technique which actually starts having some interesting features so suppose that instead of having a single segment 10:06 for each program we have two segments what we're going to call the code segment and the data segment 10:12 all right so each segment is exactly like what I described before but the difference is that the program counter 10:18 right so whenever we're fetching a memory instruction we go through this set of you know through the code base 10:25 unbound registers and you know whenever we're doing loads on stores we go through the database on bound registers 10:31 so essentially we're writing we're reading and writing a data segment which 10:37 is independent from the code segment so can you tell me what advantages does 10:43 this separation have what do we gain by having these two different segments there's some good things that happen 10:54 yes yes that's a that's a great one right so 11:02 the first benefit is you know we've talked about protecting applications 11:07 from each other but there's you know some nice extra benefit from getting the 11:13 application to protect itself from you know its own data accesses so if you 11:19 have a bug right now these separation guarantees that the the buggy program 11:24 will not go and overwrite some code right so this is essentially preventing 11:30 self-modifying code code cannot override itself right there's another benefit 11:37 that's less obvious and this is if you're running many copies of the same program you can actually share the code 11:44 segment across all of this right so you can have multiple processes all of them 11:49 which are sharing the code segment and then they give a an independent data 11:54 segment to its process right and so that is essentially saving physical memory 12:01 okay any questions 12:07 so as I said there's oh yes that's a 12:16 good point next slide I'll cover these right now so 12:22 yeah the problem here is that we're giving some fixed space to its process and these leads to what we call 12:28 fragmentation right memory fragmentation and this is sort of the big one of the big two problems that prevents us from 12:34 using segmentation so you know there are two sources or fragmentation one is 12:39 programs that essentially allocate more data as time goes on the second one is 12:45 simply you know if you have multiple processes that are you know starting and finishing over time so suppose that 12:51 you're in this situation where you have three processes running in the machine and they're running in these locations right process one has 16 kilobytes of 12:58 chunk of 16 kilobytes of memory process 2 has a 24 kilobyte chunk then there's this free chunk of 24 kilobytes of 13:04 memory and process 3 has a 3-2 kilobyte chunk all right so let's say we start processes 4 & 5 and at this point we 13:11 have you know we allocate process we're over here leaving an 8 kilobyte chunk of free memory and then process 5 France in 13:19 these 324 kilobytes of memory now assume that processes 2 & 5 finish and so now 13:25 what we're left with is the following situation so you know over time what happens is that now we have this 8 13:31 kilobyte segment here which might be too small to accommodate any new processes 13:37 and so as processes start and finish we're going to be left out with a 13:42 fragmented memory layout right we're going to be left out with a potentially 13:50 very large amount of you know small free chunks that cannot individually support 13:58 any particular process for example here right we have 56 kilobytes of free memory and so if we wanted to run 14:04 anything that had you know let's say 32 kilobytes there's no contiguous 32 kilobyte segments that we could that we could 14:10 allocate so we would need to move for example process 3 8 kilobytes up to 14:15 create a 32 kilo a chunk of 32 kilobytes and expensive right because yeah this 14:22 example shows 32 kilobytes but if you're talking about 2 gigabytes and moving you know and an entire chunk of 2 gigabytes 14:28 of data from one place to another that's very very expensive that's going to take a lot of latency another issue is you 14:38 know these pictures shows processes taking a fixed amount of space right but as as you ask processes don't actually 14:46 take a fixed amount amount of space they start with some allocation and then they demand more resources for example when 14:51 you're allocating there on the heap or when you want to grow your stack or you know any other number of cases you want 14:58 to give processes the ability to request more memory and get it and so this 15:04 scheme is not flexible enough for that ok so to solve this problem we're going 15:11 to shift to what we call Beijing or Paige memory systems and so the key idea 15:17 here is to divide physical memory into a smaller a fixed number of fixed size 15:22 blocks that we call pages and these spaces are reasonably small typically you know plate sizes are 4 or 8 15:30 kilobytes in most systems and so what we're going to do is translate physical 15:37 this or interpret Israel its virtual address as an offset into a particular 15:43 page and a page number and so if this is a virtual address we call this a virtual 15:48 page number which we're going to translate into a physical page number to produce the physical address so we're 15:55 going to have you know a new data structure that we're going to with that we call a page table that stores these translations from virtual pages into 16:02 physical pages and the key advantage of doing this is that now we can map any virtual page number into any physical 16:09 page number which means that we can store the different pages of a process 16:15 the different chunks of memory of a process in a non contiguous fashion right and so basically you know now we 16:24 have we have this more complex translation mechanism but we 16:29 have a much more flexible mapping function let's see how this works with with a more concrete example so what 16:36 happens is to implement these private address space for process that we want we have a page table for each process 16:43 right and this base table has a as many page table entries at least you know in 16:50 the simple design that we are going to see first as you have virtual pages in the process and each entry contains the 16:59 physical page number for the virtual page number of that particular process and so for example you know in this in 17:07 this particular case we have three processes all of which have this second 17:12 bates over here right let's say that there's some virtual address VA one over 17:17 here and so you know for to translate the this virtual address one of process 17:24 one we access you know entry one of the page table and that has a pointer right that that is pointing to some location 17:31 in physical memory to a page of physical memory these plates in solid blue over here right now for process to the same 17:39 address you know we're going to go to the same location in the page table and now you see that it's pointing to a different location right and same same 17:46 thing for process three right it's pointing to a different location so now note that you know the pages of process 17:52 1 the pages of process 2 and the pages of process 3 are all stored non-contiguous Li right we just have a 17:59 bunch of 4 kilobyte chunks which were allocating independently so we don't have this restriction of having a big 18:05 chunk of memory for each process anymore right yes that particular page in 18:20 physical memory like that is oh 18:25 right so apart from the bates tables the kernel needs to keep some sort of free list write some list of available space 18:32 and then it's going to allocate from there 18:38 memory yeah this actually is a poor 18:45 choice of colors so what happens is this free pattern is slightly different from 18:52 this one and this one and this one so I'll I'll fix this in the in the handle 18:58 sorry nothing's pointing to a free page yes ah 19:05 great question well we'll see that in a minute for now assume that there somewhere we don't know yet right so the key thing 19:19 here is these base tables are under the control of the operating system right that's that's the key technique here 19:25 otherwise you know we would allow processes to alter where they point you 19:30 know out there where what physical memories are accessing and so all of these translation mechanisms are always 19:37 under the control of the operating system ok we're going to see a few more 19:43 example self of Bates tables and address translation but at a high level let's 19:49 let's look at the pros and cons of paging versus segmentation and start getting into the issues of of paging so 19:56 as I said you know the good thing about paging right the obvious good thing 20:01 about paging is that now you don't have fragmentation right so now we don't have 20:08 this issue of having to move large chunks of memory can you think of 20:14 another another benefit 20:20 yes thank you don't need can contiguous sequences in memory right it just 20:27 depends on what's right so because 20:32 you're managing everything in fixed sized blocks right management of physical memories much easier but these 20:40 sort of falls on there they avoid fragmentation issues so this goes back to what I mentioned at the beginning 20:45 that the second big feature that we want from virtual memory is to use main 20:50 memory to use the RAM as a cache for disk and so what if we design our system 20:55 so that we design our page table so that it could say well this is pointing to 21:01 some location in memory or for some virtual pages it could just say you know it's not here it's somewhere in disk I 21:07 just don't have it if you want to access this particular page go get it from disk so that's precisely what demand paging 21:15 the technique that we're going to discuss next gets us the ability to use 21:21 main memory as a cache for disk okay so of course this comes at a fairly with a 21:30 really large caveat so we've gone from 21:35 using one you know well not one two registers right a base and a bound 21:43 register to using a page table and these speights tables are big right they're 21:48 much larger than their by that the base unbound registers which brings us to the 21:53 obvious question which is where do we store these speights tables right clearly not in registers we're talking 21:59 about at least megabytes of storage here so we're they're not going to be in registers so what makes sense is to have 22:06 them in main memory we have the base tables themselves in main memory right 22:13 so to access main memory you first need to access main memory okay so here's how it works right first 22:21 of all the kernel is going to have some array that stores the that source 22:28 pointers to all the page tables right it stores the addresses of all the page tables of all the running processes 22:35 right and then we're going to have you know at this particular address for example at this starting address will 22:40 have the page table for process 1 and this starting address will have the page table for process 2 and so what we're 22:47 going to do to do a single access is you 22:53 know we'll get the virtual we essentially divide the virtual address into the offset on the virtual page 22:59 number we take the virtual page number add it up to the to the base register of the page table of the process that we're 23:05 currently running in this case let's say we're running process 1 and this will give he will give us some memory address 23:11 which were going to access to get the physical page number and therefore produce the physical address right and 23:18 in this particular case you know this will this will have different the addresses of the different book or the 23:25 physical page numbers which basically is the address of the different pages of 23:30 physical memory that all our virtual pages and so in this particular case let's say we go through here right so 23:37 we'll we'll get to this particular page over here right and so then we'll use 23:42 the offset to figure out which word we need to access within its physical page so basically different process different 23:52 page tables are going to have pointers different you know are going to have different physical page numbers and then 23:58 as I said these these are Rea over here is going to store the starting addresses 24:03 of its of its page table right so then it's translation is basically one memory 24:08 access to get the physical page number and then one memory not one memory access to get the data add the physical 24:17 address that we want to access and by the way all the all these arrows 24:22 here essentially are physical addresses so we're not doing virtual to physical address translation in any of these 24:30 because you have multiple processes the last piece of the puzzle is that on a process which you need to change this 24:35 base register to point you know if for example we this schedule process 1 and 24:41 schedule process 2 we would have to change the base register to point 2 here so that we get the right translations 25:06 absolutely and so what we're going to do to simplify our lives for now is assume 25:12 that you have all the virtual address space so for example if you have 32 bits 25:18 of virtual addresses right assume that your page table has one entry for every 25:24 possible virtual page number that you can have but you're not restricting it in any way and that's in fact how this 25:31 the simplest possible page table organization works right so for every possible virtual page number we're going 25:38 to have a very large array that has an entry right for each possible virtual 25:43 page number right that's that gives us a physical page number or it just says you know it's it's not a locator is not here 25:51 right yes 26:00 yes so we're going to go from birth from the virtual address we're going to get the virtual place number which then 26:06 we're going to go through the page table to get the physical page number which then we're going to obtain the offset to get the physical address so it's a 26:13 multi-step process okay so the key issue 26:21 here is that accessing one piece of data 26:26 right each memory reference is now taking two accesses to my memory and so in the rest of the lecture we're 26:32 essentially going to address this efficiency issues and look at how demand 26:37 paging works right so these these two main issues the first one is how do we reduce this overhead of memory accesses 26:44 right so we want to have you know this paging based systems that are very 26:52 flexible but at the same time we want them to be fast and we want them to not take a ton of space to store the page tables so we'll see how to do that 26:59 towards the end of the lecture but first we're going to address the issue of what if we cannot fit all the pages in the 27:04 ramp what if we kind of fit everything in main memory and we need to use secondary storage now there's a couple 27:10 of issues that are you know quite interesting why do you page tables cannot fit in DRAM we're not going to see this in thatwell 27:16 before your aim if you're interested in these there are more advanced courses that will teach you you know some of the 27:23 of the rigmarole that you have to go through to to store the page tables 27:29 outside of the RAM okay so let's see these in more detail with demand paging and so now we're getting 27:35 to the specific organization of the page table the key idea here is that because 27:42 not all the pages fit in main memory we're going to use some part of disk 27:48 which is much larger than than main memory as you know storage for for main 27:55 memory pages and so we call this swap space and we'll see why it's all swap spaces because we're always swapping 28:01 pages between disk and main memory so the Bates table is just an array of 28:07 Bates table entries that's indexed by the virtual page number and then the paid state the page table entry so its 28:15 entry which we call the page table entry contains our resident bid to tell to tell us is this page in the room or is 28:23 this page on disk and then if it's resident it has what we call a physical 28:31 paid the physical page number for that for the memory resident pages otherwise we're going to have a disk base number 28:38 we're basically going to point to the location in the swap space on disk that 28:44 has the page that we want apart from these there's some protection and use its bits things like has this 28:50 been has this page being written right so that we need to write it back back to 28:56 this if needed or can these plates be ridden right if it's code maybe you don't want to allow programs to write to 29:04 that membrane and so again the way this works is you know you start from the base the page table base register right 29:12 that gives you the starting address taking the virtual page number as the 29:17 offset you use it to index the particular page table entry that you're interested on in this case we have a translation to a physical page number 29:23 that gives us some location in physical memory and then we take the offset to figure out which particular word within 29:30 the space we want to access right so one nice thing about this is that even if 29:37 everything fits in memory demand paging allows us to be lazy so suppose that you 29:43 have a program that that yeah is a hundred megabytes of code but when you 29:49 run it it doesn't do much right it's just you know looking at the input and 29:54 saying you know I kind of process this it's perhaps throwing an error so you don't really want to be loading the 30:00 whole program right run 10,000 instructions and then bail out and so 30:06 what demand paging allows us to do is initially have everything to be resident on disk and then us 30:13 we execute the parts of the code that we care about we're going to be bringing them to memory on demand okay any 30:24 questions yes okay so when I mentioned 30:36 main memory I'm always talking about the RAM which sees this dynamic random-access memory technology that 30:42 gives you you know accesses in about 100 cycles but you know in current technology you can have a few gigabytes 30:48 of it right when I say this this can be either a hard drive or a solid-state drive right so this is HDD or SSD this 30:57 stake you know the solid-state drives which are more common today our flash 31:03 Bay's they take about 10 to 100 microseconds to serve a read operation so we're talking about two orders of 31:09 magnitude three orders of magnitude worth of difference to read some data and hard disk is basically spinning 31:15 platter that's that's read with a magnetic needle it's pretty all at this 31:20 point but it's still used because it's very dense but it's really really slow we're talking about 10 milliseconds per per week and so there's this whole 31:27 hierarchy of storage technologies that that essentially yeah we saw a glimpse 31:32 of in the memory hierarchy lecture okay 31:38 any other questions all right so let's let's see a more 31:45 concrete example I'm particularly you know let's let's use something that has 31:51 small virtual and physical addresses so you know in general when your risk by processors you'll have 32-bit virtual 31:57 addresses but here let's assume that you know we have a system that has this 256 32:05 bytes per pates so we were going to have eight offset bits and then sixteen 32:10 virtual pages right so that's 2 to the 4 so overall we're going to have virtual addresses there are 12 bits right 12 32:18 bits it's just 2,000 locations it's not very big but you know some systems actually use small memories and then we're going to 32:25 have physical addresses that have so this is the actual amount of physical memory that we have we have eight 32:31 physical pages 256 bytes each so we have two kilobytes of memory right and so 32:36 again we're going to have 12 bit virtual addresses 11 bit physical addresses and because all of these numbers are pretty 32:43 small we can actually put the entire page table here which has 16 entries right it has 16 entries because we have 16 virtual pages right so one entry per 32:51 virtual page and then we're going to have eight physical memory pages so here 32:59 right so men trees are populated you know there's there's three bits here 33:05 dirt the dirty bit the writable bit on the resident bit so DWR for whatever is 33:11 resident like this entry over here for page e right we have a pointer to a 33:18 physical page number right so it's saying physical page this virtual page number e maps to physical page number 33:24 five yes 33:31 the DRAM is not a cache in the sense that there's no hardware and there are 33:36 no hardware tags but it is what we're using it as a cache thanks to virtual memory it is in the sense that misses 33:50 from those other caches are being served by my memory we will actually talk more 33:57 about the interaction of caches and main memory in the next lecture but for now 34:02 what we're going to do is assume that you just have the CPU and memory ok yes 34:10 so right it will bid implement something that's similar to the protection that that we saw when you have two separate 34:16 segments so for example when you have code you don't want anyone to write to it and so you'll mark that it's not 34:22 writable so that if you have a store to that page you'll trigger an exception ok 34:30 so let's say that let's assume that you have a load that issues had an access to 34:38 virtual address 0x to c8 and so what is the virtual page number for this for 34:47 this virtual address right we'd want to translate this virtual address into a physical address the first step is to 34:53 divide it into the virtual page number and the offset right so in this 34:59 particular case it's very nice because everything is a multiple of four so that's one hex digit so the virtual page 35:05 number is two and what physical page number you know what do we do with this 35:11 with this virtual page number well we go to location two in the page table right page table entry 2 so here we see that 35:19 this did the resident bit is set to 1 and the physical page number is 4 right 35:25 so our physical page number is 4 and therefore these corresponds to physical 35:30 address 0 x 4 c 8 right and so now we can go and access memory at that 35:36 particular address okay so we've seen what happens when you 35:46 have data actually in memory but what happens when you miss right what happens 35:51 when you hit on a page or a virtual page 35:56 number that's not actually resident that is on disk so there are lots of 36:03 similarities between the process that we're going to see at and caches which 36:09 we've we've seen a few lectures ago so here you know the set up that we've seen 36:14 so far is we have a cache and then main memory or primary memory and the setup 36:19 that we're going to see today is we have primary memory acting as a cache for this secondary memory which is on disk and so as I said the concepts are very 36:28 similar right so here we talk about a cache entry here we talk about a page or a page frame we talk about cache blocks 36:34 here we have pages right which are also large so that we can amortize the 36:39 overheads of keeping all these metadata there's cache miss rate there's Bates miss rate cached page hit cache means 36:46 page minutes but there are differences here right a cache miss in this case 36:52 it's served from this DRAM primary memory and so it takes around 100 cycles a page miss here means that we need to 37:01 go to disk and this is going to take milliseconds to reply or at least if you have a solid-state drive it's going to 37:07 take hundreds of microseconds and so we're talking about hundreds of thousands to millions of cycles so since 37:14 misses are so slow it actually doesn't make sense to handle the misses in 37:20 hardware so what we're going to do is this on our system to handle the misses in software and that's simpler it's 37:27 actually not any more expensive than it would be to handle them in hardware because we're going to be waiting for 37:33 the data to come back anyway right if we're going to wait for five million cycles what does it matter that we spend a hundred instructions 37:38 processing to make the Miss right why why not just the learning software which is more flexible okay so here's here's 37:46 what happens right and again what we're going to do is resort exceptions so on when you have when the 37:55 CPU issues and access to a page that doesn't have a valid Translate translation that's not resident this 38:01 actually causes an exception it triggers what we call a page fault or a page fault exception and that again invokes 38:07 the OS and the OS is going to handle this miss right and so let's see first 38:12 we've before diving into the details what the state in the system is before and after servicing the page fault so 38:19 before the debates fall suppose that the CPU wants to access this virtual page number here virtual page number five 38:25 which is not resident and so it's pointing somewhere on disk and so lets you know that there always is going to 38:31 have to choose something to replace it with in this case let's assume that it's this virtual page now virtual page one 38:40 which is resident on it's pointing somewhere over here you're going to have to implement some replacement policy 38:45 let's say that you have this recently used and these space is the one that was accessed the least recently you know the 38:52 least recently accessed page and so after the page ball we're going to have the following set up Thursday OS is 38:58 going to write whatever contents of the virtual page here virtual page one it's 39:06 going to evict that page to disk right and then it is going to load the page 39:13 you know virtual page 5 over here from disk to the location where virtual page 39:19 one was and then it's going to alter the page table so that now virtual page 1 or 39:25 page table entry 1 points to disk to the right location on disk and then we for 39:32 virtual page 5 we're pointing to the right location in memory right essentially we've swapped the pages the 39:39 locations in this on main memory so the way this happens is the same process on 39:46 its right describe essentially exactly what you implemented on lap 6 for a 39:51 cache but in software right so basically the 39:58 operating system chooses the baits to replace if it's dirtied writes it back and then it marks the page table entry 40:05 as you know that that page is no longer resident then it reads the page from these into the available physical page 40:10 and then it updates the face the debates able to show that the new page is resident and has the right page number 40:17 right and finally asks with all exceptions once we've done servicing it will return control to the program which 40:24 we execute the memory access and now this memory access is going to go through here through the baits they will 40:30 find a body translation and actually complete right yes well because in 40:45 general you're the reason that you have VP and five on this is that you're out of physical memory so every entry here 40:51 right is although not all mappings are shown every entry of physical memory is used so you have to pick something to 40:58 replace of course if you had a three physical page then you would just you know use three physical page but in this 41:05 example we're assuming that you don't have any free physical pages okay 41:10 all right so this concludes our discussion on demand paging now remember 41:16 there are two other issues that we need to take care of you know the first one is and these again are performance 41:23 optimizations that are very important performance performance optimizations the first one is that it's completely 41:29 impractical to have two memory accesses or at least two memory accesses their 41:34 reference right we don't want each page stable access to be a memory reference 41:40 and so what we do is actually we introduce a cache for the page table and 41:47 this cache for the page table is called a translation lookaside buffer or TLB 41:54 let despite you know this fairly complicated name it's just a cache it's 42:01 a cache where the tag is the virtual page number and then we're going to have all the 42:07 same bits that we have in a page table entry we're gonna have a physical page number we're gonna have the resident bit 42:15 dirty bit the right a little bit and so on how we're going to do half a Bali Beach is just the same way that we have 42:21 a valid bit in in our caches right so 42:27 this is going to be a small cache write that for every memory reference we're 42:33 going to first access this cache see we have a match and if we have a match then 42:38 we can perform the translation in a single cycle without going to memory otherwise we actually go half we have to 42:44 go access the Bates table and fill the right state in the in the TLB right 42:49 essentially feeling the trunk of the new translation in the TLB okay so we have a 42:56 cache for page table entries now these caches as I said are typically small so 43:03 they're 32 228 entries they don't store a lot of translation so that they can be 43:08 very very fast in fact modern processors have multiple levels of TLB just like modern processors have multiple levels 43:15 of caches and then you know there are two details that we need to take care of the first one is that one when you 43:23 switch from one process to another right the operating system if you don't do 43:28 anything special with the TLB actually needs to flash the TLB right because the TLB is now holding it's caching page 43:35 table entries that are stale so you have you know this is this problem of staleness and you know we could actually 43:42 solve this by including the process ID in the tag so that you know we can have context switching among multiple 43:48 processes and we only get a match if the process ID match the process ID matches 43:54 and the virtual page number matches so some some systems will do this and then 44:02 there's another big design decision which is what happens one emits and here 44:07 systems will will well I mean all systems need to check the page table which we call working the page table and 44:15 you know the key question is well how do you do this doing hardware you could do it 44:20 in software it's not that frequent so it's okay to do this software but in fact you know usually most modern 44:27 systems during hardware so they have what we call a memory management unit that issues the access to the base table 44:33 in hardware and avoid taking an exception but you could take an exception and there are other architectures like MIPS that that do 44:41 this in software but in particular our risk five and actually x86 do this this page table accesses in Hardware okay so 44:48 at this point there's a lot of design decisions let's let's go back to you know see how these all fits together and 44:56 so you know the program is issuing virtual addresses you know requests 45:02 references with virtual addresses so the first thing we're going to do is each well look up to the TLB see we have a 45:07 translation there best-case scenario we have a translation now even if we have a 45:13 translation we need to figure out whether we can actually perform the access for example if this is a store and we do not have right provision to 45:23 these to this page then we need to throw a an exception right and so if we pass 45:30 the protection checks we basically have the physical address and then go directly to memory if we hear one of 45:37 these you know the the translation is there but actually you don't have the right permissions for it then we cause 45:45 an exception which are what we call a protection fault okay so the other path 45:51 is if you have a Miss in the TLB so your 45:56 translation is not in the TLB now you need to do at updates they will work which we said can happen either in hardware or in software we will do it in 46:03 hardware and then there are two options either your page table has the translation in which case we go and 46:10 update the TLB and we have the right physical address already or it's not in memory in which case again we need to 46:15 trigger a page fault so that the operating system can load it from this and have the right translation right and 46:22 insert the right translation in the page table and then we'll replay the memory 46:28 access will replay the instruction and we'll go through these other battle 46:33 over here right okay so let's finish up with an example that 46:41 actually makes this makes this much more concrete so suppose that you have a virtual memory system that has four 46:49 gigabytes of 30 or you know 32 bits of virtual address space but you only have to do the 24 bytes of physical memory so 46:58 that's 16 megabytes right and we're gonna have one kilobyte pages not only 47:04 that we're gonna have a for entry TLB that's shown here and so first you know some questions about you know how many 47:11 pages can be stored in physical memory at once well you have 2 2 3 4 bytes you have to do the damn lights per page so 2 47:19 to the 24 divided by 2 to the 10 that's 2 to the 14 so we can have to do there 47:25 14 pages in physical memory now how many entries are doing the page table so we 47:35 have 2 to the 32 possible byte addresses and to do the 42 to the 10 bytes per 47:43 page so that's to do the 22 right you divide both to get to the - no - to the 22 virtual page numbers yes for which 47:54 one okay so these is because we have 48:12 - to the 24 bytes over 2 to the 10 bytes 48:19 per page that gives us 2 to the 14 pages right so this is bytes per page this is 48:34 physical memory bytes so this is 48:42 physical memory pages right same thing 48:51 for this one right except now that now we're considering virtual memory bytes 48:56 okay so you can go through and answer some of these questions about how much 49:02 storage is taking you know we have four-team we're gonna need 48 bytes or 49:07 14 bits for the physical page number and 2 bits for the rest in and dirty bits so we're gonna have 16 bits per per page 49:14 table entry so you can compute the size of the of the page table but let's do 49:20 one of these examples of what is the physical address for virtual address 0 X 1 8 0 4 so if you take 0 X 1 8 0 4 again 49:31 the first step here is to compute the 49:37 virtual page number now here things are not so nicely aligned so 49:46 you put it in binary this is your offset 49:56 right and these bits are your virtual page number what is this virtual page 50:04 number one one zero six right and so what components are involved in this 50:12 translation well so we go to the TLB do we have a translation for six yes and 50:18 what is the physical page number two and so given that you can go back and compute the physical address or this 50:27 virtual address and in this particular case you only have to check the TLB right because you already had a valid 50:33 translation for some of these other ones you will see that you actually need to go to the page table and in one of the 50:40 cases you go to the page say one you actually don't find a translation yes 50:46 yes not a very efficient design and so we'll we'll see how to fix that at the 50:52 beginning of next lecture yes 50:59 begin our order right you absolutely can access this row 51:14 the problem is that that would take you an extra memory access and you don't want that that's why we added TLB 51:19 because this base table access is very expensive right this TLB access we can do in a single 51:26 cycle that's why we have this very small cache right this is a performance 51:32 optimization oh yeah great question 51:38 we'll discuss this next time I assume that there's no cache and so you access 51:44 the TLB before main memory the TLB is typically before the cache but we'll see this in detail in next lecture all right 51:50 that's it for today two more on the next lecture we will actually start discussing how virtual memory interacts 51:59 with with caches right thank you