CS 481: Solutions for Test #2

Problem 1. Suppose you have a byte-addressed computer system with a 44-bit logical address. The page size is 64Kbytes and each page entry takes 4 bytes.

  • How many pages are there in the full logical address space?

    64Kbytes is 2**16 bytes; thus there are (2**44)/(2**16)=2**28 (roughly a quarter billion) pages in the logical address space.

  • We shall use 2-level paging and want all page tables to fit within one page. How many bits will be use to index a master page? a secondary page?

    64Kbytes is 2**16 bytes; since each page entry takes 4 bytes, a page of 2**16 bytes can store 2**14 page entries. Since the machine is byte addressed, the displacement within a page of 2**16 bytes requires a 16-bit field. Since the full logical address has 44 bits, everything works out just right: we use a 14-bit master page index, a 14-bit secondary page index, and a 16-bit displacement, for a total of 44 bits.

  • You are running a 4Gbytes program on this system. How many page frames would you need in order to put your entire program (including its page tables) in memory?

    4Gbytes is 2**32 bytes; since the size of a page is 2**16 bytes, we will need a total of (2**32)/(2**16)=2**16 page frames. A secondary page table (which fits within one frame) can address 2**14 page frames, so we will need (2**16)/(2**14)=4 such page tables, plus the master page table itself, for a total of 5 additional page frames. So the program needs (2**16)+5=65541 page frames in all.

  • Problem 2. You have a RISC machine with a paged virtual memory system. Because the machine is a RISC architecture, every machine instruction references at most one location in memory---of course, you also have to bring in the next instruction. Moreover, these instructions cannot branch---they are simply transfers between a CPU register and a memory location. Assume that, in a typical program, 30% of the instructions reference a memory location and 20% are branching instructions (i.e., cause the program counter to change drastically).

    The TLB takes 20ns to respond; a memory access costs you 200ns; and a disk access costs you 10ms.

  • Assume that each memory reference (regardless of whether it's for data or for the next instruction) has a 95% hit ratio in the TLB and that, of the TLB misses, 95% are resolved in the page table. What is the time needed to access memory?

    Further assume that the address decoding is serial: we first go to the TLB; if it fails, we go to the page table; if it fails, we go to the disk. In all cases, we go to the memory last to get what we need. Then the average time for one memory access is

    20ns + 0.05*200ns + 0.0025*10ms + 200ns = 25,230ns
    But 30% of the instructions require two memory accesses, so the average time spent in memory accesses per instruction is
    1.3*25,230ns=32,799ns
    or 32.8 microseconds. The page fault rate is much too high or the swap device much too slow: we were only penalized 13ns on average for TBL failures, but the page faults cost us 32.5 microseconds!
  • Now assume that the paging system is modified so that, every time we get a page fault, we bring in both the faulty page and the next page as well. This change cuts in half the page fault rate for instructions, but only reduces the page fault rate for data by 20%. What is now the time needed to access memory?

    This should improve things, but of course not enough, since it cannot do better than cut in half the contribution of page faults -- which would still leave us in the neighborhood of 16 microseconds... Let us carry out the exact computation. This time, we divide the instructions into three categories: the branching instructions, the memory transfer instructions, and all others.

    The branching instructions (20% of the running code) are not affected at all: the one memory access they cause still takes an average of 25.23 microseconds and so they contribute 0.2*25.23=5.046 microseconds to the average. The "other" instructions (50% of the running code) are improved the most: their one memory access (for the next instruction) has half the original page fault rate and so takes an average of

    20ns + 0.05*200ns + 0.00125*10ms + 200ns = 12,730ns
    so that they contribute 0.5*12.73=6.365 microseconds to the average. The memory transfer instructions cause one memory access for the next isntruction, which takes 12.73 microseconds as just computed, plus one memory access for data, where the page fault rate when down only by 20% and which thus takes
    20ns + 0.05*200ns + 0.002*10ms + 200ns = 20,230ns
    Thus a memory transfer instructions spends on average 32.96 microseconds in memory access; since they make up 30% of the running code, they contribute 0.3*32.96=9.888 microseconds to the average.

    Thus the overall average time spent in memory access per instruction is now 5.046+6.365+9.888=21.299 microseconds. Compare this figure with the best we could do: 100% hit in the TLB and thus 220ns per memory access and thus 1.3*210=286ns per instruction. The ratio remains very high: we are about 74 times slower than the optimal.

  • Problem 3. Which of the following programs are likely to do well in a paging environment and which are likely to do poorly (assume that all operate on very large data sets)? For each, indicate briefly why.

  • a stack in an array: excellent, because all work is highly localized (one page, at most two);
  • a queue in an array: excellent, because all work is localized in two places (two pages, at most four);
  • a hash table in an array: miserable, since the hash function is all over the array, which is too big to allow us to page it all into main memory;
  • linear search in an array: very good with sufficiently large pages, since we page fault only once per number of entries in a page;
  • binary search in an array: miserable at first (ranging all over the array), getting somewhat better as the search progresses (narrower range of pages), and finishing very well (last range in memory, since it needs only a few pages); with large pages, good overall, since narrowing down happens very quickly;
  • graph search with linked lists: miserable, since the pointers are sure to lead us to pages not in memory;
  • matrix multiplication with arrays: depends very much on the way it is done; standard method is very poor (needs one row and one column, then moves to another row or another column and so is sure to fault for every new matrix entry computed), but other methods can do well;
  • indirect addressing: miserable in principle -- any pointer-based system is bad, since the references need not be localized at all; how it does in specific cases is another story.
  • Problem 4. Your paged computer system is performing miserably. On measuring various parameters, you find the following: CPU utilization is 20%, swap device utilization is 98%, and other I/O device utilization is 5%. Which of the following measures (if any) would be likely to improve your system and why?

  • install a faster CPU: nope -- it'll only drop the CPU utilization even lower and move the swap device utilization even closer to 100%;
  • install a larger swap device: nope -- the system is thrashing; more room on the swap device won't help (in fact, it could make matters worse by allowing the system to increase the degree of multiprogramming);
  • install a faster swap device: that will help some; it will not remove the thrashing problem -- we will still observe huge page fault rates -- but it will enable better progress;
  • decrease the degree of multiprogramming: yes, that's the best solution in terms of price/performance ratio; it will control the thrashing, allow good CPU utilization, and cost nothing;
  • increase the size of the main memory: yes, that will help even more than the previous solution, but it's expensive and is not a permanent fix: if the system, has no control over the degree of multiprogramming, thrashing is sure to occur again at some other time, no matter how much more memory has been added.
  • Problem 5. You find out that the page fault rate in your paging system is extremely low -- say on the order of 0.00001%. Is that entirely good news? Can you think of reasons why you might want to take another look at your system and perhaps modify it?

    That was something of a bonus question, because there is no single good answer. If paging is that low, there is some reason to think things are just cool; but you may also suspect that the degree of multiprogramming is lower than it could be -- that you could raise it and increase responsiveness without seriously affecting performance. A "back-of-the-envelope" calculation shows that, even with fast 100ns memory and average 10ms disk, we can afford a page fault rate on the order of 100ns/10ms or about 1 in 100,000, or about 0.001%; that exact page fault rate would roughly double the memory access time, so something like 0.0001% would be utterly safe, adding only 10% to the memory access time. That is 10 times the observed rate, so there is quite a margin for increasing the degree of multiprogramming.

    Another suspiscion is that your paging system is so good because it adopts the old IBM philosophy, soaking up 90% of the resources of your system to do a superb job of managing what's left. (It might spend enormous amounts of time running intricate prediction schemes, running various diligent daemons, prepaging, etc.) So you should check on speed/turnaround time/throughput and see where your other resources go.

    Then there are other observations: a system that is nearly that quiescent may in fact have serious problems! For instance, there is only one job running and it is caught in a tight infinite loop -- no paging at all! Or one big job somehow hogged all page frames, runs comfortably with little or no I/O, and has not yet been swapped out -- very little paging. Other less likely failures may suggest themselves as well.

    Back to CS 481 home page