CS 481: Sample In-Class Final |
Notes: (i) the real final will have 6 problems in all, not nearly as many as here; (ii) many of the problems here are longer and require more computations than what will be on the final; (iii) a couple of the problems here actually require you to write on something we did not cover much in lecture, but that you should be able to reason through from previous classes (such as 341) and from other topics in this class---the final will have no such question.
The key to the problem is understanding how to set up a state diagram for it. The state diagram's states were already described in the handout on queueing analysis; basically, in a ``static'' picture of the airport, one simply sees how many taxi slots are full or how many customers are waiting. Service is effectively instantaneous, so that we cannot have a customer waiting while there is a full taxi slot. So, we have our standard unbounded queue for the waiting passengers, but with some additional states, due to the taxi slots. In effect, the taxi slots allow the system to accumulate a reserve when taxi service runs ahead of passenger arrivals. (Recall our analogy of the random choice of forward or backward steps: with 2 taxi slots, we can take up to 2 steps back of the ``origin'' before we have to discard further backward moves---i.e., turn taxis away.)
The transitions between adjacent states are dictated by the arrival rate of
passengers and the arrival rate of taxis; they are not related to the number
of slot, since the probability of a taxi arriving to fill an empty slot
is simply the probability of a taxi arriving---it does not depend on the
current number of empty slots. All forward transitions occur at the
arrival rate of passengers, lambda=1; all backward transitions occur at
the arrival rate of taxis, mu=1.5.
Number states from the left starting with 0; thus q_0 is the state
in which both taxi slots are full and no passenger is waiting, q_1
the state in which one slot is full and no passenger is waiting,
q_2 the state in which neither taxi nor passenger is present, and
q_i, for i>=3, the state in which i-2 passengers are waiting
and both taxi slots are empty. The handout gives the solution to such
a system as p_i=rho^i(1-rho); in our case, rho=lambda/mu=2/3,
so we have p_i = 1/3 (2/3)^i.
In particular, a taxi arriving at the airport will find both slots full
exactly when the system is in state q_0, which happens with probability
p_0=1/3 and thus will be turned away with probability 1/3.
Note that this finding is independent of the number of slots:
p_0 depends only on rho and a taxi is turned away only when all slots
are full, i.e., only when the system is in state p_0. (Another way to
see this is to realize that taxis can only be used for passengers; since
passengers arrive at the rate of lambda (less than mu), only a fraction of
lambda/mu=rho of the taxis will be used and the others, a fraction of
1-rho of them, will be turned away.) We can thus
assert that the probability of a taxi being turned away is independent
of the number of taxi slots---which tells us that at least one of the claims
made by the taxi company is false. An arriving passenger finds a taxi waiting
exactly when the system is in one of the two states q_0 and q_1; hence
the probability that a passenger finds at least one taxi waiting is
p_0+p_1=(1+rho)*(1-rho)=1-rho^2=5/9, or about 55.5%. Now this
probability is affected by the number of slots: with k taxi slots, it
would be p_0+p_1+...+p_(k-1); thus this probability grows slowly
(and ever more slowly) with increasing slots. What about average waiting
time? We never saw a direct formula for it, but we did see Little's
formula, which relates it and lambda (the customer arrival rate) to
the average queue length. The average queue length in our case is
0*p_0+0*p_1+0*p_2+1*p_3+2*p_4+... = sum_(i=1)^(infinity) i*p_(i+2).
This sum is not quite the one we know, which is sum_(i=1)^(infinity) i*rho^i,
but we can easily express it in terms of the known sum:
n_av = sum_(i=1)^(infinity) i*p_(i+2) = sum_(i=1)^(\infinity) i*rho^(i+2)(1-rho) = rho^3/(1-rho).
Note that the effect of the two slots has simply been to increase the power of
rho in the numerator, from rho to rho^3---i.e., by the number of slots.
We can correctly conclude (a formal proof is easy, but boring) that
the average queue length with k slots would be n_av = rho^(k+1)/(1-rho).
Since we have rho less than 1, this formula shows that increasing the number
of taxi slots decreases the average length of the queue. In our problem,
with two slots, the average queue length is 3*(2^3)/(3^3)= 8/9,
or about 0.89. Now, by Little's formula, we can write the average waiting
time when we have k slots: W = 1/lambda * rho^(k+1)/(1-rho).
In our problem, the average waiting time with two slots is 8/9 minutes,
or about 53.3s. With no slot at all, this time would be 2 minutes.
Note that a taxi arrives to the airport every 40s on average; the effects
of queueing in the passenger line lengthen the waiting time for passengers
to 2 minutes. (We can get the 8/9 or 2 minutes in a different way.
We have seen that only one taxi per minute is used, matching the passenger
arrival rate; the queue length times the interval between taxis used then
gives us the waiting time.) Adding a large number of slots, say
increasing the taxi queue to 10 slots, would reduce the waiting time to 20.8s.
Since two minutes appears quite reasonable and since there is nothing we can do about the probability that a taxi will be turned away, we would most likely agree reluctantly with the hotels and rent-a-car companies, at least on these narrow grounds.
Deadlock cannot occur, because at least one (in fact two) of the necessary and sufficient conditions for deadlock are denied. The one condition is ``no preemption:'' the procedure described preempts resources from blocked processes. (The other is the cyclic pattern, but the fact that it cannot arise is really a consequence of preemption.)
Starvation can easily occur---but is also easily prevented by some simple queueing. A job that requires a large number of different resources, each in a large number, may well never run to completion after it gets blocked for the first time: not only could it have trouble acquiring the rest of what it needs, but it will gradually lose what it has accumulated so far to different processes and thus may never get all of it back at one time. Specific scenarios are easily contrived.
Use semaphores (binary or counting, but assumed to follow a FIFO discipline) to write the bodies of the two procedures; indicate any necessary initialization. Your solution must be safe and prevent deadlock; it need not be fair nor prevent starvation -- although you should mention whether or not starvation can occur.
The process number may be regarded as a resource request on a single resource,
n units of which are available initially. A simple solution is to proceed
a la Dijkstra: whenever a resource release is made, wake up all
sleeping processes and have them try again. This will not be fair and could
even lead to starvation, but is sure to avoid deadlock and be safe.
A fancier solution wakes up only as many processes as necessary, but this is
very tricky, as efficiency will conflict with fairness: when the release
(i.e., the number of the completing process) is small, the head of the waiting
queue may not be able to proceed, while some process farther back in the queue
can---are we then to awake that farther process, at the risk of creating a
situation in which the head of the queue can starve? Using a strict FIFO
discipline, on the other hand, creates serious inefficiency in case the head
of the waiting queue has a very large request (number). The solution below
is the Dijkstra-style easy solution; it uses a binary semaphore, mutex,
to protect the resource count (an integer variable named sum and
initialized to n), and a counting semaphore, sleep, to put
processes to sleep; the first is initialized to true, the second to 0.
(Note: it is tempting to use a semaphore initialized to n and to have P
and V operations which use arbitrary increments; this may indeed lead to
an elegant solution, but has many pitfalls.) The program assumes that each
counting semaphore has an associated record, here sleep.count, which
keeps track of the length of the queue associated with the semaphore.
procedure StartUse(number)
1: P(mutex)
if sum+number <= n
then sum <- sum+number
V(mutex)
else V(mutex)
P(sleep)
goto 1
procedure EndUse(number)
P(mutex)
sum <- sum+number
while sleep.count > 0 do
V(sleep)
V(mutex)
In a closed task system, we can devise a mechanical scheme for implementing the precedence order as follows. There is a semaphore for each task except for the initial one; each task (except the initial one) starts with one P() operation on its own semaphore for each immediate predecessor task; each task (except the final one) ends with one V() operation on each of its immediate successors' semaphores. For instance, the system of five tasks with the precedence order given by the relations
T_1: body, V(sema_2), V(sema_3)
T_2: P(sema_2), body, V(sema_5)
T_3: P(sema_3), body, V(sema_4)
T_4: P(sema_4), body, V(sema_5)
T_5: P(sema_5), P(sema_5), body
This implementation has complexity 4+5+5=14.
A slightly informal proof follows. Since there are n tasks, we need n-1 semaphores. Since each precedence arc specifies an immediate predecessor and an immediate successor, it requires one P and one V. Hence the complexity if (n-1) + 2e, as desired. A formal proof uses induction, but is rather messy, as the induction must preserve the ``closedness'' of the task system.
The system of n+2 tasks needs only two semaphores with the following
scheme:
The first example needs only three semaphores with the following scheme:
T_1: body,V(sema_1),...,V(sema_1) (n operations)
T_i: P(sema_1),body,V(sema_2) for 2<=i<=n+1
T_(n+2): P(sema_2),...,P(sema_2),body (n operations)
Both schemes are based on the idea of associating semaphores with tasks that
have a large number of in or out arcs: since the number of semaphore
operations is dictated by the number of arcs, the only gain possible is in
the number of semaphores, which this strategy attempts to minimize.
Note that we cannot use initialization as a substitute for semaphore
operations, at least not with our definition of P and V, since
any V operation will allow some task to pass through its P,
regardless of the value of the semaphore.
T_1:body,V(sema_1),V(sema_1)
T_2:P(sema_1),body,V(sema_3)
T_3:P(sema_1),body,V(sema_2)
T_4:P(sema_2),body,V(sema_3)
T_5:P(sema_3),P(sema_3),body
The complexity of the system of n+2 tasks is now 4n+2 (instead of 5n-1) and that of the 5-task example is 13 (instead of 14). The complexity of 4n+2 is minimal: reducing the number of semaphore operations is not possible, as it would result in ignoring one of the precedence relations, and using only one semaphore is only possible with a (large) increase in the number of semaphore operations.
A linear order of n tasks is an example---although one could argue that such an order, which disallows any concurrency, should really be modelled with a single task. Although the number of semaphores can be decreased arbitrarily (down to 1), every decrease brings a concomitant increase in the number of required semaphore operations. Since all tasks have the same in and out degrees (except for initial and final), no gain can be made by permuting the association of tasks and semaphores. Other examples can be devised: they need only ensure that the first task have only one immediate descendant and that all other tasks have the same in and out degrees (except, that is, for the final task, with its fixed out degree).
Briefly state the pros and cons of write-through, then discuss why write-through is often associated with interleaved memory.
The advantage of write-through is that it speeds things up at the critical time of a cache fault -- the victim chosen for replacement can just be overwritten. Now, this is not necessarily all that useful: for instance, paging does not ever use write-through, and one reason for it is that the page replacement policy is good enough that it will almost always choose ``clean'' victims, which do not require saving on disk. But cache replacement policies are necessarily primitive and thus are much more likely to pick a dirty block as victim; hence the use of write-through. This comes at a price, though: every change must be immediately reflected and thus thousands of changes to the same location could be reflected before the cache block gets kicked out: this results in thousands of updates where one would have done. However, the overhead of the thousands of updates is distributed throughout the computation as opposed to that of the write-back, which is concentrated at the time of a cache fault. Each update is a relatively minor operation (it can be done just on the word that is changed, not on the entire cache block) and, since it takes place independently of the cache access, it need not slow the machine down. However, the updating rate must follow the cache rate; for that, since the main memory is considerably slower, the updating must be done in parallel. Well, it is quite possible to address several memory locations in parallel, simply by treating main memory as a collection of independently accessible modules. But then we must make sure that updates to consecutive locations -- the most common occurrence -- are parallelizable: hence the interleaving, which puts consecutive addresses on different and separately addressable modules.
The main differences are
Since control returns to the main program between any two subroutine calls and since the array is referenced in all contexts, it is clear that both the main program and the array should stay in main memory throughout. This is sure to happen with LRU (at most one subroutine could be more recently referenced than either of these two) and very likely to happen in LFU (unless the averaging window of LFU and the program's subroutines are both very long, in which case the main program could have a much lower frequency), but is sure not to happen with FIFO. So FIFO is the worst; even if it does reasonably well for the subroutines (it might or might not), it does so poorly for these two pieces that the whole performance is sure to be poor. As to LRU and LFU, all depends on the pattern of reference of the subroutines. Since we are in a loop, the best possible thing is to keep in memory those routines that are called the most often within the loop and to swap the others in and out of memory; this uses one page frame for the swapping and effectively locks all other frames. Both LRU and LFU can achieve this effect; of the two, LRU is the better bet if the number of available frames is not too far below the total number of frames required.
The algorithm working with a memory of size M+1 will keep in memory (on top of other pages) all of the pages it would keep if it were working with a memory of size M.Thus, stack algorithms are, in some sense, well-behaved: decreasing memory does not fundamentally change their decisions. Rather, with a smaller memory, a stack algorithm keeps in memory a subset of the pages it kept with the larger memory.
The first three are stack algorithms; the last two are not. Here is a counter-example for FIFO: choose memory sizes of 5 and 6 page frames and the request pattern 1234562721; then the FIFO algorithm keeps pages {1,2,5,6,7} in memory of size~5, while it keeps pages {1,3,4,5,6,7} with a memory of size~6, so that the first is not a subset of the second. A similar counter-example can be devised for NUR. Let us prove formally that LRU is a stack algorithm. We proceed by induction.
Basis: We start the LRU algorithm on 2 machines in parallel with some request sequence. One machine has M page frames, the other M+1; all frames are initially empty. The first M requests will bring in the same first M pages for both machines; the M+1st request will bring in the M+1st page in both machines, but at the expense of some page in {1,2,...,M} in the M machine. Thus machine M will have pages M+1 and all but one of pages {1,2,...,M}, clearly a subset of {1,2,...,M+1}, the pages in machine M+1. Note that this holds for all policies.
Step: Assume that the subset (stack) property has been maintained up to some point in the sequence. Now comes a new page request. Three cases are possible.
It now remains to show that the LRU algorithm obeys this paging out mechanism. Suppose that the LRU algorithm flushes page j out of machine M; then that page is the least recently used of machine M's pages, hence also the least recently used page of all pages of machine M+1, with the possible exception of the extra page (about which we do not know anything). Hence the LRU algorithm will flush either page j or page M+1 out of machine M+1, whichever is the less recently used. This guarantees the subset property, thereby completing our proof.
With pages 1,2,3,...,n,n+1 in memory, a stack algorithm either moves out page n+1 or moves out the same page that it would have chosen with pages 1,2,...,n in memory.That is, the monotonicity of behavior affects the choice of pages to remove as well as that of pages to keep.
Done above in the formal proof for LRU.
This is an immediate consequence of the definition: since the machine with extra page frames will keep in memory all of the pages that the machine with fewer page frames would have kept, it cannot incur page faults that the smaller machine did not incur.
This could be pretty long if done in great detail, so I'll only sketch it. Assume that the file is meant to exist (open to read). First, the system will check if the file is already open for this process: from the PCB, it will get the address of the table of opened files for this process and scan the table. Assume then that the file is not opened for this process; it will be opened and added to this table. To open a file, the system will expand the filename into a complete pathname and go through the named directories. For simplicity, assume that the named file is in the user's current directory. The inode for this directory is probably cached in main memory; if it is not, it will be read into main memory. Access rights for this inode are checked; if not authorized, the user's request is rejected with an error message. A linear search of the directory is then conducted until the next name in the given pathname is located. The inode for this entry is then retrieved and the process repeated if this is another directory. Eventually, we get a file's inode, for which an entry is created in the file structure table of the system, with a pointer set in the user's table of opened files to this entry. The entry points to the memory copy of the file's inode, called an inode structure (which contains a reference count of how many file structure pointers point to it); the file structure entry also contains an offset (in the logical file), which is set to 0 on opening. (Thus the file's inode may not have to be read in from disk, if it is already present in main memory as an inode structure -- due to some other process request for the same file.)
This leaves the matter of locating an inode. Well, an inode contains direct disk block pointers, which can be used to access disk and either further inodes (from a directory) or data. However, the disk pointers point to logical disk devices, not to the physical device. Thus the block address given must first be mapped to the physical device, using a mount table for the file system and the file system's characteristics (such as the size of a block).
| Back to CS 481 home page |