Results Report and Analysis Questions

The designer MUST provide a written report that answers the following questions. All answers MUST be substantiated with empirical evidence. The designer MAY choose any experiments she or he chooses to answer the questions, but MUST describe:

The report need not be any fixed length, but it MUST be long enough to answer all of the questions. The report MUST be in an 11 or 12 point Times Roman font, double-spaced, on an $ 8 \frac{1}{2} \Cross 11$-inch paper with one-inch margins on all sides.

The designer MUST answer the following questions:

  1. For a fixed RL algorithm, fixed $ rX$ and $ rY$, and using the SingleGoalReward REWARD function, is there any difference between learning in the BoringRect(rX,rY), RoomRect(rX,rY,5), and OutdoorRect(rX,rY,0.25,0.25,0.25,0.25) environments? Why or why not?
  2. In an OutdoorRect(30,30,0.25,0.25,0.25,0.25) environment with a
    SingleGoalReward(7,23), is there a difference in learning performance between the $ Q$-learning algorithm and the SARSA($ \lambda$) algorithm? Why or why not?
  3. For a fixed RL algorithm, the BoringRect(rX,rY) environment, and the
    SingleGoalReward(gx,gy) REWARD function, is there a difference in learning performance between $ \LR{rX,rY}=\LR{5,5}$, $ \LR{rX,rY}=\LR{10,10}$, $ \LR{rX,rY}=\LR{20,20}$, and $ \LR{rX,rY}=\LR{100,100}$? Why or why not?
  4. For a fixed RL algorithm, fixed $ rX$ and $ rY$, and the
    OutdoorRect(rX,rY,0.25,0.25,0.25,0.25), is there a difference in learning performance between using the RandomGoalReward function and a fixed START STATE versus using the SingleGoalReward function and a random START STATE? Why or why not?
  5. For a fixed learning algorithm, a fixed $ rX$ and $ rY$, and using the RandomGoalReward REWARD function, is there a difference between learning in the BoringRect(rX,rY) and the BoringTorus(rX,rY) environments? Why or why not? What about in RoomRect(rX,rY,6) versus RoomTorus(rX,rY,6)? Or in OutdoorRect(rX,rY,0.25,0.25,0.25,0.25) versus OutdoorTorus(rX,rY,0.25,0.25,0.25,0.25)? If there are different answers for Boring* versus Room* versus Outdoor*, describe why those differences occur.

In addition, the designer MAY choose to investigate other properties of the RL algorithms, environments, reward functions, or parameters. Please discuss what motivates each investigation that is reported.

Terran Lane 2005-10-18