Requirements

This section describes the elements that MUST be developed as part of this project. The designer MAY also choose to implement additional Java source files, programs, and/or shell scripts in support of the following items. This section only describes the general performance requirements for each element; for specific deliverable requirements, please refer to Section [*].

The JRoboExplorersoftware suite comprises three principle components:

At a high-level view, the WORLD SIMULATOR is responsible for maintaining the ``map'' of of the world - LOCATIONs of objects and terrain types, location of the AGENT, and so on. Beyond simply holding a MAP, however, the WORLD SIMULATOR is responsible for simulating the results of ACTIONS - deciding what happens when an AGENT executes an ACTION.

The AGENT is the decision-maker and the learner. It is based on a reinforcement learning algorithm that allows the agent to build a model of what the best ACTION to take is at any LOCATION in the environment. In order to define ``best'', we need a sense of what the AGENT ``wants'' to do. This will be defined in terms of REWARDs - units of feedback that the WORLD SIMULATOR provides to AGENTs. The agent ``likes'' positive REWARDs and learns to seek them out, while it ``dislikes'' negative REWARDs and learns to avoid them. Thus, the AGENT's goal is to learn to navigate through the environment in such a way as to pick up positive REWARDs and avoid negative REWARDs. To do so, it starts by exploring the environment, learning local knowledge about the structure of the environment at each STEP. After many STEPs of experience, the AGENT will have learned a good POLICY for acting in the world.



Subsections
Terran Lane 2005-09-29