Thesis available on adaptive robot control

Mon Jan 2 13:51:02 EST 1995

FTP-host: archive.cis.ohio-state.edu
FTP-filename: /pub/neuroprose/Thesis/meeden.thesis.ps.Z

		  **DO NOT FORWARD TO OTHER GROUPS**

	 Ph.D. Thesis available by anonymous ftp (124 pages)

			  Towards Planning:
	Incremental Investigations into Adaptive Robot Control

			     Lisa Meeden
		    Department of Computer Science
			  Indiana University

ABSTRACT:

Traditional models of planning have adopted a top-down perspective by
focusing on the deliberative, conscious qualities of planning at the
expense of having a system that is connected to the world through its
perceptions.  This thesis takes the opposing, bottom-up perspective
that being firmly situated in the world is the crucial starting point
to understanding planning.  The central hypothesis of this thesis is
that the ability to plan developed from the more primitive capacity of
reactive control.

Neural networks offer the most promising mechanism for investigating
robot control and planning because connectionist methodology allows
the task demands rather than the designer's biases to be the primary
force in shaping a system's development.  Input can come directly from
the sensors and output can feed directly into the actuators creating a
close coupling of perception and action.  This interplay between
sensing and acting fosters a dynamic interaction between the
controller and its environment that is crucial to producing reactive
behavior.  Because adaptation is fundamental to the connectionist
paradigm, the designer need not posit what form the internal knowledge
will take or what specific function it will serve.  Instead, based on
the training task, the system will construct its own internal
representations built directly from the sensor readings to achieve the
desired control behavior.  Once the system has reached an adequate
level of performance at the task, its method can be dissected and a
high-level understanding of its control principles can be determined.

This thesis takes an incremental approach towards understanding
planning using a simple recurrent network model.  In the initial
phase, several ways of representing goals are explored using a
simulated robot in a one-dimensional environment. Next the model is
extended to accommodate a physical robot and two reinforcement
learning methods for adapting the network controllers are compared: a
gradient descent algorithm and a genetic algorithm.  Then, the model's
reactive behavior and representations are analyzed to reveal that it
contains the potential building blocks necessary for planning, called
protoplans.

Finally, to show that protoplans can be used to guide behavior, a
learning transfer experiment is conducted.  The protoplans constructed
in one network controller are stored in an associative memory and
retrieved by a new controller as it learns the same task from
scratch. In this way strategies discovered in the original controller
bias the strategies developed in a new controller.  The results show
that controllers trained with protoplans and without goals are able to
converge more quickly to successful solutions than controllers trained
with goals.  Furthermore, changes in the protoplans over time
reveal that particular fluctuations in the protoplan values are highly
correlated with switches in the robot's behavior.  In some instances,
very minor disturbances to the protoplan at these fluctuation points
severely disrupts the normal pattern of behavior.  Thus protoplans can
play a key role in determining the behavior.

The success of these protoplan experiments supports a new set of
intuitions about planning. Rather than static, stand-alone procedures,
plans can be seen as dynamic, context-dependent guides, and the
process of planning may be more like informed improvisation than
deliberation.  It is not fruitful to spend processing time reasoning
about an inherently unpredictable world, and with the protoplan model,
a new protoplan can be recomputed on every time step.  Although each
protoplan offers only sketchy guidance, any more information might
actually be misleading. Once the chosen action is executed, the
subsequent perceptions are used to retrieve a new, more appropriate
protoplan.  Therefore it is possible to continually replan based on
the best information available--the robot's current perceptual state.

-------------------

Hard copies are not available.

Thanks to Jordan Pollack for maintaining neuroprose.

Lisa Meeden
Computer Science Program
Swarthmore College
meeden at cs.swarthmore.edu