[ACT-R-users] Statement of Interest: Workshop on Evaluating Architectures for Intelligence
Gal Kaminka
galk at cs.biu.ac.il
Sat Sep 30 18:03:23 EDT 2006
(Apologies if you receive this more than once.)
Hello,
We are proposing to AAAI a workshop on evaluating Architectures for
Intelligence (see draft proposal below). To convince AAAI of the need
for the workshop, we would like to get a count of the number of tentatively
interested participants.
If you feel that you or your student may find this workshop interesting, and
may wish to attend or actively participate, please let us know at
galk at cs.biu.ac.il
We would also appreciate any comments you may have about the draft proposal
for the workshop and its suggested format.
Thank you!
Gal Kaminka and Catherina Burghart
------------------- Draft proposal below -------------------
Workshop Proposal: Evaluating Architectures for Intelligence
(Draft proposal)
1 Purpose and Scope
===================
Cognitive architectures form an integral part of robots and agents.
Architectures structure and organize the knowledge and algorithms used
by the agents to select actions in dynamic environments, plan and
solve problems, learn, and coordinate with others. Architectures
enable intelligent behavior by agents, and serve to integrate general
capabilities expected of an intelligent agent (e.g., planning and
learning), to implement and test theories about natural or synthetic
agent cognition, and to explore domain-independent mechanisms for
intelligence.
The evaluation of architectures has always been challenging. Several
common methodologies have been applied: Showing the architecture
allows behavior not previously demonstrated, demonstrating generality
by application to several different domains or tasks, and
compatibility with psychological data. In a few cases, architectures
have been evaluated comparatively, in the context of a specific
task. On the one hand, this is due to the different nature of the
cognitive architectures applied; on the other hand, few appropriate
methods exist.
As AI research has improved in formal and empirical rigor, traditional
evaluation methodologies for architectures have sometimes proved
insufficient. On the formal side, rigorous analysis has often proven
elusive; we seem to be missing the notation required for formally
proving properties of architectures. On the empirical side,
experiments which demonstrate generality are notoriously expensive to
perform, and are not sufficiently informative. And at a high-level,
evaluation is difficult because the criteria are not well defined: Is
it generality? Ease of programmability? Compatibility with data from
biology and psychology?
Yet interest in architectures has not died, and in fact has even
increased in recent years. There are several major research thrusts by
funders (e.g., DARPA, Europe's FP6 program) and by researchers (e.g.,
in cognitive modeling and cognitive robotics).
Recognizing that scientific progress depends on the ability to conduct
informative evaluation (by experiment or formal analysis), we are
proposing a workshop that will address the methodologies needed for
evaluating architectures. The focus will be on methodology, rather
than specific architectures. The workshop will have two goals: To
promote discussion and to propose accepted evaluation criteria.
1.1 Key Issues for Discussion
-----------------------------
The following key questions will be raised to motivate the workshop
discussion, with the goal of providing answers (or at least steps
towards answers) within a two-day workshop:
o Which functions/characteristics turn an architecture into a
cognitive architecture?
o Are different types of evaluation needed for different types of
cognitive architectures?
o Are there any relevant formal methods? Can we prove properties of
architectures?
o What are the criteria and scales of evaluation?
o How are architectures to be compared in an informative manner?
o What are the underlying hypotheses one may explore with
architectures?
o How should we validate the design of a cognitive architecture?
o How can data-sets and benchmarks (standardized tasks) be used to
evaluate architectures?
1.2 Impact: Usable Guidelines for Evaluation
--------------------------------------------
The goal of the workshop is to propose guidelines for evaluation of
architectures, that would be acceptable to the AI community, and allow
researchers both to evaluate their own work, and to better assess the
progress of others. The format of the workshop (see below) will focus
on developing guidelines for evaluating architectural features,
conduct comparative studies, and prove architectural properties. The
format will take into account the experiences of all attendees. The
intention is to produce a citeable source of evaluation guidelines,
which will have significant impact on designers and investigators of
cognitive architectures. Such guidelines facilitate objective and
reproducible evidence of an architecture's capability to solve
intricate problems in the intended manner.
To do this, we intend to publish the results in a special issue of an
international journal. We will set up a web site with the presentation
slides and explanatory material. This online resource will serve as a
portal to evaluation methodology, much like similar pages have in the
past been used to promote standards or appropriate use of empirical
methods. We will also consider making an edited version of the
workshop video recordings available.
2 Proposed Format
=================
There are many researchers investigating architectures, but
surprisingly little published work on evaluation methodology. As the
aim of the proposed workshop is to produce the evaluation guidelines,
and to address the key questions, we choose a setting enabling a good
working atmosphere. The format combines information about different
architecture types and evaluation methods, panels, and moderated group
discussions.
We see two alternative perspectives on evaluating architectures. On
one hand, it makes sense to divide up the talks by architecture types
(e.g., architectures implementing cognitive psychological theories,
architectures inspired by biology, architectures exploiting concepts
from control or rationality, architectures for coordinating agents,
etc.). This makes sense because different types of architectures may
require different evaluation methodologies. But it is also
problematic because architectures may fall between the cracks, or fit
more than one category. On the other hand, dividing up the talks by
evaluation methods (e.g., comparison to human performance data,
comparative studies, formal methods, etc.) may also be inappropriate,
as not all evaluation methods apply everywhere.
2.1 Workshop Format
-------------------
We are therefore considering a format in which we will mix these two
perspectives. The first day's morning sessions will include talks
grouped by different types of architectures. We believe that most of
these will be contributed through the submission process. Although
the talks will necessarily touch on specific architectures, their
focus will be on evaluation, not on the features of the architectures.
An invited panel will conclude these sessions.
We will then switch gears to sessions that focus on evaluation
methodologies, outside the context of architecture types. These
sessions focus on invited and contributed talks, grouped along
evaluation approaches: Formal methods, empirical comparative studies,
benchmarks and datasets, other empirical methods, case studies
(classification dependent on contributions). Each group of talks will
be arranged as a moderated panel with a moderator posing prepared
questions and opening questions to the audience. In particular, we
will ask the panels to also consider the preceding sessions, on
different types of architectures.
Our final phase of the work is contingent on the size of the
audience. The plan is to have audience participating in generalizing
and summarizing the lessons learned. After an opening defining three
key challenges (selected from the the preceding panels), we will
follow an iterative process of (1) splitting into groups for
discussions, each headed by an invited moderator; (2) presenting the
results to everyone in the workshop (plenum); (3) getting feedback and
questions. These three steps will be repeated. We may request
re-splitting into groups, to make sure everyone has a chance to
provide input on each type of architecture or evaluation methodology.
The moderator of each group will be responsible for generating an
outline for a research note explaining the evaluation methods and how
they should be applied. A short summary of these outlines concludes
the workshop.
We believe the workshop will run for two days, but this depends on
interest and number of contributed talks. We will record all
proceedings. The number of attendees will not be limited, but
registration will be required.
2.2 Post-workshop Documentation
-------------------------------
All the panel results, video recordings and presentation materials
will be collected for the web site. After the workshop, we will
solicit articles from the different groups for the special issue, and
for the workshop web page.
3 Organizers and Relevant Expertise
===================================
The workshop will be co-chaired by Gal A. Kaminka and Catherina
Burghart. The organization committee includes other members (listed
below) who have all expressed interest in participating and
contributing.
Gal A. Kaminka (galk at cs.biu.ac.il)
----------------------------------
Gal Kaminka has been working with autonomous agents robots for the
past ten years. Much of his research has focused on developing
distributed teamwork architectures for multi-agent and multi-robot
systems. He has been a chair and co-chair of the MOO annual workshops
(at AAMAS, IJCAI, and AAAI) which focus on plan- and activity-
recognition, program co-chair of the European workshop on Multi-Agent
Systems (EUMAS 2005), and of the RoboCup 2002 Symposium. He has also
served in other organizational roles, including AAMAS doctoral
mentoring chair (2004) and Senior Program Committee member at AAAI and
AAMAS (2005-2007).
Catherina Burghart (burghart at ira.uka.de)
----------------------------------------
Catherina Burghart has been working with intelligent robots for the
past eight years. She is a member of the German Humanoids Project,
where she heads the working group on the cognitive architecture of
their robot.
Organizing Committee
--------------------
The co-chairs are supported by a distinguished list of organizers, all of whom
have had
significant experience both in organization and in relevant research. The list
includes:
TBD (To Be Determined)
--
-----------------------------------------------------------------------
Gal A. Kaminka, Ph.D. http://www.cs.biu.ac.il/~galk
Assistant Professor Computer Science Dept. Bar Ilan University
Only those who see the invisible can do the impossible
"Death is an engineering problem." -- Bart Kosko, "Fuzzy Thinking"
"But life is not an engineering task." -- Gal A. Kaminka
More information about the ACT-R-users
mailing list