[ACT-R-users] Statement of Interest: Workshop on Evaluating Architectures for Intelligence

Gal Kaminka galk at cs.biu.ac.il
Sat Sep 30 18:03:23 EDT 2006

	(Apologies if you receive this more than once.)


We are proposing to AAAI a workshop on evaluating Architectures for 
Intelligence (see draft proposal below).  To convince AAAI of the need
for the workshop, we would like to get a count of the number of tentatively 
interested participants. 

If you feel that you or your student may find this workshop interesting, and 
may wish to attend or actively participate, please let us know at 

	galk at cs.biu.ac.il

We would also appreciate any comments you may have about the draft proposal 
for the workshop and its suggested format.  

Thank you!

Gal Kaminka and Catherina Burghart

------------------- Draft proposal below -------------------

Workshop Proposal:  Evaluating Architectures for Intelligence

		    (Draft proposal) 

1 Purpose and Scope 
Cognitive architectures  form an integral  part of robots  and agents.
Architectures structure and organize the knowledge and algorithms used
by  the agents  to select  actions in  dynamic environments,  plan and
solve  problems,  learn,  and  coordinate with  others.  Architectures
enable intelligent behavior by  agents, and serve to integrate general
capabilities  expected of  an  intelligent agent  (e.g., planning  and
learning), to  implement and test theories about  natural or synthetic
agent  cognition,  and to  explore  domain-independent mechanisms  for

The evaluation  of architectures has always  been challenging. Several
common  methodologies  have  been  applied: Showing  the  architecture
allows behavior not  previously demonstrated, demonstrating generality
by   application  to   several   different  domains   or  tasks,   and
compatibility with psychological data.   In a few cases, architectures
have  been  evaluated  comparatively,  in  the context  of  a  specific
task. On  the one  hand, this is  due to  the different nature  of the
cognitive architectures  applied; on  the other hand,  few appropriate
methods exist.
As AI research has improved in formal and empirical rigor, traditional
evaluation  methodologies  for  architectures   have  sometimes proved
insufficient. On  the formal side, rigorous analysis  has often proven
elusive;  we seem  to be  missing the  notation required  for formally
proving   properties  of  architectures.    On  the   empirical  side,
experiments which demonstrate  generality are notoriously expensive to
perform, and  are not sufficiently  informative. And at  a high-level,
evaluation is difficult because the  criteria are not well defined: Is
it generality?  Ease of  programmability? Compatibility with data from
biology and  psychology? 

Yet  interest in  architectures has  not died,  and in  fact  has even
increased in recent years. There are several major research thrusts by
funders (e.g., DARPA, Europe's  FP6 program) and by researchers (e.g.,
in cognitive modeling and cognitive robotics).

Recognizing that scientific progress depends on the ability to conduct
informative  evaluation (by  experiment  or formal  analysis), we  are
proposing a  workshop that will  address the methodologies  needed for
evaluating architectures.   The focus  will be on  methodology, rather
than  specific architectures.  The  workshop will  have two  goals: To
promote discussion and to propose accepted evaluation criteria.
1.1 Key Issues for Discussion 
The following  key questions will  be raised to motivate  the workshop
discussion,  with the  goal of  providing answers  (or at  least steps
towards answers) within a two-day workshop:
 o Which functions/characteristics turn an architecture into a
   cognitive architecture?
 o Are different types of evaluation needed for different types of
   cognitive architectures?
 o Are there any relevant formal methods? Can we prove properties of
 o What are the criteria and scales of evaluation?
 o How are architectures to be compared in an informative manner?
 o What are the underlying hypotheses one may explore with
 o How should we validate the design of a cognitive architecture?
 o How can data-sets and benchmarks (standardized tasks) be used to
   evaluate architectures?
1.2 Impact: Usable Guidelines for Evaluation 
The goal  of the workshop is  to propose guidelines  for evaluation of
architectures, that would be acceptable to the AI community, and allow
researchers both to evaluate their  own work, and to better assess the
progress of others. The format  of the workshop (see below) will focus
on  developing  guidelines   for  evaluating  architectural  features,
conduct comparative studies,  and prove architectural properties.  The
format will take  into account the experiences of  all attendees.  The
intention is  to produce a  citeable source of  evaluation guidelines,
which will  have significant impact on designers  and investigators of
cognitive  architectures.  Such  guidelines  facilitate objective  and
reproducible  evidence  of   an  architecture's  capability  to  solve
intricate problems in the intended manner.

To do this, we intend to publish  the results in a special issue of an
international journal. We will set up a web site with the presentation
slides and explanatory material.  This online resource will serve as a
portal to evaluation methodology, much  like similar pages have in the
past been  used to promote  standards or appropriate use  of empirical
methods.   We will  also  consider  making an  edited  version of  the
workshop  video recordings  available.  

2 Proposed Format
There   are   many   researchers  investigating   architectures,   but
surprisingly little  published work on evaluation  methodology. As the
aim of the proposed workshop  is to produce the evaluation guidelines,
and to address the key questions,  we choose a setting enabling a good
working atmosphere.   The format combines  information about different
architecture types and evaluation methods, panels, and moderated group

We see  two alternative perspectives on  evaluating architectures.  On
one hand, it makes sense to  divide up the talks by architecture types
(e.g.,  architectures implementing  cognitive  psychological theories,
architectures  inspired by  biology, architectures  exploiting concepts
from  control or rationality,  architectures for  coordinating agents,
etc.).  This makes sense  because different types of architectures may
require   different  evaluation   methodologies.   But   it   is  also
problematic because architectures may  fall between the cracks, or fit
more than one  category.  On the other hand, dividing  up the talks by
evaluation  methods  (e.g.,  comparison  to  human  performance  data,
comparative studies, formal methods, etc.)  may also be inappropriate,
as not all evaluation methods apply everywhere.

2.1 Workshop Format
We are therefore  considering a format in which we  will mix these two
perspectives.   The first  day's morning  sessions will  include talks
grouped by different types of  architectures.  We believe that most of
these will  be contributed  through the submission  process.  Although
the  talks will  necessarily  touch on  specific architectures,  their
focus will be on evaluation, not on the features of the architectures.
An invited  panel will conclude  these sessions.  

We  will  then switch  gears  to  sessions  that focus  on  evaluation
methodologies,  outside  the  context  of architecture  types.   These
sessions  focus  on  invited  and  contributed  talks,  grouped  along
evaluation approaches: Formal  methods, empirical comparative studies,
benchmarks  and  datasets,   other  empirical  methods,  case  studies
(classification dependent on contributions).  Each group of talks will
be  arranged as  a moderated  panel with  a moderator  posing prepared
questions and  opening questions to  the audience.  In  particular, we
will  ask the  panels  to  also consider  the  preceding sessions,  on
different  types of  architectures.  

Our  final  phase  of the  work  is  contingent  on  the size  of  the
audience. The  plan is to have audience  participating in generalizing
and summarizing  the lessons learned. After an  opening defining three
key  challenges (selected  from  the the  preceding  panels), we  will
follow  an  iterative  process   of  (1)  splitting  into  groups  for
discussions, each  headed by an invited moderator;  (2) presenting the
results to everyone in the workshop (plenum); (3) getting feedback and
questions.   These  three steps  will  be  repeated.   We may  request
re-splitting  into groups,  to  make  sure everyone  has  a chance  to
provide input on each  type of architecture or evaluation methodology.

The  moderator of  each group  will be  responsible for  generating an
outline for a research note  explaining the evaluation methods and how
they should  be applied. A  short summary of these  outlines concludes
the workshop.

We believe  the workshop will  run for two  days, but this  depends on
interest  and  number  of  contributed  talks.   We  will  record  all
proceedings.   The  number  of  attendees  will not  be  limited,  but
registration will be required.

2.2 Post-workshop Documentation
All  the panel  results, video  recordings and  presentation materials
will  be collected  for the  web site.   After the  workshop,  we will
solicit articles from the different  groups for the special issue, and
for the  workshop web  page. 

3 Organizers and Relevant Expertise
The  workshop will  be co-chaired  by  Gal A.   Kaminka and  Catherina
Burghart.  The  organization committee includes  other members (listed
below)  who   have  all   expressed  interest  in   participating  and
Gal A. Kaminka (galk at cs.biu.ac.il)
Gal Kaminka  has been  working with autonomous  agents robots  for the
past  ten years.   Much  of  his research  has  focused on  developing
distributed  teamwork architectures  for  multi-agent and  multi-robot
systems. He has been a chair  and co-chair of the MOO annual workshops
(at  AAMAS,  IJCAI, and  AAAI)  which  focus  on plan-  and  activity-
recognition, program co-chair of  the European workshop on Multi-Agent
Systems (EUMAS 2005), and of  the RoboCup 2002 Symposium.  He has also
served  in  other   organizational  roles,  including  AAMAS  doctoral
mentoring chair (2004) and Senior Program Committee member at AAAI and
AAMAS (2005-2007).
Catherina Burghart (burghart at ira.uka.de)

Catherina Burghart  has been working  with intelligent robots  for the
past eight  years.  She is a  member of the  German Humanoids Project,
where she  heads the  working group on  the cognitive  architecture of
their robot.
Organizing Committee 
The co-chairs are supported by a distinguished list of organizers, all of whom 
have had 
significant experience both in organization and in relevant research. The list 
    TBD (To Be Determined)

Gal A. Kaminka, Ph.D.                     http://www.cs.biu.ac.il/~galk
Assistant Professor     Computer Science Dept.      Bar Ilan University
        Only those who see the invisible can do the impossible
   "Death is an engineering problem." -- Bart Kosko, "Fuzzy Thinking"
        "But life is not an engineering task." -- Gal A. Kaminka

More information about the ACT-R-users mailing list