Performance evaluations: request for comments
Radford Neal
radford at cs.toronto.edu
Wed Nov 22 21:18:03 EST 1995
Announcing a draft document on
ASSESSING LEARNING PROCEDURES USING DELVE
The DELVE development group, University of Toronto
http://www.cs.utoronto.ca/neuron/delve/delve.html
The DELVE development group requests comments on the draft manual
for the DELVE environment from researchers who are interested in how
to assess the performance of learning procedures. This manual is
available via the DELVE homepage, at the URL above.
Carl Rasmussen and Geoffrey Hinton will be talking about the DELVE
environment at the NIPS workshop on Benchmarking of Neural Net
Learning Algorithms. We would be pleased to hear any comments that
attendees of this workshop, or other interested researchers, might
have on the current design of the DELVE environment, as described in
this draft manual.
Here is the introduction to the DELVE manual:
DELVE --- Data for Evaluating Learning in Valid Experiments --- is a
collection of datasets from many sources, and an environment within
which this data can be used to assess the performance of procedures
that learn relationships using such data.
Many procedures for learning from empirical data have been developed
by researchers in statistics, pattern recognition, artificial
intelligence, neural networks, and other fields. Learning procedures
in common use include simple linear models, nearest neighbor methods,
decision trees, multilayer perceptron networks, and many others of
varying degrees of complexity. Comparing the performance of these
learning procedures in realistic contexts is a surprisingly difficult
task, requiring both an extensive collection of real-world data, and a
carefully-designed scheme for performing experiments.
The aim of DELVE is to help researchers and potential users to assess
learning procedures in a way which is relevant to real-world problems
and which allows for statistically-valid comparisons of different
procedures. Improved assessments will make it easier to determine
which learning procedures work best for various applications, and will
promote the development of better learning procedures by allowing
researchers to easily determine how the performance of a new procedure
compares to that of existing procedures.
This manual describes the DELVE environment in detail. First,
however, we provide an overview of DELVE's capabilities, describe
briefly how DELVE organizes datasets and learning tasks, and give an
example of how DELVE can be used to assess the performance of a
learning procedure.
---------------------------------------------------------------------------
Members of the DELVE Development Group:
G. E. Hinton R. M. Neal R. Tibshirani M. Revow
C. E. Rasmussen D. van Camp R. Kustra Z. Ghahramani
----------------------------------------------------------------------------
Radford M. Neal radford at cs.toronto.edu
Dept. of Statistics and Dept. of Computer Science radford at utstat.toronto.edu
University of Toronto http://www.cs.toronto.edu/~radford
----------------------------------------------------------------------------
More information about the Connectionists
mailing list