MLC++ utilities version 1.1

Tue Jan 24 00:08:14 EST 1995

[ Summary paragraph moved to beginning, for clarity.  -- The Moderator ]

MLC++ is a Machine Learning library of C++ classes being developed at
Stanford.  More information about the library can be obtained at URL
http://robotics.stanford.edu:/users/ronnyk/mlc.html.  The utilities
are available by anonymous ftp to starry.stanford.edu:pub/ronnyk/mlc/util.
They are currently given only in object code for Sun, but source
code will be distributed in the future or to sites that wish to
attempt a port of the code into other compilers.

			   MLC++ Utilities 1.1
                           ___________________

Since the release of MLC++ utilities 1.0 in December 1994, over
40 sites have copied the utilities.  We are now releasing
version 1.1.  New features include:

*. Options now prompt for values with help to explain the option
   values.  Options are divided into common options and "nuisance"
   options, which users should not change often (especially 
   first-time users).

*. New inducers include Naive-Bayes and 1R (Holte).

*. The nearest-neighbor (IB) inducer has many new options.  It supports
   nominals, interquartile normalization (as opposed to extreme),
   voting of neighbors, k distances (as opposed to k neighbors), and more.

*. A new utility, discretize, is available to discretize continuous
   features.  Either binning or Holte's discretization can be used.

*. Estimated performance on a test set now gives a confidence bound
   assuming i.i.d. sample (details in the manual).
   People are often surprised by how wide the interval is for
   some of the toy datasets.

*. Confusion matrices can be displayed for MLC++ inducers.

*. The tree induced by ID3 can display the distribution and entropy
   in the tree displayed using X-windows. (This option requires
   that you install dotty from AT&T, which is free for research.)

*. The learning curve gives an honest estimate of error by testing
   only on the unseen instances.  The accuracy reports for regular
   induction also report memorization accuracy and generalization
   accuracy separately (following Schaffer and Wolpert's recent papers).

--

   Ronny Kohavi (ronnyk at CS.Stanford.EDU, http://robotics.stanford.edu/~ronnyk)