MLC++: Machine Learning Utilities Available

Ronny Kohavi ronnyk at CS.Stanford.EDU
Sat Dec 3 10:26:20 EST 1994


			   MLC++ Utilities                
                           _______________


MLC++ is a Machine Learning library of C++ classes being developed at
Stanford.  More information about the library can be obtained at URL
http://robotics.stanford.edu:/users/ronnyk/mlc.html.

We are now releasing the object code for some utilities written using MLC++.
These will run on Suns, either on SunOS or Solaris.  


Included in the current release are the following induction algorithms:

1. Majority (baseline).
2. Basic ID3 for inducing decision trees.  The output can be sent
     to a mail server to get a postscript picture of the resulting
     tree.  Very useful for looking at the final tree and for teaching.
3. Basic nearest neighbor.
4. Decision Table.
5. Interface to C4.5 (for utilities below).
6. Feature subset selection : wraps around any of the above and
     selects a good subset of the features, usually improving
     performance and comprehensibility.
  


Utilities released are:

1. Cross validation : cross validate a file and any of the above induction
   algorithms.  Allows regular or stratified CV.
   You can also generate the cross validation files to compare
   your own induction algorithm.
  
2. Learning curve : generate a learning curve for any of the above
   induction algorithms.

3. Project : project the data onto a subset of attributes.

4. Convert : convert nominal attributes to unary encoding or binary
   encoding.


Quick starter guide:
--------------------

The MLC++ utilities are accessible by anonymous ftp to
  starry.stanford.edu:pub/ronnyk/mlc/
There are currently two kits, one for Solaris (MLCutil-solaris.tar.Z)
   and one for SunOS (MLCutil-sunos.tar.Z).

cd <directory>
zcat <kit-name> | tar xvf -

where <directory> is the directory under which the mlc directory will
   be built (e.g., /usr/local), and <kit-name> is the kit appropriate
   for your machine.

The documentation is in utils.ps.
The environment variable MLCDIR must be set to the directory where
    the utilities are installed.

Databases in the MLC++ format, which is very similar to C4.5 format
  can be found in starry.stanford.edu:pub/ronnyk/mlc/db.
  Most datafiles are converted from the repository at UC Irvine.

Questions or help requests related to the utilities should be
    addressed to mlcpp-help at CS.Stanford.EDU

--

   Ronny Kohavi (ronnyk at CS.Stanford.EDU, http://robotics.stanford.edu/~ronnyk)




More information about the Connectionists mailing list