MLC++: Machine Learning Utilities Available
Ronny Kohavi
ronnyk at CS.Stanford.EDU
Sat Dec 3 10:26:20 EST 1994
MLC++ Utilities
_______________
MLC++ is a Machine Learning library of C++ classes being developed at
Stanford. More information about the library can be obtained at URL
http://robotics.stanford.edu:/users/ronnyk/mlc.html.
We are now releasing the object code for some utilities written using MLC++.
These will run on Suns, either on SunOS or Solaris.
Included in the current release are the following induction algorithms:
1. Majority (baseline).
2. Basic ID3 for inducing decision trees. The output can be sent
to a mail server to get a postscript picture of the resulting
tree. Very useful for looking at the final tree and for teaching.
3. Basic nearest neighbor.
4. Decision Table.
5. Interface to C4.5 (for utilities below).
6. Feature subset selection : wraps around any of the above and
selects a good subset of the features, usually improving
performance and comprehensibility.
Utilities released are:
1. Cross validation : cross validate a file and any of the above induction
algorithms. Allows regular or stratified CV.
You can also generate the cross validation files to compare
your own induction algorithm.
2. Learning curve : generate a learning curve for any of the above
induction algorithms.
3. Project : project the data onto a subset of attributes.
4. Convert : convert nominal attributes to unary encoding or binary
encoding.
Quick starter guide:
--------------------
The MLC++ utilities are accessible by anonymous ftp to
starry.stanford.edu:pub/ronnyk/mlc/
There are currently two kits, one for Solaris (MLCutil-solaris.tar.Z)
and one for SunOS (MLCutil-sunos.tar.Z).
cd <directory>
zcat <kit-name> | tar xvf -
where <directory> is the directory under which the mlc directory will
be built (e.g., /usr/local), and <kit-name> is the kit appropriate
for your machine.
The documentation is in utils.ps.
The environment variable MLCDIR must be set to the directory where
the utilities are installed.
Databases in the MLC++ format, which is very similar to C4.5 format
can be found in starry.stanford.edu:pub/ronnyk/mlc/db.
Most datafiles are converted from the repository at UC Irvine.
Questions or help requests related to the utilities should be
addressed to mlcpp-help at CS.Stanford.EDU
--
Ronny Kohavi (ronnyk at CS.Stanford.EDU, http://robotics.stanford.edu/~ronnyk)
More information about the Connectionists
mailing list