new and improved family of boosting algorithms

Yoram Singer singer at research.att.com
Mon Jun 8 17:31:39 EDT 1998


	The following papers introduce, analyze, and describe applications
	of a new and improved family of boosting algorithms. The papers are
	available from:
		http://www.research.att.com/~schapire/boost.html
		and 
		http://www.research.att.com/~singer/pub.html
 
	Questions and comments are welcome.

- Rob Schapire and Yoram Singer
{schapire,singer}@research.att.com


-----------------------------------------------------------------------------

Improved Boosting Algorithms Using Confidence-rated Predictions Robert

Robert E. Schapire and Yoram Singer

We describe several improvements to Freund and Schapire's AdaBoost
boosting algorithm, particularly in a setting in which hypotheses may
assign confidences to each of their predictions. We give a simplified
analysis of AdaBoost in this setting, and we show how this analysis can
be used to find improved parameter settings as well as a refined
criterion for training weak hypotheses. We give a specific method for
assigning confidences to the predictions of decision trees, a method
closely related to one used by Quinlan. This method also suggests a
technique for growing decision trees which turns out to be identical to
one proposed by Kearns and Mansour.

We focus next on how to apply the new boosting algorithms to multiclass
classification problems, particularly to the multi-label case in which
each example may belong to more than one class. We give two boosting
methods for this problem. One of these leads to a new method for
handling the single-label case which is simpler but as effective as
techniques suggested by Freund and Schapire. Finally, we give some
experimental results comparing a few of the algorithms discussed in
this paper.

-----------------------------------------------------------------------------

BoosTexter: A System for Multiclass Multi-label Text Categorization

Robert E. Schapire and Yoram Singer

This work focuses on algorithms which learn from examples to perform
multiclass text and speech categorization tasks. We first show how to
extend the standard notion of classification by allowing each instance
to be associated with multiple labels. We then discuss our approach for
multiclass multi-label text categorization which is based on a new and
improved family of boosting algorithms. We describe in detail an
implementation, called BoosTexter, of the new boosting algorithms for
text categorization tasks. We present results comparing the performance
of BoosTexter and a number of other text-categorization algorithms on a
variety of tasks. We conclude by describing the application of our
system to automatic call-type identification from unconstrained spoken
customer responses.

-----------------------------------------------------------------------------

An Efficient Boosting Algorithm for Combining Preferences

Yoav Freund, Raj Iyer, Robert E. Schapire, Yoram Singer

The problem of combining preferences arises in several applications,
such as combining the results of different search engines. This work
describes an efficient algorithm for combining multiple preferences.
We first give a formal framework for the problem. We then describe and
analyze a new boosting algorithm for combining preferences called
RankBoost. We also describe an efficient implementation of the
algorithm for a restricted case. We discuss two experiments we carried
out to assess the performance of RankBoost. In the first experiment,
we used the algorithm to combine different WWW search strategies, each of
which is a query expansion for a given domain. For this task, we
compare the performance of RankBoost to the individual search
strategies. The second experiment is a collaborative-filtering task
for making movie recommendations. Here, we present results comparing
RankBoost to nearest-neighbor and regression algorithms.


More information about the Connectionists mailing list