ML Papers & Cora: Two search engines for postscript papers

Andrew McCallum mccallum at sandbox.jprc.com
Mon Jan 11 19:15:50 EST 1999


We are pleased to announce the availability of two search engines for
Postscript papers on the Web.  "ML Papers" provides access to Machine
Learning papers.  "Cora" provides access to papers on computer science
as a whole.  Both allow keyword searches over partial text of
postscript-formatted papers they have found by spidering the Web.

  * For ML Papers:     http://gubbio.cs.berkeley.edu/mlpapers/
  * For Cora:          http://www.cora.justresearch.com


About ML Papers:

"ML Papers", first released in 1997, is a search engine that
automatically extract titles, authors and abstracts from postscript
papers found on the Web; it was (to our knowledge) the first of such a
form.  Its index currently consists about 12,000 postscript papers,
mostly related to Machine Learning, Datamining, Statistics, etc, and a
Web interface provides search functionality over them.  Links to
poscript files and their referring pages are returned in response to
queries.  Titles/authors/abstracts of these papers are also displayed.
You can see ML Papers at http://gubbio.cs.berkeley.edu/mlpapers/

"ML Papers," which was recently moved from MIT to UC Berkeley, was
created by Andrew Ng.  Its companion "Vision Papers" search engine can
also be accessed at http://www.ai.mit.edu/people/ayn/cgi/vpapers.


About Cora:

"Cora" provides access to over 50,000 research papers on all computer
science subjects.  Search queries can include special operators such
as +, -, "", title:, author:, reference:, and url:, (all with their
typical meanings).  Citation references have been processed to 
provide forward and backward crosslinks---showing both (1) papers
referenced by the current paper, and (2) papers that reference the
current paper.  References have also been parsed in order to provide
automatically-generated BibTeX entries.  The papers are categorized
into a "Yahoo-like" topic hierarchy with 75 leaves.  In the near
future, the citation structure will be analyzed in order to
automatically identify seminal and survey articles in each category.
Cora is at http://www.cora.justresearch.com

"Cora" is the result of a continuing research project at Just Research, 
led by Andrew McCallum with interns Kamal Nigam, Jason Rennie and
Kristie Seymore.  Just Research is the U.S. research organization of
Justsystem Corporation, the leading independent software company in
Japan, and is located near the Carnegie Mellon campus.  A paper
describing Cora will be presented at the AAAI Spring Symposium, and
can be found at http://www.cs.cmu.edu/~mccallum/papers/cora-aaaiss98.ps.


Feel free to share this announcement with others.  Enjoy and please
send feedback.

  Andrew Ng
  ang at cs.berkeley.edu
  "ML Papers"

  Andrew McCallum
  mccallum at justresearch.com, mccallum at cs.cmu.edu
  "Cora"


More information about the Connectionists mailing list