Darrin Dimmick X4147 dld at
Mon Apr 13 08:09:43 EDT 1992


 Announces a New Database

 | "NIST Special Database 2"   | 

 Structured Forms Reference Set

The NIST database of structured forms contains 5,590 full page images of 
simulated tax forms completed using machine print. THERE IS NO REAL TAX DATA IN
THIS DATABASE. The structured forms used in this database are 12 different 
forms from the 1988, IRS 1040 Package X. These include Forms 1040, 2106, 2441, 
4562, and 6251 together with Schedules A, B, C, D, E, F and SE. Eight of these 
forms contain two pages or form faces making a total of 20 form faces 
represented in the database. 

Each image is stored in bi-level black and white raster format. The images in 
this database appear to be real forms prepared by individuals but the images
have been automatically derived and synthesized using a computer and contain no
"real" tax data. The entry field values on the forms have been automatically
generated by a computer in order to make the data available without the danger 
of distributing privileged tax information.

In addition to the images the database includes 5,590 answer files, one for
each image. Each answer file contains an ASCII representation of the data found
in the entry fields on the corresponding image. Image format documentation and
example software are also provided.

The uncompressed database totals approximately 5.9 gigabytes of data.

"NIST Special Database 2" has the following features:
+ 5,590 full-page images
+ 5,590 answer files
+ 12 pixel per millimeter resolution
+ image format documentation and example software

Suitable for automated document processing system research and development, the
database can be used for:
+ algorithm development
+ system training and testing

The system requirements are a 5.25" CD-ROM drive with software to read ISO-9660

If you have any further technical questions please contact:

Darrin L. Dimmick
dld at

If you wish to order the database, please contact:

Standard Reference Data
National Institute of Standards and Technology
Gaithersburg, MD 20899
(301)926-0416 (FAX)

More information about the Connectionists mailing list