QUESTION: What is OCR?
ANSWER: OCR stands for
"Optical Character Recognition."
It is a computerized process that enables you to convert a paper
document into a
computer file that you can search and manipulate using a
An OCR system reads text from paper,
translates the images of letters, numbers, punctuation marks, etc.
into a text-based form, and creates a computer file that contains
the translated information. The computer file that gets created
contains fonts and ASCII codes.
All OCR systems include a machine called a "scanner."
This is a device with a clear glass surface on it and
a camera inside it. You put a document face-down on
the glass and the camera inside the scanner takes
a picture of the document and stores that picture in the
form of a bitmap file
(also known as an "image file").
Then, the OCR software in your computer
intelligence to examine
the patterns of dots in the image file and creates a file
that contains text that is represented as fonts and ASCII codes.
With most OCR systems, the image file that
is created by the scanner is discarded after the final file
(the file containing the fonts and ASCII codes) has been created.