Text Scanning With OmniPage Pro

OmniPage Pro is optical character recognition (OCR) software that permits you to scan and convert text documents, and later edit them with a word processor or other application of your choice.


What is OCR?

Optical character recognition (OCR) is the process of turning an image into computer-editable text. OCR is a technology that analyzes binary images of character shapes, identifies them as particular characters, and outputs them to a text data stream. After performing OCR, resulting text can be exported to a variety of word-processing, page layout, and spreadsheet applications.


Basic Steps of Performing OCR

  1. Scan - create an image of your document
  2. Create zones - identify areas to recognize as text or retain as graphics
  3. Perform OCR - convert text information into editable text characters
  4. Export - save your document in the desired format


Scanning the Document

  1. Place your document face down on the scanner, being sure to match the upper right-hand corner of the document with the upper right-hand corner of the scanner.
  2. Select Scan Image from the Image button's drop-down list.
  3. Click the Image button.
  4. Your document now appears as an image on the screen.


Creating Zones

Zones are borders that identify areas of an image that will be recognized as text or retained as graphics. Any part of an image not enclosed by a zone is ignored during OCR. OmniPage Pro can analyze a page and create zones automatically for you. It uses the selected setting in the zone button to determine the text flow on a page, and breaks it into ordered zones. You can choose any of the following settings: Single-Column, Multiple-Column, Tables, or Mixed Pages.

  1. Choose a setting in the Zone button's drop-down list that most closely matches the format of your document.
  2. OmniPage Pro automatically draws zones on the current page in the image viewer. Each zone is numbered, and includes an A or G in the top right corner to indicate whether the zone is alphanumeric (A) or a graphic (G).

Zones can be modified by clicking on a corner of the zone, holding down the left mouse button, and moving the zone outline. Zones can be deleted by clicking inside the zone and hitting the delete key.


Performing OCR

Performing OCR converts an image to computer-editable text. OmniPage Pro only recognizes machine-printed characters, such as laser-printed or typewritten text. However, handwritten text can be retained as a graphic.

  1. Choose Options in the Tools menu and click on the Page Format tab. Select an output format for your document. The output format tells OmniPage Pro how much, or how little, of the document's original formatting to retain.
  2. Set Perform OCR as the command in the OCR button's drop-down list. (Or, you may choose OCR and Check if you would like to spell-check your document after scanning).
  3. Click the OCR button. The page is recognized according to the current Zone settings. If there are no zones on the page, zones are created according to the Zone button's current command. Recognized text now appears in the text viewer.


Saving the Results

Recognized text and graphics can be saved to disk in a variety of file formats, including WordPerfect, MS Word, HTML, ASCII, dBase, and Quattro Pro.

  1. Choose Save As in the Export button's drop-down list, or choose Save As in the File menu
  2. Select a folder location and file type
  3. Type a file name
  4. If you will be scanning more pages into the same document, select Create One File for All Pages under Save Options
  5. Click OK



To scan additional pages:

Place the next page on the scanner and follow the above instructions for Scanning the Document. Before continuing to Creating Zones, you must click on the thumbnail image of the new page in the thumbnail viewer (see left).This tells OmniPage Pro which page you are going to be working with. Next, create zones and perform OCR as above.

Because you have already saved your file and told OmniPage Pro to create one file for all pages, you can now just click the Save icon (the picture of the floppy disk) in the top menu bar to save additional pages after performing OCR.