GCN Home > 04/14/08 issue
Reading between the lines
By John Breeden II
OmniPage Professional 16 from Nuance Communications is designed to work with paper in all of its forms, but it could also be the key to eliminating paper from your agency, saving both time and the environment.

At its heart, OmniPage is an advanced optical-character recognition (OCR) package and older versions of the software did little else. But the latest version is a complete system that can read text from almost any source and create electronic documents that many different programs can open and edit if that is your goal.

To start, we decided to try out the new OCR engine using a standard Epson flatbed scanner attached to a modest test system in the lab.

We dug up a bunch of graphics-laden promotional fliers from monitor companies and pulled out our infamous printer test document, which contains letters running through, over and under graphics. Could OmniPages OCR pull text under these challenging circumstances? And what would happen to the graphics?

OmniPage was extremely accurate in decoding the complicated mess of text on the pages we sent through it. In 45 pages of mixed media scanned, it only got two words wrong and they blended into the dark background graphic. Even with those questionable pages factored in, missing two words out of 800 gives the program an impressive accuracy rate of 99.75 percent. And this was under less-than-ideal circumstances. Most documents dont have words printed over a picture of a fog-enshrouded Golden Gate Bridge.

OmniPage picks up graphics and assigns them their own element numbers, which makes deleting them easy. When we wanted to keep a graphic, we could select the True Page option from the Save menu, which keeps everything in the same order as it was captured. We could also select the Flowing Text option, which keeps graphics and text intact but lays them out in a line down the page.

Once you have captured information which takes about 20 seconds longer than a standard scan you can save it in several formats. If you only need to convert it to an electronic format, you can save the file as a PDF. If you want to be able to edit it later, you can save it as a Microsoft Word file. Or you can save your data in a format that most spreadsheet applications can open. OmniPage 16 files can also be saved as Corel Word- Perfect, HTML and native Excel 2007.

More news on related topics: Content / Record Management, Data Management, Software Applications