For me, technology falls into two broad categories. "Wow!
That's cool. I can use that!" and "Wow. That's interesting. But I don't
need it." Until recently, voice-transcription software, which turns
spoken words into typed words, and optical character recognition, the
noble art of turning paper documents into editable digital documents,
have always belonged in the latter category.
Why? Because I'm a
touch-typist with several billion words under my belt, and I can type at
a fair clip. I can usually keep the thread of an interview by taking
down phone conversations as they happen.
But I recently snapped a
ligament in my E finger (which is also useful for typing D, C, and a
couple of numbers too), and the orthopedic surgeon told me it needed to
be splinted for six weeks. Trust me, you can't type properly with your E
finger in a splint. So I turned to the very products I had once smugly
dismissed: Omnipage Professional 15, a leader in the field of optical
character recognition, and the premier dictation software, Dragon
NaturallySpeaking 8.1. And I'm glad I did.
Both products are
published by Nuance,
which doesn't sell either on the cheap. NaturallySpeaking Standard
edition is $99, but a more feature-rich Preferred version with support
for macro scripting and saving audio and text (for later proofing) costs
$199. Versions for the medical and legal profession cost $1,099, with
multi-seat and site license discounts available. Omnipage standard
edition is $149, but a much more capable workflow-enhanced Professional
version--with some heavy-duty PDF making options and support for
barcode-driven work processes--balloons the price up to $499. But for
volume production, both these products are worth every dime.
Speak Up!
NaturallySpeaking is a bit of a
misnomer in my case. Naturally speaking, I go very fast and mumble,
swallowing as many consonant sounds as I can. I also tend to preface a
thought with at least one "uh" or "um." The software NaturallySpeaking
turned into a bit of a "My Fair Lady" experience. I learned to speak a
lot more clearly.
After reading out a longish passage of text
into the bundled Andrea noise-canceling headset so that the software
could get a handle on my pronunciation, I was ready to embark on the
acid test of the product: actually producing a report without touching
the keyboard. It doesn't come naturally to include punctuation marks
(comma) or paragraph breaks as you speak (comma) but with Naturally
Speaking, that's what you have to do. (Period. New
paragraph.)
That said, it's easy to get into the flow after a
while. And when the product fails to recognize something you say (it
will, but less often as you get better at enunciating clearly), you can
issue a voice command to highlight the word, and pick from a list of
other possible options. If the word you were trying to say isn't there,
you can spell it out (or type it, if you wish) and the program will "get
it" from there on out.
I had a little more difficulty making the
program transcribe some oral history tapes I have. Even the most well
spoken interviewees tend to talk a little too fast for the product to
catch every word, and ill-mannered interviewers who interrupt tend to
confuse the product, too. That said, it's handy for pre-processing
recordings of speeches or interviews if you don't mind going back and
editing them afterwards.
The real market for this product (apart
from temporarily disabled professional writers) is for nontypists and
people who need to transcribe recordings of random ideas. It may take a
fair amount of work after the fact, but as any legal or medical
transcriber knows, it takes a lot more work to do it from scratch.
Three-Dimensional Photocopying
There's a
peculiar magic to optical character recognition. It takes scanned text
and turns it from a flat image of text into something that you can look
at and edit. Practically every scanner comes with a basic or
limited-time OCR program, but the difference between them and a full-on
Omnipage Professional is the difference between the animation effects in
"South Park" and those in "Harry Potter and the Goblet of Fire."
The professional version of Omnipage that I used can convert a
scanned document into a faithfully formatted file in Word, Excel,
PowerPoint, HTML, XML, or searchable and taggable PDF format. If you
have a particularly bad document (such as a blurry purple cyclostyle
duplicate from the 1970s), you'll have to help the product along as it
tries to recognize blobby letters.
But in general, I've been
pleasantly surprised by how smart the product has become in since I last
used it regularly in the late 1990s. Those extra iterations have really
cranked up the accuracy, which was already pretty good back then. And
the addition of little niceties (like a form creation tool that can turn
paper forms into electronic versions to fill in on-screen, and a
character map window that lets you click on characters like the
copyright or trademark symbol during the proofing cycle) make the
product even more compelling.
Like previous versions, Omnipage
Professional 15 can be set to watch folders (including Outlook or Lotus
e-mail folders) for documents, and convert them into whatever format you
set (PDF is a particularly useful one) using batch files. Such workflow
automation is uncannily useful, and made much easier using a
step-by-step Job Wizard. I also appreciated the addition of Google
Desktop Search so you can search your converted documents for text they
contain.
As a bonus, the Pro version also includes two
Acrobat-related tools called PDF Create and PDF Converter. Create is a
handy tool that works from right-click menus in Windows Explorer to
convert text or image documents into PDF files, and can also combine and
overlay multiple PDFs into a single document. It also includes a Word
add-in for converting Word documents as you go. PDF Converter does the
opposite: Turning PDF files into Word documents.
Well, my
throat's getting a bit sore from all this writing now. So I'll just sign
off and go tend to my aching E finger. I'll type my next column when
this splint's off. Or maybe I'll stick with dictation. We'll see.
(Period. End column.)
Contributing Editor Matt Lake
writes SOHO Advisor monthly for ComputerUser.