Comparing Speech Recognition
Programs
By James A. Eidelman
(For “Second Opinions” Column,
Law Technology News,
February, 2000)
There
are three leading speech recognition software packages used in the U.S. –
Dragon NaturallySpeaking, IBM ViaVoice, and Lernout & Hauspie’s
VoiceXpress. The latest versions of
each of these packages are all significantly improved, and I can recommend each
of them as “good enough.” Dragon used
to stand out as the clear best choice for most users, and it is probably still
the leader. But which one will work
best in your office will depend on the personal preferences of your lawyers and
staff, what kind of hardware you use, how you implement them, and how you use
them.
The
software DOES work, and CAN improve the efficiency of the secretary, and
sometimes even the lawyer.
However,
with Dragon, IBM or L&H, there are dozens of complicating factors involving
personal taste, skill, practice, patience, and the right combination of
compatible hardware and software. These
are not just preferences. If everything
isn’t working together properly (including the user), frustration rather than
creativity will result.
Here are
some guidelines:
- Assuming your hardware will
support it, be sure to use a recent version. IBM’s Millennium Edition is far better than the 98 version,
and Dragon and L&H have each improved their products as well. There are improvements in three
areas: (1) accuracy, (2) reduced
time from setup through training to production, and (3) improved
user-friendliness.
- If an attorney is doing his
or her own proofreading and correction of errors, he or she will lose in
correction time any efficiency gained.
It is very important to let your secretary or someone else correct
the dictation unless you have no choice.
IBM is the only package that lets you save the speech files with
the automatically converted text for “deferred correction.” (You need to have plenty of disk space,
and turn on this option.) With
Dragon, if the attorney is using NaturallySpeaking interactively, the
secretary cannot hear the speech when she corrects the work. With Dragon, the attorney should
capture the speech first, and then have the secretary do the automated
transcription and correction.
- If an attorney is speaking
directly to the PC, hardware is of paramount importance. First, SPEED. You must have a very fast PC. Both accuracy and usability improve with speed. You should
have at least a 400 mh Pentium II, Celeron or K6-II, and you should get a
Pentium III for Dragon to use the BestMatch III technology – a significant
enhancement in NaturallySpeaking 4.0.
And be sure to have enough memory.
Allow 128 mb of RAM just for the speech recognition software, which
means 256 mb total if you will be running an Microsoft or Corel Office suites
with lots of windows open at a time.
Unless you use a USB mike, make sure you have a high quality sound
card that is compatible with the software and the mike. Some experimentation may be
required.
- The general rule is that if
you work with multiple machines, you need to “enroll” (train the machine)
and maintain separate voice files for each machine. You can avoid this problem if you use a
USB (Universal Serial Bus) mike, which bypasses the sound card. This is a new development.
- When speaking to the
computer, some lawyers prefer a headset, while others prefer a hand-held
mike that is similar to that used with a traditional dictating
machine. The Philips SpeechMike
Pro is the most popular with Dragon and IBM, and Dictaphone markets its
own Boomerang hand-held mike. If
you use a telephone headset, which I do recommend for those who spend lots
of time on the phone, a headset from Andrea or VXI will work very
well. A high-quality,
noise-canceling headset is a must in order to achieve excellent results.
- If a lawyer is used to using
a dictating machine, an excellent way to benefit from the technology is to
have the lawyer continue to use a dictating machine, and have the
secretary be the one who is automated with speech recognition software. Dragon and L&H each market a mobile
version that uses a digital voice recorder, and Dictaphone and Olympus
each market their dictating machines with a special version of IBM
ViaVoice. (Note that both of them
currently support only ViaVoice 98, but they are expected to release
ViaVoice Millenium versions soon.)
Norcom’s tape-based recorders are not as convenient, but are the
most accurate. And new digital
recorders are being released from Panasonic, Sony and Grundig. The Grundig looks the best on paper, and
has a traditional thumb slide control.
The Olympus D1000 recorder comes with a removable memory card that can be
read by a laptop or desktop computer’s optional memory card reader. This makes it even easier to transfer
voice files to a secretary by email or “sneaker net,” without having to
mess with downloading through a cable.
An external microphone of high quality will significantly improve the accuracy
when dictating to a digital dictating machine.
You can either use a headset or a “lollipop mike” that plugs into the
hand-held recorder. None of the
digital recorders are good enough on their own without the external mike!
Dictaphone, Philips and Olympus each market optional transcription units for a
secretary that let her use a foot pedal to move forward and back through the
text. She can follow the voice in her
ears as she sees the cursor move through the text on the screen. This is a great feature. I wish Dragon had it.
If Dictaphone successfully integrates with the IBM ViaVoice Millenium edition
this year, then the Dictaphone Boomerang products for lawyer and secretary may
offer the solution that most easily lets attorneys and secretaries benefit from
the technology while continuing to work the way they have always worked in the
past.
I am a big believer in the idea that this technology is useful, but does more
to improve the effectiveness of secretaries than it does for attorneys.
- L&H clearly has the best
integration with Microsoft Office.
If you want to use the software interactively with Word and use
voice commands to operate the system, L&H is for you. L&H is also the best in handling
languages other than English.
- Dragon’s legal and
professional versions ($1,000 and $700 respectively) are significantly
more expensive than IBM and L&H.
This is because they are marketed through value-added
resellers. There are fancy voice
macros you can create with the Dragon Pro and Legal editions, but you
don’t need to spend the extra money unless you plan to use those
features. For litigation, the
legal editions come in handy in correctly formatting case citations. For non-litigators, the legal edition
doesn’t add much. You are more
likely to have the system miss on a client’s name than res ipsa
loquitor.
- In the latest tests, Dragon
was just behind IBM’s lead in accuracy, and Dragon was considered the
easiest to use overall, given the process of training, teaching the PC new
words, and making corrections.
- To improve accuracy, all of the packages offer a
"vocabulary builder." You can feed the software a
whole directory of your documents. The software does two things
as part of this process: First, it identifies words that are in
your documents but not in its dictionary, and gives you the opportunity to
add them. Second, it learns about the phrases
and patterns that you use. It will use this information to make
fewer mistakes as it interprets what you say. This
will make a difference, whether you are working interactively or in "mobile
mode." Another trick is that running the same files through the
vocabulary builder several times will improve accuracy.
- A computer-literate lawyer
who has a very fast computer and takes the time to learn how to use the
software can become very quick in dictating not only word processing documents,
but also time entries, case notes in a case management or litigation
support system, calendar entries and emails. In this setting, I like Dragon’s Professional version the
best, working with macros created to speed time entry. With all of them, stored paragraphs and
macros can significantly improve the productivity of any lawyer or staff
member.
- I strongly recommend that
you visit the various discussion groups on the Internet to see what users
are saying about the specific hardware and software you are
considering. Most people are very
generous with their advice. For
Dragon’s forum, see www.dragonsys.com/support/discforum.html. Obviously, all of this can be
confusing, and trial and error will be required. Be patient, start with a pilot group composed of people who
are tolerant, and try both deferred dictation with a hand-held recorder
and speaking directly to the PC.
Finally, get training and support from a specialist. Those who work regularly with these
products can help get your firm off to a good start.
- If you want to try this technology without having
to buy software or train anyone, you can use CyberSecretaries at www.youdictate.com. The most efficient way
to work with them is to dictate using a voice recorder, or use a mike to
save your dictation on your PC. Then email your voice files to
them. An army of home-based transcriptionists is available to
clean up the automatically transcribed dictation, or to type your words if
they need to, and they will turn your work around in about an
hour.
Jim
Eidelman is President of Eidelman Associates, technology consultants to the
legal profession, based in Ann Arbor, Michigan. Eidelman@lawtech.com. For additional information, visit www.lawtech.com/jimtips/voice.
LawTechProdNews-Feb2000-SpeechRec.doc