We opened this months Q&A SIG with some interesting questions. One question dealt with the problem of upgrading a computer system and not having the Win modem work after upgrading the system. As most of you know I am not a big fan of Win modems. I prefer what I think of as a "real" modem, what is now referred to as the controller type of modem as opposed to the controller-less modem. In this instance, one of the last symptoms mentioned was that the operating system had changed from Win9X to Win2K. The Win9X software driver was what was installed for Win2K. No attempt was made to check the web site for a updated driver that was specifically for Win2K. It is a fairly well known fact that a lot of the older Win9x drivers do not work properly with Win2K. My guess is that if he can find a Win2k driver that his problem will be fixed.The primary topic of discussion for this SIG however was the IBM ViaVoice for Windows release 9 voice recognition software. I have, over the last several years, tried several different versions of the VIAVoice Software. Each, at the time, seemed pretty impressive and it has gotten steadily better. Not only has the software improved in terms of its ability to run the algorithms that processed the sounds from the microphone, but the horsepower of the computer has increased greatly, which makes it much easier to run the rather complex software. The manual makes note of the fact that the computer is at a severe disadvantage when it comes to understanding speech, compared to human beings. At the very least, people can fill in the missing speech information they did not hear correctly based on the context of the subject that is under discussion. In addition to that, you can often get visual clues based on body language and facial expressions. The computer is limited to simply hearing the sound information and trying to figure out what it was that was said based on the proper volume of the speech, the correctness of the pronunciation of all the words, and to a minor extent the context and grammar. For example, the program knows that you go "to" the store, not "two " the store.
As with all speech recognition software, in order to use it you must undergo a training session first. The training session consisted of reading two or more short stories. While the software listens to you reading the story and tries to match the words that it knows you are reading from the screen with the sounds that it is hearing with the microphone. The directions suggest that you do all four of the of stories that are available, and that it will take about an hour to do this. Being in a bit of a rush to get ready for the SIG, I only finished two of them. The manual also warns you that a different environment can affect the recognition ability of the software. For instance, the reverberation that is present in the auditorium were we hold the SIG, can significantly reduce the accuracy of the recognition of the program. As I did not have time to go through a separate training session that I could select for the auditorium, I was reluctant to try the program in that setting. I did give it a try though, and for the most part I thought it did pretty good. The audience did get a few laughs when some interesting or funny lines showed up on the screen.
The program has an extensive repertoire of a voice commands, both for dictation purposes and for control purposes. Like any large complex program the learning curve to become an efficient user of the software is significant. On the other hand, there is no question that I can speak a great deal faster than I can type. When I finish the dictation to SpeakPad and transfer it to the word processor, I get a second check of the grammar. Some of the dictation errors that are difficult to catch are the small things like the missing S on the end of a word that you wanted to be plural. One of the other common errors, is the substitution of a word that sounds very similar to the word spoke, but in fact is totally different. At times, this can result in some very humorous sentences. There are, however, a lot of situations where that type of humor would not be appreciated. It highly recommended that you proof read your dictation.
The box suggests that a 266 MHz Pentium with 64 Meg of RAM is the minimum size computer that this software should be run on. The minimum size computer may run the program OK, but as a general rule more horsepower never hurts. My demonstration machine is approximately four times the required minimum and works noticeably better than the slower machine. Next month I will talk a bit more about speech recognition and on some of the USB problems that I encountered this week.
![]()
Site Disclaimer Suggestions? E-Mail to webmaster@noccc.org