The recently disseminated article on Verbal Compass was fascinating. But I think the researchers don't know speech recognition products very well. NaturallySpeaking has several "verbal compass" commands (of sorts) built in, yet I suspect most people don't use them. (The people I train are the exception!). I doubt the researchers knew about these commands. (The last I checked, ViaVoice had very little in the way of this kind of command.) The two primary commands in NaturallySpeaking to move around in this way are "Insert Before X" and "Insert After X." For example, saying "Insert After University of Maryland" places the cursor between Maryland and the comma in the first big paragraph below. Once you have placed the cursor in this way, the need to move the cursor letter by letter (or word by word) is zero. You are where you want to be. In addition, the "Select" command can be used to navigate to a word or expression when revising texts. To change "modern fad" to "recent phenomenon," say "Select modern fad," pause a fraction of a second, and say, "recent phenomenon." It's not even necessary to issue a Delete command. It is often beneficial to choose two or more target words because NaturallySpeaking recognizes expressions better than single words, especially single syllable words. I even encourage my students to include in the target one or more words that they don't want to delete. It's simply faster to dictate a word again than to move the cursor exactly where it is needed. So for example, to replace "the modern fad" with "the modern phenomenon," I would say "Select the modern fad," pause, and say, "the modern phenomenon." I encourage people to lose their "typewriter mentality" when using speech recognition: e.g., moving the cursor a line or a word at a time is how one revises with a typewriter. But zipping directly to one's target is a strategy that may be counterintuitive because it is not a familiar mechanical approach. To make it work well, the user has to be thinking about the likelihood that target words will be found. "Insert Before an" will probably fail. Insert Before "an obvious alternative" will probably work. "Select for" may work, but will likely find the wrong instance of the word. "Select for further improvement" is not only more likely to be found, but may be easier to say. Submitted by Alan Cantor acantor@cantoraccess.com --- Verbal Compass article: Verbal Compass Better speech-based error correction for dictation tools From: Technology Review - March 2005 - page 80-81 Context: Extreme multitasking is the modern fad, but no person has enough hands to manage a cell phone, a digital organizer, a steering wheel, and coffee all at the same time. Accordingly, people want a hands-free way to interact with computers. Although speech recognition systems are more accurate than ever, typical users still spend more time correcting errors than dictating text; half of their correction time is spent just moving a cursor to errors identified in, say, a dictated e-mail. Confidence scores the software's estimates of how likely it is to have captured the right word and can be used to identify possible errors. Now Jinjuan Feng and Andrew Sears at the University of Maryland, Baltimore County, have shown that confidence scores can also be used to accelerate the correction process. Methods and Results: Twelve participants dictated 400-word documents using a speech recognition system. It interpreted 17 percent of the words incorrectly, a typical rate; it was the correction process that was atypical. The software used confidence scores to tag words throughout the text as navigation anchors. Users could quickly jump to each anchor with short voice commands and then move a cursor word by word to the error. The researchers measured the number of navigation commands the participants used, the failure rates of the navigation commands, and the time spent dictating and navigating. Average failure rates reported for other techniques are about 5 percent for direction-based navigation (move right) and 10 to 20 percent for word-based navigation (select December). In a test of Feng and Searss technique, the failure rate was only 3.2 percent. Even better, the time users spent navigating to errors was cut by nearly a fifth. This is significant compared with other error-correction techniques and it is promising, because this work suggests the means for further improvement. Why it Matters: The Lilliputian buttons on PDAs and other pocket-sized wonders are quickly shrinking under a constant-sized thumb. Multitasking is on the rise, and more people with physical disabilities are entering the workforce. Both trends will steer users away from computer systems with manual interfaces. Speech recognition, but for its high error rate and long correction times, is an obvious alternative. This work clearly shows that using confidence scores for navigation can shrink users correction times. With further improvements, the technique promises to boost the usability of hands-free error correction and so engender a surge of new gadgets and applications. Source: Feng, J., and A. Sears. 2004. Using confidence scores to improve hands-free speech based navigation in continuous dictation systems. ACM Transactions on Computer-Human Interaction 11:329-356. From: http://www2.technologyreview.com/articles/05/03/issue/synopsis_info.asp Links: Andrew L. Sears http://userpages.umbc.edu/~asears/