WEB SURFING WITHOUT A MONITOR by T. V. Raman Scientific American - March, 1997, page 73 When I hook up to the Internet to check out the news on CNN, to peruse a colleague's latest paper, or to see how Adobe's stock price is doing, I leave the display on my laptop turned off. The batteries last much longer that way. Besides, because I cannot see, a monitor is of no use to me. That a blind person can navigate the Internet just as efficiently and effectively as any sighted person attests to the profound potential of digital documents to improve human communication. Printed documents are fixed snapshots of changing ideas; they limit the means of communication to the paper on which they are stored. But in electronic form, documents can become raw material for computers that can extract, catalogue and rearrange the ideas in them. Used properly, technology can separate the message from the medium so that we can access information wherever; whenever and in whatever form we want. In my case - and in the case of someone who, while using a telephone or a tiny handheld computer; becomes functionally blind - it is much easier to hear material spoken out loud than to try to view it on a screen. But it is no good to have the computer simply recite the page from top to bottom, as conventional screen-reading programs do. Imagine trying to read a book using only a one-line, 40 character display through which text marches continuously. Ideally, an aural interface would preserve the best features of its visual counterpart. Consider this special section of Scientific American. By reading the table of contents, skimming the introductory article and flipping through the section, you can quickly obtain a high-level overview and decide which parts you want to read in detail. This works because the information has structure: the contents are arranged in a list, titles are set in large type and so on. But the passive, linear nature of listening typically prohibits such multiple views it is impossible to survey the whole first and then "zoom in" on portions of interest. Computers can remove this roadblock, but they need help. The author scribbling on his virtual page must tag each block of text with a code indicating its function (title, footnote, summary, and so on) rather than its mere appearance (24-point boldface, six-point roman, indented italics, and so on). Programs can then interpret the structural tags to present documents as the reader; rather than the creator; wishes. Your software may render titles in large type, whereas mine could read them a Ia James Earl Jones. Moreover; listeners can browse through a structured text selectively in the same way you skim this printed magazine. Fortunately, Hypertext Markup Language (HTML) - the encoding system used to prepare text for the World Wide Web was designed to capture the structure of documents. If it were used as originally intended, the same electronic source could be rendered in fine detail on a printer; at lower resolution on a screen, in spoken language for functionally blind users, and in myriad other ways to suit individual preferences. Unfortunately, HTML is steadily evolving under commercial pressure to enable the design of purely visual Web pages that are completely unusable unless one can see the color graphics and can rapidly download large images. This current rush to design Web pages that can be viewed properly with only the most popular programs and standard displays, and that moreover lack important structural information, threatens to undermine the usefulness of the documents archived on the Internet. For users with special needs, the only efficient way to obtain certain information is to get it on-line. In the end, unprocessable digital documents are not only useless to the blind but also represent a missed opportunity for everyone. Archiving texts in a structurally rich form ensures that this vast repository of knowledge can be reused, searched, and displayed in ways that best suit individuals' needs and abilities, using software not yet invented or even imagined. EMACSPEAK, a speech interface for desktop computers developed and distributed freely by the author, supports a Web browser that translates the visual structure and style of a document into intuitive audio clues, such as distinctive tones and voices. A listener being read a long report by Emacspeak can request an overview of subtitles, then interrupt the summary to listen to the full text of any section. T.V. RAMAN is a researcher in the Advanced Technology Group at Adobe Systems in San Jose, CA.