Computer Talk
The idea of having computerized speech has been around since the 1970’s. The first attempts at speech sounded nothing like a real person talking. Today scientists are still working to create systems which can mimic human speech. Researchers from IBM’s synthetic-speech group have created a software called “Supervoices”. The Supervoice system, “…is based on the premise that speech is composed of a finite number of linguistic building blocks called phonemes and that these can be arranged in new sequences to create any word…English contains about 40 unique phonemes. For example, the word ‘please’ is composed of four: P, L, EE and Z.” These phonemes are captured by recording a person speaking each of them many times. In fact it can take an entire week of speaking and recording to gather a sufficient sample of phonemes. The Supervoices “…database contains an average of 10,000 recorded samples of each English phoneme.” Storing 400,000 copies of 40 unique items may not be efficient but this is what gives Supervoices the ability to speak new words which are not contained within its database by piecing together existing phonemes into new patterns. Supervoices is also aware of things like commas and question marks which cause the sentence to be pronounced in a specific way. Through a series of complex calculations Supervoices pieces together the phonemes and then works to level the pitch and inflections, “like a carpenter sanding a series of glued joints to create a smooth, pleasing surface.” Even with all this complexity the Supervoice system was designed to be fast enough to converse with a human in real time.
I think this is a very exciting technology. It could very well revolutionize the way humans interact with computer systems. I can imagine a computer system with no keyboard or mouse. It accepts user input through speech (keyboard replacement), some sort of device to track the user’s eye movements (mouse replacement), and maybe even a touch screen. I agree with the authors of the article who feel the first application of this type of technology will be refining telephone help systems like airline reservation systems, but I feel it will continue to enter new fields and applications.
References:
ScientificAmerican.com, 3-17-2003, “Making Computers Talk”, Andy Aaron, Ellen Eide and John F. Pitrelli .