Keywords: speech generation, intonation control, dialogic process, information retrieval
Start Date: 1 May 94 / Status: finished / Duration: 24 months
[ participants / contact]
The aim of SPEAK is to construct a proof-of-concept prototype of a multimodal information system combining graphical and spoken language primarily in German with a demonstrable extendability to other languages.
The multi-modal user interface constructed in the project is now able to generate automatically, from a deep representation of information and of speaker and hearer goals, meta-communication concerning the information retrieval process. Particular attention is placed on the intonational adequacy of the synthesised speech in the context of the information retrieval dialogue between information seeker and machine.
Following initial specifications for the functional, dialogue-appropriate control of intonation for German and the specification of a general model for information retrieval as a dialogic process of negotiation between seeker and giver, these components have been instantiated in a complex working prototype. The speech synthesis component has been extensively redesigned in order to provide high level and parametriseable control of the needed intonation contours and their segmental consequences. Reusable subcomponents of the complete system have been clearly isolated for subsequent take-up. These include: intonation control algorithm, marked-up text to speech component, semantics-to-mark-up text (German) component, dialogue model.
Significant improvements in the naturalness of the resulting speech have been achieved, as a consequence of the rich information concerning commutative intentions that is maintained by the interface and special (but generaliseable) extensions that have been made to the speech synthesiser adopted (Multivox).
Three workshops (Budapest, 1994 and 1996, Darmstadt 1995) have been held to report on the work in the project as well as to offer opportunities for presentations of related work. Project results have been presented in journal articles and workshop.
The reusable components are being evaluated and used in a variety of industrial contexts. It is expected that some of these components will be incorporated in several products, in particular those that are marketed by the company which supplies the speech synthesis equipment used in the project.
Technische Hochschule Darmstadt and GMD/IPSI
D-64293 Darmstadt, D
Prof. Dr. Erich J. Neuhold
Tel: +49 6151 869 800
Fax: +49 6151 869 818
SPEAK - CP93-10393, May 1997
please address enquiries to the ESPRIT Information Desk
html version of synopsis by Nick Cook