Speech Understanding and Dialogue


SUNDIAL - 2218

Keywords speech recognition, speech understanding, speech dialogue modelling


Start Date: 01-SEP-88 / Duration: 60 months

[ contact / participants ]


Objectives and Approach

SUNDIAL addresses the problem of speech-based cooperative dialogue as an interface for computer-based information services. The main technologies to be developed are continuous speech recognition and understanding, and oral dialogue modelling and management.

Speech input will be sentences of naturally spoken utterances of telephone quality with a vocabulary of 1000-2000 words for each application. The grammar will be based on a subset of the four partners' languages (English, French, German and Italian). The project has begun with speaker-independent recognition of sub-word units. The second phase will consider automatic online speaker adaptation with a view to improving performance. The dialogue manager will allow users to express themselves in a restricted natural language.

Prototypes will demonstrate the technology for three main information service applications: intercity train timetables (German), flight enquiries and reservations (English and French) and a hotel database (Italian). The spoken language phenomena to be covered will be determined from analysis of both human dialogue corpora as well as human-machine simulations. Each demonstration system will be evaluated through extensive user trials.

For all demonstrators, the project has to define a common general architecture, common formalisms for grammar representation across languages, and common semantic representations for dialogue management and message generation.

Progress and Results

The project started with a number of definition studies for the general architecture and studies of application scenarios. A common architecture has been defined, together with the interfaces between the major modules; this will facilitate comparative evaluation and exchange between partners.

A small 50-word vocabulary for the telephone speaker-independent recogniser has been developed, suitable for a banking-by-phone application. Tests on the recogniser using the Recogniser Sensitivity Analysis (RSA) technique (being developed in project 2589, SAM) have shown 95.6% correct recognition (+/- 0.7% at the 95% confidence level) on the RSA 31-word vocabulary.

Preliminary results for the acoustic-phonetic decoding module show that continuous density HMMs (CDHMM) achieve 77.6% word accuracy on sentences compared to 68.5% for discrete density HMMs using 275 phonetic units for the Italian language and a near 1000-word vocabulary. These results are for speaker-independent recognition of telephone quality sentences, but do not take into account the effect of the linguistic processing module on sentence understanding performance.

Results for the English language using CDHMM show that phoneme recognition accuracy on the DARPA TIMIT database is comparable to that achieved by Kai-Fu Lee in the Carnegie Mellon SPHINX system.

A common dialogue manager architecture has been defined and work is in progress on its implementation.

Exploitation

SUNDIAL is targeted at natural-language oral dialogues, particularly for information services. This technology will find its prime application in telephone-based services. Applications range from business-related services (eg calling the company computer through a telephone), to services aimed at the general user, such as banking-by-phone. Most of these applications are currently dominated by technology from the USA. One of the partners is already planning early exploitation of the isolated-word speaker-independent recognition work in applications such as telephone banking and mail order; this represents a major advance for European industry in providing a home-grown source for the technology.


CONTACT POINT

Mr Jeremy Peckham
LOGICA LTD
Betjeman House, 104 Hills Road
UK - CAMBRIDGE CB2 1LQ
tel: + 44/ 223-66343
fax: + 44/ 223-322315
telex: 27200

Participants

LOGICA LTD - UK - C
DAIMLER-BENZ AG - D - P
CNET - F - P
SARITEL-SARIN TELEMATICA - I - P
UNIVERSITÄT ERLANGEN-NÜRNBERG - D - P
UNIVERSITY OF SURREY - UK - P
SIEMENS AG - D - P
IRISA-RENNES - F - P
CSELT - I - P
CAP GEMINI INNOVATION - F - P
POLITECNICO DI TORINO - I - A


ST synopses home page ST acronym index ST number index
All synopses home page all acronyms index all numbers index

SUNDIAL - 2218, December 1993


please address enquiries to the ESPRIT Information Desk

html version of synopsis by Nick Cook