1

TLUMACH

Part B: Description of scientific and technological activity

B 0Title and Contents Page

Proposal Full Name Multilingual Spoken Dialogue Word/Phrase-Interpreter

Proposal Acronym TLUMACH

Proposal No.

Call Identifier

Research Programme

Thematic Priorities.

State category of RTD project: Research

B1. / Title page
Multilingual Spoken Dialogue Word/Phrase-Interpreter
TLUMACH
Jun 4 1999
Proposal number (if applicable)
B2. / Content list (Part B only):
Objectives...... 1
Contribution to programme/key action objectives...... 1
Innovation...... 2
Project workplan...... 2
Workpackage list...... 4
Deliverables list...... 6
Workpackage description...... 8
B3. / Objectives.
The objectives of this R&D project are to research and to develop new models, methods, technologies and tools for embedding spoken languages technologies into global information and communication systems, to develop new language engineering services and tools supporting users and businesses. The scientific/technological objectives are directed to propose information technologies and PC tools for fast computer adaptation to native speaker language and speech, and to give a speaker the convenient interface to communicate by natural language and voice in multilingual environment.
At present there are no speech dialogue systems which operate with high recognition and understanding accuracy, continuous and spontaneous speech, large vocabulary, natural language. Moreover, they are not multilingual, able to be used as a speech interpreter, even for a word- or phrase-book.
Under this project it is planned to create an information technology for the robust multilingual spoken dialogue word- or phrase-interpreter, with fast adaptation to natural language, dialogue domain, speaker voice and communication channel. The prototype of such interpreter will be developed.
B4. / Contribution to programme/key action objectives
The project is the first stage of the work to create tools for automatic spoken translation from one natural language to another. Proposed information technologies for multilingual spoken dialogue word/phrase-interpreter will be a contribution to the creation of the new generation of user-friendly general-interest systems and services particularly for tourism and the citizens giving a speaker the convinient interface to communicate by natural language and voice in multilingual environment.
B5. / Innovation
The innovation of the project is based on two new main ideas:
1)In spite of natural language and speech variabilities, the speech is generated by the same articulatory or speech production model. Consequently, it exists an universal system of articulatory features and a respective universal speech-oriented articulatory-phonetic alphabet for natural language and speech specifications;
2)Each speaker has his/her own speech and voice peculiarities which may be fixed as Individual Voice File or Passport (IVF).
The innovations of the project are following:
-Universal speech-oriented phonetical alphabet (USOPA) and automatic phonetical transcribing tools in this alphabet for various natural language texts.
-Universal means for speech signal analysis and description based on both speech production model and one-quasiperiodical segmentation.
-Universal articulator-phonetic code table for speech signal description.
-Transformation of speech signal into multiple USOPA phonetic streams.
-IVF forming based on short training sample.
-Composing multilingual word/phrase-books.
-Information technology for multilingual spoken word/phrase-interpreter with fast adaptation to language, dialogue domain, user voice.
B6. / Project workplan:
It is planned to work out researches and developments directed to create the information technologies and prototypes for multilingual spoken dialogue word/phrase-interpreter.
Using of these tools is supposed in such a way. The user will pronounce a word, headword or phrase in a native language. The result of the automatic recognition and translation is viewed on the monitor screen and sounded simultaneously. Different variants of the translation results usage are proposed in the language to which it was translated as well.
The project foresees the participation of a wide circle of high qualified specialists who are experts in acoustics, physics, phonetics, linguistics, computer sciences, mathematics, information technologies, software, expert systems etc. It is a multi-disciplinary project.
The project consists of nine basic workpackages. They are performed in parallel. The basic workpackages are:
W1. Universal speech-oriented articulatory-phonetic alphabet (USOPA) and tools for automatic phonetical transcribing of text and speech for various natural languages.
W2. Universal tools for speech signal analysis and description by multiple USOPA streams, which are based on speech production model and one-quasi-periodical segmentation.
W3. The universal articulatory-phonetic code table forming for speech signals description by multiple streams in USOPA alphabet.
W4. Multilingual USOPA-phonetical word/phrase-book compiling.
W5. Technologies for the individual speech file (passport) forming based on a short training sample.
W6. Information technology for multilingual automatic word and phrase recognition, based on USOPA phonetical transcriptions, from large word/phrase-books.
W7. Information technology for multilingual automatic word and phrase synthesis based on USOPA phonetical transcriptions.
W8. Information technology for fast adaptation to natural language, dialogue domain and user voice.
W9. Support and treatment of integrated information technology for multilingual spoken dialogue word/phrase-interpreter.
Project time table
Workpackage list
W9
W8
W7
W6
W5
W4
W3
W2
W1
Months: / 0 / 6 / 12 / 18 / 24 / 30 / 36

1

TLUMACH

B1. / Workpackage list
Work-package
No[1] / Workpackage title / Lead
contractor
No[2] / Person-months[3] / Start
month[4] / End
month[5] / Phase[6] / Deliv-erable
No[7]
W1 / Universal speech-oriented articulatory-phonetic alphabet (USOPA) and tools for automatic phonetical transcribing of text and speech for various natural languages / 1 / 60 / 0 / 12 / - / D1, D2
W2 / Universal tools for speech signal analysis and description by multiple USOPA streams, which are based on speech production model and one-quasi-periodical segmentation / 1 / 60 / 0 / 12 / - / D3, D4
W3 / The universal articulatory-phonetic code table forming for speech signals description by multiple streams in USOPA alphabet / 1 / 60 / 0 / 24 / - / D5, D6
W4 / Multilingual USOPA-phonetical word/phrase-book compiling / 1 / 80 / 0 / 24 / - / D7, D8
W5 / Technologies for the individual speech file (passport) forming based on a short training sample / 1 / 90 / 6 / 30 / - / D9, D10
W6 / Information technology for multilingual automatic word and phrase recognition, based on USOPA phonetical transcriptions, from large word/phrase-books / 1 / 90 / 12 / 30 / - / D11, D12
W7 / Information technology for multilingual automatic word and phrase synthesis based on USOPA phonetical transcriptions / 1 / 90 / 12 / 36 / - / D13, D14
W8 / Information technology for fast adaptation to natural language, dialogue domain and user voice / 1 / 60 / 18 / 36 / - / D15, D16
W9 / Support and treatment of integrated information technology for multilingual spoken dialogue word/phrase-interpreter / 1 / 90 / 24 / 36 / - / D17
TOTAL / 680
B2. / Deliverables list
Deliverable
No[8] / Deliverable title / Delivery
date
[9] / Nature
[10] / Dissemination
level
[11]
D1 / Universal speech-oriented phonetic alphabet / 6 / R / PP
D2 / Software for phonetical transcribing of natural languages / 12 / P / PP
D3 / Speech signal pre-processing technology / 10 / R / PU
D4 / Speech signal pre-processing software / 12 / P / RE
D5 / The universal articulatory-phonetic speech production model for speech signal analysis, description and synthesis / 18 / R / PU
D6 / Software for conversion of speech signal to multiple phonetic streams / 24 / P / CO
D7 / Phonetical word/phrase-book for first five languages / 18 / P / PU
D8 / Phonetical word/phrase-book for next five languages / 24 / P / PU
D9 / Individual speech peculiarities model / 18 / R / PU
D10 / Software for speaker voice adaptation / 30 / P / PP
D11 / Information technology and tools for multilingual word and phrase recognition (1st model) / 24 / P / PP
D12 / Information technology and tools for multilingual word and phrase recognition (2nd model) / 30 / P / PP
D13 / Information technology and tools for multilingual word and phrase synthesis (1st model) / 24 / P / PP
D14 / Information technology and tools for multilingual word and phrase synthesis (2nd model) / 30 / P / PP
D15 / Information technology and tools for fast adaptation to language and dialogue domain / 24 / P / PP
D16 / Information technology and tools for fast adaptation to language, dialogue domain and speaker voice / 30 / P / PP
D17 / Integrated information technology for multilingual spoken dialogue word/phrase-interpreter / 36 / P / PP
B3. / Workpackage description
Workpackage number : / W1
Start date or starting event: / 0
Participant number: / 1 / 2
Person-months per participant: / 40 / 20
Objectives
To develop a universal speech-oriented articulatory-phonetic alphabet (USOPA) and tools for automatic phonetical transcribing of text and speech for various natural languages in this alphabet
Description of work
Theoretical research and software development for text processing.
Deliverables
The universal speech-oriented articulatory-phonetic alphabet (USOPA) for natural language and speech specification.
Software for phonetical transcribing for 10 various natural languages in USOPA alphabet
Milestones and expected result
The universal speech-oriented articulatory-phonetic alphabet (USOPA) for natural language and speech specification. Software for phonetical transcribing of texts for various natural languages
Workpackage number : / W2
Start date or starting event: / 0
Participant number: / 1 / 2
Person-months per participant: / 40 / 20
Objectives
To research and develop universal tools for speech signal analysis and description by multiple USOPA streams which are based on both speech production model and one-quasi-period segmentation
Description of work
Creating mathematical models and software for speech signal segmentation into one-quasi-periods, speech signal partition into large quasi-periodical and non-quasi-periodical segments and their description by multiple USOPA streams
Deliverables
Speech signal pre-processing technology and software for speech description by multiple USOPA streams
Milestonesand expected result
Pre-processing technology for speech signal partition into one-quasi-periods, and quasi-periodical and non-quasi-periodical segments. Mathematical models for automatic speech signal description by multiple streams in USOPA
Workpackage number : / W3
Start date or starting event: / 0
Participant number: / 1 / 2
Person-months per participant: / 60 / -
Objectives
To develop the universal articulatory-phonetic speech production models and propose a technology for speech signals description by multiple streams in USOPA alphabet
Description of work
Modelling, research and filling an articulatory-phonetic code table for speech signals description
Deliverables
The universal articulatory-phonetic speech production model for speech signal analysis, description and synthesis.
Information technology for creating the articulatory-phonetic code table on a training sample.
Software for conversion of speech signal to multiple phonetic streams
Milestonesand expected result
Information technology for creating the articulatory-phonetic code table
Information technology for automatic speech signal description by multiple streams in USOPA alphabet
Workpackage number : / W4
Start date or starting event: / 0
Participant number: / 1 / 2
Person-months per participant: / 50 / 30
Objectives
To develop and compile a multilingual USOPA-phonetical word/phrase-books for ten languages
Description of work
Multilingual phonetical word/phrase-book compiling
Deliverables
Phonetical word/phrase-books for ten languages
Milestonesand expected result
Information technology for multilingual USOPA-phonetical word/phrase-book compiling
Workpackage number : / W5
Start date or starting event: / 6
Participant number: / 1 / 2
Person-months per participant: / 70 / 20
Objectives
To elaborate a model of speaker voice peculiarities
Description of work
Creating technologies for the individual speech file (passport) forming based on a short training sample
Deliverables
Individual speech peculiarities model.
Software for speaker voice adaptation
Milestonesand expected result
Information technology for the individual speech file forming based on the universal articulatory-phonetic speech production model
Workpackage number : / W6
Start date or starting event: / 12
Participant number: / 1 / 2
Person-months per participant: / 70 / 20
Objectives
To develop an information technology for multilingual automatic word and phrase recognition, based on USOPA phonetical transcriptions, from large word/phrase-books
Description of work
Creating of models, methods, algorithms and software
Deliverables
Information technology and tools for multilingual word and phrase recognition
Milestonesand expected result
Information technology for multilingual automatic word and phrase recognition
Workpackage number : / W7
Start date or starting event: / 12
Participant number: / 1 / 2
Person-months per participant: / 60 / 30
Objectives
To develop an information technology for multilingual automatic word and phrase synthesis
Description of work
Creating models, algorithms and software for automatic word and phrase synthesis based on phonetical transcriptions
Deliverables
Information technology and tools for multilingual word and phrase synthesis based on USOPA phonetical transcriptions
Milestonesand expected result
Information technology and tools for multilingual word and phrase synthesis based on USOPA phonetical transcriptions
Workpackage number : / W8
Start date or starting event: / 18
Participant number: / 1 / 2
Person-months per participant: / 40 / 20
Objectives
To develop an information technology for the fast adaptation to natural language, dialogue domain and user voice
Description of work
Creating models, algorithms and software for fast adaptation to natural language, dialogue domain and user voice
Deliverables
Information technology, software and tools for fast adaptation to natural language, dialogue domain, and speaker voice
Milestonesand expected result
Information technology and tools for fast adaptation to natural language, dialogue domain, and speaker voice
Workpackage number : / W9
Start date or starting event: / 24
Participant number: / 1 / 2
Person-months per participant: / 60 / 30
Objectives
To integrate developed technologies into multilingual spoken dialogue word/phrase-interpreter
Description of work
System compiling
Deliverables
Integrated information technology for multilingual spoken dialogue word/phrase-interpreter
Milestonesand expected result
Integrated information technology and prototypes for multilingual spoken dialogue word/phrase-interpreter

1

TLUMACH

[1] Workpackage number: WP 1 – WP n.

[2] Number of the contractor leading the work in this workpackage.

[3] The total number of person-months allocated to each workpackage.

[4] Relative start date for the work in the specific workpackages, month 0 marking the start of the project, and all other start dates being relative to this start date.

[5] Relative end date, month 0 marking the start of the project, and all end dates being relative to this start date.

[6] Only for combined research and demonstration projects: Please indicate R for research and D for demonstration.

[7] Deliverable number: Number for the deliverable(s)/result(s) mentioned in the workpackage: D1 - Dn.

[8] Deliverable numbers in order of delivery dates: D1 – Dn

[9] Month in which the deliverables will be available. Month 0 marking the start of the project, and all delivery dates being relative to this start date.

[10] Please indicate the nature of the deliverable using one of the following codes:

R = Report

P = Prototype

D = Demonstrator

O = Other

[11] Please indicate the dissemination level using one of the following codes:

PU = Public

PP = Restricted to other programme participants (including the Commission Services).

RE = Restricted to a group specified by the consortium (including the Commission Services).

CO = Confidential, only for members of the consortium (including the Commission Services).