University of Birmingham > Talks@bham > Artificial Intelligence and Natural Computation seminars > Using narrow phonetic transcription to improve the performance of an Arabic speech recognition system

Using narrow phonetic transcription to improve the performance of an Arabic speech recognition system

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Lars Kunze.

Host: Prof. Xin Yao

To train a speech recognition system, you have to provide it with appropriate data. The most widely used systems (HTK, Sphinx, CSLU toolkit) require three parallel sets of data—recordings, textual transcriptions, and phonetic transcriptions. The easiest way to obtain phonetic transcriptions is to use a dictionary linking textual forms to phonetic forms. This approach, however, fails to take into account the effect of the surrounding phonetic context on the the way individual phonemes are realised. The talk will report on some experiments using phonological rules, including rules relating to local stress, to provide more accurate transcriptions, with significant (beneficial!) effects on performance and on the time taken for training.

The talk will provide a very brief introduction to speech recognition and an overview of how the HTK works, since it is not possible to describe the experiments without describing the tools we used to conduct them. If time permits I will also sketch a novel way of using speech synthesis to provide training data for speech recognition.

Speaker’s homepage: http://www.cs.man.ac.uk/~ramsay

This talk is part of the Artificial Intelligence and Natural Computation seminars series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

Talks@bham, University of Birmingham. Contact Us | Help and Documentation | Privacy and Publicity.
talks@bham is based on talks.cam from the University of Cambridge.