Open Access Open Access  Restricted Access Subscription or Fee Access

Analyzing and Correction of Errors to Improve the Speech Recognition Accuracy for Telugu Language

N. Usha Rani, P.N. Girija

Abstract


Dramatic advances have been made to improve the accuracy in the area of Speech Recognition. Many differences in user articulation, acoustic environment, transmission channel, microphone, word collection likelihood, dialectal variation, speech rate – all caused reduced performance of the speech recognition system. Mis recognition may occur due to substitution of single word for another word. Confusable words which leads to less recognition accuracy for the large vocabulary speech recognition system. Hence to improve the speech recognition accuracy, error recovery procedures are essential to use in the speech recognition system. Modification of the dictionary will be done based on the confusion words which are obtained from the decoder of the speech recognition system. Mismatch occurs when new acoustic signal is given to the decoder which is different from the training system. Testing speech models are not consistent with the acoustic and language model of the speech which is used in the training phase. In such cases, statistical method will be applied to the output of the decoder. Error Correcting procedures are essential and imposed by analyzing the errors that occurs in the decoder of the speech recognition system.

Keywords


Pronunciation Dictionary, Pronunciation Variation, Speech Recognition, Statistical Modification Method

Full Text:

PDF

References


http://www-2.cs.cmu.edu/~robust/Tutorial

Jun Ogata and Masataka Goto, “Speech Repair:Quick Error Correction Just by using selection Operation for Speech Input Interface”, In Proceedings of INTERSPEECH, pp. 133-136, 2005.

Arup Sarma and David D.Palmer, “Context-based Speech Recognition Error Detection and Correction”, In Proceedings of HLT-NAACL, pp. 85-88, 2004.

John-Paul, Tom Jakobs, Allen Baker and Susan Fager, “Automatic Speech Recognition for assistive writing in Speech Supplemented Word Prediction”, In Proceedings of INTERSPEECH, pp. 2674-2677, 2010.

G.N.Swamy, K.Raja Rajeswari, K.Murali Krishna, B.Visvesvara Rao and S.V.S. Ganesh, “Speaker Dependent Word Recognition based on Demspter-Shafer Theory using Linear Predictive Coding”, In Proceedings of WSEAS TELEINFO , pp:80-85, 2004.

D.Hillard, M.Ostendorf, A.Stolcke, Y.Liu and E.Shriberg, Improving Automatic Sentence Boundary Detection with Confusion Networks”, In Proceeding of HLT/NAACL, pp. 69-72, 2004.

Su-Youn Yoon, Lei Chen and Klaus Zechner, “Predicting word accuracy for the automatic speech recognition of non-native speech”, In Proceedings of INTERSPEECH, pp. 773-776, 2010.

Vivek Rangarajan and Shrikanth Narayanan, “Analysis of Disfluent Repetitions in Spontaneous Speech Recognition”, In Proceedings of the Euripean Signal Processing Conference(EUSIPCO), pp.192-196, 2006.

Kamadev Bhanuprasad and Mats Svenson, “Errgrams-A way to improving ASR for highly inflected Dravidian Languages”, In Proceedings of the Third Joint Conference on Natural Language Processing, pp: 108-113, 2008.

Takahiro Shinozaki and Sadaoki Furui, “Error analysis using Decision Trees in Spontaneous presentation Speech Recognition”, ASRU.2001, pp.198-201, 2001.

Bo-June(paul) Hsu and James Glass, “Language Model Parameter Estimation using User Transriptions”, ICASSP,pp.4805-4808, 2009.

Herve Bourland, Hynek Hermansky, Nelson Morgan, “Towards increasing speech recognition error rates”,Speech Communication, pp. 253-255, 1996.

Marelie Davel and Olga Martirosian, “Pronunciation Dictionary Development in Resource-Scarce Environments”, INTERSPEECH, pp. 2851-2854, 2009.

Zheng Chen, Kai-Fu Lee, Ming-Jing Lee, “Discriminative Training on Language model”, ICSLP-2000, International Conference on Spoken Language Processing, pp.16-20, 2000.

Hong-Kwang Jeff Kuo, Erric Fosler-Lussire, Hui Jiang, Chin-Hui Lee, “Discriminative Training of Language models for speech Recognition”, In the Proceeding of ICASSP, pp:325-328, 2002.

Meraka_Institute, DictionaryMaker”, 2009. http://dictionarymaker.sourceforge.net

O.M.Martirosian and M.Davel, “Error analysis of a public domain pronunciation dictionary”, In Proceedings of PRASA, pp: 13-16,2007.

Hassam Al-Haj, Roger Hsiao, Ian Lane, Alan W.Black, Alex Waibel, “Pronunciation Modeling for Dialectal Arabic Speech Recognition”, ASRU, pp.525-528, 2009.


Refbacks

  • There are currently no refbacks.