Open Access Open Access  Restricted Access Subscription or Fee Access

Speech Enhancement Models Suited for Speech Recognition using Composite Source Model

P. S. RajaKumar, S. Ravi, R. M. Suresh

Abstract


To compare the performance of two speech coders, it is necessary to have some indicator of the intelligibility and quality of the speech produced by each coder. The term intelligibility usually refers to whether the output speech is easily understandable, while the term quality is an indicator of how natural the speech sounds. It is possible for a coder to produce highly intelligible speech that is low quality in that the speech may sound very machine-like and the speaker is not identifiable. On the other hand, it is unlikely that unintelligible speech would be called high quality, but there are situations in which perceptually pleasing speech does not have high intelligibility. We briefly discuss here the most common measures of intelligibility and quality used in formal tests of speech coders.

Keywords


Speech Coder, Speech Communication, Speech Enhancement, Speech Recognition, Speech Signal

Full Text:

PDF

References


E. Kreamer and J. Tardelli, “Speaker recognizability testing for voice coders,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, pp.1153-1156, April 1996.

A. Mc.Cree, K. Truong, E. George, T. Barnwell and V. Viswanathan, “A.2.4 kbit/s MELPcoder candidate for the new U.S. Federal Standard, Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, pp.200-203, April 1996.

W. Gardner, P. Jacobs and C. Lee, “QCELP: A variable rate speech coder for CDM A digital cellular,” in Speech and Audio Coding for Wireless Networks, B.S. Atal, V. Cuperman, and A. Gersho, Eds., Boston, Mass.: Kluwer, 1993, pp.85-92.

A. Das, E. Paksoy, and A. Gersho, “Multimode and variable rate speech,” in Speech Coding and Synthesis, W.B. Kleijn and K.K. Paliwal, Eds., Amsterdam: Elservier, 1995, pp. 257-288.

Gales, M., Maximum likelihood linear transformations for HMM-based speech recognition, Computer Speech and Language, Vol.12, pp.75-98, 1998.

McDonough, J. et.al., Speaker adaptation with all-pass transforms, ICASSP-99, Vol. II, pp. 757-760, Phoneix, May 1999.

Shinoda, K., lee, C.-H., Structural MAP speaker adaptation using hierarchical priors, IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings, pp.381-388, 1997.

A. Gersho, “Advances in speech and audio compression,” Proc. IEEE, 82, June 1994.

I. Gerson and M. Jasiuk, “Vector sum excited linear prediction (VSELP) speech coding at 8 kb/s,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, Albuquerque, NM, pp.461-464, April 1990..

O. Castillo, O. and P. Melin, "A New Approach for Plant Monitoring using Type-2 Fuzzy Logic and Fractal Theory", International Journal of General Systems, Taylor and Francis, Vol. 33, 2004, pp. 305-319.

P. Melin, M. L. Acosta, and C. Felix, "Pattern Recognition Using Fuzzy Logic and Neural Networks", Proceedings of IC-AI'03, Las Vegas, USA, 2003, pp. 221-227.

Furtună, F., Dârdală, M., Using Discriminant Analisys in Speech Recognition, The Proceedings Of The Fourth National Conference Humman Computer Interaction Rochi 2007, Universitatea Ovidius Constanţa, 2007, MatrixRom, Bucharest, 2007.

S. P. Bingulac, “On the compatibility of adaptive controllers (Published Conference Proceedings style),” in Proc. 4th Annu. Allerton Conf. Circuits and Systems Theory, New York, 1994, pp. 8–16.

G. R. Faulhaber, “Design of service systems with priority reservation,” in Conf. Rec. 1995 IEEE Int. Conf. Communications, pp. 3–8.

W. D. Doyle, “Magnetization reversal in films with biaxial anisotropy,” in 1987 Proc. INTERMAG Conf., pp. 2.2-1–2.2-6.

G. W. Juette and L. E. Zeffanella, “Radio noise currents n short sections on bundle conductors (Presented Conference Paper style),” presented at the IEEE Summer power Meeting, Dallas, TX, June 22–27, 1990, Paper 90 SM 690-0 PWRS.

J. G. Kreifeldt, “An analysis of surface-detected EMG as an amplitude-modulated noise,” presented at the 1989 Int. Conf. Medicine and Biological Engineering, Chicago, IL.

J. Williams, “Narrow-band analyzer (Thesis or Dissertation style),” Ph.D. dissertation, Dept. Elect. Eng., Harvard Univ., Cambridge, MA, 1993.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.