Open Access Open Access  Restricted Access Subscription or Fee Access

A Novel Approach to the Isolated Words Speech Recognition based on Features Derived from Wavelet Packets using a New Class of Triplet Halfband Filter Bank

Yogesh S. Angal, Amol D. Rahulkar, Raghunath S. Holambe, Rajan H. Chile

Abstract


This paper presents a new technique to extract the
speech features in order to improve the recognition accuracy in various types of noisy environments. Most of the speech recognition systems are suffered from high computational complexity. In this paper, a new class of triplet half band wavelet packets (THWP) has been designed based on the generalized half band polynomial. These packets are used in speech recognition system to derive the effective and efficient speech features. The proposed THWP satisfies perfect reconstruction (PR) and provides linear phase, regularity, better frequency-selectivity and near orthogonality. These properties are exploited to approximate desirable speech features significantly. The proposed technique computes features using energy, mean and variance of each sub-band of THWP. This gives low dimensional feature vectors for speech recognition purpose. The performance of the proposed algorithm has been evaluated on Texas Instruments-46 (TI-46) speech database in various noisy environments. The performance of the proposed technique is better than existing popular speech recognition algorithms.


Keywords


Filter Bank, Half Band Filters, Feature Extraction, Wavelet Transform, THWP, Speech Recognition.

Full Text:

PDF

References


L.R.Rabiner,B.H.Juang and B. H.Yegnanarayana. .Fundamentals of

Speech recognition,Pearson Education,2009,Ch.1.

J. Markel and Jr.A.H.Gray Linear Prediction of Speech, New York,

Springer,. 1976.

M.A.Anusuya and S.K.Katti , “ Front end analysis of speech recognition:

a review”, International J. Speech Technol, vol.14 , pp.99-145,2011.

J.Makhoul J., “Linear prediction: A tutorial review”, Proceedings of the

IEEE, vol.63,no.4 pp.561–580,1975.

S.B.Davis and P. Mermelstein, “Comparison of parametric representation

for monosyllabic word recognition in continuously Spoken Sentences,”

IEEE transactions on Acoustics, Speech and signal processing, vol.28,

no.4, pp. 357-366, 1980.

S.F.Boll, “Suppression of acoustic noise in robust speech using spectral

Substration,” IEEE Trans, Acoustics, Speech, Signal Processing,

ASSP-33, vol.27, pp.113-120, 1979.

[7] P. Lockwood and J. Boudy, “Experiments with nonlinear speech

subtractor(NSS),hidden Markov Model and the projection, for robust

speech recognition in cars,” Speech Communication, vol.11

pp.215-228,1992.

H.Hermansky, “Perceptual linear predictive (PLP) analysis of speech”, J.

Acoust. Soc. Am., vol.7 no.4, pp.1738-1752, 1990.

H.Hermansky and N. Morgan, “RASTA processing of speech”, IEEE

Trans. on Speech and Audio Proc., vol.2, no.4, pp.578-589, 1994.

H. Fletcher Speech and Hearing in communication, Van Nostrand Co.,

J.B.Allen,How does the Human process and Recognize speech,IEEE

Transactions on Speech, and Audio Processing,vol.4, no.2,

-577,1959.

A.Cerp and R.M. Stern, “Environmental robustness in automatic speech

recognition”, Proc. Int. Conf on Acoustics, Speech and Signal

Processing,ICASSP-90,Adelaide,South Australia, I417-I420,1994.

L.Neumeyer & M. Weintraub,“Probabilistic Optimum filtering for

robust speech recognition,” Proc.Int.Conf on Acoustics, Speech and

Signal Processing, ICASSP-94, Albu Querque, USA, pp.849-852,1990.

M.J.Gales and S.J.Young, “Robust continuous speech recognition using

parallel model combination”, IEEE trans, Speech Audio Processing,

vol.4, no.5, pp.352-359.

C.J.Long and S.Datta, “ Wavelet Based Feature Extraction for Phoneme

Recognition”, Proc. of 4th Int. Conf. of Spoken Language Processing,

ICSLP’96, Philadephia,USA,pp.264-267, 1996

C.J.Long and S.Datta, “Discreminant wavelet basis construction for

speech recognition”, Proc.5th Int. Conf. on Speech language Processing,

ICSLP’98, Sydeney, Australia,3, pp.1047-1049, 1998

O.Farooq and S.Datta,“Dynamic Feature extraction by wavelet analysis”

,Proc. Int. Conf. on speech and Language Processing,ICSLP

;Beijing,China,October2000,4,pp.696-699, 2000

O.Farooq and S. Datta, “Mel Filter like Admissible Wavelet Packet

Structure for Speech recognition”, IEEE Signal Processing Letters,

vol.8,no.7, 196-198.

O.Farooq and S.Datta, “ Wavelet Transform for dynamic Feature

extraction of phonemes”, Acoustic Lett., vol.23,no.4, pp.79-82,1999

O.Farooq and S.Datta Wavelet based robust sub-band features for

phoneme recognition, IEE.Proc Vis.Image Process., vol.151 no.3,

pp.187-193. 2004.

S.Chang,, Y.Kwon and S.Yang, “Speech feature extracted from adaptive

wavelet for speech recognition”, Electron Lett., vol.34, no.23,

pp.2211-2213,1998.

R.M. Rao and A.S. Bopardikar, Wavelet Transforms–Introduction to

Theory and Applications, Addison-Wesley, 2001.

B.D.Patil, P.G. Patwardhan and V.M.Gadre, “On the design of FIR

wavelet filter banks using factorization of a halfband polynomial”, IEEE

Signal Processing Letters, 15, pp.485-488, 2008.

S.M.Phoong, C.W. Kim and P.P.Vaidynathan ,“A novel class of

two-channel biorthogonal filterbanks and wavelet bases”, IEEE trans on

signal processing, vol.43,no.3, 649-665,2008.

R.Ansari,C.W.Kim and M.Dedovic, “Structure and design of

two-channel filter banks derived from triplet of halfband filters”, IEEE

transactions on Circuits and systems II:Analog and Digital Signal

Processing, vol.46,no.12, pp.1487-1496, 1999.

D.B.Tay & M. Palaniswami , “A novel approach to the class of triplet half

band filter banks”, IEEE Trans. on Circuits and Systems-II: Express

briefs, vol.51,no.7,pp. 378-383,2004.

D.B.Tay, “A new Class of even–length biorthogonal wavelet filters

wavelet filters for Hilbert Pair design”, IEEE trans on circuits and

systems –I:Regular papers vol.55, no.6,pp.1580-1588,2008.

S.C.Chan and K.S. Yeung, “On the design and multiplierless realization

of perfect realization triplet-based FIR filterbanks and waveletbases”,

IEEE Trans on Circuits and Systems,I, vol.51,no.8, 1476-1491,2004.

H.H.Kha, H.D.Tuan and T.Q. Nguyen, “Optimal design of FIR

Triplet Half band filter bank and application in image coding”, IEEE

Trans an Image processing,vol.22,no.2,pp.586-59,2011.

R.Eslami and H.Radha, “Design of regular wavelets using a three-step

lifting scheme”, IEEE Trans. On Signal processing, vol.58,no.4,

pp.2088-2101,2010.

K.Kovecevic and W.Sweldens,“Wavelet families of increasing order in

arbitrary dimensions, IEEE Trans .on Image Processing,

vol.9,no.3,pp.480-496, 2000.

A.D.Rahulkar and R.S.Holambe, “Half-Iris Feature Extraction and

Recognition Using A New Class of Bi-orthogonal Triplet Halfband Filter

Bank and Flexible k-out-of-n:A Post-Classifier”, IEEE Trans. on

Information Forensic and Security, In Press, doi

1109/TIFS.2011.2166069,2011.

A.Verga, H.Steeneken, M.Tomlinson, and D.Jones, “The NOISEX-92

study on effect of additive noise on automatic speech recognition”,

Technical report, DRA Speech research Unit Malvern,England,Available

from http://spib.rice.edu/spib/selct_noise. 1992.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.