A Novel Approach to the Isolated Words Speech Recognition based on Features Derived from Wavelet Packets using a New Class of Triplet Halfband Filter Bank
Abstract
This paper presents a new technique to extract the
speech features in order to improve the recognition accuracy in various types of noisy environments. Most of the speech recognition systems are suffered from high computational complexity. In this paper, a new class of triplet half band wavelet packets (THWP) has been designed based on the generalized half band polynomial. These packets are used in speech recognition system to derive the effective and efficient speech features. The proposed THWP satisfies perfect reconstruction (PR) and provides linear phase, regularity, better frequency-selectivity and near orthogonality. These properties are exploited to approximate desirable speech features significantly. The proposed technique computes features using energy, mean and variance of each sub-band of THWP. This gives low dimensional feature vectors for speech recognition purpose. The performance of the proposed algorithm has been evaluated on Texas Instruments-46 (TI-46) speech database in various noisy environments. The performance of the proposed technique is better than existing popular speech recognition algorithms.
Keywords
Full Text:
PDFReferences
L.R.Rabiner,B.H.Juang and B. H.Yegnanarayana. .Fundamentals of
Speech recognition,Pearson Education,2009,Ch.1.
J. Markel and Jr.A.H.Gray Linear Prediction of Speech, New York,
Springer,. 1976.
M.A.Anusuya and S.K.Katti , “ Front end analysis of speech recognition:
a review”, International J. Speech Technol, vol.14 , pp.99-145,2011.
J.Makhoul J., “Linear prediction: A tutorial review”, Proceedings of the
IEEE, vol.63,no.4 pp.561–580,1975.
S.B.Davis and P. Mermelstein, “Comparison of parametric representation
for monosyllabic word recognition in continuously Spoken Sentences,”
IEEE transactions on Acoustics, Speech and signal processing, vol.28,
no.4, pp. 357-366, 1980.
S.F.Boll, “Suppression of acoustic noise in robust speech using spectral
Substration,” IEEE Trans, Acoustics, Speech, Signal Processing,
ASSP-33, vol.27, pp.113-120, 1979.
[7] P. Lockwood and J. Boudy, “Experiments with nonlinear speech
subtractor(NSS),hidden Markov Model and the projection, for robust
speech recognition in cars,” Speech Communication, vol.11
pp.215-228,1992.
H.Hermansky, “Perceptual linear predictive (PLP) analysis of speech”, J.
Acoust. Soc. Am., vol.7 no.4, pp.1738-1752, 1990.
H.Hermansky and N. Morgan, “RASTA processing of speech”, IEEE
Trans. on Speech and Audio Proc., vol.2, no.4, pp.578-589, 1994.
H. Fletcher Speech and Hearing in communication, Van Nostrand Co.,
J.B.Allen,How does the Human process and Recognize speech,IEEE
Transactions on Speech, and Audio Processing,vol.4, no.2,
-577,1959.
A.Cerp and R.M. Stern, “Environmental robustness in automatic speech
recognition”, Proc. Int. Conf on Acoustics, Speech and Signal
Processing,ICASSP-90,Adelaide,South Australia, I417-I420,1994.
L.Neumeyer & M. Weintraub,“Probabilistic Optimum filtering for
robust speech recognition,” Proc.Int.Conf on Acoustics, Speech and
Signal Processing, ICASSP-94, Albu Querque, USA, pp.849-852,1990.
M.J.Gales and S.J.Young, “Robust continuous speech recognition using
parallel model combination”, IEEE trans, Speech Audio Processing,
vol.4, no.5, pp.352-359.
C.J.Long and S.Datta, “ Wavelet Based Feature Extraction for Phoneme
Recognition”, Proc. of 4th Int. Conf. of Spoken Language Processing,
ICSLP’96, Philadephia,USA,pp.264-267, 1996
C.J.Long and S.Datta, “Discreminant wavelet basis construction for
speech recognition”, Proc.5th Int. Conf. on Speech language Processing,
ICSLP’98, Sydeney, Australia,3, pp.1047-1049, 1998
O.Farooq and S.Datta,“Dynamic Feature extraction by wavelet analysis”
,Proc. Int. Conf. on speech and Language Processing,ICSLP
;Beijing,China,October2000,4,pp.696-699, 2000
O.Farooq and S. Datta, “Mel Filter like Admissible Wavelet Packet
Structure for Speech recognition”, IEEE Signal Processing Letters,
vol.8,no.7, 196-198.
O.Farooq and S.Datta, “ Wavelet Transform for dynamic Feature
extraction of phonemes”, Acoustic Lett., vol.23,no.4, pp.79-82,1999
O.Farooq and S.Datta Wavelet based robust sub-band features for
phoneme recognition, IEE.Proc Vis.Image Process., vol.151 no.3,
pp.187-193. 2004.
S.Chang,, Y.Kwon and S.Yang, “Speech feature extracted from adaptive
wavelet for speech recognition”, Electron Lett., vol.34, no.23,
pp.2211-2213,1998.
R.M. Rao and A.S. Bopardikar, Wavelet Transforms–Introduction to
Theory and Applications, Addison-Wesley, 2001.
B.D.Patil, P.G. Patwardhan and V.M.Gadre, “On the design of FIR
wavelet filter banks using factorization of a halfband polynomial”, IEEE
Signal Processing Letters, 15, pp.485-488, 2008.
S.M.Phoong, C.W. Kim and P.P.Vaidynathan ,“A novel class of
two-channel biorthogonal filterbanks and wavelet bases”, IEEE trans on
signal processing, vol.43,no.3, 649-665,2008.
R.Ansari,C.W.Kim and M.Dedovic, “Structure and design of
two-channel filter banks derived from triplet of halfband filters”, IEEE
transactions on Circuits and systems II:Analog and Digital Signal
Processing, vol.46,no.12, pp.1487-1496, 1999.
D.B.Tay & M. Palaniswami , “A novel approach to the class of triplet half
band filter banks”, IEEE Trans. on Circuits and Systems-II: Express
briefs, vol.51,no.7,pp. 378-383,2004.
D.B.Tay, “A new Class of even–length biorthogonal wavelet filters
wavelet filters for Hilbert Pair design”, IEEE trans on circuits and
systems –I:Regular papers vol.55, no.6,pp.1580-1588,2008.
S.C.Chan and K.S. Yeung, “On the design and multiplierless realization
of perfect realization triplet-based FIR filterbanks and waveletbases”,
IEEE Trans on Circuits and Systems,I, vol.51,no.8, 1476-1491,2004.
H.H.Kha, H.D.Tuan and T.Q. Nguyen, “Optimal design of FIR
Triplet Half band filter bank and application in image coding”, IEEE
Trans an Image processing,vol.22,no.2,pp.586-59,2011.
R.Eslami and H.Radha, “Design of regular wavelets using a three-step
lifting scheme”, IEEE Trans. On Signal processing, vol.58,no.4,
pp.2088-2101,2010.
K.Kovecevic and W.Sweldens,“Wavelet families of increasing order in
arbitrary dimensions, IEEE Trans .on Image Processing,
vol.9,no.3,pp.480-496, 2000.
A.D.Rahulkar and R.S.Holambe, “Half-Iris Feature Extraction and
Recognition Using A New Class of Bi-orthogonal Triplet Halfband Filter
Bank and Flexible k-out-of-n:A Post-Classifier”, IEEE Trans. on
Information Forensic and Security, In Press, doi
1109/TIFS.2011.2166069,2011.
A.Verga, H.Steeneken, M.Tomlinson, and D.Jones, “The NOISEX-92
study on effect of additive noise on automatic speech recognition”,
Technical report, DRA Speech research Unit Malvern,England,Available
from http://spib.rice.edu/spib/selct_noise. 1992.
Refbacks
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution 3.0 License.