Open Access Open Access  Restricted Access Subscription or Fee Access

Security Based Speaker Verification for Lip-Password using Learning Multi-Boosted HMMS

A. Jebaselvi, Kumar Parasuraman, T. Arumuga Maria Devi

Abstract


Lip password is composed of a password embedded with motions of lip and point out the characteristic of lip motion. To provides security of a speaker verification system by using private password and behavioral biometrics of a lip motion simultaneously. The target speaker saying wrong password then rejected and the target speaker saying correct password then detected. Here a Hidden Markov Model (HMM) learning approach based on multi boosted scheme is presented for a security speaker system.  This method first extract the visual features and to characterize each frame. The lip password segmentation algorithm is used for the segmentation of lip sequences. Hidden Markov Models with boosting learning framework contains random subspace method and data sharing scheme. Finally, the lip-password is verified based on verification results provided by all the subunit learned from HMM based multi-boosted scheme and it will check whether the password is spoken by the speaker with the already-recorded password or not.

Keywords


Lip Motion, HMM, GMM, RSM, DSS

Full Text:

PDF

References


P. Kenny, P. Ouellet, N. Dehak, V. Gupta, and P. Dumouchel, “A study of interspeaker variability in speaker verification,” IEEE Trans. Audio,Speech, Lang. Process., vol. 16, no. 5, pp. 980–988, Jul. 2008.

A. K. Sao and B. Yegnanarayana, “Face verification using template matching,” IEEE Trans. Inf. Forensics Security, vol. 2, no. 3, pp. 636–641, Sep. 2007.

M. I. Faraj and J. Bigun, “Synergy of lip-motion and acoustic features in biometric speech and speaker recognition,” IEEE Trans. Comput., vol. 56, no. 9, pp. 1169–1175, Sep. 2007.

N. A. Fox, R. Gross, J. F. Cohn, and R. B. Reilly, “Robust biometric person identification using automatic classifier fusion of speech, mouth, and face experts,” IEEE Trans. Multimedia, vol. 9, no. 4, pp. 701–714, Jun. 2007.

H. E. Cetingul, Y. Yemez, E. Engin, and A. M. Tekalp, “Discriminative analysis of lip motion features for speaker identification and speechreading,” IEEE Trans. Image Process., vol. 15, no. 10, pp. 2879–2891, Oct. 2006.

D. Tao, X. Tang, X. Li, and X. Wu, “Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 28, no. 7, pp. 1088–1099, Jul. 2006.

E. Engin, Y. Yemez, and A. M. Tekalp, “Multimodal speaker identification using an adaptive classifier cascade based on modality reliability,” IEEE Trans. Multimedia, vol. 7, no. 5, pp. 840–852, Oct. 2005.

M. N. Kaynak, Z. Qi, A. D. Cheok, K. Sengupta, J. Zhang, and C. Ko Chi, “Analysis of lip geometric features for audio-visual speech recognition,” IEEE Trans. Syst., Man, Cybern. A, Syst. Humans, vol. 34, no. 4, pp. 564–570, Jul. 2004.

S. W. Foo, Y. Lian, and L. Dong, “Recognition of visual speech elements using adaptively boosted hidden Markov models,” IEEE Trans. CircuitsSyst. Video Technol., vol. 14, no. 5, pp. 693–705, May 2004.

H. Tin Kam, “The random subspace method for constructing decision forests,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 20, no. 8, pp. 832–844, Aug. 1998.

A. Roy, M. Magimai-Doss, and S. Marcel, “A fast parts-based approach to speaker verification using boosted slice classifiers,” IEEE Trans. Inf. Forensics Security, vol. 7, no. 1, pp. 241–254, Feb. 2012

C. Chi Ho, B. Goswami, J. Kittler, and W. Christmas, “Local ordinal contrast pattern histograms for spatiotemporal, lip-based speaker authentication,” IEEE Trans. Inf. Forensics Security, vol. 7, no. 2, pp. 602–612, Apr. 2012

M. McLaren, R. Vogt, B. Baker, and S. Sridharan, “A comparison of session variability compensation approaches for speaker verification,” IEEE Trans. Inf. Forensics Security, vol. 5, no. 4, pp. 802–809, Dec. 2010.

J. S. Lee and C. H. Park, “Hybrid simulated annealing and its application to optimization of hidden Markov models for visual speech recognition,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 40, no. 4, pp. 1188–1196, Aug. 2010.

T. Hao and T. S. Huang, “Boosting Gaussian mixture models via discriminant analysis,” in Proc. 19th Int. Conf. Pattern Recognit., Dec. 2008, pp. 1–4.

S. Man Hung, Y. Xi, and G. Herbert, “Discriminatively trained GMMs for language classification using boosting methods,” IEEE Trans. Audio, Speech Lang. Process., vol. 17, no. 1, pp. 187–197, Jan. 2009.

W. C. Yau, H. Weghorn, and D. K. Kumar, “Visual speech recognition and utterance segmentation based on mouth movement,” in Proc. Biennial Conf. Austral. Pattern Recognit. Soc. Digital Image Comput. Tech. Appl., vol. 8. 2007, pp. 7–14

X. Liu and Y. M. Cheung, “A multi-boosted HMM approach to lip password based speaker verification,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., Mar. 2012, pp. 2197–2200.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.