Open Access Open Access  Restricted Access Subscription or Fee Access

Recognition of Handwritten Tamil Characters Using Statistical Classifiers

R. Jagadeesh Kannan, R.M. Suresh, M. Saravanan

Abstract


This paper describes a system to recognize handwritten Tamil characters using statistical classifier approach, for a subset of the Tamil alphabet. The process of handwriting recognition involves extraction of some defined characteristics called features to classify an unknown handwritten character into one of the known classes. A typical handwriting recognition system consists of several steps, namely: preprocessing, segmentation, feature extraction and recognition. Data input (the hand written Tamil character) were collected from different writers on A4 sized documents. They were scanned using a flat-bed scanner at a resolution of 100 dpi and stored as 256-bit color scale images. We trained the system with 500 characters belonging to 10 characters. The testing data contained a separate set of 50 characters. A training data was used to test the system, to see how well the system represents the data it has been trained on. In the test set, a recognition rate of 90% was achieved.

 


Keywords


Preprocessing, Segmentation, Feature Extraction, Character Recognition

Full Text:

PDF

References


Rafeal C. Gonzalez and Richard E. Woods, “Digital image processing”,1992.

Anil K. Jain, “Fundamentals of digital image processing”,1996.

S. Hewavitharana and H. C. Fernando, “A Two Stage Classification Approach to Tamil Handwriting Recognition”, The Tamil Internet 2002 Conference, California, USA, Sep 2002.

Anbumani Subramaniam and Bhadri Kubendran, "Optical Character Recognition of Printed Tamil Characters", Blacksburg, Dec 1999.

Danielle Azar, “Pattern Recognition Course”, McGill University, 1997, http://jeff.cs.mcgill.ca/~gadfried/teaching/projecta97/azar/skeleton.html

TAM (Tamil Monolingual scheme) and TAB (Tamil Bilingualscheme)http://www.tamilvu.org/tamilnet99/enstand.html

R.Jagadeesh Kannan, R. Prabhakar, "Accuracy Augmentation of Tamil OCR Using Algorithm Fusion", IJCSNS International Journal of Computer Science and Network Security, VOL.8 No.5, May 2008

R. Jagadeesh Kannan and R. Prabhakar “An Improved Handwritten Tamil Character Recognition System using Octal Graph” Journal of Computer Science 4(7): 509-516, 2008.

R. Jagadeesh Kannan R. Prabhakar, “Off-line cursive handwritten Tamil character recognition” WSEAS Transactions on Signal Processing archive Volume 4 , Issue 6 (June 2008) Pages: 351-360, 2008, ISSN:1790- 5022.

R. Jagadeesh Kannan, R. Prabhakar, “A Comparative Study of Optical Character Recognition for Tamil Script”, European Journal of Scientific Research, ISSN 1450-216X Vol.35 No.4 , pp.570-582 © EuroJournals Publishing, Inc.2009 http://www.eurojournals.com/ejsr.htm.

C.Y. Suen, C. Nadal, R. Legault, T.A. Mai, and L. Lam, “Computer Recognition of Unconstrained Handwritten Numerals,” Proc. IEEE, vol. 80, no. 7, July 1992, pp. 1,162-1,180

P.D. Gader, A.M. Gillies, and D. Hepp, “Handwritten Character Recognition,” E Dougherty, ed., Digital Image Processing Methods. New York: Marcel Dekker, 1994, pp. 223-261.

P.D. Gader, M. Mohamed, and J. Chiang, “Comparison, of Crisp and Fuzzy Character Networks in Handwritten Word Recognition,” Proc. North American Fuzzy Information Processing Sot. Conf.,Puerto Vallarta, Mexico, pp. 257-266,1992.

Gader, M. Mohamed, and J. Chiang, “Comparison of Crisp and Fuzzy Character Neural Networks in Handwritten Word Recognition,” IEEE Trans. Fuzzy Systems, vol. 3, no. 3, pp. 357-364,1995.

P.D. Gader, M. Mohamed, and J. Chiang, “Fuzzy and Crisp Handwritten Alphabetic Character Recognition Using Neural Networks,” Proc. Artificial Neural Networks in Engineering, St. Louis, MO., pp. 421-427, Nov. 1992.

Gillies, D. Hepp, and P. Gader, “A System for Recognizing Handwritten Words,” Technical Report submitted to the United States Postal Service, Office of Advanced Technology, Nov. 1992.

H.Bunke, P.S.P. Wang and H.S. Baird, Hand Book of Character Recognition and Document Image Analysis,World Scientific,Singapoer (1997).

Kundu, Y. He, and P. Bahl, “Recognition of handwritten word: first and second order Hidden Markov Model based approach”, Pattern Recognition J., 22, 1989, pp. 283-297.

H. S. Park and S. W. Lee, “Off-line recogntion of largeset handwritten hangul with hidden Markov models”, in Proc. Int. Workshop on Forntiers in Handwriting Recognition,Buffalo, New York, May 1993, pp. 51-61.

Y. He and A. Kundu, “2-D shape classification using hidden Markov model”, IEEE Trans. Pattern Anal., Machine Intell. 13, 1991, pp. 1172-1184.

H.Bunke, M. Roth, and E. G. Schukat-Talamazzini, “Offline cursive handwritten word recognition using a hidden Markov models”, Pattern Recognition J. 28, 1995, pp. 1399- 1413.

J. A. Vlontzos and S. Y. Kung, “Hidden Markov models for character recognition,” IEEE Trans. Image Processing, no. 1, 1992, pp. 539-543.

S. S. Kuo and O. E. Agazzi, “Machine vision for keyword spotting using pseudo 2-d hidden Markov models,” in Proc. IEEE Int. Conf. On Acoustics, Speech, Signal Processing, Minneapolis, Minnesota, April 1993, pp.121-124.

G. E. Kopec and P. A. Chou, “Document image decoding using Markov source models”, IEEE Trans. Pattern Anal., Machine Intell. 16, 1994, pp. 602-617.

Bazzi I, Schwartz R, Makhoul J, “An omni font open vocabulary OCR system for English and Arabic,” IEEE Trans.on Pattern Analysis and Machine Intelligence, vol. 21, no. 6, 1999, pp. 495-504

Atici AA, YarmanVural FT, “A heuristic algorithm for optical character recognition of Arabic script”, Signal Processing, vol 62, no 1, OCT 1997, pp 87-99.

Amin and J. Mari, “Machine recognition and correction of printed Arabic text,” IEEE Trans. on Systems, Man, and Cybernetics, vol. 19, no. 5, 1989, pp.1300-1306.

J. Makhoul, C. LáPré, C. Raphael, R. Schwartz, and Y. Zahao, “Towards language-independent character recognition using speech recognition methods,” in The 5th International Conference and Exhibition on Multi-Lingual Computing,Cambridge University Press, 1996.

M. Khorsheed and W. Clocksin, “Structural Features Of Cursive Arabic Script”, The 10th British Machine Vision Conference, University of Nottingham, Nottingham-UK, Sep-1999.

M. Dehghan, K. Faez, M. Ahmadi, and M. Shridhar, “Handwritten Farsi (Arabic) word recognition: a holistic approach using discrete HMM”, Pattern Recognition 34, 2001, pp. 1057-1065.

P. Chinnuswamy, S.G. Khrishnamoorthy, “Recognition of handprinted Tamil characters”, Pattern Recognition, vol. 12, pp. 141-152, 1980.

R.M. Suresh, S. Arumugam and K.P. Aravanan, “Recognition of handwritten Tamil characters using fuzzy classificatory approach”, Proc. The Tamil Internet 2000 Conference, Singapore, July 2000.

The Unicode Consortium, The Unicode Standard 3.0, Harlow: Addison Wesley publishers, 2000.

L. R. Rabiner, A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition, Proc. IEEE , vol. 77, no. 2, 1989, pp. 257-285.

H. Bunke, M. Roth and E. Talamazzini, Off-line Cursive Handwritten Recognition using Hidden Markov Models, Pattern Recognition, vol. 28, no. 9, 1995, pp. 1399-1413.

H.J. Kim, K.H. Kim, S.K. Kim and J.K. Lee, Online Recognition of Handwritten Chinese Characters based on Hidden Markov Models, Pattern Recognition, vol. 30, no. 9, 1997, pp. 1489-1500.

G. Loudon, C. Hong, Y. Wu and R. Zitserman, The Recognition of Handwritten Chinese Characters from Paper Records, IEEE TENCON, Digital Signal Processing Applications, 1996, pp. 923-926.

R.M. Bozinovic and S.N. Srihari, Off-line Cursive Script Word Recognition, IEEE Trans. on PAMI, vol. 11, no. 1, 1989, pp. 68-83.

F. Jelinek, Statistical Methods for Speech Recognition, MIT-Press, 1998.

L. R. Rabiner and B. H. Juang, Fundamentals of Speech Processing, Prentice-Hall, 1993.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.