Chapter 7. Conclusions and Future Scope. The techniques for the recognition of handwritten Hindi text by segmenting and

Similar documents
An Empirical Study on Identification of Strokes and their Significance in Script Identification

Primitive segmentation in old handwritten music scores

Smart Traffic Control System Using Image Processing

Computational Modelling of Harmony

OBJECTIVE EVALUATION OF A MELODY EXTRACTOR FOR NORTH INDIAN CLASSICAL VOCAL PERFORMANCES

Symbol Classification Approach for OMR of Square Notation Manuscripts

2. Problem formulation

BUILDING A SYSTEM FOR WRITER IDENTIFICATION ON HANDWRITTEN MUSIC SCORES

Towards the recognition of compound music notes in handwritten music scores

A Fast Alignment Scheme for Automatic OCR Evaluation of Books

WAHD: A database for Writer Identification of Arabic Historical Documents

International Journal of Advance Engineering and Research Development MUSICAL INSTRUMENT IDENTIFICATION AND STATUS FINDING WITH MFCC

Topics in Computer Music Instrument Identification. Ioanna Karydi

Automatic Commercial Monitoring for TV Broadcasting Using Audio Fingerprinting

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007

Copy Move Image Forgery Detection Method Using Steerable Pyramid Transform and Texture Descriptor

Wipe Scene Change Detection in Video Sequences

Hearing Sheet Music: Towards Visual Recognition of Printed Scores

Paulo V. K. Borges. Flat 1, 50A, Cephas Av. London, UK, E1 4AR (+44) PRESENTATION

An Improved Recognition Module for the Identification of Handwritten Digits. May 21, 1999

Categorization of ICMR Using Feature Extraction Strategy And MIR With Ensemble Learning

INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION

First Stage of an Automated Content-Based Citation Analysis Study: Detection of Citation Sentences 1

Improving Performance in Neural Networks Using a Boosting Algorithm

Multimodal Music Mood Classification Framework for Christian Kokborok Music

Automatically Creating Biomedical Bibliographic Records from Printed Volumes of Old Indexes

Music Information Retrieval with Temporal Features and Timbre

The GERMANA database

Bilbo-Val: Automatic Identification of Bibliographical Zone in Papers

A System for Acoustic Chord Transcription and Key Extraction from Audio Using Hidden Markov models Trained on Synthesized Audio

Music Radar: A Web-based Query by Humming System

Improving Frame Based Automatic Laughter Detection

Research Article. ISSN (Print) *Corresponding author Shireen Fathima

Automatic Laughter Detection

A QUERY BY EXAMPLE MUSIC RETRIEVAL ALGORITHM

Reducing False Positives in Video Shot Detection

TOWARD AN INTELLIGENT EDITOR FOR JAZZ MUSIC

APPLICATIONS OF A SEMI-AUTOMATIC MELODY EXTRACTION INTERFACE FOR INDIAN MUSIC

Chord Classification of an Audio Signal using Artificial Neural Network

Hidden Markov Model based dance recognition

A Music Retrieval System Using Melody and Lyric

Problem. Objective. Presentation Preview. Prior Work in Use of Color Segmentation. Prior Work in Face Detection & Recognition

ABSOLUTE OR RELATIVE? A NEW APPROACH TO BUILDING FEATURE VECTORS FOR EMOTION TRACKING IN MUSIC

Citation Proximity Analysis (CPA) A new approach for identifying related work based on Co-Citation Analysis

Comparison Parameters and Speaker Similarity Coincidence Criteria:

Automatic Labelling of tabla signals

Exploring Relationships between Audio Features and Emotion in Music

Music Emotion Recognition. Jaesung Lee. Chung-Ang University

Music Mood. Sheng Xu, Albert Peyton, Ryan Bhular

Ensemble LUT classification for degraded document enhancement

Acoustic Scene Classification

Sudhanshu Gautam *1, Sarita Soni 2. M-Tech Computer Science, BBAU Central University, Lucknow, Uttar Pradesh, India

Music Tempo Classification Using Audio Spectrum Centroid, Audio Spectrum Flatness, and Audio Spectrum Spread based on MPEG-7 Audio Features

Audio classification from time-frequency texture

Classification of Musical Instruments sounds by Using MFCC and Timbral Audio Descriptors

THE importance of music content analysis for musical

Bi-Modal Music Emotion Recognition: Novel Lyrical Features and Dataset

CS 1674: Intro to Computer Vision. Face Detection. Prof. Adriana Kovashka University of Pittsburgh November 7, 2016

UC San Diego UC San Diego Previously Published Works

Detection of Panoramic Takes in Soccer Videos Using Phase Correlation and Boosting

Musical Hit Detection

CS229 Project Report Polyphonic Piano Transcription

Design of Carry Select Adder using Binary to Excess-3 Converter in VHDL

Automatic Laughter Detection

A Framework for Segmentation of Interview Videos

Document Analysis Support for the Manual Auditing of Elections

Large scale Visual Sentiment Ontology and Detectors Using Adjective Noun Pairs

Automatic LP Digitalization Spring Group 6: Michael Sibley, Alexander Su, Daphne Tsatsoulis {msibley, ahs1,

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

Automatic Extraction of Popular Music Ringtones Based on Music Structure Analysis

Shot Transition Detection Scheme: Based on Correlation Tracking Check for MB-Based Video Sequences

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

Audio-Based Video Editing with Two-Channel Microphone

Publishing research. Antoni Martínez Ballesté PID_

Typography Day Typography and Culture

Automatic Identification of Instrument Type in Music Signal using Wavelet and MFCC

Indexing local features. Wed March 30 Prof. Kristen Grauman UT-Austin

Efficient Computer-Aided Pitch Track and Note Estimation for Scientific Applications. Matthias Mauch Chris Cannam György Fazekas

Hyper-spectral Analysis for Automatic Signature Extraction

1 Guideline for writing a term paper (in a seminar course)

Ph.D Research Proposal: Coordinating Knowledge Within an Optical Music Recognition System

Off-line Handwriting Recognition by Recurrent Error Propagation Networks

Prediction of Aesthetic Elements in Karnatic Music: A Machine Learning Approach

A Categorical Approach for Recognizing Emotional Effects of Music

MUSI-6201 Computational Music Analysis

Piano Transcription MUMT611 Presentation III 1 March, Hankinson, 1/15

Distortion Analysis Of Tamil Language Characters Recognition

EasyChair Preprint. How good is good enough? Establishing quality thresholds for the automatic text analysis of retro-digitized comics

A Survey on: Sound Source Separation Methods

Available online at ScienceDirect. Procedia Computer Science 46 (2015 )

Automatic Decipherment of Ancient Indian Epigraphical Scripts - A Brief Review

Automatic Arabic License Plate Recognition

Identifying Related Documents For Research Paper Recommender By CPA and COA

Machine Vision System for Color Sorting Wood Edge-Glued Panel Parts

Man-Machine-Interface (Video) Nataliya Nadtoka coach: Jens Bialkowski

Highly Cited Publications Output by India in Computer Science : A Scientometric Assessment

Interactive Classification of Sound Objects for Polyphonic Electro-Acoustic Music Annotation

Development of an Optical Music Recognizer (O.M.R.).

VISUAL CONTENT BASED SEGMENTATION OF TALK & GAME SHOWS. O. Javed, S. Khan, Z. Rasheed, M.Shah. {ojaved, khan, zrasheed,

Speech Recognition Combining MFCCs and Image Features

Transcription:

Chapter 7 Conclusions and Future Scope The techniques for the recognition of handwritten Hindi text by segmenting and classifying the characters have been proposed in this thesis work. The problems in handwritten Hindi text written by different persons are identified after carefully analyzing the text. To solve these problems new techniques have been developed for segmentation, feature extraction and recognition. In the present work, four new segmentation algorithms have been proposed. These include text line segmentation, segmentation of half characters, segmentation of lower modifiers and segmentation of touching left modifier from consonant in the middle region of the word. A new technique based on header line and base line detection to segment the overlapped lines of text in handwritten Hindi text have been proposed. Determination of the header line is very tough. The position of the header line in particular line of text and header line in a particular word of the same line may vary. Determining the presence of lower modifier, presence of half character, touching left modifier with consonant in the middle region or to determine the presence of touching characters is very arduous. The new threshold values for their presence in the word have been proposed. For segmentation of half characters from consonants structural properties of the text are considered. For segmentation of lower modifiers, a new technique based on shape of lower modifiers is proposed. A technique based on position and length of the left modifier is proposed for segmentation of left modifier from touching consonant in the middle region of the word. For the validity of the cxli

algorithms, the proposed algorithms are also tested on printed Hindi text and obtained pleasing results. After the segmentation of text, the features are extracted for recognition. A new feature set based on topological features or structural properties of the text has been proposed. A new technique called merging of features for the feature extraction has been proposed in the present work. A particular feature of particular character may depend upon other feature of the character. In such cases, next feature is extracted only if previous feature is available otherwise not. It leads to reduce in number of features to be extracted. Further, the problems in feature extraction are identified and many heuristics are applied to solve those problems. The overall results obtained with proposed algorithms for segmentation and recognition of handwritten Hindi text is very challenging. SVM and Rule based classifiers are used for the classification of characters of handwritten Hindi text. 7.1 Contributions of the Work This thesis has made the following contributions in the field of handwritten Hindi text recognition: i) To the best of researcher s knowledge, this is the first attempt towards the development of OCR for handwritten Hindi text. ii) New techniques are proposed for segmentation of overlapped line of handwritten Hindi text, segmentation of conjuncts (half characters), segmentation of lower modifiers and segmentation of touching modifiers or consonants in the middle region. cxlii

iii) A new feature set has been proposed which contains slant and size invariant features. The topological features are extracted which are very robust. iv) A new technique is proposed for feature extraction to increase the speed of recognition. All the features are not extracted from each character, only main features and unique feature of that character are extracted. v) A new technique is proposed for word recognition. vi) Rule based classifier and SVM classifier is used for character recognition. 7.2 Future Scope The proposed algorithms used for segmentation of handwritten Hindi text can be extended further for recognition of other Indian scripts. The proposed algorithms of segmentation can be modified further to improve accuracy of segmentation. New features can be added to improve the accuracy of recognition. These algorithms can be tried on large database of handwritten Hindi text. There is a need to develop the standard database for recognition of handwritten Hindi text. The proposed work can be extended to work on degraded text or broken characters. Recognition of digits in the text, half characters and compound characters can be done to improve the word recognition rate. cxliii

Bibliography Abuhaiba, I.S.I.; Datta S.; and Holt, M.J.J. (1995), ''Line Extraction and Stroke Ordering of Text Pages'', Proceedings of the 3 rd International Conference on Document Analysis and Recognition, pp. 390-393. Amin, A.(2000), Recognition of printed Arabic text based on global features and decision tree learning techniques, Pattern Recognition, Vol. 33, pp. 1309-1323. Arica, N. and Vural, F. T. Y. (2001), An overview of character recognition focused on offline handwriting, IEEE Transactions on Systems, Man, and Cybernetics Part C: Applications and Reviews, Vol. 31(2), pp. 216-233. Arora, S.; Bhattacharjee, D.; Nasipuri, M.; Basu, D. K.; and Kundu, M. (2009), Application of Statistical features in Handwritten Devanagari Character Recognition, International Journal of Recent Trends in Engg., Vol. 2(2), pp. 40 42. Arora, S.; Bhaattacharjee, D.; Nasipuri, M.; Malik, L.; Kundu, M.; and Basu, D. K.(2010), Performance Comparison of SVM and ANN for Handwritten Devanagari Character Recognition, International Journal of Computer Science Issues, Vol. 7, Issue 3 (6), pp. 18-26. Arora, S.; Bhattacharjee, D.; Nasipuri, M.; Basu, D. K.; Kundu, M.; and Malik, L. (2009), Study of different features on handwritten Devnagari characters, Proceedings of the international conference on Emerging Trends Engg. Technol., pp. 929 933. Bag, S.; Harit, G. (2013), A survey on optical character recognition for Bangla and Devanagari scripts, Sadhana, Vol. 38, pp. 133-168. cxliv

Bansal, V. (1999), Integrating Knowledge Sources in Devanagari Text Recognition, Ph. D. thesis, IIT Kanpur, India. Bansal, V. and Sinha, R.M.K. (2000), Integrating knowledge sources in Devanagari text recognition, IEEE Transactions- System Man Cybernetics. A: Syst. Hum., Vol. 30 (4), pp. 500 505. Bansal, V. and Sinha, R. M. K. (2002), Segmentation of touching and fused Devanagari characters, Pattern Recognition, Vol. 35(4), pp. 875-893. Biswas, K. K. and Chatterjee, S. (1995), Feature based recognition of Hindi characters, Proceedings of Indian Conference on Pattern Recognition, Image Processing and Computer Vision, pp. 182-187. Bortolozzi, F.; Britto, A.; Oliveria, L. S.; and Morita, M. (2005), Recent advances in handwriting recognition, Proceedings of International Workshop on Document Analysis (IWDA), pp. 1-30. Casey, R. G.; and Lecolinet E. (1996), A survey of methods and strategies in character segmentation, IEEE Transactions on PAMI, Vol. 18(7), pp. 690-706. Chaudhuri, B. B.; Pal, U. and Mitra, M. (2001), Automatic recognition of printed Oriya script, Proceedings of 6 th International Conference, ICDAR, pp. 795-799. Deshpande, P. S.; Malik, L.; and Arora, S. (2008), Fine classification & recognition of hand written Devnagari characters with regular expressions & minimum edit distance method, Journal of Computers, Vol. 3(5), pp. 11 17. Garain, U. and Chaudhuri, B. B. (1998), On recognition of touching characters in printed Bangla documents, Proceedings of Indian Conference on Computer Vision, Graphics, and Image Processing, pp. 377-380. cxlv

Garain, U. and Chaudhuri, B. B. (2002), Segmentation of touching characters in printed Devanagari and Bangla scripts using fuzzy multifactorial analysis, IEEE Transactions on Systems, Man and Cybernetics, Part C, Vol. 32(4), pp. 449 459. Glauberman, M. H. (1956), Character recognition for business machines, Electronics, Vol. 29, pp. 132-136. Impedovo, S.; Ottaviano L.; and Occhinegro, S. (1991), Optical character recognition- a survey, International Journal Pattern Recognition and Artificial Intelligence, Vol. 5(1-2), pp. 1-24. Hanmandlu, M. and Murthy, O. V. R. (2007), Fuzzy model based recognition of handwritten numerals, Pattern Recognition, Vol. 40, pp. 1840 1854. Hanmandlu, M.; Murthy O. V. R.; and Madasu, V. K. (2007), Fuzzy Model based recognition of handwritten Hindi characters, Proceedings of Digital Image Computing Techniques and Applications, pp: 454-461. Heutte, L.; Paquet, T.; Moreau, J. V.; Lecourtier, Y.; and Olivier, C. (1998), A structural/statistical feature based vector for handwritten character recognition, Pattern Recognition Letters, Vol. 19(7), pp. 629-641. Holambe, A. K.; Thool, R. C. (2010), Comparative Study of Different Classifiers for Devanagari Handwritten Character Recognition, International journal of Science and Technology, Vol. 2(7), pp. 2681 2689. Hu, M. K. (1962), Visual pattern recognition by moment invariants, IRE Transactions on Information Theory, Vol. 8(2), pp. 179-187. Hussain, A. B. S.; Toussaint G. T.; and Donaldson, R. W. (1972), Results obtained using a simple character recognition procedure on Munson s handprinted data, IEEE Transactions on Computers, pp. 201-205. cxlvi

Jangid, M. (2011), Devanagari Isolated Character Recognition by Using Statistical Features, International Journal of Computer Science and Engg., Vol. 3(6), pp. 2400 2407. Jayadevan, R.; Kolhe S. R.;Patil P. M.; Pal U. (2011), Offline recognition of Devanagari script: A survey, IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, Vol. 41(6), pp. 782-796. Jindal, M. K.; Sharma, R. K.; and Lehal, G. S. (2007), Segmentation of Horizontally Overlapping Lines in Printed Indian Scripts, International Journal of Computational Intelligence Research (IJCIR), Vol. 3 (4), pp. 277-286. Jindal, M. K.; Lehal G. S.; and Sharma, R. K. (2009), On Segmentation of touching characters and overlapping lines in degraded printed Gurmukhi script, International Journal of Image and Graphics (IJIG), World Scientific Publishing Company, Vol. 9 (3), pp. 321-353. Jindal, M. K.; Sharma R. K.; and Lehal, G. S. (2009), Segmentation of Touching Characters in Upper Zone in printed Gurmukhi Script, Proceedings of the 2 nd Bangalore Annual Compute Conference, Banglore, ACM, No. 9. Kahan, S.; Pavlidis, T.; and Baird, H. S. (1987), On the recognition of printed characters of any font and size, IEEE Transactions on PAMI, Vol. 9(2), pp. 274-288. Kumar A.; Holambe, N.; Thool, R. C.; and Jagade, S. M. (2010), Printed and Handwritten Character and Number Recognition of Devanagari Script using Gradient Features, International Journal of Computer Applications, Vol. 2 (9), pp. 38 41. cxlvii

Lee L. L. and Gomes, N. R. (1997), Disconnected handwritten numeral image recognition, Proceedings of the 4 th International Conference, ICDAR, pp. 467-470. Leedham, G. and Pervouchine, V. (2005), Validating the use of handwriting as a biometric and its forensic analysis, Proceedings of International Workshop on Document Analysis (IWDA), India, pp. 175-192. Lehal G. S. and Singh, C. (2000), A Gurumukhi Script recognition system, Proceeding of 15 th International conference on Pattern recognition, Spain, Vol. 2, pp. 557-560. Lehal, G. S. and Singh, C. (2001), A technique for segmentation of Gurmukhi text, Computer Analysis of Images and Patterns, Proceedings CAIP, W. Skarbek (Ed.), Lecture Notes in Computer Science, Vol. 2124, Springer-Verlag, Germany, pp. 191-200. Li, Y.; Zheng, Y.; Doermann, D.; and Jaeger, S. (2006), A new algorithm for detecting text line in handwritten documents, Proceedings of the Tenth International Workshop on Frontiers in Handwriting Recognition, pp. 35 40. LIBSVM-A Library for Support Vector Machines, available online: http://www.csie.ntu.edu.tw/~cjlin/libsvm. Likforman-Sulem, L.; and Faure, C. (1994), "Extracting text lines in handwritten documents by perceptual grouping", Advances in handwriting and drawing: a multidisciplinary approach, pp. 21-38. Lu, Y.; and Shridhar, M. (1996), Character segmentation in handwritten words an overview, Pattern Recognition, Vol. 29(1), pp. 77-96. Louloudis, G.; Gatos, B.; Pratikakis, I.; and Halatsis, K. (2006), A Block Based Hough Transform Mapping for Text Line Detection in Handwritten cxlviii

Documents, Proceedings of the Tenth International Workshop on Frontiers in Handwriting Recognition, pp. 515-520. Mori, S.; Suen, C. Y.; and Yamamoto, K. (1992), Historical review of OCR research and development, Proceedings of the IEEE, Vol. 80(7), pp. 1029-1058. Mukherji, P. and Rege, P. (2009), Shape Feature and Fuzzy Logic Based Offline Devanagari Handwritten Optical Character Recognition, Journal of Pattern Recognition Research, Vol. 4, pp. 52-68. Pal, U. and Chaudhuri, B. B. (1997), Printed Devanagari Script OCR System, Vivek, Vol. 10, pp. 12-24. Pal, U. and Chaudhuri, B. B. (1999), Automatic separation of machine-printed and handwritten text lines, Proceedings of the 5 th International Conference, ICDAR, pp. 645-648. Pal, U. and Chaudhuri, B. B. (2004), Indian script character recognition: a survey, Pattern Recognition, Vol. 37(9), pp. 1887-1899. Pal, U. and Datta, S. (2003), Segmentation of Bangla Unconstrained Handwritten Text, Proceedings of the 7 th International Conference, ICDAR, pp. 1128-1132. Pal, U.; Wakabayashi, T.; Kimura, F. (2009), Comparative Study of Devanagari Handwritten Character Recognition Using Different Features and Classifiers, Proceedings of the 10 th International Conference, ICDAR, pp. 1111-1115. Pal, U.; Sharma, N.; Wakabayashi, T.; and Kimura, F. (2007), Off-line handwritten character recognition of Devnagari script, Proceedings of the 9 th International Conference, ICDAR, pp. 496 500. Pal, U.; Wakabayashi, T.; Sharma, N.; and Kimura, F. (2007), Handwritten numeral recognition of six popular Indian scripts, Proceedings of the 5 th International Conference, ICDAR, pp. 749 753. cxlix

Palit, S. and Chaudhuri, B. B. (1995), A feature-based scheme for the machine recognition of printed Devanagari script, Proceedings of Indian Conference on Pattern Recognition, Image Processing and Computer Vision, pp. 163-168. Ramteke, R. J. and Mehrotra, S. C. (2008), Recogntion of Handwritten Devanagari Numerals, International Journal of Computer Processing of Object Oriental Languages. Reddi, S. S. (1981), Radial and angular moment invariants for image identification, IEEE Transactions on PAMI, Vol. 3(2), pp. 240-242. Sethi, I. K. and Chatterjee, B. (1977), Machine recognition of constrained hand Printed Devanagari, Pattern Recognition, Vol. 9(2), pp. 69-76. Sharma, N.; U.pal, U.; Kimura F. and Pal, S. (2006), Recognition of Off-line Handwritten Devanagari Characters using Quadratic Classifier, ICVGIP, pp.805 816. Shaw, B.; Parui, S. K.; and Shridhar, M. (2008), Off-line handwritten Devanagari word recognition: A holistic approach based on directional chain code feature and HMM, Proceeding of the IEEE International conference on Information Technology, pp. 203 208. Shaw, B.; Parui, S. K.; and Shridhar, M. (2008), A segmentation based approach to offline handwritten Devanagari word recognition, Proceeding of the IEEE International conference on Information Technology, pp. 256 257. Shridhar, M. and Badreldin, A. (1984), High accuracy character recognition using Fourier and topological descriptors, Pattern Recognition, Vol. 17(5), pp. 515-524. cl

Srinivas, B. A.; Agarwal, A.; and Rao, C. R. (2008), An overview of OCR research in Indian Scripts, International Journal of Computer Sciences and Engineering Systems, pp.141-153. Tarling, R. and Rohwer, R. (1993), Efficient use of training data in the n-tuple recognition method, Electronics Letters, Vol. 29(24), pp. 2093-2094. Teh C. H. and Chin, R. T. (1988), On image analysis by the method of moments, IEEE Transactions on PAMI, Vol. 10(4), pp. 496-513. Trier, O. D.; Jain A. K. ; and Taxt, T. (1996), Feature extraction methods for Character recognition: - a survey, Pattern Recognition, Vol. 29(4), pp. 641-662. Tripathy, N.; and Pal, U. (2004), Handwriting Segmentation of unconstrained Oriya Text, International Workshop on Frontiers in Handwriting Recognition, pp. 306 311. Wakabayashi, T.; Pal, U.; Kimura, F.; and Miyake, Y. (2009), F-ratio based weighted feature extraction for similar shape character recognition, Proceedings of the 10 th International Conference, ICDAR, pp. 196 200. Weliwitage, C.; Harvey A. L.; and Jennings, A. B. (2005), Handwritten Document Offline Text Line Segmentation, Proceedings of Digital Imaging Computing: Techniques and Applications, pp. 184-187. Zahour, A.; Taconet, B.; Mercy, P.; and Ramdane, S. (2001), Arabic Hand-written Text-line Extraction, Proceedings of the Sixth International. Conference on Document Analysis and Recognition, ICDAR, pp. 281 285. Zahour, A.; Taconet, B.; Likforman-Sulem L.; and Boussellaa, W. (2008), Overlapping and multi-touching text line segmentation by Block Covering analysis, Pattern Analysis and Applications, Vol. 12, pp. 335-351. cli