Standard Databases for Recognition of Handwritten Digits, Numerical Strings, Legal Amounts, Letters and Dates in Farsi Language

Similar documents
The Official IDENTITY SYSTEM. A Manual Concerning Graphic Standards and Proper Implementation. As developed and established by the

Chapter 1: Introduction

GRABLINKTM. FullTM. - DualBaseTM. - BaseTM. GRABLINK Full TM. GRABLINK DualBase TM. GRABLINK Base TM

Synchronising Word Problem for DFAs

Corporate Logo Guidelines

VISUAL IDENTITY GUIDE

Before Reading. Introduce Everyday Words. Use the following steps to introduce students to Nature Walk.

CPE 200L LABORATORY 2: DIGITAL LOGIC CIRCUITS BREADBOARD IMPLEMENTATION UNIVERSITY OF NEVADA, LAS VEGAS GOALS:

WE SERIES DIRECTIONAL CONTROL VALVES

Chapter 5. Synchronous Sequential Logic. Outlines

ARCHITECTURAL CONSIDERATION OF TOPS-DSP FOR VIDEO PROCESSING. Takao Nishitani. Tokyo Metropolitan University

PRACTICE FINAL EXAM T T. Music Theory II (MUT 1112) w. Name: Instructor:

Answers to Exercise 3.3 (p. 76)

INPUT CAPTURE WITH ST62 16-BIT AUTO-RELOAD TIMER

Contents 2. Notations Used in This Guide 6. Introduction to Your Projector 7. Using Basic Projector Features 28. Setting Up the Projector 15

DRAFT. Vocal Music AOS 2 WB 3. Purcell: Music for a While. Section A: Musical contexts. How is this mood achieved through the following?

Introduction. APPLICATION NOTE 712 DS80C400 Ethernet Drivers. Jun 06, 2003

LCD Data Projector VPL-S500U/S500E/S500M

Engineer To Engineer Note

Safety Relay Unit G9SB

Contents 2. Notations Used in This Guide 6. Introduction to Your Projector 7. Using Basic Projector Features 29. Setting Up the Projector 16

arxiv: v2 [cs.sd] 13 Dec 2016

Application Support. Product Information. Omron STI. Support Engineers are available at our USA headquarters from

Pitch I. I. Lesson 1 : Staff

ECE 274 Digital Logic. Digital Design. Datapath Components Registers. Datapath Components Register with Parallel Load

Soft Error Derating Computation in Sequential Circuits

A Proposed Keystream Generator Based on LFSRs. Adel M. Salman Baghdad College for Economics Sciences

LOGICAL FOUNDATION OF MUSIC

THE SOLAR NEIGHBORHOOD. XV. DISCOVERY OF NEW HIGH PROPER MOTION STARS WITH 0B4 yr 1 BETWEEN DECLINATIONS 47 AND 00

Day care centres (ages 3 to 5) Kindergarten (ages 4 to 5) taken part in a fire drill in her building and started to beep.

Evaluation of the Suitability of Acoustic Characteristics of Electronic Demung to the Original Demung

walking. Rhythm is one P-.bythm is as Rhythm is built into our pitch, possibly even more so. heartbeats, or as fundamental to mu-

DIGITAL EFFECTS MODULE OWNER'S MANUAL

Contents 2. Notations Used in This Guide 6. Introduction to Your Projector 7. Using Basic Projector Features 30. Setting Up the Projector 17

Contents 2. Notations Used in This Guide 6. Introduction to Your Projector 7. Using Basic Projector Features 29. Setting Up the Projector 16

MODELING OF BLOCK-BASED DSP SYSTEMS Dong-Ik Ko and Shuvra S. Bhattacharyya

LAERSKOOL RANDHART ENGLISH GRADE 5 DEMARCATION FOR EXAM PAPER 2

User's Guide. Downloaded from

CMST 220 PUBLIC SPEAKING

A New Concept of Providing Telemetry Data in Real Time

Have they bunched yet? An exploratory study of the impacts of bus bunching on dwell and running times.

Your Summer Holiday Resource Pack: English

Contents. English. English. Your remote control 2

Explosion protected add-on thermostat

ECE 274 Digital Logic. Digital Design. Sequential Logic Design Controller Design: Laser Timer Example

Chapter 3: Sequential Logic Design -- Controllers

Pro Series White Toner and Neon Range

Train times. Monday to Sunday. Stoke-on-Trent. Crewe

Notations Used in This Guide

SeSSION 9. This session is adapted from the work of Dr.Gary O Reilly, UCD. Session 9 Thinking Straight Page 1

Sequencer devices. Philips Semiconductors Programmable Logic Devices

lookbook Transportation - Airports

Applications to Transistors

Reproducible music for 3, 4 or 5 octaves handbells or handchimes. by Tammy Waldrop. Contents. Performance Suggestions... 3

Contents 2. Notations Used in This Guide 7. Introduction to Your Projector 8. Using Basic Projector Features 34. Setting Up the Projector 17

Reverse Iterative Deepening for Finite-Horizon MDPs with Large Branching Factors

LCD VIDEO MONITOR PVM-L1700. OPERATION MANUAL [English] 1st Edition (Revised 2)

Your KIM. characters, along with a fancy. includes scrolling, erase to end of screen, full motions, and the usual goodies. The

Cooing, Crying, and Babbling: A Link between Music and Prelinguistic Communication

lookbook Higher Education

CPSC 121: Models of Computation Lab #2: Building Circuits

Safety Relay Unit G9SB

Panel-mounted Thermostats

Standards Overview (updated 7/31/17) English III Louisiana Student Standards by Collection Assessed on. Teach in Collection(s)

Binaural and temporal integration of the loudness of tones and noises

TAU 2013 Variation Aware Timing Analysis Contest

Generating lyrics with the variational autoencoder and multi-modal artist embeddings

Solutions For Live Video & Television Productions. LiveXpert is a brand OF

1 --FORMAT FOR CITATIONS & DOCUMENTATION-- ( ) YOU MUST CITE A SOURCE EVEN IF YOU PUT INFORMATION INTO YOUR OWN WORDS!

LCD VIDEO MONITOR PVM-L3200. OPERATION MANUAL [English] 1st Edition (Revised 1)

PIRELLI BRANDBOOK 4. IDENTITY DESIGN

Appendix A. Quarter-Tone Note Names

Avaya P460. Quick Start Guide. Important Information. Unpack the Chassis. Position the Chassis. Install the Supervisor Module and PSU

Notations Used in This Guide

Train times. Monday to Sunday

Predicted Movie Rankings: Mixture of Multinomials with Features CS229 Project Final Report 12/14/2006

1. Connect the wall transformer to the mating connector on the Companion. Plug the transformer into a power outlet.

Sa ed H Zyoud 1,2,3, Samah W Al-Jabi 2, Waleed M Sweileh 4 and Rahmat Awang 3

Mapping Arbitrary Logic Functions into Synchronous Embedded Memories For Area Reduction on FPGAs

Animals. Adventures in Reading: Family Literacy Bags from Reading Rockets

MILWAUKEE ELECTRONICS NEWS

92.507/1. EYR 203, 207: novaflex universal controller. Sauter Systems

CAN THO UNIVERSITY JOURNAL OF SCIENCE INSTRUCTIONS FOR AUTHORS

On the Citation Advantage of linking to data

Successful Transfer of 12V phemt Technology. Taiwan 333, ext 1557 TRANSFER MASK

For public transport information phone Bus 415. Easy access on all buses. Middleton Alkrington Middleton Junction Chadderton Oldham

Phosphor: Explaining Transitions in the User Interface Using Afterglow Effects

9. The Structure of Chanted Ipili Tindi

lookbook Corporate LG provides a wide-array of display options that can enhance your brand and improve communications campus-wide.

ViaLiteHD RF Fibre Optic Link

User's Guide. Downloaded from

VOCAL MUSIC I * * K-5. Red Oak Community School District Vocal Music Education. Vocal Music Program Standards and Benchmarks

Operation Manual. Cutting Machine Product Code: 891-Z01

22 May to 10 December Pontefract Train Times

Kelly McDermott h#s tr#veled the U.S., C#n#d# #nd Europe #s performer, te#cher #nd student. She h#s # B#chelor of Music degree in flute perform#nce

expand their agricultural operations. The Friends were delighted to hear how a

Tran Thi Thanh Thao Ticker: STB - Exchange: HSX Recommend: HOLD Target price 2011: VND 15,800 STATISTICS

Long wavelength identification of microcalcifications in breast cancer tissue using a quantum cascade laser and upconversion detection

Embedding Multilevel Image Encryption in the LAR Codec

Association of blood lipids with Alzheimer s disease: A comprehensive lipidomics analysis

Transcription:

Stndrd Dtbses for Recognition of Hndwritten, Numericl Strings, Legl Amounts, Letters nd Dtes in Frsi Lnguge Frshid Solimnpour, Jvd Sdri, Ching Y. Suen To cite this version: Frshid Solimnpour, Jvd Sdri, Ching Y. Suen. Stndrd Dtbses for Recognition of Hndwritten, Numericl Strings, Legl Amounts, Letters nd Dtes in Frsi Lnguge. Guy Lorette. Tenth Interntionl Workshop on Frontiers in Hndwriting Recognition, Oct 006, L Bule (Frnce), Suvisoft, 006. <inri-00098> HAL Id: inri-00098 https://hl.inri.fr/inri-00098 Submitted on 5 Oct 006 HAL is multi-disciplinry open ccess rchive for the deposit nd dissemintion of scientific reserch documents, whether they re published or not. The documents my come from teching nd reserch institutions in Frnce or brod, or from public or privte reserch centers. L rchive ouverte pluridisciplinire HAL, est destinée u dépôt et à l diffusion de documents scientifiques de niveu recherche, publiés ou non, émnnt des étblissements d enseignement et de recherche frnçis ou étrngers, des lbortoires publics ou privés.

Stndrd Dtbses for Recognition of Hndwritten, Numericl Strings, Legl Amounts, Letters nd Dtes in Frsi Lnguge Frshid Solimnpour Jvd Sdri Ching Y. Suen CENPARMI (Center for Pttern Recognition nd Mchine Intelligence), Computer Science Deprtment, Concordi University, 455 de Misonneuve Blvd. West, Montrel, Quebec, Cnd, HG M8, Tel: (54)-848-44-Ext:7950, Fx: (54)-848-80 Emils:{f_solim, j_sdri, suen}@cs.concordi.c Abstrct This pper describes n importnt step towrds the stndrdiztion of the reserch on Opticl Chrcter Recognition (OCR) in Frsi lnguge. It describes formtions of novel nd stndrd hndwritten dtbses including isolted digits, letters, numericl strings, Legl mounts (used for cheques), nd dtes. Despite conventionl reserch nd n Internet serch, no publicly ccessible Frsi dtbse ws found. Hence, it ws decided tht it would be worthwhile cdemic effort to crete severl Frsi dtbses tht could stnd on their own merit functioning s useful tools for OCR reserchers. Also, in order to show the potentil uses of our new dtbses we lso conducted some experiments on the recognition of hndwritten isolted Frsi digits. Keywords: Frsi OCR, Frsi Hndwritten Dtbses, Arbic Hndwritten Dtbses, Indin Dtbse.. Introduction An essentil prt of the development nd evlution of every offline chrcter recognition technique is the comprison of the results by using the sme stndrd dtbse s other reserchers []. There re mny exmples of widely used dtbses in the field of hndwriting recognition such s NIST [], CEDAR [], CENPARMI [4], UNIPEN [5], CENPARMI Arbic Cheques [6], ETL9 (Jpn) [7], nd PE9 (Kore) [8]. But to the best of our knowledge, no stndrd dtbse for the Frsi lnguge is vilble. The Frsi lnguge is spoken by more thn 0 million people, minly in Irn, Afghnistn, Tjikistn, nd prtly in some other countries. There re lso other lnguges which use the sme lphbets nd digits or subsets of them such s: Arbic, Urdu, nd Pshto. In Frsi, words, sentences nd dtes re written from right to left, but numbers re written from left to right which mtch the style of writing numbers in the English lnguge. Frsi hs letters in the lphbet nd is cursive lnguge, which mens within one word, letters cn be connected. Due to connectivity, the shpe of Frsi letters my chnge significntly depending on their positions in word, the identity of neighboring letters, the font, or the wy tht writer connects successive letters. Considering these fcts, it is crucil to hve & Authors hve the sme contribution stndrd dtbses in order to improve reserch on Frsi hndwritten recognition. In this pper, we will describe the detils of formtion of the following dtbses: Frsi isolted digits, numericl strings, isolted letters, legl mounts, Frsi dtes (clled Hijri Shmsi); nd smll set of English digits (written by Frsi ntive spekers). In order to show the usefulness of our dtbse, we lso report the results of some of our experiments on the recognition of isolted hndwritten Frsi digits tken from this dtbse. The rest of this pper is orgnized s follows: Section describes our steps towrds collecting the dt. In Section, dt extrction methods re covered, which include the pre-processing of the imges. Section 4 detils our experiments on the recognition of Frsi isolted digits. In Section 5, we discuss the output of our work nd compre it with some other works. Finlly in Section 6 we present some concluding remrks nd suggestions for future reserch.. Dt Collection Two dt entry forms were designed for our dt collection process. The first form contined Frsi numericl strings, isolted letters, the dte, nd English digits. The Frsi digits dtbse ws formed by segmenting the numericl strings in this form. The second form ws completely dedicted to cursive legl mounts. In order to utomte the process of cutting the fields out of the scnned forms, two types of nchoring mrks were dded to the forms: the form identifiers, nd the edge identifiers. The form identifiers consisted of 8 squres such tht ech one cn hve two sttes: empty or blckened. Therefore, they could represent 55 binry numbers nd could serve s identity of 55 different forms. In our cse, for the form, squres, 5, nd 8; nd for the form, squres, 4, nd 7 were blckened. By detecting these squres our progrm could utomticlly identify the form it ws working on. Edge identifier mrks consisted of four squres locted t ech corner of the form, nd detecting them enbled the progrm to correctly determine the coordintes of the region tht contined the ctul dt. Two smples of the dt entry forms re shown in Figure nd Figure.

.. Frsi numericl strings dtbse Ech prticipnt wrote 4 numericl strings in form which were used to form our dtbse of Frsi numericl strings. In Frsi, the norml height of the num erl 0 is pproxim tely one fifth of other chrcters, nd is written differently every time either becuse of its loction in numericl string or becuse of its repetition in numericl string. To cover ll forms, we hd to repet it more times thn other numerls. In our dtbses, we hve smples of the numerl zero being t the beginning, middle or end of numerl string s well s when it is repeted two, three or six times in string. In Figure, smples of two different writing styles of repeted zeros cn be viewed. Figure. Different styles of writing zeros in the numericl string: 7000. Figure. Smple of filled form. The dt entry forms were filled by 75 writers selected from different ges, genders, nd jobs; nd mong those, 05 writer were rndomly ssigned to our trining set, 50 writer to the testing set, nd 0 writer to the verifying set. We ensured tht the dt in ech set ws completely genuine nd tht there would be no reltion between sets. Our finl work includes these dtbses: numericl strings, isolted digits, Frsi letters, cursive legl mounts, nd smll set of English isolted digits. In the following subsection we give detils on ech dtbse. Tble. Sttistics of numericl strings dtbse. 4 75 440 840 00.. Frsi isolted digits dtbse A simple segmenttion lgorithm ws developed for seprting the digits in the numericl strings nd to crete the Frsi digits dtbse. When designing the dt entry form for the numericl strings, throughout ll the strings, digits to 9 were repeted 5 times, digit 0 ws repeted 0 times, nd the deciml point ws repeted times. This wy we could control number of isolted digits tht we could extrct from the numericl strings. Smples of Frsi isolted digits re shown in Figure 4 nd sttistics of this dtbse re included in Tble. Figure 4. Smples of Frsi isolted digits. Becuse seprting ll the digits ws not possible, writers did not eqully prticipte in the dtbse for ech digit. Therefore, some of the digits written by those writers tht hd the most prticiption were rndomly removed from the dtbse in order to normlize the prticiption. The lgorithm is shown in Figure 5. Note tht every time digit is removed the most prticipting writer chnges. This procedure ws executed for ech digit. Tble shows the finl sttistics for this dtbse. Figure. Smple of filled form. Tble. Sttistics of the isolted digits dtbse. 0 75 000 000 5000

Include ll the imges Reched designted count? No Yes Finished Figure 8. Exmple of cursive worded number which reds: One Hundred nd Fourty Toumns Over. Determine the most prticipting writer Rndomly delete one imge from the determined writer.5. Frsi dtes dtbse Countries tht hve Frsi lnguge spekers use type of dte clled Hijri Shmsi. The formt of writing the dte in Frsi is like this: yer/month/dy. A smple of dte is shown in Figure 9. The sttistics of this dtbse re lso included in Tble 5. Figure 5. Algorithm of normlizing the prticiption... Frsi isolted letters dtbse Although Frsi consists of letters, yet when filling dt entry forms out people use two different styles for the letter ه (pronounced: Heh) nd ا (pronounced: Alef) nd smples of those styles re shown in Figure 6 nd Figure 7. With these styles, the number of isolted letters tht we included in the form reched 4. Figure 6. Two styles of. ه writing the letter Figure 7. Two styles of. ا writing the letter Ech writer wrote the isolted letters included in the form, two times. The sttistics of this dtbse re included in Tble. Tble. Sttistics of Frsi isolted letters dtbse. 4 75 740 60 400.4. Frsi legl mounts dtbse Two types of dt were included in our second dt entry form. The first type consisted of 4 words tht re normlly used for writing the legl mount on bnk cheques plus four dditionl words consist of currency units nd the words Over nd Equl to (in Frsi). The second type consisted of four worded number strings where three of those were pre-determined fields nd one ws free field. In the free field, writers could write worded number of their own. When including these imges in the dtbse, the free field ws lbeled mnully. A smple of worded number cn be seen in Figure 8. Tble 4 shows sttistics of this dtbse. Tble 4. Sttistics of cursive worded number dtbse. Writers = Clsses 75 Fields 48 5040 960 400 Free Field 75 05 0 50 Totl 8 545 980 450 Figure 9. Exmple of Frsi dte. Tble 5. Sttistics of the Frsi dtes dtbse 75 75 05 0 50.6. English digits English digits hve lredy been collected nd included in different dtbses; however, smll set ws included in the first form (ech digit from 0 to 9 ws repeted twice in ech form) in order to cpture the style of writing English digits by non-ntive English spekers (Irnins). Tble 6 shows sttistics of this dtbse. Tble 6. Sttistics of the isolted digits dtbse. 0 75 00 400 000. Dt Extrction.. Preprocessing Ech form ws completely scnned using Lexmrk-P80 scnner whose resolution ws set to 00 dpi t grey level of 8 bits. The imges were sved in PNG (Portble Network Grphics) indexed-color formt files. PNG provides ptent-free replcement for GIF nd lso replces mny common uses of TIFF. [9] All the dtbses consist of gryscle nd binry versions of imges nd ech set is included in seprte folder. First, gryscle imges were extrcted, nd then ll were converted to binry in seprte folder keeping the sme filenmes nd the sme folder structure. To convert ech file to binry, the threshold of gryscle imge is clculted using the gry-level histogrm [0], nd then ll the pixels with brightness less thn tht vlue re set to blck, nd the rest to white. Before strting the process of extrcting imges from scnned forms, their slt nd pepper noise ws remove using the lgorithm presented in [].

.. Dt Preprtion A computer progrm ws developed to utomticlly extrct imges of the fields from the pre-processed scnned forms using templte tht ws mnully designed for identifying the dt entry fields reltive to the nchoring mrks t the corners of the forms. The progrm first recognized edge identifier nchor mrks on the scnned imge by simple templte mtching technique. It then tried to mtch the templte coordintes to nchor mrks of the imge by scling nd/or rotting the templte if necessry. After tht, ll the fields were cut from the imge, bsed on the boundries in the mtched templte. The fields were sved s individul imge files using the set they belonged to nd the nming convention of the dtbse. To determine the set to which n imge belongs, the writers were selected from different ges, genders, nd jobs to serve in the trining, testing, or verifying set. All the im ges extrcted from ech prticulr w riter s form, were sved to the sme set for mking sure tht the dt sets re totlly unrelted. For ech imge, record ws inserted into Microsoft Access dtbse tht includes the pth to the imge file reltive to the bse folder, the lbel of the imge, the number of chrcters in the imge, the number of words in the imge, the type of the contents (numericl, dte, cursive worded number or letter), nd some other informtion. By querying this type of detiled informtion, future reserchers will be ble to find the proper set of imges more esily. 4. Experimentl Results In order to show the ppliction of our dtbses, we conducted some experiments on the recognition of hndwritten isolted Frsi digits. We used our isolted digits dtbse which contins 00 trining, 00 verifying, nd 500 testing smples per digit. 4.. Feture Extrction In order to compre our results with some previous works, we used the fetures presented in []. Eight sets of fetures were used to represent imges of digits: the outer profile from four directions; crossing counts; nd projected histogrm from ech of two directions. Figure 0. A smple of the fetures. : outer profiles, b: crossing counts, c: projection histogrm. b c b c Ech set produced n rry tht ws lter normlized to n rry of size eight. The normliztion ws done using liner interpoltion for up-smpling nd verging for down-smpling the rry. The combintion of ll the feture sets produced 64-member rry tht ws used s our feture vector. A smple of fetures used in our experiment cn be viewed in Figure 0. 4.. Clssifiction For clssifiction we used Support Vector Mchines (SVM) [] nd Rdil Bsis Function (RBF) kernel. T he prm eter C w s set to nd the prm eter σ w s set to 0.05. To find the best prmeter vlues, we djusted the prmeters on the trining set, nd tested them on the verifying set. Prmeters tht gve the best results on the verifying set were used for clssifying on the testing set. We used LIBSVM [4] for the implementtion of our SVM clssifier. Tble 7 shows the overll results of our clssifier compred to the results of [] nd [5]. The confusion mtrix of the testing set is lso shown in Tble 8. Ech row of this tble shows how isolted digits in the testing set were clssified or misclssified. Tble 7. Our results compred with [] nd [5]. Our Results Results of [] Results of [5] Trining Set 000 4500 790 Verifying Set 000 - - Testing Set 5000 600 05 nsv* 577 69 - Trining Error 0.85% 0.00% 0 RR** 97.% 99.44% 94 * Number of Support Vectors, ** Recognition Rte Tble 8. The confusion mtrix of the testing set using SVM with polynomil kernel. 0 4 5 6 7 8 9 459 0 9.8% 6.6% 48 96.6% 49 98.6%.6% 7 47 8 94.6% 4 49 4.6%.8% 98 8 5.6% 6 7 8 9.6% 5. Discussion.6%.6% 49 98 484 96.8% 5 % 500 00% 5 % 499 99.8% 49.6% 98 This reserch effort hs produced six dtbses. Ech

dtbse is divided into trining, verifying, nd testing sets, which includes pproximtely 60%, %, nd 8% of the vilble dt respectively. All the dtbses re vilble in gryscle nd binry versions. Tble 9 nd Tble 0 show comprison between our two importnt dtbses (Frsi isolted digits nd Frsi isolted letters) nd other similr vilble dtbses. Although the result of our recognition rte in Section 4 is little bit lower thn [], our dtbses were not the sme, nd our isolted digits dtbse hs more smples compred to them. Also we used unseen dt to test our clssifier nd in [] testing set ws used for djusting prmeters of the clssifier. As our dtbse is vilble for the reserch community, we hope tht it cn function s stndrd comprison bsis for Frsi hndwritten recognition reserch. Tble 9. Comprison of number of smples in our Frsi isolted digit dtbse with other dtbses. Isolted Dtbse Set MNIST English 60,000 0 0,000 CEDAR English 5,80 0 707 CENPARMI English 4,000 0,000 CENPARMI Arbic 0,56 0 4,4 Our Dtbse Frsi,000,000 5,000 Tble 0. Comprison of number of smples in our Frsi isolted letters dtbse with other dtbses. Isolted Letters Dtbse Set CEDAR English Letters 9,45 0,8 Our Dtbse Frsi Letters 7,40,60,400 6. Conclusion nd Future Works We hve presented six new stndrd dtbses consisting of hndwritten Frsi numericl strings, digits, letters, legl mounts nd dtes which cn serve s bsis for future reserch in offline Frsi hndwritten recognition. These dtbses re vilble to the reserch community upon request to the Center of Pttern Recognition nd Mchine Intelligence (CENPARMI) of Concordi University. Our dtbse contins binry nd gryscle versions of the imges llowing for experimenttion nd comprison with both gryscle nd binry preprocessing nd recognition techniques. In the future, the dtbses my be expnded by collecting more dt entry forms, nd dding more sets such s Frsi words, sub-words nd sentences. Furthermore, the sets my be esily dopted for Frsi-bsed cheque-processing systems. Lter, we would like to develop sophisticted segmenttion nd recognition lgorithms for processing smples of these dtbses. 7. References [] I. Guyon, R. Hrlick, J. Hull, nd I. Phillips, Dtbse nd benchmrking, In H. Bunke nd P. Wnd, editors, Hndbook of Chrcter Recognition nd Document Imge Anlysis. World Scientific, 997, Chpter 0, pp. 779 799. [] R. Wilkinson, J. Geist, S. Jnet, P. Grother, C. Burges, R. Creecy, B. Hmmond, J. Hull, N. Lrsen, T. Vogl, nd C. Wilson. The first census opticl chrcter recognition systems conf. #NISTIR 49, The U.S. Bureu of Census nd the Ntionl Institute of Stndrds nd Technology, Githersburg, MD, 99. [] J. Hull, A dtbse for hndwritten text recognition reserch, IEEE Trns. on Pttern Anlysis nd Mchine Intelligence, My 994, Volume 6, Issue 5, pp. 550 554. [4] C. Y. Suen, C. Ndl, R. Legult, T. Mi, nd L. Lm, Computer recognition of unconstrined hndwritten numerls, Proc. of the IEEE, 99, Volume 7, Issue 80, Pges 6 80. [5] I. Guyon, L. Schomker, R. Plmondon, M. Libermn, nd S. Jnet, Unipen project of on-line dt exchnge nd benchmrks, Proc. of the th IAPR Int. Conf on Pttern Recognition, Jeruslem, Isrel, Oct. 994, pp. 9. [6] Yousef Al-Ohli, Mohmed Cheriet, nd C.Y. Suen, Dtbses for recognition of hndwritten Arbic cheques, Proceedings of the Seventh Int. Workshop on Frontiers in Hndwritten Recognition, Sep 000, pp. 60-606. [7] F. Jelinek, Self-orgnized lnguge modeling for speech recognition, In A. Wibel nd K.-F. Lee, editors, Redings in Speech Recognition, Morgn Kufmnn Publishers, Inc., 990, pp. 450 506. [8] D. Kim, Y. Hwng, S. Prk, E. Kim, S. Pek, nd S. Bng, Hndwritten Koren chrcter imge dtbse PE9, In Proceedings of the Second Int. Conference on Document Anlysis nd Recognition, 99, pp. 470 47. [9] Chris Lilley, PNG (Portble Network Grphics). The World Wide Web Consortium (WC), Detils vilble t http://www.w.org/grphics/png/ [0] N. Otsu, A thresholding selection method from grylevel histogrm, IEEE Trnsctions on Systems, Mn, nd Cybernetic, 979, Volume 9, pp. 6-66. [] Je S. Lim, Two-Dimensionl Signl nd Imge Processing, Englewood Cliffs, editor, Prentice Hll, USA, 990, pp. 469-476. [] H. Soltnzdeh, nd M. Rhmti, Recognition of Persin hndwritten digits using imge profiles of multiple orienttions, Pttern Recognition Letters, 004, Volume 5, pp. 569 576. [] C.J.C Burges, A tutoril on support vector mchines for pttern recognition, Dt Mining nd Knowledge Discovery, 998, Volume, pp. 67. [4] Chih-Chung Chng, Chih-Jen Lin, LIBSVM: librry for support vector mchines, 00. Softwre vilble t http://www.csie.ntu.edu.tw/~cjlin/libsvm [5] J. S dri, C. Y. S uen, T. D. B ui, Appliction of Support Vector Mchines for Recognition of Hndwritten A rbic/p ersin D igits, P roceedings of the S econd Conference on Mchine Vision nd Imge Processing & Applictions (MVIP00), Vol., pp. 00-07, Feb. 00, Tehrn, Irn.