An Automation System to recognize Sinhala Handwritten Characters using Artificial Neural Networks

Abstract

In the Government of Sri Lanka, most of the information based activities are still carried out manually. This research attempt proposes a new way to automate an important public service which is fundamental by nature, issuing the National Identity Card (NIC). Thispresentsanapproachfor recognizeSinhala HandwrittenCharacters in the application forms. Initially set of handwritingsof30individualswere collectedandthentwothirdofthosesampleswereusedforthetrainingprocessandtheremainingone thirdwereusedforthetestingprocess.Thescannedimages oftheCharactersweregonethroughpreprocessing for the further processing. Findingboundariesandthe Normalization of the charactersisgoingtohandlebythepreprocessor.Afterpreprocessing,segmentationisdoneinorder toget theindividualcharacters from the list of Characters. Standard imageprocessingtechniqueswereemployed toaccomplish thesetasks.Thentheyweretrainedbyan Artificial Neural Network (ANN). The recognition of Sinhala characters is done by an ANN which is widely used in applications involving uncertainty. Rulesareimposedonthe results of the neural networks (NN) to make the recognition process more accurate. Then the details of the applicant are appended to the database. Theoutcomeof thisresearchwill be beneficial to the general public at large.

Introduction

Inmanycountries,e-governmentstrategiesarebeingusedtomakethegovernmentprocessesmore efficient andaccurate. Typically,inordertogetsomething donefromGrammaNiladari (GN),citizenshavetofilloutrequiredapplicationforms.Inthecurrentscenario,mostoftheformsareinthemediumofSinhala.ThereforewhenGNgets thosefilledapplicationformshehastogothroughthemmanuallyandthentheprocessinghastobe done.Ifthe Governmenthaveanintegratedsystemwhichispossibletoconnectallthebasicentitiesintheprocess thentheautomationbecomeseasier. Therefore thisstudyisaninitiatingpointwhichwillsupporttheconceptofe-governmentintheSriLankan context.InSriLanka,mostprobablycitizensareconnectedtoGovernmentthroughGN. Automation oftheservicesdonebyGNplaysanimportantroleinthedevelopmentprocessofarealistic e-governmentstrategy. Inordertoachievethesegoals,thissystemproposesasystemthat allowstheextraction ofSinhala Handwritten Charactersfrom the above mentioned application forms.

The mostimportantcomponentofthistaskisextracting datafromformswhicharefilledby citizensin SinhalalanguagewhichinvolvesSinhalahandwrittencharacterrecognition. Usually,peoplesubmit theapplications byfilling themintheirhandwriting.Nandasara (1995) states that itisreally achallenging tasktoidentify handwrittencharacterssincethevariationwhichhastobecapturedamongthecharactersishigh. Furthermore,duetothespecialstructureoftheSinhalacharacterstherecognitionprocessbecome complex [1]. Previouslymuch effort hasbeen carriedout in this area of making acomputerrecognize both handwrittenandtypedcharactersautomatically.Untilquiterecently,thiseffortisonrecognizing Englishcharacters.HoweverfortheAsianlanguagessuchasSinhalaandTamiltherewerefewefforts.Methodswhicharewidelyusedforcharacterrecognitioninthesekindsof languages includepattern matchingusingimageprocessingtechniques.

Material & Methods

1.DataAcquisition

Inthedataacquisitionstage,handwritingsof30peoplearecollected.Handwritingsof20peopleare usedfortrainingtheneuralnetworkandremaining10handwritingsareusedfortesting.When collectingsamplelettersfromindividuals,blankA4sheetwithdottedpencillinesisused.Afterthat eachpersonisadvisedtowriteagivensetoflettersonthosedottedlines.Subsequenttothat,the pencillinesonthesheetsareerasedandtheyarescannedbyaHPScan jetScannerwith200dpi resolution. Anotherimportantthingthatshouldbementionedis,whencollectinglettersonlylimitednumberof charactersiscollected,sincesomeofthecharactersarerarelyusedinthecontext.

Implementation of the NN consists of four main steps. They can be represented in the following diagram. Those are

  1. Pre Processing
  2. Segmentation
  3. Training
  4. Post Processing

Figure 1.0 – Main Steps of Word Processing

2. PreProcessing

The imageispreparedforfurtherprocessing.Initially,imagehasgonethroughafilteringprocesstofacilitate removingthenoisethatcouldbeaddedduringthescanningprocess.Thetermnoiseistobeunderstood asanythingthatpreventsrecognitionsystemfromfulfillingitsobjective.Noisecanbeaddedtothe imageduetotheroughnessofthepaper.Itwasobservedthatscannedimagecontainssaltandpepper noise. Thereforein orderto remove noise median filteringwas used.

Afterfiltering,itisbinarizedorconvertedtoblackandwhite.Thisisdoneto ease theprocessing.Differentpeoplecanwritethelettersindiversecolors.Thereforeinorderto avoidthat,effectbinarization [10]isdone.Inmostofthetypicalcharacterrecognitionsystems,thesesteps arefollowed before the processingstage.Afterthebinarization,the nextstepistomakethecharactersthin.Thegoalofthinningistoeliminatethethicknessdifferencesofpenbymakingtheimageonepixelthick.Whenwritingtheletters,theyare blottedwithinkandhencelettersbecomemuchthicker.Thereforetoavoidthiseffect, thinningcanbeused.Thusittakesallthelettersintoaoneparticularstandardformat.Forthinning morphological operationswere applied.Theseare the threestepsthat havebeen followedunder thepreprocessingstage. For each of these imageprocessingtechniques therewerebuilt-in functions in MATLAB [3].Those built-in methodswere used in the implementation process.

3. Segmentation

Inthesegmentationstage,theimageisdividedintocharacters.Thenfromeachcharacterallthewhitespacesaroundthemareremoved.

Figure 2.0-Finding Boundaries of a Character

Projectionprofilesoftheimageareusedtocroptheimageintotextlinesandafter thattoindividualletters.Initially,horizontalprojectionprofileisusedtodetectthetextlinesofthe imageandafterwardsimageissegmentedintotextlines.Thentheverticalprojectionprofilesofthose text lineswereused to segment them into individual Characters.

Sincethescannedimageconsistsof9textlines,horizontalhistogramalsoconsistsof9barscorrespondingtoeachofthosetextlines.Thentheboundariesofthosebarscanbeobtainedandafter thatusingthemtheimagecanbecroppedintotextlines.Afterobtainingthe text line, letters has to be cropped in an attempt to inputto the system forprocessing. Thoseletters can beprepared bygettingthe vertical histogram of the text line. Verticalprojectionhistogramshowshowthelettersaredistributedwithinthetextline.Boundariesof thecharacterscanbefound. Aftergettingtheboundariesofthecharacters,eachofthemcanbecropped. Thenthecharacterscan beisolatedinordertoinputtothesystem.

Thisprocedurewasdoneforallthecharacterswhichwereinthecollecteddataset.Thenusingthosecolumnvectorsinputvectorwascreated.Thenforthesegmentedcharacters,NNhadtobecreatedandthusinputvectorandthetestvectorwerecreated.ThentheNNhadtobetrainedwiththoseinputvector.

Results and Discussion

The mostsalient featureofNN is their massive processingunits and interconnectivity. Unlesshandledcarefully,thevariousparametersinvolvedinthearchitectureoftheNNmaycausethetrainingprocess(adjustingweights)toslowdownconsiderably.Someoftheseparameters are:thenumberoflayers,numberofneuronsineachlayer,theinitialvaluesofweights,thetraining coefficientandthetoleranceofthecorrectness.Theoptimalselectionofparametersvariesdepending onthealphabet.Soas totraintheweights,aninitialsetofweightsistestedagainsteachinput vector.Ifaninput vectorisfoundforwhichtherecognitionfails,weightsareadjustedtosuitthe particularinputvector.However,thisadjustmentmightalsoaffecttherecognitionofotherinput vectorswhichhavealreadybeentested.So,theentiremodelneedstobetestedalloveragainfromthe beginning.

ANNsarecapableofabstractingtheessenceofasetofinputs.Forexample,anetworkcanbetrainedonasequenceofdistortedversionsofaletter.Afteradequatetraining, applicationofsuchadistortedexamplewillcausethenetworktoproduceaperfectlyformedletter. Experimentalresultshaverevealedthattrainingofmorethan20suchdistortedversionsofthesame letterproducescorrect resultswith a veryhigh percentageofaccuracy.

BackpropagationisasystematicmethodfortrainingmultilayerANNs(perceptron).TheSigmoidcompressestherangeofNETsothatOUTliesbetweenzeroandone.Sincetheback propagation uses the derivativeof thesquashingfunction [2], ithas to be everywheredifferentiable. The Sigmoidhasthispropertyandtheadditionaladvantageofprovidingaformofautomaticgaincontrol. Properlytrainedbackpropagationnetworkstendtogivereasonableanswerswhenpresentedwith inputsthatthey haveneverseen.Typically,anewinputleadstoanoutputsimilartothecorrectoutput forinputvectorsusedintrainingthataresimilartothenewinputbeingpresented.Thisgeneralization propertymakesitpossibletotrainanetworkonarepresentativesetofinput/targetpairsandgetgood resultswithouttrainingthe networkonall possibleinput/output pairs.

Conclusions

Oneofthemajorproblemsofdoingthisfor Sinhala handwrittencharactersisthattheydonot appearatthesamerelativelocationoftheletter duetothedifferentproportionsinwhichcharactersare writtenbydifferentwritersofthelanguage [6].Eventhesamepersonmaynotalwayswritethesame letterwiththesameproportions.Eventhenormalizationofthecharactersintoastandardsizedoesnot completely eliminate this effect,although it does help to some extent.

TrainingisthemostimportantandthemosttimeconsumingactivityofNN implementations.Anefficientsystemshouldtaketheminimumtrainingtimepossible.Tominimize thetrainingtime,experimentsshouldbecarriedoutonthevaluesoftheparameterstochooseabettersetofvalueswhichreducesthetrainingtime.Therearecertainfactorsthataffecttrainingtimeand performanceofthenetworks.Followingaretheparametersthatcouldbeadjustedtominimizethe trainingtime:

a)Initial valuesof the weights

b)Number ofneurons in the hidden layer

c)Training coefficient

d)Tolerance

e)Grid size used to extract bit patternsfrom theinput

f)imageSize of thetrainingdata set

g)Constituent characters in the trainingset

h)Form ofthe input (i.e. individual handwriting)

i)Howrepresentative thetrainingset

j)Howrepresentative thetest set forgeneralization

Thereforetrainingis a processwhich has to be carried out carefullyin order to obtaina goodrecognition rate.

Thisdoesn’tindicateanymajorproblemswiththetraining.Thevalidationandtestcurvesareverysimilar.Ifthetestcurvehadincreasedsignificantlybeforethevalidationcurveincreased,thenit ispossiblethatsomeoverfittingmighthaveoccurred.Accordingtotheabovegraphmeansquared errorreduceswithtimewhiletheneuralnetworkistesting,validatingandtraining.Itisagood performancemeasure.

Figure3.0 -Performance Plot

FromthewholeexerciseofattemptingtouseNNtechniquesfortherecognitionofcharactersintheSinhalaalphabet,itwasdiscoveredthatthereisaseparateapproachwhichcouldbe developedbyemployingNNtechniquestogetherwithimageprocessingtechniques.The reasonscouldbeunreliabilityofthedataandthesegments.Toobtainabetteroutputorto improve results, several attemptsweretaken.Sincethenetworkwasnotsufficientlyaccurate,thenetworkwasreinitializedandthetrainingwasdoneagain.Eachtimeafeedforwardnetworkisinitialized;thenetworkparametersaredifferentand might producedifferent solutions.Howeverit could not achieve a considerable recognition rate.

Sinceitwasnotsuccessful,itwasattemptedtoimprove theresultsbyincreasing thenumberofhidden neuronsabove20.Largernumbersofneuronsinthehiddenlayergivethenetworkmore flexibility becausethenetworkhasmoreparametersitcanoptimize.Whenincreasingthelayersizegradually,ifthe hiddenlayerismadetoolarge,itmightcausetheproblemtobeundercharacterizedandthenetwork mustoptimizemoreparametersthantherearedatavectorstoconstrainttheseparameters.However that effort was not successful anyway.

Thenthethirdoptionwastotryadifferenttrainingfunction.Bayesianregularizationtrainingwith trainbr,forexample,cansometimesproducebettergeneralizationcapabilitythanusingearlystopping. Otherthanthat,trainingfunctionssuchastrainscgandtrainrpwereusedwhichwillbemore appropriateforcharacterrecognitionsystems.Howeverfromthatalsoconsiderablerecognitionrate could not be achieved.Sinceanyoftheabovemethodscouldnotproduceabetterresultitwasdecidedtouseadditionaldata fortrainingand the testing Stages.Sometimesthehandwritingstylesinthetrainingdatasetmightsimilartoeachother.Thereforeitcouldbeapossiblereasonfornotgettingahighertestingrate.Providingadditionaldataforthe networkis morelikelyto produceanetworkthatgeneralizeswell to newdata.ByincreasingtheNumberofcharactersamplesbetterresultscanbeexpected.Thiswillhelptoachieve amore generalized trained network.

Traininganeuralnetworkwithahighertestingrateisalsoachallenge.Thatisoverfittingcanbeoccurred.Inthisscenario,networkhastrainedtoclassifyonlytheitemsinthe training set.However,if itistrained wellfor patternsinthetraining setthenitcannotclassifythe itemswhichithasneverseen. Whenselectingthetrainingsets,itisbettertogroupthecharactersetsaccordingtotheshapeofthe character(suchasroundorsquared)orthesizeofthecharacters.Thentheresultswouldbemore accurateanditwouldbeacompletesystemforcharactertrainingandtesting.Moretrainingsetsand trainingprograms are required to develop sucha system.Auserinterfacecanbeintroducedtomakeituserfriendly.Amenudrivenprogramwithpushbutton controls and pictureswould be attractive.InthisiscurrentstatetheNNcanbeusedonlyfortrainingeightcharactersonSinhalaAlphabet with an initial guidance. A trainee can continue trainingwhileenjoyingit as agame.

References

Rajapakse,Jagath(2000)“NeuralNetworksandPatternRecognition”_CourseNotes,Nanyang Technological University, Singapore, December

Aleksander,IgoreMorton,Helen(1991):“AnIntroductiontoNeuralComputing”, ChapmanHal,ISBN 0 412 37780 2

Documentation, MATLABVersion 7.1.2 (R11)TheMathworks,Inc., Jan.21, 1999

Nandasara, S. T., Disanayake, J. B., Samaranayake, V. K., Seneviratne, E. K and Koannantakool, T. (1990): Draft Standards for the use of Sinhala in Computer Technology” by the Computer & Information Council of Sri Lanka (CINTEC)

Beale,R.andJakson,T.(1990):“NeuralComputing–AnIntroduction”,IOPpublishingLtd. ISBN 0 852774 262 2

Disanayake, J. B. (1993): “Lets Readand Write Sinhala”, PioneerLanka

Valluru, R. and Hayagriva, R.(1996): C++Neural Networksand Fuzzy Logic,BPBPublications

EaralGose,RichedJohnsonbaughandSteveJost(2003),PatternRecognitionandImageAnalysis.Prentice-Hall,India

Hemakumar L. Prematathne and J.Bibun (2002), Recognition of Printed Sinhala Characters Using Linear Symmetry.The 5th Asian Conference on Computer Vision

Manning, Christopher D. and HinrichSchutze (2000), Foundations of Statistical Natural Language Processing.MIT Press

1