mini-projet 1: « reconnaissance d'écritures manuscrites
TRANSCRIPT
Mini-projet 1: « Reconnaissance
d'écritures manuscrites »
VincenzoBazzucchi,BarbaraJobstmann,
JamilaSam24.10.2018
1
Outline • Administra=ve
• Informa=on/Star=ngpoint• SubmissionandGroups• SubmissionServerandTokens
• Project• Goal• Overview• ProvidedCode• ProjectDetails:
• Representa=onofImages• Part1:ParserforIDXformat“ParseurpourleformatIDX”• Part2:DistancesbetweenImages“Distancesentreimages”• Part3:Aggrega=on“Choixdel’e=queWe”• Part4:Evalua=on“Ou=lsd'évalua=on”• Part5:Bonus"Réduc=ondudataset
2
Informa>on about the Project • Detailedprojectdescrip=onandprovidedmaterial:under“Project1”→“Descrip=on”athWps://proginsc.epfl.ch/wwwhiver/
3
Submission • Deadline:Nov12th,1pm• Groupsof(atmost)2students• Submission:under“Project1”→“Rendu”athWps://proginsc.epfl.ch/wwwhiver/
4
Submission Content • EclipseArchivefile(zip-file<20kB)thatincludes
• KNN.java(required)• KNNTest.java(op=onal)• KMeansClustering.java(op=onal,Bonus)
5
Submission Content • EclipseArchivefile(zip-file<20kB)thatincludes
• KNN.java(required)• KNNTest.java(op=onal)• KMeansClustering.java(op=onal,Bonus)
6
Submission Server • Willopenoneweekbeforethedeadline:
• FromMon,Nov5thun=lFri,Nov9th4pm.• Nosubmissionsovertheweekend!• ReopenonMon,Nov12thfrom9amto1pm(strictdeadline).
• Eachstudentwillneedatoken(specifickey)tosubmit.• Tokenswillbesendoutperemailoneweekbeforethesubmissiondeadline.
• Submissionrequirestwotokens:onefromeachgroupmember.Ifyouworkalone,youneedtouseyourtokentwice.
• Youcansubmitanewversionusingthesametoken.• TODO:submitini=al(incomplete)versionbeforethedeadlinetogetfamiliarwiththesubmissionprocess
7
Submission Server – Examples • Exampletokens:p1-11111andp1-12345• Exampleofsubmissionwith2students
• Exampleofsubmissionwith1student
8
Submission – Chea>ng • Theprojectisgraded.• Theexchangeofideasbetweengroupsorwiththirdpar=esispermiWedandevenrecommended.
• Theexchangeofcodeisstrictlyforbidden!• PlagiarismwillbecontrolledandwillbeconsideredcheaEng.
• Incaseofchea=ng,youwillreceiveara=ngof"NA“:Art.18“Fraudedel'ordonnancesurladiscipline“hWps://www.admin.ch/opc/fr/classified-compila=on/20041650/index.html
• NotethatatanyEme,youwillneedtobeabletoexplainyourcode.
9
Outline • Administra=ve
• Informa=on/Star=ngpoint• SubmissionandGroups• SubmissionServerandTokens
• Project• Goal• Overview• ProvidedCode• ProjectDetails:
• Representa=onofImages• Part1:ParserforIDXformat“ParseurpourleformatIDX”• Part2:DistancesbetweenImages“Distancesentreimages”• Part3:Aggrega=on“Choixdel’e=queWe”• Part4:Evalua=on“Ou=lsd'évalua=on”
10
Our Approach: K-Nearest Neighbour
• Weusealargedatabase(calledMNIST)withhandwriWendigits+labels
• Givenanewdigitd,weusethekimagesinthedatabasethatlookthemostsimilartod,tolabeld
ModifiedNa=onalIns=tuteofStandardsandTechnologydatabase
4 4 4 4 7
?
12
Main Ques>ons
~~
• HowtotreatimagesintheMNISTdatabase?• Howdifferentorsimilararetwoimages?
• Howtoaggregate?
• Howtoevaluate?4 4 4 4 7
13
Project Overview 1. ParserforIDXformat(ImagesinMNISTdatabase)
“ParseurpourleformatIDX”
2. DistancesbetweenImages“Distancesentreimages”
3. Aggrega=on“Choixdel’e=queWe”
4. Evalua=on“Ou=lsd'évalua=on”
5. Bonus
0 0 0 0 0 0 0 …
0 0 0 190 200 200 201 …
0 0 170 204 240 250 250 …
… … … … … … … …
~
4 4 4 4 7
4?
14
Provided Code (1) classKNN• Headersandemptybodiesofthemethodsthatneedtobeimplemented.Writeyourcodehere!
classKNNTest• Examplesofhowtouseandtestthemethods• Thesetestsarenotexhaus=ve.Writeyourowntests!classKMeanClustering• Headersandemptybodyofthemethodsthatcanbeimplemented.Writeyourcodehere!
classSignatureChecks• Checksthatthesignaturesoftherequiredmethodsarecorrect(tosimplifyautoma=ctes=ng).
• Doesnotcheckanyfunc=onality!• Donotmodify!
15
Provided Code (2) classHelper(Donotmodify!Changeswillnotbetakenintoaccount.)
• Readfromandwritetoafileasequenceofbytes(bytearray)public static byte[] readBinaryFile(String filename) public static void writeBinaryFile(String filename, byte[] data)
• Displayimages(andlabels)inrowsxcolumnsgridpublic static void show(String title, byte[][][] tensor, int rows, int columns) public static void show(String title, byte[][][] tensor, byte[] labels, int rows, int columns) public static void show(String title, byte[][][] tensor, byte[] labels, byte[] trueLabels, int rows, int columns) (Windowwillpop-upandprogramwillbepausedun=lwindowisclosed.)
• BytetoBinaryStringConversionandSigned/UnsignedConversionpublic static String byteToBinaryString(byte b) public static byte binaryStringToByte(String bits) public static String interpretSigned(String bits) public static String interpretUnsigned(String bits)
• Noteabouttypebyte:variantofintegerwithrange[-128,127](seehandouts)
16
Examples
17
show(“Example”, tensor, labels, 5, 10);
interpretSigned(“11111111”) will return “-1” interpretUnsigned(“11111111”) will return “255”
IDXformatusesunsignedinterpreta=onofbytesJAVAusedsignedinterpreta=onofbytes(moreaboutthislater)
Handling Mul>ple Files (Classes) • Uptonowallyourprogramswerecontainedinasinglefile.
• Inthisprojectyouwillbeusingseveralfiles• Givenasta=cmethodm1()definedinafileA.java,andasta=cmethodm2()definedinafileB.java,
• Ifyouwanttocallm2inthebodyofm1youmustusethefollowingsyntax;B.m2();
• E.g.,inKNN.java:...Helpers.show("Train Images", trainImages, trainLabels, 20, 35);...
18
Outline • Administra=ve
• Informa=on/Star=ngpoint• SubmissionandGroups• SubmissionServerandTokens
• Project• Goal• Overview• ProvidedCode• ProjectDetails:
• Representa=onofImages• Part1:ParserforIDXformat“ParseurpourleformatIDX”• Part2:DistancesbetweenImages“Distancesentreimages”• Part3:Aggrega=on“Choixdel’e=queWe”• Part4:Evalua=on“Ou=lsd'évalua=on”• Part5:Bonus"Réduc=ondudataset"
19
Refresher: Arrays in Java Example FuncEonality
image.length Lengthofanarray(heightofimage=numberofrows)
image[4] Accesstheelementatposi=on4Recall:firstelementisatposi=on0;lastelementisatposi=onlength-1
image[4].length Lengthofelementatposi=on4(widthofrow4)
image[4][1] Accesstoelementatrow4andcolumn1
newfloat[7] Createanew1-dim.floatarraywith7entries(0-6)
newbyte[4][5] Createanew2-dim.bytearraywith4rows(0-3)and5columns(0-4)
newbyte[3][4][5] Createanew3-dim.bytearraywith3tables(e.g.,images)with4rows(0-3)and5columns(0-4)
20
(Noteabyteisavariantofintegerwithrange[-128,127].)
• Matrixofpixelsingreyscale• Eachpixelhasoneof256greyvalues(1byte/8bits)
Representa>on of IDX Image
21
Values=[0,255]0=“00000000”255=“11111111”0isblack255iswhite
Representa>on of Images • Digitalimage=rasterofpixel(orpictureelements)
• Resolu=on=numberofpixelsusedtorepresentanimage,e.g.,1024x768means
• 1024pixelsfromleytoright• 768pixelsfromtoptoboWom
• Inthisproject:imagesarerepresentedastwo-dimensionalarrays(ofbytes)
• E.g.,byte[][]image=newbyte[30][50];isanimagewith30rows(fromtoptoboWom)and50columns(leytoright)andatotalof30x50=1500pixels.
22
Representa>on of Images • Digitalimage=rasterofpixel(orpictureelements)
• Resolu=on=numberofpixelsusedtorepresentanimage,e.g.,1024x768means
• 1024pixelsfromleytoright• 768pixelsfromtoptoboWom
• Inthisproject:imagesarerepresentedastwo-dimensionalarrays(ofbytes)
• E.g.,byte[][]image=newbyte[30][50];isanimagewith30rows(fromtoptoboWom)and50columns(leytoright)andatotalof30x50=1500pixels.Orderintheproject:firstrow,thencolumn!
23
Representa>on of a Set of Images
byte[][][]tensor
byte[][]firstImage=tensor[0]
byte[]row=tensor[0][15]
bytepixel=tensor[0][15][0]
24
(zoomed)
tensor[imageId][rowId][colId]
File Structure of IDX Format • EachIDXfileisanarchive:itstoresmul=pleimagesorlabelsinonefile.
• Bytesmustbeinterpretedasunsignedbytes• EachIDXarchivebeginswithamagicnumber• Magicnumbersindicatethetypeofthefile:
• 2049=>labelsarchive• 2051=>imagesarchive
2051
26
Labels IDX Format • Sequenceofbytes:0,0,8,1,0,0,0,100,4,8,4,5,3,0,8
• First8bytes:2integers• magicnumber:0,0,8,1=>2049• numberoflabels:0,0,0,100=>100
• Ayerwardseachbytesisalabel• Tenpossiblevaluesofalabel:0,1,2,3,4,5,6,7,8,9
27
Images in IDX Format • Sequenceofbytes(byte[]):0,0,8,3,0,0,0,100,0,0,0,28,0,0,0,28,0,0,0,0,0,0,0,0
• First16bytes=4integers:• magicnumber:0,0,8,3=>2051• #images:0,0,0,100=>100• heightofimages(#rows):0,0,0,28=>28• widthofimages(#columns):0,0,0,28=>28
• Theneachbyteisthegrayscalevalueofapixel:• Unsignedbytehavevalues[0,255]:0=black,255=white
• TranslatetoJavasignedbytes[-128,127]:-128=black,127=white
• Detailsinthehandout28
Refresher: Numbers in Java • AllnumbersinJAVAaresigned!• Decimal(base10):intdecValue=13;
• Binary(base2:1bit):intbinValue1=0b00000000000000000000000000001101;//32-bitsintbinValue2=0b1101;//leadingzerosarenotrequiredStar=ngwithJava7youcanuseunderlinesforreadability.Underlinesareop=onal.intbinValue1=0b00000000_00000000_00000000_00001101;
• BytesinJAVAbyte a = 125 ; //01_11_11_01byte b = 123 ; //01_11_10_11byte sumClassic = a + b ; //Compile error byte sum = (byte) (a + b) ; //conversion to int + truncatedSystem.out.println(sum); //prints -8//sum = 11_11_10_00 = 248 (unsigned) = -8 (signed)
29
• Image=avector(withheightxwidthdimensions)
• Innerproductoftwovectors(oflength1)indicateshowsimilartheyarew.r.t.theangle
Similarity: Inner Product
-11 0 0
⋅
32
Distance = Inverted Similarity
⋅Weuse1-(zero-normalized)innerproduct
Innerproduct
Dividingbythestandarddevia=on
Subtrac=ngthemean
Meanofanimage
35
The Machine Learning Task • Thetoolstoimportthedataandtocompareimagesareinplace.Wecannowuseamachinelearningalgorithmtolabelunknowndigits.
• IntheMLjargonourtaskisanexampleofsupervisedclassifica/on.
• Classifica(on:thesetoflabelsisfiniteandknown.• Supervised:thelearningdatasetcontainslabeledimages.
• TypicalMLapproach:themodelshouldbetestedondataithadnotseenduringtraining:
• 2datasets:oneforlearning,onefortes=ng.• WeprovidetrainingdatasetsofmulEplesizesand1tesEngdataset
37
The K-Nearest Neighbors Algorithm • Algorithm:sorttheimages,findthemostcommonlabelbetweenthelabelsoftheKimageswhicharethemostsimilartotheimagetoclassify
• Intui=on:considerasimplerexample:whatcolordoyouthinkthegreycircleshouldhave?
InthiscasethepossiblelabelsareRED/BLUE,andthepointsareina2Dspace.Thisseemsobvious
Whataboutthisone?Wehavetodecidetothenumberofneighborstoconsider:WithK=1èREDWithK=5èBLUE
38
On the choice of K • ThevalueofKdeterminestheshapeofthedecisionboundary:
• LargeKèsmoothersurface,simplermodel• SmallKèsharpersurface,morecomplexmodel
• Thechoiceisatradeoffbetweenflexibilityandstability.Inprac=ce:choosethevaluewhichperformsbest(seenextsec=onforperformanceevalua=on)
39
Accuracy • Themodelpredicts(guesses)thelabelsforasetofimages.
• Howtomeasurehowwellitperformed?• Usetrainingimages+labelstoclassifyntestimages
• Usetestlabelstocomputetheaccuracy.• Accuracy:#correctpredic=ons/#n
• Perfectpredic=onsèaccuracy=1
41
Bonus: Dataset reduc>on • Classifica=onisslowbecausetherearemanytrainingimages.Manyofthemlooksimilar!
• Isitpossibletocombinesimilarimagesintoasingleone?
• MachineLearningtotherescue!• K-MeansClustering:iden=fyclusters(groups)ofsimilarimagesandreplaceeachgroupbyitscentroid.
• Unsupervisedclustering:thealgorithmlearnsthestructureofthedatawithoutanypredefinedtarget.
43
Bonus: KMeans Clustering
Simple2DScenario• Ourbraincanquicklyiden=fy
groups!• Canweteachamachinetodothe
same?• Sidenote:eachMNISTimageisa
vectorwith784components.Canyouevenimaginesuchavectorspace?Itisveryhardforustocomprehendavectorspacewithdimension>3
• Intui=on:theimagesofeachdigitshouldformgroupsintheirvectorspace(allthe1sinagroup,the2sinanotherone....)
• Createsomegroupsandreduceeachgroupeto1image.• ThisimageisrepresentaEveofthegroup,itscentroid.• Let'svisualizethealgorithmonasimple2Dvectorspace
44
Bonus: KMeans Algorithm 1.Findini=alrandomcentroids.Trick:usepointsinthedatasetasini=alcentroids(providedcode)
2.Assigneachpointtotheclusterwhosecentroidistheclosest.Trick:reusethedistancefunc=ondefinedearlier
3.Computethenewcentroidsbyaveragingthecomponentsofthepointsassignedtothecluster
4.Gobackto2.un=lconvergence45