mini-projet 1: « reconnaissance d'écritures manuscrites

46
Mini-projet 1: « Reconnaissance d'écritures manuscrites » Vincenzo Bazzucchi, Barbara Jobstmann, Jamila Sam 24.10.2018 1

Upload: others

Post on 18-Dec-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Mini-projet 1: « Reconnaissance

d'écritures manuscrites »

VincenzoBazzucchi,BarbaraJobstmann,

JamilaSam24.10.2018

1

Outline •  Administra=ve

•  Informa=on/Star=ngpoint•  SubmissionandGroups•  SubmissionServerandTokens

•  Project•  Goal•  Overview•  ProvidedCode•  ProjectDetails:

•  Representa=onofImages•  Part1:ParserforIDXformat“ParseurpourleformatIDX”•  Part2:DistancesbetweenImages“Distancesentreimages”•  Part3:Aggrega=on“Choixdel’e=queWe”•  Part4:Evalua=on“Ou=lsd'évalua=on”•  Part5:Bonus"Réduc=ondudataset

2

Informa>on about the Project • Detailedprojectdescrip=onandprovidedmaterial:under“Project1”→“Descrip=on”athWps://proginsc.epfl.ch/wwwhiver/

3

Submission • Deadline:Nov12th,1pm• Groupsof(atmost)2students•  Submission:under“Project1”→“Rendu”athWps://proginsc.epfl.ch/wwwhiver/

4

Submission Content •  EclipseArchivefile(zip-file<20kB)thatincludes

•  KNN.java(required)•  KNNTest.java(op=onal)•  KMeansClustering.java(op=onal,Bonus)

5

Submission Content •  EclipseArchivefile(zip-file<20kB)thatincludes

•  KNN.java(required)•  KNNTest.java(op=onal)•  KMeansClustering.java(op=onal,Bonus)

6

Submission Server • Willopenoneweekbeforethedeadline:

•  FromMon,Nov5thun=lFri,Nov9th4pm.•  Nosubmissionsovertheweekend!•  ReopenonMon,Nov12thfrom9amto1pm(strictdeadline).

•  Eachstudentwillneedatoken(specifickey)tosubmit.•  Tokenswillbesendoutperemailoneweekbeforethesubmissiondeadline.

•  Submissionrequirestwotokens:onefromeachgroupmember.Ifyouworkalone,youneedtouseyourtokentwice.

•  Youcansubmitanewversionusingthesametoken.•  TODO:submitini=al(incomplete)versionbeforethedeadlinetogetfamiliarwiththesubmissionprocess

7

Submission Server – Examples •  Exampletokens:p1-11111andp1-12345•  Exampleofsubmissionwith2students

•  Exampleofsubmissionwith1student

8

Submission – Chea>ng •  Theprojectisgraded.•  Theexchangeofideasbetweengroupsorwiththirdpar=esispermiWedandevenrecommended.

•  Theexchangeofcodeisstrictlyforbidden!• PlagiarismwillbecontrolledandwillbeconsideredcheaEng.

•  Incaseofchea=ng,youwillreceiveara=ngof"NA“:Art.18“Fraudedel'ordonnancesurladiscipline“hWps://www.admin.ch/opc/fr/classified-compila=on/20041650/index.html

• NotethatatanyEme,youwillneedtobeabletoexplainyourcode.

9

Outline •  Administra=ve

•  Informa=on/Star=ngpoint•  SubmissionandGroups•  SubmissionServerandTokens

•  Project•  Goal•  Overview•  ProvidedCode•  ProjectDetails:

•  Representa=onofImages•  Part1:ParserforIDXformat“ParseurpourleformatIDX”•  Part2:DistancesbetweenImages“Distancesentreimages”•  Part3:Aggrega=on“Choixdel’e=queWe”•  Part4:Evalua=on“Ou=lsd'évalua=on”

10

Goal • OurgoalistorecognizehandwriJennumbers,i.e.,addacorrectlabel(0-9)toeverydigitbelow

11

Our Approach: K-Nearest Neighbour

• Weusealargedatabase(calledMNIST)withhandwriWendigits+labels

• Givenanewdigitd,weusethekimagesinthedatabasethatlookthemostsimilartod,tolabeld

ModifiedNa=onalIns=tuteofStandardsandTechnologydatabase

4 4 4 4 7

?

12

Main Ques>ons

~~

• HowtotreatimagesintheMNISTdatabase?• Howdifferentorsimilararetwoimages?

• Howtoaggregate?

• Howtoevaluate?4 4 4 4 7

13

Project Overview 1.  ParserforIDXformat(ImagesinMNISTdatabase)

“ParseurpourleformatIDX”

2. DistancesbetweenImages“Distancesentreimages”

3.  Aggrega=on“Choixdel’e=queWe”

4.  Evalua=on“Ou=lsd'évalua=on”

5.  Bonus

0 0 0 0 0 0 0 …

0 0 0 190 200 200 201 …

0 0 170 204 240 250 250 …

… … … … … … … …

~

4 4 4 4 7

4?

14

Provided Code (1) classKNN•  Headersandemptybodiesofthemethodsthatneedtobeimplemented.Writeyourcodehere!

classKNNTest•  Examplesofhowtouseandtestthemethods•  Thesetestsarenotexhaus=ve.Writeyourowntests!classKMeanClustering•  Headersandemptybodyofthemethodsthatcanbeimplemented.Writeyourcodehere!

classSignatureChecks•  Checksthatthesignaturesoftherequiredmethodsarecorrect(tosimplifyautoma=ctes=ng).

•  Doesnotcheckanyfunc=onality!•  Donotmodify!

15

Provided Code (2) classHelper(Donotmodify!Changeswillnotbetakenintoaccount.)

•  Readfromandwritetoafileasequenceofbytes(bytearray)public static byte[] readBinaryFile(String filename) public static void writeBinaryFile(String filename, byte[] data)

•  Displayimages(andlabels)inrowsxcolumnsgridpublic static void show(String title, byte[][][] tensor, int rows, int columns) public static void show(String title, byte[][][] tensor, byte[] labels, int rows, int columns) public static void show(String title, byte[][][] tensor, byte[] labels, byte[] trueLabels, int rows, int columns) (Windowwillpop-upandprogramwillbepausedun=lwindowisclosed.)

•  BytetoBinaryStringConversionandSigned/UnsignedConversionpublic static String byteToBinaryString(byte b) public static byte binaryStringToByte(String bits) public static String interpretSigned(String bits) public static String interpretUnsigned(String bits)

•  Noteabouttypebyte:variantofintegerwithrange[-128,127](seehandouts)

16

Examples

17

show(“Example”, tensor, labels, 5, 10);

interpretSigned(“11111111”) will return “-1” interpretUnsigned(“11111111”) will return “255”

IDXformatusesunsignedinterpreta=onofbytesJAVAusedsignedinterpreta=onofbytes(moreaboutthislater)

Handling Mul>ple Files (Classes) • Uptonowallyourprogramswerecontainedinasinglefile.

•  Inthisprojectyouwillbeusingseveralfiles•  Givenasta=cmethodm1()definedinafileA.java,andasta=cmethodm2()definedinafileB.java,

•  Ifyouwanttocallm2inthebodyofm1youmustusethefollowingsyntax;B.m2();

•  E.g.,inKNN.java:...Helpers.show("Train Images", trainImages, trainLabels, 20, 35);...

18

Outline •  Administra=ve

•  Informa=on/Star=ngpoint•  SubmissionandGroups•  SubmissionServerandTokens

•  Project•  Goal•  Overview•  ProvidedCode•  ProjectDetails:

•  Representa=onofImages•  Part1:ParserforIDXformat“ParseurpourleformatIDX”•  Part2:DistancesbetweenImages“Distancesentreimages”•  Part3:Aggrega=on“Choixdel’e=queWe”•  Part4:Evalua=on“Ou=lsd'évalua=on”•  Part5:Bonus"Réduc=ondudataset"

19

Refresher: Arrays in Java Example FuncEonality

image.length Lengthofanarray(heightofimage=numberofrows)

image[4] Accesstheelementatposi=on4Recall:firstelementisatposi=on0;lastelementisatposi=onlength-1

image[4].length Lengthofelementatposi=on4(widthofrow4)

image[4][1] Accesstoelementatrow4andcolumn1

newfloat[7] Createanew1-dim.floatarraywith7entries(0-6)

newbyte[4][5] Createanew2-dim.bytearraywith4rows(0-3)and5columns(0-4)

newbyte[3][4][5] Createanew3-dim.bytearraywith3tables(e.g.,images)with4rows(0-3)and5columns(0-4)

20

(Noteabyteisavariantofintegerwithrange[-128,127].)

• Matrixofpixelsingreyscale•  Eachpixelhasoneof256greyvalues(1byte/8bits)

Representa>on of IDX Image

21

Values=[0,255]0=“00000000”255=“11111111”0isblack255iswhite

Representa>on of Images • Digitalimage=rasterofpixel(orpictureelements)

• Resolu=on=numberofpixelsusedtorepresentanimage,e.g.,1024x768means

•  1024pixelsfromleytoright•  768pixelsfromtoptoboWom

•  Inthisproject:imagesarerepresentedastwo-dimensionalarrays(ofbytes)

•  E.g.,byte[][]image=newbyte[30][50];isanimagewith30rows(fromtoptoboWom)and50columns(leytoright)andatotalof30x50=1500pixels.

22

Representa>on of Images • Digitalimage=rasterofpixel(orpictureelements)

• Resolu=on=numberofpixelsusedtorepresentanimage,e.g.,1024x768means

•  1024pixelsfromleytoright•  768pixelsfromtoptoboWom

•  Inthisproject:imagesarerepresentedastwo-dimensionalarrays(ofbytes)

•  E.g.,byte[][]image=newbyte[30][50];isanimagewith30rows(fromtoptoboWom)and50columns(leytoright)andatotalof30x50=1500pixels.Orderintheproject:firstrow,thencolumn!

23

Representa>on of a Set of Images

byte[][][]tensor

byte[][]firstImage=tensor[0]

byte[]row=tensor[0][15]

bytepixel=tensor[0][15][0]

24

(zoomed)

tensor[imageId][rowId][colId]

Part 1

ParserforIDXformat

25

File Structure of IDX Format •  EachIDXfileisanarchive:itstoresmul=pleimagesorlabelsinonefile.

• Bytesmustbeinterpretedasunsignedbytes•  EachIDXarchivebeginswithamagicnumber• Magicnumbersindicatethetypeofthefile:

•  2049=>labelsarchive•  2051=>imagesarchive

2051

26

Labels IDX Format •  Sequenceofbytes:0,0,8,1,0,0,0,100,4,8,4,5,3,0,8

•  First8bytes:2integers•  magicnumber:0,0,8,1=>2049•  numberoflabels:0,0,0,100=>100

• Ayerwardseachbytesisalabel•  Tenpossiblevaluesofalabel:0,1,2,3,4,5,6,7,8,9

27

Images in IDX Format •  Sequenceofbytes(byte[]):0,0,8,3,0,0,0,100,0,0,0,28,0,0,0,28,0,0,0,0,0,0,0,0

•  First16bytes=4integers:•  magicnumber:0,0,8,3=>2051•  #images:0,0,0,100=>100•  heightofimages(#rows):0,0,0,28=>28•  widthofimages(#columns):0,0,0,28=>28

•  Theneachbyteisthegrayscalevalueofapixel:•  Unsignedbytehavevalues[0,255]:0=black,255=white

•  TranslatetoJavasignedbytes[-128,127]:-128=black,127=white

•  Detailsinthehandout28

Refresher: Numbers in Java •  AllnumbersinJAVAaresigned!•  Decimal(base10):intdecValue=13;

•  Binary(base2:1bit):intbinValue1=0b00000000000000000000000000001101;//32-bitsintbinValue2=0b1101;//leadingzerosarenotrequiredStar=ngwithJava7youcanuseunderlinesforreadability.Underlinesareop=onal.intbinValue1=0b00000000_00000000_00000000_00001101;

•  BytesinJAVAbyte a = 125 ; //01_11_11_01byte b = 123 ; //01_11_10_11byte sumClassic = a + b ; //Compile error byte sum = (byte) (a + b) ; //conversion to int + truncatedSystem.out.println(sum); //prints -8//sum = 11_11_10_00 = 248 (unsigned) = -8 (signed)

29

Part 2

DistancesbetweenImages

30

Distance between Two Images

•  SumofEuclidiandistancebetweentwopixels

•  Inthisproject,wewilluse

31

•  Image=avector(withheightxwidthdimensions)

•  Innerproductoftwovectors(oflength1)indicateshowsimilartheyarew.r.t.theangle

Similarity: Inner Product

-11 0 0

32

• Assumeimageswithonlytwopixels

Similarity: Inner Product

Innerproduct:1 0.950.30

33

Distance = Inverted Similarity

⋅Weuse1-(zero-normalized)innerproduct

34

Distance = Inverted Similarity

⋅Weuse1-(zero-normalized)innerproduct

Innerproduct

Dividingbythestandarddevia=on

Subtrac=ngthemean

Meanofanimage

35

Part 3

Aggrega=on“Choixdel’e=queWe”

36

The Machine Learning Task •  Thetoolstoimportthedataandtocompareimagesareinplace.Wecannowuseamachinelearningalgorithmtolabelunknowndigits.

•  IntheMLjargonourtaskisanexampleofsupervisedclassifica/on.

•  Classifica(on:thesetoflabelsisfiniteandknown.•  Supervised:thelearningdatasetcontainslabeledimages.

•  TypicalMLapproach:themodelshouldbetestedondataithadnotseenduringtraining:

•  2datasets:oneforlearning,onefortes=ng.• WeprovidetrainingdatasetsofmulEplesizesand1tesEngdataset

37

The K-Nearest Neighbors Algorithm •  Algorithm:sorttheimages,findthemostcommonlabelbetweenthelabelsoftheKimageswhicharethemostsimilartotheimagetoclassify

•  Intui=on:considerasimplerexample:whatcolordoyouthinkthegreycircleshouldhave?

InthiscasethepossiblelabelsareRED/BLUE,andthepointsareina2Dspace.Thisseemsobvious

Whataboutthisone?Wehavetodecidetothenumberofneighborstoconsider:WithK=1èREDWithK=5èBLUE

38

On the choice of K •  ThevalueofKdeterminestheshapeofthedecisionboundary:

•  LargeKèsmoothersurface,simplermodel•  SmallKèsharpersurface,morecomplexmodel

•  Thechoiceisatradeoffbetweenflexibilityandstability.Inprac=ce:choosethevaluewhichperformsbest(seenextsec=onforperformanceevalua=on)

39

Part 4

Evalua=on“Ou=lsd'évalua=on”

40

Accuracy •  Themodelpredicts(guesses)thelabelsforasetofimages.

• Howtomeasurehowwellitperformed?•  Usetrainingimages+labelstoclassifyntestimages

• Usetestlabelstocomputetheaccuracy.•  Accuracy:#correctpredic=ons/#n

• Perfectpredic=onsèaccuracy=1

41

Part 5

Bonus“Réduc=ondudataset”

42

Bonus: Dataset reduc>on • Classifica=onisslowbecausetherearemanytrainingimages.Manyofthemlooksimilar!

•  Isitpossibletocombinesimilarimagesintoasingleone?

• MachineLearningtotherescue!•  K-MeansClustering:iden=fyclusters(groups)ofsimilarimagesandreplaceeachgroupbyitscentroid.

•  Unsupervisedclustering:thealgorithmlearnsthestructureofthedatawithoutanypredefinedtarget.

43

Bonus: KMeans Clustering

Simple2DScenario•  Ourbraincanquicklyiden=fy

groups!•  Canweteachamachinetodothe

same?•  Sidenote:eachMNISTimageisa

vectorwith784components.Canyouevenimaginesuchavectorspace?Itisveryhardforustocomprehendavectorspacewithdimension>3

•  Intui=on:theimagesofeachdigitshouldformgroupsintheirvectorspace(allthe1sinagroup,the2sinanotherone....)

•  Createsomegroupsandreduceeachgroupeto1image.•  ThisimageisrepresentaEveofthegroup,itscentroid.•  Let'svisualizethealgorithmonasimple2Dvectorspace

44

Bonus: KMeans Algorithm 1.Findini=alrandomcentroids.Trick:usepointsinthedatasetasini=alcentroids(providedcode)

2.Assigneachpointtotheclusterwhosecentroidistheclosest.Trick:reusethedistancefunc=ondefinedearlier

3.Computethenewcentroidsbyaveragingthecomponentsofthepointsassignedtothecluster

4.Gobackto2.un=lconvergence45

Remarks • Donotbescaredbythelengthofthehandout!

•  Itdoesnotonlylistthetasksthatyouhavetocomplete•  Butalsosometheoryandexplana=ons

•  HowtodoX?•  Whydoitthisway?

• UsetheMoodleforumforques=onsanddiscussions•  DONOTPOSTCODE

•  TAswillhelpyouifyouarestuckorifsomethingisnotclear

Bontravail!46