going deeper with convolutionscis.csuohio.edu/~sschung/cis601/cis 601 presentation 2-2 going...

GoingDeeperwithConvolutionsChristianSzegedy,WeiLiu,Yangqing Jia,PierreSermanet,ScottReed,DragomirAnguelov,Dumitru Erhan,VincentVanhoucke,andAndrewRabinovichPRESENTEDBY:KAYLEEYUHASANDKYLECOFFEY

AboutNeuralNetworks• Neuralnetworkscanbeusedinmanydifferentcapacities,oftenbycapitalizingontheirsharedskillswithAIs:• Objectclassification,suchaswithimages

- Givenimagesof2differentwolves,canidentifysubspecies• Speechrecognition• Throughinteractivemediumssuchasvideogames,identifyhowpeoplerespondtodifferentstimuliinvariousenvironmentsandsituations

• Thisworkrequiresaheftyamountofresourcestorunsmoothly

• Traditionalneuralnetworkarchitecturehasremainedmostlyconstant

Howtoimproveontraditionalneuralnetworksetups?• Increasingtheperformanceofaneuralnetworkbyincreasingitssize,whileseeminglylogicallysound,hasseveredrawbacks:• Increasednumberofparametersmakesthenetworkpronetooverfitting• Largernetworksizerequiresmorecomputationalresources

l

l Greenline:overfitting

Howtoimproveontraditionalneuralnetworksetups?

• Howtoimproveperformancewithoutmorehardware?

• Byutilizingcomputationsondensematrices

• Thissparsearchitecture’snameisInception, basedonthe2010filmofthesamename

• Introducingsparsityintothearchitecturebyreplacingfullyconnectedlayerswithsparseones,eveninsideconvolutions,iskey.

• Mimicsbiologicalsystems

InceptionArchitecture:NaïveVersion

• Thepaper’sauthorsdeterminedthiswastheoptimalspatialspread,“thedecisionbasedmoreonconveniencethannecessity”

l Thiscanberepeatedspatiallyforscalingl Thisalignmentalsoavoidspatch-alignmentissues

• However,5x5modulesquicklybecomeprohibitivelyexpensiveonconvolutionallayerswithalargenumberoffilters

In short: Inputs come from the previous layer, and go through various convolutional layers. The pooling layer serves to control overfitting by reducing spatial size.

InceptionArchitecture:DimensionalityReduction

• Bycomputingreductionswith1x1convolutionsbeforereachingthemoreexpensive3x3and5x5convolutions,thenecessaryprocessingpoweristremendouslyreduced

l Theuseofdimensionalityreductionsallowsforsignificantincreasesinthenumberofunitsateachstagewithouthavingasharpincreaseinnecessarycomputationalresourcesatlater,morecomplexstages

GoogLeNet• AniterationofInceptionthepaper’sauthorsusedastheirsubmissiontothe2014ImageNetLargeScaleVisualRecognitionCompetition(ILSVRC).

• Thenetworkwasdesignedtobesoefficientitcouldrunwithalowmemoryfootprintonindividualdevicesthathavelimitedcomputationalresources.l IfCNNsaretogainafootholdinprivateindustry,havinglowoverheadcostsisespeciallyimportant.

HereisasmallsampleofthearchitectureofGoogLeNet,whereyoucannotetheusageofdimensionalityreductionasopposedtothenaïve.

GoogLeNet• Becausetheentiretyofthearchitectureisfartoolargetofitlegiblyinoneslide.

GoogLeNet• GoogLeNet incarnationoftheInceptionarchitecture.

• “#3x3/#5x5reduce”standsforthenumberof1x1filtersinthereductionlayerusedbefore3x3and5x5convolutions.

• Whiletherearemanylayerstothis,themaingoalofitistohavethefinal “softmax”layersgive“scores”totheimageclasses.

• i.e.dogs,skindiseases,etc.

• Lossfunctiondetermineshowgoodorbadeachscoreis.

GoogLeNet• GoogLeNetwas22layersdeep,whencountingonlylayerswithparameters.

l 27ifyoucountpoolingl About100totallayers

• Couldbetrainedtoconvergencewithafewhigh-endGPUsinaboutaweek

l Themainlimitationwouldbememoryusage• Itwastrainedtoclassifyimagesofintooneofover1000leaf-nodeimagecategoriesintheImageNethierarchy

l ImageNetisalargevisualdatabasedesignedspecificallyforvisualsoftwarerecognitionresearch

l GoogLeNetperformedquitewellinthiscontest

GoogLeNet

• GoogLeNetwas22layersdeep,whencountingonlylayerswithparameters:27ifyoucountpooling,withabout100layersintotal.

• Left:GoogLeNet’sperformanceatthe2014ILSVRC:itcameinfirstplace.

• Right:Abreakdownofitsclassificationperformancebreakdown.

l UsingmultipledifferentCNNsandaveragingtheirscorestogetapredictionclassforanimageresultsinbetterscoresthanjust1CNN.See:theinstancewith7CNNs.

Summary• Convolutionalneuralnetworksarestilltopperformersinneuralnetworks.• TheInceptionframeworkallowsforlargescalingwhileminimizingprocessingbottlenecks,aswellas“chokepoints”whereifitscalestoacertainpoint,itbecomesinefficient.

l Italsorunswellonmachineswithoutpowerfulhardware.• Reducingwithusing1x1convolutionsbeforepassingitto3x3and5x5convolutionshasprovenefficientandeffective.

• Furtherstudy:ismimickingtheactualbiologicalconditionsuniversallythebestcaseforneuralnetworkarchitecture?Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A. (2015). Going deeper with convolutions. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). doi:10.1109/cvpr.2015.7298594 .Chabacano.(2008,February).Overfitting.RetrievedApril08,2017,fromhttps://en.wikipedia.org/wiki/Overfitting

going deeper with convolutionscis.csuohio.edu/~sschung/cis601/cis 601 presentation 2-2 going...

Documents