going deeper with convolutionscis.csuohio.edu/~sschung/cis601/cis 601 presentation 2-2 going...
TRANSCRIPT
GoingDeeperwithConvolutionsChristianSzegedy,WeiLiu,Yangqing Jia,PierreSermanet,ScottReed,DragomirAnguelov,Dumitru Erhan,VincentVanhoucke,andAndrewRabinovichPRESENTEDBY:KAYLEEYUHASANDKYLECOFFEY
AboutNeuralNetworks• Neuralnetworkscanbeusedinmanydifferentcapacities,oftenbycapitalizingontheirsharedskillswithAIs:• Objectclassification,suchaswithimages
- Givenimagesof2differentwolves,canidentifysubspecies• Speechrecognition• Throughinteractivemediumssuchasvideogames,identifyhowpeoplerespondtodifferentstimuliinvariousenvironmentsandsituations
• Thisworkrequiresaheftyamountofresourcestorunsmoothly
• Traditionalneuralnetworkarchitecturehasremainedmostlyconstant
Howtoimproveontraditionalneuralnetworksetups?• Increasingtheperformanceofaneuralnetworkbyincreasingitssize,whileseeminglylogicallysound,hasseveredrawbacks:• Increasednumberofparametersmakesthenetworkpronetooverfitting• Largernetworksizerequiresmorecomputationalresources
l
l Greenline:overfitting
Howtoimproveontraditionalneuralnetworksetups?
• Howtoimproveperformancewithoutmorehardware?
• Byutilizingcomputationsondensematrices
• Thissparsearchitecture’snameisInception, basedonthe2010filmofthesamename
• Introducingsparsityintothearchitecturebyreplacingfullyconnectedlayerswithsparseones,eveninsideconvolutions,iskey.
• Mimicsbiologicalsystems
InceptionArchitecture:NaïveVersion
• Thepaper’sauthorsdeterminedthiswastheoptimalspatialspread,“thedecisionbasedmoreonconveniencethannecessity”
l Thiscanberepeatedspatiallyforscalingl Thisalignmentalsoavoidspatch-alignmentissues
• However,5x5modulesquicklybecomeprohibitivelyexpensiveonconvolutionallayerswithalargenumberoffilters
In short: Inputs come from the previous layer, and go through various convolutional layers. The pooling layer serves to control overfitting by reducing spatial size.
InceptionArchitecture:DimensionalityReduction
• Bycomputingreductionswith1x1convolutionsbeforereachingthemoreexpensive3x3and5x5convolutions,thenecessaryprocessingpoweristremendouslyreduced
l Theuseofdimensionalityreductionsallowsforsignificantincreasesinthenumberofunitsateachstagewithouthavingasharpincreaseinnecessarycomputationalresourcesatlater,morecomplexstages
GoogLeNet• AniterationofInceptionthepaper’sauthorsusedastheirsubmissiontothe2014ImageNetLargeScaleVisualRecognitionCompetition(ILSVRC).
• Thenetworkwasdesignedtobesoefficientitcouldrunwithalowmemoryfootprintonindividualdevicesthathavelimitedcomputationalresources.l IfCNNsaretogainafootholdinprivateindustry,havinglowoverheadcostsisespeciallyimportant.
HereisasmallsampleofthearchitectureofGoogLeNet,whereyoucannotetheusageofdimensionalityreductionasopposedtothenaïve.
GoogLeNet• Becausetheentiretyofthearchitectureisfartoolargetofitlegiblyinoneslide.
GoogLeNet• GoogLeNet incarnationoftheInceptionarchitecture.
• “#3x3/#5x5reduce”standsforthenumberof1x1filtersinthereductionlayerusedbefore3x3and5x5convolutions.
• Whiletherearemanylayerstothis,themaingoalofitistohavethefinal “softmax”layersgive“scores”totheimageclasses.
• i.e.dogs,skindiseases,etc.
• Lossfunctiondetermineshowgoodorbadeachscoreis.
GoogLeNet• GoogLeNetwas22layersdeep,whencountingonlylayerswithparameters.
l 27ifyoucountpoolingl About100totallayers
• Couldbetrainedtoconvergencewithafewhigh-endGPUsinaboutaweek
l Themainlimitationwouldbememoryusage• Itwastrainedtoclassifyimagesofintooneofover1000leaf-nodeimagecategoriesintheImageNethierarchy
l ImageNetisalargevisualdatabasedesignedspecificallyforvisualsoftwarerecognitionresearch
l GoogLeNetperformedquitewellinthiscontest
GoogLeNet
• GoogLeNetwas22layersdeep,whencountingonlylayerswithparameters:27ifyoucountpooling,withabout100layersintotal.
• Left:GoogLeNet’sperformanceatthe2014ILSVRC:itcameinfirstplace.
• Right:Abreakdownofitsclassificationperformancebreakdown.
l UsingmultipledifferentCNNsandaveragingtheirscorestogetapredictionclassforanimageresultsinbetterscoresthanjust1CNN.See:theinstancewith7CNNs.
Summary• Convolutionalneuralnetworksarestilltopperformersinneuralnetworks.• TheInceptionframeworkallowsforlargescalingwhileminimizingprocessingbottlenecks,aswellas“chokepoints”whereifitscalestoacertainpoint,itbecomesinefficient.
l Italsorunswellonmachineswithoutpowerfulhardware.• Reducingwithusing1x1convolutionsbeforepassingitto3x3and5x5convolutionshasprovenefficientandeffective.
• Furtherstudy:ismimickingtheactualbiologicalconditionsuniversallythebestcaseforneuralnetworkarchitecture?Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A. (2015). Going deeper with convolutions. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). doi:10.1109/cvpr.2015.7298594 .Chabacano.(2008,February).Overfitting.RetrievedApril08,2017,fromhttps://en.wikipedia.org/wiki/Overfitting