top 6 data science and analytics trends for 2021

12
EBOOK How the Data Cloud accelerates machine learning TOP 6 DATA SCIENCE AND ANALYTICS TRENDS FOR 2021

Upload: others

Post on 16-Oct-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: TOP 6 DATA SCIENCE AND ANALYTICS TRENDS FOR 2021

EBOOKHow the Data Cloud accelerates machine learning

TOP 6 DATA SCIENCE AND ANALYTICS TRENDS FOR 2021

Page 2: TOP 6 DATA SCIENCE AND ANALYTICS TRENDS FOR 2021

3 Introduction

4 Shifttowardspredictiveandprescriptiveanalyticscontinues

5 Trend#1:Easy-to-useMLtoolsempowerdataanalystsanddatascientists AutoML AIServicesviaanAPI

6 Trend#2:AconsolidatedplatformclosesthegapbetweenanalyticsandML

7 Trend#3:Snowflake’sDataCloudexpandsaccesstonewdata

8 Trend#4:Dataengineeringtoolsdecreasedrainondatascientist’stime

9 Trend#5:NewgenerationofdistributedtrainingframeworksofferscompellingalternativetoSpark

10 Trend#6:ContinuousreleasesprovidenewoptionsforMLlibraries,tools,andframeworks

11 Accelerateyourmachinelearningin2021

12 AboutSnowflake

Page 3: TOP 6 DATA SCIENCE AND ANALYTICS TRENDS FOR 2021

Data science has evolved dramatically over the last 10 years. However, very few organizations have experienced the full business impact or competitive advantage from their advanced analytics, despite significant investments in data science and machine learning (ML). The reason? Many of the tools needed to scale ML are too complicated, and necessary skill sets are in short supply. But change is now afoot. Technology advancements in 2021 will significantly impact the way in which data scientists and data analysts work. In 2021, six trends have the potential to accelerate ML and move organizations from descriptive and diagnostic analytics (explaining what happened and why) towards predictive and prescriptive analytics that forecast what will happen and also provide powerful pointers on how to change the future.

INTRODUCTION

In this ebook, you will learn how:• �Easy-to-use�ML�tools�and�consolidated�data�platforms�empower�data�analysts�and�bridge�the�gap�between�analytics�and�ML

• �Snowflake’s�Data�Cloud�can�expand�data�access�and�data�sharing�through�a�secure�ecosystem�with�access�to�ready-to-use�third-party�data

• �Data�engineering�tools�remove�the�burden�of�data�prep�for�data�scientists�and�make�repurposing�existing�work�easy

• �New�distributed�training�frameworks�offer�an�alternative�superior�to�Spark�while�delivering�up�to�2,000x�faster�performance

• �Rapid�advancements�in�ML�libraries,�tools,�and�frameworks�demonstrate�the�need�for�a�solution�that�future-proofs�data�science�and�ML�investments

3

Page 4: TOP 6 DATA SCIENCE AND ANALYTICS TRENDS FOR 2021

4

SHIFT TOWARDS PREDICTIVE AND PRESCRIPTIVE ANALYTICS CONTINUES

In 2021, the field of data science is poised to finally live up to the high expectations that many organizations have held for years. Over the last 10 years, huge investments have been made in data science and ML, guided by the hope that it would transform the way companies do business. However, many organizations continue to feel challenged to drive real business impact with analytics.

Recentstudieslendcredencetothiscommonsentiment.AreportfromMITSloanManagementReviewandBostonConsultingGroupfoundthatonly10%oforganizationsareseeingsignificantfinancialbenefitsfromtheirinvestmentsinAI,1andVentureBeatAIreportedthat87%ofdatascienceprojectsnevermakeitintoproduction.2Andin2019,Gartnerpointedtooneofthebiggerchallengesorganizationsface:“Through2020,80%ofAIprojectswillremainalchemy,runbywizardswhosetalentswillnotscaleintheorganization.”3

Organizationsinvestindatasciencebecauseitpromisestobringcompetitiveadvantages,butmanyofthetoolsandskillsetsneededtoscaleMLhavebeenmissingorinshortsupply.Datascientistscontinuetobeasought-afterandexpensiveresource,andtheirvaluableeffortstendtoberelegatedtotime-consumingtaskssuchasdataselectionanddatapreparation.Conversely,dataanalystsareinabundantsupplyinmostcompaniesandalreadyknowhowtoaddressbusinessproblemsdirectly,buttheylackthetechnicalbackgroundrequiredtomakethejumpfromanalyticstodatasciencetobuildtheirownMLmodels.

Remarkably,advancementsmadein2020pointtosixexcitingtrendsfordatascienceandMLin2021.Newtoolsandtechnologieshaveemerged—andcontinuetobereleasedeverymonth—thatacceleratetheworkofdatascientistswhilealsoempoweringdataanalyststomovebeyonddescriptiveanalyticsandconductlightdatascienceandML.

Underlyingthisaccelerationisthecloud.Datascientistsanddataanalystsbenefitfromcloudtechnologiesthatprovidevirtuallyunlimitedamountsofcomputeresources.Inaddition,thecloudenablestheeliminationofdatasilosbyconsolidatingdatalakes,datawarehouses,anddatamartsforfast,secure,andeasydatasharingandanalysisina singlelocation.

Inshort,dataistransformingintoanactionableasset,andnewtoolsareusingthatrealitytomovetheneedlewithML.Asaresult,organizationsareonthebrinkofmobilizingdatatonotonlypredictthefuturebuttoalsoincreasethelikelihoodofcertainoutcomesthroughprescriptiveanalytics.

Herearesixtrendsthatwillshapedatasciencein2021andcontinuetheevolutionofanalyticstowardsML.

CH

AM

PIO

N G

UID

ES

Page 5: TOP 6 DATA SCIENCE AND ANALYTICS TRENDS FOR 2021

5

TREND #1: EASY-TO-USE ML TOOLS EMPOWER DATA ANALYSTS AND DATA SCIENTISTST

Most organizations employ an abundance of data analysts and a limited number of data scientists, due in large part to the limited supply and high costs associated with data scientists. Since analysts lack the data science expertise required to build ML models, data scientists have become the de facto bottleneck for broadening the use of ML.

However, new and improved ML tools are opening the floodgates on ML by automating the technical aspects of data science. Data analysts are empowered with access to powerful models without needing to manually build them. Specifically, automated machine learning (AutoML) and AI services via APIs are removing the need to manually prepare data and then build and train models.

AUTOML

AutoMLtoolsareaptlynamed:TheyautomatethetasksassociatedwithdevelopinganddeployingMLmodels.Theirdevelopmentisgamechangingforbothdatascientistsanddataanalysts.Byautomatingtaskstraditionallydonebydatascientists,AutoMLtoolsenabledataanalyststoaccessmodelsthroughanentirelygraphicalinterface—withoutrequiringadatascientist’sinvolvementortheneedtowritecode.

ButAutoMLisn’tjustforanalysts.Thankstoitspower,AutoMLismakingahugedifferencefordatascientistsbyaddressingthebusyworkthatcantakeupto80%oftheirtime,accordingtoHarvardBusinessReview:loading,selecting,preparing,andcleaningdata.4 Byeliminatingthesetime-consumingdatachores,AutoMLincreasesdatascientists’productivityandprovidesmoretimetoconductanalyses.Additionally,humanerrorsfoundinmanualmodelingprocessesareeliminated,whichimprovesaccuracy.

Inaddition,theonehistoricalflawofAutoMLwasthatitwasseenasablackbox,butthatchallengehasbeensolved.AutoMLservicesnowprovidetransparencyandexplanationsfortheirmodels,whichiskeyforauditinganddetectingbias.Fordatascientists,AutoMLtransformshowquicklytheycanbuildandtestmultiplemodelssimultaneously.

In2020,AutoMLtoolsfromproviderssuchasDataRobot,Dataiku,andH2Osawsignificantadvancements,andnewsolutionswereintroducedsuchasAmazonSageMakerAutopilot.

AI SERVICES VIA AN API

AnotherapproachgrowinginpopularityisAIservices,whichareready-mademodelsavailablethroughAPIs.Ratherthanuseyourowndatatobuildandtrainmodelsforcommonactivities,organizationsaccesspre-trainedmodelsthataccomplishspecifictasks.Whetheranorganizationneedsnaturallanguageprocessing(NLP),automaticspeechrecognition(ASR),

orimagerecognition,AIservicessimplyplug-and-playintoanapplicationthroughanAPI,whichrequiresnoinvolvementfromadatascientist.

AmazonprovidesavarietyoffullymanagedAIservices,includingAmazonLex,Polly,Rekognition,Forecast,andTranslate.5Toillustratethevalue,RekognitionallowsanimagetobesentfromanappthroughtheAPItoAmazon;theAIservicethenreturnsaclassificationanddescriptionofwhattheimageis.Thesetypesofutilitiesnotonlysavetimeandeffort,buttheyalsofreeupdatascientiststofocusonbuildingandtrainingmodelsthatarehighlycustomizedtotheirbusiness,ratherthanre-creatingcommonlyusedservices.

AutoMLtoolsandAIserviceslowerthebarriertoentryforML,soalmostanyonecannowaccessandusedatasciencewithoutrequiringanacademicbackground.However,thetruepowerofthesetoolsisunleashedwhentheyareintegratedseamlesslywithyourexistingtechnologies.WithSnowflake Partner Connect,organizationscanreceivefasterinsightsfromtheirdatathroughpre-builtintegrationsbetweenSnowflakeandtechnologypartners’products.SnowflakePartnerConnectmakesitfastandeasytotrynewMLtoolsandservicesandthenadoptthosethatbestmeetyourbusinessneeds.

CH

AM

PIO

N G

UID

ES

Page 6: TOP 6 DATA SCIENCE AND ANALYTICS TRENDS FOR 2021

6

TREND #2: A CONSOLIDATED PLATFORM CLOSES THE GAP BETWEEN ANALYTICS AND ML

Everyone knows data silos exist within and across organizations. However, few realize that these siloes also take the form of “analytics siloes,” particularly between data scientists and data analysts. These analytics silos have formed as a result of the different ways the two roles work and their respective skill sets. Data silos are just one part of the difference: Data scientists and data analysts use different data (raw versus processed), data sources (data lakes versus warehouses and marts), languages (Python and Spark versus SQL), and tools (ML versus BI).

Muchlikeorganizationalsilos,analyticssilosthwartcollaborationandintegrationopportunitiesbetweendatascientistsanddataanalysts.Thissituationresultsinorganizationsmissingoutonthecombinedpowerofthesetwoteams,whichisexponentiallystrongerthansimplythesumofthetwoparts.

Forexample,dataanalystsleveragedatatoprovidekeybusinessmetricsandanswerquestionsaroundwhysomethinghappened.AccordingtoSisu,dataanalysts’superpowerisspeed,andtheyuseittoanalyzedatasetsquicklyandworkwithbusinessstakeholderstouncoverpotentialinsights.6Whiletheirgoalsaretohelpcompaniesmonetizemarketopportunitiesandimprovecompetitiveadvantage,mostoftheworkdataanalystsdoisbackwardslookingbecausetheylackthedatascienceskillsnecessarytobuildpredictiveMLmodels.Instead,dataanalystsrelyonBItoolswhosedashboardshavebuilt-inlimitations.Whiletheycanusedatatounderstandwhathasalreadyhappened,it’schallengingfordataanalyststobeproactiveandexploredatadeeplytofigureoutwhatwillhappenandhowtoinfluenceit.

Conversely,datascientistshavetheabilitytobuildMLmodelsthatnotonlypredictbutalsoinfluencebusinessoutcomes.However,theyarenotaswellversedinthedynamicandfluctuatingbusinessenvironmentasdataanalystsare.Sisudescribesdatascientistsas“narrow-and-deepworkers,”andtheirfocusfrequentlyresultsinorganizationstryingtofocusdatascienceeffortsatknownproblems(oftenuncoveredbydataanalysts)tomaximizetheirvalueandcontributionsratherthanpotentiallywastingtimeandeffortontheunknown.7

Snowflake’s Data Cloud providesthetoolstohelpdeliverstrongeroutcomesandscale.ThroughSnowflake,analyticssilosareeliminated.Thesameconsistent,governedmetricsanddataareavailableforbothanalyticsandMLthroughasharedfeaturestoreandreuseofdataengineeringpipelines.WhendatascienceinsightsaresharedinSnowflake’splatform,dataanalystscanaccessandincorporatethemintodashboardsandanalysis,thusbroadeningthescopeofimpactofthemodelsthedatascienceteambuilds.

CH

AM

PIO

N G

UID

ES

Page 7: TOP 6 DATA SCIENCE AND ANALYTICS TRENDS FOR 2021

7

CH

AM

PIO

N G

UID

ES

TREND #3: SNOWFLAKE’S DATA CLOUD EXPANDS ACCESS TO NEW DATA

In an IDC report sponsored by Seagate, IDC reports that by 2025, it expects 175 zettabytes of data to be created worldwide each year,8 which is approximately four times the amount produced in 2020 according to the World Economic Forum.9 The explosion of data produced by technologies such as IoT, social media, and mobile devices is opening up vast opportunities for data-driven insights into every area of inquiry.

Ofcourse,it’svirtuallyimpossibleforanyorganizationtoproduceorcollectallthedataneededtouncoverbusinessandcompetitivetrends.Increasingly,theabilitytoshareandjoindatasets,bothwithinandacrossorganizations,isviewedasthebestwaytoderivemorevaluefromdata.That’swhydatascientistsanddataanalystsarecontinuallyonthehuntformoredatatosupplementtheirMLmodelsandanalyseswithexternaldatatoimprovetheaccuracyofresults.

Snowflakeenablessecure,governed,compliant,andseamlessaccesstothird-partydatainthreeways.

1 Snowflake’s Data CloudisanecosystemwhereSnowflakecustomers,partners,dataproviders,anddataserviceprovidersconnecttotheirowndataandseamlesslyshareandconsumedataanddataservicessharedbyotherusers.UnderpinnedbySnowflake’splatform,theDataCloudeliminatesbarrierspresentedbysiloeddataandenablesorganizationstounifyandconnecttoasinglecopyofdata.Inaddition,theDataCloudisaseamlesswaytoderivevaluefromrapidlygrowingcommercializeddatasetswithfast,easy,andgovernedaccess.

2� EmpoweringtheDataCloudisSnowflake Secure Data Sharing,whichremovestraditionaldatatransferbarriers.WithSnowflake,dataisgenerallynevercopiedandtransmitted.Instead,userscansharelivedatafromitsoriginallocation.Thosegrantedaccesssimplyreferencedatainacontrolledandsecuremannerwithoutlatencyorcontentionfromconcurrentusers.Becausechangestodataaremadetoasingleversion,dataremainsuptodateforallconsumers,whichensuresdatamodelsarealwaysusingthelatestversion.

3 SnowflakeSecureDataSharingisthetechnologyfoundationforSnowflake Data Marketplace, whichservesasasinglelocationtoaccesslive,ready-to-querydata.Secure,governeddatacanbesharedwith,andreceivedfrom,anecosystem

ofbusinesspartners,suppliers,andcustomers,orfromthird-partydataprovidersanddataserviceproviders.SnowflakeDataMarketplaceremovesthearduousprocessesinvolvedinlocatingtherightdatasets,signingcontractswithvendors,andmanagingthedatatomakeitcompatiblewithinternaldata.Instead,datascientistsanddataanalystscansourcenewdatawithease.InadditiontoSnowflakeDataMarketplace,organizationscanuseprivate data exchangestosharedatawithtrustedpartners,suppliers,vendors,andcustomersthroughSnowflakeSecureDataSharing.

ExternaldataisavailableandaccessibletoallDataClouduserswithjustafewclicks.Onceit’sintheDataCloud,dataisreadytobesharedandconsumed.There’snosendingofCSVfilesormanualversioncontrol.Datascientistscanenrichmodelswithseamlessaccesstoalmost-unlimiteddataonanytopic,includingreal-timeandevolvingcircumstances.Forexample,in2020,COVIDdatasetswereuniversallyinhighdemandacrossallsectorsandindustriesbecauseorganizationsneededtoanalyzetheimpactofthevirusonanhour-by-hourbasis.

CH

AM

PIO

N G

UID

ES

Page 8: TOP 6 DATA SCIENCE AND ANALYTICS TRENDS FOR 2021

8

TREND #4: DATA ENGINEERING TOOLS DECREASE DRAIN ON DATA SCIENTISTS’ TIME

Data engineering tasks require an inordinately large amount of a data scientist’s attention. The time commitment for data preparation varies. It can range from 45%, according to the Anaconda “2020 State of Data Science” survey reported by Datanami,10 to 80%, according to a survey conducted by CrowdFlower and reported by Forbes,11 but there is little disagreement among data scientists that collecting, organizing, and cleaning data are the least enjoyable tasks they undertake. Minimizing this burden is paramount not only for keeping data scientists productive and happy but also for broadening access to ML.

Finally,directintegrationsbetweendataengineeringtoolsandMLtoolsarestartingtobringthesetwoworldstogether.Forexample,DataRobotacquiredPaxatatoadddatapreparationtoolsalongsideitsAutoMLoffering,12andAlteryxisshiftingitsfocusfromdatapreptowardsanautomated,assistedMLmodelingoffering.13Inaddition,AmazonSageMakerlaunchedtwonewservices:DataWrangler,14 to acceleratedataprepforML,andPipelines,15asacontinuousintegrationandcontinuousdelivery(CI/CD)serviceforML.

Datascientistsarealsobenefitingfrom“featurestores,”whichmakeiteasytorepurposeexistingwork.Forexample,onceadatascientisthasconvertedrawdataintoametric(forexample,“costofgoodssold”),thisuniversalmetriccanbefoundquicklyandusedbyeveryoneelseforquickanalysisagainstthatdata.Notonlydoesthispracticesavedatascientiststimeandeffort,butitalsoreinforcesBImetrics,maintainsdatagovernance,andensurestherearenodiscrepanciesacrosstheworkdonebydataanalystsanddatascientists.

Dataengineeringtoolsnotonlyenablereuseofprepareddatabyanyonewithinanorganization,buttheycanleveragethescalabilityandefficiencyofSnowflaketoprovidetheprocessing.Snowflakeworkswithvariouspartnersthatprovidefeaturestoresolutionstoensuredatascientistsanddataanalystscanreuseconsistentfeatures.

CH

AM

PIO

N G

UID

ES

Page 9: TOP 6 DATA SCIENCE AND ANALYTICS TRENDS FOR 2021

9

TREND #5: NEW GENERATION OF DISTRIBUTED TRAINING FRAMEWORKS OFFERS COMPELLING ALTERNATIVE TO SPARK

Data scientists are always looking for strategic ways to inject efficiency into training and deploying models. Recently, a new generation of distributed training engines has surfaced that delivers on that goal by providing tremendous speed and performance gains over Apache Spark.

Oneapproachthat’sgainingattentionisDask,adistributedtrainingframeworkbuiltinPython.16Daskisdesignedtoenabledatascientiststoimprovemodelaccuracyfaster.DatascientistscandoeverythinginPythonendtoend,whichmeanstheynolongerneedtoconverttheircodetoexecuteinSpark.Theresultisreducedcomplexityandincreasedefficiency.

AnotheropensourcePythonframeworkisRAPIDS,whichisbuiltontopofDask.17RAPIDSoptimizescomputetimeandspeedbyprovidingdatapipelinesandexecutingdatasciencecodeentirelyongraphicsprocessingunits(GPUs)ratherthanCPUs.SaturnCloudrecentlycomparedRAPIDStoSparkanddiscoveredthatmodeltrainingwithRAPIDStookonesecondona20-nodeGPUcluster,whileSparktook37minutesonasimilarlypriced20-nodeCPUcluster.SaturnCloudconcludedthatRAPIDSenables2,000xfasterprocessingusingGPUswhilecostingafractionoftheprice.18

Theimpactofthesedistributedtrainingframeworksisalreadybeingseenintherealworld.WalmartusesRAPIDSwithDaskandXGBoost(anMLalgorithm)foritsdataanalyticsandML,andNVIDIAreportsthatWalmarthasfoundthat“oneGPUserverrequiresonlyfourpercentofthetimeneededtorunthesameforecastingmodelsvsa20-nodeCPUserver.”19ThattranslatestoWalmartrunningmodelsinfourhoursthatpreviouslytookseveralweeksusingCPUs.

Whileorganizationsarethinkingstrategicallyabouttrainingframeworks,somehaverunintobarriersinthepast.Today,newtechnologiesareunlockingwhat’spossibleanddemonstratinghowmuchfasterthingscanbewheneverythingisdonedirectlywithPython.ByeliminatingtheneedtoconvertmodelsintoSpark,organizationsarereducingcomplexityandincreasingefficiency.Andit’seasytotrydifferentdistributedtrainingframeworksonSnowflake’splatformtofindwhatworksbest.

CH

AM

PIO

N G

UID

ES

Page 10: TOP 6 DATA SCIENCE AND ANALYTICS TRENDS FOR 2021

10

TREND #6: CONTINUOUS RELEASES PROVIDE NEW OPTIONS FOR ML LIBRARIES, TOOLS, AND FRAMEWORKS

The field of data science is evolving rapidly. Not only are new ML and AI developments released every month, but new startups, tools, and solutions emerge regularly. With the rapid pace of innovation occurring in this space, it’s imperative not to get locked into using a single tool.

That’swhyit’simportanttoselectaplatformthatisvendor-,framework-,andalgorithm-agnostic.Bychoosingafuture-proofplatform,youensurethatupcomingMLtoolswillcontinuetoworkseamlesslywiththeplatformyouhave.Afterall,thelastthingyouwanttodoisre-platforminordertousethenextgenerationoftools.

WhatmakesSnowflake’splatformuniqueisitsmodernarchitecture.Designedwithseparate,butlogicallyintegrated,computeandstorage,Snowflakeeliminatesthemanualcluster-buildingeffortsthatothersystemsmustperformtomakeseparatelayersworktogether.Asaresult,Snowflakeprovidesamulti-cluster, shared data architecturethatprovidesnearlyinfinitescalability,instantelasticity,andextremelyhighlevelsofconcurrencytopowertheDataCloud.

Inadditiontotheunderlyingarchitecture,Snowflakesupportsdatascienceinavarietyofways.

• �Snowflake’s�External Functions�allow�any�third-party,�hosted,�or�custom�ML�service�to�be�accessed�easily�using�SQL.�

• �Recognizing�that�various�teams�may�prefer�languages�other�than�SQL,�Snowpark�extends�language�support�for�Java,�Scala,�and,�soon,�Python.�Snowpark�allows�data�scientists�to�write�code�in�their�language�of�choice�using�familiar�programming�concepts,�such�as�DataFrames,�and�then�execute�data�preparation�and�workloads�directly�on�Snowflake.

• Java user-defined functions (UDF)�are�supported�to�enable�trained�models�to�run�within�Snowflake.�That�means�models�built�and�trained�in�an�ML�partner’s�technology�can�be�brought�into�and�run�directly�on�Snowflake�resources.

CH

AM

PIO

N G

UID

ES

Page 11: TOP 6 DATA SCIENCE AND ANALYTICS TRENDS FOR 2021

11

ACCELERATE YOUR MACHINE LEARNING IN 2021

It’s remarkable how quickly data science has become mainstream. In the last 10 years, companies have shifted their focus from reporting and historical analysis to conducting data science with advanced mathematical models and ML. The cloud changed everything. With the ability to inexpensively collect and store more and more data came the need to build data models powered by ML.

Today,amodernplatformisanecessityifyouwanttoanalyzeandsharedataquicklyandscalablywithsecurityandgovernancebuiltin.Snowflakeprovidesanarchitecturethatenablesdataconsolidation,efficientdatapreparation,andanextensivepartnerecosystem.Yourdataismobilized,whichallowsyoutobenefitimmediatelyfromnewtrendsindatascienceandML.

With Snowflake, limitations on data science are removed. Are you ready to accelerate your machine learning?

CH

AM

PIO

N G

UID

ES

Page 12: TOP 6 DATA SCIENCE AND ANALYTICS TRENDS FOR 2021

ABOUT SNOWFLAKE SnowflakedeliverstheDataCloud—aglobalnetworkwherethousandsoforganizationsmobilizedatawithnear-unlimitedscale,concurrency,andperformance.InsidetheDataCloud,organizationsunitetheirsiloeddata,easilydiscoverandsecurelysharegoverneddata,andexecutediverseanalyticworkloads.Whereverdataoruserslive,Snowflakedeliversasingleandseamlessexperienceacrossmultiplepublicclouds.Snowflake’splatformistheenginethatpowersandprovidesaccesstotheDataCloud,creatingasolutionfordatawarehousing,datalakes,dataengineering,

datascience,dataapplicationdevelopment,anddatasharing.JoinSnowflakecustomers,partners,anddataprovidersalreadytakingtheirbusinessestonewfrontiersintheDataCloud.

snowflake.com

©2021SnowflakeInc.Allrightsreserved.Snowflake,theSnowflakelogo,andallotherSnowflakeproduct,featureandservicenamesmentionedhereinareregisteredtrademarksortrademarksofSnowflakeInc.intheUnitedStatesandothercountries.Allotherbrandnamesorlogosmentionedorusedhereinareforidentificationpurposesonlyandmaybethetrademarksoftheirrespectiveholder(s).Snowflakemaynotbe

associatedwith,orbesponsoredorendorsedby,anysuchholder(s).

CITATIONS1 on.bcg.com/2Kh3grW 2 bit.ly/3qpSFKQ 3 gtnr.it/2N4FZul4 bit.ly/39Cnaq8

5 amzn.to/3sv5qFx6 bit.ly/2XJOa1u7 bit.ly/3swABAn8 bit.ly/2LURiog

9 bit.ly/3ss1tld10 bit.ly/3oQvitn11 bit.ly/3qpT9AI 12 tcrn.ch/2PFKZDf

13 zdnet.com/article/where-is-alteryx-heading14 aws.amazon.com/sagemaker/data-wrangler15 aws.amazon.com/sagemaker/pipelines16 dask.org

17 rapids.ai18 bit.ly/3srNbRv 19 blogs.nvidia.com/blog/2019/07/11/walmart-nvidia