nosql and column store · why choose rdbms over nosql : 1/3 1.if new relational systems can do...

50
CompSci 516 Database Systems Lecture 20 NoSQL and Column Store Instructor: Sudeepa Roy Duke CS, Fall 2018 CompSci 516: Database Systems 1

Upload: others

Post on 27-May-2020

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

CompSci 516DatabaseSystems

Lecture20NoSQLand

ColumnStore

Instructor:Sudeepa Roy

DukeCS,Fall2018 CompSci516:DatabaseSystems 1

Page 2: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

ReadingMaterialNOSQL:• “ScalableSQLandNoSQLDataStores”RickCattell,SIGMODRecord,December2010(Vol.39,No.4)• seewebpagehttp://cattell.net/datastores/ forupdatesandmorepointers• MongoDBmanual:https://docs.mongodb.com/manual/

ColumnStore:• D.Abadi,P.Boncz,S.Harizopoulos,S.Idreos andS.Madden.TheDesignandImplementationof

ModernColumn-OrientedDatabaseSystems.FoundationsandTrendsinDatabases,vol.5,no.3,pp.197–280,2012.

• SeeVLDB2009tutorial:http://nms.csail.mit.edu/~stavros/pubs/tutorial2009-column_stores.pdf

Optional:• “Dynamo:Amazon’sHighlyAvailableKey-valueStore”ByGiuseppeDeCandia et.al.SOSP

2007

• “Bigtable:ADistributedStorageSystemforStructuredData”FayChanget.al.OSDI2006

DukeCS,Fall2018 CompSci516:DatabaseSystems 2

Page 3: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

NoSQL

DukeCS,Fall2018 CompSci516:DatabaseSystems 3

Seetheoptional/additionalslidesonMongoDBonthecoursewebsiteMaybeusefulforHW3

Page 4: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

DukeCS,Fall2018 CompSci516:DatabaseSystems 4

Page 5: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

Sofar-- RDBMS

• RelationalDataModel• RelationalDatabaseSystems(RDBMS)• RDBMSshave– acompletepre-definedfixedschema– aSQLinterface– andACIDtransactions

DukeCS,Fall2018 CompSci516:DatabaseSystems 5

Page 6: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

Today• NoSQL:”new”databasesystems– nottypicallyRDBMS– relaxonsomerequirements,gainefficiencyandscalability

• Newsystemschoosetouse/notuseseveralconceptswelearntsofar– e.g.“System---”doesnotuselocksbutusesmulti-versionCC(MVCC)or,

– “System---”usesasynchronousreplication• therefore,itisimportanttounderstandthebasics(Lectures1-18)eveniftheyarenotusedinsomenewsystems!

DukeCS,Fall2018 CompSci516:DatabaseSystems 6

Page 7: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

Warnings!

• MaterialfromCattell’spaper(2010-11)–someinfowillbeoutdated– seewebpagehttp://cattell.net/datastores/ forupdatesandmorepointers

• WewillfocusonthebasicideasofNoSQLsystems

• Optional readingslidesattheendonMongoDB– maybeusefulforHW3– therearealsocomparisontablesintheCattell’spaperifyouareinterested

DukeCS,Fall2018 CompSci516:DatabaseSystems 7

Page 8: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

OLAPvs.OLTP

• OLTP(OnLine TransactionProcessing)– Recalltransactions!– Multipleconcurrentread-writerequests– Commercialapplications(banking,onlineshopping)– Datachangesfrequently– ACIDproperties,concurrencycontrol,recovery

• OLAP(OnLine AnalyticalProcessing)– Manyaggregate/group-byqueries– multidimensionaldata– Datamostlystatic– WillstudyOLAPCubesoon

DukeCS,Fall2018 CompSci516:DatabaseSystems 8

Page 9: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

NewSystems• WewillexamineanumberofSQLandso- called“NoSQL”systemsor“datastores”

• DesignedtoscalesimpleOLTP-styleapplicationloads– todoupdatesaswellasreads– incontrasttotraditionalDBMSsanddatawarehouses– toprovidegoodhorizontalscalability(?)forsimpleread/writedatabaseoperationsdistributedovermanyservers

• OriginallymotivatedbyWeb2.0applications– thesesystemsaredesignedtoscaletothousandsormillionsofusers

DukeCS,Fall2018 CompSci516:DatabaseSystems 9

Page 10: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

NewSystemsvs.RDMS• Whenyoustudyanewsystem,compareitwithRDBMS-sonits– datamodel– consistencymechanisms– storagemechanisms– durabilityguarantees– availability– querysupport

• Thesesystemstypicallysacrificesomeofthesedimensions– e.g.database-widetransactionconsistency,inordertoachieveothers,e.g.higheravailabilityandscalability

DukeCS,Fall2018 CompSci516:DatabaseSystems 10

Page 11: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

NoSQL

• Manyofthenewsystemsarereferredtoas“NoSQL”datastores

• NoSQLstandsfor“NotOnlySQL”or“NotRelational”– notentirelyagreedupon

• Next:sixkeyfeaturesofNoSQLsystems

DukeCS,Fall2018 CompSci516:DatabaseSystems 11

Page 12: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

NoSQL:SixKeyFeatures

1. theabilitytohorizontallyscale“simpleoperations”throughputovermanyservers

2. theabilitytoreplicateandtodistribute(partition)dataovermanyservers

3. asimplecalllevelinterfaceorprotocol(incontrasttoSQLbinding)

4. aweakerconcurrencymodelthantheACIDtransactionsofmostrelational(SQL)databasesystems

5. efficientuseofdistributedindexesandRAMfordatastorage6. theabilitytodynamicallyaddnewattributestodatarecords

DukeCS,Fall2018 CompSci516:DatabaseSystems 12

Page 13: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

ImportantExamplesofNewSystems

• Threesystemsprovideda“proofofconcept”andinspiredmanyotherdatastores

1. Memcached2. Amazon’sDynamo3. Google’sBigTable

DukeCS,Fall2018 CompSci516:DatabaseSystems 13

Page 14: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

1.Memcached:mainfeatures

• popularopensourcecache

• supportsdistributedhashing(later)

• demonstratedthatin-memoryindexes canbehighlyscalable,distributing andreplicatingobjectsovermultiplenodes

DukeCS,Fall2018 CompSci516:DatabaseSystems 14

Page 15: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

2.Dynamo:mainfeatures

• pioneeredtheideaofeventualconsistencyasawaytoachievehigheravailabilityandscalability

• datafetchedarenotguaranteedtobeup-to-date

• butupdatesareguaranteedtobepropagatedtoallnodeseventually

DukeCS,Fall2018 CompSci516:DatabaseSystems 15

Page 16: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

3.BigTable :mainfeatures

• demonstratedthatpersistentrecordstoragecouldbescaledtothousandsofnodes

• “columnfamilies”

• https://cloud.google.com/bigtable/• https://static.googleusercontent.com/media/research.google.co

m/en//archive/bigtable-osdi06.pdf

DukeCS,Fall2018 CompSci516:DatabaseSystems 16

Page 17: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

BASE(notACIDJ)

• RecallACIDforRDBMSdesiredpropertiesoftransactions:– Atomicity,Consistency,Isolation,andDurability

• NOSQLsystemstypicallydonotprovideACID

• BasicallyAvailable• Softstate• Eventuallyconsistent

DukeCS,Fall2018 CompSci516:DatabaseSystems 17

Page 18: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

ACIDvs.BASE• TheideaisthatbygivingupACIDconstraints,onecanachievemuchhigherperformanceandscalability

• Thesystemsdifferinhowmuchtheygiveup– e.g.mostofthesystemscallthemselves“eventuallyconsistent”,meaningthatupdatesareeventuallypropagatedtoallnodes

– butmanyofthemprovidemechanismsforsomedegreeofconsistency,suchasmulti-versionconcurrencycontrol(MVCC)

DukeCS,Fall2018 CompSci516:DatabaseSystems 18

Page 19: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

“CAP”Theorem

• OftenEricBrewer’sCAPtheoremcitedforNoSQL

• A systemcanhaveonlytwooutofthreeofthefollowingproperties:• Consistency– doallclientsseethesamedata?

• Availability– isthesystemalwayson?

• Partition-tolerance– evenifcommunicationisunreliable,doesthesystemfunction?

• TheNoSQLsystemsgenerallygiveupconsistency– However,thetrade-offsarecomplex

DukeCS,Fall2018 CompSci516:DatabaseSystems 19

Page 20: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

TwofociforNoSQLsystems

1. “Simple”operations

2. HorizontalScalability

DukeCS,Fall2018 CompSci516:DatabaseSystems 20

Page 21: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

1.“Simple”Operations

• Readingorwritingasmallnumberofrelatedrecordsineachoperation– e.g.keylookups– readsandwritesofonerecordorasmallnumberofrecords

• Thisisincontrasttocomplexqueries,joins,orread-mostlyaccess

• Inspiredbyweb,wheremillionsofusersmaybothreadandwritedatainsimpleoperations– e.g.searchandupdatemulti-serverdatabasesofelectronic

mail,personalprofiles,webpostings,wikis,customerrecords,onlinedatingrecords,classifiedads,andmanyotherkindsofdata

DukeCS,Fall2018 CompSci516:DatabaseSystems 21

Page 22: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

2.HorizontalScalability

• Shared-NothingHorizontalScaling

• Theabilitytodistributeboththedataandtheloadofthesesimpleoperationsovermanyservers– withnoRAMordisksharedamongtheservers

• Not“vertical”scaling– whereadatabasesystemutilizesmanycoresand/orCPUsthatshareRAManddisks

• Someofthesystemswedescribeprovidebothverticalandhorizontalscalability

DukeCS,Fall2018 CompSci516:DatabaseSystems 22

Page 23: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

2.Horizontalvs.VerticalScaling

• Effectiveuseofmultiplecores(verticalscaling)isimportant– butthenumberofcoresthatcansharememoryislimited

• horizontalscalinggenerallyislessexpensive– canusecommodityservers

• Note:horizontalandverticalpartitioningarenotrelatedtohorizontalandverticalscaling (Lecture18)– exceptthattheyarebothusefulforhorizontalscaling

DukeCS,Fall2018 CompSci516:DatabaseSystems 23

Page 24: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

WhatisdifferentinNOSQLsystems

• WhenyoustudyanewNOSQLsystem,noticehowitdiffersfromRDBMSintermsof

1. ConcurrencyControl2. DataStorageMedium3. Replication4. Transactions

DukeCS,Fall2018 CompSci516:DatabaseSystems 24

Page 25: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

ChoicesinNOSQLsystems:1.ConcurrencyControl

a) Locks– somesystemsprovideone-user-at-a-timereadorupdatelocks– MongoDBprovideslockingatafieldlevel

b) MVCCc) None– donotprovideatomicity– multipleuserscaneditinparallel– noguaranteewhichversionyouwillread

d) ACID– pre-analyzetransactionstoavoidconflicts– nodeadlocksandnowaitsonlocks

DukeCS,Fall2018 CompSci516:DatabaseSystems 25

Page 26: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

ChoicesinNOSQLsystems:2.DataStorageMedium

a) StorageinRAM– snapshotsorreplicationtodisk– poorperformancewhenoverflowsRAM

b) Diskstorage– cachinginRAM

DukeCS,Fall2018 CompSci516:DatabaseSystems 26

Page 27: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

ChoicesinNOSQLsystems:3.Replication

• whethermirrorcopiesarealwaysinsynca) Synchronousb) Asynchronous– faster,butupdatesmaybelostinacrash

c) Both– localcopiessynchronously,remotecopies

asynchronously

DukeCS,Fall2018 CompSci516:DatabaseSystems 27

Page 28: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

ChoicesinNOSQLsystems:4.TransactionMechanisms

a) supportb) donotsupportc) inbetween– supportlocaltransactionsonlywithinasingle

objector“shard”– shard=ahorizontalpartitionofdataina

database

DukeCS,Fall2018 CompSci516:DatabaseSystems 28

Page 29: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

ComparisonfromCattell’spaper(2011)

DukeCS,Fall2018 CompSci516:DatabaseSystems 29

Page 30: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

DataModelTerminologyforNoSQL

• UnlikeSQL/RDBMS,theterminologyforNoSQLisofteninconsistent– wearefollowingnotationsinCattell’spaper

• Allsystemsprovideawaytostorescalarvalues– e.g.numbersandstrings

• Someofthemalsoprovideawaytostoremorecomplexnestedorreferencevalues

DukeCS,Fall2018 CompSci516:DatabaseSystems 30

Page 31: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

DataModelTerminologyforNoSQL

• Thesystemsallstoresetsofattribute-valuepairs– butusefourdifferentdatastructures

1. Tuple2. Document3. ExtensibleRecord4. Object

DukeCS,Fall2018 CompSci516:DatabaseSystems 31

Page 32: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

1.Tuple

• Sameasbefore• A“tuple”isarowinarelationaltable– attributenamesarepre-definedinaschema– thevaluesmustbescalar– thevaluesarereferencedbyattributename– incontrasttoanarrayorlist,wheretheyarereferencedbyordinalposition

DukeCS,Fall2018 CompSci516:DatabaseSystems 32

Page 33: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

2.Document

• Allowsvaluestobenesteddocumentsorlistsaswellasscalarvalues– thinkaboutXMLorJSON

• Theattributenamesaredynamicallydefinedforeachdocumentatruntime

• Adocumentdiffersfromatupleinthattheattributesarenotdefinedinaglobalschema– anda widerrangeofvaluesarepermitted

DukeCS,Fall2018 CompSci516:DatabaseSystems 33

Page 34: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

3.ExtensibleRecord

• A hybrid betweenatupleandadocument• familiesofattributesaredefinedinaschema• butnewattributescanbeadded(withinanattributefamily)onaper-recordbasis

• Attributesmaybelist-valued

DukeCS,Fall2018 CompSci516:DatabaseSystems 34

Page 35: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

4.Object

• Analogoustoanobjectinprogramminglanguages– butwithouttheproceduralmethods

• Valuesmaybereferencesornestedobjects

DukeCS,Fall2018 CompSci516:DatabaseSystems 35

Page 36: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

ExampleNOSQLsystems

• Key-valueStores:– ProjectVoldemort,Riak,Redis,Scalaris,TokyoCabinet,Memcached/Membrain/Membase

• DocumentStores:– AmazonSimpleDB,CouchDB,MongoDB,Terrastore

• ExtensibleRecordStores:– Hbase,HyperTable,Cassandra,Yahoo’sPNUTS

• RelationalDatabases:– MySQLCluster,VoltDB,Clustrix,ScaleDB,ScaleBase,NimbusDB,GoogleMegastore(alayeronBigTable)

DukeCS,Fall2018 CompSci516:DatabaseSystems 36

Page 37: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

SQLvs.NOSQL

DukeCS,Fall2018 CompSci516:DatabaseSystems 37

Argumentsforbothsidesstillacontroversialtopic

Page 38: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

WhychooseRDBMSoverNoSQL:1/31. Ifnewrelationalsystemscandoeverything

thataNoSQLsystemcan,withanalogousperformanceandscalability(?),andwiththeconvenienceoftransactionsandSQL,NoSQLisnotneeded

2. RelationalDBMSshavetakenandretainedmajoritymarketshareoverothercompetitorsinthepast30years– (network,object,andXMLDBMSs)

DukeCS,Fall2018 CompSci516:DatabaseSystems 38

Page 39: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

WhychooseRDBMSoverNoSQL:2/33. SuccessfulrelationalDBMSshavebeenbuilt

tohandleotherspecificapplicationloads inthepast:– read-onlyorread-mostlydatawarehousing– OLTPonmulti-coremulti-diskCPUs– in-memorydatabases– distributeddatabases,and– nowhorizontallyscaleddatabases

DukeCS,Fall2018 CompSci516:DatabaseSystems 39

Page 40: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

WhychooseRDBMSoverNoSQL:3/3

4. Whileno“onesizefitsall”intheSQLproductsthemselves,thereisacommoninterfacewithSQL,transactions,andrelationalschemathatgiveadvantagesintraining,continuity,anddatainterchange

DukeCS,Fall2018 CompSci516:DatabaseSystems 40

Page 41: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

WhychooseNoSQLoverRDBMS:1/31. Wehaven’tyetseengoodbenchmarksshowing

thatRDBMSscanachievescaling comparablewithNoSQLsystemslikeGoogle’sBigTable

2. Ifyouonlyrequirealookupofobjectsbasedonasinglekey– thenakey-valuestoreisadequateandprobablyeasiertounderstand

thanarelationalDBMS– Likewiseforadocumentstoreonasimpleapplication:youonlypay

thelearningcurveforthelevelofcomplexityyourequire

DukeCS,Fall2018 CompSci516:DatabaseSystems 41

Page 42: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

WhychooseNoSQLoverRDBMS:2/3

3. Someapplicationsrequireaflexibleschema– allowingeachobjectinacollectiontohavedifferentattributes

– WhilesomeRDBMSsallowefficient“packing”oftupleswithmissingattributes,andsomeallowaddingnewattributesatruntime,thisisuncommon

DukeCS,Fall2018 CompSci516:DatabaseSystems 42

Page 43: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

WhychooseNoSQLoverRDBMS:3/3

4. ArelationalDBMSmakes“expensive”(multi- nodemulti-table)operations“tooeasy”– NoSQLsystemsmakethemimpossibleorobviouslyexpensiveforprogrammers

5. WhileRDBMSshavemaintainedmajoritymarketshareovertheyears,otherproductshaveestablishedsmallerbutnon-trivialmarketsinareaswherethereisaneedforparticularcapabilities– e.g.indexedobjectswithproductslikeBerkeleyDB,orgraph-following

operationswithobject-orientedDBMSs

DukeCS,Fall2018 CompSci516:DatabaseSystems 43

Page 44: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

ColumnStore

DukeCS,Fall2018 CompSci516:DatabaseSystems 44

Page 45: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

Rowvs.ColumnStore

• Rowstore– storeallattributesofatupletogether– storagelike“row-majororder”inamatrix

• Columnstore– storeallrowsforanattribute(column)together– storagelike“column-majororder”inamatrix

• e.g.– MonetDB,Vertica(earlier,C-store),SAP/SybaseIQ,GoogleBigtable (withcolumngroups)

DukeCS,Fall2018 CompSci516:DatabaseSystems 45

Page 46: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

DukeCS,Fall2018 CompSci516:DatabaseSystems 46

Ack:SlidefromVLDB2009tutorialonColumnstore

Page 47: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

DukeCS,Fall2018 CompSci516:DatabaseSystems 47

Ack:SlidefromVLDB2009tutorialonColumnstore

Page 48: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

DukeCS,Fall2018 CompSci516:DatabaseSystems 48

Ack:SlidefromVLDB2009tutorialonColumnstore

Page 49: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

DukeCS,Fall2018 CompSci516:DatabaseSystems 49

Ack:SlidefromVLDB2009tutorialonColumnstore

Page 50: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),

DukeCS,Fall2018 CompSci516:DatabaseSystems 50

Ack:SlidefromVLDB2009tutorialonColumnstore