data warehouses and nosql -...

68
Data Warehouses and NoSQL Sharing Administra6ve Informa6on Carmen Barandela So-ware Engineer CERN / GS‐AIS October 24 ‐ 28, 2011 JINR/CERN Grid and Management Informa6on Systems

Upload: others

Post on 27-Sep-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

DataWarehousesandNoSQLSharingAdministra6veInforma6on

CarmenBarandelaSo-wareEngineerCERN/GS‐AIS

October24‐28,2011JINR/CERNGridandManagementInforma6onSystems

Page 2: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

Agenda•  DataWarehousesinAdministra6veCompu6ng

•  Recap:DataWarehousesTheory

•  DataWarehousesandInforma6onSystemsinAIS–  Founda6on,HRandFIInforma6onSystems

–  ComplexDataExtrac6onProcesses–  Pixel‐PerfectRepor6ng– Dashboards

•  NoSQL–  ...

CarmenBarandela:"DataWarehousesandNoSQL” 2

Page 3: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

Agenda•  DataWarehousesinAdministra6veCompu6ng

•  Recap:DataWarehousesTheory

•  DataWarehousesandInforma6onSystemsinAIS–  Founda6on,HRandFIInforma6onSystems

–  ComplexDataExtrac6onProcesses–  Pixel‐PerfectRepor6ng– Dashboards

•  NoSQL–  ...

3CarmenBarandela:"DataWarehousesandNoSQL”

Page 4: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

Ca.16,000People

4CarmenBarandela:"DataWarehousesandNoSQL”

Page 5: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

Mankind’sLargestMachine

5CarmenBarandela:"DataWarehousesandNoSQL”

Page 6: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

EnormousAmountofData

JanJanke:"DataWarehousesandAnaly6calDataProcessing..." 6

Page 7: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

Administra6veCompu6ng•  ProvidesmeanstoadministrateCERN•  Enablesphysiciststofocusontheirwork•  Allowsmanagementtomaketherightmoves

7CarmenBarandela:"DataWarehousesandNoSQL”

Page 8: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

WhyDataWarehouses?•  Heterogeneouscompu6nglandscape•  VariousspecialisedOLTPsystems•  Planningneeds•  LegalRequirements

8

  Supportadministra6vestaff  Enforcesecurityandsafetyonsite  Allowmanagementtomakedecisions

CarmenBarandela:"DataWarehousesandNoSQL”

Page 9: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

Example:KeepFinancesUnderControl

•  SpecialisedSystems– Accoun6ng,ERPforCERNstores– Externalcontractsmanagement– Payroll,treasurymanagement,…

9

Specialisedsmallusergroups

Dis6nctdatabases

Highavailabilityandperformance,real‐6medata

Systemsonlyaccessibletoauthorisedspecialists

CarmenBarandela:"DataWarehousesandNoSQL”

Page 10: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

Example:KeepFinancesUnderControl

•  GeneralFinancialInforma6onSystem– Singlesystem

– Accesstodatafrommul6plesources– Differentlevelsofcomplexity

10

Specialisedsmallusergroups

Dis6nctdatabases

Highavailabilityandperformance,real‐6medata

Systemsonlyaccessibletoauthorisedspecialists

CarmenBarandela:"DataWarehousesandNoSQL”

Page 11: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

Example:KeepFinancesUnderControl

•  GeneralFinancialInforma6onSystem– Singlesystem

– Accesstodatafrommul6plesources– Differentlevelsofcomplexity

11

UsersfromallareasofCERN

Singledatawarehouse

Highavailabilityandperformance,butnonecessityforreal‐6medata

Securityisextremelyimportant!SystemisaccessibleCERNwide.

CarmenBarandela:"DataWarehousesandNoSQL”

Page 12: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

AIS’FinancialDataWarehouse

•  Keepdatainsyncwithdataproviders•  Mastercomplexdataextrac6onprocess

•  Ensurehighqueryperformance

•  Basefordetaileddataanalysis

12

Technologies:o ORACLERACdatabaseo JavaEnterprisewebapplica6onso In‐housedevelopedframeworkso Third‐partyBIandrepor6ngtools

CarmenBarandela:"DataWarehousesandNoSQL”

Page 13: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

Agenda•  DataWarehousesinAdministra6veCompu6ng

•  Recap:DataWarehousesTheory

•  DataWarehousesandInforma6onSystemsinAIS–  Founda6on,HRandFIInforma6onSystems

–  ComplexDataExtrac6onProcesses–  Pixel‐PerfectRepor6ng– Dashboards

•  DetailedDataWarehouseExample– ManagementDataLayer(MDL)

13CarmenBarandela:"DataWarehousesandNoSQL”

Page 14: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

FindtheNeedleintheHay…

14CarmenBarandela:"DataWarehousesandNoSQL”

Page 15: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

OLTPvsOLAP

OLTP OLAP

Datasource Opera6ons OLTP(consolidated)

Datapurpose Runthebusiness Repor6ng,analysis

Inserts,updates High Periodicbatchjobs

Querycomplexity Low High

DBdesign Normalized Star,snowflake

Availability Cri6cal Lesscri6cal

Target Opera6onalstaff Middle/higherMgmt.

15CarmenBarandela:"DataWarehousesandNoSQL”

Page 16: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

OLTPvsOLAP

OLTP OLAP

Datasource Opera6ons OLTP(consolidated)

Datapurpose Runthebusiness Repor6ng,analysis

Inserts,updates High Periodicbatchjobs

Querycomplexity Low Depends…

DBdesign Normalized Snowflakeandothers

Availability Cri6cal MaybeverycriOcal

Target Opera6onalstaff Mgmt.+OperaOons

16

That’stheory!

Realworldisnotthat

easy…

CarmenBarandela:"DataWarehousesandNoSQL”

Page 17: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

Normalisa6on(Codd/Boyce)

•  1NF– 1table=1rela6on,norepea6nggroupsorduplicaterows

•  2NF– Allnonprimeanributesdependonallparts(anributes)ofacompositekey

•  3NF– Allnonprimeanributesdependonlyonthe(whole)key

17CarmenBarandela:"DataWarehousesandNoSQL”

Page 18: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

Normalisa6on(Codd/Boyce)

•  1NF– 1table=1rela6on,norepea6nggroupsorduplicaterows

•  2NF– Allnonprimeanributesdependonallparts(anributes)ofacompositekey

•  3NF– Allnonprimeanributesdependonlyonthe(whole)key)

18

Course Category Winner Origin

Monaco‘10 Formula1 M.Webber Australia

Japan‘10 Formula1 S.Venel Germany

Japan‘10 Rally S.Ogier France

Notin3NF,why?

CarmenBarandela:"DataWarehousesandNoSQL”

Page 19: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

StarSchema

19

Source:hnp://www.execu6onmih.com/data‐warehouse/star‐snowflake‐schema.php(16/10/2010)

item_key

branch_key

loca6on_key

units_sold

dollars_sold

avg_sales

Measures

branch_keybranch_namebranch_type

Branch

SalesFactTable item_keyitem_namebrandtypesupplier_type

item

loca6onloca6on_keyStreetcitystate_or_provincecountry

CarmenBarandela:"DataWarehousesandNoSQL”

Page 20: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

SnowflakeSchema

20

Source:hnp://www.execu6onmih.com/data‐warehouse/star‐snowflake‐schema.php(16/10/2010)

city_keycitystate_or_provincecountry

city

item_key

branch_key

loca6on_key

units_sold

dollars_sold

avg_sales

Measures

branch_keybranch_namebranch_type

Branch

SalesFactTableitem_keyitem_namebrandtypesupplier_key

item

loca6on

loca6on_keystreetcity_key

supplier_keySupplier_type

supplier

CarmenBarandela:"DataWarehousesandNoSQL”

Page 21: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

FromOpera6onstoRepor6ng

21

Source:hnp://www.deakin.edu.au/ddw/what‐is.php(16/10/2010)

ERP

FI

HR

CarmenBarandela:"DataWarehousesandNoSQL”

Page 22: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

Analysis

•  DataMining•  Drilldown– Finerdetailgranularity(e.g.addagroup‐bycolumn)

•  Slice&dice– Playwiththedimensions•  Combinedifferentdimensions•  Remove/addadimension

•  Analysefactchanges

22CarmenBarandela:"DataWarehousesandNoSQL”

Page 23: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

Agenda•  DataWarehousesinAdministra6veCompu6ng

•  Recap:DataWarehousesTheory

•  DataWarehousesandInforma6onSystemsinAIS–  Founda6on,HRandFIInforma6onSystems

–  ComplexDataExtrac6onProcesses–  Pixel‐PerfectRepor6ng– Dashboards

•  NoSQL–  ...

23CarmenBarandela:"DataWarehousesandNoSQL”

Page 24: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

CERN/AISBusinessMap

24CarmenBarandela:"DataWarehousesandNoSQL”

Page 25: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

Founda6on•  CommondatalayerforvariousAISservices•  DatainterfacesforotherCERNservices•  Commonapplica6ons(e.g.mgmt.ofroles)

25

HRInformaOonSystem(HRT)

FIInformaOonSystem(CET)

…moredomainspecificinformaOonsystems

OperaOvesystems

CarmenBarandela:"DataWarehousesandNoSQL”

Page 26: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

VariousSpecialisedSystems

•  ORACLEHR•  CERNTrainingApplica6on•  Safety&accesssystems•  EDH(ElectronicDocumentHandling)•  Accoun6ngApplica6on•  ERPsystemforCERNstores•  Contractfollow‐up•  …

26CarmenBarandela:"DataWarehousesandNoSQL”

Page 27: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

TechnicalEnvironment

•  Sourcedatabases:– ORACLE10/11g– MicrosorExcel

•  HR/FIInforma6onSystems:– ORACLE10/11g–  JavaEnterprisewebapplica6ons– SAPBusinessObjectstoolfamily

27CarmenBarandela:"DataWarehousesandNoSQL”

Page 28: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

DataExtrac6ons

•  Nightlyscheduledbatchjobs•  Extrac6onsorganisedinSQLscripts•  Runbyself‐developed“batchrunner”– Controls•  Orderofexecu6on(sequen6al,parallel)•  Cri6cality•  Logging•  Problemescala6on(automa6cemails)

28CarmenBarandela:"DataWarehousesandNoSQL”

Page 29: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

Defini6onofExtrac6onProcess(1)

29

Generaldefini6ons

CarmenBarandela:"DataWarehousesandNoSQL”

Page 30: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

Defini6onofExtrac6onProcess(2)

30

Batches&commands

CarmenBarandela:"DataWarehousesandNoSQL”

Page 31: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

ImportanceofMonitoring

31

NewhardwareforDEVdatabases(gain>1h)

CarmenBarandela:"DataWarehousesandNoSQL”

Page 32: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

TurtleorLeopard?

32CarmenBarandela:"DataWarehousesandNoSQL”

Page 33: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

ORACLEMaterialisedViews

•  Pre‐aggregatedsummaries•  Benefitfromqueryrewrite

33

Source:ORACLE10gDocumenta6on/DataWarehousingGuide

CarmenBarandela:"DataWarehousesandNoSQL”

Page 34: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

ORACLEMaterialisedViews

•  Pre‐aggregatedsummaries•  Benefitfromqueryrewrite

34

Source:ORACLE10gDocumenta6on/DataWarehousingGuide

CarmenBarandela:"DataWarehousesandNoSQL”

Page 35: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

Materialised(Summary)Views

•  Don’tuseremotetablesifyouneedqueryrewrite

•  Creatematerializedviewlogonallsourcetables

35CarmenBarandela:"DataWarehousesandNoSQL”

Page 36: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

Materialised(Summary)Views

•  Don’tuseremotetablesifyouneedqueryrewrite

•  Creatematerializedviewlogonallsourcetables

36CarmenBarandela:"DataWarehousesandNoSQL”

Page 37: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

Snapshots

•  Usesnapshotstoefficientlyaccessremotetables– Syntax:CREATESNAPSHOT…AS[YourQuery]– Refreshop6ons:•  FAST•  COMPLETE

•  FORCE

37CarmenBarandela:"DataWarehousesandNoSQL”

Page 38: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

Snapshots

•  Usesnapshotstoefficientlyaccessremotetables– Syntax:CREATESNAPSHOT…AS[YourQuery]– Refreshop6ons:•  FAST•  COMPLETE

•  FORCE

38CarmenBarandela:"DataWarehousesandNoSQL”

Page 39: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

PipelinedFunc6ons

•  PL/SQLisdatasourceinsteadofatable•  MayincreaseperformanceinenvironmentswithheavyPL/SQLuse

39CarmenBarandela:"DataWarehousesandNoSQL”

Page 40: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

PipelinedFunc6ons

•  PL/SQLisdatasourceinsteadofatable•  MayincreaseperformanceinenvironmentswithheavyPL/SQLuse

40

CREATE OR REPLACE TYPE myTableFormat AS OBJECT( col_a NUMBER, col_b DATE, col_c VARCHAR2(25) ) /

CREATE OR REPLACE TYPE myTableType AS TABLE OF myTableFormat

/ CarmenBarandela:"DataWarehousesandNoSQL”

1

Page 41: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

PipelinedFunc6ons

41

CREATE OR REPLACE FUNCTION myFunc RETURN myTableType PIPELINED IS

BEGIN

FOR i in 1 .. 5 LOOP PIPE ROW ( myTableFormat( i, SYSDATE+i, 'Row '||i ) ); END LOOP; RETURN;

END;

END;

/

2

CarmenBarandela:"DataWarehousesandNoSQL”

Page 42: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

PipelinedFunc6ons

42

SELECT * FROM TABLE( myFunc() );

col_a col_b col_c --------- ---------- ---------- 1 27/10/2010 Row 1 2 28/10/2010 Row 2 3 29/10/2010 Row 3 4 30/10/2010 Row 4 5 31/10/2010 Row 5

3

Useapipelinedfunc.onifyourequireadatasourceotherthanatable!

CarmenBarandela:"DataWarehousesandNoSQL”

Page 43: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

DatabaseDesign

•  Starschemalike•  Highlyde‐normalisedincl.duplica6onofdata•  Usesingle‐anributekeyswhereverpossible•  Performancemaners!–  Becarefulwhenextrac6ngoverdatabaselinks–  Certaintablesfromopera6onalsystemsarecopied– Dele6on&recrea6onofindexes– Usepar66ons– Manualcontrolofsta6s6cscollec6on– Op6mizingexecu6onplansvery6me‐consuming

43CarmenBarandela:"DataWarehousesandNoSQL”

Page 44: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

Repor6ngApplica6onFramework

•  Columnandorderingselec6on•  Subreports•  Variousoutputformats(e.g.HTML,PDF)

•  Charts•  Self‐servicerepor6ng•  Automatedscheduledreportexecu6on

•  Rowandcolumnbasedaccesscontrol

44CarmenBarandela:"DataWarehousesandNoSQL”

Page 45: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

DataAccess

45

Name Unit Tel Salary Category

Meyer A 12345 $4,900 3

Schmidt B 23456 $6,400 1

Cook B 34567 $5,700 2

Whichrowsarevisibletome?UnitleaderofBonlyseespersonsfromUnitB.

Whichdata(columns)amIallowedtosee?AsasupervisorImaynotbeen6tledtoseethehealthinsurancecategory.Asafetyormedicalofficermaynotseethesalary,etc.

CarmenBarandela:"DataWarehousesandNoSQL”

Page 46: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

UserInterface

46CarmenBarandela:"DataWarehousesandNoSQL”

Page 47: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

PixelPerfectForms•  UseofApacheFOPlibrary–  Examples:

•  Employment&traininganesta6ons•  Swiss/Frenchcardapplica6onforms

•  BusinessObjectsXIEnterprise– Directuse–  IndirectuseviaBusinessObjectsJavaSDK–  Examples:

•  Salaryslips•  Cars6ckers•  Workorders

47CarmenBarandela:"DataWarehousesandNoSQL”

Page 48: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

BusinessObjects

•  CommercialtoolfamilyfromSAP•  Advantages–  Richrepor6ngpossibili6es(interac6veorviaSDK)– AppealingdashboardsusingXcelsius– Onlyafewusersneedtheknowledgetodesignreports

•  Drawbacks–  Two‐waydatastorage(filesystem&database)–  Some6messtabilityproblems–  Time‐intensiveadministra6onandmaintenance–  Expensive

48CarmenBarandela:"DataWarehousesandNoSQL”

Page 49: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

ManagementDashboards

49

DesignedlocallyusingMSOfficeandXcelsius.

DatacomesfromtheMDLdatawarehouse.

PublishedasFlashtotheBOServer.

CarmenBarandela:"DataWarehousesandNoSQL”

Page 50: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

Remember:

•  Highdatavolumes+analysis=datawarehouse

•  OLTPvs.OLAP•  Usethefacili6esthetoolprovides– Materializedviews,snapshots,pipelinedfunc6ons

•  Keepthingsextensibleandsimple!

•  Par66onsareveryhelpful

50CarmenBarandela:"DataWarehousesandNoSQL”

Page 51: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

Agenda•  DataWarehousesinAdministra6veCompu6ng

•  Recap:DataWarehousesTheory

•  DataWarehousesandInforma6onSystemsinAIS–  Founda6on,HRandFIInforma6onSystems

–  ComplexDataExtrac6onProcesses–  Pixel‐PerfectRepor6ng– Dashboards

•  NoSQL–  ...

51CarmenBarandela:"DataWarehousesandNoSQL”

Page 52: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

NoSQL

52CarmenBarandela:"DataWarehousesandNoSQL”

NotOnlySQL

Page 53: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

WhyNoSQL?(1)Tradi6onalRDBMSsystems:‐Defini6onofarigidschema

‐GuaranteesACIDtransac6ons:

.Atomicity

.Consistency

.Isola6on

.Durability

CarmenBarandela:"DataWarehousesandNoSQL” 53

Page 54: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

WhyNoSQL?(1)Tradi6onalRDBMSsystems:‐Defini6onofarigidschema

‐GuaranteesACIDproper6es:

.Atomicity

.Consistency

.Isola6on

.Durability

Scalability

CarmenBarandela:"DataWarehousesandNoSQL” 54

Page 55: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

WhyNoSQL?(2)‐  Newwebappsdifferentneeds:.Scalability&elas6city(atlowcost!).Highavailability.Flexibleschemas

.Geographicdistribu6on

‐  Donotnecessarilyneed:.Strongconsistency/integrity.Complexqueries

CarmenBarandela:"DataWarehousesandNoSQL” 55

Page 56: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

NoSQLusecases1. Massivedatavolumes.Massivelydistributedarchitecture.Google,Amazon,Yahoo,Facebook...

2.  Extremequeryworkload.Impossibletoefficientlydojoinsatthatscalewith

anRDBMS

3.  Schemaevolu6on.Schemaflexibilityisnottrivialatlargescale.Schemachangescanbegraduallyintroducedwith

NoSQL.

CarmenBarandela:"DataWarehousesandNoSQL” 56

Page 57: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

NoSQLTaxonomy

•  Keyvaluestores•  Documentdatabases

•  Column‐orienteddatabases

•  Graphdatabases

CarmenBarandela:"DataWarehousesandNoSQL” 57

Page 58: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

NoSQLTaxonomyKeyvaluestores

‐1key(uniqueid)‐>1value(binaryobject,blob)

‐TheDBdoesnotunderstandit:nodatamodel

‐Veryfast

‐Examples:AmazonDynamo,MemcacheDB

CarmenBarandela:"DataWarehousesandNoSQL” 58

Page 59: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

NoSQLTaxonomyDocumentdatabases

‐  Key‐valuestore:thevalueisusuallystructured‐  Queryingwithmorethanakeyispossible

‐  Examples:AmazonSimpleDB,CouchDB...

CarmenBarandela:"DataWarehousesandNoSQL” 59

Page 60: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

NoSQLTaxonomyColumn‐orienteddatabases

‐Distributed,persistentmul6dimensionalsortedmap.

‐Mapindexedby:rowkey,columnkeyanda6mestamp

‐Examples:GoogleBigTable,Cassandra(Facebook)

CarmenBarandela:"DataWarehousesandNoSQL” 60

Googlepaper:hnp://sta6c.googleusercontent.com/external_content/untrusted_dlcp/labs.google.com/en//papers/bigtable‐osdi06.pdf

Page 61: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

NoSQLTaxonomyGraphdatabases

‐  Basedonthegraphtheory:‐Ver6ces:likeen66es

‐Edges:rela6onshipsbetweentheen66es

‐  Examples:Neo4j,FlockDB(Twiner)

CarmenBarandela:"DataWarehousesandNoSQL” 61

Page 62: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

CAPTheorem

CarmenBarandela:"DataWarehousesandNoSQL” 62

Consistency Availability

ParOOonTolerance

Page 63: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

CAPTheorem

CarmenBarandela:"DataWarehousesandNoSQL” 63

Consistency Availability

ParOOonTolerance

Only2canbesaOsfiedattheOme

Page 64: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

NoConsistency?

CarmenBarandela:"DataWarehousesandNoSQL”

Page 65: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

NoConsistency?

CarmenBarandela:"DataWarehousesandNoSQL”

EventualConsistency!!!

Page 66: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

EventualConsistency

Thestoragesystemguaranteesthatifnonewupdatesaremadetotheobjecteventuallyallaccesseswillreturnthelastupdatedvalue.

CarmenBarandela:"DataWarehousesandNoSQL” 66

Page 67: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

RDBMSvsNOSQL

•  StrongConsistency•  BigDatasets•  Scalingispossible•  SQL•  Goodavailability•  Consolidatedtech.

•  EventualConsistency•  HugeDatasets•  Scalingiseasy•  API•  Highavailability•  S6llimmaturetech.

CarmenBarandela:"DataWarehousesandNoSQL” 67

Page 68: Data Warehouses and NoSQL - ais-grid-2011.jinr.ruais-grid-2011.jinr.ru/docs/2011-10-24_jinr-cern_school_datawarehous… · • Data Warehouses in Administrave Compung • Recap: Data

Спасибо!•  Ques6ons?•  MoreInfo– [email protected]

CarmenBarandela:"DataWarehousesandNoSQL”