part 1 - data warehousing lecture at bw cooperative state university (dhbw)
TRANSCRIPT
![Page 1: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/1.jpg)
Andreas Buckenhofer
Data Warehouse (Datenbanken II)
![Page 2: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/2.jpg)
Daimler TSS GmbH
Zur Person
Data Warehouse / DHBW / Fall 2016 / Seite 2
https://de.linkedin.com/in/buckenhofer
https://twitter.com/ABuckenhofer
https://www.xing.com/profile/Andreas_Buckenhofer2
Andreas Buckenhofer
Senior DB Professional
Seit 2009 bei Daimler TSS
im Fachgebiet Big Data
Business Unit Analytics
![Page 3: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/3.jpg)
Daimler TSS GmbH
Interner IT-Partner für Daimler
Wir sind Spezialist und strategischer Business-Partner für
innovative IT-Gesamtlösungen im Daimler-Konzern.
Als 100%ige Daimler-Tochter leben wir die Kultur der Exzellenz
mit dem Anspruch der Innovations- und Technologieführerschaft.
Mit herausragender Technologie- und Methodenkompetenz
verstehen wir uns als Anbieter von wettbewerbs-
differenzierenden Dienstleistungen und sind
Impulsgeber in anspruchsvollen IT-Fragestellungen,
speziell in den Kernthemen Car IT und Mobility,
Information Security, Analytics und
Shared Services.
Data Warehouse / DHBW / Fall 2016 / Seite 3
![Page 4: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/4.jpg)
Daimler TSS GmbH
Standorte
Data Warehouse / DHBW / Fall 2016 / Seite 4
Daimler TSS Deutschland
6 Standorte
774 Mitarbeiter
Ulm (Hauptsitz)
Raum Stuttgart
Böblingen, Echterdingen,
Leinfelden, Möhringen
Berlin
Daimler TSS India
Hub Bangalore
19 Mitarbeiter
Daimler TSS China
Hub Beijing
6 Mitarbeiter
Daimler TSS Malaysia
Hub Kuala Lumpur
42 Mitarbeiter
![Page 5: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/5.jpg)
Daimler TSS GmbH Data Warehouse / DHBW / Fall 2016 / Page 5
Ganzheitliche Betreuung (Erhebung, Auswertung,
Visualisierung und Interpretation), unabhängige
Beratung und Optimierung der Geschäftsabläufe.
Von klassischer BI bis hin zu predictive und prescriptive
Analytics bieten wir Leistungen unter Berücksichtigung
der Datensicherheit.
Dabei verknüpfen wir fachliche Erfahrung und IT-Know-
how im Daimler-Kontext mit dem Blick fürs große
Ganze.
Analytics. Das große Ganze verstehen, um Daten nutzbar machen.
![Page 6: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/6.jpg)
Daimler TSS GmbH
DWH, Big Data, Data Mining
Data Warehouse / DHBW / Fall 2016 / Page 6
• This lecture is about the classical DWH
• 4 sessions
• Mr. Bollinger’s lecture is about Big Data and Data Mining
![Page 7: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/7.jpg)
Daimler TSS GmbH
Overview of the lecture
Data Warehouse / DHBW / Fall 2016 / Page 7
1. Introduction to DWH, DWH Architectures - 20.10.2016
2. Data Modeling, OLAP 1 - 27.10.2016
3. OLAP 2, ETL - 03.11.2016
4. Metadata, DWH Projects, Advanced Topics - 10.11.2016
![Page 8: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/8.jpg)
Daimler TSS GmbH
About this Lecture
Data Warehouse / DHBW / Fall 2016 / Page 8
• Structure of the lecture
• Review of the preceding lecture
• Presentation of content
• Group tasks, exercises
• 14:45 – 18:15
• 2x15min breaks
![Page 9: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/9.jpg)
Daimler TSS GmbH
What you will learn today
Data Warehouse / DHBW / Fall 2016 / Page 9
• Data Warehousing is a major topic of computer science.
• After the end of this lecture you will be able to
• Understand the basic business and technology drivers for data warehousing
• Describe the characteristics of a data warehouse
• Describe the differences between production and data warehouse systems
• Understand DWH architecture(s)
![Page 10: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/10.jpg)
Daimler TSS GmbH
Motivation
Data Warehouse / DHBW / Fall 2016 / Page 10
• Data as the new oil
• Data as Asset, getting more and more important also in production industries
• Not only classical data-intensive companies like Google or Facebook
• Increasing interest e.g. in insurance, health care, automotive, …
• Connected cars, Smart Home, Tailor-made insurances, etc.
• Hype technologies
• Databases (RDBMS, NoSQL)
• Big Data
• DWH still growing (additional stimuli coming also from Big Data, Analytics, NoSQL, IOT,
RealTime, TimeToMarket, etc)
![Page 11: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/11.jpg)
Daimler TSS GmbH
Many employment opportunities
Data Warehouse / DHBW / Fall 2016 / Page 11
• DWH department in every (bigger) end user company, also in many medium-sized or
small-sized companies
• DWH department in every (bigger) consulting company
• DWH-only specialized consulting companies
• DWH tool vendors
![Page 12: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/12.jpg)
Daimler TSS GmbH
Many employment opportunities with comprehensive
requirements
Data Warehouse / DHBW / Fall 2016 / Page 12
• Comprehensive requirements
• DWHs are complex
• Specialists (e.g. Data Integration, Data Visualization, Tool XYZ)
• Generalists (e.g. Data Architects)
![Page 13: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/13.jpg)
Daimler TSS GmbH
Job description examples
Data Warehouse / DHBW / Fall 2016 / Page 13
![Page 14: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/14.jpg)
Daimler TSS GmbH
Job description examples
Data Warehouse / DHBW / Fall 2016 / Page 14
![Page 15: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/15.jpg)
Daimler TSS GmbH
Job description examples
Data Warehouse / DHBW / Fall 2016 / Page 15
![Page 16: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/16.jpg)
Daimler TSS GmbH
• Often used as synonym
• DWH more technical focus
• BI more business / process focus
• “Business intelligence is a set of methodologies, processes, architectures, and
technologies that transform raw data into meaningful and useful information used
to enable more effective strategic, tactical, and operational insights and decision
making.” (Boris Evelson, Forrester Research, 2008)
Data Warehouse (DWH) or Business Intelligence (BI)?
Data Warehouse / DHBW / Fall 2016 / Page 16
![Page 17: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/17.jpg)
Daimler TSS GmbH
Information Technology (1960‘s – 80‘s)
Data Warehouse / DHBW / Fall 2016 / Page 17
Support of few business processes in separated applications
• applications separated
• target: replace manual and time consuming activities
• support daily transactions
• data embedded in process / process-specific application
• process-orientation
![Page 18: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/18.jpg)
Daimler TSS GmbH
Information Technology (1960‘s – 80‘s)
Data Warehouse / DHBW / Fall 2016 / Page 18
• Many systems throughout the enterprises for dedicated purposes
• Customer data, order data, etc spread over many systems in many variations and with
contradictions
![Page 19: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/19.jpg)
Daimler TSS GmbH
Information Technology (1960‘s – 80‘s)
Data Warehouse / DHBW / Fall 2016 / Page 19
• Storage of data
• mostly on tape
• very little space, therefore data encoded and compressed
• one separate tape for each process / application
• data spread
• later introduction of databases – however, no consolidation of data throughout the
company
![Page 20: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/20.jpg)
Daimler TSS GmbH
Information Technology (1960‘s – 80‘s)
Data Warehouse / DHBW / Fall 2016 / Page 20
• Databases accessed through common unique keys, numbers, primary keys, ...
• More and more applications access the different databases
• Complex structure of systems and databases, but no overall view of company
possible (silos)
![Page 21: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/21.jpg)
Daimler TSS GmbH
Information Technology (1960‘s – 80‘s)
Data Warehouse / DHBW / Fall 2016 / Page 21
• „Management Information Systems“ (MIS)
• since late 60es
• did not really work
• Can be characterized as “Unplanned decision support”
• Management needs reports / combined data from different systems to make
decisions for company
• Reports all manually written by IT people
• extract, combine, cache, accumulate data
• can take several days
• Error prone
• relevant information may be forgotten or combined in a wrong way
• No ad-hoc questions possible
![Page 22: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/22.jpg)
Daimler TSS GmbH
Major problems for effective decision support
Data Warehouse / DHBW / Fall 2016 / Page 22
• Distributed data
• Different data structures
• Historic data
• System workload
• Inadequate technology
![Page 23: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/23.jpg)
Daimler TSS GmbH
Distributed data
Data Warehouse / DHBW / Fall 2016 / Page 23
• Data resides on
• different systems / storages
• different applications
• different technologies
• Has to be accumulated on one system for further analysis
• Data is inhomogeneous, e.g. each system has their own customer number or order
number, etc.
• How to combine the data?
![Page 24: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/24.jpg)
Daimler TSS GmbH
Different data structures
Data Warehouse / DHBW / Fall 2016 / Page 24
• Systems developed independently from each other
• Different data types
• E.g.; zip-code as integer or character string
• Different encodings
• E.g.; m-f, m-w for gender
• Different field lengths
• E.g.; address field
• Different data modeling
• E.g. last name / first name in different fields vs last name / first name (badly
modelled) in one single field
![Page 25: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/25.jpg)
Daimler TSS GmbH
Issues with historic data
Data Warehouse / DHBW / Fall 2016 / Page 25
• Usually: Data archived after max. 3 months
• daily transactions produce lots of data
• limited size of storage → high amounts of data fill up systems
• For reports, already archived data may have to be accessed
![Page 26: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/26.jpg)
Daimler TSS GmbH
Issues with system workload
Data Warehouse / DHBW / Fall 2016 / Page 26
• Systems constructed for daily transaction business
• handle a constant (large) amount of transactions
• handle transactions concurrently
• each transaction accesses small amount of data
• very simple
• Reports:
• few reports at one time
• access lots of data
• not on a regular basis
• complex arithmetic operations
![Page 27: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/27.jpg)
Daimler TSS GmbH
Issues with system workload
Data Warehouse / DHBW / Fall 2016 / Page 27
• Consequences:
• systems stressed by additional load (due to reports)
• not optimized for this kind of workload
• performance of daily transaction business jeopardized
• may possibly lead to system failure!
![Page 28: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/28.jpg)
Daimler TSS GmbH
Inadequate technology
Data Warehouse / DHBW / Fall 2016 / Page 28
• Capacity and cost of hard disk drives
• No SAN, inflexible scaling of storage
• E.g. 1979: 8″-Winchester hard disk drive of IBM
• Capacity 5 MB ! Cost 10 000 DM.
• Performance and cost of processors
• No graphical front-ends
• Network technology in its beginnings
![Page 29: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/29.jpg)
Daimler TSS GmbH
Textual and visual Reports
Data Warehouse / DHBW / Fall 2016 / Seite 29
![Page 30: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/30.jpg)
Daimler TSS GmbH
Visual Data Integration Tool
Data Warehouse / DHBW / Fall 2016 / Seite 30
![Page 31: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/31.jpg)
Daimler TSS GmbH
Conclusion
Data Warehouse / DHBW / Fall 2016 / Page 31
• "operative systems" not suitable for analytical evaluations
• Need for a new, separated system
• fast answers, ad-hoc questions possible
• no interference with daily transaction business
� Data Warehouse
![Page 32: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/32.jpg)
Daimler TSS GmbH
Group Task
Data Warehouse / DHBW / Fall 2016 / Page 32
• Assume that you are the CIO of a bank, a retail company, a telco provider or an online
store.
• From what you have learned so far list possible (functional and non-functional)
requirements for a data warehouse. Think of the deficiencies of transactional
systems like
• Distributed data
• Different data structures
• Problem with historic data
• Problem with system workload
• Inadequate technology !
• What are requirements from an end-user perspective.
![Page 33: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/33.jpg)
Daimler TSS GmbH
Data Warehouse User
Data Warehouse / DHBW / Fall 2016 / Page 33
• Wants to access and analyze all important data from desktop
• Uses data analysis as facility for daily work
• Wants data to be fast and dynamically available
• Identifies data correlations by browsing through the data
• does possibly not know what (s)he is looking for
![Page 34: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/34.jpg)
Daimler TSS GmbH
Data Warehouse
Data Warehouse / DHBW / Fall 2016 / Page 34
• Answer business-critical questions
• Decision support for management
• Overcomes difficulties when using existing transaction systems for those tasks
• Not a product, but a overall concept
• Applications come, applications go. The data, however, lives forever. It is not about
building applications; it really is about the data underneath these application (Tom
Kyte)
![Page 35: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/35.jpg)
Daimler TSS GmbH
Data Warehouse
Data Warehouse / DHBW / Fall 2016 / Page 35
• Contains data from different systems
• Imports data from different systems on a regular basis
• summarize data
• provide historic data
• generate metadata
• Completely new system
![Page 36: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/36.jpg)
Daimler TSS GmbH
High-Level Data Warehouse Architecture
Data Warehouse / DHBW / Fall 2016 / Page 36
Staging
OLTP
OLTP
OLTP
Core Warehouse Mart
Data Warehouse
![Page 37: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/37.jpg)
Daimler TSS GmbH
Definitions DWH
Data Warehouse / DHBW / Fall 2016 / Page 37
Ralph Kimball Bill Inmon
„A data warehouse is a copy of transaction
data specifically structured for querying
and reporting“
“A data warehouse is a subject-oriented,
integrated, time-variant, nonvolatile
collection of data in support of
management’s decision-making process”
![Page 38: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/38.jpg)
Daimler TSS GmbH
• A data warehouse is organized around the major subjects of the enterprise like
• Customer
• Vendor
• Car
• In contrast to the process/functional orientation of applications such as
• Offer
• Booking
• Fulfillment
• Delivery
Subject-oriented
Data Warehouse / DHBW / Fall 2016 / Page 38
![Page 39: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/39.jpg)
Daimler TSS GmbH
• Data contained in the warehouse are integrated.
• Aspects of integration
• consistent naming conventions,
• consistent measurement of variables,
• consistent encoding structures,
• consistent physical attributes of data
• „One version of truth“
Integrated
Data Warehouse / DHBW / Fall 2016 / Page 39
![Page 40: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/40.jpg)
Daimler TSS GmbH
• All data in the data warehouse is accurate as of some moment in time.
• Has to be associated with a time stamp.
• In the operational environment data is accurate as of the moment of access.
• Once data is correctly recorded in the data warehouse, it cannot be updated or
deleted.
• Data warehouse data is, for all practical purposes, a long series of snapshots.
• If data is deleted in the OLTP system, the data is the DWH is marked with e.g. a
deletion flag but never actually deleted.
• Operational data, being accurate as of the moment of access, can be updated as the
need arises.
Time-variant
Data Warehouse / DHBW / Fall 2016 / Page 40
![Page 41: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/41.jpg)
Daimler TSS GmbH
• Operations in operational environment
• insert
• delete
• update
• select
• Operations in a data warehouse
• the initial and additional loading of data by (batch) processes
• the access of data
• (almost) no updates and deletes (technical updates / deletes only)
Nonvolatile
Data Warehouse / DHBW / Fall 2016 / Page 41
![Page 42: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/42.jpg)
Daimler TSS GmbH
Online Transaction Processing Online Analytical Processing
Transaction-oriented system Query-oriented system
Optimized for insert and update Optimized for complex queries with short
response times; ad-hoc queries
Many users change data Only ETL process writes data
Selective queries on the data Evaluations of all data (complex queries)
Avoid redundancy Redundant data storage
Normalized data management De-normalized data management
Relational Data Modeling Several layers with different data models,
one model usually Dimensional Data
Modeling
OLTP vs OLAP
Data Warehouse / DHBW / Fall 2016 / Page 42
![Page 43: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/43.jpg)
Daimler TSS GmbH
Operative data Dispositive data
Handling structured, parallel processes
with short and isolated ("atomic")
transactions
information for management (decision
support)
Modeling process- and function oriented,
individual for each application
Different data models in one DWH;
historic, stable and summarized, data
# users many few(er) but increasing user base
System
return time
Milliseconds Seconds to minutes (even hours)
Operative vs dispositive data
Data Warehouse / DHBW / Fall 2016 / Page 43
![Page 44: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/44.jpg)
Daimler TSS GmbH
Operative DBs Dispositive DBs
Purpose processing of daily business
transactions
information for management (decision
support)
Content detailed, complete, most recent
data
historic, stable and summarized, data
Data
amount
small amount of data per
transaction
large amount of data for load, and often
per query
Data
structure
suitable for operational
transactions
Several models; suitable for long term
storage and business analyses
Trans
actions
ACID; very short read/write
transactions
long load operations, longer read
transactions
Operative vs dispositive databases
Data Warehouse / DHBW / Fall 2016 / Page 44
![Page 45: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/45.jpg)
Daimler TSS GmbH
• Information oriented
• Analysis oriented
• Campaign oriented
• Planning oriented
Data Warehousing Usage Types
Data Warehouse / DHBW / Fall 2016 / Page 45
![Page 46: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/46.jpg)
Daimler TSS GmbH
Information oriented
Data Warehouse / DHBW / Fall 2016 / Page 46
• Computation of various key figures
• Revenue, profit, ..
• Used for „business performance management“
• User groups
• Higher management
• Controlling
• Mostly fixed reports „standard reporting“
• No ad hoc queries
• 70 – 80 % of all users of a data warehouse
• Web-based front-end
• Enterprise Portals
![Page 47: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/47.jpg)
Daimler TSS GmbH
Analysis oriented applications
Data Warehouse / DHBW / Fall 2016 / Page 47
• Purpose analysis of complex problems
• Example: analysis of quality problems
• Using the information of warranty cases to find underlying reasons for product
failure
• Identification of outliers or quality issues
![Page 48: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/48.jpg)
Daimler TSS GmbH
Campaign oriented applications
Data Warehouse / DHBW / Fall 2016 / Page 48
• Executed as specific projects
• Campaign planning
• Campaign execution
• Campaign evaluation of results
• Requires
• a broad spectrum of data
• Not all data may be in a data warehouse
• Sophisticated analysis techniques
• Data mining
• Examples
• Marketing campaigns
• Churn prevention
![Page 49: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/49.jpg)
Daimler TSS GmbH
Planning Oriented
Data Warehouse / DHBW / Fall 2016 / Page 49
• Use data from past to predict trends
• Using trends in the historical data for setting up plan figures
• Futures product sales
• For production planning
• Stock replenishment
• ...
![Page 50: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/50.jpg)
Daimler TSS GmbH
Exercise
Data Warehouse / DHBW / Fall 2016 / Page 50
• For one of the following companies
• Bank, Telecommunication company, Online store (like Amazon.com), Car
manufacturer, Supermarket
1. Outline the operational systems
• characterize which operations are performed by them
• mostly triggered by actions of a customer
• which information is stored by these systems
• which questions can be answered by these systems
2. For a data warehouse
• describe what information can be stored in it
• which questions can/should be answered with this information
![Page 51: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/51.jpg)
Daimler TSS GmbH
Data Warehouse
FrontendBackend
External data
sources
Internal data
sources
Standard Data Warehouse Architecture
Data Warehouse / DHBW / Fall 2016 / Page 51
Staging Layer
(Input Layer)
Core Warehouse
Layer
(Storage Layer)
Reporting Layer
(Output Layer)
(Mart Layer)
![Page 52: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/52.jpg)
Daimler TSS GmbH
Data Sources
Data Warehouse / DHBW / Fall 2016 / Page 52
� Providing internal and external data out of the source systems.
� Enabling data through Push (source is generating extracts) or Pull (BI Data Backend is
requesting or directly accessing data).
� Example for Push practice (deliver csv or text data through file interface; Change
Data Capture (CDC)).
� Example for Pull practice (direct access to the source system via ODBC, JDBC, API
and so on).
![Page 53: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/53.jpg)
Daimler TSS GmbH
Staging Layer
Data Warehouse / DHBW / Fall 2016 / Page 53
� Major target is to decouple source and target system (repeating extraction run,
additional delivery).
� Granular data (no pre-aggregation or filtering in the Data Source Layer, i.e. the source
system).
� Usually not persistent, therefore regular housekeeping is necessary (for instance after a
few days or weeks or at the latest once a correct upload to Core Warehouse Layer is
ensured).
![Page 54: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/54.jpg)
Daimler TSS GmbH
Core Warehouse Layer
Data Warehouse / DHBW / Fall 2016 / Page 54
� Data storage in an integrated, consolidated, consistent and often non-redundant
(normalized) data model.
� Application / Reporting neutral data storage on the most detailed level of granularity
(incl. historic data).
� Size of database can be several TB and can grow rapidly due to data historization.
![Page 55: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/55.jpg)
Daimler TSS GmbH
Data Mart Layer
Data Warehouse / DHBW / Fall 2016 / Page 55
� For performance reasons the Data Mart Layer is providing aggregated data in a
denormalized data model.
� Created through filtering or aggregating the Core Warehouse Layer content on specific
dimensions (e.g. time, region), granularity or business content specific aspects.
� Technically the Data Mart Layer can also be a part of an Analytical Frontend product
(such as QlickTech QlikView or IBM Cognos TM1) and need not to be stored in a
relational database.
� The Data Mart Layer also contains security permissions as end users are granted
access to this layer only.
![Page 56: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/56.jpg)
Daimler TSS GmbH
Other architectures
Data Warehouse / DHBW / Fall 2016 / Page 56
• Independent Data Marts
• Central data warehouse based on data marts (Kimball approach)
• Virtual Data Warehouse
• Data Vault 2.0 Architecture (Dan Linstedt)
• DW 2.0: The Architecture for the Next Generation of Data Warehousing (Bill Inmon)
![Page 57: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/57.jpg)
Daimler TSS GmbH
Data Warehouse
Frontend
Frontend
Backend
Backend
FrontendBackend
External data
sources
Internal data
sources
Independent Data Marts
Data Warehouse / DHBW / Fall 2016 / Page 57
Data Mart 1
Data Mart 2
Data Mart 3Staging Layer
(Input Layer)
Core Warehouse
Layer
(Storage Layer)
![Page 58: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/58.jpg)
Daimler TSS GmbH
Independent Data Marts
Data Warehouse / DHBW / Fall 2016 / Page 58
• Situation is often encountered in enterprises
• Data marts created to solve specific analytical requirements
• Fast implementation and lower cost
• Often Staging Layer and/or Core Warehouse Layer omitted
• Not recommended
• Possible problems
• No integration with other data marts
• No common data model for all data marts
• Cross data mart queries difficult
![Page 59: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/59.jpg)
Daimler TSS GmbH
Data Warehouse
FrontendBackend
External data
sources
Internal data
sources
Central data warehouse based on data marts (Kimball
approach)
Data Warehouse / DHBW / Fall 2016 / Page 59
Core Warehouse Layer = Mart Layer
Data Mart 1
Data Mart 2Data Mart 3
Staging Layer
(Input Layer)
![Page 60: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/60.jpg)
Daimler TSS GmbH
Central data warehouse based on data marts (Kimball
approach)
Data Warehouse / DHBW / Fall 2016 / Page 60
• Bottom-up approach
• Leverages existing data marts and integration processes
• Data marts may need to be redesigned if incompatibilities between the data models
exist
• Data mart changes can have an impact on the central data warehouse
• Enterprise service bus / conformed dimensions for integration purposes
![Page 61: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/61.jpg)
Daimler TSS GmbH
Data Warehouse
FrontendBackend
External data
sources
Internal data
sources
Virtual Data Warehouse
Data Warehouse / DHBW / Fall 2016 / Page 61
Query Management
![Page 62: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/62.jpg)
Daimler TSS GmbH
Virtual Data Warehouse
Data Warehouse / DHBW / Fall 2016 / Page 62
• Data not extracted from operational systems and stored separately
• Standardized interface for all operational data sources
• One "GUI" for all existing data
• Generates combined queries
• Query Processor joins query result data from different sources
• Can also access data in Hadoop (Polybase, Big SQL, BigData SQL, etc)
![Page 63: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/63.jpg)
Daimler TSS GmbH
Virtual Data Warehouse
Data Warehouse / DHBW / Fall 2016 / Page 63
• Query Management manages metadata about all operational systems
• (physical) location of data
• Algorithms for extracting data from operational system
• Implementation easier
• Low cost: can use existing hardware infrastructure
• Optimized access to existing infrastructure
![Page 64: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/64.jpg)
Daimler TSS GmbH
Virtual Data Warehouse
Data Warehouse / DHBW / Fall 2016 / Page 64
• Queries cause significant performance problems in operational systems
• Known problems when analyzing operational data directly
• Same query is processed multiple times (if queried multiple times)
• Same query delivers different results when processed at different times
![Page 65: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/65.jpg)
Daimler TSS GmbH
Data Warehouse
FrontendBackend
External data
sources
Internal data
sources
Data Vault 2.0 Architecture (Dan Linstedt)
Data Warehouse / DHBW / Fall 2016 / Page 65
Staging Layer
(Input Layer)
Core Warehouse
Layer
(Storage Layer)
Reporting Layer
(Output Layer)
(Mart Layer)
![Page 66: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/66.jpg)
Daimler TSS GmbH
Data Vault 2.0 Architecture
Data Warehouse / DHBW / Fall 2016 / Page 66
• Core Warehouse Layer is modeled with Data Vault
• Business rules are applied from Core Warehouse Layer to Mart Layer and not earlier
• Or alternatively within Core Warehouse Layer (Raw Data Vault containing data
without transformations and Business Data Vault containing transformed data)
• Data is fully auditable
• Real-time capable architecture
![Page 67: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/67.jpg)
Daimler TSS GmbH
Data Vault 2.0 Architecture
Data Warehouse / DHBW / Fall 2016 / Page 67
• Data Vault is optimized for the following requirements:
• Flexibility
• Agility
• Data historization
• Data integration
• Auditability
• Bill Inmon wrote in 2008: “Data Vault is the optimal approach for modeling the EDW in
the DW2.0 framework.” (DW2.0)
![Page 68: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/68.jpg)
Daimler TSS GmbH
DW 2.0: The Architecture for the Next Generation of Data
Warehousing (Bill Inmon)
Data Warehouse / DHBW / Fall 2016 / Page 68
![Page 69: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/69.jpg)
Daimler TSS GmbH
DW 2.0: The Architecture for the Next Generation of Data
Warehousing (Bill Inmon)
Data Warehouse / DHBW / Fall 2016 / Page 69
• Structured and unstructured data
• Different storage areas
• Hot data: High speed, expensive storage (RAM, SSDs) for most recent data
• …
• Cold data: Low speed, inexpensive storage (e.g. hard disks) for old data
![Page 70: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/70.jpg)
Daimler TSS GmbH
Architecture from an actual Project
Data Warehouse / DHBW / Fall 2016 / Seite 70
ETL Engine
Fro
nte
nd
Standard
Reports
AdHoc
ReportsLogs
TSM
IIDR
ReplEngine
Source
Datastore
Source
Spiegel
DB
OLTP
DB
IIDR ReplEngine
Spiegel
Datastore
Spiegel
IIDR ReplEngine
DWH
Datastore
DWH
DWH DB
Staging Layer
Core Layer / Data Vault
Mart Layer
![Page 71: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/71.jpg)
Daimler TSS GmbH
Group Task
Data Warehouse / DHBW / Fall 2016 / Page 71
• Review the presented data warehouse architectures.
• Which architecture would you recommend for
• A multinational telco provider
• A holding of 3 car manufacturers
• An online store represented in almost all countries
• A machine tools manufacturer with plants in Europe, America and Asia
• You may recommend a combination of the architecures or you may recommend as well
different architectures for different subject areas
• List advantages and drawbacks of your proposal.
![Page 72: Part 1 - Data Warehousing Lecture at BW Cooperative State University (DHBW)](https://reader031.vdocuments.us/reader031/viewer/2022022203/5871116b1a28abac6d8b5de1/html5/thumbnails/72.jpg)
Thank you!
Daimler TSS GmbH
Wilhelm-Runge-Straße 11, 89081 Ulm, Germany / Phone +49 731 505-06 / Fax +49 731 505-65 99
[email protected] / Internet: www.daimler-tss.com / Intranet portal code: @TSS
Domicile and Court of Registry: Ulm / Commercial Register No.: 3844 / Management: Christoph Röger (Vorsitzender), Steffen Bäuerle