opensap hana1 warmup unit 01
DESCRIPTION
OpenSAP HANA1 WarmUp Unit 01TRANSCRIPT
-
Lecture: Trends and Concepts in the
Software Industry
Introduction
Dr.-Ing. Jrgen Mller
Course Statistics h#p://www.openHPI.de 13,126 par8cipants, 4.161 ac8ve
contributors in discussions 6 course modules for 6 weeks 35 individual module items (videos /
slides / reading material) 300 exam / assignment / self-test
ques8ons 309 slides explicitly created for this course 215 pages book reading material 8h cut video material (12h raw material)
Teaching Team: 4 colleagues
2
0"20"40"60"80"100"
0"
1000"
2000"
3000"
4000"
Week"1"
Week"2"
Week"3"
Week"4"
Week"5"
Week"6"Exam"
Number"of"Par8cipants"in"Assignments"&"Exam"Average"Results"in"%"
-
Goals
Deep technical understanding of a column-oriented dic8onary-encoded in-memory database and its applica8on in enterprise compu8ng
The future of enterprise compu8ng Founda8ons of database storage techniques In-memory database operators Advanced database storage techniques Founda8ons for a new enterprise applica8on development era
3
December 2006: The Basic Idea
4
-
Learning Map
Founda(ons for a New Enterprise Applica(on
Development Era
Founda(ons of Database Storage
Techniques
The Future of Enterprise Compu(ng
Advanced Database Storage Tech-niques
In-Memory Database Operators
Chapter 1: The Future of
Enterprise Computing
-
Dr.-Ing. Jrgen Mller
New Requirements for Enterprise Computing
Sensors
Tracing Pharmaceu(cal Packages in Europe 15 billion packages / 34 billion read events per year Distributed repositories for storing read events References to read events are stored in central discovery service
Monitoring F1 car Performance Between 300 and 600 sensors per car Mul8ple events per second per sensor Tracking every Grand Prix lap or test run
8
-
Airplane Maintenance at Boeing Maintenance workers at Boeing write reports aYer repairs Reports and men8oned part numbers get indexed Analy8cs reveal which parts in other planes may be defec8ve
Medical Diagnosis at Charit Doctors write medical reports aYer every diagnosis Diagnosis and men8oned symptoms get indexed Comparison with similar cases for op8mal treatment
Combination of Structured and Unstructured Data
9
Mobile Mobile inverts tradi(onal corporate structures Enables customer-facing
personnel
Response 8mes < 1 second
Example: Dunning
Sales, Service
Opera8ons
Controlling
Consolida8on
Strategy
Consumers / Customers
CEO
10
-
Learning Map
Founda(ons for a New Enterprise Applica(on
Development Era
Founda(ons of Database Storage
Techniques
The Future of Enterprise Compu(ng
Advanced Database Storage Tech-niques
In-Memory Database Operators
Enterprise Application Characteristics
Dr.-Ing. Jrgen Mller
-
Transac(onal Data Entry Sources: Machines, Transac8onal Apps, User Interac8on, etc.
Text Analy(cs, Unstructured Data Sources: web, social, logs, support systems, etc.
Event Processing, Stream Data Sources: machines, sensors, high volume systems
Real-(me Analy(cs, Structured Data Sources: Repor8ng, Classical Analy8cs, planning, simula8on
Data Management
CPUs (mul*-Core + Cache)
Main Memory
Challenge: Diverse Applications
13
Modern enterprise resource planning (ERP) systems are challenged by mixed workloads, including OLAP-style queries. For example: OLTP-style: create sales order, invoice, accoun8ng documents,
display customer master data or sales order OLAP-style: dunning, available-to-promise, cross selling,
opera8onal repor8ng (list open sales orders)
But: Todays data management systems are op8mized either for daily transac(onal or analy(cal workloads storing their data along rows or columns
Online Transac8on Processing
Online Analy8cal Processing
OLTP vs. OLAP
14
-
Drawbacks of the Separation
OLAP system does not have the latest data
OLAP system does only have predened subset of the data
Cost-intensive ETL process has to synch both systems
There is a lot of redundancy
Dierent data schemas introduce complexity for applica8ons combining sources
15
OLTP Access Pattern Myth
Workload analysis of enterprise systems shows: OLTP and OLAP workloads are not that dierent
16
-
Combine OLTP and OLAP data using modern hardware and database systems
to create a single source of truth, enable real-(me analy(cs and
simplify applica8ons and database structures.
Addi8onally, extrac8on, transforma8on, and loading (ETL) processes pre-computed aggregates and materialized views become
obsolete.
Vision
17
Many columns are not used even once
Many columns have a low cardinality of values
NULL values/default values are dominant
Sparse distribu8on facilitates high compression
Standard enterprise soYware data is sparse and wide
Enterprise Data Characteristics
18
-
Low Cardinality of Values Within Many Columns
Results from analyzing nancials
Dis8nct values in accoun8ng document headers (99 a#ributes)
CPG Logis8cs
High tech Discrete manufacturing
Banking
19
Many Columns are not Used Even Once
55% unused columns per company in average 40% unused columns across all companies
0%10%20%30%40%50%60%70%80%
1 - 32 33 - 1023 1024 - 100000000
13%9%
78%
24%12%
64%
Number of Distinct Values
Inventory ManagementFinancial Accounting
% of
Colum
ns
20
-
Wide Tables Analysis of width of 144 most used* tables
* Largest in terms of cardinality
0
5
10
15
20
25
30
1-9
10-19
20-29
30-39
40-49
50-59
60-69
70-79 82
99
110-11
9 12
0-12
9 13
8 14
0-14
9 15
6 18
0-18
9 20
0-20
9 23
0 31
2 39
9
# Tables
# Columns
21
Learning Map
Founda(ons for a New Enterprise Applica(on
Development Era
Founda(ons of Database Storage
Techniques
The Future of Enterprise Compu(ng
Advanced Database Storage Tech-niques
In-Memory Database Operators
-
Changes in Hardware
Dr.-Ing. Jrgen Mller
Advances in Hardware
64 bit address space 4TB in current server boards
25GB/s data throughput, CPU - DRAM
Cost-performance ra8o rapidly declining
Mul8-Core Architecture 8 x (8-16) core CPU per blade
Parallel scaling across blades
One blade $50.000 = 1 Enterprise Class Server
A
24
Copy 50GB data via Inniband 10s
-
25
CPU Registers
Main Memory
Flash
Hard Disk
High
er
Perfo
rman
ceLower Price /
Higher Latency
CPU Caches
Memory Hierarchy
Latency Numbers L1 cache reference (cached data word) 0.5ns
Branch mispredict 5ns
L2 cache reference 7ns
Mutex lock/unlock 25ns
Main memory reference 100ns 0.1s
Send 2K bytes over 1 Gb/s network 20,000ns 20s
SSD random read 150,000ns 150s
Read 1 MB sequen8ally from memory 250,000ns 250s
Disk seek 10,000,000ns 10ms
Send packet CA to Netherlands to CA 150,000,000ns 150ms
26
-
Learning Map
Founda(ons for a New Enterprise Applica(on
Development Era
Founda(ons of Database Storage
Techniques
The Future of Enterprise Compu(ng
Advanced Database Storage Tech-niques
In-Memory Database Operators
A Blueprint of SanssouciDB
Dr.-Ing. Jrgen Mller
-
SanssouciDB: An In-Memory Database for Enterprise Applications
Main Memoryat Blade i
Log
SnapshotsPassive Data (History)
Non-VolatileMemory
RecoveryLoggingTime travel
Data aging
Query Execution Metadata TA Manager
Interface Services and Session Management
Distribution Layerat Blade i
Main Store DifferentialStore
Active Data
Me
rgeCo
lum
n
Co
lum
n
Co
mb
ined
Co
lum
n
Co
lum
n
Co
lum
n
Co
mb
ined
Co
lum
n
Indexes
Inverted
ObjectData Guide
In-Memory Database (IMDB)
Data resides permanently in main memory
Main Memory is the primary persistence
S8ll: logging to disk/ recovery from disk
Main memory access is the new bo\leneck
Cache-conscious algorithms/ data structures are crucial (locality is king)
Main Memory at Server i
Distribution Layer at Server i