opensap hana1 warmup unit 01

15
Lecture: Trends and Concepts in the Software Industry Introduction Dr.-Ing. Jürgen Müller Course Statistics h#p://www.openHPI.de 13,126 par8cipants, 4.161 ac8ve contributors in discussions 6 course modules for 6 weeks 35 individual module items (videos / slides / reading material) ≈ 300 exam / assignment / selftest ques8ons 309 slides explicitly created for this course 215 pages book reading material ≈ 8h cut video material (12h raw material) Teaching Team: 4 colleagues 2 0 20 40 60 80 100 0 1000 2000 3000 4000 Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Exam Number of Par8cipants in Assignments & Exam Average Results in %

Upload: aabid-abdul-majeed

Post on 23-Nov-2015

37 views

Category:

Documents


1 download

DESCRIPTION

OpenSAP HANA1 WarmUp Unit 01

TRANSCRIPT

  • Lecture: Trends and Concepts in the

    Software Industry

    Introduction

    Dr.-Ing. Jrgen Mller

    Course Statistics h#p://www.openHPI.de 13,126 par8cipants, 4.161 ac8ve

    contributors in discussions 6 course modules for 6 weeks 35 individual module items (videos /

    slides / reading material) 300 exam / assignment / self-test

    ques8ons 309 slides explicitly created for this course 215 pages book reading material 8h cut video material (12h raw material)

    Teaching Team: 4 colleagues

    2

    0"20"40"60"80"100"

    0"

    1000"

    2000"

    3000"

    4000"

    Week"1"

    Week"2"

    Week"3"

    Week"4"

    Week"5"

    Week"6"Exam"

    Number"of"Par8cipants"in"Assignments"&"Exam"Average"Results"in"%"

  • Goals

    Deep technical understanding of a column-oriented dic8onary-encoded in-memory database and its applica8on in enterprise compu8ng

    The future of enterprise compu8ng Founda8ons of database storage techniques In-memory database operators Advanced database storage techniques Founda8ons for a new enterprise applica8on development era

    3

    December 2006: The Basic Idea

    4

  • Learning Map

    Founda(ons for a New Enterprise Applica(on

    Development Era

    Founda(ons of Database Storage

    Techniques

    The Future of Enterprise Compu(ng

    Advanced Database Storage Tech-niques

    In-Memory Database Operators

    Chapter 1: The Future of

    Enterprise Computing

  • Dr.-Ing. Jrgen Mller

    New Requirements for Enterprise Computing

    Sensors

    Tracing Pharmaceu(cal Packages in Europe 15 billion packages / 34 billion read events per year Distributed repositories for storing read events References to read events are stored in central discovery service

    Monitoring F1 car Performance Between 300 and 600 sensors per car Mul8ple events per second per sensor Tracking every Grand Prix lap or test run

    8

  • Airplane Maintenance at Boeing Maintenance workers at Boeing write reports aYer repairs Reports and men8oned part numbers get indexed Analy8cs reveal which parts in other planes may be defec8ve

    Medical Diagnosis at Charit Doctors write medical reports aYer every diagnosis Diagnosis and men8oned symptoms get indexed Comparison with similar cases for op8mal treatment

    Combination of Structured and Unstructured Data

    9

    Mobile Mobile inverts tradi(onal corporate structures Enables customer-facing

    personnel

    Response 8mes < 1 second

    Example: Dunning

    Sales, Service

    Opera8ons

    Controlling

    Consolida8on

    Strategy

    Consumers / Customers

    CEO

    10

  • Learning Map

    Founda(ons for a New Enterprise Applica(on

    Development Era

    Founda(ons of Database Storage

    Techniques

    The Future of Enterprise Compu(ng

    Advanced Database Storage Tech-niques

    In-Memory Database Operators

    Enterprise Application Characteristics

    Dr.-Ing. Jrgen Mller

  • Transac(onal Data Entry Sources: Machines, Transac8onal Apps, User Interac8on, etc.

    Text Analy(cs, Unstructured Data Sources: web, social, logs, support systems, etc.

    Event Processing, Stream Data Sources: machines, sensors, high volume systems

    Real-(me Analy(cs, Structured Data Sources: Repor8ng, Classical Analy8cs, planning, simula8on

    Data Management

    CPUs (mul*-Core + Cache)

    Main Memory

    Challenge: Diverse Applications

    13

    Modern enterprise resource planning (ERP) systems are challenged by mixed workloads, including OLAP-style queries. For example: OLTP-style: create sales order, invoice, accoun8ng documents,

    display customer master data or sales order OLAP-style: dunning, available-to-promise, cross selling,

    opera8onal repor8ng (list open sales orders)

    But: Todays data management systems are op8mized either for daily transac(onal or analy(cal workloads storing their data along rows or columns

    Online Transac8on Processing

    Online Analy8cal Processing

    OLTP vs. OLAP

    14

  • Drawbacks of the Separation

    OLAP system does not have the latest data

    OLAP system does only have predened subset of the data

    Cost-intensive ETL process has to synch both systems

    There is a lot of redundancy

    Dierent data schemas introduce complexity for applica8ons combining sources

    15

    OLTP Access Pattern Myth

    Workload analysis of enterprise systems shows: OLTP and OLAP workloads are not that dierent

    16

  • Combine OLTP and OLAP data using modern hardware and database systems

    to create a single source of truth, enable real-(me analy(cs and

    simplify applica8ons and database structures.

    Addi8onally, extrac8on, transforma8on, and loading (ETL) processes pre-computed aggregates and materialized views become

    obsolete.

    Vision

    17

    Many columns are not used even once

    Many columns have a low cardinality of values

    NULL values/default values are dominant

    Sparse distribu8on facilitates high compression

    Standard enterprise soYware data is sparse and wide

    Enterprise Data Characteristics

    18

  • Low Cardinality of Values Within Many Columns

    Results from analyzing nancials

    Dis8nct values in accoun8ng document headers (99 a#ributes)

    CPG Logis8cs

    High tech Discrete manufacturing

    Banking

    19

    Many Columns are not Used Even Once

    55% unused columns per company in average 40% unused columns across all companies

    0%10%20%30%40%50%60%70%80%

    1 - 32 33 - 1023 1024 - 100000000

    13%9%

    78%

    24%12%

    64%

    Number of Distinct Values

    Inventory ManagementFinancial Accounting

    % of

    Colum

    ns

    20

  • Wide Tables Analysis of width of 144 most used* tables

    * Largest in terms of cardinality

    0

    5

    10

    15

    20

    25

    30

    1-9

    10-19

    20-29

    30-39

    40-49

    50-59

    60-69

    70-79 82

    99

    110-11

    9 12

    0-12

    9 13

    8 14

    0-14

    9 15

    6 18

    0-18

    9 20

    0-20

    9 23

    0 31

    2 39

    9

    # Tables

    # Columns

    21

    Learning Map

    Founda(ons for a New Enterprise Applica(on

    Development Era

    Founda(ons of Database Storage

    Techniques

    The Future of Enterprise Compu(ng

    Advanced Database Storage Tech-niques

    In-Memory Database Operators

  • Changes in Hardware

    Dr.-Ing. Jrgen Mller

    Advances in Hardware

    64 bit address space 4TB in current server boards

    25GB/s data throughput, CPU - DRAM

    Cost-performance ra8o rapidly declining

    Mul8-Core Architecture 8 x (8-16) core CPU per blade

    Parallel scaling across blades

    One blade $50.000 = 1 Enterprise Class Server

    A

    24

    Copy 50GB data via Inniband 10s

  • 25

    CPU Registers

    Main Memory

    Flash

    Hard Disk

    High

    er

    Perfo

    rman

    ceLower Price /

    Higher Latency

    CPU Caches

    Memory Hierarchy

    Latency Numbers L1 cache reference (cached data word) 0.5ns

    Branch mispredict 5ns

    L2 cache reference 7ns

    Mutex lock/unlock 25ns

    Main memory reference 100ns 0.1s

    Send 2K bytes over 1 Gb/s network 20,000ns 20s

    SSD random read 150,000ns 150s

    Read 1 MB sequen8ally from memory 250,000ns 250s

    Disk seek 10,000,000ns 10ms

    Send packet CA to Netherlands to CA 150,000,000ns 150ms

    26

  • Learning Map

    Founda(ons for a New Enterprise Applica(on

    Development Era

    Founda(ons of Database Storage

    Techniques

    The Future of Enterprise Compu(ng

    Advanced Database Storage Tech-niques

    In-Memory Database Operators

    A Blueprint of SanssouciDB

    Dr.-Ing. Jrgen Mller

  • SanssouciDB: An In-Memory Database for Enterprise Applications

    Main Memoryat Blade i

    Log

    SnapshotsPassive Data (History)

    Non-VolatileMemory

    RecoveryLoggingTime travel

    Data aging

    Query Execution Metadata TA Manager

    Interface Services and Session Management

    Distribution Layerat Blade i

    Main Store DifferentialStore

    Active Data

    Me

    rgeCo

    lum

    n

    Co

    lum

    n

    Co

    mb

    ined

    Co

    lum

    n

    Co

    lum

    n

    Co

    lum

    n

    Co

    mb

    ined

    Co

    lum

    n

    Indexes

    Inverted

    ObjectData Guide

    In-Memory Database (IMDB)

    Data resides permanently in main memory

    Main Memory is the primary persistence

    S8ll: logging to disk/ recovery from disk

    Main memory access is the new bo\leneck

    Cache-conscious algorithms/ data structures are crucial (locality is king)

    Main Memory at Server i

    Distribution Layer at Server i