annual report 2014 final web - hasso-plattner-institut · annual&report&2014&!!!! &...

35
Annual Report 2014 Enterprise Platform and Integration Concepts Research Group of Prof. Dr. Hasso Plattner Hasso Plattner Institute AugustBebelStr. 88 14482 Potsdam http://epic.hpi.de

Upload: tranmien

Post on 18-Sep-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Annual Report 2014 final web - Hasso-Plattner-Institut · Annual&Report&2014&!!!! & & & EnterprisePlatformandIntegrationConcepts& & Research!Group!of! Prof.Dr.HassoPlattner&! HassoPlattner!Institute!

             

Annual  Report  2014      

   

     

Enterprise  Platform  and  Integration  Concepts    

Research  Group  of  Prof.  Dr.  Hasso  Plattner  

 Hasso  Plattner  Institute  August-­‐Bebel-­‐Str.  88  

14482  Potsdam    

http://epic.hpi.de  

Page 2: Annual Report 2014 final web - Hasso-Plattner-Institut · Annual&Report&2014&!!!! & & & EnterprisePlatformandIntegrationConcepts& & Research!Group!of! Prof.Dr.HassoPlattner&! HassoPlattner!Institute!

Annual  Report  |  2014  

 

Enterprise  Platform  and  Integration  Concepts  |  Hasso  Plattner  Institute    

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Contact  

Dr.  Matthias  Uflacker  Hasso  Plattner  Institute  August-­‐Bebel-­‐Str.  88  14482  Potsdam,  Germany  Tel.:  +49  (331)  5509-­‐566  E-­‐Mail:  [email protected]  

 

   

Page 3: Annual Report 2014 final web - Hasso-Plattner-Institut · Annual&Report&2014&!!!! & & & EnterprisePlatformandIntegrationConcepts& & Research!Group!of! Prof.Dr.HassoPlattner&! HassoPlattner!Institute!

Annual  Report  |  2014  

 

Enterprise  Platform  and  Integration  Concepts  |  Hasso  Plattner  Institute    

Table  of  Contents  

 

1.      OUR  TEAM  .................................................................................................................................  1  

2.      RESEARCH  AREAS  AND  SELECTED  PROJECTS    ..............................................................................  4  2.1      IN-­‐MEMORY  DATA  MANAGEMENT  FOR  ENTERPRISE  SYSTEMS    ................................................................  4  2.2      TOOLS  AND  METHODS  FOR  ENTERPRISE  SYSTEMS  DESIGN    AND  ENGINEERING    ...........................................  9  2.3      IN-­‐MEMORY  DATA  MANAGEMENT  FOR  LIFE-­‐SCIENCE    ..........................................................................  14  

3.      SUPERVISED  MASTER  THESES  ...................................................................................................  16  

4.      COMPLETED  PH.D.  DISSERTATIONS    .........................................................................................  17  

5.      PUBLICATIONS    ........................................................................................................................  20  5.1      BOOKS    .........................................................................................................................................  20  5.2      BOOK  CHAPTERS    ...........................................................................................................................  21  5.3      JOURNAL  ARTICLES    ........................................................................................................................  21  5.4      CONFERENCE  CHAPTERS    .................................................................................................................  21  5.5      WORKSHOP  ARTICLES    .....................................................................................................................  23  

6.      TEACHING    ...............................................................................................................................  24  6.1      SUMMER  TERM  2014    ....................................................................................................................  24  6.2      WINTER  TERM  2014/2015    ............................................................................................................  24  6.3      ME310  GLOBAL  TEAM-­‐BASED  PRODUCT  INNOVATION  &  ENGINEERING  ..................................................  25  6.4      OPENHPI    .....................................................................................................................................  27  

7.      EVENTS,  SPEECHES,  AND  PRESENTATIONS    ...............................................................................  28  

8.      INDUSTRY  PARTNERSHIPS  ........................................................................................................  31  

9.      ACADEMIC  PARTNERSHIPS  .......................................................................................................  31    

 

 

Page 4: Annual Report 2014 final web - Hasso-Plattner-Institut · Annual&Report&2014&!!!! & & & EnterprisePlatformandIntegrationConcepts& & Research!Group!of! Prof.Dr.HassoPlattner&! HassoPlattner!Institute!

1   Annual  Report  |  2014  

 

1   Enterprise  Platform  and  Integration  Concepts  |  Hasso  Plattner  Institute    

1. OUR  TEAM    Chair                        Chair  Representative  

 

 Prof.  Dr.  h.c.  mult.  Hasso  Plattner    

   Dr.        Matthias  Uflacker  

   Assistant  

 

   PostDoc  Researchers  

 

 Dr.    Mariana  Neves    

Dr.    Matthieu-­‐P.  Schapranow  

 Natural  Language  Processing  in  In-­‐Memory  Database  

In-­‐Memory  Data  Management  for  Life  Science  Applications    

Research  Assistants    

  Martin  Boissier     Lars  Butzmann    Data  Tiering  and  Access  Based  Data  Partitioning  

 Business  Simulations  using  In-­‐Memory  Databases        

  Andrea  Lange  

 

 

Page 5: Annual Report 2014 final web - Hasso-Plattner-Institut · Annual&Report&2014&!!!! & & & EnterprisePlatformandIntegrationConcepts& & Research!Group!of! Prof.Dr.HassoPlattner&! HassoPlattner!Institute!

Annual  Report  |  2014   2  

 

Enterprise  Platform  and  Integration  Concepts  |  Hasso  Plattner  Institute   2    

 

  Martin  Faust     Cindy  Fähnrich    Indices  for  In-­‐Memory  Column  Stores        

 Application  of  In-­‐Memory  Database  Technology  to  Population-­‐Specific  Genome  Data  Analysis      

 

 Franziska  Häger    

 Stefan  Klauck  

 Design  Thinking  in  Software  Development    Processes        

 Generic  What-­‐If-­‐Analyses  Using  In-­‐Memory  Column  Stores    

  Thomas  Kowark    

 Martin  Lorenz  

 Query-­‐Level  Replication  of  Software  Repository  Analyses        

 The  Impact  of  Column-­‐Orientation  on  the  Quality  of  Class  Inheritance  Mapping    

  Carsten  Meyer     Stephan  Müller    Dynamic  Data  Tiering  for  Mixed-­‐Workload  In-­‐Memory  Databases          

 Aggregates  Caching  for  Enterprise  Applications      

Page 6: Annual Report 2014 final web - Hasso-Plattner-Institut · Annual&Report&2014&!!!! & & & EnterprisePlatformandIntegrationConcepts& & Research!Group!of! Prof.Dr.HassoPlattner&! HassoPlattner!Institute!

3   Annual  Report  |  2014  

 

3   Enterprise  Platform  and  Integration  Concepts  |  Hasso  Plattner  Institute    

  Keven  Richly     David  Schwalb    Geo-­‐Spatial  Analyses  on  In-­‐Memory  Column  Stores          

 Leveraging  Non-­‐Volatile  Memory  Technologies  for  In-­‐Memory  Column  Stores    

  Christian  Schwarz    

 Ralf  Teusner  

 Predictive  Analytics  on  In-­‐Memory  Databases          

 Teaching  Software  Development  in  Massive  Open  Online  Courses    

  Arian  Treffer      Omniscient  Debugging  in  Database  Applications      Student  Assistants    Aechtner,  Sten  Berning,  Tim  Brauer,  Janos  Franke,  Alexander  Hopstock,  Michael  Horschig,  Siegfried    Jankrift,  Marcel  Klauck,  Stefan  Kotschenreuther,  Leo  Matthies,  Christoph  Rehbein,  Cornelia  Bärtig,  Andrea  Bock,  Cornelius  Enderlein,  Jonas  

Frick,  Jakob  Höroldt,  Carolin  Ihrke,  Sebastian  Kastius,  Alexander  Kohlhagen,  Marco  Lehmann,  Sven  Marten,  Jannik  Reißaus,  Benjamin  Benson,  Lawrence  Bothe,  Max  Flemming,  Pedro  Hesse,  Hubert  Horschig,  Friedrich  Jacoby,  Janusch  

Keller,  Marvin  Koßmann,  Jan  Liedke,  Franz  Nack,  Tobias  Ruhrländer,  Rui  Paulo  Schmidt,  Christopher  Wolff,  Felix  Matysik,  Jan-­‐Tobias  Schulze,  Alexander  Dreseler,  Markus  Schumann,  David  Wacke,  Markus  Illi,  Cornelius    

   

Page 7: Annual Report 2014 final web - Hasso-Plattner-Institut · Annual&Report&2014&!!!! & & & EnterprisePlatformandIntegrationConcepts& & Research!Group!of! Prof.Dr.HassoPlattner&! HassoPlattner!Institute!

Annual  Report  |  2014   4  

 

Enterprise  Platform  and  Integration  Concepts  |  Hasso  Plattner  Institute   4    

2. RESEARCH  AREAS  AND  SELECTED  PROJECTS    

Our   research  activities   focus  on   the  principles  of   in-­‐memory  data  management   for  enter-­‐prise   systems   and   the   integration   of   different   software   systems   to   meet   customer  requirements.  This  involves  studying  the  conceptual  and  technological  aspects  of  systems  for  data  management   and   business   process   support.   In   customer-­‐centered   business   software  development,   the   focus   is   on   the   users.   Developing   solutions   tailored   to   user   needs   in   a  timely  manner  requires  well-­‐designed  tools  and  methods  for  enterprise  system  design  and  engineering.  We  apply  our  findings  to  real-­‐world  scenarios  and  showcase  future  enterprise  applications   by   developing   and   evaluating   functional   prototypes   closely   together  with   our  industry   partners.  One  particular   domain   of   interest   is   the   application   of   in-­‐memory  data  management  in  life  sciences  and  eHealth  systems.    2.1      In-­‐Memory  Data  Management  for  Enterprise  Systems    The   traditional   market   division   into   online   transaction   processing   (OLTP)   and   online  analytical  processing  (OLAP)  has  been  justified  by  different  workloads  of  both  systems.  While  OLTP  workloads  are  characterized  by  a  mix  of  reads  and  writes  of  a  few  rows  at  a  time,  OLAP  applications  are  characterized  by  complex  read  queries  with  joins  and  large  sequential  scans  spanning   few  columns  but  many   rows  of   the  database.   Those   two  workloads  are   typically  addressed  by  separate  systems:  transaction  processing  systems  and  business  intelligence  or  data   warehousing   systems.   Our   research   investigates   the   re-­‐unification   of   enterprise  architectures,  uniting  transactional  and  analytical  systems  to  significantly  reduce  application  complexity  and  data  redundancy,  to  simplify  IT  landscapes,  and  to  enable  real-­‐time  reporting  on  the  transactional  data.  The  following  figure  outlines  our  proposed  system  architecture.  

 It   is   a   common   belief   that   a   columnar   data   layout   is   not   well   suited   for   transactional  processing  and  should  mainly  be  used  for  analytical  processing.  We  postulate  that  this  is  not  

Page 8: Annual Report 2014 final web - Hasso-Plattner-Institut · Annual&Report&2014&!!!! & & & EnterprisePlatformandIntegrationConcepts& & Research!Group!of! Prof.Dr.HassoPlattner&! HassoPlattner!Institute!

5   Annual  Report  |  2014  

 

5   Enterprise  Platform  and  Integration  Concepts  |  Hasso  Plattner  Institute    

the   case   as   column-­‐based   system   architectures   can   even   be   superior   for   transactional  business  processing  if  a  data  layout  without  transaction  maintained  aggregates  is  chosen.  By  dropping   all   transaction-­‐maintained   aggregates,   indices,   and   other   redundant   data  structures,  we  can  significantly  simplify  data  entry  transactions.  There  is  no  need  to  update  additional   summarization   tables   or   secondary   indices.   Consequently,   the   only   necessary  steps  for  the  booking  of  a  vendor  invoice  are  the  inserts  of  the  accounting  document  header  and  its  line  items  as  depicted  in  the  following  graphic.    

Although   single   inserts   into   a   column   store   generally   take   longer,   the   reduction   of  complexity   eliminates   most   of   the   work   during   data   entry   and   results   in   significant  performance  advantages  on  the  simplified  data  schema.  In  turn,  data  entry  becomes  actually  faster  on  an   in-­‐memory  column  store.  We  measured  the  runtime  of  both  transactions   in  a  productive   setting,   finding   the   simplified   data   entry   transaction   on   an   in-­‐memory   column  store  to  be  2.5  times  faster  than  the  classic  data  entry  transaction  on  a  disk-­‐based  row-­‐store.    

In   summary,   our   research   investigates   the   impact   of   a   redundancy-­‐free,   column-­‐based  architecture  without  transaction-­‐maintained  aggregates  on  the  way  enterprise  applications  are   being   built.   This   includes   a   dramatic   simplification   of   applications,   a   reduced   data  footprint,   and   advanced   partitioning   techniques   based   on   classifying   data   into   actual   and  historical.  

Selected  Publications  in  this  Research  Area  in  2014    ! Hasso  Plattner,  Martin  Faust,  Stephan  Müller,  David  Schwalb,  Matthias  Uflacker,  

Johannes  Wust:  The  Impact  of  Columnar  In-­‐Memory  Databases  on  Enterprise  Systems,  VLDB,  2014  

! Hasso  Plattner:  A  Course  in  In-­‐Memory  Data  Management:  The  Inner  Mechanics  of  In-­‐Memory  Databases,  Second  Edition,  ISBN:  978-­‐3-­‐642-­‐55269-­‐4,  2014  

! Stephan  Müller,  Lars  Butzmann,  Stefan  Klauck,  Hasso  Plattner:  Materialized  View  Maintenance  Leveraging  In-­‐Memory  Data  Structures,  International  Journal  On  Advances  in  Software,  vol.  7,  no.  3&4,  2014  

! Christian  Tinnefeld,  Donald  Kossmann,  Joos-­‐Hendrik  Boese,  Hasso  Plattner:  Parallel  Join  Executions  in  RAMCloud,  CloudDB  -­‐  In  conjunction  with  ICDE  2014,  2014  

Page 9: Annual Report 2014 final web - Hasso-Plattner-Institut · Annual&Report&2014&!!!! & & & EnterprisePlatformandIntegrationConcepts& & Research!Group!of! Prof.Dr.HassoPlattner&! HassoPlattner!Institute!

Annual  Report  |  2014   6  

 

Enterprise  Platform  and  Integration  Concepts  |  Hasso  Plattner  Institute   6    

 

Project  Highlight  -­‐  Bachelor  Project  2013/14  Enterprise  Workload  Analysis  for  Hot  and  Cold  Data  Classification    In   this   project,   students   addressed   the   topic   of   classifying   data   into   hot   and   cold   by  analyzing   production   workload   traces.   The   goal   of   the   project   was   to   get   a   better  understanding  of  data  relevance  in  a  realistic  system.  Therefore,  the  ERP  system  of  a  large  global   company   was   traced   for   several   days,   resulting   in   50   GB   of   raw   query   logs.   A  framework  called  EWA  (Explorative  Workload  Analyzer)  was  implemented  to  visualize  the  massive  workload  traces  in  a  meaningful  way.  With  the  help  of  this  framework,  we  could  not  only  interactively  analyze  queries  and  find  access  patterns  in  a  productive  workload.  We  were  furthermore  able  to  execute  a   full   replay  on  a   copy  of  the  production  system.  With   the  help   of   this   replay  we  were  able   to   tell   for   every   single   tuple   in   the   database  when  and  how  often   it  has  been  accessed.  Ultimately,   this  allowed  us   to  analyze  which  tuples  were  most   relevant   and   answer  questions   like   `is   the   date   of   a   tuple   correlating  with   its  relevance?’.  A  SAP  UI5  frontend  was  developed  to  allow  the  easy  exploration  of  workload  characteristics  based  on  the  raw  trace  data  consisting  of  billions  of  tuples.      

 

1

Project  Highlight  -­‐  Master  Project  2014/15  HANA  Load  Simulator    The   HANA   Load   Simulator   is   a   tool   that   generates   a   realistic   enterprise   workload   of  thousands   of   concurrent   users   and   executes   that   workload   on   different   database  configurations   simultaneously.   A   dashboard  monitors   several   performance   indicators   of  each   database,   incl.   data   footprint,   transaction   latencies,   throughput,   and   overall   CPU  utilization.  The  dashboard  can  also  be  used  to  configure  several  workload  parameters  like  OLTP   and   OLAP   query   frequencies   or   the   ratio   of   actual   and   historical   queries.   This  provides   a   simple   and   interactive   tool   to   assess   key   performance   characteristics   of  different  database  setups  (e.g.,  single-­‐  vs.  multi-­‐node)  side-­‐by-­‐side  and  in  real-­‐time.    

Page 10: Annual Report 2014 final web - Hasso-Plattner-Institut · Annual&Report&2014&!!!! & & & EnterprisePlatformandIntegrationConcepts& & Research!Group!of! Prof.Dr.HassoPlattner&! HassoPlattner!Institute!

7   Annual  Report  |  2014  

 

7   Enterprise  Platform  and  Integration  Concepts  |  Hasso  Plattner  Institute    

 

2

Using  the  Load  Simulator,  we  compared  a)  a  single  HANA  node  with  b)  a  multi-­‐node  setup  consisting  of  a  master  node  (actual  data  only),  one  replica  node  of  the  master  for  running  OLAP   transactions,   and  a   cold  node   for  historical  data.  Both   setups   have   an  equal   total  amount  of  cores  and  main  memory.  The  workload  consists  of  three  types  of  transactions:  invoice   postings,   read-­‐only   transactions   incl.   OLTP   queries,   and   OLAP   transactions   incl.  read-­‐heavy   analytical   queries.   With   the   partitioning   into   actual   and   historical   and  replication  of  the  actual  data  we  could  see  the  following  improvements  (90%  actual-­‐only  OLAP  transactions,  100%  actual-­‐only  OLTP  transactions,  1%  queries  being  analytical):    

Improved  performance:    

• Transactional   processing   is   improved   even   without   the   use   of   a   replica   due   to   the  smaller  data  set.  Activating  the  replica,  the  multi-­‐node  setup  is  faster  by  a  factor  of  ~4  for  mixed  workloads.      

• The  higher  the  skew  tends  towards  an  actual-­‐only  workload,  the  more  the  partitioned  system  outperforms  the  traditional  setup.      

• When   adding   analytical   users   to   the   system,   a   replica   of   the   actual   master   node  significantly   lowers   the   latency   of   OLTP   transactions   compared   to   the   single   HANA  setup  due  to  better  load  distribution.      

Reduced  costs:      

• Historical   data   can   be   purged   and   better   compressed,   thus   decreasing   the   memory  footprint,  resulting  in  less  main  memory  usage.      

• Overall   system   costs   decrease   as   smaller   servers   can   be   deployed,   hence   avoiding  disproportional  prices  for  large  server  systems.    

   

The  screen  shows  OLTP  latency,  OLAP  latency,  and  CPU  load  for  both  setups  with  10000  transactional  users  and  500  analytical  users.  The  single-­‐node  setup  on  the  left  shows  higher  OLTP  latencies    and  significantly  

higher  OLAP  latencies,  both  violating  the  SLA  (service  level  agreement)  thresholds.  The  partitioned  setup  on  the  right  shows  significantly  better  latencies  while  using  only  half  of  its  CPU  resources.    

Page 11: Annual Report 2014 final web - Hasso-Plattner-Institut · Annual&Report&2014&!!!! & & & EnterprisePlatformandIntegrationConcepts& & Research!Group!of! Prof.Dr.HassoPlattner&! HassoPlattner!Institute!

Annual  Report  |  2014   8  

 

Enterprise  Platform  and  Integration  Concepts  |  Hasso  Plattner  Institute   8    

2.1.1      SSICLOPS  –  Scalable  and  Secure  Infrastructures  for  Cloud  Operations    Starting   in  February  2015,  our   research  group  –   together  with   the   research  group  of  Prof.  Polze   and   a   consortium   of   12   academic   and   industry   partners   from   7   countries   –     will  participate   in   a   three-­‐year   collaboration   project   titled   SSICLOPS   –   Scalable   and   Secure  Infrastructures   for   Cloud   Operations.   The   project   is   funded   by   the   European   Commission  under  the  Horizon2020  program.    The  SSICLOPS  project  focuses  on  techniques  for  the  management  of  federated  private  cloud  infrastructures,   in   particular   cloud   networking   techniques   within   software-­‐defined   data  centers   and   across  wide-­‐area   networks.   SSICLOPS  will   empower   enterprises   to   create   and  operate   high-­‐performance   private   cloud   infrastructure   that   allows   flexible   scaling   through  federation   with   other   private   clouds   without   compromising   on   their   service   level   and  security   requirements.  SSICLOPS   federation  will   support   the  efficient   integration  of  clouds,  no   matter,   if   they   are   geographically   collocated   or   spread   out,   belong   to   the   same   or  different  administrative  entities  or  jurisdictions:  in  all  cases,  SSICLOPS  will  enforce  legal  and  security   constraints   and  minimize   the   overall   resource   consumption.   In   such   a   federation,  individual   enterprises  will   be   able   to   dynamically   scale   in/out   their   private   cloud   services.  This  allows  maximizing  own  infrastructure  utilization  while  minimizing  excess  capacity  needs  for  each  federation  member.  The  project  will  design,  implement,  demonstrate,  and  evaluate  three  specific  use  cases,  namely  a  cloud-­‐based  in-­‐memory  database,  the  analysis  of  physics  experiment  data,  and  the  prototypical  extension  of  network  stacks  for  a  telecom  provider.    

 Partners  and  countries  of  the  SSICLOPS  consortium.  

Page 12: Annual Report 2014 final web - Hasso-Plattner-Institut · Annual&Report&2014&!!!! & & & EnterprisePlatformandIntegrationConcepts& & Research!Group!of! Prof.Dr.HassoPlattner&! HassoPlattner!Institute!

9   Annual  Report  |  2014  

 

9   Enterprise  Platform  and  Integration  Concepts  |  Hasso  Plattner  Institute    

2.1.2      In-­‐Memory  Research  Laboratory    The  In-­‐Memory  Research  Laboratory  supports  all  research  activities  on  main-­‐memory,  multi-­‐core  and  coprocessor  technology  at  our  research  group.  The   lab  offers  physical  and  virtual  resources  in  order  to  build  a  solid  foundation  for  experiments  and  teaching  activities  around  the  topic  of  in-­‐memory  databases  and  enterprise  applications.  We  currently  maintain  a  pool  of  50  high-­‐end  servers  of  different  generations  and  one  petabyte  of  permanent  storage.   In  2014,   our   team  made   several   improvements   in   terms   of  manageability   and   availability   of  those   resources.   We   rolled   out   a   solution   for   automatic   configuration,   installation,   and  maintenance   for   physical   and   virtual   servers.   User   accounts   are   now   handled   globally,  reducing  the  time  to   initial  system  availability  to  a  minimum.  Resources  to  provide  a  more  flexible  testing  environment  have  been  increased  as  well.    In  November  2014  we  installed  an  SGI  UV300  for  HANA,  a  system  with  240  enterprise  level  CPU  cores,  12  TB  of  main-­‐memory  and  50  TB  of  high-­‐performance  permanent  storage.  The  machine  has  been   integrated   into   the  existing   landscape  and  was   ready   to  use  within  one  day.  The  machine  operates  for  several  NUMA  as  well  as  in-­‐memory  database  and  predictive  application  related  research  projects.    

 In-­‐Memory  Research  Laboratory:  Bringing  the  SGI  UV300  for  HANA  into  service.  

   

2.2      Tools  and  Methods  for  Enterprise  Systems  Design  and  Engineering    We  consider  the  balance  of  technological,  business,  and  human  factors  to  be  the  driver  for  innovation.  Software  development  often  lacks  the  required  emphasis  on  human  values,  even  though  a  shift  towards  more  user-­‐centricity  is  noticeable  in  many  companies.  Therefore,  we  are  focusing  on  the   influence  of  the  human  element   in  software  engineering.  We  observe,  analyze,  and  understand  how  individuals,  teams,  and  organizations  work  and  in  which  ways  tools  and  processes  can  support  them  to  create  better  software  outcomes  for  increased  user  experience.  We  consider  the   impact  of   in-­‐memory  technology  also  to  be  a  driver   for  more  

Page 13: Annual Report 2014 final web - Hasso-Plattner-Institut · Annual&Report&2014&!!!! & & & EnterprisePlatformandIntegrationConcepts& & Research!Group!of! Prof.Dr.HassoPlattner&! HassoPlattner!Institute!

Annual  Report  |  2014   10  

 

Enterprise  Platform  and  Integration  Concepts  |  Hasso  Plattner  Institute   10    

efficient   application   programming,   intelligent   tools,   and   user-­‐friendly   software,   ultimately  influencing  the  way  we  will  design,  develop  and  operate  business  applications  in  the  future.      2.2.1      Data-­‐  and  Performance-­‐Aware  Development  of  Business  Applications  on  SAP  HANA    The   Bachelor   project  „Modern   Computer-­‐aided   Software   Engineering“   has   tested   novel  concepts  to  ease  the  process  of  creating  and  debugging  enterprise  applications  built  on  an  in-­‐memory   database.   Students   have   built   a   prototype   as   an   extension   of   a   web-­‐based  development  environment  that  integrates  database  logic  and  application  code  in  one  single  view.   Developers   get   immediate   detailed   information   about   queries,   such   as   query   plans,  estimated   performance   measures   and   result   set   sizes,   while   writing   and  modifying   code.  Those  estimations  help  spotting  potential  performance  bottlenecks  early  in  the  development  process   and   thus   prevent   cost-­‐intensive   changes   later   on.   Using   sampling   and   clustering  approaches,  the  estimations  on  query  performance  are  available  after  fractions  of  a  second.  A  visual  representation  of  the  program  flow  simplifies  program  comprehension  and  effects  of  code  changes.  Early   results  and   feedback   from  professional  developers   is  promising  and  emphasizes  the  need  for  more  intelligent,  reactive  development  environments.    2.2.2      Code  Better,  Run  Faster:  Tools  for  Performance-­‐Driven  Enterprise  Application  Development    In  this  Master  project,  we  addressed  the  problem  of  writing  application  logic  that  performs  efficiently   on   columnar   main   memory   databases.   The   research   was   based   on   the   open-­‐source  database  Hyrise,  which  on  one  hand  allowed  us  to  integrate  new  functionality  easily  and   on   the   other   hand   offered   access   to   key   performance   metrics.   Leveraging   expected  result   set   sizes,   number   of   cache   misses,   core   utilization,   used   CPU   cycles,   and   total  execution   times,   our   improved  Hyrise   development   environment   supported   developers   in  improving  query  logic  and  response  times.  Custom  operators  can  be  programmed  directly  in  the  browser-­‐based   IDE  using   JavaScript,  providing  a   flexible  possibility   to   integrate  custom  operators  in  complex  query  execution  plans.        The   team   implemented   intuitive   visualizations   for   query   result   sizes,   previews   of   the  expected   result   sets,  and  breakdowns  of   the   total  query   runtimes   to   identify  performance  intensive   tasks.   Based   on   a   tempo-­‐spatial   dataset   of   multiple   soccer   matches,   analytical  operators   were   implemented   to   detect   offside   situations.   The   operators   were   optimized  with  the  developed  tools  and  it  was  allowed  for  highly  performing  scans  of  the  whole  data  set.    2.2.3      Object-­‐Relational  Mapping  Strategies  for  New  Enterprise  Applications    Specialization/Generalization   relationships   are   a   common   pattern   in   enterprise   system  domain   models.   In   object-­‐oriented   programming,   such   relationships   can   be   expressed   as  

Page 14: Annual Report 2014 final web - Hasso-Plattner-Institut · Annual&Report&2014&!!!! & & & EnterprisePlatformandIntegrationConcepts& & Research!Group!of! Prof.Dr.HassoPlattner&! HassoPlattner!Institute!

11   Annual  Report  |  2014  

 

11   Enterprise  Platform  and  Integration  Concepts  |  Hasso  Plattner  Institute    

inheritance   between   entities.   Persisting   entities   of   the   domain  model   that   are   part   of   an  inheritance   relationship   is   not   trivial.   Research   has   proposed   three   different   strategies   to  map   inheritance   structures   to   relational   data   models.   What   they   have   in   common   is   an  inherent   trade-­‐off   between  memory   consumption   and   query   performance.   Depending   on  the  actual  characteristics  of  the  inheritance  structure  at  hand,  each  strategy  has  its  strengths  and   weaknesses.   Consequently,   the   combination   of   inheritance   characteristics   and  prioritization   of   non-­‐functional   requirements   (memory   consumption   and   query  performance)  determines  the  strategy  to  implement.  Unfortunately,  not  all  characteristics  of  the   inheritance  hierarchy   can  be   defined   in   advance.   In   this   ongoing   research   project,  we  look   at   how   column-­‐orientation   as   a   means   to   physically   structure   data   in   memory  influences  the  determination  of  the  best  mapping  strategy  for  a  given  data  model.      2.2.4      Omniscient  Debugging  in  Database  Applications    Omniscient   debugging   is   an   approach   to   improve   the   efficiency   of   debugging   activities,  thereby   increasing  overall  developer  productivity.  While  a  regular  debugger  can  only  show  the  program  state  at  the  current  point  in  time  and  allows  the  developer  to  move  execution  forward,  an  omniscient  debugger  (ODB)  can  immediately  produce  the  state  of  any  point   in  time   and   allows   the   developer   to   move   in   any   direction   through   the   execution.   In   this  research   project,  we  want   to   bring   omniscient   debugging   into   the   database   layer.   Typical  ODB   implementations   record   every   execution   step   to   be   able   to   reproduce   previous  program  states.  However,  with  a  stored  procedure  that  touches  millions  of  tuples,  this  is  not  feasible.   Instead,   we   only   trace   scalar   variables   and   leverage   the   speed   of   in-­‐memory  databases   to   quickly   reproduce   query   results   on   demand.   Combining   these   dynamic   and  static  analysis  techniques,  we  aim  to  build  a  debugger  with  useful  visualizations  that  help  the  developer  to  understand  program  and  data  flow  in  complex  stored  procedures.      Selected  Publications  in  this  Research  Area  in  2014    ! Martin  Lorenz,  Johannes  Albrecht:  Object-­‐Relational  Mapping  Strategies  revised  –  A  

comparison  of  Row-­‐  and  Column-­‐  oriented  Database  Systems,  International  Conference  on  Challenges  in  IT,  Engineering  and  Technology  (ICCIET),  2014  

! Franziska  Häger,  Thomas  Kowark,  Jens  Krüger,  Christophe  Vetterli,  Falk  Übernickel,  Matthias  Uflacker:  DT@Scrum:  Integrating  Design  Thinking  with  Software  Development  Processes,  Understanding  Innovation  -­‐  Building  Innovators,  2014  

! Thomas  Kowark,  Hasso  Plattner:  Collective,  Incremental  Ontology  Alignment  Through  Query  Translation,  The  8th  International  Conference  On  Web  Reasoning  And  Rule  Systems,  Athens,  Greece,  2014  

! Franziska  Häger,  Ralf  Teusner:  From  theory  to  practice  -­‐  Using  a  multi-­‐team  design  thinking  workshop  to  kickstart  software  projects,  DTBIS,  2014  

Page 15: Annual Report 2014 final web - Hasso-Plattner-Institut · Annual&Report&2014&!!!! & & & EnterprisePlatformandIntegrationConcepts& & Research!Group!of! Prof.Dr.HassoPlattner&! HassoPlattner!Institute!

Annual  Report  |  2014   12  

 

Enterprise  Platform  and  Integration  Concepts  |  Hasso  Plattner  Institute   12    

 

1

Project  Highlight  HPI  Business  Simulator    Companies   invest   a   significant   amount   of   time   in   the   yearly   budgeting   process   and  resulting   quarterly   or  monthly   forecasts.   This   process   is   often   seen   by  management   as  inefficient   given   the   volatility   of  markets   and   enterprise   structures.   In   this   context,   the  what-­‐if  analysis  has  been  established  with  the  goal  to  closely  model  cause  and  effect  in  an  enterprise  and  its  environment.  This  functionality  can  be  used  for  the  budgeting  process  or  as  part  of  a  forecast  for  scenario  evaluation  in  terms  of  their  goal  fulfillment.  However,  while  this  theoretical  model  has  well-­‐defined  semantics,  it  still  lacks  proper  tool  support.    We  have  connected  with  a  Fortune  500  company  in  the  consumer  goods  industry  in  order  to  discover   their  needs   for   enterprise   simulation   and  create   a  new  simulation   tool.   The  following   key   requirements  were   identified:   flexibility,   i.e.,   the   adaptability   to   new  use  cases  without   additional   programming   efforts,   interactivity,   i.e.,   sufficient   performance  for   interactive   decision-­‐making   during   planning   runs,   and   collaboration,   i.e.,  multi-­‐user  support  for  collaborative  development  of  joint  simulation  scenarios.    These   identified   requirements   can   be   addressed   effectively   with   HANA’s   capability   for  direct  execution  of  analytical  queries  on  transaction  data.  To  meet  the  challenge,  we  have  developed  the  HPI  Business  Simulator,  a  proof-­‐of-­‐concept  tool   that  allows  companies  to  define   and   calculate   what-­‐if   analyses   in   seconds.   The   main   idea   behind   this   tool   is   to  enable  companies  to  flexibly  simulate  scenarios  directly  on  the  transactional  data.  Users  can  easily  configure  and  perform  their  simulations  without  the  development  overhead  of  custom-­‐built  simulations  and  pre-­‐calculated  data  cubes.  This  serves  as  an  enabler  for  ad-­‐hoc   decision   support,   planning,   and   forecasting   –   positively   impacting   multiple   areas  within  a  company:    Purchasing:  Material   costs   can   be   simulated   in   dependencies   of   commodity   prices   and  currency  fluctuations.  Production:   Costs   can   be   simulated   based   on   production   paths,   machine   allocations,  transportation  costs,  rejection  rates,  energy  consumption,  energy  prices,  and  more.  Sales:   The   sales  volume  can  be   simulated  using  drivers  such  as  unit  price  and  economic  factors  such  as  buying  power  and  competitive  vendors.  Controlling:   The   value   drivers   can   also   be   consolidated   and   used   for   profitability  simulation.  On  the  management  level,  executives  gain  more  process  transparency  through  real-­‐time  information  access  and  can  see  the  impacts  of  strategic  decisions.    The   HPI   Business   Simulator   builds   on   the   concept   of   value   driver  models.   Value   driver  trees,   such   as   the   DuPont   model,   are   well-­‐known   methodologies   to   model   Key  Performance  Indicators  (KPIs)  with  independent  linear  equations  or  –  in  the  case  of  input-­‐output  structures  –  with  systems  of  linear  equations.  Using  value  driver  models,  activities  and  decisions  in  a  company  are  focused  on  the  core  factors  that  drive  the  KPIs,  e.g.,  the  operating   profit.   Furthermore,   their   usage   increases   collaboration   across   departments,  leading  to  more  aligned  operations.  This  results  in  that  less  effort  is  required  for  planning  

Page 16: Annual Report 2014 final web - Hasso-Plattner-Institut · Annual&Report&2014&!!!! & & & EnterprisePlatformandIntegrationConcepts& & Research!Group!of! Prof.Dr.HassoPlattner&! HassoPlattner!Institute!

13   Annual  Report  |  2014  

 

13   Enterprise  Platform  and  Integration  Concepts  |  Hasso  Plattner  Institute    

 

2

and   the   development   of  more   realistic   plans,   as   the   KPIs   are   directly   connected   to   the  operational  drivers.  Drivers  can   influence  multiple  other  drivers,  e.g.,   an   increased   sales  volume  influences  both  net  sales  and  variable  costs.    A  value  driver  model  is  a  directed  graph  consisting  of  a  set  of  nodes  and  their  connecting  edges.   Each   node   is  a   value   driver   that  either   represents  a   data   source   or   is   calculated  based  on  other  value  drivers.  Nodes  can  drive  multiple  other  nodes.  Our  model  not  only  supports   simple  operations   like  additions  and  multiplications  but  also  complex  equation  systems  which,  e.g.,  describe   the   influence  of  a  bill  of  material,   raw  material  prices  and  cost   center   rates   on   product   costs.   Models   can   be   filtered,   e.g.,   along   the   product,  customer,  location,  and  time  dimensions.    The  simulation  model  can  be  easily  configured  by  domain  experts  who  define  the  relevant  drivers   and   their   dependencies.   As   an   example,   for   the   value   driver   ‘sales   volume’,   the  attribute   ‘quantity’   would   be   the   value   that   has   to   be   aggregated.   Customer,   location,  product,  and  time  would  specify  the  supported  filter  dimensions.    Once   the  value  driver  model  has  been  configured,   simulations  are   initiated  by  adjusting  (overriding)  values  with  the  nodes  of  the  driver  model.  The  simulated  impact   is  instantly  visualized   in   relation   to   the   actuals,   plans,   and   forecasts.   Users   can   set   filters   on   the  dimensions,   e.g.,   by   customer,   location,   or   product,   to   explore   the   impact   of   the  simulation  run  in  detail.    

Drill  down  in  a  Profit  &  Loss  simulation.  With  the  HPI  Business  Simulator  it  becomes  feasible  to  directly  use  the  transaction  data  for  enterprise  simulations.  

Page 17: Annual Report 2014 final web - Hasso-Plattner-Institut · Annual&Report&2014&!!!! & & & EnterprisePlatformandIntegrationConcepts& & Research!Group!of! Prof.Dr.HassoPlattner&! HassoPlattner!Institute!

Annual  Report  |  2014   14  

 

Enterprise  Platform  and  Integration  Concepts  |  Hasso  Plattner  Institute   14    

2.3      In-­‐Memory  Data  Management  for  Life  Sciences  and  eHealth  Systems    In   addition   to   our   research   activities   in   the   field   of   "In-­‐Memory   Data   Management   for  Enterprise  Systems",  our  group  focuses  on  applying  in-­‐memory  technology  to  the  field  of  Life  Sciences.  The  volume  of  scientific  data  in  this  area  typically  exceeds  all  requirements  of  data  sets   used   in   traditional   enterprises.   Building  on  our   long-­‐lasting   experience   in   applying   in-­‐memory  technology  to  selected  enterprise  challenges,  we  also  want  to   improve  processing  and  analyzing  large  scientific  data  sets  in  real-­‐time.    2.3.1      EBOKON  –  Surveillance  of  Ebola  Outbreaks    In  occasion  of  the  2014  Ebola  outbreak  in  West  Africa,  the  aim  of  this  ongoing  project  is  to  support  the  identification  and  management  of  (suspected)  Ebola   infections  and  the  follow-­‐up  surveillance  of  their  contacts  to  prevent  the  disease  from  further  spreading.  The  project  is   a   cooperation  between  HPI,  Helmholtz  Centre   for   Infection  Research   (HZI),  Robert  Koch  Institute   (RKI),   Bernhard   Nocht   Insitute   for   Tropical   Medicine   (BNITM),   and   Nigeria   Field  Epidemiology  &  Laboratory  Training  Program  (NFELTP).      During  several  Design  Thinking  workshops   in  both  Germany  and  Nigeria,  we  systematically  analyzed  experiences  from  field  workers  and  the  Ebola  Emergency  Operations  Centre  (EOC)  after   their   successful   control   of   the   Ebola   outbreak   in   Nigeria.   From   those   insights,   we  identified   relevant   personas   and   developed   process   models   depicting   their   interactions  during  an  Ebola  outbreak.    In   one   of   our   ongoing   Bachelor   projects,   we   implement   these   process   models   and  requirements   into   a   software   system.   The   students   are   building   a   mobile   application   for  contact   tracers,   i.e.   field   workers   who   visit   contacts   of   a   (suspected)   Ebola   case   daily   to  detect  new  Ebola   cases  early  and   initiate   corresponding  measures.  The  mobile  application  supports  contact  tracers  in  contact  management  and  provides  interview  guides  and  a  simple  user  interface  for  data  collection.  The  contact  tracing  app  will  run  within  a  private  cloud  that  is  based  on  standard  SAP  software  such  as  SAP  HANA,  SAP  Afaria,  and  SAP  Mobile  Platform.    2.3.2      MediTweet    In  the  winter  term  2013/14,  students  of  our  seminar  “Next  Generation  Clinical  Information  Systems”  had  the  task  to  improve  the  workflow  of  clinical  personnel.  After  several  cycles  of  interviewing,  brainstorming,  prototyping,  and  testing,  one  team  developed  the  concept  of  a  messaging  system  for  clinical  environments,  called  MediTweet.  

MediTweet   is   an   open   messaging   system   for   clinical   environments.   It   connects   Clinical  Information  Systems   (CIS)  with  both  medical  devices  and  personnel.  With  MediTweet,   the  users   are   enabled   to   send   structured   messages   to   other   users   in   order   to   automate  

Page 18: Annual Report 2014 final web - Hasso-Plattner-Institut · Annual&Report&2014&!!!! & & & EnterprisePlatformandIntegrationConcepts& & Research!Group!of! Prof.Dr.HassoPlattner&! HassoPlattner!Institute!

15   Annual  Report  |  2014  

 

15   Enterprise  Platform  and  Integration  Concepts  |  Hasso  Plattner  Institute    

documentation  and  task  synchronization.  Additionally,  medical  devices  automatically  inform  users  about  their  status  and  results.  During  the  summer  term  2014,  a  working  prototype  of  MediTweet  has  been  implemented,  including  a  mobile  client  for  iOS,  a  messaging  server,  and  a   connector   for   the   SAP   IS-­‐H   and   i.s.h.med   system.  Semi-­‐structured   messages   containing  event  and  context   information   from  clinical   devices,   IT   systems,   sensors  as   well   as   users   are   broadcasted  automatically   in   predefined   streams  within   the   MediTweet   network.  Users  can  subscribe  and  unsubscribe  to   streams   that   are   relevant   for  them.   MediTweet   fully   integrates  into   all   clinical   processes   processed  within   the   SAP   ISH   and   i.s.h.med  system.    

MediTweet   also   allows   the   clinical  staff  to  create  tasks  (actions)  based  on  received  messages.  These  tasks  and  their  status  are  synchronized   between   all   recipients   of   the  message   stream,  making   it   easy   to   coordinate  their   work   and   improving   the   clinical   information   flow.   The   Solution   Experience  Infrastructure   and   Healthcare   Demo   Team   of   SAP   supported   the   student’s   initiative   by  providing  a  copy  of  the  actual  SAP  Demo  Cloud  Healthcare  System.  Our  prototype  allows  to  automatically  publish  messages  on  the  patients’  channel  as  soon  as  new  information  about  the   patients’  medical   results   enters   the   IS-­‐H   system.   The   SAP  Healthcare  Demo  Team  will  continue  the  cooperation  with  the  HPI  team  and  is  planning  to  implement  MediTweet  into  the  SAP  Demo  Cloud  starting  in  February  2015.  

 

Selected  Publications  in  this  Research  Area  in  2014    ! Hasso  Plattner,  Matthieu-­‐P.  Schapranow:  High-­‐Performance  In-­‐Memory  Genome  Data  

Analysis:  How  In-­‐Memory  Database  Technology  Accelerates  Personalized  Medicine,  In-­‐Memory  Data  Management  Research,  ISBN:  978-­‐3-­‐319-­‐03034-­‐0,  2014  

! Matthieu-­‐P.  Schapranow,  Franziska  Häger,  Cindy  Fähnrich,  Emanuel  Ziegler,  Hasso  Plattner:  In-­‐Memory  Computing  Enabling  Real-­‐time  Genome  Data  Analysis,  International  Journal  on  Advances  in  Life  Sciences,  Vol  6,  Nr  1-­‐2,  2014  

! Cindy  Fähnrich,  Matthieu-­‐P.  Schapranow,  Hasso  Plattner:  Towards  Integrating  the  Detection  of  Genetic  Variants  into  an  In-­‐Memory  Database,  Proceedings  of  the  International  Conference  on  Big  Data,  2014  

   

Page 19: Annual Report 2014 final web - Hasso-Plattner-Institut · Annual&Report&2014&!!!! & & & EnterprisePlatformandIntegrationConcepts& & Research!Group!of! Prof.Dr.HassoPlattner&! HassoPlattner!Institute!

Annual  Report  |  2014   16  

 

Enterprise  Platform  and  Integration  Concepts  |  Hasso  Plattner  Institute   16    

3. SUPERVISED  MASTER’S  THESES    The  following  Master’s  theses  have  been  supervised,  submitted,  and  successfully  defended  in  our  research  group  in  2014:  

 ! Tim  Berning:  nvm_malloc:  Memory  Allocation  in  the  NVRAM  Era.  ! Lars  Butzmann:  Efficient  Aggregate  Cache  Revalidation.  ! Ralf  Diestelkämper:  Cache  Management  for  Aggregates  in  Columnar  In-­‐Memory  

Databases.  ! Markus  Dreseler:  Leveraging  NVRAM  for  the  In-­‐Memory  Database  HYRISE.  ! Ekaterina  Gavrilova:  Alternative  Data  Models  to  Leverage  the  Features  of  In-­‐Memory  

Column-­‐oriented  Databases.  ! Philipp  Giese:  Eliciting  Expertise  based  on  Time  Series  Analyses  of  Code  Complexity  

Metrics.  ! Sebastian  Hillig  :  HyDispatch:  Type  Dispatch  for  Performance  and  Extensibility.  ! Kai  Höwelmeyer:  Pipelining  Parallelism  for  Main  Memory  Databases.  ! Cornelius  Illi:  Understanding  Information  Sharing  and  the  Development  of  Shared  

Understanding  in  Virtual  New  Product  Development  Teams.    ! Sebastian  Meyer:  Testing  Mobile  Prototypes  in  Enterprise-­‐Scale  Software  

Development  Processes.    ! Paul  Möller:  Leveraging  Enterprise  Application  Characteristics  to  Optimize  Incremental  

Materialized  View  Maintenance  on  Columnar  In-­‐Memory  Databases.    ! Stefan  Schäfer:  A  Cost  Model  for  Optimized  Coprocessor  Integration.    ! Björn  Wagner:  Mixed  Workload  Processing  in  a  RDMA-­‐Enabled  Parallel  Main-­‐Memory  

DBMS.  ! Johannes  Albrecht:  Mapping  Inheritance  Hierarchies.  A  Cost  Model  for  mapping  

Object  Inheritance  Hierarchies  to  Relational  Databases.  ! Sebastian  Oergel:  The  Integration  of  Relational  Languages  into  Object-­‐Oriented  

Programming  Languages.    ! Daniel  Taschik:  Elastic  In-­‐Memory  Computing  .  Quantifying  the  Elasticity  of  Relational  

Database  Management  Systems.    

   

Page 20: Annual Report 2014 final web - Hasso-Plattner-Institut · Annual&Report&2014&!!!! & & & EnterprisePlatformandIntegrationConcepts& & Research!Group!of! Prof.Dr.HassoPlattner&! HassoPlattner!Institute!

17   Annual  Report  |  2014  

 

17   Enterprise  Platform  and  Integration  Concepts  |  Hasso  Plattner  Institute    

4. COMPLETED  PH.D.  DISSERTATIONS      ! Jens  Krüger:  Enterprise-­‐specific  In-­‐Memory  Data  Management:  HYRISEC  –  An  In-­‐

Memory  Column  Store  Engine  for  OLXP  

Abstract:   Enterprise   applications   are   presently   built   on   a   20-­‐year-­‐old   data   management  infrastructure  that  was  designed  to  meet  a  specific  set  of  requirements  for  OLTP  systems.  In  the  meantime,  enterprise  applications  have  become  more  sophisticated,  data  set  sizes  have  increased,   requirements   on   the   freshness   of   data   have   been   strengthened,   and   the   time  allotted   for   completing   business   processes   has   been   reduced.   To   meet   these   challenges,  enterprise  applications  have  become  increasingly  complicated  to  make  up  for  shortcomings  in   the   data   management   infrastructure.   These   complications   increase   the   total   cost   of  ownership  of  the  applications  and  make  them  harder  to  use.  This  thesis  pursues  the  idea  of  designing  an  enterprise  application-­‐specific  database  engine,  which   is  better  optimized   for  the  observed  workload  and  data  characteristics,  while  leveraging  latest  hardware  trends  and  advances   in   data   processing   algorithms.   As   a   result,   the   actual   requirements,   data  characteristics,  and  as  workloads  from  today’s  enterprise  applications  are  extracted,  a  novel  workload   category   called   Online   Mixed   Workload   Processing   (OLXP)   is   defined   and   the  enterprise  application-­‐specific  database  engine  HYRISEc  is  presented.  HYRISEc  facilitates  read-­‐optimized   in-­‐memory   data   structures   since   today’s   database   systems   are   designed   for   a  more   update   intensive  workload   than   they   are   actually   facing.   Traditional   read-­‐optimized  databases   use   a   dictionary   encoded   compressed   column-­‐oriented   approach,   especially   in  combination  with  an  in-­‐memory  architecture.  Inserting  a  tuple  in  such  a  compressed  store  is  as  complex  as  inserting  a  value  in  a  sorted  column,  because  the  entire  compression  has  to  be  rebuilt.   Furthermore,   traditional   index   structures   cannot   be   applied   efficiently   as   these  databases  are  not  based  on  a  page  structure.    

To   handle   updates   in   a   compressed   storage   efficiently,   HYRISEc   implements   a   technique  called   differential   store,   maintaining   a   small   write-­‐optimized   delta   partition   that  accumulates   all   updates.   Periodically,   this   delta   partition   is   combined   with   the   read-­‐optimized  main  partition.  This  merge  process  involves  decompressing  the  compressed  main  partition,   merging   the   delta   and   main   partitions,   and   re-­‐compressing   the   resulting   main  partition.   For   transactional   enterprise   applications   it   is   crucial   that   their   data   is   always  available  in  a  24/7  environment  and  system  downtimes  are  not  allowed.  As  such,  the  merge  process   must   be   performed   online   and   fast   enough,   so   as   not   to   degrade   the   update  throughput.  The  update  performance  of  such  a  system  is  limited  by  two  factors:  a)  the  insert  rate  for  the  write-­‐optimized  structure,  and  b)  the  speed  with  which  the  system  can  merge  the  accumulated  updates  back   into  the  read  optimized  partition,  while  keeping  the  system  online   without   any   downtime.   The   merge   process   becomes   the   main   bottleneck   for   the  system,   and   needs   to   be   optimized   by   orders   of   magnitude   to   support   fast   updates.  Consequently,  a   fast  attribute  merging  algorithm   is   introduced  that  performs  a   linear-­‐time  update  of   the  compressed  main  partition,  and  performs  multi-­‐core  aware  optimizations  to  exploit  the  underlying  high  compute  and  bandwidth  resources  of  modern  multi-­‐core  CPUs.    

With   regard   to   fast   lookups,   memory   bandwidth   and   latency   are   limiting   the   execution  speed   of   queries   and,   therefore,   this   scarce   resource   has   to   be   used   economically   to  

Page 21: Annual Report 2014 final web - Hasso-Plattner-Institut · Annual&Report&2014&!!!! & & & EnterprisePlatformandIntegrationConcepts& & Research!Group!of! Prof.Dr.HassoPlattner&! HassoPlattner!Institute!

Annual  Report  |  2014   18  

 

Enterprise  Platform  and  Integration  Concepts  |  Hasso  Plattner  Institute   18    

maximize   performance.   Scanning   a   complete   column   results   in   the   transfer   of   the   entire  column  from  memory  to  the  processor  and  the  costs  depend  linearly  on  the  column  length.  A  common  approach  to  speed  up  the  access  to  highly  selective  subsets  of  the  data  is  to  use  indices   which   enable   searches   in   logarithmical   time.   Thus,   an   index   is   introduced   that  leverages  the  proposed  architecture.  This  includes  the  compressed  column-­‐oriented  storage  for   the   actual   index   data   structures   and   the   attribute   merge   algorithm   for   index  maintenance.    

To  summarize,  this  thesis  presents  research  results  illustrating  how  an  in-­‐memory  database  engine   can   be   implemented   for   OLXP   by   introducing   data   structures   and   algorithms   to  enable  fast  updates  and  lookups  by  leveraging  the  potential  of  a  read-­‐optimized  store  at  the  same  time.

 ! Christian  Tinnefeld:  Building  a  Columnar  Database  on  Shared  Main  Memory-­‐Based  

Storage:  Database  Operator  Placement  in  a  Shared  Main  Memory-­‐Based  Storage  System  that  Supports  Data  Access  and  Code  Execution    

Abstract:   In   the   field   of   disk-­‐based   parallel   database  management   systems   exists   a   great  variety  of  solutions  based  on  a  shared-­‐storage  or  a  shared-­‐nothing  architecture.  In  contrast,  main  memory-­‐based   parallel   database  management   systems   are   dominated   solely   by   the  shared-­‐nothing   approach   as   it   preserves   the   in-­‐memory   performance   advantage   by  processing  data  locally  on  each  server.  We  argue  that  this  unilateral  development  is  going  to  cease  due  to  the  combination  of  the  following  two  trends:  a)  nowadays  network  technology  features  remote  direct  memory  access  (RDMA)  and  narrows  the  performance  gap  between  accessing  main  memory   inside  a  server  and  of  a  remote  server  to  and  even  below  a  single  order   of   magnitude.   b)  Modern   storage   systems   are   elastic,   provide   durability   as   well   as  high-­‐availability  and  —  e.g.   in  the  case  of  Stanford’s  RAMCloud  —  keep  all  data  resident  in  main   memory.   Exploiting   these   characteristics   in   the   context   of   a   main-­‐memory   parallel  database   management   system   is   desirable,   the   advent   of   RDMA-­‐enabled   network  technology  makes  the  creation  of  a  parallel  main  memory  DBMS  based  on  a  shared-­‐  storage  approach   feasible.   This   thesis   describes   building   a   columnar   database   on   shared   main  memory-­‐based   storage.   The   thesis   discusses   the   resulting   architecture   (Part   I),   the  implications   on   query   processing   (Part   II),   and   presents   an   evaluation   (Part   III)   of   the  resulting  solution  in  terms  of  performance,  high-­‐availability,  and  elasticity.    

In   our   architecture  we  use   Stanford’s   RAMCloud   as   shared-­‐storage   and   the   self-­‐   designed  and   developed   in-­‐memory   AnalyticsDB   as   relational   query   processor   on   top:   AnalyticsDB  encapsulates  data  access  and  operator  execution  via  an  interface  which  allows  to  seamlessly  switch   between   local   and   remote   main   memory,   RAM-­‐   Cloud   provides   not   only   storage  capacity,   but   also  processing  power:   combining  both   aspects   allows   for   pushing-­‐down   the  execution   of   database   operators   into   the   storage   system.  We  describe   how   the   columnar  data  processed  by  AnalyticsDB  is  mapped  to  RAMCloud’s  key-­‐value  data  model  and  how  the  performance  advantages  of  columnar  data  storage  can  be  preserved.    

The   combination   of   fast   network   technology   and   the   possibility   to   execute   database  operators   in   the   storage   system   opens   the   discussion   for   site   selection.   We   construct   a  

Page 22: Annual Report 2014 final web - Hasso-Plattner-Institut · Annual&Report&2014&!!!! & & & EnterprisePlatformandIntegrationConcepts& & Research!Group!of! Prof.Dr.HassoPlattner&! HassoPlattner!Institute!

19   Annual  Report  |  2014  

 

19   Enterprise  Platform  and  Integration  Concepts  |  Hasso  Plattner  Institute    

system  model   that   allows   the  estimation  of   operator   execution   costs   in   terms  of   network  transfer,  data  processed  in  memory,  and  wall  time.  This  can  be  used  for  database  operators  that  work  on  one  relation  at  a  time  —  such  as  a  scan  or  materialize  operation  —  to  discuss  the  site  selection  problem  (data  pull  vs.  operator  push).  Since  a  database  query  translates  to  the   execution   of   several   database   operators,   it   is   possible   that   the   optimal   site   selection  varies  per  operator.  For  the  execution  of  a  database  operator  that  works  on  two  (or  more)  relations  at  a   time  —  such  as  a   join  —  the  system  model   is  enriched  by  additional   factors  such  as  the  chosen  algorithm  (e.g.  Grace-­‐  vs.  Distributed  Block  Nested  Loop  Join  vs.  Cyclo-­‐Join),   the  data  partitioning  of  the  respective  relations,  and  their  overlapping  as  well  as  the  allowed  resource  allocation.    

We   present   an   evaluation   on   a   cluster  with   60   nodes  where   all   nodes   are   connected   via  RDMA-­‐enabled  network  equipment.  We  show  that  query  processing  performance   is  about  2.4x  slower  if  everything  is  done  via  the  data  pull  operator  execution  strategy  (i.e.  RAMCloud  is   being   used   only   for   data   access)   and   about   27%   slower   if   operator   execution   is   also  supported   inside   RAMCloud   (in   comparison   to   opera-­‐   ting   only   on  main  memory   inside   a  server   without   any   network   communication   at   all).   The   fast-­‐crash   recovery   feature   of  RAMCloud  can  be  leveraged  for  providing  high-­‐availability,  e.g.  a  server  crash  during  query  execution  only  delays  the  query  response  for  about  one  second.  Our  solution  is  elastic  in  a  way  that  it  can  adapt  to  changing  workloads  a)  within  seconds  b)  without  interruption  of  the  ongoing  query  processing  and  c)  without  manual  intervention.    

     

   

   

Page 23: Annual Report 2014 final web - Hasso-Plattner-Institut · Annual&Report&2014&!!!! & & & EnterprisePlatformandIntegrationConcepts& & Research!Group!of! Prof.Dr.HassoPlattner&! HassoPlattner!Institute!

Annual  Report  |  2014   20  

 

Enterprise  Platform  and  Integration  Concepts  |  Hasso  Plattner  Institute   20    

5. PUBLICATIONS    

5.1      Books    ! Hasso  Plattner:  A  Course  in  In-­‐Memory  Data  Management:  The  Inner  Mechanics  of  

In-­‐Memory  Databases,  Second  Edition,  ISBN:  978-­‐3-­‐642-­‐55269-­‐4,  2014  

 Recent  achievements  in  hardware  and  software  development,  such  as  multi-­‐core  CPUs  and  DRAM  capacities  of  multiple  terabytes  per  server,  enabled  the  introduction  of  a  revolutionary  technology:  in-­‐memory   data  management.   This   technology   supports   the   flexible  and  extremely  fast  analysis  of  massive  amounts  of  enterprise  data.  Professor   Hasso   Plattner   and   his   research   group   at   the   Hasso  Plattner   Institute   in   Potsdam,   Germany,   have   been   investigating  and  teaching  the  corresponding  concepts  and  their  adoption  in  the  

software  industry  for  years.  This  book  is  based  on  the  first  online  course  on  the  openHPI  e-­‐learning  platform,  which  was   launched   in  autumn  2012  with  more  than  13,000   learners.  The  book   is  designed  for   students   of   computer   science,   software   engineering,   and   IT   related   subjects.  However,   it   addresses   business   experts,   decision   makers,   software   developers,  technology  experts,  and  IT  analysts  alike.  Plattner  and  his  group  focus  on  exploring  the  inner   mechanics   of   a   column-­‐oriented   dictionary-­‐encoded   in-­‐memory   database.  Covered   topics   include   -­‐   amongst   others   -­‐   physical   data   storage   and   access,   basic  database   operators,   compression   mechanisms,   and   parallel   join   algorithms.   Beyond  that,   implications   for   future   enterprise   applications   and   their   development   are  discussed.  Readers  are   lead   to  understand   the   radical  differences  and  advantages  of  the  new  technology  over  traditional  row-­‐oriented  disk-­‐based  databases.    

! Hasso  Plattner,  Matthieu-­‐P.  Schapranow:  High-­‐Performance  In-­‐Memory  Genome  Data  Analysis:  How  In-­‐Memory  Database  Technology  Accelerates  Personalized  Medicine,  In-­‐Memory  Data  Management  Research,  ISBN:  978-­‐3-­‐319-­‐03034-­‐0,  2014  

 Recent  achievements  in  hardware  and  software  developments  have  enabled  the  introduction  of  a  revolutionary  technology:  in-­‐memory  data   management.   This   technology   supports   the   flexible   and  extremely   fast   analysis   of   massive   amounts   of   data,   such   as  diagnoses,   therapies,   and   human   genome   data.   This   book   shares  the   latest   research   results   of   applying   in-­‐memory   data  management   to   personalized   medicine,   changing   it   from  

Page 24: Annual Report 2014 final web - Hasso-Plattner-Institut · Annual&Report&2014&!!!! & & & EnterprisePlatformandIntegrationConcepts& & Research!Group!of! Prof.Dr.HassoPlattner&! HassoPlattner!Institute!

21   Annual  Report  |  2014  

 

21   Enterprise  Platform  and  Integration  Concepts  |  Hasso  Plattner  Institute    

computational   possibility   to   clinical   reality.   The   authors   provide   details   on   innovative  approaches   to   enabling   the   processing,   combination,   and   analysis   of   relevant   data   in  real-­‐time.   The   book   bridges   the   gap   between   medical   experts,   such   as   physicians,  clinicians,   and   biological   researchers,   and   technology   experts,   such   as   software  developers,  database  specialists,  and  statisticians.  Topics  covered   in  this  book   include  -­‐  amongst   others   -­‐   modeling   of   genome   data   processing   and   analysis   pipelines,   high-­‐throughput   data   processing,   exchange   of   sensitive   data   and   protection   of   intellectual  property.   Beyond   that,   it   shares   insights   on   research   prototypes   for   the   analysis   of  patient   cohorts,   topology   analysis   of   biological   pathways,   and   combined   search   in  structured  and  unstructured  medical  data,  and  outlines  completely  new  processes  that  have  now  become  possible  due  to  interactive  data  analyses.    

5.2      Book  Chapters    ! Franziska  Häger,  Thomas  Kowark,  Jens  Krüger,  Christophe  Vetterli,  Falk  Übernickel,  

Matthias  Uflacker:  DT@Scrum:  Integrating  Design  Thinking  with  Software  Development  Processes,  Understanding  Innovation  -­‐  Building  Innovators,  2014    

5.3      Journal  Articles    

! Hasso  Plattner,  Martin  Faust,  Stephan  Müller,  David  Schwalb,  Matthias  Uflacker,  Johannes  Wust:  The  Impact  of  Columnar  In-­‐Memory  Databases  on  Enterprise  Systems,  VLDB,  2014  

! Matthieu-­‐P.  Schapranow,  Franziska  Häger,  Cindy  Fähnrich,  Emanuel  Ziegler,  Hasso  Plattner:  In-­‐Memory  Computing  Enabling  Real-­‐time  Genome  Data  Analysis,  International  Journal  on  Advances  in  Life  Sciences,  Vol  6,  Nr  1-­‐2,  2014  

! Stephan  Müller,  Lars  Butzmann,  Stefan  Klauck,  Hasso  Plattner:  Materialized  View  Maintenance  Leveraging  In-­‐Memory  Data  Structures,  International  Journal  On  Advances  in  Software,  vol.  7,  no.  3&4,  2014    

5.4        Conference  Articles    ! Stephan  Müller,  Lars  Butzmann,  Stefan  Klauck,  Hasso  Plattner:  An  Adaptive  Aggregate  

Maintenance  Approach  for  Mixed  Workloads  in  Columnar  In-­‐Memory  Databases,  The  37th  Australasian  Computer  Science  Conference  (ACSC),  Auckland,  New  Zealand,  2014  

! Johannes  Wust,  Carsten  Meyer,  Hasso  Plattner:  DAC:  Database  Application  Context  Analysis  applied  to  Enterprise  Applications,  The  37th  Australasian  Computer  Science  Conference  (ACSC),  Auckland,  New  Zealand,  2014  

! Christian  Tinnefeld,  Donald  Kossmann,  Joos-­‐Hendrik  Boese,  Hasso  Plattner:  Parallel  Join  Executions  in  RAMCloud,  CloudDB  -­‐  In  conjunction  with  ICDE  2014,  2014  

Page 25: Annual Report 2014 final web - Hasso-Plattner-Institut · Annual&Report&2014&!!!! & & & EnterprisePlatformandIntegrationConcepts& & Research!Group!of! Prof.Dr.HassoPlattner&! HassoPlattner!Institute!

Annual  Report  |  2014   22  

 

Enterprise  Platform  and  Integration  Concepts  |  Hasso  Plattner  Institute   22    

! Christian  Tinnefeld,  Daniel  Taschik,  Hasso  Plattner:  Quantifying  the  Elasticity  of  a  Database  Management  System,  DBKDA,  2014  

! Stephan  Müller,  Hasso  Plattner:  Aggregates  Caching  for  Enterprise  Applications,  30th  International  Conference  on  Data  Engineering  (ICDE),  PHD  Symposium,  Chicago,  USA,  2014  

! Johannes  Wust,  Martin  Grund,  Kai  Hoewelmeyer,  David  Schwalb,  Hasso  Plattner:  Concurrent  Execution  of  Mixed  Enterprise  Workloads  on  In-­‐Memory  Databases,  DASFAA,  2014  

! Stephan  Müller,  Ralf  Diestelkämper,  Hasso  Plattner:  Cache  Management  for  Aggregates  in  Columnar  In-­‐Memory  Databases,  The  6th  International  Conference  on  Advances  in  Databases,  Knowledge,  and  Data  Applications  (DBKDA),  Chamonix,  France,  2014  

! Stephan  Müller,  Lars  Butzmann,  Hasso  Plattner:  Efficient  Aggregate  Cache  Revalidation  in  an  In-­‐Memory  Column  Store,  The  Sixth  International  Conference  on  Advances  in  Databases,  Knowledge,  and  Data  Applications  (DBKDA),  Chamonix,  France,  2014  

! Martin  Faust,  Martin  Grund,  Tim  Berning,  David  Schwalb,  Hasso  Plattner:  Vertical  Bit-­‐Packing:  Optimizing  Operations  on  Bit-­‐Packed  Vectors  Leveraging  SIMD  Instructions,  BDMA  in  conjunction  with  DASFAA,  2014  

! Franziska  Häger,  Ralf  Teusner:  From  theory  to  practice  -­‐  Using  a  multi-­‐team  design  thinking  workshop  to  kickstart  software  projects,  DTBIS,  2014  

! David  Schwalb,  Markus  Dreseler,  Martin  Faust,  Johannes  Wust,  Hasso  Plattner:  Split  Dictionaries  for  In-­‐Memory  Column  Stores  in  Mixed  Workload  Environments,  ADC,  2014  

! Martin  Lorenz,  Johannes  Albrecht:  Object-­‐Relational  Mapping  Strategies  revised  –  A  comparison  of  Row-­‐  and  Column-­‐  oriented  Database  Systems,  International  Conference  on  Challenges  in  IT,  Engineering  and  Technology  (ICCIET),  2014  

! Franziska  Häger,  Thomas  Kowark,  Matthias  Uflacker:  Pay  it  forward  -­‐  Planning  and  Assessment  of  a  Coaching  Seminar  for  Global-­‐Design  Team  Alumni,  The  10th  NordDesign  Conference,  2014  

! Matthieu-­‐P.  Schapranow,  Konrad  Klinghammer,  Cindy  Fähnrich,  Hasso  Plattner:  An  Optimized  Research  Process  for  Real-­‐time  Drug  Response  Analysis,  The  3rd  International  Conference  on  Global  Health  Challenges,  2014  

! Thomas  Kowark,  Hasso  Plattner:  Collective,  Incremental  Ontology  Alignment  Through  Query  Translation,  The  8th  International  Conference  On  Web  Reasoning  And  Rule  Systems,  Athens,  Greece,  2014  

! Matthieu-­‐P.  Schapranow,  Konrad  Klinghammer,  Cindy  Fähnrich,  Hasso  Plattner:  In-­‐Memory  Technology  Enables  Interactive  Drug  Response  Analysis,  16th  International  Conference  on  e-­‐Health  Networking,  Applications  and  Services  (Healthcom  2014),  2014  

Page 26: Annual Report 2014 final web - Hasso-Plattner-Institut · Annual&Report&2014&!!!! & & & EnterprisePlatformandIntegrationConcepts& & Research!Group!of! Prof.Dr.HassoPlattner&! HassoPlattner!Institute!

23   Annual  Report  |  2014  

 

23   Enterprise  Platform  and  Integration  Concepts  |  Hasso  Plattner  Institute    

! Cindy  Fähnrich,  Matthieu-­‐P.  Schapranow,  Hasso  Plattner:  Towards  Integrating  the  Detection  of  Genetic  Variants  into  an  In-­‐Memory  Database,  Proceedings  of  the  International  Conference  on  Big  Data,  2014  

! Martin  Boissier,  Jens  Krüger,  Johannes  Wust,  Hasso  Plattner:  An  Integrated  Data  Management  for  Enterprise  Systems,  Proceedings  of  the  16th  International  Conference  on  Enterprise  Information  Systems  (ICEIS),  2014  

 

5.5      Workshop  Articles      

! David  Schwalb,  Martin  Faust,  Jens  Krüger,  Hasso  Plattner:  Leveraging  In-­‐Memory  Technology  for  Interactive  Analyses  of  Point-­‐of-­‐Sales  Data,  BDCA  in  conjunction  with  ICDE  2014,  2014  

! Stephan  Müller,  Paul  Möller,  Hasso  Plattner:  Leveraging  Enterprise  Application  Characteristics  to  Optimize  Incremental  Aggregate  Maintenance  in  a  Columnar  In-­‐Memory  Database,  Second  International  DASFAA  Workshop  on  Big  Data  Management  and  Analytics  (BDMA),  in  conjunction  with  DASFAA,  Bali,  Indonesia,  2014  

! Mariana  Neves,  Konrad  Herbst,  Matthias  Uflacker,  Hasso  Plattner:  Preliminary  evaluation  of  passage  retrieval  in  biomedical  multilingual  question  answering,  BioTxtM  2014,  Fourth  Workshop  on  Building  and  Evaluating  Resources  for  Health  and  Biomedical  Text  Processing,  2014  

! David  Schwalb,  Martin  Faust,  Johannes  Wust,  Martin  Grund,  Hasso  Plattner:  Efficient  Transaction  Processing  for  Hyrise  in  Mixed  Workload  Environments,  IMDM  in  conjunction  with  VLDB,  2014  

! Martin  Faust,  David  Schwalb,  Hasso  Plattner:  Composite  Group-­‐Keys:  Space-­‐efficient  Indexing  of  Multiple  Columns  for  Compressed  In-­‐Memory  Column  Stores,  IMDM  in  conjunction  with  VLDB,  2014  

! Ralf  Teusner,  Malte  Appeltauer,  Michael  Perscheid,  Jonas  Enderlein,  Thomas  Klingbeil,  Michael  Kusber:  PopulAid:  In-­‐Memory  Data  Generation  for  Customized  Benchmarks,  Workshop  on  Big  Data  Benchmarking  (WBDB),  2014  

! Martin  Boissier:  Optimizing  Main  Memory  Utilization  of  Columnar  In-­‐Memory  Databases  Using  Data  Eviction,  Proceedings  of  Phd  Workshop  @  VLDB  2014,  Hangzhou,  2014  

! Konrad  Herbst,  Cindy  Fähnrich,  Mariana  Neves,  Matthieu-­‐P.  Schapranow:  Applying  In-­‐Memory  Technology  for  Automatic  Template  Filling  in  the  Clinical  Domain,  CLEF  2014  Evaluation  Labs  and  Workshop,  Online  Working  Notes,  2014  

! Mariana  Neves:  HPI  in-­‐memory-­‐based  database  system  in  Task  2b  of  BioASQ,  Working  Notes  for  the  CLEF  BioASQ  Challenge,  2014  

! Thomas  Kowark,  Hasso  Plattner:  One  Query  at  a  Time:  Incremental,  Collective  Ontology  Matching,  The  Ninth  International  Workshop  on  Ontology  Matching,  Riva  del  Garda,  Trentino,  Italy,  2014  

Page 27: Annual Report 2014 final web - Hasso-Plattner-Institut · Annual&Report&2014&!!!! & & & EnterprisePlatformandIntegrationConcepts& & Research!Group!of! Prof.Dr.HassoPlattner&! HassoPlattner!Institute!

Annual  Report  |  2014   24  

 

Enterprise  Platform  and  Integration  Concepts  |  Hasso  Plattner  Institute   24    

6. TEACHING    In  2014,  our  group  has  been  again  responsible  for  numerous  teaching  activities,  covering  the  full   range   of   our   three   research   areas   “In-­‐Memory   Data   Management   for   Enterprise  Systems”,   “Tools   &   Methods   for   Enterprise   Systems   Design   and   Engineering”,   and   “In-­‐Memory  Data  Management  for  Life  Sciences  and  eHealth  Systems”.  Prof.  Plattner  has  taught  the  “Trends  and  Concepts”  series,  consisting  of  a  lecture  in  the  summer  term  and  a  seminar  held   in   the  winter   term.  During   the   lecture,   Prof.   Plattner   covers   the   basic   principles   and  advanced  use  case  scenarios  of  in-­‐memory  databases  as  well  as  current  trends  in  enterprise  computing.  In  the  seminar,  the  students  are  given  a  concrete  enterprise  application  scenario  and  design   challenge   that   they  need   to   address  with  prototypes   as  well   as   to   incorporate  end-­‐user  feedback.  In  the  following,  we  summarize  our  teaching  activities  in  2014.    

6.1      Summer  Term  2014    Bachelor  ! Enterprise  Software  Systems:  Programming  Concepts  and  Application  Characteristics  

(Seminar)  ! Enterprise  Workload  Analysis  for  Hot  and  Cold  Data  Classification  (Bachelor  Project)  ! Data  and  Performance  Aware  Development  of  Business  Applications  on  SAP  HANA  

(Bachelor  Project)  

Master  ! Trends  and  Concepts  in  the  Software  Industry  I  –  Principals  of  In-­‐Memory  Databases    

(Lecture)  ! In-­‐Memory  Data  Management  Research  (Seminar)  ! In-­‐Memory  Computing  for  Life  Science  (Seminar)  ! Designing  and  Programming  Applications  for  In-­‐Memory  Databases  (Exercise)  ! ME310:  Global  Team-­‐based  Product  Innovation  &  Engineering  (Project  Seminar)  ! Code  Better,  Run  Faster  –  Tools  for  Performance-­‐Driven  Enterprise  Application  

Development  (Master  Project)  

6.2      Winter  Term  2014/2015    Bachelor  ! Software  Engineering  II  (Lecture)  ! Advanced  Enterprise  Applications  using  In-­‐Memory  Databases  (Bachelor  Project)  ! Real-­‐time  Analysis  of  Big  Medical  Data  (Bachelor  Project)  

Master  ! Trends  and  Concepts  in  the  Software  Industry  II  –  Exploiting  Point-­‐of-­‐Sales  Data  

(Seminar)  ! Advanced  Topics  on  In-­‐Memory  Database  Servers  (Seminar)  

Page 28: Annual Report 2014 final web - Hasso-Plattner-Institut · Annual&Report&2014&!!!! & & & EnterprisePlatformandIntegrationConcepts& & Research!Group!of! Prof.Dr.HassoPlattner&! HassoPlattner!Institute!

25   Annual  Report  |  2014  

 

25   Enterprise  Platform  and  Integration  Concepts  |  Hasso  Plattner  Institute    

! ME310:  Global  Team-­‐based  Product  Innovation  &  Engineering  (Project  Seminar)  ! ME310:  Global  Team-­‐based  Product  Innovation  &  Engineering  –  Coaching  Research  

(Seminar)  ! HOT  or  NOT?  Data  Aging  Re-­‐defined  (Master  Project)  

 6.3      ME310:  Global  Team-­‐based  Product  Innovation  &  Engineering  

 Our   group   again   participated   in   co-­‐teaching   Stanford’s  ME310   class,  where   students   have  the   opportunity   to   work   on   real   design   challenges   posed   by   industry   partners   in   globally  distributed  teams.  In  2013/2014,  twelve  HPI  students  participated  in  this  9-­‐months  project-­‐based  innovation  course.  In  cooperation  with  Stanford  University  and  Siemens  AG  one  team  created   the   boardroom   of   the   future.   Another   team   partnered   with   École   des   Ponts  ParisTech   and   redesigned   the   bathroom   for   the   elderly.   This   project   was   undertaken   in  cooperation  with   the   furniture  manufacturer  Lapeyre.  How  to  use  self-­‐tracking  devices   for  pharmaceutical   research,   has   been   tackled   by   a   global   team   in   collaboration   with   Aalto  University  and  Bayer  AG.    ! Boardroom  of  the  Future    (with  Siemens  AG  and  Stanford  University)  

Meetings   are   a   fundamental   part   of  corporate  culture,  and  they  are  necessary  to   disseminate   information,   exchange  ideas,   formulate   strategy,   and   make  executive   decisions   that   “steer   a  company’s   fortune.”   Siemens   AG   has  given   us   the   task   to   “redesign   the  experience   for   decision   makers”   and  create   the   boardroom  of   the   future.  Our  findings   from  user  research  revealed  that  executives   are   frustrated   with   the  inefficiencies   in   today’s   meetings,   like  tedious   technical   setup   and   a   non-­‐collaborative   environment.   Furthermore,  decision  makers   seek   to   leverage   data   in  order   to   provide   quantifiable   insight   into  past  and  current  operations.  As  a  solution,  we   propose   “The   Q”,   a   meeting  experience   that   reinvents   executive  decision   making   to   be   more   productive,  data-­‐centric,   and   enjoyable.   We   have  created   a   physical   environment   that  

Page 29: Annual Report 2014 final web - Hasso-Plattner-Institut · Annual&Report&2014&!!!! & & & EnterprisePlatformandIntegrationConcepts& & Research!Group!of! Prof.Dr.HassoPlattner&! HassoPlattner!Institute!

Annual  Report  |  2014   26  

 

Enterprise  Platform  and  Integration  Concepts  |  Hasso  Plattner  Institute   26    

fosters   teamwork,   combined  with   screens,   tablet   devices,   and   a   voice-­‐controlled   software  system  that  provides  instant  access  to  a  company’s  live  data.    ! Bathroom  for  the  Elderly  (with  Lapeyre  and  ENPC  Paris)  

An   average   French   person   spends   one   hour   per   day   in   the   bathroom   for   daily   hygiene,  dressing,  and  wellness.  There  are  currently  9  million  French  people  over  75  years  old.  At  this  age,  physical  impairments  increase.  The  muscles  are  weaker,  the  balance  is  less  steady,  the  body   is   stiffer  and   the  senses  degenerate  gradually.  The  devices,  materials  or   items   in   the  bathroom  are  often  not  adapted  to  deal  with  such  impairments.  Since  the  bathroom  is  the  place  with  the  highest  number  of  accidents  at  home,  the  autonomy  of  the  elderly   is   linked  considerably   with   the   adaptation   of   their   bathroom.   This   need   for   adaptation   and   the  demographic   changes   provide   a   significant   business   opportunity   for   Lapeyre   -­‐   one   of   the  major  bathroom  distributors  and  manufacturers  in  France.      By   interviewing   medical   experts   and   users,  we   quickly   understood   that   a   bathroom   for  the   elderly   not   only   needs   to   address  functional   needs   but   also   has   to   be  aesthetically  desirable.  It’s  difficult  to  accept  physical   difficulties   when   getting   older   and  thus   products   should   not   stigmatize   with   a  poor   design   and   a   purely   clinical   look.   We  developed   Intemporel,   a   new   bathroom  furniture  that  provides  comfort  by  offering  a  relaxing   seating   position   while   giving   easy  access   to   all   products   of   daily   use.   Unlike  common   clinic-­‐like   "elderly   furniture“,   our  product   is   for   all   age   groups,   whether   they  simply  want  to  relax  and  sit  back  while  they  are  brushing  their  teeth  or  wish  to  rest  after  standing   for   a   long   time.   Due   to   some  reluctance  in  the  bathroom  furniture  market  and  a  need  for  a  simplistic  and  feasible  product  we  have  chosen  a  clean  and  unobtrusive  solution  that  integrates  into  a  traditional  dressing  table,   thus   combining   the   comfort   of   a   coiffeuse   with   the   functional   and   hygienic  requirements  of  a  sink.  Intemporel  is  going  to  be  produced  commercially  by  Lapeyre  and  will  be  available  in  stores  early  2015.    ! Real-­‐life  Evidence  for  Pharmaceutics  (with  Bayer  AG  and  Aalto  University)  

Drug  development  is  a  time  and  cost  consuming  process  for  pharmaceutical  vendors,  posing  considerable   risks   to   their   business.   Billion   dollar   investments   and  more   than   10   years   of  

Page 30: Annual Report 2014 final web - Hasso-Plattner-Institut · Annual&Report&2014&!!!! & & & EnterprisePlatformandIntegrationConcepts& & Research!Group!of! Prof.Dr.HassoPlattner&! HassoPlattner!Institute!

27   Annual  Report  |  2014  

 

27   Enterprise  Platform  and  Integration  Concepts  |  Hasso  Plattner  Institute    

research,   development,   and   clinical   testing   are   typical.   Once   a   product   has   entered   the  market,  health  insurances,  clinicians,  and  patients  demand  proof  of  efficacy,  information  on  long-­‐term   consequences   and   interferences   with   other   medication.   Therefore,   drug  manufacturers   need   to   monitor   how   their   products   function   in   a   real-­‐life   environment.  Leading   pharma   companies   like   Bayer   put   much   effort   into   the   collection   of   "real-­‐life  evidence   data",   but   managing   and   analyzing   this   data   is   difficult.   That   is   why   Bayer  Healthcare   has   challenged   us  with   the   task   to   improve   gathering,  managing,   and  merging  real-­‐life  evidence  data.    After  exploring  the  problem  space,  we  found  that  data  structure  and  quality  varies  heavily  among   different   sources,   institutions,   and   countries.   Thus,   instead   of   trying   to   merge  existing  sources,  we  propose  to  create  a  crowd-­‐powered,  open  and  combined  data  source  by  leveraging  the  increasing  popularity  of  self-­‐tracking  (“quantified  self”).      With   LINK,   we   propose   a   platform   for   everyone   to   contribute   to   healthcare   research   by  participating   in   studies   posed  by   researchers.   LINK  makes   it   easy   for   people   to   contribute  data   through   interactive   questionnaires,   blood   samples,   and   self-­‐tracking   devices.   Study  participants   can   easily   track   their  medicine   intake,   provide   feedback,   and   share   data  with  researchers.  With  appropriate  incentives  provided,  researchers  are  able  to  reach  out  to  the  right   participants   at   very   low   costs   and   short   cycles.   LINK   shortens   the   distance   and  disconnects  between  patients  and  drug  vendors  and  suggests  real-­‐life  evidence  monitoring,  where  patients  and  pharma  companies  together  improve  the  development  of  better  drugs.    

6.4      openHPI  –  Online  Courses    In  2014,  our  research  group  conducted  and  supervised  a  number  of  online  courses  on  openHPI.  For  the  first  time,  Prof.  Plattner’s  course  on  In-­‐Memory  Data  Management  was  not  only  offered  in  English,  but  also  in  Chinese.  

! In-­‐Memory  Data  Management  –  Implications  on  Enterprise  Systems  Date:       1st  September  –  3rd  November  Language:     English  

! In-­‐Memory  Data  Management  Date:       16th  February  –  14th  April  Language:     Chinese  

! In-­‐Memory  Data  Management  –  Implications  on  Enterprise  Systems  Date:       3rd  November  –  31st  December  Language:     Chinese    

 

 

Page 31: Annual Report 2014 final web - Hasso-Plattner-Institut · Annual&Report&2014&!!!! & & & EnterprisePlatformandIntegrationConcepts& & Research!Group!of! Prof.Dr.HassoPlattner&! HassoPlattner!Institute!

Annual  Report  |  2014   28  

 

Enterprise  Platform  and  Integration  Concepts  |  Hasso  Plattner  Institute   28    

7. EVENTS,  SPEECHES,  AND  PRESENTATIONS    Additionally  to  presenting  our  work  at  international  conferences  and  workshops,  Prof.  Plattner  and  members  of  our  research  group  have  attended  several  events,  delivered  speeches,  or  represented  HPI  at  special  occasions.  Please  find  a  selection  below.  

! Plenary  Keynote  at  VLDB  Conference  

Prof.  Plattner  gave  the  plenary  keynote  at  the  Very  Large  Data  Bases  Conference  (VLDB)  on  September  2nd,  2014  in  Hangzhou,  China.  The  title  of  his  talk  was  “The  impact  of  In-­‐Memory  Databases  on  Enterprise  Systems”.  

 

Prof. Dr. Hasso Platter at VLDB’14 in Hangzhou, China.

! Keynote  Talk  at  the  Technical  University  of  Munich  

On  July  7,  2014,  Prof.  Plattner  visited  the  Technical  University  of  Munich  (TUM)  and  gave  a  colloquium  speech  at  the  computer  science  department.  Prof.  Krcmar  was  hosting  the  event  that  was  attended  by  the  Professors  of  the  CS  department  and  ca.  100  IT  students.    

 Prof.  Dr.  Helmut  Krcmar  (TUM),  Prof.  Dr.  Hasso  Plattner  (HPI)  

Page 32: Annual Report 2014 final web - Hasso-Plattner-Institut · Annual&Report&2014&!!!! & & & EnterprisePlatformandIntegrationConcepts& & Research!Group!of! Prof.Dr.HassoPlattner&! HassoPlattner!Institute!

29   Annual  Report  |  2014  

 

29   Enterprise  Platform  and  Integration  Concepts  |  Hasso  Plattner  Institute    

! SAPPHIRE  NOW  2014  

SAPPHIRE  NOW,  one  of  the  world’s  premier  business  technology  conferences  for  senior  executives,  line  of  business,  and  IT  decision  makers,  business  managers,  and  project  managers  involved  in  deploying  business  technology  initiatives  took  place  on  June  3  -­‐  5,  2014  at  the  Orange  County  Convention  Center,  Orlando,  Florida.  The  Hasso  Plattner  Institute  was  again  well  represented  on  the  show  floor.  We  met  new  and  old  project  partners  at  the  HPI  booth  and  presented  our  latest  work  results,  parts  of  which  have  been  referenced  in  the  keynote  speech  of  Prof.  Plattner.    

     Our  team  presenting  at  the  HPI  booth  on  the  SAPPHIRE  showfloor.  

 

! SAP  TechEd  and  d-­‐code  

Our  team  again  attended  the  SAP  TechEd  and  d-­‐code  event  in  Las  Vegas  on  October  20  –  24,  2014,  where  we  have  presented  our  work  on  the  show  floor  and  in  technical  talks.  We  also  presented  at  the  TechEd  and  d-­‐code  in  Berlin,  which  took  place  on  November  11-­‐13,  2014.  There,  our  team  members  Lars  Butzmann  and  Stefan  Klauck,  together  with  our  students  Michael  Weisz,  Stephan  Schulz,  and  Leo  Kotschenreuther  have  won  the  SAP  InnoJam  and  the  SAP  DemoJam  2014  contest  with  their  concept  and  HANA-­‐based  prototype  for  “Remote  Farming”.    

 Winners  of  the  SAP  InnoJam  and  DemoJam  2014:  Michael  Weisz,  Lars  Butzmann,  Stephan  Schulz,  Leo  

Kotschreuther,  Stefan  Klauck  (from  left  to  right)  

Page 33: Annual Report 2014 final web - Hasso-Plattner-Institut · Annual&Report&2014&!!!! & & & EnterprisePlatformandIntegrationConcepts& & Research!Group!of! Prof.Dr.HassoPlattner&! HassoPlattner!Institute!

Annual  Report  |  2014   30  

 

Enterprise  Platform  and  Integration  Concepts  |  Hasso  Plattner  Institute   30    

! Fifth  Workshop  on  Big  Data  Benchmarking    

We  have  been  hosting  the  Fifth  Workshop  on  Big  Data  Benchmarking  (5th  WBDB)  on  August  5-­‐6,  2014.  The  objective  of  the  WBDB  workshops  is  to  make  progress  towards  the  development  of  industry  standard  application-­‐level  benchmarks  for  evaluating  hardware  and  software  systems  for  big  data  applications.  Dr.  Uflacker,  local  organization  chair  of  the  event,  welcomed  the  international  attendees  on  the  HPI  campus.  

 

   

Page 34: Annual Report 2014 final web - Hasso-Plattner-Institut · Annual&Report&2014&!!!! & & & EnterprisePlatformandIntegrationConcepts& & Research!Group!of! Prof.Dr.HassoPlattner&! HassoPlattner!Institute!

31   Annual  Report  |  2014  

 

31   Enterprise  Platform  and  Integration  Concepts  |  Hasso  Plattner  Institute    

8. INDUSTRY  PARTNERSHIPS    

We  would   like   to   thank   our   industry   partners   for   the   trustful   and   fruitful   collaboration   in  2014.    ! Audi  AG  ! Bayer  AG  ! Charité  -­‐  Universitätsmedizin  Berlin  ! Colgate-­‐Palmolive  ! Intel  ! Lapeyre  ! SAP  SE  ! Siemens  AG  ! ThyssenKrupp  AG  

 

9. ACADEMIC  PARTNERSHIPS    In   2014,  our   research  group  was   in   close   collaboration  with   the   following  universities   and  institutes.    ! Stanford  University,  USA  

Joint  Class  –  Global  Project-­‐based  Engineering  Design,  Innovation  and  Development  (ME310),  Prof.  Larry  Leifer  and  Prof.  Mark  Cutkosky  

HPI-­‐Stanford  Design  Thinking  Research  Program,  Prof.  Larry  Leifer  

! Paris,  École  des  Ponts  Business  School,  France  

Joint  Class  –  Global  Project-­‐based  Engineering  Design,  Innovation  and  Development  (ME310)    ! Aalto  University,  Finland  

Joint  Class  –  Global  Project-­‐based  Engineering  Design,  Innovation  and  Development  (ME310)    ! University  of  St.  Gallen,  Switzerland  

Joint  Class  –  Global  Project-­‐based  Engineering  Design,  Innovation  and  Development  (ME310)    

Page 35: Annual Report 2014 final web - Hasso-Plattner-Institut · Annual&Report&2014&!!!! & & & EnterprisePlatformandIntegrationConcepts& & Research!Group!of! Prof.Dr.HassoPlattner&! HassoPlattner!Institute!

Annual  Report  |  2014  

 

Enterprise  Platform  and  Integration  Concepts  |  Hasso  Plattner  Institute    

 

                                     

http://epic.hpi.de