fase 2015 - map-based transparent persistence for very large models

39
MAP-BASED TRANSPARENT PERSISTENCE FOR VERY LARGE MODELS lin Abel Gómez Massimo Tisi Gerson Sunyé and Jordi Cabot 1

Upload: abgolla

Post on 13-Apr-2017

409 views

Category:

Science


1 download

TRANSCRIPT

Page 1: Fase 2015 - Map-based Transparent Persistence for Very Large Models

MAP-BASED TRANSPARENT PERSISTENCE

FOR VERY LARGE MODELS lin

Abel GómezMassimo Tisi

Gerson Sunyéand Jordi Cabot

1

Page 2: Fase 2015 - Map-based Transparent Persistence for Very Large Models

OUTLINE

▌The landscape in MDE▌Motivation: running example and current

persistence approaches▌Towards a simple EMF-based

persistence layer▌NEOEMF/MAP: A transparent persistence

layer for EMF models▌Our experimental evaluation in a nutshell▌Conclusions and future work

© ATLANMOD - [email protected]

Page 3: Fase 2015 - Map-based Transparent Persistence for Very Large Models

INTRODUCTION

Why another persistence solution?

3

Page 4: Fase 2015 - Map-based Transparent Persistence for Very Large Models

THE LANDSCAPE IN MDE▌ Models and code generation are the center of the

software-engineering processes▌ Modeling tools are built around modeling frameworks (EMF

has become the de facto standard)▌ The technologies at the core of modeling frameworks were

designed to support simple modeling activities▌ Since its publication, the XMI standard has been the

preferred format for storing and sharing models and metamodels

▌ Clear limits arise when current technologies are applied to VLMs: XML is not the right technology for VLMs (verbosity, costly serialization/deserialization…)

▌ Some solutions exist, but problems in managing memory and persisting data are still under-studied in MDE

© ATLANMOD - [email protected]

4

Page 5: Fase 2015 - Map-based Transparent Persistence for Very Large Models

MOTIVATION

Running exampleCurrent persistence approaches

© ATLANMOD - [email protected]

5

Page 6: Fase 2015 - Map-based Transparent Persistence for Very Large Models

RUNING EXAMPLE

© ATLANMOD - [email protected]

Java Metamodel(excerpt)

nsURI: ’http://java’

6

Page 7: Fase 2015 - Map-based Transparent Persistence for Very Large Models

RUNING EXAMPLE

© ATLANMOD - [email protected]

Java Metamodel(excerpt)

nsURI: ’http://java’

Instance

7

Page 8: Fase 2015 - Map-based Transparent Persistence for Very Large Models

MOTIVATION

▌ Within a modeling ecosystem, all tools that need to access or manipulate models have to pass through a single model management interface

▌ In some of these ecosystems (e.g. EMF) the model management interface is automatically generated from the metamodel

© ATLANMOD - [email protected]

8

Page 9: Fase 2015 - Map-based Transparent Persistence for Very Large Models

THE GENERATED MODEL MANAGEMENT INTERFACE▌ // Creation of objects▌ Package p1 := Factory.createPackage();▌ ClassDeclaration c1 := Factory.createClassDeclaration();▌ BodyDeclaration b1 := Factory.createBodyDeclaration();▌ BodyDeclaration b2 := Factory.createBodyDeclaration();▌ Modifier m1 := Factory.createModifier();▌ Modifier m2 := Factory.createModifier();▌ // Initialization of attributes▌ p1.setName("package1");▌ c1.setName("class1");▌ b1.setName("bodyDecl1");▌ b2.setName("bodyDecl2");▌ m1.setVisibility(VisibilityKind.PUBLIC);▌ m2.setVisibility(VisibilityKind.PUBLIC);▌ // Initialization of references▌ p1.getOwnedElements().add(c1);▌ c1.getBodyDeclarations().add(b1);▌ c1.getBodyDeclarations().add(b2);▌ b1.setModifier(m1);▌ b2.setModifier(m2)

© ATLANMOD - [email protected]

9

Page 10: Fase 2015 - Map-based Transparent Persistence for Very Large Models

MOTIVATION

▌ Without any specific memory-management solution, the model would need to be fully contained in memory for any access or modification

▌ Models that exceed the main memory would cause a significant performance drop or the application crash

© ATLANMOD - [email protected]

10

Page 11: Fase 2015 - Map-based Transparent Persistence for Very Large Models

STANDARD TECHNOLOGIES FOR PERSISTING MODELS IN EMF▌XML-based (XMI)

│ Pros: Readability, fast for small models│ Cons: Needs to load/keep the whole

model in memory.▌Connected Data Objects (CDO)

│ Pros: on-demand loading, transactions, versioning, notifications

│ Cons: Only the relational mapping is regularly maintained, does not scale well with VLMs

© ATLANMOD - [email protected]

11

Page 12: Fase 2015 - Map-based Transparent Persistence for Very Large Models

NEW TRENDS IN PERSISTING MODELS IN EMF▌ Morsa (document-oriented)

│ On-demand loading, incremental updates, fully compatible with the EMF API

│ Requires its own query language to get good performance▌ MongoEMF (document-oriented)

│ Uses the standard EMF API│ It behaves different than the standard back-ends

▌ EMF fragments│ Uses the standard proxy mechanism to partition models in small chunks│ Requires modifications on the metamodels to get the benefits of partitions

▌ NeoEMF/Graph, a.k.a. Neo4EMF (graph-based)│ Models are a set of highly interconnected elements → graphs are the

most natural way to represent them│ The generated API only performs one-step navigations → only a

significant gain in performance is obtained when using native queries on the underlying persistence back-end

© ATLANMOD - [email protected]

12

Page 13: Fase 2015 - Map-based Transparent Persistence for Very Large Models

MOTIVATION

▌ We need a transparent persistence layer able to automatically persist, load and unload model elements with no changes to the application code

© ATLANMOD - [email protected]

13

Page 14: Fase 2015 - Map-based Transparent Persistence for Very Large Models

NEOEMF/MAPDESIGNGOALS

Towards a simple EMF-based persistence layer

14

Page 15: Fase 2015 - Map-based Transparent Persistence for Very Large Models

MODEL-PERSISTENCE LAYER

▌NEOEMF/MAP must…

… be an exact replacement… use a replaceable underlying engine… allow different types of caching

… be memory friendly … provide on-demand load capabilities… free unused memory

… outperform current persistence layers using the standard API

Inte

rope

rabi

lity

requ

irem

ents

Perf

orm

ance

re

quire

men

ts

© ATLANMOD - [email protected]

15

Page 16: Fase 2015 - Map-based Transparent Persistence for Very Large Models

MODEL-PERSISTENCE LAYER

© ATLANMOD - [email protected]

ModelManager

PersistenceManager

PersistenceBackend

NeoEMF/Map

EMF

/GraphCDOXMI

Serialization

Model-based Tools

XMI File GraphDB MapDB

CachingStrategy

RelationalDB

Model Access API

Persistence API

Backend API

ClientCode

16

Page 17: Fase 2015 - Map-based Transparent Persistence for Very Large Models

NEOEMF/MAPA TRANSPARENT

PERSISTENCE LAYER FOR

EMF MODELS

Memory ManagementMap-based data modelModel operations as map operations

17

Page 18: Fase 2015 - Map-based Transparent Persistence for Very Large Models

MEMORY MANAGEMENT

▌ Decoupling dependencies among objects by assigning a unique identifier to all model objects allows:

▌ Lightweight on-demand loading│ Each live model object has a lightweight delegate

object that is in charge of on-demand loading the element data and keeping track of the element’s state

▌ Efficient garbage collection in the JRE│ No hard Java references are kept among model

objects. Any model object not directly referenced by the application will be deallocated

© ATLANMOD - [email protected]

18

Page 19: Fase 2015 - Map-based Transparent Persistence for Very Large Models

MAP-BASED DATA MODEL▌ The unique identifier allows flattening the graph

structure into a set of key-value mappings▌ Operations on hash-maps have a constant cost▌ Three different (hash-)maps are used to store

models’ information:│ Property map: keeps all objects’ data in a centralized

place│ Type map: tracks how objects interact with the meta-

level (e.g. instance of)│ Containment map: defines the models’ structure in

terms of containment references

© ATLANMOD - [email protected]

19

Page 20: Fase 2015 - Map-based Transparent Persistence for Very Large Models

MAP-BASED DATA MODEL

▌Property map│ Key: OID + EstructuralFeature│ Value: data

© ATLANMOD - [email protected]

Key Value{ ‘c1’, ‘name’ } ‘class1’{ ‘c1’, ‘bodyDeclarations’ }

{ ‘b1’, ‘b2’ }

20

Page 21: Fase 2015 - Map-based Transparent Persistence for Very Large Models

MAP-BASED DATA MODEL

▌Type map│ Key: OID│ Value: nsURI + EObject’s EClass

© ATLANMOD - [email protected]

Key Value‘c1’ ⟨ nsUri=‘http://java’, class=‘ClassDeclaration’ ⟩

21

Page 22: Fase 2015 - Map-based Transparent Persistence for Very Large Models

MAP-BASED DATA MODEL

▌Containmentmap│ Key: OID│ Value: Container’s OID +

EStructuralFeature (from parent to child).

© ATLANMOD - [email protected]

Key Value‘c1’ ⟨ container=‘p1’, featureName=‘ownedElements’ ⟩

22

Page 23: Fase 2015 - Map-based Transparent Persistence for Very Large Models

MODEL OPERATIONS ASMAP OPERATIONS

LOOKUPS INSERTSMETHOD MIN. MAX. MIN. MAX

OPERATIONS ON OBJECTSgetType 1 1 0 0

getContainer 1 1 0 0getContainerFeature 1 1 0 0

OPERATIONS ON PROPERTIESget* 1 1 0 0set* 0 3 1 3

isSet* 1 1 0 0unset* 1 1 0 1

OPERATIONS ON MUTI-VALUED FEATURESadd 1 3 1 3

remove 1 2 1 2clear 0 0 1 1size 1 1 0 0

© ATLANMOD - [email protected]

23

Page 24: Fase 2015 - Map-based Transparent Persistence for Very Large Models

EXPERIMENTALEVALUATION

Conditions of the experimentsResultsSummary

24

Page 25: Fase 2015 - Map-based Transparent Persistence for Very Large Models

EXPERIMENTAL EVALUATION▌ Based on our joint experience with industrial

partners:│ We obtained three models from OSS using reverse

engineering…│ … that resemble models from real-world scenarios│ We defined a set of queries (GraBaTs’09 and

industry-like)│ Only the standard EMF API is used → Queries are

backend-agnostic│ Three heap sizes: 8GB, 512MB and 256MB

© ATLANMOD - [email protected]

# MODEL SIZE IN XMI ELEMENTS1 org.eclipse.gmt.modisco.java 19.3MB 80.6652 org.eclipse.jdt.core 420.6MB 1.557.0073 org.eclipse.jdt.* 984.7MB 3.609.454

25

Page 26: Fase 2015 - Map-based Transparent Persistence for Very Large Models

EXPERIMENTAL EVALUATION

▌ Selected back-ends:│ NEOEMF/MAP (MapDB)│ NEOEMP/GRAPH (Neo4j embedded)│ CDO (H2 embedded)

▌ Discarded back-ends:│ MongoEMF → does not strictly comply with the

standard EMF behavior│ EMF-fragments → requires manual modifications

in the source models or metamodels│ Morsa → only a small subset of the experiments

ran successfullyConfiguration details: Intel Core i7 3740QM (2.70GHz), 16 GB of DDR3 SDRAM (800MHz), Samsung SM841 SATA3 SSD Hard Disk (6GB/s), Windows 7 Enterprise 64, JRE 1.7.0_40-b43, Eclipse 4.4.0, EMF 2.10.1, NeoEMF/Map uses MapDB 0.9.10, NeoEMF/Graph uses Neo4j 1.9.2, CDO 4.3.1 runs on top of H2 1.3.168

© ATLANMOD - [email protected]

26

Page 27: Fase 2015 - Map-based Transparent Persistence for Very Large Models

EXPERIMENT I

© ATLANMOD - [email protected]

Model 1 Model 2 Model 39 s 161 s

412 s41 s

1161 s

3767 s

12 s 120 s301 s

Import model from XMI (8GB)

NeoEMF/Map NeoEMF/Graph CDO27

Page 28: Fase 2015 - Map-based Transparent Persistence for Very Large Models

EXPERIMENT II

© ATLANMOD - [email protected]

Model 1 Model 2 Model 34 s 35 s

79 s3 s 25 s

62 s16 s

201 s

708 s

14 s

133 s

309 s

Model traversal 8GB (incl. loading & unloading)

XMI NeoEMF/MapNeoEMF/Graph CDO

28

Page 29: Fase 2015 - Map-based Transparent Persistence for Very Large Models

EXPERIMENT II

© ATLANMOD - [email protected]

Model 1 Model 2 Model 34 3

42

366

15

235

763

13

550 548

Model traversal 512MB (incl. loading & unloading)

XMI NeoEMF/MapNeoEMF/Graph CDO

29

Page 30: Fase 2015 - Map-based Transparent Persistence for Very Large Models

EXPERIMENT III

© ATLANMOD - [email protected]

Model 1 Model 2 Model 30 s 0 s 0 s0 s

2 s

19 s

0 s 0 s

2 s

Model queries that do not traverse the model 8GB

NeoEMF/Map NeoEMF/Graph CDO

30

Page 31: Fase 2015 - Map-based Transparent Persistence for Very Large Models

EXPERIMENT IV

Model 1 Model 2 Model 31 s 24 s

61 s11 s

188 s

717 s

9 s48 s

367 s

GraBaTs’09 8GB

NeoEMF/MapNeoEMF/GraphCDO

© ATLANMOD - [email protected]

Model 1 Model 2 Model 32 s 36 s

101 s17 s

359 s

1328 s

9 s131 s

294 s

Unused Methods 8GB

NeoEMF/MapNeoEMF/GraphCDO

31

Page 32: Fase 2015 - Map-based Transparent Persistence for Very Large Models

EXPERIMENT V

© ATLANMOD - [email protected]

Model 1 Model 2 Model 31 s 24 s

62 s11 s

191 s

677 s

9 s

118 s

296 s

Model modification and saving 8GB

NeoEMF/Map NeoEMF/Graph CDO

32

Page 33: Fase 2015 - Map-based Transparent Persistence for Very Large Models

EXPERIMENT V

© ATLANMOD - [email protected]

Model 1 Model 2 Model 31 s

160 s

472 s

11 s

224 s

9 s

723 s

Model modification and saving 256MB

NeoEMF/Map NeoEMF/Graph CDO

33

Page 34: Fase 2015 - Map-based Transparent Persistence for Very Large Models

SUMMARY▌ NeoEMF/Map performs better than any other

solution when using the standard API▌ NeoEMF/Map presents import times in the same

order of magnitude than CDO, but it is about a 33% slower for the largest model → NeoEMF/Map is affected by the overhead produced by modifications on big lists (>100.000 elements) that grow monotonically (caching is needed)

▌ The simple data model with low-cost operations implemented by NeoEMF/Map contrasts with the more complex data model implemented by NeoEMF/Graph (consistently slower by a factor between 7 and 9)

© ATLANMOD - [email protected]

34

Page 35: Fase 2015 - Map-based Transparent Persistence for Very Large Models

SUMMARY

▌ Traversal of a very large model is much faster (up to 9×) by using the NeoEMF/Map

▌ If load and unload times are considered NeoEMF/Map also outperforms XMI

▌ The fast model-traversal ability of NeoEMF/Map is exploited by the pattern followed by most of the queries in the modernization domain

▌ Queries that traverse the model to apply and persist changes perform significantly better on NeoEMF/Map (5× faster on big models, 9× on small models).

© ATLANMOD - [email protected]

35

Page 36: Fase 2015 - Map-based Transparent Persistence for Very Large Models

CONCLUSIONSConclusionsFuture work

© ATLANMOD - [email protected]

36

Page 37: Fase 2015 - Map-based Transparent Persistence for Very Large Models

CONCLUSIONS▌ Map-based persistence layer to handle VLMs▌ Comparison against relational-based and graph-

based alternatives▌ EMF as the implementation technology▌ We used queries from some of our industrial partners

in the model-driven modernization domain as experiments

▌ Typical model-access APIs, with fine-grained methods with one-step-navigation queries, do not benefit from complex relational or graph-based data structures.

▌ Low-level data structures, like hash-tables, with low and constant access times provide better results

© ATLANMOD - [email protected]

37

Page 38: Fase 2015 - Map-based Transparent Persistence for Very Large Models

FUTURE WORK

▌ Caching strategies:│ Element unloading (which element is not needed

anymore?) │ Element prefetching (which element will be

needed in future?)▌ Benefits of other backends depending on the

specific application scenario:│ Graph-based persistence solutions when some

of our requirements can be dropped│ Bypassing the model access API by translating

the queries to high performance native graph-database queries may provide great benefits

© ATLANMOD - [email protected]

38

Page 39: Fase 2015 - Map-based Transparent Persistence for Very Large Models

MAP-BASED TRANSPARENT PERSISTENCE

FOR VERY LARGE MODELS lin

Abel GómezMassimo Tisi

Gerson Sunyéand Jordi Cabot