generation of random emf models for benchmarks

Generation of Random Software Models for

BenchmarksMarkus Scheidgen

1

Agenda

▶ Benchmarks for MDE

▶ Input models for MDE benchmarks

▶ Generation of random models

■ Language

■ Examples

▶ Related Work

▶ Conclusion

2

Benchmarks (I)

▶ in smallMDE technology1 is solely evaluated by its functionality

▶ BigMDE technology is evaluated by its functionality and its performance (execution time, memory consumption, ...)

▶ Benchmarks enable sound comparison of technologies based on their performance

1) technology = algorithms ∪ methods ∪ tools ∪ frameworks

3

Benchmarks (II)

▶ A benchmark describes the measure ...

■ of a well defined property

■ acquired in a well defined processes

■ with a well defined workload (tasks and inputs)

■ in a well defined environment

4

All measurements were performed on a Notebook computer with Intel Core i5 2.4GHz CPU, 8 GB 1067 MHz DDR3 RAM, running Mac OS 10.7.3.

+ some environment- software versions, JVM

configuration

M. Scheidgen, A. Zubow, J. Fischer, T. H. Kolbe: Automated and Transparent Model Fragmentation for Persisting Large Models; ACM/IEEE 15th International Conference on Model Driven Engineering Languages & Systems (MODELS); Innsbruck; 2012; LNCS Springer

http://www.markus-scheidgen.de/wp-content/uploads/2012/07/Automated-and-Transparent-Model-Fragmentation-for-Persisting-Large-Models.pdf




configuration


+ some process- distribution, variation,

outlier- warmup: JIT, caches, GC

All experiments were repeated at least 20 times, and all present results are respective averages.



We measured the performance of instantiating and persisting objects.

+ some property description- exact task?- comparable between

technologies



configuration







We measured the performance of instantiating and persisting objects.

+ some property description- exact task?- comparable between

technologies



configuration





XMI CDO Morsa EMFFrag10

3

104

105

Ob

ject

s p

er

seco

nd

(! 1

04)

w/ cross referencesw/o cross references

+ two specific shapes- two specific shapes- real world likeness

We created test models with 105 objects, a binary containment hierarchy, and two different densities of cross references: one cross reference per object and no cross references.



Input for Benchmarks (I)

▶ A benchmark input model should

■ include no bias

■ invoke real world behavior

■ cover different scenarios

■ metrical scale

6

Input for Benchmarks – In MDE

▶ For MDE technology input is usually a software engineering artifact, which we commonly refer to as a model

▶ Usually the models from the 2009 Grabat’s graph transformation contest are used

■ MoDisco-models of JDT

■ different sizes and shapes (with and without method implementations)

■ sizes not linear

7

Input for Benchmarks – Properties: Size & Shape

▶ different properties to mimic different scenarios and invoke different behavior/performance characteristics

▶ goal: understand correlation between performance properties and model sizes and shapes

▶ ordinal vs metrical

▶ What defines a shape?

■ metrics (depending on the language, e.g. methods per class in OO programming)

■ graph/tree properties (degree, connectedness, sparse vs. dense, etc.)

▶ What defines size

■ # objects

■ # values

■ # links

8

Input for Benchmarks – Properties: Size & Shape

▶ different properties to mimic different scenarios and invoke different behavior/performance characteristics

▶ goal: understand correlation between performance properties and model sizes and shapes

▶ ordinal vs metrical

▶ What defines a shape?

■ metrics (depending on the language, e.g. methods per class in OO programming)

■ graph/tree properties (degree, connectedness, sparse vs. dense, etc.)

▶ What defines size

■ # objects

■ # values

■ # links

9

Input for Benchmarks – Approaches

▶ handcraft input models – no scalability

▶ take existing models – only given shapes

▶ generate models – do not mimic the real world

▶ bias?

■ bias in creation, selection, algorithm

■ social problem, can’t use technology to solve social problems

10

Input for Benchmarks – Random Models

▶ random ≠ arbitrary nor uniform

▶ surprise element

▶ probability distributions as abstractions for typical usage of language constructs

■ e.g. a class has typically a negative binomial distributed (with certain parameters) number of methods [1]

▶ distribution parameters to define shapes

▶ random models can be sensible representatives of a large class of models

11

[1] Tetsuo Tamai, Takako Nakatani: Analysis of Software Evolution Processes Using Statistical Distribution Models, IWPSE '02

http://iwpse2002.ics.es.osaka-u.ac.jp/

http://iwpse2002.ics.es.osaka-u.ac.jp/

Generation of Random Models – A Generator DSL (I)

12

generator RandomEcore for ecore in "...ecore/model/Ecore.ecore" { ePackage: EPackage -> name := RandomID(Normal(8,3)) eClassifiers += eClass#NegBinomial(5,0.5) ; eClass: EClass -> name := RandomID(Normal(10,4)) abstract := UniformBool(0.2) eStructuralFeatures += eReference(UniformBool(0.3))#NegBinomial(4,0.7) eStructuralFeatures += eAttribute#NegBinomial(6,0.5) ; eReference(boolean composite):EReference -> name := RandomID(Normal(10,4)) upperBound := if (UniformBool(0.5)) -1 else 1 ordered := UniformBool(0.2) containment := composite eType:EClass := Uniform(model.EClassifiers.filter[it instanceof EClass]) ;

...}

http://github.com/markus1978/RandomEMF

https://github.com/markus1978/RandomEMF

https://github.com/markus1978/RandomEMF

Input for Benchmarks – A Generator DSL (II)

13

▶ Maps Meta-Model to Grammar-like description

▶ Rule based

▶ Each rule creates an object of a certain meta-class

▶ Each rule calls other rules to create features

▶ Rules can have parameters

▶ Expressions with random values

■ different distributions for random number generation

■ random number of rule application

■ random values (e.g. identifier, choices)

▶ xText + xBase DSL

Generation of Random Models – Generated Example

14

1. package dabobobues;

3. class Dues {4. 5. DuBoBuTus begubicus;6. ELius brauguslus;7. 8. void Dues(Alius donus, FanulAudaCio aubetin) {9. }10. 11. void baGusFritus() {12. eudaguslius = "";13. bigusdaGubolius();14. if ("") {15. annulAugusaugusfrigustin("");16. albucio = Dues()<=++12;17. bi();18. eBoTor();19. } else {20. brauguslus = 9;21. baGusFritus();22. duLus = ""=="";23. }24. }25. 26. void aufribonulAubufrinus(Dues e) {27. dobubogutor();28. aubiguTus = 9;29. }30. }

Classes/InterfacesMethods

StatementsExpressions

others

randomly generated code

actual Java code

syntheticly generated code

15

generatedgenerated

actual project

actual project

generatedgenerated

Generation of Random Models – Problems

▶ randomness is a tool to reduce bias, but clients have to decide to use it correctly

▶ hard to generate static semantically correct models

16

Related Work

▶ Test-Model generation with SAT-Solvers

■ Meta-Model/Constraint divided into small partitions that cover test-cases

■ translation into logical equations

■ SAT-Solver

■ translation of results into model-fragments

■ composition of test-models from model fragments

➡ small, valid models with statistically proved test-coverage

17

Sagar Sen, Benoit Baudry, Jean-Marie Mottu: Automatic Model Generation Strategies for Model Transformation Testing, Theory and Practice of Model Transformations, Springer, 2009

Erwan Brottier, Franck Fleurey, Jim Steel, Benoit Baudry, Yves Le Traon: Metamodel-based Test Generation for Model Transformations: an Algorithm and a Tool, ISSRE’06, IEEE, 2006

Related Work

▶ Translation into a constructive formalism

■ Meta-Modeling is not constructive (full set of instances can not be generated from a meta-model)

■ translation into context-free or graph-grammars

■ random application of rules to generate random models

➡ large models, shape can be influenced via probability distributions on rule selection

18

K Ehrig, JM Küster, G Taentzer: Generating instance models from meta models, Formal Methods for Open Object-Based Distributed Systems, Springer, 2006

https://scholar.google.de/citations?user=rQwUZewAAAAJ&hl=de&oi=sra

https://scholar.google.de/citations?user=rQwUZewAAAAJ&hl=de&oi=sra

http://link.springer.com/chapter/10.1007/11768869_13

http://link.springer.com/chapter/10.1007/11768869_13

http://link.springer.com/book/10.1007/11768869




Related Work

▶ Fitting meta-model instances onto randomly generated tree/graph structures

■ existing methods for random tree or graph generation

■ interpretation of randomly generated trees/graphs as meta-model instances

➡ large models, but uniform models, not static semantic aware

19

A Mougenot, A Darrasse, X Blanc, M Soria: Uniform random generation of huge metamodel instances, ECMDA, Springer, 2009

https://scholar.google.de/citations?user=gv78_KoAAAAJ&hl=de&oi=sra

https://scholar.google.de/citations?user=gv78_KoAAAAJ&hl=de&oi=sra

https://scholar.google.de/citations?user=UInPjWoAAAAJ&hl=de&oi=sra

https://scholar.google.de/citations?user=UInPjWoAAAAJ&hl=de&oi=sra

http://link.springer.com/chapter/10.1007/978-3-642-02674-4_10




Related Work

▶ benchmark definitions for graph transformations

▶ different distribution for graph edges to create different shapes

■ binomial

■ hypergeometric

■ uniform

■ preferential attachment

➡ large models, not static semantic aware

20

Izso, B., Szatmari, Z., Bergmann, G., Horvath, A., & Rath, I.: Towards precise metrics for predicting graph query performance. 2013 28th IEEE/ACM International Conference on Automated Software Engineering, ASE 2013

Conclusions

▶ benchmarking in MDE can be improved

▶ there are other options for input models than the Grabats’ 09 contest models

▶ different shapes (preferably on a metrical scale) should be used to find distinctive merits and flaws in compared technologies

▶ generators for random models

■ parameters to create differently shaped models

■ randomness and suitable distributions for real world like input

■ linear scaled sizes21

generation of random emf models for benchmarks

Science

large models acmieee

mde input models

dened environment

international conference

lncs springer

jvm configuration

benchmarks markus scheidgen

intel core i5