emf large-scale modeling outside eclipse. eclipsecon europe 2011
DESCRIPTION
EMF is successfully used for almost every large Eclipse project, however EMF adoption outside of Eclipse ecosystem is very low. In this talk, Renat Zubairov from Talend will discuss how EMF is used for large scale meta-modeling in non-Eclipse projects. : Smooks - the extensible data binding and processing framework. In Smooks, EMF and Eclipse modeling technologies are used for data processing applications that model the UN/EDIFACT large legacy standards. UN/EDIFACT reference models built with EMF are used to store records for 40 different directories versioned over 10 years. With 2 releases per year, the system contains around 800 large models which are interconnected with each other. During my talk I will cover following aspects/challenges: Cultural challenge - Eclipse modeling is highly coupled and can't be used without Eclipse Using EMF artifacts for Maven Builder - where to find artifacts, how to use them Re-use of artifacts with and without Eclipse platform - tips and tricks for packaging and deployment Deploying EMF-based application on Google cloud - appengine and EMFTRANSCRIPT
EMF large-scale modelingoutside of Eclipse
by Renat Zubairov
Donnerstag, 3. November 11
© Talend 2011 follow me on @zubairov 2
About me
Product Owner at Talend (ex. SOPERA)
Open source contributions to:- Apache Tapestry- Emf4Swing- Eclipse BPMN Designer- Eclipse Swordfish + Swordfish Tooling- Smooks- Talend AI Tooling
@zubairov
github.com/zubairov
Donnerstag, 3. November 11
© Talend 2011 follow me on @zubairov
What I’m going to talk about?
3
Donnerstag, 3. November 11
© Talend 2011 follow me on @zubairov 4
Smooks Project
‣ Smooks is a data integration framework for ‘building applications for processing XML and non XML data using Java’
‣ Main features‣ Java Binding‣ Transformation‣ Large message processing‣ Message enrichment‣ Validation‣ EDI & UN/EDIFACT support
Donnerstag, 3. November 11
© Talend 2011 follow me on @zubairov 5
EDI & UN/EDIFACT
‣ Major EDI standards‣ UN/EDIFACT (ouside US)‣ US ANSI X12
‣ TRADACOMS
‣ ODETTE
‣ IATA standards
‣ UN/EDIFACT‣ 26 Directories, ~160 Message types
each.
UNA:+.? 'UNB+IATB:1+6XPPC+LHPPC+940101:0950+1'UNH+1+PAORES:93:1:IA'MSG+1:45'IFT+3+XYZCOMPANY AVAILABILITY'ERC+A7V:1:AMD'IFT+3+NO MORE FLIGHTS'ODI'TVL+240493:1000::1220+FRA+JFK+DL+400+C'PDI++C:3+Y::3+F::1'APD+74C:0:::6++++++6X'TVL+240493:1740::2030+JFK+MIA+DL+081+C'PDI++C:4'APD+EM2:0:1630::6+++++++DA'UNT+13+1'UNZ+1+1'
Donnerstag, 3. November 11
© Talend 2011 follow me on @zubairov
Old UN/EDIFACT processing approach with Smooks
6
Dictionary Proprietary EDI file Model
EDI File
ECT Java SourcesEJC
EDIParser SAX Events Mapper Java
Instances
Smooks code
Donnerstag, 3. November 11
© Talend 2011 follow me on @zubairov
New processing approach with Smooks and EMF
7
Dictionary ECore Model
EDI File
ECT Java SourcesGenmodel
EDIParser SAX Events
EMF Runtime
Java Instances
Smooks code
EMF code
Donnerstag, 3. November 11
© Talend 2011 follow me on @zubairov
By-product: Eclipse EDI Editor
8
Donnerstag, 3. November 11
Challenges
Donnerstag, 3. November 11
© Talend 2011 follow me on @zubairov
Culture and positioning
10
Apache Planet
Donnerstag, 3. November 11
© Talend 2011 follow me on @zubairov
Build challenges
‣ Maven is a de-facto standard.
‣ Latest available EMF JAR file from http://mvnrepository.com is 2.6.0 build in June 2010
‣ Missing sources
‣ Broken dependency tree
11
Donnerstag, 3. November 11
© Talend 2011 follow me on @zubairov 12
Donnerstag, 3. November 11
© Talend 2011 follow me on @zubairov
Coupling
For example to parse the XML Schema with EMF I would need:
13
‣ org.eclipse.core.runtime
‣ org.eclipse.core.jobs
‣ org.eclipse.osgi
‣ org.eclipse.equinox.app
‣ org.osgi.foundation
‣ servlet-api
Do we need all of it?
Donnerstag, 3. November 11
© Talend 2011 follow me on @zubairov
Re-using resulting artifact
‣ Artifacts should be usable in four runtime environments
‣ Java standalone
‣ Eclipse (as parts of XML Catalog)
‣ OSGi runtime (together with Apache Camel)
‣ WAR file deployed on Google App Engine
‣ Different configuration discovery mechanisms
‣ extension points from plugin.xml / fragment.xml in Eclipse environment
‣ Manual classpath discovery in Java Standalone + WAR
‣ OSGi configuration admin and blueprint in OSGi
‣ As a result we use duplicate information in all three of them.
14
Donnerstag, 3. November 11
© Talend 2011 follow me on @zubairov
Deploying to Google App-Engine
‣ A prototype of App-Engine deployed service for converting UN/EDIFACT into XML.
http://edi-to-xml.appspot.com
‣ XML produced by that service references schemas generated based on Ecore model.
‣ WARNING: AppEngine does not support signed JAR files, and all EMF jar files are signed. Issue #3754 on Google App Engine issues.
15
Donnerstag, 3. November 11
© Talend 2011 follow me on @zubairov
EMF scalability issues (for our use-cases)
‣ We have 26 Directories with ~160 message types, so, all together it’s ~80k classifiers
‣ Serializing ECore model with 336 classifiers and 1155 structural features as annotated XML Schema takes 5 minutes on the Intel Core i7.
‣ And that’s only for one message type out of ~4160.
‣ Quickly got answer on the EMF Forum:‣ Answer from Ed: ‘I don't imagine folks change their models so often that a few
minutes for a large schema is a big concern...’ see http://www.eclipse.org/forums/index.php/m/663952/
16
Donnerstag, 3. November 11
© Talend 2011 follow me on @zubairov
Why would you care?
17
Donnerstag, 3. November 11
Questions?
mail them to Renat.Zubairov at gmail.comor tweet to: @zubairov
Donnerstag, 3. November 11