fukiat julnual [ ]

21
OVERVIEW OF OPEN XML & ACCESSING OPEN XML DATA FROM JAVA Fukiat Julnual [ http://www.narisa.com/blog/fuju ]

Upload: lillie-stuart

Post on 31-Mar-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Fukiat Julnual [  ]

OVERVIEW OF OPEN XML &

ACCESSING OPEN XML DATA FROM JAVA

Fukiat Julnual[ http://www.narisa.com/blog/fuju ]

Page 2: Fukiat Julnual [  ]

Fukiat Julnual @Microsoft Thailand

Platform Strategy ManagerITPro community

@narisa.com ( fuju )Voluntary Consultant (VC) in JEE ,

Oracle , LAMP and Microsoft technology Stacks

Participate in Blog / Discussion Forums

Page 3: Fukiat Julnual [  ]

ECMA Office Open XML File Formats

Macro-Free Macro-Enabled

Document Template Document Template

docx dotx docm dotm

pptx potx pptm potm

xlsx xltx xlsm xltm

Open Packaging Convention

Page 4: Fukiat Julnual [  ]

Basic Components of the New Formats Package – ZIP Container

Part – The “files” inside the ZIPMost parts are XMLBinary files can be includedEach XML part is a discreet, compressed component

Content Types – Each part has a content type that is enforced on open

Relationships – Any part that references another

part or plays a certain role in the application must do so via a relationship

Page 5: Fukiat Julnual [  ]

Why Open XML?

Page 6: Fukiat Julnual [  ]

Community AccessProducts Standards

How does Microsoft support its customers?

Participate in standards bodies and support standards in products to foster interop

Make MS technology assets available to others

Deliberate delivery of interoperability in Microsoft products and technologies

Listen to, and work with customers, partners, and competitors to build bridges and coexist

Page 7: Fukiat Julnual [  ]

Interoperability for DocumentsThe role of XML-based document formats in the Microsoft® Office system

XML structureData and presentation

Document ArchivalArchived & consumed long into the future without vendor-specific clients or applications

Improved Data Access

Web serviceformattingintelligent receipt

Customer details, CostsBusiness requirementsCompany standards

Business Process Efficiency

Efficient captureValidated information

Searching &RepurposingContent

Query & extractTextDocumentFragmentImage

FindingPreviousRFP Content

Auto-creating a polished document

datadata

Document Assembly

formatting

Page 8: Fukiat Julnual [  ]

XML-based formats enable you to do things you couldn’t before

Better value from existing infrastructure Information security Regulatory and process compliance Information integration Retention, discovery and content management Documents as digital assets – what are you worth? Legacy documents & archives Lifecycle cost vs. Implementation cost

Documents that manage themselves

Managing documents with systems

Page 9: Fukiat Julnual [  ]

Different scenarios of XML-based formats

Pastpreservation and archiving

Presentfidelity versus interchange

Futuredocument engineering and systems integration

Page 10: Fukiat Julnual [  ]

Ecma Office Open XML

Specifications published by Ecma International TC-45.

Freely available for download and implementation

TC-45 is comprised of many companies, chaired by Microsoft – Apple, Toshiba, Novell, Statoil, and others

Microsoft offers the Open Specification Promise to alleviate IP-related concerns for the Open XML formats

Page 11: Fukiat Julnual [  ]

Ecma Office Open XML Formats

Open File Formats

• Ecma standard for document formats (December 2006)• Designed to enable interoperability• Open Specifications enable broad access to technology

The future of file format technology

• ZIP compression of the format reduces file sizes• Segmented data storage improves data recovery and programmatic access• Full accessibility support

Compatible file formats

• Standardized file formats supporting 100% of Microsoft Office functionality• Supported in Microsoft Office 2000, XP, 2003 and 2007, as well as OpenOffice.org and Corel WordPerfect Suite*• Document conversion, deployment and migration tools available

Page 12: Fukiat Julnual [  ]

Adoption in major Office suites

2007 Microsoft Office system - Default Save Format is Open XML (+ free updates for Office 2000, XP, 2003) – Dec 2006/Jan 2007

Open Office – Novell announcement of support of Open XML in Open Office – Novell edition

Corel announcement of support of Open XML - Availability mid 2007

Page 13: Fukiat Julnual [  ]

File Format CompatibilityEnsuring Free Document Exchange With Prior Office Releases Office XP, 2003 will open, edit and save new Office formats

Will recognize new Word, Excel and PowerPoint file format extensions

Enables users to Open XML Formats across multiple versions

Windows 2000 SP4 and later can convert between binary and Open XML Formats

Office 2007 users can change the default file format if desired Current .doc, .xls, .ppt file formats will be supported in 2007 Office system

Default file format can be set by users during deployment or after

Advanced policy controls for enabling and disabling the use of specific formats

Office release User Experience with new file formats

2007 Office system Default save as Open XML FormatsOffice 97-2003 formats fully supportedCompatibility Mode ensures features are also compatible

Office 2003Office XP

Native support for Open XML Formats (Compatibility Pack installed)Open, Edit and Save Open XML Formats

Office 2000, Windows 2000 SP4 and later (non-Office)

Support for Open XML formats via standalone compatibility packFormat conversion within Windows Explorer

Related Blog about Microsoft Office Compatibility Pack :http://www.narisa.com/blog/fuju/index.php?showentry=924

Page 14: Fukiat Julnual [  ]

What difference from Apache POI Solution ?

“ Apache POI - Java API To Access Microsoft Format Files

The POI project consists of APIs for manipulating various file formats based upon Microsoft's OLE 2 Compound Document format using pure Java. In short, you can read and write MS Excel files using Java. Soon, you'll be able to read and write Word files using Java. POI is your Java Excel solution as well as your Java Word solution. However, we have a complete API for porting other OLE 2 Compound Document formats and welcome others to participate.

OLE 2 Compound Document Format based files include most Microsoft Office files such as XLS and DOC as well as MFC serialization API based file formats. “

(Source : http://poi.apache.org/ )

Page 15: Fukiat Julnual [  ]

Open XML Interoperability

Linux Java Microsoft COM

ZIP LibraryMinizip

zLib

J2SEjava.util.zip

.NET Framework 3.0System.IO.Packaging

*

Xceed .NET controls

Xceed ActiveX controls

XML Library Apache Xerces JAXP.NET Framework 3.0

System.XmlMSXML

* Also includes abstractions for OPC concepts (Open Packaging Convention)

Page 16: Fukiat Julnual [  ]

OpenXmlDeveloper.org Formed by 40 companies to share developer

information about the Office Open XML file formats

Articles with full source code for C#, VB, Java, XSLT

Forums for posting technical questions

Page 17: Fukiat Julnual [  ]

Demo

NetBeans and the samples of Java and Open XML file format from [ http://openxmldeveloper.org/articles/OpenXMLandJava.aspx ]

Page 18: Fukiat Julnual [  ]

Fukiat Julnualhttp://www.narisa.com/blog/fuju

Page 19: Fukiat Julnual [  ]

Appendix

Page 20: Fukiat Julnual [  ]

For more informationwww.microsoft.com/office/previewwww.OpenXMLDeveloper.orgwww.ecma-international.orgmsdn.microsoft.com/office/xmlwww.microsoft.com/technet/prodtechnol/office

Page 21: Fukiat Julnual [  ]

Company Solution

Advisory Board Company, TheXML data-driven charting and presentation data for automated presentation development.

Northumberland College XML-based tool to automate the processing of self-assessment reports.

CLE British Columbia XML-based authoring and document-publishing system for book-publishing

Wortmann AG Extract geographic data from Navision and import it into Excel 2003.

Moore MedicalDrug and product information entered via InfoPath 2003 updates to a legacy ERP system.

Open University, TheContent Authoring Tool—create XML structured documents that can be easily published via print and the Web.

Ohio State University Medical CenterUses Web services to improve the flow of clinical and other data related to operating room activities.

McGraw-Hill ConstructionAn online service creates customized customer-defined views of construction information and integrated data housed in previously isolated databases.

PGGMDocument info system automates doc handling, updates data, and archives electronically. XML Web services provide integration between desktop and server

Austrian Broadcast CorporationJournalists use InfoPath 2003 forms to write and save stories in an offline mode, then submit and route automatically.

Austrian Ministry of Interior (BMI)InfoPath forms are used with SQL Server to collect info about a work item and upload the information to the database, drive workflow.

Danish InfoStructure Base Open publishing format for documents endorsed by government

Example Solutions