fukiat julnual [ ]
TRANSCRIPT
OVERVIEW OF OPEN XML &
ACCESSING OPEN XML DATA FROM JAVA
Fukiat Julnual[ http://www.narisa.com/blog/fuju ]
Fukiat Julnual @Microsoft Thailand
Platform Strategy ManagerITPro community
@narisa.com ( fuju )Voluntary Consultant (VC) in JEE ,
Oracle , LAMP and Microsoft technology Stacks
Participate in Blog / Discussion Forums
ECMA Office Open XML File Formats
Macro-Free Macro-Enabled
Document Template Document Template
docx dotx docm dotm
pptx potx pptm potm
xlsx xltx xlsm xltm
Open Packaging Convention
Basic Components of the New Formats Package – ZIP Container
Part – The “files” inside the ZIPMost parts are XMLBinary files can be includedEach XML part is a discreet, compressed component
Content Types – Each part has a content type that is enforced on open
Relationships – Any part that references another
part or plays a certain role in the application must do so via a relationship
Why Open XML?
Community AccessProducts Standards
How does Microsoft support its customers?
Participate in standards bodies and support standards in products to foster interop
Make MS technology assets available to others
Deliberate delivery of interoperability in Microsoft products and technologies
Listen to, and work with customers, partners, and competitors to build bridges and coexist
Interoperability for DocumentsThe role of XML-based document formats in the Microsoft® Office system
XML structureData and presentation
Document ArchivalArchived & consumed long into the future without vendor-specific clients or applications
Improved Data Access
Web serviceformattingintelligent receipt
Customer details, CostsBusiness requirementsCompany standards
Business Process Efficiency
Efficient captureValidated information
Searching &RepurposingContent
Query & extractTextDocumentFragmentImage
FindingPreviousRFP Content
Auto-creating a polished document
datadata
Document Assembly
formatting
XML-based formats enable you to do things you couldn’t before
Better value from existing infrastructure Information security Regulatory and process compliance Information integration Retention, discovery and content management Documents as digital assets – what are you worth? Legacy documents & archives Lifecycle cost vs. Implementation cost
Documents that manage themselves
Managing documents with systems
Different scenarios of XML-based formats
Pastpreservation and archiving
Presentfidelity versus interchange
Futuredocument engineering and systems integration
Ecma Office Open XML
Specifications published by Ecma International TC-45.
Freely available for download and implementation
TC-45 is comprised of many companies, chaired by Microsoft – Apple, Toshiba, Novell, Statoil, and others
Microsoft offers the Open Specification Promise to alleviate IP-related concerns for the Open XML formats
Ecma Office Open XML Formats
Open File Formats
• Ecma standard for document formats (December 2006)• Designed to enable interoperability• Open Specifications enable broad access to technology
The future of file format technology
• ZIP compression of the format reduces file sizes• Segmented data storage improves data recovery and programmatic access• Full accessibility support
Compatible file formats
• Standardized file formats supporting 100% of Microsoft Office functionality• Supported in Microsoft Office 2000, XP, 2003 and 2007, as well as OpenOffice.org and Corel WordPerfect Suite*• Document conversion, deployment and migration tools available
Adoption in major Office suites
2007 Microsoft Office system - Default Save Format is Open XML (+ free updates for Office 2000, XP, 2003) – Dec 2006/Jan 2007
Open Office – Novell announcement of support of Open XML in Open Office – Novell edition
Corel announcement of support of Open XML - Availability mid 2007
File Format CompatibilityEnsuring Free Document Exchange With Prior Office Releases Office XP, 2003 will open, edit and save new Office formats
Will recognize new Word, Excel and PowerPoint file format extensions
Enables users to Open XML Formats across multiple versions
Windows 2000 SP4 and later can convert between binary and Open XML Formats
Office 2007 users can change the default file format if desired Current .doc, .xls, .ppt file formats will be supported in 2007 Office system
Default file format can be set by users during deployment or after
Advanced policy controls for enabling and disabling the use of specific formats
Office release User Experience with new file formats
2007 Office system Default save as Open XML FormatsOffice 97-2003 formats fully supportedCompatibility Mode ensures features are also compatible
Office 2003Office XP
Native support for Open XML Formats (Compatibility Pack installed)Open, Edit and Save Open XML Formats
Office 2000, Windows 2000 SP4 and later (non-Office)
Support for Open XML formats via standalone compatibility packFormat conversion within Windows Explorer
Related Blog about Microsoft Office Compatibility Pack :http://www.narisa.com/blog/fuju/index.php?showentry=924
What difference from Apache POI Solution ?
“ Apache POI - Java API To Access Microsoft Format Files
The POI project consists of APIs for manipulating various file formats based upon Microsoft's OLE 2 Compound Document format using pure Java. In short, you can read and write MS Excel files using Java. Soon, you'll be able to read and write Word files using Java. POI is your Java Excel solution as well as your Java Word solution. However, we have a complete API for porting other OLE 2 Compound Document formats and welcome others to participate.
OLE 2 Compound Document Format based files include most Microsoft Office files such as XLS and DOC as well as MFC serialization API based file formats. “
(Source : http://poi.apache.org/ )
Open XML Interoperability
Linux Java Microsoft COM
ZIP LibraryMinizip
zLib
J2SEjava.util.zip
.NET Framework 3.0System.IO.Packaging
*
Xceed .NET controls
Xceed ActiveX controls
XML Library Apache Xerces JAXP.NET Framework 3.0
System.XmlMSXML
* Also includes abstractions for OPC concepts (Open Packaging Convention)
OpenXmlDeveloper.org Formed by 40 companies to share developer
information about the Office Open XML file formats
Articles with full source code for C#, VB, Java, XSLT
Forums for posting technical questions
Demo
NetBeans and the samples of Java and Open XML file format from [ http://openxmldeveloper.org/articles/OpenXMLandJava.aspx ]
Fukiat Julnualhttp://www.narisa.com/blog/fuju
Appendix
For more informationwww.microsoft.com/office/previewwww.OpenXMLDeveloper.orgwww.ecma-international.orgmsdn.microsoft.com/office/xmlwww.microsoft.com/technet/prodtechnol/office
Company Solution
Advisory Board Company, TheXML data-driven charting and presentation data for automated presentation development.
Northumberland College XML-based tool to automate the processing of self-assessment reports.
CLE British Columbia XML-based authoring and document-publishing system for book-publishing
Wortmann AG Extract geographic data from Navision and import it into Excel 2003.
Moore MedicalDrug and product information entered via InfoPath 2003 updates to a legacy ERP system.
Open University, TheContent Authoring Tool—create XML structured documents that can be easily published via print and the Web.
Ohio State University Medical CenterUses Web services to improve the flow of clinical and other data related to operating room activities.
McGraw-Hill ConstructionAn online service creates customized customer-defined views of construction information and integrated data housed in previously isolated databases.
PGGMDocument info system automates doc handling, updates data, and archives electronically. XML Web services provide integration between desktop and server
Austrian Broadcast CorporationJournalists use InfoPath 2003 forms to write and save stories in an offline mode, then submit and route automatically.
Austrian Ministry of Interior (BMI)InfoPath forms are used with SQL Server to collect info about a work item and upload the information to the database, drive workflow.
Danish InfoStructure Base Open publishing format for documents endorsed by government
Example Solutions