xml-publishing - implementation strategy file · web viewimplementation strategy of an...
TRANSCRIPT
STISStatistical
Information Systems
ConsortiumINTRASOFT INTERNATIONAL S.A.
andAGILIS S.A.
European Commission – EUROSTAT/B3
Framework Contract 14200/2005/007-2005/699 - Lot 1Specific Contract 17101.2006.001-2006.457
‘XML-Publishing’Implementation Strategy of an XML-based publishing
in Eurostat
D2.1: Analysis & Evaluation of existing standards
May 2007
Project: XML-Publishing - Implementation StrategyContract: Specific Contract 17101.2006.001-2006.457Prepared by: VBE, CBO Reviewed by: MFEVersion 2.0
Date Updated: 18/05/2007Status: Company Approved Page 2/17
Document Service Data
Type of Document Project deliverable
Reference: document.doc
Issue: 2 Revision: 0 Status: Company Approved
Created by: Victorio Bentivogli, Christian Boudot
Date: 18/05/2007
Distribution: EU-Eurostat, Intrasoft International S.A.
Contract Full Title: XML-Publishing - Implementation Strategy
Service contract number: Specific Contract 17101.2006.001-2006.457
For Internal Use Only
Reviewed by: CBO, VJB
Approved by: MFE
Document Change Record
Issue/Revision Date Change
0.1 15/01/2007 First draft document
0.2 05/02/2007 Updated draft
0.3 23/02/2007 Updated draft
0.4 07/03/2007 Updated draft
1.0 07/04/2007 Added information about CoSSI
1.1 10/04/2007 Updated draft
1.2 15/04/2007 Updated draft
1.3 30/04/2007 Updated draft
1.4 09/05/2007 Delivery version
2.0 18/05/2007 Delivery version
Project: XML-Publishing - Implementation StrategyContract: Specific Contract 17101.2006.001-2006.457Prepared by: VBE, CBO Reviewed by: MFEVersion 2.0
Date Updated: 18/05/2007Status: Company Approved Page 3/17
Table of contents Page
1 Introduction.................................................................................................................................... 41.1 Purpose......................................................................................................................................... 41.2 References.................................................................................................................................... 4
2 Existing Standards......................................................................................................................... 52.1 CoSSI............................................................................................................................................ 5
2.1.1 Introduction......................................................................................................................... 52.1.2 Specification....................................................................................................................... 52.1.3 Usage................................................................................................................................. 5
2.2 Formex 4....................................................................................................................................... 52.2.1 Introduction......................................................................................................................... 52.2.2 Specification....................................................................................................................... 62.2.3 Usage................................................................................................................................. 6
2.3 DocBook........................................................................................................................................ 62.3.1 Introduction......................................................................................................................... 62.3.2 Specification....................................................................................................................... 62.3.3 Usage................................................................................................................................. 7
2.4 ODF............................................................................................................................................... 72.4.1 Introduction......................................................................................................................... 72.4.2 Specification....................................................................................................................... 82.4.3 Usage............................................................................................................................... 11
2.5 MS OOXML................................................................................................................................. 112.5.1 Introduction.......................................................................................................................112.5.2 Specification..................................................................................................................... 112.5.3 Usage............................................................................................................................... 13
2.6 Comparison Tables..................................................................................................................... 132.6.1 General Information..........................................................................................................132.6.2 Characteristics..................................................................................................................13
2.7 Conclusion................................................................................................................................... 14
3 ODF vs OOXML............................................................................................................................. 153.1 Advantages of OpenDocument over Office Open XML formats..................................................153.2 Advantages of Office Open XML formats over OpenDocument..................................................153.3 Shortcomings of OpenDocument................................................................................................153.4 Shortcomings of Office Open XML..............................................................................................163.5 Cross-platform interoperability.....................................................................................................163.6 Conclusion................................................................................................................................... 16
Table of Figures
Table 1: References............................................................................................................................... 4Table 2: Document types........................................................................................................................ 8
Project: XML-Publishing - Implementation StrategyContract: Specific Contract 17101.2006.001-2006.457Prepared by: VBE, CBO Reviewed by: MFEVersion 2.0
Date Updated: 18/05/2007Status: Company Approved Page 4/17
1 Introduction
1.1 Purpose
This document stands for the deliverable “D2.1 Analysis & Evaluation of existing standards” and is the outcome of Task 2.1 “Analysis & Evaluation existing standards for publications”.
1.2 References
This document references:
Reference Document/Resource Name Filename
R1 OASIS foundation http://opendocument.xml.org/
http://www.oasis-open.org/committees/download.php/18630/06-06-08-bidi-appendix
http://books.evc-cit.info/odbook/ch05.html#table-value-table
R2 Ecma International http://www.ecma-international.org/publications/standards/Ecma-376.htm
http://www.ecma-international.org/news/PressReleases/PR_TC45_Dec2006.htm
R3 Microsoft http://www.microsoft.com/office/xml/covenant.mspx
R4 ISO http://www.iso.org/iso/en/CatalogueDetailPage.CatalogueDetail?CSNUMBER=43485
R5 W3C http://www.ecma-international.org/news/PressReleases/PR_TC45_Dec2006.htm
Table 1: References
Project: XML-Publishing - Implementation StrategyContract: Specific Contract 17101.2006.001-2006.457Prepared by: VBE, CBO Reviewed by: MFEVersion 2.0
Date Updated: 18/05/2007Status: Company Approved Page 5/17
2 Existing Standards
2.1 CoSSI
2.1.1 Introduction
CoSSI stands for “Common Structure of Statistical Information”. This model covers different ways of statistical data organisation (statistical data matrix and statistical table), statistical publications (monthly and quarterly publications, press releases, etc.) and quality declarations. The structuring of the metadata connected to statistical data is also implemented within this system.
2.1.2 Specification
The CoSSI model defines the structures of statistical data (matrices and tables), metadata (document and statistical metadata, and quality declarations), and publications using XML DTDs. The CoSSI model is comprised of several DTDs that can be modularly combined for different types of documents. The basic document types are a statistical table, a statistical matrix and a publication. These documents are XML documents that are compatible with the CoSSI model and also contain the metadata and the language versions necessary for describing a set of statistics.
2.1.3 Usage
As an in-house initiative within the Statistic Finland, it is not used elsewhere.
!CoSSI is a model that was designed to fulfil the requirements of Statistics Finland. According to our research it hasn’t been widely adapted.
Furthermore the integration into Office products was not part of the implementation scope.
2.2 Formex 4
2.2.1 Introduction
Formex stands for “Formalised exchange of electronic documents”. It is the document exchange format for the Office for Official Publications of the European Union. It is used for the delivery of documents to the Official Journal of the European Union. Formex was developed in-house by the Office of Publications and it is not a standard used elsewhere in the industry.
The first version of the specification was started in 1985. Initially it was a mixture of SGML (Standard Generalized Markup Language) and CCF (Common Communication Format). By 1999, the slow take-
Project: XML-Publishing - Implementation StrategyContract: Specific Contract 17101.2006.001-2006.457Prepared by: VBE, CBO Reviewed by: MFEVersion 2.0
Date Updated: 18/05/2007Status: Company Approved Page 6/17
up of SGML and lack of tools available on the market led to a move to XML. Formex v4 came out on February 9, 2004. It included three major initiatives, the migration to XML, the character migration to Unicode and the adoption of XML schema. The migration to XML also gave opportunity to stream-line the specifications. Instead of about 1200 tags in Formex version 3, Formex version 4 consists of only about 260 tags.
2.2.2 Specification
Version 4 of Formex 4 was published in October 31, 2006. It will enter into force on January 1, 2007. It includes changes for the new structure of the Official Journal.
2.2.3 Usage
As an in-house initiative within the Office for Official Publications of the European Union, it is not used elsewhere.
!Formex 4 is a model that was designed to fulfil the requirements of OPOCE. According to our research it hasn’t been widely adapted.
As for CoSSI the integration into Office products was not part of the implementation scope.
2.3 DocBook
2.3.1 Introduction
DocBook is a markup language for technical documentation. It was originally intended for authoring technical documents related to computer hardware and software but it can be used for other sort of documentation as well.
DocBook began in 1991 as a joint project of HaL Computer Systems and O'Reilly & Associates and eventually spawned its own maintenance organization before moving in 1998 to the SGML Open consortium, which subsequently became OASIS. DocBook is currently maintained by the DocBook Technical Committee at OASIS.
2.3.2 Specification
As of December 2006, DocBook version 5.0 is in its 1st candidate release. There are many changes between the older 4.x versions and the 5.0 version. Among them are that DocBook is defined by a RELAX NG + Schematron schema. While there is a W3C XML Schema + Schematron version
Project: XML-Publishing - Implementation StrategyContract: Specific Contract 17101.2006.001-2006.457Prepared by: VBE, CBO Reviewed by: MFEVersion 2.0
Date Updated: 18/05/2007Status: Company Approved Page 7/17
available, it is not considered the definitive or "normative" version of the schema. There is also a DTD available, though it lacks the power to truly validate all DocBook 5 documents.
DocBook 5 markup is not a strict superset of DocBook 4.x. Many of the redundancies that grew from DocBook's origins have been concatenated. For example, in DocBook 4.x, there were a set of info elements (bookinfo, chapterinfo, appendixinfo, etc) that describe information about that particular kind of element (book, chapter, appendix). In most cases, the contents of these elements were identical. However, because DocBook 4.x was defined by a DTD, any differences between these info elements based on their context required a new element name, as DTDs can only describe the content model of an element based on its name. RELAX NG has no such limitations, so all of these elements are called info in DocBook 5.
Because DocBook 5 is defined by a RELAX NG schema rather than a DTD, versioning became an issue. As such, in DocBook 5, the version of a document is defined by a version, which is required on the root element of a DocBook 5 document. This attribute specifies the version of DocBook 5 that the document is written against. Through Schematron rules, the schema requires that it appear on the root, though it may appear on other elements.
2.3.3 Usage
DocBook was adopted by O’Reilly and the open source community and was used for creating documentation for many projects, including FreeBSD, KDE, GNOME desktop documentation, the GTK+ API references, the Linux kernel documentation, and the work of the Linux Documentation Project.
!DocBook is a model that was originally designed for hardware and software documentation. The format is widely used in the open source community
As for previous listed standards the integration into Office products was not part of the implementation scope.
2.4 ODF
2.4.1 Introduction
OpenDocument or ODF, is a document file format used for describing electronic documents such as memos, reports, books, spreadsheets, charts, presentations and word processing documents. This standard was developed by a Technical Committee under the Organization for the Advancement of Structured Information Standards consortium and based upon the XML format originally created and implemented by the OpenOffice.org office suite. OpenDocument is an OASIS Standard and a published ISO and IEC International Standard referred to as ISO/IEC 26300:2006.
Project: XML-Publishing - Implementation StrategyContract: Specific Contract 17101.2006.001-2006.457Prepared by: VBE, CBO Reviewed by: MFEVersion 2.0
Date Updated: 18/05/2007Status: Company Approved Page 8/17
2.4.2 Specification
Document and Template
The most common file extensions used for OpenDocument documents are .odt for text documents, .ods for spreadsheets, .odp for presentation programs, and .odg for graphics. OpenDocument also supports a set of template types that represent formatting information (including styles) for documents, without the content themselves.
Here is the complete list of document types, showing the type of file, the recommended file extension, and the MIME:
File type Extension MIME TypeText .odt application/vnd.oasis.opendocument.text
Spreadsheet .ods application/vnd.oasis.opendocument.spreadsheet
Presentation .odp application/vnd.oasis.opendocument.presentation
Drawing .odg application/vnd.oasis.opendocument.graphicsChart .odc application/vnd.oasis.opendocument.chartFormula .odf application/vnd.oasis.opendocument.formulaImage .odi application/vnd.oasis.opendocument.imageMaster Document .odm application/vnd.oasis.opendocument.text-master
Table 2: Document types
Metadata
The OpenDocument format supports storing metadata by having a set of pre-defined metadata elements, as well as allowing user-defined and custom metadata.
Content
OpenDocument's text content format supports both typical and advanced capabilities. Headings of various levels, lists of various kinds (numbered and not), numbered paragraphs, and change tracking are all supported. Page sequences and section attributes can be used to control how the text is displayed. Hyperlinks, bookmarks, and references are supported as well. Text fields (for autogenerated content), and mechanisms for automatically generating tables such as tables of contents, indexes, and bibliographies, are included as well.
In the OpenDocument format, spreadsheets are an example of a set of tables. Thus, there are extensive capabilities for formatting the display of tables and spreadsheets. Database ranges, filters, and data pilots (known to Excel users as "pivot tables") are also supported. Change tracking is available for spreadsheets as well.
The graphics format supports a vector graphic representation, in which a set of layers and the contents of each layer is defined. Available drawing shapes include Rectangle, Line, Polyline, Polygon, Regular Polygon, Path, Circle, Ellipse, and Connector. 3D Shapes are also available; the
Project: XML-Publishing - Implementation StrategyContract: Specific Contract 17101.2006.001-2006.457Prepared by: VBE, CBO Reviewed by: MFEVersion 2.0
Date Updated: 18/05/2007Status: Company Approved Page 9/17
format includes information about the Scene, Light, Cube, Sphere, Extrude, and Rotate. Custom shapes can also be defined.
Presentations are supported. Animations can be included in presentations, with control over the Sound, showing a shape or text, hiding a shape or text, or dimming something, and these can be grouped. In OpenDocument, much of the format capabilities are reused from the text format, simplifying implementations. However, tables are not supported within OpenDocument as drawing objects, so may only be included in presentations as embedded tables.
Charts define how to create graphical displays from numerical data. They support titles, subtitles, footer, and a legend to explain the chart. The format defines the series of data that is to be used for the graphical display, and a number of different kinds of graphical display (such as line charts, pie charts, and so on).
Formatting
The style and formatting controls are numerous, providing a number of controls over how information is displayed.
Page layout is controlled by a variety of attributes. These include page size, number format, paper tray, print orientation, margins, border (and its line width), padding, shadow, background, columns, print page order, first page number, scale, table centering, maximum footnote height and separator, and many layout grid properties.
Headers and footer can have defined fixed and minimum heights, margins, border line width, padding, background, shadow, and dynamic spacing.
There are many attributes for specific text, paragraphs, ruby text, sections, tables, columns, lists, and fills. Specific characters can have their fonts, sizes, and other properties set. Paragraphs can have their vertical space controlled through attributes on keep together, widow, and orphan, and have other attributes such as "drop caps" to provide special formatting.
Format internals
An OpenDocument file is a Jar compressed archive containing a number of files and directories. This simple compression mechanism means that OpenDocument files are normally significantly smaller than equivalent Microsoft ".doc" or ".ppt" files. This smaller size is important for organizations who store a vast number of documents for long periods of time, and to organizations those who must exchange documents over low bandwidth connections. Once uncompressed, most data is contained in simple text-based XML files, so the data contents (once uncompressed) have the typical ease of modification and processing of XML files. The standard also allows for the creation of a single XML document, which uses <office:document> as the root element, for use in document processing.
Directories can be included to store non-SVG images, non-SMIL animations, and other files that are used by the document but cannot be expressed directly in the XML.
Project: XML-Publishing - Implementation StrategyContract: Specific Contract 17101.2006.001-2006.457Prepared by: VBE, CBO Reviewed by: MFEVersion 2.0
Date Updated: 18/05/2007Status: Company Approved Page 10/17
Due to the openly specified compression format used, it is possible for a user to extract the container file to manually edit the contained files. This allows a corrupted file to be repaired or low level manipulation of the contents. It is known that many programs which implement the OpenDocument format do not utilise high compression levels. It is therefore possible for the user to optimise the file sizes by using more aggressive compression programs. This may be coupled with a number of image optimisation programs being used on the contained pictures and has been seen to give over 40% reduction in file size over a file directly saved from an OpenDocument compatible program.
The zipped set of files and directories includes the following:
XML files
o content.xml
o meta.xml
o settings.xml
o styles.xml
Other files
o mimetype
Directories
o META-INF/
o Thumbnails/
The OpenDocument format provides a strong separation between content, layout and metadata. The most notable components of the format are described in the subsections below. The files in XML format are further defined using the RELAX NG language for defining XML schemas. RELAX NG is itself defined by an OASIS specification, as well as by part two of the international standard ISO/IEC 19757: Document Schema Definition Languages (DSDL).
content.xml is the most important file. It carries the actual content of the document (except for binary data, like images). The base format is inspired by HTML, and though far more complex, it is reasonably legible to humans
styles.xml contains style information. OpenDocument makes heavy use of styles for formatting and layout. Most of the style information is here (though some is in content.xml). Styles types include:
Paragraph styles.
Page Styles.
Character Styles.
Frame Styles.
List styles.
Project: XML-Publishing - Implementation StrategyContract: Specific Contract 17101.2006.001-2006.457Prepared by: VBE, CBO Reviewed by: MFEVersion 2.0
Date Updated: 18/05/2007Status: Company Approved Page 11/17
meta.xml contains the file metadata. For example, Author, "Last modified by", date of last modification, etc.
settings.xml includes settings such as the zoom factor or the cursor position. These are properties that are not content or layout.
mimetype is just a one-line file with the mimetype of the document. One implication of this is that the file extension is actually immaterial to the format. The file extension is only there for the benefit of the user.
2.4.3 Usage
OpenDocument is designed to reuse parts of existing XML standards whenever they are available, and it creates new tags only where no existing standard can provide the needed functionality. So, OpenDocument uses DublinCore for metadata, MathML for displayed formulas, SVG for vector graphics, SMIL for multimedia, XLink for hyperlinks etc.
This new standard gets largely supported by the open community and the industry which explains that the list of applications supporting or based on OpenDocument is continual growing. Large organisations and governments around the world have started evaluating the use of OpenDocument for saving and exchanging editable office documents.
!The Open Document Format was initially developed as ISO standard to represent Office Documents. Due to it origins ODF can naturally be integrated into any Office products. Plug-ins and conversion tools are freely available.
The format is widely used, supported and adopted by the open source community, the industry and the national bodies.
2.5 MS OOXML
2.5.1 Introduction
Office Open XML (commonly abbreviated as OOXML) is a file format specification for the storage of electronic documents such as memos, reports, books, spreadsheets, charts, presentations and word processing documents. The specification was developed by Microsoft for its Microsoft Office 2007 product suite and was standardized by Ecma International as Ecma 376 in December 2006.
2.5.2 Specification
File format and structure
Project: XML-Publishing - Implementation StrategyContract: Specific Contract 17101.2006.001-2006.457Prepared by: VBE, CBO Reviewed by: MFEVersion 2.0
Date Updated: 18/05/2007Status: Company Approved Page 12/17
The Office Open XML file is a ZIP package containing the individual files that form the basis of the document. As well as XML files the ZIP package can also include embedded (binary) files in formats such as PNG, BMP, AVI or PDF.
Since this is a complete break with the previous binary based Microsoft Office file formats, and is completely new, it isn't entirely clear what the extent of Microsoft's backwards compatibility claims for OOXML are. It cannot be backwards compatible with existing Microsoft Office documents, by virtue of it being a completely new format, and it isn't backwards compatible with versions of Microsoft Office prior to 2007 without the Microsoft Office Compatibility Pack.
Document format
Office Open XML is a container format for several specialized XML-based document markup languages, roughly corresponding to individual applications within the Microsoft Office product line:
* WordprocessingML for word processing documents
* SpreadsheetML for spreadsheets
* PresentationML for presentations
* DataDiagramingML for technical diagrams
* FormTemplate for electronic forms
Container structure
A basic Office Open XML file contains an XML file called [Content_Types].xml at the root level of the ZIP package, along with three folders: _rels, docProps, and a directory specific for the document type (for example, in a .docx word processing file that would be a word directory). The word directory contains the document.xml file which is the core content of the document.
[Content_Types].xml file
This file describes the content of the ZIP package. It also contains a mapping for file extensions and overrides for specific URIs.
_rels Folder
The _rels folders are where one goes to find the relationships for any given part within the package. To find the relationships for a specific part, one looks for the _rels folder that is a sibling of one's part. If the part has relationships, the _rels folder will contain a file that has one's original part name with a .rels appended to it. For example, if the content types part had any relationships, there would be a file called [Content_Types.xml.rels] inside the _rels folder.
_rels/.rel
The root level _rels folder always contains a part called .rels. This URI (/_rels/.rels) and /[Content_Types].xml are the only two reserved URIs for parts in files that adhere to Office Open XML conventions. This is where the "package relationships" are located. Whenever one opens a file using
Project: XML-Publishing - Implementation StrategyContract: Specific Contract 17101.2006.001-2006.457Prepared by: VBE, CBO Reviewed by: MFEVersion 2.0
Date Updated: 18/05/2007Status: Company Approved Page 13/17
these conventions, one always starts by going to the _rels/.rels file. All relationship files are represented with XML. If one opens it in a text editor, one will see a bunch of XML that outlines each relationship for that part. In a minimal word document containing only the basic document.xml, the top level parts are two metadata parts, and the document.xml part.
word/document.xml
This is the main part for any Word document. If one views it in an XML editor, one will see a pretty basic XML file. The body of the word processing document is contained in this part.
2.5.3 Usage
Office Open XML is the default Office 2007 format if macros are not enabled. Microsoft has also released a compatibility pack for older versions. Using the compatibility pack users can create and edit Office Open XML files from within Office 2000, Office XP and Office 2003. The compatibility pack can also be used as a stand alone converter in combination with Office 97.
!OOXML was developed as standard to represent Office Documents within MS Office 2007. According to Microsoft the format is not reverse compatible with previous version of MS Office document.
As the format has not been released to the public yet it hasn’t been adopted anywhere outside the MS Office 2007 product.
2.6 Comparison Tables
2.6.1 General Information
Language Creator First public release date
Latest stable version
Editor Viewer
CoSSI Statistic Finland -- 0.91 XML Editor --Formex 4 OPOCE 2004 4 XML Editor --DocBook The Davenport
Group1992 5.0 XML Editor Output to HTML,
PDF, CHM, javadoc, others.
ODF OASIS 2005 1.1 Office suite
Office suite
MS OOXML Microsoft -- -- Office suite
Office suite
2.6.2 Characteristics
Language Major purpose Based on Structural markup Presentational markup
CoSSI Statistical Information SGML / XML Yes NoFormex 4 Document exchange SGML / XML Yes No
Project: XML-Publishing - Implementation StrategyContract: Specific Contract 17101.2006.001-2006.457Prepared by: VBE, CBO Reviewed by: MFEVersion 2.0
Date Updated: 18/05/2007Status: Company Approved Page 14/17
format for OPOCEDocBook Technical documents SGML / XML Yes NoODF Multi-purpose XML/ZIP Yes YesMS OOXML Multi-purpose XML/ZIP Yes Yes
Project: XML-Publishing - Implementation StrategyContract: Specific Contract 17101.2006.001-2006.457Prepared by: VBE, CBO Reviewed by: MFEVersion 2.0
Date Updated: 18/05/2007Status: Company Approved Page 15/17
2.7 Conclusion
In order to fulfil the requirements of Eurostat the chosen standard will have to provide functionalities to separate content from layout and fully integrate into office suite tools. From the above analysis only two of the standards match these expectations namely ODF and MS OOXML.
We recommend choosing ODF as it is strongly supported by the industry, well documented, easy to use and it is a recognised ISO standard.
We want to highlight that being based on XML, ODF can be transformed into other XML based formats if that specific requirement arise. These formats could include Formex 4, DocBook, etc.
Project: XML-Publishing - Implementation StrategyContract: Specific Contract 17101.2006.001-2006.457Prepared by: VBE, CBO Reviewed by: MFEVersion 2.0
Date Updated: 18/05/2007Status: Company Approved Page 16/17
3 ODF vs OOXMLOffice Open XML and OpenDocument are two competing XML-based formats for documents intended for use in office productivity software. Both formats combine XML content with other files into compressed ZIP archives. In both formats, the main office document content and presentation information is stored as XML, with the ability to reference embedded and external binary content such as PNG, BMP, GIF, and JPEG.
3.1 Advantages of OpenDocument over Office Open XML formats1. OpenDocument uses a mixed content model whereas the Office Open XML format does not.
Non-mixed documents usually represent structured data and mixed documents are usually used to represent narrative. MS XML uses the non-mixed model to represent narrative which in certain case leads to an ambiguous markup. The mixed-content model is closer to what a developer will be familiar to.
2. OpenDocument is similar to XHTML, while MS XML is not. OpenDocument uses mixed content and marks styles in a similar way. This makes it easier to transform data accurately between OpenDocument and XHTML.
3. OpenDocument gives better separation of style and content.
4. OpenDocument hyperlink URLs are embedded in the main file, whereas in Office Open XML the URL is placed in a separate file. This cause problems with manipulation of OOXML using standard tools such as XSLT.
5. OpenDocument reuses existing standards whenever possible whereas MS XML implements its own definitions.ODF uses parts of SVG for drawings, MathML for equations, XLink for linking, Dublin Core for metadata, etc. This makes the format infinitely more transparent to someone familiar with XML technologies. It also allows you to reuse existing tools that understand these standards.
6. OpenDocument is an approved ISO standard Office Open XML is not.
7. OpenDocument is royalty-free. It can be used without charge by anyone. Whereas Microsoft only released a covenant not to sue for the use of their Schemas.
3.2 Advantages of Office Open XML formats over OpenDocument1. Microsoft Excel has a well-known formula language that has been defined in its entirety in the
new XML formats. The ODF implementation of the formulas in spreadsheets is specific to every vendor.
2. The OpenXML spreadsheet is faster than the ODF spreadsheet format.
3.3 Shortcomings of OpenDocument1. OpenDocument has no macro language specification.
2. ODF 1.1 has no digital signature
Project: XML-Publishing - Implementation StrategyContract: Specific Contract 17101.2006.001-2006.457Prepared by: VBE, CBO Reviewed by: MFEVersion 2.0
Date Updated: 18/05/2007Status: Company Approved Page 17/17
3.4 Shortcomings of Office Open XML1. The specification is incomplete and not entirely publicly available
2. The markup language for spreadsheets used in Office Open XML has two numeric formats for storing dates.
3.5 Cross-platform interoperability1. Microsoft Office 2007 for Windows uses Office Open XML as its native file format. Microsoft
Office 2008 for Mac OS X, scheduled for release in late summer 2007, will also use Office Open XML as its native file format. An ODF converter plugin for Microsoft Office XP/2003/2007 for Windows allows one to open and save OpenDocument word processing (.odt) files.
2. Corel has announced that the WordPerfect Office X3 suite will include support for OpenDocument Format as well as Office Open XML by mid-2007.
3. Gnumeric has included support for OpenDocument spreadsheet and preliminary support for Microsoft Office Open XML spreadsheet format since version 1.7.
4. IBM announced that Lotus Notes will use OpenDocument as the native format for its office productivity editors in the next release, due in 2007. IBM Workplace 2.6 already supports OpenDocument format.
5. Google Docs and Spreadsheets supports OpenDocument word processing and spreadsheet formats.
6. AbiWord 2.4 supports OpenDocument word processing format.
7. Scribus 1.3.3, a multi-platform, open source, page layout application, supports import of OpenDocument word processing files.
8. OpenDocument Format is currently supported in several office suites and individual applications, including as the native file format for KOffice 1.5, OpenOffice.org 2.0 and StarOffice 8.
3.6 Conclusion
According to the pro and contra arguments presented in the previous sections, we can conclude that Open Document Format is the leading format with the biggest potential and flexibility. Its advantages clearly overweight its disadvantages. As an ISO standard, ODF has been widely adopted by the industry and is well supported by most common Office Tools. Furthermore from a technical point of view, the schema is clear, well structured and easily convertible into any other format using XSLT.