from planning to publishing: how business objects migrated documentation to dita one step at a time
DESCRIPTION
Presented by Dave Holmes at Documentation and Training West May 6-9, 2008 in Vancouver, BCIn 2006, Business Objects faced a major challenge. How to migrate over 50,000 pages of unstructured non-topic based documentation it had acquired through rapid growth and acquisitions. The answer was to use DITA to standardize content creation, management, translation and publishing processes company-wide. In this session, you will learn how they went from planning to publishing using an iterative approach, and how you can use this method to see the results of a content migration sooner in your project cycle.TRANSCRIPT
COPYRIGHT © 2008, BUSINESS OBJECTS S.A.
FROM PLANNING TO PUBLISHINGHow Business Objects migrated to DITA
SLIDE 2 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
AGENDA
1. About us
2. Reasons for change
3. Migrating to DITA
4. Other Changes Required
5. How did we do?
6. Lessons Learned
SLIDE 3 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
About Business Objects
Business Objects, an SAP company, is the world leader in business intelligence (BI) software
Headquarters in San Jose, CA and Paris, France
SAP is the world's leading provider of business software
Headquarters in Waldorf, Germany
SAP acquired Business Objects in 2007
SLIDE 4 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
About Business Objects Documentation
Some quick numbers76 authors + 16 Production Staff
Nine sites
All content is written in English and localized to up to 10 other languages. Many documents are sim-shipped
Documentation teams have undergone rapid growth due to acquisition in the past 3 years
Began a move to XML based authoring in 2005
First complete release in mid 2007
SLIDE 5 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Reasons for Change
SLIDE 6 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Motivation for Change (1/2)
Fast growth due to acquisitions of other companies brought inconsistencies
Supported 6+ file formats
Different team structures and cultures
Different styles and guidelines
Authors suffered from inefficient processesManual processes made up a large portion of an authors work
Inconsistent tools and processes increased overhead
Writers spent time recreating existing content and manually copying/pasting instead of single sourcing
SLIDE 7 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Motivation for Change (2/2)
Translation process was expensiveHad to manage multiple character encodings
Large number of queries about English source content
Simultaneous shipment of software in 8 languages meant complex schedules and tight deadlines
Publishing process was overly complicatedBuild times were as high as 2 days per deliverable per language
Multiple tools meant supporting multiple publishing processes
SLIDE 8 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Why did we choose XML?
Support for end-to-end Unicode encoding
Improved reporting on content and automated workflows
Separation of content and format: centralized, standardized output
But we needed a DTD…
SLIDE 9 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Why did we choose DITA?
DITA is a robust, industry-standard DTD
Topic based:Provides better experience to our users
Makes reuse easier
Allows easier division of workload
Allows for rolling translation
Extensible architecture allows us to grow our information types with minimal effort
Allows us to impose constraints on topic and element structure, which encourages:
Minimalism: less extraneous information in standalone topics
Structural and stylistic consistency
SLIDE 10 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Migration Goals
Reduce production times
Support a single file format
Support a single publishing process
Minimize writing effort
Reuse content between deliverables and Business Units
End-to-end Unicode character encoding
Reduce the amount of required interaction between the localization, documentation and publishing teams
SLIDE 11 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Challenges and Restrictions
Existing knowledge of DITA was very low
Delivery of our largest doc set, in 9 languages, by 2007
In order to make the migration financially feasible, we had to migrate all of our content by the end of 2007
SLIDE 12 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Migrating to DITA
How did we get from there, to here?
SLIDE 13 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Components of a Successful Migration
Content and Reuse analysis
DITA specialization
New Authoring tools
Content Management System
Publishing process
Automated migration process
SLIDE 14 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Content and Reuse analysis
Conducted initial analysis, including:Content analysis
Reuse analysis
Tools analysis
Designed a roadmap that included all teams
Decided to migrate to DITA 1.0 with no specialization
Reuse strategy would be implemented later in the project
Created rough plans for content alignment and rework
SLIDE 15 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Authoring Tool
Authoring tools may have to change
Any XML Friendly Editor should be fine
We selected XMetaL for DITA authoringHighly visual XML Editor
Direct integration with our Content Management System
Several extension points
SLIDE 16 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Content Management System
A good Content Management Tool:Is a common place to store files
Allows multiple people to work on the same files
Includes tools to find, group, sort and categorize information
Includes tools to publish information to other sources
A good Globalization Management Tool:Supports the maintenance and deployment of multi-lingual versions of the same content
Custom formatting for each language
Provides additional tools for translation to the people that need them
We Selected Idiom’s WorldServer
SLIDE 17 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Publishing System
Nearly all publishing is handled through WorldServer Exceptions for some release material or API material.
We single source to the following formats:PDF x 3 styles (for review, product whitepaper, product guide)
CHM x 3 styles (for review, help, .NET2005)
HxS
Eclipse Help
HTML
Flat XML
All languages published using the same workflows
Initial customization of publishing process done in house
Hired an XSL Developer to work on publishing full time
SLIDE 18 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Automated Migration to DITA
SLIDE 19 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Creating an Automated Process
Most migration tasks can be automated
Some tools freely available
Any automation will require customization
Any customization will require technical expertise
No automation is perfect
SLIDE 20 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Process Overview
SLIDE 21 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Content Analysis
Authors conducted an analysis per deliverableIdentified content that would be obsolete for next major release
Analyzed content for appropriate structure and for potential reuse
Flagged difficult passages
SLIDE 22 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Pre-Processing
Make the input as simple, and uniform, as possibleEnsure adherence to current templates
Ensure adherence to corporate style guide
Remove ‘complicated’ constructions
Remove, or minimize, variables in text and call-outs in images
Move towards Topic Oriented styleRemove ‘book-isms’ where possible
Remove phrases such as ‘In this chapter’ or ‘on page…’
Structure content as much as possibleConsistent styles for blocks of text or inline elements improved the results of automation
SLIDE 23 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Scripted Migration
Migration to DITA was handled by our Publishing team
Scripts validated input files against the expected style sheets and templates
Framemaker Content Frame files ‘published’ to DITA using a WebWorks template
Non-Framemaker ContentFiles were published to a simplified HTML template
Content was converted to DITA using XSLT
Perl and XSLT were used to fine tune the output based on input from the author
SLIDE 24 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Three stages of DITA
Considered “Well Formed” if:All tags that open are closed
All tags open and close in the same order
All attributes are quoted
Considered “Valid” if:It is well formed
It conforms to the rules of the DITA DTD
Considered “Well Written” if:It is well formed, and valid
Content conforms to our Style Guide
All tags are used correctly
Adheres to the correct Information Architecture (topic based, correct topic types are used when appropriate)
SLIDE 25 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Post-Processing
Content from migration was:Guaranteed to be Valid
80% Structurally correct
20-80% topic based
Authors examined resulting files and improved content as necessary
No more than 10% of the content required re-writing
Most rewriting occurred because the input files were not topic based
SLIDE 26 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Final Steps
Content moved to new CMS
New published output compared against input filesAuthors published content in familiar file formats, and compared the output against the original files
Authors published content to unfamiliar formats, and examined the output for oddities
Localization teams scoped the new files for loc impactTranslation Memories were adjusted programmatically where possible to reduce the impact of the changes
Input files changed programmatically to filter out some content from translation
SLIDE 27 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Other changes required
SLIDE 28 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Process Refinement
Continual improvement of migration processWrite scripts to migrate content to DITA
Write scripts to fine tune results
Test scripts on a sample set
Work with authors to identify pain points
Repeat…
Began enforcing stricter limitations on input files
SLIDE 29 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Changes for Authors
New Authoring Tool
New Content Management SystemDirect integration with our authoring tool made managing files easier
New Content Management System easier to use, but less robust, than previous system
Software strings extracted from source code for use in error message guides
New Style Guide
Created new roles to handle concerns or confusion about the new format
SLIDE 30 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Changes for Localization
TM adjusted programmatically to reduce the impact of the new file format
Filters put in place to restrict the type of content that is exposed for translation
Workflows introduced to automate translation process
Interactions with vendors changed
New translation tools
New systems for translating graphics and screenshots (graphics now translated as text)
SLIDE 31 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Changes for Publishing
All content uses a single file format
Redesigned our publishing layer (several times) to be more extensible
Had to develop custom transforms for formats that were previously produced with proprietary software
Introduced tools for automated QA testing
Created processes to automate publishing of content, and incorporate output into the product build
SLIDE 32 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
How did we do?
SLIDE 33 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Migration Goals: Revisited
Reduce production times
Support a single file format
Support a single publishing process
Minimize writing effort
Reuse content between deliverables and Business Units
End-to-end Unicode character encoding
Reduce the amount of required interaction between the localization, documentation and publishing teams
SLIDE 34 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Documentation in 2008
Criteria 2005 2008
Teams 6 14
Tools/Formats supported Word, Framemaker 6/7, Robohelp, (forehelp), JavaDoc, .Net XML
XMetaL and DITA
Content Management Perforce WorldServer
Translation Trados WorldServer
Publishing Combination of people, and WebWorks
WorldServer
Managing Published Content
Fully manual 50% Automation (and more on the way!)
SLIDE 35 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Unexpected Benefits
Less source content
Increased adherence to standards and style guidelines
Collaboration across the sites
Improved flexibility with published output
The technology has given us more flexibilityPulling content directly from source code
Direct integration with the build system
SLIDE 36 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Room for improvement
Lost some doc-related features
Process automation needs reviewSome workflows not effective
Some workflows take too long, or are too tedious
Discovered commonalities in content that can be better represented through topic specialization
Information Architecture still fairly rudimentary
SLIDE 37 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Lessons Learned
Some additional wisdom we picked up along the
way
SLIDE 38 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Education
General education should be provided earlyTheoretical DITA
Topic Oriented Writing
Structured Writing Principles
Specific education should be provided as neededDITA tag reference
Specific tools training
Classroom training can help improve confidence
Some material should always be available for on boarding
Skill with DITA is not yet common – some degree of training will need to be provided for any new hires
SLIDE 39 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
DITA is not ‘Just XML’
DITA implies a content architecture and necessitates Information Typing
The DITA DTD is not simple
The Open Toolkit Transformations are not trivial
SLIDE 40 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Planning
Plan extra time for:Migration workload for writers
Rewriting of content
Bug resolution before first release
Analyze the cost and the business caseIs it a worthwhile investment?
Get 100% commitmentUpper management commit to cost
Writers commit to change and to migration schedule
SLIDE 41 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Communication
Separate tools and content architecture decisionsCreate a dedicated tools team
Leverage the tools as much as possible
Create a single point of contact for style changes
Determine tagging rules and ‘special cases’ as early as possible
With no guidance, authors are forced to make their own decisions
Not everything needs to be done at once, but clear milestones need to be set for when things will be done
SLIDE 42 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
General
The migration requires some initial investment from all parties
The most difficult move for us was the move to Topic Oriented Authoring
The ‘cleaner’ your input, the better your output will be
Dedicate resources for customizing publishing output
SLIDE 43 BUSINESS OBJECTS CONFIDENTIAL. COPYRIGHT © 2008 BUSINESS OBJECTS S.A.
Questions?
Feel free to email me at [email protected]