documentation costs avoided using python and other open standards andrew jonathan fine operating...
TRANSCRIPT
Documentation Costs Avoided using Python and other Open
Standards
Andrew Jonathan FineOperating Systems Software Organization
Engines, Systems, and ServicesHoneywell International
Original Core Data Flow
Single Python application• set of front end translators• content inserter• post-processing formatter
CompanyDocument
CompanyDatabase
Translators
Paragraphs,Tables,Pictures
Inserter FormatterRaw.doc Final.doc
CompanyTemplate.dot
GeneratorApplication
Front End Translator
• Selected by caller
• Caller specifies input file containing corporate data
• Extracts components from file Pictures Tables Paragraphs
• Saves to Python dictionary
Inserter
• Caller selects components from Python dictionaries made by front-ends for respective documents.
• Inserter creates a Word document
• Inserter uses Python/Com to insert components into document
Back End Formatter
• Scans corporate Word document template
• Scans Word document made by inserter
• Makes final style corrections.
Why?
The flow was designed to cope with changes in requirements!
• New projects
• New teams
• New data source formats
• New standards for existing formats
First front-end translator
Take pictures, tables, and data from a recursive property list constructed by an aerospace industry software visual programming tool called BEACON.
(… actual design of translator outside the scope of this paper…)
Initial Design of Inserter
• Straightforward use of principles demonstrated by Mark Hammond's book, Python Programming in Win32.
• Chapter containing a thorough treatment of how to have Python use the Word 97 COM object model to create and manipulate a Word Document.
Problems!!!
• Must cope with huge amounts of corporate data such as table cells..
• Speed of COM interface for new individual elements.
• Reuse issues for detailed typesetting of elements.
What I wanted:
• Faster conversion
• Existing standard
• Callable from Python
What I found:
• Faster conversion (OpenJade)
• Existing standard (DocBook SGML)
Why Call from Python?
• New scripting language to replace islands of automation (Perl, MSDOS, internal test stand controller language).
• Easier to connect islands after writing in Python.• Open source thus continuously peer reviewed.• Tremendous user base! Plenty of wrappers
written in Python around open source libraries supporting open standards.
… so I wrote a Python wrapper around some DocBook rules …
Revised Core Data Flow
CompanyDocument
CompanyDatabase
Translators
Paragraphs,Tables,Pictures
DocBook.py
OpenJade
DocBook.smgl Result.rtf
CompanyTemplate.dot
GeneratorApplication
TypesettingText in
DocBookSGML
LocalDocbookDSSSL
stylesheets
DocBookSGML
definition anddefault
stylesheets
Local.dslLocal.dtd
\usr\packages\sgml
Final.docCleanup.py
• Python wrapper writes DocBook SGML• OpenJade translates DocBook SGML
to Word RTF
A DocBook Informal table rendered by OpenJade into Word
Name Type
statex Integer
statey Long
Input to OpenJade as local DocBook SGML
<!DOCTYPE informaltable SYSTEM "C:\Local.dtd"><informaltable frame='all'><tgroup cols='2' colsep='1' rowsep='1' align='center'><colspec colname='Name' colwidth='75' align='left'></colspec><colspec colname='Type' colwidth='64' align='center'></colspec><thead><row><entry><emphasis role='bold'>Name</emphasis></entry><entry><emphasis role='bold'>Type</emphasis></entry></row></thead><tbody><row><entry><phrase role='xe' condition='italic'>statex</phrase></entry><entry>Integer</entry></row><row><entry><phrase role='xe' condition='italic'>statey</phrase></entry><entry>Long</entry></row></tbody></tgroup></informaltable>
from DocBook import DocBook
class ItalicIndexPhrase (DocBook.Rules.Phrase):
"italic indexible text phrase" TITLE = DocBook.Rules.Phrase def __init__ (self, text): DocBook.Rules.Phrase.__init__ (self, 'xe', 'italic') self.data = [ text ]
class NameCell (DocBook.Rules.Entry):
"table row cell describing name of identifier (italic and indexible text!)" TITLE = DocBook.Rules.Entry
def __init__ (self, text): DocBook.Rules.Entry.__init__ (self) self.data = [ ItalicIndexPhrase (text) ]
class StorageCell (DocBook.Rules.Entry):
"table row cell describing storage type of identifier (ordinary text)" TITLE = DocBook.Rules.Entry def __init__ (self, text): DocBook.Rules.Entry.__init__ (self) self.data = text
class TRow (DocBook.Rules.Row):
"each row in application's informal table body" TITLE = DocBook.Rules.Row def __init__ (self, binding): (identifier, storage) = binding DocBook.Rules.Row.__init__ (self, [ NameCell (identifier), StorageCell (storage) ])
class TBody (DocBook.Rules.TBody):
"application's informal table body" TITLE = DocBook.Rules.TBody
def __init__ (self, items): DocBook.Rules.TBody.__init__ (self, map (TRow, items))
class TGroup (DocBook.Rules.TGroup):
"application's informal table group"
COLSPECS = [ DocBook.Rules.ColSpec ('Name', 75, 'left'), DocBook.Rules.ColSpec ('Type', 64, 'center') ]
SHAPE = [ '2', '1', '1', 'center' ] TBODY = TBody
class InformalTable (DocBook.Rules.InformalTable):
"application's informal table" TGROUP = TGroup
class Example (DocBook):
'example application of DocBook formatting class' SECTION = str (InformalTable) def __call__ (self): self.data = [ InformalTable ()(self.data) ] return DocBook.__call__ (self) if __name__ == '__main__': print Example ([('statex', 'Integer'), ('statey', 'Long')]) ()
Python code to translate data into OpenJade input in local DocBook SGML
(based on Python to DocBook sample wrapper class DocBook)
Using class DocBook
• class DocBook from DocBook.py in Appendix F is the top-level interface callable class
• Application inherits from class DocBook
• Contents of application inherit from classes contained by DocBook.Rules
• Use overrides to specify structure, formatting, and text.
OpenJade
• OpenJade is an open source DSSSL execution engine available from SourceForge.
• DSSSL is an ISO standard for typesetting specification and document conversion.
• OpenJade reads DocBook DSSSL stylesheets and our local DSSSL stylesheets if any.
• The DSSSL is executed by OpenJade upon SGML source text to write a final document for later loading into a word processor.
DocBook Post-Processing using Word Automation
with Python/COM
• DocBook/OpenJade emits RTF with different Word document style identifier names than in corporate Word DOT file.
• Much faster to change document using Python/COM than to create document!
• Cannibalized Python code from inserter first draft to create post-processor.
• Reads RTF, changes, saves as final DOC.
Return on Investment5 projects ranging from 30 BEACON files to 150, average about 75 filesEach project has 2 releases per year where each file must generate hard copy.
Previously (cut/paste by hand):Each project release:
1/5 * 75 * 4 hours = 60 hours3/5 * 75 * 8 hours = 360 hours1/5 * 75 * 16 hours = 240 hours
----- 660 hours
Two releases per year: * 2 = 1,320 hoursFive projects needing releases: * 5 = 6,600 hoursTwo year period (2002-2003) * 2 = 13,200 hours
------Total effort avoided: 13,200 hours
Automated:
Automated releases over 2 year period: 160 hoursMy effort (12 * 140 hours per labor month): 1 680 hoursTotal investment: 1 840 hours
Net effort avoided, 2002-3: 11 360 hoursNet avoided by customers 2002-3 at $100/hour: 1 136 000 dollars Net labor years avoided 2002-3 at 1680 hours/year: 6.76 yearsHeadcount avoided per year: 3.38 people
ROI (Total effort avoided / total invested) 2002-3: 7.17
Python and DocBook together
• Python connects our department’s engineering specific islands of automation.
• Python with DocBook created Word documents from engineering data.
• The combination of an open language with an open standard eliminated a real-world business process bottleneck.
• The return on investment was substantial.