documentation costs avoided using python and other open standards andrew jonathan fine operating...

20
Documentation Costs Avoided using Python and other Open Standards Andrew Jonathan Fine Operating Systems Software Organization Engines, Systems, and Services Honeywell International

Upload: moses-mccormick

Post on 17-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Documentation Costs Avoided using Python and other Open Standards Andrew Jonathan Fine Operating Systems Software Organization Engines, Systems, and Services

Documentation Costs Avoided using Python and other Open

Standards

Andrew Jonathan FineOperating Systems Software Organization

Engines, Systems, and ServicesHoneywell International

Page 2: Documentation Costs Avoided using Python and other Open Standards Andrew Jonathan Fine Operating Systems Software Organization Engines, Systems, and Services

Original Core Data Flow

Single Python application• set of front end translators• content inserter• post-processing formatter

CompanyDocument

CompanyDatabase

Translators

Paragraphs,Tables,Pictures

Inserter FormatterRaw.doc Final.doc

CompanyTemplate.dot

GeneratorApplication

Page 3: Documentation Costs Avoided using Python and other Open Standards Andrew Jonathan Fine Operating Systems Software Organization Engines, Systems, and Services

Front End Translator

• Selected by caller

• Caller specifies input file containing corporate data

• Extracts components from file Pictures Tables Paragraphs

• Saves to Python dictionary

Page 4: Documentation Costs Avoided using Python and other Open Standards Andrew Jonathan Fine Operating Systems Software Organization Engines, Systems, and Services

Inserter

• Caller selects components from Python dictionaries made by front-ends for respective documents.

• Inserter creates a Word document

• Inserter uses Python/Com to insert components into document

Page 5: Documentation Costs Avoided using Python and other Open Standards Andrew Jonathan Fine Operating Systems Software Organization Engines, Systems, and Services

Back End Formatter

• Scans corporate Word document template

• Scans Word document made by inserter

• Makes final style corrections.

Page 6: Documentation Costs Avoided using Python and other Open Standards Andrew Jonathan Fine Operating Systems Software Organization Engines, Systems, and Services

Why?

The flow was designed to cope with changes in requirements!

• New projects

• New teams

• New data source formats

• New standards for existing formats

Page 7: Documentation Costs Avoided using Python and other Open Standards Andrew Jonathan Fine Operating Systems Software Organization Engines, Systems, and Services

First front-end translator

Take pictures, tables, and data from a recursive property list constructed by an aerospace industry software visual programming tool called BEACON.

(… actual design of translator outside the scope of this paper…)

Page 8: Documentation Costs Avoided using Python and other Open Standards Andrew Jonathan Fine Operating Systems Software Organization Engines, Systems, and Services

Initial Design of Inserter

• Straightforward use of principles demonstrated by Mark Hammond's book, Python Programming in Win32.

• Chapter containing a thorough treatment of how to have Python use the Word 97 COM object model to create and manipulate a Word Document.

Page 9: Documentation Costs Avoided using Python and other Open Standards Andrew Jonathan Fine Operating Systems Software Organization Engines, Systems, and Services

Problems!!!

• Must cope with huge amounts of corporate data such as table cells..

• Speed of COM interface for new individual elements.

• Reuse issues for detailed typesetting of elements.

Page 10: Documentation Costs Avoided using Python and other Open Standards Andrew Jonathan Fine Operating Systems Software Organization Engines, Systems, and Services

What I wanted:

• Faster conversion

• Existing standard

• Callable from Python

What I found:

• Faster conversion (OpenJade)

• Existing standard (DocBook SGML)

Page 11: Documentation Costs Avoided using Python and other Open Standards Andrew Jonathan Fine Operating Systems Software Organization Engines, Systems, and Services

Why Call from Python?

• New scripting language to replace islands of automation (Perl, MSDOS, internal test stand controller language).

• Easier to connect islands after writing in Python.• Open source thus continuously peer reviewed.• Tremendous user base! Plenty of wrappers

written in Python around open source libraries supporting open standards.

… so I wrote a Python wrapper around some DocBook rules …

Page 12: Documentation Costs Avoided using Python and other Open Standards Andrew Jonathan Fine Operating Systems Software Organization Engines, Systems, and Services

Revised Core Data Flow

CompanyDocument

CompanyDatabase

Translators

Paragraphs,Tables,Pictures

DocBook.py

OpenJade

DocBook.smgl Result.rtf

CompanyTemplate.dot

GeneratorApplication

TypesettingText in

DocBookSGML

LocalDocbookDSSSL

stylesheets

DocBookSGML

definition anddefault

stylesheets

Local.dslLocal.dtd

\usr\packages\sgml

Final.docCleanup.py

• Python wrapper writes DocBook SGML• OpenJade translates DocBook SGML

to Word RTF

Page 13: Documentation Costs Avoided using Python and other Open Standards Andrew Jonathan Fine Operating Systems Software Organization Engines, Systems, and Services

A DocBook Informal table rendered by OpenJade into Word

Name Type

statex Integer

statey Long

Page 14: Documentation Costs Avoided using Python and other Open Standards Andrew Jonathan Fine Operating Systems Software Organization Engines, Systems, and Services

Input to OpenJade as local DocBook SGML

<!DOCTYPE informaltable SYSTEM "C:\Local.dtd"><informaltable frame='all'><tgroup cols='2' colsep='1' rowsep='1' align='center'><colspec colname='Name' colwidth='75' align='left'></colspec><colspec colname='Type' colwidth='64' align='center'></colspec><thead><row><entry><emphasis role='bold'>Name</emphasis></entry><entry><emphasis role='bold'>Type</emphasis></entry></row></thead><tbody><row><entry><phrase role='xe' condition='italic'>statex</phrase></entry><entry>Integer</entry></row><row><entry><phrase role='xe' condition='italic'>statey</phrase></entry><entry>Long</entry></row></tbody></tgroup></informaltable>

Page 15: Documentation Costs Avoided using Python and other Open Standards Andrew Jonathan Fine Operating Systems Software Organization Engines, Systems, and Services

from DocBook import DocBook

class ItalicIndexPhrase (DocBook.Rules.Phrase):

"italic indexible text phrase" TITLE = DocBook.Rules.Phrase def __init__ (self, text): DocBook.Rules.Phrase.__init__ (self, 'xe', 'italic') self.data = [ text ]

class NameCell (DocBook.Rules.Entry):

"table row cell describing name of identifier (italic and indexible text!)" TITLE = DocBook.Rules.Entry

def __init__ (self, text): DocBook.Rules.Entry.__init__ (self) self.data = [ ItalicIndexPhrase (text) ]

class StorageCell (DocBook.Rules.Entry):

"table row cell describing storage type of identifier (ordinary text)" TITLE = DocBook.Rules.Entry def __init__ (self, text): DocBook.Rules.Entry.__init__ (self) self.data = text

class TRow (DocBook.Rules.Row):

"each row in application's informal table body" TITLE = DocBook.Rules.Row def __init__ (self, binding): (identifier, storage) = binding DocBook.Rules.Row.__init__ (self, [ NameCell (identifier), StorageCell (storage) ])

class TBody (DocBook.Rules.TBody):

"application's informal table body" TITLE = DocBook.Rules.TBody

def __init__ (self, items): DocBook.Rules.TBody.__init__ (self, map (TRow, items))

class TGroup (DocBook.Rules.TGroup):

"application's informal table group"

COLSPECS = [ DocBook.Rules.ColSpec ('Name', 75, 'left'), DocBook.Rules.ColSpec ('Type', 64, 'center') ]

SHAPE = [ '2', '1', '1', 'center' ] TBODY = TBody

class InformalTable (DocBook.Rules.InformalTable):

"application's informal table" TGROUP = TGroup

class Example (DocBook):

'example application of DocBook formatting class' SECTION = str (InformalTable) def __call__ (self): self.data = [ InformalTable ()(self.data) ] return DocBook.__call__ (self) if __name__ == '__main__': print Example ([('statex', 'Integer'), ('statey', 'Long')]) ()

Python code to translate data into OpenJade input in local DocBook SGML

(based on Python to DocBook sample wrapper class DocBook)

Page 16: Documentation Costs Avoided using Python and other Open Standards Andrew Jonathan Fine Operating Systems Software Organization Engines, Systems, and Services

Using class DocBook

• class DocBook from DocBook.py in Appendix F is the top-level interface callable class

• Application inherits from class DocBook

• Contents of application inherit from classes contained by DocBook.Rules

• Use overrides to specify structure, formatting, and text.

Page 17: Documentation Costs Avoided using Python and other Open Standards Andrew Jonathan Fine Operating Systems Software Organization Engines, Systems, and Services

OpenJade

• OpenJade is an open source DSSSL execution engine available from SourceForge.

• DSSSL is an ISO standard for typesetting specification and document conversion.

• OpenJade reads DocBook DSSSL stylesheets and our local DSSSL stylesheets if any.

• The DSSSL is executed by OpenJade upon SGML source text to write a final document for later loading into a word processor.

Page 18: Documentation Costs Avoided using Python and other Open Standards Andrew Jonathan Fine Operating Systems Software Organization Engines, Systems, and Services

DocBook Post-Processing using Word Automation

with Python/COM

• DocBook/OpenJade emits RTF with different Word document style identifier names than in corporate Word DOT file.

• Much faster to change document using Python/COM than to create document!

• Cannibalized Python code from inserter first draft to create post-processor.

• Reads RTF, changes, saves as final DOC.

Page 19: Documentation Costs Avoided using Python and other Open Standards Andrew Jonathan Fine Operating Systems Software Organization Engines, Systems, and Services

Return on Investment5 projects ranging from 30 BEACON files to 150, average about 75 filesEach project has 2 releases per year where each file must generate hard copy.

Previously (cut/paste by hand):Each project release:

1/5 * 75 * 4 hours = 60 hours3/5 * 75 * 8 hours = 360 hours1/5 * 75 * 16 hours = 240 hours

----- 660 hours

Two releases per year: * 2 = 1,320 hoursFive projects needing releases: * 5 = 6,600 hoursTwo year period (2002-2003) * 2 = 13,200 hours

------Total effort avoided: 13,200 hours

Automated:

Automated releases over 2 year period: 160 hoursMy effort (12 * 140 hours per labor month): 1 680 hoursTotal investment: 1 840 hours

Net effort avoided, 2002-3: 11 360 hoursNet avoided by customers 2002-3 at $100/hour: 1 136 000 dollars Net labor years avoided 2002-3 at 1680 hours/year: 6.76 yearsHeadcount avoided per year: 3.38 people

ROI (Total effort avoided / total invested) 2002-3: 7.17

Page 20: Documentation Costs Avoided using Python and other Open Standards Andrew Jonathan Fine Operating Systems Software Organization Engines, Systems, and Services

Python and DocBook together

• Python connects our department’s engineering specific islands of automation.

• Python with DocBook created Word documents from engineering data.

• The combination of an open language with an open standard eliminated a real-world business process bottleneck.

• The return on investment was substantial.