can i have a word: managing shared glossaries and references to terms with dita
TRANSCRIPT
Can I Have a Word:
Managing Shared Glossaries and
References to Terms With DITA
Eliot KimberContrext
Tekom 2017
About the Author
• Independent consultant focusing on DITA analysis, design, and implementation
• Doing SGML and XML for cough 30 years cough• Founding member of the DITA Technical
Committee• Founding member of the XML Working Group• Co-editor of HyTime standard (ISO/IEC 10744)• Primary developer and founder of the DITA for
Publishers project• Author of DITA for Practitioners, Vol 1 (XML Press)
Tekom 2017
Agenda
• DITA glossary markup
• Glossary challenges
• Managing and using glossary entries
• Glossary processing
Tekom 2017
Glossary is…
• Terms and their definitions
• For presentation to readers
• May include definitions of acronyms and abbreviations
• May include lexicographic details: part of speech, etc.
• Source for use-by-reference of <term>elements in content
Tekom 2017
Glossary is not…
• Formal term list as used in terminology management tools like Congree or Acrolinx
– Terminology management is a separate concern from glossary authoring and presentation
Tekom 2017
General Requirements
• Provide glossary of terms in publications
• Get terms by reference in content (mentions of terms)
• Links from uses of terms to their glossary entries
• Show expansions of acronyms and abbreviations on first use
• Reuse glossary entries in multiple publications
• Publish master glossary with links to it from other publications
Tekom 2017
<glossentry>
• Topic type for glossary entries
• Captures:
– Term
– Definition
– Abbreviated forms
– Parts of speech
– Surface form
– Other details
Tekom 2017
<glossgroup>
• Topic type for grouping glossary entries together into one source document
• Allows nested <glossentry> elements
Tekom 2017
<glossref>
• Topicref type for referring to glossary topics
• DO NOT USE
• Sets @toc to “no”
• Sets @print to “no”
– Nobody knows why
• Requires @keys attribute
Tekom 2017
<abbreviated-form>
• Reference to a glossary entry
– Specialization of <term>
• Intended to produce abbreviation and expansion on “first use”
• Produces just abbreviation on other occurrences
• Challenge: When is a use the “first use”?
Tekom 2017
<term>
• Can use @keyref to use a glossary term by reference
• Reflects the term if no local content
• Should be a link to the glossary entry
• Example:<p>The <term keyref="gloss-framitz"/>
…</p>
Tekom 2017
<sort-as>
• Can be used in topic prolog to provide sorting key
– Often required for Japanese
– May be required for Simplified Chinese
– Other languages, terms with special characters, etc.
Tekom 2017
Glossary Entries as Resources
• Manage glossentry topics as individual docs
– Typical DITA practice for topics in general
• Must have associated keys
• Challenges:
– Where to define the keys?
– Defining naming conventions for keys
Tekom 2017
Maps for Glossaries
• Glossary entries MUST be part of the publication navigation tree
• <keydef> is either not appropriate or not sufficient
– <keydef> has processing role of “resource-only”
– Does not put referenced topic in the navigation tree
• Need normal-role topicrefs to glossary entries
Tekom 2017
Grouping Entries
• Obvious approach is to use topicheads to group entries:
<topichead><topicmeta>
<navtitle>Glossary</navtitle></topicmeta><topichead>
<topicmeta><navtitle>A</navtitle>
</topicmeta><topicref keys="gloss-apple"href="glossary/apple-gloss.dita"/>
…</topichead>…
</topichead>
• Doesn’t always work the way you might expect
Tekom 2017
Topichead Chunking Rule
• @chunk="to-content" on <topichead>makes topic act like reference to a title-only topic– DITA Spec: Clause 2.4.5.1 “Using the @chunk
attribute”
• Unfortunately, includes all child topics in the resulting chunk– Probably not what you want for glossaries– Have to specify @chunk on each subordinate topicref– Very annoying
• Bugs in Open Toolkit as of 2.5.4 produce incorrect results in both HTML and PDF
Tekom 2017
Workaround for Grouping
• Create title-only topics for what would otherwise be topicheads– Glossary top-level topic
– Each group
• Will need these for each language-specific group for localized glossaries
• Easy enough to generate– Could do as extension to Open Toolkit
preprocessing
Tekom 2017
Challenge:
How to Define Glossaries in Maps?
• Two basic options:
1. Use normal-role topicrefs only
2. Use both resource-only topicrefs and normal-role topicrefs that refer to the resource-only topicrefs by key
• Depends on your reuse requirements
Tekom 2017
Map Organization Option 1:
Just Normal-Role Topicrefs
• Publication map has normal topicrefs to the glossary entries
• Can have a single reusable submap
• Or can author separately for each publication
• Advantage: Keeps it simple
• Disadvantage: May have redundant or duplicate authoring in different publications
Tekom 2017
Map Organization Option 2:
Keydefs + Normal Topicrefs• Have a master map that uses <keydef> to refer to glossary entry topics
– These <keydef> keys are NOT to be used as target of <term> and <abbreviated-term> elements
– Reflects “exactly one topicref with URI reference to a given topic” policy
• In each publication:
– Grouping topicrefs
– Normal-role topicrefs with keys and keyref to <keydef> keys
• Advantage: Makes reuse easier to manage
• Disadvantages:
– Two keys where there were one
– May still have per-publication navigation structures for glossaries
Tekom 2017
Master Glossaries
• Separate publication that is just the glossary• Cross-deliverable links from other publications to
glossary entries• Cross-deliverable links are always a challenge• DITA 1.3 provides cross-deliverable linking feature
– Probably not implemented in your tools as of November 2017
• Can use deliverable-specific topicrefs– Requires that you know how glossary entries will be
delivered– Would expect to generate them automatically
Tekom 2017
Processing Challenges
• Determining “first use” for abbreviated form references
• Automatic grouping and sorting
• Producing minimum glossary for a given publication
Tekom 2017
First Use Problem
• What is the scope?
– Single topic?
– “Chapter”?
– Entire publication?
• Scope may be different for different deliverable types
• May have different editorial rules
• Difficult to have a general solution
Tekom 2017
Automated Grouping and
Sorting• Nothing in standard-defined map markup that
says unambiguously “this branch of the map is a glossary”
• Need locale-specific configuration for grouping
• Need local-specific configuration for sorting
• Simplified Chinese needs special support– DITA Community i18n project provides necessary
features
– Somebody needs to implement Open Toolkit plugin for doing glossary sorting
Tekom 2017
Generating Glossary Based on
Terms Used
• Possible to generate a glossary that reflects only those terms actually used in the topics included in a publication
• Requires synthesizing normal-role topicrefs so key references will work properly
• Could be implemented as an extension to Open Toolkit preprocessing
• Could be a separate process that generates otherwise-normal map and topic components
Tekom 2017
Resources
• Me: [email protected]• DITA specification: http://docs.oasis-
open.org/dita/dita/v1.3/dita-v1.3-part0-overview.html
• DITA Community i18n project: https://github.com/dita-community/org.dita-community.i18n
• Sample files: https://github.com/dita-community/dita-test-cases/tree/master/glossaries/realistic-glossary/wipo-glossary
Tekom 2017