Download - OLIF2 Consortium: Organizational Meeting
![Page 1: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/1.jpg)
OLIF2 Consortium: OLIF2 Consortium: Organizational Meeting Organizational Meeting
April 6, 2000SAP AG
Walldorf, Germany
![Page 2: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/2.jpg)
AgendaAgenda
9.00 – 9.15 Welcome and introductory Remarks: Daniel Grasmick
9.15 – 9.45 Structure of the OLIF2 Consortium: Daniel Grasmick, Susan McCormick
9.45 – 10.30 Time frame for OLIF2: Daniel Grasmick, Susan McCormick
Financial issues for the consortium: Daniel Grasmick
10.30 – 10.45 Coffee break
10.45 – 12.00 Current status of OLIF: Gregor Thurmair
12.00 – 13.00 Discussion of changes to OLIF currently envisaged for OLIF2: Susan McCormick
13.00 – 14.00 Lunch
14.00 – 14.30 Review of current level of support for OLIF among tool vendors: Daniel Grasmick, Susan McCormick
14.30 – 15.30 Review of other interchange formats and initiatives: all participants
Discussion of interaction of OLIF2 Consortium with SALT and/or OSCAR: all participants
15.30 – 15.45 Coffee break
15.45 – 17.00 Task descriptions for work groups to review current OLIF and suggest changes/additionsin linguistic, terminology, and technical specifications; recommendations to be
completed in April/May, 2000: all participants
![Page 3: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/3.jpg)
Consortium ParticipantsConsortium Participants
Gregor Thurmair, Sail LabsJohannes Ritzke, Sail LabsAlex Muzarku, LogosPierre-Yves Foucou, Systran Yves Mahe, XeroxPaolo Martins, EUChris Pyne, L10NBRIDGEJörgen Danielsen, L10NBRIDGENils van der Laan, TradosPeter Quartier, LotusUlrike Irmler, MicrosoftDaniel Grasmick, SAPSusan McCormick, SAPJennifer Brundage, SAP Christian Lieske, SAPChristoph Pahlke-Lerch, SAP
![Page 4: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/4.jpg)
Welcome and IntroductionsWelcome and Introductions
• Company
• Professional background
• Terminology volume
• Languages supported
• Organization of terminology management in your company
• Terminology database(s) used
• Other tools related to terminology
• Any exchange formats?
• Future plans for terminology/lexicon management
![Page 5: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/5.jpg)
Purpose of OLIF2Purpose of OLIF2
To upgrade the current OLIF standard
so that it can be supported by tool
vendors and applied by users in 2001
![Page 6: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/6.jpg)
Why a New Consortium?Why a New Consortium?
• OLIF was developed in the OTELO project as a OLIF was developed in the OTELO project as a prototype, but is not usable in its current formprototype, but is not usable in its current form
• The SALT project plans to use the OLIF format as part The SALT project plans to use the OLIF format as part of its XLT standard, but will not edit OLIF1 for contentof its XLT standard, but will not edit OLIF1 for content
• LISA TBX will be based on SALT XLTLISA TBX will be based on SALT XLT
• None of the other formats supports MT requirementsNone of the other formats supports MT requirements
• Thus, usable OLIF is requiredThus, usable OLIF is required
e.g., SAP will double its terminology volume by the end of 2000 and add additional NLP tools needing term data
![Page 7: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/7.jpg)
• OTELO participantsOTELO participants SAIL Labs, Logos, Lotus, SAP
• New MT representativeNew MT representative Systran
• Term Management representativesTerm Management representatives Trados, Xerox
• Service (and tool) providersService (and tool) providers L10NBRIDGE, L&H via SAIL Labs
• UsersUsers EU, Microsoft...
• ... And open to interested parties... And open to interested parties
Structure of the ConsortiumStructure of the Consortium
![Page 8: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/8.jpg)
Time Frame for OLIF2Time Frame for OLIF2
Phase I: Specification
• Working groups make recommendations for changes to OLIF format by May 31, 2000
• Specifications for OLIF2 complete by September, 2000
Phase II: Implementation
• Tool vendors support new format in 2001
• Maintenance tools developed by end of 2000/beginning of 2001
![Page 9: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/9.jpg)
Changes to OLIF Changes to OLIF for OLIF2 for OLIF2
![Page 10: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/10.jpg)
OLIF to OLIF2OLIF to OLIF2
• technical structure
• linguistic analysis
• terminology handling
Review current OLIF format for changes to:
![Page 11: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/11.jpg)
XMLXML
• well-supported industry standard
• extensible - new element types easily defined
• well-suited for data exchange formats
• SALT project already working on XML-based standard in which they want to embed OLIF
Make OLIF compliant with XML:
technical
![Page 12: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/12.jpg)
Achieving XML-ComplianceAchieving XML-Compliance
• OLIF2 is primarily ‘rewrite’ of OLIF, but with XML-compliance
• OLIF entry structure remains basically the same for OLIF2
technical
![Page 13: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/13.jpg)
• reanalyze some current tags as attributes of XML element types, e.g.,
<LINK=“synonym”>
• allow for more embedding of structure
Use some of the features of XML to make design changes for OLIF2:
XML-Driven Design ChangesXML-Driven Design Changes
technical
![Page 14: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/14.jpg)
Current OLIF: ISO-Latin-1
OLIF2 functionality:
• double-byte characters
• bidirectionality
XML supports ISO/IEC 10646, which is similar to unicode
Character SetsCharacter Sets
technical
![Page 15: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/15.jpg)
• company-code as part of central entry base
• formally distinguish bilingual from monolingual links
• develop protocol for user-defined fields
Make substantive changes to the structure
Changes to the OLIF ConceptChanges to the OLIF Concept
technical
![Page 16: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/16.jpg)
Converging with other StandardsConverging with other Standards
• Achieve as much overlap as possible with, e.g.,
» names of element types
» structure of entries
Coordination with other standardization initiatives such as SALT
technical
![Page 17: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/17.jpg)
Review of Linguistic Features Review of Linguistic Features
• are features in correct feature groups?
• are all of the features that are essential for the different vendors covered?
» transitivity for Logos
» Systran requirements
» Xerox
• what about other NLP products or users?
Comprehensive review of linguistic features
linguistic
![Page 18: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/18.jpg)
Morphology Morphology
• currently includes only German, Danish and English
• theoretical underpinnings of analysis are inconsistent
Review the current morphological analysis
linguistic
![Page 19: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/19.jpg)
Syntax and Semantics Syntax and Semantics
• selectional restrictions (transfer conditions) - representation should be improved
• syntactic frames - currently for German, Danish and English only
• semantic types - should be reviewed and expanded
Special attention to:
linguistic
![Page 20: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/20.jpg)
Features and Values Features and Values
• Make sure feature names and values conform to general practice
• Make sure all element types that we want to cover are actually in DTD
linguistic
![Page 21: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/21.jpg)
Canonical Forms Canonical Forms
• defined for formulation of entry string in given language
• necessary for optimal convergence of entries from different systems
• based on language-specific lexical conventions
• published as part of formal specification
Conventions for formulating canonical forms
linguistic
![Page 22: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/22.jpg)
Structure of Terminology Structure of Terminology
• allow for deeper structure, more embedding (in line with MARTIF?)
• expand on feature/value pairs to allow more admin detail
Expand current structure?
terminology
![Page 23: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/23.jpg)
Entry Identifier Entry Identifier
• current OLIF does not support a unique
identifier for each entry, although many
termbanks require this
Add unique entry identifier
terminology
![Page 24: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/24.jpg)
Review of OLIF Support Review of OLIF Support Among Tool VendorsAmong Tool Vendors
![Page 25: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/25.jpg)
Overview of Other Overview of Other Exchange Formats and Exchange Formats and
Initiatives Initiatives
![Page 26: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/26.jpg)
MARTIF MARTIF
• SGML-based
• strictly terminology
• formal concept-orientation
• extensive DTD
• lots of administrative information
• relatively complex embedding in structure
ISO 12200:1999 Standard
![Page 27: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/27.jpg)
• extended MARTIF - attempt to coordinate with TMX and OLIF
• adapted to XML
• extends MARTIF to include NLP some features
Proposal ISO/TC 37/SC 3 N 318
X-MARTIF X-MARTIF
![Page 28: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/28.jpg)
SALT SALT
XLT (lex/term exchange)
OLIF (lex) MSC (term < MARTIF)
SALT Project - Currently funded by the EU
![Page 29: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/29.jpg)
OSCAR OSCAR
• TMX - format for re-use of translation memory data
• TBX - lex/termbase exchange (subset of XLT)
Group within LISA Organization
![Page 30: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/30.jpg)
Geneter Geneter
• for DB management
• compatibility with internet
• fairly complex hierarchical structure
• reworked to allow multiple word senses alongside concept model
“Generic model for the distribution and reuse of heterogeneous terminological data”
![Page 31: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/31.jpg)
Meeting Results:Meeting Results:Participation of all companies invitedParticipation of all companies invited
working in 3 action groups ... working in 3 action groups ...
![Page 32: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/32.jpg)
TG1: Technical Structure TG1: Technical Structure
Goal: provide formal structure of the formatGoal: provide formal structure of the format
• Review for XML complianceReview for XML compliance• RedundancyRedundancy• Links representationLinks representation• Definition of the headerDefinition of the header• Incorporation of user-defined fieldsIncorporation of user-defined fields
= Output: OLIF DTD= Output: OLIF DTD
![Page 33: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/33.jpg)
TG2: Linguistic AnalysisTG2: Linguistic Analysis
Goal: provide a “final” list of feature-value Goal: provide a “final” list of feature-value pairs for the linguistic componentpairs for the linguistic component
• Canonical form formulationCanonical form formulation• Morphology, syntax and semanticsMorphology, syntax and semantics• Transfer conditions and transformationsTransfer conditions and transformations• Cross-references (based on ISO)Cross-references (based on ISO)
![Page 34: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/34.jpg)
TG3: Terminology HandlingTG3: Terminology Handling
Goal: to provide a “final” list of feature-Goal: to provide a “final” list of feature-value pairs for terminologyvalue pairs for terminology
• Concordance with other standardsConcordance with other standards• Administrative informationAdministrative information
![Page 35: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/35.jpg)
Languages Supported in OLIF2Languages Supported in OLIF2
Priority 1Priority 1
• ENEN• DEDE• DADA• FRFR• ESES• PTPT• JAJA
Priority 2Priority 2
• RURU• ITIT• NLNL
• Other priorities...Other priorities...
• ELEL• HUHU• ZHZH• ZFZF• KOKO• ARAR
![Page 36: OLIF2 Consortium: Organizational Meeting](https://reader035.vdocuments.us/reader035/viewer/2022070405/56813f60550346895daa2edf/html5/thumbnails/36.jpg)
Other ItemsOther Items
• Terminology samples from all participantsTerminology samples from all participants at least 100 entries incl. description at least 2 languages and different categories