the lego project brent miller, the linguist list

32
The LEGO Project Brent Miller, The LINGUIST List

Upload: herbert-james

Post on 17-Dec-2015

226 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: The LEGO Project Brent Miller, The LINGUIST List

The LEGO ProjectBrent Miller, The LINGUIST List

Page 2: The LEGO Project Brent Miller, The LINGUIST List

Overview

• Introduction• Doing LEGO• Current Status• Future of LEGO

Page 3: The LEGO Project Brent Miller, The LINGUIST List

Introduction

LEGO and the Need for Interoperability

Page 4: The LEGO Project Brent Miller, The LINGUIST List

A Variety of Data

• Standards• LIFT• LMF• TEI

• File Formats• PDF• Excel/Access• MDF (Toolbox)• .doc/.odt (Word/OpenOffice)

Page 5: The LEGO Project Brent Miller, The LINGUIST List

Why Interoperate?

• Greater access to language data• More intelligent searches• Ease of comparison between lexicons

Page 6: The LEGO Project Brent Miller, The LINGUIST List

What is LEGO?

• Three-year project sponsored by the NSF• Participants: LINGUIST List, University at Buffalo• Goal: Create a datanet of interoperable lexicons

• Map grammatical information to GOLD• Map structure to a common schema (LL-LIFT)• Output in XML where lexicon contributor allows• Preserve source’s integrity

Page 7: The LEGO Project Brent Miller, The LINGUIST List

LEGO’s Purpose

• Not intended to develop a lexicon creation or display tool

• Will support multi-lexicon searches and comparisons

• Will demonstrate the value of digital standards in linguistic research

Page 8: The LEGO Project Brent Miller, The LINGUIST List

Doing LEGO

Team Structure and Workflow

Page 9: The LEGO Project Brent Miller, The LINGUIST List

Team Structure

• Three principle investigators• Jeff Good, University at Buffalo• Helen Aristar-Dry and Anthony Aristar, Eastern

Michigan University• Three graduate students

• Brent Miller, Justin Petro, Erica Wicks• One undergraduate, Lili Xia• One programmer, Lily Zheng

Page 10: The LEGO Project Brent Miller, The LINGUIST List

Workflow

Page 11: The LEGO Project Brent Miller, The LINGUIST List

Current Status

Our Data, Website, and Faceted Search

Page 12: The LEGO Project Brent Miller, The LINGUIST List

Lexical Data

• Completed• 11 wordlists (10 Qiang dialects, Saliba)• 7 lexicons (Western Sisaala, Potawatomi, Udi,

Ibibio, Wichita, Tuva, Shoshone)• 10 nearing completion (Fulfulde, Archi, Udi,

Mocovi, Jarawara, Nhirrpi, Titan, Maa, Mbodomo, Western Pantar, Mocho’)

Page 13: The LEGO Project Brent Miller, The LINGUIST List

The LEGO Site

• Homepage (in development)• http://lego.linguistlist.org

• Browse lexicons• Each lexicon has a homepage

• Browse entries• Each entry has its own page

• Faceted search• Allows for fine-grained GOLD-aware searches of

morphological information across lexicons

Page 14: The LEGO Project Brent Miller, The LINGUIST List
Page 15: The LEGO Project Brent Miller, The LINGUIST List
Page 16: The LEGO Project Brent Miller, The LINGUIST List
Page 17: The LEGO Project Brent Miller, The LINGUIST List
Page 18: The LEGO Project Brent Miller, The LINGUIST List
Page 19: The LEGO Project Brent Miller, The LINGUIST List

Faceted Search

• Choose lexicons• Text search

• Search across forms, variants, glosses, definitions, etymology, examples, notes

• Displays keyword in context• Filters

• Easily added/removed• Narrow search in real time

Page 20: The LEGO Project Brent Miller, The LINGUIST List
Page 21: The LEGO Project Brent Miller, The LINGUIST List
Page 22: The LEGO Project Brent Miller, The LINGUIST List
Page 23: The LEGO Project Brent Miller, The LINGUIST List
Page 24: The LEGO Project Brent Miller, The LINGUIST List
Page 25: The LEGO Project Brent Miller, The LINGUIST List

Filters

• GOLD concepts• Author grammatical information tokens• Language codes• Note types• Entry relation types

Page 26: The LEGO Project Brent Miller, The LINGUIST List
Page 27: The LEGO Project Brent Miller, The LINGUIST List
Page 28: The LEGO Project Brent Miller, The LINGUIST List
Page 29: The LEGO Project Brent Miller, The LINGUIST List
Page 30: The LEGO Project Brent Miller, The LINGUIST List

Future of LEGO

Immediate and Long-Term Plans

Page 31: The LEGO Project Brent Miller, The LINGUIST List

2011-2012

• Create a lexicon creator log-in• Allow users to edit and add to their data• User-tagging of GOLD concepts• Upload of user’s original lexicon documents• Enhance publically-available datanet of lexicons• Facilitate open participation of linguists

• Solicit a large number of new lexicons• Refine the import/export facility• Publicize the site

Page 32: The LEGO Project Brent Miller, The LINGUIST List

2012 and Beyond

• Continue to solicit new data and refine the interface

• The more data that’s present on the site, the more useful it will become to semanticists, typologists, lexicographers, translators, and other researchers