sharing linguistic multi-media resources jacquelijn ringersma paul trilsbeek max planck institute...

39
Sharing linguistic multi-media re Jacquelijn Ringersma Paul Trilsbeek Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands

Upload: dwight-bell

Post on 27-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Sharing linguistic multi-media resources

Jacquelijn Ringersma

Paul Trilsbeek

Max Planck Institute for Psycholinguistics

Nijmegen, The Netherlands

Max Planck Institute for psycholinguistics

Max Planck Gesellschaft

78 research institutes (Germany)

3 outside Germany:

2 Italy (art)

1 The Netherlands (psycholinguistics)

The study of mental processes involved in language production, language comprehension and language acquisition, as well as the relation between language, thought, and culture

Documenting (endangered) languages

Creation of a representative and long lasting, multipurpose record of natural languages

It contributes to maintain, consolidate or revitalize endangered languages and thus safeguards the full range of their uses …

and it also

contributes to the description of cultural elements of a language community

Documenting (endangered) languages

Audio resources: represent spoken language

Video resources: information on the socio-linguistic environment

Enrichments: Annotations, transcriptions, translations, lexica

Sharing resources

Where is the data stored?

Digital (online) archives:

DoBeS (MPI-archive), AILLA (Austin), Paradise© (Sydney)

Archive for linguistic resources (MPI)

Archive for linguistic resources

Different types of linguistic material: Endangered languages archive (DoBeS) MPI language documentation corporaExternal corpora (Carib, Narrangansett, Slavonic etc.)

Total amount of data in the archiveMore than 230.000 objects, 25 Tb datadigitized audio and videoimagesannotations

Organization:Metadata descriptions, data base

Archive for linguistic resources (DoBeS)

Multimedia Lexicon

Typed Relations within the Lexicon

Annotated Media

Described Corpus

Archive for linguistic resources (MPI)

Photos

Sharing resources

Issues in the access debate

(Culturally) sensitive data

Ownership

Research purposes

National and institutional regulations

Code of conducts

Specific groups or individual users have specific access rules to resources

Who is the data for?

Collector (team) - researcher

Colleague researchers

General public – education, information

Speech communities – knowledge sharing, education, revitalization etc.

Sharing resources

How can the data be accessed?

1. Browsing

When represented in an organized manner (e.g. tree or by category)

Sharing resources

How can the data be accessed?

1. Browsing

When represented in an organized manner (e.g. tree or by category)

Sharing resources

How can the data be accessed?

1. Browsing

When represented in an organized manner (e.g. tree or by category)

Sharing resources

How can the data be accessed?

1. Browsing

When represented in an organized manner (e.g. tree or by category)

Sharing resources

How can the data be accessed?

1. Browsing

When represented in an organized manner (e.g. tree or by category)

Sharing resources

How can the data be accessed?

1. Browsing

When represented in an organized manner (e.g. tree or by category)

2. Metadata search

Sharing resources

How can the data be accessed?

1. Browsing

When represented in an organized manner (e.g. tree or by category)

Sharing resources

How can the data be accessed?

1. Browsing

When represented in an organized manner (e.g. tree or by category)

2. Metadata search

3. Content search (only when users have access rights)

Sharing resources

How can the data be accessed?

1. Browsing

When represented in an organized manner (e.g. tree or by category)

2. Metadata search

3. Content search (only when users have access rights)

Sharing resources

How can the data be accessed?

1. Browsing

When represented in an organized manner (e.g. tree or by category)

2. Metadata search

3. Content search (only when users have access rights)

Sharing resources

How can the data be accessed?

1. Browsing

When represented in an organized manner (e.g. tree or by category)

2. Metadata search

3. Content search (only when users have access rights)

Sharing resources

How can the data be accessed?

1. Browsing

When represented in an organized manner (e.g. tree or by category)

2. Metadata search

3. Content search (only when users have access rights)

Collector (team) - researcherColleague researchersTrained general public – education, information

Sharing resources

How can the data be accessed?

1. Browsing

When represented in an organized manner (e.g. tree or by category)

2. Metadata search

3. Content search (only when users have access rights)

4. Geographic browsing

Sharing resources

How can the data be accessed?

1. Browsing

When represented in an organized manner (e.g. tree or by category)

2. Metadata search

3. Content search (only when users have access rights)

4. Geographic browsing

Sharing resources

How can the data be accessed?

1. Browsing

When represented in an organized manner (e.g. tree or by category)

2. Metadata search

3. Content search (only when users have access rights)

4. Geographic browsing

Sharing resources

How can the data be accessed?

1. Browsing

When represented in an organized manner (e.g. tree or by category)

2. Metadata search

3. Content search (only when users have access rights)

4. Geographic browsing

Sharing resources

How can the data be accessed?

1. Browsing

When represented in an organized manner (e.g. tree or by category)

2. Metadata search

3. Content search (only when users have access rights)

4. Geographic browsing

Sharing resources

How can the data be accessed?

1. Browsing

When represented in an organized manner (e.g. tree or by category)

2. Metadata search

3. Content search (only when users have access rights)

4. Geographic browsing

Colleague researchersGeneral public – education, information

Sharing resources

How can the data be accessed?

1. Browsing

When represented in an organized manner (e.g. tree or by category)

2. Metadata search

3. Content search (only when users have access rights)

4. Geographic browsing

5. Lexicon and conceptual spaces

LEXUS - Lexicon tool

LEXUS

Web based lexicon tool

Word lists and detailed views of information in the lexical entries

Linking of multi-media fragments (images, video, sound files)

Linking of multi-media fragments stored in digital archives

Toolbox/XML compatibility (import and export)

ViCoS

LEXUS - Lexicon tool

LEXUS

Web based lexicon tool

Word lists and detailed views of information in the lexical entries

Linking of multi-media fragments (images, video, sound files)

Linking of multi-media fragments stored in digital archives

Toolbox/XML compatibility (import and export)

LEXUS - Lexicon tool

ViCoS – Visualizing conceptual spaces

Conceptual spaces in multi media encyclopedia

Conventional paper dictionaries: network of meanings less visible

Paper dictionaries limited usefulness in language maintenance and language revival (Manning et al., 2000) Members of speech community prefer following semantic links of different semantic types (synonyms, antonyms, lexical, taxonomies)

Complement lexical spaces with ontological spaces

Allow users to construct a space of culturally relevant concepts

Concepts as centres for all sorts of information

relations to other concepts

anchored in the language to express them

linked to multimedia archive to describe them

Vizualizing Conceptual Spaces

ViCoS – Visualizing conceptual spaces

Sharing resources

How can the data be accessed?

1. Browsing

When represented in an organized manner (e.g. tree or by category)

2. Metadata search

3. Content search (only when users have access rights)

4. Geographic browsing

Collector (team) researcherSpeech community

Sharing resources

Collector (team) researcherSpeech community

Sharing resources

Collector (team) researcherSpeech community

Sharing resources

How can the data be accessed?

1. Browsing

When represented in an organized manner (e.g. tree or by category)

2. Metadata search

3. Content search (only when users have access rights)

4. Geographic browsing

5. Lexicon and conceptual spaces

Collector (team) - researchersSpeech community members

Sharing resources

How can the data be accessed?

Direct access to archive through: browse, metadata search

Access through content search

Collector (team) – researcherColleague researchersTrained general public

Geographic browsingColleague researchersGeneral public

Lexicon and conceptual spacesCollector (team) – researcherMembers of the speech community