mpi tla din a0 2011-11-16
TRANSCRIPT
TheLanguage ArchiveLanguage
Data
Experts Collaboration
Tools
Projects
Max Planck Institute for Psycholinguistics
TLA‘s Mission• digitizeandarchivelanguageresources
• supportaccesstolanguageresources
• developtools,servicesandinfrastructures
• setupofregionalarchivesworldwide
• organizeeducationandtrainingactivities
• givehelpandsupport
TheLanguageArchiveMax Planck Institute for PsycholinguisticsP.O.Box310,6500AHNijmegenWundtlaan1,6525XDNijmegenTheNetherlandsPhone: (+31)(0)24-3521911Fax: (+31)(0)24-3521213eMail: [email protected]/tla
State of the Archive•60+terabyte,500.000+files
•73.000+metadatasessions
•20.000+hoursaudio/videorecordings
•60.000+annotationfiles
•4.5million+annotatedsegments
•45+lexica
•speech,multimodal,acquisition,multilingual,languageandcognition,brainimaging,ethnologicalandotherdata
TLA is jointly funded by the Max-Planck-Society, the Berlin-Brandenburg Academy of Sciences and the Royal Netherlands Academy of Arts and Sciences
WithsubstantialcontributionsbytheVolkswagen-Foundation,theEuropeanCommission,theGermanMinistryforEducationandResearch,theDutchScienceFoundationandtheMaxPlanckInstituteforPsycholinguistics.
Nov
ember201
1
Language Archiving Technology LAT
TLAbuildsonalargearchiveoflanguageresources,includingprimarydata(multimediarecordings),secondarydata(annotation,lexica,comments,etc.),andmetadata.Topreventitsloss,theArchiveiscopiedtovariouslocationsincludingagrowingnumberofregionalarchives,preservingrelations,contextsandprovenanceinformation.
Totakecareoftheinterpretabilityofdatainthelongrun,adherencetostandardsandacontinuouscurationprocedureareveryimportant.AccesstothedatainthespiritoftheLive Archivesideaandregulatedbyacodeofconductandotheragreementsisguaranteedtothosewhohaveaccesspermissionstotheindividualresourceswhicharedefinedinfourlevels(fullyopentoclosed)bythedepositors.
Besidesthefieldworkdataofabout60DOBESprojects,TLAcontinuestodigitizeandarchiveanincreasingamountofotherlanguagerelateddata.Currentlytherearedataonmorethan200languagesinthearchive.
Archive
Technology
TheLATsoftwaresuite,startedin2000withthemulti-mediaannotationtoolELANandtheIMDImetadatainfrastructure,coversabout15componentsandtools.Itiscontinuouslybeingdebugged,adaptedandextended.
ItincludestoolsforResourceCreation&Organization(ELAN,LEXUS,IMDI/CMDI,ARBIL,AVRecognizers),toolsforManagement,Upload&Infrastructure(LAMUS,IMDI/CMDI,AMS,COSIX,HANDLE,REPLIX),andtoolsforbasicandcomplexresourceaccess(IMDI/CMDI,VLO,ANNEX,IMEX,LEXUS,GIS,TROVA,VICOS).
2ComputerCentersinMunich(onefromMPG)
2ComputerCentersinGöttingen(onefromMPG)
2CopiesMPINijmegen
Activities
TLAisinvolvedinanumberofinitiativesdevotedtothearchivingofdigitallanguagedata,totheimprovementoftechnologiestocreate,manageandaccesslanguagedata,andtotheconstructionofinfrastructuresthatfacilitatecross-institutionalandcross-corporaaccess.TheresultinginfrastructureswillallowresearcherstobuildvirtualcollectionsandworkflowstoimprovedataaccessinthedirectionofeHumanitiesusagescenarios.TLAalsocontributestostandardizationinISOTC37/SC4(www.tc37sc4.org)whichaimsatfacilitatinginteroperabilityinthelanguageresourcesdomain.
PastProjects:MUMIS,INTERA,ISLE,LIRICS,DAM-LR(allEC),CGN(NWO),HARVE,INTER,ROR(allMPG),REPLIX,(DEISA,CLARIN-EU).RunningProjects:DOBES(VWS),CLARIN(NL,DE),DASISH,INNET,CLARA,EUDAT(allEC),AVATecH,(MPG-FhG),RELISH(DFG/NEH).
preparation
integration
utilization
RELcat / ISOcat Ontology
managementframework
Archivefederation
Infrastructures
Data Life Cycle Support
Data Archiving and Copying
IMDI / CMDI / GIS / VLO
MetadataBrowsing&Searching
IMDI / CMDI / ARBILDataOrganization
MetadataDescription
ELAN / LEXUS
Annotation+Lexicon
ANNEX / LEXUS / IMEX TROVA
ComplexAccessviaWeb
VICOS
SemanticAccessandEnrichment
LAMUSDataUploadingandManagement
AccessManagement
Dokumentation BeDrohter Sprachen Documentation oF enDanGereD LanGuaGeS DOBES
DéĮine
Beaver
Hoocąk
Wichita
Chontal
Lacandón
Aikanã/Kwazá
Tsafiki
People of the Center
Cashinahua
Baure
Movima
Yuracaré
Uru-Chipaya
Chaco Languages
Marquesan
Tuamotuan
Minderico
Bainouk
Laal
Beezen
Bubia / Isubu
Bakola
Tima
Oyda
=| Akhoe Hai||om
Taa
Lower Sorbian
Kola-Sámi
Enets / Nenets
Svan / Udi / Tsova-Tush
Gorani
Khinalug Semoq Beri / Batek
Semang
Totoli
Waima‘a
Wooi
Teop
Saliba / Logea
Savosavo
Vurës / Vera‘a
Iwaidja
Jaminjung
Nen/Tonda
Ambrym Languages
Tofa
Even
Salar / Monguor
Chintang / Puma
Tangsa / Tai / Singpho
Kurumba Languages
Sri Lanka Malay
Katxuyana
Mawé
Trumai
Kuikuro
Awetí
Bakairí
Ache
Regional archives
DOBES
MPI
Archive