archives for communities of interest, the pacific and regional archive for digital sources in...

27
QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. Archives for communities of interest, the Pacific And Regional Archive for Digital Sources in Endangered Cultures (PARADISEC Nick Thieberger Department of Linguistics & Applied Linguistics The University of Melbourne PNC Conference November 2005

Upload: anna-french

Post on 29-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Archives for communities of interest, the Pacific And Regional Archive for Digital Sources in Endangered Cultures (PARADISEC

Archives for communities of interest, the Pacific And Regional Archive for Digital Sources in Endangered Cultures (PARADISEC

Nick ThiebergerDepartment of Linguistics & Applied LinguisticsThe University of Melbourne

PNC Conference November 2005

Nick ThiebergerDepartment of Linguistics & Applied LinguisticsThe University of Melbourne

PNC Conference November 2005

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Collaborative digital research resource set up by University of Sydney, University of Melbourne & Australian National University, 2003 (University of New England joined 2004)

Collaborative digital research resource set up by University of Sydney, University of Melbourne & Australian National University, 2003 (University of New England joined 2004)

75% funding from Australian Research Council Linkage Infrastructure and Equipment Fund Scheme (3 successful applications)

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Communities of interestCommunities of interest

A group of linguists and musicologists recognised that large collections of recorded material were not being properly archived. The other parts of the community are speakers and their descendants. Shared needs in the current group, and need for training of new researchers.

At least 3000 hours of analog fieldtapesNew technologies have a steep learning curve - Need for specialised assistance - Applied for research funds to establish an archive

A group of linguists and musicologists recognised that large collections of recorded material were not being properly archived. The other parts of the community are speakers and their descendants. Shared needs in the current group, and need for training of new researchers.

At least 3000 hours of analog fieldtapesNew technologies have a steep learning curve - Need for specialised assistance - Applied for research funds to establish an archive

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Communities of interestCommunities of interest

Safeguarding unique data with rights managementSafeguarding unique data with rights management

Ability to store all metadata but to expose only Ability to store all metadata but to expose only parts of it to search enginesparts of it to search engines

All data subject to password accessAll data subject to password access

Location of speakers on legacy tapes is not Location of speakers on legacy tapes is not possible, hence liaison with regional cultural possible, hence liaison with regional cultural centres and museums to act as clearinghouses for centres and museums to act as clearinghouses for repatriated data.repatriated data.

Safeguarding unique data with rights managementSafeguarding unique data with rights management

Ability to store all metadata but to expose only Ability to store all metadata but to expose only parts of it to search enginesparts of it to search engines

All data subject to password accessAll data subject to password access

Location of speakers on legacy tapes is not Location of speakers on legacy tapes is not possible, hence liaison with regional cultural possible, hence liaison with regional cultural centres and museums to act as clearinghouses for centres and museums to act as clearinghouses for repatriated data.repatriated data.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Communities of interestCommunities of interest

Collaboration across universities and disciplines Support from computing specialists (data grid, mass data store, programming), government agencies (E-research, Australian Partnership for Sustainable Repositories. GrangeNet) International links - similar initiatives (OLAC/DELAMAN) Regional cultural centres and museums (targets for repatriation of digital recordings) International standards - Metadata (OLAC/OAI)

All requires coordination or project management

Collaboration across universities and disciplines Support from computing specialists (data grid, mass data store, programming), government agencies (E-research, Australian Partnership for Sustainable Repositories. GrangeNet) International links - similar initiatives (OLAC/DELAMAN) Regional cultural centres and museums (targets for repatriation of digital recordings) International standards - Metadata (OLAC/OAI)

All requires coordination or project management

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

To preserve and make accessible Australian researchers’ field recordings of endangered languages and musics from the Asia-Pacific together with other digital material related to cultures of the region (theses, wordlists, texts, etc)

Preservation: to adopt world’s best practice standards and formats to maximise sustainability and future usability of the collectionAccess: To take advantage of emerging information and communication technologies to maximise access to our collection by both researchers and cultural heritage communities

To preserve and make accessible Australian researchers’ field recordings of endangered languages and musics from the Asia-Pacific together with other digital material related to cultures of the region (theses, wordlists, texts, etc)

Preservation: to adopt world’s best practice standards and formats to maximise sustainability and future usability of the collectionAccess: To take advantage of emerging information and communication technologies to maximise access to our collection by both researchers and cultural heritage communities

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Over 2000 of the world’s 6000 languages in the Asia-Pacific regionNumber likely to fall to a few hundred by 2100 (UNESCO)Australian researchers active in region since 1950s - making unique recordings of unrepeatable eventsRecordings now themselves endangered (format obsolescence, media deterioration, loss of metadata)

Over 2000 of the world’s 6000 languages in the Asia-Pacific regionNumber likely to fall to a few hundred by 2100 (UNESCO)Australian researchers active in region since 1950s - making unique recordings of unrepeatable eventsRecordings now themselves endangered (format obsolescence, media deterioration, loss of metadata)

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

2500 records in PARADISEC catalogue 2500 records in PARADISEC catalogue with data on 390 languages from 50 with data on 390 languages from 50 countriescountries including: including: American Samoa, Australia, American Samoa, Australia, Bangladesh, Botswana, Cambodia, Chile, China, Cook Bangladesh, Botswana, Cambodia, Chile, China, Cook Islands, Fiji, French Polynesia, Greenland, Hong Kong, Islands, Fiji, French Polynesia, Greenland, Hong Kong, Iceland, India, Indonesia, Israel, Italy, Japan, Kiribati, Iceland, India, Indonesia, Israel, Italy, Japan, Kiribati, Republic Of Korea, Lao People’s Democratic Republic, Republic Of Korea, Lao People’s Democratic Republic, Madagascar, Malaysia, Malta, Marshall Islands, Mexico,, Madagascar, Malaysia, Malta, Marshall Islands, Mexico,, Federated States Of Micronesia, Myanmar, Nauru, Nepal, Federated States Of Micronesia, Myanmar, Nauru, Nepal, New Caledonia, New Zealand, Nigeria, Niue, Palau, Papua New Caledonia, New Zealand, Nigeria, Niue, Palau, Papua New Guinea, Philippines, Reunion, Samoa, Singapore, New Guinea, Philippines, Reunion, Samoa, Singapore, Solomon Islands, South Africa, Taiwan, Province of China, Solomon Islands, South Africa, Taiwan, Province of China, Thailand, Tonga, Uganda, United States of America, Thailand, Tonga, Uganda, United States of America, Vanuatu, Viet Nam, Wallis And Futuna (data as of Vanuatu, Viet Nam, Wallis And Futuna (data as of September 2005)September 2005)

2500 records in PARADISEC catalogue 2500 records in PARADISEC catalogue with data on 390 languages from 50 with data on 390 languages from 50 countriescountries including: including: American Samoa, Australia, American Samoa, Australia, Bangladesh, Botswana, Cambodia, Chile, China, Cook Bangladesh, Botswana, Cambodia, Chile, China, Cook Islands, Fiji, French Polynesia, Greenland, Hong Kong, Islands, Fiji, French Polynesia, Greenland, Hong Kong, Iceland, India, Indonesia, Israel, Italy, Japan, Kiribati, Iceland, India, Indonesia, Israel, Italy, Japan, Kiribati, Republic Of Korea, Lao People’s Democratic Republic, Republic Of Korea, Lao People’s Democratic Republic, Madagascar, Malaysia, Malta, Marshall Islands, Mexico,, Madagascar, Malaysia, Malta, Marshall Islands, Mexico,, Federated States Of Micronesia, Myanmar, Nauru, Nepal, Federated States Of Micronesia, Myanmar, Nauru, Nepal, New Caledonia, New Zealand, Nigeria, Niue, Palau, Papua New Caledonia, New Zealand, Nigeria, Niue, Palau, Papua New Guinea, Philippines, Reunion, Samoa, Singapore, New Guinea, Philippines, Reunion, Samoa, Singapore, Solomon Islands, South Africa, Taiwan, Province of China, Solomon Islands, South Africa, Taiwan, Province of China, Thailand, Tonga, Uganda, United States of America, Thailand, Tonga, Uganda, United States of America, Vanuatu, Viet Nam, Wallis And Futuna (data as of Vanuatu, Viet Nam, Wallis And Futuna (data as of September 2005)September 2005)

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Locating data in the collectionLocating data in the collection

Metadata complying to international standards

Open language archives community (OLAC)

Geographic data entered via a map interface for later geographic querying

Open Archives Initiative (OAI)

Metadata complying to international standards

Open language archives community (OLAC)

Geographic data entered via a map interface for later geographic querying

Open Archives Initiative (OAI)

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Metadata CatalogueMetadata Catalogue

SQL/PHP password access

OAI/DC compliant via the Open Language Archives Community

Controlled vocabularies (language name, contributor role, data type, coverage, etc)

Link to repository data stored at the Australian Partnership for Advanced Computing (APAC) in Canberra

SQL/PHP password access

OAI/DC compliant via the Open Language Archives Community

Controlled vocabularies (language name, contributor role, data type, coverage, etc)

Link to repository data stored at the Australian Partnership for Advanced Computing (APAC) in Canberra

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Typical dataTypical data

Stephen Wurm’s several hundred tapes, including 120 1970s Solomon Islands tapes and transcripts/fieldnotes

Arthur Capell’s 114 tapes, Pacific and PNG 1950s (and 30 archive boxes of fieldnotes)

Bert Voorhoeve’s 180 tapes - West Papua

Tom Dutton’s 295 PNG tapes

Stephen Wurm’s several hundred tapes, including 120 1970s Solomon Islands tapes and transcripts/fieldnotes

Arthur Capell’s 114 tapes, Pacific and PNG 1950s (and 30 archive boxes of fieldnotes)

Bert Voorhoeve’s 180 tapes - West Papua

Tom Dutton’s 295 PNG tapes

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Imaging fieldnotesImaging fieldnotes

To date over 10,000 pages of fieldnotes have been photographed and are being put online

Crucial that links between fieldnotes and field recordings be maintained

Aim to allow trusted users to build links between dynamic media and fieldnotes

To date over 10,000 pages of fieldnotes have been photographed and are being put online

Crucial that links between fieldnotes and field recordings be maintained

Aim to allow trusted users to build links between dynamic media and fieldnotes

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Wurm collection, Solomon Wurm collection, Solomon Islands, 1979. Digitised cassette Islands, 1979. Digitised cassette

tape with page image of transcript, tape with page image of transcript, and Wurm’s language mapand Wurm’s language map

Wurm collection, Solomon Wurm collection, Solomon Islands, 1979. Digitised cassette Islands, 1979. Digitised cassette

tape with page image of transcript, tape with page image of transcript, and Wurm’s language mapand Wurm’s language map

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Archival dataArchival data

Linking transcripts to media Creation of time aligned data that

acts as finegrained metadata

Searchable time-aligned media corpus

Citation of primary media

Linking transcripts to media Creation of time aligned data that

acts as finegrained metadata

Searchable time-aligned media corpus

Citation of primary media

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Training, resources and advocacy Training, resources and advocacy

Use of new technological approaches requires training, resources and advocacy Training in use of new toolsResources such as software, archiving,

advice on tools and methodsAdvocacy of the benefits of these new

approaches and tools and the reasons for engaging with them

Use of new technological approaches requires training, resources and advocacy Training in use of new toolsResources such as software, archiving,

advice on tools and methodsAdvocacy of the benefits of these new

approaches and tools and the reasons for engaging with them

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Training, resources and advocacy Training, resources and advocacy

Great need for training in particular Great need for training in particular expressed by postgraduate studentsexpressed by postgraduate students

Training is critical as tools are constantly Training is critical as tools are constantly emerging (recording techniques and emerging (recording techniques and equipment, software tools)equipment, software tools)

Great need for training in particular Great need for training in particular expressed by postgraduate studentsexpressed by postgraduate students

Training is critical as tools are constantly Training is critical as tools are constantly emerging (recording techniques and emerging (recording techniques and equipment, software tools)equipment, software tools)

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Training, resources and advocacy Training, resources and advocacy

Methods for development of:Methods for development of: Time-aligned transcripts (in XML)Time-aligned transcripts (in XML) Interlinearised text Interlinearised text Dictionary productionDictionary production Crucial separation of content and Crucial separation of content and

form to allow well-formed archival form to allow well-formed archival datadata

Methods for development of:Methods for development of: Time-aligned transcripts (in XML)Time-aligned transcripts (in XML) Interlinearised text Interlinearised text Dictionary productionDictionary production Crucial separation of content and Crucial separation of content and

form to allow well-formed archival form to allow well-formed archival datadata

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Training, resources and advocacy Training, resources and advocacy

Training in creation of archival sources Training in creation of archival sources by fieldworkersby fieldworkers Naming conventions and persistent Naming conventions and persistent

identification of dataidentification of data Metadata sets and toolsMetadata sets and tools Data formats Data formats

WAVWAV Text/XMLText/XML etcetc

Training in creation of archival sources Training in creation of archival sources by fieldworkersby fieldworkers Naming conventions and persistent Naming conventions and persistent

identification of dataidentification of data Metadata sets and toolsMetadata sets and tools Data formats Data formats

WAVWAV Text/XMLText/XML etcetc

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Community of interest, support and trainingCommunity of interest, support and training

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Training, resources and advocacy Training, resources and advocacy

We have run training workshops in the use of We have run training workshops in the use of appropriate linguistic toolsappropriate linguistic tools for archival for archival output (Toolbox, Transcriber etc)output (Toolbox, Transcriber etc)

University campuses in Melbourne, Sydney, University campuses in Melbourne, Sydney, Brisbane, University of Hawai’iBrisbane, University of Hawai’i

In community language centres in In community language centres in Melbourne, Kalgoorlie, Nambucca Heads and Melbourne, Kalgoorlie, Nambucca Heads and SydneySydney

Batchelor Institute (Aboriginal training Batchelor Institute (Aboriginal training centre)centre)

We have run training workshops in the use of We have run training workshops in the use of appropriate linguistic toolsappropriate linguistic tools for archival for archival output (Toolbox, Transcriber etc)output (Toolbox, Transcriber etc)

University campuses in Melbourne, Sydney, University campuses in Melbourne, Sydney, Brisbane, University of Hawai’iBrisbane, University of Hawai’i

In community language centres in In community language centres in Melbourne, Kalgoorlie, Nambucca Heads and Melbourne, Kalgoorlie, Nambucca Heads and SydneySydney

Batchelor Institute (Aboriginal training Batchelor Institute (Aboriginal training centre)centre)

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

RNLD mailing listRNLD mailing list

131 subscribers (August 2005)131 subscribers (August 2005)

Searchable archive at LinguistList (only around 140 Searchable archive at LinguistList (only around 140 messages over 12 months)messages over 12 months)

Topics covered include:Topics covered include: Digital audio/video recording equipment and tools Digital audio/video recording equipment and tools Scanning images - management of photographsScanning images - management of photographs Shoebox/ Toolbox issuesShoebox/ Toolbox issues Transcriber issuesTranscriber issues Consent formsConsent forms Unicode and orthography issuesUnicode and orthography issues

131 subscribers (August 2005)131 subscribers (August 2005)

Searchable archive at LinguistList (only around 140 Searchable archive at LinguistList (only around 140 messages over 12 months)messages over 12 months)

Topics covered include:Topics covered include: Digital audio/video recording equipment and tools Digital audio/video recording equipment and tools Scanning images - management of photographsScanning images - management of photographs Shoebox/ Toolbox issuesShoebox/ Toolbox issues Transcriber issuesTranscriber issues Consent formsConsent forms Unicode and orthography issuesUnicode and orthography issues

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Global research communityGlobal research community

LACITO (Paris)LACITO (Paris)ANLC (Alaska)ANLC (Alaska)

EMELD (Michigan)EMELD (Michigan)

AILLA (Texas)AILLA (Texas)

PARADISECPARADISEC

AMPM (Auckland)AMPM (Auckland)AIATSIS (Canberra)AIATSIS (Canberra)

ELAR (London)ELAR (London)

DOBES (Netherlands)DOBES (Netherlands)

DELAMANDELAMANarchivesarchives

Digital Endangered Languages and Musics Archives NetworkDigital Endangered Languages and Musics Archives NetworkDigital Endangered Languages and Musics Archives NetworkDigital Endangered Languages and Musics Archives Network

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

We are cited as an exemplar We are cited as an exemplar using Digital Mass Storage using Digital Mass Storage Systems in the International Systems in the International Association of Sound and Association of Sound and Audiovisual Archives (IASA) Audiovisual Archives (IASA) Guidelines on the Production Guidelines on the Production and Preservation of Digital Audio and Preservation of Digital Audio Objects (IASA-TC04). Aarhus, Objects (IASA-TC04). Aarhus, Denmark: International Denmark: International Association of Sound and Association of Sound and Audiovisual Archives (IASA), Audiovisual Archives (IASA), 2004, p. 51. 2004, p. 51.

"The Sub Committee on "The Sub Committee on Technology of the Memory of Technology of the Memory of the World Programme of the World Programme of UNESCO recommends these UNESCO recommends these guidelines as best practice for guidelines as best practice for Audio-Visual Archives. "Audio-Visual Archives. "

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Total file counts by file type:Total file counts by file type:

".jpg" : 7791 files".jpg" : 7791 files

".mp3" : 2061 files".mp3" : 2061 files

".pdf" : 34 files".pdf" : 34 files

".rtf" : 8 files".rtf" : 8 files

".tif" : 171 files".tif" : 171 files

".txt" : 3 files".txt" : 3 files

".wav" : 2061 files".wav" : 2061 files

".xml" : 31 files".xml" : 31 files

Total file sizes by file type:Total file sizes by file type: ".jpg" : 10.91 GB".jpg" : 10.91 GB ".mp3" : 55.59 GB".mp3" : 55.59 GB ".pdf" : 5.70 MB".pdf" : 5.70 MB ".rtf" : 1.04 MB".rtf" : 1.04 MB ".tif" : 848.57 MB".tif" : 848.57 MB ".txt" : 2.15 MB".txt" : 2.15 MB ".wav" : 1.67 TB".wav" : 1.67 TB ".xml" : 1.20 MB".xml" : 1.20 MB

Current size of collectionCurrent size of collection

As at October 28th 2005 - 12,160As at October 28th 2005 - 12,160 files in the files in the collection totaling 1.74 TBcollection totaling 1.74 TB

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

StaffStaff

Director - 2 days per week Project Manager - 1 day per week Admin - 4 days per week Audio Engineer - fulltime Occasional contract work

i.e. just over 2 fulltime positions

Director - 2 days per week Project Manager - 1 day per week Admin - 4 days per week Audio Engineer - fulltime Occasional contract work

i.e. just over 2 fulltime positions

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

ConclusionConclusion

Successful cooperation between technical expertise and discipline based practitioners

Need for involvement of practitioners Lack of resources to curate such collections

means training practitioners to produce well-formed data for longterm accessibility

Change in practice to reflect new technological possibilities for creation of innovative research objects

Successful cooperation between technical expertise and discipline based practitioners

Need for involvement of practitioners Lack of resources to curate such collections

means training practitioners to produce well-formed data for longterm accessibility

Change in practice to reflect new technological possibilities for creation of innovative research objects

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Further informationFurther information

http://paradisec.org.auhttp://paradisec.org.au