metadata extraction projects for education network australia

27
edna is partly funded by the Australian Government Department of Education, Employment and Workplace Relations. Managed and maintained by Education Services Australia Metadata Extraction Projects Pru Mitchell & Sarah Hayman Education Network Australia

Upload: pru-mitchell

Post on 22-Nov-2014

1.294 views

Category:

Education


2 download

DESCRIPTION

Overview of proof of concept projects presented at Metadata 2010 conference: Sharing Data, Sharing Ideas Canberra, 26-27 May 2010.

TRANSCRIPT

Page 1: Metadata Extraction Projects for Education Network Australia

edna is partly funded by the Australian Government Department of Education, Employment and Workplace Relations. Managed and maintained by Education Services Australia

Metadata Extraction Projects

Pru Mitchell & Sarah Hayman

Education Network Australia

Page 2: Metadata Extraction Projects for Education Network Australia

delivering innovative, cost-effective services across all sectors of

education formed 1 March 2010

not-for-profit, ministerial company (MCEECDYA)www.esa.edu.au

Page 3: Metadata Extraction Projects for Education Network Australia

VETADATA

ANZ-LOM

IEEE LOM

ASCEDASCED

ASCOASCO

Page 5: Metadata Extraction Projects for Education Network Australia

Metadata is not scalable

We can no longer be comprehensive or meet the standards set by our collection policy, because now we have:

more contentless fundingfewer cataloguerssame old clunky metadata tools

Page 6: Metadata Extraction Projects for Education Network Australia

Solutionsreduce quantity of metadata reduce quality of metadata get someone else to create/pay for

metadata users other organisations

improve metadata creation tools ? program machines to create

metadata

Page 7: Metadata Extraction Projects for Education Network Australia

edna proof of concepts

me.edu.au professional networking

edna sustainable collections (ESC)

faceted search: rights and user level

Flinders University-edna AI research

Page 8: Metadata Extraction Projects for Education Network Australia

professional networking site for educators

users bookmark and discuss resources and these are aggregated to own url

the system collects, manages and maps metadata

Page 9: Metadata Extraction Projects for Education Network Australia
Page 10: Metadata Extraction Projects for Education Network Australia
Page 11: Metadata Extraction Projects for Education Network Australia

person – resource - tag - community

Page 12: Metadata Extraction Projects for Education Network Australia
Page 13: Metadata Extraction Projects for Education Network Australia

edna Sustainable Collections (ESC)harvests bookmarks

from key educators in me.edu.au

and external services

links to OpenCalais entity data

Page 14: Metadata Extraction Projects for Education Network Australia

How does it do this?

takes an RSS feedextracts available metadata

checks for duplicatesmaps it to edna metadata profile in DSpace metadata management system

Page 15: Metadata Extraction Projects for Education Network Australia
Page 16: Metadata Extraction Projects for Education Network Australia
Page 17: Metadata Extraction Projects for Education Network Australia
Page 18: Metadata Extraction Projects for Education Network Australia

Outcomes

increasing efficiency for information managersfreeing of Information managers to focus on higher end work, eg subject, user level metadataadding user suggestion to collectionwidening the range of resources being captured and evaluated

Page 19: Metadata Extraction Projects for Education Network Australia

Faceted search

use metadata to help solve issue for stakeholder - cost of educational copying

harvest rights/licence metadata

make this meaningful to educators ‘what can I do with this resource?’

preference openly licensed content

Page 20: Metadata Extraction Projects for Education Network Australia
Page 22: Metadata Extraction Projects for Education Network Australia

AI proof of concept

Flinders University Artificial Intelligence and Knowledge Laboratory and Education.au 2008-09

partial automation of categorisation and annotation of web pages

Page 23: Metadata Extraction Projects for Education Network Australia

Elements of the project

text analysisautomatic classification edna categorysuggestion of categories from controlled vocabularyclassification data capture tool

Page 25: Metadata Extraction Projects for Education Network Australia

Findings35% accuracy for mapping category

from title alone, 60% accuracy using WordNet-based semantic relatedness

confirmation of the need for a human eye/expertise

classification information may be contained in images/style/tone not text

Page 26: Metadata Extraction Projects for Education Network Australia

Conclusionsconsider new approaches and keep

pace with developments, cultural and technical

find opportunities to involve users in discovery, evaluation and description of content

continue to explore smart tools to help build and manage collections