metadata extraction projects for education network australia

Post on 22-Nov-2014

1.294 Views

Category:

Education

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

Overview of proof of concept projects presented at Metadata 2010 conference: Sharing Data, Sharing Ideas Canberra, 26-27 May 2010.

TRANSCRIPT

edna is partly funded by the Australian Government Department of Education, Employment and Workplace Relations. Managed and maintained by Education Services Australia

Metadata Extraction Projects

Pru Mitchell & Sarah Hayman

Education Network Australia

delivering innovative, cost-effective services across all sectors of

education formed 1 March 2010

not-for-profit, ministerial company (MCEECDYA)www.esa.edu.au

VETADATA

ANZ-LOM

IEEE LOM

ASCEDASCED

ASCOASCO

Metadata is not scalable

We can no longer be comprehensive or meet the standards set by our collection policy, because now we have:

more contentless fundingfewer cataloguerssame old clunky metadata tools

Solutionsreduce quantity of metadata reduce quality of metadata get someone else to create/pay for

metadata users other organisations

improve metadata creation tools ? program machines to create

metadata

edna proof of concepts

me.edu.au professional networking

edna sustainable collections (ESC)

faceted search: rights and user level

Flinders University-edna AI research

professional networking site for educators

users bookmark and discuss resources and these are aggregated to own url

the system collects, manages and maps metadata

person – resource - tag - community

edna Sustainable Collections (ESC)harvests bookmarks

from key educators in me.edu.au

and external services

links to OpenCalais entity data

How does it do this?

takes an RSS feedextracts available metadata

checks for duplicatesmaps it to edna metadata profile in DSpace metadata management system

Outcomes

increasing efficiency for information managersfreeing of Information managers to focus on higher end work, eg subject, user level metadataadding user suggestion to collectionwidening the range of resources being captured and evaluated

Faceted search

use metadata to help solve issue for stakeholder - cost of educational copying

harvest rights/licence metadata

make this meaningful to educators ‘what can I do with this resource?’

preference openly licensed content

AI proof of concept

Flinders University Artificial Intelligence and Knowledge Laboratory and Education.au 2008-09

partial automation of categorisation and annotation of web pages

Elements of the project

text analysisautomatic classification edna categorysuggestion of categories from controlled vocabularyclassification data capture tool

Findings35% accuracy for mapping category

from title alone, 60% accuracy using WordNet-based semantic relatedness

confirmation of the need for a human eye/expertise

classification information may be contained in images/style/tone not text

Conclusionsconsider new approaches and keep

pace with developments, cultural and technical

find opportunities to involve users in discovery, evaluation and description of content

continue to explore smart tools to help build and manage collections

Questions, feedback

Pru Mitchellpru.mitchell@esa.edu.auSarah Haymansarah.hayman@esa.edu.au

top related