attribution and impact for social science data

34
Attribution and impact for social science data ODIN conference, Cologne October 2013 Louise Corti Collections Development and Producer Support

Upload: others

Post on 28-Nov-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Attribution and impact for social

science data

ODIN conference, Cologne

October 2013

Louise Corti

Collections Development and

Producer Support

Overview

• Introducing the UK Data Service

• Our data portfolio and users

• Citation, impact measurement and DOIs

• Challenges for social science citation

The UK Data Archive

• Based at the University of Essex, since 1967

• 45 years of selecting, ingesting, curating and providing access to social science data

• designated as Place of Deposit by The National Archives

• Data and data support services for higher and further education for research, teaching and learning

• Recently attained the highest information security standard, ISO 27001

University of Essex

The Archive

SISTER DATA ARCHIVES

Council of European Social Science Data Archives (CESSDA )

ADA Australian Social

Science Data Archive

ICPSR (USA) Inter-University Consortium for

Political and Social Research

What is the UK Data Service?

• Comprehensive data resource funded by the UK Economic and Social Research Council

• Single virtual point of access to a wide range of secondary data for social science research (Directed from Essex)

• Offer promotion, support, training and guidance

What does the UK Data Service do?

• Put together a collection of the most valuable data

• Preserve data for the long term for future research

purposes

• Make the data and documentation available for reuse

• Provide data management advice for data creators

• Provide training and support for users of the service

• Bring together owners, producers and users

• Demonstrate impact through evidence of usage

• Easy access through website - ukdataservice.ac.uk

Who is our service for?

• Data for secondary analysis, research, policy making

• Teaching and learning

• Academic researchers and students

• Government analysts

• Charities and foundations

• Business consultants

• Independent research centres

• Think tanks

Our data portfolio

• Over 6,000 datasets in the collection

• 230 new datasets added each year

• Official agencies - mainly central government

• International statistical time series

• Individual academic’ research grants

• Market research agencies

• Public records/historical sources

• Access to international data via links with

other data archives worldwide

UK survey series

• High quality repeated cross-sectional surveys

• Individual or household level data

• Cover many topics including health, work, crime, social

attitudes, family expenditure, living costs, housing etc.

• Labour Force Survey

• British Crime Survey

• Health Survey for England

• British Social Attitudes

• Annual Population Survey

….

Cross-national surveys and macro databanks

• Eurobarometers

• European Social Survey

• European Values Survey

• International Social Survey Programme

• Time series data aggregated to country/region

• International governmental organisations (IMF, OECD,

IEA, World Bank)

Longitudinal studies

• British Household Panel Survey and Understanding

Society

• Understanding Society (2009-)

• English Longitudinal Study of Ageing

• Families and Children Study

• Growing Up in Scotland

• Longitudinal Study of Young People in England

UK census data

• 1971-2011 census data

• Baseline for other statistics

• Detailed combinations of characteristics

• Small geographies

• Census outputs

• Aggregate data

• Boundary data

• Flow data

• Microdata

Business data

• Collected through a wide range of surveys, and

administrative sources:

• productivity, innovation, workforce skills, earnings

• international trade, foreign direct investment

• research and development

• business demography

• industrial relations

Qualitative data

• Interviews, focus groups

• Essays, diaries, open-ended survey questions

• Observations, case notes etc.

• Family Life and Work Experience before 1918, Middle and Upper Class Families in the Early 20th Century,1870-1977

• Gender Difference, Anxiety and the Fear of Crime, 1995

• Mothers Alone: Poverty and the Fatherless Family, 1955-1966

Usage of data

• Operate a spectrum of access

• Web download under End

User Licence

• Permission only via Special

Licence access

• ‘Approved researcher’ access

via remote secure access

• End user licence includes:

• Appropriate data usage

• Full citation of data and informing us of re-use

• Have always provided a citation format

• over 22,000

registered users

• approximately

60,000 downloads

worldwide p.a.

• 3,000+ user support

queries

Evidence of access and re-use

User access information

• Collect user information and ‘projects’ upon registration

• Collate data and documentation download statistics

• Users can share project information for others to see

• Report data access stats on demand

Usage information

• Email all users every 6 months after registration about activity

• Manually add all research outputs references to the data record

• Reporting rate of publications is poor!

• Prior to DOIs, have scanned citation literature for dataset

mentions – very manual and unreliable, and poorly cited

Impactful case studies of use

• Identify and seek out case studies of re-use: research or

teaching.

• Very successful!

• 125 case studies in our database

• Can help provide impact stories for data owners/producers

and users

• And can inspire others!

• Some are harvested by ESRC for their website

• Often include ongoing work – no need to wait for

publications

Our Persistent identifiers approach

• Our data collections are not digital objects

• Need to capture changes made to data

• Versioning data in a commonly understood manner

• Needed rule-based definition of a‘significant’change

• Integrate processes with digital preservation activities & work

flows

• In 2011 we assigned Datacite DOIs for all of our collections

• Mint and update DOIs with our metadata management

infrastructure

Recording significant change

• Approx. 15% UKDA data collections are altered within

first year after first publication

• We have distinguished between major and minor

changes to a data collection = high impact vs. low impact

• DOI allocated to a metadata instance of a data collection

• DOIs resolve to jump page pointing to all external instances

• New DOI = High Impact change, with explicit logging

• Provided access only to most up-to-date version of data

Major changes – high impact

• New variable added

• New labels/value codes added

• Weighting variables reconstructed

• Wrong data supplied (e.g., March

not April)

• Mis-coded data (e.g., Don’t

know/Refused confused)

• Change in format (file migration)

• Significant changes in

documentation

• Change in access conditions

Raising awareness in the social sciences

• ESRC funding for short-term project on citation

• Advocacy for best practice in citing research data

• Audiences

• Professional organisations

• Academic publishers and journal editors

• Researchers and postgraduates

• Key activities

• Data citation principles for social sciences

• Personal communications

• Events with BL DataCite, JISC and wider PI community

• Outreach through Doctoral Training Centres

Making

Demonstrating impact with citation

• Assuming better use of DOIS…

• Starting to search for use of our DOIs – Google

• Automate this process and compile reports; promote

• Gather data citation statistics from Thomson Reuters

Data Citation Index. One of the early 20 feeder

repositories, but our own access limited!

• Work with BL Datacite and ODIN to gain connectivity

between identifiers & outputs – early adopters

CHALLENGES FOR THE FUTURE

• Citing parts (fragments) of data collections

• single files

• subsets of quantitative data

• extracts of textual data

• ESRC project Digital Futures will enable extract level

citation within a web-based browsing system

• Using rich highly structured XML metadata

• GUIDS for everything

UK Quali Bank

Resolving citation objects

CONTACT

UK Data Service

University of Essex

Wivenhoe Park

Colchester

Essex CO4 3SQ • ……………..…..………………………..

T +44 (0)1206 872001

E [email protected]