bl doctoral open days feb 2012 - social science data and digital resources
DESCRIPTION
BL Doctoral Open Days Feb 2012 - Social Science Data and Digital ResourcesTRANSCRIPT
Social Science Data and Digital Resources
February 2012
John Kaye – Lead Curator Digital Social Science
http://www.slideshare.net/johnkayebl
Overview
• British Library Datasets Strategy
• ESDS
• Census Resources
• Spatial Data
• Open Data
• UK Web Archive
• Other Data and Resources
• Tools, Software and Visualisation
• Identifying, Citing and Sharing Data
2
3
What is a dataset?
Seismic measurements taken by a geologist.
Genetic data collected by a medical researcher.
A survey of public opinions collected by a sociologist.
4
The Foundation for Research
Data is a crucial component of the scholarly record.
Re-acquisition may be impossible
Datasets are essential to the British Library’s mission to advance the World’s knowledge.
5
The British Library Datasets Strategy
We envision a future where researchers can:
Discover, access, reuse, and reference datasets.
Track the impact of the data that they generate and receive appropriate credit.
Our approach is to:
Provide a focus for the community to establish needs, requirements and agreement.
Explore novel technology and creative solutions.
6
Datasets in Explore The British Library
7
Explore The British Library (Portals)
8
Explore The British Library (Portals)
9
Economic and Social Data Service - (ESDS) www.esds.ac.uk
Data search and download
Research method guides
Thematic guides
Online analysis
Secure Data Service http://securedata.data-archive.ac.uk/
UK Data Service
10
Economic and Social Data Service - (ESDS) www.esds.ac.uk
ESDS Government large-scale government surveys, such as the Labour Force Survey and
the General Household Survey
ESDS International multi-nation databanks, such as World Bank's World Development
Indicators, and survey data including Eurobarometer
ESDS Longitudinal major UK surveys following individuals over time, such as the British
Household Panel Survey
ESDS Qualidata a range of multimedia qualitative data sources
2011 Census
11
Data available on www.ons.gov.uk - latest release is output area key statistics
Academic releases will be made available via Census Dissemination Unit via their InFuse tool http://cdu.mimas.ac.uk/
Lots of information and support at http://census.ac.uk/
Experian Geodemographic Data http://cdu.mimas.ac.uk/experian/index.htm
Previous Censuses
12
Data available for 1981-2011 on http://www.nomisweb.co.uk/
Academic data release from 1971 to 2001 on casweb (also contains geographic boundary data) http://casweb.mimas.ac.uk/
Histpop – The Online Historical Reports Collection (OHPR) provides online access to population reports for Britain and Ireland from 1801 to 1937 http://www.histpop.org/
Look at changes between census questions, structures and geographies
Previous Censuses
13
Previous Censuses
14
15
BL Official Publications Collection – Census Reports
UK Census Reports BL holds statistical reports
relating to each census. Reports for 1921-1991 in the
reading room on open shelves National and county aggregate
reports for England and Wales, Scotland, Northern Ireland and Great Britain
Aggregate statistical information at each level for all census questions
Compliments Histpop which has digitised reports between 1801 – 1937 and Casweb: 1971 – 2001
Some older reports can be found in parliamentary papers
Number of Households Lacking or sharing Amenities (England and Wales)
0
500,000
1,000,000
1,500,000
2,000,000
2,500,000
3,000,000
3,500,000
1951 (Lack orShare Flushing
Toilet)
1961 (Lack orShare Flushing
Toilet)
1971 (Lack orShare Inside
Toilet)
1981 (Lack orShare Inside
Toilet)
1991 (Lack orShare InsideToilet and/or
Bath or Shower)
2001 (WithoutSole use of
Toilet and/orBath or Shower)
1851 Population Pyramid
1,500,000 1,000,000 500,000 0 500,000 1,000,000 1,500,000
0-45-9
10-1415-1920-2425-2930-3435-3940-4445-4950-5455-5960-6465-6970-7475-7980-84
85+
Ag
e G
rou
p
Number of People
Male
Female
16
Maps
The library holds a number of maps generated with census and population data from UK and all over the world
Augustus Petermann, Map of the British Isles, elucidating the distribution of the population based on the 1841 census. London,1861.
Ireland map for railways
17
Spatial Data
Edina Digimap and UK Bordershttp://edina.ac.uk/digimap/
http://edina.ac.uk/ukborders/
Go Geo! Searchhttp://www.gogeo.ac.uk/cgi-bin/index.cgi
18
Spatial Data
Ordanance Survey Open Datahttp://www.ordnancesurvey.co.uk/oswebsite/products/os-opendata.html
Landmaphttp://landmap.mimas.ac.uk/
19
UK Government Open Data
http://data.gov.uk/
Admin and Statistical data portal
Office for National Statistics
http://www.statistics.gov.uk/default.asp
http://www.neighbourhood.statistics.gov.uk/dissemination/
https://www.nomisweb.co.uk/Default.asp
National Digital Archive of Datasets
http://www.ndad.nationalarchives.gov.uk/
Regional
http://data.london.gov.uk/
http://datagm.org.uk/
20
International open data
United Nations
http://data.un.org/
European Union
http://epp.eurostat.ec.europa.eu/portal/page/portal/eurostat/home/
OECD
http://www.oecd.org/statsportal/
World Bank
http://data.worldbank.org/
IMF
http://www.imf.org/external/data.htm
Public Data EU
http://publicdata.eu/
21
UK Web Archive http://www.webarchive.org.uk
Selective Web Archive over 11,000 websites collected since
2004
over 50,000 instances
Over 16TB of compressed data
British Library, National Library of Wales, JISC
Also National Library of Scotland, the National Archives, Wellcome Library
Many collaborators
eg Women’s Library, Live Arts Development Agency, Quakers in Britain
A typical event-based special collection
Collect, preserve, and make accessible
eb sites of cultural and scholarly
importance from the UK domain
A comprehensive special collection
Collect, preserve, and make accessible
eb sites of cultural and scholarly
importance from the UK domain
24
JISC UK Web Domain Dataset (1996-2010)
Funded by JISC to create a research collection of UK websites
Collaboration between the Internet Archive, JISC and the British Library
Copy of subset of the Internet Archive’s web collection that relates to the UK
470466 files, mostly arc.gz, with 4494 warc.gz. Total size: 32TB
No local access – possible through the Internet Archive
Can be used to generate secondary datasets and make these available
Analytical access the main route
25
Other Data and Resources
Arts and Humanities data Service (AHDS)
http://ahds.ac.uk
Guardian Data Store
http://www.guardian.co.uk/data-store
Financial Times
http://www.ft.com/home/uk
Economist Intelligence Unit
http://www.eiu.com/Default.aspx
UK Government Web Archive
http://www.nationalarchives.gov.uk/webarchive/
26
Other Data and Resources
The Mass Observation Archive Specialises in material about everyday life
in Britain. It contains papers generated by the original Mass Observation social research organisation (1937 to early 1950s), and newer material collected continuously since 1981
http://www.massobs.org.uk/index.htm
A Vision of Britain through Time Contains historical Maps, Census Reports,
Election reports and other historical material, searchable by local area.
http://www.visionofbritain.org.uk/
Charles Booth Online Archive Gives access to archive material from the
Booth collections ofthe London School of Economics and Political Science and the Senate House Library
http://booth.lse.ac.uk/
Images from The Mass Observation Archive
27
Analysis Tools and Software
Statistical - SPSS, SATA, R (open source)GIS - ArcGIS, MapInfo, Quantum GIS (open source)ExcelOnline Tools
28
Examples of Online Analysis Tools
ESDS NESSTAR
http://nesstar.esds.ac.uk
ESDS Spatial Tools
http://www.ccsr.ac.uk/esds/gis/
Economists Online Dataverse
http://dvn.iq.harvard.edu/dvn/dv/NEEO
United Nations
http://data.un.org/Explorer.aspx
London Profiler
http://www.londonprofiler.org/
London Heat Map
http://www.londonheatmap.org.uk/Mapping/
29
Online Mapping Tools using Google Maps
MapTube
http://www.maptube.org/
Google Drive
https://drive.google.com
Gmap Creator
http://www.casa.ucl.ac.uk/software/gmapcreator.asp
Other, more advanced online mapping (requires coding):
Open Layers http://openlayers.org/
OS Openspace http://www.ordnancesurvey.co.uk/oswebsite/web-services/os-openspace/index.html
30
Data Visualization
Presenting data in a useful and interesting manner
Allowing concepts to be easily understood
Lots of examples online e.g:
http://flowingdata.com/
http://datavisualization.ch/
http://www.guardian.co.uk/news/datablog
31
Citing Data
32
DataCite
DataCite is an international consortium which aims to:
Establish easier access to research data on the Internet
Increase acceptance of research data as legitimate, citable contributions to the scholarly record
Support data archiving that will permit results to be verified and re-purposed for future study
http://datacite.org/
33
Digital Object Identifiers (DOIs) offer a solution
Mostly widely used identifier for scientific articles
Researchers, authors, publishers know how to use them
Put datasets on the same playing field as articles
Connecting an Article with the Underlying Data
DatasetYancheva et al (2007). Analyses on sediment of Lake Maar. PANGAEA.doi:10.1594/PANGAEA.587840
URLs are not persistent
(e.g. Wren JD: URL decay in MEDLINE- a 4-year follow-up study. Bioinformatics. 2008, Jun 1;24(11):1381-5).
34
Metadata Search http://search.datacite.org/ui
Open Researcher and Contributor ID (ORCID) http://about.orcid.org/
•Infrastructure is being created for researchers to build up an open portfolio of research objects
Open Researcher and Contributor ID (ORCID)
•Register an ORCID ID www.orcid.org and link published papers using ORCID’s tools
Sharing Data - Figshare
•Non published outputs (working papers, datasets) can be deposited in figshare http://figshare.com/ given a DataCite DOI and linked back and added to ORCID profile
•ODIN wants to expand on this principle and engage with data centres and institutional repositories to allow easier more open discovery of non-traditional research outputs.
Impact of Data
•View the impact of your work using traditional citation metrics and social citations http://www.impactstory.org/
39
Depositing and Archiving Data
Why Archive?Institutional RepositoriesUK Data Archive/ESDSMetadata and Code!
40
BL Social Science Research Bloghttp://britishlibrary.typepad.co.uk/socialscience/
41
John KayeLead Curator – Digital Social ScienceSocials Sciences The British Library96 Euston Road London NW1 2DB Telephone: 020 7412 7450Email: [email protected]: @johnkayebl
http://britishlibrary.typepad.co.uk/socialscience/
Slides - http://www.slideshare.net/johnkayebl
Contact Details