social science datasets and digital resources
DESCRIPTION
Social Science Datasets and Digital Resources. http://www.slideshare.net/johnkayebl. Overview. British Library Datasets Strategy UK Data Service Census Resources Spatial Data Open Data UK Web Archive Other Data and Resources Tools, Software and Visualisation - PowerPoint PPT PresentationTRANSCRIPT
Social Science Datasets and Digital Resources
http://www.slideshare.net/johnkayebl
www.bl.uk 2
Overview
• British Library Datasets Strategy
• UK Data Service
• Census Resources
• Spatial Data
• Open Data
• UK Web Archive
• Other Data and Resources
• Tools, Software and Visualisation
• Identifying, Citing and Sharing Data
2
www.bl.uk 3
What is a dataset?
Seismic measurements taken by a geologist.
Genetic data collected by a medical researcher.
A survey of public opinions collected by a sociologist.
A collection of tweets about events
www.bl.uk 4
The Foundation for Research
Data is a crucial component of the scholarly record.
Re-acquisition may be impossible
Datasets are essential to the British Library’s mission to advance the World’s knowledge.
www.bl.uk 5
The British Library Datasets Strategy
We envision a future where researchers can:
Discover, access, reuse, and reference datasets.
Track the impact of the data that they generate and receive appropriate credit.
Our approach is to:
Provide a focus for the community to establish needs, requirements and agreement.
Explore novel technology and creative solutions.
www.bl.uk 6
Datasets in Explore The British Library
www.bl.uk 7
Explore The British Library (Portals)
www.bl.uk 8
Explore The British Library (Portals)
www.bl.uk 9
UK Data Service http://ukdataservice.ac.uk/
Data search and download
Research method guides
Thematic guides
Online analysis
Secure Data Service http://securedata.data-archive.ac.uk/
Administrative Data Service
www.bl.uk 10
UK Data Service
Government– large-scale government surveys, such as the Labour Force Survey and the
General Household Survey
International– multi-nation databanks, such as World Bank's World Development
Indicators, and survey data including Eurobarometer
Longitudinal– major UK surveys following individuals over time, such as the British
Household Panel Survey and Birth Cohort Studies
Qualidata – a range of multimedia qualitative data sources – new portal (UK Quali Bank) to be launched Dec 2013
www.bl.uk 11
2011 Census
11
Data available on www.ons.gov.uk - latest release is output area key statistics
Academic releases 1971 - 2011 are made available via http://census.ukdataservice.ac.uk/
Experian Geodemographic Data http://cdu.mimas.ac.uk/experian/index.htm
www.bl.uk 12
Previous Censuses
12
Data available for 1981-2011 on http://www.nomisweb.co.uk/
Academic data release from 1971 to 2011 on casweb (also contains geographic boundary data) http://census.ukdataservice.ac.uk/
Histpop – The Online Historical Reports Collection (OHPR) provides online access to population reports for Britain and Ireland from 1801 to 1937 http://www.histpop.org/
Look at changes between census questions, structures and geographies
www.bl.uk 13
Previous Censuses
13
www.bl.uk 14
Previous Censuses
14
www.bl.uk 15
BL Official Publications Collection – Census Reports
UK Census Reports – BL holds statistical reports relating
to each census.– Reports for 1921-1991 in the
reading room on open shelves– National and county aggregate
reports for England and Wales, Scotland, Northern Ireland and Great Britain
– Aggregate statistical information at each level for all census questions
– Compliments Histpop which has digitised reports between 1801 – 1937 and Casweb: 1971 – 2001
– Some older reports can be found in parliamentary papers
Number of Households Lacking or sharing Amenities (England and Wales)
0
500,000
1,000,000
1,500,000
2,000,000
2,500,000
3,000,000
3,500,000
1951 (Lack orShare Flushing
Toilet)
1961 (Lack orShare Flushing
Toilet)
1971 (Lack orShare Inside
Toilet)
1981 (Lack orShare Inside
Toilet)
1991 (Lack orShare InsideToilet and/or
Bath or Shower)
2001 (WithoutSole use of
Toilet and/orBath or Shower)
1851 Population Pyramid
1,500,000 1,000,000 500,000 0 500,000 1,000,000 1,500,000
0-45-9
10-1415-1920-2425-2930-3435-3940-4445-4950-5455-5960-6465-6970-7475-7980-84
85+
Ag
e G
rou
p
Number of People
Male
Female
www.bl.uk 16
MapsThe library holds a number of maps generated with census and population data from UK and all over the world
Augustus Petermann, Map of the British Isles, elucidating the distribution of the population based on the 1841 census. London,1861.
Ireland map for railways
www.bl.uk 17
Spatial Data
Edina Digimap and UK Borders
http://edina.ac.uk/digimap/
http://edina.ac.uk/ukborders/
Go Geo! Searchhttp://www.gogeo.ac.uk/cgi-bin/index.cgi
www.bl.uk 18
Spatial Data
Ordnance Survey Open Datahttp://www.ordnancesurvey.co.uk/oswebsite/products/os-opendata.html
Landmaphttp://landmap.mimas.ac.uk/
www.bl.uk 19
UK Government Open Data•http://data.gov.uk/
•Admin and Statistical data portal
•Office for National Statistics
•http://www.statistics.gov.uk/default.asp
•http://www.neighbourhood.statistics.gov.uk/dissemination/
•https://www.nomisweb.co.uk/Default.asp
•National Digital Archive of Datasets
•http://www.ndad.nationalarchives.gov.uk/
•Regional
•http://data.london.gov.uk/
•http://datagm.org.uk/
www.bl.uk 2020
International open data•United Nations
•http://data.un.org/
•European Union
•http://epp.eurostat.ec.europa.eu/portal/page/portal/eurostat/home/
•OECD
•http://www.oecd.org/statsportal/
•World Bank
•http://data.worldbank.org/
•IMF
•http://www.imf.org/external/data.htm
•Public Data EU
•http://publicdata.eu/
www.bl.uk 21
UK Web Archive http://www.webarchive.org.uk
Selective Web Archive over 11,000 websites collected since
2004
over 50,000 instances
Over 16TB of compressed data
British Library, National Library of Wales, JISC
Also National Library of Scotland, the National Archives, Wellcome Library
Many collaborators eg Women’s Library, Live Arts
Development Agency, Quakers in Britain
www.bl.uk 22
UK Web Archive: The lost web
www.bl.uk 23
UK Web Archive: The lost web
www.bl.uk 24
UK Web Archive - event-based special collections
Collect, preserve, and make accessible
eb sites of cultural and scholarly
importance from the UK domain
www.bl.uk 25
JISC UK Web Domain Dataset (1996-2010)
Funded by JISC to create a research collection of UK websites
Collaboration between the Internet Archive, JISC and the British Library
Copy of subset of the Internet Archive’s web collection that relates to the UK
470466 files, mostly arc.gz, with 4494 warc.gz. Total size: 32TB
No local access – possible through the Internet Archive
Can be used to generate secondary datasets and make these available
Analytical access the main route
www.bl.uk 26
Other Data and Resources
•Guardian Data Store
•http://www.guardian.co.uk/data-store
•Financial Times
•http://www.ft.com/home/uk
•Economist Intelligence Unit
•http://www.eiu.com/Default.aspx
•UK Government Web Archive
•http://www.nationalarchives.gov.uk/webarchive/
www.bl.uk 27
Other Data and Resources
The Mass Observation Archive – Specialises in material about everyday life
in Britain. It contains papers generated by the original Mass Observation social research organisation (1937 to early 1950s), and newer material collected continuously since 1981
– http://www.massobs.org.uk/index.htm
A Vision of Britain through Time– Contains historical Maps, Census Reports,
Election reports and other historical material, searchable by local area.
– http://www.visionofbritain.org.uk/
Charles Booth Online Archive– Gives access to archive material from the
Booth collections of the London School of Economics and Political Science and the Senate House Library
– http://booth.lse.ac.uk/
Images from The Mass Observation Archive
www.bl.uk 28
Analysis Tools and SoftwareStatistical - SPSS, STATA, R (open source)GIS - ArcGIS, MapInfo, Quantum GIS (open source)ExcelOnline Tools
www.bl.uk 29
Examples of Online Analysis Tools
•UK Data Service NESSTAR
•http://nesstar.esds.ac.uk
•ESDS Spatial Tools
•http://www.ccsr.ac.uk/esds/gis/
•Economists Online Dataverse
•http://dvn.iq.harvard.edu/dvn/dv/NEEO
•United Nations
•http://data.un.org/Explorer.aspx
•London Profiler
•http://www.londonprofiler.org/
•London Heat Map
•http://www.londonheatmap.org.uk/Mapping/
www.bl.uk 30
Online Mapping Tools using Google Maps•MapTube
•http://www.maptube.org/
•Google Drive, KML and Google Earth
•https://drive.google.com
•Gmap Creator
•http://www.casa.ucl.ac.uk/software/gmapcreator.asp
•Other, more advanced online mapping (requires coding):
•Open Layers http://openlayers.org/
•OS Openspace http://www.ordnancesurvey.co.uk/oswebsite/web-services/os-openspace/index.html
www.bl.uk 31
Large and Big DataTraditional Tools don’t work!University Resources?Cloud Services (Amazon or other)Coding languages
http://www.dominoup.com/
www.bl.uk 32
Data Visualization
Presenting data in a useful and interesting manner
Allowing concepts to be easily understood
Lots of examples online e.g:
•http://flowingdata.com/
•http://datavisualization.ch/
•http://www.guardian.co.uk/news/datablog
www.bl.uk 33
Current Data Citation
www.bl.uk 34
Current Data Citation
www.bl.uk 35
DataCite
DataCite is an international consortium which aims to:
Establish easier access to research data on the Internet
Increase acceptance of research data as legitimate, citable contributions to the scholarly record
Support data archiving that will permit results to be verified and re-purposed for future study
http://datacite.org/
www.bl.uk 3636
Digital Object Identifiers (DOIs) offer a solution
Mostly widely used identifier for scientific articles
Researchers, authors, publishers know how to use them
Put datasets on the same playing field as articles
Connecting an Article with the Underlying Data
DatasetYancheva et al (2007). Analyses on sediment of Lake Maar. PANGAEA.doi:10.1594/PANGAEA.587840
URLs are not persistent
(e.g. Wren JD: URL decay in MEDLINE- a 4-year follow-up study. Bioinformatics. 2008, Jun 1;24(11):1381-5).
www.bl.uk 38
Open Researcher and Contributor ID (ORCID) http://about.orcid.org/
•Infrastructure is being created for researchers to build up an open portfolio of research objects
www.bl.uk 39
Open Researcher and Contributor ID (ORCID)
Register an ORCID ID www.orcid.org and link published papers and data (and anything!) using ORCID’s tools
www.bl.uk 40
Sharing Data - Figshare
Non published outputs (working papers, datasets) can be deposited in figshare http://figshare.com/ given a DataCite DOI and linked back and added to ORCID profile
•ODIN wants to expand on this principle and engage with data centres and institutional repositories to allow easier more open discovery of non-traditional research outputs.
www.bl.uk 41
Impact of Data
•View the impact of your work using traditional citation metrics and social citations http://www.impactstory.org/
www.bl.uk 43
Depositing and Archiving Data
Why Archive?Institutional RepositoriesUK Data Archive/ESDSMetadata and Code!
www.bl.uk 4444
John KayeLead Curator – Digital Social ScienceSocials Sciences The British Library96 Euston Road London NW1 2DB Telephone: 020 7412 7450Email: [email protected]: @johnkayebl
http://britishlibrary.typepad.co.uk/socialscience/
Slides - http://www.slideshare.net/johnkayebl
Contact Details