big data and official statistics the un global working group ronald jansen chief, trade statistics...
TRANSCRIPT
Big Data and Official Statistics The UN Global
Working Grouphttp://unstats.un.org/unsd/trade/bigdata/
Ronald JansenChief, Trade Statistics BranchUnited Nations Statistics [email protected] and [email protected]
Overview
What is Big Data?
What is Big Data for Official Statistics?
What are the Challenges?
UN Global Working Group on
Big Data for Official Statistics
What is Big Data? Big Data characteristics Volume Velocity Variety
Big Data sources: o Mobile Deviceso Digital
Transactionso Sensorso Social
Networking
Big Data benefits Auto-generated Timely Granular
Big Data for Development
UN GLOBAL PULSEo Research &
Developmento Big Data
Partnershipso Pulse Lab Network
New York Jakarta Kampala
Big Data for Development
UN Global Pulseo Early warningo Real-time awarenesso Real-time feedback
What is Big Data for Official Statistics?
Big Data sources / Statistics• Mobile Phone
Positioning Data / Tourism statistics
• Supermarket scanner data / Price statistics
• Vehicle tracking device / Transport statistics
• Twitter/ Consumer confidence statistics
• Satellite imagery / Agriculture statistics
What are the Challenges?
Big Data Challenges• Methodology• Privacy• IT infrastructure• Human skills• Partnerships
Methodological Challenges
Methodology• Representativeness ?• Volatility (no time series?)• Standardization ?• Modelling
Big Data partnerships opportunities
Big Data providers
Data Sources
Research institutes and technology
providers
- Upstream processing
- IT infrastructure,
Academia and scientific communities
- Analytical capacity and modelling
National Statistical Offices
- Standards and methodology
Partnership Challenges
Example of use of Mobile Phone data for Official Statistics
Mobile Phone Positioning Data
Call Detail Record (CDR)
Data Detail Record (DDR) (Internet Protocol Detail
Record)
Location Updates (LA)
Radio coverage updates (Abis data)
Example 1: Estonia
Data provider/ source
• Mobile phone operator
• Provides raw data to research company
Research company
• Provides IT infrastructure• Preparation of data• Ensuring privacy and
confidentiality
Academia
• Providing modeling and analytics
National Statistical
Office
• Receives pre-processed and aggregated data
• Prepares and disseminates statistics
Shared responsibility for data processing and analysis
Example 2: Netherlands
Data source/ provider
• Vodafone
• Makes data available to research institute
Research company
• Intermediate commercial partner
• Provides IT platform for data filtering and privacy
National Statistical
Office
• Receives pre-processed Big Data
• Analyze, prepare and disseminate statistics
Responsibility for data
UN Global Working Group on Big Data for Official Statistics
Milestones
Feb 2013 Friday Seminar on Big Data Mar 2014 Report to Statistical
CommissionDecision 45/110 – GWG on Big
Data Oct 2014 Conference on Big Data –
Beijing Mar 2015 Report of GWG to Stat
Commission Oct 2015 Conference on Big Data –
Abu Dhabi
UN Global Working Group on Big Data
Mandate (Decision 45/110 – 2014)
1. Provide strategic vision, direction and coordination of a global programme on Big Data for official statistics;
2. Promote practical use of sources of Big Data for official statistics, while building on the existing precedents in Big Data and finding solutions fori. Methodological issues,ii. Legal issues of access to data sources;iii. Privacy issuesiv. Data security issues;v. Cost benefit analysis
UN Global Working Group on Big Data
Mandate(continue) 3. Promote capacity building, training and sharing of
experience;
4. Foster communication and advocacy of use of Big Data for policy applications;
5. Build public trust in the use of private sector Big Data for official statistics including issues of confidentiality and privacy.
Composition of the GWG
CountriesAustralia
Denmark
Italy
Mexico
Netherlands
USA
BangladeshChina
ColombiaMorocco
PhilippinesTanzania
EgyptUAE
Oman
International Organization
sITU
OECD
UN Global Pulse
UNECE
UNESCAP
UNSD
World Bank
Task Teams1. Advocacy and Communication2. Big Data and SDG indicators3. Access and Partnerships4. Skills, Training, Capacity Building5. Cross-cutting issues (Quality
framework)6. Mobile Phone Data7. Satellite Imagery8. Social Media Data
Strong participationo Orange, o Telenor, o World Economic Forum, o NASA,o Google, o Data Pop, o World Pop, o Flowminder, o UNU-EHS, o ODI (UK), o Positium, o UPenn, o Harvard
Advocacy and Communication
1. Advocacy/information materials on Big Data for official statistics, such as brochures, papers, presentations, videos, etc.;
2. Good practices of using big data in official statistics, such as use cases; and
3. Joint materials with other task teams to promote and communicate the outcomes of their work
Big Data and SDG indicators
1. Assessment of the potential use of Big Data for monitoring progress on the 169 SDG targets by conducting a survey, taking account of the draft proposals of indicators,
2. An inventory of research work on Big Data that could be used to estimate one or more SDGs, and
3. Good practices and lessons learned from a pilot research in 1-2 countries on estimating 2-3 SDG indicators using Big Data.
Access and Partnerships
1. Assessment of existing good practices of data access
2. A set of principles as conditions for access to Big Data sources
3. Templates for MOUs or services agreements
4. Good practices and examples of partnerships
Training, Skills and Capacity Building
1. Assessment tool for the needs of Big Data skills and for the institutional ‘readiness’ for Big Data skills;
2. Corresponding training curriculum;
3. Technical seminar on Big Data in 2015; and
4. MOU for a global network of training institutions on Big Data
Cross-cutting issues1. Inventory of methodologies, data
analytics /visualisation tools as well as quality assurance frameworks and refined classification of Big Data;
2. Examples and good practices for the repository of Big Data projects;
3. Assessment of Big data sources for the production of official statistics, determine their impact on the production process and identify best practices for managing the necessary changes within an NSO
Big Data sources
Lessons learned from one or more pilot projects using mobile phone data / satellite imagery / social media data;
Guidelines on the use of Big Data sources for official statistics, including template of a project plan
Big Data and Classifications
Classification of Type of Big Data
1. Social Networks (human-sourced information)
2. Traditional Business systems (process-mediated data)
3. Internet of Things (machine-generated data)
Classification of Type of Big Data Social Networks (human-sourced
information)
1100. Social Networks: Facebook, Twitter, Tumblr etc. 1200. Blogs and comments 1300. Personal documents 1400. Pictures: Instagram, Flickr, Picasa etc. 1500. Videos: Youtube etc. 1600. Internet searches 1700. Mobile data content: text messages 1800. User-generated maps 1900. E-Mail
Classification of Type of Big Data Traditional Business systems
(process-mediated data)
21. Data produced by Public Agencies 2110. Medical records 22. Data produced by businesses 2210. Commercial transactions 2220. Banking/stock records 2230. E-commerce 2240. Credit cards
Classification of Type of Big Data Internet of Things (machine-generated
data)
31. Data from sensors 311. Fixed sensors 3111. Home automation 3112. Weather/pollution sensors 3113. Traffic sensors/webcam 3114. Scientific sensors 3115. Security/surveillance videos/images 312. Mobile sensors (tracking) 3121. Mobile phone location 3122. Cars 3123. Satellite images 32. Data from computer systems 3210. Logs 3220. Web logs
Global survey (June-Aug 2015)
Big data are data sources with a high volume, velocity and variety of data, which require new tools and methods to capture, curate, manage, and process them in an efficient way
Part I: Office Management of Big DataBig Data strategyBig Data strategy – Advocacy and CommunicationBig Data strategy – linking to the SDG indicatorsAccess, privacy and confidentialityPart II: Current status of Big Data projectsData sourcesTechnologiesQuality assessmentPart III: Guidelines on Big Data for Official Statistics