deliverable 2.1: community building, coordination and planning · d2.1– v 1.0 page 3 executive...

39
Project funded by the European Union’s Horizon 2020 Research and Innovation Programme (2014 2020) Support Action Big Data Europe Empowering Communities with Data Technologies Deliverable 2.1: Community Building, Coordination and Planning Dissemination Level Public Due Date of Deliverable Month 3, 31/03/2015 Actual Submission Date Month 4, 16/04/2015 Work Package WP2, Community Building & Requirements Task T2.1 Type Report Approval Status Approved Version V 1.0, Final Number of Pages 38 Filename D2.1_CommunityBuidling-Coordination-and- Planning.pdf Abstract: This report summarises the different methods, materials and actions that will allow the approach of relevant stakeholders and establish the required communication channels with the seven different communities. The deliverable includes a list of projects and organisations already contacted as well as a target list for future contact. The information in this document reflects only the author’s views and the European Community is not liable for any use that may be made of the information contained therein. The information in this document is provided “as is” without guarantee or warranty of any kind, express or implied, including but not limited to the fitness of the information for a particular purpose. The user thereof uses the information at his/ her sole risk and liability. Project Number: 644564 Start Date of Project: 01/01/2015 Duration: 36 months

Upload: tranxuyen

Post on 04-May-2018

217 views

Category:

Documents


1 download

TRANSCRIPT

Project funded by the European Union’s Horizon 2020 Research and Innovation Programme (2014 – 2020)

Support Action

Big Data Europe – Empowering Communities with

Data Technologies

Deliverable 2.1: Community Building,

Coordination and Planning

Dissemination Level Public

Due Date of Deliverable Month 3, 31/03/2015

Actual Submission Date Month 4, 16/04/2015

Work Package WP2, Community Building & Requirements

Task T2.1

Type Report

Approval Status Approved

Version V 1.0, Final

Number of Pages 38

Filename D2.1_CommunityBuidling-Coordination-and-Planning.pdf

Abstract: This report summarises the different methods, materials and actions that will allow the approach of relevant stakeholders and establish the required communication channels with the seven different communities. The deliverable includes a list of projects and organisations already contacted as well as a target list for future contact.

The information in this document reflects only the author’s views and the European Community is not liable for any use that may be made of the information contained therein. The information in this document is provided “as is” without guarantee or warranty of any kind, express or implied, including but not limited to the fitness of the information for a

particular purpose. The user thereof uses the information at his/ her sole risk and liability.

Project Number: 644564 Start Date of Project: 01/01/2015 Duration: 36 months

D2.1– v 1.0

Page 2

History

Version Date Reason Revised by

0.0 15.03.2015 Content Provision WP2 Working Group

0.1 20.03.2015 Content Provision for First Draft

Thomas Thurner (SWC)

0.2 23.03.2015 Content Provision for Final Draft

Dr. Marie-Claire Forgue (ERCIM), Spyros Andronopoulos (NCSR-D), Diamando Vlachogiannis (NCSR-D)

0.3 27.03.2015 Update following Peer Review

Simon Scerri (Fraunhofer), Thomas Thurner (SWC)

0.4 31.03.2015 Final Version Sören Auer (Fraunhofer)

Author List

Organisation Name Contact Information

Fraunhofer Simon Scerri [email protected]

SWC Thomas Thurner [email protected]

OpenPHACTS Bryn Williams-Jones [email protected]

FAO Valeria Pesce [email protected]

CRES Fragiskos Mouzakis [email protected]

NCSR-D Diamando Vlachogiannis [email protected]

NCSR-D Spyros Andronopoulos [email protected]

CESSDA Paul Jackson [email protected]

SatCen Sergio Albani [email protected]

D2.1– v 1.0

Page 3

Executive Summary

In this deliverable we outline the Community Building and Stakeholder Engagement strategy adopted in the BigDataEurope project. The strategy targets stakeholders in the seven communities associated with the seven H2020 Societal Challenges. The known characteristics of these ad-hoc communities are described in order to give some background before outlining chosen methodology for stakeholder identification and engagement.

Stakeholder identification will be carried out in a number of iterations (the first of which is

already complete by this report’s delivery), and will consist of collection, analysis, mapping and prioritisation of stakeholders in each sector (multi-dimensional). Results of this exercise will yield five different kinds of engagement levels (multi-level), resulting in different kinds of targets and expectations from stakeholders. Engagement will be carried out through different means (multi-channel).

The plans for initiating and maintaining multi-dimensional, multi-level, and multi-channel

stakeholder engagement are then discussed in detail. These include:

● the setup of seven W3C interest groups for shaping the seven communities ● plans (including a blueprint) for the yearly workshops to be organised per community. ● plans for other synchronous and asynchronous methods for stakeholder enagagement. Included are also some metrics for providing internal targets and directions and ultimately

to determine the level of success attained in this exercise throughout the project’s lifetime.

D2.1– v 1.0

Page 4

Abbreviations and Acronyms

SC Societal Challenge

EC European Commission

RE Requirement Elicitation

RS Requirement Specification

WP Work Package

D2.1– v 1.0

Page 5

Table of Contents

1. INTRODUCTION ..................................................................................................................... 8

2. CHARACTERISATIONS OF COMMUNITIES ENGAGED IN SOCIETAL CHALLENGES DATA ............................................................................................................ 9

2.1 SC1: HEALTH, DEMOGRAPHIC CHANGE AND WELLBEING - DATA COMMUNITY ................................................................................................................... 9

2.1.1 General Description ........................................................................................... 9

2.1.2 Sectoral Structure of the Community .............................................................. 9

2.1.3 Size of the Community .................................................................................... 10

2.1.4 Formal Networks .............................................................................................. 10

2.1.5 Informal or Upcoming Networks .................................................................... 11

2.2 SC2: FOOD SECURITY, SUSTAINABLE AGRICULTURE AND FORESTRY, MARINE AND MARITIME AND INLAND WATER RESEARCH, AND THE BIOECONOMY - DATA COMMUNITY ...................................................................... 11

2.2.1 General Description ......................................................................................... 11

2.2.2 Sectoral Structure of the Community ............................................................ 11

2.2.3 Size of the Community .................................................................................... 11

2.2.4 Formal Networks .............................................................................................. 11

2.2.5 Degree of Formal Networking ........................................................................ 12

2.3 SC3: SECURE, CLEAN AND EFFICIENT ENERGY - DATA COMMUNITY ...... 12

2.3.1 General Description ......................................................................................... 12

2.3.2 Sectoral Structure of the Community ............................................................ 12

2.3.3 Size of the Community .................................................................................... 13

2.3.4 Formal Networks .............................................................................................. 13

2.3.5 Degree of Formal Networking ........................................................................ 13

2.3.6 Informal or Upcoming Networks .................................................................... 13

2.4 SC4: SMART, GREEN AND INTEGRATED TRANSPORT - DATA COMMUNITY ......................................................................................................................................... 13

2.5 SC5: CLIMATE ACTION - DATA COMMUNITY ...................................................... 13

2.5.1 General Description ......................................................................................... 13

2.5.2 Sectoral Structure of the Community ............................................................ 14

2.5.3 Size of the Community .................................................................................... 16

2.5.4 Formal Networks .............................................................................................. 16

2.6 SC6: EUROPE IN A CHANGING WORLD - INCLUSIVE, INNOVATIVE AND REFLECTIVE SOCIETIES - DATA COMMUNITY .................................................. 17

2.6.1 General Description and Sectoral Structure of the Community ................ 17

D2.1– v 1.0

Page 6

2.6.2 Size of the Community .................................................................................... 18

2.6.3 Formal Networks .............................................................................................. 18

2.6.4 2.6.4 Degree of formal networking ................................................................ 19

2.6.5 Specific challenges and opportunities in SC6 functional groups .............. 19

2.7 SC7: SECURE SOCIETIES - PROTECTING FREEDOM AND SECURITY OF EUROPE AND ITS CITIZENS - DATA COMMUNITY ............................................ 20

2.7.1 General Description ......................................................................................... 20

2.7.2 Sectoral Structure of the Community ............................................................ 20

2.7.3 Size of the Community .................................................................................... 20

2.7.4 Formal Networks .............................................................................................. 21

2.7.5 Degree of Formal Networking ........................................................................ 21

2.7.6 Informal or Upcoming Networks .................................................................... 21

3. STAKEHOLDER IDENTIFICATION METHODOLOGY ................................................. 23

3.1 IDENTIFICATION PHASE ........................................................................................... 23

3.2 ANALYSIS PHASE ....................................................................................................... 25

3.3 MAPPING PHASE ........................................................................................................ 26

3.4 PRIORITISATION PHASE ........................................................................................... 26

4. ENGAGEMENT STRATEGY AND METRICS ................................................................. 26

4.1 MULTI-DIMENSIONAL ENGAGEMENT ................................................................... 26

4.2 MULTI-LEVEL ENGAGEMENT .................................................................................. 27

4.3 MULTI-CHANNEL ENGAGEMENT............................................................................ 28

4.3.1 Community Stakeholder Workshops ............................................................ 28

4.3.2 Community-Based Requirements Elicitation (T2.2) ................................... 31

4.3.3 Asynchronous Engagement ........................................................................... 33

4.3.4 Synchronous Engagement ............................................................................. 34

5. CURRENT AND FUTURE CONTACTS ............................................................................ 36

6. SUMMARY ............................................................................................................................. 37

7. REFERENCES ...................................................................................................................... 38

8. APPENDIX ............................................................................................................................. 38

8.1 STAKEHOLDER INVITATION LETTER .................................................................... 38

D2.1– v 1.0

Page 7

List of Tables

Table 1: Technical and domain leads for each of the seven societal challenge areas .... 24

Table 2: Workshop Plans, Year 1 ............................................................................................. 30

Table 3: Requirement Elicitation Matrix ................................................................................... 31

Table 4: Channels and Goals for Asynchronous Engagement ............................................ 34

Table 5: Channels and Goals for Synchronous Engagement .............................................. 35

Table 6: Contacted Stakeholder Count and Future Target Count ....................................... 37

D2.1– v 1.0

Page 8

1. Introduction This report details the BigDataEurope strategy for reaching existing communities linked to

the 7 Horizon 2020 Societal Challenges (SCs): Health, Food & Agriculture, Energy, Transport, Climate, Social Sciences and Security, with the intent of establishing new Big Data management working groups on top of them. This exercise is crucial for the objectives of WP2, which include the elicitation and analysis of the Big Data technological demands and requirements of stakeholders from all Europe’s SCs. It will also ensure strong visibility in all 7 SC domains, so that the resulting Big Data Integrator Platform is used by as many stakeholders as possible. In addition, the identified stakeholders will be carefully selected so as to act as multipliers.

The deliverable describes the WP2 working group’s methodology for identifying European networks and communities (including H2020 projects, participating organisations, industrial players) which are to be contacted and to be involved in the BDE project and its activities. Particularly, the identified stakeholders will be organised in a new set of W3C interest groups, and will be involved in project events, including the 21 workshops planned (3 each per SC domain). Outreach, which includes initial approach, community creation and enabling and communication, will be coordinated through the identified consortium members (from here on referred to as the seven societal consortium members):

● OpenPHACTS: SC1 - Health, demographic change and wellbeing; ● FAO: SC2 - Food security, sustainable agriculture and forestry, marine and maritime

and inland water research, and the Bioeconomy; ● CRES: SC3 - Secure, clean and efficient energy; ● Fraunhofer (via Ertico): SC4 - Smart, green and integrated transport; ● NCSR-D: SC5 - Climate action, environment, resource efficiency and raw materials; ● CESSDA: SC6 -Europe in a changing world - inclusive, innovative and reflective

societies; ● SatCen: SC7 - Secure societies - protecting freedom and security of Europe and its

citizens. The above consortium members (and subcontracted partner) are all important umbrella

organisations in their respective domain. Aside from organising workshops, conferences and other events, they will each be responsible to engage with stakeholders in the respective SC area and raise awareness for Big Data and Data Value chain aspects, and for maintaining the communication channels established.

The rest of the report describes in more detail the methodology for identifying and engaging stakeholders in each of the seven communities, an outline of the activities to be executed by WP2 partners, and measurable criteria for determining success.

D2.1– v 1.0

Page 9

2. Characterisations of Communities Engaged in Societal Challenges Data

2.1 SC1: Health, Demographic Change and Wellbeing - Data Community

2.1.1 General Description The big data challenges in this sector are driven by variety and increasingly volume of

data generated, stored, accessed, and analysed in the understanding of biomedical science. In the context of health and wellbeing, the intensive data generation involved in genetic profiling and other technologies used to gather information on health and disease represent significant hurdles for the understanding of disease and health. Indeed the understanding of the biology of the normal situation is mostly lacking, regardless of how this changes in disease, how disease progression or therapeutic intervention can be measured, and how data can be used in new ways to improve health and well being.

The variety of data which is either publicly accessible relating to biomedical science is significant, and represents a significant barrier in the development of understanding of biology and disease. Standardisation of data relating to genetics, genomics, other ‘omic technologies, drugs, drug targets, clinical measurements, diagnostic testing, biomarkers or the development of biomarkers is in many cases lacking. Integration of all of this data into platforms which can be used to explore findings, generate hypotheses or otherwise generate knowledge is complex if even currently possible.

The development of widely applicable interoperable data standards is the key problem which limits the impact of big data approaches in healthcare. The development of interoperable data standards across the value chain will drive new insights in biomarkers, disease categorisation, and patient segmentation by enabling the integration of diverse and heterogenous data sets. Addressing the fundamental questions in health through big data necessitates the interoperability of diverse and complex data types - which in isolation are arguably not enough to develop new insights into disease.

2.1.2 Sectoral Structure of the Community There is a wide and diverse range of stakeholders in this sector, both public and private.

Beyond the provision of healthcare through hospitals, there are also diagnostics companies, technology vendors, pharmaceutical companies, universities, SMEs, charities, rare disease societies and many others. In as far as it applies to this project, we limit the stakeholders to those who both generate and also reuse data relating to basic science and drug research1. In all cases, there is a significant amount of openly available public data which are curated and maintained by specific repositories. Also there are many stakeholder who either directly or otherwise enable the mixing of public open data with proprietary or otherwise restricted data.

Examples of the community can be classified as:

● Health care providers - for clarity limited to those engaged in clinical research

1 Although the scope of stakeholder collection in this domain will remain as general as possible,

given its dimensions we prefer to focus on stakeholders that can mostly benefit from Open PHACTS’

current and future data sources.

D2.1– v 1.0

Page 10

● EU research funders - the various DGs which fund research and data generation relating to healthcare, knowledge management, technology development or basic research

● National based Research Funders eg UK BBSRC, Max Planck Institute etc ● Data repositories eg EMBL-EBI, NCBI - organisations specifically funded to collect,

curate and build services for public research data ● Proprietary data providers eg Thomson Reuters - commercial organisations which

publish, curate or otherwise generate structured data and content to subscribers ● Technology-driven data generators eg BGI - Beijing Genome Institute focussing on

generation of large amounts of a particular emerging data type ● Standards Bodies eg CDISC - organisations which may be public or private which

develop and promote data standards for particular use cases ● Drugs companies - both consumers and producers of large amounts of data, with an

emphasis on leveraging large internal data sets ● Biopharma/Biotech companies - emerging technologies in therapeutics or diagnostics

who may be more dependent on CROs or reuse of existing open data ● Contract Research Organisations (CROs) - produce data for other companies, and

emphasis on interoperability and integration on the client side. Increasingly developing their own data platforms

● Academic Institutions - researchers in data and computational sciences, generation of large amounts of emerging technology data, or utilising knowledge-driven approaches to better understand disease. Likely to produce and consume large amounts of data.

2.1.3 Size of the Community The size of the community is vast, global and diverse and spans public and private

organisations.

2.1.4 Formal Networks There are a very large number of formal networks in this space, some of which are

geographically limited, but there are an increasing number of global networks. The following list highlights those of particular relevance to the application of data to further understanding of disease, and/or development of new medicines. All have an emphasis on data interoperability, development of standards, and generation of new knowledge. Many of them are overlapping or complementary, and may or may not collaborate. In the early research space, there is more diversity and activities may be focussed on specific technologies.

● Global Alliance on Genomics and Health genomicsandhealth.org ● Research Data Alliance rd-alliance.org ● Force11 force11.org ● ELIXIR elixir-europe.org ● NCBO bioontology.org ● BioSharing biosharing.org ● Pistoia Alliance.pistoiaalliance.org ● Innovative Medicines Initiativer imi.europa.eu ● Sage Bionetworks sagebase.org ● Critical Path c-path.org ● CDISC cdisc.org ● BD2K bd2k.nih.gov

D2.1– v 1.0

Page 11

2.1.5 Informal or Upcoming Networks There are a vast number of informal networks in this space and activity varies significantly.

Also many technology vendors (eg genome sequencing) will form networks around their particular technology which may develop into other areas, or participate in state funding mechanisms.

2.2 SC2: Food Security, Sustainable Agriculture and Forestry, Marine and Maritime and Inland Water Research, and the Bioeconomy - Data Community

2.2.1 General Description The main big data societal challenge identified in this community is around the

improvement of productivity in agriculture and throughout the food value chain. Normally, big data in agriculture are associated with information collected by sensors, satellites or drones combined with genomic information or climate data, which can all help farmers to optimize their farms’ operations.

However, the main issues identified by existing communities of data managers in this area are more around the heterogeneity of the data that need to be combined and integrated for both fostering new research and innovation and providing meaningful information for decision making.

The initiatives and networks described below were born mostly around these challenges, while the private sector and governments are tackling more the issue of “big data” in terms of volume and velocity.

2.2.2 Sectoral Structure of the Community The key players in the domain are of different types depending on the area considered:

● standardisation: international organizations (FAO, ISO, EC directives – e.g. INSPIRE); ● observational data / accessions: Government, applied research institutions (Bioversity

International and all CGIAR Centers,, national research institutes / Ministries); ● research datasets, raw data: applied research (CGIAR, INRA, Wageningen University)

, academic research; ● precision agriculture, sensor data: private sector; ● IT support / extension: NGOs, private sector..

2.2.3 Size of the Community ● Agricultural information managers registered on the Agricultural Information

Management Standards and Services website (AIMS): 1,992 ● Institutions partners in the GODAN initiative: 138 ● Institutions partners in the CIARD initiative: 499.

2.2.4 Formal Networks ● Global Open Data for Agriculture and Nutrition (GODAN): high-level advocacy for open

data. ● CIARD movement: technical framework for agricultural institutions and data managers. ● Project consortia: agINFRA, SemaGrow, VIBRANT, iMarine: infrastructure, software..

D2.1– v 1.0

Page 12

2.2.5 Degree of Formal Networking ● GODAN: being established, very formal: Secretariat networking focal point; working

groups, regular Assemblies. Outreach especially to governments and private sector. ● CIARD: e-discussions. Outreach especially to directors of institutions, data managers. ● Project consortia: mailing lists, events. Outreach especially to EC partners.

2.3 SC3: Secure, Clean and Efficient Energy - Data Community

2.3.1 General Description The prospect for application of Big Data technology within the Energy domain is focused

in the following sectors:

● Electricity production, transmission and distribution ● Renewable energy production ● Distributed production and smart grids ● Energy saving ● Energy policy planning The origin of data produced and processed in the Energy domain presents a high diversity:

● Monitoring complex electro-mechanical systems (O&M, condition and health monitoring CM/SHM, preventive maintenance, optimization based on historic data etc)

● Monitoring of energy flow on transmission and distribution grids (Remote Terminal Units - RTUs, smart metering, etc.)

● Forecasting of energy demand and renewable energy production (localized weather, access historic reanalysis data, control of Internet connected distributed systems or components (inverters etc)

● Monitoring and optimizing energy management systems (Building Management Systems - BMS etc.)

● Monitoring Energy Policy related data from a variety of sources and formats, such as socioeconomic, geospatial, resource or even legislation data

The current applications of BigData technology in the domain are mainly private industrial use cases.

2.3.2 Sectoral Structure of the Community From various sectors, the key players are the following:

Electricity production, transmission and distribution sector:

● Utilities and Operators (system monitoring, forecasting) ● Transmission System Operators (TSOs) (grid and substation monitoring, energy flow,

smart grids in transmission level, forecasting) ● Distribution System Operators (DSOs) & aggregators (grid and substation monitoring,

AMI automated metering infrastructure, historic data management and forecasting)

Renewable energy production sector:

● Manufacturers (fleet monitoring) ● Wind Farm operators (system monitoring, resource forecasting and day ahead

bidding)

Distributed production and smart grids sector:

● DSOs & aggregators (grid and substation monitoring, energy flow and balancing, smart metering, forecasting, demand side management)

D2.1– v 1.0

Page 13

Energy saving:

● Industrial sector (energy MS, large distributed installations) ● Building & commercial sector (building envelope, audit data, user preferences and

behaviour, etc.)

Energy policy planning:

● Service suppliers in resource estimation (wind atlases, climate effects, etc.) ● Service suppliers in processing and analysis of socioeconomic, geospatial and

legislation data from various sources and formats

2.3.3 Size of the Community The size of the community is extensive, and is primarily composed of private companies.

The independent institutional organisations also play a significant role in the systems operational and regulatory field.

2.3.4 Formal Networks In the Energy domain, there is a significant number of associations, networks and working

groups. Although these networks are not focused on specific technology solutions, they comprise the focal point for investigating current practices on data management, Big Data case studies, future developments and needs that will require Big Data technologies, etc. Among the existing networks, the Transmission System Operators (TSOs), the Distribution System Operators (DSOs) and smart grid related networks are the primary target of BDE community building task.

2.3.5 Degree of Formal Networking The primary focus of the formal networks are technology issues, process homogenisation,

standard development, system interoperability and policy making. Through their active working groups, especially those related to data management, the access to the network members will be realized.

2.3.6 Informal or Upcoming Networks Other existing informal or upcoming networks will be identified in the process of community

building during the course of the project.

2.4 SC4: Smart, Green and Integrated Transport - Data Community Although Fraunhofer is the consortium member responsible for this domain, as per Description of

Work (DoW) the WP2 efforts will be delegated to a subcontracted partner. Since the process of formally getting the said partner on board (ERTICO is still the primary contender, as per DoW) is still underway, we here defer the characterisation and description of the community together with the big data challenges and opportunities faced to a later date. The contribution will instead be included as an addendum to Deliverable 2.3 (1st document), due in month 6.

2.5 SC5: Climate Action - Data Community

2.5.1 General Description Climate research investigates global climatic changes – taking into account natural and

anthropogenic forcing, as well as interaction of climate with atmospheric composition. The reason for climate change research is to investigate whether there is actual climate change due to activities of man, to forecast the magnitude of the climate change and its impact on

D2.1– v 1.0

Page 14

economy, environment (e.g., changes of land use, availability of water resources) and health on global as well as on local scale. Further aim of climate change research is to increase resilience against the above, mitigate effects, and propose measures and policies to reduce the causing factors.

A universal and common belief is that improved climate prediction and information services and broad assessment of climate impacts are essential for targeted adaptation and risk management measures and strategies that would facilitate the mainstreaming of adaptation into sustainable development and resilience strategies at local, national and regional scales.

Current and future climate change and climate variability studies focus on providing a sound basis for the continued development and application of new tools and weather and climate services. Such efforts require enhanced research, weather and climate observations, monitoring, and analysis and computations modelling. The aim is to transform this immense information (data sets) into sector-specific products and applications and to ensure their widest possible use by all interested sectors of society in decision making.

Climate research is heavily based on computer models that simulate the earth’s climate for time periods spanning several decades. The simulations are performed considering different scenarios of worldwide emissions of anthropogenic pollutants that affect the climate. In addition, the simulations assimilate a very large number of current and past weather observations from ground stations and satellites.

The above show how climate change research is connected to Big Data issues. The global climate models discretize the entire earth’s surface to a resolution that has recently gone down to a few kilometres, resulting in billions of grid cells. In addition, worldwide weather observations and satellite data are collected and assimilated on a daily basis and past observational data are re-analysed. Repeated climate simulations are carried out with different anthropogenic emissions scenarios spanning very long time periods (decades). In conclusion, the above climate simulations produce massive amounts of data that need to be stored, analysed, visualized and combined.

Therefore management and manipulation of climate models simulations’ results is a Big Data challenge and involves techniques and tools for storage, analysis and visualisation in order to extract useful conclusions. It also requires techniques and tools for combination of climate models results with data from other areas, e.g., agricultural production, population distribution, economic activities, etc.

Big Data management and analytics of global climate models’ results can be used to address real world impacts of climate change. Characteristic examples of potential pilot cases to be considered in the frame of BigDataEurope are: prediction of frequency and intensity of extreme weather events, prediction of frequency and intensity of flooding events and prediction of rise of sea level. Big Data can reveal critical insights for strengthening resilience against the above effects of climate change. Other potential pilot cases include effects on agricultural production on local scale, effects on economic activities and effects on forest fires frequency and intensity. Visualization techniques of big data can help towards increasing people's’ awareness of climate change effects in the local scale.

2.5.2 Sectoral Structure of the Community The key players in the domain can be governmental and non-governmental organisations,

an indicative list of which, is the following:

IPCC

D2.1– v 1.0

Page 15

The Intergovernmental Panel on Climate Change (IPCC) (http://www.ipcc.ch/index.htm) is the leading international body for the assessment of climate change. It was established by the United Nations Environment Programme (UNEP) and the World Meteorological Organization (WMO) in 1988 to provide the world with a clear scientific view on the current state of knowledge in climate change and its potential environmental and socio-economic impacts.

The main activity of the IPCC is to provide at regular intervals Assessment Reports of the state of knowledge on climate change. The latest one is the Fifth Assessment Report which was finalized in November 2014.

DCC

The Data Distribution Centre (DDC) (http://www.ipcc-data.org/) of the IPCC provides climate (observations and climate models results), socio-economic and environmental data, both from the past and also in scenarios projected into the future. The DDC is jointly managed by the British Atmospheric Data Centre (BADC) in the United Kingdom (http://badc.nerc.ac.uk/home/index.html), the CSU World Data Center Climate (WDCC) in Germany (http://www.dkrz.de/daten/wdcc/), and the Center for International Earth Science Information Network (CIESIN) at Columbia University, New York, USA (http://www.ciesin.columbia.edu/). The data are provided by co-operating modelling and analysis centres.

Divisions of National Meteorological Services, e.g.,

● The Climate and Large scale Meteorology Department (GMGEC, Toulouse) of Meteo France http://www.cnrm-game.fr/spip.php?rubrique89

● The German Weather Service (Deutscher Wetterdienst - DWD) Integrated Climate Data Centre (ICDC) (http://icdc.zmaw.de/dwd_station.html?&L=1) and the Climate Data Centre (CDC) https://kwz.me/GJ

Research centres, e.g.,

● German Research Centre for Climate – Deutches Klimarechenzentrum (DKRZ) http://www.dkrz.de/ that hosts the World Data Center for Climate (WDCC) / Climate and Environmental Retrieval and Archive (CERA) http://cera-www.dkrz.de/ and http://www.dkrz.de/daten-en/cera/portal

● Center for International Climate and Environmental Research (CICERO), Oslo, Norway http://www.cicero.uio.no/

● National Observatory of Athens, Institute for Environmental Research and Sustainable Development, Athens, Greece http://www.meteo.noa.gr/research_area_03.html

ECMWF

The European Centre for Medium-Range Weather Forecasts (ECMWF) is an independent intergovernmental organisation supported by 34 states.

ECMWF is both a research institute and a 24/7 operational service, producing and disseminating numerical weather predictions to its Member States. This data is fully available to the national meteorological services in the Member States. The Centre also offers a catalogue of forecast data that can be purchased by businesses worldwide and other commercial customers. The supercomputer facility (and associated data archive) at ECMWF is one of the largest of its type in Europe and Member States can use 25% of its capacity for their own purposes.

The organisation was established in 1975 and now employs around 280 staff from more than 30 countries. ECMWF is one of the six members of the Co-ordinated Organisations, which also include the North Atlantic Treaty Organisation (NATO), the Council of Europe (CoE), the European Space Agency (ESA), the Organisation for Economic Co-operation and

D2.1– v 1.0

Page 16

Development (OECD), and the European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT).

UN

The United Nations have recently launched an initiative called “The Big Data Climate Challenge” (http://www.unglobalpulse.org/big-data-climate) with the aim to bring forward current evidence of the economic dimensions of climate change around the world using big data and analytics.

NASA

Research at the NASA Goddard Institute for Space Studies (GISS) (http://www.giss.nasa.gov/, and http://data.giss.nasa.gov/gistemp/) emphasizes a broad study of global change, which is an interdisciplinary initiative addressing natural and man-made changes in our environment that occur on various time scales — from one-time forcings such as volcanic explosions, to seasonal and annual effects such as El Niño, and on up to the millennia of ice ages — and that affect the habitability of our planet.

GISS is located at Columbia University in New York City. The institute is a laboratory in the Earth Sciences Division of NASA's Goddard Space Flight Center and is affiliated with the Columbia Earth Institute and School of Engineering and Applied Science.

The Global Change Master Directory of NASA provides a list of atmosphere and climate data websites in the USA http://gcmd.gsfc.nasa.gov/learn/pointers/meteo.html.

2.5.3 Size of the Community The size of the community is vast, involving thousands of researchers users and policy

makers.

2.5.4 Formal Networks Examples of formal networks are listed below:

WCRP CMIP3 Multi-Model Dataset

In response to a proposed activity of the World Climate Research Programme's (WCRP's) Working Group on Coupled Modelling (WGCM), model output contributed by leading modelling centres around the world has been collected in http://www-pcmdi.llnl.gov/ipcc/about_ipcc.php. The climate model output from simulations of the past, present and future climate was collected mostly during the years 2005 and 2006, and this archived data constitutes phase 3 of the Coupled Model Intercomparison Project (CMIP3). This activity was organized in part to enable those outside the major modelling centres to perform research of relevance to climate scientists preparing the Fourth Assessment Report (AR4) of the Intergovernmental Panel on Climate Change (IPCC).

The CMIP3 multi-model dataset is open and free for non-commercial purposes, after registering and agreeing to the "terms of use".

As of January 2007, over 35 terabytes of data were in the archive and over 337 terabytes of data had been downloaded among the more than 1200 registered users. Over 250 journal articles, based at least in part on the dataset, have been published or have been accepted for peer-reviewed publication.

Earth System Grid Federation

ESGF (http://esgf-data.dkrz.de/esgf-web-fe/) is a worldwide federation for climate data with data nodes in Europe, USA, Canada, China, Japan and Australia. It provides data from climate research generated for and within climate model intercomparison projects: global

D2.1– v 1.0

Page 17

model output (e.g. CMIP5), regional model output (e.g. CORDEX) and selected observational data (e.g. obs4MIPs). User registration is easy and immediately effective. Registration and data download are free of charge. Most data are also accessible for other purposes than research, even for commercial use.

IS-ENES - Infrastructure for the European Network for Earth System Modelling

The ENES data services (https://verc.enes.org/data) will help you to find and access climate data provided by the distributed data centers within the ENES data federation. Information on the different climate models and tools developed in Europe and on European high performance computing facilities is also provided.

KomFor

The KomFor data services (http://www.komfor.net/data-portal.html) will help to find high quality research data from the Earth system sciences provided by the WDC Cluster Earth System Research. The members are Pangaea (AWI, MARUM), WDC-Climate (DKRZ), WDC-RSAT (DLR) and GFZ (GeoForschungsZentrum), associated.

EUDAT - European Data Infrastructure

In the context of the EUDAT project (http://b2find.eudat.eu), meta data of various operational scientific databases in Europe is harvested and comprehended to a joint multi-disciplinary scientific data catalogue.

C3-Grid

The C3-Grid Portal (https://c3portal.awi.de/home, https://verc.enes.org/c3web) provides access to various climate data centers in Germany. DKRZ integrates WDCC and ESGF data into C3-Grid and makes this data available for a set of climate processing workflows.

2.6 SC6: Europe in a Changing World - Inclusive, Innovative and Reflective Societies - Data Community

2.6.1 General Description and Sectoral Structure of the Community This community is made up of practitioners in 4 basic functional groups: data aggregators,

data service providers, data analysts, and users of analytical results.

It is sometimes the case that the same organisation, or indeed a single project or even scientist, can be a practitioner of all four functions. For example, a government agency may conduct a survey, provide access to the raw data, analyse the data, and make policy decisions based on that evidence.

However, this community does not typically operate in such silos. On the contrary, it is widely recognised that these functions require particular skills to be performed expertly, in the interests of good science and good evidence-based decision-making. Public trust in the information relevant to this societal challenge is of vital importance, and is best achieved when there is visible independence between practitioners operating in these functional groups. The structure of this group has evolved at least in part due to this pressure.

A definition of each above mentioned basic functional group is provided below:

The data aggregators’ group consists of the official statistics offices of each country, and the consortiums and organisations conducting large-scale quantitative and qualitative surveys (i.e. European Social Survey, Survey of Health, Ageing, and Retirement in Europe). Increasingly these data aggregators are seeking data sources that pre-exist in the administrative systems of public administration, and in the big data of commercial and other

D2.1– v 1.0

Page 18

person interactive industries, services and media. These sources may offer different quality attributes and different costs to data aggregators and are being actively explored.

The data service providers take the raw data from data aggregators, ingest the data into managed storage, and offer preservation and access services to other practitioners further along the value chain. In this community they are represented by CESSDA and DARIAH, for the Social Sciences and the Humanities, respectively, with many other practitioners independently offering such services in all European countries. Considerable work is underway to make the cooperation between data aggregators and data service providers (for example, the FP7 project “Data Without Boundaries”). The data service providers appreciate that their mandate extends to the ingest, preservation, and provision of access to all relevant data for their user community, and therefore they must learn how to use administrative and big data sources.

Social Sciences’ and Humanities’ researchers, who use data to gain insights into the key issues facing society today, represent the data analysts group. Their aim is to inform the public decision-making, public administrations, and commercial organisations with good scientific evidence and insight. Universities, NGOs, political think tanks, public administrations, and business analysis, all have data analysts using data to derive insight, and this may be the most numerous and dispersed of the 4 functional communities. Data analysts are interested in the different qualities that big data can bring to their menu of sources.

The users of analytics are the decision makers of public and private administrations, and also members of the public in their engagement with public services, consumers of commercial products and services, and with the democratic process. They need to understand the effect of adoption of big data as a source for the analytical insights that they use to understand the world around them.

For the purposes of this project, the key players are Eurostat, the statistics office of the Commission; the national statistics offices in each ERA country; the Research Infrastructures of DARIAH, ESS; SHARE, and CESSDA; the Social Sciences and Humanities scientific community representatives dealing with research infrastructure; and the European Commission and policy making departments of central and local government.

2.6.2 Size of the Community The European Statistical System is a very substantial community, with a statutory function

and budget in every ERA country. In each country these institutions deploy the majority of the professionals in the fields of Official Statistics and Research Methodology in the public sector.

The Research Infrastructure communities of the European Social Survey, SHARE, DARIAH, and CESSDA each have representative organisations in half or more of the ERA member countries.

Each substantial University and research institution in each city of Europe has a faculty of social sciences and humanities. The majority already is (mainly, within the fields of Economics and some other Social Sciences i.e. Psychology) or will be users of data from data aggregators, sources through data service providers.

2.6.3 Formal Networks The European Statistical System is a permanent statutory network with collaborative

research initiatives underway into the issues arising from adopting big data as a source.

CESSDA, SHARE, ESS, and DARIAH are permanent research infrastructures as data aggregators and/or data service providers.

D2.1– v 1.0

Page 19

2.6.4 2.6.4 Degree of formal networking There are many networks used by these communities, such as IASSIST, European

Conference on the Social Sciences, the European Research Association, and the Research Data Alliance. Given the functional interdependencies of the elements of this societal challenge, these are important forums for searching for interoperability, efficiency, and effectiveness in the data value chain.

2.6.5 Specific challenges and opportunities in SC6 functional groups The recent Impact Assessment of EU funding on the SSH identified several critical

weaknesses in European research area; barriers to cooperation formed by the low compatibility and interoperability of national research programmes, and restricted circulation of, and uneven access to scientific knowledge, to mention the few crucial for the data analysts group. The recent body of research accompanies findings on use of the Big Data in SSH [1] questioning how automating research changes the definition of knowledge, how bigger data are not always better data, followed by the question of ethics (if it is accessible, it doesn’t make it ethical), and finally question of new digital divides created by limited access to Big Data.

Some further criticism states that availability of Big Data coupled with new data analytics, challenges established epistemologies across the sciences (including SSH). It argues that creation of data-driven rather than knowledge-driven science, and the development of digital Humanities and computational Social Sciences proposes radically different ways to make sense of culture, history, economy and society [2].

On the other hand, EU promotes interdisciplinary and pan-European research conducts assuming that contributions from various fields can only enrich outputs and support sustainability of results. Nevertheless, in the last decades it has been proven that teams rather than individual authors dominate the production of knowledge [3]. Teams typically produce more frequently cited research, proving applicability and re-evaluation of results is possible across fields and time. Teams now also produce the exceptionally high impact research, even in the areas traditionally reserved for single authors or small national teams. Research is increasingly done in teams across nearly all fields, suggesting that the process of knowledge creation has fundamentally changed.

A position paper of the DG RTD setting priorities for 2016 and 2017 work programme in SC6 [4], reveals a considerable diversity of conditions in member states that can limit the adoption of identical practices and mechanisms on the level of the data providers group. It recommends further development of tools for analysis and access to publications and the underlying data, as well as emphasizes the need to address the unresolved challenge in the assessment of research results in the European Humanities and Social Sciences due to the number of languages and the different data formats. Therefore, the following recommendations (many of which BDE project has already taken into account) were formulated for developing an information infrastructure building on what has already been achieved, including the many medium- and small-scale national research infrastructures that need to be networked at a European level [5]:

● the cataloguing of journals, monographs and other publications; ● searchable database of contents, with multilingual input and output; ● ensuring standards and meta-data for digitised records and tools for analysing objects

within texts, pictures, tones and multi-modal media; ● open and, as far as possible, free access to published outputs and controlled access

to primary data; ● enduring support for the conservation of data and the migration of data to different

platforms;

D2.1– v 1.0

Page 20

● incentives for participation and maintaining comparability of information within longitudinal research;

● incentives for national data collection to ensure high levels of country participation; ● mapping of research expertise across Europe and in other regions.

2.7 SC7: Secure Societies - Protecting Freedom and Security of Europe and its Citizens - Data Community

2.7.1 General Description The Secure Societies H2020 Societal Challenge is related to the protection of freedom

and security of Europe and its citizens. Existing datasets to be possibly used in the Security field are very heterogeneous: possible examples are Earth Observation (EO) satellite data, aerial imagery (e.g. from Remotely Piloted Aircrafts), long-term archives, in-situ data and data from collateral sources (e.g. media, public data, web-based communities, social networks, user-generated content, video sharing sites, wikis, blogs and publicly available sources).

In particular space assets play an important role in the Security domain and activities related to their usage for this scope are currently running accordingly to international programmes within and outside Europe; for instance the EU and its Member States, driven by principles such as the ones stated in the European Space Policy or in the European Security Strategy, are supporting programmes such as Copernicus and Galileo that can have an application in the Security domain.

The rapidly increasing availability, precision and variety of data to be analysed and used in the Security field (e.g. in support of EU decision-making) require among other things new or refocused initiatives to support the development of capacities for the whole data lifecycle.

2.7.2 Sectoral Structure of the Community The mission of the EU SatCen is to support the decision making and actions of the EU in

the field of the Common Foreign and Security Policy (CFSP) by providing products and services resulting from the exploitation of relevant space assets and collateral data. Thus the EU SatCen can represent in the framework of the BigDataEurope (BDE) project, and in line with the Secure Societies H2020 Societal Challenge, the Stakeholders involved in the decision-making process of the EU in the CFSP field.

A possible list of key players in the Secure Societies domain could include EU Member states, EU agencies and other relevant entities, e.g.:

● EU Member States: Ministries of Defence and of Foreign Affairs; ● EC entities: DG Migration and Home Affairs, EC Joint Research Centre (JRC); ● EU entities: European External Action Service (including its relevant units); ● EU Agencies: European Defence Agency (EDA), European Maritime Safety Agency

(EMSA), European Agency for the Management of Operational Cooperation at the External Borders of the Member States of the EU (FRONTEX), European Union Agency for Network and Information Security (ENISA), Europol European Cybercrime Centre (EC3);

● International Organisations: European Space Agency, United Nations; ● Industry Associations: European Association of Remote Sensing Companies (EARSC),

Trade Association of the European Space Industry (EUROSPACE).

2.7.3 Size of the Community The Secure Societies Community is roughly composed by thousands of people.

D2.1– v 1.0

Page 21

2.7.4 Formal Networks Taking into account that Big Data is a quite recent issue, further investigation at EC level

is needed to identify the current initiatives and fora; here below some existing examples in the field of EO are described.

Group on Earth Observations (GEO)

GEO is a voluntary partnership of 97 Governments (including the European Commission) and 87 International Organizations with a mandate in Earth Observation providing a framework within which these partners can develop new projects and coordinate their strategies and investments.

Open Geospatial Consortium (OGC)

Open Geospatial Consortium (OGC) is a worldwide organisation with the aim to advance the development and use of international standards and supporting services that promote geospatial data interoperability. It is also the body that standardizes access to Earth Observation data.

2.7.5 Degree of Formal Networking Group on Earth Observations (GEO)

GEO is coordinating efforts to build a Global Earth Observation System of Systems (GEOSS); GEOSS will provide decision support tools to a wide variety of users and will be a global and flexible network of content providers allowing decision makers to access an extraordinary range of information at their desk.

Open Geospatial Consortium

To accomplish this mission, OGC serves as the global forum for the collaboration of geospatial data / solution providers and users; several OGC activities focuses on the geospatial aspects of Big Data Processing.

2.7.6 Informal or Upcoming Networks Big Data from Space

Big Data from Space is a conference co-organised by ESA, JRC and the EU SatCen which bring together researchers, engineers, developers and users in the area of Big Data in the space sector. The focus is on the whole data life cycle, ranging from data acquisition by space-borne and ground-based sensors to data management, analysis and exploitation in the domains of Earth Observation, Space Science, Space Engineering, Space Weather. The conference is also used as a requirements/ideas collector for ESA and the European Commission to support actions in this field. The last edition of the Big Data from Space Conference was held in Frascati at ESA-ESRIN in November 2014 with nearly 400 participants coming from more than 25 different countries; the next event is foreseen in early 2016 in Tenerife and will include a session on Big Data for Secure Societies.

EU Space Policy Conference

This annual conference offers to stakeholders from the space sector, industry, users and European and national decision-makers the opportunity: to debate the current and future implementation of EU space programmes; to illustrate the contributions of space infrastructures and services to the delivery of effective EU joint action; to discuss the increasing role of space infrastructure and services in meeting the new security and defence challenges and threats facing Europe, including energy security.

D2.1– v 1.0

Page 22

D2.1– v 1.0

Page 23

3. Stakeholder Identification Methodology Our methodology for stakeholder identification and engagement follows the best practices

laid out by the BIG project2 consortium, as detailed in Deliverables 3.5.1 and 3.5.2 (Final Stakeholder engagement activities, October 2014).

The BDE consortium has since the start of the project sought to identify all relevant stakeholders associated with each of the seven SCs, in both the respective industrial sectors (demand) and technological areas (supply) in the existing European Big Data value chain. A collaborative stakeholder collection process was initiated within the first month, in view of the requirement to engage stakeholders at the earliest and the decision to invite a large number already to the public project launch (27.02.2014, Brussels). Following this first iteration, which successfully yielded around 60 participants from across the entire Big Data stakeholder spectrum despite a lack of travel funding, the identification exercise will follow a more methodological process. Taking cue from the above-referenced project, each iteration process will consist of four phases:

1. Identification: listing relevant groups, companies, organisations, etc. 2. Analysis: understanding stakeholder background, perspectives and relevance. 3. Mapping: identifying relationships amongst stakeholders, and mapping to the

objectives. 4. Prioritisation: ranking stakeholder relevance Plans for each phase are described in more detail below.

3.1 Identification Phase We adapt the BIG project definition to define stakeholders as “European-based entities

that have a stake, interest or right in the big data value chain and that are affected (negatively or positively) by Big Data opportunities, challenges, activities and trends, and who therefore have a high likelihood of influencing (and benefitting) from the Big Data Platform (architecture, components, guidelines and best practices) resulting from BigDataEurope ”.

The BDE consortium members will continue the process of identifying stakeholders. The process will be transparent and open so that all interested parties may participate. Contacted entities will be encouraged to invite additional stakeholders in order to achieve a multiplier effect.

The identification process is being spearheaded by the 7 established SC domain chairs (2 each, as shown in Table 1), whose networking activities throughout the project will yield a continuously expanding list of stakeholders, especially in the first year. Several sources of information are being considered for community building:

● BDE partners’ existing network. ● European Commission's (via Project Officer and other key people) contacts to relevant

Organisational Units. ● European Commission’s public list of funded projects, including recent projects funded

under both FP7 and H2020 programmes. ● Other CSAs in ICT-15, since they will need to reach the same communities. ● Dissemination plans published by related projects as early deliverables as these act as

a guide to the events where several projects will be represented. ● Existing lists created by recent related projects and initiatives, such as the European

Big Data Directory and Map3 generated by the BIG project.

2 http://www.big-project.eu/ 3 http://big-project.eu/content/european-big-data-map

D2.1– v 1.0

Page 24

● National and European-wide information days, and publicly-available attendee lists (including anchors, organising committee and affiliations).

● Umbrella organisations such as national funding agencies, European Innovation Partnerships and/or national and European-wide associations

Table 1: Technical and domain leads for each of the seven societal challenge areas

Societal Challenge Chairs (Technical, Domain)

Healthcare Leading partner: Open PHACTS

Technical Victor de Boer (VU)

Domain Bryn Williams-Jones (Open PHACTS)

Food & Agriculture Leading partner: FAO

Technical Timea Turdean (SWC) & Nikos Manouselis (Agro-Know)

Domain Valeria Pesce (FAO/GFAR)

Energy Leading partner: CRES and NCSR-D

Technical Andreas Ikonomopoulos (NCSR-D)

Domain Fragiskos Mouzakis (CRES)

Intelligent Transport Leading partner: Fraunhofer , entity to be subcontracted

Technical Simon Scerri (Fraunhofer)

Domain ERTICO t.b.c.

Climate & Environment Leading partner: NCSR-D

Technical Spyros Andronopoulos (NCSR-D)

Domain Diamando Vlachogiannis (NCSR-D)

Inclusive & Reflective Societies Leading partner: CESSDA

Technical Martin Kaltenböck (SWC) & Simon Scerri (Fraunhofer)

Domain Paul Jackson (CESSDA)

Secure Societies Leading partner: EU SatCen

Technical Manolis Koubarakis (UoA)

Domain Sergio Albani (EU SatCen)

The exercise described in Section 2, namely the characterisation of communities engaged in each societal challenge domain, enabled the following categorisation of stakeholders. Identified stakeholders are considered as actors along the Big Data Value chain, with a potential to create value in the data economy. The identified stakeholder categories include entities from the respective industrial sectors (demand) and technological areas (supply) as well as government agencies and other big data communities, projects and initiatives:

● Industry ○ Large Companies ○ SMEs & Start-ups ○ Data generators ○ Industrial Associations

● Academia ○ Researchers ○ Unvirsities

D2.1– v 1.0

Page 25

○ Research Associations, Institutes and Centres ○ H2020 Project Coordinators

● Public Administration, Regulatory bodies & Governance ○ Governmental agencies ○ Intra-governmental networks ○ European Commission Units ○ European Union Entities ○ Policy makers

● Projects, Standardisation bodies and other initiatives ○ Networking/Lobbying Associations ○ Standardisation bodies ○ Societal Initiatives ○ International Organisations ○ Running Projects

Industry will be represented by large companies in the domain, as well as start-ups, SMEs and entrepreneurs that have a stake in the Big Data Value chain, and posses specific technical, application and business competences. Candidates consist of enterprises in all seven sectors and of all sizes that wish to improve their services or products using Big Data technology. They also include technology providers who already provide data management tools, platforms and analysis services, including services for the aggregation, generation and transformation of non-public data for business use-cases.

The academic sector will be represented by individual or teams of researchers at universities and research centers related to the SCs. Amongst these will be coordinators of running and relevant H2020 projects. Research centres, institutes and universities related to the SCs will also be identified.

Governmental agencies, together with various European Commission Units and Directorate Generals, will be targeted if: they generate or aggregate national and European-wide data sources; they have an interest in exploiting networked data for various intents and purposes, or they act as regulatory bodies concerned with privacy and legal issues related to mixed data usage.

H2020 and FP7 projects related to each sector are considered relevant if they are still running, or have only finished recently, if they target one of the seven sectors, and if they handle or in any way deal with data subject to one of the three original big data v’s - volume, velocity and/or variety. Other entities and initiatives considered include standardisation bodies seeking to promote new big data concepts, systems and solutions for global adoption, societal initiatives and lobbying/networking associations promoting data value services through the coordination of initiatives between various key stakeholders related to a single sector.

Once a stakeholder has been identified, they will be invited to follow the project (at the very minimum), stay up-to-date with dissemination and engagement activities, and join the W3C interest groups. An invitation letter that is used as a template to invite identified stakeholders through postal mail or email is included in the Appendix Section.

3.2 Analysis Phase Following the stakeholder identifications phase, we will analyse each stakeholder’s

relevance in the sector, their interest to engage in our activities and the value of said engagement. The key parameters are adaptations of those identified by the earlier-referenced BIG project deliverable, namely:

● Identification of big data use, expertise ● Relevance and fit to one (or more) of the 7 SC sectors ● Relevance and fit to one (or more) of the project objectives (w.r.t WP2)

D2.1– v 1.0

Page 26

● Legitimacy of stakeholder’s claim for engagement ● Willingness to engage ● Influence level - who they influence ● Necessity of involvement

3.3 Mapping Phase Stakeholders will be mapped i) to each other, based on the results of the above analysis,

and ii) to the objectives, based on the five core project, community building and networking-related objectives:

● Creating SC communities ● Identifying issues and eliciting requirements for a Big Data architecture and

components ● Identifying use-cases ● Participating in pilots, trials and prototype testing ● Dissemination The result of this mapping exercise will only be used internally, and it will guide tasks along

WPs 2-7 (community building, requirement elicitation, BDE platform, instances, evaluation and dissemination).

3.4 Prioritisation Phase Following the mapping stage, activities with each of the identified stakeholders will be

prioritised ranked by relevance. Different priorities for the three different kinds of WP2 activities will be established for each stakeholder:

1. Communication - Indirect interaction by means of project updates, workshop announcements and invitations, and project result dissemination.

2. Contribution - Direct contribution by means of general interviews, surveys, discussions about issues concerns, technical discussions about challenges and technological needs, etc.

3. Involvement - Direct involvement in the project’s technological activities by means of specific/use-case discussions, trials, feedback collection.

The results of this phase will indicate the engagement level appropriate for each stakeholder, at identification stage. However, this will then be fine-tuned and updated during the engagement process, as described in Section 4.2.

4. Engagement Strategy and Metrics The consortium aims to constantly engage the identified stakeholders and groups

throughout the different project stages. Stakeholder engagement activities will follow a multi-dimensional, multi-level and multi-channel approach. These aspects are described in more detail below.

4.1 Multi-Dimensional Engagement The multiple dimensions primarily correspond to the seven European Big Data

Communities to be established through the W3C interest groups. Within each of these seven dimensions, engagement activities will then target:

D2.1– v 1.0

Page 27

● The extended networks of the seven societal challenge project partners outlined in the introduction;

● Other identified stakeholders (including those identified via the European Commission and the relevant units), and their extended networks.

4.2 Multi-Level Engagement Multiple levels result from the identification of the three activity priorities established in

Section 3.4. Repeating the exercise set out by the BIG project, these priorities can be mapped to a Pyramid of engagement4 which distinguishes between the different levels. This explicitly defines what to expect from the stakeholders, what they are willing to commit to, what they are interested in, and what characteristics they require to move up to the next level. The pyramid of engagement can be adapted to the project to generate the following list (where level 1 here refers to lowest level of engagement and level 6 to highest):

1. Observers: Stakeholders with a general, non-committal but confirmed interest in the project’s contributions.

2. Followers: Stakeholders that subscribe to notifications, activities and publications by the consortium.

3. Endorsers: Stakeholders that forward project contributions, notifications, activities and publications amongst their extended networks.

4. Contributors: Stakeholders that attend project activities, contribute to discussions, problem identification, requirements elicitation.

5. Owners: Stakeholders that are involved in the projects activities, including participating in pilots, prototype trials and who provide continuous feedback.

6. Leaders: Representatives from each societal domain that manage activities and communications with the other stakeholders.

Note that in this case, the 6th (and highest) level of engagement only applies to the seven identified societal consortium members/subcontracted partner; who are not merely stakeholders but full consortium members tasked with leading the communities. The ‘Communication’ activity priority in Section 3.4 can be mapped to levels 1-3, the ‘Contribution’ priority to level 4 and ‘Involvement’ to level 5. However, these priorities are initial indications of the engagement level, and the best fit to the above five levels will be determined throughout the course of the engagement process. Stakeholders can be promoted (higher level of engagement) or demoted, accordingly. In addition, the lowest level of engagement will exclude stakeholders who express interest in the project without ever following-up on any events, progress and activities.

In order to determine which of the stakeholders are promoted to level 4 and 5, we will consider the following criteria:

● expertise, experience and background ● skills and knowledge ● influence within the respective community ● access to representative data sources ● technology-business balance ● potential of exploiting and disseminating project results ● C-level of the contact in the company, or role in the project/institution ● commitment and interest to the project ● recommendation from the representative societal consortium member

4 Groundwire’s 10 Rules of Engangement, Available at:

http://www.salesforcefoundation.org/groundwires-10-rules-of-engagement/

D2.1– v 1.0

Page 28

4.3 Multi-Channel engagement We will consider multiple channels to engage with the communities and individual

stakeholders. We categorise these channels in three: the planned stakeholder workshops and synchronous/asynchronous communication channels. More details are provided in the subsections below.

4.3.1 Community Stakeholder Workshops Stakeholder engagement will be focussed around the yearly workshops organised (at

least one) per each of the seven SC communities. As per the DoW in the first round (Year 1), the workshops’ main objective will be to elicit requirements for the big data integrator platform. In the second and third year, the focus will be on reviewing the architecture for prototype implementation, and platform evaluation and showcasing, respectively. To facilitate the organisation of the resulting (at least) 21 workshops and ensure a consistency in results across all SCs, we have established a workshop blueprint, described below.

Note: The below blueprint is optimised for the first round of workshops in 2015, in particular, the structure and agenda. The blueprint will be updated for the second and third rounds (2016, 2017) accordingly and updates reported in the interim WP2 deliverables.

Societal Challenge Stakeholder Workshop ‘Blueprint’

General Guidelines

● A 1-day workshop (spread over multiple days if necessary) will be organised: ○ (preferably) in the course of/collocated with a specific event, or ○ as an independently organised event

● In either case, a crucial requirement is the high-likelihood of attendance by EC representatives (including Units and DGs, together with a good balance of influential stakeholders in the SC sector.

● We will strive to target all areas and sub-domains of each SC in terms of coverage and stakeholder attendance, even if we will then only focus on some of them (as per DoW).

Consortium Workshop Team

● At least 1 person from the respective SC domain representative ● At least 1 person from the respective SC technical representative ● At least 1 local organiser providing organisational support ● At least 1 extra person for noting minutes and providing additional organisational support

Participants

We envisage between 10-30 participants for each workshop, depending on whether they will need to travel especially for the event, or they will be attending a main collocated event. Attendees will include individuals from:

● Research & Innovation (H2020 projects) ● Academia ● EC representatives ● EU entities ● Industry (large companies, startups, SMEs) ● Public administration/Policy Makers ● Networking/Lobbying/Associations/Societal initiatives ● International Organisations ● Data community - Open Data & Data Science ● Other relevant stakeholders

Other considerations

● Gender balance

D2.1– v 1.0

Page 29

● Regional balance (EU28)

Workshop Structure

● Workshop can be managed within 4 to 6 hours. ● Workshops can consist of up to four interactive sessions, each with a maximum of 3 input

talks, or a panel discussion, as follows: ● BDE and existing data-centric initiatives in the SC ● Existing and potential Big Data use-cases and applications in the SC ● Technical requirements for a Big Data Platform in the SC ● Industrial session can be organised in parallel, or back-to-back, to collect feedback

on EU policy

Draft Workshop Agenda

● Welcome & Introduction, 1-2 hours ○ Tour de Table

■ Name and affiliation ■ Role in organisation ■ Connection to big data & data management ■ Expectations for the workshop (what to take home)

○ Introductory Talk ■ BigDataEurope ■ Big Data in a nutshell ■ (To be extended per each SC, as relevant)

● Break, 0.5 hours

● Interactive Sessions (as per structure above), 3 hours

● Session 1 ○ [15’ Input] Data-centric initiatives in the SC ○ [20’ Interactive] Stories and persona

● Session 2 ○ [15’ Input] Big Data use-cases in the SC ○ [15’ Interactive] Pilots

● Session 3 ○ [15’ Input] Technologies and tools used and envisaged ○ [20’ Interactive] Data requirements ○ [20’ Interactive] Technology requirements

● Session 4 ○ [10’ Input] Industrial session/ EU policy requirements ○ [10’ Input] Legal issues around (big) data, Governance, Data portability ○ [20’ Interactive] Other requirements

● Summary, outreach & farewell, 0.5 hours

○ Summary of results ○ Give participants clear picture of the workshop’s outcomes ○ Q&A session ○ Closing note, outreach plans and farewell

In the following table, we include the current plans for the first series of workshops (Year 1):

D2.1– v 1.0

Page 30

Table 2: Workshop Plans, Year 1

SC Collocation Event (optional)

Audience Targeted Date (2015)

Planned venue

Possible anchor invitees/ keynotes (excluding BDE consortium)

BDE contact

SC1 Health

n/a EU Directorate General- focussed meeting

21.05 Brussels T.b.d. OpenPhacts

SC2 Food

Research Data Alliance (RDA), Agricultural Data Interest Group meeting

Agriculture, food & environment data managers and scientists; also participants from other disciplines related to our SC.

t.b.c. Main event date: 23-25.09

Paris

Odile Hologne, INRA, France. Sander Janssen, Alterra WUR, The Netherlands

FAO, Agro-Know

SC3 Energy

EU Sustainable Energy week: Policy Conference

Utilities/operators, TSOs, ENTSO-e, DSOs, EDSO for smart grids, ENTSOG, manufacturers, Smart-Grids.eu platform, institutions/associations, EC officers in related units (JRC, DG R&I G2/G3. INEA, DG Connect G3/H5, DG Energy), EERA

15.06 Main event date: 16-18.06

Brussels T.b.d. CRES

SC4 Transport - technical

Intelligent Transport Systems World Congress

All aspects of transport stakeholders - technical

t.b.c. Main event date: 5-9.10

Bordeaux Dr. Fastenrath, BMW

Fraunhofer/ERTICO

SC4ii Transport - policy

EC Policy Seminar

Policy-oriented meeting

19.05 Brussels T.b.d. Fraunhofer/ERTICO

SC5 Climate

JPI CLIMATE workshop

Representatives of National Meteorological Services, ECMWF, Organisations supporting IPCC, Research Centres for Climate

16.06 Main event date: 16-17.06

Brussels T.b.d. NSCR-D

SC6 Societies

(No event, but GESIS are ideal hosts as sectoral leaders)

Officials from Eurostat, national official statistics departments, from the Social Science and Humanities Data

t.b.c 09-10

Cologne (location of hosts)

Alexia Katsandiou, GESIS/CESSDA Training;

CESSDA

D2.1– v 1.0

Page 31

Archives, in particular those who are already working on big data investigation projects or who have expressed an interest in response to our questionnaire. Plus representatives from DG Research and Innovation.

Kamel Gadouche, CASD, Paris Jane Naylor, Office for National Statistics, UK

SC7 Security

SatCen Technical Working Group

EU Agencies (EDA, EMSA, FRONTEX), EEAS (including EUMS, CPCC), EU Ministries of Defence and Foreign Affairs, ESA, JRC, EARSC, EUROSPACE.

22.10 Main event date: 22.10

Madrid EU SatCen Director

SatCen

4.3.2 Community-Based Requirements Elicitation (T2.2) Within the workshops, participants will be guided through interactive interactions, to get

input and stable quantitative and qualitative material to feed the Requirement Elicitation (RE) and further on to drive the Requirement Specification (RS).

4.3.2.1 Requirement Elicitation Matrix

Based on a matrix of to be asked questions, involved stakeholders, and elements of the RE, these interactions are laid out with a couple of methods. The following matrix (table), which will be filled with concrete questions, will be the basis for the RE process.

Table 3: Requirement Elicitation Matrix

Elements of the RE model

Questions to people within the specific Societal Challenge grouped by type of interviewee

Business Strategical Technical Domain experts

Stories Question Question Question Question

In this element, stories which describe the current status and future development are asked

Question Question Question Question

Personas Question Question Question Question

In this element, typical personas which playing a role are described

Question Question Question Question

Pilots Question Question Question Question

In this element, the special facts of the pilots are described

Question Question Question Question

Functional requirements Question Question Question Question

D2.1– v 1.0

Page 32

In this element, the functional requirements to our specific solution are described

Question Question Question Question

Non-functional requirements Question Question Question Question

In this element, the non-functional requirements to our specific solution are described

Question Question Question Question

Evaluation Question Question Question Question

In this element, we ask for the results which are crucial to be archived (quantitative and qualitative)

Question Question Question Question

Other Question Question Question Question

Question Question Question Question

4.3.2.2 Interactions

To ensure corresponding results in the process of collected data for the RE, we are building an interconnected question matrix (see above). In order to choose the appropriate form of collecting in respect to the stakeholders, we vary the interaction according to the dialog partner. At the moment, we build on three different forms of interaction

4.3.2.2.1 Affinity diagram / K-J Method5

The affinity diagram organizes input using the following steps:

1. Choose the question relevant for the stakeholders group and put them on a screen. In case you have a workshop involving mixed stakeholders groups, provide an enlarged set of questions and relate the answers to the stakeholders using colors to mark categories

2. Invite participants to pin the card to the respective question and comment it 3. Group or cluster the cards (using feedback with the participants) 4. Conclude the results and give space to comment (note the comments on cards and pin

them to the board) 5. Leave unresolved questions open 6. Document (take a photo, for exemple) and rotate it to the participants Limit the number of answers by limiting the cards per participant, but inform participants

that they have to household with their cards, and spread answers over various questions.

In many cases, the best results tend to be achieved when the activity is completed by a cross-functional team.

4.3.2.2.2 Round Table / World Cafe

Small groups of four or five participants sit around a table and discuss a set of questions for a structured amount of time. Notes and drawings are often made by participants on the paper tablecloths used in most events. Individuals switch tables after the agreed upon amount of time, where (if they are being used) a "table host" at the new table briefly welcomes people and fills them in on highlights of the previous discussion.

5 http://www.pmhut.com/affinity-diagram-kawakita-jiro-or-kj-method

D2.1– v 1.0

Page 33

Participants have multiple rounds of conversation in response to defined questions, taking the ideas from one group and adding to them, developing insights through multiple conversations with a diverse number of people.

Tables can be structured by stakeholder group or by RE element - but not mixed.

A "table host" may be used to anchor each table, welcoming incoming participants and relaying any key insights from the last round of conversation. The table host has also the duty to formulate the outcome of the conversation series, answering the RE matrix for his table.

4.3.2.2.3 F2F Interview

In standardized open-ended interviews, the same open-ended questions are asked to all interviewees. As questions and answers are not strictly formalized, the interview is the most extensive exercise to get answers for the RE.

For exclusive or prominent interviewees this technique is appropriate, as they are used to get treated exclusively. An F2F interview can provide this special attention.

The interviewer has to distill answers from notes or audio recordings after (ideally time near) the interview, together with a general conclusion.

4.3.2.2.4 Plenary Session

As an open interactive format, a plenary session can be chosen. After presenting the question / topics (possible beforehand also via e-mail) to the participants, the floor will be opened for inputs. A moderation, a speakers-list, and time discipline is crucial to get outcomes from a plenary session.

A protocol secretary has to write precise minutes which gives the possibility to derive qualitative findings out of the plenary statements and discussion.

4.3.3 Asynchronous Engagement This form of engagement includes general project progress communications, workshop

announcements and requests for expressions of interest, as well as direct requests for information and discussions over regular emails, through the project Web site6, as well as over project blog-posts and other social media accounts. In particular, ERCIM will support the setting up and maintenance of the seven dedicated W3C SC-oriented Community Groups that can access the global community. The infrastructure provides three mailing lists:

● a publicly archived list writable by group members which is the primary list for discussion of issues;

● a publicly archived, publicly writable list that allows non-group members to submit comments;

● a member-only list for administrative messages (in practice this list is rarely used). Community Groups are entitled to publish reports and other documents that, like all W3C

documents, are highly stable and citable, and these can sometimes be seen as - ‘pre-standardisation’ documents. The Community Group behind Open Digital Rights Language, ODRL, is a prime example of this7. There are other optional elements of the infrastructure, notably a wiki, but it is expected that the Big Data Europe CGs will use the project’s existing infrastructure and website.

6 http://www.big-data-europe.eu/ 7 https://www.w3.org/community/odrl/

D2.1– v 1.0

Page 34

It’s important to note that participation in W3C Community Groups requires participants to agree to the contributor licence8, the primary aim of which is to ensure that the output of the group is royalty free.

In the table below, we outline current plans for this form of engagement, clearly identifying the target stakeholders (refer to Section 4.2) and also providing metrics for determining whether we are reaching our goals.

Table 4: Channels and Goals for Asynchronous Engagement

Channels Targeted Engagement Level Group9

Goal (indicative averages throughout project lifetime)

Website All

100 views/day

Blog (through Web site) All 250 shares/month 80 Blogposts/year

Emails (Direct messaging) - Endorsers - Contributors - Owners

1 Initial Invitation Follow-ups, Direct Communication

Newsletter - General Mail. List (MailChimp) All 1000 communications/month

W3C Interest Groups & Fora - Followers - Endorsers - Contributors - Owners

7 SCs Interest Groups 20 active members per SC group 200 other members (followers) per SC group

Emails - W3C Mailing Lists 1 per SC

Twitter account (General) All 20 shares/day

SlideShare account (General) All 1000 views/month

LinkedIn profile (General) All 500 visits/month 1000 members

4.3.4 Synchronous Engagement In parallel to holding the workshops and sustaining continuous asynchronous

engagement, the consortium also plans the following activities. If the identified stakeholders lack the appropriate knowledge about issues to be consulted upon, proper steps need to be followed to provide them with the required information prior to the start of the discussions, interviews and consultations. Below, we also indicate the targeted engagement level groups (refer to Section 4.2) and metrics for determining the success rate.

8 https://www.w3.org/community/about/agreements/cla/ 9 Leaders are by default always included since they are part of the consortium.

D2.1– v 1.0

Page 35

Table 5: Channels and Goals for Synchronous Engagement

Channels Targeted Engagement Level Group10

Goal

Webinars/Training Sessions on platform, components and applications

All

30-35

Wiki (W3C) All 1 per SC

Interest group calls\Hangouts - Followers - Endorsers - Contributors - Owners

1 per month per SC

Interest group face-to-face workshops - Endorsers - Contributors - Owners

1-2 per year per SC

Interviews - Contributors - Owners

6 per SC

Presence at Big Data events & conferences Leaders 20 per year

Since the hangouts organised for members of the SC Domain Interest Groups are crucial forms of direct engagement with a high impact on the project’s success, we here also provide a blueprint for the organisation of such sessions.

Societal Challenge Stakeholder Hangout ‘Blueprint’

General Guidelines

Between the workshops, each interest group will organize regular video hangouts, where the progress of the project is presented and discussed with the societal stakeholders. These hangouts will be recorded and made available online.

Consortium Workshop Team & Participants

● At least 1 person from the respective SC domain representative ● At least 1 person from the respective SC technical representative ● A 3rd person for noting minutes and providing organisational support ● We envisage between 15 and 100 participants for each hangout, with a similar distribution to

workshop attendees (refer to Workshop blueprint)

Hangout Structure & Agenda

● Hangouts can be managed from 40 minutes to maximum 1 hour ● Hangouts can consist of thematic inputs, as follows:

● BDE and existing Data-centric initiatives in the SC ● Existing and potential Big Data use-cases and applications in the SC ● Technical Requirements for a Big Data Platform in the SC ● (Only for relevant SCs) Industrial session can be organised in parallel, or back-to-

back, to collect feedback on EU policy

10 Leaders are by default always included since they are part of the consortium.

D2.1– v 1.0

Page 36

Draft Workshop Agenda

● Welcome & introduction ○ About the hangout (aim, agenda, frequency) ○ Introduction round (one sentence per person, via chat)

● Thematic input (see above) talk

○ relevant to the development on BD in the SC ● Lightning talks

(max. 5 minutes each, call for those in advance / invitation to the hangout) ● Conclusion and call for the next hangout (has to be scheduled already)

5. Current and Future Contacts Given the intent to invite a number of stakeholders already to the public project launch

(where we have successfully attracted 45 stakeholders and 7 EC representatives), we started the stakeholder collection process very early in year 1. The exercise was in part delegated to the 7 SC consortium representatives, who compiled a list of stakeholders in their extended networks, and in part supported by all other consortium members and directed by the project officer.

For the stakeholder compilation exercise, a balance between technical and business is being targeted. Industrial contacts and large business players, as well as startups and SMEs across the data economy is actively targeted. A yard stick is to try and attain a 1:1 ratio balance between a technical:business stakeholder, however this will largely vary depending on the SC, with some being more technical- or business-oriented.

In the table below, we indicate the number of stakeholders that have already been contacted by the submission date of this report, and the number of stakeholders targeted (contacted) by the end of the first year, across the seven solid and diverse stakeholder communities. The aim is to implement the engagement strategy and operate the W3C groups, plan next workshops and carry out regular events as indicated in Section 4.3. Note that some stakeholders can be categorised under more than one SC, and therefore the ‘contacted’ figures may contain some duplicates. In addition, some as yet unclassified stakeholders (i.e., they have not been assigned to one specific SC sector) are omitted from the shown contacted counts (around 50). The stakeholders are also categorised by types, including: Startups/SMEs, Industry (large corporations/companies), Public Administration/Policy (Governmental agencies, Intra-governmental networks, policy makers), EC Representatives (Directorate Generals, Units, EU Entities), Academia (Universities, Research Associations, Centres and Institutions), Networking/Association/Initiatives (Networking/Lobbying Groups, Industrial Associations, Societal Initiatives), Projects (Running FP7/H2020 projects), International Organisations and others.

D2.1– v 1.0

Page 37

Table 6: Contacted Stakeholder Count and Future Target Count

Societal Challenge

Startup/SME

Industry Public Admin./Policy

EC Reps.

Academia Networking/Associations/Initiatives

Organisations

Projects Other

HEALTH

Contacted 2 2 1 6 8 3 0 6 -

Target 100 50 30 10 100 30 20 20 -

FOOD

Contacted 1 0 4 4 13 5 5 1 5

Target 70 40 20 10 80 30 20 20 -

ENERGY

Contacted 0 0 0 23 3 13 2 0 0

Target 20 50 20 10 100 30 20 20 -

TRANSPORT

Contacted 1 19 3 2 10 1 0 2 -

Target 100 50 30 10 90 30 15 20 -

CL IMATE

Contacted 0 0 1 5 3 1 8 14 10

Target 15 10 20 10 120 30 20 20 -

SOC I ET I ES

Contacted 0 0 16 5 1 1 1 0 0

Target 10 5 30 10 80 30 20 20 -

SECUR I TY

Contacted 7 8 0 7 7 0 3 2 1

Target 10 5 30 10 100 30 20 20 -

6. Summary In this deliverable we outline the Community Building and Stakeholder Engagement

strategy during the BigDataEurope project. The efforts described have already started. In particular, after describing the characteristics of the seven targeted individual sectors (corresponding to the seven societal challenges), we present our methodology for stakeholder identification and engagement. In order to form and maintain the resulting communities, we

D2.1– v 1.0

Page 38

show the different levels of the planned engagement, including details about the 21 (or more) foreseen workshops, the interest groups to be set-up and the various additional activities planned. The WP2 efforts of the coming months (Year 1) will focus on stakeholder identification (Task 2.1) in order to attain a widespread community-base. After that the efforts will shift to maintaining the appropriate engagement levels and carry out the planned activities in view of the requirements elicitation process (Task 2.2), the results of which (reported in two subsequent versions of Deliverable 2.3, M6, M10) are a prerequisite for the architecture design, implementation and instantiation activities in WPs 3-5. Three follow-ups on the community building activities described in this report will be provided at the end of each year (Deliverable 2.2, M12, M24, M36).

7. References 1. Boyd, D. et al. (2012). Critical questions for Big Data. Information, Communication &

Society, Vol 15, Issue 5, p. 662-679. 2. Kitchin, R. (2014). Big Data, new epistemologies and paradigm shifts. Big Data &

Society, Sage Publications, April-June issue, p. 1-12. 3. Wuchty, S. et al. (2007). The Increasing Dominance of Teams in Production of

Knowledge. Science 316, p. 1036-1039. Accessible at: www.sciencemag.org 4. External advice and societal engagement: Towards the 2016 and 2017 work

programme of "Inclusive, Innovative and Reflective Societies" of Horizon 2020. European Commission DG RTD, 2014. p.1-50.

5. External advice and societal engagement: Towards the 2016 and 2017 work programme of "Inclusive, Innovative and Reflective Societies" of Horizon 2020. European Commission DG RTD, 2014. p.1-50.

8. Appendix

8.1 Stakeholder Invitation Letter