researchers’ needs for transnational access to confidential microdata survey preliminary results...

31
Researchers’ needs for transnational access to confidential microdata Survey preliminary results Marie Cros 1 Frédérique Cornuau 1 Roxane Silberman 2 1 Université Lille 1 2 CNRS Réseau Quetelet DwB workshop, NTTS conference, Satelllite event, Brussels, 4 March 2013

Upload: arron-warren

Post on 12-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Researchers’ needs for transnational access to confidential microdata Survey preliminary results Marie Cros 1 Frédérique Cornuau 1 Roxane Silberman 2 1

Researchers’ needsfor transnational access to confidential microdata

Survey preliminary results

Marie Cros1 Frédérique Cornuau1 Roxane Silberman2

1 Université Lille 12 CNRS Réseau Quetelet

DwB workshop, NTTS conference, Satelllite event, Brussels, 4 March 2013

Page 2: Researchers’ needs for transnational access to confidential microdata Survey preliminary results Marie Cros 1 Frédérique Cornuau 1 Roxane Silberman 2 1

Why surveying researchers ?• DwB objective: Enhancing transnational access to official microdata,

particularly confidential microdata• Increasing number of RDCs offering remote access at national level• Access to confidential microdata also increasingly available to non

resident researchers, though in many cases “on site”• Building a European Remote Access Network (ERAN) would improve the

situation No (or less) travelling required Would also allow researchers’ teams located in different sites/countries to work

together and to conduct comparative projects

• Crucial to understand How the different solutions used by RDCs impact the researchers ? What would the researchers require from an ERAN ?

Page 3: Researchers’ needs for transnational access to confidential microdata Survey preliminary results Marie Cros 1 Frédérique Cornuau 1 Roxane Silberman 2 1

Survey perimeter and bases for the survey• Users of 5 European RDCs offering remote access • CASD/GENES, CBS, IAB, ONS, SDS (DwB partners) • Bases for the survey:

DWB survey on the different remote access solutions (D4-1)

Previous analysis of the different phases a researcher has to go through for a project

Preliminary discussions with researchers involved in an international project/team and DwB/RDCs experiences

Page 4: Researchers’ needs for transnational access to confidential microdata Survey preliminary results Marie Cros 1 Frédérique Cornuau 1 Roxane Silberman 2 1

Structure and rationale for the survey• Researchers and research project characteristics that may

impact their needs• Selected phases of a research project • For each phase, selected components of remote access

solutions to ensure security that may impact the researchers• Researchers expectations

Page 5: Researchers’ needs for transnational access to confidential microdata Survey preliminary results Marie Cros 1 Frédérique Cornuau 1 Roxane Silberman 2 1

Structure and rationale for the survey1. Researcher’s profile

• Assumption that researchers’ needs and the way they are impacted by various solutions may vary depending on: • Discipline, type of institution and context of the institution

• Differences in type of outputs needed, complexity of analysis, way of working, support from colleagues, support from local IT …

• Structure of research team and organisation of work in research teams

• Work alone or not • Work with colleagues located in other places

• Previous experience of remote access • Type of datasets used

• Business, households, persons : different issues for outputs anonymisation• More or less large data files : storage capacity

Page 6: Researchers’ needs for transnational access to confidential microdata Survey preliminary results Marie Cros 1 Frédérique Cornuau 1 Roxane Silberman 2 1

Structure and rationale for the surveySelection of phases of a research project

• We selected 4 phases over 8 phases identified, focusing on phases the RDCs are involved in Information phase Accreditation phase Access phase Data phase Support phase Output checking phase Closure phase

• Yet, though we did not investigate the information and accreditation phases (DwB WP8 and 3), it must be underlined that most researchers spontaneous complained about the time needed before accessing data was possible

Page 7: Researchers’ needs for transnational access to confidential microdata Survey preliminary results Marie Cros 1 Frédérique Cornuau 1 Roxane Silberman 2 1

Structure and rationale for the surveySelection of the components of the security solutions

• All RDCs providing remote access aim at garanteeing security, though with different solutions for the different components of security (DwB D4.1)

• Different technical solutions as well as different interpretation and requirements for security

• We divided these components in 2 subsets based on possible impact (neutral or non neutral on researchers’ performance depending on the type of adopted solution

Page 8: Researchers’ needs for transnational access to confidential microdata Survey preliminary results Marie Cros 1 Frédérique Cornuau 1 Roxane Silberman 2 1

Access phase Components that may impact the researchers

• Access from where and secure environment (physically)• Regular PC or Thin client,• What operating system (OS) is supported for the user’s

workstation• installations that have to be done on the user’s side of the

connection• Researcher authentication

Page 9: Researchers’ needs for transnational access to confidential microdata Survey preliminary results Marie Cros 1 Frédérique Cornuau 1 Roxane Silberman 2 1

Access phase Component: place for access

The requirements concerning the place where from access is made possible

researcher's office, locked room, dedicated space in researcher’s institution, space only in national institute …), security regulations for such a place

May impact the researchers at various degrees depending on the nature and the intensity of these constraints and support the researchers may have from their institution

Researcher’s office more friendly Dedicated space in researcher’s institution not available in all

institutions Only in RDCs in NSI may need travelling inside the country

Page 10: Researchers’ needs for transnational access to confidential microdata Survey preliminary results Marie Cros 1 Frédérique Cornuau 1 Roxane Silberman 2 1

Access phaseStarting point for the connexion

• Own computer (regular PC) vs dedicated equipment (thin client)

• Regular PC more friendly • Yet secure access requires some installation on the user’s

workstation • Different problems may arise depending support is needed from

external IT staff, the local IT, costs

• Fewer problems may be expected with a thin client solution• Yet the researcher has to deal with a different solution from his

routine way of working.

Page 11: Researchers’ needs for transnational access to confidential microdata Survey preliminary results Marie Cros 1 Frédérique Cornuau 1 Roxane Silberman 2 1

Data phase: “Work with data” Components that may impact the researcher’s work

• Different methods for authentication Smartcard, login/password, biometric Frequency of authentication, methods of diverse levels of

complexity, and success of the authentication may impact the researcher

• How researchers organize their work• Constraints when travel to a specific place is required or if high fees

Needs to concentrate/shorten the work

• Working with other researchers located in different places/countries ?• Combining datasets from different RDCs ?

Page 12: Researchers’ needs for transnational access to confidential microdata Survey preliminary results Marie Cros 1 Frédérique Cornuau 1 Roxane Silberman 2 1

Support phaseComponents that may impact the researcher’s work

• Available data, metadata and support from the RDC team• Available software for the researcher• User surveillance• Upload data in the user’s workspace at the server of the data

provider

Page 13: Researchers’ needs for transnational access to confidential microdata Survey preliminary results Marie Cros 1 Frédérique Cornuau 1 Roxane Silberman 2 1

Output checking phase

• A major issue for both sides, RDCs and researchers • Similar principles for output checking, yet differences in

procedures as well as SDC rules• More or less time consuming • Assumption that impact may depend on:

Checking intermediary outputs or final outputs Disciplines (if more descriptive and detailed outputs such as

demography, urban sociology)

Page 14: Researchers’ needs for transnational access to confidential microdata Survey preliminary results Marie Cros 1 Frédérique Cornuau 1 Roxane Silberman 2 1

In addition we investigated …

• Language issues• Anticipated time• Fees issues • And

Many questions allowed free comments Final questions about general feelings about their

experience and expectations

Page 15: Researchers’ needs for transnational access to confidential microdata Survey preliminary results Marie Cros 1 Frédérique Cornuau 1 Roxane Silberman 2 1

Survey administration

• Web-based survey ( a few paper-based)• Sent by the RDCs to researchers who experienced existing

remote access solutions in France, Germany, The Netherlands, and United Kingdom

• Researchers completed the questionnaire anonymously and submitted it directly online to survey design team

• 90 researchers completed the questionnaire • Lack of feedbacks at the moment from RDCs about how

many researchers have received the questionnaire Not a “representative” survey, yet providing useful information

Page 16: Researchers’ needs for transnational access to confidential microdata Survey preliminary results Marie Cros 1 Frédérique Cornuau 1 Roxane Silberman 2 1

Some preliminary results

Not all issues

Page 17: Researchers’ needs for transnational access to confidential microdata Survey preliminary results Marie Cros 1 Frédérique Cornuau 1 Roxane Silberman 2 1

Respondents profile (n=90)institution

Public university 55%

Public research center

27%

Private research center

9%

Other 7%

% economists

CASD 65%VML 54%IAB 93%SDS 88%CBS 25% ( =8)

ALL 73%

Also : geography, sociology, health …

Remote access system described in surveyCASD 28%ONS 14%FDZ 16%SDS 33%CBS 9%

Who do researchers work with ? *

Alone 17%

With other researchers from the same institution 57%

With other researchers from other national institutions

22%

With other researchers from other countries 8%

* Multiple answers possible

Not surprisingly, most researchers used remote access solution of their own country, for national (non-comparative) projects

Page 18: Researchers’ needs for transnational access to confidential microdata Survey preliminary results Marie Cros 1 Frédérique Cornuau 1 Roxane Silberman 2 1

Points of access

• A majority of researchers could access data from their own institution. Researchers who couldn’t access data from their own institution had to go in accredited points of access.

Points of

accessResearcher’s own institution 71%Data centre of a National Statistical Institute 22%Another research institution 6%

MaterialDedicated equipement 73 %Own computer 27 %

Page 19: Researchers’ needs for transnational access to confidential microdata Survey preliminary results Marie Cros 1 Frédérique Cornuau 1 Roxane Silberman 2 1

Comments from researchers who had to join special points of access in their country outside their institution

• Material conditions : traveling, time-consuming and loosing money “Pre-booking was needed, Needed time and other resources to travel.”“Coming to the location to use data is expensive, time-consuming, and leads to

inaccurate research (inability to check things later).” “Time lost travelling to and from the data centre. Inefficiency in having to spend

solid chunks of time out of the office.”• Work organisation constraints“(…) not all info (books, papers) around, no chance to ask colleagues for help,....”“(…) need to be organized and prepared, you don't have all your files, literature,

software, less spontaneous - in particular difficult, because you never know what to find in the data, need time to get to know the data etc.”

“It’s not very problematic but annoying that you don't have the usual tools and settings that are there in your own office”

Only possible during office hours

Page 20: Researchers’ needs for transnational access to confidential microdata Survey preliminary results Marie Cros 1 Frédérique Cornuau 1 Roxane Silberman 2 1

Comments from researchers working on own computer or dedicated equipment

• Own computer• Most comments “Friendly”• Few got some problems for installation

• Dedicated equipment • No problem for installation

• Yet• “Personal files not available”• “Time consuming, need to be organized and prepared” • “Because we could not go back and forth easily between the drafting and

empirical work on the data”• “Limited choice of software”

• And more problems when dedicated equipment combined with place of access not in the researcher’s institution

Page 21: Researchers’ needs for transnational access to confidential microdata Survey preliminary results Marie Cros 1 Frédérique Cornuau 1 Roxane Silberman 2 1

Authentication

• 93% of researchers approved the type of authentication required and 86% the frequency they had to log in.

• Success in authentication procedures : more mixed opinions : for 60% it worked each time, but for others problems appeared, and differences between the RDCs

• “

Page 22: Researchers’ needs for transnational access to confidential microdata Survey preliminary results Marie Cros 1 Frédérique Cornuau 1 Roxane Silberman 2 1

Researchers’ comments on authentication

“Authentication was no problem for me. But for one of my colleagues on the project it would fail more than 50% of the time. It seems that his skin was too thin...!”

“It was stressing as if it was not working for a certain number of time in a row I would have had to change the authentication card and it was time consuming.”

“Passwords frequently expired and could only be reset by an assistant, who was not always available.”

“There were problems of authentication due to the sensitivity of the fingerprint sensor.”

“Difficult at the very beginning, but it was solve.”

Page 23: Researchers’ needs for transnational access to confidential microdata Survey preliminary results Marie Cros 1 Frédérique Cornuau 1 Roxane Silberman 2 1

Researchers’ organisation of work

• A large majority of researchers used remote access during a long period (several months) > mainly a choice

• A minority worked on a shorter period & on a daily basis. Most of them (2/3) indicate this organisation was a constraint

• Constraints linked to: monthly costs location of the point of access and need to travel (time and costs) organisation issues (need time to work on other projects, teaching,

other activities…)• For those working with researchers from other institutions in the

country or outside the country

Page 24: Researchers’ needs for transnational access to confidential microdata Survey preliminary results Marie Cros 1 Frédérique Cornuau 1 Roxane Silberman 2 1

Checking of outputs : what about delays ?

Delays are generally a few days but can last from a few minutes to a few days, depending on

- the RDC- if intermediate outputs are checked or only final outputs- Complexity of analysis

73% of researchers happy with delays for output checking• Comprehensive / positive comments “Ideally it would be quicker of course, but I understand the labour constraints.”“A few years ago output checking took several days. The service is improved, and takes

1 day nowadays” “ Understand it is a trade off”

Researchers having previous different experiences happier

Page 25: Researchers’ needs for transnational access to confidential microdata Survey preliminary results Marie Cros 1 Frédérique Cornuau 1 Roxane Silberman 2 1

Yet negative comments on delays and on the way output checkings work

“Such delay is not convenient for quality research - such delay is not acceptable for a paid service - we are not kept informed about why this takes sometimes so long and what is checked”

“It’s expensive as we have very limited research resources”

“RAs are generally not experienced enough to judge that output is correct.”

“I understand that it is time consuming but (…) waiting just to learn that the program crashed is very frustrating. “

Page 26: Researchers’ needs for transnational access to confidential microdata Survey preliminary results Marie Cros 1 Frédérique Cornuau 1 Roxane Silberman 2 1

Output restrictions

• ¾ of researchers did not experience annoying restrictions on outputs.

• Several comments show that they understand that these restrictions have to apply and they adapt their work to these constraints. Kind of « learning process ».

• However… Contest about methodology when checking is also on some on

intermediary outputs, should be only on final ouptus Some restrictions are judged excessive

Page 27: Researchers’ needs for transnational access to confidential microdata Survey preliminary results Marie Cros 1 Frédérique Cornuau 1 Roxane Silberman 2 1

Researchers’ comments on outputs restrictions• Frequencies in cells“Outputs on max, mean, min were not allowed as they could breach confidentiality, which I felt was OK”.“Some cells in output contains less than 10 persons. This is forbidden, even if there is no risk of violating confidentiality”“Unable to report summary statistics (maxima and minima) and scatterplots because of (excessive?) “confidentiality requirements”-”which rarely really an indicated actual possibility to identify single entire, though, they acted partly too strict”• Software for outputs, complexity of analysis, methodology“We were not allowed to produce and use certain graphics using Stata because the RDC could not check this thoroughly enough”“…Because our methodology was not accepted; but this is not what the remote access should be about: they should not check our methodology”• Delays“Because of delays, we left some works”“We needed detailed information for mapping and further analysis; this wasn't always possible. Even though we did not intend to publish the data”.

Page 28: Researchers’ needs for transnational access to confidential microdata Survey preliminary results Marie Cros 1 Frédérique Cornuau 1 Roxane Silberman 2 1

Overall comments• Positive“Remote access systems have improved substantially over the last years”• Negative

Timing issue taking into consideration the overall process

“ We work for organizations that always want the result ASAP. It is simply unacceptable for them that you have to wait for sometimes a few months to get a project going and access the files”

Particularly bureaucracy, accreditation process, time for checking outputs Costs

“Prices for a small project too high” “Because of the slow process the analysis cost about ten times more that expected. Cost increased” “Restriction in the number of outputs”

Page 29: Researchers’ needs for transnational access to confidential microdata Survey preliminary results Marie Cros 1 Frédérique Cornuau 1 Roxane Silberman 2 1

Conclusions• Preliminary conclusions

Remote access friendly Security constraints mostly accepted by researchers Yet interpretation and solutions differently impact the researchers Work organization also problematic, even more for researchers involved in

teams with other researchers from other institutions in the country or across borders

Page 30: Researchers’ needs for transnational access to confidential microdata Survey preliminary results Marie Cros 1 Frédérique Cornuau 1 Roxane Silberman 2 1

Results to be refined and complemented Only preliminary results to be refined Only a few researchers involved in multi-institutions/multi-countries

projects/teams Get more researchers involved in such teams multinational projects/teams

completing the on-line survey Further in-depth interviews to be conducted with researchers involved in

such team (important for Virtual Reseach Environment for the DwB ERAN project)

Idea to focus more on how researchers would work together access to microdata for all access to intermediary outputs comparing outputs from analysis conducted in different RDCs combining datasets from different RDCs to run a single analysis

Page 31: Researchers’ needs for transnational access to confidential microdata Survey preliminary results Marie Cros 1 Frédérique Cornuau 1 Roxane Silberman 2 1

Thanks for Listening

Contact: [email protected]

Website: http://www.dwbproject.org/