surveys in software engineering

111
Surveys in Software Engineering Rafael Maiani de Mello Marco Torchiano Daniel Méndez Guilherme H. Travassos Further coaches

Upload: daniel-mendez

Post on 28-Jan-2018

14 views

Category:

Engineering


3 download

TRANSCRIPT

Page 1: Surveys in Software Engineering

SurveysinSoftwareEngineering

Rafael Maiani de Mello

Marco Torchiano

Daniel Méndez Guilherme H. TravassosFurther coaches

Page 2: Surveys in Software Engineering

Ground rules for today

1. Whenever you have a question/remark: share them with the group!

2. Stay flexible

3. There are no further rules J

Page 3: Surveys in Software Engineering

Feelfreeto…

copy,share,andalter,

filmandphotograph(aslongaswelookgreat)

tweetandliveblog

today’spresentationsgiventhatyouattributeittoitsauthor(s)andrespectitsrightandlicensesofitsparts.

Page 4: Surveys in Software Engineering

Whoare we?

MarcoTorchianoPolitecnico diTorinohttp://softeng.polito.it/torchiano/

RafaelMaiani deMelloPontificalCatholicUniversityofRiodeJaneirohttp://inf.puc-rio.br/~rmaiani

GuilhermeHorta TravassosCOPPE,FederalUniversityofRiodeJaneirohttp://www.cos.ufrj.br/~ght

DanielMéndezTechnicalUniversityofMunichhttp://www4.in.tum.de/~mendezfe/

Page 5: Surveys in Software Engineering

Whoareyou?

Quick round:

• Who are you?

• What is your experience in conducting survey research?

• What are your expectations?

Page 6: Surveys in Software Engineering

What doyou think?

Whydoweneedsurveyresearchinsoftwareengineering?

Page 7: Surveys in Software Engineering

AgendaTime Topic

09:00– 11:00 SessionI- IntroductiontosurveysWherewewillprovidethebasictheoreticalconceptsofpopulationsurveys:generalmethod,sourceoferrors,sampling,instrumentdesign.

11:00– 11:30 Morningbreak

11:30– 13:00 SessionII- BestpracticesWherewewillfocusonthekeyaspectsofdesigningandconductingsoftwareengineeringsurveysandpresentobservedissuesandevidencebasedlessons.

13:00– 14:30 Lunch14:30– 16:30 SessionIII- Hands-on(BYOL)

Wheretheparticipantsareexpectedtodesignandimplementasimplesurveyonarealonlinetool.BringYourOwnLaptop,ortabletatleast.

16:30– 17:00 Afternoonbreak

17:00– 18:30 SessionIV– Q&AWherewewilldiscusswiththeparticipantsaboutthemostimportantissuesandcomeupwithsomegeneralrecommendation.

19:30– 22:00 TourCiudadRealReceptionattheTownHallofCiudadRealatCasa-Museo López-Villaseñor

Page 8: Surveys in Software Engineering

SessionI

IntroductiontoSurveyResearch

Session I

Introduction to Survey research

Page 9: Surveys in Software Engineering

BigPicture…1st layer

9

Philosophy of science

Principle ways of working

Methods and Strategies

Fundamental tools

Epistemology

Empirical methods

Statistics

Hypothesis testingCase studies

Logic

Examples

Theories

Page 10: Surveys in Software Engineering

ESEreliesoneverylayer!

10

Philosophy of science

Principle ways of working

Methods and Strategies

Fundamental tools

Setting of Empirical Software Engineering:

§ Theory building and evaluation

are supported by

§ Methods and Strategies

Analogy: Theoretical and Experimental Physics

Page 11: Surveys in Software Engineering

BigPicture…2nd layer

11

Theory/System of theories

(Tentative)Hypotheses

Observations / Evaluations

Study Population

Induction

PatternBuilding

Deduction

Falsification /Support

TheoryBuilding

Page 12: Surveys in Software Engineering

BigPicture3rd layer:Methods

12

• EachmethodIcanapply…• Hasaspecificpurpose

• Reliesonaspecificdatatype

Purposes

• Exploratory• Descriptive• Explanatory

• Improving

DataTypes

• Qualitative• Quantitative

Example: Grounded Theory

(Tentative)Hypotheses

Study Population

Qualitative Data

DescriptiveExploratory, or

Explanatory

Page 13: Surveys in Software Engineering

Further reading: Vessey et alA unified classification system for research in the computing disciplines

Theory/System oftheories

(Tentative)Hypotheses

Observations / Evaluations

Study Population

PatternBuilding

Falsification /Support

TheoryBuilding

BigPicture3rd layer:MethodsFormal /

ConceptualAnalysis

Grounded Theory

Confirmatory• Case & Field

Studies• Experiments • Simulations

Survey andInterview Research

• EthnographicStudies

• Folklore Gathering

Exploratory• Case & Field

Studies• Data Analysis

Page 14: Surveys in Software Engineering

ObservationalStudies

Survey(Cross-Sectional) Case study Case-Control

Page 15: Surveys in Software Engineering

Survey

Systematicobservationalmethodtogatherqualitativeand/orquantitativedatafrom(asampleof)entitiesto

characterizeinformation,attitudesand/orbehaviorsfromdifferentgroupsofsubjectsregardinganobjectofstudy

15

Descriptivestatistics+Analyticstatistics

Page 16: Surveys in Software Engineering

Planning

Surveyprocess

16

Define research objectives

Chose collection mode Chose sampling frame

Questionnaireconstruction and pretest

Design and select sample

Recruit and measure

Data coding and editing

AnalysisPost-survey adjustments

Page 17: Surveys in Software Engineering

Surveyprocess

17

Define research objectives

Chose collection mode Chose sampling frame

Questionnaireconstruction and pretest

Design and select sample

Recruit and measure

Data coding and editing

AnalysisPost-survey adjustments

Research question +Characterization of constructs and population

What is the productivity ofJava software developers?

Parameter/ Construct

Target population

Page 18: Surveys in Software Engineering

Surveyprocess

18

Define research objectives

Chose collection mode Chose sampling frame

Questionnaireconstruction and pretest

Design and select sample

Recruit and measure

Data coding and editing

AnalysisPost-survey adjustments

How many lines of Java code have you written in the last week?

Measurement

Page 19: Surveys in Software Engineering

Surveyprocess

19

Define research objectives

Chose collection mode Chose sampling frame

Questionnaireconstruction and pretest

Design and select sample

Recruit and measure

Data coding and editing

AnalysisPost-survey adjustments

A couple hundreds

Response

Page 20: Surveys in Software Engineering

Surveyprocess

20

Define research objectives

Chose collection mode Chose sampling frame

Questionnaireconstruction and pretest

Design and select sample

Recruit and measure

Data coding and editing

AnalysisPost-survey adjustments

200 LOCs

Edited response

Page 21: Surveys in Software Engineering

Surveyprocess

21

Define research objectives

Chose collection mode Chose sampling frame

Questionnaireconstruction and pretest

Design and select sample

Recruit and measure

Data coding and editing

AnalysisPost-survey adjustments

Research question +Characterization of constructs and population

What is the productivity ofJava software developers?

Parameter/ Construct

Target population

Page 22: Surveys in Software Engineering

Surveyprocess

22

Define research objectives

Chose collection mode Chose sampling frame

Questionnaireconstruction and pretest

Design and select sample

Recruit and measure

Data coding and editing

AnalysisPost-survey adjustments

Developers working for companies in the region having NACE activity code J

62.0.x

Frame population

Page 23: Surveys in Software Engineering

Surveyprocess

23

Define research objectives

Chose collection mode Chose sampling frame

Questionnaireconstruction and pretest Design and select sample

Recruit and measure

Data coding and editing

AnalysisPost-survey adjustments

2 developers in each of 100 randomly selected companies

Sample

Page 24: Surveys in Software Engineering

Surveyprocess

24

Define research objectives

Chose collection mode Chose sampling frame

Questionnaireconstruction and pretest

Design and select sample

Recruit and measure

Data coding and editing

AnalysisPost-survey adjustments

[email protected][email protected]

Respondents

Page 25: Surveys in Software Engineering

Measurementvs Representation

25

Define research objectives

Chose collection mode Chose sampling frame

Questionnaireconstruction and pretest Design and select sample

Recruit and measure

Data coding and editing

AnalysisMake adjustments

Construct Target population

Measurement

Response

Edited response

Frame population

Sample

Respondents

Post-survey adjustments

Instrument

Page 26: Surveys in Software Engineering

Measurementperspective

• Construct

• Measurement

• Response

• Editedresponse

26

Define research objectives

Chose collection mode Chose sampling frame

Questionnaireconstruction and pretest Design and select sample

Recruit and measure

Data coding and editing

AnalysisPost-survey adjustments

μi

Yi

yiyip

Page 27: Surveys in Software Engineering

Construct

• Elementofinformationsoughtbyresearchers• Examples

• Howmanynewjobscreated• Howmanyincidentsofcrimewithvictims

• Whichdevelopmentstoolsused• Formulation

• Easytounderstand

• Imprecise• Abstract

27

μi

Page 28: Surveys in Software Engineering

Construct

• LevelofAbstraction/Measurementperspective

• Directlyobservable

• E.g.Staffforaproject

• Afewdefinedwaystomeasure

• Nondirectlyobservable

• Intentiontoadoptatechnology

• Nosinglewell-definedmeasure

28

Page 29: Surveys in Software Engineering

Measurement

• Howtogatherinformationaboutconstructs• Objectivemeasures

• Electronic• Physical

• Answerstoquestions• Visual(formalquestionnaires)• Oral(structuredorsemi-structuredinterviews)

29

Yi

Page 30: Surveys in Software Engineering

Validity

• Gapbetweenconstructsandmeasurement• Ideallythemeasureistheresultofjustoneamongseveral

possibletrials• Inpracticethemeasurementmayintroduceanerror

• Eachtrialintroducesadifferenterror

• Validity:=correlationbetweenYandμ

30

Yit = µi + ✏it

Page 31: Surveys in Software Engineering

Response

• Theactualdatacollectedthroughthesurvey• Aquestionmayrequire

• Searchownmemory• Accessrecords

• Askotherpersons• Closedquestionsalreadycontainpossibleanswers• Sometimesaresponseisnotprovided

31

yi

Page 32: Surveys in Software Engineering

• Gapbetweentheidealmeasurementoutcomeandresponseobtained

• Responsebias• Systematicmisreporting

• Reliability

• Variabilityoverseveraltrials

Measurementerror

32

yi � Yi

Page 33: Surveys in Software Engineering

Editedresponse

• Reviewprocessbeforeusingdata• Rangechecks

• Consistencychecks• Illegibleanswersdetection

• Skippedquestions• Outlierdetection

33

yip

Page 34: Surveys in Software Engineering

Processingerror

• Gapbetweenvariablesusedinanalysisandthoseprovidedbytherespondent

• Erroneousoutlieridentification• Codingerror

34

Page 35: Surveys in Software Engineering

Errors– measurementpov

35

Construct

Measurement

Response

EditedResponseProcessing

Error

Measurement Error

Validity

Page 36: Surveys in Software Engineering

Representationperspective

36

• Targetpopulation

• Framepopulation

• Sample

• Respondents

• Post-surveyadjustments

Define research objectives

Chose collection mode Chose sampling frame

Questionnaireconstruction and pretest Design and select sample

Recruit and measure

Data coding and editing

AnalysisPost-survey adjustments

Y

YC

ya

yr

Page 37: Surveys in Software Engineering

Representationperspective

37

Target

Frame

Sample

Respondents

Page 38: Surveys in Software Engineering

Targetpopulation

• Thesetofunitstobestudied• Abstractpopulationdefinition

• E.g.softwareprojects• Time?

• Insoftwarecompaniesonly?• Italiancompaniesonly?• Completedorjuststartedprojects?

38

Page 39: Surveys in Software Engineering

Framepopulation

• Allunitsinthesamplingframeconstitutetheframepopulation• Intheory

• Thesubsetoftargetpopulationthathasachancetobeselected

• Inpractice• asetofunitsimperfectlylinkedtothetargetpopulation

members

• E.g.telephonenumbers

39

Page 40: Surveys in Software Engineering

Framinginstrument• The(conceptual)instrumentusedtoidentifytheunitsofstudy

• Householdphonenumberstogetpersons

• Companyrecordstogetemployees• CustomerIDstogetcustomers

• SocialNetworkIDstogetmembers• Warning:oftentheframeelementsareindirectlylinkedtothe

unitsofanalysis,throughrespondents.E.g.

• UoA:softwareprojects(UoA)• Respondents:developers

• Frameelements:softwarecompanies

40

Page 41: Surveys in Software Engineering

Target

Frame

CoverageError

41

Undercoverage

Ineligible units

Covered population

Page 42: Surveys in Software Engineering

Realitycheck

• During2012USAPresidentialElectionsCampaignbecauseofaneffectofFederalregulationspollingcellphoneswasmoreexpensive

• Asaresult,manypublicpollsleavecellphoneusersoutoftheirsamples

• Duetothegrowingpopularityofcellphonesastheonlypointofcontactforyoungvotersandminorities,poolersleftkeyconstituenciesforObamaoutofthepollsandskewedthenumbersforRomneyinsomesamples

42http://www.politico.com/news/stories/1112/84103.html?hp=l1

“That’swhysomepolls lookedsodifficultforthepresident,becausetheywereunder-polling theelectorateforthepresident”

J.Messina(CampaignManagerforObama)

Page 43: Surveys in Software Engineering

Coveragebias

• Twofactors1. Differencebetweencoveredandnotcoveredpopulation

• Y:meanoftarget• YC:meanofcovered YU:meanofuncovered

2. Proportionofnoncoveredpopulation• C:#coveredunitsU:#uncoveredunits

43

Y C � Y =U

C(Y C � Y U )

Page 44: Surveys in Software Engineering

Sample

• Unitsselectedfromtheframepopulation• Timeandcostopportunity

• Sampling:==Deliberatenon-observation• Mayintroducedeviationbetween

• Samplestatistic• Fullframestatistic

44

Page 45: Surveys in Software Engineering

SamplingDesign

• Strategyfollowedtoestablishthesamplefromtheframepopulation

• Non-probabilisticsamplingdesignsinclude:1. Accidentalsampling(simplyuseconvenience)

2. Judgementsampling(applysometechnicalcriteriatosample)

3. Snowballing(sharethesamplingdecisionwithpartofthesubjects)

4. QuotaSampling(establishfixedquotabygroups)

45

Page 46: Surveys in Software Engineering

SamplingDesign

• Probabilisticsamplingdesignsinclude:1. Simplerandomsampling(randomselect"n"unitsfromthe

framepopulation)2. StratifiedSampling(simplerandomsamplingfromeach

stratumestablishedintheframepopulation)• Example:Tosamplejavadevelopersbycountry

46

Page 47: Surveys in Software Engineering

SampleSizeFormula

• Recommendedwhenworkingwithprobabilisticsamplingdesigns

• SS:samplesize

• Z: Z-value,establishedthroughaspecifictable(Z=2.58for99%ofconfidencelevel,Z=1.96for95%ofconfidencelevel

• p: percentage selecting a choice, expressed as decimal (0.5 used as default forcalculating sample size, since it represents the worst case).

• c:desiredconfidenceInterval,expressedindecimalpoints(Ex.:0.04).

47

Page 48: Surveys in Software Engineering

SampleSizeFormula

• Correctionformulabasedonafinitepopulationwithapopsize

48

Population Confidence LevelConfidence

IntervalSample Size

10,000 95% 0.01 4,899

10,000 95% 0.05 370

500 95% 0.01 475

500 95% 0.05 217

Page 49: Surveys in Software Engineering

Samplingerror

• Samplingbias• Systematicexclusionofsomemembers

• Orsignificantlyreducedchanceofselection• Samplingvariance

• Idealsetofsamplesalldrawnfromthesameframe

49

Vs =

SX

s=1

�ys � Y C

�2

S

Page 50: Surveys in Software Engineering

Samplingerrorreduction

• Probabilisticsamplingü Allunitshavenonzeroselectionprobability

• Stratifiedsamplingü Representationofkeysub-populationsiscontrolled

• Elementsamplesü Asopposedtoclustersamples

• Samplesize

50

Page 51: Surveys in Software Engineering

Respondents

• Thesubsetofsampleforwhichameasurementcouldbecollected

ü Itemmissingdata:incompletemeasures• Fullparticipation(i.e.100%responserate)realisticallypossible

onlyforinanimateunits

51

Page 52: Surveys in Software Engineering

Respondents

52

000000000000000000000000000000000000000000

111111x1111111111111111111111111111x11111111111111x111111111x111111111111111111111111111xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Framedata Interview data

Data items

Samplecases

Respondents

Nonrespondents

Item missing data

Page 53: Surveys in Software Engineering

Non-responseerror

• Non-responsebiasü Non-responserate:mS /nSü Differencebetweenrespondentsandnon-respondents

53

yr � ys =ms

ns(yr � ym)

Page 54: Surveys in Software Engineering

Post-surveyadjustment

• Weightingü Compensateunder-representationdueto

ü Nonresponsepatternsü Mismatchbetweenframeandtargetpopulation

• Imputationü Itemmissingdataarereplacedbyestimations

54

Page 55: Surveys in Software Engineering

Session II

Best Practices on Planning Surveys

Page 56: Surveys in Software Engineering

DisclaimerThere is no universal silver bullet!

Page 57: Surveys in Software Engineering

BestPractices

Defining research objectives

Sampling

Questionnaire Design

Recruiting

Characterizing the Target Population

Page 58: Surveys in Software Engineering

Designresearchobjectives

Challenge:

Knowthelimitationsofsurveyresearch

à Surveyresearchoptsforanswersthatrelyonexperiences,opinions,andobservations(folklore)oftherespondents

• Developinternalquestionstohelpyoudepictingtheresearchobjective

• Optfordescriptivequestions(“whatishappening?”)orexplanatoryquestions(“whyisthishappening?”)ratherthannormativequestions(“whatshouldwedo?”)

Page 59: Surveys in Software Engineering

Designresearchobjectives

Challenge:

Identifytherealtargetpopulation

à Avoidtorestrictthetargetpopulationbasedonfactorssuchasitssizeoritsavailability.

• Basedontheresearchobjectives,answerthefollowingquestion:“Whocanbestprovideyouwiththeinformationyouneed?”,insteadofanswering“Whoareprobablyavailabletoparticipate?"

Page 60: Surveys in Software Engineering

BestPractices

Defining research objectives

Sampling

Questionnaire Design

Recruiting

Characterizing the Target Population

Page 61: Surveys in Software Engineering

CharacterizingtheTargetPopulation

Challenge:

Identifyingthesurveyunitofanalysis

à Thesurveysubjects(individuals)aretypicallythesurveyunitofanalysis.However,insomecasesitmaybeagroupofindividuals,suchashouseholdsororganizationalunits/projectteamsinSEresearch

• Basedontheresearchobjective,identifywhichentityshouldbeusedtoguidesamplinganddataanalysing• Forinstance,investigating Javadevelopers programmingpracticeisaresearch

objectivedifferentfrominvestigatingjavaprogrammingpracticeinsoftwarehouses

Page 62: Surveys in Software Engineering

Unit of analysis

CharacterizingtheTargetPopulation

Image source: http://www.telegraph.co.uk/finance/personalfinance/8080294/Poorest-households-hit-15-times-harder-by-Government-cuts.html

How do developers perform code debugging?

How code debugging have been performed by developers from software houses?

List of software developers

List ofsoftwarehouses

Page 63: Surveys in Software Engineering

Challenge:

CharacterizingthesubjectsandunitsofanalysisinSEsurveys

à Differentresearchobjectivesmaydemanddifferentattributestocharacterizingindividuals/groupsofindividualsinvolvedinthesurveys

à Whatattributesarenecessarytoidentifya“representative”population?

• Standardscanbeespeciallyhelpfultoprovidescalesandevennominalvalues• Forinstance,CMMI-DEVmaturitylevelcanbeusedtocharacterize

organizationalunitsregardingtheirmaturityinsoftwareprocess.RUProlescanbeusedtocharacterizesubjects’currentposition

CharacterizingtheTargetPopulation

Page 64: Surveys in Software Engineering

CharacterizingtheTargetPopulation

Challenge:

CharacterizingthesubjectsandunitsofanalysisinSEsurveys

• Individuals canbecharacterizedthroughattributessuchas:experienceintheresearchcontext,experienceinSE,currentprofessionalrole,locationandhigheracademicdegree]

• Organizations canbecharacterizedthroughattributessuchas:size(scaletypicallybasedinthenumberofemployees),industrysegment (softwarefactory,avionics,finance,health, telecommunications, etc.),locationandorganization type(government, privatecompany,university,etc.)

• Projectteams canbecharacterizedthroughattributessuchasprojectsize;teamsize,client/productdomain(avionics, finance,health,telecommunications, etc.)andphysicaldistribution

Page 65: Surveys in Software Engineering

BestPractices

Defining research objectives

Sampling

Questionnaire Design

Recruiting

Characterizing the Target Population

Page 66: Surveys in Software Engineering

Sampling

Challenge:LookingfortheFramePopulation

à SuitablesamplingframesarerarelyavailableinSEresearch.Weoftenneedtoperform“indirectsampling”(forinstance,thereisnoyellowpagesforsoftwareprojectsinacountry).

• Firstofall,youshouldsearchforcandidatesofsourcesofpopulation.Avoidtheconvenienceonsearchingcandidates,tryingtoanswer:“Wherearepresentativepopulationfromthesurveytargetpopulationorevenalltargetpopulationisavailable?”

Image source: http://www.bryan-allen.com/Photography/General/i-fsVh3h2

Page 67: Surveys in Software Engineering

Sampling

Anuniverseofimperfectalternatives!

Page 68: Surveys in Software Engineering

SamplingAgoodsourceofpopulation...

• ...should not intentionally represent a segregated subset from the targetpopulation, i.e., for a target population audience “X”, it is not adequate tosearch for units from a source intentionally designed to compose a specificsubset of “X”

• ...should not present any bias on including on its database preferentially onlysubsets from the target population. Unequal criteria for including search unitsmean unequal sampling opportunities

• … allow identifying all source of population’ units by a distinct logical ornumerical id

• ...should allow accessing all its units. If there are hidden elements, it is notpossible to contextualize the population

Page 69: Surveys in Software Engineering

Sampling

Challenge:

LookingfortheFramePopulation

• InthecaseofsurveyshavingSEresearchersastargetpopulation,youcanuseresultsfrompreviouslyconductedsystematicliteraturereviews(SLR)regardingyourresearchtheme• SocialnetworksaddressedtointegrateacademicssuchasResearchGate and

Academia.edu canbealsousefulinthiscontext.

Page 70: Surveys in Software Engineering

SamplingChallenge:

LookingfortheFramePopulation

• Lookforcataloguesprovidedbyrecognizedinstitutes/associations/governmentstoretrieverelevantsetofSEprofessionals/organizations.Someexamples:

• SEI(www.sei.cmu.edu) instituteprovidesanopenlistoforganizationsandorganizationalunitscertifiedineachCMMI-DEVlevel.

• FIPA(www.tivia.fi/in-english)providesinformationregardingFinlandITorganizationsanditsprofessionals.

• CAPES(www.capes.gov.br/) providesatoolforaccessinginformationregardingBrazilianresearchgroups.

Page 71: Surveys in Software Engineering

Sampling

Challenge:

LookingfortheFramePopulation

• Sourcesavailable inthewebsuchasdiscussion groups,projectsrepositories andworldwideprofessional socialnetworks canbehelpfultoidentifyrepresentativepopulationscomposedbySEprofessionals

Such sources can restrict at any moment the access to the

content available!

Page 72: Surveys in Software Engineering

Sampling

Challenge:

LookingfortheFramePopulation

• Howtofindtheframepopulationinthesourceofpopulation?• Onceyouhaveidentifiedasourceofpopulation,youneedtoestablish

steps/procedurestosystematicallydepictingthesurveysamplingframe.• Suchpracticeisimportanttoassessthesamplesrepresentativeness,

alsosupportingfuturere-executions

Page 73: Surveys in Software Engineering

Sampling

Challenge:

Establishingthesurveysamplesize

• ParticipationratesinvoluntarysurveysinSEperformedoverrandomsamplestendtobesmall(lowerthan10%).Whataretheimplicationsonresponserates,butalsoonrepresentativeness?

• Takepreferencetoprobabilisticsamplingdesigns.• Independentfromtheamountofrespondents,itwillbepossibletocalculate

theresultsconfidence

• Involuntarysurveyswithpractitioners,establishsignificantlyhighersamplesizes,consideringtheexpectationofaverylowparticipationrate

Image source:http://www.playbuzz.com/viralpx/a-what-do-people-really-think-about-you

Page 74: Surveys in Software Engineering

BestPractices

Defining research objectives

Sampling

Questionnaire Design

Recruiting

Characterizing the Target Population

Page 75: Surveys in Software Engineering

QuestionnaireDesign

Challenge:Todesignaclear,simpleandconsistentsurveyquestionnaire

Remember:

Bad questionnaires can led subjects initially willing to participate to give up!

Page 76: Surveys in Software Engineering

QuestionnaireDesign

Challenge:

Todesignaclear,simpleandconsistentsurveyquestionnaire

• Usesimpleandappropriatewordingforthesurveyquestions• Avoidtechnicaltermsasmuchaspossibleordefinetheminthe

questionnaire,accordingtothesurveytargetpopulation

• Takepreferencetodesignshortquestionsregardingasingleconcept• Avoiddoublebarreledquestions

• Avoidvaguesentenceswhilewritingsurveyquestions

Image source: http://www.fooj.it/wp-content/uploads/2015/07/boring-office.jpg

Page 77: Surveys in Software Engineering

QuestionnaireDesign

Inyouropinion,doyouagreeordisagreethatcoderefactoringisaneed?Andwhataboutcodesmelldetection?

a) Istronglyagreeb) Ipartiallyagreec) Iagreed) Idisagree

Page 78: Surveys in Software Engineering

QuestionnaireDesign

Coderefactoringisanessentialpracticeforimprovingtheunderstandingofobject-orientedcode.

a) Totallyagree

b) Partiallyagree

c) Neitheragreenordisagree

d) Partiallydisagree

e) Totallydisagree

Page 79: Surveys in Software Engineering

QuestionnaireDesign

Challenge:

Todesignaclear,simpleandconsistentsurveyquestionnaire

• Avoidbiased questions,whichcanbedonebycarefullyphrasingthequestionsthatdonotsuggestlikelyanswersorresponses

• Avoidingsensitive questions• InSEcontext,thesensitivequestionscanbeaboutrespondentsincome,

opinionaboutorganizationormanagement,etc.

• Avoidtoaskaboutfarpastevents

Page 80: Surveys in Software Engineering

QuestionnaireDesign

Doyoupreferworkinginprojectsfollowingagilemethodsorthosefollowingusualnon-agile approaches?

Considering themaincharacteristicsofthelast10softwareprojectsyouhaveworkedon,pleaseanswerthefollowingquestions:

Askingage,gender,maritalstatusforcharacterizingrequirementsengineers

Page 81: Surveys in Software Engineering

QuestionnaireDesign

Challenge:

Todesignaclear,simpleandconsistentsurveyquestionnaire

• Itisimportanttoavoiddemandingquestions(requiringtoomucheffortfromrespondentstoanswer)

• Avoiddoublenegatives

Page 82: Surveys in Software Engineering

QuestionnaireDesign

Afterreadingtheattachedpapersregardingnonfunctionalrequirements(NFR),pleaseanswerthefollowingquestions:

1. WhichofthefollowingNFRdoyoudisagreearenotrelevantinthecontextofreal-timesystems?

Page 83: Surveys in Software Engineering

QuestionnaireDesign

Challenge:

Todesignaclear,simpleandconsistentsurveyquestionnaire

• Becarefulonselectingthe ResponseFormat!• Wrongchoicesofresponseformatmayleadyouto:

• Losepreciousdata• Losetheopportunityofapplyingrelevantstatisticaltests

• Significantly(andunnecessarily)increasedataanalysisefforts

Page 84: Surveys in Software Engineering

QuestionnaireDesign

Free-text

Numeric values

• Open questions• Allow coding• Content analysis• High effort on data

analysis

• Open questions• Allow a wide range

of statistical analysisInterval

Scale

• Closed questions• Not necessarily equally

distributed intervals• Significantly restricts

statistical analysis

Ordinal/ Likert scale

• Closed questions• Intervals are

considered equally distributed

• Statistical analysis is less restrictive than Interval Scale

Nominal• Closed questions• Statistical analysis

based on frequency

Page 85: Surveys in Software Engineering

QuestionnaireDesign

HowmuchexperiencedoyouhaveinJavaprogramming?

a) VeryHighexperienceb) HighExperiencec) FewExperienced) VeryFewexperience

HowmuchexperiencedoyouhaveinJavaProgramming?

a) Lessthanoneyearb) 1yearto3yearsc) 3yearsto5yearsd) Morethan5years

HowmuchexperiencedoyouhaveinJavaprogramming?

__5__years

HowmuchexperiencedoyouhaveinJavaprogramming?

I have been working with Java programming atcompanies since 2011. Before, I got my firstJava certification in 2009, when I startedworking in personal projects. But I havedifficult with object-oriented parts…_________

DoyouhaveexperienceinJavaprogramming?

()Yes()No

Page 86: Surveys in Software Engineering

BestPractices

Defining research objectives

Sampling

Questionnaire Design

Recruiting

Characterizing the Target Population

Page 87: Surveys in Software Engineering

RecruitingChallenge:

Controllingrecruitmentandparticipation

• Sendindividualbutstandardinvitationmessages• Itisexpectedthatgreatmostoftheindividualmessagessentwillberead

• Avoid"spreadingspree":mailinglists,foruminvitationmessages,crowdsourcingtools(suchasAmazonMechanicalTurk)• Youwillhavefewornocontrolonwhoreadtheinvitation.So,whowas

effectivelyrecruited?

• Neverallowforwarding(whichisdifferentfromsnowballing)!• Itwillviolatethesample

• Sendaquestionnaire’sindividualtokentoeachsubject

Page 88: Surveys in Software Engineering

RecruitingChallenge:

Stimulatingparticipation

• Remindersshouldbeusedwithcare.• Avoidremindingwhoalreadyhadparticipated

• Avoidremindingmorethanonce

• Theinvitationmessageshouldclearlycharacterizetheinvolvedresearchers,theresearchcontextandpresenttherecruitmentparameters

• Includeintheinvitationmessageacomplimentandanobservationregardingtherelevanceofsubjectparticipation

Image source: http://quotesgram.com/you-are-important-to-me-quotes/

Page 89: Surveys in Software Engineering

Challenge:

Stimulatingparticipation

• Establishafiniteandnotlongperiodtoanswerthesurvey• One-twoweeks

• Offerrewards(raffles,donations,payments,sharingresults)• Takeintoaccountthelocalpolicies

Image source: http://quotesgram.com/you-are-important-to-me-quotes/

Recruiting

Page 90: Surveys in Software Engineering

BestPractices

Defining research objectives

Sampling

Questionnaire Design

Recruiting

Characterizing the Target Population

Page 91: Surveys in Software Engineering

PilotingtheSurveyChallenge:

Youhaveonlyoneshot!Onceyoustartedthesurvey,thereisusuallynowayback

• Pilotthepopulationandsamplingactivitiesü Usea(smaller)sampleofthesamplingframe,reproducingallplannedstepsü Willallowyoutochecktheadequacyoftheframepopulationtoyoursurvey.

• Pilotthequestionnaireü Isitclear,unambiguous,didyoumaybemisssomequestions?

• Pilottherecruitmentü Doesitisworkingeffectively?

• Pilotthedataanalysisü Doyouhaveplannedfortheproperdataanalysistechniques?Whatisthenecessary

dataquantityand quality?

Page 92: Surveys in Software Engineering

SessionIII

Hands-On

Session III

Hands-On!

Page 93: Surveys in Software Engineering

The“Plan”1. Defineupto4researchobjectives2. Teamassignment

<lunch>

1. ShortintroductionintothetoolFor each team:2. (Online)Surveydesignand implementation3. Piloting /Testing4. Wrap-up5. (Optional:Running survey with ESEMparticipants)

Page 94: Surveys in Software Engineering

1. Your research objective2. Your (coarse) research questions3. Survey target population4. Survey unit of analysis5. Possible sources of population

Definitionof...

à Everyone sketcheshis/heridea for asurveyonasticky note

à Wemakea(plenary)selection and teamassingment

94

15-20 minutes

5 minutes

Page 95: Surveys in Software Engineering

Break

Page 96: Surveys in Software Engineering

Back-UpQuestions• Are functional and non-functional requirements really

distinct?• Can current software testing techniques support the test

of context awareness systems?• What development practices can contribute more to

decrease the software technical debt?• What could be the software engineering gap when

developing scientific (e-science) software?• What is the impact of Continuous Software Engineering

on software productivity and quality?• What is software?

Page 97: Surveys in Software Engineering

The“Plan”1. Defineupto4researchobjectives2. Teamassignment

<lunch>

1. ShortintroductionintothetoolFor each team:2. (Online)Surveydesignand implementation3. Piloting4. Wrap-up5. (Optional:Running survey with ESEMparticipants)

Page 98: Surveys in Software Engineering

Login

http://ww2.unipark.de/www/

Page 99: Surveys in Software Engineering

Login

http://ww2.unipark.de/www/

Team1User:iasese_1Password:A2C04aq.

Team2User:iasese_2Password:A2C04sw.

Team3User:iasese_3Password:A2C04de.

Team4User:iasese_4Password:A2C04fr.

Page 100: Surveys in Software Engineering

Selectyourproject

• Pleasesticktoyourownproject(watchtheteamnumber)

• Pleasetakecarewhatyoudo,youhaveadminrightsinthesystem

Page 101: Surveys in Software Engineering

Startediting

Participationlink(page-based)questionnaireeditor

Dataexport

Don’ttouchJ

Page 102: Surveys in Software Engineering

Questionnaire

Page 103: Surveys in Software Engineering

ExportingdataDataexport(takesabit)

Codebook

Page 104: Surveys in Software Engineering

BeforeyoustartTestandreset

Page 105: Surveys in Software Engineering

Acclimatizationphase

Team1User:iasese_1Password:A2C04aq.

Team2User:iasese_2Password:A2C04sw.

à Everyteamgets to know the toolà Incaseofquestions,askJ

Team3User:iasese_3Password:A2C04de.

Team4User:iasese_4Password:A2C04fr.

10 minutes

http://ww2.unipark.de/www/

Page 106: Surveys in Software Engineering

The“Plan”1. Defineupto4researchobjectives2. Teamassignment

<lunch>

1. ShortintroductionintothetoolFor each team:2. (Online)Surveydesignand implementation3. Piloting4. Wrap-up5. (Optional:Running survey with ESEMparticipants)

Page 107: Surveys in Software Engineering

The“Plan”1. Defineupto4researchobjectives2. Teamassignment

<lunch>

1. ShortintroductionintothetoolFor each team:2. (Online)Surveydesignand implementation3. Piloting4. Wrap-up5. (Optional:Running survey with ESEMparticipants)

Page 108: Surveys in Software Engineering

The“Plan”1. Defineupto4researchobjectives2. Teamassignment

<lunch>

1. ShortintroductionintothetoolFor each team:2. (Online)Surveydesignand implementation3. Piloting4. Wrap-up5. (Optional:Running survey with ESEMparticipants)

Page 109: Surveys in Software Engineering

Wrap-Up

Eachgroup:Brieflyintroducethesurvey(researchobjective,targetpopulation,questionnaire)

à Askquestionsà Seewhatyoucanlearnfromotherdesigns

Alltogether• ShallwerunoneofthesurveysduringESEM?

15-20 minutes

Page 110: Surveys in Software Engineering

Session IV

Q&A

Page 111: Surveys in Software Engineering

Furtherreading…

• Groves, Fowler, Couper, Lepkowski, Singer and Torangeau, 2009. “Survey Methodology – 2nd edition” John Wiley and Sons

• Conradi R., Li J., Slyngstad O. P. N., Kampenes V. B., Bunse C., Morisio M., Torchiano M. “Reflections on conducting an international survey of CBSE in ICT industry” IEEE 4th International Symposium on Empirical Software Engineering November, 2005

• Linåker, J. Sulaman, S. M., de Mello, R. M. and Höst, M. 2015. Guidelines for Conducting Surveys in Software Engineering. TR 5366801, Lund University Publications. http://lup.lub.lu.se/record/5366801/file/5366839.pdf

• de Mello, R. M. and Travassos, G. H. 2016. Surveys in Software Engineering: Identifying Representative Samples. Proc. of 10th ACM/IEEE ESEM, Ciudad Real.

• de Mello, R. M. 2016. Conceptual Framework for Supporting the Identification of Representative Samples for Surveys in Software Engineering. Doctoral Thesis, COPPE/UFRJ, 138p. http://www.cos.ufrj.br/uploadfile/publicacao/2611.pdf