curating github for engineered software projects€¦ · an empirical study of goto in c code from...

Post on 16-Jun-2020

0 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

CuratingGitHubforEngineeredSoftwareProjects

MeiNagappan

https://reporeapers.github.io/

CraigCabreyStevenKrohNuthan Munaiah

Whyisthistopicinterestingtoday?

AccesstoData

Whyisthistopicinterestingtoday?

4

ComputingPowerAccesstoData

5

Whyisthistopicinterestingtoday?

Whatcanwedowiththisdata?

6

7

DataDrivenDecisionSupportforSoftwareStakeholders

Developer Maintainer

OperatorManager

BuildEngineer

AnEmpiricalStudyofGotoinCCodefromGitHub

RepositoriesMeiNagappan,RomainRobbes,YasutakaKamei,Éric

Tanter,ShaneMcIntosh,AudrisMockus,AhmedE.Hassan

However,thereisalurkingissue

WhataretheseprojectsonGithub?

85± 5%filesaresystemornetworkingfiles

Noise

• Studentprojects• Tutorialprojects• Personalprojects• Forkedprojects

Weneedtochooseengineered

softwareprojects

Sohowarewefindingengineeredsoftwareprojects?

23

Stargazers/Watchers/Forks

Canwedobetter?

25

CurateGithub tofindtheengineeredsoftwareprojects

26

Howdowedefineanengineeredsoftwareproject?

27

Engg.SW

Project

Arch.

CI

Comm.

Doc.

History

Issues

License

Unittesting

MetricThreshold

Weight

Thresholds– 150Github Projects

HighStars,HighScore(100)

HighStars,butverylowscore(25)

LowStars,butveryhighscore(100)

Evaluation– Groundtruthdata

384Github reposweremanuallyanalyzed

Evaluation– 384Github reposweremanuallyanalyzed

Evaluation– 384Github reposweremanuallyanalyzed

Perfectprecision,butverylowrecall

https://reporeapers.github.io/Data

https://github.com/RepoReapers/reaper

https://reporeapers.github.io/

SourceCode

Data

top related