leveraging the crowd: supporting newcomers to build an oss community

68
Leveraging the Crowd: Supporting Newcomers to Build an OSS Community Marco Aurélio Gerosa University of São Paulo (USP) Northern Arizona University (NAU) Keynote @ PARIS Workshop (Methods and Tools for Project / Architecture / Risk Management in Globally Distributed Software Development Projects) August 2, 2016 Irvine, California, US

Upload: marco-aurelio-gerosa

Post on 16-Apr-2017

196 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Marco Aurélio GerosaUniversity of São Paulo (USP)

Northern Arizona University (NAU)

Keynote @ PARIS Workshop (Methods and Tools for Project / Architecture / Risk Management in Globally Distributed Software Development Projects)August 2, 2016Irvine, California, US

Page 2: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Companies are open sourcing their code after using and

contributing to open source software projects

Context…

2

Page 3: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

3

Page 4: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

4

Page 5: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

5

Page 6: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

6

Page 7: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

7

Page 8: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

8

Page 9: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

9

Page 10: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

10

Page 11: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

11

Page 12: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

But, why companies are open sourcing their code?

12

Page 13: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Leveraging the Crowd

“The World Wide Web became a tool for bringing together the small contributions of millions of people and making them matter”

Collaboration on a scale never seen before

13

Page 14: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Where do they find the time?Activity- Wikipedia project

- Television watching in the U.S. (every year)

Time spent= 100,000,000 hours of human thought

= 200,000,000,000 hours Or 2,0000 Wikipedia projects (year)

14

Page 15: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

"given enough eyeballs, all bugs are shallow"

15

Page 16: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

16

Page 17: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Distributed and collaborative software development

• OpenStack:• 1.7M lines of code• 19 programming languages• 17K community members• 4.5K code contributors (2.1K in the last 12 months)• 38K e-mail messages • 51K followers on Twitter• took an estimated 507 years of effort (COCOMO model) -

first commit in 2006

• Mozilla Firefox:• 13.5M lines of code• 37 programming languages• 4K contributors (1K in the last 12 months)• 4,231 years of effort (COCOMO model) - first commit in 2002

• Swift:• 445K lines of code• Over the past 12 months, 428 developers (403 in the last 12

months)https://www.openhub.net/p/openstackhttps://opensource.com/business/14/6/openstack-numbershttps://www.openhub.net/p/apple_swift 17

Page 18: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Meet the community

https://github.com/about/press

> 14 million users

http://www.alexa.com/siteinfo/github.com

56th most accessed site18

Page 19: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

19

Page 20: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

DIAS, L.F., Igor STEINMACHER, Gustavo PINTO, Costa, D.A., and Marco GEROSA, How does the shift to GitHub impact project collaboration? - ICSME 2016 Era Track

20

Page 21: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Casual contributors

Pinto, Steinmacher & Gerosa (2016) ”More Common Than You Think: An In-Depth Study of Casual Contributors” 23rd IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER 2016)

• We analyzed the most popular projects in each language (300 in total)• 49% of contributors contributed only once (casual

contributor)• They are responsible for 2% of the total number of

commits• Casual contributions:

• bug fixes (30%)• fixing typos and grammar issues (29%)• adding new features (19%)• code refactoring (9%)

• Both casual contributors and project maintainersbelieve that casual contributions have more benefits than drawbacks (survey)

21

Page 22: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Heterogeneity• Meet multiple and specific demands• Cloud computing• Micro-services

SPEED

22

Page 23: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Ok, Let’s do it!

Let’s attract new developers!

Let’s open our code!

23

Page 24: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

However…

24

Page 25: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

“I opened my browser and typed the website address: http://www.libreoffice.org/. I will need to contribute to LibreOffice but I don’t have any clue on how to do it” “... I am a little lost, so I will try a bug that I think I can work with...”

“I don’t know what I was supposed to do after finishing the compilation process. I will watch the video tutorial once again to find it out. I need to define my next steps, I don’t know what these steps are.”

“The information I found in the project website are long and confusing. I felt really lost and concerned.”

Igor STEINMACHER, Tayana CONTE; Marco GEROSA, David REDMILES (2015) ” Social barriers faced by newcomers placing their first contribution in open source software projects”, 18th ACM Conference on Computer Supported Cooperative Work (CSCW 2015)

25

Page 26: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Why do newcomers dropout from OSS projects?

We analyzed:• 60 months Hadoop project• Mailing lists (50K messages), Issue tracker (8K issues, 76K comments), VCS• Survey

Absence of response , politeness, usefulness, and type of the author influence the retention of newcomers in an open source project

Steinmacher, Wiese, Chaves & Gerosa, ”Why do newcomers abandon open source software projects?”, 6th Int. Workshop on Cooperative and Human Aspects of Software Engineering (CHASE 2013)

82% of dropouts!!!

26

Page 27: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

How to better support newcomers?

27

Page 28: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Our research goalTo understand the entrance of newcomers in open source software projects by means of empirical studies and mitigate the barriers they face by means of processes and tools, leveraging sociotechnical information from software repositories

28

Page 29: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Method

UnderstandEngineer

Evaluate

Model & Theories

Executable code

Research

Our general approach

29

Page 30: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

1. Empirical studies using mixed-methods approach to understand the phenomenon

2. Engineering of innovative tool support for different stakeholders based on the understanding obtained

3. Evaluation using rigorous and systematic scientific studies

30

Engineer

Understand

Evaluate

Page 31: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Method

UnderstandEngineer

Evaluate

Model & Theories

Executable code

Research

31

Page 32: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Systematic literature review

STEINMACHER, I.; SILVA, M.A.; GEROSA, M.A.; REDMILES, D.F. “A systematic literature review on the barriers faced by newcomers to open source software projects.” Information and Software Technology, v. 59, p. 67-85, 2015

(“OSS” OR “Open Source” OR “Free Software” OR FLOSS OR FOSS) AND (newcomer OR “joining process” OR newbie OR “new developer” OR “new member” OR “new contributor” OR novice OR beginner OR “potential participant” OR retention OR joiner OR onboarding OR “new committer”)

291 papers initially found20 papers selected

32

RQ: What are the barriers that hinder the contribution of newcomers in OSS projects?

Understand

Page 33: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Empirical studies• Interviews: 36 subjects, 14 projects• Survey: 24 answers, 9 projects• Ethnography study: 2 courses

Igor STEINMACHER, Tayana CONTE; Marco GEROSA, David REDMILES (2015) ” Social barriers faced by newcomers placing their first contribution in open source software projects”, 18th ACM Conference on Computer Supported Cooperative Work (CSCW 2015)

33

Prof. David Redmiles

In collaboration with:

Understand

Page 34: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Put everything together

Igor STEINMACHER, Tayana CONTE; Marco GEROSA, David REDMILES (2015) ” Social barriers faced by newcomers placing their first contribution in open source software projects”, 18th ACM Conference on Computer Supported Cooperative Work (CSCW 2015)

34

Understand

Page 35: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Igor STEINMACHER, Tayana CONTE; Marco GEROSA, David REDMILES (2015) ” Social barriers faced by newcomers placing their first contribution in open source software projects”, 18th ACM Conference on Computer Supported Cooperative Work (CSCW 2015)

The Barriers Model

35

Understand

Page 36: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Method

UnderstandEngineer

Evaluate

Model & Theories

Executable code

Research

36

Page 37: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

FLOSSCoach: a portal for newcomers

http://www.flosscoach.com/

Engineer

37

Page 38: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

”Awareness denotes the practices through which actors tacitly and seamlessly align and integrate their distributed and yet interdependent activities.” Kjeld Schmidt (2002)

Page 39: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

(Big)Data!

Page 40: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Mining software repositories

Mining

Information about a project

Information about an ecosystem

Information about Software Engineering

Decision making

Software understanding

Support maintenance

Empirical validation of ideas & techniques

Collaboration and software production

Practitioner Researcher

Applications

Tag cloud from MSR 2014 CFP

The Mining Software Repositories (MSR) field analyzes the rich data available in software repositories to uncover interesting and actionable information about software systems and projects.

40

Page 41: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Software repositories

https://github.com/about/presshttp://octoverse.github.com/

31 million repositories12 million usersIn a single year:• 3 million new users• 152 million pushes• 25 million comments• 14 million issues• 7 million pull requests

36K projectshttp://en.wikipedia.org/wiki/CodePlex

30K projectshttps://launchpad.net

324K projects3.4 million developers

http://sourceforge.net/apps/trac/sourceforge/wiki/What%20is%20SourceForge.net

250K projects

http://en.wikipedia.org/wiki/Comparison_of_open_source_software_hosting_facilities

93K projects1 million users

200 projectshttp://projects.apache.org/indexes/alpha.html

661K projects29 billion of lines of codes3 million users

33K projects

http://www.ohloh.net/

41

Page 42: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

42

Page 43: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Method

UnderstandEngineer

Evaluate

Model & Theories

Executable code

Research

43

Page 44: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

FLOSSCoach evaluation• Deployed the portal for 6 different

projects • Developers reported their progress on

user diaries• Surveyed developers using the

Technology Acceptance Model and Self-efficacy instruments.

44

Evaluate

Page 45: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Evaluation• The portal improved newcomers’ experiences of the contribution

process

Igor STEINMACHER, Tayana CONTE, Christoph TREUDE, Marco GEROSA,"Overcoming Open Source Project Entry Barriers with a Portal for Newcomers". International Conference on Conference on Software Engineering (ICSE 2016), Austin, Texas. 45

Evaluate

Page 46: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Evaluation• The portal improved newcomers’ experiences of the contribution

process

Igor STEINMACHER, Tayana CONTE, Christoph TREUDE, Marco GEROSA,"Overcoming Open Source Project Entry Barriers with a Portal for Newcomers". International Conference on Conference on Software Engineering (ICSE 2016), Austin, Texas.

“The tool seems to be good, because it solves doubts that range from the skills needed to start to pointing how to submit a contribution.” “I could check what newcomers need to know regarding the development environment, accessing the links to documentation and relevant guidelines, understanding how to search for help and who to talk to in case of problems.”

“…the tool helped me a lot, because it gave me an outstanding guidance about what I needed to do and, consequently, made me spend less time and made me more confident”

“The flow was great. I always used it, and from here I accessed the other information. It is easy”

46

Evaluate

Page 47: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Next steps

UnderstandEngineer

Evaluate

Model & Theories

Executable code

Research

47

Page 48: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

How does the shift to GitHub impact projects’ collaboration?

DIAS, L.F., Igor STEINMACHER, Gustavo PINTO, Costa, D.A., and Marco GEROSA, How does the shift to GitHub impact project collaboration? - ICSME 2016 Era Track

We also investigated number of pull requests and issues

48

Understand

Page 49: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

What are the benefits and challenges of open-sourcing a proprietary software project?

Prof. Gustavo Pinto (IFPA)

In collaboration with:

49

Understand

Page 50: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

What are the benefits and barriers of contributing to OSS in a Software Engineering course?

50

Page 51: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Understanding newcomer’s motivations and engagement programs

Code Engagement Programs… may potentially motivate students to engage into Open Source

GSoC 2014

Analysis per project tool BeforeDuringAfter -> only 2 got back!

Country Partici-pants

Sri Lanka 13

China 5

India 3

USA 2

Spain 2Ireland 2United Kingdom, France, South Korea, Portugal, Hungary, Estonia

1

33 students

Prof. Daniel German (Uvic)

Understand

In collaboration with:

Page 52: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

What about mentors?

What are the benefits?Is it worth it?What is the process?What are the motivations?What are the challenges?

Understand

In collaboration with:

Anita Sarma(Oregon State University) 52

Page 53: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Documentation is everywhere

53

Page 54: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Next steps – more techniques• Natural Language Processing • deals with analyzing, understanding, and generating languages

that humans use naturally

• Information Retrieval • obtains information resources relevant to an information need

from a collection of resources

• Mining Software Repositories • uncovers interesting and actionable information about

software systems and projects summarize

visualize

search sort

filter Prof. Christoph Treude

Engineer

In collaboration with:

54

Page 55: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Improving engagementEngineer

Sabrina Marczak(PUCRS)

In collaboration with:

55

Page 56: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Thomas Zimmermann, Peter Weissgerber, Stephan Diehl, and Andreas Zeller. 2005. Mining Version Histories to Guide Software Changes. IEEE Trans. Software Eng. 31, 6 (June 2005), 429-445.

Solving specific barriers: recommending co-changes

Engineer

Gustavo OLIVA, Marco GEROSA, Change coupling between software artifacts: learning from past changes - in: The Art and Science of Analyzing Software Data (2015)56

Page 57: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Social information to predict co-changesUsing context information from software change collected from communication (comments), coordination (issues), and cooperation (artifacts);Random forest classifier for each specific co-change;

Engineer

Igor WIESE, Reginaldo RÉ, Igor STEINMACHER, Rodrigo KURODA, Gustavo OLIVA, Christoph TREUDE, Marco GEROSA, “Using contextual information to predict co-change”, Journal of Systems and Software (JSS), Elsevier (2016)

57

Page 58: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Detecting code issues

Different architectural roles have different metrics

distribution

Engineer

Mauricio ANICHE, Gabriele BAVOTA, Christoph TREUDE, Arie van DEURSEN, Marco GEROSA, A validated set of smells in Model-View-Controller architectures - ICSME 2016

Specific smells to specific architectural roles

58

Page 59: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Training the next generation of software engineers

Yorah Bosse,PhD candidate

70.69%

29.31%

% Pass % Fail and Abort

C Java Python VBA2010 78% 10% 3% 13%2011 78% 8% 0% 13%2012 75% 9% 3% 13%2013 28% 6% 53% 13%2014 48% 0% 40% 13%

29.3% of these enrollments resulted

in fail and abortC and Python were the

most used programming languages

0.00%10.00%20.00%30.00%40.00%50.00%60.00%70.00%80.00%90.00%

100.00%

Fail/Abort Pass

One of these courses had 62.2% of fail and abort in

this period

1 2 3 4 5 >5

73.3

1%

19.6

0%

4.49

%

1.81

%

0.43

%

0.36

%

Engineer

59Leonidas BRANDÃO, Yorah BOSSE, Marco GEROSA, ” Visual programming and automatic evaluation of exercises: an experience with a STEM course”, Frontiers in Education conference (FIE), Erie, PA, October, 2016

Page 60: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Next steps – more evaluation

• Free and Open Source Competence Center• OSS in Education• Federal Government Public Software Portal• In the wild…

Evaluate

60

Page 61: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Vision: Bug ExchangeBug Exchange: helping newcomers familiarize with sociotechnical aspects of an ecosystem of projects, generating a workforce of contributors who can transfer knowledge across projects

61Anita SARMA, Marco GEROSA, Igor STEINMACHER, I., LEANO, R., ” Training the future workforce through task curation in an OSS ecosystem”, Foundations of Software Engineering (FSE 2016), Visions and Reflections Track (FSE-VaR)

• Identifying required skill• Determining task complexity• Identifying information needs and

providing documentation• Recommending tasks• supporting peer mentor networks• Transferring knowledge across

projects• Crowdsourcing tasks

In collaboration with:

Anita Sarma

Page 62: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Recap

62

Page 63: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Recap

63

Page 64: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

Recap

64

Page 65: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

For existing projects:

Lower the barriers to boost contributions, specially from newcomers

Steinmacher, I., Gerosa, M.A., “Fostering Free/Libre Open Source Software community formation: guidelines for communities to support newcomers’ onboarding,” in: XVI International Free Software Workshop (WSL 2015)

65

YOU NEVER GETA SECOND CHANCE TO

MAKE A FIRST IMPRESSION

Page 66: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

For companies opening their code:

Steinmacher, I., Gerosa, M.A., “Fostering Free/Libre Open Source Software community formation: guidelines for communities to support newcomers’ onboarding,” in: XVI International Free Software Workshop (WSL 2015)

66

YOU NEVER GETA SECOND CHANCE TO

MAKE A FIRST IMPRESSIONLower the barriers to boost contributions, specially from newcomers

Page 67: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

For researchers:64 barriers faced by newcomers

We still need methods and tools for Project/Architecture/Risk

Management 67

Page 68: Leveraging the Crowd: Supporting Newcomers to Build an OSS Community

For researchers:64 barriers faced by newcomers

We still need methods and tools for Project/Architecture/Risk

Management 68

Thank you!Marco Aurelio Gerosa ([email protected])@gerosa_marcohttp://www.ime.usp.br/~gerosa