130719 sebastiano panichella - who is going to mentor newcomers in open source projects

Post on 05-Dec-2014

185 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Software developers, empirical studies, social studies, mentoring

TRANSCRIPT

Who is going to Mentor Newcomers in Open Source Projects?

Gerardo Massimiliano Rocco Sebastiano Canfora Di Penta Oliveto Panichella

Context and Motivations • Software Development

How? • Training via Mentoring

Case Study •Explorative analysis

•Recommendation system evaluation

Training Project Newcomers

With a GOOD TRAINING

Can immediately start to work

ACTIVELY

Newcomer

Zhou and Mockus

Better training from

Senior Developers

Newcomer

Previous Work...

Low Sociability

Previous Work...

Dagenais et al.

MENTOR

Newcomer

Mentoring of project newcomers is highly desirable

Characteristics of a good Mentor

Sources of Information

SVN GIT CVS

Small Projects: find Mentors

is a trivial problem

Large Projects: : find Mentors

is not a trivial problem

Mentoring Small/large Projects

.........

YODA (Young and newcOmer Developer Assistant)

Approach for Mentors Identification

in Open Source Projects

SVN GIT CVS

YODA: two pashes

?

What factor can be used to identify

mentors?

What factor can be used to identify mentors?

RQ1: Identify past mentors

How does Arnetminer works?

f1: they published Many papers Together f2: advisor published More than the Student f3: advisor older than the student f4: student published her first paper(s) with the advisor

Ranks pairs of researchers according to four factors:

Time

F1: Exchanged emails

Heuristics to identify Mentors

When Alice joins the project

F1: Exchanged emails

Heuristics to identify Mentors

Time

F2: overall amount of emails

Heuristics to identify Mentors

F2: overall amount of emails

Heuristics to identify Mentors

F2: overall amount of emails

Heuristics to identify Mentors

F3: project age Heuristics to identify Mentors

Time

F3: project age Heuristics to identify Mentors

F4: newcomer early emails

Heuristics to identify Mentors

Time

First emails by Alice in the project

F4: newcomer early emails

Heuristics to identify Mentors

F5: Commits

Heuristics to identify Mentors

When Alice joins the project

Time

F5: Commits

Heuristics to identify Mentors

What factors can be used to identify mentors?

Aggregating the factors

5

1i

ii fw

Recommend Mentors

Time

Recommend Mentors

Time

Recommend Mentors

Time t

Recommend Mentors

Time t

Mentor with adequate skills

Recommend Mentors

Time

Inspired to the work On Bug Triaging by J. Anvik et al.

2011

Recommend Mentors

Time

Inspired to the work On Bug Triaging by J. Anvik et al.

2011

t

Recommend Mentors

Time

Inspired to the work On Bug Triaging by J. Anvik et al.

2011

t

Recommend Mentors

Time

Inspired to the work On Bug Triaging by J. Anvik et al.

2011

t

Recommend Mentors

Time

Inspired to the work On Bug Triaging by J. Anvik et al.

2011

t

Recommend Mentors

Time

Inspired to the work On Bug Triaging by J. Anvik et al.

2011

t

DICE SIMILARITY

Empirical Study

• Goal: analyze data from mailing lists and versioning

systems

• Purpose: investigating which factors can be used to

identify mentors

• Quality focus: recommend mentors in software

projects

• Context: mailing lists and versioning systems of five software

Apache, FreeBSD, PostgreSQL, Python and Samba

Apache FreeBSD PostgreSQL Python Samba

Period

(Training set) 08/2001-03/2002 11/1998-02/2000 10/1998-05/2001 05/2000-05/2001 04/1998-09/2000

Period

(Test set) 04/2002-12/2008 03/2000-10/2008 06/2001-03/2008 06/2001-12/2008 10/2000-12/2008

# of Mentors

(Training set) 19 65 10 28 17

# of

Newcomers

(Training set) 13 33 8 32 33

# of

Newcomers

(Test set) 13 33 7 31 33

Context

Training and Test sets for evaluating Yoda.

Research Questions

?

RQ1: How can we identify mentors from the past history of a software project? SCORE

2.5

1.5

1.5

1.5

1.5

1.5

……….

COUPLES

……….

5

1i

ii fw

RQ1: How can we identify mentors from the past history of a software project? SCORE

2.5

1.5

1.5

1.5

1.5

1.5

……….

COUPLES

……….

5

1i

ii fw

Manual Validation

RQ1: How can we identify mentors from the past history of a software project?

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

18 19 20 21 22 23 24

Pre

cisi

on

Number of newcomer‐mentor pairs

Possible Configurations

f1

RQ1: How can we identify mentors from the past history of a software project?

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

18 19 20 21 22 23 24

Pre

cisi

on

Number of newcomer‐mentor pairs

Possible Configurations

f1 +f2+ f3

RQ1: How can we identify mentors from the past history of a software project?

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

18 19 20 21 22 23 24

Pre

cisi

on

Number of newcomer‐mentor pairs

Possible Configurations

f1 +f2+ f4

RQ1: How can we identify mentors from the past history of a software project?

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

18 19 20 21 22 23 24

Pre

cisi

on

Number of newcomer‐mentor pairs

Possible Configurations

f5

(Baseline)

RQ1: How can we identify mentors from the past history of a software project?

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

18 19 20 21 22 23 24

Pre

cisi

on

Number of newcomer‐mentor pairs

Apache

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

12 14 16 18 20 22

Pre

cisi

on

Number of newcomer‐mentor pairs

PostgreSQL

f1 f1 +f2+ f3 f1 +f2+ f4 f5 (Baseline)

RQ1: How can we identify mentors from the past history of a software project?

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

18 19 20 21 22 23 24

Pre

cisi

on

Number of newcomer‐mentor pairs

Apache

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

12 14 16 18 20 22

Pre

cisi

on

Number of newcomer‐mentor pairs

PostgreSQL

f1 f1 +f2+ f3 f1 +f2+ f4 f5 (Baseline)

RQ1: How can we identify mentors from the past history of a software project?

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

23 25 27 29 31 33 35 37 39 41

Pre

cisi

on

Number of newcomer‐mentor pairs

FreeBSD

0% 10% 20% 30% 40% 50% 60% 70% 80% 90%

100%

24 26 28 30 32 34 36 38 40 42 44 46 48

Pre

cisi

on

Number of newcomer‐mentor pairs

Python

0% 10% 20% 30% 40% 50% 60% 70% 80% 90%

100%

30 32 34 36 38 40 42

Pre

cisi

on

Number of newcomer‐mentor pairs

Samba

RQ1: How can we identify mentors from the past history of a software project?

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

23 25 27 29 31 33 35 37 39 41

Pre

cisi

on

Number of newcomer‐mentor pairs

FreeBSD

0% 10% 20% 30% 40% 50% 60% 70% 80% 90%

100%

24 26 28 30 32 34 36 38 40 42 44 46 48

Pre

cisi

on

Number of newcomer‐mentor pairs

Python

0% 10% 20% 30% 40% 50% 60% 70% 80% 90%

100%

30 32 34 36 38 40 42

Pre

cisi

on

Number of newcomer‐mentor pairs

Samba

USEFUL FACTORS FOR MENTORS IDENTIFICATION

0.5*f1 + 0.25*f2 + 0.25*f4

0.5*f1 + 0.25*f2 + 0.25*f3

f1

RQ2: To what extent would it be possible to recommend mentors to newcomers joining a software project?

RQ2: To what extent would it be possible to recommend mentors to newcomers joining a software project?

RQ2: To what extent would it be possible to recommend mentors to newcomers joining a software project?

YODA make it is possible possible to recommend

Mentors

Why don’t just using Top Committers?

Why don’t just using Top Committers?

Why don’t just using Top Committers?

Not all Committers Are Good Mentors

Questions Asked: - Done/received mentoring

- Perceived importance of mentoring

- What makes a good Mentor

Surveying Projects Developers

Sent to 114 Subjects…

FreeBSD

Postgre- SQL

Python

Apache

Samba .....

37

..... 37

..... 15

..... 23

..... 23

Obtained Answare…

FreeBSD

Postgre- SQL

Python

Apache

Samba

-

92%

58%

8%

42%

0% 20% 40% 60% 80% 100%

Did mentoring?

Had a mentor?

YES NO

Done/received mentoring?

92%

58%

8%

42%

0% 20% 40% 60% 80% 100%

Did mentoring?

Had a mentor?

YES NO

Done/received mentoring?

Yes, I received Mentoring. My mentor was…

Yes, I did mentoring…

>

18%

36%

45%

0%

0%

33%

56%

11%

0%

0%

0% 20% 40% 60%

Very important

Important

Neutral

Not important

Useless at all

Effect of mentor Effect on newcomer

Perceived importance of mentoring

18%

36%

45%

0%

0%

33%

56%

11%

0%

0%

0% 20% 40% 60%

Very important

Important

Neutral

Not important

Useless at all

Effect of mentor Effect on newcomer

Perceived importance of mentoring

18%

36%

45%

0%

0%

33%

56%

11%

0%

0%

0% 20% 40% 60%

Very important

Important

Neutral

Not important

Useless at all

Effect of mentor Effect on newcomer

Perceived importance of mentoring

18%

36%

45%

0%

0%

33%

56%

11%

0%

0%

0% 20% 40% 60%

Very important

Important

Neutral

Not important

Useless at all

Effect of mentor Effect on newcomer

Perceived importance of mentoring

Is very important that mentor share knowledge with a mentee…

19%

42%

38%

0%

0% 10% 20% 30% 40% 50%

Experience

Communication skills

Project knowledge

Others

What makes a good Mentor

19%

42%

38%

0%

0% 10% 20% 30% 40% 50%

Experience

Communication skills

Project knowledge

Others

What makes a good Mentor

19%

42%

38%

0%

0% 10% 20% 30% 40% 50%

Experience

Communication skills

Project knowledge

Others

What makes a good Mentor

My first Mentor had a very strong and technical background

Conclusion

Conclusion

Conclusion

Conclusion

Conclusion

Future Work

top related