end-user programmers and their communities: an artifact ... · environment yahoo! pipes scratch...

89
End-User Programmers and their Communities End-User Programmers and their Communities: An Artifact-based Analysis Kathryn T. Stolee, Sebastian Elbaum, and Anita Sarma University of Nebraska–Lincoln {kstolee, elbaum, asarma}@cse.unl.edu September 22, 2011 This work is supported by the NSF GRFP under CFDA#47.076, NSF Award #0915526, and AFOSR Award #9550-10-1-0406. 1 / 31

Upload: others

Post on 17-Oct-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

End-User Programmers and theirCommunities: An Artifact-based Analysis

Kathryn T. Stolee, Sebastian Elbaum, and Anita SarmaUniversity of Nebraska–Lincoln

{kstolee, elbaum, asarma}@cse.unl.edu

September 22, 2011

This work is supported by the NSF GRFP under CFDA#47.076, NSF Award #0915526, and AFOSR Award #9550-10-1-0406.

1 / 31

Page 2: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Introduction

End-User Programming

Introduction

End User Programmers

People who engage in programming activities to support theirhobbies and work.

Professionals End UsersNumber in U.S. 3 million 13 millionTypical Education C.S. Degree Other DegreeRole of Programming It’s their job It supports their job

2 / 31

Page 3: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Introduction

End-User Programming

Introduction

End User Programmers

People who engage in programming activities to support theirhobbies and work.

Professionals End UsersNumber in U.S. 3 million 13 millionTypical Education C.S. Degree Other DegreeRole of Programming It’s their job It supports their job

2 / 31

Page 4: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Introduction

End-User Communities

Many Domains and Applications

Web Mashups: EducationalGames:

ScientificComputing:

Environment Yahoo! Pipes Scratch MATLAB# Artifacts 100,000 700,000 13,717# Participants 90,000 500,000 5,356

. . . yet we know little about how the repositories are utilized

3 / 31

Page 5: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Introduction

End-User Communities

Many Domains and Applications

Web Mashups: EducationalGames:

ScientificComputing:

Environment Yahoo! Pipes Scratch MATLAB# Artifacts 100,000 700,000 13,717# Participants 90,000 500,000 5,356

. . . yet we know little about how the repositories are utilized

3 / 31

Page 6: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Introduction

End-User Communities

Many Domains and Applications

Web Mashups: EducationalGames:

ScientificComputing:

Environment Yahoo! Pipes Scratch MATLAB# Artifacts 100,000 700,000 13,717# Participants 90,000 500,000 5,356

. . . yet we know little about how the repositories are utilized

3 / 31

Page 7: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Motivation

Empirical Study Details

Research GoalStudy ContextResearch QuestionsVariables and MetricsMethodsResults

4 / 31

Page 8: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Motivation

Research Goal

To better understand end-user programmer communities

Learn how communities and artifact repositories evolveUncover needs for support in: development, maintenance,search, program understanding, . . .

5 / 31

Page 9: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Motivation

Empirical Study Details

Goal: To better understand end-user programmer communities

Research GoalStudy ContextResearch QuestionsVariables and MetricsMethodsResults

6 / 31

Page 10: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Web Mashups

Why Mashup Communities?

Web Mashups

Applications that compose and manipulate existing data sources orservices to create new data or service.

Why study mashups?Many environments (e.g., Apatar, DERI Pipes, IBM MashupCenter, Kivati, Yahoo! Pipes, . . . )Potential impact (many users, growth)

7 / 31

Page 11: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Web Mashups

Why Mashup Communities?

Web Mashups

Applications that compose and manipulate existing data sources orservices to create new data or service.

Why study mashups?Many environments (e.g., Apatar, DERI Pipes, IBM MashupCenter, Kivati, Yahoo! Pipes, . . . )Potential impact (many users, growth)

7 / 31

Page 12: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Web Mashups

About Yahoo! Pipes

This example mashupfetches and filters newsfrom news.google.com

Information page showsthe pipe output anddescriptive information

8 / 31

Page 13: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Web Mashups

About Yahoo! Pipes

Clicking Publish adds thepipe to the publicrepository

8 / 31

Page 14: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Web Mashups

About Yahoo! Pipes

Clicking Edit Source loadsthe Pipes Editor

8 / 31

Page 15: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Web Mashups

About Yahoo! Pipes

Visual mashupcreation environmentWithin a browserDrag and dropinterface

8 / 31

Page 16: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Web Mashups

About Yahoo! Pipes

Visual mashupcreation environmentWithin a browserDrag and dropinterface

8 / 31

Page 17: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Web Mashups

About Yahoo! Pipes

Visual mashupcreation environmentWithin a browserDrag and dropinterface

8 / 31

Page 18: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Web Mashups

About Yahoo! Pipes

Visual mashupcreation environmentWithin a browserDrag and dropinterface

8 / 31

Page 19: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Web Mashups

Empirical Study Details

Goal: To better understand end-user programmer communities

Research GoalStudy ContextResearch QuestionsVariables and MetricsMethodsResults

9 / 31

Page 20: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Study Setup

Research Questions

RQ1: What are the characteristics of Yahoo! Pipes community?1a,b: author attrition and author contributions1c: artifact sharing, abstraction, complexity, and degree ofoverlap among pipes in the repository

RQ2: How do pipe attributes change as authors gain experience?2a: experience measured by time2b: experience measured by total contributions

RQ3: What are the characteristics of most prolific authors?3a: author activities3b: author skills3c: awareness of the community

10 / 31

Page 21: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Study Setup

Research Questions

RQ1: What are the characteristics of Yahoo! Pipes community?1a,b: author attrition and author contributions1c: artifact sharing, abstraction, complexity, and degree ofoverlap among pipes in the repository

RQ2: How do pipe attributes change as authors gain experience?2a: experience measured by time2b: experience measured by total contributions

RQ3: What are the characteristics of most prolific authors?3a: author activities3b: author skills3c: awareness of the community

10 / 31

Page 22: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Study Setup

Research Questions

RQ1: What are the characteristics of Yahoo! Pipes community?1a,b: author attrition and author contributions1c: artifact sharing, abstraction, complexity, and degree ofoverlap among pipes in the repository

RQ2: How do pipe attributes change as authors gain experience?2a: experience measured by time2b: experience measured by total contributions

RQ3: What are the characteristics of most prolific authors?3a: author activities3b: author skills3c: awareness of the community

10 / 31

Page 23: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Study Setup

Research Questions

RQ1: What are the characteristics of Yahoo! Pipes community?1a,b: author attrition and author contributions1c: artifact sharing, abstraction, complexity, and degree ofoverlap among pipes in the repository

RQ2: How do pipe attributes change as authors gain experience?2a: experience measured by time2b: experience measured by total contributions

RQ3: What are the characteristics of most prolific authors?3a: author activities3b: author skills3c: awareness of the community

10 / 31

Page 24: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Study Setup

Empirical Study Details

Goal: To better understand end-user programmer communities

Research GoalStudy ContextResearch QuestionsVariables and MetricsMethodsResults

11 / 31

Page 25: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Metrics

Study Details

Concept to Capture Variableartifact sharing/impact popularityabstraction configurabilitycomplexity sizeoverlap of artifacts in repository diversity

12 / 31

Page 26: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Metrics

Study Details

Variables: size, configurability, popularity, diversity

Pipe Source

Pipe Information

12 / 31

Page 27: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Metrics

Study Details

Variables: size, configurability, popularity, diversity

Pipe Source

6 modules

Significance: Size is related to complexity

12 / 31

Page 28: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Metrics

Study Details

Variables: size, configurability, popularity, diversity

Pipe SourcePipe Information

3 modules

Significance: Configurability is related to abstraction and languagemastery

12 / 31

Page 29: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Metrics

Study Details

Variables: size, configurability, popularity, diversity

Pipe SourcePipe Information

3 modules

Significance: Configurability is related to abstraction and languagemastery

12 / 31

Page 30: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Metrics

Study Details

Variables: size, configurability, popularity, diversity

Pipe SourcePipe Information

3 modules

Significance: Configurability is related to abstraction and languagemastery

12 / 31

Page 31: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Metrics

Study Details

Variables: size, configurability, popularity, diversity

Pipe SourcePipe Information

3 modules

Significance: Configurability is related to abstraction and languagemastery

12 / 31

Page 32: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Metrics

Study Details

Variables: size, configurability, popularity, diversity

Pipe SourcePipe Information

190 clones

Significance: Popularity is related to impact on community

12 / 31

Page 33: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Metrics

Study Details

Variables: size, configurability, popularity, diversity

Pipe Source

1 Same structure, fields, content2 Same structure, field counts3 Same structure4 Same bag of modules5 Same set of modules6 Same type bag7 Same size8 No match

Significance: Diversity is related to contribution novelty

12 / 31

Page 34: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Metrics

Study Details

Variables: size, configurability, popularity, diversity

Pipe Source

1 Same structure, fields, content2 Same structure, field counts3 Same structure4 Same bag of modules5 Same set of modules6 Same type bag7 Same size8 No match

Significance: Diversity is related to contribution novelty

12 / 31

Page 35: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Metrics

Study Details

Variables: size, configurability, popularity, diversity

Pipe Source3 Same structure

Significance: Diversity is related to contribution novelty

12 / 31

Page 36: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Metrics

Study Details

Variables: size, configurability, popularity, diversity

Pipe Source

1 Same structure, fields, content2 Same structure, field counts3 Same structure4 Same bag of modules5 Same set of modules6 Same type bag7 Same size8 No match

Significance: Diversity is related to contribution novelty

12 / 31

Page 37: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Metrics

Study Details

Variables: size, configurability, popularity, diversity

Pipe Source

1 Same structure, fields, content2 Same structure, field counts3 Same structure4 Same bag of modules5 Same set of modules6 Same type bag7 Same size8 No match

Significance: Diversity is related to contribution novelty

12 / 31

Page 38: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Metrics

Study Details

Variables: size, configurability, popularity, diversity

Pipe Source 5 Same set of modules

Significance: Diversity is related to contribution novelty

12 / 31

Page 39: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Metrics

Study Details

Variables: size, configurability, popularity, diversity

Pipe Source 5 Same set of modules

Significance: Diversity is related to contribution novelty

12 / 31

Page 40: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Metrics

Study Details

Variables: size, configurability, popularity, diversity

Pipe Source

1 Same structure, fields, content2 Same structure, field counts3 Same structure4 Same bag of modules5 Same set of modules6 Same type bag7 Same size8 No match

Significance: Diversity is related to contribution novelty

12 / 31

Page 41: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Metrics

Empirical Study Details

Goal: To better understand end-user programmer communities

Research GoalStudy ContextResearch QuestionsVariables and MetricsMethodsResults

13 / 31

Page 42: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Study Methods

Data Collection

Artifacts: 32,887Authors: 20,313

Threats: public repository offers limited visibility (internal); samplingbias (external); generalizability to other domains (external)

14 / 31

Page 43: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Study Methods

Data Collection

Artifacts: 32,887

Authors: 20,313

Threats: public repository offers limited visibility (internal); samplingbias (external); generalizability to other domains (external)

14 / 31

Page 44: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Study Methods

Data Collection

Artifacts: 32,887Authors: 20,313

Threats: public repository offers limited visibility (internal); samplingbias (external); generalizability to other domains (external)

14 / 31

Page 45: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Study Methods

Data Collection

Artifacts: 32,887Authors: 20,313

Threats: public repository offers limited visibility (internal); samplingbias (external); generalizability to other domains (external)

14 / 31

Page 46: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Study Methods

Empirical Study Details

Goal: To better understand end-user programmer communities

Research GoalStudy ContextResearch QuestionsVariables and MetricsMethodsResults

15 / 31

Page 47: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

Research Questions

RQ1: What are the characteristics of Yahoo! Pipes community?1a,b: author attrition and author contributions1c: artifact sharing, abstraction, complexity, and degree ofoverlap among pipes in the repository

16 / 31

Page 48: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

RQ1: Characteristics of Yahoo! Pipes Community

Summary

Metric AverageSize 8.20 modules per pipeConfigurability 0.65 modules per pipePopularity 5.67 clones per pipeDiversity 3.62 cluster level

17 / 31

Page 49: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

RQ1: Characteristics of Yahoo! Pipes Community

Summary

Metric AverageSize 8.20 modules per pipeConfigurability 0.65 modules per pipePopularity 5.67 clones per pipeDiversity 3.62 cluster level

34% of pipes areconfigurable

17 / 31

Page 50: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

RQ1: Characteristics of Yahoo! Pipes Community

Summary

Metric AverageSize 8.20 modules per pipeConfigurability 0.65 modules per pipePopularity 5.67 clones per pipeDiversity 3.62 cluster level

54% of pipes havebeen cloned

17 / 31

Page 51: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

RQ1: Characteristics of Yahoo! Pipes Community

Summary

Metric AverageSize 8.20 modules per pipeConfigurability 0.65 modules per pipePopularity 5.67 clones per pipeDiversity 3.62 cluster level

5% of pipes areexact duplicates,yet 46% have amatch if fieldvalues are relaxed

17 / 31

Page 52: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

RQ1: Characteristics of Yahoo! Pipes Community

Take Aways:

There is a lot of reuse of shared pipesParticipants often submit pipes that are highly similar to otherpipes in the repository

18 / 31

Page 53: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

Research Questions

RQ2: How do pipe attributes change as authors gain experience?2a: measures experience in terms of time2b: measures experience in terms of total contributions

19 / 31

Page 54: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

RQ2: Analysis of artifacts as authors gain experienceComparisons based on experience (time)

For eachpipe

Get daysexperiencefor author

days < 31

add to Early

add to Late

yes

no

Characteristic µearly µlate# of Pipes 27,555 5,332Diversity*** 3.519 4.126Popularity*** 4.984 9.254Configurability*** 0.614 0.838Size*** 7.919 9.587

H0 : µearly > µlateHa : µearly ≤ µlate

Signif. codes:*** 0.001 ** 0.01

20 / 31

Page 55: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

RQ2: Analysis of artifacts as authors gain experienceComparisons based on experience (time)

For eachpipe

Get daysexperiencefor author

days < 31

add to Early

add to Late

yes

no

Characteristic µearly µlate# of Pipes 27,555 5,332Diversity*** 3.519 4.126Popularity*** 4.984 9.254Configurability*** 0.614 0.838Size*** 7.919 9.587

H0 : µearly > µlateHa : µearly ≤ µlate

Signif. codes:*** 0.001 ** 0.01

20 / 31

Page 56: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

RQ2: Analysis of artifacts as authors gain experience

Take Away: More experience results in pipes that are larger, morepopular, more configurable, and more diverse

21 / 31

Page 57: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

RQ2: Analysis of artifacts as authors gain experienceComparisons based on contributions

For eachauthor

Countall pipescreated

pipes > 15

add pipes to Many

add pipes to Few

yes

no

Characteristic µfew µmany

# of Pipes 30,503 2,384Diversity 3.639 3.355Popularity*** 4.302 23.250Configurability*** 0.644 0.729Size** 8.194 8.136

H0 : µfew > µmanyHa : µfew ≤ µmany

Signif. codes:*** 0.001 ** 0.01

22 / 31

Page 58: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

RQ2: Analysis of artifacts as authors gain experienceComparisons based on contributions

For eachauthor

Countall pipescreated

pipes > 15

add pipes to Many

add pipes to Few

yes

no

Characteristic µfew µmany

# of Pipes 30,503 2,384Diversity 3.639 3.355Popularity*** 4.302 23.250Configurability*** 0.644 0.729Size** 8.194 8.136

H0 : µfew > µmanyHa : µfew ≤ µmany

Signif. codes:*** 0.001 ** 0.01

22 / 31

Page 59: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

RQ2: Analysis of artifacts as authors gain experience

Take Away: The most prolific authors create pipes that are larger,more popular, and more configurable

. . . what about diversity?

23 / 31

Page 60: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

Research Questions

RQ3: What are the characteristics of most prolific authors?3a: author activities3b: author skills3c: awareness of the community

24 / 31

Page 61: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

RQ3: Characteristics of most prolific authorsStudy Set-up

Authors: 20,313

Prolific Authors: 81

25 / 31

Page 62: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

RQ3: Characteristics of most prolific authorsStudy Set-up

Authors: 20,313Prolific Authors: 81

25 / 31

Page 63: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

RQ3: Characteristics of most prolific authorsRolling Cluster Analysis

26 / 31

Page 64: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

RQ3: Characteristics of most prolific authorsRolling Cluster Analysis

26 / 31

Page 65: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

RQ3: Characteristics of most prolific authorsRolling Cluster Analysis

26 / 31

Page 66: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

RQ3: Characteristics of most prolific authorsRolling Cluster Analysis

26 / 31

Page 67: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

RQ3: Characteristics of most prolific authorsRolling Cluster Analysis

26 / 31

Page 68: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

RQ3: Characteristics of most prolific authorsRolling Cluster Analysis

26 / 31

Page 69: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

RQ3: Characteristics of most prolific authorsRolling Cluster Analysis

26 / 31

Page 70: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

RQ3: Characteristics of most prolific authorsRolling Cluster Analysis

26 / 31

Page 71: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

RQ3: Characteristics of most prolific authorsRolling Cluster Analysis

26 / 31

Page 72: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

RQ3: Characteristics of most prolific authorsRolling Cluster Analysis

26 / 31

Page 73: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

RQ3: Characteristics of most prolific authorsRolling Cluster Analysis

26 / 31

Page 74: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

RQ3: Characteristics of most prolific authorsRolling Cluster Analysis

26 / 31

Page 75: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

RQ3: Characteristics of most prolific authorsRolling Cluster Analysis

26 / 31

Page 76: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

RQ3: Characteristics of most prolific authorsRolling Cluster Analysis

26 / 31

Page 77: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

RQ3: Characteristics of most prolific authorsRolling Cluster Analysis

26 / 31

Page 78: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

RQ3: Characteristics of most prolific authorsRolling Cluster Analysis

26 / 31

Page 79: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

RQ3: Characteristics of most prolific authorsAuthor Activities

02

46

8

Rolling Diversity Analysis Over Time

Time in days: 806 total

Div

ersi

ty

19 14 3 45 14 58 6 10 8 9 8 35 69 140

368

02

46

8

27 / 31

Page 80: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

RQ3: Characteristics of most prolific authorsAuthor Activities

Level 2:Samestructure andfield counts;relax fieldvalues

43% of pipes submitted by prolific authors represent tweaks

For Example: Change a URL, filter criterion, sort order, . . .

27 / 31

Page 81: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

RQ3: Characteristics of most prolific authorsAuthor Activities

Level 8: Nostructuralsimilarities

43% of pipes submitted by prolific authors represent tweaks52% of pipes submitted by prolific authors represent new initiatives

27 / 31

Page 82: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

RQ3: Characteristics of most prolific authorsAuthor Activities

02

46

8

Rolling Diversity Analysis Over Time

Time in days: 713 total

Div

ersi

ty

0

513 16 148 2 0 0 0 0 0 1 0 0 31 0 0 1 0 1

02

46

8

56% of prolific authors consistently submit new initiatives

27 / 31

Page 83: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

RQ3: Characteristics of most prolific authorsAuthor Activities

02

46

8

Rolling Diversity Analysis Over Time

Time in days: 19 total

Div

ersi

ty

0 0 0 0 0 0 0 0 11 0 1 2 0 0 0 0 0 0 0 5

02

46

8

27% of prolific authors consistently submit tweaks

27 / 31

Page 84: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Empirical Study

Results

RQ3: Characteristics of most prolific authors

Take Away #1: 1/2 of participants submit pipes that are novel to theirprevious contributions

Take Away #2: 1/4 of participants submit pipes that are tweaks oftheir other pipes

28 / 31

Page 85: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Discussion

Implications

The real take away

End-user programmer communities may need . . .

moderators.→ Repository is cluttered with highly similar artifacts (RQ1)

more sophisticated repository search.→ Many pipes are very structurally similar to other pipes in the

repository (RQ1)→ Early authors create less diverse pipes than later authors (RQ2)

artifact development support.→ Tweaks represent missed opportunities for parameterization (RQ3)→ Many shared pipes are tweaks on previously-committed pipes by

the same author (RQ3)

29 / 31

Page 86: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Discussion

Implications

The real take away

End-user programmer communities may need . . .

moderators.→ Repository is cluttered with highly similar artifacts (RQ1)

more sophisticated repository search.→ Many pipes are very structurally similar to other pipes in the

repository (RQ1)→ Early authors create less diverse pipes than later authors (RQ2)

artifact development support.→ Tweaks represent missed opportunities for parameterization (RQ3)→ Many shared pipes are tweaks on previously-committed pipes by

the same author (RQ3)

29 / 31

Page 87: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Discussion

Implications

The real take away

End-user programmer communities may need . . .

moderators.→ Repository is cluttered with highly similar artifacts (RQ1)

more sophisticated repository search.→ Many pipes are very structurally similar to other pipes in the

repository (RQ1)→ Early authors create less diverse pipes than later authors (RQ2)

artifact development support.→ Tweaks represent missed opportunities for parameterization (RQ3)→ Many shared pipes are tweaks on previously-committed pipes by

the same author (RQ3)

29 / 31

Page 88: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Discussion

Threats

Threats to Validity

Internal→ History (the pipes were sampled at different times)→ Selection (the repository only provides public pipes)

Construct→ Interaction of different factors→ Mono-method bias on diversity (only consider structural diversity,

not semantic)

External→ Generalizability (only studied one community)→ Sampling bias (could not control search results when sampling)

30 / 31

Page 89: End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet

End-User Programmers and their Communities

Discussion

Conclusion

Conclusion

Authors utilize the repository in different waysAs authors gain experience in the environment, they tend tomake more valuable contributions to the repositoryThere is a need for better support to help end-user programmercommunities continue to progress and growTo generalize the results, we are interested in extending themetrics to other languages and repositories

To facilitate replication, the data used in this analysis is available:http://cse.unl.edu/˜kstolee/esem2011/artifacts.html

31 / 31