how do developers blog? an exploratory study
DESCRIPTION
We report on an exploratory study, which aims at understanding how software developers use social media compared to conventional development infrastructures. We analyzed the blogging and the committing behavior of 1,100 developers in four large open source communities. We observed that these communities intensively use blogs with one new entry about every 8 hours. A blog entry includes 14 times more words than a commit message. When analyzing the content of the blogs, we found that most popular topics rep- resent high-level concepts such as functional requirements and domain concepts. Source code related topics are covered in less than 15% of the posts. Our results also show that developers are more likely to blog after corrective engineering and management activities than after forward engineering and re-engineering activities. Our findings call for a hypothesis-driven research to further understand the role of social media in software engineering and integrate it into development processes and tools.TRANSCRIPT
How Do Developers Blog?
An Exploratory Study
Dennis Pagano, Technische Universität München
Walid Maalej, Technische Universität München
Integrating Social Media in Software Development
Executive Summary
Pagano, Maalej, May 2011 How Do Developers Blog? – MSR 2011 2
• We analyzed blogging
frequency, post
content, and commit
messages in 4 open
source communities
• Blog posts exhibit
recurrent usage
patterns and
common topics
• Relationship between
developers’ blogging
and committing
behavior
1 2
Outline
1 Motivation
2 Research Setting
3 Research Results
4 Summary
Pagano, Maalej, May 2011 3How Do Developers Blog? – MSR 2011
Motivation
• Many software developers publish information related
to their work in blogs
• We do not know how they blog and what they blog
• We also do not understand the connection to
“conventional” development activities
Pagano, Maalej, May 2011 How Do Developers Blog? – MSR 2011 4
Towards empirical framework on the role of blogs
and other social media in software engineering
Outline
1 Motivation
2 Research Setting
3 Research Results
4 Summary
Pagano, Maalej, May 2011 5How Do Developers Blog? – MSR 2011
Research Questions
Pagano, Maalej, May 2011 6How Do Developers Blog? – MSR 2011
•How often do committers and other bloggers post?
• What are typical elements in a post?
• Which topics do developers blog about?
• How popular are these topics across the projects?
• Are there publishing patterns in developers’ workflows?
• Is there a semantic relationship between work performed
and information blogged?
1. Blog Usage
2. Blog Content
3. Blog Integration
Research Method
Pagano, Maalej, May 2011 7How Do Developers Blog? – MSR 2011
Collected Data Sets
Eclipse GNOME PostgreSQL Python
# posts 10,333 18,323 2,691 18,660
# bloggers 328 342 95 405
# commits 239,659 252,831 30,745 45,116
# committers 467 2,294 34 178
# blogging committers 93 250 12 34
Pagano, Maalej, May 2011 8How Do Developers Blog? – MSR 2011
Outline
1 Motivation
2 Research Setting
3 Research Results
4 Summary
Pagano, Maalej, May 2011 9How Do Developers Blog? – MSR 2011
Blog Usage: Publishing Frequency
Pagano, Maalej, May 2011 How Do Developers Blog? – MSR 2011 10
• Mean time between successive posts within a
community is about 8.1 hours
• Committersblog on average more frequently
(every 26 days) than other bloggers (every 28 days)
• Committers use blogs for longer time periods (about
2.2 years) than other bloggers (about 1.6 years)
Blog Usage: Post Structure
Pagano, Maalej, May 2011 How Do Developers Blog? – MSR 2011 11
• Posts comprise 150 words on average
• 95% are shorter than 1,000 words
• 1.8% of posts contain source code paragraphs
• 80% of posts contain links
– Committers include more links to Wikis (11%) than other
bloggers (8%)
– Committers more frequently link to other blogs (28%) than
other bloggers (25%)
• 29% of posts contain images
– Committers post more screenshots (22%) than other
bloggers (18%)
Findings on Blog Usage
Pagano, Maalej, May 2011 How Do Developers Blog? – MSR 2011 12
1Regular social activities in the studied open
source communities
2Posts are less frequent than commit messages
but comprise more content
3Posts rarely contain source code, but
frequently high level information and images
4Committers reuse more knowledge than other
bloggers
radio, listen,
player, sync, song,
music, play, ipod,
album, artist, band
radio, listen,
player, sync, song,
music, play, ipod,
album, artist, band
radio, listen,
player, sync, song,
music, play, ipod,
album, artist, band
radio, listen,
player, sync, song,
music, play, ipod,
album, artist, band
radio, listen,
player, sync, song,
music, play, ipod,
album, artist, band
radio, listen,
player, sync, song,
music, play, ipod,
album, artist, band
Method Used to Analyze Blog Content
Pagano, Maalej, May 2011 How Do Developers Blog? – MSR 2011 13
Set of words
per community
Common topic
labels
Topic
category
cache, memory,
perform, high,
quality, limit,
secure, cost
Functional
requirements &
domain concepts
Non-functional
requirements
Requirements
radio, listen,
player, sync, song,
music, play, ipod,
album, artist, band
Popular Topics Identified in Blogs
Pagano, Maalej, May 2011 How Do Developers Blog? – MSR 2011 14
# Topic description Popularity Examples of influential words
1functional requirements &
domain concepts42.2%
radio, listen, player, sync, song,
music, play, ipod, album, artist, band
2 community& contributions 37.7%people, community, contribute,
group, help, news, post, comment
3API usage & project
documentation30.0%
wiki, write, project, api, document,
use, review, text, output
... ... ... ...
15 source code 14.7%void, new, import, public, final, string,
class, return, private, true
... ... ... ...
23 continuous integration 2.5%resource, test, source, build,
configure, hudson, project, generate
Topic Categories Identified in Blogs
# Category Popularity Example Topics
1 Requirements 51%
functional requirements &
domain concepts, non-
functional requirements
2 Community 45%community & contributions,
conferences
... ... ... ...
6 Implementation 29% source code, solution concepts
... ... ... ...
10 Business 15% licensing
Pagano, Maalej, May 2011 How Do Developers Blog? – MSR 2011 15
Findings on Blog Content
Pagano, Maalej, May 2011 How Do Developers Blog? – MSR 2011 16
1 More than half of all posts discuss requirements
2
3
Source code and technical content is less
popular
4
Blogs comprise more high-level than low-level
concepts
Community-related topics are orthogonal to
other topics
Blog Integration
Pagano, Maalej, May 2011 How Do Developers Blog? – MSR 2011 17
Categories of Commits Before Blog Posts:
Blog Integration Results
Pagano, Maalej, May 2011 How Do Developers Blog? – MSR 2011 18
• Most blog posts occur after corrective commits
(~36.5%), least after management commits (~19%)
• Conversely blog posts are more likely after
corrective (2.6%) or management (2.3%) commits
• 15.5% of posts included information previously
described in commit messages
• Dependency between commit messages and blog
posts decreases with increasing time in-between
Findings on Blog Integration
Pagano, Maalej, May 2011 How Do Developers Blog? – MSR 2011 19
1Bug fixes and management activities are
frequently shared with all stakeholders in a
software community
2Developers are more likely to publish
information about their recent activities than
about old activities
Outline
1 Motivation
2 Research Setting
3 Research Results
4 Summary
Pagano, Maalej, May 2011 20How Do Developers Blog? – MSR 2011
Summary of the Talk
Pagano, Maalej, May 2011 How Do Developers Blog? – MSR 2011 21
• Integration of
blogging
(social)
activities
• Tools for
knowledge
reuse and
topic
annotation in
blogs
Blog
Usage
• Blogging is a regular social activity
• High level information(screenshots) and
knowledge reuse (links)
• Short documentations, tutorials, howtos
Blog
Content
• High-level topics like requirements are
predominant
• Low-level topics like source code are less
popular
Blog
Integration
• Most posts occur after corrective or
management commits
• Developers use blogs to elaborate recent
commit messages
Feedback, Questions, Suggestions and
Collaboration are Welcome!
Pagano, Maalej, May 2011 How Do Developers Blog? – MSR 2011 22
Dennis Pagano
TUM
Walid Maalej
TUM
BACKUPS
Pagano, Maalej, May 2011 How Do Developers Blog? – MSR 2011 23
Blog Content – Most Popular Topics
Pagano, Maalej, May 2011 How Do Developers Blog? – MSR 2011 24
Eclipse GNOME PostgreSQL Python
functional
requirements & domain
concepts (47%)
community &
contributions (36%)
functional
requirements & domain
concepts (47%)
API usage & project
documentation (50%)
community &
contributions (33%)
functional
requirements & domain
concepts (33%)
non-functional
requirements (40%)
functional
requirements & domain
concepts (42%)
architecture &
packages (31%)
user interface & user
interaction (32%)
community &
contributions (39%)
community &
contributions (42%)
target platform (26%)architecture &
packages (32%)
release management &
announcements (38%)
deployment &
dependencies (36%)
solution concepts &
technology (26%)
development activities
(31%)conferences (25%)
architecture &
packages (36%)
Blog Integration
Pagano, Maalej, May 2011 How Do Developers Blog? – MSR 2011 25
Frequencies of Commit Categories:
Blog Integration
Blogging Probability after Commits
Pagano, Maalej, May 2011 How Do Developers Blog? – MSR 2011 26
Commit type Eclipse GNOME PostgreSQL Python
Corrective
Engineering1.49% 2.67% 2.17% 4.20%
Forward
Engineering1.35% 1.64% 1.16% 2.81%
Management 0.90% 2.95% 2.46% 2.79%
Re-engineering 1.13% 1.95% 0.99% 2.87%
Blog Integration
Time Dependencies between Blogs and Commits
Pagano, Maalej, May 2011 How Do Developers Blog? – MSR 2011 27