security and emotion: sentiment analysis of security discussions on github

7
Security and Emotion: Sentiment Analysis of Security Discussions on GitHub @DanielPletea @b_vasilescu @aserebrenik Eindhoven University of Technology, NL

Upload: alexander-serebrenik

Post on 19-May-2015

246 views

Category:

Science


6 download

DESCRIPTION

Application security is becoming increasingly prevalent during software and especially web application development. Consequently, countermeasures are continuously being discussed and built into applications, with the goal of reducing the risk that unauthorized code will be able to access, steal, modify, or delete sensitive data. We gauged the presence and atmosphere surrounding security-related discussions on GitHub, as mined from discussions around commits and pull requests. First, we found that security-related discussions account for approximately 10\% of all discussions on GitHub. Second, we found that more negative emotions are expressed in security-related discussions than in other discussions. These findings confirm the importance of properly training developers to address security concerns in their applications as well as the need to test applications thoroughly for security vulnerabilities in order to reduce frustration and improve overall project atmosphere.

TRANSCRIPT

Page 1: Security and Emotion: Sentiment Analysis of Security Discussions on GitHub

Security and Emotion:Sentiment Analysis of Security

Discussions on GitHub

@DanielPletea @b_vasilescu @aserebrenikEindhoven University of Technology, NL

Page 2: Security and Emotion: Sentiment Analysis of Security Discussions on GitHub
Page 3: Security and Emotion: Sentiment Analysis of Security Discussions on GitHub

SEC NEG:“Blocking a handful of very specific exploits is less useful, it gives the appearance of security when there may be many other vulnerabilities not protected against.”

SEC POS:woot! one more exploit gone!

Page 4: Security and Emotion: Sentiment Analysis of Security Discussions on GitHub

Security = more negative emotions

Similar results • commits/pull

requests• individual

comments/discussions

Page 5: Security and Emotion: Sentiment Analysis of Security Discussions on GitHub

Glossary of Key Information Security Terms

Co-occurring tags

Final list of security terms

Challenge data

CommentsDiscussions

Security/other comments

Security/other discussions

NLTK Neutral %Pos/Neg %

exploit, ldap,

spoofing,

Page 6: Security and Emotion: Sentiment Analysis of Security Discussions on GitHub

Challenge data ≠ GitHub

Recognition of security comments/discussions might be imperfect

NLTK was trained on movie reviews & tweets

Commit messages were cut to 256 characters

Page 7: Security and Emotion: Sentiment Analysis of Security Discussions on GitHub