the ultimate debian database
DESCRIPTION
Some comments about the sources of data stored in the Ultimate Debian DatabaseTRANSCRIPT
![Page 1: The Ultimate Debian Database](https://reader033.vdocuments.us/reader033/viewer/2022052505/556875a8d8b42a3b7b8b4cfc/html5/thumbnails/1.jpg)
The Ultimate Debian
Database Israel Herraiz
Davis, CA, July 26th 2012
Download these slides at http://slideshare.net/herraiz/the-ultimate-debian-database
![Page 2: The Ultimate Debian Database](https://reader033.vdocuments.us/reader033/viewer/2022052505/556875a8d8b42a3b7b8b4cfc/html5/thumbnails/2.jpg)
1 / 25
Outline
1. Debian: what is it and sources of data
2. The UDD: what is it and where to get it
3. What has been done and what we can do
![Page 3: The Ultimate Debian Database](https://reader033.vdocuments.us/reader033/viewer/2022052505/556875a8d8b42a3b7b8b4cfc/html5/thumbnails/3.jpg)
2 / 25
1. Debian: what is it and
sources of data
![Page 4: The Ultimate Debian Database](https://reader033.vdocuments.us/reader033/viewer/2022052505/556875a8d8b42a3b7b8b4cfc/html5/thumbnails/4.jpg)
3 / 25
Debian
• GNU/Linux software distribution
• Goal: to deliver an entirely and exclusively free
distribution
• Maintained by volunteers
• Bureaucratic organization (policies, constitution,
social contract)
• Release when ready
• > 10 years history
• > 500 MSLOC
• > 15k packages
![Page 5: The Ultimate Debian Database](https://reader033.vdocuments.us/reader033/viewer/2022052505/556875a8d8b42a3b7b8b4cfc/html5/thumbnails/5.jpg)
4 / 25
Debian Releases
![Page 6: The Ultimate Debian Database](https://reader033.vdocuments.us/reader033/viewer/2022052505/556875a8d8b42a3b7b8b4cfc/html5/thumbnails/6.jpg)
5 / 25
![Page 7: The Ultimate Debian Database](https://reader033.vdocuments.us/reader033/viewer/2022052505/556875a8d8b42a3b7b8b4cfc/html5/thumbnails/7.jpg)
6 / 25
Debian Source Packages
![Page 8: The Ultimate Debian Database](https://reader033.vdocuments.us/reader033/viewer/2022052505/556875a8d8b42a3b7b8b4cfc/html5/thumbnails/8.jpg)
7 / 25
Source and Binary Packages
• A source package generates one or more binary
packages
octave
octave-core
octave-doc
liboctave
liboctave-dev
![Page 9: The Ultimate Debian Database](https://reader033.vdocuments.us/reader033/viewer/2022052505/556875a8d8b42a3b7b8b4cfc/html5/thumbnails/9.jpg)
8 / 25
Package uploads
• There are no repositories like in other software
projects
• Although developers may privately use version
control systems
• When a bug is fixed, a new version is uploaded
• Uploads == commits
![Page 10: The Ultimate Debian Database](https://reader033.vdocuments.us/reader033/viewer/2022052505/556875a8d8b42a3b7b8b4cfc/html5/thumbnails/10.jpg)
9 / 25
Source: octave
Section: math
Priority: extra
Maintainer: Debian Octave Group <[email protected]>
Uploaders: Thomas Weber <[email protected]>, Sébastien Villemot
DM-Upload-Allowed: yes
Build-Depends: gfortran, debhelper (>= 9), automake, dh-autoreconf, texinfo ….
Standards-Version: 3.9.3
Homepage: http://www.octave.org/
Vcs-Git: git://git.debian.org/git/pkg-octave/octave.git
Vcs-Browser: http://git.debian.org/?p=pkg-octave/octave.git
Source Packages metadata
![Page 11: The Ultimate Debian Database](https://reader033.vdocuments.us/reader033/viewer/2022052505/556875a8d8b42a3b7b8b4cfc/html5/thumbnails/11.jpg)
10 / 25
Package: octave
Priority: extra
Section: math
Installed-Size: 4760
Maintainer: Ubuntu Developers <[email protected]>
Architecture: amd64
Version: 3.6.1-1ubuntu1ppa1~precise1
Recommends: gnuplot, libatlas3gf-base
Replaces: octave3.2
Suggests: octave-info, octave-doc, octave-htmldoc
Depends: libamd2.2.0 (>= 1:3.4.0), libarpack2 (>= 2.1), …
Conflicts: octave3.2
Filename: pool/main/o/octave/octave_3.6.1-1ubuntu1ppa1~precise1_amd64.deb
Size: 1746050
MD5sum: 2c431556d6cf98fd8a341e865ac63058
SHA1: b333c49e6f6cb7d4445378020dfffdb5a1626de7
Description: GNU Octave language for numerical computations…
Binary Packages metadata
![Page 12: The Ultimate Debian Database](https://reader033.vdocuments.us/reader033/viewer/2022052505/556875a8d8b42a3b7b8b4cfc/html5/thumbnails/12.jpg)
11 / 25
Package: octave
Priority: extra
Section: math
Installed-Size: 4760
Maintainer: Ubuntu Developers <[email protected]>
Architecture: amd64
Version: 3.6.1-1ubuntu1ppa1~precise1
Recommends: gnuplot, libatlas3gf-base
Replaces: octave3.2
Suggests: octave-info, octave-doc, octave-htmldoc
Depends: libamd2.2.0 (>= 1:3.4.0), libarpack2 (>= 2.1), …
Conflicts: octave3.2
Filename: pool/main/o/octave/octave_3.6.1-1ubuntu1ppa1~precise1_amd64.deb
Size: 1746050
MD5sum: 2c431556d6cf98fd8a341e865ac63058
SHA1: b333c49e6f6cb7d4445378020dfffdb5a1626de7
Description: GNU Octave language for numerical computations…
Binary Packages metadata
![Page 13: The Ultimate Debian Database](https://reader033.vdocuments.us/reader033/viewer/2022052505/556875a8d8b42a3b7b8b4cfc/html5/thumbnails/13.jpg)
12 / 25
Debian Popcon: Tracking Installations
• Popularity: total
install counts
• Recent Use (< 30
days)
• Old Use (Beyond 30
days)
• Data collected daily
• Users voluntarily opt-
in
• Source of bias
![Page 14: The Ultimate Debian Database](https://reader033.vdocuments.us/reader033/viewer/2022052505/556875a8d8b42a3b7b8b4cfc/html5/thumbnails/14.jpg)
13 / 25
Debian Bugs
• People find bugs in binary packages
• ~500 bugs per month
• But bugs are linked to source packages
• Bugs can be
• Accepted and solved in Debian
• Rejected
• Forwarded to upstream
• Everything else, similar to other bug tracking
systems
• Life cycle, comments, severity levels…
![Page 15: The Ultimate Debian Database](https://reader033.vdocuments.us/reader033/viewer/2022052505/556875a8d8b42a3b7b8b4cfc/html5/thumbnails/15.jpg)
14 / 25
2. The UDD: what is it and
where to get it
![Page 16: The Ultimate Debian Database](https://reader033.vdocuments.us/reader033/viewer/2022052505/556875a8d8b42a3b7b8b4cfc/html5/thumbnails/16.jpg)
15 / 25
Research work: main paper (at MSR 2010)
![Page 17: The Ultimate Debian Database](https://reader033.vdocuments.us/reader033/viewer/2022052505/556875a8d8b42a3b7b8b4cfc/html5/thumbnails/17.jpg)
16 / 25
Other papers at MSR 2010
![Page 18: The Ultimate Debian Database](https://reader033.vdocuments.us/reader033/viewer/2022052505/556875a8d8b42a3b7b8b4cfc/html5/thumbnails/18.jpg)
17 / 25
What is the UDD?
• PostgreSQL database with all the information of
the sources described so far
• http://udd.debian.org
• New dumps available every two days
• ~ 500 MB bz2
• Used for some Debian internal services
• Schema too complex and too big for a slide
• Technical detail: you need a Debian-based
system to load the dump of the UDD
![Page 19: The Ultimate Debian Database](https://reader033.vdocuments.us/reader033/viewer/2022052505/556875a8d8b42a3b7b8b4cfc/html5/thumbnails/19.jpg)
18 / 25
Debian sources of data
• Sources / Packages
metadata
• Bugs
• including *all*
archived bugs
• 1995-96-97
• Carnivore
• Debtags
• Popularity Contest
• DEHS
• Lintian
• Migrations to testing
• Uploads
• All the way back to
1998!
• New packages queue
• Translations status
• Orphaned packages
• Screenshots
![Page 20: The Ultimate Debian Database](https://reader033.vdocuments.us/reader033/viewer/2022052505/556875a8d8b42a3b7b8b4cfc/html5/thumbnails/20.jpg)
19 / 25
!
![Page 21: The Ultimate Debian Database](https://reader033.vdocuments.us/reader033/viewer/2022052505/556875a8d8b42a3b7b8b4cfc/html5/thumbnails/21.jpg)
20 / 25
Bear in mind!
• You can also obtain the source code of the
packages
• Easy to automate
• And the modifications done by the Debian
maintainers
• So add product metrics to the set of data
sources
• But this is not included in the UDD
![Page 22: The Ultimate Debian Database](https://reader033.vdocuments.us/reader033/viewer/2022052505/556875a8d8b42a3b7b8b4cfc/html5/thumbnails/22.jpg)
21 / 25
3. What has been done and
what we can do
![Page 23: The Ultimate Debian Database](https://reader033.vdocuments.us/reader033/viewer/2022052505/556875a8d8b42a3b7b8b4cfc/html5/thumbnails/23.jpg)
22 / 25
What kind of questions does Debian solve with the
UDD?
• High priority packages that have Release
Candidate blocker bugs
• Developers with very buggy and/or outdated
packages
• Who uploaded this package to the unstable
release?
• Who reported the RC bugs since the last
release?
![Page 24: The Ultimate Debian Database](https://reader033.vdocuments.us/reader033/viewer/2022052505/556875a8d8b42a3b7b8b4cfc/html5/thumbnails/24.jpg)
23 / 25
Some questions solved in the literature
• The popularity bias
• http://oa.upm.es/9585/
• Open source projects get more bug reports if
they are popular
• The actual number of bugs is not related to the
number of bugs reported
• So more bugs actually means more quality
• Well, at least more people who decide to use the
software
![Page 25: The Ultimate Debian Database](https://reader033.vdocuments.us/reader033/viewer/2022052505/556875a8d8b42a3b7b8b4cfc/html5/thumbnails/25.jpg)
24 / 25
The popularity bias
Lo
g(B
ug
s)
Log(installations)
Required packages
![Page 26: The Ultimate Debian Database](https://reader033.vdocuments.us/reader033/viewer/2022052505/556875a8d8b42a3b7b8b4cfc/html5/thumbnails/26.jpg)
25 / 25
Summary
• Packages and sources metadata
• And source code
• Bugs
• All the way back to 1995-96-97!
• Popularity contest
• Maintainers activity (uploads)
• All the way back to 1998!
• And much more….
• Now, what do you think we can do with this?