co-evolving code-related and database-related changes in data intensive software system

15
CoEvolving CodeRelated and DatabaseRelated Changes in a DataIntensive SoEware System Mathieu Goeminne, Alexandre Decan, Tom Mens Service de Génie Logiciel, Université de Mons FNRS Projet de Recherche “DataIntensive SoEware System EvoluIon” in collaboraIon with A. Cleve and L. Meurice (Université de Namur) hPp://informaIque.umons.ac.be/genlog/projects/disse

Upload: mathieu-goeminne

Post on 06-Jul-2015

129 views

Category:

Technology


0 download

DESCRIPTION

Current empirical studies on the evolution of software systems are primarily analyzing source code. Very few studies, however, focus on data-intensive software systems (DISS), in which a significant part of the total development effort is devoted to maintaining and evolving the database schema, typically through the use of a specific database management system and database technology (such as Hibernate). Adding new functionality to the system may affect the database structure and, conversely, modifying the database structure may impact the source code associated to it. Because of this, evolution of the DISS requires the source code and the database schema to co-evolve. Very little empirical studies have been carried out to study the co-evolution between source code changes and database schema changes in a DISS.

TRANSCRIPT

Page 1: Co-Evolving Code-Related and Database-related Changes in Data Intensive Software System

Co-­‐Evolving  Code-­‐Related  and  Database-­‐Related  Changes  in  aData-­‐Intensive  SoEware  System

Mathieu  Goeminne,  Alexandre  Decan,  Tom  MensService  de  Génie  Logiciel,  Université  de  Mons

FNRS  Projet  de  Recherche  “Data-­‐Intensive  SoEware  System  EvoluIon”in  collaboraIon  with  A.  Cleve  and  L.  Meurice  (Université  de  Namur)

hPp://informaIque.umons.ac.be/genlog/projects/disse

Page 2: Co-Evolving Code-Related and Database-related Changes in Data Intensive Software System

16  December  2013  -­‐  BENEVOL,  Mons

Context

• Focus  on  data-­‐intensive  so0ware  systems  (DISS)

• Expand  empirical  MSR  research  to  include  database-­‐related  acBviBes

• Study  co-­‐evoluBon  between  code  and  database

• Carry  out  empirical  studies  on  open  source  DISS

2

Page 3: Co-Evolving Code-Related and Database-related Changes in Data Intensive Software System

16  December  2013  -­‐  BENEVOL,  Mons

Research  QuesIons

• RQ1:  Is  there  any  relaIon  between  how  source  code  files  and  database-­‐related  files  evolve?

• RQ2:  What  is  the  effect  of  migraIng  to  new  database  technology?

• RQ3:  How  do  developers  divide  their  work  and  how  does  this  evolve  over  Ime?

3

Page 4: Co-Evolving Code-Related and Database-related Changes in Data Intensive Software System

16  December  2013  -­‐  BENEVOL,  Mons

Case  Study:  OSCAR

• Canadian  research  network  SCOOP– Social  Collaboratory  for  Outcome  Oriented  Primary  care

• hPp://scoop.leadlab.ca

• Open  source  tool  infrastructure  for  Electronic  Medical  Records  (EMR)

• hPp://github.com/scoophealth

• OSCAR:  EMR  system  for  healthcare– Support  for  billing,  chronic  disease  management  tools,  prescripIon  module,  scheduling,  ...• Data  available  on  hPps://github.com/scoophealth/oscar.git

4

Page 5: Co-Evolving Code-Related and Database-related Changes in Data Intensive Software System

16  December  2013  -­‐  BENEVOL,  Mons

Case  Study:  OSCAR

5

characteris/c value

duraIon 3,939  days  (  >  129  months)

dates from  Nov  2002  Ill  Aug  2013

number  of  commits 18,727

number  of  disInct  files 20,718  (of  which  54%  code  files)

number  of  file  touches 93,721

number  of  disInct  developers 100

Page 6: Co-Evolving Code-Related and Database-related Changes in Data Intensive Software System

16  December  2013  -­‐  BENEVOL,  Mons

EvoluIon  of  OSCAR

• Monthly  aggregated  proporIon  of  JSP  and  Java  files  in  OSCAR

6

0%#

10%#

20%#

30%#

40%#

50%#

60%#

70%#

80%#

90%#

100%#

2003-07#

2004-01#

2004-07#

2005-01#

2005-07#

2006-01#

2006-07#

2007-01#

2007-07#

2008-01#

2008-07#

2009-01#

2009-07#

2010-01#

2010-07#

2011-01#

2011-07#

2012-01#

2012-07#

2013-01#

jsp#

java#

Page 7: Co-Evolving Code-Related and Database-related Changes in Data Intensive Software System

16  December  2013  -­‐  BENEVOL,  Mons

EvoluIon  of  OSCAR  -­‐  Social  Dimension

• Monthly  number  of  disInct  acIve  developers  for  OSCAR

7

0"

5"

10"

15"

20"

25"

2003'07"

2004'01"

2004'07"

2005'01"

2005'07"

2006'01"

2006'07"

2007'01"

2007'07"

2008'01"

2008'07"

2009'01"

2009'07"

2010'01"

2010'07"

2011'01"

2011'07"

2012'01"

2012'07"

2013'01"

Page 8: Co-Evolving Code-Related and Database-related Changes in Data Intensive Software System

16  December  2013  -­‐  BENEVOL,  Mons

EvoluIon  of  OSCAR

• Growth  of  source  code  files  and  database-­‐related  files

8

0"

1000"

2000"

3000"

4000"

5000"

6000"

2003)07"

2004)01"

2004)07"

2005)01"

2005)07"

2006)01"

2006)07"

2007)01"

2007)07"

2008)01"

2008)07"

2009)01"

2009)07"

2010)01"

2010)07"

2011)01"

2011)07"

2012)01"

2012)07"

2013)01"

pure"

sql"

Page 9: Co-Evolving Code-Related and Database-related Changes in Data Intensive Software System

16  December  2013  -­‐  BENEVOL,  Mons

EvoluIon  of  OSCAR  -­‐  Social  Dimension

• How  does  the  acIvity  of  developers  evolve  over  Ime?

9

Develop

er

Page 10: Co-Evolving Code-Related and Database-related Changes in Data Intensive Software System

16  December  2013  -­‐  BENEVOL,  Mons

IntroducIon  of  Persistence  Provider

• Hibernate  (introduced  in  OSCAR  since  July  2006)– Java  object-­‐relaIonal  mapping  (ORM)  library

• XML  files  map  Java  classes  to  database  tables  and  Java  data  types  to  SQL  data  types

• facilitates  data  query  and  retrieval• generates  SQL  calls  and  relieves  the  developer  from  manual  result  set  handling  and  object  conversion

• JPA  (introduced  in  OSCAR  since  July  2008)– Java  Persistence  API

– Uses  Java  annotaIons  instead  of  XML  files  for  ORM

10

Page 11: Co-Evolving Code-Related and Database-related Changes in Data Intensive Software System

16  December  2013  -­‐  BENEVOL,  Mons

IntroducIon  of  Persistence  Provider

• SQL  =  code  file  containing  embedded  SQL  query

• HIB  =  Java  file  targeted  by  Hibernate  XML  file

• JPA  =  Java  file  containing  JPA  annotaIon

• pure  =  code  files  not  containing  any  of  these

11

0%#10%#20%#30%#40%#50%#60%#70%#80%#90%#

100%#

2003-07-01#

2004-01-01#

2004-07-01#

2005-01-01#

2005-07-01#

2006-01-01#

2006-07-01#

2007-01-01#

2007-07-01#

2008-01-01#

2008-07-01#

2009-01-01#

2009-07-01#

2010-01-01#

2010-07-01#

2011-01-01#

2011-07-01#

2012-01-01#

2012-07-01#

2013-01-01#

2013-07-01#

JPA# HIB#

SQL# pure#

Monthly  aggregated  number  of  acIve  files

0"

200"

400"

600"

800"

1000"

1200"

1400"

1600"

2003)07)01"

2003)11)01"

2004)03)01"

2004)07)01"

2004)11)01"

2005)03)01"

2005)07)01"

2005)11)01"

2006)03)01"

2006)07)01"

2006)11)01"

2007)03)01"

2007)07)01"

2007)11)01"

2008)03)01"

2008)07)01"

2008)11)01"

2009)03)01"

2009)07)01"

2009)11)01"

2010)03)01"

2010)07)01"

2010)11)01"

2011)03)01"

2011)07)01"

2011)11)01"

2012)03)01"

2012)07)01"

2012)11)01"

2013)03)01"

2013)07)01"

JPA" HIB" SQL"

Page 12: Co-Evolving Code-Related and Database-related Changes in Data Intensive Software System

16  December  2013  -­‐  BENEVOL,  Mons

IntroducIon  of  Persistence  Provider

• Who  is  involved  in  introducing  changes  in  database-­‐related  code?

12

Develop

er

Page 13: Co-Evolving Code-Related and Database-related Changes in Data Intensive Software System

16  December  2013  -­‐  BENEVOL,  Mons

EvoluIon  of  OSCAR  -­‐  Social  Dimension

• How  do  developers  divide  their  work?

13

Number of developers that introduce database-related code in some file for the first time

Java$(87)$ JSP$(86)$

OSCAR$developers$(100)$3"

24"

10" 10"

1" 0"

SQL$(53)$

HIB$

24"

JPA$9" 8" 11"

Page 14: Co-Evolving Code-Related and Database-related Changes in Data Intensive Software System

16  December  2013  -­‐  BENEVOL,  Mons

Preliminary  Conclusions

• RQ1:  Code-­‐related  and  database-­‐related  files  evolve  together  (no  “phased”  co-­‐evoluIon)

• RQ2:  MigraIon  to  Hibernate,  then  JPA,  but  embedded  SQL  sIll  remains  important

• RQ3:  No  clear  separaIon  of  acIviIes  between  developers– The  majority  of  developers  changes  both  db-­‐related  and  db-­‐unrelated  code

– No  observed  periods  dedicated  to  a  specific  acIvity

14

Page 15: Co-Evolving Code-Related and Database-related Changes in Data Intensive Software System

16  December  2013  -­‐  BENEVOL,  Mons

Future  Work

• ConInue  studying  co-­‐evoluIon  between  code-­‐related  and  db-­‐related  changes– Refine  our  results  by  analysing  changes  at  finer  granularity

• Analyse  database  schema  changes  and  their  impact  on  source  code  (collaboraIon  with  UNamur)– Detect  change  paPerns  in  code  and  database  schema

• Study  impact  of  introducing  persistence  providers– Analyse  migraIon  paPerns  in  code

– How  do  persistence  providers  reduce  impact  of  changes  in  database  schema?

• Study  and  compare  with  other  DISS

15