doing science properly in the digital age - rutgers seminar

29
Software Sustainability Institute www.software.ac. uk Doing Science Properly in the Digital Age 2 October 2012, Rutgers University Neil Chue Hong (@npch) [email protected]

Upload: neil-chue-hong

Post on 09-May-2015

233 views

Category:

Technology


0 download

DESCRIPTION

Seminar given at Rutgers University on 2nd October 2012.

TRANSCRIPT

Page 1: Doing Science Properly In The Digital Age - Rutgers Seminar

Software Sustainability Institute

www.software.ac.uk

Doing Science Properly in the Digital Age2 October 2012, Rutgers UniversityNeil Chue Hong (@npch) [email protected]

Page 2: Doing Science Properly In The Digital Age - Rutgers Seminar

Software Sustainability Institute

www.software.ac.uk

Four Paradigms of Research

Empirical

Theoretical

Computational

Data Exploration

Page 3: Doing Science Properly In The Digital Age - Rutgers Seminar

Software Sustainability Institute

www.software.ac.uk

Software is pervasive in research

Page 4: Doing Science Properly In The Digital Age - Rutgers Seminar

Software Sustainability Institute

www.software.ac.uk

Just the Nature of the problem?

Maintenance is not funHacking new stuff is fun

Statistics courtesy of Jo Hannay et al, “How Do Scientists Develop and Use Scientific Software?

Published online 13 October 2010 | Nature 467, 775-777 (2010) doi:10.1038/467775a

Page 5: Doing Science Properly In The Digital Age - Rutgers Seminar

Software Sustainability Institute

www.software.ac.uk

The Software Sustainability Institute

A national facility for cultivating world-class research through software• Better software enables better research• Software reaches boundaries in its

development cycle that prevent improvement, growth and adoption

• Providing the expertise and services needed to negotiate to the next stage

• Developing the policy and tools tosupport the community developing andusing research software Supported by EPSRC

Grant EP/H043160/1

Page 6: Doing Science Properly In The Digital Age - Rutgers Seminar

Software Sustainability Institute

www.software.ac.uk

People

UK Research Computing Ecosystem

Computing Communities

Network/Collaboration

Instruments

Software Data Centres

Page 7: Doing Science Properly In The Digital Age - Rutgers Seminar

Software Sustainability Institute

www.software.ac.uk

SSI Organisation

• Community Engagement (Shoaib Sufi) Fellowship Programme

• Consultancy (Steve Crouch) Open Call for Projects Software Evaluation

• Policy (Simon Hettrick) Guides and Case Studies

• Training (Mike Jackson) Software Carpentry Software Surgeries

• Collaboration between universities of Edinburgh, Manchester, Oxford and Southampton

Page 8: Doing Science Properly In The Digital Age - Rutgers Seminar

Software Sustainability Institute

www.software.ac.uk

Case Study: Ligand Binding

• Centre for Computational Chemistry, Bristol New methods for rapid MC sampling of

biomolecular systems modelled using QM/MM Developed two codes ProtoMS (F77) + Sire (C++) Water-Swap Reaction Coordinate method to

calculate absolute protein-ligand binding free energies

• SSI’s work is helping to scale development ProtoMS and Sire both single developer codes ASPIRE/ACQUIRE framework has multiple devs

• Split architecture between ASPIRE (adaptive multiresolution hybrid MD simulation) and ACQUIRE (WorkPacket scheduling system with optimisation for time to result vs “green-ness”

• http://www.siremol.org/adaptive_dynamics

Page 9: Doing Science Properly In The Digital Age - Rutgers Seminar

Software Sustainability Institute

www.software.ac.uk

Case Study: Brain Imaging

• Brain Research Imaging Centre, Edinburgh Develop PrivacyGuard software, a DICOM

image deidentification toolkit Created software to support new multispectral

colouring modulation and variance identification technique (“MCMxxxVI”) toidentify white matter lesions that are indicativeof declining cognitive ability

BRIC are not principally software developers, but do provide software to other researchers

• SSI’s work means the software has been reviewed and refactored Looked at exploitation

• Usability review, Naming/trademark review Made it easier for BRIC staff to maintain and develop

• Move to standard repositories, testing and documentation processes• Examination of licencing for MCMxxxVI• Extraction and refactoring to create standalone tools

• http://www.software.ac.uk/who-do-we-work/brain-research-imaging-centre-edinburgh• http://www.bric.ed.ac.uk/

Page 10: Doing Science Properly In The Digital Age - Rutgers Seminar

Software Sustainability Institute

www.software.ac.uk

Case Study: Climate Policy Modelling

• CIAS team at Tyndall Centre for Climate Change Research, University of East Anglia Develop linked climate and economic models for detailed

analysis Their software was not ready to be used by other groups

• One researcher/developer at UEA, several users

• SSI’s work means the software is robust enough that it can be installed and used by others Enabled use of the software by the WWFN’s Climascope

project and James Cook University• Documented software to allow extensions by contributors• Made it easier to maintain and backup• Added job scheduling to improve modeling throughput• New modelling framework enables new models i.e. new

science

• http://www.tyndall.ac.uk/research/cias

Page 11: Doing Science Properly In The Digital Age - Rutgers Seminar

Software Sustainability Institute

www.software.ac.uk

Case Study: textual studies

• TextVRE team at CeRCH, Kings College London Developed an environment which is used to integrate

various tools used in the e-Humanities textual studies lifecycle

Builds on the German TextGrid project, and many other existing tools

• SSI’s work means the software is can be run “out of the box” – an important requirement for the researchers Developed a VM image containing the TextVRE installation

• Improve installation instructions• Develop tests to check each installed component• Improve modularisation to allow others to contribute and

maintain Feeding back work to TextGrid

• http://textvre.cerch.kcl.ac.uk

Page 12: Doing Science Properly In The Digital Age - Rutgers Seminar

Software Sustainability Institute

www.software.ac.uk

The modern researcher…

• … worries about: Data management

and analysis Reproducible

research Scalable simulations Integration of

models and workflows

CollaborationPicture of Otto Stern courtesy of Emilio Segre Visual Archives

Where do they learn how to do this?

Page 13: Doing Science Properly In The Digital Age - Rutgers Seminar

Software Sustainability Institute

www.software.ac.ukObservation 1:

Software is pervasive across research

Corollary: software is bleeding edge and long-tail Demanding users are coming from arts + humanities, economics, and social science as well as sciences

Page 14: Doing Science Properly In The Digital Age - Rutgers Seminar

Software Sustainability Institute

www.software.ac.uk

Observation 2:A culture of re-use rather than re-invention is not widespread Corollary: we have wasted effort and increased siloing

Page 15: Doing Science Properly In The Digital Age - Rutgers Seminar

Software Sustainability Institute

www.software.ac.uk

Observation 3:Many people are “embarrassed” about software

Corollary: something is broken in the way we regard, recognise and reward software

Page 16: Doing Science Properly In The Digital Age - Rutgers Seminar

Software Sustainability Institute

www.software.ac.uk

SSI Drivers and Themes

• Two key drivers which cause people to seek the SSI’s advice: They want to be more productive in their research They don’t want to be embarrassed by appearing worse than

their peers

• Broadly, our work falls into a few key themes: The role and reward of software in research Recognition of software career paths Developing the scientific computing / software development

skill base

Page 17: Doing Science Properly In The Digital Age - Rutgers Seminar

Software Sustainability Institute

www.software.ac.uk

The Foundations of Digital Research

Software

Software

Software

Re-usable Re-producible

Page 18: Doing Science Properly In The Digital Age - Rutgers Seminar

Software Sustainability Institute

www.software.ac.uk

Gap 1: Software Skills Training

Basic Advanced

ProgrammingFocussed

(Tools)

Research

Focussed

(methods)

SoftwareCarpentry

Programming 101

SummerSchools

Advanced HPC Training

HPC Short CoursesDoctoral Training

MSc in HPC / scientific

computing

Programming 201

Who fills this gap?

Page 19: Doing Science Properly In The Digital Age - Rutgers Seminar

Software Sustainability Institute

www.software.ac.uk

Software philosophy as part of the process

• Foundations of scientific computing in undergraduate courses Like presentation skills

• Methods of scientific computing in postgraduate courses Like statistics and ethics

• Show the benefits from the knowledge and methods of digital research Not just programming 101

Page 20: Doing Science Properly In The Digital Age - Rutgers Seminar

Software Sustainability Institute

www.software.ac.uk

Best Practices for Scientific Computing

1. Write programs for people, not computers2. Automate repetitive tasks3. Use the computer to record history4. Make incremental changes5. Use version control6. Don’t repeat yourself (or others)7. Plan for mistakes8. Optimise software only after it works correctly9. Document the design and purpose of the code, rather than its mechanics10. Conduct code reviews

Paper (including the evidence) being submitted to arXiv and PNAShttp://arxiv.org/abs/1210.0530

Page 21: Doing Science Properly In The Digital Age - Rutgers Seminar

Software Sustainability Institute

www.software.ac.uk

Gap 2: Lack of recognition and reward

• There is an anachronism in the way we conduct and recognise research? REF references software as an output but it is still not

easy to get recognition – peer review fails• Software careers

Researchers who use software Researcher-Developers Research Software Engineers Research Software Support Research Systems Providers

Page 22: Doing Science Properly In The Digital Age - Rutgers Seminar

Software Sustainability Institute

www.software.ac.uk

No recognition without reward, no reward without reproducibility?

• How do we reward people for important software contributions?

• Traditionally: publish a research paper that happens to mention software Can we provide more direct, acceptable software citations?

• A Research Software Impact Manifesto http://www.software.ac.uk/blog/2011-05-02-publish-or-be-damned-alternat

ive-impact-manifesto-research-software

NB Authorship is hard• It works for data!

C.f. Heather Piowowar’s work http://www.plosone.org/article/info:doi%2F10.1371%2Fjournal.pone.

0000308

Page 23: Doing Science Properly In The Digital Age - Rutgers Seminar

Software Sustainability Institute

www.software.ac.uk

Software Metapapers

• Create a complete scholarly record including “standard” publication, method, dataset and models, and software e.g. modelling and simulation, statistical analysis Enable replay, reproduction and reuse

• Pragmatic approach is to create a metadata record for the software, and link it to a copy of the software in some storage infrastructure This is a software metapaper Peer-review the metadata, not the software

• Journal of Open Research Software: http://openresearchsoftware.metajnl.com/

See: http://openresearchsoftware.metajnl.com/faq/ and the work by B. Matthews et al: The Significant Properties of Software: A Study

Page 24: Doing Science Properly In The Digital Age - Rutgers Seminar

Software Sustainability Institute

www.software.ac.uk

Gap 3: Lack of support infrastructure

• For example: no digital repository which satisfies the criteria: Open to anyone in the UK to archive software Software associated with an OSI license Provide a unique, permanent identifier Publishes a preservation/curation/sustainability

plan• This is just deposit, not even preservation or

sustainability

Page 25: Doing Science Properly In The Digital Age - Rutgers Seminar

Software Sustainability Institute

www.software.ac.uk

5 Stars of Software?

• Do we need a 5 stars for software? Existence – there is accurate

metadata that defines the software Availability – you can access and run

the software Openness – the software has an

open permissible license Assured – the software provides

ways of assuring its correctness Linked – the related data,

dependencies and papers are indicated

c.f.5 Stars of Linked Data (Berners-Lee)5 Stars of Online Journals (Shotton)

Page 26: Doing Science Properly In The Digital Age - Rutgers Seminar

Software Sustainability Institute

www.software.ac.uk

Gap 4: Software Maturity and Management

Soft

war

e pr

olife

ratio

n

Time

CustomisationInnovation Consolidation

Not all software should make it to the next stageManagement changes through time, requiring planning

Page 27: Doing Science Properly In The Digital Age - Rutgers Seminar

Software Sustainability Institute

www.software.ac.uk

A More Manageable Ecosystem

• Discourage duplicative software development in research grants by rewarding reuse and long-term development Need to change perceptions so that software is seen as

valuable But understand when it should not proceed to next stage

• Different stages should be managed and funded separately Maintenance vs. research vs. development

• A skilled researcher base is the key in the digital age Create a larger proportion of enabled researchers and provide

the ramps to go from desktop to high-end infrastructure Allow and encourage specialism and collaboration

Page 28: Doing Science Properly In The Digital Age - Rutgers Seminar

Software Sustainability Institute

www.software.ac.uk

Take home points1) Researchers are developing more software than ever, and trying to do it better

2) We are not adequately providing the training, recognition and reward, and career paths to enable a step change improvement in research software3) This is hindering digital research

4) The only people who can change this situation are people like you!

Page 29: Doing Science Properly In The Digital Age - Rutgers Seminar

Software Sustainability Institute

www.software.ac.uk

A national facility for cultivating world-class research through software

Become our next collaborators!Website: www.software.ac.ukEmail: [email protected]: twitter.com/SoftwareSaved

Some current collaborations