the life-sciences as a pathfinder in data-intensive research practice
DESCRIPTION
Presentation given at UQ Winterschool 2014. The advent of the Internet is bringing about fundamental changes in the ways that research is performed and communicated. These have been particularly driven by the growing importance of data, as well as the tools available to work with this data. This presentation will examine this shift, drawing on examples from the life‐sciences, and try to make some predictions about the next five years.TRANSCRIPT
The life-sciences as a pathfinder in data-intensive research practice
Dr Andrew Treloar Director of Technology
11 April 2023 CC-BY-SA atreloar 1
Structure presentation Research Lifecycles Functions of Scholarly Communication Pointers to the future Characterising the future Pathfinder problems Conclusions
11 April 2023 CC-BY-SA atreloar 2
So many lifecycleshellip
11 April 2023 CC-BY-SA hvdsomp and atreloar 3
Minimal Research Lifecycle
Think
DoShare
11 April 2023 CC-BY-SA atreloar 4
Sharing Scholarly Communication System and its Functions
Registration Certification Awareness Archiving
(Rosendaal and Geurts 1997)
11 April 2023 CC-BY-SA hvdsomp and atreloar 5
System of Journals Registration
submission of manuscript
Certification peer-review (pre-publication) commentary (post-publication)
Awareness discovery services
Archiving libraries (print) publishers (electronic) special purpose organisations (eg Portico)
11 April 2023 CC-BY-SA hvdsomp and atreloar 6
Pointers to the future
ldquothe future is already here ndash itrsquos just not very evenly distributedrdquo
William Gibson NPR interview
11 April 2023 CC-BY-SA hvdsomp and atreloar 7
Registration BioRxiv
11 April 2023 CC-BY-SA hvdsomp and atreloar 8
Registration Github
11 April 2023 CC-BY-SA hvdsomp and atreloar 9
Registration WikiPathways
11 April 2023 CC-BY-SA hvdsomp and atreloar 10
Registration NeuroLex
11 April 2023 CC-BY-SA hvdsomp and atreloar 11
Registration Nanopublications
11 April 2023 CC-BY-SA hvdsomp and atreloar 12
Registration some observations Decoupling registration from certification Timestamping versioning Registration of various types of objects Machines as creators and contributors
11 April 2023 CC-BY-SA hvdsomp and atreloar 13
Certification PubMed Commons
11 April 2023 CC-BY-SA hvdsomp and atreloar 14
Certification PubPeer
11 April 2023 CC-BY-SA hvdsomp and atreloar 15
Certification Publons
11 April 2023 CC-BY-SA hvdsomp and atreloar 16
Certification some observations Peer-review decoupled from publication process Certification of various types of objects Machines validating form Social endorsement
11 April 2023 CC-BY-SA hvdsomp and atreloar 17
Awareness myExperiment
11 April 2023 CC-BY-SA hvdsomp and atreloar 18
Awareness eLabNotebook RSS
11 April 2023 CC-BY-SA hvdsomp and atreloar 19
Awareness Twitter
11 April 2023 CC-BY-SA hvdsomp and atreloar 20
Awareness some observations Awareness for various types of objects Real time awareness Awareness support targeted at machines Awareness through social media
11 April 2023 CC-BY-SA hvdsomp and atreloar 21
Archiving PDB
11 April 2023 CC-BY-SA hvdsomp and atreloar 22
Archiving GenBank
11 April 2023 CC-BY-SA hvdsomp and atreloar 23
Characterising the future
11 April 2023 CC-BY-SA hvdsomp and atreloar 24
Fundamental changes The research process (objects social
dimension) is becoming more exposed Articles books are no longer the only
relevant objects for research communication Objects are no longer static Machines are joining humans as
(co-)creators and consumers of research objects
11 April 2023 CC-BY-SA hvdsomp and atreloar 25
Pathfinder problems Integrity of the scholarly record The three obsolescences
hardware file format software
11 April 2023 CC-BY-SA atreloar 26
System of Journals Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 27
Web of Objects Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 28
Not just citation relationships
11 April 2023 CC-BY-SA hvdsomp and atreloar 29
Your Text Here
The problem of obsolescence Lifescience research environment can be viewed as
undergoing a process of accelerated evolution Other disciplines will hit these problems in time
11 April 2023 CC-BY-SA atreloar 30
Cambrian explosion
11 April 2023 31
Hardware obsolescence Roche 454
11 April 2023 CC-BY-SA atreloar 32
Software obsolescence too much choice not enough support
11 April 2023 CC-BY-SA atreloar 33
Abandonware ldquoLast summer a member of the biology department of the
University of Udine in Italy approached Nicola Vitacolonna with an intriguing project The ANREP program which annotates structural motifs in gene or protein sequences was out of date having been written more than a decade ago Although still used by molecular biologists its slow computing ability meant a straightforward multiple search could take all night on a desktop PC The Udine biologist wanted Vitacolonna a postdoctoral fellow in computational biology to write a program that could do the job more quicklyrdquo Sam Jaffe Scientists Abandon their Software The Scientist Feb 16 2004
11 April 2023 CC-BY-SA atreloar 34
File format obsolescence Illumina Probability of error in basecalling encoded using ascii code
to reduce file size Meaning of the ascii code changed along the life cycle and
for data generated at different time points the quality might be encoded differently
ldquoIf you get an error like Invalid quality score value your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores Youll need to add the option -Q33 to your FASTX Toolkit argumentsrdquo Obviouslyhellip
11 April 2023 CC-BY-SA atreloar 35
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
Structure presentation Research Lifecycles Functions of Scholarly Communication Pointers to the future Characterising the future Pathfinder problems Conclusions
11 April 2023 CC-BY-SA atreloar 2
So many lifecycleshellip
11 April 2023 CC-BY-SA hvdsomp and atreloar 3
Minimal Research Lifecycle
Think
DoShare
11 April 2023 CC-BY-SA atreloar 4
Sharing Scholarly Communication System and its Functions
Registration Certification Awareness Archiving
(Rosendaal and Geurts 1997)
11 April 2023 CC-BY-SA hvdsomp and atreloar 5
System of Journals Registration
submission of manuscript
Certification peer-review (pre-publication) commentary (post-publication)
Awareness discovery services
Archiving libraries (print) publishers (electronic) special purpose organisations (eg Portico)
11 April 2023 CC-BY-SA hvdsomp and atreloar 6
Pointers to the future
ldquothe future is already here ndash itrsquos just not very evenly distributedrdquo
William Gibson NPR interview
11 April 2023 CC-BY-SA hvdsomp and atreloar 7
Registration BioRxiv
11 April 2023 CC-BY-SA hvdsomp and atreloar 8
Registration Github
11 April 2023 CC-BY-SA hvdsomp and atreloar 9
Registration WikiPathways
11 April 2023 CC-BY-SA hvdsomp and atreloar 10
Registration NeuroLex
11 April 2023 CC-BY-SA hvdsomp and atreloar 11
Registration Nanopublications
11 April 2023 CC-BY-SA hvdsomp and atreloar 12
Registration some observations Decoupling registration from certification Timestamping versioning Registration of various types of objects Machines as creators and contributors
11 April 2023 CC-BY-SA hvdsomp and atreloar 13
Certification PubMed Commons
11 April 2023 CC-BY-SA hvdsomp and atreloar 14
Certification PubPeer
11 April 2023 CC-BY-SA hvdsomp and atreloar 15
Certification Publons
11 April 2023 CC-BY-SA hvdsomp and atreloar 16
Certification some observations Peer-review decoupled from publication process Certification of various types of objects Machines validating form Social endorsement
11 April 2023 CC-BY-SA hvdsomp and atreloar 17
Awareness myExperiment
11 April 2023 CC-BY-SA hvdsomp and atreloar 18
Awareness eLabNotebook RSS
11 April 2023 CC-BY-SA hvdsomp and atreloar 19
Awareness Twitter
11 April 2023 CC-BY-SA hvdsomp and atreloar 20
Awareness some observations Awareness for various types of objects Real time awareness Awareness support targeted at machines Awareness through social media
11 April 2023 CC-BY-SA hvdsomp and atreloar 21
Archiving PDB
11 April 2023 CC-BY-SA hvdsomp and atreloar 22
Archiving GenBank
11 April 2023 CC-BY-SA hvdsomp and atreloar 23
Characterising the future
11 April 2023 CC-BY-SA hvdsomp and atreloar 24
Fundamental changes The research process (objects social
dimension) is becoming more exposed Articles books are no longer the only
relevant objects for research communication Objects are no longer static Machines are joining humans as
(co-)creators and consumers of research objects
11 April 2023 CC-BY-SA hvdsomp and atreloar 25
Pathfinder problems Integrity of the scholarly record The three obsolescences
hardware file format software
11 April 2023 CC-BY-SA atreloar 26
System of Journals Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 27
Web of Objects Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 28
Not just citation relationships
11 April 2023 CC-BY-SA hvdsomp and atreloar 29
Your Text Here
The problem of obsolescence Lifescience research environment can be viewed as
undergoing a process of accelerated evolution Other disciplines will hit these problems in time
11 April 2023 CC-BY-SA atreloar 30
Cambrian explosion
11 April 2023 31
Hardware obsolescence Roche 454
11 April 2023 CC-BY-SA atreloar 32
Software obsolescence too much choice not enough support
11 April 2023 CC-BY-SA atreloar 33
Abandonware ldquoLast summer a member of the biology department of the
University of Udine in Italy approached Nicola Vitacolonna with an intriguing project The ANREP program which annotates structural motifs in gene or protein sequences was out of date having been written more than a decade ago Although still used by molecular biologists its slow computing ability meant a straightforward multiple search could take all night on a desktop PC The Udine biologist wanted Vitacolonna a postdoctoral fellow in computational biology to write a program that could do the job more quicklyrdquo Sam Jaffe Scientists Abandon their Software The Scientist Feb 16 2004
11 April 2023 CC-BY-SA atreloar 34
File format obsolescence Illumina Probability of error in basecalling encoded using ascii code
to reduce file size Meaning of the ascii code changed along the life cycle and
for data generated at different time points the quality might be encoded differently
ldquoIf you get an error like Invalid quality score value your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores Youll need to add the option -Q33 to your FASTX Toolkit argumentsrdquo Obviouslyhellip
11 April 2023 CC-BY-SA atreloar 35
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
So many lifecycleshellip
11 April 2023 CC-BY-SA hvdsomp and atreloar 3
Minimal Research Lifecycle
Think
DoShare
11 April 2023 CC-BY-SA atreloar 4
Sharing Scholarly Communication System and its Functions
Registration Certification Awareness Archiving
(Rosendaal and Geurts 1997)
11 April 2023 CC-BY-SA hvdsomp and atreloar 5
System of Journals Registration
submission of manuscript
Certification peer-review (pre-publication) commentary (post-publication)
Awareness discovery services
Archiving libraries (print) publishers (electronic) special purpose organisations (eg Portico)
11 April 2023 CC-BY-SA hvdsomp and atreloar 6
Pointers to the future
ldquothe future is already here ndash itrsquos just not very evenly distributedrdquo
William Gibson NPR interview
11 April 2023 CC-BY-SA hvdsomp and atreloar 7
Registration BioRxiv
11 April 2023 CC-BY-SA hvdsomp and atreloar 8
Registration Github
11 April 2023 CC-BY-SA hvdsomp and atreloar 9
Registration WikiPathways
11 April 2023 CC-BY-SA hvdsomp and atreloar 10
Registration NeuroLex
11 April 2023 CC-BY-SA hvdsomp and atreloar 11
Registration Nanopublications
11 April 2023 CC-BY-SA hvdsomp and atreloar 12
Registration some observations Decoupling registration from certification Timestamping versioning Registration of various types of objects Machines as creators and contributors
11 April 2023 CC-BY-SA hvdsomp and atreloar 13
Certification PubMed Commons
11 April 2023 CC-BY-SA hvdsomp and atreloar 14
Certification PubPeer
11 April 2023 CC-BY-SA hvdsomp and atreloar 15
Certification Publons
11 April 2023 CC-BY-SA hvdsomp and atreloar 16
Certification some observations Peer-review decoupled from publication process Certification of various types of objects Machines validating form Social endorsement
11 April 2023 CC-BY-SA hvdsomp and atreloar 17
Awareness myExperiment
11 April 2023 CC-BY-SA hvdsomp and atreloar 18
Awareness eLabNotebook RSS
11 April 2023 CC-BY-SA hvdsomp and atreloar 19
Awareness Twitter
11 April 2023 CC-BY-SA hvdsomp and atreloar 20
Awareness some observations Awareness for various types of objects Real time awareness Awareness support targeted at machines Awareness through social media
11 April 2023 CC-BY-SA hvdsomp and atreloar 21
Archiving PDB
11 April 2023 CC-BY-SA hvdsomp and atreloar 22
Archiving GenBank
11 April 2023 CC-BY-SA hvdsomp and atreloar 23
Characterising the future
11 April 2023 CC-BY-SA hvdsomp and atreloar 24
Fundamental changes The research process (objects social
dimension) is becoming more exposed Articles books are no longer the only
relevant objects for research communication Objects are no longer static Machines are joining humans as
(co-)creators and consumers of research objects
11 April 2023 CC-BY-SA hvdsomp and atreloar 25
Pathfinder problems Integrity of the scholarly record The three obsolescences
hardware file format software
11 April 2023 CC-BY-SA atreloar 26
System of Journals Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 27
Web of Objects Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 28
Not just citation relationships
11 April 2023 CC-BY-SA hvdsomp and atreloar 29
Your Text Here
The problem of obsolescence Lifescience research environment can be viewed as
undergoing a process of accelerated evolution Other disciplines will hit these problems in time
11 April 2023 CC-BY-SA atreloar 30
Cambrian explosion
11 April 2023 31
Hardware obsolescence Roche 454
11 April 2023 CC-BY-SA atreloar 32
Software obsolescence too much choice not enough support
11 April 2023 CC-BY-SA atreloar 33
Abandonware ldquoLast summer a member of the biology department of the
University of Udine in Italy approached Nicola Vitacolonna with an intriguing project The ANREP program which annotates structural motifs in gene or protein sequences was out of date having been written more than a decade ago Although still used by molecular biologists its slow computing ability meant a straightforward multiple search could take all night on a desktop PC The Udine biologist wanted Vitacolonna a postdoctoral fellow in computational biology to write a program that could do the job more quicklyrdquo Sam Jaffe Scientists Abandon their Software The Scientist Feb 16 2004
11 April 2023 CC-BY-SA atreloar 34
File format obsolescence Illumina Probability of error in basecalling encoded using ascii code
to reduce file size Meaning of the ascii code changed along the life cycle and
for data generated at different time points the quality might be encoded differently
ldquoIf you get an error like Invalid quality score value your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores Youll need to add the option -Q33 to your FASTX Toolkit argumentsrdquo Obviouslyhellip
11 April 2023 CC-BY-SA atreloar 35
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
Minimal Research Lifecycle
Think
DoShare
11 April 2023 CC-BY-SA atreloar 4
Sharing Scholarly Communication System and its Functions
Registration Certification Awareness Archiving
(Rosendaal and Geurts 1997)
11 April 2023 CC-BY-SA hvdsomp and atreloar 5
System of Journals Registration
submission of manuscript
Certification peer-review (pre-publication) commentary (post-publication)
Awareness discovery services
Archiving libraries (print) publishers (electronic) special purpose organisations (eg Portico)
11 April 2023 CC-BY-SA hvdsomp and atreloar 6
Pointers to the future
ldquothe future is already here ndash itrsquos just not very evenly distributedrdquo
William Gibson NPR interview
11 April 2023 CC-BY-SA hvdsomp and atreloar 7
Registration BioRxiv
11 April 2023 CC-BY-SA hvdsomp and atreloar 8
Registration Github
11 April 2023 CC-BY-SA hvdsomp and atreloar 9
Registration WikiPathways
11 April 2023 CC-BY-SA hvdsomp and atreloar 10
Registration NeuroLex
11 April 2023 CC-BY-SA hvdsomp and atreloar 11
Registration Nanopublications
11 April 2023 CC-BY-SA hvdsomp and atreloar 12
Registration some observations Decoupling registration from certification Timestamping versioning Registration of various types of objects Machines as creators and contributors
11 April 2023 CC-BY-SA hvdsomp and atreloar 13
Certification PubMed Commons
11 April 2023 CC-BY-SA hvdsomp and atreloar 14
Certification PubPeer
11 April 2023 CC-BY-SA hvdsomp and atreloar 15
Certification Publons
11 April 2023 CC-BY-SA hvdsomp and atreloar 16
Certification some observations Peer-review decoupled from publication process Certification of various types of objects Machines validating form Social endorsement
11 April 2023 CC-BY-SA hvdsomp and atreloar 17
Awareness myExperiment
11 April 2023 CC-BY-SA hvdsomp and atreloar 18
Awareness eLabNotebook RSS
11 April 2023 CC-BY-SA hvdsomp and atreloar 19
Awareness Twitter
11 April 2023 CC-BY-SA hvdsomp and atreloar 20
Awareness some observations Awareness for various types of objects Real time awareness Awareness support targeted at machines Awareness through social media
11 April 2023 CC-BY-SA hvdsomp and atreloar 21
Archiving PDB
11 April 2023 CC-BY-SA hvdsomp and atreloar 22
Archiving GenBank
11 April 2023 CC-BY-SA hvdsomp and atreloar 23
Characterising the future
11 April 2023 CC-BY-SA hvdsomp and atreloar 24
Fundamental changes The research process (objects social
dimension) is becoming more exposed Articles books are no longer the only
relevant objects for research communication Objects are no longer static Machines are joining humans as
(co-)creators and consumers of research objects
11 April 2023 CC-BY-SA hvdsomp and atreloar 25
Pathfinder problems Integrity of the scholarly record The three obsolescences
hardware file format software
11 April 2023 CC-BY-SA atreloar 26
System of Journals Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 27
Web of Objects Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 28
Not just citation relationships
11 April 2023 CC-BY-SA hvdsomp and atreloar 29
Your Text Here
The problem of obsolescence Lifescience research environment can be viewed as
undergoing a process of accelerated evolution Other disciplines will hit these problems in time
11 April 2023 CC-BY-SA atreloar 30
Cambrian explosion
11 April 2023 31
Hardware obsolescence Roche 454
11 April 2023 CC-BY-SA atreloar 32
Software obsolescence too much choice not enough support
11 April 2023 CC-BY-SA atreloar 33
Abandonware ldquoLast summer a member of the biology department of the
University of Udine in Italy approached Nicola Vitacolonna with an intriguing project The ANREP program which annotates structural motifs in gene or protein sequences was out of date having been written more than a decade ago Although still used by molecular biologists its slow computing ability meant a straightforward multiple search could take all night on a desktop PC The Udine biologist wanted Vitacolonna a postdoctoral fellow in computational biology to write a program that could do the job more quicklyrdquo Sam Jaffe Scientists Abandon their Software The Scientist Feb 16 2004
11 April 2023 CC-BY-SA atreloar 34
File format obsolescence Illumina Probability of error in basecalling encoded using ascii code
to reduce file size Meaning of the ascii code changed along the life cycle and
for data generated at different time points the quality might be encoded differently
ldquoIf you get an error like Invalid quality score value your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores Youll need to add the option -Q33 to your FASTX Toolkit argumentsrdquo Obviouslyhellip
11 April 2023 CC-BY-SA atreloar 35
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
Sharing Scholarly Communication System and its Functions
Registration Certification Awareness Archiving
(Rosendaal and Geurts 1997)
11 April 2023 CC-BY-SA hvdsomp and atreloar 5
System of Journals Registration
submission of manuscript
Certification peer-review (pre-publication) commentary (post-publication)
Awareness discovery services
Archiving libraries (print) publishers (electronic) special purpose organisations (eg Portico)
11 April 2023 CC-BY-SA hvdsomp and atreloar 6
Pointers to the future
ldquothe future is already here ndash itrsquos just not very evenly distributedrdquo
William Gibson NPR interview
11 April 2023 CC-BY-SA hvdsomp and atreloar 7
Registration BioRxiv
11 April 2023 CC-BY-SA hvdsomp and atreloar 8
Registration Github
11 April 2023 CC-BY-SA hvdsomp and atreloar 9
Registration WikiPathways
11 April 2023 CC-BY-SA hvdsomp and atreloar 10
Registration NeuroLex
11 April 2023 CC-BY-SA hvdsomp and atreloar 11
Registration Nanopublications
11 April 2023 CC-BY-SA hvdsomp and atreloar 12
Registration some observations Decoupling registration from certification Timestamping versioning Registration of various types of objects Machines as creators and contributors
11 April 2023 CC-BY-SA hvdsomp and atreloar 13
Certification PubMed Commons
11 April 2023 CC-BY-SA hvdsomp and atreloar 14
Certification PubPeer
11 April 2023 CC-BY-SA hvdsomp and atreloar 15
Certification Publons
11 April 2023 CC-BY-SA hvdsomp and atreloar 16
Certification some observations Peer-review decoupled from publication process Certification of various types of objects Machines validating form Social endorsement
11 April 2023 CC-BY-SA hvdsomp and atreloar 17
Awareness myExperiment
11 April 2023 CC-BY-SA hvdsomp and atreloar 18
Awareness eLabNotebook RSS
11 April 2023 CC-BY-SA hvdsomp and atreloar 19
Awareness Twitter
11 April 2023 CC-BY-SA hvdsomp and atreloar 20
Awareness some observations Awareness for various types of objects Real time awareness Awareness support targeted at machines Awareness through social media
11 April 2023 CC-BY-SA hvdsomp and atreloar 21
Archiving PDB
11 April 2023 CC-BY-SA hvdsomp and atreloar 22
Archiving GenBank
11 April 2023 CC-BY-SA hvdsomp and atreloar 23
Characterising the future
11 April 2023 CC-BY-SA hvdsomp and atreloar 24
Fundamental changes The research process (objects social
dimension) is becoming more exposed Articles books are no longer the only
relevant objects for research communication Objects are no longer static Machines are joining humans as
(co-)creators and consumers of research objects
11 April 2023 CC-BY-SA hvdsomp and atreloar 25
Pathfinder problems Integrity of the scholarly record The three obsolescences
hardware file format software
11 April 2023 CC-BY-SA atreloar 26
System of Journals Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 27
Web of Objects Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 28
Not just citation relationships
11 April 2023 CC-BY-SA hvdsomp and atreloar 29
Your Text Here
The problem of obsolescence Lifescience research environment can be viewed as
undergoing a process of accelerated evolution Other disciplines will hit these problems in time
11 April 2023 CC-BY-SA atreloar 30
Cambrian explosion
11 April 2023 31
Hardware obsolescence Roche 454
11 April 2023 CC-BY-SA atreloar 32
Software obsolescence too much choice not enough support
11 April 2023 CC-BY-SA atreloar 33
Abandonware ldquoLast summer a member of the biology department of the
University of Udine in Italy approached Nicola Vitacolonna with an intriguing project The ANREP program which annotates structural motifs in gene or protein sequences was out of date having been written more than a decade ago Although still used by molecular biologists its slow computing ability meant a straightforward multiple search could take all night on a desktop PC The Udine biologist wanted Vitacolonna a postdoctoral fellow in computational biology to write a program that could do the job more quicklyrdquo Sam Jaffe Scientists Abandon their Software The Scientist Feb 16 2004
11 April 2023 CC-BY-SA atreloar 34
File format obsolescence Illumina Probability of error in basecalling encoded using ascii code
to reduce file size Meaning of the ascii code changed along the life cycle and
for data generated at different time points the quality might be encoded differently
ldquoIf you get an error like Invalid quality score value your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores Youll need to add the option -Q33 to your FASTX Toolkit argumentsrdquo Obviouslyhellip
11 April 2023 CC-BY-SA atreloar 35
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
System of Journals Registration
submission of manuscript
Certification peer-review (pre-publication) commentary (post-publication)
Awareness discovery services
Archiving libraries (print) publishers (electronic) special purpose organisations (eg Portico)
11 April 2023 CC-BY-SA hvdsomp and atreloar 6
Pointers to the future
ldquothe future is already here ndash itrsquos just not very evenly distributedrdquo
William Gibson NPR interview
11 April 2023 CC-BY-SA hvdsomp and atreloar 7
Registration BioRxiv
11 April 2023 CC-BY-SA hvdsomp and atreloar 8
Registration Github
11 April 2023 CC-BY-SA hvdsomp and atreloar 9
Registration WikiPathways
11 April 2023 CC-BY-SA hvdsomp and atreloar 10
Registration NeuroLex
11 April 2023 CC-BY-SA hvdsomp and atreloar 11
Registration Nanopublications
11 April 2023 CC-BY-SA hvdsomp and atreloar 12
Registration some observations Decoupling registration from certification Timestamping versioning Registration of various types of objects Machines as creators and contributors
11 April 2023 CC-BY-SA hvdsomp and atreloar 13
Certification PubMed Commons
11 April 2023 CC-BY-SA hvdsomp and atreloar 14
Certification PubPeer
11 April 2023 CC-BY-SA hvdsomp and atreloar 15
Certification Publons
11 April 2023 CC-BY-SA hvdsomp and atreloar 16
Certification some observations Peer-review decoupled from publication process Certification of various types of objects Machines validating form Social endorsement
11 April 2023 CC-BY-SA hvdsomp and atreloar 17
Awareness myExperiment
11 April 2023 CC-BY-SA hvdsomp and atreloar 18
Awareness eLabNotebook RSS
11 April 2023 CC-BY-SA hvdsomp and atreloar 19
Awareness Twitter
11 April 2023 CC-BY-SA hvdsomp and atreloar 20
Awareness some observations Awareness for various types of objects Real time awareness Awareness support targeted at machines Awareness through social media
11 April 2023 CC-BY-SA hvdsomp and atreloar 21
Archiving PDB
11 April 2023 CC-BY-SA hvdsomp and atreloar 22
Archiving GenBank
11 April 2023 CC-BY-SA hvdsomp and atreloar 23
Characterising the future
11 April 2023 CC-BY-SA hvdsomp and atreloar 24
Fundamental changes The research process (objects social
dimension) is becoming more exposed Articles books are no longer the only
relevant objects for research communication Objects are no longer static Machines are joining humans as
(co-)creators and consumers of research objects
11 April 2023 CC-BY-SA hvdsomp and atreloar 25
Pathfinder problems Integrity of the scholarly record The three obsolescences
hardware file format software
11 April 2023 CC-BY-SA atreloar 26
System of Journals Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 27
Web of Objects Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 28
Not just citation relationships
11 April 2023 CC-BY-SA hvdsomp and atreloar 29
Your Text Here
The problem of obsolescence Lifescience research environment can be viewed as
undergoing a process of accelerated evolution Other disciplines will hit these problems in time
11 April 2023 CC-BY-SA atreloar 30
Cambrian explosion
11 April 2023 31
Hardware obsolescence Roche 454
11 April 2023 CC-BY-SA atreloar 32
Software obsolescence too much choice not enough support
11 April 2023 CC-BY-SA atreloar 33
Abandonware ldquoLast summer a member of the biology department of the
University of Udine in Italy approached Nicola Vitacolonna with an intriguing project The ANREP program which annotates structural motifs in gene or protein sequences was out of date having been written more than a decade ago Although still used by molecular biologists its slow computing ability meant a straightforward multiple search could take all night on a desktop PC The Udine biologist wanted Vitacolonna a postdoctoral fellow in computational biology to write a program that could do the job more quicklyrdquo Sam Jaffe Scientists Abandon their Software The Scientist Feb 16 2004
11 April 2023 CC-BY-SA atreloar 34
File format obsolescence Illumina Probability of error in basecalling encoded using ascii code
to reduce file size Meaning of the ascii code changed along the life cycle and
for data generated at different time points the quality might be encoded differently
ldquoIf you get an error like Invalid quality score value your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores Youll need to add the option -Q33 to your FASTX Toolkit argumentsrdquo Obviouslyhellip
11 April 2023 CC-BY-SA atreloar 35
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
Pointers to the future
ldquothe future is already here ndash itrsquos just not very evenly distributedrdquo
William Gibson NPR interview
11 April 2023 CC-BY-SA hvdsomp and atreloar 7
Registration BioRxiv
11 April 2023 CC-BY-SA hvdsomp and atreloar 8
Registration Github
11 April 2023 CC-BY-SA hvdsomp and atreloar 9
Registration WikiPathways
11 April 2023 CC-BY-SA hvdsomp and atreloar 10
Registration NeuroLex
11 April 2023 CC-BY-SA hvdsomp and atreloar 11
Registration Nanopublications
11 April 2023 CC-BY-SA hvdsomp and atreloar 12
Registration some observations Decoupling registration from certification Timestamping versioning Registration of various types of objects Machines as creators and contributors
11 April 2023 CC-BY-SA hvdsomp and atreloar 13
Certification PubMed Commons
11 April 2023 CC-BY-SA hvdsomp and atreloar 14
Certification PubPeer
11 April 2023 CC-BY-SA hvdsomp and atreloar 15
Certification Publons
11 April 2023 CC-BY-SA hvdsomp and atreloar 16
Certification some observations Peer-review decoupled from publication process Certification of various types of objects Machines validating form Social endorsement
11 April 2023 CC-BY-SA hvdsomp and atreloar 17
Awareness myExperiment
11 April 2023 CC-BY-SA hvdsomp and atreloar 18
Awareness eLabNotebook RSS
11 April 2023 CC-BY-SA hvdsomp and atreloar 19
Awareness Twitter
11 April 2023 CC-BY-SA hvdsomp and atreloar 20
Awareness some observations Awareness for various types of objects Real time awareness Awareness support targeted at machines Awareness through social media
11 April 2023 CC-BY-SA hvdsomp and atreloar 21
Archiving PDB
11 April 2023 CC-BY-SA hvdsomp and atreloar 22
Archiving GenBank
11 April 2023 CC-BY-SA hvdsomp and atreloar 23
Characterising the future
11 April 2023 CC-BY-SA hvdsomp and atreloar 24
Fundamental changes The research process (objects social
dimension) is becoming more exposed Articles books are no longer the only
relevant objects for research communication Objects are no longer static Machines are joining humans as
(co-)creators and consumers of research objects
11 April 2023 CC-BY-SA hvdsomp and atreloar 25
Pathfinder problems Integrity of the scholarly record The three obsolescences
hardware file format software
11 April 2023 CC-BY-SA atreloar 26
System of Journals Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 27
Web of Objects Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 28
Not just citation relationships
11 April 2023 CC-BY-SA hvdsomp and atreloar 29
Your Text Here
The problem of obsolescence Lifescience research environment can be viewed as
undergoing a process of accelerated evolution Other disciplines will hit these problems in time
11 April 2023 CC-BY-SA atreloar 30
Cambrian explosion
11 April 2023 31
Hardware obsolescence Roche 454
11 April 2023 CC-BY-SA atreloar 32
Software obsolescence too much choice not enough support
11 April 2023 CC-BY-SA atreloar 33
Abandonware ldquoLast summer a member of the biology department of the
University of Udine in Italy approached Nicola Vitacolonna with an intriguing project The ANREP program which annotates structural motifs in gene or protein sequences was out of date having been written more than a decade ago Although still used by molecular biologists its slow computing ability meant a straightforward multiple search could take all night on a desktop PC The Udine biologist wanted Vitacolonna a postdoctoral fellow in computational biology to write a program that could do the job more quicklyrdquo Sam Jaffe Scientists Abandon their Software The Scientist Feb 16 2004
11 April 2023 CC-BY-SA atreloar 34
File format obsolescence Illumina Probability of error in basecalling encoded using ascii code
to reduce file size Meaning of the ascii code changed along the life cycle and
for data generated at different time points the quality might be encoded differently
ldquoIf you get an error like Invalid quality score value your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores Youll need to add the option -Q33 to your FASTX Toolkit argumentsrdquo Obviouslyhellip
11 April 2023 CC-BY-SA atreloar 35
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
Registration BioRxiv
11 April 2023 CC-BY-SA hvdsomp and atreloar 8
Registration Github
11 April 2023 CC-BY-SA hvdsomp and atreloar 9
Registration WikiPathways
11 April 2023 CC-BY-SA hvdsomp and atreloar 10
Registration NeuroLex
11 April 2023 CC-BY-SA hvdsomp and atreloar 11
Registration Nanopublications
11 April 2023 CC-BY-SA hvdsomp and atreloar 12
Registration some observations Decoupling registration from certification Timestamping versioning Registration of various types of objects Machines as creators and contributors
11 April 2023 CC-BY-SA hvdsomp and atreloar 13
Certification PubMed Commons
11 April 2023 CC-BY-SA hvdsomp and atreloar 14
Certification PubPeer
11 April 2023 CC-BY-SA hvdsomp and atreloar 15
Certification Publons
11 April 2023 CC-BY-SA hvdsomp and atreloar 16
Certification some observations Peer-review decoupled from publication process Certification of various types of objects Machines validating form Social endorsement
11 April 2023 CC-BY-SA hvdsomp and atreloar 17
Awareness myExperiment
11 April 2023 CC-BY-SA hvdsomp and atreloar 18
Awareness eLabNotebook RSS
11 April 2023 CC-BY-SA hvdsomp and atreloar 19
Awareness Twitter
11 April 2023 CC-BY-SA hvdsomp and atreloar 20
Awareness some observations Awareness for various types of objects Real time awareness Awareness support targeted at machines Awareness through social media
11 April 2023 CC-BY-SA hvdsomp and atreloar 21
Archiving PDB
11 April 2023 CC-BY-SA hvdsomp and atreloar 22
Archiving GenBank
11 April 2023 CC-BY-SA hvdsomp and atreloar 23
Characterising the future
11 April 2023 CC-BY-SA hvdsomp and atreloar 24
Fundamental changes The research process (objects social
dimension) is becoming more exposed Articles books are no longer the only
relevant objects for research communication Objects are no longer static Machines are joining humans as
(co-)creators and consumers of research objects
11 April 2023 CC-BY-SA hvdsomp and atreloar 25
Pathfinder problems Integrity of the scholarly record The three obsolescences
hardware file format software
11 April 2023 CC-BY-SA atreloar 26
System of Journals Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 27
Web of Objects Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 28
Not just citation relationships
11 April 2023 CC-BY-SA hvdsomp and atreloar 29
Your Text Here
The problem of obsolescence Lifescience research environment can be viewed as
undergoing a process of accelerated evolution Other disciplines will hit these problems in time
11 April 2023 CC-BY-SA atreloar 30
Cambrian explosion
11 April 2023 31
Hardware obsolescence Roche 454
11 April 2023 CC-BY-SA atreloar 32
Software obsolescence too much choice not enough support
11 April 2023 CC-BY-SA atreloar 33
Abandonware ldquoLast summer a member of the biology department of the
University of Udine in Italy approached Nicola Vitacolonna with an intriguing project The ANREP program which annotates structural motifs in gene or protein sequences was out of date having been written more than a decade ago Although still used by molecular biologists its slow computing ability meant a straightforward multiple search could take all night on a desktop PC The Udine biologist wanted Vitacolonna a postdoctoral fellow in computational biology to write a program that could do the job more quicklyrdquo Sam Jaffe Scientists Abandon their Software The Scientist Feb 16 2004
11 April 2023 CC-BY-SA atreloar 34
File format obsolescence Illumina Probability of error in basecalling encoded using ascii code
to reduce file size Meaning of the ascii code changed along the life cycle and
for data generated at different time points the quality might be encoded differently
ldquoIf you get an error like Invalid quality score value your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores Youll need to add the option -Q33 to your FASTX Toolkit argumentsrdquo Obviouslyhellip
11 April 2023 CC-BY-SA atreloar 35
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
Registration Github
11 April 2023 CC-BY-SA hvdsomp and atreloar 9
Registration WikiPathways
11 April 2023 CC-BY-SA hvdsomp and atreloar 10
Registration NeuroLex
11 April 2023 CC-BY-SA hvdsomp and atreloar 11
Registration Nanopublications
11 April 2023 CC-BY-SA hvdsomp and atreloar 12
Registration some observations Decoupling registration from certification Timestamping versioning Registration of various types of objects Machines as creators and contributors
11 April 2023 CC-BY-SA hvdsomp and atreloar 13
Certification PubMed Commons
11 April 2023 CC-BY-SA hvdsomp and atreloar 14
Certification PubPeer
11 April 2023 CC-BY-SA hvdsomp and atreloar 15
Certification Publons
11 April 2023 CC-BY-SA hvdsomp and atreloar 16
Certification some observations Peer-review decoupled from publication process Certification of various types of objects Machines validating form Social endorsement
11 April 2023 CC-BY-SA hvdsomp and atreloar 17
Awareness myExperiment
11 April 2023 CC-BY-SA hvdsomp and atreloar 18
Awareness eLabNotebook RSS
11 April 2023 CC-BY-SA hvdsomp and atreloar 19
Awareness Twitter
11 April 2023 CC-BY-SA hvdsomp and atreloar 20
Awareness some observations Awareness for various types of objects Real time awareness Awareness support targeted at machines Awareness through social media
11 April 2023 CC-BY-SA hvdsomp and atreloar 21
Archiving PDB
11 April 2023 CC-BY-SA hvdsomp and atreloar 22
Archiving GenBank
11 April 2023 CC-BY-SA hvdsomp and atreloar 23
Characterising the future
11 April 2023 CC-BY-SA hvdsomp and atreloar 24
Fundamental changes The research process (objects social
dimension) is becoming more exposed Articles books are no longer the only
relevant objects for research communication Objects are no longer static Machines are joining humans as
(co-)creators and consumers of research objects
11 April 2023 CC-BY-SA hvdsomp and atreloar 25
Pathfinder problems Integrity of the scholarly record The three obsolescences
hardware file format software
11 April 2023 CC-BY-SA atreloar 26
System of Journals Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 27
Web of Objects Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 28
Not just citation relationships
11 April 2023 CC-BY-SA hvdsomp and atreloar 29
Your Text Here
The problem of obsolescence Lifescience research environment can be viewed as
undergoing a process of accelerated evolution Other disciplines will hit these problems in time
11 April 2023 CC-BY-SA atreloar 30
Cambrian explosion
11 April 2023 31
Hardware obsolescence Roche 454
11 April 2023 CC-BY-SA atreloar 32
Software obsolescence too much choice not enough support
11 April 2023 CC-BY-SA atreloar 33
Abandonware ldquoLast summer a member of the biology department of the
University of Udine in Italy approached Nicola Vitacolonna with an intriguing project The ANREP program which annotates structural motifs in gene or protein sequences was out of date having been written more than a decade ago Although still used by molecular biologists its slow computing ability meant a straightforward multiple search could take all night on a desktop PC The Udine biologist wanted Vitacolonna a postdoctoral fellow in computational biology to write a program that could do the job more quicklyrdquo Sam Jaffe Scientists Abandon their Software The Scientist Feb 16 2004
11 April 2023 CC-BY-SA atreloar 34
File format obsolescence Illumina Probability of error in basecalling encoded using ascii code
to reduce file size Meaning of the ascii code changed along the life cycle and
for data generated at different time points the quality might be encoded differently
ldquoIf you get an error like Invalid quality score value your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores Youll need to add the option -Q33 to your FASTX Toolkit argumentsrdquo Obviouslyhellip
11 April 2023 CC-BY-SA atreloar 35
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
Registration WikiPathways
11 April 2023 CC-BY-SA hvdsomp and atreloar 10
Registration NeuroLex
11 April 2023 CC-BY-SA hvdsomp and atreloar 11
Registration Nanopublications
11 April 2023 CC-BY-SA hvdsomp and atreloar 12
Registration some observations Decoupling registration from certification Timestamping versioning Registration of various types of objects Machines as creators and contributors
11 April 2023 CC-BY-SA hvdsomp and atreloar 13
Certification PubMed Commons
11 April 2023 CC-BY-SA hvdsomp and atreloar 14
Certification PubPeer
11 April 2023 CC-BY-SA hvdsomp and atreloar 15
Certification Publons
11 April 2023 CC-BY-SA hvdsomp and atreloar 16
Certification some observations Peer-review decoupled from publication process Certification of various types of objects Machines validating form Social endorsement
11 April 2023 CC-BY-SA hvdsomp and atreloar 17
Awareness myExperiment
11 April 2023 CC-BY-SA hvdsomp and atreloar 18
Awareness eLabNotebook RSS
11 April 2023 CC-BY-SA hvdsomp and atreloar 19
Awareness Twitter
11 April 2023 CC-BY-SA hvdsomp and atreloar 20
Awareness some observations Awareness for various types of objects Real time awareness Awareness support targeted at machines Awareness through social media
11 April 2023 CC-BY-SA hvdsomp and atreloar 21
Archiving PDB
11 April 2023 CC-BY-SA hvdsomp and atreloar 22
Archiving GenBank
11 April 2023 CC-BY-SA hvdsomp and atreloar 23
Characterising the future
11 April 2023 CC-BY-SA hvdsomp and atreloar 24
Fundamental changes The research process (objects social
dimension) is becoming more exposed Articles books are no longer the only
relevant objects for research communication Objects are no longer static Machines are joining humans as
(co-)creators and consumers of research objects
11 April 2023 CC-BY-SA hvdsomp and atreloar 25
Pathfinder problems Integrity of the scholarly record The three obsolescences
hardware file format software
11 April 2023 CC-BY-SA atreloar 26
System of Journals Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 27
Web of Objects Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 28
Not just citation relationships
11 April 2023 CC-BY-SA hvdsomp and atreloar 29
Your Text Here
The problem of obsolescence Lifescience research environment can be viewed as
undergoing a process of accelerated evolution Other disciplines will hit these problems in time
11 April 2023 CC-BY-SA atreloar 30
Cambrian explosion
11 April 2023 31
Hardware obsolescence Roche 454
11 April 2023 CC-BY-SA atreloar 32
Software obsolescence too much choice not enough support
11 April 2023 CC-BY-SA atreloar 33
Abandonware ldquoLast summer a member of the biology department of the
University of Udine in Italy approached Nicola Vitacolonna with an intriguing project The ANREP program which annotates structural motifs in gene or protein sequences was out of date having been written more than a decade ago Although still used by molecular biologists its slow computing ability meant a straightforward multiple search could take all night on a desktop PC The Udine biologist wanted Vitacolonna a postdoctoral fellow in computational biology to write a program that could do the job more quicklyrdquo Sam Jaffe Scientists Abandon their Software The Scientist Feb 16 2004
11 April 2023 CC-BY-SA atreloar 34
File format obsolescence Illumina Probability of error in basecalling encoded using ascii code
to reduce file size Meaning of the ascii code changed along the life cycle and
for data generated at different time points the quality might be encoded differently
ldquoIf you get an error like Invalid quality score value your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores Youll need to add the option -Q33 to your FASTX Toolkit argumentsrdquo Obviouslyhellip
11 April 2023 CC-BY-SA atreloar 35
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
Registration NeuroLex
11 April 2023 CC-BY-SA hvdsomp and atreloar 11
Registration Nanopublications
11 April 2023 CC-BY-SA hvdsomp and atreloar 12
Registration some observations Decoupling registration from certification Timestamping versioning Registration of various types of objects Machines as creators and contributors
11 April 2023 CC-BY-SA hvdsomp and atreloar 13
Certification PubMed Commons
11 April 2023 CC-BY-SA hvdsomp and atreloar 14
Certification PubPeer
11 April 2023 CC-BY-SA hvdsomp and atreloar 15
Certification Publons
11 April 2023 CC-BY-SA hvdsomp and atreloar 16
Certification some observations Peer-review decoupled from publication process Certification of various types of objects Machines validating form Social endorsement
11 April 2023 CC-BY-SA hvdsomp and atreloar 17
Awareness myExperiment
11 April 2023 CC-BY-SA hvdsomp and atreloar 18
Awareness eLabNotebook RSS
11 April 2023 CC-BY-SA hvdsomp and atreloar 19
Awareness Twitter
11 April 2023 CC-BY-SA hvdsomp and atreloar 20
Awareness some observations Awareness for various types of objects Real time awareness Awareness support targeted at machines Awareness through social media
11 April 2023 CC-BY-SA hvdsomp and atreloar 21
Archiving PDB
11 April 2023 CC-BY-SA hvdsomp and atreloar 22
Archiving GenBank
11 April 2023 CC-BY-SA hvdsomp and atreloar 23
Characterising the future
11 April 2023 CC-BY-SA hvdsomp and atreloar 24
Fundamental changes The research process (objects social
dimension) is becoming more exposed Articles books are no longer the only
relevant objects for research communication Objects are no longer static Machines are joining humans as
(co-)creators and consumers of research objects
11 April 2023 CC-BY-SA hvdsomp and atreloar 25
Pathfinder problems Integrity of the scholarly record The three obsolescences
hardware file format software
11 April 2023 CC-BY-SA atreloar 26
System of Journals Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 27
Web of Objects Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 28
Not just citation relationships
11 April 2023 CC-BY-SA hvdsomp and atreloar 29
Your Text Here
The problem of obsolescence Lifescience research environment can be viewed as
undergoing a process of accelerated evolution Other disciplines will hit these problems in time
11 April 2023 CC-BY-SA atreloar 30
Cambrian explosion
11 April 2023 31
Hardware obsolescence Roche 454
11 April 2023 CC-BY-SA atreloar 32
Software obsolescence too much choice not enough support
11 April 2023 CC-BY-SA atreloar 33
Abandonware ldquoLast summer a member of the biology department of the
University of Udine in Italy approached Nicola Vitacolonna with an intriguing project The ANREP program which annotates structural motifs in gene or protein sequences was out of date having been written more than a decade ago Although still used by molecular biologists its slow computing ability meant a straightforward multiple search could take all night on a desktop PC The Udine biologist wanted Vitacolonna a postdoctoral fellow in computational biology to write a program that could do the job more quicklyrdquo Sam Jaffe Scientists Abandon their Software The Scientist Feb 16 2004
11 April 2023 CC-BY-SA atreloar 34
File format obsolescence Illumina Probability of error in basecalling encoded using ascii code
to reduce file size Meaning of the ascii code changed along the life cycle and
for data generated at different time points the quality might be encoded differently
ldquoIf you get an error like Invalid quality score value your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores Youll need to add the option -Q33 to your FASTX Toolkit argumentsrdquo Obviouslyhellip
11 April 2023 CC-BY-SA atreloar 35
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
Registration Nanopublications
11 April 2023 CC-BY-SA hvdsomp and atreloar 12
Registration some observations Decoupling registration from certification Timestamping versioning Registration of various types of objects Machines as creators and contributors
11 April 2023 CC-BY-SA hvdsomp and atreloar 13
Certification PubMed Commons
11 April 2023 CC-BY-SA hvdsomp and atreloar 14
Certification PubPeer
11 April 2023 CC-BY-SA hvdsomp and atreloar 15
Certification Publons
11 April 2023 CC-BY-SA hvdsomp and atreloar 16
Certification some observations Peer-review decoupled from publication process Certification of various types of objects Machines validating form Social endorsement
11 April 2023 CC-BY-SA hvdsomp and atreloar 17
Awareness myExperiment
11 April 2023 CC-BY-SA hvdsomp and atreloar 18
Awareness eLabNotebook RSS
11 April 2023 CC-BY-SA hvdsomp and atreloar 19
Awareness Twitter
11 April 2023 CC-BY-SA hvdsomp and atreloar 20
Awareness some observations Awareness for various types of objects Real time awareness Awareness support targeted at machines Awareness through social media
11 April 2023 CC-BY-SA hvdsomp and atreloar 21
Archiving PDB
11 April 2023 CC-BY-SA hvdsomp and atreloar 22
Archiving GenBank
11 April 2023 CC-BY-SA hvdsomp and atreloar 23
Characterising the future
11 April 2023 CC-BY-SA hvdsomp and atreloar 24
Fundamental changes The research process (objects social
dimension) is becoming more exposed Articles books are no longer the only
relevant objects for research communication Objects are no longer static Machines are joining humans as
(co-)creators and consumers of research objects
11 April 2023 CC-BY-SA hvdsomp and atreloar 25
Pathfinder problems Integrity of the scholarly record The three obsolescences
hardware file format software
11 April 2023 CC-BY-SA atreloar 26
System of Journals Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 27
Web of Objects Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 28
Not just citation relationships
11 April 2023 CC-BY-SA hvdsomp and atreloar 29
Your Text Here
The problem of obsolescence Lifescience research environment can be viewed as
undergoing a process of accelerated evolution Other disciplines will hit these problems in time
11 April 2023 CC-BY-SA atreloar 30
Cambrian explosion
11 April 2023 31
Hardware obsolescence Roche 454
11 April 2023 CC-BY-SA atreloar 32
Software obsolescence too much choice not enough support
11 April 2023 CC-BY-SA atreloar 33
Abandonware ldquoLast summer a member of the biology department of the
University of Udine in Italy approached Nicola Vitacolonna with an intriguing project The ANREP program which annotates structural motifs in gene or protein sequences was out of date having been written more than a decade ago Although still used by molecular biologists its slow computing ability meant a straightforward multiple search could take all night on a desktop PC The Udine biologist wanted Vitacolonna a postdoctoral fellow in computational biology to write a program that could do the job more quicklyrdquo Sam Jaffe Scientists Abandon their Software The Scientist Feb 16 2004
11 April 2023 CC-BY-SA atreloar 34
File format obsolescence Illumina Probability of error in basecalling encoded using ascii code
to reduce file size Meaning of the ascii code changed along the life cycle and
for data generated at different time points the quality might be encoded differently
ldquoIf you get an error like Invalid quality score value your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores Youll need to add the option -Q33 to your FASTX Toolkit argumentsrdquo Obviouslyhellip
11 April 2023 CC-BY-SA atreloar 35
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
Registration some observations Decoupling registration from certification Timestamping versioning Registration of various types of objects Machines as creators and contributors
11 April 2023 CC-BY-SA hvdsomp and atreloar 13
Certification PubMed Commons
11 April 2023 CC-BY-SA hvdsomp and atreloar 14
Certification PubPeer
11 April 2023 CC-BY-SA hvdsomp and atreloar 15
Certification Publons
11 April 2023 CC-BY-SA hvdsomp and atreloar 16
Certification some observations Peer-review decoupled from publication process Certification of various types of objects Machines validating form Social endorsement
11 April 2023 CC-BY-SA hvdsomp and atreloar 17
Awareness myExperiment
11 April 2023 CC-BY-SA hvdsomp and atreloar 18
Awareness eLabNotebook RSS
11 April 2023 CC-BY-SA hvdsomp and atreloar 19
Awareness Twitter
11 April 2023 CC-BY-SA hvdsomp and atreloar 20
Awareness some observations Awareness for various types of objects Real time awareness Awareness support targeted at machines Awareness through social media
11 April 2023 CC-BY-SA hvdsomp and atreloar 21
Archiving PDB
11 April 2023 CC-BY-SA hvdsomp and atreloar 22
Archiving GenBank
11 April 2023 CC-BY-SA hvdsomp and atreloar 23
Characterising the future
11 April 2023 CC-BY-SA hvdsomp and atreloar 24
Fundamental changes The research process (objects social
dimension) is becoming more exposed Articles books are no longer the only
relevant objects for research communication Objects are no longer static Machines are joining humans as
(co-)creators and consumers of research objects
11 April 2023 CC-BY-SA hvdsomp and atreloar 25
Pathfinder problems Integrity of the scholarly record The three obsolescences
hardware file format software
11 April 2023 CC-BY-SA atreloar 26
System of Journals Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 27
Web of Objects Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 28
Not just citation relationships
11 April 2023 CC-BY-SA hvdsomp and atreloar 29
Your Text Here
The problem of obsolescence Lifescience research environment can be viewed as
undergoing a process of accelerated evolution Other disciplines will hit these problems in time
11 April 2023 CC-BY-SA atreloar 30
Cambrian explosion
11 April 2023 31
Hardware obsolescence Roche 454
11 April 2023 CC-BY-SA atreloar 32
Software obsolescence too much choice not enough support
11 April 2023 CC-BY-SA atreloar 33
Abandonware ldquoLast summer a member of the biology department of the
University of Udine in Italy approached Nicola Vitacolonna with an intriguing project The ANREP program which annotates structural motifs in gene or protein sequences was out of date having been written more than a decade ago Although still used by molecular biologists its slow computing ability meant a straightforward multiple search could take all night on a desktop PC The Udine biologist wanted Vitacolonna a postdoctoral fellow in computational biology to write a program that could do the job more quicklyrdquo Sam Jaffe Scientists Abandon their Software The Scientist Feb 16 2004
11 April 2023 CC-BY-SA atreloar 34
File format obsolescence Illumina Probability of error in basecalling encoded using ascii code
to reduce file size Meaning of the ascii code changed along the life cycle and
for data generated at different time points the quality might be encoded differently
ldquoIf you get an error like Invalid quality score value your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores Youll need to add the option -Q33 to your FASTX Toolkit argumentsrdquo Obviouslyhellip
11 April 2023 CC-BY-SA atreloar 35
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
Certification PubMed Commons
11 April 2023 CC-BY-SA hvdsomp and atreloar 14
Certification PubPeer
11 April 2023 CC-BY-SA hvdsomp and atreloar 15
Certification Publons
11 April 2023 CC-BY-SA hvdsomp and atreloar 16
Certification some observations Peer-review decoupled from publication process Certification of various types of objects Machines validating form Social endorsement
11 April 2023 CC-BY-SA hvdsomp and atreloar 17
Awareness myExperiment
11 April 2023 CC-BY-SA hvdsomp and atreloar 18
Awareness eLabNotebook RSS
11 April 2023 CC-BY-SA hvdsomp and atreloar 19
Awareness Twitter
11 April 2023 CC-BY-SA hvdsomp and atreloar 20
Awareness some observations Awareness for various types of objects Real time awareness Awareness support targeted at machines Awareness through social media
11 April 2023 CC-BY-SA hvdsomp and atreloar 21
Archiving PDB
11 April 2023 CC-BY-SA hvdsomp and atreloar 22
Archiving GenBank
11 April 2023 CC-BY-SA hvdsomp and atreloar 23
Characterising the future
11 April 2023 CC-BY-SA hvdsomp and atreloar 24
Fundamental changes The research process (objects social
dimension) is becoming more exposed Articles books are no longer the only
relevant objects for research communication Objects are no longer static Machines are joining humans as
(co-)creators and consumers of research objects
11 April 2023 CC-BY-SA hvdsomp and atreloar 25
Pathfinder problems Integrity of the scholarly record The three obsolescences
hardware file format software
11 April 2023 CC-BY-SA atreloar 26
System of Journals Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 27
Web of Objects Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 28
Not just citation relationships
11 April 2023 CC-BY-SA hvdsomp and atreloar 29
Your Text Here
The problem of obsolescence Lifescience research environment can be viewed as
undergoing a process of accelerated evolution Other disciplines will hit these problems in time
11 April 2023 CC-BY-SA atreloar 30
Cambrian explosion
11 April 2023 31
Hardware obsolescence Roche 454
11 April 2023 CC-BY-SA atreloar 32
Software obsolescence too much choice not enough support
11 April 2023 CC-BY-SA atreloar 33
Abandonware ldquoLast summer a member of the biology department of the
University of Udine in Italy approached Nicola Vitacolonna with an intriguing project The ANREP program which annotates structural motifs in gene or protein sequences was out of date having been written more than a decade ago Although still used by molecular biologists its slow computing ability meant a straightforward multiple search could take all night on a desktop PC The Udine biologist wanted Vitacolonna a postdoctoral fellow in computational biology to write a program that could do the job more quicklyrdquo Sam Jaffe Scientists Abandon their Software The Scientist Feb 16 2004
11 April 2023 CC-BY-SA atreloar 34
File format obsolescence Illumina Probability of error in basecalling encoded using ascii code
to reduce file size Meaning of the ascii code changed along the life cycle and
for data generated at different time points the quality might be encoded differently
ldquoIf you get an error like Invalid quality score value your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores Youll need to add the option -Q33 to your FASTX Toolkit argumentsrdquo Obviouslyhellip
11 April 2023 CC-BY-SA atreloar 35
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
Certification PubPeer
11 April 2023 CC-BY-SA hvdsomp and atreloar 15
Certification Publons
11 April 2023 CC-BY-SA hvdsomp and atreloar 16
Certification some observations Peer-review decoupled from publication process Certification of various types of objects Machines validating form Social endorsement
11 April 2023 CC-BY-SA hvdsomp and atreloar 17
Awareness myExperiment
11 April 2023 CC-BY-SA hvdsomp and atreloar 18
Awareness eLabNotebook RSS
11 April 2023 CC-BY-SA hvdsomp and atreloar 19
Awareness Twitter
11 April 2023 CC-BY-SA hvdsomp and atreloar 20
Awareness some observations Awareness for various types of objects Real time awareness Awareness support targeted at machines Awareness through social media
11 April 2023 CC-BY-SA hvdsomp and atreloar 21
Archiving PDB
11 April 2023 CC-BY-SA hvdsomp and atreloar 22
Archiving GenBank
11 April 2023 CC-BY-SA hvdsomp and atreloar 23
Characterising the future
11 April 2023 CC-BY-SA hvdsomp and atreloar 24
Fundamental changes The research process (objects social
dimension) is becoming more exposed Articles books are no longer the only
relevant objects for research communication Objects are no longer static Machines are joining humans as
(co-)creators and consumers of research objects
11 April 2023 CC-BY-SA hvdsomp and atreloar 25
Pathfinder problems Integrity of the scholarly record The three obsolescences
hardware file format software
11 April 2023 CC-BY-SA atreloar 26
System of Journals Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 27
Web of Objects Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 28
Not just citation relationships
11 April 2023 CC-BY-SA hvdsomp and atreloar 29
Your Text Here
The problem of obsolescence Lifescience research environment can be viewed as
undergoing a process of accelerated evolution Other disciplines will hit these problems in time
11 April 2023 CC-BY-SA atreloar 30
Cambrian explosion
11 April 2023 31
Hardware obsolescence Roche 454
11 April 2023 CC-BY-SA atreloar 32
Software obsolescence too much choice not enough support
11 April 2023 CC-BY-SA atreloar 33
Abandonware ldquoLast summer a member of the biology department of the
University of Udine in Italy approached Nicola Vitacolonna with an intriguing project The ANREP program which annotates structural motifs in gene or protein sequences was out of date having been written more than a decade ago Although still used by molecular biologists its slow computing ability meant a straightforward multiple search could take all night on a desktop PC The Udine biologist wanted Vitacolonna a postdoctoral fellow in computational biology to write a program that could do the job more quicklyrdquo Sam Jaffe Scientists Abandon their Software The Scientist Feb 16 2004
11 April 2023 CC-BY-SA atreloar 34
File format obsolescence Illumina Probability of error in basecalling encoded using ascii code
to reduce file size Meaning of the ascii code changed along the life cycle and
for data generated at different time points the quality might be encoded differently
ldquoIf you get an error like Invalid quality score value your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores Youll need to add the option -Q33 to your FASTX Toolkit argumentsrdquo Obviouslyhellip
11 April 2023 CC-BY-SA atreloar 35
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
Certification Publons
11 April 2023 CC-BY-SA hvdsomp and atreloar 16
Certification some observations Peer-review decoupled from publication process Certification of various types of objects Machines validating form Social endorsement
11 April 2023 CC-BY-SA hvdsomp and atreloar 17
Awareness myExperiment
11 April 2023 CC-BY-SA hvdsomp and atreloar 18
Awareness eLabNotebook RSS
11 April 2023 CC-BY-SA hvdsomp and atreloar 19
Awareness Twitter
11 April 2023 CC-BY-SA hvdsomp and atreloar 20
Awareness some observations Awareness for various types of objects Real time awareness Awareness support targeted at machines Awareness through social media
11 April 2023 CC-BY-SA hvdsomp and atreloar 21
Archiving PDB
11 April 2023 CC-BY-SA hvdsomp and atreloar 22
Archiving GenBank
11 April 2023 CC-BY-SA hvdsomp and atreloar 23
Characterising the future
11 April 2023 CC-BY-SA hvdsomp and atreloar 24
Fundamental changes The research process (objects social
dimension) is becoming more exposed Articles books are no longer the only
relevant objects for research communication Objects are no longer static Machines are joining humans as
(co-)creators and consumers of research objects
11 April 2023 CC-BY-SA hvdsomp and atreloar 25
Pathfinder problems Integrity of the scholarly record The three obsolescences
hardware file format software
11 April 2023 CC-BY-SA atreloar 26
System of Journals Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 27
Web of Objects Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 28
Not just citation relationships
11 April 2023 CC-BY-SA hvdsomp and atreloar 29
Your Text Here
The problem of obsolescence Lifescience research environment can be viewed as
undergoing a process of accelerated evolution Other disciplines will hit these problems in time
11 April 2023 CC-BY-SA atreloar 30
Cambrian explosion
11 April 2023 31
Hardware obsolescence Roche 454
11 April 2023 CC-BY-SA atreloar 32
Software obsolescence too much choice not enough support
11 April 2023 CC-BY-SA atreloar 33
Abandonware ldquoLast summer a member of the biology department of the
University of Udine in Italy approached Nicola Vitacolonna with an intriguing project The ANREP program which annotates structural motifs in gene or protein sequences was out of date having been written more than a decade ago Although still used by molecular biologists its slow computing ability meant a straightforward multiple search could take all night on a desktop PC The Udine biologist wanted Vitacolonna a postdoctoral fellow in computational biology to write a program that could do the job more quicklyrdquo Sam Jaffe Scientists Abandon their Software The Scientist Feb 16 2004
11 April 2023 CC-BY-SA atreloar 34
File format obsolescence Illumina Probability of error in basecalling encoded using ascii code
to reduce file size Meaning of the ascii code changed along the life cycle and
for data generated at different time points the quality might be encoded differently
ldquoIf you get an error like Invalid quality score value your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores Youll need to add the option -Q33 to your FASTX Toolkit argumentsrdquo Obviouslyhellip
11 April 2023 CC-BY-SA atreloar 35
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
Certification some observations Peer-review decoupled from publication process Certification of various types of objects Machines validating form Social endorsement
11 April 2023 CC-BY-SA hvdsomp and atreloar 17
Awareness myExperiment
11 April 2023 CC-BY-SA hvdsomp and atreloar 18
Awareness eLabNotebook RSS
11 April 2023 CC-BY-SA hvdsomp and atreloar 19
Awareness Twitter
11 April 2023 CC-BY-SA hvdsomp and atreloar 20
Awareness some observations Awareness for various types of objects Real time awareness Awareness support targeted at machines Awareness through social media
11 April 2023 CC-BY-SA hvdsomp and atreloar 21
Archiving PDB
11 April 2023 CC-BY-SA hvdsomp and atreloar 22
Archiving GenBank
11 April 2023 CC-BY-SA hvdsomp and atreloar 23
Characterising the future
11 April 2023 CC-BY-SA hvdsomp and atreloar 24
Fundamental changes The research process (objects social
dimension) is becoming more exposed Articles books are no longer the only
relevant objects for research communication Objects are no longer static Machines are joining humans as
(co-)creators and consumers of research objects
11 April 2023 CC-BY-SA hvdsomp and atreloar 25
Pathfinder problems Integrity of the scholarly record The three obsolescences
hardware file format software
11 April 2023 CC-BY-SA atreloar 26
System of Journals Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 27
Web of Objects Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 28
Not just citation relationships
11 April 2023 CC-BY-SA hvdsomp and atreloar 29
Your Text Here
The problem of obsolescence Lifescience research environment can be viewed as
undergoing a process of accelerated evolution Other disciplines will hit these problems in time
11 April 2023 CC-BY-SA atreloar 30
Cambrian explosion
11 April 2023 31
Hardware obsolescence Roche 454
11 April 2023 CC-BY-SA atreloar 32
Software obsolescence too much choice not enough support
11 April 2023 CC-BY-SA atreloar 33
Abandonware ldquoLast summer a member of the biology department of the
University of Udine in Italy approached Nicola Vitacolonna with an intriguing project The ANREP program which annotates structural motifs in gene or protein sequences was out of date having been written more than a decade ago Although still used by molecular biologists its slow computing ability meant a straightforward multiple search could take all night on a desktop PC The Udine biologist wanted Vitacolonna a postdoctoral fellow in computational biology to write a program that could do the job more quicklyrdquo Sam Jaffe Scientists Abandon their Software The Scientist Feb 16 2004
11 April 2023 CC-BY-SA atreloar 34
File format obsolescence Illumina Probability of error in basecalling encoded using ascii code
to reduce file size Meaning of the ascii code changed along the life cycle and
for data generated at different time points the quality might be encoded differently
ldquoIf you get an error like Invalid quality score value your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores Youll need to add the option -Q33 to your FASTX Toolkit argumentsrdquo Obviouslyhellip
11 April 2023 CC-BY-SA atreloar 35
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
Awareness myExperiment
11 April 2023 CC-BY-SA hvdsomp and atreloar 18
Awareness eLabNotebook RSS
11 April 2023 CC-BY-SA hvdsomp and atreloar 19
Awareness Twitter
11 April 2023 CC-BY-SA hvdsomp and atreloar 20
Awareness some observations Awareness for various types of objects Real time awareness Awareness support targeted at machines Awareness through social media
11 April 2023 CC-BY-SA hvdsomp and atreloar 21
Archiving PDB
11 April 2023 CC-BY-SA hvdsomp and atreloar 22
Archiving GenBank
11 April 2023 CC-BY-SA hvdsomp and atreloar 23
Characterising the future
11 April 2023 CC-BY-SA hvdsomp and atreloar 24
Fundamental changes The research process (objects social
dimension) is becoming more exposed Articles books are no longer the only
relevant objects for research communication Objects are no longer static Machines are joining humans as
(co-)creators and consumers of research objects
11 April 2023 CC-BY-SA hvdsomp and atreloar 25
Pathfinder problems Integrity of the scholarly record The three obsolescences
hardware file format software
11 April 2023 CC-BY-SA atreloar 26
System of Journals Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 27
Web of Objects Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 28
Not just citation relationships
11 April 2023 CC-BY-SA hvdsomp and atreloar 29
Your Text Here
The problem of obsolescence Lifescience research environment can be viewed as
undergoing a process of accelerated evolution Other disciplines will hit these problems in time
11 April 2023 CC-BY-SA atreloar 30
Cambrian explosion
11 April 2023 31
Hardware obsolescence Roche 454
11 April 2023 CC-BY-SA atreloar 32
Software obsolescence too much choice not enough support
11 April 2023 CC-BY-SA atreloar 33
Abandonware ldquoLast summer a member of the biology department of the
University of Udine in Italy approached Nicola Vitacolonna with an intriguing project The ANREP program which annotates structural motifs in gene or protein sequences was out of date having been written more than a decade ago Although still used by molecular biologists its slow computing ability meant a straightforward multiple search could take all night on a desktop PC The Udine biologist wanted Vitacolonna a postdoctoral fellow in computational biology to write a program that could do the job more quicklyrdquo Sam Jaffe Scientists Abandon their Software The Scientist Feb 16 2004
11 April 2023 CC-BY-SA atreloar 34
File format obsolescence Illumina Probability of error in basecalling encoded using ascii code
to reduce file size Meaning of the ascii code changed along the life cycle and
for data generated at different time points the quality might be encoded differently
ldquoIf you get an error like Invalid quality score value your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores Youll need to add the option -Q33 to your FASTX Toolkit argumentsrdquo Obviouslyhellip
11 April 2023 CC-BY-SA atreloar 35
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
Awareness eLabNotebook RSS
11 April 2023 CC-BY-SA hvdsomp and atreloar 19
Awareness Twitter
11 April 2023 CC-BY-SA hvdsomp and atreloar 20
Awareness some observations Awareness for various types of objects Real time awareness Awareness support targeted at machines Awareness through social media
11 April 2023 CC-BY-SA hvdsomp and atreloar 21
Archiving PDB
11 April 2023 CC-BY-SA hvdsomp and atreloar 22
Archiving GenBank
11 April 2023 CC-BY-SA hvdsomp and atreloar 23
Characterising the future
11 April 2023 CC-BY-SA hvdsomp and atreloar 24
Fundamental changes The research process (objects social
dimension) is becoming more exposed Articles books are no longer the only
relevant objects for research communication Objects are no longer static Machines are joining humans as
(co-)creators and consumers of research objects
11 April 2023 CC-BY-SA hvdsomp and atreloar 25
Pathfinder problems Integrity of the scholarly record The three obsolescences
hardware file format software
11 April 2023 CC-BY-SA atreloar 26
System of Journals Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 27
Web of Objects Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 28
Not just citation relationships
11 April 2023 CC-BY-SA hvdsomp and atreloar 29
Your Text Here
The problem of obsolescence Lifescience research environment can be viewed as
undergoing a process of accelerated evolution Other disciplines will hit these problems in time
11 April 2023 CC-BY-SA atreloar 30
Cambrian explosion
11 April 2023 31
Hardware obsolescence Roche 454
11 April 2023 CC-BY-SA atreloar 32
Software obsolescence too much choice not enough support
11 April 2023 CC-BY-SA atreloar 33
Abandonware ldquoLast summer a member of the biology department of the
University of Udine in Italy approached Nicola Vitacolonna with an intriguing project The ANREP program which annotates structural motifs in gene or protein sequences was out of date having been written more than a decade ago Although still used by molecular biologists its slow computing ability meant a straightforward multiple search could take all night on a desktop PC The Udine biologist wanted Vitacolonna a postdoctoral fellow in computational biology to write a program that could do the job more quicklyrdquo Sam Jaffe Scientists Abandon their Software The Scientist Feb 16 2004
11 April 2023 CC-BY-SA atreloar 34
File format obsolescence Illumina Probability of error in basecalling encoded using ascii code
to reduce file size Meaning of the ascii code changed along the life cycle and
for data generated at different time points the quality might be encoded differently
ldquoIf you get an error like Invalid quality score value your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores Youll need to add the option -Q33 to your FASTX Toolkit argumentsrdquo Obviouslyhellip
11 April 2023 CC-BY-SA atreloar 35
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
Awareness Twitter
11 April 2023 CC-BY-SA hvdsomp and atreloar 20
Awareness some observations Awareness for various types of objects Real time awareness Awareness support targeted at machines Awareness through social media
11 April 2023 CC-BY-SA hvdsomp and atreloar 21
Archiving PDB
11 April 2023 CC-BY-SA hvdsomp and atreloar 22
Archiving GenBank
11 April 2023 CC-BY-SA hvdsomp and atreloar 23
Characterising the future
11 April 2023 CC-BY-SA hvdsomp and atreloar 24
Fundamental changes The research process (objects social
dimension) is becoming more exposed Articles books are no longer the only
relevant objects for research communication Objects are no longer static Machines are joining humans as
(co-)creators and consumers of research objects
11 April 2023 CC-BY-SA hvdsomp and atreloar 25
Pathfinder problems Integrity of the scholarly record The three obsolescences
hardware file format software
11 April 2023 CC-BY-SA atreloar 26
System of Journals Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 27
Web of Objects Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 28
Not just citation relationships
11 April 2023 CC-BY-SA hvdsomp and atreloar 29
Your Text Here
The problem of obsolescence Lifescience research environment can be viewed as
undergoing a process of accelerated evolution Other disciplines will hit these problems in time
11 April 2023 CC-BY-SA atreloar 30
Cambrian explosion
11 April 2023 31
Hardware obsolescence Roche 454
11 April 2023 CC-BY-SA atreloar 32
Software obsolescence too much choice not enough support
11 April 2023 CC-BY-SA atreloar 33
Abandonware ldquoLast summer a member of the biology department of the
University of Udine in Italy approached Nicola Vitacolonna with an intriguing project The ANREP program which annotates structural motifs in gene or protein sequences was out of date having been written more than a decade ago Although still used by molecular biologists its slow computing ability meant a straightforward multiple search could take all night on a desktop PC The Udine biologist wanted Vitacolonna a postdoctoral fellow in computational biology to write a program that could do the job more quicklyrdquo Sam Jaffe Scientists Abandon their Software The Scientist Feb 16 2004
11 April 2023 CC-BY-SA atreloar 34
File format obsolescence Illumina Probability of error in basecalling encoded using ascii code
to reduce file size Meaning of the ascii code changed along the life cycle and
for data generated at different time points the quality might be encoded differently
ldquoIf you get an error like Invalid quality score value your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores Youll need to add the option -Q33 to your FASTX Toolkit argumentsrdquo Obviouslyhellip
11 April 2023 CC-BY-SA atreloar 35
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
Awareness some observations Awareness for various types of objects Real time awareness Awareness support targeted at machines Awareness through social media
11 April 2023 CC-BY-SA hvdsomp and atreloar 21
Archiving PDB
11 April 2023 CC-BY-SA hvdsomp and atreloar 22
Archiving GenBank
11 April 2023 CC-BY-SA hvdsomp and atreloar 23
Characterising the future
11 April 2023 CC-BY-SA hvdsomp and atreloar 24
Fundamental changes The research process (objects social
dimension) is becoming more exposed Articles books are no longer the only
relevant objects for research communication Objects are no longer static Machines are joining humans as
(co-)creators and consumers of research objects
11 April 2023 CC-BY-SA hvdsomp and atreloar 25
Pathfinder problems Integrity of the scholarly record The three obsolescences
hardware file format software
11 April 2023 CC-BY-SA atreloar 26
System of Journals Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 27
Web of Objects Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 28
Not just citation relationships
11 April 2023 CC-BY-SA hvdsomp and atreloar 29
Your Text Here
The problem of obsolescence Lifescience research environment can be viewed as
undergoing a process of accelerated evolution Other disciplines will hit these problems in time
11 April 2023 CC-BY-SA atreloar 30
Cambrian explosion
11 April 2023 31
Hardware obsolescence Roche 454
11 April 2023 CC-BY-SA atreloar 32
Software obsolescence too much choice not enough support
11 April 2023 CC-BY-SA atreloar 33
Abandonware ldquoLast summer a member of the biology department of the
University of Udine in Italy approached Nicola Vitacolonna with an intriguing project The ANREP program which annotates structural motifs in gene or protein sequences was out of date having been written more than a decade ago Although still used by molecular biologists its slow computing ability meant a straightforward multiple search could take all night on a desktop PC The Udine biologist wanted Vitacolonna a postdoctoral fellow in computational biology to write a program that could do the job more quicklyrdquo Sam Jaffe Scientists Abandon their Software The Scientist Feb 16 2004
11 April 2023 CC-BY-SA atreloar 34
File format obsolescence Illumina Probability of error in basecalling encoded using ascii code
to reduce file size Meaning of the ascii code changed along the life cycle and
for data generated at different time points the quality might be encoded differently
ldquoIf you get an error like Invalid quality score value your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores Youll need to add the option -Q33 to your FASTX Toolkit argumentsrdquo Obviouslyhellip
11 April 2023 CC-BY-SA atreloar 35
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
Archiving PDB
11 April 2023 CC-BY-SA hvdsomp and atreloar 22
Archiving GenBank
11 April 2023 CC-BY-SA hvdsomp and atreloar 23
Characterising the future
11 April 2023 CC-BY-SA hvdsomp and atreloar 24
Fundamental changes The research process (objects social
dimension) is becoming more exposed Articles books are no longer the only
relevant objects for research communication Objects are no longer static Machines are joining humans as
(co-)creators and consumers of research objects
11 April 2023 CC-BY-SA hvdsomp and atreloar 25
Pathfinder problems Integrity of the scholarly record The three obsolescences
hardware file format software
11 April 2023 CC-BY-SA atreloar 26
System of Journals Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 27
Web of Objects Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 28
Not just citation relationships
11 April 2023 CC-BY-SA hvdsomp and atreloar 29
Your Text Here
The problem of obsolescence Lifescience research environment can be viewed as
undergoing a process of accelerated evolution Other disciplines will hit these problems in time
11 April 2023 CC-BY-SA atreloar 30
Cambrian explosion
11 April 2023 31
Hardware obsolescence Roche 454
11 April 2023 CC-BY-SA atreloar 32
Software obsolescence too much choice not enough support
11 April 2023 CC-BY-SA atreloar 33
Abandonware ldquoLast summer a member of the biology department of the
University of Udine in Italy approached Nicola Vitacolonna with an intriguing project The ANREP program which annotates structural motifs in gene or protein sequences was out of date having been written more than a decade ago Although still used by molecular biologists its slow computing ability meant a straightforward multiple search could take all night on a desktop PC The Udine biologist wanted Vitacolonna a postdoctoral fellow in computational biology to write a program that could do the job more quicklyrdquo Sam Jaffe Scientists Abandon their Software The Scientist Feb 16 2004
11 April 2023 CC-BY-SA atreloar 34
File format obsolescence Illumina Probability of error in basecalling encoded using ascii code
to reduce file size Meaning of the ascii code changed along the life cycle and
for data generated at different time points the quality might be encoded differently
ldquoIf you get an error like Invalid quality score value your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores Youll need to add the option -Q33 to your FASTX Toolkit argumentsrdquo Obviouslyhellip
11 April 2023 CC-BY-SA atreloar 35
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
Archiving GenBank
11 April 2023 CC-BY-SA hvdsomp and atreloar 23
Characterising the future
11 April 2023 CC-BY-SA hvdsomp and atreloar 24
Fundamental changes The research process (objects social
dimension) is becoming more exposed Articles books are no longer the only
relevant objects for research communication Objects are no longer static Machines are joining humans as
(co-)creators and consumers of research objects
11 April 2023 CC-BY-SA hvdsomp and atreloar 25
Pathfinder problems Integrity of the scholarly record The three obsolescences
hardware file format software
11 April 2023 CC-BY-SA atreloar 26
System of Journals Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 27
Web of Objects Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 28
Not just citation relationships
11 April 2023 CC-BY-SA hvdsomp and atreloar 29
Your Text Here
The problem of obsolescence Lifescience research environment can be viewed as
undergoing a process of accelerated evolution Other disciplines will hit these problems in time
11 April 2023 CC-BY-SA atreloar 30
Cambrian explosion
11 April 2023 31
Hardware obsolescence Roche 454
11 April 2023 CC-BY-SA atreloar 32
Software obsolescence too much choice not enough support
11 April 2023 CC-BY-SA atreloar 33
Abandonware ldquoLast summer a member of the biology department of the
University of Udine in Italy approached Nicola Vitacolonna with an intriguing project The ANREP program which annotates structural motifs in gene or protein sequences was out of date having been written more than a decade ago Although still used by molecular biologists its slow computing ability meant a straightforward multiple search could take all night on a desktop PC The Udine biologist wanted Vitacolonna a postdoctoral fellow in computational biology to write a program that could do the job more quicklyrdquo Sam Jaffe Scientists Abandon their Software The Scientist Feb 16 2004
11 April 2023 CC-BY-SA atreloar 34
File format obsolescence Illumina Probability of error in basecalling encoded using ascii code
to reduce file size Meaning of the ascii code changed along the life cycle and
for data generated at different time points the quality might be encoded differently
ldquoIf you get an error like Invalid quality score value your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores Youll need to add the option -Q33 to your FASTX Toolkit argumentsrdquo Obviouslyhellip
11 April 2023 CC-BY-SA atreloar 35
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
Characterising the future
11 April 2023 CC-BY-SA hvdsomp and atreloar 24
Fundamental changes The research process (objects social
dimension) is becoming more exposed Articles books are no longer the only
relevant objects for research communication Objects are no longer static Machines are joining humans as
(co-)creators and consumers of research objects
11 April 2023 CC-BY-SA hvdsomp and atreloar 25
Pathfinder problems Integrity of the scholarly record The three obsolescences
hardware file format software
11 April 2023 CC-BY-SA atreloar 26
System of Journals Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 27
Web of Objects Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 28
Not just citation relationships
11 April 2023 CC-BY-SA hvdsomp and atreloar 29
Your Text Here
The problem of obsolescence Lifescience research environment can be viewed as
undergoing a process of accelerated evolution Other disciplines will hit these problems in time
11 April 2023 CC-BY-SA atreloar 30
Cambrian explosion
11 April 2023 31
Hardware obsolescence Roche 454
11 April 2023 CC-BY-SA atreloar 32
Software obsolescence too much choice not enough support
11 April 2023 CC-BY-SA atreloar 33
Abandonware ldquoLast summer a member of the biology department of the
University of Udine in Italy approached Nicola Vitacolonna with an intriguing project The ANREP program which annotates structural motifs in gene or protein sequences was out of date having been written more than a decade ago Although still used by molecular biologists its slow computing ability meant a straightforward multiple search could take all night on a desktop PC The Udine biologist wanted Vitacolonna a postdoctoral fellow in computational biology to write a program that could do the job more quicklyrdquo Sam Jaffe Scientists Abandon their Software The Scientist Feb 16 2004
11 April 2023 CC-BY-SA atreloar 34
File format obsolescence Illumina Probability of error in basecalling encoded using ascii code
to reduce file size Meaning of the ascii code changed along the life cycle and
for data generated at different time points the quality might be encoded differently
ldquoIf you get an error like Invalid quality score value your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores Youll need to add the option -Q33 to your FASTX Toolkit argumentsrdquo Obviouslyhellip
11 April 2023 CC-BY-SA atreloar 35
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
Fundamental changes The research process (objects social
dimension) is becoming more exposed Articles books are no longer the only
relevant objects for research communication Objects are no longer static Machines are joining humans as
(co-)creators and consumers of research objects
11 April 2023 CC-BY-SA hvdsomp and atreloar 25
Pathfinder problems Integrity of the scholarly record The three obsolescences
hardware file format software
11 April 2023 CC-BY-SA atreloar 26
System of Journals Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 27
Web of Objects Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 28
Not just citation relationships
11 April 2023 CC-BY-SA hvdsomp and atreloar 29
Your Text Here
The problem of obsolescence Lifescience research environment can be viewed as
undergoing a process of accelerated evolution Other disciplines will hit these problems in time
11 April 2023 CC-BY-SA atreloar 30
Cambrian explosion
11 April 2023 31
Hardware obsolescence Roche 454
11 April 2023 CC-BY-SA atreloar 32
Software obsolescence too much choice not enough support
11 April 2023 CC-BY-SA atreloar 33
Abandonware ldquoLast summer a member of the biology department of the
University of Udine in Italy approached Nicola Vitacolonna with an intriguing project The ANREP program which annotates structural motifs in gene or protein sequences was out of date having been written more than a decade ago Although still used by molecular biologists its slow computing ability meant a straightforward multiple search could take all night on a desktop PC The Udine biologist wanted Vitacolonna a postdoctoral fellow in computational biology to write a program that could do the job more quicklyrdquo Sam Jaffe Scientists Abandon their Software The Scientist Feb 16 2004
11 April 2023 CC-BY-SA atreloar 34
File format obsolescence Illumina Probability of error in basecalling encoded using ascii code
to reduce file size Meaning of the ascii code changed along the life cycle and
for data generated at different time points the quality might be encoded differently
ldquoIf you get an error like Invalid quality score value your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores Youll need to add the option -Q33 to your FASTX Toolkit argumentsrdquo Obviouslyhellip
11 April 2023 CC-BY-SA atreloar 35
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
Pathfinder problems Integrity of the scholarly record The three obsolescences
hardware file format software
11 April 2023 CC-BY-SA atreloar 26
System of Journals Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 27
Web of Objects Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 28
Not just citation relationships
11 April 2023 CC-BY-SA hvdsomp and atreloar 29
Your Text Here
The problem of obsolescence Lifescience research environment can be viewed as
undergoing a process of accelerated evolution Other disciplines will hit these problems in time
11 April 2023 CC-BY-SA atreloar 30
Cambrian explosion
11 April 2023 31
Hardware obsolescence Roche 454
11 April 2023 CC-BY-SA atreloar 32
Software obsolescence too much choice not enough support
11 April 2023 CC-BY-SA atreloar 33
Abandonware ldquoLast summer a member of the biology department of the
University of Udine in Italy approached Nicola Vitacolonna with an intriguing project The ANREP program which annotates structural motifs in gene or protein sequences was out of date having been written more than a decade ago Although still used by molecular biologists its slow computing ability meant a straightforward multiple search could take all night on a desktop PC The Udine biologist wanted Vitacolonna a postdoctoral fellow in computational biology to write a program that could do the job more quicklyrdquo Sam Jaffe Scientists Abandon their Software The Scientist Feb 16 2004
11 April 2023 CC-BY-SA atreloar 34
File format obsolescence Illumina Probability of error in basecalling encoded using ascii code
to reduce file size Meaning of the ascii code changed along the life cycle and
for data generated at different time points the quality might be encoded differently
ldquoIf you get an error like Invalid quality score value your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores Youll need to add the option -Q33 to your FASTX Toolkit argumentsrdquo Obviouslyhellip
11 April 2023 CC-BY-SA atreloar 35
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
System of Journals Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 27
Web of Objects Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 28
Not just citation relationships
11 April 2023 CC-BY-SA hvdsomp and atreloar 29
Your Text Here
The problem of obsolescence Lifescience research environment can be viewed as
undergoing a process of accelerated evolution Other disciplines will hit these problems in time
11 April 2023 CC-BY-SA atreloar 30
Cambrian explosion
11 April 2023 31
Hardware obsolescence Roche 454
11 April 2023 CC-BY-SA atreloar 32
Software obsolescence too much choice not enough support
11 April 2023 CC-BY-SA atreloar 33
Abandonware ldquoLast summer a member of the biology department of the
University of Udine in Italy approached Nicola Vitacolonna with an intriguing project The ANREP program which annotates structural motifs in gene or protein sequences was out of date having been written more than a decade ago Although still used by molecular biologists its slow computing ability meant a straightforward multiple search could take all night on a desktop PC The Udine biologist wanted Vitacolonna a postdoctoral fellow in computational biology to write a program that could do the job more quicklyrdquo Sam Jaffe Scientists Abandon their Software The Scientist Feb 16 2004
11 April 2023 CC-BY-SA atreloar 34
File format obsolescence Illumina Probability of error in basecalling encoded using ascii code
to reduce file size Meaning of the ascii code changed along the life cycle and
for data generated at different time points the quality might be encoded differently
ldquoIf you get an error like Invalid quality score value your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores Youll need to add the option -Q33 to your FASTX Toolkit argumentsrdquo Obviouslyhellip
11 April 2023 CC-BY-SA atreloar 35
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
Web of Objects Archiving
11 April 2023 CC-BY-SA hvdsomp and atreloar 28
Not just citation relationships
11 April 2023 CC-BY-SA hvdsomp and atreloar 29
Your Text Here
The problem of obsolescence Lifescience research environment can be viewed as
undergoing a process of accelerated evolution Other disciplines will hit these problems in time
11 April 2023 CC-BY-SA atreloar 30
Cambrian explosion
11 April 2023 31
Hardware obsolescence Roche 454
11 April 2023 CC-BY-SA atreloar 32
Software obsolescence too much choice not enough support
11 April 2023 CC-BY-SA atreloar 33
Abandonware ldquoLast summer a member of the biology department of the
University of Udine in Italy approached Nicola Vitacolonna with an intriguing project The ANREP program which annotates structural motifs in gene or protein sequences was out of date having been written more than a decade ago Although still used by molecular biologists its slow computing ability meant a straightforward multiple search could take all night on a desktop PC The Udine biologist wanted Vitacolonna a postdoctoral fellow in computational biology to write a program that could do the job more quicklyrdquo Sam Jaffe Scientists Abandon their Software The Scientist Feb 16 2004
11 April 2023 CC-BY-SA atreloar 34
File format obsolescence Illumina Probability of error in basecalling encoded using ascii code
to reduce file size Meaning of the ascii code changed along the life cycle and
for data generated at different time points the quality might be encoded differently
ldquoIf you get an error like Invalid quality score value your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores Youll need to add the option -Q33 to your FASTX Toolkit argumentsrdquo Obviouslyhellip
11 April 2023 CC-BY-SA atreloar 35
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
Not just citation relationships
11 April 2023 CC-BY-SA hvdsomp and atreloar 29
Your Text Here
The problem of obsolescence Lifescience research environment can be viewed as
undergoing a process of accelerated evolution Other disciplines will hit these problems in time
11 April 2023 CC-BY-SA atreloar 30
Cambrian explosion
11 April 2023 31
Hardware obsolescence Roche 454
11 April 2023 CC-BY-SA atreloar 32
Software obsolescence too much choice not enough support
11 April 2023 CC-BY-SA atreloar 33
Abandonware ldquoLast summer a member of the biology department of the
University of Udine in Italy approached Nicola Vitacolonna with an intriguing project The ANREP program which annotates structural motifs in gene or protein sequences was out of date having been written more than a decade ago Although still used by molecular biologists its slow computing ability meant a straightforward multiple search could take all night on a desktop PC The Udine biologist wanted Vitacolonna a postdoctoral fellow in computational biology to write a program that could do the job more quicklyrdquo Sam Jaffe Scientists Abandon their Software The Scientist Feb 16 2004
11 April 2023 CC-BY-SA atreloar 34
File format obsolescence Illumina Probability of error in basecalling encoded using ascii code
to reduce file size Meaning of the ascii code changed along the life cycle and
for data generated at different time points the quality might be encoded differently
ldquoIf you get an error like Invalid quality score value your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores Youll need to add the option -Q33 to your FASTX Toolkit argumentsrdquo Obviouslyhellip
11 April 2023 CC-BY-SA atreloar 35
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
The problem of obsolescence Lifescience research environment can be viewed as
undergoing a process of accelerated evolution Other disciplines will hit these problems in time
11 April 2023 CC-BY-SA atreloar 30
Cambrian explosion
11 April 2023 31
Hardware obsolescence Roche 454
11 April 2023 CC-BY-SA atreloar 32
Software obsolescence too much choice not enough support
11 April 2023 CC-BY-SA atreloar 33
Abandonware ldquoLast summer a member of the biology department of the
University of Udine in Italy approached Nicola Vitacolonna with an intriguing project The ANREP program which annotates structural motifs in gene or protein sequences was out of date having been written more than a decade ago Although still used by molecular biologists its slow computing ability meant a straightforward multiple search could take all night on a desktop PC The Udine biologist wanted Vitacolonna a postdoctoral fellow in computational biology to write a program that could do the job more quicklyrdquo Sam Jaffe Scientists Abandon their Software The Scientist Feb 16 2004
11 April 2023 CC-BY-SA atreloar 34
File format obsolescence Illumina Probability of error in basecalling encoded using ascii code
to reduce file size Meaning of the ascii code changed along the life cycle and
for data generated at different time points the quality might be encoded differently
ldquoIf you get an error like Invalid quality score value your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores Youll need to add the option -Q33 to your FASTX Toolkit argumentsrdquo Obviouslyhellip
11 April 2023 CC-BY-SA atreloar 35
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
Cambrian explosion
11 April 2023 31
Hardware obsolescence Roche 454
11 April 2023 CC-BY-SA atreloar 32
Software obsolescence too much choice not enough support
11 April 2023 CC-BY-SA atreloar 33
Abandonware ldquoLast summer a member of the biology department of the
University of Udine in Italy approached Nicola Vitacolonna with an intriguing project The ANREP program which annotates structural motifs in gene or protein sequences was out of date having been written more than a decade ago Although still used by molecular biologists its slow computing ability meant a straightforward multiple search could take all night on a desktop PC The Udine biologist wanted Vitacolonna a postdoctoral fellow in computational biology to write a program that could do the job more quicklyrdquo Sam Jaffe Scientists Abandon their Software The Scientist Feb 16 2004
11 April 2023 CC-BY-SA atreloar 34
File format obsolescence Illumina Probability of error in basecalling encoded using ascii code
to reduce file size Meaning of the ascii code changed along the life cycle and
for data generated at different time points the quality might be encoded differently
ldquoIf you get an error like Invalid quality score value your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores Youll need to add the option -Q33 to your FASTX Toolkit argumentsrdquo Obviouslyhellip
11 April 2023 CC-BY-SA atreloar 35
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
Hardware obsolescence Roche 454
11 April 2023 CC-BY-SA atreloar 32
Software obsolescence too much choice not enough support
11 April 2023 CC-BY-SA atreloar 33
Abandonware ldquoLast summer a member of the biology department of the
University of Udine in Italy approached Nicola Vitacolonna with an intriguing project The ANREP program which annotates structural motifs in gene or protein sequences was out of date having been written more than a decade ago Although still used by molecular biologists its slow computing ability meant a straightforward multiple search could take all night on a desktop PC The Udine biologist wanted Vitacolonna a postdoctoral fellow in computational biology to write a program that could do the job more quicklyrdquo Sam Jaffe Scientists Abandon their Software The Scientist Feb 16 2004
11 April 2023 CC-BY-SA atreloar 34
File format obsolescence Illumina Probability of error in basecalling encoded using ascii code
to reduce file size Meaning of the ascii code changed along the life cycle and
for data generated at different time points the quality might be encoded differently
ldquoIf you get an error like Invalid quality score value your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores Youll need to add the option -Q33 to your FASTX Toolkit argumentsrdquo Obviouslyhellip
11 April 2023 CC-BY-SA atreloar 35
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
Software obsolescence too much choice not enough support
11 April 2023 CC-BY-SA atreloar 33
Abandonware ldquoLast summer a member of the biology department of the
University of Udine in Italy approached Nicola Vitacolonna with an intriguing project The ANREP program which annotates structural motifs in gene or protein sequences was out of date having been written more than a decade ago Although still used by molecular biologists its slow computing ability meant a straightforward multiple search could take all night on a desktop PC The Udine biologist wanted Vitacolonna a postdoctoral fellow in computational biology to write a program that could do the job more quicklyrdquo Sam Jaffe Scientists Abandon their Software The Scientist Feb 16 2004
11 April 2023 CC-BY-SA atreloar 34
File format obsolescence Illumina Probability of error in basecalling encoded using ascii code
to reduce file size Meaning of the ascii code changed along the life cycle and
for data generated at different time points the quality might be encoded differently
ldquoIf you get an error like Invalid quality score value your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores Youll need to add the option -Q33 to your FASTX Toolkit argumentsrdquo Obviouslyhellip
11 April 2023 CC-BY-SA atreloar 35
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
Abandonware ldquoLast summer a member of the biology department of the
University of Udine in Italy approached Nicola Vitacolonna with an intriguing project The ANREP program which annotates structural motifs in gene or protein sequences was out of date having been written more than a decade ago Although still used by molecular biologists its slow computing ability meant a straightforward multiple search could take all night on a desktop PC The Udine biologist wanted Vitacolonna a postdoctoral fellow in computational biology to write a program that could do the job more quicklyrdquo Sam Jaffe Scientists Abandon their Software The Scientist Feb 16 2004
11 April 2023 CC-BY-SA atreloar 34
File format obsolescence Illumina Probability of error in basecalling encoded using ascii code
to reduce file size Meaning of the ascii code changed along the life cycle and
for data generated at different time points the quality might be encoded differently
ldquoIf you get an error like Invalid quality score value your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores Youll need to add the option -Q33 to your FASTX Toolkit argumentsrdquo Obviouslyhellip
11 April 2023 CC-BY-SA atreloar 35
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
File format obsolescence Illumina Probability of error in basecalling encoded using ascii code
to reduce file size Meaning of the ascii code changed along the life cycle and
for data generated at different time points the quality might be encoded differently
ldquoIf you get an error like Invalid quality score value your fastq file probably has Sanger (offset 33) instead of Illumina (ASCII offset 64) quality scores Youll need to add the option -Q33 to your FASTX Toolkit argumentsrdquo Obviouslyhellip
11 April 2023 CC-BY-SA atreloar 35
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
Everett Rogers Diffusion of Innovation 1962
11 April 2023 CC-BY-SA atreloar 36
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
Conclusions Need to move to a smaller number of standard file
formats Need to move to a more sustainable model of
software development and maintenance Need to encourage platform manufacturers to
innovate around the hardware not the software NOTE other disciplines are looking to lifesciences
to work out how to solve some of these problems11 April 2023 CC-BY-SA atreloar 37
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
On best practices in the development of bioinformatics software Front Genet 02 Jul 14
Source code available to reviewers Software indexed citable available Source code documented Source code managed Test libraries sample data and dataset repositories
available
11 April 2023 CC-BY-SA atreloar 38
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-
Questions andrewtreloarandsorgau
atreloar
httpswwwslidesharenetatreloarthe-lifesciences-as-a-pathfinder-in-dataintensive-research-practice
11 April 2023 CC-BY-SA atreloar 39
- The life-sciences as a pathfinder in data-intensive research pr
- Structure presentation
- So many lifecycleshellip
- Minimal Research Lifecycle
- Sharing Scholarly Communication System and its Functions
- System of Journals
- Pointers to the future
- Registration BioRxiv
- Registration Github
- Registration WikiPathways
- Registration NeuroLex
- Registration Nanopublications
- Registration some observations
- Certification PubMed Commons
- Certification PubPeer
- Certification Publons
- Certification some observations
- Awareness myExperiment
- Awareness eLabNotebook RSS
- Awareness Twitter
- Awareness some observations
- Archiving PDB
- Archiving GenBank
- Characterising the future
- Fundamental changes
- Pathfinder problems
- System of Journals Archiving
- Web of Objects Archiving
- Not just citation relationships
- The problem of obsolescence
- Cambrian explosion
- Hardware obsolescence Roche 454
- Software obsolescence too much choice not enough support
- Abandonware
- File format obsolescence Illumina
- Everett Rogers Diffusion of Innovation 1962
- Conclusions
- On best practices in the development of bioinformatics software
- Questions
-