conceptual basis for critical thinking, data analysis and problem solving
DESCRIPTION
STRATEGY. Conceptual basis for critical thinking, data analysis and problem solving (and I don’t know what this is either !). Challenges for bioinformatics. With the sequence/structure deficit, the challenges are to rationalise the mass of sequence data - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/1.jpg)
EMBL-EBI
Conceptual basis for critical thinking, data analysis and
problem solving
(and I don’t know what this is either !)
STRATEGY
![Page 2: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/2.jpg)
EMBL-EBI
Challenges for bioinformatics
With the sequence/structure deficit, the challenges are to rationalise the mass of sequence data derive more efficient means of data storage design more reliable analysis tools
Imperative - to convert sequence information into biochemical & biophysical knowledge
![Page 3: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/3.jpg)
EMBL-EBI
What we cannot do well
“Give us sequence, we do rest”
![Page 4: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/4.jpg)
EMBL-EBI
![Page 5: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/5.jpg)
EMBL-EBI
![Page 6: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/6.jpg)
EMBL-EBI
What is the function of this structure?
What is the function of this sequence?
What is the function of this motif? the fold provides a scaffold, which
can be decorated in different ways by different sequences to confer different functions - knowing the fold & function allows us to rationalise how the structure effects its function at the molecular level
![Page 7: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/7.jpg)
EMBL-EBI
Complication – Multiprotein Complexes
![Page 8: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/8.jpg)
EMBL-EBI
1H8E (ADP.ALF4)2(ADP.SO4) BOVINE F1-ATPASE (ALL THREE CATALYTIC SITES OCCUPIED)MENZ, R.I., WALKER, J.E., LESLIE, A.G.W.
ATPase
![Page 9: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/9.jpg)
EMBL-EBI
1NT9 COMPLETE 12-SUBUNIT RNA POLYMERASE IIARMACHE, K.-J., KETTENBERGER, H., CRAMER, P
Multiprotein transcription complexes- RNA Polymerase
Science 288, 640 (2000) P. Cramer et.al.
![Page 10: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/10.jpg)
EMBL-EBI
STRING: a database of predicted functional associations between proteins. von Mering C, Huynen M, Jaeggi D, Schmidt S, Bork P, Snel B
http://string.embl.de/
Prolinks: a database of protein functional linkages derived from coevolution P.M. Bowers, M.Pellegrini, M.J. Thompson, J.Fierro, T.O. Yeates, D.Eisenberghttp://dip.doe-mbi.ucla.edu/pronav (? )
![Page 11: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/11.jpg)
EMBL-EBI
Ground rules for bioinformatics
Don't always believe what programs tell youthey're often misleading & sometimes wrong!
Don't always believe what databases tell youthey're often misleading & sometimes wrong!
Don't always believe what lecturers tell youthey're often misleading & sometimes wrong!
In short, don't be a naive user when computers are applied to biology, it is vital
to understand the difference between mathematical & biological significance
computers don’t do biology - they do sums quickly!
![Page 12: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/12.jpg)
EMBL-EBI
General Evaluation Criteria Be sceptical and cynical!
When you are searching for information you need to judge its quality and suitability.
Think critically about each piece of information you find and how you found it.
Relevance: Does the information you have found adequately support your research? Does it answer the question, or support one of your arguments? How general or specific is the information about the topic?
![Page 13: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/13.jpg)
EMBL-EBI
Building a search protocol
The usual starting point searching the primary data sources
NRDB, SPTR, etc.Pattern recognition methods
searching the secondary sourcespatterns, profiles, blocks, fingerprints
& HMMsEstimating significance
when do we believe a result?
![Page 14: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/14.jpg)
EMBL-EBI
A central goal is to predict protein function from sequence
Given a sequence, we want to know what is my protein? to what family does it belong? what is its function? how can we explain its function in structural terms?
By searching pattern dbs & fold libraries, we may recognise patterns that allow us to infer relationships with previously-characterised families & folds
Given the variety of dbs to search, how do we use them to build a sensible search protocol?
![Page 15: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/15.jpg)
EMBL-EBI
Planning a database Search
To find various aspects of your query sequence, you may have to search a number of databases
1. Identify the sequenceSearch for a matching or similar sequence using a 'BLAST' program.
2. Find related sequences(a) For a protein sequence, find the mRNA sequence that produces the protein, and the DNA sequence that codes for the mRNA.(b) For mRNA sequence, find the protein it produces, and the DNA sequence that codes for the mRNA.(c) For DNA sequence, find the mRNA it translates to, and the protein that the mRNA produces.
![Page 16: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/16.jpg)
EMBL-EBI
3. If a the sequence is from a protein, find a structural image.
4. Research the functionality of the sequence: (a) What is its function in different tissues (homology)?(b) What is its function in different organisms (phylogeny).(c) Are there any mutations, and what are their consequences?(d) What is the role of the protein in cell function?
![Page 17: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/17.jpg)
Protein sequence database identity searche.g., for short fragments, pinpoints identical matches to probe - may
identify correct reading frame
Protein sequence database similarity searche.g., nrdb, OWL, SP+SPTrEMBL - identifies homologues to
probe
Protein pattern database search e.g., PROSITE, profiles, PRINTS, BLOCKS, Pfam - identifies
family relationships or pinpoints key structural or functional sites
Known structure Structure classification database query library search e.g., scop, CATH, FSSPprovides details ligand-binding, etc.
Unknown StructureProtein fold patterne.g. threading identifies compatible of structural class
![Page 18: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/18.jpg)
![Page 19: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/19.jpg)
EMBL-EBI
iGAP
http://eol.sdsc.edu
![Page 20: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/20.jpg)
Protein sequences
Prediction of : signal peptides (SignalP, PSORT) transmembrane (TMHMM, PSORT) coiled coils (COILS) low complexity regions (SEG)
Structural assignment of domains by PSI-BLAST profiles on FOLDLIB
Structural assignment of domains by 123D on FOLDLIB
Structural assignment of domains by WU-BLAST
Data Warehouse
Functional assignment by PFAM, NR assignments
FOLDLIB
Building FOLDLIB:
PDB chains SCOP domains PDP domains CE matches PDB vs. SCOP
90% sequence non-identical minimum size 25 aa coverage (90%, gaps <30, ends<30)
Domain location prediction by sequence
structure info sequence info
Step 1
Step 2
Step 3
Step 4
Step 5
Step 6
NR, PFAMSCOP, PDB
![Page 21: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/21.jpg)
EMBL-EBI
http://harvester.embl.de/
“Harvester” collects information from selected public databases
![Page 22: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/22.jpg)
EMBL-EBI
Similarity searching
Whether or not an identity search finds a match, the next step is to look for similar sequencese.g., you may wish to know if a wider family exists
The most rapid option is to use BLAST & variants and look for high scores with low P-values (unlikely to be
random) clusters of high scores at the top of the hitlist (a
family?) trends in the type of sequences matched
Use a composite databasese.g., UNIPROT
![Page 23: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/23.jpg)
EMBL-EBI
Structural & functional interpretation
db searches often does little more than identify a protein familythis only scratches the surface - we still want
to know what our protein does & what it might look like
The first step is to examine the detailed family in InterPromay help to elucidate function
The next step is to examine the fold classification & structure summary resourcese.g., SCOP, CATH
![Page 24: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/24.jpg)
EMBL-EBI
Gene prediction, structure & function prediction are non-trivialstructure & function prediction tools are, at best,
70% accurate What are the lessons for sequence analysis?
when searching for distant homologues, several dbs should be searched
different methods provide different perspectives dbs aren’t complete & their contents don’t fully
overlap
The more dbs searched, the more difficult it can be to interpret results
![Page 25: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/25.jpg)
EMBL-EBI
Thinking about your Topic
Can you identify what you already know about the topic, and identify what you do not know.
Can you create questions based on these knowledge 'gaps', that is, can you identify your information needs.
What do you require about your protein sequence.
Develop a concept map to organise your ideas and structure your approach to the topic.
Discuss your topic with others.
![Page 26: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/26.jpg)
EMBL-EBI
Identifying the Type of Information you need
As well as thinking about your topic, you need to consider the type of information you will need.
Which information tools are best suited to your inquiry?How much information do you need - to what degree of detail?
![Page 27: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/27.jpg)
EMBL-EBI
Appreciate how difficult it is to draw a complex 3-D object and appreciate the complexity of the requirements for storing sequence and structural information of molecules in a database.
There are a lot of interrelated pieces of information about a biomolecule, such as
sequence similaritiesgenome locationprotein structureExpressionchemistry
![Page 28: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/28.jpg)
EMBL-EBI
All information on a molecule or sequence will not be found in one record, nor even in the one database.
Be prepared to search in several databases for information on your query sequence
As different organisations create databases to suit their own purposes, there will not be a great deal of similarity between these databases.
![Page 29: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/29.jpg)
EMBL-EBI
Data formats are not standard. The nomenclature is not standard. There is more than one database offering the same information (data redundancy). Links between databases may not be easy to follow. The number of databases available makes it confusing to choose from
Some of the obstacles of searching databases are:
![Page 30: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/30.jpg)
EMBL-EBI
Once you have found some information on your query sequence, you will find a new focus for your research from this information.
Through exploring any linked text in the databases:-
![Page 31: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/31.jpg)
EMBL-EBI
What function does the protein/mRNA/DNA have?
Do mutations occur and what are their effects?
Does it play a role in disease?
Homologies: Does it have the same function in different tissues?
Phylogenies: Does it have the same function in different organisms?
What role does structure play in the protein's function?
Does it have a similar function to other molecules with similar structure?
![Page 32: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/32.jpg)
EMBL-EBI
Pitfalls of searching databases
Remember that you are looking for information about a molecule, not database records.
Duplication of information (even within the same database) Links that are not always intuitive (or self-explanatory) Nomenclature that is not always standard
![Page 33: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/33.jpg)
EMBL-EBI
You need to determine whether the information is reliable or not
Accuracy or Validity
![Page 34: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/34.jpg)
EMBL-EBI
Quality Control Issues
The quality of archived data is no better than the data determined in the contributing laboratories.
Curation of the data can help to identify errors. Disagreement between duplicate determinations is
a clear warning of an error in one or the other. Similarly, results that disagree with established
principles may contain errors. It is useful, for instance, to flag deviations from
expected stereochemistry in protein structures, but such ``outliers'' are not necessarily wrong.
![Page 35: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/35.jpg)
EMBL-EBI
Data quality
Data Consistency Data Models Reliability
Evidences ? Level of confidence ? Assignation of function by similarity
recursive process propagation of errors
![Page 36: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/36.jpg)
EMBL-EBI
Data quality
It’s hard to judge whether something “makes sense”.
The lack of labeling on many web pages makes it hard to know the source.
Calculations based on databases are even harder to deal with
Logical deductions may be worse.
“tacR gene regulates the human nervous system”
“tacQ gene is similar to tacR but is found in E. coli”
“so tacQ gene regulates the E. coli nervous system”
![Page 37: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/37.jpg)
EMBL-EBI
E. coli nervous system
Who spotted ?
![Page 38: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/38.jpg)
EMBL-EBI
Evaluating database records
In order for your research to reliable you must use reliable sources of information
It is important to evaluate the information you find in databases as you would any other type of information
In the case of sequencing research however, peer review does not necessarily happen prior to publication.
![Page 39: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/39.jpg)
EMBL-EBI
Significance
Appreciating that mathematical & biological significance are different is crucial
Important in understanding the limitations of database search algorithms multiple sequence alignment algorithms pattern recognition techniques functional site & structure prediction tools
Contrary to popular opinion, there is currently still no biologically-reliable automatic multiple alignment
algorithm no infallible pattern-recognition technique no reliable gene, function or structure prediction
algorithm
![Page 40: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/40.jpg)
EMBL-EBI
Summary
Difficult questions on big data Data and Information Database and Databanks Organise the data to provide a service Visualization and Rendering Keep it up-to-date Provide a means to ask questions Provide a useful service to a large and
diverse scientific field
![Page 41: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/41.jpg)
EMBL-EBI
Data & Information
Data : a collection of factsi.e. X-ordinate, B-value, sequence
Information : acquired knowledge Data within a scientific “context” Meaning of the data
Sequence/structure alignment
![Page 42: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/42.jpg)
EMBL-EBI
Databases & Databanks
Databank A (usually large) collection of data
DatabaseA (usually large) set of data organized to allow
rapid retrieval of information. Organized for a reason Rapid retrieval : human short term memory is ~5
seconds information
![Page 43: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/43.jpg)
EMBL-EBI
WHAT IS THE PDB?
![Page 44: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/44.jpg)
EMBL-EBI
Databanks and Databases
The PDB Archive is a “databank” A series of flat files that have a format originally
designed for Fortran card readers
The MSD, RCSB, and PDBj provide “databases”
Collections of data (1000’s attributes) organized into relational tables and held with a RDMS.
![Page 45: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/45.jpg)
EMBL-EBI
![Page 46: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/46.jpg)
EMBL-EBI
Data & information
ATOM 2567 N PHE B 175 7.821 -25.530 -22.848 1.00 8.71 ATOM 2568 CA PHE B 175 8.845 -25.172 -21.877 1.00 9.41ATOM 2569 C PHE B 175 9.449 -23.798 -22.169 1.00 10.02 ATOM 2570 O PHE B 175 10.664 -23.613 -22.103 1.00 10.37 ATOM 2571 CB PHE B 175 9.928 -26.251 -21.848 1.00 9.53 ATOM 2572 CG PHE B 175 10.969 -26.137 -22.982 1.00 10.03 ATOM 2573 CD1 PHE B 175 12.356 -25.819 -22.988 1.00 10.51 ATOM 2574 CD2 PHE B 175 11.725 -27.211 -23.402 1.00 10.25 ATOM 2575 CE1 PHE B 175 11.821 -27.095 -22.869 1.00 11.17 ATOM 2576 CE2 PHE B 175 12.282 -26.086 -24.008 1.00 10.95 ATOM 2577 CZ PHE B 175 10.953 -26.335 -23.622 1.00 11.38
![Page 47: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/47.jpg)
EMBL-EBI
http://oca.ebi.ac.uk/oca-docs/oca-home.htmlhttp://srs.ebi.ac.uk/
http://www.rcsb.org/pdb/http://www.ebi.ac.uk/msd/http://www.pdbj.org/
![Page 48: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/48.jpg)
EMBL-EBI
wwPDB are service providers
We provide a service to the scientific community
24/7 (almost) : parallel DB with fail-over, etc. Service “ping” baseline check several times/day Data is incremented with new data weekly Systems are extensible
![Page 49: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/49.jpg)
EMBL-EBI
Query capabilities
Browsing (click and read) Simple search
select records with some constraints More elaborate search
select specific fields of some records with constraints on some fields
Complex queryingability to return an answer that results from a
"live" computation, and was not part of any record of the database
![Page 50: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/50.jpg)
EMBL-EBI
Interfaces
User interfaces user-friendly convenient browsing intuitive query forms visualization (graphical output)
Programmatic interfaces - communication with external programs: other databases (concept of distributed database) analysis tools
![Page 51: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/51.jpg)
EMBL-EBI
Annotation Issues
![Page 52: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/52.jpg)
EMBL-EBI
Annotation
Problem The flow of available data is increasing
exponentiallyStrategies
internal curators selected external experts public submission computer-based extraction of information
from biological texts
![Page 53: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/53.jpg)
EMBL-EBI
Annotation is a weak component of the enterprise.
Automation of annotation is possible only to a limited extent and getting annotation right remains labor-intensive.
But the importance of proper annotation, however, cannot be underestimated.
P. Bork has commented that for people interested in analysing the protein sequences implicit in genome sequence information, errors in gene assignment corrupt the high quality of the sequence data.
Annotation of the data
![Page 54: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/54.jpg)
EMBL-EBI
A possible solution is a distributed and dynamic error-correction and annotation process.
The workload must be distributed because databank staff have neither the time nor the expertise for the job; specialists will have to act as curators.
Progress in automation of annotation and error identification /correction will permit re-annotation of databanks.
Distributed Annotation
![Page 55: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/55.jpg)
EMBL-EBI
As a result, we will have to give up the ``safe'' idea of a stable databank composed of entries that are correct when they are first distributed in mature form and stay fixed thereafter.
Databanks are dynamic in information content and growing in size, and maturing in quality.
Maintaining local copies – largely “top up” this is not sufficient.
Proliferation of various copies in various states with out-of-date linkages
New Problems
![Page 56: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/56.jpg)
EMBL-EBI
The more computers are involved in automating genome annotation, the greater the need for collaboration with biologists
The more data we have to handle, the more rigorous we must be in our thinking (& writing) if we are to make sense of the complexities
We are still a long way from having reliable tools for deducing protein function from sequence
but with the right approach, there is hope
![Page 57: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/57.jpg)
EMBL-EBI
not much without intervention
What can you do with bioinformatics?
Conclusion
however, a lot if you know how to apply it right!
![Page 58: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/58.jpg)
EMBL-EBI
http://www.library.cqu.edu.au/chemcompass/index.htm
Terri AttwoodSchool of Biological SciencesUniversity of Manchester, Oxford RoadManchester M13 9PT, UKhttp://www.bioinf.man.ac.uk/dbbrowser/
Referencing - and Plagiarism
![Page 59: Conceptual basis for critical thinking, data analysis and problem solving](https://reader035.vdocuments.us/reader035/viewer/2022062314/568148a0550346895db5b531/html5/thumbnails/59.jpg)
http://www.vts.rdn.ac.uk/tutorial/biores