basics on bioinformatics lecture 2 - unina.it bioinf. (1).pdfbasics on bioinformatics lecture 2...
TRANSCRIPT
![Page 2: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/2.jpg)
Database or databank?
Initially
o Databank (UK)
o Database (USA)
Solution
The abbreviation db
2
![Page 3: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/3.jpg)
Entity-Relationship (ER) modeling
Notation uses three main constructs:
o Data entities
Represents a set or collection of objects in the real world that share the
same properties. Person, place, object, event or concept about which data is
to be maintained.
o Attributes
Named property or characteristic of an entity
o Relationships
Association between the instances of one or more entity typesAssociation between the instances of one or more entity types
Relationships can be classified as either
one – to – one 1�1one – to – many 1�Nmany – to –many N�N
Connectivity
3
![Page 4: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/4.jpg)
1 : N
Cardinality
1 : 1
4
N : M
![Page 5: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/5.jpg)
ER example
5
![Page 6: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/6.jpg)
database: basic structure
Databases are composed of tables of data.
Gi Accession Length Cultivar Dev.stag Tissue sequence
30320090 CD003352 356 -Turning stage
of fruit ripeningPericarp GTACTCCTAAAC…..
15195408 BI421671 492 TA496 25-40 days old callus CCACAACCACA…..
50892290 AJ784669 346West Virginia
106
8 days post
anthesisfruit CAAATTTA…..
Databases are composed of tables of data.
Tables hold logically related sets of data. A table is essentially
the same thing as a spreadsheet: a set of rows and columns
6
![Page 7: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/7.jpg)
database: basic structure
Gi Accession Length Cultivar Dev.stag Tissue sequence
30320090 CD003352 356 -Turning stage
of fruit ripeningPericarp GTACTCCTAAAC…..
15195408 BI421671 492 TA496 25-40 days old callus CCACAACCACA…..
50892290 AJ784669 346West Virginia
106
8 days post
anthesisfruit CAAATTTA…..
Each table has several records or entries : Each table has several records or entries :
a record stores all the information for a given individual
Records are the rows of a data table
7
![Page 8: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/8.jpg)
database: basic structure
Gi Accession Length Cultivar Dev.stag Tissue sequence
30320090 CD003352 356 -Turning stage
of fruit ripeningPericarp GTACTCCTAAAC…..
15195408 BI421671 492 TA496 25-40 days old callus CCACAACCACA…..
50892290 AJ784669 346West Virginia
106
8 days post
anthesisfruit CAAATTTA…..
Each record has several fields:Each record has several fields:
A field is an individual piece of data, a single attribute of the
record.
Fields are the columns of a data table
8
![Page 9: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/9.jpg)
database: basic structure
Gi Accession Length Cultivar Dev.stag Tissue sequence
30320090 CD003352 356 -Turning stage
of fruit ripeningPericarp GTACTCCTAAAC…..
15195408 BI421671 492 TA496 25-40 days old callus CCACAACCACA…..
50892290 AJ784669 346West Virginia
106
8 days post
anthesisfruit CAAATTTA…..
Each record (row) has a unique identifier, the primary key.Each record (row) has a unique identifier, the primary key.
the primary key serves to identify the data stored in this
record across all the tables in the database.
Databases are manipulated with a language called SQL (Structured
Query Language). It’s a “baby English” type of language: uses real
words, but rigid in terms of the order and placement.
Various database software: Oracle, MS Access, MySQL, etc.9
![Page 10: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/10.jpg)
Why biological databases?
oMake biological data available to scientistsConsolidation of data (gather data from different sources)Provide access to large dataset that cannot be publishedexplicitly (genome, …)
oMake biological data available in computer-readable formatMake data accessible for automated analysisMake data accessible for automated analysis
10
![Page 11: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/11.jpg)
Biological db
o Vary in size, quality, coverage, level of interest
o Many of the major ones covered in the annual Database Issue of
Nucleic Acids Research
11
2010
![Page 12: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/12.jpg)
Biological db
12
![Page 13: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/13.jpg)
Biological db
13
![Page 14: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/14.jpg)
What makes a good db?
o comprehensiveness
o accuracy
o is up-to-date
o good interface
o batch search/download
o API (web services, DAS, etc.)
14
![Page 15: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/15.jpg)
“must have” item when using db
o Remember the server, the database, and the program
version used
o Write down sequence identification numbers
o Databases are not like good wine
(use up-to-date builds)
o Use local installs when it becomes necessary15
![Page 16: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/16.jpg)
Primary and derived data
Primary databases:
Databases consisting of data derived experimentally such as
nucleotide sequences and three dimensional structures.
Secondary databases:
Those data that are derived from the analysis or treatment ofThose data that are derived from the analysis or treatment of
primary data
16
![Page 17: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/17.jpg)
Nucleotide sequence databases
GenBank www.ncbi.nlm.nih.gov/GenBank
17
www.ebi.ac.uk/emblwww.ddbj.nig.ac.jp
The 3 databases are synchronized on a daily basis, and the accessionnumbers are consistent.
There are no legal restriction in the usage of these databases.However, there are some patented sequences in the database
![Page 18: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/18.jpg)
GenBank sample record
http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.htmlLOCUS AF115338 591 bp DNA linear BCT 19-AUG-1999 DEFINITION Pseudomonas fluorescens ECF sigma factor SigX (sigX) gene, complete cds. ACCESSION AF115338 VERSION AF115338.1 GI:4959391 KEYWORDS . SOURCE Pseudomonas fluorescens. ORGANISM Pseudomonas fluorescens Bacteria; Proteobacteria; gamma subdivision; Pseudomonadaceae; Pseudomonas. REFERENCE 1 (bases 1 to 591) AUTHORS Brinkman,F.S., Schoofs,G., Hancock,R.E. and De Mot,R. TITLE Influence of a putative ECF sigma factor on expression of the major outer membrane protein, OprF, in Pseudomonas aeruginosa and Pseudomonas fluorescens JOURNAL J. Bacteriol. 181 (16), 4746-4754 (1999) MEDLINE 99369842 PUBMED 10438740 REFERENCE 2 (bases 1 to 591) AUTHORS De Mot,R. TITLE Direct Submission JOURNAL Submitted (04-DEC-1998) F.A. Janssens Laboratory of Genetics,
headertitle
taxonomy
citation
18
JOURNAL Submitted (04-DEC-1998) F.A. Janssens Laboratory of Genetics, Applied Plant Sciences, K. Mercierlaan 92, Heverlee B-3001, Belgium FEATURES Location/Qualifiers source 1..591 /organism="Pseudomonas fluorescens" /strain="M114" /db_xref="taxon:294" gene 1..591 /gene="sigX" CDS 1..591 /gene="sigX" /codon_start=1 /transl_table=11 /product="ECF sigma factor SigX" /protein_id="AAD34329.1" /db_xref="GI:4959392" /translation="MNKAQTLSTRYDPRELSDEELVARSHTELFHVTRAYEELMRRYQ RTLFNVCARYLGNDRDADDVCQEVMLKVLYGLKNLEGKSKFKTWLYSITYNECITQYR KERRKRRLMDALSLDPLEEASEEKALQPEEKGGLDRWLVYVNPIDRGILVLRFVAELE FQEIADIMHMGLSATKMRYKRALDKLREKFAGETET" BASE COUNT 157 a 133 c 170 g 131 t ORIGIN 1 atgaataaag cccaaacgct atccacgcgc tacgaccccc gcgagctctc tgatgaggag 61 ttggtcgcgc gctcgcatac cgagcttttt cacgtaacgc gcgcctatga agaactgatg 121 cggcgttacc agcgaacatt atttaacgtt tgtgcgagat atcttgggaa cgatcgcgac 181 gcagacgatg tctgtcagga agtcatgttg aaggtgctgt atggcctgaa gaacctcgag 241 gggaaatcga agttcaaaac gtggctctac agcatcacgt acaacgaatg tattacgcag 301 tatcggaagg aacggcgaaa gcgtcgcttg atggacgcat tgagtcttga ccccctcgag 361 gaagcgtccg aagaaaaggc gcttcaaccc gaggagaagg gcgggcttga tcgctggctg 421 gtgtatgtga acccgattga ccgtggaatt ctggtgcttc gatttgtcgc agagctggaa 481 tttcaggaga tcgcagacat catgcacatg ggtttgagtg cgacaaaaat gcgttacaaa 541 cgtgctctag ataaattgcg tgagaaattt gcaggcgaga ctgaaactta g
features
sequence
![Page 19: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/19.jpg)
Protein sequence database
The mission of UniProt is to provide the
scientific community with a comprehensive,
high-quality and freely accessible resource of
protein sequence and functional information.
UniprotKB Knowledgebase
is the central hub for the collection of functional information on proteins, with accurate,
consistent and rich annotation.
Swiss-Prot, which is
manually annotated
and reviewed.
TrEMBL, which is
automatically annotated
and is not reviewed.
The UniProt Reference
Clusters (UniRef), which is
used to speed up sequence
similarity searches.
19
![Page 20: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/20.jpg)
UniProt entry
20
![Page 21: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/21.jpg)
Protein data bank
The PDB archive contains information about experimentally
determined structures of proteins, nucleic acids, and complex
assemblies. (XrayXray,, NMR,NMR, ComputationallyComputationally predictedpredicted)
Mission: maintain a single archive of macromolecular structural data that is freely
and openly available to the global community
Number of Structures Available
21
![Page 22: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/22.jpg)
PDB entry
22
![Page 23: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/23.jpg)
Protein structure levels
23
![Page 24: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/24.jpg)
The gene Ontology (GO)
GO goals
The GO Website http://www.geneontology.org 24
![Page 25: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/25.jpg)
The gene Ontology (GO)
GO is divided in 3 domain (levels of annotation):
o Molecular function - basic activities of a gene product atthe molecular level
o Biological process - set of molecular events with a definedbeginning and an endbeginning and an end
o Cellular component - the parts of a cell or its extracellularenvironment
25
![Page 26: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/26.jpg)
GO structure
nucleus chromosome mitochondrion
The structure of GO can be described in terms of direct acyclic graph (DAG), where each
GO term is a node, and the relationships between the terms are arcs between the nodes
Is_a
part_of part_of
Nuclear chromosome mitochondrial chromosome
GO currently has 2 relationship types:Is_a
An is_a child of a parent means that the child is a complete type of its parent, but can be discriminated in some way from other children of the parent.
Part_ofA part_of child of a parent means that the child is always a constituent of the parent that in combination with other constituents of the parent make up the parent.
26
![Page 27: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/27.jpg)
Searching for papers
http://www.ncbi.nlm.nih.gov/pubmedhttp://scholar.google.com/
http://www.scopus.com/home.url
http://portal.isiknowledge.com/
27
![Page 28: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/28.jpg)
Querying GenBank
http://www.ncbi.nlm.nih.gov/sites/gquery
Search from the Entrez main page the gene whose accession
number is BC043443.
o How many results we get in the Gene db?
o What is the official name of the gene? Other possible
28
o What is the official name of the gene? Other possible
names?
o On which DNA strand is it located?
o How many variants of splicing it has?
o Which disease is the gene associated to?
o Is it involved in the apoptosis process?
o How long is the coding sequence of the first variant of
slicing?
![Page 29: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/29.jpg)
Querying GenBank
http://www.ncbi.nlm.nih.gov/genbank/
NG_000007
29
![Page 30: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/30.jpg)
Querying GenBank
What kind of molecule is it? Genomic DNA
30
![Page 31: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/31.jpg)
Querying GenBank
Where is locate the promoter of the gene HBB? Upstream the nucleotide 70545
31
![Page 32: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/32.jpg)
Querying GenBank
Indicate the number of exons =
Indicate the length of the second exon =
Indicate the number of introns =
Indicate the length of the first intron =
3
71039-70817 +1 = 223 nts
2
70816-70685+1 = 132 nts
32
![Page 33: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/33.jpg)
Querying GenBank
Indicate the location of the 5 'UTR =
Indicate the length of the 5 'UTR =
Indicate the location of the 3 'UTR =
Indicate the length of the 3 'UTR =
70545..70594
70594-70545 +1 = 50 nts
72019..72150
72150-72019 +1 = 132 nts
33
![Page 34: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/34.jpg)
Querying GenBank
Indicate the nucleotide positions of the start codon = 70595,70596,70597
34
![Page 35: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/35.jpg)
Querying GenBank
Download in FASTA format the sequence of the HBB gene
35
![Page 36: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/36.jpg)
Querying GenBank
70545 72150
36
![Page 37: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/37.jpg)
Querying GenBank
37
![Page 38: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/38.jpg)
Querying GenBank
>gi|28380636:70545-72150 Homo sapiens beta globin region (HBB@); and hemoglobin, beta (HBB); and hemoglobin, delta (HBD); and hemoglobin, epsilon 1 (HBE1); and hemoglobin, gamma A (HBG1); and hemoglobin, gamma G (HBG2), RefSeqGene on chromosome 11 ACATTTGCTTCTGACACAACTGTGTTCACTAGCAACCTCAAACAGACACCATGGTGCATCTGACTCCTGA GGAGAAGTCTGCCGTTACTGCCCTGTGGGGCAAGGTGAACGTGGATGAAGTTGGTGGTGAGGCCCTGGGC AGGTTGGTATCAAGGTTACAAGACAGGTTTAAGGAGACCAATAGAAACTGGGCATGTGGAGACAGAGAAG ACTCTTGGGTTTCTGATAGGCACTGACTCTCTCTGCCTATTGGTCTATTTTCCCACCCTTAGGCTGCTGG TGGTCTACCCTTGGACCCAGAGGTTCTTTGAGTCCTTTGGGGATCTGTCCACTCCTGATGCTGTTATGGG CAACCCTAAGGTGAAGGCTCATGGCAAGAAAGTGCTCGGTGCCTTTAGTGATGGCCTGGCTCACCTGGAC AACCTCAAGGGCACCTTTGCCACACTGAGTGAGCTGCACTGTGACAAGCTGCACGTGGATCCTGAGAACT TCAGGGTGAGTCTATGGGACGCTTGATGTTTTCTTTCCCCTTCTTTTCTATGGTTAAGTTCATGTCATAG GAAGGGGATAAGTAACAGGGTACAGTTTAGAATGGGAAACAGACGAATGATTGCATCAGTGTGGAAGTCT CAGGATCGTTTTAGTTTCTTTTATTTGCTGTTCATAACAATTGTTTTCTTTTGTTTAATTCTTGCTTTCT TTTTTTTTCTTCTCCGCAATTTTTACTATTATACTTAATGCCTTAACATTGTGTATAACAAAAGGAAATA TCTCTGAGATACATTAAGTAACTTAAAAAAAAACTTTACACAGTCTGCCTAGTACATTACTATTTGGAAT ATATGTGTGCTTATTTGCATATTCATAATCTCCCTACTTTATTTTCTTTTATTTTTAATTGATACATAAT CATTATACATATTTATGGGTTAAAGTGTAATGTTTTAATATGTGTACACATATTGACCAAATCAGGGTAA TTTTGCATTTGTAATTTTAAAAAATGCTTTCTTCTTTTAATATACTTTTTTGTTTATCTTATTTCTAATA CTTTCCCTAATCTCTTTCTTTCAGGGCAATAATGATACAATGTATCATGCCTCTTTGCACCATTCTAAAG CTTTCCCTAATCTCTTTCTTTCAGGGCAATAATGATACAATGTATCATGCCTCTTTGCACCATTCTAAAG AATAACAGTGATAATTTCTGGGTTAAGGCAATAGCAATATCTCTGCATATAAATATTTCTGCATATAAAT TGTAACTGATGTAAGAGGTTTCATATTGCTAATAGCAGCTACAATCCAGCTACCATTCTGCTTTTATTTT ATGGTTGGGATAAGGCTGGATTATTCTGAGTCCAAGCTAGGCCCTTTTGCTAATCATGTTCATACCTCTT ATCTTCCTCCCACAGCTCCTGGGCAACGTGCTGGTCTGTGTGCTGGCCCATCACTTTGGCAAAGAATTCA CCCCACCAGTGCAGGCTGCCTATCAGAAAGTGGTGGCTGGTGTGGCTAATGCCCTGGCCCACAAGTATCA CTAAGCTCGCTTTCTTGCTGTCCAATTTCTATTAAAGGTTCCTTTGTTCCCTAAGTCCAACTACTAAACT GGGGGATATTATGAAGGGCCTTGAGCATCTGGATTCTGCCTAATAAAAAACATTTATTTTCATTGC
38
![Page 39: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/39.jpg)
Querying GenBank
39
![Page 40: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/40.jpg)
Querying GenBank: link to geneID
40
![Page 41: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/41.jpg)
How many articles did Nunzio D’Agostino publish?
Querying PUBMEDhttp://www.ncbi.nlm.nih.gov/pubmed
41
![Page 42: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/42.jpg)
Querying PUBMEDhttp://www.ncbi.nlm.nih.gov/pubmed
How many articles did Nunzio D’Agostino publish?
D'Agostino, Nunzio [Full Author Name] OR D Agostino, Nunzio [Full Author Name]
42
![Page 43: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/43.jpg)
How many articles did Nunzio D’Agostino publish?
Querying PUBMEDhttp://www.ncbi.nlm.nih.gov/pubmed
D'Agostino, Nunzio [Full Author Name] OR D Agostino, Nunzio [Full Author Name]
How many of these are releted to EST?
43
![Page 44: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/44.jpg)
How many articles did Nunzio D’Agostino publish?
Querying PUBMEDhttp://www.ncbi.nlm.nih.gov/pubmed
D'Agostino, Nunzio [Full Author Name] OR D Agostino, Nunzio [Full Author Name]
How many of these are releted to EST?
D'Agostino, Nunzio [Full Author Name] AND EST [Title/Abstract]
44
![Page 45: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/45.jpg)
How many articles did Nunzio D’Agostino publish?
Querying PUBMEDhttp://www.ncbi.nlm.nih.gov/pubmed
D'Agostino, Nunzio [Full Author Name] OR D Agostino, Nunzio [Full Author Name]
How many of these are releted to EST?
D'Agostino, Nunzio [Full Author Name] AND EST [Title/Abstract]
How many of these are on the BMC Genomics Journal?
45
How many of these are on the BMC Genomics Journal?
![Page 46: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/46.jpg)
How many articles did Nunzio D’Agostino publish?
Querying PUBMEDhttp://www.ncbi.nlm.nih.gov/pubmed
D'Agostino, Nunzio [Full Author Name] OR D Agostino, Nunzio [Full Author Name]
How many of these are releted to EST?
D'Agostino, Nunzio [Full Author Name] AND EST [Title/Abstract]
How many of these are on the BMC Genomics Journal?
46
How many of these are on the BMC Genomics Journal?
D'Agostino, Nunzio [Full Author Name] OR D Agostino, Nunzio [Full Author
Name] AND BMC Genomics [journal]
![Page 47: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/47.jpg)
How many articles did Nunzio D’Agostino publish?
Querying PUBMEDhttp://www.ncbi.nlm.nih.gov/pubmed
D'Agostino, Nunzio [Full Author Name] OR D Agostino, Nunzio [Full Author Name]
How many of these are releted to EST?
D'Agostino, Nunzio [Full Author Name] AND EST [Title/Abstract]
How many of these are on the BMC Genomics Journal?
47
How many of these are on the BMC Genomics Journal?
D'Agostino, Nunzio [Full Author Name] OR D Agostino, Nunzio [Full Author
Name] AND BMC Genomics [journal]
How many articles do include the word “RNA-Seq” in the title?
![Page 48: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/48.jpg)
How many articles did Nunzio D’Agostino publish?
Querying PUBMEDhttp://www.ncbi.nlm.nih.gov/pubmed
D'Agostino, Nunzio [Full Author Name] OR D Agostino, Nunzio [Full Author Name]
How many of these are releted to EST?
D'Agostino, Nunzio [Full Author Name] AND EST [Title/Abstract]
How many of these are on the BMC Genomics Journal?
48
How many of these are on the BMC Genomics Journal?
D'Agostino, Nunzio [Full Author Name] OR D Agostino, Nunzio [Full Author
Name] AND BMC Genomics [journal]
How many articles in PubMED do include the word “RNA-Seq” in the title?
RNA-Seq [title]
![Page 49: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/49.jpg)
How many articles did Nunzio D’Agostino publish?
Querying PUBMEDhttp://www.ncbi.nlm.nih.gov/pubmed
D'Agostino, Nunzio [Full Author Name] OR D Agostino, Nunzio [Full Author Name]
How many of these are releted to EST?
D'Agostino, Nunzio [Full Author Name] AND EST [Title/Abstract]
How many of these are on the BMC Genomics Journal?
49
How many of these are on the BMC Genomics Journal?
D'Agostino, Nunzio [Full Author Name] OR D Agostino, Nunzio [Full Author
Name] AND BMC Genomics [journal]
How many articles in PubMED do include the word “RNA-Seq” in the title?
RNA-Seq [title]
How many reviews have been published in 2008 containing the word
"transcriptome”?
![Page 50: Basics on bioinformatics Lecture 2 - unina.it Bioinf. (1).pdfBasics on bioinformatics Lecture 2 Nunzio D’Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Database](https://reader033.vdocuments.us/reader033/viewer/2022043010/5fa0a1395f57c20a700b4017/html5/thumbnails/50.jpg)
How many articles did Nunzio D’Agostino publish?
Querying PUBMEDhttp://www.ncbi.nlm.nih.gov/pubmed
D'Agostino, Nunzio [Full Author Name] OR D Agostino, Nunzio [Full Author Name]
How many of these are releted to EST?
D'Agostino, Nunzio [Full Author Name] AND EST [Title/Abstract]
How many of these are on the BMC Genomics Journal?
50
How many of these are on the BMC Genomics Journal?
D'Agostino, Nunzio [Full Author Name] OR D Agostino, Nunzio [Full Author
Name] AND BMC Genomics [journal]
How many articles in PubMED do include the word “RNA-Seq” in the title?
RNA-Seq [title]
How many reviews have been published in 2008 containing the word
"transcriptome”?
transcriptome [title] AND review [Publication Type] AND 2008[publication date]