basic genomic characteristic aim: to collect as much general information as possible about your...
TRANSCRIPT
![Page 1: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/1.jpg)
INTRODUCTION TO BIOLOGICAL DATABASES
![Page 2: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/2.jpg)
Basic Genomic Characteristic AIM: to collect as much general
information as possible about your gene:Nucleotide sequence Databases
○ NCBI GenBank○ EMBL Nucleotide Sequence Database○ DDBJ
For Protein sequences○ UniProtKB
NCBI Reference Sequence (RefSeq)
![Page 3: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/3.jpg)
Nucleotide sequence DB
The 3 databases form an international collaboration. Each of the three groups collects a portion of the total sequence data reported worldwide, and all new and updated database entries are exchanged between the groups on a daily basis.
You do not need to check all of them!
![Page 4: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/4.jpg)
Nucleotide sequence DB
![Page 5: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/5.jpg)
Nucleotide sequence DB
![Page 6: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/6.jpg)
Nucleotide sequence DB
![Page 7: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/7.jpg)
Nucleotide sequence DB
![Page 8: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/8.jpg)
Nucleotide sequence DB
![Page 9: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/9.jpg)
NCBI Entrez
Present all the information available at NCBI for a gene. Entrez is a integrated searching tool across all the databases
![Page 10: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/10.jpg)
Genome Browsers
NCBI Sequence Viewer
UCSC Genome Browser
ENSEMBL
![Page 11: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/11.jpg)
NCBI Sequence Viewer
This is an example view of the human beta globin region on chr11
![Page 12: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/12.jpg)
UCSC Genome Browser
![Page 13: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/13.jpg)
ENSEMBL
![Page 14: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/14.jpg)
ENSEMBL – genome view
![Page 15: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/15.jpg)
ENSEMBL – Gene tree
![Page 16: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/16.jpg)
NCBI OMIM database
Nucleotide databases and Genome Browser provide information on the gene nucleotide sequence (exon, intron, alternative splicing sites…) but give you very few information on gene function
OMIM database provide a summary of all the literature concerning a gene.
![Page 17: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/17.jpg)
NCBI OMIM database
![Page 18: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/18.jpg)
Protein Databases
Protein databases provide useful information about the function of gene: e.g. conserved protein domains,…
UniProt is the reference database Interpro offer automatic protein
annotation based on conserved domains RefSeq
![Page 19: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/19.jpg)
Protein databases - UniProt
![Page 20: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/20.jpg)
Protein databases - UniProt
![Page 21: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/21.jpg)
Protein databases - UniProt
![Page 22: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/22.jpg)
Protein databases - UniProt
![Page 23: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/23.jpg)
Similarity search
If your gene has no protein information
Protein sequence availableBLASTP against a non redundant protein
database
Protein sequence unavailableBLASTX against a non redundant protein
database
![Page 24: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/24.jpg)
Protein 3D structure
Many proteins have the 3D structure determined. Biggest databases are:PDBNCBI Structure GroupDali
They offer tools for the visualization
![Page 25: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/25.jpg)
PDB database
The visualization tools allows you to see the structure and the ligands (if presents), rotate the image and zoom-in
![Page 26: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/26.jpg)
3D structure prediction
Structure still available for a limited number of proteins
Effort to predict protein structures based on sequences similarities
Still not very accurate!
SwissModel PSIPRED PredictProtein
![Page 27: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/27.jpg)
Swiss-Model
![Page 28: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/28.jpg)
Protein interaction databases AIM: find proteins that interact with your
target
IntAct: EBI resource to find interctors
BioGRID: is a freely available interaction database from model organisms and humans.
![Page 29: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/29.jpg)
IntAct
![Page 31: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/31.jpg)
miRNA specific resources Databases:
miRNAMap: it present several useful information such as secondary structure, tissue specific expression and predicted target gene
HMDD: is specific for disease-miRNA associationMiRbase: is a searchable database of published
miRNA sequences and annotation. Target Prediction tools:
miRecords: is a good repository that shows confirmed target genes and predictions from several other software
![Page 32: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/32.jpg)
C. Elegans specific tools
WormBase: is the main resource of information on C. elegans.
Expression pattern databaseHope lab Expression Pattern Database The Nematode Expression Pattern
DataBase Caenorhabditis elegans Genetics and
Genomics: provides links to many useful resources for C. elegans
![Page 33: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/33.jpg)
Expression databases
Allows exploratory analyses of multiple experiments
Experiments need to be linked Require much information about how
experiments where conducted = sources of variation
Very different to genomic databases MIAME standard
![Page 34: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/34.jpg)
MIAME
Experimental design Microarray design Extraction, preparation and labelling Hybridisation conditions Measurements: images, quantifications,
parameters Systematic error adjustments and
transformations
![Page 35: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/35.jpg)
MIAME
![Page 36: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/36.jpg)
Gene Expression Omnibus NCBI administered ~280,000 samples >100 organisms >1,000,000,000
measurements
![Page 37: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/37.jpg)
Gene Expression Omnibus
![Page 38: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/38.jpg)
Gene Expression Omnibus
![Page 39: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/39.jpg)
Gene Expression Omnibus
![Page 40: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/40.jpg)
Gene Expression Omnibus
![Page 41: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/41.jpg)
Gene Expression Omnibus
![Page 42: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/42.jpg)
ArrayExpress
EBI administered >7000 experiments Provide p-values Bioconductor
package
![Page 43: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/43.jpg)
ArrayExpress
![Page 44: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/44.jpg)
ArrayExpress
![Page 45: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/45.jpg)
ArrayExpress
![Page 46: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/46.jpg)
ArrayExpress
![Page 47: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/47.jpg)
ArrayExpress
![Page 48: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/48.jpg)
GEO and ArrayExpress Databases provide:
The raw data for each hybridization (e.g., CEL or GPR files) The final processed (normalized) data for the set of hybridizations in the
experiment (study) (e.g., the gene expression data matrix used to draw the conclusions from the study)
The essential sample annotation including experimental factors and their values (e.g., compound and dose in a dose response experiment)
The experimental design including sample data relationships (e.g., which raw data file relates to which sample, which hybridizations are technical, which are biological replicates)
Sufficient annotation of the array (e.g., gene identifiers, genomic coordinates, probe oligonucleotide sequences or reference commercial array catalog number)
The essential laboratory and data processing protocols (e.g., what normalization method has been used to obtain the final processed data)
![Page 49: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/49.jpg)
Problems:
Difficult compare experiments Significant genes not highlighted Poor results visualization
ArrayExpress is trying with its Atlas to solve this problems
![Page 50: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/50.jpg)
Genevestigator
It is JAVA visualization tool that summarizes results from thousands of high quality transcriptomic experiments
Much easier to compare samples
Open access to only some of the data and 1 probeset/gene
![Page 51: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/51.jpg)
Genevestigator
![Page 52: Basic Genomic Characteristic AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○](https://reader036.vdocuments.us/reader036/viewer/2022062515/56649d095503460f949db7c6/html5/thumbnails/52.jpg)
ONCOMINE