web resources for bioinformatics vadim alexandrov and mark gerstein

28
Web Resources for Bioinformatics Vadim Alexandrov and Mark Gerstein

Upload: keegan-pettitt

Post on 15-Jan-2016

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Web Resources for Bioinformatics Vadim Alexandrov and Mark Gerstein

Web Resourcesfor

Bioinformatics

Vadim Alexandrov and Mark Gerstein

Page 2: Web Resources for Bioinformatics Vadim Alexandrov and Mark Gerstein

What is Bioinformatics?• (Molecular) Bio - informatics• One idea for a definition?

Bioinformatics is conceptualizing biology in terms of molecules (in the sense of physical-chemistry) and then applying “informatics” techniques (derived from disciplines such as applied math, CS, and statistics) to understand and organize the information associated with these molecules, on a large-scale.

• Bioinformatics is “MIS” for Molecular Biology Information. It is a practical discipline with many applications.

Page 3: Web Resources for Bioinformatics Vadim Alexandrov and Mark Gerstein

Web Resources:

• Molecules– Sequence, Structure,

Function

• Algorithms– HMMs– alignments– simulations

• Databases

Page 5: Web Resources for Bioinformatics Vadim Alexandrov and Mark Gerstein
Page 6: Web Resources for Bioinformatics Vadim Alexandrov and Mark Gerstein
Page 7: Web Resources for Bioinformatics Vadim Alexandrov and Mark Gerstein
Page 8: Web Resources for Bioinformatics Vadim Alexandrov and Mark Gerstein
Page 9: Web Resources for Bioinformatics Vadim Alexandrov and Mark Gerstein
Page 10: Web Resources for Bioinformatics Vadim Alexandrov and Mark Gerstein
Page 11: Web Resources for Bioinformatics Vadim Alexandrov and Mark Gerstein
Page 12: Web Resources for Bioinformatics Vadim Alexandrov and Mark Gerstein
Page 13: Web Resources for Bioinformatics Vadim Alexandrov and Mark Gerstein
Page 14: Web Resources for Bioinformatics Vadim Alexandrov and Mark Gerstein
Page 16: Web Resources for Bioinformatics Vadim Alexandrov and Mark Gerstein

Web tour of UCL tools and resources

Page 17: Web Resources for Bioinformatics Vadim Alexandrov and Mark Gerstein

1. PDBsum capabilities     PDBsum: www.biochem.ucl.ac.uk/bsm/pdbsumStarting point for looking at PDB structure Each entry contains:

a. View-      Schematic pictures of the entry• -        Interactive views (RasMol/VRML) b. Details• -        Name, date and description of macromolecules in PDB entry• -        Authors, resolution and R-factor c. Links• -        PDB header information• -        PDB, NDB, SWISSPROT• -        PQS (protein quaternary structure), MMDB• -        CATH, SCOP, FSSP• -        Structure check reports - PROCHECK, WHATIF• -        Many others – enzyme, PRINTS etc

Page 18: Web Resources for Bioinformatics Vadim Alexandrov and Mark Gerstein

PDBsum capabilites, continuedd. Each chain-        CATH classification-        Plot of sequence, secondary structure and domain assignments-        PROMOTIF analysis-        TOPS topology diagram-        SAS – annotated FASTA alignment of related sequences in PDB-        PROSITE pattern e. Nucleic acid ligands-        Base sequence-        NUCPLOT diagram of interactions f. Small molecule ligands-        Schematic diagram of ligand-        LIGPLOT diagram of interactions

Page 19: Web Resources for Bioinformatics Vadim Alexandrov and Mark Gerstein

2. SAS (Sequence Annotated by Structure): www.biochem.ucl.ac.uk/bsm/sas

Annotation of protein sequences by structural information.

a. Input for FASTA search of rest of PDB- PDB code- SWISS-PROT code- Paste sequence- Upload own alignment

b. Annotation- Residue type- Ligand contacts- Active site residues- CATH domains- Residue similarity

c. Options- Select inclusion in alignment- Colour/b&w, secondary structure

d. View 3D structural superposition- coloured by SAS annotation

Page 20: Web Resources for Bioinformatics Vadim Alexandrov and Mark Gerstein
Page 21: Web Resources for Bioinformatics Vadim Alexandrov and Mark Gerstein

3. CATH: www.biochem.ucl.ac.uk/bsm/cathHierarchical domain classification of protein structures in the PDB. Four basic levels:

a. Class (automated): secondary structure composition and packing within structure- mainly-, mainly- , mixed , low secondary structure

b. Architecture (manual): overall shape of the domain structure as determined by the orientations of the secondary structures. Connectivity is ignored

- e.g. barrel, sandwich etc.

c. Topology (semi-automated): fold families determined by shape and connectivity of secondary structures

- e.g. Mainly-b two-layer sandwich

d. Homologous superfamily (semi-automated): domains of common ancestors determined by sequence and structural similarity

e. Sequence family (automated): highly similar structures and function as determined by sequence identity

Page 23: Web Resources for Bioinformatics Vadim Alexandrov and Mark Gerstein

4. Other classification databases

a. Enzyme structures database: www.biochem.ucl.ac.uk/bsm/enzymes

- PDB enzymes structures classified by E.C. number

b. Protein-DNA database: www.biochem.ucl.ac.uk/bsm/prot_dna/prot_dna.html

- PDB complex structures classified by binding motif

Page 24: Web Resources for Bioinformatics Vadim Alexandrov and Mark Gerstein

5. Protein sequence analysis: www.biochem.ucl.ac.uk/bsm/dbbrowser

Protein sequence search using

protein fingerprints - group of conserved sequence motifs used to characterize a protein family.

Page 26: Web Resources for Bioinformatics Vadim Alexandrov and Mark Gerstein

7. Atomic-level protein properties

a. PROCAT: www.biochem.ucl.ac.uk/bsm/PROCAT/PROCAT.html

- Database of 3D enzyme active sites

b. Hydrogen bond atlas: www.biochem.ucl.ac.uk/~mcdonald/atlas

- Graphical summary of hydrogen-bonding properties of amino acids

c. Atlas of side chain-side chain/side chain-base interactions:

www.biochem.ucl.ac.uk/bsm/sidechains

- interaction geometries of side chain and side chain-base pairs

Page 27: Web Resources for Bioinformatics Vadim Alexandrov and Mark Gerstein

8. Publicly available software(protein structure/interaction)

a. HBPLUS - calculation of interactions in PDB structures

b. LIGPLOT - schematic diagrams of protein-ligand interactions

c. NUCPLOT - schematic diagrams of protein-DNA interactions

d. PROMOTIF - analyze protein secondary structural motifs

e. NACCESS - calculate atomic accessibilities of protein surfaces

f. SURFNET - visualization of molecular surfaces, cavities etc

g. PROCHECK - check stereochemical quality of protein structures

h. THREADER - prediction of protein tertiary structure

i. MEMSAT - prediction of transmembrane protein structure

j-z BROWSE THE WEB AT YOUR SPARE TIME AND BOOKMARK ‘EM!

Page 28: Web Resources for Bioinformatics Vadim Alexandrov and Mark Gerstein

‘Domestic’ resources: http://bioinfo.mbb.yale.edu/partslist/