liisa holm

17
Proteiinianalyysin työt) Bioinformatiikan syventävä harjoitustyökurssi HOW – hands-on workshop on protein analysis Liisa Holm

Upload: kumiko

Post on 14-Jan-2016

27 views

Category:

Documents


1 download

DESCRIPTION

(52925 Proteiinianalyysin työt) Bioinformatiikan syventävä harjoitustyökurssi HOW – hands-on workshop on protein analysis. Liisa Holm. Instructors. Patrik Koskinen Petri Törönen. Course web page. http://ekhidna.biocenter.helsinki.fi/how Schedule Talks Exercises Course assignments - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Liisa Holm

(52925 Proteiinianalyysin työt)

Bioinformatiikan syventävä harjoitustyökurssi

HOW – hands-on workshop on protein analysis

Liisa Holm

Page 2: Liisa Holm

Instructors• Patrik Koskinen• Petri Törönen

Page 3: Liisa Holm

Course web page

http://ekhidna.biocenter.helsinki.fi/how – Schedule– Talks– Exercises– Course assignments– Instructions for computer use

Page 4: Liisa Holm

Mode of work• Course assignments

– Two sequences assigned to each team

• Sessions (12-16)– Demonstrations (~ 1 hour)– Practical exercises

• Structured questions• You should first try yourself, then ask team mate, then ask instructor • Discuss results with team mate

– Try out tools on your assigned sequences during the course• Second-last session reserved solely for working on course

assignments

• Presentations– Course grade based on presentation (2 March)

• Two sequence assignments per team

Page 5: Liisa Holm

Objectives

• Infer function and/or structure starting from the amino acid sequence of a query protein– Identify related sequences, place in family– Identify conserved positions in sequence and

structure

• Learn to use representative web-based tools

• No programming, no Unix/Linux

Page 6: Liisa Holm

Introduction

• Most cellular functions are performed or facilitated by proteins. – Primary biocatalyst– Cofactor transport/storage– Mechanical motion/support– Immune protection– Control of growth/differentiation

Page 7: Liisa Holm

Linear DNA

Watson & Crick (1953)

Page 8: Liisa Holm

3D structure

1mbn

MyoglobinKendrew & Perutz (1957)

Page 9: Liisa Holm

Function = interactions

Page 10: Liisa Holm

EvolutionSequence – Structure - Function

DNA sequence

Protein sequence Protein structure

Protein functionNatural selection

Page 11: Liisa Holm

What can sequence analysis do?

• Homology– Inference of inherited complex features: what is

conserved is important– Most powerful approach– Good tertiary structure prediction

• Diagnostic patterns– E.g. subcellular localization signals

• Physical preferences– Good secondary structure prediction– Prediction of transmembrane segments– Poor ab initio tertiary structure prediction

Page 12: Liisa Holm

Application: Finding Homologs

Page 13: Liisa Holm

Application:Finding Homologues

• Find Similar Ones in Different Organisms• Human vs. Mouse vs. Yeast

– Easier to do Expts. on latter!

(Section from NCBI Disease Genes Database Reproduced Below.)

Best Sequence Similarity Matches to Date Between Positionally ClonedHuman Genes and S. cerevisiae Proteins

Human Disease MIM # Human GenBank BLASTX Yeast GenBank Yeast Gene Gene Acc# for P-value Gene Acc# for Description Human cDNA Yeast cDNA

Hereditary Non-polyposis Colon Cancer 120436 MSH2 U03911 9.2e-261 MSH2 M84170 DNA repair proteinHereditary Non-polyposis Colon Cancer 120436 MLH1 U07418 6.3e-196 MLH1 U07187 DNA repair proteinCystic Fibrosis 219700 CFTR M28668 1.3e-167 YCF1 L35237 Metal resistance proteinWilson Disease 277900 WND U11700 5.9e-161 CCC2 L36317 Probable copper transporterGlycerol Kinase Deficiency 307030 GK L13943 1.8e-129 GUT1 X69049 Glycerol kinaseBloom Syndrome 210900 BLM U39817 2.6e-119 SGS1 U22341 HelicaseAdrenoleukodystrophy, X-linked 300100 ALD Z21876 3.4e-107 PXA1 U17065 Peroxisomal ABC transporterAtaxia Telangiectasia 208900 ATM U26455 2.8e-90 TEL1 U31331 PI3 kinaseAmyotrophic Lateral Sclerosis 105400 SOD1 K00065 2.0e-58 SOD1 J03279 Superoxide dismutaseMyotonic Dystrophy 160900 DM L19268 5.4e-53 YPK1 M21307 Serine/threonine protein kinaseLowe Syndrome 309000 OCRL M88162 1.2e-47 YIL002C Z47047 Putative IPP-5-phosphataseNeurofibromatosis, Type 1 162200 NF1 M89914 2.0e-46 IRA2 M33779 Inhibitory regulator protein

Choroideremia 303100 CHM X78121 2.1e-42 GDI1 S69371 GDP dissociation inhibitorDiastrophic Dysplasia 222600 DTD U14528 7.2e-38 SUL1 X82013 Sulfate permeaseLissencephaly 247200 LIS1 L13385 1.7e-34 MET30 L26505 Methionine metabolismThomsen Disease 160800 CLC1 Z25884 7.9e-31 GEF1 Z23117 Voltage-gated chloride channelWilms Tumor 194070 WT1 X51630 1.1e-20 FZF1 X67787 Sulphite resistance proteinAchondroplasia 100800 FGFR3 M58051 2.0e-18 IPL1 U07163 Serine/threoinine protein kinaseMenkes Syndrome 309400 MNK X69208 2.1e-17 CCC2 L36317 Probable copper transporter

Page 14: Liisa Holm

What you will learn• Multiple alignment

– Used as input to many prediction tools– Improves sequence-structure alignment– Identify functional sites

• Protein structure – Visualisation– Comparative modelling

• Using phylogeny in function assignment– Family classifications

Page 15: Liisa Holm

Query = Protein sequence Sequence similarity to other proteins?

Yes: does similarity imply homology?Yes: place query in family tree

Known function(s) in family?Yes

Transfer functionVerify conservation of functional motifs

NoMotif searchUse other data

Known structure in family?Yes

Comparative modellingValidate motifs against 3D model

NoSecondary structure prediction

No: use single sequence methodsNo: single sequence methods

Motif search Secondary structure predictionUse other data

Flowchart

Page 16: Liisa Holm

Course assignments

• Goal: using the flowchart, what can you say, with what confidence, about the structure and function of the protein?

• Max length of presentation is 25 minutes. No need to dwell on negative results.

• More detailed guidelines given in Session 7.

Page 17: Liisa Holm

TeamsTeam n, n=1,…,12, works on both sequence_nA and

sequence_nB • A and B sequences have been selected to

present different challenges. The team members should work together on both sequences, discussing the findings between them and making notes for the final report (presentation).

Sequences are here: http://ekhidna.biocenter.helsinki.fi/how/proteinlist.fasta