guide sheet for tics lab 1 - 4

Adopted (with minor modifications) from: www.nslc.wustl.edu/elgin/Bio3055/manual.pdf www.nslc.wustl.edu/elgin/genomics/Bio3055/3055/projects/PAH/PAH_manual.pdf ______________________________________________________________________________ SCHEDULE ACTIVITY Bioinformatics Lab 1: Tutorial on bioinformatics tools (COX-2 or PTGS2) Bioinformatics Lab 2: Start mini-project/exploration on a disease related to (Guide sheet 1) PAH (explore Gene, ExPASy, and ClustalW)

Answers to Tutorial Sheet Questions Due Answers to Reading Assignment Due

Bioinformatics Lab 3: Swiss-Pdb Viewer/DeepView (Guide sheet 2) Structure Problem Set due

Answers to Guide Sheet 1 Questions Due Bioinformatics Lab 4: OMIM and KEGG (Guide sheet 3) Answers to Guide Sheet 2 Questions Due

One week after Answers to Guide Sheet 3 Questions Due Bioinformatics Lab4 Report for Bioinformatics Lab 1-4 Due Note: There is a glossary of terms available in the course website through LMS (if it is not

available on the computer desktop). ______________________________________________________________________________ BIOINFORMATICS LAB 1: Tutorial to Basic Bioinformatics: COX-2 (PTGS2) This tutorial involves web-based bioinformatics exercise. The tutorial is based on the enzyme cyclooxygenase-2 (COX-2), also known as prostaglandin synthase-2 (PTGS2). The bioinformatics tools from the NCBI (National Center for Biotechnology Information) website will be used. NCBI is a division of the National Institute of Health (NIH), USA. These tools include Gene, GenBank, RefSeq, and PubMed. Gene = database of genes with a brief summary, the common gene symbol, information about the gene function, and links to websites, articles, and sequence information for that gene

GenBank = historical database of gene sequences, which means it contains every sequence that was published, even if the same sequence was published more than once. Therefore, GenBank is considered a redundant database. RefSeq = database of sequences that is edited by NCBI and is NON-redundant, meaning that it contains what NCBI determines is the strongest sequence data for each gene. ClustalW, which is a multiple sequence alignment program will also be introduced. ClustalW allows the entry of a series of gene or protein sequences that one believes to be similar and may be evolutionarily related. These sequences are usually obtained by performing a BLAST search. ClustalW then aligns the sequences, so that the lowest number of gaps is introduced and the highest numbers of similar residues are aligned with each other. About COX-2 (PTGS2) This enzyme is important in prostaglandin synthesis. Prostaglandins have a wide range of roles in our body from aiding in digestion to propagating pain and inflammation. Aspirin is a general inhibitor of prostaglandin synthesis and therefore, helps reduce pain. However, aspirin also inhibits the synthesis of prostaglandins that aid in digestion. Therefore, aspirin is a poor choice for pain and inflammation management for those with ulcers or other digestion problems. Recent advances in targeting specific prostaglandin-synthesizing enzymes have lead to the development of Celebrex, which is marketed as an arthritis therapy. Celebrex is a potent and specific inhibitor of COX-2. Celebrex is considered specific because it doesn’t inhibit COX-1, which is involved in synthesizing prostaglandins that aid in digestion. This is a remarkable accomplishment given the great similarity between COX-1 and COX-2. This achievement has paved the way for developing new therapies that bind more specifically to their target and therefore have fewer side effects. Understanding the enzyme structures of COX-1 and COX-2 helped researchers develop a drug that would only bind and inhibit COX-2. PART A: Background on Cox-1 (PTGS1) and Cox-2 (PTGS2) step1. Go to the NCBI homepage at http://www.ncbi.nlm.nih.gov step2. Select “Gene” from the database pulldown menu. Type “PTGS” in the

search box, then click “Go.” step3. Scan the results for the “Homo sapiens” entries. There should be one

called “PTGS1” and one called “PTGS2.” step4. Select each entry by clicking on its name, then read the paragraph

under the “Summary” section for each entry, and provide answers to the questions below.

Q1. PTGS1 and PTGS2 are isozymes. Isozymes catalyze the same reaction, but are separate genes. What types of reactions do PTGS enzymes catalyze? What biochemical pathway(s) are these enzymes involved?

Q2. How is the expression of PTGS1 and PTGS2 different? Q3. Which isozyme would you want to inhibit to stop inflammation? Q4. The drug Celebrex selectively inhibits PTGS2 while aspirin and other non-steroidal anti-

inflammatory drugs (NSAID’s) inhibit both PTGS1 and PTGS2 in the same way. Why do you think researchers wanted to discover a selective inhibitor to PTGS2?

Q5. Describe how studying 3-D structures of PTGS1 and PTGS2 could help researchers

design a drug that binds to PTGS1, but not to PTGS2.

(Note: Q4 & Q5 may require an internet search, but be very brief in your answer). PART B: Getting Sequence Information and Viewing Database Entries NCBI – Gene step1. Go back to the “Gene” entry for Homo sapiens PTGS2. step2. What is the gene name? ______________________ step3. What is the GeneID number? _______________________ step4. Where in the human genome is this gene located? ________________________ step5. What is the RefSeq accession number for the mRNA sequence of Homo sapiens

prostaglandin-endoperoxide synthase 2?__________________. step6. Open the entry, then choose “FASTA” from the pull-down menu. Copy the sequence

(including the title line designated by the “>” symbol) and paste it into a word document.

step7. Select the “Replace” tool under the EDIT menu. In the “find” box, type “^p” to find all

paragraph marks. Don’t type anything into the “replace” box. Then click “Replace All.” This will eliminate all the paragraph marks in the document. If you still see white spaces in the sequence, use the same procedure, but type “^w” in the “find” box to represent white spaces.

step8. You now should add back a paragraph mark after the title line (that starts with “>”) and

before the sequence starts. Save the file as Cox2rna.doc in your desktop folder (create one).

step 9. What is the RefSeq accession number for the Homo sapiens PTGS2 protein sequence?_________________. Open the entry. Follow the steps given above to save the sequence in FASTA format as a Word document called PTGS2prot.doc file on your desktop.

step10. Start a new browser, and go to the Expasy website (Swiss-Prot Entry) and search for the

Swiss-Prot entry for PTGS2. (Hint: use the gene name to search and be sure to select the HUMAN protein from the search results). step11. Write at least three alternate names for this protein. ____________________________ step12. Where in the cell is this protein located? _________________________________ step13. What types of drugs target this protein? _______________________________ step14. What amino acid is acetylated by aspirin (amino acid type and number)? _____________ step15. What His residue is in the active site? ____________________ Sequence Manipulation step16. Go to the Sequence Manipulation Suite (http://bioinformatics.org/sms/). step17. Click on “Translate” under “DNA Analysis” heading from the menu. step18. Clear the data entry box by hitting “Clear”. step19. Copy the mRNA sequence from your Word file and Paste it into the data entry box on

the Sequence Manipulation website. step20. Select “Reading Frame 3” and “direct” from the pull-down menus, then click “Submit”. step21. When the Output window opens with your results, copy and paste the sequence into a

Word document and save it as, “translate.doc” on your desktop folder. step22. Compare this sequence in the “translate.doc” file with the sequence in the

“Cox2rna.doc”. What are the first residues that are the same in the sequences? Do the sequences look like they are the same? ______________________

step23. Open a doc. file on your desktop, “SIXclustalW.doc”. (or alternatively cox2sixseq.pdf).

Using the “Text select” tool, select all the text and copy or go under the “Edit” menu and “Select All.” This file contains six FASTA formatted sequences of PTGS2 from different organisms. The top sequence is the human PTGS2 protein sequence you have been working with.

step24. Go to the ClustalW website and enter (by using “copy” and “paste”) all of the FASTA

formatted sequences into the data entry box. Select “input” for the Output order so the human sequences will stay at the top in the alignment. Press “Run.”

step25. Copy the alignment and paste it into a Word document. To make this file readable, do the

following things: a) Go to “Page Set-up” under “File” and change the page orientation to landscape. b) Select all text and change to “Courier” font, size 10. Courier is the best font for alignments because all the letters are the same width. c) Save this file to the desktop as “ClustalW.doc”

step26. Review the alignment.

What symbols are used for positions in the alignment that contain: identical, _________________ highly homologous, ________________ homologous, ________________ and non-homologous residues ________________ Are the residue numbers mentioned in steps 13 and 14 conserved? Why would you expect them to be conserved? _________________________________

BIOINFORMATICS LAB 2: Working with Primary Protein Structure Information In this lab session a protein from the SwissProt database will be used. One can search for the protein of interest by using the gene name given in Gene. A protein entry in the SwissProt contains some of the same information that is found in Gene, as well as information about the protein sequence, structure, and function that is summarized in a short, easy-to-read format. GUIDE SHEET 1 (below) will provide instructions and guidelines for the internet-based mini project you will be doing today. The ultimate goal for today’s lab is to create a multiple sequence alignment for the protein of interest using ClustalW. This alignment ‘software’ helps to identify the protein mutation, to observe regions of high sequence conservation, and to map secondary structure predictions. The protein mutation is important to identify since it is the basis for your mini project exploration in the bioinformatics component of this lab course and is important for understanding the link between the protein and the disease that will be studied. The regions of high sequence conservation are important because they often correspond to regions in the protein that are important to the protein’s function. GUIDE SHEET 1 PART 1 – OBTAINING THE BASICS: Getting sequence information and viewing the SwissProt and GenBank entries for the protein of interest. Instructions and Guidelines: Follow this guide sheet and answer the questions that accompany this guide as you work through each section. Be sure to refer to hints whenever there are given. Translating the patient’s cDNA step1. Obtain the mutant cDNA sequence, “PAHmutseq.pdf” from the course website through

LMS (if the sequence is not available on the desktop). Open the file and copy the sequence.

step2. Go to the Sequence Manipulation Site (http://bioinformatics.org/sms/). step3. In the menu to the left, Click on “Translate” found under the heading “DNA analysis”. step4. Clear the search box, then paste your patient’s cDNA sequence into the search box.

Choose a reading frame from the pull-down menu. Click “Submit.” step5. You should be able to find the sequence of your protein by finding the first methionine

(M), then continuing until you see the first “*” which is a stop codon. Copy the protein

sequence in that region, starting with the first “M” and paste it into a word document. Save the document in your folder on the desktop. Now you have saved the file of the mutant protein sequence.

NCBI – Gene step6. Using Gene on the NCBI website, find the entry for the protein you are studying by

searching with the protein name. Be sure to select the Homo sapiens protein from the list of results.

step7. Open the entry for the RefSeq protein sequence and save the sequence in FASTA format

to your desktop folder. Name this file “WTprotein.doc” so that you will know it is the un-mutated protein sequence.

Swiss-Prot Entry step8. Go to the ExPASy website and search for the SwissProt entry for your protein using

either the protein name or the gene name. Be sure to select the human protein from the list of results. Make sure the information in the entry is the same as you saw in the Gene entry. If your protein is an enzyme, the EC number is a good way to double-check. You may want to record the SwissProt entry number in case you want to find this entry again.

PART 2: BLAST SEQUENCE: FINDING HOMOLOGOUS PROTEINS Protein-protein BLAST step9. Perform a BLAST search using the RefSeq un-mutated protein sequence. First, select

PSI-PHI BLAST. Then paste the FASTA formatted protein sequence in the search box. Select the nrprotein database. Click “BLAST” to begin.

On the next page that appears, select “Format.” You may need to wait a few minutes before the results page opens. After obtaining the results, choose 5 sequences from various positions in the results. The goal is to choose a variety of sequences that differ in evolutionary distance from the human protein. Be sure not to choose any sequences that are human, since they are the same as your search sequence. For each of the 5 sequences, click on the sequence name to view the GenBank entry for the sequence. Then view the sequence in FASTA format. Copy and paste all the FASTA formatted sequences into the same Word file and save it to your desktop folder. At the beginning of this file, add your mutant protein sequence, also in FASTA format.

step10. This Word file will be used to create the multiple sequence alignment, so the formatting

is very important. The format for this file should be like the example used in the tutorial for COX-2. Review the entry for FASTA format in the Glossary. You should end up with a Word file that contains the 5 sequences from the BLAST search plus the un-mutated human protein sequence and your mutant sequence for a total of 7 sequences. Each sequence should be in FASTA format and contain a title line (starting

with >, then text, then a return). Shorten the text to contain JUST the species information so it will fit in the alignment (next step). For example, you should erase the “gi” line and add in something simpler like “rat,” “cat,” etc. Your mutant sequence should read “>mutant”. At the end of each title, be sure to press return to separate it from the rest of the sequence.

PART 3 – MULTIPLE SEQUENCE ALIGNMENT step11. Go to the ClustalW website and enter (by using “copy” and “paste”) all your FASTA

formatted sequence into the data entry box. The default parameters should be OK, except for the output order. a) Select “input” for the Output order b) Press “run”

step12. Save the alignment by copying and pasting the alignment into a Word document. At first

the alignment will look broken up. Follow these steps to make it readable again. a) Select all b) Change the font to size 10 and Courier c) Change the page set-up to landscape d) Save the file to your desktop

step13. Scroll through the alignment and make sure none of the blocks of sequences are

separated by a page break. Save and print the alignment. (We will be working with this alignment next week and it will be part of your final report.)

step14. On your alignment, compare the wild-type protein sequence with your mutant sequence

and mark any differences that you observe. PART 4 - SECONDARY STRUCTURE PREDICTIONS – USING PSIPRED step15. To save time, the PSIPRED and MEMSTAT predictions for your protein are available

for you in the same format as we received them. Obtain these results from the course website through the LMS (if the files are not on the desktop). Refer to the Glossary if you want to know how to submit requests for PSIPRED and MEMSTAT predictions.

step16. On your Multiple Sequence Alignment, draw the regions of secondary structure and

transmembrane predictions on top the corresponding sequence. Use the following symbols to represent predictions:

h = helix b = beta sheet

Mark nothing for coil predictions. Use the symbol above and an arrow to show how long each region of secondary structure spans on the alignment. See an instructor for examples if these directions are unclear.

At the end of this lab session, you should save the multiple sequence alignment for your Final Report. The multiple sequence alignment should have the secondary structure predictions and the mutation mapped.

BEFORE THE NEXT LAB SESSION: Solve the structure problem set below. It’s important that you are familiar with such structures before the next lab session. Structure Problem Set Directions – Draw the chemical structures for the following amino acids. They are represented in cpk color mode (see Glossary document for more information). 1.

2.

3.

4. Draw the chemical representation of the following tripeptide.

5. Draw the chemical representation and represent H-bonds as dotted lines between the

atoms where distances have been measured. You will need to add hydrogens that don’t appear in the picture below.

6. What distance must two atoms be in order to be involved in hydrogen bonds and ionic

bonds (refer to a biochemistry textbook, if needed)?

BIOINFORMATICS LABORATORY 3: Investigation of Protein Crystal Structure INTRODUCTION The goal of this lab is to analyze the crystal structure of protein in order find the functionally important areas of protein and predict the effect of the mutation on protein function. You may determine that the mutation makes the protein more active, less active, or that the mutation is likely to have no effect of protein function. A protein structure-viewing program called DeepView/Swiss-Pdb Viewer will be used. This program has some similarities to other programs like Chime and Protein Explorer. DeepView can easily model mutations and is easy to learn to use in one day. It is not a web-based program. (At the beginning of this lab, a brief tutorial on how to use DeepView will be given using the COX-2 protein. The protein structure is identified as 1CX2.pdb) After the brief tutorial on how to use DeepView, the first step in today’s lab session is to obtain the data file of the crystal structure for the protein that you will be investigating. In some cases, the crystal structure has not been solved for your exact protein, so you will analyze the structure and model the mutation in a homologous protein. In this case, the amino acid numbering will be slightly different. The data files for crystal structures are called pdb files and are all stored in the Protein Data Bank. You can search the Protein Data Bank website for your protein and download the pdb file to your desktop folder. In analyzing the crystal structure, the focus is on identifying the noncovalent interactions (van der Waals, H-bonds, and ionic bonds) of the amino acid side chain before and after it is mutated. How do the interactions change or how are they maintained when the amino acid is mutated? Be sure to review both the distances and nature of these non-covalent interactions before coming to this lab. Crystal structures are the results of experiments, and so it is important to consider experimental error when analyzing a crystal structure. One way experimental error is reported in a crystal structure is by giving the resolution. Resolution of a crystal structure is the accuracy of the prediction of each atom location. For example, if the resolution is 2 angstroms, you can be confident that the atom is located within a 2 angstrom radius of where it is shown in the pdb file. This is important to consider when measuring distances, since each distance measured in a crystal structure is actually ± the resolution of the crystal structure. PHENYLALANINE HYDROXYLASE (PAH): A brief note Phenylketonuria (PKU) is a metabolic disease that results from a deficiency in phenylalanine hydroxylase (PAH). This metabolic disease was one of the first to be well-characterized, and today is tested for newborns. In this lab session you will use web-based biotechnology tools to research how a genetic mutation in phenylalanine hydroxylase leads to the symptoms of PKU.

In today’s lab session, you will be analyzing a cDNA sequence of a patient “PAHmutseq” (in FASTA format) who may be a carrier for deficient PAH. For your mini-project/exploration, you will analyze the mutant PAH protein using the bioinformatics tools presented in lab. You will investigate the structure of the PAH protein, model the mutation, and find out what is known, if anything, about the biological impact of the mutation. Through your studies, you will form a hypothesis about what the structural and biological effects are of this mutation, and organize the results of your research into a report. PRE-LAB ASSIGNMENT: Complete the questions accompanying the reading assignment. Retrieve the necessary information/questions from “Reading Assignment and Questions for Reading Assignment” word document (course website through LMS or deskstop)

GUIDE SHEET 2 Obtaining 3D Structure Information: Searching for Structure Files (pdb files): The crystal structure of phenylalanine hydroxlase (PAH) bound to its cofactor and a substrate analog has been solved. This is the structure you read about in the reading assignment due for this lab. To obtain the crystal structure data file (pdb file), follow these steps: step1. Go to the Protein Data Bank website www.rcsb.pdb.org (see Glossary) which contains all

of the macromolecule 3-D structure files (pdb files). Pdb files are named in 4 characters (numbers and letters).

a) Search for 1KW0.pdb file. The summary information page for 1KW0 contains a title

for the entry, the compound crystallized, and the species of the source of the protein. Use this entry to answer questions 1 – 3 accompanying Guide Sheet 2.

b) Click on “Download/Display” file at the left of the screen.

c) On this page, choose to download the structure file in PDB format with no

compression. It will be the “none” option in the second table. The “1KW0.pdb” file should now be on your desktop.

Viewing the structure file: NOTE: The biopterin cofactor and a substrate analog are bound in the active site along with an iron (Fe) atom. In the crystal structure file, the tetrahydrobiopterin is abbreviated “BH4426”, the substrate analog is labeled “TIH427”, and the Fe2+ is labeled, “Fe2425” in the control panel. step2. Swiss-Pdb Viewer/DeepView has been loaded on your desktop. To open 1KW0.pdb in

this program, drag the file to the Swiss-Pdb Viewer/DeepView icon and drop it on the icon. In some cases, double-clicking on the file will also open the pdb file in DeepView.

step3. A black screen should appear with the protein shown in wire form. This is a difficult form

to view the protein, so we are going to change it to the ribbon form mode. To do this, follow these steps:

a) First make sure the control panel is open. If you don’t see it, select “control panel”

under “Wind”

b) Click on the control panel window. You can see that all the amino acid residues in the protein are listed in the first column by 3-letter code and residue number. The next

columns allow you to change what is displayed. In order to clean up the display of the enzyme, follow these steps:

c) Erase all the check marks in the “show” column and the “side” (meaning side chain)

column by clicking on them. For now, we are only going to view the protein backbone in a ribbon diagram.

d) Put check marks in the “ribbon” column for all the residues including the substrate

analog, the iron, and the tetrahydrobiopterin.

e) Locate the substrate analog, the iron, and the tetrahydrobiopterin in the control panel and put checks in the “show” column to these molecules to the display.

f) Go to the main window and click on the “Display” menu and select “Render in Solid

3-D”. You should now be viewing a ribbon diagram of your protein.

g) You can change the ribbon colors to any color you think looks best by selecting “ribbon” under “Prefs”. In this window, make sure the “render as solid ribbon” option (near the top) is selected. You can select different colors for the top, side, and bottom of the ribbons.

This allows you to choose a darker version of the same color for the bottom of the ribbon to enhance the 3-D viewing. Take a minute to play around with this option and to color your protein the way you want. You can also change the background to any color by choosing “Colors” under “Prefs”, then “background”.

h) Click in the display window to make sure that window is the active one. The tool bar

for this window is located at the top and is described in your lab manual. Select the “rotate” tool. To rotate the protein, click and hold on the picture while moving the mouse. The other two buttons are “zoom” and “transverse” for zooming in on the protein and for moving the protein from side to side across the screen.

i) Once you have a view that you like of your protein, save it by going to “File” then

“Save”. Then select “Layer”. Name your file hprt1.pdb and save to desktop. When you open this file, all your colors and the orientation should by saved, but you will have to select “Render in Solid 3D” again under “Display” to see it.

Answer questions 4 – 5 accompanying Guide Sheet 2. Printing the 3D Figure of Your Protein step4. To save the pdb file as a photo file, we will use the program Grab. You can open Grab by

clicking on the scissors icon in the toolbar of your desktop. step5. Make sure the figure is visible exactly the way you want it in Swiss-Pdb Viewer. Then, in

Grab, go to “Capture” then “Selection”. You can now draw a box around the part of the view in Swiss-Pdb Viewer that you want to save. Save the file as “pah.tiff”. Save it to your desktop. You can use this same method to create a protein structure figure for a PowerPoint presentation.

step6. Open the “.tiff” file in Preview. Choose Page Setup under “File” and change the scale to 70% to make sure the figure prints on one page. Print a copy of your “.tiff” file.

Viewing an amino acid side chain step7. Locate the Glu280 (the residue that is mutated in your patient’s protein) in the structure.

Show the side chain by clicking on the “show” and “side” columns in the control panel for that amino acid. The side chain will appear in the CPK coloring mode which is based on atom type:

red = oxygen blue = nitrogen orange = phosphorous yellow = sulfur and phosphorous gray = carbon light blue = hydrogen

At this point, you can erase the check marks in the “show” column for all the other residues in order to focus on the Glu280, the cofactor (BH4426), the substrate (TIH427), and the Fe2+ (Fe2425).

step8. Zoom in on the Glu280. Rotate the structure until you can get a good view of the

glutamate side chain. step9. Confirm the identity the Glu side chain by using the identity tool on the toolbar and

clicking on the side chain in the display window. Investigating the environment of the side chain

In order to investigate the non-covalent interactions of the side chain, use the distance tool on the toolbar to find atoms that are close enough to atoms in the side chain to be involved in H-bonds, ionic bonds, or van der Waal’s interactions. Keep in mind the atom type when determining what type of interactions may be occurring.

step10. Use the radius tool to identify all the residues within 4 angstroms of the glutamate side

chain. To do this, select the radius tool. The command line will then instruct you to click on an atom of the glutamate side chain. When you do this, a window should open where you can enter in the radius. Enter 4 angstroms. All the side chains within 4 angstroms of the glutamate should now appear.

step11. Use the distance tool to measure the distances between the glutamate side chain and the

closest residue. You may need to measure several distances until you find the one that is closest. Measure the distances between the atoms that appear closest to the glutamate side chain. You will click on the two atoms that you want to measure the distance between, then the distance, in angstroms, should appear. This step may take several attempts. If

you need to erase distances or labels, you can go under “Display” to “labels”, then select “erase user labels.” Keep in mind the resolution of the crystal structure provides the error in the distances that you are measuring. For example, if the distance is 5 angstroms and the resolution is 2 angstroms, the distance between the atoms is estimated to be 5 angstroms ± 2 angstroms.

step12. Once you have identified possible non-covalent interactions involving the Glu side chain,

print this view with the distances using the same method as in steps 4 - 6. Save this view as something “.tiff” You will need the distance measured here for question 6 of your guide sheet 2 questions.

Modelling the Mutation step13. To change the glutamate side chain to a different side chain, use the mutate tool on the

toolbar. Select the amino acid lysine (Lys) to mimic your patient’s mutation. The Lys side chain will appear in the lowest energy conformation (most stable). Some strange green lines may appear which represent potential H-bonds. You can make these disappear under “Display”, then deselect “Show H-bonds”.

step14. Use the “distance” tool to measure the distance between the Lys side chain and the

closest residue(s) as you did in step 11 with glutamate. You will click on the two atoms that you want to measure the distance between, then the distance, in angstroms, should appear. Save and print this view following steps 4 - 6 and naming the file, “Lys.tiff”. Answer questions 7-8 accompanying Guide Sheet 2.

To put in report: For this lab session, you will need the three figures printed in steps 6, 12, and 14. Make sure the residue numbers and distances are labeled. The distances and labels can be added by hand to the figure if they are difficult to see in the print-out.

BIOINFORMATICS LABORATORY 4: Genetic Mutation and Consequence in Metabolic Pathway OMIM search: step1. Go to the NCBI homepage and search the OMIM database for “Phenylketonuria”. Select

the search result that states this exact disease name.

Read the text under “Allelic Variants” and find the entry for the mutation you are studying. Answer questions 1 – 3 accompanying Guide Sheet 3.

step2. Go to the NCBI home page. Select “PubMed” from the pull-down menu and search for

PMID 12655545 (just enter the number in the search box). step3. Click on the author line to view the abstract for this article. Read this abstract and Answer

question 4 accompanying Guide Sheet 3. step4. Go to the website that is listed in the abstract. At this website, click on “Mutation Search”

from the menu on the left. Then click on the “Human Data Listing” link under “In Vitro Expression”. Scroll until you find E280K (this entry will be more than halfway down the page). Answer questions 5 and 6 accompanying Guide Sheet 3.

KEGG pathway step1. Go back to the Gene entry for human PAH. Scroll down to the “General gene

information” section and select the KEGG pathway link for “Phenylalanine, tyrosine and tryptophan biosynthesis.” You should see a nice graphic for the enzyme complexes in the pathway. The diagram also contains boxes with the EC numbers for enzymes in each of the reactions. Each box is a link to a data base entry similar to GenBank. Find the EC number for PAH (1.14.16.1). Answer question 7 accompanying Guide Sheet 3.

step2. Click on “Tyrosine Metabolism.” This will take you to a new pathway figure. In this

figure, locate “dopamine” and click on click on the circle next to dopamine. Answer questions 8 – 9 accompanying Guide Sheet 3.

guide sheet for tics lab 1 - 4

Documents