bio in for matics
TRANSCRIPT
-
7/16/2019 Bio in for Matics
1/54
Updated:February2011
TheGeneGatewayWorkbookAcollectionofactivitiesintroducingnewusersto
thewebresourcesthatscientistsaccesstolearn
aboutgeneticdisorders,genes,andproteins.
Using hereditary hemochromatosis as a model,access a variety of websites and databases to
Learnaboutageneticdisorderanditsassociatedgene.
Identifymutationsthatcausethedisorder.
Findthegeneonachromosomemap.
Examinethegenessequenceandstructure.
Accesstheaminoacidsequenceofagenesproteinproduct.
Explorethe3-Dstructureofthegenesproteinproduct.
To view the chromosomes of the Human Genome
Landmarks poster online, order your free copy of
the poster, or download additional copies of this
workbook, go to the Gene Gateway website:
genomics.energy.gov/genegateway/
http://genomics.energy.gov/genegateway/http://genomics.energy.gov/genegateway/ -
7/16/2019 Bio in for Matics
2/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
2
Acknowledgements
ThisworkbookwaslastupdatedFebruary2011.Withmuchappreciation,wewouldliketo
acknowledgeCarrieMorjan,AuroraUniversity,whofacilitatedthecreationofthislatestversion
bysharingherupdatestoTheGeneGatewayWorkbookforuseinhergeneticsclasses.
ThisworkbookwasfirstproducedbytheBiologicalandEnvironmentalResearchInformation
SystematOakRidgeNationalLaboratory,OakRidge,Tennessee,July2003,withsupportfrom
theU.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch.
OfficeofBiologicalandEnvironmentalResearch
OfficeofScience
U.S.DepartmentofEnergy(DOE)
ForMoreInformation
ThisworkbookisfreelydownloadablefromtheGeneGatewaywebsite(seelinkbelow).For
questionsorcommentsconcerningthisdocument,contactJenniferBownasbyemailat
GeneGateway
genomics.energy.gov/genegateway/
HumanGenomeProjectInformation
www.ornl.gov/hgmis/home.shtml
DOEGenomicScienceProgram
genomicscience.energy.gov
DOEOfficeofBiologicalandEnvironmentalResearch
science.energy.gov/ber/
mailto:[email protected]://genomics.energy.gov/genegateway/http://www.ornl.gov/hgmis/home.shtmlhttp://genomicscience.energy.gov/http://science.energy.gov/ber/http://science.energy.gov/ber/http://genomicscience.energy.gov/http://www.ornl.gov/hgmis/home.shtmlmailto:[email protected]://genomics.energy.gov/genegateway/ -
7/16/2019 Bio in for Matics
3/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
3
TableofContents
Introduction...................................................................................................................4 WhyUseHereditaryHemochromatosisasaModel?..............................................................5SomeBasicConceptstoUnderstandBeforeStarting..............................................................5
Activity1........................................................................................................................6OnlineMendelianInheritanceinMan(OMIM).......................................................................6GeneTests............................................................................................................................11
Activity2......................................................................................................................14NCBIMapViewer.................................................................................................................14
Activity3......................................................................................................................22NCBIEntrezGeneandGenBank...........................................................................................22
Activity4......................................................................................................................32UniProtProteinKnowledgebaseandBLASTSearching.........................................................32
Activity5......................................................................................................................38ProteinDataBank................................................................................................................38
TableofStandardGeneticCodeforTranslatingDNASequenceRecords.......................50
HereditaryHemochromatosisWorksheet....................................................................51
-
7/16/2019 Bio in for Matics
4/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
4
Introduction
TheGeneGatewayWorkbookisacollectionofactivitieswithscreenshotsandstep-by-step
instructionsdesignedtointroducenewuserstogenetic-disorderandbioinformaticsresources
freelyavailableontheWeb.Itshouldtakeabout3hourstocompleteallfiveactivities.
Theworkbookactivitieswerederivedfrommoredetailedguidesandtutorialsavailableatthe
GeneGatewaywebsite( genomics.energy.gov/genegateway/ ).Thiswebsitewascreatedasa
resourceforlearningmoreaboutthegenes,traits,anddisorderslistedontheHumanGenome
Landmarks(HGL)poster,butitcanbeusedtoinvestigateanygeneorgeneticdisorderofinterest.
Manyguidestousingbioinformaticresourcesaredesignedforbioscienceresearchersandaretoo
technicalfornonexperts.ThisworkbookandotherGeneGatewayresourcestargetamore
generalaudience:teachers,highschoolandcollegestudents,patientswithdisordersandtheir
families,andanyoneelsewhowantstolearnmoreabouthowlifeworksatamolecularlevel.
Thisworkbookshowsyouhowtogetstartedusingbioinformaticsresourcesthatoftenintimidate
andoverwhelmnewusers.Italsodemonstrateshowinformationfromoneresource,suchasannotatedproteinsequencedatafromtheUniProtProteinKnowledgebase,canbeusedto
reinforceandclarifyinformationavailablefromanotherresource,suchasthree-dimensional(3-D)
structuresfromProteinDataBank(PDB).GeneGatewayprovidesuserswithasystematic
approachtousingmultiplebioinformaticsdatabasestogainabetterunderstandingofhowgenes
andproteinscancontributetothedevelopmentofaparticulargeneticcondition.
Usingthegeneticdisorderhereditaryhemochromatosisasamodel,thisworkbookshowsyou
howtoaccess:
OnlineMendelianInheritanceinMan(OMIM)andGeneReviewstolearnaboutagenetic
disorder,itsassociatedgeneorgenes,andcommondisease-causingmutations.
NCBIMapViewertofindagenelocusonachromosomemap.
NCBIEntrezGeneandGenBanktoexaminethesequenceandstructureofagene.
UniProtProteinKnowledgebasetofindtheannotatedaminoacidsequenceofagenes
proteinproduct.
ProteinDataBanktoviewandmodifythe3-Dstructureofthegenesproteinproduct.
Skillsgainedbyworkingthroughtheactivitiesinthisworkbookcanbeappliedtolearningabout
othergeneticdisorders,genes,andproteins.
Thisworkbookandothergenomescienceresourcesareavailablefromthewebsiteforthe
genomeprogramsoftheOfficeofBiologicalandEnvironmentalResearch,U.S.DepartmentofEnergyOfficeofScience( genomics.energy.gov/ ).
http://genomics.energy.gov/genegateway/http://genomics.energy.gov/http://genomics.energy.gov/http://genomics.energy.gov/genegateway/ -
7/16/2019 Bio in for Matics
5/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
5
Why Use Hereditary Hemochromatosis as a Model?
Hereditaryhemochromatosis,adisorderinwhichtoomuchironaccumulatesincertain
tissuesandorgans,iscausedbychangesintheDNAsequenceofasinglegene,sothegenetic
basisofthisconditioniseasiertounderstandthanmorecomplexdisorderscausedby
alterationsinmultiplegenes.
Thegeneanditsproteinproductarerelativelywellstudied.Three-dimensionalstructuresof
theproteinproductareavailableinPDB,theinternationalrepositoryformacromolecular
structuredata.
Hereditaryhemochromatosisisthemostcommonautosomalrecessivedisorderaffecting
individualsofNorthernEuropeandescent(about1in200Caucasiansdevelophereditary
hemochromatosis).
Effectivemethodsfortreatmentareavailablewithearlydiagnosis.
Some Basic Concepts to Understand Before Starting
Genesarethebasicphysicalandfunctionalunitsofheredity.Eachgeneislocatedona
particularregionofachromosomeandhasaspecificorderedsequenceofnucleotides(the
buildingblocksofDNA).
Centraldogmaofmolecularbiology:DNARNAProtein
- GeneticinformationisstoredinDNA.
- SegmentsofDNAthatencodeproteinsorotherfunctionalproductsarecalledgenes.
- GenesequencesaretranscribedintomessengerRNAintermediates(mRNA).
- mRNAintermediatesaretranslatedintoproteinsthatperformmostlifefunctions.
Eukaryoticgeneshaveintronsandexons.Exonscontainnucleotidesthataretranslatedinto
aminoacidsofproteins.Exonsareseparatedfromeachotherbyinterveningsegmentsof
DNAcalledintrons.Intronsdonotcodeforprotein,andtheyareremovedwheneukaryotic
mRNAisprocessed.Exonsaresplicedbacktogethertoformtheintron-freemRNAstrandthat
isusedasatemplatetomakeproteins.
Specialcellularcomponents(ribosomes)usethetripletgeneticcodetotranslatethe
nucleotidesofanmRNAsequenceintotheaminoacidsequenceofaprotein.ATableof
StandardGeneticCodeisprovidedonpage50ofthisworkbook.
Thereare20differentaminoacids.Proteinsarecreatedbylinkingaminoacidstogetherina
linearfashiontoformpolypeptidechains.SeetheTableofStandardGeneticCodeonpage50
forsingle-letterandthree-letterabbreviationsforthe20differentaminoacids.
Polypeptidechainsfoldinto3-Dstructuresthatcanassociatewithothermolecularstructures
toperformspecificfunctions.
-
7/16/2019 Bio in for Matics
6/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
6
Activity 1Online Resources: OMIM and GeneTests
Learn about the genetic disorder and its associated gene.
Identify mutations that cause the disorder.
Online Mendelian Inheritance in Man (OMIM)
OMIMisacomprehensivedatabaseofhumangenes,genetictraits,anddisorderscreatedby
researchersatJohnsHopkinsUniversity.TheOMIMdatabase,whichisupdateddaily,isaccessible
throughtheNationalCenterforBiotechnologyInformation(NCBI)suiteofonlineresources.Each
recordinOMIMsummarizesthebodyofresearchrelevanttoaparticulargene,trait,ordisorder.
ToaccessOMIM,letsgototheNCBIhomepage( www.ncbi.nlm.nih.gov)shownbelow,andthen
clickonOMIMintheboxontheupperright.
AscreenshotoftheOMIMhomepageisshownonthefollowingpage.Theeasiestwaytobegina
searchistosimplytypeadisordernameinthesearchboxatthetopoftheOMIMpageand
submityoursearch.However,NCBIalsosupportsavarietyoffeaturesfornarrowingasearchand
browsingdisordersalphabetically(usingOMIMMorbidMap)orbychromosomallocation(using
OMIMGeneMap).
Tonarrowasearch,NCBIhasoptionsfortypingsearchfieldqualifiersintothesearchbox[see
OMIMHelp(www.ncbi.nlm.nih.gov/Omim/omimhelp.html)formoreinformation]orselecting
searchfieldsusingthe Limitstab.Thisexercisewilldemonstratesearchesusingthe Limitstab.
http://www.ncbi.nlm.nih.gov/http://www.ncbi.nlm.nih.gov/Omim/omimhelp.htmlhttp://www.ncbi.nlm.nih.gov/Omim/omimhelp.htmlhttp://www.ncbi.nlm.nih.gov/ -
7/16/2019 Bio in for Matics
7/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
7
1. SelecttheLimitstabatthetopoftheOMIMpageshowninthescreenshotbelow.
URLforOMIMhomepage:www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM
Mostgenes,disordersandtraitslistedontheHumanGenomeLandmarks(HGL)posterwere
takenfromthetitlefieldsofOMIMrecords,sowecannarrowoursearchtolookonlyforthose
recordsthathavehemochromatosisinthetitlefield.Byselectinghemochromatosisfromthe
HGLposter,wealsoknowthatthegeneforthisdisorderisfoundonchromosome6.
2. FromtheLimitspage,enterhemochromatosisintothesearchboxandselectthe Titlebox
andchromosome6asshowninthescreenshotbelow.ClickGotosubmityoursearch.
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIMhttp://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM -
7/16/2019 Bio in for Matics
8/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
8
NOTE:SearchingforOMIMrecordsassociatedwithmulti-genedisorders,such
asbreastcancerordiabetes,whicharecausedbyalterationsingeneson
differentchromosomes,mayprovidemultipleOMIMrecordsinthesearch
results.Limitingyoursearchtojustonechromosomeforamulti-genedisorder
mayonlyretrieveasubsetofalltherecordsassociatedwiththatdisorder.
3. Thesearchshouldreturnoneresult: MIMID#235200.AscreenshotofthefullOMIMrecord
forthehemochromatosisdisorderisshownbelow.
4. Letsexaminesomeofthefeaturesofthisrecord:
EachOMIMrecordisassignedauniquesix-digit MIMIDnumberlocatedatthetopof
eachentry.Forhemochromatosis,theMIMIDis235200.Asauniqueidentifierfora
disorder,theMIMIDcanbeusedtosearchotherdatabasesforinformationabouta
particulardisorder.
Thenumbersign(#)prefixinfrontoftheMIMIDmeansthatthisentryreferstothe
descriptionofaphenotype,andthemolecularbasisforthisphenotypeisknown.For
moreinformationaboutotherMIMnumberprefixes,seeOMIMHelp
(www.ncbi.nlm.nih.gov/Omim/omimhelp.html#MIMnumberPrefix).
BelowtheMIMID,youwillfindthedisordernameandtheofficialgenesymbol(shownin
theimageonthenextpage).Theofficialgenesymbol,whichis HFEforhemochromatosis,
servesasauniqueidentifierforagene.Tobe"official,"agenesymbolmusthavebeen
approvedbytheHUGOGeneNomenclatureCommittee( www.genenames.org).Thegene
http://www.ncbi.nlm.nih.gov/Omim/omimhelp.html#MIMnumberPrefixhttp://www.ncbi.nlm.nih.gov/Omim/omimhelp.html#MIMnumberPrefixhttp://www.genenames.org/http://www.genenames.org/http://www.ncbi.nlm.nih.gov/Omim/omimhelp.html#MIMnumberPrefix -
7/16/2019 Bio in for Matics
9/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
9
symbolisespeciallyusefulwhensearchingotherdatabases(suchassequence,genome-
mapping,andstructuredatabases)forgene-specificinformation.
NOTE:Foradisorderlikehemochromatosis,which
isprimarilycausedbymutationsinasinglegene,
theofficialgenesymbolmaybeincludedinthe
recordtitle.Forcomplexdisorderslikebreast
cancer,officialsymbolsforassociatedgeneswillbe
describedinthefirstparagraphoftext.
TheGenemaplocusdescribeswhereagenecanbefoundonachromosome.Forthe
genelocus6p21.3,6isthechromosomenumber,pindicatestheshortarmofthe
chromosome,and21.3isanumberassignedtoaparticularregionofthechromosome.
ClickingonagenemaplocusopenstheOMIMGeneMap,atableofgenesorganizedby
chromosomallocation.
TheamountoftextwithinanOMIMrecordvariesaccordingtowhatisknownabouta
particulargene,disorder,ortrait.Sincehemochromatosisiswellstudied,alotof
informationisknownaboutthisdisorderanditsgene.Somedifferenttypesof
informationthatmaybeincludedinanOMIMrecordaredisorderdescription,inheritance,moleculargenetics,genotypeandphenotypecorrelations,diagnosis,
populationgenetics,andanimalmodels.
EachrecordincludesaTableofContentsboxontherightwithquicklinkstodifferent
sectionswithintherecord.
5. Tolearnmoreaboutthemolecularbasisofhemochromatosis,selecttheMolecularGenetics
linkintheTableofContentsbox(seescreenshotonpreviouspage).TheMolecularGenetics
sectionoftheOMIMrecordforhemochromatosisisshownbelow.
Onestudyshowedthatabout83%ofhemochromatosiscasesarerelatedtotheC282Y
mutation.TheC282YnotationmeansthatamutationoccursintheDNAsequencethat
changestheaminoacidatposition282intheproteinproductfromacysteine(C)toa
tyrosine(T).
6. ClickonthefirstlinkfortheC282Ymutation, 613609.0001.Thislinkwilltakeyoutothe
OMIMrecordfortheHFEgene(MIMID*613609;theasteriskprefixindicatestherecord
representsageneofknownsequence).OMIMoftenmaintainsseparaterecordsfor
-
7/16/2019 Bio in for Matics
10/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
10
phenotypes(suchasthedisorderhemochromatosis)andthegenesassociatedwiththose
phenotypes.
7. TheAllelicVariants sectionoftheOMIMrecordfortheHFEgeneisshowninthescreenshot
below.Thissectiontypicallydescribessomeofthemostnotablegenemutations(alsocalled
allelicvariants)thatproducediseasephenotypes.NotethattheC282Ymutationisalsoknown
astheCYS282TYRmutation,anditisthefirstofseveralmutationsthathavebeenidentifiedfortheHFEgene.ToseealistingofthedifferentmutationsfortheHFEgene,clickonthe See
allelicvariantsintabulardisplay link.
8. NowyouarereadytoanswerQuestions12forActivity1intheworksheetonpage51.
9. ScrolltothetopofthisOMIMrecord,andclickonthe Limitstab.Letsuseoptionsonthe
Limitspagetodeterminehowmanygenesinthehumangenomehavebeendescribedin
OMIM.
UnchecktheboxesforTitleandchromosome6.
ChecktheboxesbesidetheMIMNumberPrefixoptionsfor *genewithknownsequence
and+genewithknownsequenceandphenotypeasshowninthescreenshotonthenext
page.
ThenclicktheGobuttonbesidethesearchboxatthetopofthepage.
-
7/16/2019 Bio in for Matics
11/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
11
10.Youshouldretrieveover13,500searchresults.Oftheestimated20,000to25,000genesin
thehumangenome,about13,500geneshaverecordsinOMIM.Youmaywanttotestyour
newsearchskillsbyusingOMIMtosearchforothergenesorgeneticconditions.Inaddition
toOMIM,anothergoodresourceforlearningaboutgeneticdisordersandassociatedgenesis
theGeneTestswebsite,whichisdescribedinthenextpartofthisactivity.
GeneTests
TheGeneTestswebsiteisamedicalgeneticsinformationresourcedevelopedbyresearchersand
healthcareprofessionalsandfundedbytheNationalInstitutesofHealth.Inadditiontoproviding
up-to-date,authoritativereports(GeneReviews)ongeneticdisorders,thesitealsoincludes
educationalmaterials(e.g.,factsheetsongenetictestingandcounseling,PowerPointslides,and
anillustratedglossary)andonlinedirectoriesofgeneticlaboratoriesandclinics.
Thisactivityfocusesonaccessingandusinggeneticdisorderinformationavailablefrom
GeneReviews.Allentriesarewrittenandreviewedbyphysicians,sothelanguageissimilartothat
ofmedicaltext.Whiletheamountandkindofcontentcanvarygreatlyfromrecordtorecordin
OMIM,allreportsinGeneReviewswillprovidesimilarkindsofinformationandsharethesame
organizationalstructure.
LetsgototheGeneTestswebsite( www.genetests.org)tofindaGeneReviewforhereditary
hemochromatosis.ThescreenshotoftheGeneTestshomepageisshownonthenextpage.
http://www.genetests.org/http://www.genetests.org/ -
7/16/2019 Bio in for Matics
12/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
12
1. Clickon inthenavigationbaratthetop.
2. AttheGeneReviewssearchpage(shownbelow),usethe GeneSymbolsearchoption,select
exactlymatchesfromthedrop-downmenu,andenter HFEintothesearchbox.ClickGoto
submityoursearch.
-
7/16/2019 Bio in for Matics
13/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
13
3. BesidethesearchresultHFE-AssociatedHereditaryHemochromatosis,selectthe
linktoaccessthehereditaryhemochromatosisreviewshownbelow.
4. Ontherightsideofthescreenisanavigationcolumnwithlinkstodifferentsectionsofthe
HFE-AssociatedHereditaryHemochromatosisGeneReview.
5. AccesstheSummarysectiontolearnaboutdiseasecharacteristicsandtreatmentfor
hemochromatosis.ThissectioncanhelpanswerQuestion3forActivity1intheworksheeton
page51.
6. AccesstheMolecularGeneticssectionforabriefoverviewofthisdisordersmolecularbasis.
Withinthissectionyoucanfindinformationabout:
officialsymbolforthegeneassociatedwiththisdisorder.
chromosomallocusofthegene.
genesizeandthenumberofexonsinthegene.
nameofthegenesproteinproduct.
descriptionoftheproteinsfunction. mutationsinnucleotideandaminoacidsequencesthatcauseabnormalproteinproducts
anddiseasephenotypes.
linkstoscientificliteratureandotherdatabasesformoreinformation.
-
7/16/2019 Bio in for Matics
14/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
14
Activity 2Online Resource: NCBI Map Viewer
Find the hereditary hemochromatosis gene on a chromosome map.
NCBI Map ViewerNCBIMapViewerisaWeb-basedtoolforviewingandsearchinganorganism'scompletegenome.
Usersalsocanviewmapsofindividualchromosomesandzoomintospecificregionswithin
chromosomestoexplorethegenomeatthesequencelevel.
MapViewerprovidesaccesstoseveraldifferenttypesofmapsfordifferentorganisms.Manyof
thesemapsaremeaningfulonlytoscientificresearchers.Adiscussionofallthedifferenttypesof
mapsandgenomicdataisbeyondthescopeofthisactivity,whichwillfocusonlyonhowtolocate
aspecificgenelocusonachromosomemap.
1. GototheNCBIMapViewerwebsite( www.ncbi.nlm.nih.gov/mapview/).Inthelistof
Primates,clickontheBuild37.2linkforHomosapiens(human).
http://www.ncbi.nlm.nih.gov/mapview/http://www.ncbi.nlm.nih.gov/mapview/ -
7/16/2019 Bio in for Matics
15/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
15
2. TheMapViewerpagefortheentirehumangenomeisshowninthescreenshotbelow.
Homosapiensgenomeview:www.ncbi.nlm.nih.gov/mapview/map_search.cgi?taxid=9606
3. InActivity1,welearnedthattheofficialsymbolforthehereditaryhemochromatosisgeneis
HFE,anditslocusis6p21.3.LetsfindtheHFEgeneonchromosome6.
Whatisalocus?
Thelocusforaparticulargenedescribestheregionofa
chromosomewherethatgenecanbefound.Forthe6p21.3
locus:6isthechromosomenumber,pindicatestheshort
armofthechromosome,and21.3isthenumberassignedto
aparticularbandorregiononachromosome.When
chromosomesarestainedinthelab,lightanddarkbands
appear,andeachbandisnumbered.Thehigherthenumber,
thefartherawaythebandisfromthecentromere.Alocus
containingqisfoundonthelongarmofachromosome.
4. InthesearchboxatthetopoftheMapViewerpage,enter HFE[sym]andthenclicktheFind
buttontosubmityoursearch.Addingthe[sym]searchfieldqualifiertotheendofyoursearch
termspecifiesyourquerysothatonlythoseresultscontainingtheHFEgeneareretrieved.
http://www.ncbi.nlm.nih.gov/mapview/map_search.cgi?taxid=9606http://www.ncbi.nlm.nih.gov/mapview/map_search.cgi?taxid=9606 -
7/16/2019 Bio in for Matics
16/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
16
5. Redtickmarksshouldbedisplayedonchromosome6,indicatingtheapproximatelocationof
theHFEgeneinthemiddleoftheshortarmofchromosome6(seescreenshotbelow).The
rednumber(61)labelingchromosome6indicatesthenumberofobjectsmappedto
differentassembliesofthehumangenomethatincludetheHFEgene.
6. Clickonthenumber6linkbelowthechromosome.Thiswillopenaviewofchromosome6
thatshouldlooklikethescreenshotbelow.Inthenextstepwewillmodifythisviewsowecan
seeanideogramshowingtheregionofchromosome6wheretheHFEgenecanbefound.
-
7/16/2019 Bio in for Matics
17/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
17
7. Tomodifythedisplayoptions,clickontheMaps&Optionsbuttonintheupperrightcorner.
Thiswillopenawindowforcustomizingmapoptions.Makethefollowingadjustments.
RemoveallmapslistedunderMapsDisplayed(lefttoright) excepttheGenemap.To
removeamap,selectitwithyourmouseandthenclicktheREMOVEbutton.
UnderAvailableMapsselectideogr(youwillneedtoscrollthroughmorethanhalfofthe
availablemaps)andthenclicktheADDbutton.Theideogrammapisagraphicshowing
thebandingpatternofachromosome.
TheMapsDisplayedlistshouldlooklikethescreenshotbelow.TheGenemapshouldbe
designatedasyourmastermap.Tomakeamapthemaster,selectitwithyourmouseand
thenclicktheMakeMaster/MovetoBottombutton.Inthechromosomeview,amaster
mapisshownattherightsideofthescreenalongwithitsdetailsanddescriptivetext.The
Genemapincludeslinksforlearningmoreaboutthegenesmappedtoaparticularregion
ofgenomicsequenceonachromosome.
UnderMoreOptions nearthebottomofthewindow,changePageLengthfrom30to10.
ThePageLengthoptionishighlightedinthescreenshotbelow.Thiswilladjusttheheight
ofthedisplayedmap.
BeforeyouclicktheOKbuttontosubmityourchanges,theoptionswindowshould
resemblethescreenshotbelow.
8. Thenewmapofchromosome6shouldresemblethescreenshotonthenextpage.
-
7/16/2019 Bio in for Matics
18/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
18
9. CheckoutsomeofMapViewersfeaturesdisplayedinthescreenshotabove.
Theportionofchromosome6displayedinMapViewerishighlightedontheideogramin
thebluenavigationcolumnontheleft.Noticethattheredmarkindicatingthepositionof
theHFEgenelinesupwiththeideogramatthe6p22chromosomeband,not6p21.3.
Roundedtothenearestthousandth,theregionofsequencedisplayedbeginsataboutthe
26,086,000th
nucleotideandendsataboutthe26,100,000th
nucleotideoftheDNA
sequenceofchromosome6.ThetotalDNAsequenceforchromosome6isabout171
millionbasepairslong,butthisviewonlyshowsabout14,000basepairs.
ClickingontheIdeogramorGenes_seqmaps(notthelabels)willopenapop-upwindow
withoptionsforzoominginoroutonthedisplayedmaps.MapViewerhaszoomedinso
muchtoshowtheHFEgene,thereisntmuchoftheideogrammapdisplayed.Youcan
alsozoominandoutusingthezoomoptioninthebluenavigationcolumn.
TheGenes_seqmapprovideslinkstogene-specificentriesinotherNCBIdatabases.
o HFELinkstotheHFEentryintheEntrezGenedatabase,acompendiumofgenes
andmappedphenotypes.
o OMIMLinkstothehemochromatosisentryintheOnlineMendelianInheritancein
Man(OMIM)databasecoveredinActivity1.
-
7/16/2019 Bio in for Matics
19/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
19
o HGNCLinkstothegenesymbolreportmaintainedbytheHUGOGene
NomenclatureCommittee.
o svLinkstoSequenceViewer,agraphicalinterfaceforinvestigatingthegenes
sequenceaswellasgenomicsequenceupstreamanddownstreamofthegene.
o prLinkstosequencerecordsforthegenesproteinproductmaintainedinNCBIs
Proteindatabase.
o dlLinkstoapagefordownloadingtherangeofsequencedatadisplayedinMap
Viewer.
o evLinkstoEvidenceViewer,atoolforfindingbiologicalevidencethatsupportsa
particulargenemodelandforexploringthedifferenttypesofexpressedsequences
thataligntoaparticularareawithinagenome.
o mmLinkstoModelMaker,atoolforbuildingyourownversionofagenemodelby
addingorremovingexons.
o hmLinkstoHomologene,aresourceforcomparinggenesinhomologoussegments
ofDNAfromdifferentorganisms.
o stsLinkstoUniSTS,acomprehensivedatabasethatintegratesgeneticmarkerand
mappinginformation.Asequencetaggedsite(STS)isashort(200to500basepairs)
DNAsequencethathasasingleoccurrenceinthehumangenome.Detectableby
polymerasechainreaction(PCR),STSsareusefulforlocalizingandorientingthe
sequencedatareportedfrommanydifferentlaboratories.
o CCDSLinkstotheCCDSproject,anefforttoensurethatcodingregionswithinthe
humangenomeareconsistentlyannotated.
o SNPLinkstorecordsforsinglenucleotidepolymorphisms(SNPs)andotherareasof
sequencevariationthathavebeenidentifiedintheselectedgene.
10.Letszoomouttoviewtheentirechromosomeusingthe Maps&Options window.
ClickonMaps&Options againtoopentheoptionswindow.
Deletethenumbersdefiningthe RegionShownatthetopoftheoptionswindow.
Thiswillmodifythedisplaysoitshowstheentirechromosome.
UnderMoreOptionsnearthebottomofthewindow,changePageLengthfrom10to
20.ThePageLengthoptionishighlightedinthescreenshotonthenextpage.Thiswill
display20labeledgenesinthemastermapandshouldprovideenoughspaceonthe
screentoviewtheentirechromosomewithreadablelabelsforthechromosome
bands.
OncetheMaps&Optionswindowresemblesthescreenshotonthefollowingpage,
clicktheOKbuttontosubmityourchanges.
-
7/16/2019 Bio in for Matics
20/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
20
11.Yourviewofchromosome6shouldresemblethescreenshotonthenextpage.
Toseeamorecomprehensivelistingofgenesonchromosome6,selectthe DataAsTable
Viewlinkinthebluenavigationcolumnontheleft.The DataAsTableViewdisplays
1,000ofthegenesonchromosome6andshowswheregenesstartandstopinthe
chromosomesDNAsequence.
Scrolldowntothebottomofthemaptoexaminethe SummaryofMapssection.Usethis
informationandwhatyouhavelearnedaboutMapViewertoanswertheQuestionsforActivity2onpage51.
-
7/16/2019 Bio in for Matics
21/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
21
-
7/16/2019 Bio in for Matics
22/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
22
Activity 3Online Resources: NCBI Entrez Gene and GenBank
Examine gene sequence and structure.
NCBI Entrez Gene and GenBank
EntrezGeneisanNCBIresourcethatservesasasingle-queryinterfaceforaccessingsequence
andotherbiologicalinformationforspecificgenesfromavarietyofsequencedorganisms.
GenBankisNCBIscomprehensiverepositoryofannotatedDNAsequences.
ThisactivitycovershowtouseEntrezGenetoaccessthegenomicDNAsequenceofthe
hereditaryhemochromatosis(HFE)gene.Thenbyexaminingsomedifferentfeaturesofa
GenBankrecordfortheHFEgene,wewilllearnaboutthegenesstructure(e.g.,intronandexon
composition,codingsequence).
1. Tobegin,letsgototheEntrezGenehomepage( www.ncbi.nlm.nih.gov/gene).Inthesearch
boxatthetop,enterHFE[sym]ANDhuman[orgn]asshowninthescreenshotbelow.Besure
tocapitalizeanyBooleanoperator(AND,OR,andNOT)includedinyoursearchstatements.
Thensubmityoursearch.
http://www.ncbi.nlm.nih.gov/genehttp://www.ncbi.nlm.nih.gov/gene -
7/16/2019 Bio in for Matics
23/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
23
SearchTip:Adding[sym]totheendofyourquerytermtellsEntrezGenethatyouare
searchingbygenesymbolonly.Ifyoudonotspecifythatyouwanttosearchthegene
symbolfield,thesearchwillreturnmultiplerecordsthatincludethequeryterm
anywherewithinarecordscontent.Adding[orgn]toasearchtermlimitsthesearch
togenesfromaspecificorganism.Formoreinformationonoptionsforrefiningyour
search,seetheSearchFieldDescriptionsandQualifierssectionoftheEntrezHelp
Document(www.ncbi.nlm.nih.gov/entrez/query/static/help/Summary_Matrices.html).
2. Submittingthissearchshouldretrieveasingleresult.TheHFErecordisshownbelow.
3. IntheSummarysectionyoucanfindinformationaboutthefunctionofthegenesprotein
product.TheHFEproteinisthoughttohavearoleinregulatingirontransportintocells,and
defectsintheHFEgenecancausetheironabsorptiondisorderhereditaryhemochromatosis.
UseinformationprovidedintheSummarysectiontoanswerQuestion1forActivity3inthe
worksheetonpage52.
http://www.ncbi.nlm.nih.gov/entrez/query/static/help/Summary_Matrices.htmlhttp://www.ncbi.nlm.nih.gov/entrez/query/static/help/Summary_Matrices.html -
7/16/2019 Bio in for Matics
24/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
24
4. BelowthesummarysectionistheGenomicregions,transcripts,andproductssection.
ThesequenceviewerboxshowsagraphicmodeloftheHFEgeneconsistingofathingray
line(representingintronsthatareremovedwhenthemRNAisprocessed)connectedto
thickergreenboxes(representingexons).
Theportionofthechromosome6sequenceincludedinthesequenceviewerboxisnoted
intheupperleftcorner.
ClickontheGenBanklinkintheupperrightcornertoaccesstheGenBankrecordforthe
HFEgenesequencethatispartofthesequencedatageneratedbytheInternational
HumanGenomeProject.AscreenshotofthisGenBankrecordisshownonthenextpage.
-
7/16/2019 Bio in for Matics
25/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
25
5. CheckoutsomeofthefollowingfeaturesintheGenBankrecord.
Atthetopweseethatonlyaverysmallportionofchromosome6(from26,087,448th
base
to26,097,059th
base)isincludedinthisrecord.
ThefirstReferencelistedforthisrecordidentifiestheInternationalHumanGenome
SequencingConsortiumasthesourceforthissequenceinformation.Thusthissequence
isaproductoftheinternationalHumanGenomeProject.
Evenafteragenomesequenceispublishedinajournalandreportedascomplete,the
researchcommunitycontinuestoanalyzethegenomesequencedataandimprovethe
annotationthatdescribesdifferentfeaturesencodedwithinthegenomesequence.Note
thatthisrecordwaslastmodifiedOctober25,2010.
6. ScrolldowntotheFEATURESsectionofthisGenBankrecord(seescreenshotonnextpage).
TheHFEgeneis9,612basepairs(bp)long.
TheinformationinthisGenBankrecordfortheHFEgenewasDerivedbyautomated
computationalanalysisusinggenepredictionmethodasapartoftheHumanGenome
Project.
FromthemultipleentriesformRNAlistedinthisrecord,weseethatmorethanone
mRNAtranscriptcanbegeneratedfromtheHFEgene.Forexample,anexonincludedin
onemRNAtranscriptmightbeleftoutinanothertranscript.Eachofthesedifferent
mRNAtranscriptsfromthesamegeneisknownasavariant.
-
7/16/2019 Bio in for Matics
26/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
26
7. UseyourbrowsersbackbuttontoreturntotheEntrezGenepageforthehumanHFEgene.
8. LetsaccessanotherGenBankrecordfortheHFEgenesequencetoseehowinformationcan
varyinrecordsthatcomefromdifferentsources.Asshowninthescreenshotonthenextpage,selecttheRelatedsequenceslinkintheTableofcontentsboxontherightsideofthe
screen.
-
7/16/2019 Bio in for Matics
27/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
27
9. IntheRelatedSequencesfortheHFEgene(seescreenshotonthenextpage),selectthe
genomicsequencerecord Z92910.1.
Howdidyouknowwhichgenomicsequencetoselect?
TheproblemwitharchivalsequencedatabaseslikeNCBIsGenBankisthatthey
usuallyhavemultiplesequencerecordsforthesamegene.Youmayneedtoopeneachrecordindividuallyandbrowsethroughdefinition,sequence
annotation,andcommentstodeterminehowmuchofthegenesnucleotide
sequenceiscontainedwithineachrecord.
Forexample,theU91328.1recordcontainsthesequenceofagenomicsegment
thatnotonlyincludestheHFEgenesequencebutalsosequencesforother
genes.Y09801.1containsonlysequenceinformationfortheHFEpromoterand
theHFEgene'sfirstexon.Ofthegenomicrecordslisted,Z92910.1hasthemost
completesequenceinformationfortheHFEgene.
InsequencedatabasessuchasGenBank,genomicDNAsequencerecordsfor
eukaryoticorganismscontainbothexonsandintrons,whilemRNAsequences
areintron-freeDNAsequences.AllsequencesinGenBankandsimilarrepositoriesusethesingle-letterabbreviationsfortheDNAbasesadenine(A),
cytosine(C),guanine(G),andthymine(T)torepresenteachnucleotide.Even
mRNAsequencerecordsuseA,C,G,andTwhereTisusedtoreplaceeach
uracil(U)inthemRNAsequence.
-
7/16/2019 Bio in for Matics
28/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
28
RelatedSequencesfortheHFEGeneinEntrezGene
10.AscreenshotoftheGenBankrecord Z92910.1fortheHFEgeneisshownonthenextpage.
TheDNAsequenceincludedinthisrecordis12,146basepairs(bp)long.Inadditionto
containingthegenomicsequenceoftheHFEgene,thisrecordalsocontainsseveralhundredadditionalbasepairsofsequenceupstreamanddownstreamofthegene.
ThisrecordwasoriginallysubmittedbyaresearchertoGenBankin1997,sothesequence
oftheHFEgenewasknownseveralyearsbeforetheHumanGenomeProjectwas
complete.
ScrolldowntotheFEATURESsectionofthisrecordandusethisinformationtoanswer
Questions24forActivity3onpage52.Notethatclickingonthe genelinkinthe
FEATURESsectionshowsthatthelengthoftheHFEgeneisdifferentfromwhatwe
observedintheGenBankrecordexaminedinstep5ofthisactivity.
-
7/16/2019 Bio in for Matics
29/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
29
GenBankRecordZ92910.1forHumanHFEGene
11.SomefeaturesofthesequenceinGenBankrecordZ92910.1include
source:RequiredforeveryGenBankrecord,thesourceprovidestheentiresequence
lengthandthescientificnameofthesourceorganism.Othertypesofsourceinformation
mayincludechromosomenumber,maplocation,andcloneorstrainidentification. gene:Thisfeatureprovidesnucleotidenumbersindicatingwherethegenestopsand
starts.Thislinkopensanewsequencerecordthatshowsonlythegenesequence.
exon:Thisfeatureprovidesnucleotidenumbersindicatingwhereeachexonbeginsand
ends.Youwillseeseveraloftheseentriesasyouscrolldown.Eachexonisasequence
segmentthatcodesforaportionofprocessed(intron-free)mRNA.Thenameofthegene
-
7/16/2019 Bio in for Matics
30/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
30
towhichtheexonbelongsandtheexonnumberareprovided. Anexonlinkopensa
newsequencerecordthatshowsonlytheexonsequence.
CDS:Thecodingsequence(CDS)consistsofnucleotidesthatactuallycodeforaminoacids
oftheproteinproduct.Thisfeatureincludesthecodingsequence'saminoacidtranslation
andmayalsocontaingenename,geneproductfunction,alinktoproteinsequence
record,andcross-referencestootherdatabaseentries. ACDSlinkopensanewsequencerecordthatshowsonlythecodingsequence.
intron:Thisfeatureprovidesthenucleotidenumbersindicatingwhereeachintronbegins
andends.Anintronisasegmentofnoncodingsequencethatistranscribedbutremoved
fromthetranscriptbysplicingtogethertheexonsoneithersideofit. Anintronlink
opensanewsequencerecordthatshowsonlytheintronsequence.
Whatsthedifferencebetweenexons
andcodingsequence?
Exonsoftenaredescribedasshortsegments
ofproteincodingsequence.Thisisabitofan
oversimplification.Exonsaresegmentsof
sequencesplicedtogetherafterintronshave
beenremovedfrompre-mRNA.Exonscarry
thecodingsequenceofagene,butsome
exonsmaycontainnocodingsequence.
Portionsofexonsorevenentireexonsmay
containsequencethatisnottranslatedinto
aminoacids.Thesearetheuntranslated
regions(UTR)ofmRNA.UTRsarefound
upstreamanddownstreamoftheprotein-
codingsequence.Seediagramonright.
12.SequenceinformationinaGenBankrecordcanalsobedisplayedusinggraphicsintheNCBI
SequenceViewer.ToaccessSequenceViewerfromaGenBankrecord,clickonthe Graphics
linkintheupperleftcorner(asshownbelow).
-
7/16/2019 Bio in for Matics
31/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
31
13.TheSequenceVieweroptionforGenBankrecordZ92910.1isshowninthescreenshotbelow.
ThetoppaneldisplaystheentiresequenceincludedintheGenBankrecord,thegreenbar
representstheHFEgenesequence,andtheblueoutlineofaboxwitharrowsindicates
whichportionofthesequenceisshowninthepanelbelow.Clickanddragthearrowson
theblue-boxoutlinetochangehowmuchofthesequenceisdisplayedinthelowerpanel.
Youcanalsousethearrowsontheleftsideofthelowerpaneltomovealongthe
sequenceandseewhereexonsandothergenefeaturesbeginandend.Thesliderbelow
thearrowscanbeusedtozoominandoutonthesequence.
-
7/16/2019 Bio in for Matics
32/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
32
Activity 4Online Resources: UniProt Protein Knowledgebase and BLAST Searching
Access the amino acid sequence of a genes protein product.
Compare the HFE protein sequence with protein sequences of otherorganisms.
UniProt Protein Knowledgebase and BLAST Searching
TheProteinKnowledgebase,whichispartoftheUniversalProteinResource(UniProt),isa
comprehensive,freelyaccessibledatabasethatthescientificcommunityusestoaccesshigh-
qualityproteinsequenceandfunctionalinformation.ThisactivitycovershowtouseUniProtto
learnabouttheaminoacidsequenceandotherfeaturesofthehereditaryhemochromatosis
protein.
1. GototheUniProthomepage( www.uniprot.org),enterHFEintothequeryboxasshownin
thescreenshotbelow,andthensubmityoursearch.
2. Fromthelistofresults(showninthescreenshotonthenextpage),noticethatsomeentries
havegoldstarsandothershavegraystars.Thosewithgoldstarshavedescriptionsofprotein
functionsandcharacteristicsthathavebeenmanuallyreviewedbyexperts.Entrieswithgray
starshavedescriptionsthatwereautomaticallygenerated,andexpertshavenotyetreviewed
http://www.uniprot.org/http://www.uniprot.org/ -
7/16/2019 Bio in for Matics
33/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
33
theserecords.Thusselectingasearchresultwithagoldstarwillprovideyouwithricher,
higherqualityinformationaboutaprotein.
SelectaccessionnumberQ30201fortheHFE_HUMANentryforthehereditary
hemochromatosisprotein.
3. TheUniProtentryfortheHFEproteinisshownonthenextpage.Thebluenavigationbarat
thetopofthescreencontainslinkstodifferentpartsoftheUniProtrecordforthisprotein.
Makeanoteoftheaccessionnumber(Q30201)forthisprotein.Wewillusetheaccession
numbertosearchforproteinstructuralinformationinActivity5.
Scrolldownthroughtherecordandreviewthe Proteinattributes andtheGeneral
annotationsectionstoanswerQuestions13forActivity4intheworksheetonpage52.
4. IntheProteinattributes section,forSequenceprocessing,noteThedisplayedsequenceis
furtherprocessedintoamatureform.ThismeansthatpartoftheHFEproteinchainneedsto
becutoffbyaproteolyticenzymetoformthematurefunctionalprotein.
-
7/16/2019 Bio in for Matics
34/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
34
5. ClickonSequenceannotationinthebluenavigationbarnearthetopoftherecord(markedin
thescreenshotabove).
6. TheSequenceannotationsectionoftheHFEproteinrecordisshowninthescreenshotonthe
nextpage.
UnderMoleculeprocessingintheSequenceannotationsection,noticethatthesignal
peptidecomprisesaminoacids122.Thefirst22aminoacidsarenotassociatedwithany
domains(functionalunitswithinaprotein).Thisportioniscleavedfromthecomplete
proteinsequencetomakethemature,functionalHFEprotein,whichconsistsofaminoacids23348.ClickingonthebluePosition(s)numbersinthesequenceannotationwill
openawindowshowingtheselectedsequencehighlightedwithinthecontextofthe
entireproteinsequence.
InActivity1welearnedthatthecysteineataminoacidposition282ischangedtoa
tyrosineinacommonmutationthatcauseshemochromatosis.ReviewtheRegionsand
AminoacidmodificationspartsoftheSequenceannotationsectiontoanswer
Questions45forActivity4onpage52.
-
7/16/2019 Bio in for Matics
35/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
35
7. ScrolldowntotheSecondarystructurepartofthe Sequenceannotationsection(shownin
imagebelow)andclickonDetailsbelowthecoloredbar.
8. Thesecondarystructuredetailsshowwhichsegmentsofproteinsequencemakeupbeta
strands,alphahelices,ortheturnsthatformbetweenbetastrandsandalphahelices.These
secondaryelementsareimportantindeterminingthethree-dimensionalproteinstructure.
UsethissecondarystructuralinformationtoanswerQuestion6forActivity4onpage52.
9. ReturntothetopoftheHFEproteinrecordbyscrollingorbyclicking Namesintheblue
navigationbar.ClickontheBlasttabatthetopofthepage.
-
7/16/2019 Bio in for Matics
36/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
36
NOTE:BLAST(BasicLocalAlignmentSearchTool)isatoolusedtocalculatehowsimilar
nucleotideorproteinsequencesareamongthesameordifferentkindsoforganisms.Many
resourcesthatmaintainbiologicalsequenceinformationoftensupporttheirownBLAST
searchingcapabilitiestoretrieveandcomparesequencedata.Formoreinformationabout
BLAST,seeTheNCBIHandbook(www.ncbi.nlm.nih.gov/books/NBK21097/).
ProteinsequencesareoftenpreferredovernucleotidesequencesforBLASTsearchingbecauseofthegreatervariabilityinnucleotidesequences.Rememberwiththegenetic
code,differentcodonsofnucleotidescanspecifythesameaminoacid.Thusproteinsthat
havesimilaraminoacidsequencesmayhaveconsiderablydifferentnucleotidesequences
encodingthoseproteins.
10.AscreenshotoftheBLASTsearchfeaturefortheHFEproteinisshownbelow.
TheaminoacidsequenceofthecompleteHFEproteinisautomaticallyenteredintothe
textboxontheleft.Thesingle-letterabbreviationsusedtorepresenteachaminoacidare
explainedintheTableofStandardGeneticCodeonpage50.
ClickontheBlastbuttontocomparetheaminoacidsequenceoftheHFEproteinwithall
thesequencesavailablefromtheUniProtKnowledgebase.Bepatient.ABLASTsearchmaytakeseveralminutesdependingonhowbusytheserveris.
11.Oncetheresultsareretrieved,scrolldowntothe DetailedBLASTresults(seescreenshoton
nextpage).
TheIdentitycolumnontherightprovidesthepercentofeachentrysaminoacid
sequencethatisidenticaltothesequencesubmitted.Tosortallofyourresultsfrom
highesttolowestIdentityvalues,clickonthearrowsatthetopofthe Identitycolumn.
Toseemoreresults,clickNextintheupperrightcorner.
UsetheDetailedBLASTresultstoanswerQuestion7forActivity4onpage53.
http://www.ncbi.nlm.nih.gov/books/NBK21097/http://www.ncbi.nlm.nih.gov/books/NBK21097/ -
7/16/2019 Bio in for Matics
37/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
37
DetailedBLASTResultsfortheHFEProteininUniPRot
-
7/16/2019 Bio in for Matics
38/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
38
Activity 5Online Resources: Protein Data Bank
Explore the sequence and structure of the genes protein product.
Protein Data Bank
Thisactivitydemonstrateshowtofindandviewaproteinstructureusingtoolsandresources
availablefromtheProteinDataBank(PDB).PDBisaninternationalarchiveof3-Dstructural
informationforbiologicalmacromolecules.PDBrecordsprovideaccesstoseveralinteractive
moleculargraphicsprograms.ThisactivityalsousesFirstGlanceinJmol,aresourcethatworksin
mostbrowsersforviewingthemajormolecularfeaturesofastructurewithjustafewmouse-
clicks.
Before You Begin
ManyfeaturesofthePDBwebsiterequirenewerWebbrowserswithJavaScriptandcookies
enabled,andpop-upsshouldnotbeblocked.FormoreinformationonsystemrequirementsseePDBFrequentlyAskedQuestions(www.rcsb.org/pdb/static.do?p=home/faq.html).
Some Protein Structure Basics
Proteinsarecreatedbylinkingaminoacidsinalinearfashiontoformpolypeptidechains.The
aminoacidsequenceofapolypeptidechainistheprimarystructureofaprotein.Seethe
TableofStandardGeneticCodeonpage50forsingle-letterandthree-letterabbreviationsfor
the20differentaminoacids.
Aminoacidshavedifferentchemicalproperties.Forexample,someaminoacidresiduesare
strictlyhydrophobic(waterfearing)andmustbeprotectedfromaqueousenvironments,whileotheraminoacidsarehydrophilic(waterloving).Thesubstitutionofjustoneamino
acidforanotherwithverydifferentchemicalpropertiescanhaveseriousconsequencesfora
proteinsstructureandfunction.
Thefoldingofregionswithinthepolypeptidechainintoalphahelicesandbetasheetsisa
proteinssecondarystructure.
Thepackingoftheentirepolypeptidechainintoathree-dimensionalglobularunitisa
proteinstertiarystructure .
Ifaproteinmoleculeisacomplexofmorethanonepolypeptidechain,thenthecomplete
structureofthismolecularcomplexiscalledaproteins quaternarystructure.
Adomainisadiscreteportionofaproteinwithitsownfunctionandspecificthree-
dimensionalstructure.Thecombinationofdomainsinasingleproteindeterminesitsoverall
function.
Differentpartsofapolypeptidechaincanbelinkedbydisulfidebridgesthatformbetween
twocysteineresidues.Disulfidebridges(ordisulfidebonds)stabilizeaproteinsthree-
dimensionalstructure.Thelossofadisulfidebridgewouldbedetrimentaltoaproteins
overallstructure.
http://www.rcsb.org/pdb/static.do?p=home/faq.htmlhttp://www.rcsb.org/pdb/static.do?p=home/faq.html -
7/16/2019 Bio in for Matics
39/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
39
Finding a Structure Record in PDB
1. Tobegin,letsgototheProteinDataBank( www.rcsb.org/pdb/).
NOTE:IfyouarenewtoPDB,besuretocheckoutthe Educationlinksinthelight
bluecolumnontheleftofthescreen.Under EducationalResourcesyoucanfind
posters,tutorials,activities,andlessons.MoleculeoftheMonth isacollectionof
vignettes,eachfeaturingadifferentmolecularstructureanditsimportanceto
humanwelfare.
2. BesidethesearchboxatthetopofthePDBhomepage,select AdvancedSearch.
http://www.rcsb.org/pdb/http://www.rcsb.org/pdb/ -
7/16/2019 Bio in for Matics
40/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
40
3. OntheAdvancedSearchpage,fromthe ChooseaQueryTypedropboxselectUniProtKB
AccessionNumber(s).InActivity4weaccessedthehumanhemochromatosisproteinrecord
Q30201intheUniProtProteinKnowledgebase.Enter Q30201inthesearchbox.The
advancedsearchpageshouldlooklikethescreenshotbelow.Selectthe SubmitQuerybutton
tosubmityoursearch.
4. Thesearchshouldreturntwohits.Scrolldownthepagetoseeabriefsummaryofeach
searchresult.Onerecord(1DE4)providesstructuralinformationonthehemochromatosisproteinHFEcomplexedwithareceptor,andtheotherrecord(1A6Z)justprovidesstructural
informationfortheHFEprotein.Clickon 1A6ZHFE(HUMAN)HEMOCHROMATOSISPROTEIN
toopenthisPDBrecord.
-
7/16/2019 Bio in for Matics
41/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
41
5. Thesummarytabofthe1A6Zrecordisshowninthescreenshotabove.
NotetheMolecularDescriptionboxinthecenterofthescreenshot.Thisstructureisa
complexoffourpolymerchains:A,B,C,andD.AandCareidenticalHFEpolypeptide
chains,andBandDareidenticalchainsofanotherproteincalledbeta-2-microglobulin.
NotethePrimaryCitationinthe1A6Zrecord.Thebestwaytolearnaboutstructure
detailsistoaccessthearticlelistedastheprimarycitation.Althoughthefulltextforsome
articlesmaybefreelyavailableonline,manyarticlesareaccessibleonlybysubscription.
Someuniversityresearchlibrariesmayprovidepublicaccesstotheirjournalcollections.
Thearticleforthisstructurehasbeenaccessedtorevealthefollowingdetails:
o OnlythesolubleportionoftheHFEpolypeptidechainisincludedinthe1A6Zstructure.Thetransmembranedomainismissing,sotheHFEproteininthis
structurehasonly275ofthe348aminoacidsinthecompleteHFEprotein
sequence.
o Thefirst22aminoacidsoftheHFEpolypeptidesequencehavebeenexcluded
becausetheyarenotpartofthemature,functionalprotein.Therefore,thefirst
aminoacidinthisstructureisreallythe23rd
,andcysteine260isthecysteine
residueinvolvedintheCYS282TYRmutationthatwelearnedaboutinActivity1.
-
7/16/2019 Bio in for Matics
42/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
42
o EachHFEpolypeptidechainiscomplexedwithanotherpolypeptidechaincalled
beta-2-microglobulin.
o The1A6ZstructureconsistsoftwoHFEbeta-2microglobulincomplexes.
6. SelecttheSequencetabtoexaminethesequenceandsecondarystructuredetailsforthis
structure.
7. TheSequenceandStructureDetailsforrecord1A6Zareshowninthescreenshotbelow.
TheHFEproteinsequence(polypeptidechainA)ispresentedfirst.Eachletterinthe
proteinsequencerepresentsadifferentaminoacid.Cstandsforcysteine.SeetheTable
ofStandardGeneticCodeonpage50todeterminewhichaminoacidisrepresentedby
eachletter.
Secondarystructuredetailsaremappedontosequencedetails.Differentgraphical
symbolsareusedtorepresentextendedbetastrands,alphahelixes,bends,andturns.
-
7/16/2019 Bio in for Matics
43/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
43
8. Selectthedisplayexternal(UniProtKB)sequencelinkhighlightedinthepreviousscreenshot.
9. ThesequencepagewillreloadanddisplaytheaminoacidnumbersfortheUniProtHFE
proteinsequence(thatweexaminedinActivity4)abovethelineofsingle-letteraminoacid
abbreviations(seescreenshotbelow).
Findcysteine282intheUniProtsequence.Cysteine282istheaminoacidthatisreplaced
bytyrosineintheCYS282TYRmutation.
Youwillseethatcysteine282intheUniProtsequenceisatposition260inthePDB
structuresequence.InActivity4,welearnedthatcysteine282formsadisulfidebond
withcysteine225intheUniProtHFEproteinsequence.IntheHFEproteinsequencefor
PDBstructure1A6Z,weseethatcysteine260formsadisulfidebondwithcysteine203
(whichcorrespondstocysteine225intheUniProtsequence).Disulfidebondsarecritical
toformingtheproperstructuralarrangementneededtomakeafunctionalprotein;
therefore,thelossofcysteine260wouldbedetrimentaltoproteinstructure. Answerthe
firsttwoquestionsforActivity5intheworksheetonpage54.
Viewing the Structure
10.SelecttheSummarytabnearthetopofthepagetoreturntothe1A6Zrecordsummary.In
theBiologicalAssembly1boxintheupperrightcornerofthepagethereareseveraloptions
forviewingthemolecularstructure.Clickingonthe MoreImageslinkwillopenapagewith
optionsfordownloadingastillimageoftheHFEmolecularcomplex1A6Z.AlthoughPDBprovidesaccesstoseveraldifferentmolecularviewersforexamininga3-Drepresentationofa
molecularcomplex,manyoftheseoptionsweredesignedforscientistswhospecializein
studyingmolecularstructures.Inthisactivity,wewilluseamolecularviewercalled
FirstGlanceinJmol,whichisoneofthemoreuser-friendlyoptionsfordisplayingthemajor
structuralfeaturesofamolecule.FirstGlanceinJmolwasdevelopedtoworkinallpopular
webbrowserswithouthavingtodownloadandinstallanything.
-
7/16/2019 Bio in for Matics
44/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
44
11.ToaccessFirstGlance,firstclickontheleftarrownexttothe BiologicalAssembly1 label
abovethemolecularimage.ThisshouldchangetheboxlabeltoAsymmetricUnit.
12.ByclickingonthearrownexttoOtherViewers,adrop-downmenuwillappear.Select
FirstGlancefromthedrop-downmenu(seescreenshotbelow).
-
7/16/2019 Bio in for Matics
45/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
45
13.Anewpageshouldopendisplayingstructure1A6Zusing FirstGlanceinJmol (seescreenshot
below).
Tostopthespinningofthemolecule,clickthe Spinboxintheupperleft.
ToremovetheS-labels,unchecktheShowboxbesideLabelsX,S-,?.
14.Thestructureisinitiallydisplayedusingthe Cartoonoption,whichassignsadifferentcolorto
eachmolecularchaininthestructure.ChainsA,B,C,andDshouldbedisplayed.Earlierinthe
activitywelearnedthatchainsAandCareidenticalHFEchainsandchainsBandDare
identicalbeta-2-microglobulinchains.
Clickinganywhereonthemoleculewillgeneratealabelinthelowerleftcornershowing
theaminoacidresidueandtheproteinchainthatyouhaveselected.
ClickoneachcoloredchaintofindChainA,whichisoneofthetwoHFEproteinchains.In
thescreenshotonthenextpage,ChainAisthebluechain.
Ifyouneedtorotatethestructure,simplyclickonthestructureanddragwithyourmouse.
Toundoanyofthechangesyouhavemadeandresetthestructuretoitsoriginal
configuration,clickResetintheupperleftcorner.
-
7/16/2019 Bio in for Matics
46/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
46
Byclickingonthebluechain,thelabelinthelowerleftindicatesthatthebluechainisChainA.
15.LetshideallthechainsexceptChainA.Clickonthe Hide..linkintheupperleftcornerand
thenclickoneachchainexceptChainA.Yourscreenshouldlooklikethescreenshotbelow.
16.ClickontheCenterVisibleChains link(highlightedinscreenshotabove)toplaceChainAin
thecenterofthedisplaypanel.
-
7/16/2019 Bio in for Matics
47/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
47
17.OnceChainAiscentered,usethe ZoomtooltoenlargeChainA.InadditiontousingtheZoom
arrowsintheupperleftcorner,youcanzoominandoutbyclickingonthebackgroundofthe
structureandthenusingthewheelonyourmouse.Alternatively,youcanalsoholddownthe
Shiftkeyanddragthemouseupanddownoverthemoleculetozoominandout.Yourscreen
shouldsomethinglooklikethescreenshotbelow.
18.Letsfindcysteine260andcysteine203(thecysteineresiduesthatformthedisulfidebond
involvedintheCYS282TYRmutation).Clickonthe Find..link(highlightedinscreenshot
above).
19.TheFindoption(showninthescreenshotonthenextpage)allowsyoutosearchfor
particularresidueswithinamolecule.Thelocationsoftheresiduesareindicatedusingyellow
dots.ThebackgroundcolorautomaticallychangestoblackwhenyouselectFind.Ablack
backgroundmakestheyellowdotseasiertosee.Youcantogglebetweenblackandwhite
backgroundcolorsbyclickingontheBackgroundboxintheupperleftcorner.
TypeCYS260,CYS203intothetextbox.
PresstheEnterkeyonyourkeyboardtosubmityoursearch.
Yellowdotsshouldindicatewherethesetworesiduesareintheproteinchain.Youmay
needtorotatethestructurebyclickinganddraggingyourmouseoverthemoleculesothatyoucanobtainagoodviewoftheyellowdots.Notethattheyellowdotssurrounda
thingoldbar.Thisthingoldbarrepresentsadisulfidebond.Youcanseethatabond
betweencysteines203and260wouldcreateastrongconnectionbetweentwodifferent
strandswithintheprotein.
-
7/16/2019 Bio in for Matics
48/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
48
FindingCysteineResiduesintheHFEProtein
20.ToobtainabetterviewofthedisulfidebondsintheHFEprotein,clickonthe MoreViews..
linkintheupperleftcorner,andthenclickonthe DisulfideBonds:ShowAll link.Thepage
shouldchangesothatitlookslikethescreenshotbelow.Thebackboneoftheproteinchainis
modifiedtoathinline(whichisdifficulttoseeinthescreenshot),andthedisulfidebonds
becomethickerandeasiertosee.Thecysteineresiduesarealsolabeled. AnswerQuestions
34forActivity5intheworksheetonpage54.
-
7/16/2019 Bio in for Matics
49/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
49
21.Nowthatyouarefamiliarwithafewoptionsformodifyingamolecularstructureusing
FirstGlance,youmaywanttoResetthestructureandpracticewhatyouhavelearned.In
additiontothedisplayoptionsintheupperleftcornerofthescreen,youcanalsousepop-up
menustomodifythestructurebyclickingon Jmolinthelowerrightcornerofthedisplay
panel(highlightedinthepreviousscreenshot).
22. Ifyouareinterestedincopyingorsavingaparticularviewofastructurethatyouhave
created,checkoutthePresentingMolecularViewsfromFirstGlanceinJmolpage
(molvis.sdsc.edu/fgij/slides.htm).
Protein Structure and Hereditary Hemochromatosis Development
ByexaminingtheHFEproteinssequenceandstructure,wediscoverthatthecysteinelostinthe
CYS282TYRmutationhasanimportantroleinestablishingthecorrectthree-dimensionalHFE
structure.Inthismutation,acysteineresidueisreplacedbyanotheraminoacid,tyrosine,andthe
disulfidebondbetweentwocysteinesinthepolypeptidechainislost.Thisisdetrimentaltothe
protein'sstructure.Asaresult,theHFEproteincannolongerperformitsnormalfunctionofregulatingironuptake,andcellsbecomeoverloadedwithiron.Thisbuildupofironincells,if
untreated,canleadtoorgandamageandothercomplications.
http://molvis.sdsc.edu/fgij/slides.htmhttp://molvis.sdsc.edu/fgij/slides.htm -
7/16/2019 Bio in for Matics
50/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
50
Table of Standard Genetic Code forTranslating DNA Sequence Records
T C A GT
TTT Phe (F)TTC Phe (F)TTA Leu (L)TTG Leu (L)
TCT Ser (S)TCC Ser (S)TCA Ser (S)TCG Ser (S)
TAT Tyr (Y)TACTAA STOPTAG STOP
TGT Cys (C)TGCTGA STOPTGG Trp (W)
CCTT Leu (L)CTC Leu (L)CTA Leu (L)CTG Leu (L)
CCT Pro (P)CCC Pro (P)CCA Pro (P)CCG Pro (P)
CAT His (H)CAC His (H)CAA Gln (Q)CAG Gln (Q)
CGT Arg (R)CGC Arg (R)CGA Arg (R)CGG Arg (R)
AATT Ile (I)ATC Ile (I)ATA Ile (I)ATG Met (M) START
ACT Thr (T)ACC Thr (T)ACA Thr (T)ACG Thr (T)
AAT Asn (N)AAC Asn (N)AAA Lys (K)AAG Lys (K)
AGT Ser (S)AGC Ser (S)AGA Arg (R)AGG Arg (R)
GGTT Val (V)GTC Val (V)GTA Val (V)GTG Val (V)
GCT Ala (A)GCC Ala (A)GCA Ala (A)GCG Ala (A)
GAT Asp (D)GAC Asp (D)GAA Glu (E)GAG Glu (E)
GGT Gly (G)GGC Gly (G)GGA Gly (G)GGG Gly (G)
Key to the Table of Standard Genetic Code
Alanine ALA A Arginine ARG R
Asparagine ASN N Aspartic acid ASP D
Cysteine CYS C Glutamic acid GLU E
Glutamine GLN Q Glycine GLY G
Histidine HIS H Isoleucine ILE I
Leucine LEU L Lysine LYS K
Methionine MET M Phenylalanine PHE F
Proline PRO P Serine SER S
Threonine THR T Tryptophan TRP W
Tyrosine TYR Y Valine VAL V
START = Initiation Signal (signifies the beginning of apolypeptide chain)
STOP = Termination Signal (signifies the end of a
polypeptide chain)
-
7/16/2019 Bio in for Matics
51/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
51
Hereditary Hemochromatosis WorksheetThisworksheetprovidesquestionstobeansweredasyoucompletetheactivitiesintheGene
GatewayWorkbook.
Questions for Activity 1
1) Whatistheofficialgenesymbolofthehereditaryhemochromatosisgene?
2) Whichallelicvariant(geneticmutation)mostcommonlycauseshereditaryhemochromatosis?
3) Whataresomecharacteristicsofhereditaryhemochromatosis?Howisittreated?
Questions for Activity 2
1) Onthediagramtotheright,markthegeneralregionwhere
theHFEgenecanbefoundonchromosome6.
2) Abouthowmanygenesareonchromosome6?
3) HowlongistheDNAsequenceforchromosome6?
-
7/16/2019 Bio in for Matics
52/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
52
Questions for Activity 3
1) UsingthesummaryfromtheEntrezGenerecordfortheHFEgene, brieflydescribethe
functionofthegenesproteinproduct.
Use the GenBank sequence record Z92910.1 to answer questions 24.
2) IntheFeaturessectionofrecordZ92910.1,selectthe genelink.Howmanybasepairs(bp)are
inthegenomicsequenceoftheHFEgene?
3) ScrollthroughtheFeaturessectionofthe genesequenceinZ92910.1.Howmanyexonshave
beenidentifiedinthissequence?
4) ReturntothemainrecordZ92910.1.Selectthe CDSlink.Howmanybasepairsareinthecodingsequence?
Questions for Activity 4
1) Howmanyaminoacids(AA)areinthecompleteHFEprotein?
2) InwhatpartofthecellistheHFEproteinlocated?
3) WhattypeoftissuedoesnotexpresstheHFEprotein?
4) Iscysteine282foundontheextracellularorcytoplasmicsideoftheHFEprotein?
5) Whatisthenumberofthecysteineresiduethatformsadisulfidebondwithcysteine282?
6) Whatkindofsecondarystructuralelementcontainscysteine282:alphahelix,turn,orbeta
strand?
-
7/16/2019 Bio in for Matics
53/54
U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
53
7) UsingtheBLASTsearchresults,listthefirst10non-humanorganismsthathaveproteins
similartothehumanHFEproteinsequence.Includethepercentidentityscorewitheach
organismyoulist,andorderthelistfromhighesttolowestidentityscore.Skipanyhuman
entries,anddonotlistanyorganismmorethanonce.
OrganismName Identity%
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
-
7/16/2019 Bio in for Matics
54/54
TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011
Questions for Activity 5
1. ExaminetheaminoacidsequenceforthehumanHFEproteinfromtheUniProtProtein
Knowledgebase(shownbelow).Findcysteine282,theaminoacidthatisreplacedbytyrosine
intheCYS282TYRmutation.RefertotheTableofStandardGeneticCodeonPage50forhelp
withthesingle-letteraminoacidabbreviations.
10 20 30 40 50 60
| | | | | |
MGPRARPALL LLMLLQTAVL QGRLLRSHSL HYLFMGASEQ DLGLSLFEAL GYVDDQLFVF
70 80 90 100 110 120
| | | | | |
YDHESRRVEP RTPWVSSRIS SQMWLQLSQS LKGWDHMFTV DFWTIMENHN HSKESHTLQV
130 140 150 160 170 180
| | | | | |
ILGCEMQEDN STEGYWKYGY DGQDHLEFCP DTLDWRAAEP RAWPTKLEWE RHKIRARQNR
190 200 210 220 230 240
| | | | | |
AYLERDCPAQ LQQLLELGRG VLDQQVPPLV KVTHHVTSSV TTLRCRALNY YPQNITMKWL
250 260 270 280 290 300
| | | | | |
KDKQPMDAKE FEPKDVLPNG DGTYQGWITL AVPPGEEQRY TCQVEHPGLD QPLIVIWEPS
310 320 330 340
| | | |
PSGTLVIGVI SGIAVFVVIL FIGILFIILR KRQGSRGAMG HYVLAERE
2. ComparetheaminoacidsequenceabovewiththeHFEsequencedetailsprovidedforPDB
structure1A6Z.Inquestion1,underlinetheportionoftheaminoacidsequenceincludedin
thePDBstructure.
3. Howmanydisulfidebondsarepresentinthehereditaryhemochromatosisprotein?
4. WhyisthecysteineresidueaffectedintheCYS282TYRmutationimportant?