bio in for matics

Upload: rodrigo-carmo

Post on 30-Oct-2015

56 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/16/2019 Bio in for Matics

    1/54

    Updated:February2011

    TheGeneGatewayWorkbookAcollectionofactivitiesintroducingnewusersto

    thewebresourcesthatscientistsaccesstolearn

    aboutgeneticdisorders,genes,andproteins.

    Using hereditary hemochromatosis as a model,access a variety of websites and databases to

    Learnaboutageneticdisorderanditsassociatedgene.

    Identifymutationsthatcausethedisorder.

    Findthegeneonachromosomemap.

    Examinethegenessequenceandstructure.

    Accesstheaminoacidsequenceofagenesproteinproduct.

    Explorethe3-Dstructureofthegenesproteinproduct.

    To view the chromosomes of the Human Genome

    Landmarks poster online, order your free copy of

    the poster, or download additional copies of this

    workbook, go to the Gene Gateway website:

    genomics.energy.gov/genegateway/

    http://genomics.energy.gov/genegateway/http://genomics.energy.gov/genegateway/
  • 7/16/2019 Bio in for Matics

    2/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    2

    Acknowledgements

    ThisworkbookwaslastupdatedFebruary2011.Withmuchappreciation,wewouldliketo

    acknowledgeCarrieMorjan,AuroraUniversity,whofacilitatedthecreationofthislatestversion

    bysharingherupdatestoTheGeneGatewayWorkbookforuseinhergeneticsclasses.

    ThisworkbookwasfirstproducedbytheBiologicalandEnvironmentalResearchInformation

    SystematOakRidgeNationalLaboratory,OakRidge,Tennessee,July2003,withsupportfrom

    theU.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch.

    OfficeofBiologicalandEnvironmentalResearch

    OfficeofScience

    U.S.DepartmentofEnergy(DOE)

    ForMoreInformation

    ThisworkbookisfreelydownloadablefromtheGeneGatewaywebsite(seelinkbelow).For

    questionsorcommentsconcerningthisdocument,contactJenniferBownasbyemailat

    [email protected].

    GeneGateway

    genomics.energy.gov/genegateway/

    HumanGenomeProjectInformation

    www.ornl.gov/hgmis/home.shtml

    DOEGenomicScienceProgram

    genomicscience.energy.gov

    DOEOfficeofBiologicalandEnvironmentalResearch

    science.energy.gov/ber/

    mailto:[email protected]://genomics.energy.gov/genegateway/http://www.ornl.gov/hgmis/home.shtmlhttp://genomicscience.energy.gov/http://science.energy.gov/ber/http://science.energy.gov/ber/http://genomicscience.energy.gov/http://www.ornl.gov/hgmis/home.shtmlmailto:[email protected]://genomics.energy.gov/genegateway/
  • 7/16/2019 Bio in for Matics

    3/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    3

    TableofContents

    Introduction...................................................................................................................4 WhyUseHereditaryHemochromatosisasaModel?..............................................................5SomeBasicConceptstoUnderstandBeforeStarting..............................................................5

    Activity1........................................................................................................................6OnlineMendelianInheritanceinMan(OMIM).......................................................................6GeneTests............................................................................................................................11

    Activity2......................................................................................................................14NCBIMapViewer.................................................................................................................14

    Activity3......................................................................................................................22NCBIEntrezGeneandGenBank...........................................................................................22

    Activity4......................................................................................................................32UniProtProteinKnowledgebaseandBLASTSearching.........................................................32

    Activity5......................................................................................................................38ProteinDataBank................................................................................................................38

    TableofStandardGeneticCodeforTranslatingDNASequenceRecords.......................50

    HereditaryHemochromatosisWorksheet....................................................................51

  • 7/16/2019 Bio in for Matics

    4/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    4

    Introduction

    TheGeneGatewayWorkbookisacollectionofactivitieswithscreenshotsandstep-by-step

    instructionsdesignedtointroducenewuserstogenetic-disorderandbioinformaticsresources

    freelyavailableontheWeb.Itshouldtakeabout3hourstocompleteallfiveactivities.

    Theworkbookactivitieswerederivedfrommoredetailedguidesandtutorialsavailableatthe

    GeneGatewaywebsite( genomics.energy.gov/genegateway/ ).Thiswebsitewascreatedasa

    resourceforlearningmoreaboutthegenes,traits,anddisorderslistedontheHumanGenome

    Landmarks(HGL)poster,butitcanbeusedtoinvestigateanygeneorgeneticdisorderofinterest.

    Manyguidestousingbioinformaticresourcesaredesignedforbioscienceresearchersandaretoo

    technicalfornonexperts.ThisworkbookandotherGeneGatewayresourcestargetamore

    generalaudience:teachers,highschoolandcollegestudents,patientswithdisordersandtheir

    families,andanyoneelsewhowantstolearnmoreabouthowlifeworksatamolecularlevel.

    Thisworkbookshowsyouhowtogetstartedusingbioinformaticsresourcesthatoftenintimidate

    andoverwhelmnewusers.Italsodemonstrateshowinformationfromoneresource,suchasannotatedproteinsequencedatafromtheUniProtProteinKnowledgebase,canbeusedto

    reinforceandclarifyinformationavailablefromanotherresource,suchasthree-dimensional(3-D)

    structuresfromProteinDataBank(PDB).GeneGatewayprovidesuserswithasystematic

    approachtousingmultiplebioinformaticsdatabasestogainabetterunderstandingofhowgenes

    andproteinscancontributetothedevelopmentofaparticulargeneticcondition.

    Usingthegeneticdisorderhereditaryhemochromatosisasamodel,thisworkbookshowsyou

    howtoaccess:

    OnlineMendelianInheritanceinMan(OMIM)andGeneReviewstolearnaboutagenetic

    disorder,itsassociatedgeneorgenes,andcommondisease-causingmutations.

    NCBIMapViewertofindagenelocusonachromosomemap.

    NCBIEntrezGeneandGenBanktoexaminethesequenceandstructureofagene.

    UniProtProteinKnowledgebasetofindtheannotatedaminoacidsequenceofagenes

    proteinproduct.

    ProteinDataBanktoviewandmodifythe3-Dstructureofthegenesproteinproduct.

    Skillsgainedbyworkingthroughtheactivitiesinthisworkbookcanbeappliedtolearningabout

    othergeneticdisorders,genes,andproteins.

    Thisworkbookandothergenomescienceresourcesareavailablefromthewebsiteforthe

    genomeprogramsoftheOfficeofBiologicalandEnvironmentalResearch,U.S.DepartmentofEnergyOfficeofScience( genomics.energy.gov/ ).

    http://genomics.energy.gov/genegateway/http://genomics.energy.gov/http://genomics.energy.gov/http://genomics.energy.gov/genegateway/
  • 7/16/2019 Bio in for Matics

    5/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    5

    Why Use Hereditary Hemochromatosis as a Model?

    Hereditaryhemochromatosis,adisorderinwhichtoomuchironaccumulatesincertain

    tissuesandorgans,iscausedbychangesintheDNAsequenceofasinglegene,sothegenetic

    basisofthisconditioniseasiertounderstandthanmorecomplexdisorderscausedby

    alterationsinmultiplegenes.

    Thegeneanditsproteinproductarerelativelywellstudied.Three-dimensionalstructuresof

    theproteinproductareavailableinPDB,theinternationalrepositoryformacromolecular

    structuredata.

    Hereditaryhemochromatosisisthemostcommonautosomalrecessivedisorderaffecting

    individualsofNorthernEuropeandescent(about1in200Caucasiansdevelophereditary

    hemochromatosis).

    Effectivemethodsfortreatmentareavailablewithearlydiagnosis.

    Some Basic Concepts to Understand Before Starting

    Genesarethebasicphysicalandfunctionalunitsofheredity.Eachgeneislocatedona

    particularregionofachromosomeandhasaspecificorderedsequenceofnucleotides(the

    buildingblocksofDNA).

    Centraldogmaofmolecularbiology:DNARNAProtein

    - GeneticinformationisstoredinDNA.

    - SegmentsofDNAthatencodeproteinsorotherfunctionalproductsarecalledgenes.

    - GenesequencesaretranscribedintomessengerRNAintermediates(mRNA).

    - mRNAintermediatesaretranslatedintoproteinsthatperformmostlifefunctions.

    Eukaryoticgeneshaveintronsandexons.Exonscontainnucleotidesthataretranslatedinto

    aminoacidsofproteins.Exonsareseparatedfromeachotherbyinterveningsegmentsof

    DNAcalledintrons.Intronsdonotcodeforprotein,andtheyareremovedwheneukaryotic

    mRNAisprocessed.Exonsaresplicedbacktogethertoformtheintron-freemRNAstrandthat

    isusedasatemplatetomakeproteins.

    Specialcellularcomponents(ribosomes)usethetripletgeneticcodetotranslatethe

    nucleotidesofanmRNAsequenceintotheaminoacidsequenceofaprotein.ATableof

    StandardGeneticCodeisprovidedonpage50ofthisworkbook.

    Thereare20differentaminoacids.Proteinsarecreatedbylinkingaminoacidstogetherina

    linearfashiontoformpolypeptidechains.SeetheTableofStandardGeneticCodeonpage50

    forsingle-letterandthree-letterabbreviationsforthe20differentaminoacids.

    Polypeptidechainsfoldinto3-Dstructuresthatcanassociatewithothermolecularstructures

    toperformspecificfunctions.

  • 7/16/2019 Bio in for Matics

    6/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    6

    Activity 1Online Resources: OMIM and GeneTests

    Learn about the genetic disorder and its associated gene.

    Identify mutations that cause the disorder.

    Online Mendelian Inheritance in Man (OMIM)

    OMIMisacomprehensivedatabaseofhumangenes,genetictraits,anddisorderscreatedby

    researchersatJohnsHopkinsUniversity.TheOMIMdatabase,whichisupdateddaily,isaccessible

    throughtheNationalCenterforBiotechnologyInformation(NCBI)suiteofonlineresources.Each

    recordinOMIMsummarizesthebodyofresearchrelevanttoaparticulargene,trait,ordisorder.

    ToaccessOMIM,letsgototheNCBIhomepage( www.ncbi.nlm.nih.gov)shownbelow,andthen

    clickonOMIMintheboxontheupperright.

    AscreenshotoftheOMIMhomepageisshownonthefollowingpage.Theeasiestwaytobegina

    searchistosimplytypeadisordernameinthesearchboxatthetopoftheOMIMpageand

    submityoursearch.However,NCBIalsosupportsavarietyoffeaturesfornarrowingasearchand

    browsingdisordersalphabetically(usingOMIMMorbidMap)orbychromosomallocation(using

    OMIMGeneMap).

    Tonarrowasearch,NCBIhasoptionsfortypingsearchfieldqualifiersintothesearchbox[see

    OMIMHelp(www.ncbi.nlm.nih.gov/Omim/omimhelp.html)formoreinformation]orselecting

    searchfieldsusingthe Limitstab.Thisexercisewilldemonstratesearchesusingthe Limitstab.

    http://www.ncbi.nlm.nih.gov/http://www.ncbi.nlm.nih.gov/Omim/omimhelp.htmlhttp://www.ncbi.nlm.nih.gov/Omim/omimhelp.htmlhttp://www.ncbi.nlm.nih.gov/
  • 7/16/2019 Bio in for Matics

    7/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    7

    1. SelecttheLimitstabatthetopoftheOMIMpageshowninthescreenshotbelow.

    URLforOMIMhomepage:www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM

    Mostgenes,disordersandtraitslistedontheHumanGenomeLandmarks(HGL)posterwere

    takenfromthetitlefieldsofOMIMrecords,sowecannarrowoursearchtolookonlyforthose

    recordsthathavehemochromatosisinthetitlefield.Byselectinghemochromatosisfromthe

    HGLposter,wealsoknowthatthegeneforthisdisorderisfoundonchromosome6.

    2. FromtheLimitspage,enterhemochromatosisintothesearchboxandselectthe Titlebox

    andchromosome6asshowninthescreenshotbelow.ClickGotosubmityoursearch.

    http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIMhttp://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM
  • 7/16/2019 Bio in for Matics

    8/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    8

    NOTE:SearchingforOMIMrecordsassociatedwithmulti-genedisorders,such

    asbreastcancerordiabetes,whicharecausedbyalterationsingeneson

    differentchromosomes,mayprovidemultipleOMIMrecordsinthesearch

    results.Limitingyoursearchtojustonechromosomeforamulti-genedisorder

    mayonlyretrieveasubsetofalltherecordsassociatedwiththatdisorder.

    3. Thesearchshouldreturnoneresult: MIMID#235200.AscreenshotofthefullOMIMrecord

    forthehemochromatosisdisorderisshownbelow.

    4. Letsexaminesomeofthefeaturesofthisrecord:

    EachOMIMrecordisassignedauniquesix-digit MIMIDnumberlocatedatthetopof

    eachentry.Forhemochromatosis,theMIMIDis235200.Asauniqueidentifierfora

    disorder,theMIMIDcanbeusedtosearchotherdatabasesforinformationabouta

    particulardisorder.

    Thenumbersign(#)prefixinfrontoftheMIMIDmeansthatthisentryreferstothe

    descriptionofaphenotype,andthemolecularbasisforthisphenotypeisknown.For

    moreinformationaboutotherMIMnumberprefixes,seeOMIMHelp

    (www.ncbi.nlm.nih.gov/Omim/omimhelp.html#MIMnumberPrefix).

    BelowtheMIMID,youwillfindthedisordernameandtheofficialgenesymbol(shownin

    theimageonthenextpage).Theofficialgenesymbol,whichis HFEforhemochromatosis,

    servesasauniqueidentifierforagene.Tobe"official,"agenesymbolmusthavebeen

    approvedbytheHUGOGeneNomenclatureCommittee( www.genenames.org).Thegene

    http://www.ncbi.nlm.nih.gov/Omim/omimhelp.html#MIMnumberPrefixhttp://www.ncbi.nlm.nih.gov/Omim/omimhelp.html#MIMnumberPrefixhttp://www.genenames.org/http://www.genenames.org/http://www.ncbi.nlm.nih.gov/Omim/omimhelp.html#MIMnumberPrefix
  • 7/16/2019 Bio in for Matics

    9/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    9

    symbolisespeciallyusefulwhensearchingotherdatabases(suchassequence,genome-

    mapping,andstructuredatabases)forgene-specificinformation.

    NOTE:Foradisorderlikehemochromatosis,which

    isprimarilycausedbymutationsinasinglegene,

    theofficialgenesymbolmaybeincludedinthe

    recordtitle.Forcomplexdisorderslikebreast

    cancer,officialsymbolsforassociatedgeneswillbe

    describedinthefirstparagraphoftext.

    TheGenemaplocusdescribeswhereagenecanbefoundonachromosome.Forthe

    genelocus6p21.3,6isthechromosomenumber,pindicatestheshortarmofthe

    chromosome,and21.3isanumberassignedtoaparticularregionofthechromosome.

    ClickingonagenemaplocusopenstheOMIMGeneMap,atableofgenesorganizedby

    chromosomallocation.

    TheamountoftextwithinanOMIMrecordvariesaccordingtowhatisknownabouta

    particulargene,disorder,ortrait.Sincehemochromatosisiswellstudied,alotof

    informationisknownaboutthisdisorderanditsgene.Somedifferenttypesof

    informationthatmaybeincludedinanOMIMrecordaredisorderdescription,inheritance,moleculargenetics,genotypeandphenotypecorrelations,diagnosis,

    populationgenetics,andanimalmodels.

    EachrecordincludesaTableofContentsboxontherightwithquicklinkstodifferent

    sectionswithintherecord.

    5. Tolearnmoreaboutthemolecularbasisofhemochromatosis,selecttheMolecularGenetics

    linkintheTableofContentsbox(seescreenshotonpreviouspage).TheMolecularGenetics

    sectionoftheOMIMrecordforhemochromatosisisshownbelow.

    Onestudyshowedthatabout83%ofhemochromatosiscasesarerelatedtotheC282Y

    mutation.TheC282YnotationmeansthatamutationoccursintheDNAsequencethat

    changestheaminoacidatposition282intheproteinproductfromacysteine(C)toa

    tyrosine(T).

    6. ClickonthefirstlinkfortheC282Ymutation, 613609.0001.Thislinkwilltakeyoutothe

    OMIMrecordfortheHFEgene(MIMID*613609;theasteriskprefixindicatestherecord

    representsageneofknownsequence).OMIMoftenmaintainsseparaterecordsfor

  • 7/16/2019 Bio in for Matics

    10/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    10

    phenotypes(suchasthedisorderhemochromatosis)andthegenesassociatedwiththose

    phenotypes.

    7. TheAllelicVariants sectionoftheOMIMrecordfortheHFEgeneisshowninthescreenshot

    below.Thissectiontypicallydescribessomeofthemostnotablegenemutations(alsocalled

    allelicvariants)thatproducediseasephenotypes.NotethattheC282Ymutationisalsoknown

    astheCYS282TYRmutation,anditisthefirstofseveralmutationsthathavebeenidentifiedfortheHFEgene.ToseealistingofthedifferentmutationsfortheHFEgene,clickonthe See

    allelicvariantsintabulardisplay link.

    8. NowyouarereadytoanswerQuestions12forActivity1intheworksheetonpage51.

    9. ScrolltothetopofthisOMIMrecord,andclickonthe Limitstab.Letsuseoptionsonthe

    Limitspagetodeterminehowmanygenesinthehumangenomehavebeendescribedin

    OMIM.

    UnchecktheboxesforTitleandchromosome6.

    ChecktheboxesbesidetheMIMNumberPrefixoptionsfor *genewithknownsequence

    and+genewithknownsequenceandphenotypeasshowninthescreenshotonthenext

    page.

    ThenclicktheGobuttonbesidethesearchboxatthetopofthepage.

  • 7/16/2019 Bio in for Matics

    11/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    11

    10.Youshouldretrieveover13,500searchresults.Oftheestimated20,000to25,000genesin

    thehumangenome,about13,500geneshaverecordsinOMIM.Youmaywanttotestyour

    newsearchskillsbyusingOMIMtosearchforothergenesorgeneticconditions.Inaddition

    toOMIM,anothergoodresourceforlearningaboutgeneticdisordersandassociatedgenesis

    theGeneTestswebsite,whichisdescribedinthenextpartofthisactivity.

    GeneTests

    TheGeneTestswebsiteisamedicalgeneticsinformationresourcedevelopedbyresearchersand

    healthcareprofessionalsandfundedbytheNationalInstitutesofHealth.Inadditiontoproviding

    up-to-date,authoritativereports(GeneReviews)ongeneticdisorders,thesitealsoincludes

    educationalmaterials(e.g.,factsheetsongenetictestingandcounseling,PowerPointslides,and

    anillustratedglossary)andonlinedirectoriesofgeneticlaboratoriesandclinics.

    Thisactivityfocusesonaccessingandusinggeneticdisorderinformationavailablefrom

    GeneReviews.Allentriesarewrittenandreviewedbyphysicians,sothelanguageissimilartothat

    ofmedicaltext.Whiletheamountandkindofcontentcanvarygreatlyfromrecordtorecordin

    OMIM,allreportsinGeneReviewswillprovidesimilarkindsofinformationandsharethesame

    organizationalstructure.

    LetsgototheGeneTestswebsite( www.genetests.org)tofindaGeneReviewforhereditary

    hemochromatosis.ThescreenshotoftheGeneTestshomepageisshownonthenextpage.

    http://www.genetests.org/http://www.genetests.org/
  • 7/16/2019 Bio in for Matics

    12/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    12

    1. Clickon inthenavigationbaratthetop.

    2. AttheGeneReviewssearchpage(shownbelow),usethe GeneSymbolsearchoption,select

    exactlymatchesfromthedrop-downmenu,andenter HFEintothesearchbox.ClickGoto

    submityoursearch.

  • 7/16/2019 Bio in for Matics

    13/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    13

    3. BesidethesearchresultHFE-AssociatedHereditaryHemochromatosis,selectthe

    linktoaccessthehereditaryhemochromatosisreviewshownbelow.

    4. Ontherightsideofthescreenisanavigationcolumnwithlinkstodifferentsectionsofthe

    HFE-AssociatedHereditaryHemochromatosisGeneReview.

    5. AccesstheSummarysectiontolearnaboutdiseasecharacteristicsandtreatmentfor

    hemochromatosis.ThissectioncanhelpanswerQuestion3forActivity1intheworksheeton

    page51.

    6. AccesstheMolecularGeneticssectionforabriefoverviewofthisdisordersmolecularbasis.

    Withinthissectionyoucanfindinformationabout:

    officialsymbolforthegeneassociatedwiththisdisorder.

    chromosomallocusofthegene.

    genesizeandthenumberofexonsinthegene.

    nameofthegenesproteinproduct.

    descriptionoftheproteinsfunction. mutationsinnucleotideandaminoacidsequencesthatcauseabnormalproteinproducts

    anddiseasephenotypes.

    linkstoscientificliteratureandotherdatabasesformoreinformation.

  • 7/16/2019 Bio in for Matics

    14/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    14

    Activity 2Online Resource: NCBI Map Viewer

    Find the hereditary hemochromatosis gene on a chromosome map.

    NCBI Map ViewerNCBIMapViewerisaWeb-basedtoolforviewingandsearchinganorganism'scompletegenome.

    Usersalsocanviewmapsofindividualchromosomesandzoomintospecificregionswithin

    chromosomestoexplorethegenomeatthesequencelevel.

    MapViewerprovidesaccesstoseveraldifferenttypesofmapsfordifferentorganisms.Manyof

    thesemapsaremeaningfulonlytoscientificresearchers.Adiscussionofallthedifferenttypesof

    mapsandgenomicdataisbeyondthescopeofthisactivity,whichwillfocusonlyonhowtolocate

    aspecificgenelocusonachromosomemap.

    1. GototheNCBIMapViewerwebsite( www.ncbi.nlm.nih.gov/mapview/).Inthelistof

    Primates,clickontheBuild37.2linkforHomosapiens(human).

    http://www.ncbi.nlm.nih.gov/mapview/http://www.ncbi.nlm.nih.gov/mapview/
  • 7/16/2019 Bio in for Matics

    15/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    15

    2. TheMapViewerpagefortheentirehumangenomeisshowninthescreenshotbelow.

    Homosapiensgenomeview:www.ncbi.nlm.nih.gov/mapview/map_search.cgi?taxid=9606

    3. InActivity1,welearnedthattheofficialsymbolforthehereditaryhemochromatosisgeneis

    HFE,anditslocusis6p21.3.LetsfindtheHFEgeneonchromosome6.

    Whatisalocus?

    Thelocusforaparticulargenedescribestheregionofa

    chromosomewherethatgenecanbefound.Forthe6p21.3

    locus:6isthechromosomenumber,pindicatestheshort

    armofthechromosome,and21.3isthenumberassignedto

    aparticularbandorregiononachromosome.When

    chromosomesarestainedinthelab,lightanddarkbands

    appear,andeachbandisnumbered.Thehigherthenumber,

    thefartherawaythebandisfromthecentromere.Alocus

    containingqisfoundonthelongarmofachromosome.

    4. InthesearchboxatthetopoftheMapViewerpage,enter HFE[sym]andthenclicktheFind

    buttontosubmityoursearch.Addingthe[sym]searchfieldqualifiertotheendofyoursearch

    termspecifiesyourquerysothatonlythoseresultscontainingtheHFEgeneareretrieved.

    http://www.ncbi.nlm.nih.gov/mapview/map_search.cgi?taxid=9606http://www.ncbi.nlm.nih.gov/mapview/map_search.cgi?taxid=9606
  • 7/16/2019 Bio in for Matics

    16/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    16

    5. Redtickmarksshouldbedisplayedonchromosome6,indicatingtheapproximatelocationof

    theHFEgeneinthemiddleoftheshortarmofchromosome6(seescreenshotbelow).The

    rednumber(61)labelingchromosome6indicatesthenumberofobjectsmappedto

    differentassembliesofthehumangenomethatincludetheHFEgene.

    6. Clickonthenumber6linkbelowthechromosome.Thiswillopenaviewofchromosome6

    thatshouldlooklikethescreenshotbelow.Inthenextstepwewillmodifythisviewsowecan

    seeanideogramshowingtheregionofchromosome6wheretheHFEgenecanbefound.

  • 7/16/2019 Bio in for Matics

    17/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    17

    7. Tomodifythedisplayoptions,clickontheMaps&Optionsbuttonintheupperrightcorner.

    Thiswillopenawindowforcustomizingmapoptions.Makethefollowingadjustments.

    RemoveallmapslistedunderMapsDisplayed(lefttoright) excepttheGenemap.To

    removeamap,selectitwithyourmouseandthenclicktheREMOVEbutton.

    UnderAvailableMapsselectideogr(youwillneedtoscrollthroughmorethanhalfofthe

    availablemaps)andthenclicktheADDbutton.Theideogrammapisagraphicshowing

    thebandingpatternofachromosome.

    TheMapsDisplayedlistshouldlooklikethescreenshotbelow.TheGenemapshouldbe

    designatedasyourmastermap.Tomakeamapthemaster,selectitwithyourmouseand

    thenclicktheMakeMaster/MovetoBottombutton.Inthechromosomeview,amaster

    mapisshownattherightsideofthescreenalongwithitsdetailsanddescriptivetext.The

    Genemapincludeslinksforlearningmoreaboutthegenesmappedtoaparticularregion

    ofgenomicsequenceonachromosome.

    UnderMoreOptions nearthebottomofthewindow,changePageLengthfrom30to10.

    ThePageLengthoptionishighlightedinthescreenshotbelow.Thiswilladjusttheheight

    ofthedisplayedmap.

    BeforeyouclicktheOKbuttontosubmityourchanges,theoptionswindowshould

    resemblethescreenshotbelow.

    8. Thenewmapofchromosome6shouldresemblethescreenshotonthenextpage.

  • 7/16/2019 Bio in for Matics

    18/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    18

    9. CheckoutsomeofMapViewersfeaturesdisplayedinthescreenshotabove.

    Theportionofchromosome6displayedinMapViewerishighlightedontheideogramin

    thebluenavigationcolumnontheleft.Noticethattheredmarkindicatingthepositionof

    theHFEgenelinesupwiththeideogramatthe6p22chromosomeband,not6p21.3.

    Roundedtothenearestthousandth,theregionofsequencedisplayedbeginsataboutthe

    26,086,000th

    nucleotideandendsataboutthe26,100,000th

    nucleotideoftheDNA

    sequenceofchromosome6.ThetotalDNAsequenceforchromosome6isabout171

    millionbasepairslong,butthisviewonlyshowsabout14,000basepairs.

    ClickingontheIdeogramorGenes_seqmaps(notthelabels)willopenapop-upwindow

    withoptionsforzoominginoroutonthedisplayedmaps.MapViewerhaszoomedinso

    muchtoshowtheHFEgene,thereisntmuchoftheideogrammapdisplayed.Youcan

    alsozoominandoutusingthezoomoptioninthebluenavigationcolumn.

    TheGenes_seqmapprovideslinkstogene-specificentriesinotherNCBIdatabases.

    o HFELinkstotheHFEentryintheEntrezGenedatabase,acompendiumofgenes

    andmappedphenotypes.

    o OMIMLinkstothehemochromatosisentryintheOnlineMendelianInheritancein

    Man(OMIM)databasecoveredinActivity1.

  • 7/16/2019 Bio in for Matics

    19/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    19

    o HGNCLinkstothegenesymbolreportmaintainedbytheHUGOGene

    NomenclatureCommittee.

    o svLinkstoSequenceViewer,agraphicalinterfaceforinvestigatingthegenes

    sequenceaswellasgenomicsequenceupstreamanddownstreamofthegene.

    o prLinkstosequencerecordsforthegenesproteinproductmaintainedinNCBIs

    Proteindatabase.

    o dlLinkstoapagefordownloadingtherangeofsequencedatadisplayedinMap

    Viewer.

    o evLinkstoEvidenceViewer,atoolforfindingbiologicalevidencethatsupportsa

    particulargenemodelandforexploringthedifferenttypesofexpressedsequences

    thataligntoaparticularareawithinagenome.

    o mmLinkstoModelMaker,atoolforbuildingyourownversionofagenemodelby

    addingorremovingexons.

    o hmLinkstoHomologene,aresourceforcomparinggenesinhomologoussegments

    ofDNAfromdifferentorganisms.

    o stsLinkstoUniSTS,acomprehensivedatabasethatintegratesgeneticmarkerand

    mappinginformation.Asequencetaggedsite(STS)isashort(200to500basepairs)

    DNAsequencethathasasingleoccurrenceinthehumangenome.Detectableby

    polymerasechainreaction(PCR),STSsareusefulforlocalizingandorientingthe

    sequencedatareportedfrommanydifferentlaboratories.

    o CCDSLinkstotheCCDSproject,anefforttoensurethatcodingregionswithinthe

    humangenomeareconsistentlyannotated.

    o SNPLinkstorecordsforsinglenucleotidepolymorphisms(SNPs)andotherareasof

    sequencevariationthathavebeenidentifiedintheselectedgene.

    10.Letszoomouttoviewtheentirechromosomeusingthe Maps&Options window.

    ClickonMaps&Options againtoopentheoptionswindow.

    Deletethenumbersdefiningthe RegionShownatthetopoftheoptionswindow.

    Thiswillmodifythedisplaysoitshowstheentirechromosome.

    UnderMoreOptionsnearthebottomofthewindow,changePageLengthfrom10to

    20.ThePageLengthoptionishighlightedinthescreenshotonthenextpage.Thiswill

    display20labeledgenesinthemastermapandshouldprovideenoughspaceonthe

    screentoviewtheentirechromosomewithreadablelabelsforthechromosome

    bands.

    OncetheMaps&Optionswindowresemblesthescreenshotonthefollowingpage,

    clicktheOKbuttontosubmityourchanges.

  • 7/16/2019 Bio in for Matics

    20/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    20

    11.Yourviewofchromosome6shouldresemblethescreenshotonthenextpage.

    Toseeamorecomprehensivelistingofgenesonchromosome6,selectthe DataAsTable

    Viewlinkinthebluenavigationcolumnontheleft.The DataAsTableViewdisplays

    1,000ofthegenesonchromosome6andshowswheregenesstartandstopinthe

    chromosomesDNAsequence.

    Scrolldowntothebottomofthemaptoexaminethe SummaryofMapssection.Usethis

    informationandwhatyouhavelearnedaboutMapViewertoanswertheQuestionsforActivity2onpage51.

  • 7/16/2019 Bio in for Matics

    21/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    21

  • 7/16/2019 Bio in for Matics

    22/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    22

    Activity 3Online Resources: NCBI Entrez Gene and GenBank

    Examine gene sequence and structure.

    NCBI Entrez Gene and GenBank

    EntrezGeneisanNCBIresourcethatservesasasingle-queryinterfaceforaccessingsequence

    andotherbiologicalinformationforspecificgenesfromavarietyofsequencedorganisms.

    GenBankisNCBIscomprehensiverepositoryofannotatedDNAsequences.

    ThisactivitycovershowtouseEntrezGenetoaccessthegenomicDNAsequenceofthe

    hereditaryhemochromatosis(HFE)gene.Thenbyexaminingsomedifferentfeaturesofa

    GenBankrecordfortheHFEgene,wewilllearnaboutthegenesstructure(e.g.,intronandexon

    composition,codingsequence).

    1. Tobegin,letsgototheEntrezGenehomepage( www.ncbi.nlm.nih.gov/gene).Inthesearch

    boxatthetop,enterHFE[sym]ANDhuman[orgn]asshowninthescreenshotbelow.Besure

    tocapitalizeanyBooleanoperator(AND,OR,andNOT)includedinyoursearchstatements.

    Thensubmityoursearch.

    http://www.ncbi.nlm.nih.gov/genehttp://www.ncbi.nlm.nih.gov/gene
  • 7/16/2019 Bio in for Matics

    23/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    23

    SearchTip:Adding[sym]totheendofyourquerytermtellsEntrezGenethatyouare

    searchingbygenesymbolonly.Ifyoudonotspecifythatyouwanttosearchthegene

    symbolfield,thesearchwillreturnmultiplerecordsthatincludethequeryterm

    anywherewithinarecordscontent.Adding[orgn]toasearchtermlimitsthesearch

    togenesfromaspecificorganism.Formoreinformationonoptionsforrefiningyour

    search,seetheSearchFieldDescriptionsandQualifierssectionoftheEntrezHelp

    Document(www.ncbi.nlm.nih.gov/entrez/query/static/help/Summary_Matrices.html).

    2. Submittingthissearchshouldretrieveasingleresult.TheHFErecordisshownbelow.

    3. IntheSummarysectionyoucanfindinformationaboutthefunctionofthegenesprotein

    product.TheHFEproteinisthoughttohavearoleinregulatingirontransportintocells,and

    defectsintheHFEgenecancausetheironabsorptiondisorderhereditaryhemochromatosis.

    UseinformationprovidedintheSummarysectiontoanswerQuestion1forActivity3inthe

    worksheetonpage52.

    http://www.ncbi.nlm.nih.gov/entrez/query/static/help/Summary_Matrices.htmlhttp://www.ncbi.nlm.nih.gov/entrez/query/static/help/Summary_Matrices.html
  • 7/16/2019 Bio in for Matics

    24/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    24

    4. BelowthesummarysectionistheGenomicregions,transcripts,andproductssection.

    ThesequenceviewerboxshowsagraphicmodeloftheHFEgeneconsistingofathingray

    line(representingintronsthatareremovedwhenthemRNAisprocessed)connectedto

    thickergreenboxes(representingexons).

    Theportionofthechromosome6sequenceincludedinthesequenceviewerboxisnoted

    intheupperleftcorner.

    ClickontheGenBanklinkintheupperrightcornertoaccesstheGenBankrecordforthe

    HFEgenesequencethatispartofthesequencedatageneratedbytheInternational

    HumanGenomeProject.AscreenshotofthisGenBankrecordisshownonthenextpage.

  • 7/16/2019 Bio in for Matics

    25/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    25

    5. CheckoutsomeofthefollowingfeaturesintheGenBankrecord.

    Atthetopweseethatonlyaverysmallportionofchromosome6(from26,087,448th

    base

    to26,097,059th

    base)isincludedinthisrecord.

    ThefirstReferencelistedforthisrecordidentifiestheInternationalHumanGenome

    SequencingConsortiumasthesourceforthissequenceinformation.Thusthissequence

    isaproductoftheinternationalHumanGenomeProject.

    Evenafteragenomesequenceispublishedinajournalandreportedascomplete,the

    researchcommunitycontinuestoanalyzethegenomesequencedataandimprovethe

    annotationthatdescribesdifferentfeaturesencodedwithinthegenomesequence.Note

    thatthisrecordwaslastmodifiedOctober25,2010.

    6. ScrolldowntotheFEATURESsectionofthisGenBankrecord(seescreenshotonnextpage).

    TheHFEgeneis9,612basepairs(bp)long.

    TheinformationinthisGenBankrecordfortheHFEgenewasDerivedbyautomated

    computationalanalysisusinggenepredictionmethodasapartoftheHumanGenome

    Project.

    FromthemultipleentriesformRNAlistedinthisrecord,weseethatmorethanone

    mRNAtranscriptcanbegeneratedfromtheHFEgene.Forexample,anexonincludedin

    onemRNAtranscriptmightbeleftoutinanothertranscript.Eachofthesedifferent

    mRNAtranscriptsfromthesamegeneisknownasavariant.

  • 7/16/2019 Bio in for Matics

    26/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    26

    7. UseyourbrowsersbackbuttontoreturntotheEntrezGenepageforthehumanHFEgene.

    8. LetsaccessanotherGenBankrecordfortheHFEgenesequencetoseehowinformationcan

    varyinrecordsthatcomefromdifferentsources.Asshowninthescreenshotonthenextpage,selecttheRelatedsequenceslinkintheTableofcontentsboxontherightsideofthe

    screen.

  • 7/16/2019 Bio in for Matics

    27/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    27

    9. IntheRelatedSequencesfortheHFEgene(seescreenshotonthenextpage),selectthe

    genomicsequencerecord Z92910.1.

    Howdidyouknowwhichgenomicsequencetoselect?

    TheproblemwitharchivalsequencedatabaseslikeNCBIsGenBankisthatthey

    usuallyhavemultiplesequencerecordsforthesamegene.Youmayneedtoopeneachrecordindividuallyandbrowsethroughdefinition,sequence

    annotation,andcommentstodeterminehowmuchofthegenesnucleotide

    sequenceiscontainedwithineachrecord.

    Forexample,theU91328.1recordcontainsthesequenceofagenomicsegment

    thatnotonlyincludestheHFEgenesequencebutalsosequencesforother

    genes.Y09801.1containsonlysequenceinformationfortheHFEpromoterand

    theHFEgene'sfirstexon.Ofthegenomicrecordslisted,Z92910.1hasthemost

    completesequenceinformationfortheHFEgene.

    InsequencedatabasessuchasGenBank,genomicDNAsequencerecordsfor

    eukaryoticorganismscontainbothexonsandintrons,whilemRNAsequences

    areintron-freeDNAsequences.AllsequencesinGenBankandsimilarrepositoriesusethesingle-letterabbreviationsfortheDNAbasesadenine(A),

    cytosine(C),guanine(G),andthymine(T)torepresenteachnucleotide.Even

    mRNAsequencerecordsuseA,C,G,andTwhereTisusedtoreplaceeach

    uracil(U)inthemRNAsequence.

  • 7/16/2019 Bio in for Matics

    28/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    28

    RelatedSequencesfortheHFEGeneinEntrezGene

    10.AscreenshotoftheGenBankrecord Z92910.1fortheHFEgeneisshownonthenextpage.

    TheDNAsequenceincludedinthisrecordis12,146basepairs(bp)long.Inadditionto

    containingthegenomicsequenceoftheHFEgene,thisrecordalsocontainsseveralhundredadditionalbasepairsofsequenceupstreamanddownstreamofthegene.

    ThisrecordwasoriginallysubmittedbyaresearchertoGenBankin1997,sothesequence

    oftheHFEgenewasknownseveralyearsbeforetheHumanGenomeProjectwas

    complete.

    ScrolldowntotheFEATURESsectionofthisrecordandusethisinformationtoanswer

    Questions24forActivity3onpage52.Notethatclickingonthe genelinkinthe

    FEATURESsectionshowsthatthelengthoftheHFEgeneisdifferentfromwhatwe

    observedintheGenBankrecordexaminedinstep5ofthisactivity.

  • 7/16/2019 Bio in for Matics

    29/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    29

    GenBankRecordZ92910.1forHumanHFEGene

    11.SomefeaturesofthesequenceinGenBankrecordZ92910.1include

    source:RequiredforeveryGenBankrecord,thesourceprovidestheentiresequence

    lengthandthescientificnameofthesourceorganism.Othertypesofsourceinformation

    mayincludechromosomenumber,maplocation,andcloneorstrainidentification. gene:Thisfeatureprovidesnucleotidenumbersindicatingwherethegenestopsand

    starts.Thislinkopensanewsequencerecordthatshowsonlythegenesequence.

    exon:Thisfeatureprovidesnucleotidenumbersindicatingwhereeachexonbeginsand

    ends.Youwillseeseveraloftheseentriesasyouscrolldown.Eachexonisasequence

    segmentthatcodesforaportionofprocessed(intron-free)mRNA.Thenameofthegene

  • 7/16/2019 Bio in for Matics

    30/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    30

    towhichtheexonbelongsandtheexonnumberareprovided. Anexonlinkopensa

    newsequencerecordthatshowsonlytheexonsequence.

    CDS:Thecodingsequence(CDS)consistsofnucleotidesthatactuallycodeforaminoacids

    oftheproteinproduct.Thisfeatureincludesthecodingsequence'saminoacidtranslation

    andmayalsocontaingenename,geneproductfunction,alinktoproteinsequence

    record,andcross-referencestootherdatabaseentries. ACDSlinkopensanewsequencerecordthatshowsonlythecodingsequence.

    intron:Thisfeatureprovidesthenucleotidenumbersindicatingwhereeachintronbegins

    andends.Anintronisasegmentofnoncodingsequencethatistranscribedbutremoved

    fromthetranscriptbysplicingtogethertheexonsoneithersideofit. Anintronlink

    opensanewsequencerecordthatshowsonlytheintronsequence.

    Whatsthedifferencebetweenexons

    andcodingsequence?

    Exonsoftenaredescribedasshortsegments

    ofproteincodingsequence.Thisisabitofan

    oversimplification.Exonsaresegmentsof

    sequencesplicedtogetherafterintronshave

    beenremovedfrompre-mRNA.Exonscarry

    thecodingsequenceofagene,butsome

    exonsmaycontainnocodingsequence.

    Portionsofexonsorevenentireexonsmay

    containsequencethatisnottranslatedinto

    aminoacids.Thesearetheuntranslated

    regions(UTR)ofmRNA.UTRsarefound

    upstreamanddownstreamoftheprotein-

    codingsequence.Seediagramonright.

    12.SequenceinformationinaGenBankrecordcanalsobedisplayedusinggraphicsintheNCBI

    SequenceViewer.ToaccessSequenceViewerfromaGenBankrecord,clickonthe Graphics

    linkintheupperleftcorner(asshownbelow).

  • 7/16/2019 Bio in for Matics

    31/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    31

    13.TheSequenceVieweroptionforGenBankrecordZ92910.1isshowninthescreenshotbelow.

    ThetoppaneldisplaystheentiresequenceincludedintheGenBankrecord,thegreenbar

    representstheHFEgenesequence,andtheblueoutlineofaboxwitharrowsindicates

    whichportionofthesequenceisshowninthepanelbelow.Clickanddragthearrowson

    theblue-boxoutlinetochangehowmuchofthesequenceisdisplayedinthelowerpanel.

    Youcanalsousethearrowsontheleftsideofthelowerpaneltomovealongthe

    sequenceandseewhereexonsandothergenefeaturesbeginandend.Thesliderbelow

    thearrowscanbeusedtozoominandoutonthesequence.

  • 7/16/2019 Bio in for Matics

    32/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    32

    Activity 4Online Resources: UniProt Protein Knowledgebase and BLAST Searching

    Access the amino acid sequence of a genes protein product.

    Compare the HFE protein sequence with protein sequences of otherorganisms.

    UniProt Protein Knowledgebase and BLAST Searching

    TheProteinKnowledgebase,whichispartoftheUniversalProteinResource(UniProt),isa

    comprehensive,freelyaccessibledatabasethatthescientificcommunityusestoaccesshigh-

    qualityproteinsequenceandfunctionalinformation.ThisactivitycovershowtouseUniProtto

    learnabouttheaminoacidsequenceandotherfeaturesofthehereditaryhemochromatosis

    protein.

    1. GototheUniProthomepage( www.uniprot.org),enterHFEintothequeryboxasshownin

    thescreenshotbelow,andthensubmityoursearch.

    2. Fromthelistofresults(showninthescreenshotonthenextpage),noticethatsomeentries

    havegoldstarsandothershavegraystars.Thosewithgoldstarshavedescriptionsofprotein

    functionsandcharacteristicsthathavebeenmanuallyreviewedbyexperts.Entrieswithgray

    starshavedescriptionsthatwereautomaticallygenerated,andexpertshavenotyetreviewed

    http://www.uniprot.org/http://www.uniprot.org/
  • 7/16/2019 Bio in for Matics

    33/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    33

    theserecords.Thusselectingasearchresultwithagoldstarwillprovideyouwithricher,

    higherqualityinformationaboutaprotein.

    SelectaccessionnumberQ30201fortheHFE_HUMANentryforthehereditary

    hemochromatosisprotein.

    3. TheUniProtentryfortheHFEproteinisshownonthenextpage.Thebluenavigationbarat

    thetopofthescreencontainslinkstodifferentpartsoftheUniProtrecordforthisprotein.

    Makeanoteoftheaccessionnumber(Q30201)forthisprotein.Wewillusetheaccession

    numbertosearchforproteinstructuralinformationinActivity5.

    Scrolldownthroughtherecordandreviewthe Proteinattributes andtheGeneral

    annotationsectionstoanswerQuestions13forActivity4intheworksheetonpage52.

    4. IntheProteinattributes section,forSequenceprocessing,noteThedisplayedsequenceis

    furtherprocessedintoamatureform.ThismeansthatpartoftheHFEproteinchainneedsto

    becutoffbyaproteolyticenzymetoformthematurefunctionalprotein.

  • 7/16/2019 Bio in for Matics

    34/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    34

    5. ClickonSequenceannotationinthebluenavigationbarnearthetopoftherecord(markedin

    thescreenshotabove).

    6. TheSequenceannotationsectionoftheHFEproteinrecordisshowninthescreenshotonthe

    nextpage.

    UnderMoleculeprocessingintheSequenceannotationsection,noticethatthesignal

    peptidecomprisesaminoacids122.Thefirst22aminoacidsarenotassociatedwithany

    domains(functionalunitswithinaprotein).Thisportioniscleavedfromthecomplete

    proteinsequencetomakethemature,functionalHFEprotein,whichconsistsofaminoacids23348.ClickingonthebluePosition(s)numbersinthesequenceannotationwill

    openawindowshowingtheselectedsequencehighlightedwithinthecontextofthe

    entireproteinsequence.

    InActivity1welearnedthatthecysteineataminoacidposition282ischangedtoa

    tyrosineinacommonmutationthatcauseshemochromatosis.ReviewtheRegionsand

    AminoacidmodificationspartsoftheSequenceannotationsectiontoanswer

    Questions45forActivity4onpage52.

  • 7/16/2019 Bio in for Matics

    35/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    35

    7. ScrolldowntotheSecondarystructurepartofthe Sequenceannotationsection(shownin

    imagebelow)andclickonDetailsbelowthecoloredbar.

    8. Thesecondarystructuredetailsshowwhichsegmentsofproteinsequencemakeupbeta

    strands,alphahelices,ortheturnsthatformbetweenbetastrandsandalphahelices.These

    secondaryelementsareimportantindeterminingthethree-dimensionalproteinstructure.

    UsethissecondarystructuralinformationtoanswerQuestion6forActivity4onpage52.

    9. ReturntothetopoftheHFEproteinrecordbyscrollingorbyclicking Namesintheblue

    navigationbar.ClickontheBlasttabatthetopofthepage.

  • 7/16/2019 Bio in for Matics

    36/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    36

    NOTE:BLAST(BasicLocalAlignmentSearchTool)isatoolusedtocalculatehowsimilar

    nucleotideorproteinsequencesareamongthesameordifferentkindsoforganisms.Many

    resourcesthatmaintainbiologicalsequenceinformationoftensupporttheirownBLAST

    searchingcapabilitiestoretrieveandcomparesequencedata.Formoreinformationabout

    BLAST,seeTheNCBIHandbook(www.ncbi.nlm.nih.gov/books/NBK21097/).

    ProteinsequencesareoftenpreferredovernucleotidesequencesforBLASTsearchingbecauseofthegreatervariabilityinnucleotidesequences.Rememberwiththegenetic

    code,differentcodonsofnucleotidescanspecifythesameaminoacid.Thusproteinsthat

    havesimilaraminoacidsequencesmayhaveconsiderablydifferentnucleotidesequences

    encodingthoseproteins.

    10.AscreenshotoftheBLASTsearchfeaturefortheHFEproteinisshownbelow.

    TheaminoacidsequenceofthecompleteHFEproteinisautomaticallyenteredintothe

    textboxontheleft.Thesingle-letterabbreviationsusedtorepresenteachaminoacidare

    explainedintheTableofStandardGeneticCodeonpage50.

    ClickontheBlastbuttontocomparetheaminoacidsequenceoftheHFEproteinwithall

    thesequencesavailablefromtheUniProtKnowledgebase.Bepatient.ABLASTsearchmaytakeseveralminutesdependingonhowbusytheserveris.

    11.Oncetheresultsareretrieved,scrolldowntothe DetailedBLASTresults(seescreenshoton

    nextpage).

    TheIdentitycolumnontherightprovidesthepercentofeachentrysaminoacid

    sequencethatisidenticaltothesequencesubmitted.Tosortallofyourresultsfrom

    highesttolowestIdentityvalues,clickonthearrowsatthetopofthe Identitycolumn.

    Toseemoreresults,clickNextintheupperrightcorner.

    UsetheDetailedBLASTresultstoanswerQuestion7forActivity4onpage53.

    http://www.ncbi.nlm.nih.gov/books/NBK21097/http://www.ncbi.nlm.nih.gov/books/NBK21097/
  • 7/16/2019 Bio in for Matics

    37/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    37

    DetailedBLASTResultsfortheHFEProteininUniPRot

  • 7/16/2019 Bio in for Matics

    38/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    38

    Activity 5Online Resources: Protein Data Bank

    Explore the sequence and structure of the genes protein product.

    Protein Data Bank

    Thisactivitydemonstrateshowtofindandviewaproteinstructureusingtoolsandresources

    availablefromtheProteinDataBank(PDB).PDBisaninternationalarchiveof3-Dstructural

    informationforbiologicalmacromolecules.PDBrecordsprovideaccesstoseveralinteractive

    moleculargraphicsprograms.ThisactivityalsousesFirstGlanceinJmol,aresourcethatworksin

    mostbrowsersforviewingthemajormolecularfeaturesofastructurewithjustafewmouse-

    clicks.

    Before You Begin

    ManyfeaturesofthePDBwebsiterequirenewerWebbrowserswithJavaScriptandcookies

    enabled,andpop-upsshouldnotbeblocked.FormoreinformationonsystemrequirementsseePDBFrequentlyAskedQuestions(www.rcsb.org/pdb/static.do?p=home/faq.html).

    Some Protein Structure Basics

    Proteinsarecreatedbylinkingaminoacidsinalinearfashiontoformpolypeptidechains.The

    aminoacidsequenceofapolypeptidechainistheprimarystructureofaprotein.Seethe

    TableofStandardGeneticCodeonpage50forsingle-letterandthree-letterabbreviationsfor

    the20differentaminoacids.

    Aminoacidshavedifferentchemicalproperties.Forexample,someaminoacidresiduesare

    strictlyhydrophobic(waterfearing)andmustbeprotectedfromaqueousenvironments,whileotheraminoacidsarehydrophilic(waterloving).Thesubstitutionofjustoneamino

    acidforanotherwithverydifferentchemicalpropertiescanhaveseriousconsequencesfora

    proteinsstructureandfunction.

    Thefoldingofregionswithinthepolypeptidechainintoalphahelicesandbetasheetsisa

    proteinssecondarystructure.

    Thepackingoftheentirepolypeptidechainintoathree-dimensionalglobularunitisa

    proteinstertiarystructure .

    Ifaproteinmoleculeisacomplexofmorethanonepolypeptidechain,thenthecomplete

    structureofthismolecularcomplexiscalledaproteins quaternarystructure.

    Adomainisadiscreteportionofaproteinwithitsownfunctionandspecificthree-

    dimensionalstructure.Thecombinationofdomainsinasingleproteindeterminesitsoverall

    function.

    Differentpartsofapolypeptidechaincanbelinkedbydisulfidebridgesthatformbetween

    twocysteineresidues.Disulfidebridges(ordisulfidebonds)stabilizeaproteinsthree-

    dimensionalstructure.Thelossofadisulfidebridgewouldbedetrimentaltoaproteins

    overallstructure.

    http://www.rcsb.org/pdb/static.do?p=home/faq.htmlhttp://www.rcsb.org/pdb/static.do?p=home/faq.html
  • 7/16/2019 Bio in for Matics

    39/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    39

    Finding a Structure Record in PDB

    1. Tobegin,letsgototheProteinDataBank( www.rcsb.org/pdb/).

    NOTE:IfyouarenewtoPDB,besuretocheckoutthe Educationlinksinthelight

    bluecolumnontheleftofthescreen.Under EducationalResourcesyoucanfind

    posters,tutorials,activities,andlessons.MoleculeoftheMonth isacollectionof

    vignettes,eachfeaturingadifferentmolecularstructureanditsimportanceto

    humanwelfare.

    2. BesidethesearchboxatthetopofthePDBhomepage,select AdvancedSearch.

    http://www.rcsb.org/pdb/http://www.rcsb.org/pdb/
  • 7/16/2019 Bio in for Matics

    40/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    40

    3. OntheAdvancedSearchpage,fromthe ChooseaQueryTypedropboxselectUniProtKB

    AccessionNumber(s).InActivity4weaccessedthehumanhemochromatosisproteinrecord

    Q30201intheUniProtProteinKnowledgebase.Enter Q30201inthesearchbox.The

    advancedsearchpageshouldlooklikethescreenshotbelow.Selectthe SubmitQuerybutton

    tosubmityoursearch.

    4. Thesearchshouldreturntwohits.Scrolldownthepagetoseeabriefsummaryofeach

    searchresult.Onerecord(1DE4)providesstructuralinformationonthehemochromatosisproteinHFEcomplexedwithareceptor,andtheotherrecord(1A6Z)justprovidesstructural

    informationfortheHFEprotein.Clickon 1A6ZHFE(HUMAN)HEMOCHROMATOSISPROTEIN

    toopenthisPDBrecord.

  • 7/16/2019 Bio in for Matics

    41/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    41

    5. Thesummarytabofthe1A6Zrecordisshowninthescreenshotabove.

    NotetheMolecularDescriptionboxinthecenterofthescreenshot.Thisstructureisa

    complexoffourpolymerchains:A,B,C,andD.AandCareidenticalHFEpolypeptide

    chains,andBandDareidenticalchainsofanotherproteincalledbeta-2-microglobulin.

    NotethePrimaryCitationinthe1A6Zrecord.Thebestwaytolearnaboutstructure

    detailsistoaccessthearticlelistedastheprimarycitation.Althoughthefulltextforsome

    articlesmaybefreelyavailableonline,manyarticlesareaccessibleonlybysubscription.

    Someuniversityresearchlibrariesmayprovidepublicaccesstotheirjournalcollections.

    Thearticleforthisstructurehasbeenaccessedtorevealthefollowingdetails:

    o OnlythesolubleportionoftheHFEpolypeptidechainisincludedinthe1A6Zstructure.Thetransmembranedomainismissing,sotheHFEproteininthis

    structurehasonly275ofthe348aminoacidsinthecompleteHFEprotein

    sequence.

    o Thefirst22aminoacidsoftheHFEpolypeptidesequencehavebeenexcluded

    becausetheyarenotpartofthemature,functionalprotein.Therefore,thefirst

    aminoacidinthisstructureisreallythe23rd

    ,andcysteine260isthecysteine

    residueinvolvedintheCYS282TYRmutationthatwelearnedaboutinActivity1.

  • 7/16/2019 Bio in for Matics

    42/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    42

    o EachHFEpolypeptidechainiscomplexedwithanotherpolypeptidechaincalled

    beta-2-microglobulin.

    o The1A6ZstructureconsistsoftwoHFEbeta-2microglobulincomplexes.

    6. SelecttheSequencetabtoexaminethesequenceandsecondarystructuredetailsforthis

    structure.

    7. TheSequenceandStructureDetailsforrecord1A6Zareshowninthescreenshotbelow.

    TheHFEproteinsequence(polypeptidechainA)ispresentedfirst.Eachletterinthe

    proteinsequencerepresentsadifferentaminoacid.Cstandsforcysteine.SeetheTable

    ofStandardGeneticCodeonpage50todeterminewhichaminoacidisrepresentedby

    eachletter.

    Secondarystructuredetailsaremappedontosequencedetails.Differentgraphical

    symbolsareusedtorepresentextendedbetastrands,alphahelixes,bends,andturns.

  • 7/16/2019 Bio in for Matics

    43/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    43

    8. Selectthedisplayexternal(UniProtKB)sequencelinkhighlightedinthepreviousscreenshot.

    9. ThesequencepagewillreloadanddisplaytheaminoacidnumbersfortheUniProtHFE

    proteinsequence(thatweexaminedinActivity4)abovethelineofsingle-letteraminoacid

    abbreviations(seescreenshotbelow).

    Findcysteine282intheUniProtsequence.Cysteine282istheaminoacidthatisreplaced

    bytyrosineintheCYS282TYRmutation.

    Youwillseethatcysteine282intheUniProtsequenceisatposition260inthePDB

    structuresequence.InActivity4,welearnedthatcysteine282formsadisulfidebond

    withcysteine225intheUniProtHFEproteinsequence.IntheHFEproteinsequencefor

    PDBstructure1A6Z,weseethatcysteine260formsadisulfidebondwithcysteine203

    (whichcorrespondstocysteine225intheUniProtsequence).Disulfidebondsarecritical

    toformingtheproperstructuralarrangementneededtomakeafunctionalprotein;

    therefore,thelossofcysteine260wouldbedetrimentaltoproteinstructure. Answerthe

    firsttwoquestionsforActivity5intheworksheetonpage54.

    Viewing the Structure

    10.SelecttheSummarytabnearthetopofthepagetoreturntothe1A6Zrecordsummary.In

    theBiologicalAssembly1boxintheupperrightcornerofthepagethereareseveraloptions

    forviewingthemolecularstructure.Clickingonthe MoreImageslinkwillopenapagewith

    optionsfordownloadingastillimageoftheHFEmolecularcomplex1A6Z.AlthoughPDBprovidesaccesstoseveraldifferentmolecularviewersforexamininga3-Drepresentationofa

    molecularcomplex,manyoftheseoptionsweredesignedforscientistswhospecializein

    studyingmolecularstructures.Inthisactivity,wewilluseamolecularviewercalled

    FirstGlanceinJmol,whichisoneofthemoreuser-friendlyoptionsfordisplayingthemajor

    structuralfeaturesofamolecule.FirstGlanceinJmolwasdevelopedtoworkinallpopular

    webbrowserswithouthavingtodownloadandinstallanything.

  • 7/16/2019 Bio in for Matics

    44/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    44

    11.ToaccessFirstGlance,firstclickontheleftarrownexttothe BiologicalAssembly1 label

    abovethemolecularimage.ThisshouldchangetheboxlabeltoAsymmetricUnit.

    12.ByclickingonthearrownexttoOtherViewers,adrop-downmenuwillappear.Select

    FirstGlancefromthedrop-downmenu(seescreenshotbelow).

  • 7/16/2019 Bio in for Matics

    45/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    45

    13.Anewpageshouldopendisplayingstructure1A6Zusing FirstGlanceinJmol (seescreenshot

    below).

    Tostopthespinningofthemolecule,clickthe Spinboxintheupperleft.

    ToremovetheS-labels,unchecktheShowboxbesideLabelsX,S-,?.

    14.Thestructureisinitiallydisplayedusingthe Cartoonoption,whichassignsadifferentcolorto

    eachmolecularchaininthestructure.ChainsA,B,C,andDshouldbedisplayed.Earlierinthe

    activitywelearnedthatchainsAandCareidenticalHFEchainsandchainsBandDare

    identicalbeta-2-microglobulinchains.

    Clickinganywhereonthemoleculewillgeneratealabelinthelowerleftcornershowing

    theaminoacidresidueandtheproteinchainthatyouhaveselected.

    ClickoneachcoloredchaintofindChainA,whichisoneofthetwoHFEproteinchains.In

    thescreenshotonthenextpage,ChainAisthebluechain.

    Ifyouneedtorotatethestructure,simplyclickonthestructureanddragwithyourmouse.

    Toundoanyofthechangesyouhavemadeandresetthestructuretoitsoriginal

    configuration,clickResetintheupperleftcorner.

  • 7/16/2019 Bio in for Matics

    46/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    46

    Byclickingonthebluechain,thelabelinthelowerleftindicatesthatthebluechainisChainA.

    15.LetshideallthechainsexceptChainA.Clickonthe Hide..linkintheupperleftcornerand

    thenclickoneachchainexceptChainA.Yourscreenshouldlooklikethescreenshotbelow.

    16.ClickontheCenterVisibleChains link(highlightedinscreenshotabove)toplaceChainAin

    thecenterofthedisplaypanel.

  • 7/16/2019 Bio in for Matics

    47/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    47

    17.OnceChainAiscentered,usethe ZoomtooltoenlargeChainA.InadditiontousingtheZoom

    arrowsintheupperleftcorner,youcanzoominandoutbyclickingonthebackgroundofthe

    structureandthenusingthewheelonyourmouse.Alternatively,youcanalsoholddownthe

    Shiftkeyanddragthemouseupanddownoverthemoleculetozoominandout.Yourscreen

    shouldsomethinglooklikethescreenshotbelow.

    18.Letsfindcysteine260andcysteine203(thecysteineresiduesthatformthedisulfidebond

    involvedintheCYS282TYRmutation).Clickonthe Find..link(highlightedinscreenshot

    above).

    19.TheFindoption(showninthescreenshotonthenextpage)allowsyoutosearchfor

    particularresidueswithinamolecule.Thelocationsoftheresiduesareindicatedusingyellow

    dots.ThebackgroundcolorautomaticallychangestoblackwhenyouselectFind.Ablack

    backgroundmakestheyellowdotseasiertosee.Youcantogglebetweenblackandwhite

    backgroundcolorsbyclickingontheBackgroundboxintheupperleftcorner.

    TypeCYS260,CYS203intothetextbox.

    PresstheEnterkeyonyourkeyboardtosubmityoursearch.

    Yellowdotsshouldindicatewherethesetworesiduesareintheproteinchain.Youmay

    needtorotatethestructurebyclickinganddraggingyourmouseoverthemoleculesothatyoucanobtainagoodviewoftheyellowdots.Notethattheyellowdotssurrounda

    thingoldbar.Thisthingoldbarrepresentsadisulfidebond.Youcanseethatabond

    betweencysteines203and260wouldcreateastrongconnectionbetweentwodifferent

    strandswithintheprotein.

  • 7/16/2019 Bio in for Matics

    48/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    48

    FindingCysteineResiduesintheHFEProtein

    20.ToobtainabetterviewofthedisulfidebondsintheHFEprotein,clickonthe MoreViews..

    linkintheupperleftcorner,andthenclickonthe DisulfideBonds:ShowAll link.Thepage

    shouldchangesothatitlookslikethescreenshotbelow.Thebackboneoftheproteinchainis

    modifiedtoathinline(whichisdifficulttoseeinthescreenshot),andthedisulfidebonds

    becomethickerandeasiertosee.Thecysteineresiduesarealsolabeled. AnswerQuestions

    34forActivity5intheworksheetonpage54.

  • 7/16/2019 Bio in for Matics

    49/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    49

    21.Nowthatyouarefamiliarwithafewoptionsformodifyingamolecularstructureusing

    FirstGlance,youmaywanttoResetthestructureandpracticewhatyouhavelearned.In

    additiontothedisplayoptionsintheupperleftcornerofthescreen,youcanalsousepop-up

    menustomodifythestructurebyclickingon Jmolinthelowerrightcornerofthedisplay

    panel(highlightedinthepreviousscreenshot).

    22. Ifyouareinterestedincopyingorsavingaparticularviewofastructurethatyouhave

    created,checkoutthePresentingMolecularViewsfromFirstGlanceinJmolpage

    (molvis.sdsc.edu/fgij/slides.htm).

    Protein Structure and Hereditary Hemochromatosis Development

    ByexaminingtheHFEproteinssequenceandstructure,wediscoverthatthecysteinelostinthe

    CYS282TYRmutationhasanimportantroleinestablishingthecorrectthree-dimensionalHFE

    structure.Inthismutation,acysteineresidueisreplacedbyanotheraminoacid,tyrosine,andthe

    disulfidebondbetweentwocysteinesinthepolypeptidechainislost.Thisisdetrimentaltothe

    protein'sstructure.Asaresult,theHFEproteincannolongerperformitsnormalfunctionofregulatingironuptake,andcellsbecomeoverloadedwithiron.Thisbuildupofironincells,if

    untreated,canleadtoorgandamageandothercomplications.

    http://molvis.sdsc.edu/fgij/slides.htmhttp://molvis.sdsc.edu/fgij/slides.htm
  • 7/16/2019 Bio in for Matics

    50/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    50

    Table of Standard Genetic Code forTranslating DNA Sequence Records

    T C A GT

    TTT Phe (F)TTC Phe (F)TTA Leu (L)TTG Leu (L)

    TCT Ser (S)TCC Ser (S)TCA Ser (S)TCG Ser (S)

    TAT Tyr (Y)TACTAA STOPTAG STOP

    TGT Cys (C)TGCTGA STOPTGG Trp (W)

    CCTT Leu (L)CTC Leu (L)CTA Leu (L)CTG Leu (L)

    CCT Pro (P)CCC Pro (P)CCA Pro (P)CCG Pro (P)

    CAT His (H)CAC His (H)CAA Gln (Q)CAG Gln (Q)

    CGT Arg (R)CGC Arg (R)CGA Arg (R)CGG Arg (R)

    AATT Ile (I)ATC Ile (I)ATA Ile (I)ATG Met (M) START

    ACT Thr (T)ACC Thr (T)ACA Thr (T)ACG Thr (T)

    AAT Asn (N)AAC Asn (N)AAA Lys (K)AAG Lys (K)

    AGT Ser (S)AGC Ser (S)AGA Arg (R)AGG Arg (R)

    GGTT Val (V)GTC Val (V)GTA Val (V)GTG Val (V)

    GCT Ala (A)GCC Ala (A)GCA Ala (A)GCG Ala (A)

    GAT Asp (D)GAC Asp (D)GAA Glu (E)GAG Glu (E)

    GGT Gly (G)GGC Gly (G)GGA Gly (G)GGG Gly (G)

    Key to the Table of Standard Genetic Code

    Alanine ALA A Arginine ARG R

    Asparagine ASN N Aspartic acid ASP D

    Cysteine CYS C Glutamic acid GLU E

    Glutamine GLN Q Glycine GLY G

    Histidine HIS H Isoleucine ILE I

    Leucine LEU L Lysine LYS K

    Methionine MET M Phenylalanine PHE F

    Proline PRO P Serine SER S

    Threonine THR T Tryptophan TRP W

    Tyrosine TYR Y Valine VAL V

    START = Initiation Signal (signifies the beginning of apolypeptide chain)

    STOP = Termination Signal (signifies the end of a

    polypeptide chain)

  • 7/16/2019 Bio in for Matics

    51/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    51

    Hereditary Hemochromatosis WorksheetThisworksheetprovidesquestionstobeansweredasyoucompletetheactivitiesintheGene

    GatewayWorkbook.

    Questions for Activity 1

    1) Whatistheofficialgenesymbolofthehereditaryhemochromatosisgene?

    2) Whichallelicvariant(geneticmutation)mostcommonlycauseshereditaryhemochromatosis?

    3) Whataresomecharacteristicsofhereditaryhemochromatosis?Howisittreated?

    Questions for Activity 2

    1) Onthediagramtotheright,markthegeneralregionwhere

    theHFEgenecanbefoundonchromosome6.

    2) Abouthowmanygenesareonchromosome6?

    3) HowlongistheDNAsequenceforchromosome6?

  • 7/16/2019 Bio in for Matics

    52/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    52

    Questions for Activity 3

    1) UsingthesummaryfromtheEntrezGenerecordfortheHFEgene, brieflydescribethe

    functionofthegenesproteinproduct.

    Use the GenBank sequence record Z92910.1 to answer questions 24.

    2) IntheFeaturessectionofrecordZ92910.1,selectthe genelink.Howmanybasepairs(bp)are

    inthegenomicsequenceoftheHFEgene?

    3) ScrollthroughtheFeaturessectionofthe genesequenceinZ92910.1.Howmanyexonshave

    beenidentifiedinthissequence?

    4) ReturntothemainrecordZ92910.1.Selectthe CDSlink.Howmanybasepairsareinthecodingsequence?

    Questions for Activity 4

    1) Howmanyaminoacids(AA)areinthecompleteHFEprotein?

    2) InwhatpartofthecellistheHFEproteinlocated?

    3) WhattypeoftissuedoesnotexpresstheHFEprotein?

    4) Iscysteine282foundontheextracellularorcytoplasmicsideoftheHFEprotein?

    5) Whatisthenumberofthecysteineresiduethatformsadisulfidebondwithcysteine282?

    6) Whatkindofsecondarystructuralelementcontainscysteine282:alphahelix,turn,orbeta

    strand?

  • 7/16/2019 Bio in for Matics

    53/54

    U.S.DepartmentofEnergyOfficeofBiologicalandEnvironmentalResearch

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    53

    7) UsingtheBLASTsearchresults,listthefirst10non-humanorganismsthathaveproteins

    similartothehumanHFEproteinsequence.Includethepercentidentityscorewitheach

    organismyoulist,andorderthelistfromhighesttolowestidentityscore.Skipanyhuman

    entries,anddonotlistanyorganismmorethanonce.

    OrganismName Identity%

    1.

    2.

    3.

    4.

    5.

    6.

    7.

    8.

    9.

    10.

  • 7/16/2019 Bio in for Matics

    54/54

    TheGeneGatewayWorkbook genomics.energy.gov/genegateway/ Updated:Feb2011

    Questions for Activity 5

    1. ExaminetheaminoacidsequenceforthehumanHFEproteinfromtheUniProtProtein

    Knowledgebase(shownbelow).Findcysteine282,theaminoacidthatisreplacedbytyrosine

    intheCYS282TYRmutation.RefertotheTableofStandardGeneticCodeonPage50forhelp

    withthesingle-letteraminoacidabbreviations.

    10 20 30 40 50 60

    | | | | | |

    MGPRARPALL LLMLLQTAVL QGRLLRSHSL HYLFMGASEQ DLGLSLFEAL GYVDDQLFVF

    70 80 90 100 110 120

    | | | | | |

    YDHESRRVEP RTPWVSSRIS SQMWLQLSQS LKGWDHMFTV DFWTIMENHN HSKESHTLQV

    130 140 150 160 170 180

    | | | | | |

    ILGCEMQEDN STEGYWKYGY DGQDHLEFCP DTLDWRAAEP RAWPTKLEWE RHKIRARQNR

    190 200 210 220 230 240

    | | | | | |

    AYLERDCPAQ LQQLLELGRG VLDQQVPPLV KVTHHVTSSV TTLRCRALNY YPQNITMKWL

    250 260 270 280 290 300

    | | | | | |

    KDKQPMDAKE FEPKDVLPNG DGTYQGWITL AVPPGEEQRY TCQVEHPGLD QPLIVIWEPS

    310 320 330 340

    | | | |

    PSGTLVIGVI SGIAVFVVIL FIGILFIILR KRQGSRGAMG HYVLAERE

    2. ComparetheaminoacidsequenceabovewiththeHFEsequencedetailsprovidedforPDB

    structure1A6Z.Inquestion1,underlinetheportionoftheaminoacidsequenceincludedin

    thePDBstructure.

    3. Howmanydisulfidebondsarepresentinthehereditaryhemochromatosisprotein?

    4. WhyisthecysteineresidueaffectedintheCYS282TYRmutationimportant?