a comparative genomic analysis of the bacterium … · 2018-04-30 · a comparative genomic...

1
A Comparative Genomic Analysis of the Bacterium Staphylococcus aureus Marisa Benjamin, Biology, BIO 490 Genomic Bioinformatics Results Figure 3 focuses on data quantity and quality of Strain 1 (SAMEA104137068) a strain acquired from an Endocarditis infection. This displays fastqc output that shows the impact of running Trimmomatic. The Box Whiskers plot above shows base call quality within reads Results: Focusing on the Biology Methodology: Data Mining and Bioinformatics Pipeline Abstract Background/Introduction: S. aureus S. Aureus is found in the normal flora of the body and only pathogenic to some It is very hard to treat if it is a Methicillin resistant strain Virulence factors from Staphylococcus aureus can be structural or secreted products that lead to pathogenesis, with categories that are classified into secreting toxins, superantigen toxin, and surface proteins Exotoxins can include: TSST1-toxic shock syndrome toxin 1, Panton Valentine Leucocidin-PVL, and enterotoxins-SE This was a successful approach in identifying the core genome Conclusions We gratefully acknowledge the support the Hubbard Genome Center of UNH, especially Kelley Thomas, Krystalynne Morris, Stephen Simpson, Jordan Ramsdell and Devon Thomas, for all aspects of bioinformatics support. This support is funded through the NH INBRE program (P20GM103506, from the National Institute of General Medical Sciences of the NIH. We also thank Dan Williams of the KSC Information Technology Department for his continued support of the necessary computational resources to accomplish this work. We thank Dana Gibson for support in the printing of this poster. Purpose of the Project References 1) EMBL-EBI. “Study: PRJEB21460.” The European Bioinformatics Institute < EMBL-EBI, www.ebi.ac.uk/ena/data/view/PRJEB21460 . 2) Resch, G, et al. “Human-to-Bovine Jump of Staphylococcus Aureus CC8 Is Associated with the Loss of a β-Hemolysin Converting Prophage and the Acquisition of a New Staphylococcal Cassette Chromosome.” PloS One., U.S. National Library of Medicine, www.ncbi.nlm.nih.gov/pubmed/23505465/ . 3) “Staphylococcus Aureus.” Wikipedia, Wikimedia Foundation, 23 Apr. 2018, en.wikipedia.org/wiki/Staphylococcus_aureus. 4) Oeggerli, Martin. “Staphylococcus Aureus.” Http://Www.micronaut.ch, www.micronaut.ch/wp-content/uploads/2012/12/%C2%A9-Micronaut-Bacteria- Staphylococcus-aureus-001b014.jpg. 5) Phandango, jameshadfield.github.io/phandango/#/main. Staphylococcus aureus is a member of the normal flora in the body and is a coccal (round) shaped bacteria that is Gram-Positive Bacteria found in the repertory. tract, nose, and skin There are pathogenic which is the common cause of skin infections, abscesses and respiratory infections and there are nonpathogenic forms of this bacteria that lie dormant 20-30% of the human population are long term carriers Pathogenic strains promote infections by producing virulence factors such as protein toxins and the expression of a cell-surface protein that binds and inactivates antibodies There is no approved vaccine to prevent infection from this bacteria and can only be treated with antibiotics If infection left untreated without antibiotics there is a fatality rate of 80%, and if treated there is still a 15-50% fatality rare depending on age and health The most common antibiotic resistant strain of S. aureus is MRSA S. aureus is the leading cause of bloodstream infections and is the most common cause of infective Endocarditis- 4 strains were used in this study This study focuses on the bacterium Staphylococcus aureus. Seven different strains were examined, four from endocarditis and three from a CC8 strain. The Background information provides a basic outline of this bacteria. The methodology is then explained through the Bioinformatics pipeline that was used on my data. Tables display results after running the data through the pipeline. Along with acquiring data for my seven different strain, reference data was downloaded from the NCBI and analyzed. The reference data was also compared to different strains of S. aureus using a whole genome sequence-based phylogenetic tree. Figure 2. This figure shows the flow chart of my bioinformatics pipeline explaining the process I used. 1. DNA of interest is collected from lab 2. It is then sequenced 3. The raw data is then acquired from the EBI and downloaded into RON 4. The program Trimmomatic was then used to Trim data and access overall quality 5. FastQC is then used as an additional data quality tool that is used after Trimmomatic 6. To assemble the data, the forward paired and reverse reads acquired from Trimmomatic ran through another program called SPAdes. 7. Next on the pipeline was quality assessment in which QUAST was used 8. We ran PROKKA for annotation of the data 9. The last step of the pipeline was to use ROARY on all of the strain .gff files Acknowledgements I aimed to characterize and compare seven strains of Staphylococcus aureus, four strains that caused Endocarditis, and three from the CC8 strain found in a Human-to-Bovine using a whole genome approach. Figure 4 shows the comparison of one sample used against a reference strain. This figure was developed using Quast. Table 2. This table shows SPADES results. Column 1 shows the seven different strains. Column two displays the total number of contigs (spades output). Column three shows the N50 size for each and the fourth column shows the GC content from each. Table 3. PROKKA results showing the genome size, protein coding sequences, the number of tRNA, and the number of rRNA for each strain. Figure 5. Shows a Whole Genome Sequence-Based Phylogenetic Tree acquired through ROARY Figure 1 Shows the U.S. mortality rate in infectious diseases and other causes in 2005. Table 1. Column one shows the strain being analyzed. Column two shows where the strain was acquired from. Column three describes why that specific strain was used in this study. Figure 2 shows S.aureus through a microscope f Analysis of the seven core genomes revealed the presence of these toxins: Leucotoxin LukEv and LukDV, Exotoxin Type A, and Enterotoxin A, B, C-1, D, E, G, H

Upload: others

Post on 11-Aug-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Comparative Genomic Analysis of the Bacterium … · 2018-04-30 · A Comparative Genomic Analysis of the Bacterium Staphylococcus aureus Marisa Benjamin, Biology, BIO 490 Genomic

A Comparative Genomic Analysis of the Bacterium Staphylococcus aureus

Marisa Benjamin, Biology, BIO 490 Genomic Bioinformatics

Results

Figure 3 focuses on data quantity and quality of Strain 1 (SAMEA104137068) a strain acquired from an Endocarditis infection. This displays fastqc output that shows the impact of running Trimmomatic. The Box Whiskers plot above shows base call quality within reads

Results: Focusing on the Biology Methodology: Data Mining and Bioinformatics Pipeline Abstract

Background/Introduction: S. aureus

• S. Aureus is found in the normal flora of the body and only pathogenic to some

•  It is very hard to treat if it is a Methicillin resistant strain • Virulence factors from Staphylococcus aureus can be structural

or secreted products that lead to pathogenesis, with categories that are classified into secreting toxins, superantigen toxin, and surface proteins

• Exotoxins can include: TSST1-toxic shock syndrome toxin 1, Panton Valentine Leucocidin-PVL, and enterotoxins-SE

• Thiswasasuccessfulapproachinidentifyingthecoregenome

Conclusions

We gratefully acknowledge the support the Hubbard Genome Center of UNH, especially Kelley Thomas, Krystalynne Morris, Stephen Simpson, Jordan Ramsdell and Devon Thomas, for all aspects of bioinformatics support. This support is funded through the NH INBRE program (P20GM103506, from the National Institute of General Medical Sciences of the NIH. We also thank Dan Williams of the KSC Information Technology Department for his continued support of the necessary computational resources to accomplish this work. We thank Dana Gibson for support in the printing of this poster.

Purpose of the Project

References

1) EMBL-EBI. “Study: PRJEB21460.” The European Bioinformatics Institute < EMBL-EBI, www.ebi.ac.uk/ena/data/view/PRJEB21460. 2) Resch, G, et al. “Human-to-Bovine Jump of Staphylococcus Aureus CC8 Is Associated with the Loss of a β-Hemolysin Converting Prophage and the Acquisition of a New Staphylococcal Cassette Chromosome.” PloS One., U.S. National Library of Medicine, www.ncbi.nlm.nih.gov/pubmed/23505465/. 3) “Staphylococcus Aureus.” Wikipedia, Wikimedia Foundation, 23 Apr. 2018, en.wikipedia.org/wiki/Staphylococcus_aureus. 4) Oeggerli, Martin. “Staphylococcus Aureus.” Http://Www.micronaut.ch, www.micronaut.ch/wp-content/uploads/2012/12/%C2%A9-Micronaut-Bacteria-Staphylococcus-aureus-001b014.jpg. 5) Phandango, jameshadfield.github.io/phandango/#/main.

•  Staphylococcus aureus is a member of the normal flora in the body and is

a coccal (round) shaped bacteria that is Gram-Positive •  Bacteria found in the repertory. tract, nose, and skin •  There are pathogenic which is the common cause of skin infections,

abscesses and respiratory infections and there are nonpathogenic forms of this bacteria that lie dormant

•  20-30% of the human population are long term carriers •  Pathogenic strains promote infections by producing virulence factors

such as protein toxins and the expression of a cell-surface protein that binds and inactivates antibodies

•  There is no approved vaccine to prevent infection from this bacteria and can only be treated with antibiotics

•  If infection left untreated without antibiotics there is a fatality rate of 80%, and if treated there is still a 15-50% fatality rare depending on age and health

•  The most common antibiotic resistant strain of S. aureus is MRSA •  S. aureus is the leading cause of bloodstream infections and is the most

common cause of infective Endocarditis- 4 strains were used in this study

This study focuses on the bacterium Staphylococcus aureus. Seven different strains were examined, four from endocarditis and three from a CC8 strain. The Background information provides a basic outline of this bacteria. The methodology is then explained through the Bioinformatics pipeline that was used on my data. Tables display results after running the data through the pipeline. Along with acquiring data for my seven different strain, reference data was downloaded from the NCBI and analyzed. The reference data was also compared to different strains of S. aureus using a whole genome sequence-based phylogenetic tree.

Figure 2. This figure shows the flow chart of my bioinformatics pipeline explaining the process I used. 1.  DNA of interest is collected from lab 2.  It is then sequenced 3.  The raw data is then acquired from the EBI and downloaded into RON 4.  The program Trimmomatic was then used to Trim data and access

overall quality 5.  FastQC is then used as an additional data quality tool that is used

after Trimmomatic 6.  To assemble the data, the forward paired and reverse reads acquired

from Trimmomatic ran through another program called SPAdes. 7.  Next on the pipeline was quality assessment in which QUAST was

used 8.  We ran PROKKA for annotation of the data 9.  The last step of the pipeline was to use ROARY on all of the strain .gff

files

Acknowledgements

I aimed to characterize and compare seven strains of Staphylococcus aureus, four strains that caused Endocarditis, and three from the CC8 strain found in a Human-to-Bovine using a whole genome approach.

Figure 4 shows the comparison of one sample used against a reference strain. This figure was developed using Quast.

Table 2. This table shows SPADES results. Column 1 shows the seven different strains. Column two displays the total number of contigs (spades output). Column three shows the N50 size for each and the fourth column shows the GC content from each.

Table 3. PROKKA results showing the genome size, protein coding sequences, the number of tRNA, and the number of rRNA for each strain.

Figure 5. Shows a Whole Genome Sequence-Based Phylogenetic Tree acquired through ROARY

Figure 1 Shows the U.S. mortality rate in infectious diseases and other causes in 2005.

Table 1. Column one shows the strain being analyzed. Column two shows where the strain was acquired from. Column three describes why that specific strain was used in this study.

Figure 2 shows S.aureus through a microscope

f

Analysis of the seven core genomes revealed the presence of these toxins: Leucotoxin LukEv and LukDV, Exotoxin Type A, and Enterotoxin A, B, C-1, D, E, G, H