megan milton & mallory van wyngaarden - managing barcode data library generation

Post on 13-Jan-2015

951 Views

Category:

Education

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

How to manage barcode data library generation using BOLD systems

TRANSCRIPT

Barcode of Life Data Systems (BOLD)

www.boldsystems.org (v2.5)v3.boldsystems.org (v3.0 beta)

Managing Barcode Data Library Generation

Fourth International Barcode of Life Conference - Workshop

Megan Milton and Mallory Van Wyngaarden

Monday, November 28, 2011 – University of Adelaide, Australia

Barcode Library Generation

Barcode Library Generation

Needs• Scope (taxonomic and/or geographic)• Barcode standards compliance• Completion of data• Access by all participants• Quality control process• Data Curation/updates• Avoid duplication of effort• Computational power for analysis• Protection of data

BOLD Workbench

How BOLD addresses these needs:• Secure Data Storage• Online access anywhere• Permission based sharing• Taxonomy Browser (view progress so far)• Built-in Quality Control checks• Progress feeds/Activity log• Analysis tools on BOLD compute cluster

User Registration

Getting Started

Requesting an Account– Requirements:

• Valid Email Address

• Institutional Affiliation• Password

Getting Started

Creating a Project– Project Identifiers

• Project code• Project type

– Markers• Primary• secondary

– Campaign– Description– Project permissions

Project Creation Form

Specimen Page Sequence Page

Getting Started

Barcode Record = Specimen data + Molecular data

Getting Started

Standard Workflow - order of upload

Specimen Data

Images

Traces

Sequences

Specimen Data Submissions

Single Specimen Upload Form

Specimen Data– Single Uploads

• Identifiers• Taxonomy• Specimen Details• Collection data

– Batch Uploads• New and updated records• Template spreadsheet• Submit through BOLD to

Data Management Team

Image Submissions

Image Library

Image Data– Required Fields

• Sample ID• Process ID• Image File• Original Specimen• View Metadata• Licensing

– Resolution• < 20 Megapixels

– Assemble Package• Images (.jpeg format)• Spreadsheet (template)• Maximum zipped file size

190MB

Trace Submissions

Trace File Viewer

Trace Files– Sequencing details:

• Trace file in .ab1 or .scf• Phred File in .phd.1• PCR primers• Sequencing primer• Direction• Marker• Attribution to run site

– Assemble Package• Electropherograms• Spreadsheet (template)• Maximum zipped file size

190MB

Primer Submissions

Primer Database

Primer Database– Search by

• Primer code• Submitter• Target marker• Reference/Citation

Primer Submissions

Primer Submission Form

Primers– Required Fields

• Primer code• Primer description• Target marker• Primer sequence• Reference/Citation• Direction• *Public/private

Sequence Submissions

Sequence Page

Sequence Data– Required Fields

• Aligned sequences in FASTA format

• Header can use Process ID or Sample ID

• Marker• Run Site (Institution)• < 1000 sequence per upload

Project Console– Project Permissions and

Publication• Project manager only

– Project Statistics– Upload/Downloads– Sequence Analysis– Specimen Aggregates– Activity Feed– Tags and Comments

Project Console

Project Summary

Record List and Icons

Project Summary

Record List– Identification– Specimen Page

• Specimen information• Image data

– Sequence Page• Sequence(s), trace files and

primer

– Icons and flags– Tagging and Comments

on multiple records

Taxon ID Tree

Data Validation

Taxon ID Tree– Requires: good quality

sequences, some level of taxonomy, images are recommended

– Highlights common contaminations

– Colourize by taxonomy, geography, etc

– Helps to catch misidentifications

– Add pictures for comparison– Use to help make

identifications

Nearest Neighbour Summary

Data Validation

Nearest Neighbour– Tabular Format– Requires low level taxonomy– Highlights:

• Low Divergence compared to nearest neighbour

• Divergence that is less than the intra-specific

Specimen and Sequence Pages

Data Curation

Editing Records– Review graphs and flags

in Project Summary– Review and edit

specimen page– Review sequence page

• Sequence• Trace• Primer

– Replace or delete images, traces, sequences

Publication

Publishing Project– Submitting to GenBank– Making projects public

on BOLD

Published Project

Bibliography Submissions

Biblio Submission Form and Publication Database

Bibliography• Required Fields:

• Title• Authors• Abstract• Journal details

• Connect to BOLD records• Primary records• Secondary records

support@boldsystems.org

top related