biobank informatics: best practices in standardization, change, …€¦ · from watzlawick,...
TRANSCRIPT
ISBER 2014 Annual Meeting & Exhibits May 20-24, 2014, Orlando, FL
Biobank Informatics: Best Practices in
Standardization, Change, Migration
ISBER Informatics Working Group
International Society for Biological and Environmental Repositories
Agenda
• Informatics Survey 2013 update • Best Practices for Biobank Systems • Managing Change • Best Practices for Data Migration • Data Standardization and Sharing
International Society for Biological and Environmental Repositories
ISBER Informatics Survey
2014
Kevin Meagher
International Society for Biological and Environmental Repositories
Took survey in 2012?
YesNo
International Society for Biological and Environmental Repositories
Organization Type
0% 10% 20% 30% 40% 50% 60%
Pharma
Biotech
Storage Fac
Govt
Other
Hosp
Res Lab
Academic
201020122014
International Society for Biological and Environmental Repositories
Kinds of Specimens
0% 20% 40% 60% 80% 100%
Plant
Other
Animal
Fluids
DNA
Tissue
Blood
Human
201020122014
(check all that apply) 2010 n=69, 2012 n=72, 2014 n=55
International Society for Biological and Environmental Repositories
How many specimens?
0.00% 10.00% 20.00% 30.00% 40.00% 50.00%
Less than 1,000
1,000 - 10,000
10,001 - 100,000
100,001 - 1,000,000
More than 1,000,000
201020122014
2010 n=69, 2012 n=72, 2014 n=55
International Society for Biological and Environmental Repositories
Current system in 2014
Database
Paper Spreadsheet
10
2 11
0
27
0 1
International Society for Biological and Environmental Repositories
IT Support
0% 10% 20% 30% 40% 50%
No IT support
Dedicated IT
3rd party
Organization IT
20142012
2014 n=55, 2012 n=72
International Society for Biological and Environmental Repositories
Labeling
0% 10% 20% 30% 40% 50% 60%
RFID
OTHER
LINEAR BARCODE ENGRAVED
2D BARCODE ENGRAVED
HAND WRITING
LINEAR BARCODE
PRINT LABEL
2D BARCODE
201020122014
International Society for Biological and Environmental Repositories
Current System
0% 20% 40% 60% 80% 100%
DATABASE
SPREADSHEET
PAPER
20122014
International Society for Biological and Environmental Repositories
Satisfaction with Current System
0%
10%
20%
30%
40%
50%
60%
2010 2012 2014
Very satisfied
SomewhatsatisfiedNeither
SomewhatdisatisfiedVery disatisfied
International Society for Biological and Environmental Repositories
Data Sharing Importance
00.5
11.5
22.5
33.5
44.5
5
SEARCH INSIDE SEARCHOUTSIDE
SHARING INSIDE SHARINGOUTSIDE
201420122010
International Society for Biological and Environmental Repositories
Meet Data Sharing Needs?
00.5
11.5
22.5
33.5
44.5
5
SEARCH INSIDE SEARCHOUTSIDE
SHARING INSIDE SHARINGOUTSIDE
20142012
International Society for Biological and Environmental Repositories
Compatibility Importance
00.5
11.5
22.5
33.5
44.5
5
LIMS EDC CRI LABEQUIPMENT
DATA WH
201420122010
International Society for Biological and Environmental Repositories
Importance of Technical Needs
00.5
11.5
22.5
33.5
44.5
5
INTERNAL IT VENDORSUPPORT
CUSTOMIZE W/OPROGRAMMING
CUSTOMIZE WPROGRAMMING
201420122010
International Society for Biological and Environmental Repositories
Technical Needs
00.5
11.5
22.5
33.5
44.5
5
INTERNAL IT VENDORSUPPORT
CUSTOMIZE W/OPROGRAMMING
CUSTOMIZE WPROGRAMMING
20142012
International Society for Biological and Environmental Repositories
Technical Needs Met 2014
00.5
11.5
22.5
33.5
44.5
5
ImportanceHow Well
International Society for Biological and Environmental Repositories
Importance Of Information
00.5
11.5
22.5
33.5
44.5
5
201020122014
International Society for Biological and Environmental Repositories
Meets Information Requirement
00.5
11.5
22.5
33.5
44.5
5
20122014
International Society for Biological and Environmental Repositories
Information Requirements 2014 Importance & How Well Met
00.5
11.5
22.5
33.5
44.5
5
International Society for Biological and Environmental Repositories
Importance of Features
00.5
11.5
22.5
33.5
44.5
5
20122014
International Society for Biological and Environmental Repositories
Current System Meet Requirements
00.5
11.5
22.5
33.5
44.5
5
20122014
International Society for Biological and Environmental Repositories
2014 System Meet Requirements
00.5
11.5
22.5
33.5
44.5
5
ImportanceHow Well
International Society for Biological and Environmental Repositories
Top Problems or Barriers
12
8
7
6
5
5
4
4
4
4
3
2
2
2
Integration
Cost
Easy of use
Lack of Standardization
Reporting
Security/Privacy
Lack of Support
Migration
Customization of system
Executive Buy in
Expert consultants
Adapation
Flexibility/Extensibility
Stability
International Society for Biological and Environmental Repositories
Best Practices for Choosing, and
Implementing Information Systems
International Society for Biological and Environmental Repositories
Status Quo • Informatics systems range from paper
systems to enterprise systems
• Spreadsheets, homegrown databases, commercial off the shelf software, custom built applications
International Society for Biological and Environmental Repositories
Systems Used in 2012
Database
Paper Spreadsheet 16
10 6
1
36
0 4
International Society for Biological and Environmental Repositories
ISBER Best Practices
• ISBER Best Practices For Repositories Collection, Storage, Retrieval and Distribution of Biological Materials for Research 3rd Edition published 2012
• www.isber.org/?page=BPR
International Society for Biological and Environmental Repositories
Specimen Tracking
• Inventory management system is essential • Data integrity checking • Track location of each sample • Annotation on each sample • Significant events
– Thaws, Processing, Distribution, Destruction
International Society for Biological and Environmental Repositories
Specimen Tracking
• Full query capability throughout the system
• Unique KeyID assigned to each Sample (container)
• Cross reference external identifier(s) with internal identifier
International Society for Biological and Environmental Repositories
Specimen Location
• Must have ability to uniquely identify each position in storage
• Different configurations should be supported (e.g., uprights, chests, LN2 tanks, straws)
• Show available open positions
International Society for Biological and Environmental Repositories
Additional Specimen Descriptors
• Specimen type, container type, quantity, collection date/time, receipt date, processing date/method, storage temperature
• Sample processing and Chain of Custody • Quality of the sample • Metadata
International Society for Biological and Environmental Repositories
Labels
• Use of bar codes and human readable • Test labels and ribbons in all projected
storage and processing conditions • Label should have ID linked back to
software
International Society for Biological and Environmental Repositories
Audit Trail
• Computer-generated and automatic – Original data and new data – Date and time changed – How the change was made – Who made the changes – Why the changes were made
• Not editable
International Society for Biological and Environmental Repositories
Security • NIST and FISMA compliance • User authentication & passwords standards • Security roles with defined privilege levels • Automatic log-off after inactivity • Audit trail of login attempts • Remote communication encrypted socket
(SSL/VPN) • PHI secured (HIPAA & HITECH)
International Society for Biological and Environmental Repositories
System Interoperability • Benefits
– Reduced manual re-entry of data ‘Sneakerware’
– Data accessibility with minimum of applications running
• Minimum Requirements – Public documented Application Programming
Interface (API) – Common public vocabularies
International Society for Biological and Environmental Repositories
Security and External Access
• System must enforce all data integrity, security, and audit trail requirements for external access
• Ability to control group level security to limit external users seeing each others info
International Society for Biological and Environmental Repositories
Reporting • Support the workflow
– Empty Positions, Count Reports, Shipping Manifests • Query editor • Report designer • Save reports for future execution • Generate printed output and electronic data files
(CSV, PDF XML)
International Society for Biological and Environmental Repositories
Backup
• Regularly scheduled backups are essential
• Restores of the backups must be tested on a regular basis
• Where is the backup located? • Don’t be caught unprepared!!!!
International Society for Biological and Environmental Repositories
Validation • Is the data accurate? • Does the system perform reliably? • Can you discern invalid or altered
records? • Does the system perform in the manner
expected? • Test in your environment with your data
International Society for Biological and Environmental Repositories
Quality Assurance Audits • User requirements • Industry specific certification requirements • Details of the review and approval process • Procedures followed to test the
functionality of the software • Processes used to handle program errors
and modifications • Training provided
International Society for Biological and Environmental Repositories
Systems are compliant; not software
International Society for Biological and Environmental Repositories
Core Elements • System Security • Software and System Validation • Audit Trails • Electronic Signature Security • Password Maintenance • Record Retention and Protection • Operational, Authority, & Device Checks • Document Control
International Society for Biological and Environmental Repositories
Software Supplier Relationships
• You are responsible for qualifying your software suppliers.
• Applies to both external and internal suppliers
• Evaluation categories • Business Perspective • Technical Perspective
International Society for Biological and Environmental Repositories
Computerized System Validation
• Documented requirements and specifications
• Written validation plan • Evidence of reviews and test results • Traceability of validation back to
requirements
International Society for Biological and Environmental Repositories
Computerized System Validation
• Documented approval of validation • Defined change control process • Vendor verification
International Society for Biological and Environmental Repositories
US Regulatory considerations
• 21CFR, Part 11 –Any records required by or
submitted to the FDA in electronic form
• CLIA • CAP
International Society for Biological and Environmental Repositories
International Considerations
• HTA (UK) • Other ISO?
International Society for Biological and Environmental Repositories
Conclusion
• Use ISBER Best Practices as a guide • Know your supplier • Understand the regulations as they apply
to you • Document everything! (URS, SOW, FRS,
UAT, Validation)
International Society for Biological and Environmental Repositories
Questions?
Mark Cada
Life Sciences Account Manager
LabVantage Solutions [email protected]
484-202-0442
International Society for Biological and Environmental Repositories
STRETCH BREAK!
So this doesn’t become true…
International Society for Biological and Environmental Repositories
Managing Change
Mary E. Edgerton, MD, PhD
International Society for Biological and Environmental Repositories
What is Change Management?
What does Change Management Have to Do with Migration?
What are some Best Practices in Change Management?
Useful References
0 Roadmap
0
International Society for Biological and Environmental Repositories
What is Change Management?
0 1 2 3 4
The process of “transitioning individuals, teams, and organizations to a desired future state1” Is especially important where new software systems affect numerous groups within an organization Can make or break acceptance of a new software system
1Kotter, J. (July 12, 2011). "Change Management vs. Change Leadership -- What's the Difference?". Forbes online.
1
International Society for Biological and Environmental Repositories
0 1 2 3 4
From Watzlawick, Weakland, and Fisch1, change is divided into two different types of change. First-order change A variation in procedures, but the workflow is unchanged. (new version of software without major changes) Second order change Major shift in practice (migration to a new system is usually second- order, particularly if issues such as adherence to best practices or compliance with new regulations is required)
1 Watzlawick, Weakland, and Fisch, Change: Principles of Problem Formation and Problem Resolution, 1974
1 What is Change Management?
International Society for Biological and Environmental Repositories
0 1 2 3 4
1Kurt Lewin in Deutsch and Krauss, Theories in Social Psychology, 1965
Borrowing from social psychology, we know that individual behavior is linked to motivational concepts. This can affect the process. Two equally good choices Can paralyze the community with indecision Choosing the lesser of two evils Often the perception of buy versus develop Opposing interests among the stakeholders The users versus information technology
What is Change Management? 1
International Society for Biological and Environmental Repositories
0 1 2 3 4
Applying the Everett Rogers Diffusion of Innovations1 theory, there are five categories of product adopters. Even in a “benevolent dictatorship” where system selection is enforced across an institution, the users can be identified in these five categories. Innovators Technologically adventurous, always ready for the newest thing (try to hire people who buy everything on-line) Early adopters Social leaders (helps if the new system addresses some major deficiencies) Early majority Deliberate (need to see the proof of the improvements of the new system; while slow they can be reliable supporters as new groups come on) Late majority Skeptical, traditional, resistant to change (often people who are not technologically inclined) Laggards New system is a threat (often the Principal Investigator who perceives loss of ownership of data) 1Everett Rogers Diffusion of Innovations , first published 1962 now in 6th edition 2011
What is Change Management? 1
International Society for Biological and Environmental Repositories
0 1 2 3 4
Lorenzi and Riley1 define eight categories for potential pitfalls when managing change. Communication
Culture
Underestimation of complexity
Scope creep
Organizational
Technology
Training
Leadership/Ownership
1 Lorenzi and Riley, “Managing change”, JAMIA 2000.
What is Change Management? 1
International Society for Biological and Environmental Repositories
0 1 2 3 4
Question from me to our software trainer “How do people like the new release of our tissue collection module?” Answer “Well, there seem to be two reactions. The existing users hate it but the new users just coming on love it.” Conclusion People hate change! “If it weren’t for the people …1”
What does Change Management Have to Do with Migration?
1Kurt Vonnegut Jr., Slaughterhouse Five
2
International Society for Biological and Environmental Repositories
0 1 2 3 4
Change management is especially important where new software systems affect numerous groups within an organization Biospecimen repository system typically are used by multiple groups within an organization: Consenters Bank staff Tissue collectors Data entry for updating and annotation Tissue distribution Regulatory oversight Reporting for grants, sponsors Accounting/finance Customers Acceptance depends upon how you handle Change Management
What does Change Management Have to Do with Migration? 2
International Society for Biological and Environmental Repositories
0 1 2 3 4
What are some Best Practices in Change Management?
Good communication Identify the stakeholders Understand their perspectives Meet with them regularly Status updates Manage communications within your own team in large, complex migrations
3
International Society for Biological and Environmental Repositories
0 1 2 3 4
Address culture change Recognize big change Is this from paper to a web based front-end with a back-end relational database Address hostility between the user group and IT/IS Identify IT/IS as a stakeholder with responsibility Engage tech-savvy individuals in the user group who will be innovators or early adopters.
What are some Best Practices in Change Management? 3
International Society for Biological and Environmental Repositories
0 1 2 3 4
Avoid underestimation of complexity e.g. The Affordable Health Care Act Manage expectations up front Involve technical personnel to view requirements Pair a requirements analyst with a technical person while gathering requirements Double your estimate of the time and effort Apply AGILE or similar approach Reduce loss of credibility over missed deadlines
What are some Best Practices in Change Management? 3
International Society for Biological and Environmental Repositories
0 1 2 3 4
Avoid scope creep Very common Gather requirements and manage expectations Watch out for the innovators! Document the plan Engage all stakeholders in a sign-off of the deliverables Be able to separate critical versus desirable scope changes Don’t introduce scope creep from the IS driven side because then no one will keep in scope
What are some Best Practices in Change Management? 3
International Society for Biological and Environmental Repositories
0 1 2 3 4
Address organizational issues Avoid large, disconnected team without leadership or accountability Be prepared for staff turnover in either the implementation team, the software support team, or the stakeholders, can be a problem Don’t try to use software to fix a “people” problem Define metrics for success up front Establish rules for which requirements require the agreement of the majority of stakeholders versus a unanimous agreement
What are some Best Practices in Change Management? 3
International Society for Biological and Environmental Repositories
0 1 2 3 4
Technology Don’t make this too hard Use robust communications and programming technology Don’t constantly redesign or reconfigure your system
What are some Best Practices in Change Management? 3
International Society for Biological and Environmental Repositories
0 1 2 3 4
Training Have a plan for training users on the new system Provide workshops to train people on new versions Try to keep the system as simple as possible so that it is either intuitive or a computer based module is sufficient.
What are some Best Practices in Change Management? 3
International Society for Biological and Environmental Repositories
0 1 2 3 4
Leadership Engage all of the stakeholders as owners of the project (goes back to communications) Engage institutional leadership to endorse the project Avoid emotional ties to software Engage domain experts but make sure they have sufficient time commitment to the project
What are some Best Practices in Change Management? 3
International Society for Biological and Environmental Repositories
0 1 2 3 4
Useful References
Lorenzi and Riley, “Managing change: an overview”, JAMIA 7: 116-124. 2000.
Edgerton, Grizzle, and Washington “The deployment of a tissue request tracking system for the CHTN: a case study in managing change in informatics for biobanking operations,” BMC Medical Informatics and Decision Making 10:32. 2010.
Rogers, Diffusion of Innovations , 6th edition 2011
Fisher, Ury, and Patton, Getting to Yes: Negotiating Agreement Without Giving In, 3rd edition 2011
4
International Society for Biological and Environmental Repositories
Data Migration
Ashokkumar A. Patel
International Society for Biological and Environmental Repositories
Overview • Who …are the stake holders? …will perform the data migration? • What ...are common problems that your group might face with data migration?
…are the expectations from your end users? Did they like the outcome? …type of migration is it (from spreadsheets to home grown/commercial
system, etc.)? • When …is there a need for the data migration? • Why …are you doing data migration? Reasons? • How …to address issues/problems with quality of data addressed with your end
users prior to migration? ...long does it take to complete the data migration? • Case studies from multiple perspective: Vendor, Academic, and Informatician.
International Society for Biological and Environmental Repositories
All biobank data is NOT the same • Similarity
– Patient enrollment/registration
– Inventory management – Shipping
(they all have key concepts of biobanking)
• Differences
– Data model – Vocabulary – Coding
(however they do not have the tools to translate their data with other systems)
What to expect when you are expecting….
Source: www.imdb.com
International Society for Biological and Environmental Repositories
Why Migrate Data? • Current system over extended and need
additional functionality • Enterprise wide biobank initiatives • Existing biobanks forming new networks
for newly funding project – Need to appear as a single entity
• Existing network of biobanks expanding • New partners to your existing biobank
International Society for Biological and Environmental Repositories
Reasons for Data Migration
• Good reasons: – Ability to have your biobank get an accurate
accounting of your specimens/data – Ability to SHARE your data with OTHERS
• Not so good reasons:
– “Better and Newer” technologies
International Society for Biological and Environmental Repositories
Before migration: things to consider • The migration of data from one system to
another is a non-trivial task. – What data is to be migrated? – How will the data be transferred to the new system? – Does any cleaning of the original data need to happen? – What data translations need to happen? – How will the IDs from one system be transferred to
another? – How to handle any confidential or Personal information? – How to validate that the migration was done correctly? – How to document the data transfer?
International Society for Biological and Environmental Repositories
Types of Data Migration • What type of migration are you
planning....from spreadsheets to home grown system/commercial system, etc? – Many different kinds of systems can be
involved: • SQL DB • Access DB • Excel • ASCII text file • Paper records
International Society for Biological and Environmental Repositories
Reflection of your legacy data • Most common problems:
– use of non-standardized vocabularies – mismatch of IDs when merging multiple files – Multiple data formats [i.e. date and units (cm vs.. mm)]
– Poor or multiple data models implemented • Patient->Specimen->Sample • Specimen->Sample • Collections->Patient->Specimen->Sample • Collections->Specimen->Sample • Sample data entry (TMA vs. paraffin block)
– Free text fields
International Society for Biological and Environmental Repositories
Case Study #1a: commercial vendor’s view Single well formed database • A conversion from a well structured validated relational database containing over four million specimens to
a commercial biobanking system. – Coding of the two databases were different – the ID structures not compatible – both databases stored their information on the specimen level, i.e., one record per individual
specimen.
• The original database structure was normalized and well formed, using look up tables, primary and secondary keys, and very few free-form text fields.
– The data was analyzed for errors and a conversion plan mapping the existing data structures to the new data structures was created.
• The conversion plan was reviewed with the system owners and plans for addressing the (very few) data anomalies were constructed.
• The data was exported out of the original database into CSV files, converted using SAS programs, and then loaded into the new database using a SQL script.
• The data was then exported out from the new system into CSV format, and converted to the old data codes using a SAS program.
• This output was then compared to the original exported CSV file to ensure that the migration was correct.
• Due to the structured nature of the source database, this process took less than one week.
International Society for Biological and Environmental Repositories
Case Study #1b: commercial vendor’s view
Multiple data sources with little or no validation • This migration involved the conversion of 22 different study collections into a
single new commercial system for a large institution.
• The data was stored in 25 different formats including RDMS, SAS data files, Excel, text, Microsoft Word, and on paper. – Most of the 25 source datasets did not have validation
• invalid dates, unlinked IDs, blank entries, bad codes, and other data problems had to be addressed.
• In the end 25 different data analyses had to be performed to convert the source data.
• Each source was divided up into three categories: – the Good (data that could be converted) – the Bad (data with discrepancies that need resolution by the collection center) – the Ugly (data that was so erroneous that it could not be migrated)
• Each of these datasets had to be converted and loaded into the database, a process that took over 18 months.
International Society for Biological and Environmental Repositories
Case Study #2: Multi-institute distributed biobank
• Multi-institute biobank project with several major hubs in the USA with international affiliate sites and a single coordinating data/administrative center started in 1999.
• Original informatics system, developed on a MS Access platform in 2000 and revised regularly since, has served the resource well in its earlier developmental stages.
• The project has grown and now moves into a new phase of program operations and development, with expansion to other hubs and other module requirements (i.e. TMA management).
• Access database was installed at each hub (identified data) as well as at the central coordinating site (de-identified).
• Each Hub was able to modify the Access database, as long as core data preserved.
International Society for Biological and Environmental Repositories
Case Study #2: problems with legacy system
• Lack of functionalities with current database implementation – Lacks multi-user capabilities – Lag time between Hub and Central data – Specimen Inventory/Special collections are not effectively
managed • Patient->Batch->Specimen model
– Counts of samples/aliquot recorded for each specimen (i.e 10units available)
• Miscommunication between the Hub staff with Bioinformatics team
• Barriers with technical skill sets
– All of the Hubs do not have DBAs w/similar or minimal level of technical expertise.
• Discrepancy of data received between the Hubs and the Central site
International Society for Biological and Environmental Repositories
Case Study #2: migration • Implemented a 2-phase migration process
– Phase 1: quick migration of existing data model from Access to Oracle platform
• Allowed quick deployment of a web-based application to address many of the “simple” problems due to Access db.
• Allowed controlled vocabulary of existing data sets (not standardized)
• Allowed each Hub to “clean up” legacy data
– Phase 2: reorganize data model and create modular components of biobanking
• Use Standardized vocabulary • Allow multi-variation of workflow for data collection • Seamless integration with other tools/applications used • Conform to Best Practices
International Society for Biological and Environmental Repositories
Case Study #3: Large Academic Enterprise Biobank Initiative
• Large Academic Center in the USA has a federated tissue banking infrastructure with an executive bank, the Institutional Tissue Bank, and twenty-eight satellite banks each specialized on either organ system (e.g., lung), disease (e.g., sarcoma) or tissue type (e.g., cytology).
• The individual banks are of varying ages.
• Except for seven of these that used the ITB to log in their samples, the remainder had control over their information system and the type of information that was collected.
• They have now migrated twenty of the banks onto a single system that has been mandated by the institution as the enterprise system of record.
• New system accommodates multiple modes of tissue acquisition and management, including various administrative privileges.
• On average it took 6 months per bank to assess their requirements for any customizations prior to migration and then to review their data and migrate it. Some banks have taken longer because of multiple complicated module.
International Society for Biological and Environmental Repositories
Case Study #3: common problems • Data clean-up
• Solutions: Mapping, simple natural language parsing (NLP) using rules to disaggregate data, iterations to try to reduce manual curation to less than 500 records.
• Reconciliation of tissue/entries census
• Solutions: Agree upon metrics in advance of migration. Obtain scripts, code or procedures used for legacy system census
• Missing data
• Solution: If possible capture missing data profiles and make sure all parties understand their impact. • Agree upon criteria for a successful migration in advance
• Solution: Make the metrics for success part of the signoff of requirements • Agree upon what the final system will consist of
• Solution; develop a requirements document for the new system that both parties sign off on. This helps to prevent scope creep
• Mixed diagnostic and research tissue being held by banks • Allow banks to declare “Bank Specific Attributes” to define extra attributes for data they are
collecting • Solution: Better control of Vocabulary and Data Elements
International Society for Biological and Environmental Repositories
Retrospective vs. Prospective Data Collection
• Legacy data problems – (Already discussed in previous slides)
• Prospective data collection issues:
– Workflow – Data standardization (different textual terms/coding) – Change management/Human Factors – Expectations of End Users – Centralized vs. Distributed/Federated Models
International Society for Biological and Environmental Repositories
Managing Change
Source: Lorenzi and Riley, JAMIA 2000: 7:116-124
International Society for Biological and Environmental Repositories
Lessons Learned • Do not start without an agreement on expectations • Communicate as often as possible
• Extend ownership of new system to party that is migrating
• Add on time at the end of the project in your forecasting for surprises
• Create a modular design that can be configured to a user’s needs.
• Read Lorenzi and Riley’s paper with your audience so that the stakeholders also take
ownership of the migration problems. – Lorenzi and Riley, JAMIA 2000: 7:116-124
• When the users have to curate migrated data, it helps to give them tools for monitoring
this process. • Provide tools that compare the migrated data to the equivalent data in the old system.
• When possible create a tool so that users can map their existing data elements and
specific data values to those used in your new system.
International Society for Biological and Environmental Repositories
Take Home Message • Past experience has shown that data migrations benefit from several
steps that were described:
– Analysis of the Old Data – Review of the New Data Structures – Creation of a Conversion Plan
• Conversion of Coded Values • Conversion of IDs • Linkage to the Original Data (if needed) • Cleaning of Patient Identifiable IDs (if needed)
– A Post-Migration Analysis and Validation – Consider Human Factors with Change Management – Time and Cost of Migration can vary (i.e. size of bank, quality/integrity of
existing data, personnel, and complexity of required modules)
International Society for Biological and Environmental Repositories
Thank You
Ashokkumar A. Patel, MD
Case Western Reserve University: Instructor, Division of General Medical Sciences, Cancer Center
Bioinformatics PI, AIDS and Cancer Specimen Resource
Case Comprehensive Cancer Center:
Coordinator for Clinical Research Informatics Computational Genomic Epidemiology of Cancer Fellow (R25)
2103 Cornell Rd., WRB 3rd Floor
Cleveland, OH 44106 Phone: 216-368-5106
Fax: 216-368-0494 CWRU Email: [email protected]
International Society for Biological and Environmental Repositories
Data Standardization for Data Sharing
Piper Mullins
International Society for Biological and Environmental Repositories
Topics
• Metadata Standards – What and Why
• Implementing Standardization
• Sharing Collection Metadata
International Society for Biological and Environmental Repositories
What Is Metadata?
• Metadata is: Data ‘reporting’
– WHO collected the specimen? – WHAT is the specimen type? – WHEN was it collected? – WHERE is it located? – HOW was the specimen collected? – WHY was the specimen collected?
International Society for Biological and Environmental Repositories
What is Metadata? Inventory Descriptors
(ISBER Best Practices)
• Specimen type, vial type, volume, collection date/time, receipt date, processing date/method, storage temperature, preservative
• Sample processing and movement • Quality of the sample
International Society for Biological and Environmental Repositories
What is Metadata? Example Vocabularies
Common public vocabularies for relevant data points: -SNOMED-CT -LOINC -ICD9 -DarwinCore -ISIS
International Society for Biological and Environmental Repositories
Why Use Metadata Standards? Inventory Management
– Ensures Consistent, Standard Vocabulary
– Avoid re-entry or duplication of data
International Society for Biological and Environmental Repositories
Why Use Metadata Standards? Interoperability & Data Sharing
– Ensures users able to access appropriate specimens
– Enhances Reporting
International Society for Biological and Environmental Repositories
Implementing a Standard
• Pick a standard for your database….. • SNOMED-CT, LOINC, ICD9, TBWG, DarwinCore,
ISIS, etc.
• OR Create the standard within your organization
International Society for Biological and Environmental Repositories
http://xkcd.com/927/
International Society for Biological and Environmental Repositories
Example Standardization Project
International Society for Biological and Environmental Repositories
Example Standardization Project Stage 1: Gather data
International Society for Biological and Environmental Repositories
Stage 2: Map Data Fields DarwinCore Categories
Lab Metadata
Unit Database Entities
International Society for Biological and Environmental Repositories
Standardization
Stage 2, cont: Map Data Fields
Common Elements Unique Identifier
Genus Species
Date Collected
From Departmental Metadata…
•Voucher No# •Field No# •Country/Continent/Ocean •Type/Rare •Tissue Sample •DNA Sample Quality
•Tissue ID •DNA Concentration •Archival Status •Restrictions
•Teats Collected •Field Collection Name •Sample Preservative •Storage •Temperature •Tube Comments
International Society for Biological and Environmental Repositories
Stage 3: Create Forms for one Standard
International Society for Biological and Environmental Repositories
Stage 4: Market the Standard Form Example Enterprise-level Information Management System
International Society for Biological and Environmental Repositories
Data Standards = Data Sharing
Biobank Database
Information management system chart
Lab Metadata
Level
• Standard Processes
• Shared Core Metadata
Institutional Collection
Level
•Metadata Standard
• Shared Database
• Interoperable
• External Data Sharing
Aggregation
Layers
International Society for Biological and Environmental Repositories
Data Sharing Example Biodiversity.aq
biodiversity.aq example
International Society for Biological and Environmental Repositories
Data Sharing Example idigbio.org • Each client :
– Maps initial process with inventory/collection catalogue – Maps to Data Standard (ex. DarwinCore v2)
• uses Automated Export to create desired file – CSV and/or text
• uses DwC CSV file with Integrated Publishing Toolkit to create DwC-Archive file
• Shares DwC-A file with GBIF (Global Biodiversity Information Facility)
• GBIF – harvests periodically and replaces an old dataset with a newer one.
idigbio.org
International Society for Biological and Environmental Repositories
Project Tips: Standardization & Sharing
Team-based Collaborative Approach
Stakeholder Communication
Slide from: http://interoperability.ucsd.edu/
“Translator” Users IT Designers
PSCI Research Communit
ies IT Dept.
International Society for Biological and Environmental Repositories
Final Thoughts
International Society for Biological and Environmental Repositories
Fin
Piper Mullins, MIS
Program Coordinator, Pan-Smithsonian Cryo-Initiative
202-633-4054
Get Involved People like you are the backbone of our work! ISBER is looking for volunteers with strong organizational and leadership skills to get involved. Learn more… contact Debra Leiolani Garcia at [email protected]