development and experience with tissue banking tools to support cancer research
DESCRIPTION
Development and Experience with Tissue Banking Tools to Support Cancer Research. Waqas Amin M.D , Anil V. Parwani M.D PhD and Michael J. Becich M.D, PhD1 - PowerPoint PPT PresentationTRANSCRIPT
Development and Experience with Development and Experience with Tissue Banking Tools to Support Tissue Banking Tools to Support
Cancer ResearchCancer Research
Waqas Amin M.DWaqas Amin M.D, Anil V. Parwani M.D PhD and Michael J. , Anil V. Parwani M.D PhD and Michael J. Becich M.D, PhD1Becich M.D, PhD1
Department of Biomedical Informatics, University of Pittsburgh, Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA.USA 2Department of Pathology, University of Pittsburgh, PA.USA 2Department of Pathology, University of
Pittsburgh Medical Center, Pittsburgh, PA. USAPittsburgh Medical Center, Pittsburgh, PA. USA
Introduction:
Over the last decade, the Department of Biomedical Over the last decade, the Department of Biomedical Informatics (DBMI) at the University of Pittsburgh has Informatics (DBMI) at the University of Pittsburgh has developed and deployed various tissue banking informatics developed and deployed various tissue banking informatics tools to expedite translational medicine research. tools to expedite translational medicine research.
Deals with management of clinicopathologic annotation, Deals with management of clinicopathologic annotation, inventory management and distribution of biospecimens inventory management and distribution of biospecimens that are collected and stored for translational research use that are collected and stored for translational research use by the scientific community. by the scientific community.
Tissue Banking Informatics:
Aggregation: Aggregation: Process to associate tissue samples with valuable Process to associate tissue samples with valuable data including demographic, epidemiology, pathology, data including demographic, epidemiology, pathology, progression, vital status, therapy and outcomes related data. progression, vital status, therapy and outcomes related data.
Standardization:Standardization: Collected data must be uniform or shareable. Collected data must be uniform or shareable. This standardized approach to annotation is to ensure This standardized approach to annotation is to ensure uniformity, consistency, and quality of collected data. This uniformity, consistency, and quality of collected data. This facilitates information sharing across multiple institutions. facilitates information sharing across multiple institutions.
Searchable:Searchable: Development of an information model supported Development of an information model supported by standardized data collection approach allows annotated by standardized data collection approach allows annotated tissue samples to be matched with the research queries, tissue samples to be matched with the research queries, thereby facilitating better understanding of the experimental thereby facilitating better understanding of the experimental design and result.design and result.
Data Requirement in Cancer Research:
High quality, accurate and comprehensive data is required to support genomic, proteomic, clinical and translation research.
Data must be acquired in accordance with legal and ethical subject polices.
Type of Data Collection: Demographic data Patient clinical data Pathology block level data Patient treatment data Outcome and follow up data Biochemical data Genomic level data Cell and tissue level data
Data Collection Standards:
Development of Common Data Element (CDE):Development of Common Data Element (CDE):
Standardized clinical annotations defined in detail utilizing Standardized clinical annotations defined in detail utilizing metadata. Allows uniform, consistent shareable data metadata. Allows uniform, consistent shareable data collection across multiple institutes/systems.collection across multiple institutes/systems.
Development of CDEs are supervised by multidisciplinary Development of CDEs are supervised by multidisciplinary team and CDE subcommittee developed consensus CDE team and CDE subcommittee developed consensus CDE incorporating following standards applicable for a organ incorporating following standards applicable for a organ specific tissue.specific tissue.
ADASP (Association of Directors of Anatomic and ADASP (Association of Directors of Anatomic and Surgical Pathology (ADASP) Cancer Reporting Surgical Pathology (ADASP) Cancer Reporting Guidelines Guidelines
American Joint Committee on Cancer (AJCC) Cancer American Joint Committee on Cancer (AJCC) Cancer Staging ManualStaging Manual
NAACCR (North American Association of Central Cancer NAACCR (North American Association of Central Cancer Registry) Data Standards for Cancer Registries Registry) Data Standards for Cancer Registries
Data Sources:Data Sources: Data import from automated electronic systems like AP-Data import from automated electronic systems like AP-
LIS, CP-LIS, Radiology and Registry information System LIS, CP-LIS, Radiology and Registry information System (RIS). (RIS).
Patient questionnaire, patient health record and Patient questionnaire, patient health record and treatment charts, existing databases, consultation with treatment charts, existing databases, consultation with referring physicians, archived data and pathology referring physicians, archived data and pathology reports.reports.
De-Identification of PHI:De-Identification of PHI: The purpose is to ensure proper confidentiality and privacy of The purpose is to ensure proper confidentiality and privacy of
human subjects based upon Institutional Review Board human subjects based upon Institutional Review Board approved protocols.approved protocols.
De-identification of PHI is done by an Honest Broker according De-identification of PHI is done by an Honest Broker according to Health Insurance Portability and Accountability Act (HIPAA). to Health Insurance Portability and Accountability Act (HIPAA). regulations by designating unique codes to patient data related regulations by designating unique codes to patient data related identifiers.identifiers.
Specimen collection and standardization
Biospecimens are collected according to pathology and Biospecimens are collected according to pathology and tissue banking standardized protocol. Biospecimens are tissue banking standardized protocol. Biospecimens are collected and stored for tissue banking project , includes:collected and stored for tissue banking project , includes:
Paraffin BlocksParaffin Blocks Fresh Frozen TissueFresh Frozen Tissue Blood Products includes:Blood Products includes:
SerumSerum PlasmaPlasma Buffy CoatBuffy Coat RBCRBC WBCWBC
Tissue Banking Information Models Tissue Banking Information Models and Architecture:and Architecture:
Two types of information models that have been utilized in Two types of information models that have been utilized in the development of tissue bank.the development of tissue bank.
Organ-specific databases (OSD)Organ-specific databases (OSD) Cooperative Prostate Cancer Tissue Resource (CPCTR) (Cooperative Prostate Cancer Tissue Resource (CPCTR) (
www.cpctr.info)) Pennsylvania Cancer Alliance for Bioinformatics Pennsylvania Cancer Alliance for Bioinformatics
Consortium (PCABC) (Consortium (PCABC) (www.pcabc.upmc.edu)) Early Detection Research Network (EDRN) Colorectal and Early Detection Research Network (EDRN) Colorectal and
Pancreatic Neoplasm databasePancreatic Neoplasm database SPORE Head and Neck Neoplasm Database SPORE Head and Neck Neoplasm Database
Model Driven Approach (Database)Model Driven Approach (Database) National Mesothelioma Virtual Bank (NMVB) (National Mesothelioma Virtual Bank (NMVB) (
www.mesotissue.org))
OSD (Organ Specific Database):OSD (Organ Specific Database):
OSD is a three-tiered architecture, and implemented on an OSD is a three-tiered architecture, and implemented on an Oracle Application Server v10.1.2.3 running on a Windows Oracle Application Server v10.1.2.3 running on a Windows 2003 and Oracle RDBMS v.10.2.0.2 running on an AIX 5L 2003 and Oracle RDBMS v.10.2.0.2 running on an AIX 5L virtual host definition supported by IBM x3850 system virtual host definition supported by IBM x3850 system hardware.hardware.
Dynamic web pages are generated using Oracle http server Dynamic web pages are generated using Oracle http server and mod_plsql extensions for the database users.and mod_plsql extensions for the database users.
The data annotation engine is a flexible dynamic web-based The data annotation engine is a flexible dynamic web-based tool, while the data query engine facilitates investigators to tool, while the data query engine facilitates investigators to search de-identified information within the warehouse search de-identified information within the warehouse
through a “point and click” interface.through a “point and click” interface.
OSD Multi Tier OSD Multi Tier Architecture:Architecture:
Physical DataPhysical Data PresentationPresentation
Metadata EngineMetadata Engine
Application Data
Layer
Common Data Elements (CDE)Definitions
Business Rules Engine
Mapping Engine
HELP Builder
Security EngineSecurity Engine
Registration
Authorization
Authentication
Security Data Layer
Metadata DataLayer
MetadataCuration
ManualAnnotation
Data Query
Data ImportExport
AdminSecurity
OSD (Meta Data Builder Tool):
OSD Feature List:OSD Feature List:
To address the needs of the heterogeneous users we To address the needs of the heterogeneous users we identified numerous criteria for success. Some requirements identified numerous criteria for success. Some requirements and features are listed below:and features are listed below:
Quick Statistics on overall data.Quick Statistics on overall data. Multi-mode search: Multiplex search and Advance Multi-mode search: Multiplex search and Advance
search.search. Mechanism for keeping user’s orientated (e.g. help, Mechanism for keeping user’s orientated (e.g. help,
persistence of last entered query text)persistence of last entered query text) Results in tabular forms, sorting on each column Results in tabular forms, sorting on each column
including access to full case report.including access to full case report. Both Honest Broker and De-identified (researcher) Both Honest Broker and De-identified (researcher)
access.access. Controlled access to subjects for different studiesControlled access to subjects for different studies
Feature List (Contd..)Feature List (Contd..)
Standard and customized query results of the data.Standard and customized query results of the data. Individual research and consent based access to Individual research and consent based access to
information.information. Quick search using cases saved in “My Cases”.Quick search using cases saved in “My Cases”. Query Builder interface.Query Builder interface. On Line Help Manual Builder.On Line Help Manual Builder. This model can support multi institutional data This model can support multi institutional data
enterprise model.enterprise model. User Management Module helps create, revoke, control User Management Module helps create, revoke, control
users access and activities within the database.users access and activities within the database. Business layer allows for creation of complex/logical Business layer allows for creation of complex/logical
data fields based on data interpretation by experts.data fields based on data interpretation by experts.
OSD model Based Head and Neck OSD model Based Head and Neck Neoplasm Virtual Biorepository:Neoplasm Virtual Biorepository:
It is Developing bioinformatics driven system to utilize multi It is Developing bioinformatics driven system to utilize multi model data sets from patient questionnaire, clinical, model data sets from patient questionnaire, clinical, pathological, radiology and molecular systemspathological, radiology and molecular systems
Results in one architecture supported by a set of CDEs to Results in one architecture supported by a set of CDEs to facilitate basic science, clinical as well translational research facilitate basic science, clinical as well translational research
Systems designed to facilitate semantic and syntactic Systems designed to facilitate semantic and syntactic interoperability in development of data elements (i.e., interoperability in development of data elements (i.e., metadata or data descriptors using controlled vocabulary metadata or data descriptors using controlled vocabulary and ontology) and ontology)
Provides data entry, data mining and analysis tools.Provides data entry, data mining and analysis tools.
OSD Integration with other Data OSD Integration with other Data Sources:Sources:
Genotype Lab data
Bio-marker data
Radiology (PET/CT) data
Patient Insurance
information
Human Papilloma Virus Questionnaire
data
Epidemiology Project-1
questionnaire data
SPORE H&N Neoplasm Database
AP-LIS/ CP-LIS
RIS
BIOS
Data Collection & Data Collection & Annotation ToolAnnotation Tool
User Authentication
Data Collection & Annotation Tool:
User Management Module
Data Collection & Annotation Tool
Administrator can create, edit, revoke control user’s & their access to different applications
Data Collection & Annotation Tool:
Manual data collection module
Case summary
Data Collection & Annotation Tool
Can switch quickly between different available applications as per user access rights
Data Collection & Annotation Tool
Quick over all review of Statistics on the collected database
Data Collection & Annotation Tool
Data Query template
Data Collection & Annotation Tool:
Standard view
Data Collection & Annotation Tool
Descriptions of different views for reference
Data Collection & Annotation Tool
Allows data export for Statistical analysis packages, such as SAS, etc.
Data Collection & Annotation Tool
Full Case Report View (Identified or De-identified as per access level
User can have multiple “My Case” lists for different studies
Data Collection & Annotation Tool
User can also select any data field to create personalized views & save under ”My Views”
Data Collection & Annotation ToolData Collection & Annotation Tool
Administrator can edit or create data views
Virtual BiorepositoryVirtual Biorepository Tissue typeTissue type
Total # Cases, Total # Cases, Total Number of BiospecimensTotal Number of Biospecimens
Paraffin BlocksParaffin BlocksFrozen Frozen BlocksBlocks
Blood/Blood/serum/serum/PlasmaPlasma
CPCTRCPCTR ProstateProstate 70007000 3464134641 1750817508 1750817508
PCABCPCABC
BreastBreast 36453645 17601760 847847 823823
MelanomaMelanoma 1762 1762 18851885 168168 112112
ProstateProstate 73277327 54575457 16421642 415415
EDRN Colorectal and EDRN Colorectal and Pancreatic Neoplasm Pancreatic Neoplasm Virtual BiorepositoryVirtual Biorepository
Pancreas and Pancreas and coloncolon
24592459 175175 942942 12541254
SPORE’s Head & Neck SPORE’s Head & Neck Neoplasm Virtual Neoplasm Virtual BiorepositoryBiorepository
Head and Neck Head and Neck NeoplasmNeoplasm
1162211622 22372237 00 10381038
OSD based Databases Accruals:
Amin et al. Tissue banking informatics 2010)
Model Driven Database (MDD):Model Driven Database (MDD):
NMVB is developed using a model-driven approach (MDD).NMVB is developed using a model-driven approach (MDD).
Application components are generated from UML domain Application components are generated from UML domain models.models.
Java based application designed using a Model-Driven Java based application designed using a Model-Driven Development framework. Development framework.
MDD (contd.…)
Web Tier: Construct web pages upon metadata Web Tier: Construct web pages upon metadata dictionarydictionary
Business Tier: Provides an object/relational Business Tier: Provides an object/relational mapping mechanism, a metadata interrogation mapping mechanism, a metadata interrogation mechanism, an application programming Interface mechanism, an application programming Interface and a set of shared services.and a set of shared services.
Data Tier: Consists of domain database that houses Data Tier: Consists of domain database that houses clinically annotated data, indexes to support the clinically annotated data, indexes to support the query mechanism and security data.query mechanism and security data.
Virtual Component of NMVB:
Statistical Data Query InterfaceStatistical Data Query Interface
Approved Investigator Query InterfaceApproved Investigator Query Interface
Data Entry InterfaceData Entry Interface
www.mesotissue.org
YearYear Retrospective CasesRetrospective Cases Prospective CasesProspective Cases Overall NMVB TotalOverall NMVB Total
20062006 515515 88 523523
20072007 585585 5050 635635
20082008 605605 105105 710710
20092009 674674 162162 836836
2010 (to date)2010 (to date) 674674 183183 865865
NMVB Accruals:
Conclusion:
Informatics supported tissue banking initiatives act as a Informatics supported tissue banking initiatives act as a large source of annotated biospecimens and facilitates large source of annotated biospecimens and facilitates basic and clinical science research.basic and clinical science research.
Tissue banking infrastructure allows efficient governess, Tissue banking infrastructure allows efficient governess, standardized capture of data and detailed standardized standardized capture of data and detailed standardized annotation at local institute and across multiple annotation at local institute and across multiple collaborating sites.collaborating sites.
Finally, tissue banking tools developed at DBMI Finally, tissue banking tools developed at DBMI (Department of biomedical informatics) provides an (Department of biomedical informatics) provides an important knowledgebase for the development of important knowledgebase for the development of integrated tissue banking efforts and benefit other tissue integrated tissue banking efforts and benefit other tissue banking initiatives by providing consultation.banking initiatives by providing consultation.
Thank you