large-scale generative modeling to improve …yuhuiz/assets/posters/nips2018.pdfwe find generative...

1
We implement a gradient-based interpretation method, which attributes prediction scores to input by computing the attribution score as gradient × input. We compute the frequency of words that have score ≥ 0.2 (threshold chosen to select on average 3% words per note), use MetaMap dictionary as a filter to extract medical relevant terms. We show the top 10 words in the that the model most strongly associates with the disease. Interpretation Data Example Large-scale Generative Modeling to Improve Automated Veterinary Disease Coding Large-scale veterinary clinical records can become a powerful resource for patient care and research. However, clinicians lack the time and resource to annotate patient records with standard medical diagnostic codes. The lack of standard coding makes it challenging to use the clinical data to improve patient care, and make models trained on one dataset might easily get biased and perform poorly on general clinical records. We refer to this as the cross-hospital challenge. Veterinary medicine domain lacks coding infrastructure and standardized nomenclatures across medical institutions Why is annotating clinical notes automatically important for the veterinary medicine field? ● Identifying clinical cohorts of veterinary patients on a large scale for clinical research (Baraban, 2014). Animals have important translational impact on the study of human disease (Kol, 2015). Spontaneous models of disease in companion animals are used in drug development pipelines (Hernandez, 2018). Introduction Yuhui Zhang*, Allen Nie*, James Zou Tsinghua University, Stanford University We use 3 datasets. CSU and PP are labeled with a set of SNOMED-CT codes by veterinarians from Colorado State University and a commercial veterinary practice, and 42 diseases are actually coded in the dataset. SAGE is a large set of clinical notes from the SAGE Centers without codes. Two tasks are shown: generative modeling (top) and supervised learning (bottom). The dashed arrows represent the generative modeling process on the unlabeled SAGE data, and the solid arrows represent the supervised learning process on the labeled CSU data. An additional test is done on the PP data (not shown). Model MetaMap processes a document and outputs a list of matched medically-relevant keywords, we use bag-of-words feature representation to train SVM or MLP. CAML (Mullenbach et al., 2018) is the state-of-the-art on MIMIC, an open dataset of ICU medical records. LSTM and Transformer are two base encoder models; +Word2Vec uses Word2Vec trained on SAGE to initialize; +Pretrain uses generative modeling loss on SAGE to initialize; +Auxiliary uses generative modeling loss on CSU in addition to classification objective on CSU. We find generative modeling outperforms Word2Vec, and helps Transformer more. Results Aronson, A. R., and Lang, F.-M. 2010. An overview of metamap: historical perspective and recent advances. Journal of the American Medical Informatics Association 17(3):229–236. Radford, A.; Narasimhan, K.; Salimans, T.; and Sutskever, I. 2018. Improving language understanding by generative pre-training. Ancona, M.; Ceolini, E.; Oztireli, C.; and Gross, M. 2018. Towards better understanding of gradient-based attribution methods for deep neural networks. In 6th International Conference on Learning Representations (ICLR 2018). References Data are only available from the authors upon reasonable request and with permission of Colorado State University College of Veterinary Medicine, the private hospital and SAGE Centers. Code is available on https://github.com/yuhui-zh15/VetTag. Availability

Upload: others

Post on 21-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Large-scale Generative Modeling to Improve …yuhuiz/assets/posters/nips2018.pdfWe find generative modeling outperforms Word2Vec, and helps Transformer more. Results Aronson, A. R.,

We implement a gradient-based interpretation method, which attributes prediction scores to input by computing the attribution score as gradient × input. We compute the frequency of words that have score ≥ 0.2 (threshold chosen to select on average 3% words per note), use MetaMap dictionary as a filter to extract medical relevant terms. We show the top 10 words in the that the model most strongly associates with the disease.

InterpretationData

Example

Large-scale Generative Modeling to Improve Automated Veterinary Disease Coding

Large-scale veterinary clinical records can become a powerful resource for patient care and research. However, clinicians lack the time and resource to annotate patient records with standard medical diagnostic codes. The lack of standard coding makes it challenging to use the clinical data to improve patient care, and make models trained on one dataset might easily get biased and perform poorly on general clinical records. We refer to this as the cross-hospital challenge.

○ Veterinary medicine domain lacks coding infrastructure and standardized nomenclatures across medical institutions

Why is annotating clinical notes automatically important for the veterinary medicine field?

● Identifying clinical cohorts of veterinary patients on a large scale for clinical research (Baraban, 2014).

● Animals have important translational impact on the study of human disease (Kol, 2015).● Spontaneous models of disease in companion animals are used in drug development

pipelines (Hernandez, 2018).

Introduction

Yuhui Zhang*, Allen Nie*, James ZouTsinghua University, Stanford University

We use 3 datasets. CSU and PP are labeled with a set of SNOMED-CT codes by veterinarians from Colorado State University and a commercial veterinary practice, and 42 diseases are actually coded in the dataset. SAGE is a large set of clinical notes from the SAGE Centers without codes.

Two tasks are shown: generative modeling (top) and supervised learning (bottom). The dashed arrows represent the generative modeling process on the unlabeled SAGE data, and the solid arrows represent the supervised learning process on the labeled CSU data. An additional test is done on the PP data (not shown).

Model

MetaMap processes a document and outputs a list of matched medically-relevant keywords, we use bag-of-words feature representation to train SVM or MLP. CAML (Mullenbach et al., 2018) is the state-of-the-art on MIMIC, an open dataset of ICU medical records. LSTM and Transformer are two base encoder models; +Word2Vec uses Word2Vec trained on SAGE to initialize; +Pretrain uses generative modeling loss on SAGE to initialize; +Auxiliary uses generative modeling loss on CSU in addition to classification objective on CSU. We find generative modeling outperforms Word2Vec, and helps Transformer more.

Results

Aronson, A. R., and Lang, F.-M. 2010. An overview of metamap: historical perspective and recent advances. Journal of the American Medical Informatics Association 17(3):229–236.

Radford, A.; Narasimhan, K.; Salimans, T.; and Sutskever, I. 2018. Improving language understanding by generative pre-training.

Ancona, M.; Ceolini, E.; Oztireli, C.; and Gross, M. 2018. Towards better understanding of gradient-based attribution methods for deep neural networks. In 6th International Conference on Learning Representations (ICLR 2018).

References

Data are only available from the authors upon reasonable request and with permission of Colorado State University College of Veterinary Medicine, the private hospital and SAGE Centers. Code is available on https://github.com/yuhui-zh15/VetTag.

Availability