learning molecular fingerprints from the graph upduvenaud/talks/neuralfps.pdf · learning molecular...
TRANSCRIPT
![Page 1: Learning Molecular Fingerprints from the Graph Upduvenaud/talks/neuralfps.pdf · Learning Molecular Fingerprints from the Graph Up David Duvenaud, Dougal Maclaurin, Jorge Aguilera-Iparraguirre,](https://reader031.vdocuments.us/reader031/viewer/2022022619/5bac1cba09d3f27d588d5e51/html5/thumbnails/1.jpg)
Learning Molecular Fingerprintsfrom the Graph Up
David Duvenaud, Dougal Maclaurin,
Jorge Aguilera-Iparraguirre, Rafael Gómez-Bombarelli,
Timothy Hirzel, Alán Aspuru-Guzik, Ryan P. Adams
Harvard University
September 30, 2015
![Page 2: Learning Molecular Fingerprints from the Graph Upduvenaud/talks/neuralfps.pdf · Learning Molecular Fingerprints from the Graph Up David Duvenaud, Dougal Maclaurin, Jorge Aguilera-Iparraguirre,](https://reader031.vdocuments.us/reader031/viewer/2022022619/5bac1cba09d3f27d588d5e51/html5/thumbnails/2.jpg)
Motivation
• Want to do regression onmolecules
• For virtual screening ofdrugs, materials, etc.
• Problem: Molecules can beany size and shape
• Only know how to learnfrom fixed-size examples.
• How to take a molecule inand produce a fixed-sizevector?
![Page 3: Learning Molecular Fingerprints from the Graph Upduvenaud/talks/neuralfps.pdf · Learning Molecular Fingerprints from the Graph Up David Duvenaud, Dougal Maclaurin, Jorge Aguilera-Iparraguirre,](https://reader031.vdocuments.us/reader031/viewer/2022022619/5bac1cba09d3f27d588d5e51/html5/thumbnails/3.jpg)
Circular Fingeprints
• Standard method lists allsubstructures below acertain size
• Can do this bycombining hashes ofeach atom with andbonded neighbors
• Hash value indexes intoa fixed-sized vector
• Problem: can’t optimizewith gradients
![Page 4: Learning Molecular Fingerprints from the Graph Upduvenaud/talks/neuralfps.pdf · Learning Molecular Fingerprints from the Graph Up David Duvenaud, Dougal Maclaurin, Jorge Aguilera-Iparraguirre,](https://reader031.vdocuments.us/reader031/viewer/2022022619/5bac1cba09d3f27d588d5e51/html5/thumbnails/4.jpg)
What would Ryan do?
• Maybe we can build amessage-passingnetwork
• same function is appliedto each node (atom) andits neighbors
• Like a convolutional net• At the top, add all node’s
vectors together• If we use a softmax, this
generalizes circularfingerprints
![Page 5: Learning Molecular Fingerprints from the Graph Upduvenaud/talks/neuralfps.pdf · Learning Molecular Fingerprints from the Graph Up David Duvenaud, Dougal Maclaurin, Jorge Aguilera-Iparraguirre,](https://reader031.vdocuments.us/reader031/viewer/2022022619/5bac1cba09d3f27d588d5e51/html5/thumbnails/5.jpg)
Continuous-izing Circular Fingerprints
Circular fingerprints1: Input: molecule, radius R, fingerprint
length S2: Initialize: fingerprint vector f← 0S3: for each atom a in molecule do4: ra ← g(a) . lookup atom features5: for L = 1 to R do . for each layer6: for each atom a in molecule do7: r1 . . . rN = neighbors(a)8: v← [ra, r1, . . . , rN ] . concatenate9: ra ← hash(v) . hash function10: i ← mod(ra,S) . convert to index11: fi ← 1 . Write 1 at index12: Return: binary vector f
Neural graph fingerprints1: Input: molecule, radius R, weights
H11 . . .H
5R , output weights W1 . . .WR
2: Initialize: fingerprint vector f← 0S3: for each atom a in molecule do4: ra ← g(a) . lookup atom features5: for L = 1 to R do . for each layer6: for each atom a in molecule do7: r1 . . . rN = neighbors(a)8: v← ra +
∑Ni=1 ri . sum
9: ra ← σ(vHNL ) . smooth function
10: i← softmax(raWL) . sparsify11: f← f + i . add to fingerprint12: Return: real-valued vector f
Every non-differentiable operation is replaced with adifferentiable analog.
![Page 6: Learning Molecular Fingerprints from the Graph Upduvenaud/talks/neuralfps.pdf · Learning Molecular Fingerprints from the Graph Up David Duvenaud, Dougal Maclaurin, Jorge Aguilera-Iparraguirre,](https://reader031.vdocuments.us/reader031/viewer/2022022619/5bac1cba09d3f27d588d5e51/html5/thumbnails/6.jpg)
Generalizing Circular Fingerprints
• If we generalize existingfingerprints, we can’t notwin (unless we overfit)
• large random weightsmakes neural nets act likehash functions
• Looked at similaritiesbetween pairwisedistances. 0.5 0.6 0.7 0.8 0.9 1.0
Circular fingerprint distances
0.5
0.6
0.7
0.8
0.9
1.0
Neu
ral
fin
gerp
rin
t d
ista
nce
s
Neural vs Circular distances, r=0:823
![Page 7: Learning Molecular Fingerprints from the Graph Upduvenaud/talks/neuralfps.pdf · Learning Molecular Fingerprints from the Graph Up David Duvenaud, Dougal Maclaurin, Jorge Aguilera-Iparraguirre,](https://reader031.vdocuments.us/reader031/viewer/2022022619/5bac1cba09d3f27d588d5e51/html5/thumbnails/7.jpg)
Generalizing Circular Fingerprints
• If we generalize existingfingerprints, we can’t notwin (unless we overfit)
• large random weightsmakes neural nets act likehash functions
• Looked at performance ofrandom weights. 0 1 2 3 4 5 6
Fingerprint radius
0.8
1.0
1.2
1.4
1.6
1.8
2.0
RM
SE
(lo
g M
ol/
L)
Circular fingerprints
Random conv with large parameters
Random conv with small parameters
![Page 8: Learning Molecular Fingerprints from the Graph Upduvenaud/talks/neuralfps.pdf · Learning Molecular Fingerprints from the Graph Up David Duvenaud, Dougal Maclaurin, Jorge Aguilera-Iparraguirre,](https://reader031.vdocuments.us/reader031/viewer/2022022619/5bac1cba09d3f27d588d5e51/html5/thumbnails/8.jpg)
Performance
Dataset Solubility Drug efficacy Photovoltaic efficiencyUnits log Mol/L EC50 in nM percent
Predict mean 4.29 ± 0.40 1.47 ± 0.07 6.40 ± 0.09Circular FPs + linear layer 1.84 ± 0.08 1.13 ± 0.03 2.62 ± 0.07Circular FPs + neural net 1.40 ± 0.15 1.24 ± 0.03 2.04 ± 0.07Neural FPs + linear layer 0.74 ± 0.09 1.16 ± 0.03 2.71 ± 0.13Neural FPs + neural net 0.53 ± 0.07 1.17 ± 0.03 1.44 ± 0.11
• Could also try varying depth of neural net on top(used one hidden layer here)
![Page 9: Learning Molecular Fingerprints from the Graph Upduvenaud/talks/neuralfps.pdf · Learning Molecular Fingerprints from the Graph Up David Duvenaud, Dougal Maclaurin, Jorge Aguilera-Iparraguirre,](https://reader031.vdocuments.us/reader031/viewer/2022022619/5bac1cba09d3f27d588d5e51/html5/thumbnails/9.jpg)
Interpretability
• Circular fingerprintsactivate for a singlesubstructure
• No generalization• No notion of similarity• Let’s put a linear layer on
top of neural fingerprintsand examine whichfragments activate mostpredictive features.
![Page 10: Learning Molecular Fingerprints from the Graph Upduvenaud/talks/neuralfps.pdf · Learning Molecular Fingerprints from the Graph Up David Duvenaud, Dougal Maclaurin, Jorge Aguilera-Iparraguirre,](https://reader031.vdocuments.us/reader031/viewer/2022022619/5bac1cba09d3f27d588d5e51/html5/thumbnails/10.jpg)
Interpretability: Solubility
Fragments activating feature most predictive of solubility:
OOH
O
NH
O
OH
OH
most predictive of insolubility:
![Page 11: Learning Molecular Fingerprints from the Graph Upduvenaud/talks/neuralfps.pdf · Learning Molecular Fingerprints from the Graph Up David Duvenaud, Dougal Maclaurin, Jorge Aguilera-Iparraguirre,](https://reader031.vdocuments.us/reader031/viewer/2022022619/5bac1cba09d3f27d588d5e51/html5/thumbnails/11.jpg)
Interpretability: Toxicity
Fragments most activated by toxicity feature on SR-MMPdataset:
Fragments most activated by toxicity feature on NR-AHRdataset:
![Page 12: Learning Molecular Fingerprints from the Graph Upduvenaud/talks/neuralfps.pdf · Learning Molecular Fingerprints from the Graph Up David Duvenaud, Dougal Maclaurin, Jorge Aguilera-Iparraguirre,](https://reader031.vdocuments.us/reader031/viewer/2022022619/5bac1cba09d3f27d588d5e51/html5/thumbnails/12.jpg)
Future Work
• Limitation: Slow because ofso many weight transforms
• Could use low-rank weightmatrices
• Limitation: All features arelocal
• Could learn to “parse”molecules
• But how to take gradients?
![Page 13: Learning Molecular Fingerprints from the Graph Upduvenaud/talks/neuralfps.pdf · Learning Molecular Fingerprints from the Graph Up David Duvenaud, Dougal Maclaurin, Jorge Aguilera-Iparraguirre,](https://reader031.vdocuments.us/reader031/viewer/2022022619/5bac1cba09d3f27d588d5e51/html5/thumbnails/13.jpg)
Delaney, John S. ESOL: Estimating aqueous solubilitydirectly from molecular structure. Journal of ChemicalInformation and Computer Sciences, 44(3):1000–1005,2004.
Gamo, Francisco-Javier, Sanz, Laura M, Vidal, Jaume,de Cozar, Cristina, Alvarez, Emilio, Lavandera,Jose-Luis, Vanderwall, Dana E, Green, Darren VS,Kumar, Vinod, Hasan, Samiul, et al. Thousands ofchemical starting points for antimalarial leadidentification. Nature, 465(7296):305–310, 2010.
Hachmann, Johannes, Olivares-Amaya, Roberto,Atahan-Evrenk, Sule, Amador-Bedolla, Carlos,Sánchez-Carrera, Roel S, Gold-Parker, Aryeh, Vogt,Leslie, Brockway, Anna M, and Aspuru-Guzik, Alán.The Harvard clean energy project: large-scalecomputational screening and design of organicphotovoltaics on the world community grid. The Journalof Physical Chemistry Letters, 2(17):2241–2251, 2011.
11 / 11
![Page 14: Learning Molecular Fingerprints from the Graph Upduvenaud/talks/neuralfps.pdf · Learning Molecular Fingerprints from the Graph Up David Duvenaud, Dougal Maclaurin, Jorge Aguilera-Iparraguirre,](https://reader031.vdocuments.us/reader031/viewer/2022022619/5bac1cba09d3f27d588d5e51/html5/thumbnails/14.jpg)
Tox21 Challenge. National center for advancingtranslational sciences.http://tripod.nih.gov/tox21/challenge,2014. [Online; accessed 2-June-2015].
11 / 11