text and data mining for material synthesis - olivetti, elsa presentation.pdftext and data mining...

18
Text and Data Mining for Material Synthesis Elsa Olivetti, MIT Gerbrand Ceder, UC Berkeley Departments of Materials Science & Engineering Andrew McCallum, UMass Amherst Department of Computer Science & Engineering 1

Upload: others

Post on 27-Mar-2020

12 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Text and Data Mining for Material Synthesis - Olivetti, Elsa Presentation.pdfText and Data Mining for Material Synthesis Elsa Olivetti, MIT Gerbrand Ceder, UC Berkeley Departments

Text and Data Mining for Material Synthesis

Elsa Olivetti, MIT Gerbrand Ceder, UC Berkeley Departments of Materials Science & Engineering

Andrew McCallum, UMass AmherstDepartment of Computer Science & Engineering

1

Page 2: Text and Data Mining for Material Synthesis - Olivetti, Elsa Presentation.pdfText and Data Mining for Material Synthesis Elsa Olivetti, MIT Gerbrand Ceder, UC Berkeley Departments

1980 1991 2008

1954 1963 1982

Silicon Solar Cells

Lithium Ion Batteries

1st practical silicon solar cell invented at Bell Labs

Sharp produces 1st

practical solar module of silicon

solar cells

Kyocera 1st mass produces polysilicon cells

by today’s standard process

Oxford demonstrates 1st viable rechargeable

lithium battery

Sony sells 1st commercially available Li-ion batteries for

high price consumer electronics

1st uses of Li-ion battery in production

vehicles

Challenges for technology development: Timeline for development is long

Page 3: Text and Data Mining for Material Synthesis - Olivetti, Elsa Presentation.pdfText and Data Mining for Material Synthesis Elsa Olivetti, MIT Gerbrand Ceder, UC Berkeley Departments

Modern data‐driven and first‐principles materials design accelerates pace of what to make…

3

+ + H i2

i1

Ne

Vnuclear (ri )i1

Ne

12

1rj riji

Ne

i

Ne

3.2V

3. 86 V

3.7 V

3.76 V

4.09 V

Phase diagrams SurfacesBandgaps

Page 4: Text and Data Mining for Material Synthesis - Olivetti, Elsa Presentation.pdfText and Data Mining for Material Synthesis Elsa Olivetti, MIT Gerbrand Ceder, UC Berkeley Departments

4

Page 5: Text and Data Mining for Material Synthesis - Olivetti, Elsa Presentation.pdfText and Data Mining for Material Synthesis Elsa Olivetti, MIT Gerbrand Ceder, UC Berkeley Departments
Page 6: Text and Data Mining for Material Synthesis - Olivetti, Elsa Presentation.pdfText and Data Mining for Material Synthesis Elsa Olivetti, MIT Gerbrand Ceder, UC Berkeley Departments

Text extraction workflow

Page 7: Text and Data Mining for Material Synthesis - Olivetti, Elsa Presentation.pdfText and Data Mining for Material Synthesis Elsa Olivetti, MIT Gerbrand Ceder, UC Berkeley Departments

Status of data dissemination

Page 8: Text and Data Mining for Material Synthesis - Olivetti, Elsa Presentation.pdfText and Data Mining for Material Synthesis Elsa Olivetti, MIT Gerbrand Ceder, UC Berkeley Departments

8

Page 9: Text and Data Mining for Material Synthesis - Olivetti, Elsa Presentation.pdfText and Data Mining for Material Synthesis Elsa Olivetti, MIT Gerbrand Ceder, UC Berkeley Departments

9

Example: suggesting synthesis conditions for a specific morphology in titania

Edward Kim et al., Chemistry of Materials 2017

Experimentally‐accessible (and reported) variables to facilitate practical synthesis route planning. 

*Grey circled points = accuracy test pointsAll other points =

training data points

Page 10: Text and Data Mining for Material Synthesis - Olivetti, Elsa Presentation.pdfText and Data Mining for Material Synthesis Elsa Olivetti, MIT Gerbrand Ceder, UC Berkeley Departments

Virtual synthesis screening is hard: data is sparse & scarce

“Sparse” = high‐dimensional vector of synthesis actions“Scarce” = materials of interest  not many papers published to train on

Can deep learning / generative models be useful for synthesis screening? 

Page 11: Text and Data Mining for Material Synthesis - Olivetti, Elsa Presentation.pdfText and Data Mining for Material Synthesis Elsa Olivetti, MIT Gerbrand Ceder, UC Berkeley Departments

Schematic of Variational Autoencoder

Variational autoencoder:• Loss = reconstruction + f(Gaussian)• Also a generative model

Edward Kim et al., npj Computational Materials 2017

Collaborator, Stefanie Jegelka, CSAIL, MIT

Page 12: Text and Data Mining for Material Synthesis - Olivetti, Elsa Presentation.pdfText and Data Mining for Material Synthesis Elsa Olivetti, MIT Gerbrand Ceder, UC Berkeley Departments

Data augmentation with text & data mining 

Edward Kim et al., npj Computational Materials 2017

Page 13: Text and Data Mining for Material Synthesis - Olivetti, Elsa Presentation.pdfText and Data Mining for Material Synthesis Elsa Olivetti, MIT Gerbrand Ceder, UC Berkeley Departments

Example: suggesting synthesis conditions for stabilizing desired materials

Polymorphs for MnO2overlaid with most probable alkali‐ion use in synthesis (intercalation‐based phase stability)

Edward Kim et al., npj Computational Materials 2017

Photocatalysts

Lithium‐ion batteries Molecular sieves

Alkaline batteries

10,200 articles

Page 14: Text and Data Mining for Material Synthesis - Olivetti, Elsa Presentation.pdfText and Data Mining for Material Synthesis Elsa Olivetti, MIT Gerbrand Ceder, UC Berkeley Departments

Exploratory: Rare phase in common material

Clustering of latent space shows driving conditions for polymorph of TiO2 for photocatalysis

Edward Kim et al., npj Computational Materials 2017

Page 15: Text and Data Mining for Material Synthesis - Olivetti, Elsa Presentation.pdfText and Data Mining for Material Synthesis Elsa Olivetti, MIT Gerbrand Ceder, UC Berkeley Departments

Comparison of literature / virtual samples for SrTiO3 synthesisCalcination Sintering Annealing NaOH (M) Reference

800C, 2h ‐ ‐ 1 Ye et al, 2016800C, 2h 1250C, 2h ‐ ‐ Zhao et al, 20041000C, 12h ‐ 500C, 2h ‐ Zhao et al, 2015600‐750C, 4h ‐ ‐ ‐ Puangpetch et al, 2008721C, 1.8h ‐ 468C, 0.4h ‐ N/A

‐ ‐ 450C, 0.9h 1 N/A955C, 6h 1182C, 7.5h ‐ ‐ N/A

Edward Kim et al., npj Computational Materials 2017

One cannot train a model exclusively on literature data and classify something as successful or not, since there are no negative examples in the literature

Page 16: Text and Data Mining for Material Synthesis - Olivetti, Elsa Presentation.pdfText and Data Mining for Material Synthesis Elsa Olivetti, MIT Gerbrand Ceder, UC Berkeley Departments

Key elements in machine learning for energy technology

Ramprasad et al npjcomputational materials, 2017

Page 17: Text and Data Mining for Material Synthesis - Olivetti, Elsa Presentation.pdfText and Data Mining for Material Synthesis Elsa Olivetti, MIT Gerbrand Ceder, UC Berkeley Departments

Applicability and Next steps

• Continue to improve pipeline and disseminate information to the community

• Inform structure for data going forward• Use cases in:

• Solid state synthesis, hydrothermal and sol gel methods• Alloy design• Electrolyte performance

Page 18: Text and Data Mining for Material Synthesis - Olivetti, Elsa Presentation.pdfText and Data Mining for Material Synthesis Elsa Olivetti, MIT Gerbrand Ceder, UC Berkeley Departments

Thank you olivetti.mit.edu

synthesisproject.org

18