using mongodb for materials discovery

30
Using MongoDB for Materials Discovery Michael Kocher and Dan Gunter Lawrence Berkeley National Lab

Upload: dan-gunter

Post on 29-Jul-2015

747 views

Category:

Technology


4 download

TRANSCRIPT

Page 1: Using MongoDB for Materials Discovery

Using MongoDB for Materials Discovery

Michael Kocher and Dan GunterLawrence Berkeley National Lab

Page 2: Using MongoDB for Materials Discovery

Energy Mission at LBNL

• Li-ion Batteries

• Photovoltaic (Solar Cells)

• Thermoelectrics

• Biofuels

• New Computational Tools

• Cutting edge Spectroscopic Tools (Advanced Light Source)

http://carboncycle2.lbl.gov/

Page 3: Using MongoDB for Materials Discovery

Current Material Design model is Slow

18 Years... from the average new materials discovery to commercialization

Bringing New Materials to the Market: Eagar, T.W. Technology Review Feb 1995, 98, 42.

Page 4: Using MongoDB for Materials Discovery

Materials Genome Initiative:A Renaissance of American Manufacturing

“To help businesses discover, develop, and deploy new materials twice as fast, we're launching what we call the

Materials Genome Initiative. The invention of silicon circuits and lithium-ion batteries made computers and iPods

and iPads possible -- but it took years to get those technologies from the drawing board to the marketplace.

We can do it faster.”

- President Obama at Carnegie Mellon University 6/24/2011

Page 5: Using MongoDB for Materials Discovery

What is a Material?

Page 6: Using MongoDB for Materials Discovery

NaCl Silicon

Page 7: Using MongoDB for Materials Discovery

LiCoO2Li

O

Co

Page 8: Using MongoDB for Materials Discovery

What can we Compute using quantum mechanics?

+

No empirical parameters!

volumedensity

total energyformation energy

metallic?etc...

Page 9: Using MongoDB for Materials Discovery

MIT and LBNL collaboration

‘The Google of Material Science Data”MaterialsProject.org

+

Page 10: Using MongoDB for Materials Discovery

Inverting the Problem

Page 11: Using MongoDB for Materials Discovery

Detailed Properties

Page 12: Using MongoDB for Materials Discovery

Machine LearningStructure 1Structure 2Structure 3Structure 4Structure 5Structure 6

materials.bson Learning Algorithm

(new materials)

Prof. Gerbrand Ceder (DOI: 10.1103/PhysRevLett.91.135503)

What about Na, V, P, O?

How often can you substitute Mg for Ca?

Page 13: Using MongoDB for Materials Discovery

Materials Project:A Play in Three Acts

I.Data generation using HTCII. Data storageIII.Data analysis/logging

Page 14: Using MongoDB for Materials Discovery

Act I: Managing Calculations

• Centralized distributed model is the only way to go

• Hub is at LBNL

• Store the state in db

• Overview of running many MPI jobs at many different HP centers

Page 15: Using MongoDB for Materials Discovery

MasterQueue

master_queue.bson

Franklin

NERSC (Oakland)

Lawrencium(Berkeley)

Hopper Carver lr1 lr2

manager.x manager.x manager.x manager.x manager.x

create a new engine, add

to queue

builder.xpull crystal

HPC

‘The Brain’

Page 16: Using MongoDB for Materials Discovery

ExampleMongoDB

FranklinHopper Carver lr1 lr2

manager.x

CathodeO1

MIT

manager.x manager.x manager.x manager.x manager.x manager.x

DLX

manager.x

Centralized Logging and Management

NERSC (Oakland) LBNL Kentucky

query = {‘elements’: {‘$all’: [“Li”, “O”], ‘nelectrons’ :{“$lte: 200}}

Page 17: Using MongoDB for Materials Discovery

Act II :Core Data storage

Page 18: Using MongoDB for Materials Discovery

Very Complex Documents

Page 19: Using MongoDB for Materials Discovery

Powerful Querying

Every crystal that has (Li or Na or K), (Mn), (O or S or F or Si)plus one other element except (Zn or Ni or Fe or Cu or Co)

{"lattice.volume" : { "$lt" : 500 },"elements" : {"$all" : ['Mn'],"$size" : 4, “$nin”:['Zn','Ni','Fe','Cu','Co']},"atoms" : { "$elemMatch" : { ‘oxidation_state’ : 3, ‘symbol’:’Mn’} },"$where" : "match_all(

this.element_names, ['Li', 'Na', 'K'], ['Mn'], ['O', 'S', 'F', 'Si'])"

}

Page 20: Using MongoDB for Materials Discovery

pre-MongoDB :(((SELECT structure.structureid FROM structure NATURAL INNER JOINdatabase NATURAL INNER JOIN databaseentry WHERE structureid IN((select structure.structureid from structure NATURAL INNER JOINelemententry where elemententry.symbol='Li' INTERSECT selectstructure.structureid from structure NATURAL INNER JOIN elemententrywhere elemententry.symbol='O') INTERSECT select structure.structureidfrom structure NATURAL INNER JOIN database NATURAL INNER JOINdatabaseentry where database.title='ICSD')) EXCEPT (SELECTstructure.structureid FROM structure where structure.entryid IN(select duplicateentry.entryid from duplicateentry))) EXCEPT (SELECTstructure.structureid FROM structure where structure.entryid IN(select entryid from removals))

Search for materials with Li and O, excluding duplicates

Page 21: Using MongoDB for Materials Discovery

Map/Reduce

tasks.bson materials.bson

MR

✓Calculation 12Calculation 13Calculation 14Calculation 15

Page 22: Using MongoDB for Materials Discovery

Every App uses MongoDB

by G. Hautier

structure_predictors.bsoncandidate_materials.bson diffraction_patterns.bson

Page 23: Using MongoDB for Materials Discovery

Structure Predictor

Page 24: Using MongoDB for Materials Discovery

Diffraction Pattern

Page 25: Using MongoDB for Materials Discovery

Act III:Analytics and Logging

Page 26: Using MongoDB for Materials Discovery

Rich Error Analysis

Experimental Calculated

Page 27: Using MongoDB for Materials Discovery

Integrated logging just makes sense

• Semi-structured data easily stored

• Can correlate with all other data

• Automation Layer: Failed tasks

• Web/App Layer

Page 28: Using MongoDB for Materials Discovery

Conclusions • MongoDB is a very versatile tool

• Used in several different cases

• Elegant query syntax

• Very useful for scientific data storage

• A lot of exciting future ideas

Page 29: Using MongoDB for Materials Discovery

Acknowledgements

Page 30: Using MongoDB for Materials Discovery

Thanks!

MaterialsProject.org