how thermo fisher is reducing mass spectrometry experiment times from days to minutes w/ mongodb...
TRANSCRIPT
![Page 1: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/1.jpg)
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
DAT204
How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to
Minutes with MongoDB & AWS
![Page 2: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/2.jpg)
World leader in serving scienceRevenues of $17 billion50,000 employees 50 countries
![Page 3: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/3.jpg)
A Mass Spectrometer tells you…
What’s in there and how much
![Page 4: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/4.jpg)
![Page 5: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/5.jpg)
Making the world cleaner and safer
![Page 6: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/6.jpg)
Mars Organic Molecule Analyzer (MOMA) will take a modified Thermo Linear Ion Trap Mass Spectrometer to Mars in 2020
![Page 7: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/7.jpg)
![Page 8: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/8.jpg)
What beer looks like in a mass spec
![Page 9: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/9.jpg)
![Page 10: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/10.jpg)
![Page 11: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/11.jpg)
Demo
![Page 12: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/12.jpg)
Instrument
MongoDB
MS Instrument Connect
Demo: instrument connect
![Page 13: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/13.jpg)
Demo: remote monitoring a mass spectrometer
![Page 14: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/14.jpg)
Why does Thermo use MongoDB?
![Page 15: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/15.jpg)
ThermoFisher apps using MongoDB
XML MongoDB
Starting on MongoDBOracle MongoDB
SQL Lite MongoDB
Postgres MongoDB
Amazon DynamoDB MongoDB Atlas
![Page 16: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/16.jpg)
Scientific apps = humongous data
![Page 17: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/17.jpg)
Big molecules = big data
![Page 18: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/18.jpg)
instrument { UserId : "[email protected]", MachineName : "TRACEFINDER8", Location : "Austin", AcquisitionStationName : "TSQ 8000", LastErrorEventDate : "2016-09-05", LastErrorEventValue : null, RuntimeEstimate : { MeasuredElaspedDuration : 0.21966, Confidence : HighConfidence }, RunManagerStatus : { Status : "Acquire", Sequence : "Testosterone", SampleName : "Drugx", VialPosition : "1", Rawfile : "2pg_161029205505", Instmethod : "1x.meth", Instrument : "TSQ 8000", IsPaused : false, Operator : "Fred", }}
Why MongoDB was chosen
• Performance• Developer productivity• Cost effective• Runs anywhere• Rich feature set• Achieved legal and regulatory approval
![Page 19: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/19.jpg)
MongoDB is a Swiss army knife
• Hierarchical data• Relational data • Queues• File storage• Device state
Amazon SQSAmazon S3Amazon IoT
![Page 20: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/20.jpg)
Join example
• Version 3.2 introduced the $lookup operator
• SQL query
• MongoDB C# driver query
![Page 21: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/21.jpg)
MongoDB has caught up to relational DBs
Notably, we show that the MUPG (match, unwind, project, group) fragment is already at least as expressive as full relational algebra over (the relational view of) a single collection, and in particular able to express arbitrary joins.
– Bolzano University in Italy
“”
![Page 22: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/22.jpg)
Hash-Based ShardingRolesKerberosOn-Prem Monitoring
2.4GA 2013
2.6GA 2014
3.0GA 2015
3.2GA 2015
Headline Features by Release
$outIndex IntersectionText SearchField-Level RedactionLDAP & x509Auditing
Document Validation$lookupFast FailoverSimpler ScalabilityAggregation ++Encryption At RestIn-Memory Storage EngineBI ConnectorMongoDB CompassAPM IntegrationProfiler VisualizationAuto Index BuildsBackups to File System
Doc-Level ConcurrencyCompressionStorage Engine API≤50 replicasAuditing ++Ops Manager
Linearizable readsIntra-cluster compressionViewsLog RedactionGraph ProcessingDecimalCollations Faceted NavigationSpark Connector ++Zones ++Aggregation ++Auto-balancing ++ARM, Power, zSeriesBI Connector ++Compass ++Hardware MonitoringServer PoolLDAP AuthorizationEncrypted BackupsCloud Foundry Integration
3.4GA 2016Atlas
The evolution of MongoDB
1.02009
![Page 23: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/23.jpg)
MySQL vs. MongoDB
![Page 24: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/24.jpg)
Database schema
MySQL schema
MongoDB schema
![Page 25: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/25.jpg)
Inserting data: MongoDB vs. MySQL
• Inserting 1,615 chemical compound records into two parent-child tables.• To optimize the MySQL query, we turned off foreign keys during insert and
used a string builder to create a bulk insert SQL statement. This improved insert performance by a factor of 360.
• Compare to MongoDB.
Database Milliseconds Lines of codeMySQL not optimized 147,600 (2.5 minutes) 21MySQL optimized 410 40MongoDB 68 1
![Page 26: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/26.jpg)
Inserting data: MongoDB vs. MySQL
![Page 27: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/27.jpg)
Selecting data: MongoDB vs. MySQL
• Query 600,000 rows of SampleCompound result data• To optimize the MySQL select query, we created a dictionary to lookup child
records for each parent, this improved performance by a factor of 300, optimization effort: 2 engineers and 2 weeks.
Database Seconds Lines of codeMySQL not optimized 2,400 (4.1 minutes) 20MySQL optimized 8.2 29MongoDB 17.5 7
![Page 28: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/28.jpg)
Update: MongoDB vs. MySQL
![Page 29: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/29.jpg)
Migrating to MongoDB reduced code by 3.5x
SQLite MongoDBData Layer Lines of Code 4271 1260
![Page 30: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/30.jpg)
MongoDB compared to DynamoDB
MongoDB DynamoDBAnywhere AWSRich Ad-hoc Query Language + IDE No Ad-hoc query languageMany operators (Joins, Aggregation, etc.) Fewer operatorsExcellent Performance Excellent PerformanceEasy to deploy (with Atlas) Easy to Deploy each tableAdding tables requires no configuration changes
Adding tables requires additional configuration and cost
Easy to use from AWS services but not natively integrated
Native integration with AWS Services: IAM, VPC, Lambda, Kinesis
Released in 2009 Released in 2012
![Page 31: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/31.jpg)
MongoDB vs. S3 performance
Download 220 KB object from MongoDB was 7x faster cold, and 3x faster when warm
MongoDB Amazon S3Retrieve document first time 68 ms 468 ms
Retrieve document second time 13 ms 38 ms
![Page 32: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/32.jpg)
MongoDB vs. S3 performance
MongoDB 11x faster than S3 in the use case of partial document loading
MongoDB S3
Data size 400 Bytes 2.1 MB
Performance 19 ms 214 ms
![Page 33: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/33.jpg)
Reducing processing from days to minutes
![Page 34: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/34.jpg)
Frameworks used to parallelize algorithms
• AWS Lambda• Docker and Amazon ECS• Spark and Elastic Map Reduce
![Page 35: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/35.jpg)
Parallel data processing
![Page 36: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/36.jpg)
Why Atlas?
• Easy• Performant • Seamless Migration• Robust• No downtime, even when scaling up
![Page 37: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/37.jpg)
Building MongoDB Atlas on Amazon Web Services
![Page 38: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/38.jpg)
Operations burden
PATCHES
UPGRADES
SECURITY
BACKUPS
RECOVERY
99.999% UPTIME
UPSCALE
DOWNSCALE
PERFORMANCE
UAT
STAGING
MONITORING
ALERTS
PROVISION
CONFIGURE
INSTALL
![Page 39: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/39.jpg)
Automated Available On-Demand
Secure Highly Available Automated Backups
Elastically Scalable
Database as a service for MongoDB
![Page 40: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/40.jpg)
Fully managed MongoDB clusters
Customer only needs to choose the shape and size of the cluster
● Instance size (CPU and RAM)
● Replication factor
● Number of shards
● Disk space
● Disk speed
Screenshot of create dialog
Cluster features
![Page 41: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/41.jpg)
VPC peering
IP address whitelist
SCRAM-SHA-1 authentication
readWriteAnyDatabase
enableSharding
clusterMonitorSSL
Using well-known CATrust system CAs by default
Security features
![Page 42: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/42.jpg)
Backup AutomationMonitoring
Key components
![Page 43: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/43.jpg)
AWS Account X—Region Y
VPC (Customer N)
Availability Zone A
Availability Zone B
Availability Zone C
Subnet A Subnet B Subnet C
mongod—27017
mongod—27017
mongod—27017
Customer container with replica set
![Page 44: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/44.jpg)
AWS Account X—Region Y
VPC (Customer N)
Availability Zone A
Availability Zone B
Availability Zone C
Subnet A Subnet B Subnet C
Customer container with sharded cluster
shard0
S
shard1
S
shard2 config
shard0
S
shard1
S
shard2 config
shard0
S
shard1
S
shard2 config
![Page 45: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/45.jpg)
mongod—27017
mongod—27017
mongod—27017
One security group per VPC applied to all Amazon EC2 instances
Three classes of security rules:
● MongoDB traffic between cluster members
● MongoDB traffic between application and clusters
● SSH traffic between production support jump box and EC2 instance
App Server Jump Box
IP firewall using security groups
![Page 46: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/46.jpg)
173.31.248.0/21
10.0.0.0/16
VPC peering
Your VPC
Elastic LB
CIDR Block: 10.0.0.0/16
Atlas VPC
AZ 1 AZ 2 AZ 3
CIDR Block: 172.31.248.0/21
![Page 47: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/47.jpg)
![Page 48: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/48.jpg)
We want prime to be such a good value, you’d be irresponsible not to be a member.—Jeff Bezos
“”
![Page 49: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/49.jpg)
Questions?
![Page 50: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/50.jpg)
Thank you!
![Page 51: How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to Minutes w/ MongoDB Atlas on AWS](https://reader035.vdocuments.us/reader035/viewer/2022070603/587064fd1a28ab48378b4c29/html5/thumbnails/51.jpg)
Remember to complete your evaluations!