mp-pipe for soybean proteome brad barnes 27/11/15 comp 5704
DESCRIPTION
The Cluster 18 nodes 32 GB RAM 8 core processors 100 GB SSD Source:TRANSCRIPT
![Page 1: MP-PIPE for Soybean Proteome Brad Barnes 27/11/15 COMP 5704](https://reader036.vdocuments.us/reader036/viewer/2022062223/5a4d1ad07f8b9ab0599711bf/html5/thumbnails/1.jpg)
MP-PIPE for Soybean Proteome
Brad Barnes27/11/15
COMP 5704
![Page 2: MP-PIPE for Soybean Proteome Brad Barnes 27/11/15 COMP 5704](https://reader036.vdocuments.us/reader036/viewer/2022062223/5a4d1ad07f8b9ab0599711bf/html5/thumbnails/2.jpg)
Problem High protein plant, grown in Canada
Contains over 70,000 different proteins Unsigned Short: 2^16 = 65534 Unable to use with PIPE
![Page 3: MP-PIPE for Soybean Proteome Brad Barnes 27/11/15 COMP 5704](https://reader036.vdocuments.us/reader036/viewer/2022062223/5a4d1ad07f8b9ab0599711bf/html5/thumbnails/3.jpg)
The Cluster 18 nodes
32 GB RAM 8 core processors 100 GB SSD
Source: www.dehne.net
![Page 4: MP-PIPE for Soybean Proteome Brad Barnes 27/11/15 COMP 5704](https://reader036.vdocuments.us/reader036/viewer/2022062223/5a4d1ad07f8b9ab0599711bf/html5/thumbnails/4.jpg)
PIPE Pipeline
1. Prepare data
2. Run genTab to build database
3. Run MP-PIPE to predict interactions
![Page 5: MP-PIPE for Soybean Proteome Brad Barnes 27/11/15 COMP 5704](https://reader036.vdocuments.us/reader036/viewer/2022062223/5a4d1ad07f8b9ab0599711bf/html5/thumbnails/5.jpg)
Memory
Error: Proc killed with Signal 9 Running out of memory
Top output:
![Page 6: MP-PIPE for Soybean Proteome Brad Barnes 27/11/15 COMP 5704](https://reader036.vdocuments.us/reader036/viewer/2022062223/5a4d1ad07f8b9ab0599711bf/html5/thumbnails/6.jpg)
Logging
Errors in regular PIPE: Process killed with signal 11 (Segmentation Fault)
Need logging to file!
![Page 7: MP-PIPE for Soybean Proteome Brad Barnes 27/11/15 COMP 5704](https://reader036.vdocuments.us/reader036/viewer/2022062223/5a4d1ad07f8b9ab0599711bf/html5/thumbnails/7.jpg)
Debugging
Debug in single threaded mode Attach gdb debugger to file
Trace error: to hash table lookup (with very long protein) Issue: very large protein sizes lead to integer overflow
![Page 8: MP-PIPE for Soybean Proteome Brad Barnes 27/11/15 COMP 5704](https://reader036.vdocuments.us/reader036/viewer/2022062223/5a4d1ad07f8b9ab0599711bf/html5/thumbnails/8.jpg)
Testing
“The principle objective of software testing is to give confidence in the software.” – Anonymous
Small datasets with known results
Large dataset for final test
![Page 9: MP-PIPE for Soybean Proteome Brad Barnes 27/11/15 COMP 5704](https://reader036.vdocuments.us/reader036/viewer/2022062223/5a4d1ad07f8b9ab0599711bf/html5/thumbnails/9.jpg)
Performance Tuning # threads vs speed up MP-PIPE
Source: (A. Schoenrock et al. 2011)
![Page 10: MP-PIPE for Soybean Proteome Brad Barnes 27/11/15 COMP 5704](https://reader036.vdocuments.us/reader036/viewer/2022062223/5a4d1ad07f8b9ab0599711bf/html5/thumbnails/10.jpg)
Version Control
Checkpoint results Work on different things
![Page 11: MP-PIPE for Soybean Proteome Brad Barnes 27/11/15 COMP 5704](https://reader036.vdocuments.us/reader036/viewer/2022062223/5a4d1ad07f8b9ab0599711bf/html5/thumbnails/11.jpg)
Version Control
![Page 12: MP-PIPE for Soybean Proteome Brad Barnes 27/11/15 COMP 5704](https://reader036.vdocuments.us/reader036/viewer/2022062223/5a4d1ad07f8b9ab0599711bf/html5/thumbnails/12.jpg)
Conclusion
Modified PIPE to work for Soybeans genTab limited by memory => doubled runtime MP-PIPE performance constant
Validated with tests
Added logging to file Fixed integer overflow issue
![Page 13: MP-PIPE for Soybean Proteome Brad Barnes 27/11/15 COMP 5704](https://reader036.vdocuments.us/reader036/viewer/2022062223/5a4d1ad07f8b9ab0599711bf/html5/thumbnails/13.jpg)
Questions
1. What was the issue with PIPE?
2. How were changes verified?
3. What’s one useful tool for software development