advanced computing data

17
Advanced Computing Data By: Jacob Nikolai, Jade Mendoza, Katerina Bonilla, & Robert Volkmann Project Mentor : Brandon T. Klein

Upload: others

Post on 20-Jun-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Advanced Computing Data

Advanced Computing DataBy: Jacob Nikolai, Jade Mendoza, Katerina Bonilla, & Robert

Volkmann

Project Mentor : Brandon T. Klein

Page 2: Advanced Computing Data

Project Statement

Advanced Computing Data (ACD) is a small project focused around using advanced computing technologies to enable data ontology of COVID-19 and historical weather data for the Sandia National Laboratories STAR Program 2021 Summer Camp.The ACD project will bring in data from the COVID Data and Weather Data projects to help foster a solution.

Page 3: Advanced Computing Data

Approach

Through the use of advanced computing technologies and principles from disciplines of data ontology, data science, and dev ops, we quickly analyzed large data to uncover inherent relationships, insights, models not easily seen by traditional data science models or the human eye. The selected data used for analysis was COVID-19 and Weather datasets from publicly available sources

Page 4: Advanced Computing Data

Objective

Our objective is to import, read, and use data regarding COVID-19 and historical weather to determine whether there is a relationship between the two by becoming familiar with the necessary programs, which are: git, GitLab, Slack, Neo4j, Cypher, Docker, and Amazon Web Services (AWS).

Page 5: Advanced Computing Data

GitGit is a software for developing and coordinating files and programs between many people, accommodating things like non-linear workflows and data integrity as a part of its CVCS.

Page 6: Advanced Computing Data

What is a CVCS

A Central Version Control System or CVCS for short, is a system in which an open server holding versions of data is access by different sources (like computers). These sources copy the data from the server; alter, add, and remove from it, to ultimately send it back ‘upstream’ to server, where the changes are implemented yet saved as a different version.

Page 7: Advanced Computing Data

GitLab

Gitlab is a web-based repository for users of “Git”, where people can list and solve issues, work together, and generally share information and programs. It is where most of our teamwork occurred as a group, and it is how we shared files and issues.

Page 8: Advanced Computing Data

Slack

Slack is an organized private group messaging system focused on connecting people through projects and making it easier to communicate about these projects.

Page 9: Advanced Computing Data

Neo4j

Neo4j is a graph Database Management System, or DBMS, which uses the programming language of Cypher. It works by storing, retrieving, and updating a database over time through programming.

Head image by: https://neo4j.com/

Page 10: Advanced Computing Data

CypherCypher is the programming language of Neo4j. Compared to other programming languages, cypher is a graph-based programming language where all data is referred to in relation to other sets of data, referred to as ‘nodes’. This language is based off of SQL, which can be seen through programming node property values.

Image by: https://stackoverflow.com/questions/52763891/cypher-neo4j-find-all-relationships-as-long-as-one-relationship-from-node-sa

Page 11: Advanced Computing Data

Docker

Docker can take software and code and package them into ‘containers’, which allows for the code to be moved and run independently on different devices. It also enables access to public and private cloud applications, through its API implementation. We plan to use this in conjunction with the Neo4j database to give our program web access.

Images by: https://www.docker.com/ and https://www.planningpme.com/planningpme-api.htm

Page 12: Advanced Computing Data

Results

In the summer months, the deaths were higher than in the winter months

Page 13: Advanced Computing Data
Page 14: Advanced Computing Data
Page 15: Advanced Computing Data

Conclusions

As a group, we were able to learn how to use git and GitLab with each other, and how to build off of each program. In addition, we used Neo4j and its query language, Cypher, to import data and combine files. Docker and amazon Web Services were used to help with the conclusion of the project by We also learned how to overcome all the challenges that we faced, such as downloading problems, difficulties while importing data, GitLab crashing , and much more.

Participating in this program taught us as a group that we can overcome adversity with both unforeseen and expected challenges. Working remotely and for only two weeks was a condition that brought forth many difficulties due to all the participants working with different devices and internet connections. In addition, not being able to work together in-person caused us to have to create our own collaborative environment from home.

Page 16: Advanced Computing Data

We would like to thank Ms. Cheryl Garcia for putting the STAR Camp together this year. It was a tough year, and we are so thankful you put the time and effort into putting the program together this year!

We would also like to give a huge thank you to our mentor, Mr. Brandon Thorin Klein. We are all so appreciative for everything you have shown us and helped us with.

Page 17: Advanced Computing Data

Questions?