big data in the cloud? yes, you can do it in openstack
TRANSCRIPT
“Cloud computing term is used for a variety of services and applications emerging for users to access on demand over the Internet as opposed to being utilized via on-premises means.
“OpenStack is a cloud operating system that controls large pools of compute storage and networking resources throughout a datacenter, all managed through a dashboard, CLI, RestFUL API ...
What’s around Data-Processing?◇ Big Data◇ Data Science◇ Cloud◇ Machine Learning◇ Patterns Recognition◇ Neural Networks◇ Etc ...
OpenStack SaharaThe Sahara project provides a simple means to provision data-intensive application cluster (Spark or Hadoop) on top of OpenStack.
https://wiki.openstack.org/wiki/Sahara
Architecture
http://docs.openstack.org/developer/sahara/architecture.html
Getting Started- Clusters- Templates- Provisioning Plugins- Image Registry- Data Processing Frameworks- Elastic Data Processing (EDP)
http://docs.openstack.org/developer/sahara/userdoc/edp.html
More Features ...- OpenStack Block Storage support- Cluster Scaling- Data locality- Distributed Mode- Hadoop HDFS High Availability- Orchestration support- …
Data-Processing Frameworks
- Hadoop- Spark- Storm
http://docs.openstack.org/developer/sahara/userdoc/edp.html
Provisioning Plugins- Vanilla - Vanilla Apache Hadoop- Ambari - Hortonworks Data
Platform- Spark - Apache Spark with Cloudera
HDFS- MapR Distribution - MapR plugin
with MapR File System- Cloudera - Cloudera Hadoop
http://docs.openstack.org/developer/sahara/userdoc/edp.html
Elastic Data Processing (EDP)Allows the execution of jobs on cluster created from Sahara. It supports:
- Hive, Pig, MapReduce.Streaming, Java, Shell job types on Hadoop clusters
- Spark jobs- Shared File system service (manila), or Sahara own
database- Access to input and output data sources in:
- HDFS- Swift- Manila
http://docs.openstack.org/developer/sahara/userdoc/edp.html
Resources- Documentation
- http://docs.openstack.org/developer/sahara/- https://wiki.openstack.org/wiki/Sahara -
- Hadoop/Spark Images- http://sahara-files.mirantis.com/images/upstream/mitak
a/ - OpenStack Auto-deployment with RDO
- https://www.rdoproject.org/install/quickstart/- Videos
- https://www.youtube.com/watch?v=idAaLo1stbw- https://www.youtube.com/watch?v=TgPTjrf1y0A
http://docs.openstack.org/developer/sahara/userdoc/edp.html
Thanks!Any questions?You can find me at:◇ @obedmr◇ [email protected]