apache hadoop – the big name in the big data world

8
Java/J2EE Capabilities Apache Hadoop – The Big Name In The Big Data World

Upload: sarajstanford

Post on 30-Dec-2015

25 views

Category:

Documents


4 download

DESCRIPTION

Data has been piling up in organizations since a number of years but since some time, because of the prevailing fervor behind ‘Big Data’ and ‘Business Intelligence’, there is awareness and availability of valued information and accurate storage of data to organizations, which is why they are happily storing their heaps of data and extracting desired information in required format. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Apache Hadoop – The Big Name In The Big Data World

Java/J2EE Capabilities

Apache Hadoop – The Big Name In The Big Data World

Page 2: Apache Hadoop – The Big Name In The Big Data World

What is Apache Hadoop?

•A proficient data management framework for Big Data

•Open source software for distributed processing of large chunks of data

•Offers distributed parallel processing across servers, ranging from a single server to multiple machines

•Processing and analysis of thousands of terabytes of data

•Apt framework to increase business efficiency and maximize ROI

•Latest Release on 18 November, 2014: Release 2.6.0

What is Apache Hadoop?

Page 3: Apache Hadoop – The Big Name In The Big Data World

Main Modules of HadoopHadoop Common

HDFS (Hadoop Distributed File System)

Hadoop YARN

Hadoop MapReduce

Main Modules of Hadoop

Page 4: Apache Hadoop – The Big Name In The Big Data World

Main Modules of Hadoop (contd.)

•Hadoop CommonCommon utilities to help other Hadoop modules

and support subprojectsIncludes File System, RPC and serialization libraries

•Hadoop Distributed File System (HDFS) Distributed File System giving access to application

dataSpans across all nodes in a Hadoop cluster to link

them into one big file systemJava based, giving scalable and reliable data

storage

Main Modules of Hadoop (contd.)

Page 5: Apache Hadoop – The Big Name In The Big Data World

Main Modules of Hadoop (contd.)

•Hadoop YARNUtilized for job scheduling and resource

management of clustersSplits up two roles of JobTracker, namely, resource

management and job scheduling into different areas

•Hadoop MapReduce System for parallel processing of large data setsA framework that gets into work assignment to

nodes in a particular clusterWrites applications processing large amount of

data, on multiple nodes of hardware with utmost reliability

Main Modules of Hadoop (contd.)

Page 6: Apache Hadoop – The Big Name In The Big Data World

Other Hadoop Related Projects at Apache

• Avro

•Cassandra

•Hbase

•Hive

•Pig

•Spark

• Ambari

•Chukwa

•Mahout

•Tez

•ZooKeeper

Other Hadoop Related Projects at Apache

Page 7: Apache Hadoop – The Big Name In The Big Data World

Why Hadoop?

• Next generation real time analytics

•Rich eco systems

•Scale-out storage

•Reduced cost of ownership

•Scalability, Flexibility and Reliability

•Fault tolerance

•Simplistic programming models

Why Hadoop?

Page 8: Apache Hadoop – The Big Name In The Big Data World

Looking Forward To Have A Mutually Beneficial Association.

Assuring You Of Our Best Services Always.

SPEC INDIA"SPEC House“, Parth Complex,Swastik Cross Road, Navrangpura,Ahmedabad-380 009, INDIA.

Tel.:+91-79-26404031 to 34VoIP : + 1 - 908 - 450 - 9862

Instant Messengers

spec.bd | spec_india | bd.spec

specindia2009 | specindia.bd

e-mail: [email protected] URL: http://www.spec-india.com

THANK YOU