Download - Weather Data Analytics Using Hadoop
WEATHER DATA ANALYTICS USING
HADOOP
By,
M.S. Najima Begum,
II B.Sc(Computer Science)
Department of Computer Science,
ANJA College, Sivakasi.
About Hadoop• The Apache Hadoop software library is a framework that
allows for the distributed processing of large data sets across
clusters of computers using simple programming models.
• It is designed to scale up from single servers to thousands of
machines, each offering local computation and storage.
AbstractIn this paper, we have used Hadoop platform to analyse
the weather data of various years using MapReduce programs
written in JAVA and we can find the maximum or minimum
temperature every year that can be used for other purposes.
The Paper includes these modules:
• Hadoop Common: The common utilities that support the
other Hadoop modules.
• Hadoop Distributed File System (HDFS™): A distributed
file system that provides high-throughput access to application
data.
• Hadoop MapReduce: A YARN-based system for parallel
processing of large data sets.
Hadoop ArchitectureHadoop Architecture = HDFS Architecture + MAPREDUCE
MapReduce
WEATHER DATA ANALYTICS
Weather prediction is the application of technology to predict
the action of the atmosphere for a given location.
DATA • The data we will use is from the National Climatic Data
Center (ncdc, http://www.nc.noah.gov/).
• The data is stored using a line-oriented ASCII format, in
which each line is a record.
• The format supports a rich set of meteorological elements,
many of which are optional or with variable data lengths.
FINDING MAXIMUM TEMPERATURE IN VARIOUS YEARS
• INPUT0043011990999991950051512004...9999999N9+00221+99999999999...
0043011990999991950051518004...9999999N9-00111+99999999999...
0043012650999991949032412004...0500001N9+01111+99999999999...
0043012650999991949032418004...0500001N9+00781+99999999999...
• INPUT TO MAPPER:(106,0043011990999992010051512004...9999999N9+00221+99999999999...)
(212,0043011990999991950051518004...9999999N9-00111+99999999999...)
(318,00430126509999920119032412004...0500001N9+01111+99999999999..)
(424,0043012650999991949032418004...0500001N9+00781+99999999999...)
INPUT TO REDUCERThe input to the reducer is the output of the mapper
• OUTPUT FROM MAPPER
(2010, 22)
(1950, −11)
(2011, 111)
(1949, 78)
WEATHER DATA
PROCESSING
OUTPUT2010 280
The maximum temperature in the year 2010 is
280.
CONCLUSIONwe have analysed the weather data using big data
environment. The method used in our paper is Hadoop with
mapreducer which processes the sensor data which is stored in
the National Climatic Data Cementum (NCDC)
Any Queries…