HETEROGENEITY BASED FAIRLY DATA DISTRIBUTION IN HADOOP ENVIRONMENT

Ramchandani Hema Megharajbhai; Viral Parmar

Authors

Ramchandani Hema Megharajbhai PG Scholar, Information Technology,SSEC ,Bhavnagar,Gujarat,India
Viral Parmar Assistant Professor Dept. of Information Technology,SSEC ,Bhavnagar,Gujarat,India

Keywords:

Hadoop,Data-Locality,Distributed Computing,Big Data,Cloud Computing

Abstract

The Hadoop framework has been developed to effectively process data-intensive MapReduce applications.
Hadoop users specify the application computation logic in terms of a map and a reduce function, which are often termed
MapReduce applications. The Hadoop distributed file system is used to store the MapReduce application data on the
Hadoop clusternodes called Datanodes, whereas Namenode is a control point for all Datanodes. While its resilience is
increased, its current data-distribution methodologies are not necessarily efficient forheterogeneous distributed
environments such as public clouds.This work contends that existing data distribution techniques are not necessarily
suitable, since the performance of Hadoop typically degrades in heterogeneous environments whenever data distribution
is not determined as per the computing capability of the nodes. The concept of data-locality and its impact onthe
performance of Hadoop are key factors, since they affect the performance in the Map phase when scheduling tasks. The
task scheduling techniques in Hadoop should arguably consider data locality to enhance performance. Various task
scheduling techniques have been analysed to understand their data-locality awareness while scheduling applications.
Other system factors also play a major role while achieving high performance in Hadoop data processing.

HETEROGENEITY BASED FAIRLY DATA DISTRIBUTION IN HADOOP ENVIRONMENT

Authors

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Make a Submission

downloads

Imp links

google

Current Issue

Information