High Performance Hadoop Distributed File System

Mohamed Elkawkagy; Heba Elbeh

doi:10.2991/ijndc.k.200515.007

Next Article In Issue>

Volume 8, Issue 3, June 2020, Pages 119 - 123

High Performance Hadoop Distributed File System

Authors

Mohamed Elkawkagy^*, Heba Elbeh

Computer Science Department, Menoufia University, Shibin Elkom, Menoufia, Egypt

^*Corresponding author. Email: M_nabil_shams@yahoo.com

Corresponding Author

Mohamed Elkawkagy

Received 13 March 2020, Accepted 24 April 2020, Available Online 25 May 2020.

DOI: 10.2991/ijndc.k.200515.007 How to use a DOI?
Keywords: Cloud; HDFS; fault tolerance; reliability
Abstract: Although by the end of 2020, most of companies will be running 1000 node Hadoop in the system, the Hadoop implementation is still accompanied by many challenges like security, fault tolerance, flexibility. Hadoop is a software paradigm that handles big data, and it has a distributed file systems so-called Hadoop Distributed File System (HDFS). HDFS has the ability to handle fault tolerance using data replication technique. It works by repeating the data in multiple DataNodes which means the reliability and availability are achieved. Although data replications technique works well, but still waste much more time because it uses single pipelined paradigm. The proposed approach improves the performance of HDFS by using multiple pipelines in transferring data blocks instead of single pipeline. In addition, each DataNode will update its reliability value after each round and send this updated data to the NameNode. The NameNode will sort the DataNodes according to the reliability value. When the client submits request to upload data block, the NameNode will reply by a list of high reliability DataNodes that will achieve high performance. The proposed approach is fully implemented and the experimental results show that it improves the performance of HDFS write operations.
Copyright: © 2020 The Authors. Published by Atlantis Press SARL.
Open Access: This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)
View full text (HTML)

Next Article In Issue>

Journal: International Journal of Networked and Distributed Computing
Volume-Issue: 8 - 3
Pages: 119 - 123
Publication Date: 2020/05/25
ISSN (Online): 2211-7946
ISSN (Print): 2211-7938
DOI: 10.2991/ijndc.k.200515.007 How to use a DOI?
Open Access: This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

ris enw bib

TY  - JOUR
AU  - Mohamed Elkawkagy
AU  - Heba Elbeh
PY  - 2020
DA  - 2020/05/25
TI  - High Performance Hadoop Distributed File System
JO  - International Journal of Networked and Distributed Computing
SP  - 119
EP  - 123
VL  - 8
IS  - 3
SN  - 2211-7946
UR  - https://doi.org/10.2991/ijndc.k.200515.007
DO  - 10.2991/ijndc.k.200515.007
ID  - Elkawkagy2020
ER  -

download .riscopy to clipboard