A New Data Classification Algorithm for Data-Intensive Computing Environments

Qizhi Deng; Longbo Zhang; Xin Qian; Yali Chen; Fengying Wang

doi:10.2991/iccia.2012.335

<Previous Article In Volume

Next Article In Volume>

A New Data Classification Algorithm for Data-Intensive Computing Environments

Authors

Qizhi Deng, Longbo Zhang, Xin Qian, Yali Chen, Fengying Wang

Corresponding Author

Qizhi Deng

Available Online May 2014.

DOI: 10.2991/iccia.2012.335 How to use a DOI?
Keywords: MapReduce, Data-Intensive, SPRINT, Gini index
Abstract: In order to solve the problem of how to improve the scalability of data processing capabilities and the data availability which encountered by data mining techniques for Data-intensive computing, a new method of tree learning is presented in this paper. By introducing the MapReduce, the tree learning method based on SPRINT can obtain a well scalability when address large datasets. Moreover, we define the process of split point as a series of distributed computations, which is implemented with the MapReduce model respectively. And a new data structure called class distribution table is introduced to assist the calculation of histogram. Experiments and results analysis shows that the algorithm has strong processing capabilities of data mining for data-intensive computing environments.
Copyright: © 2013, the Authors. Published by Atlantis Press.
Open Access: This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

<Previous Article In Volume

Next Article In Volume>

Volume Title: Proceedings of the 2012 2nd International Conference on Computer and Information Application (ICCIA 2012)
Series: Advances in Intelligent Systems Research
Publication Date: May 2014
ISBN: 978-94-91216-41-1
ISSN: 1951-6851
DOI: 10.2991/iccia.2012.335 How to use a DOI?
Open Access: This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

ris enw bib

TY  - CONF
AU  - Qizhi Deng
AU  - Longbo Zhang
AU  - Xin Qian
AU  - Yali Chen
AU  - Fengying Wang
PY  - 2014/05
DA  - 2014/05
TI  - A New Data Classification Algorithm for Data-Intensive Computing Environments
BT  - Proceedings of the 2012 2nd International Conference on Computer and Information Application (ICCIA 2012)
PB  - Atlantis Press
SP  - 1351
EP  - 1354
SN  - 1951-6851
UR  - https://doi.org/10.2991/iccia.2012.335
DO  - 10.2991/iccia.2012.335
ID  - Deng2014/05
ER  -

download .riscopy to clipboard