Proceedings of the 22nd International Scientific Conference on Applications of Mathematics and Statistics in Economics (AMSE 2019)

Effect of ordinal variable transformations on hierarchical clustering results: A case study on the Big Data phenomenon

Authors
Hana Řezanková, Richard Novák
Corresponding Author
Hana Řezanková
Available Online October 2019.
DOI
10.2991/amse-19.2019.9How to use a DOI?
Keywords
cluster analysis, ordinal variables, hierarchical clustering, data transformation, distance measures, Big Data phenomenon, New Digital Divide
Abstract

The aim of the paper is to show some possible transformations of ordinal variables in cluster analysis and discuss their effect on hierarchical clustering results. Although several papers comparing different approaches to clustering objects characterized by ordinal variables have been published, the comparisons are not complete and include also variables other than ordinal variables (e.g. nominal variables). The following possibilities are considered in this paper to capture ordinal variables in clustering: “original” values (from one to the number of categories), standardized values, transformed values based on the range, ranks of the original values (averaged in case of ties), standardized ranks, and transformed ranks based on the range (usually recommended). The results of the complete linkage method obtained by the Manhattan and Euclidean distances for different numbers of clusters are compared. Moreover, these results are compared with the results obtained by the TwoStep algorithm. The case study is based on the answers of 481 respondents concerning the awareness of problems related to the “Big Data Phenomenon” and “New Digital Divide”.

Copyright
© 2019, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

Volume Title
Proceedings of the 22nd International Scientific Conference on Applications of Mathematics and Statistics in Economics (AMSE 2019)
Series
Atlantis Studies in Uncertainty Modelling
Publication Date
October 2019
ISBN
978-94-6252-804-8
ISSN
2589-6644
DOI
10.2991/amse-19.2019.9How to use a DOI?
Copyright
© 2019, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - CONF
AU  - Hana Řezanková
AU  - Richard Novák
PY  - 2019/10
DA  - 2019/10
TI  - Effect of ordinal variable transformations on hierarchical clustering results: A case study on the Big Data phenomenon
BT  - Proceedings of the 22nd International Scientific Conference on Applications of Mathematics and Statistics in Economics (AMSE 2019)
PB  - Atlantis Press
SP  - 81
EP  - 90
SN  - 2589-6644
UR  - https://doi.org/10.2991/amse-19.2019.9
DO  - 10.2991/amse-19.2019.9
ID  - Řezanková2019/10
ER  -