DisCoSet: Discovery of Contrast Sets to Reduce Dimensionality and Improve Classification
- DOI
- 10.1080/18756891.2015.1113750How to use a DOI?
- Keywords
- Contrast sets, dimensionality reduction, classification, information retrieval, data mining
- Abstract
Traditionally, contrast set mining aims at finding a set of rules that best distinguish the instances of different user-defined groups. Contrast sets are conjunctions of attribute-value pairs that are significantly more frequent in one group than in other groups. Typically, these contrast sets are extracted from categorical data or discretized numerical data. Existing methods of rule-based contrast sets require some user-defined thresholds to select the contrast sets. In this paper, we propose a greedy algorithm, called DisCoSet, to find incrementally a minimum set of local features that best distinguishes a class from other classes without resorting to discretization. The discovered contrast sets reduce the dimensionality of the feature vectors considerably and improve the classification accuracy significantly. We show that the proposed algorithm reduces the dimensionality of class instances by 40%-97% of the original length and yet improves classification accuracy by 10%-24% using different types of datasets.
- Copyright
- © 2017, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - JOUR AU - Zaher Al Aghbari AU - Imran N. Junejo PY - 2015 DA - 2015/12/01 TI - DisCoSet: Discovery of Contrast Sets to Reduce Dimensionality and Improve Classification JO - International Journal of Computational Intelligence Systems SP - 1178 EP - 1191 VL - 8 IS - 6 SN - 1875-6883 UR - https://doi.org/10.1080/18756891.2015.1113750 DO - 10.1080/18756891.2015.1113750 ID - AlAghbari2015 ER -