Attribute Reduction of Boolean Matrix in Neighborhood Rough Set Model

Yan Gao; Changwei Lv; Zhengjiang Wu

doi:10.2991/ijcis.d.200915.004

<Previous Article In Issue

Download article (PDF)

Next Article In Issue>

Volume 13, Issue 1, 2020, Pages 1473 - 1482

Attribute Reduction of Boolean Matrix in Neighborhood Rough Set Model

Authors

Yan Gao, Changwei Lv, Zhengjiang Wu^*^,

College of Computer Science and Technology, Henan Polytechnic University, No. 2001 Century Avenue, Jiaozuo, Henan 454003, P.R. China

^*Corresponding author. Email: wuzhengjiang@hpu.edu.cn

Corresponding Author

Zhengjiang Wu

Received 2 January 2020, Accepted 2 September 2020, Available Online 21 September 2020.

DOI: 10.2991/ijcis.d.200915.004 How to use a DOI?
Keywords: Neighborhood rough set; Boolean matrix; Attribute reduction; GPU
Abstract: Neighborhood rough set is a powerful tool to deal with continuous value information systems. Graphics processing unit (GPU) computing can efficiently accelerate the calculation of the attribute reduction and approximation sets based on matrix. In this paper, we rewrite neighborhood approximation sets in the matrix-based form. Based on the matrix-based neighborhood approximation sets, we propose the relative dependency degree of attributes and the corresponding algorithm (DBM). Furthermore, we design the reduction algorithm (ARNI) for continuous value information systems. Compared with other algorithms, ARNI can effectively remove redundant attributes, and less affect the classification accuracy. On the other hand, the experiment shows ARNI based on the matrixing rough set model can significantly speed up by GPU. The speedup is many times over the central processing unit implementation.
Copyright: © 2020 The Authors. Published by Atlantis Press B.V.
Open Access: This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

1. INTRODUCTION

The explosive growth of data volume increases the complexity of data, and makes data processing more difficult than before. By reducing unnecessary or irrelevant attributes in the data, the efficiency of decision-making can be improved. The rough set theory proposed by Pawlak can quickly deal with uncertain and inaccurate problems [1], which has been widely used in machine learning, data mining, and other fields [2]. To construct an equivalence relation of Pawlak rough set for continuous value information system, the continuous values need to be clustered into some mutually disjoint blocks. However, discretizing the continuous values exists some uncertainty and may lose some essential information. To solve this problem, many rough set models have been proposed, such as fuzzy rough sets [3–6], covering rough sets [7–9], semi-monolayer cover rough set [10], neighborhood rough sets [11–14], granule-based rough sets [15–17]. Neighborhood rough set is a feasible model to handle continuous values without discretization.

In neighborhood rough sets, the calculation, whether two elements are neighborhoods or not, becomes easier and more localized compared to other methods [11]. It offers better potential for parallel and distributed computation [18]. Furthermore, by marking the neighborhood of the element, we can label some data to improve the usability for lack-of-label big data [19]. However, the complexity of the calculation is an unavoidable question on neighborhood rough set [19,20].

Matrix description makes the calculation of approximation sets more efficient. Huang et al. [21] defined three composition operations, and studied their characteristic matrices, and investigated the relationship between the characteristic matrices and covering approximation operators. Wang et al. [22] represented three types of existing covering approximation operators with the Boolean matrix. Ma [23] selected more covering approximation model and rewrote them in a high level.

Graphics processing unit (GPU) acceleration is widely used not only in deep learning [24] but also in the operations of the dense matrices and blocks [25], such as minimizing the encountered in the transformation of tensor contractions into matrix multiplications [26], object sorting [27], computation of equivalence classes, and approximation sets [28]. Generally, GPU is at least 10 times faster than the coetaneous central processing unit (CPU) [29]. Zhang et al. [30] adopted a multi-GPU solution to accelerate their algorithm about a parallel method for computing approximations based on matrix and achieved 334.9 times of acceleration compared to the CPU.

The matrixing of the whole computation process is the premise of the speedup by GPU. For neighborhood rough approximation operators, the relation of disjunction and conjunction among Boolean matrices can represent the relation of union and intersection between sets. Through a sequence of substitution operations, the operation of the neighborhood rough set will be transformed from element-based form to matrix-based form. Furthermore, we proposed the relative dependency degree of attributes based on matrixing neighborhood approximation sets, and the corresponding reduction algorithms (DBM and ARNI).

This paper is organized as follows: In Section 2, some basic concepts about neighborhood rough sets are introduced. In Section 3, we propose attribute reduction algorithms DBM and ARNI based on the Boolean matrix. In Section 4, a series of comparative experiments are designed to show the feasibility and efficiency of ARNI. Section 5 is the conclusion of this paper.

2. PRELIMINARIES

In information systems, the elements can easily cluster into some information granules by different methods. How to describe them systematically in a framework is a basic question about the construction of approximation space in rough set theory. The binary relation and neighborhood system are two common entry points. Binary relation can summarize the principle of generating information granules. Covering or neighborhood systems directly organize the related elements into a block or a neighborhood. In classic Pawlak rough set, the equivalence relation and equivalence classes can transform mutually. However, in other models, the transformation is no longer smooth [31]. The choice binary relation orneighborhood system depends almost entirely on the character in information systems. For example, in set-valued or incomplete information systems, we can use not only tolerance relation and its generalized tolerance relation [18] but also maximal consistent blocks [11] or semi-monolayer covering [10] to organize the blocks on the system. For the information system with noise, the variable precision binary relation [32] and probabilistic binary relation [33] are frequently mentioned. For continuous information systems, we can build a hypersphere neighborhood to describe a similar degree among the elements, as shown in Figure 1.

In this section, we introduce the basic concepts about neighborhood rough set and two Boolean matrix operations.

2.1. Neighborhood Rough Set

In information systems, let $U=x1,x2,x3,⋯,xn$ be a nonempty and finite set, $A=a1,a2,a3,⋯,am$ be the finite condition-attribute set of $U$ , $d=d1,d2,d3,⋯,dk$ be the decision attribute set of $U$ . If a family of neighborhood relations can be generated by condition-attribute set $A$ , then, $NS=U,A$ is called a neighborhood system, and $NDS=U,A,d$ is called a neighborhood decision system.

Definition 1.

[12,13] In the neighborhood system $NS=U,A$ , $∀xi∈U$ , the neighborhood of $xi$ is defined as

$εxi=xj|xj∈U,δpxi,xj≤ε$ (1)

where

$ε$ is the radius of the hypersphere neighborhood,

$δp$ is the Minkowsky distance between

$xi$ and

$xj$ , (1) it is called Manhattan distance if

$p=1$ , (2) Euclidean distance if

$p=2$ , (3) Chebyshev distance if

$p=∞$ . The Euclidean distance is selected in this paper.

Hu proposed the definition of the neighborhood rough sets [13,14], which is a specialized covering rough model in neighborhood approximation space (Definition 2).

Definition 2.

[11–14,34] In the neighborhood system $NS=U,A$ , for any $X⊆U$ , the upper approximation $N¯X$ , the lower approximation $N¯X$ , and the boundary region $BNX$ of the neighborhood of $X$ are defined as follows, respectively:

$N¯X=xi|εxi∩X≠∅$ (2)

$N¯X=xi|εxi⊆X$ (3)

$BNX=N¯X−N¯X$ (4)

Example 1.

Given a neighborhood decision system $NDS=U,A,d$ at Table 1, where $d$ is the decision attribute, $ai∈A$ is the condition attributes. We suppose that $ε=0.5$ and the neighborhood calculation formula of $x$ under $A$ is

$δpxi,xj=∑k=14vxi,ak−vxj,ak2,$

$εxi=xj|xj∈U,δpxi,xj≤0.5$

where,

$i,j=1,2,3,4,5$ , then

U	a₁	a₂	a₃	a₄	d
x₁	0.9	0.6	1.4	1.3	0
x₂	0.5	1.0	1.5	1.1	0
x₃	0.6	1.3	1.4	1.0	1
x₄	1.0	1.2	1.7	1.5	1
x₅	1.1	0.9	2.5	1.4	1

Table 1

Decision information system.

The values of neighborhoods are

$εx1=x1,εx2=x2,x3,εx3=x2,x3,x4=x4,εx5=x5$

By (2) and (3), the upper and lower approximation sets of the neighborhood rough set are as follows:

$N¯d1=xi|εxi∩d1≠∅=x1,x2,x3,$

$N¯d2=x2,x3,x4,x5$

and

$N¯d1=xi|εxi⊆d1=x1,$

$N¯d2=x4,x5$

2.2. Boolean Matrix Operation

Boolean matrix has been used to describe the classic binary relation or neighborhood system frequently. To matrix the upper and lower neighborhood approximation operations, we need to upgrade some operations based on traditional disjunction and conjunction to calculate the union and intersection of two sets.

Definition 3.

[23,34] Let $⊙$ be an invariant operation between two Boolean matrices, $An×m=aikn×m$ and $Bm×l=bkjm×l$ be two Boolean matrices. Then, the calculation of $Cn×l=cijn×l=A⊙B$ is defined as follows:

$cij=∨k=1maik∧bkj$ (5)

where

$i=1,2,3,⋯,n, j=1,2,3,⋯,l$ ;

$∨$ and

$∧$ are Boolean logic operations, which represent the disjunctive and conjunctive, respectively.

Definition 4.

[23,34] Let $⊗$ be a matrix transformational operations between two Boolean matrices, $An×m=aikn×m$ and $Bm×l=bkjm×l$ be two Boolean matrices. Then, the calculation of $Dn×l=dijn×l=A⊗B$ is defined as follows:

$dij=∧k=1m1−aik∨bkj$ (6)

where

$i=1,2,3,⋯,n, j=1,2,3,⋯,l$ .

3. ATTRIBUTE REDUCTION BASED ON BOOLEAN MATRIX

In this section, Boolean matrix operations ( $⊙$ and $⊗$ ) have been constructed to matrix the set operations (union and intersection) in neighborhood rough approximation operations. Furthermore, we define a dependency degree in the neighborhood decision system and propose a new attribute reduction. The algorithm can keep the same consistency of matrix-based calculation and element-based calculation in neighborhood approximation operations.

3.1. Boolean Matrix Representation of Approximations

In this section, we rewrite the neighborhood approximation operation by the Boolean matrix and its matrix operations.

Definition 5.

[34] In the neighborhood system $NS=U,A$ , $∀B⊆A$ , $∀xi,xj∈U$ . Then, the Boolean matrix $MB=mijn×n$ of the neighbor $NB$ corresponding to $B$ is defined as

$mij=1,xj∈εxi0,xj∉εxi$ (7)

where,

$i,j=1,2,3,⋯,n$ .

Obviously, this definition indicates that $MB$ is a Boolean matrix, $NB$ is the neighborhood cluster of $U$ with respect to condition-attribute set $B$ , $U,NB$ is the neighborhood approximation space, $MB$ is its Boolean matrix representation, and $MU$ is a set of Boolean matrices on $U$ .

For any $X⊆U=x1,x2,x3,⋯,xn$ , the eigenvector $λX=a1,a2,a3,⋯,anT$ [24,25,34] of $X$ includes only 1 and 0. For $∀xi∈U$ , if $xi∈X$ , $ai=1$ ; otherwise $ai=0$ . For example, suppose $U=x1,x2,x3,x4$ , and $X=x2,x4$ , then $λX=0 1 0 1T$ . $λX$ is a Boolean vector, the two operations ( $⊙$ and $⊗$ ) are also applicable to the operation between Boolean matrix and Boolean vector.

Definition 6.

[22,23] In the approximation space $U,N$ , $MB$ is its Boolean matrix representation. The relation matrix of the approximation space is defined as $RB=rikn×n$

$RB=MB⊙MBT$ (8)

where,

$RB$ represents the invariance of the operation between the Boolean matrices and its transpose.

Proposition 1.

In the neighborhood approximation space $U,N$ , $RB$ is its relation matrix, $U/d=d1,d2,d3,⋯,dk.$ For $∀B⊆A$ , $λdi$ is the eigenvectors of $di$ , the upper and lower approximations of $d1$ of $B$ corresponds to the eigenvectors are defined as follows, respectively:

$λN¯di=RB⊙λdi$ (9)

$λN¯di=RB⊗λdi$ (10)

where,

$i=1,2,3,⋯,k$ .

Proof

( $⇒$ ): $∀t∈1,2,3,⋯,s$ , if $xt∈N¯di$ , also known as $at=1$ , then $εxt∩di≠∅$ , and $∃xl∈di$ , $xl∈εxt$ . Thus, $rtl=1$ and $at=1$ , and $∨t=1srtl∧at=1$ . Hence, $∀t∈1,2,3,⋯,n$ , $ri→∨k=1mrik∧ak$ .

( $⇐$ ): $∀t∈1,2,3,⋯,s$ , if $∨t=1srtl∧at=1$ , in other words $∃l∈1,2,3,⋯,s$ , $rtl=1$ and $at=1$ . Thus, $∃xl∈εxt$ and $xl∈di$ , $εxt∩di≠∅$ , then $xt∈N¯di$ and $at=1$ . Hence, $∀t∈1,2,3,⋯,n$ , $∨k=1mrik∧ak→ri$ .

Thus, $∀t∈1,2,3,⋯,s$ , $ri=∨t=1srtl∧axt=1$ . namely, $λN¯di=RB⊙λdi$ , the proof of formula (9) is completed.

The proof of formula (10) is similar to that of formula (9), then proposition 1 is held.

Definition 7.

[34] In the neighborhood approximation space $U,N$ , $U/d=d1,d2,d3,⋯,dk$ , $λN¯di$ and $λN¯di$ are the eigenvectors corresponding to the upper and lower approximations of $di$ . Then, upper approximation $N¯di$ and lower approximation $N¯di$ of $di$ are defined as follows, respectively:

$N¯di=∪j=1kxj|aj=1$ (11)

$N¯(di)=∪j=1k{xj|aj=1,aj∈λN¯(di)}$ (12)

Definition 8.

In the neighborhood approximation space $U,N$ , $U/d=d1,d2,d3,⋯,dk.λN¯di$ and $λN¯di$ are the eigenvectors corresponding to the upper and lower approximations of $di$ . Then, $λN¯d$ and $λN¯d$ of $d$ are defined as follows, respectively:

$N_(di)=∪j=1k{xj|aj=1,aj∈λN_(di)}$ (13)

$λN¯d=∪i=1kλN¯di$ (14)

Suppose that eigenvectors

$λX=a1,a2,…,anT,λY=b1,b2,…,bnT$

then

$λX∪λY=maxa1,b1,maxa2,b2,…,maxan,bn$

So far, we have accomplished the matrix description of the neighborhood rough set.

Example 2 (Continuing Example 1).

Let $NDS=U,A,d$ be a neighborhood decision system the same as that of Example 1. The Boolean matrix method is used to calculate the upper and lower approximation sets of the neighborhood rough set. Thus, the corresponding Boolean matrix of the neighborhood is

$MB=10000 01100 01100 00010 00001,$

$λd1=1 1 0 0 0T,$

$λd2=0 0 1 1 1T$

The relation matrix $RB$ is

$RB=MB⊙MBT=10000 01100 01100 00010 00001,$

The upper and lower approximation sets of $d1$ corresponding eigenvectors are

$λN¯d1=RB⊙λd1=1 1 1 0 0T,$

$λN¯d1=RB⊗λd1=1 0 0 0 0T;$

the upper and lower approximation sets of

$d1$ :

$N¯d1=x1,x2,x3,N¯d1=x1.$

Similarly,

$λN¯d2=0 1 1 1 1T,$

$λN¯d2=0 0 0 1 1T;$

$N¯d2=x2,x3,x4,x5 and N¯d2=x4,x5.$

By the matrixing method, we still get the same result as Example 1.

3.2. Reduction of Neighborhood Rough Set

Dependency and importance degree are widely used in attribute reduction. For neighborhood rough set, we specified the two indexes based on the approximation sets.

Definition 9.

In neighborhood approximation space $U,N$ , let $A$ be a condition-attribute set on $U$ , $U/d=d1,d2,d3,⋯,dk.$ For $∀B⊆A$ , the dependency degree $γBd$ of $d$ with respect to $B$ is defined as follows:

$γBd=SλN¯d|U|$ (15)

where,

$SλN¯d$ are the sum of all the elements in the eigenvector

$λN¯d$ , it represents the number of elements in the lower approximation of

$d$ .

Unless particularly stated, the relative dependency degree is the dependency degree of the decision-making $d$ with respect to all attributes $A$ . Obviously, $γBd∈0,1$ , and the greater of the value, the more elements identified in $U$ , the stronger classification ability of the attribute subset $B$ .

Definition 10.

[34,35] In the neighborhood decision system $NDS=U,A,d$ , $∀B⊆A$ , $∀a∈A$ , then the neighborhood importance degree $σBa$ or $σB′a$ of $a$ relative to $B$ is

If $a∈B$ ,
$σBa=γBd−γB−ad$ (16)
If $a∈A−B$ ,
$σB′a=γB∪ad−γBd$ (17)

Definition 11.

In the neighborhood decision system $NDS=U,A,d$ , $∀C⊆A$ , and $∀a∈C$ , if $γCd=γAd$ and $σCa>0$ , then we call $C$ an attribute reduction of $NDS$ .

Example 3 (Continuing Example 2).

Let $NDS=U,A,d$ be a neighborhood decision system the same as Example 1, and the upper and lower approximations of the neighborhood rough set, respectively $N¯d1=x1,x2,x3$ , $N¯d1=x1$ , $N¯d2=x2,x3,x4,x5$ , $N¯d2=x4,x5$ .

The eigenvector corresponding to the upper and lower approximations of $d$ are as follows:

$λN¯d=∪i=12λN¯di=1 1 1 1 1T,$

$λN¯d=∪i=12λN¯di=1 0 0 1 1T.$

Thus, the dependency degree of $d$ with respect to $A$ is

$γAd=SλN¯d|U|=35.$

Similarly, the dependency degrees of other condition attributes are shown in Table 2. Because $σAa4=0$ , $a4$ is redundant, the lower approximation set of the data set doesn't change.

	$λN¯d$	$λN¯d$	$N¯d$	$N¯d$	$γBd$	$σAa$
$A$	$1 1 1 1 1T$	$1 0 0 1 1T$	$x1,x2,x3,x4,x5$	$x1,x4,x5$	$35$	$∕$
$A−a1$	$1 1 1 1 1T$	$1 0 0 0 1T$	$x1,x2,x3,x4,x5$	$x1,x5$	$25$	$15$
$A−a2$	$1 1 1 1 1T$	$0 0 0 0 1T$	$x1,x2,x3,x4,x5$	$x5$	$15$	$25$
$A−a3$	$1 1 1 1 1T$	$0 0 0 1 0T$	$x1,x2,x3,x4,x5$	$x4$	$15$	$25$
$A−a4$	$1 1 1 1 1T$	$1 0 0 1 1T$	$x1,x2,x3,x4,x5$	$x1,x4,x5$	$35$	$0$

Table 2

The result of the subset of condition-attribute sets.

3.3. Attribute Reduction Algorithm Based on Boolean Matrix

Based on the previous propositions and definitions, we design two algorithms Dependency Degree based on Boolean Matrix (DBM) and Attribute Reduction based on Neighborhood Importance Degree (ARNI).

Algorithm DBM is used to calculate the dependency degree of the condition attribute with respect to decision attributes in the data set. Suppose that there are $n$ samples, $m$ condition attributes, and $k$ classes in the data set. In line 2 of DBM, we initialize a matrix whose size is $n×n$ . The corresponding time complexity is $On2$ . According to Definition 5, the complexity of lines 3–9 is $On2$ . In line 10 of DBM, we initialize a vector whose size is $n × 1$ , and its time complexity is $On$ . In lines 11–15, we compute the $γAd$ of $d$ with respect to $A$ based on Definitions 7–9, the time complexity of these lines is $On2k$ . Thus, the overall time complexity of the algorithm DBM is $On2k$ . Algorithm DBM is an essential part of algorithm ARNI.

ARNI calls DBM to calculate the neighborhood importance degree of $a$ relative to $A$ at line 4. Its time complexity is $On2k$ . The time complexity of step in line 9 to line 20 is $0.5mm−1$ . Hence, the time complexity of algorithm ARNI is $On2m2k$ .

4. EXPERIMENTS

In the experiment, we use Algorithm DBM to calculate the relative dependency degree for every condition attribute. In Algorithm ARNI, the redundant attributes will be reduced from the continuous information systems. To verify the feasibility and effectiveness of ARNI, we design a series of comparative experiments. Specifically, we divide the experiments into two parts. One part is the reduction ability of the algorithm ARNI. In another part, we test the acceleration performance of GPU for the matrixing models.

4.1. Experiment Environment and Objects

The experimental data in this paper come from the UCI machine learning database [36], which is listed in Table 3. All values in those datasets will be normalized before we used.

Data Sets	Logogram	Samples	Attributes	Classes
biodeg	BI	1055	41	2
debrecen	DR	1151	19	2
ForestTypes	FT	326	27	4
glass	GL	214	10	6
ionosphere	IP	351	33	2
iris	IR	150	4	3
movement_libras	ML	360	90	15
sonar	SN	208	60	2
wdbc	WD	569	30	2
wine	WI	178	13	9

Table 3

Description of data sets.

Algorithms DBM and ARNI are programmed in Python 3.6.7. The experiments run on a computer with Windows 7 Ultimate, Intel i5-5200U CPU 2.2GHz, and 4GB memory, NVIDIA 820M.

4.2. Comparative Experiment between ARNI and Other Reduction Algorithms

To demonstrate the effectiveness of ARNI, we compare our algorithm with NRS, CRS, and PCA in terms of the classification performance of K-Nearest Neighbor, K = 3 (KNN) and Support Vector Machine (SVM).

NRS is another classic reduction algorithm in neighborhood rough set theory [12,13]. Table 4 shows the number of features after reduction by ARNI and NRS for the different radius of the hypersphere neighborhood $ε$ . In general, the reduction performance of ARNI and NRS appears some differences. As shown in Table 4, in 19 of 30 experiments, the number of reduced attributes based on ARNI is no less than the number of NRS. The classification accuracy of KNN (SVM) based ARNI reduction is superior to the one based on NRS in 25 (24) of 30 experiments. Specifically, when $ε=0.01$ , although the number of reduction attribute subsets of algorithm ARNI is better than the NRS, its classification accuracy of classifiers is higher than NRS generally. When $ε=0.05$ , the advantage of ARNI in the number of reduction attribute subsets begins to appear, and its classification accuracy is higher than NRS in general. When $ε=0.10$ , the classification accuracy of ARNI is almost the same as that of NRS or better than NRS, but the number of attribute subsets by ARNI reduction is better.

ε		NRS			ARNI
	Data Sets	Reduction	Accuracy		Reduction	Accuracy
	Data Sets		KNN	SVM		KNN	SVM
0.01	BI	9	0.8216	0.7128	9	0.8216	0.7128
	DR	7	0.6221	0.5951	8	0.6413	0.6245
	FT	4	0.8108	0.7293	6	0.8234	0.7183
	GL	4	0.6889	0.5653	4	0.6889	0.5653
	IP	5	0.9007	0.8638	5	0.9007	0.8638
	IR	3	0.9533	0.9667	3	0.9533	0.9667
	ML	7	0.5956	0.3980	12	0.6811	0.4235
	SN	4	0.5889	0.5374	7	0.6218	0.6134
	WD	4	0.8999	0.9177	6	0.9189	0.9255
	WI	4	0.8144	0.8147	6	0.95131	0.96663
0.05	BI	14	0.8284	0.6948	13	0.8328	0.7493
	DR	13	0.6230	0.6168	11	0.6334	0.6213
	FT	7	0.8393	0.7414	9	0.8306	0.7298
	GL	9	0.6555	0.4696	8	0.6641	0.4958
	IP	6	0.9091	0.8867	6	0.9091	0.8867
	IR	3	0.9533	0.9667	3	0.9533	0.9667
	ML	31	0.7011	0.4111	22	0.6989	0.4564
	SN	7	0.5794	0.5753	11	0.6375	0.6384
	WD	9	0.9316	0.9230	10	0.9489	0.9216
	WI	5	0.8262	0.8141	4	0.8783	0.8818
0.1	BI	23	0.8159	0.7658	20	0.8200	0.7625
	DR	19	0.6229	0.6142	17	0.6213	0.6169
	FT	17	0.8429	0.7379	19	0.8436	0.7379
	GL	9	0.6555	0.4696	7	0.6479	0.4563
	IP	8	0.8952	0.8780	8	0.8952	0.8780
	IR	4	0.9533	0.9667	3	0.9533	0.9667
	ML	23	0.7000	0.4089	19	0.7167	0.4360
	SN	9	0.6493	0.6492	11	0.6375	0.6384
	WD	11	0.9403	0.9213	11	0.9403	0.9213
	WI	6	0.8655	0.8655	4	0.8783	0.8818

ARNI, Attribute Reduction based on Neighborhood Importance Degree; KNN, K-nearest neighbor; SVM, support vector machine.

Table 4

Reduction results of two different algorithms.

Next, we provide a comprehensive comparison of the effects of ARNI, NRS, and other reduction algorithms based on non-neighborhood rough set theory. Those algorithms include the reduction algorithm classic rough set (CRS) [1] and principal component analysis (PCA) in the number of attributes and classification accuracy after reduction. For ARNI and NRS, we choose a moderate-performance parameter $ε=0.01$ as the radius of the neighborhood. For CRS, all continuous values in decision systems have been clustered into two categories by KMeans ( $K=2$ ). The reduction algorithm of CRS is the one based on the discernibility matrix. Because there are many results in the reduction sets, we randomly choose 3 results to calculate the average value as the final result in this paper. For PCA, to simplify the experimental analysis, the final number of attributes is set to be the same as ARNI.

The results of the comparative experiment are shown in Table 5. The classifiers perform better in the ARNI reduced decision systems than other algorithms. That is to say, the reduction of attributes based on ARNI provided better support for KNN and SVM than others.

	RAW			CRS			NRS			PCA			ARNI
Data Sets	Reduction	KNN	SVM	Reduction	KNN	SVM	Reduction	KNN	SVM	Reduction	KNN	SVM	Reduction	KNN	SVM
BI	41	0.8368	0.7725	33	0.8368	0.7772	9	0.8216	0.7128	9	0.8226	0.7008	9	0.8216	0.7128
DR	19	0.6229	0.6142	17	0.6203	0.6142	7	0.6221	0.5951	8	0.6117	0.6156	8	0.6413	0.6245
FT	27	0.8168	0.7042	18	0.7908	0.6374	4	0.8108	0.7293	6	0.8025	0.7178	6	0.8234	0.7183
GL	10	0.6555	0.4696	8	0.6379	0.4973	4	0.6889	0.5653	4	0.6690	0.5277	4	0.6889	0.5653
IP	33	0.8553	0.8523	27	0.8437	0.8664	5	0.9007	0.8638	5	0.8693	0.8809	5	0.9007	0.8638
IR	4	0.9533	0.9667	4	0.9533	0.9667	3	0.9533	0.9667	3	0.9533	0.9600	3	0.9533	0.9667
ML	90	0.7956	0.5389	17	0.7389	0.4678	7	0.5956	0.3980	12	0.6900	0.5889	12	0.6811	0.4235
SN	60	0.6432	0.6388	9	0.6498	0.7024	4	0.5889	0.5374	7	0.6644	0.6726	7	0.6722	0.6134
WD	30	0.9702	0.9527	23	0.9648	0.9474	4	0.8999	0.9177	6	0.9499	0.9422	6	0.9189	0.9255
WI	13	0.9555	0.9781	13	0.9555	0.9781	4	0.8144	0.8147	6	0.9513	0.9666	6	0.9513	0.9667

ARNI, Attribute Reduction based on Neighborhood Importance Degree; KNN, K-nearest neighbor; SVM, support vector machine; CRS, classic rough set; PCA, principal component analysis.

Table 5

Reduction results of different algorithms.

4.3. Experiment about the Effect of ε

The granularity $ε$ is a hyperparameter in ARNI. In this experiment, we will test the effect of attribute reduction and classification accuracy with different granularity $ε$ . It will show the relationship between granularity $ε$ and the number of attributes and classification accuracy after reduction.

The range of $ε$ is from 0.02 to 1, and the step size is 0.02. The classifier is KNN ( $K=3$ ). The final accuracy is the average of 10-fold cross-validation. The detailed results are shown in Figure 2.

4.4. Experiment about GPU Acceleration

GPU architecture consists of a set of streaming multiprocessors (SMs), each of which contains a set of processor cores called streaming processors (SPs). Therefore, multiple threads are suitable for a large number of repeated or parallel operations. Using GPU acceleration operations can effectively improve computational efficiency.

Compute Unified Device Architecture (CUDA), introduced by NVIDIA, is a heterogeneous parallel computing model that involves both CPU and GPU. In this paper, we use CUDA V7.0 [37] and Pytorch framework [38] to recode ARNI in it. The program runs on CPU versus GPU. The acceleration times are shown in Figure 3. The speedup ratio $ρ$ of GPU on different data sets is shown in Figure 4.

$ρ=CPU running timeGPU running time$

5. CONCLUSION

It is a core challenge to the rough set theory about how to reduce attributes effectively in information systems. The matrix description of rough sets and reduction algorithm can be executed more efficiently than other forms in the computer operation, especially in GPU. In this paper, we propose an attribute reduction algorithm for neighborhood rough set based on Boolean matrix. Specifically, we define the dependency degree of decision attribute sets. Based on matrix description of neighborhood rough set, we also give a solving method for the relative dependency degree by matrix computing (Algorithm DBM). Furthermore, we realized the attribute reduction by matrix computing (Algorithm ARNI). The results in UCI datasets show that comparing the existing attribute reduction algorithms ARNI can reduce attributes less affecting the classification accuracy. Besides, we also show the acceleration performance of GPU for the matrixing model.

CONFLICT OF INTEREST

The authors declare that there is no conflict of interest.

AUTHORS’ CONTRIBUTIONS

Yan Gao conceived and designed the study. Changwei Lv performed the experiments and wrote the paper. Zhengjiang Wu reviewed and edited the manuscript. All authors read and approved the manuscript.

ACKNOWLEDGMENTS

This work was supported by the National Natural Science Foundation of China (No. 11601129, 61972134).

REFERENCES

1.Z. Pawlak, Rough sets, Int. J. Comp. Inf. Sci., Vol. 11, 1982, pp. 341-356.

2.S. Angra and S. Ahuja, Machine learning and its applications: a review: in 2017, in International Conference on Big Data Analytics and Computational Intelligence (Beijing, China), 2017, pp. 57-60.

3.M. Aggarwal, Probabilistic variable precision fuzzy rough sets, IEEE Trans. Fuzzy Syst., Vol. 24, 2016, pp. 29-39.

4.Z. Pawlak, Rough sets and fuzzy sets, Fuzzy Set Syst., Vol. 17, 1985, pp. 99-102.

5.D. Meng, X. Zhang, and K. Qin, Soft rough fuzzy sets and soft fuzzy rough sets, Cmput. Math. Appl., Vol. 62, 2011, pp. 4635-4645.

6.Y. Daren, A. Shuang, and Q. Hu, Fuzzy mutual information based min-redundancy and max-relevance heterogeneous feature selection, Int. J. Comput. Intell. Sys., Vol. 4, 2011, pp. 619-633.

7.Z. Bonikowski, E. Bryniarski, and U. Wybraniec-Skardowska, Extensions and intentions in the rough set theory, Inf. Sci., Vol. 107, 1998, pp. 149-167.

8.S.U. Kumar and H.H. Inbarani, PSO-based feature selection and neighborhood rough set-based classification for BCI multiclass motor imagery task, Neural Comput. Appl., Vol. 28, 2017, pp. 3239-3258.

9.W. Zhu and F. Wang, Reduction and axiomization of covering generalized rough sets, Inf. Sci., Vol. 152, 2003, pp. 217-230.

10.Z. Wu, N. Chen, and Y. Gao, Semi-monolayer cover rough set: Concept, property and granular algorithm, Inf. Sci., Vol. 456, 2018, pp. 97-112.

11.C. Wang, Q. He, M. Shao, and Q. Hu, Feature selection based on maximal neighborhood discernibility, Int. J. Mach. Learn. Cybern., Vol. 9, 2018, pp. 1929-1940.

12.C. Wang, Q. Hu, X. Wang, D. Chen, and Y. Qian, Feature selection based on neighborhood discrimination index, IEEE Trans. Neural Netw. Learn. Syst., Vol. 29, 2018, pp. 2986-2999.

13.Q. Hu, D. Yu, and Z. Xie, Neighborhood classifiers, Expert Syst. Appl., Vol. 34, 2008, pp. 866-876.

14.Q. Hu, D. Yu, J. Liu, and C. Wu, Neighborhood rough set based heterogeneous feature subset selection, Inf. Sci., Vol. 178, 2008, pp. 3577-3594.

15.G. Yu, Information structures in an incomplete information system: a granular computing viewpoint, Int. J. Comput. Intell. Syst., Vol. 11, 2018, pp. 1179-1191.

16.P. Yu, J. Li, W. Hongkun, and G. Lin, Matrix-based approaches for updating approximations in multigranulation rough set while adding and deleting attributes, Int. J. Comput. Intell. Syst., Vol. 12, 2019, pp. 855-872.

17.X. Yang, S. Xu, H. Dou, X. Song, H. Yu, and J. Yang, Multigranulation rough set: a multiset based strategy, Int. J. Comput. Intell. Syst., Vol. 10, 2017, pp. 277-292.

18.Y. Qian, X. Liang, Q. Wang, J. Liang, B. Liu, A. Skowron, Y. Yao, J. Ma, and C. Dang, Local rough set: a solution to rough data analysis in big data, Int. J. Approx. Reason., Vol. 97, 2018, pp. 38-63.

19.T. Lin, K. Huang, and Q. Liu, Rough sets, neighborhood systems and approximation, in Proceedings of the Fifth International Symposium on Methodologies of Intelligent Systems, Vol. 22, 1990, pp. 130-141.

20.Q. Wang, Y. Qian, X. Liang, G. Guo, and J. Liang, Local neighborhood rough set, Knowl. Based Syst., Vol. 153, 2018, pp. 53-64.

21.A. Huang and W. Zhu, Characteristic matrices of compound operations of coverings and their relationships with rough sets, Int. J. Mach. Learn. Cybern., Vol. 10, 2019, pp. 75-85.

22.S. Wang, W. Zhu, Q. Zhu, and F. Min, Characteristic matrix of covering and its application to Boolean matrix decomposition, Inf. Sci., Vol. 263, 2014, pp. 186-197.

23.L. Ma, The investigation of covering rough sets by Boolean matrices, Int. J. Approx. Reason., Vol. 100, 2018, pp. 69-84.

24.J. Fan and T. Chow, Deep learning based matrix completion, Neurocomputing, Vol. 266, 2017, pp. 540-549.

25.N.L. Zamarashkin and D.A. Zheltkov, GPU based acceleration of parallel block Lancoz solver, Lobachevskii J. Math., Vol. 39, 2018, pp. 596-602.

26.D.I. Lyakh, An efficient tensor transpose algorithm for multicore CPU, Intel Xeon Phi, and NVidia Tesla GPU, Comput. Phys. Commun., Vol. 189, 2015, pp. 84-91.

27.R. Skinderowicz, The GPU-based parallel ant colony system, J. Parallel Distr. Comput., Vol. 98, 2016, pp. 48-60.

28.S. Jing, G. Li, K. Zeng, W. Pan, and C. Liu, Efficient parallel algorithm for computing rough set approximation on GPU, Soft. Comput., Vol. 22, 2018, pp. 7553-7569.

29.A. Singh, K. Deep, and P. Grover, A novel approach to accelerate calibration process of a k-nearest neighbours classifier using GPU, J. Parallel Distr. Comput., Vol. 104, 2017, pp. 114-129.

30.J. Zhang, Y. Zhu, Y. Pan, and T. Li, Efficient parallel Boolean matrix based algorithms for computing composite rough set approximations, Inf. Sci., Vol. 329, 2016, pp. 287-302.

31.W. Zhu, Relationship between generalized rough sets based on binary relation and covering, Inf. Sci., Vol. 179, 2009, pp. 210-225.

32.W. Ziarko, Variable precision rough set model, J. Comput. Syst. Sci., Vol. 46, 1993, pp. 39-59.

33.Y. Yao, Probabilistic rough set approximations, Int. J. Approx. Reason., Vol. 49, 2008, pp. 255-271.

34.Y. Shi, Y. Huang, C. Wang, and Q. He, Attribute reduction based on the Boolean matrix, Granul. Comput., Vol. 4, 2019, pp. 313-322.

35.G. Liu, L. Li, J. Yang, Y. Feng, and K. Zhu, Attribute reduction approaches for general relation decision systems, Pattern. Recogn. Lett., Vol. 65, 2015, pp. 81-87.

36.D. Dua and C. Graff, UCI Machine Learning Repository, 2019. https://archive.ics.uci.edu/ml/datasets.php

37.Nvidia, CUDA Toolkit Documentation v8.0, 2017. https://docs.nvidia.com/cuda/archive/8.0

38.PyTorch, PyTorch 0.4 Chinese document, 2019. https://pytorch.apachecn.org/docs/0.4

<Previous Article In Issue

Download article (PDF)

Next Article In Issue>

Journal: International Journal of Computational Intelligence Systems
Volume-Issue: 13 - 1
Pages: 1473 - 1482
Publication Date: 2020/09/21
ISSN (Online): 1875-6883
ISSN (Print): 1875-6891
DOI: 10.2991/ijcis.d.200915.004 How to use a DOI?
Open Access: This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

ris enw bib

TY  - JOUR
AU  - Yan Gao
AU  - Changwei Lv
AU  - Zhengjiang Wu
PY  - 2020
DA  - 2020/09/21
TI  - Attribute Reduction of Boolean Matrix in Neighborhood Rough Set Model
JO  - International Journal of Computational Intelligence Systems
SP  - 1473
EP  - 1482
VL  - 13
IS  - 1
SN  - 1875-6883
UR  - https://doi.org/10.2991/ijcis.d.200915.004
DO  - 10.2991/ijcis.d.200915.004
ID  - Gao2020
ER  -

download .riscopy to clipboard