International Journal of Computational Intelligence Systems

Volume 13, Issue 1, 2020, Pages 1014 - 1026

Computation of Support and Confidence for Interval-Valued Fuzzy Association Rules

Authors
Michal Burda*, ORCID, Viktor Pavliska, Petra Murinová
Institute for Research and Applications of Fuzzy Modeling, CE IT4Innovations, Division University of Ostrava, 30. Dubna 22, 701 03 Ostrava, Czech Republic
*Corresponding author. Email: michal.burda@osu.cz
Corresponding Author
Michal Burda
Received 3 February 2020, Accepted 13 July 2020, Available Online 29 July 2020.
DOI
10.2991/ijcis.d.200715.001How to use a DOI?
Keywords
Interval-valued data; Fuzzy association rules; Support; Confidence; Algorithm
Abstract

The aim of this paper is to provide an algorithm for the computation of support and confidence of the association rules on interval-valued fuzzy sets. Each element of the interval-valued fuzzy set has a membership degree defined as an interval. In other words, the membership intervals may be interpreted as partial knowledge when the precise value is not known. The computations of the support and the confidence are discussed with respect to the three most common triangular norms (minimum, product and Łukasiewicz), which act as conjunction in support and confidence definitions.

Copyright
© 2020 The Authors. Published by Atlantis Press B.V.
Open Access
This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

1. INTRODUCTION

The main objective of this paper is to continue the development of generalized association rules with missing values, which are a tool for the explanatory analysis of large data sets. In [1], we introduced a generalized algorithm for the association analysis of natural data with undefined values.

Association rules were first introduced by Hájek et al. in the late 1960s [2] through their formulation of the General Unary Hypotheses Automaton (GUHA) method [3] for processing binary or categorical data. Agrawal [4] reinvented this approach independently in 1993. An analysis of natural data based on linguistic summaries was developed by Yager in 1982 [5] and later by Kacprzyk [6] and, in many aspects, is similar to association analysis. An extension of association rules for fuzzy data was also developed in 2009 by Ralbovský [7]. An interpretation of data using expressions of natural language, namely using the generalized intermediate quantifiers “Almost all, Most, Many,” was proposed in [8].

In real-world applications, it is very common for data being analyzed to sometimes be missing. There may be many reasons for data unavailability. Some reasons are discussed in [9]. If an analytical method cannot work with missing data, records with missing values are often removed. Another common practice is to impute some reasonable values in the empty cells, if the missingness is not dependent on the unobserved or missing value itself [10]. Imputation is performed by substituting the missing values with averages or by applying a more sophisticated approach, such as machine learning techniques [11].

On the other hand, some analytical methods allow for the direct processing of missing values. For instance, an extension of the GUHA method allowing the search for association rules on binary and categorical data with missing values was introduced in [3]. The popular C4.5 algorithm [12] for decision trees can also manage missing values via distribution-based imputation.

The problem of missing values has also been studied from the mathematical logic's point of view. Fundamental grounds were laid by Kleene, Bochvar, Sobociński and others by establishing three-valued logics, which were also studied by Łukasiewicz [13] in 1920. An overview can be found in [14]. Generalized approaches were introduced in [15], [16] and [17]. A three-valued logic modeling borderline or unknown values is presented in [18]. A deeper interpretation of the third-truth value (possible, undefined, half-true, inconsistent) can be found in [19].

The missingness of a value is a special case of partial knowledge. Other values may be measured imprecisely, which can be described by an interval of possible values. From that perspective, a completely missing value can be characterized with an interval that is identical to the whole domain of the attribute.

Recently, computations with intervals have become popular. For instance, Grzegorzewski [20] introduces the measures of dispersion for interval data, Billard and Diday [21] focuses on regression analysis for interval-valued data and Lima Neto and De Carvalho [22] develops a nonlinear interval-valued regression.

The aim of this paper is to generalize the concept of association rules for interval-valued fuzzy sets as introduced in [23]. The fuzzy set generalizes a classical mathematical set by allowing a partial membership of an element. Each element of the universe is assigned a degree of membership to a given fuzzy set. Interval-valued fuzzy sets further generalize this concept by using intervals of membership degrees. These intervals may be interpreted as partial information where the exact membership degree is not known. Interval-valued fuzzy sets were studied in [24] and [25].

This paper enables the search for association rules on fuzzy data that have missing information. As the fuzziness captures the degree of an object possessing some property, the lack of information in data or imprecision in membership degree assignment may be modeled using the intervals of these membership degrees.

Concretely, this paper focuses on the algorithm for fast computing of the most common quality measures of an association rule: the support and confidence. The support measures the number of records in the database that satisfy both antecedent A and consequent C of the association rule AC. The confidence is a measure defined as a generalization of conditional probability: it is the fraction of records satisfying consequent C in a set of records that fulfill antecedent A. The “and” connective is in a fuzzy association analysis modeled with a triangular norm (t-norm). The presented algorithms are designed for the three most commonly used t-norms: Gödel (minimum), Goguen (product) and Łukasiewicz t-norm.

Example 1.

Consider a dataset with concentrations of potassium and water pH measured in order to prove their influence on the grow of water plants. U is the set of experiments conducted. The initial three numeric columns (pH, potassium, grow rate) can be transformed into several fuzzy sets such as “acidic pH,” “neutral pH,” “basic pH,” “low potassium,” “medium potassium,” “high potassium,” etc. From that, the association rules searching algorithm may find the rule “acidic pH”“medium potassium”“high grow rate”.

If the original numeric values were all measured precisely, then also membership degrees to the fuzzy sets would be precise and we obtain a precise confidence of that rule. On the other hand, if some measures were missing or if the measuring tool was imprecise, we might be given intervals of possible values instead of precise numbers, which may lead into intervals of possible membership degrees and also into imprecise confidence.

The rest of the paper is organized as follows: Section 2 recalls basic definitions from interval-valued fuzzy set theory. Section 3 introduces the reader into association rules and Section 4 generalizes association rules for interval-valued fuzzy sets. Section 5 concludes the paper.

2. PRELIMINARIES

The main goal of this section is to introduce a mathematical background that will be used for fuzzy association rules.

2.1. Interval-Valued Fuzzy Sets

A fuzzy set [26] is defined as a mapping from a universe of discourse U to a real interval [0,1], i.e. F:U[0,1]. Unlike crisp sets, where an object fully belongs or does not belong to a set, fuzzy sets enable an object uU to belong partially to the set F in a degree F(u). A fuzzy set X is a subset of a fuzzy set Y, XY, if X(u)Y(u), for all uU. A size of a fuzzy set X is |X|=uUX(u).

Triangular norms (t-norms) are binary operations :[0,1]×[0,1][0,1], which fulfill the following axioms for each x,y,z[0,1]:

  • T1

    commutativity: xy=yx;

  • T2

    associativity: x(yz)=(xy)z;

  • T3

    monotonicity: xyxz whenever yz;

  • T4

    boundary condition: 1x=x.

T-norms have been mainly studied by Klement et al. [27] and later elaborated by many others. The most common examples of t-norms are

  1. Gödel (minimum) t-norm: min{x,y},

  2. Goguen (product) t-norm: xy,

  3. Łukasiewicz t-norm: max{0,x+y1}.

Convention. In the sequel, we will work with closed intervals [x1,x2][0,1]. By we will denote the set of all closed intervals on [0,1], i.e., ={[x1,x2][x1,x2][0,1]}. For X=[x1,x2], the lower and upper bound will be respectively denoted with X and X, i.e., X=x1 and X=x2.

An interval-valued fuzzy set F on the universe U is a mapping F:U assigning an interval from to any element from the universe U.

First of all, we start with a definition of the ordering on . Let X,Y then XY iff XY=Y (equivalently XY=X), where

XY=max{X,Y},max{X,Y},XY=min{X,Y},min{X,Y}.

We continue with definitions of basic operations on . We generalize this approach to interval fuzzy logic in definitions below which were introduced in [24,25].

A t-norm on is a binary operation :×, which satisfies the following axioms for each X,Y,Z:

(T1) commutativity: XY=YX;

(T2) associativity: X(YZ)=(XY)Z;

(T3) monotonicity: XYXZ whenever YZ;

(T4) boundary condition: [1,1]X=X.

Theorem 1.

(Representability [28]) Let :[0,1]×[0,1][0,1] be a t-norm. For X,Y, we put

XY=XY,XY.

Then is a t-norm on .

Proof.

Let be a t-norm, hence is commutative. Then obviously

XY=XY,XY==YX,YX=YX,
so we see that is commutative too. We can prove the associativity (T 2) and the boundary condition (T 4) in the similar way.

To prove the monotonicity (T 3), let us consider arbitrary Y,Z such that YZ. By definition, YZ=Z, which means that max(Y,Z)=Z and max(Y,Z)=Z, i.e., YZ and YZ. Since is monotonous, we can see that for any X, XYXZ and XYXZ, which is equivalent to XYXZ=XZ, which means that XYXZ, from which we can see the monotonicity of .

Convention. Whenever we will need to perform an operation between an element a[0,1] and an interval X (e.g., aX) then we can promote the element a to the interval [a,a] so we can use the operation defined for intervals.

2.2. (Noninterval) Fuzzy Association Rules

Let O={o1,o2,,on}, n>0, be a finite set of abstract elements called objects and A={a1,a2,,am}, m>0, be a finite set of attributes. Within the association rules framework, a dataset D is a mapping that assigns to each object oO and attribute aA a truth degree D(a,o)[0,1], which represents the intensity of assignment of attribute a to object o. We also define

D(X,o)=aXD(a,o)(1)
for each subset XA and a fixed t-norm .

Association rule is a formula AC, where AA is an antecedent and CA is an consequent. It is natural to assume AC= and also |C|=1.

As each combination of attributes in antecedent and consequent form a well-formed association rules, an important problem is to identify rules that are relevant to the given dataset D. So far, there exist a large number of measures of rule relevance. An overview can be found in [29]. Perhaps the most essential association rule quality measures are support and confidence, which may be defined as follows:

Let AC be an association rule and D be a dataset. Then

supp(AC)=oOD(A,o)D(C,o),conf(AC)=oOD(A,o)D(C,o)oOD(A,o).(2)

Example 2.

Consider a dataset D in Table 1. Here O={o1,,o5} are table rows and A={a1,,a3} are table columns. The numeric values in the table represent membership degrees D(a,o), for aA and oO. Let A={a1,a2} and let be a Gödel (minimum) t-norm. Then D(A,o1)=min{0.9,1.0}=0.9, D(A,o2)=0, D(A,o3)=0.8 etc. Let C={a3}. The association rule AC, i.e., {a1,a2}{a3}, has the following support and confidence:

supp(AC)=0.9+0+0.8+0+0=1.7,conf(AC)=1.7(0.9+0+0.8+0.1+0)0.94.

a1 a2 a3
o1 0.9 1.0 0.9
o2 0.7 0.0 0.1
o3 0.9 0.8 1.0
o4 0.1 0.9 0.0
o5 0.0 0.0 0.2
Table 1

An exemplary noninterval fuzzy dataset D. Objects from O are represented by table rows, attributes from A with columns.

The mining process of the association rules starts with the generation of all association rules with antecedents of length one. Support measures are computed for all of the generated rules, and the generated rules, which exceed the user-defined minimum support threshold, serve as a basis for the longer rules. The process is based on the fact that the support is nonincreasing with respect to the extension of antecedents by the new attributes. The monotonicity of the support is often the basis for efficient association rules in mining algorithms. Rules that satisfy the minimum support condition are then subjected to a confidence evaluation. A rule with both support and confidence above the user-defined thresholds is considered interesting and it is included in the results. Details of the algorithm may be found in [30].

3. ASSOCIATION RULES ON INTERVAL-VALUED FUZZY SETS

In order to extend the association rules framework for data containing interval-valued membership degrees, one has to switch the range of membership degrees from [0,1] to . In other words, dataset D becomes a mapping such that D(a,o) for each aA and each oO. Similarly to (1), we define

D(X,o)=D(x1,o)D(x2,o)D(xk,o),
for X={x1,x2,,xk}A and a fixed t-norm on .

For interval-valued membership degrees, the support and confidence of an association rule cannot be computed directly as in (2). Note that the interval membership degrees represent a partial knowledge—the exact membership degree is only known to be from the given interval. Therefore, the rule's support and confidence must also be represented as an interval. Although the evaluation of support is quite straightforward, the computation of the lower and upper bounds for confidence is an optimization problem that involves finding the exact values of the membership degrees such that the resulting confidence is minimal or maximal. To formalize this problem mathematically, we introduce the notion of a possible world as follows.

Definition 1.

Let D be a dataset that assigns an interval of truth values to each pair a,o of attribute aA and object oO, i.e., D(a,o). A possible world of the dataset D is a mapping wD:A×O[0,1] such that w(a,o)D(a,o) for each aA,oO. A class of all such mappings wD will be denoted by WD.

Convention. For the sake of simplicity, we omit the subscript D in wD and WD, if D is evident from the context. Moreover, we put w(A,o)=aAw(a,o), for any AA.

Example 3.

Table 2 shows a dataset D with interval-valued fuzzy memberhsip degrees, from which we can see, e.g., D(a1,o3)=[0.5,0.9]. If we put A={a1,a2} and we let to be Goguen (product) t-norm, we obtain D(A,o3)=[0.50.8,0.91.0]=[0.4,0.9], etc. As one can see, Table 1 may be viewed as a possible world of the interval-valued dataset from Table 2. Another possible world is, e.g., such w that w(a1,o1)=0.95, w(a2,o1)=1.0, w(a3,o1)=0.88, w(a1,o2)=0.0, etc.

a1 a2 a3
o1 [0.9, 1.0] [1.0, 1.0] [0.8, 0.9]
o2 [0.0, 0.7] [0.0, 0.5] [0.1, 0.1]
o3 [0.5, 0.9] [0.8, 1.0] [1.0, 1.0]
o4 [0.0, 0.1] [0.9, 0.9] [0.0, 0.0]
o5 [0.0, 0.0] [0.0, 0.5] [0.2, 1.0]
Table 2

An exemplary interval fuzzy dataset D. Objects from O are represented by table rows, attributes from A with columns.

In the following subsections, we generalize the quality measures of an association rule—the support and confidence—for interval-valued data. We also propose efficient algorithms for their computation and prove their correctness.

3.1. The Support

In this section, a generalization of an association rule's support for interval-valued data is provided. A subsequent theorem gives a hint for a rather straightforward way of computation.

Definition 2.

Let D be a dataset, let us have a rule AC and let be a t-norm. Then the support of AC on a possible world wW is

suppAC(w)=oOw(A,o)w(C,o).

The interval-valued support of AC in the dataset D is

supp(AC)=[supp1,supp2],
with such maximal supp1 and minimal supp2 that
wW:supp1suppAC(w)supp2.

For simplicity, we omit the superscript and write only supp(w) because the rule AC is fixed throughout the paper. Obviously, supp(w)[0,|O|].

As stated in the subsequent theorem, finding the boundaries supp1 and supp2 is quite straightforward.

Theorem 2.

Let us have a rule AC and let D be a (interval-valued) dataset. Let w1,w2W such that for each oO,

w1(A,o)=D(A,o), w1(C,o)=D(C,o),w2(A,o)=D(A,o), w2(C,o)=D(C,o).

Then for any t-norm , supp(AC)=[supp1,supp2], where supp1=supp(w1) and supp2=supp(w2).

Proof.

The proof evidently follows from the monotonicity of the t-norm .

Example 4.

Let be Gödel min t-norm. Accordingly to Theorem 2, the support of rule {a1,a2}{a3} in Table 2 can be obtained by evaluating the support of two possible worlds: w1 of lower bounds and w2 of upper bounds.

supp1=min{0.9,1,0.8}+min{0,0,0.1}+++min{0,0,0.2}=1.3,supp2=min{1,1,0.9}+min{0.7,0.5,0.1}+++min{0,0.5,1}=1.9.

Theorem 3.

Let AC be a rule, A,CA, let aA be an attribute, aAC, and let D be a dataset. Then for arbitrary t-norm ,

suppACsupp(A{a})C.(3)

Proof.

Note that accordingly to Theorem 2,

supp(AC)=oOD(A,o)D(C,o),supp((A{a})C)=oOD(A,o)D(a,o)D(C,o).

(Analogous equations hold for upper bounds of the support.) Since is monotone, we can immediately see that inequality (3) holds.

Theorem 3 says that with extension of rule's antecedent by a new attribute, the rule's support is not increasing. As discussed in Section 2.2, such property of the support is important for association rules searching algorithms. Theorem 3 allows us to use, e.g., the Agrawal's Apriori algorithm [30] to search also for association rules on interval-valued data by constraining the support with user-defined minSupp threshold, e.g., as follows:

  • supp(AC)minSupp;

  • supp(AC)minSupp; or

  • 12supp(AC)+supp(AC)minSupp.

3.2. The Confidence

The objective of this section is to provide an algorithm computing the association rule's confidence in the interval-valued data. The problem is analyzed from the perspective of possible worlds (see Definition 1).

First, the setting of w(C,o)D(C,o) that yields the minimum or maximum confidence is discussed. After that, the setting of w(A,o)D(A,o) is analyzed separately with respect to the Gödel (minimum), Goguen (product) and Łukasiewicz t-norms, and the potential solutions from the interval D(A,o) are identified. It is shown, that for each oO, at most two w(A,o) suspicious points exist that need to be examined in the search for the extreme confidence. (The combination of those points is called the suspicious world). After that, an efficient algorithm (Algorithm 1) is presented that solves the optimization sub-problem, which is later used to find the extreme confidence (minimum or maximum) based on the suspicious points (Algorithms 2, 3 and 4).

Algorithm 1 An algorithm for searching for F(Imin), as introduced in Lemma 11. The input argumentsG, H are assumed to be real numbers, H0 and input argumentsg and h are assumed to be vectors of length n, g=(g1,,gn), h=(h1,,hn),hi0.

 function (FracMinim) (G,H,g,h)

   heap new empty min-heap

   for all i{1,,n} do

    push gihi into heap

   end for

   while heap is not empty do

    pop a minimal element ab from heap

    if ab<GH then

     GG+a

     HH+b

    else

     break the while loop

    end if

   end while

   return GH

 end function

Algorithm 2 The computation of conf(AC) on the dataset D

 function (Confidence) (D,A,C)

   G,G,H0

   g,g′,h new empty vector

   for all oO do

    alD(A,o), ahD(A,o)

    clD(C,o), chD(C,o)

    GG+alcl

    GG+alch

    HH+al

    if alah then

     append (ahclalcl) to g

     append (ahchalch) to g′

     append (ahal) to h

    end if

   end for

   lo FracMinim (G,H,g,h)

   hi FracMinim (G,H,g′,h)

   return [lo,hi]

end function

Definition 3.

Let D be a dataset, let us have a rule AC and let wW such that w(A,o)0 for some oO. Let be a t-norm. Then the confidence of AC in a possible world w is

confAC(w)=oOw(A,o)w(C,o)oOw(A,o).(4)

The interval-valued confidence of AC in the dataset D is

conf(AC)=[conf1,conf2],
with such maximal conf1 and minimal conf2 that
wW:conf1confAC(w)conf2.

Obviously, confAC(w)[0,1]. Again, we omit AC in the superscript where the rule AC is obvious from the context.

Algorithm 3 Computation of confmin(AC) on the dataset D

 function (Confidence) min(D,A,C)

   G,H,M,N0

   g,h new empty vector

   for all oO do

    alD(A,o), ahD(A,o)

    clD(C,o), chD(C,o)

    if ahcl then

     GG+al

     HH+al

         because min(al,cl)=al

    else if alcl then

     GG+cl

     HH+ah

         because min(ah,cl)=cl

    else

     GG+al

     HH+al

         because min(al,cl)=al

     if alah then

      append (clal) to g

      append (ahal) to h

     end if

    end if

    aclosest([al,ah],ch)

    MM+min(a,ch)

    NN+a

   end for

   lo FracMinim (G,H,g,h)

   hiMN

   return [lo,hi]

 end function

Example 5.

Let D be a dataset as in Table 2 and be Goguen product t-norm. Let A={a1,a2} and C={a3}. Let us define three possible worlds, w1, w2, w3 by taking the lower bounds for C and lower bound for A (w1), upper bound for A (w2) and purposely mixed bounds for A (w3). Please see Table 3 and compare it with Table 2. However, neither a selection of lower bounds as in w1 nor upper bounds as in w2 yield in a minimum confidence of a rule AC possible to obtain from D. The minimum confidence is reached for the world w3: confAC(w1)0.86, confAC(w2)0.74, confAC(w3)0.67.

w1
w2
w3
A C A C A C
o1 0.9 0.8 1.0 0.8 1.0 0.8
o2 0.0 0.1 0.35 0.1 0.35 0.1
o3 0.4 1.0 0.9 1.0 0.4 1.0
o4 0.0 0.0 0.09 0.0 0.09 0.0
o5 0.0 0.2 0.0 0.2 0.0 0.2
Table 3

Three possible worlds for dataset D in Table 2. w1 takes lower bounds for A, w2 takes upper bounds for A, and w3 is mixed so that it yields the minimum confidence of rule AC that can be obtained from D.

As can be seen from the example, finding boundaries conf1 and conf2 of the confidence of the association rule AC is not as straightforward as for the support. It is an optimization problem, which is equivalent to finding global extremes (minima and maxima) of function (4), i.e., to finding such worlds that yield minimal (resp. maximal) possible confidence.

Worlds wW with optimal values of w(C,o) can be obtained accordingly to the following lemma.

Lemma 4.

Let AC be a rule, D be a dataset, W1,W2W such that

W1={ww(C,o)=D(C,o),oO},W2={ww(C,o)=D(C,o),oO}
and let
conf(AC)=[conf1,conf2].

Then for any t-norm , there exist w1W1 and w2W2 such that conf1=conf(w1) and conf2=conf(w2).

Proof.

The proof evidently follows from the monotonicity of the t-norm.

Lemma 4 gives us a straightforward advice of how to select values of w(C,o) for each oO in order to minimize (resp. maximize) the confidence conf(w). Now, let us sketch the way of finding the values of w(A,o) (for all oO) such that the confidence becomes extreme.

Searching for a possible world wW with globally maximal (resp. minimal) confidence conf(w) is equivalent to searching for global extremes of a function f(x1,,xn) with n=|O| parameters, where xi corresponds to w(A,oi), for oiO and for all i{1,,n}. Formally, function f is a mapping

f:D(A,o1)××D(A,on)[0,1],
defined as
f(x1,,xn)=i=1nxiw(C,oi)i=1nxi=i=1nti(xi)i=1nxi,(5)
where for each i{1,,n}, ti:[0,1][0,1] is a function defined as
ti(x)=xw(C,oi),
since w(C,oi) are fixed.

The extremes of function f may be found:

  • at boundaries of intervals D(A,oi), or

  • at critical points, where partial derivatives of f with respect to xi are equal to zero or do not exist.

Lemma 5.

Let function f be defined for i=1nxi0 as

f(x1,,xn)=i=1nti(xi)i=1nxi,
where ti:[0,1][0,1] is for each i{1,,n} a differentiable function. Then for j{1,,n},
f(x1,,xn)xj=0ifdtjdxj(xj)=f(x1,,xn).(6)

The function f(x1,,xn) is increasing with respect to xj if

dtjdxj(xj)>f(x1,,xn)(7)
and decreasing with respect to xj if
dtjdxj(xj)<f(x1,,xn).(8)

Proof.

Let f(x1,,xn)xj=0, i.e.,

i=1nti(xi)xji=1nxii=1nti(xi)i=1nxi2=0,
which yields in
dtjdxj(xj)=i=1nti(xi)i=1nxi=f(x1,,xn).

The inequalities (7) and (8) easily follow from the fact that a function f is increasing with respect to xj if f(x1,,xn)xj>0 and decreasing with respect to xj if f(x1,,xn)xj<0.

Lemma 5 provides a general framework for finding the points suspicious of extrema for arbitrary t-norm . Below we examine suspicious points for the most commonly used t-norms: Goguen product (), Gödel minimum (min) and Łukasiewicz (L).

Lemma 6.

(Suspicious worlds related to product t-norm) Let AC be a rule and D be a dataset. Let

P={wWoO:w(A,o){D(A,o),D(A,o)}}.

Then there exist w1,w2P such that

conf(AC)=conf(w1)andconf(AC)=conf(w2).

Proof.

For wW, conf(w) can be viewed as a function f in (5). If we put ti(xi)=xiw(C,oi), where w(C,oi) is fixed accordingly to Lemma 4, we obtain from equation (6) of Lemma 5 the following equality:

dtjdxj(xj)=w(C,oj)=i=1nxiw(C,oi)i=1nxi,j{1,,|O|}.

Since is simply the product, the equation above can be further simplified to

w(C,oj)i=1nxi=i=1nxiw(C,oi),w(C,oj)xj+w(C,oj)i=1ijnxi=xjw(C,oj)+i=1ijnxiw(C,oi),w(C,oj)=i=1ijnxiw(C,oi)i=1ijnxi.(9)

So, the partial derivative fxj is equal to zero for all xjD(A,oj) if (9) holds. In other words, the local extreme either does not exist (if (9) is not fulfilled) or the extreme is at any point xjD(A,oj) (if (9) holds). It means that the function is monotonic with respect to xj. In each case, the global extreme of conf(w) is thus reached if w(A,oj) is equal to either lower or upper bound of the interval D(A,oj), i.e., D(A,oj) or D(A,oj).

Example 6.

Consider data from Example 3 and Table 2. Naturally, there exists an infinite number of possible worlds that are represented by that interval-valued data. However, if we are searching for worlds with minimum or maximum confidence of the rule AC, A={a1,a2}, C={a3}, and product t-norm is used for computations, there exists, accordingly to Lemmas 4 and 6, a finite set P of possible worlds, which contains both extreme worlds. Any combination of values of suspicious points from Table 4 is a possible world from the set P. As can be seen, |P|=64.

Lower conf.
Upper conf.
A C A C
o1 {0.9,1.0} {0.8} {0.9,1.0} {0.9}
o2 {0.0,0.35} {0.1} {0.0,0.35} {0.1}
o3 {0.4,0.9} {1.0} {0.4,0.9} {1.0}
o4 {0.0,0.09} {0.0} {0.0,0.09} {0.0}
o5 {0.0} {0.2} {0.0} {1.0}
Table 4

Sets of suspicious points for rule AC from Example 3, if the product t-norm is being used.

Definition 4.

(Closest element) A function closest:×[0,1][0,1] is defined as

closest(X,y)=X,ifX<y,X,ifX>y,y,otherwise,
for X and y[0,1].

In other words, closest(X,y) finds an element from interval X that is the closest to y.

Lemma 7.

(Suspicious worlds related to minimum t-norm) Let AC be a rule and D be a dataset. Let

M={woO:w(A,o)αo},
where
αo=D(A,o),ifD(A,o)D(C,o),D(A,o),ifD(A,o)D(C,o),D(A,o),D(A,o),otherwise.

Then there exists w1M such that confmin(AC)=confmin(w1).

Let w2W such that

w2(A,o)=closest(D(A,o),D(C,o)).(10)

Then confmin(AC)=confmin(w2).

Proof.

For confidence based on minimum t-norm, we can continue similarly as in previous Lemma 6. In this case, we put ti(xi)=min(xi,w(C,oi)), for w(C,oi)) again fixed accordingly to Lemma 4. From (6) of Lemma 5 we obtain

dtjdxj(xj)=i=1nmin(xi,w(C,oi))i=1nxi,j{1,,|O|}.

The derivative of min(xj,w(C,oj)) does not exist for xj=w(C,oj). The derivative equals 1 for xj<w(C,oj), and 0 for xj>w(C,oj).

We can deduce from inequalities (7) and (8) that the confidence is increasing with respect to xj for xjw(C,oj) and decreasing with respect to xj for xjw(C,oj). From that, we can deduce the following values for w(A,o):

  • if D(A,o)D(C,o), the confidence is increasing on the whole D(A,o) interval. Therefore, D(A,o) is the point of global minimum;

  • if D(A,o)D(C,o), the confidence is decreasing on the whole D(A,o) interval. Therefore, D(A,o) is the point of global minimum;

  • otherwise, the global minimum is reached either for D(A,o) or D(A,o).

Similarly,

  • if D(A,o)D(C,o), the confidence is increasing on the whole D(A,o) interval. Therefore, D(A,o) is the point of global maximum;

  • if D(A,o)D(C,o), the confidence is decreasing on the whole D(A,o) interval. Therefore, D(A,o) is the point of global maximum;

  • otherwise, the global maximum is reached for D(C,o).

Example 7.

Consider data from Example 3 and Table 2 and minimum t-norm. Accordingly to Lemmas 4 and 7, there are only two possible worlds that are candidates for a lower bound of the confidence of rule AC. The upper bound of the confidence is given by the world w2 – see Table 5.

M
w2
A C A C
o1 {1.0} {0.8} {0.9} {0.9}
o2 {0.0,0.35} {0.1} {0.1} {0.1}
o3 {0.4} {1.0} {0.9} {1.0}
o4 {0.09} {0.0} {0.0} {0.0}
o5 {0.0} {0.2} {0.0} {1.0}
Table 5

Sets of suspicious points for rule AC from Example 3, if the minimum t-norm is being used.

Lemma 8.

(Suspicious worlds related to Łukasiewicz t-norm) Let AC and D be a dataset. Let w1W such that

w1(A,o)=closest(D(A,o),1D(C,o)).

Then confL(AC)=confL(w1). Let

L={woO:w(A,o)αo},
where
αo=D(A,o),ifD(A,o)1D(C,o),D(A,o),ifD(A,o)1D(C,o),D(A,o),D(A,o),otherwise.

Then there exists w2L such that confL(AC)=confL(w2).

Proof.

By setting w(C,oi)) fixed accordingly to Lemma 4 and by putting ti(xi)=xiLw(C,oi)=max(0,xi+w(C,oi)1), we obtain from (6) of Lemma 5, for all j{1,,|O|}:

dtjdxj(xj)=dmax(0,xj+w(C,oj)1)dxj=i=1nmax(0,xi+w(C,oi)1)i=1nxi.

The derivative of max(0,xj+w(C,oj)1) does not exist for xj=1w(C,oj). The derivative equals 0 (for xj<1w(C,oj)), and 1 (for xj>1w(C,oj)).

We can deduce from inequalities (7) and (8) that the confidence is decreasing with respect to xj for xj1w(C,oj) and increasing with respect to xj for xj1w(C,oj). From that, we can deduce the following values for w(A,o):

  • if D(A,o)1D(C,o), the confidence is decreasing on the whole D(A,o) interval. Therefore, D(A,o) is the point of global minimum;

  • if D(A,o)1D(C,o), the confidence is increasing on the whole D(A,o) interval. Therefore, D(A,o) is the point of global minimum;

  • otherwise, the global minimum is reached for 1D(C,o).

Similarly,

  • if D(A,o)1D(C,o), the confidence is decreasing on the whole D(A,o) interval. Therefore, D(A,o) is the point of global maximum;

  • if D(A,o)1D(C,o), the confidence is increasing on the whole D(A,o) interval. Therefore, D(A,o) is the point of global maximum;

  • otherwise, the global maximum is reached either for D(A,o) or D(A,o).

Example 8.

Consider data from Example 3 and Table 2 and minimum t-norm. Accordingly to Lemmas 4 and 8, there exists a single possible world that is a candidate for a lower, and a different single possible world that is a candidate for the upper bound of the confidence of rule AC – see Table 6.

w1
L
A C A C
o1 {0.9} {0.8} {1.0} {0.9}
o2 {0.35} {0.1} {0.0} {0.1}
o3 {0.4} {1.0} {0.9} {1.0}
o4 {0.09} {0.0} {0.0} {0.0}
o5 {0.0} {0.2} {0.0} {1.0}
Table 6

Sets of suspicious points for rule AC from Example 3, if the Łukasiewicz t-norm is being used.

Lemmas 68 identify the points that are suspected to be global extremes for each object oO. Recall that for each examined t-norm, there are at most two suspicious alternative values that could be extreme for each oiO={o1,,on}. The core of the problem of finding the global extremes lies in selecting the correct value from these alternatives. The subsequent lemma transforms the problem to search for a set of indices I={1,,n}, for which the objects oi pick the greatest suspicious value.

Lemma 9.

Let O={o1,,on,,on} be a set of objects, nn, be a t-norm, AC be a rule and D be a dataset. Let us have a set R of possible worlds,

R={woiO:w(A,oi)αiw(C,oi)=ci},
where
αi={ai,ai}fori{1,,n},{ai}fori{n+1,,n},

ai,aiD(A,oi), ai<ai, and ciD(C,oi). Then

min{conf(w)wR}=F(Imin),max{conf(w)wR}=F(Imax),
where
F(I)=G+iIgiH+iIhi,I{1,,n}.
and where
G=i=1naici,H=i=1nai,gi=aiciaici,hi=aiai,
and Imin,Imax{1,,n} such that for all I{1,,n},
F(Imin)F(I)F(Imax).

Proof.

Let wR and let I={iw(A,oi)=ai}. Then

conf(w)=oiOw(A,oi)w(C,oi)oiOw(A,oi)=iIaici+iIaiciiIai+iIai=i=1naici+iI(aiciaici)i=1nai+iI(aiai)=G+iIgiH+iIhi=F(I).

Lemma 10.

Let G,H,g,h, H,h>0. If gh<GH then

gh<G+gH+h<GH.

Proof.

From gh<GH and h>0 we get G>gHh. By substituting that G into G + gH + h we obtain

gHh+gH+h<G+gH+h,g(H+h)hH+h<G+gH+h,
which yields in gh<G + gH + h. The second inequality can be proven analogously.

Lemma 11.

Let G,H,, H>0, let gi,hi, hi>0, for all i{1,,n}, and let

F(I)=G+iIgiH+iIhi,I{1,,n}.(11)

Moreover, let Imin,Imax{1,,n} such that for all I{1,,n},

F(Imin)F(I)F(Imax).(12)

Then gjhj<F(Imin) implies jImin and gjhj>F(Imin) implies jImin. Similarly, gjhj>F(Imax) implies jImax and gjhj<F(Imax) implies jImax.

Proof.

Let us prove only the first implication, i.e., gjhj<F(Imin) implies jImin. (The other implications can be proven analogously.) Assume to the contrary that gjhj<F(Imin) and jImin. Let G=G+iImingi and H=H+iIminhi. Then F(Imin)=GH>gjhj. According to Lemma 10, it is evident that

F(Imin)=GH>G+gjH+hj=F(Imin{j}),
which yields that F(Imin) is is not minimal, which is a contradiction.

Next, we present the FracMinim algorithm that searches for F(Imin) (see Algorithm 1). The algorithm uses the so-called minimum heap (min-heap), which is a data structure that allows us to effectively obtain the smallest element of the set of stored values. The algorithm first puts all gihi into the heap and then iteratively pops the elements in ascending order from the smallest. Such approach is functionally equivalent to iterating through the collection of values sorted in ascending order by gihi. The benefit of using the heap instead of a sorted collection is that a heap has a better time complexity as the elements that are never popped from the heap never need to be fully ordered.

Lemma 12.

Let G,H, H>0, let g,h be vectors of real numbers, i.e., g=(g1,,gn), h=(h1,,hn), where gi,hi, hi>0, for all i{1,,n}. Then the call FracMinim (G,H,g,h), as defined in Algorithm 1, returns F(Imin) as in (12).

Proof.

We are going to prove this lemma inductively by proving a simultaneous validity of the following statements:

  1. if gihiGH for all i{1,,n} then FracMinim (G,H,g,h)=F(Imin) where g=(g1,,gn) and h=(h1,,hn);

  2. let j{1,,n} such that gjhjgihi for all i{1,,n}; if gjhj<GH then FracMinim (G,H,g,h)= FracMinim (G,H,g′,h′) where G=G+gj, H=H+hj, g′=(g1,,gj1,gj+1,,gn), h′=(h1,,hj1,hj+1,,hn).

Ad (a): Let gihiGH for all i{1,,n}, possibly also n may be equal to 0, then obviously the Algorithm 1 returns GH. A proof is needed that GH is the correct value for F(Imin). Assume to the contrary that F(Imin)<GH. Then from the definition of F(I) in (11) obviously follows that Imin. On the other hand, it follows from assumptions gihiGH>F(Imin) and from Lemma 11 that iImin, for all i{1,,n}, which is in contradiction with Imin.

Ad (b): Without loss of generality, let us assume the vectors g and h to be indexed according to the ascending order of values gihi, i.e., g1h1g2h2gmhm. Now, let j{1,,n} such that gjhjgihi for all i{1,,n} and let k=igihi=gjhj, i.e., k is the number of equally smallest fractions. If k=1 then obviously also j=1 and calls FracMinim (G,H,g,h) and FracMinim (G,H,g′,h′) return evidently the same result. If k>1 then j{1,,k} and it is clear that the call FracMinim (G,H,g,h) will process at least all first k elements from the vectors g and h, i.e., after k steps of the “while” loop, the variables G and H become incremented by i=1kgi and i=1khi, respectively. This is due the fact that gjhj<GH implies accordingly to Lemma 10 that gjhj<G+gjH+hj. Hence, the “if” condition inside of the “while” loop remains true for all first k fractions in the heap. Then obviously, the call FracMinim (G,H,g′,h′) will surely process all first k1 fractions and thus increment G and H by i=1k1gi=i=1,ijkgi and i=1k1hi=i=1,ijkhi, respectively. This proves FracMinim (G,H,g,h)= FracMinim (G,H,g′,h′).

Corollary 13.

Let G,H, H>0, let g=(g1,,gn), h=(h1,,hn), where gi,hi, hi>0, for all i{1,,n}. Then the call of the function FracMinim (G,H,g,h), as defined in Algorithm 1, returns F(Imax) for F(I) as in (11) and Imax as in (12).

Proof.

The call of FracMinim (G,H,g,h) searches for F(Imin), where

F(I)=G+iI(gi)H+iIhi=G+iIgiH+iIhi=F(I),
for F(I) as in (11) and where for all I{1,,n}
F(Imin)F(I)=F(I)F(Imax).

Hence, F(Imin) is the highest bound of F(I), and therefore, F(Imin)=F(Imax).

Theorem 14.

Let D be a dataset and AC be an association rule. Then the call of function Confidence (D,A,C), as defined in Algorithm 2, returns conf(AC).

Proof.

Algorithm 2 prepares arguments for calls of FracMinim. Namely, variables G,G,H and vectors g,g,h are initialized as follows (where O={o1,,on}):

G=i=1nD(A,oi)D(C,oi),(13)
G=i=1nD(A,oi)D(C,oi),(14)
H=i=1nD(A,oi),(15)
g=gii=1,gi0nwheregi=D(A,oi)D(C,oi)D(A,oi)D(C,oi),(16)
g′=gii=1,gi0nwheregi=D(A,oi)D(C,oi)D(A,oi)D(C,oi),(17)
h=hii=1,hi0nwherehi=D(A,oi)D(A,oi),(18)

Let n=|g|=|g′|=|h|. Note that nn because the vectors do not contain zero elements, which correspond to objects oi such that D(A,oi) is a single-element interval.

First of all, let us prove that the algorithm computes the correct lower bound lo of the confidence. According to Lemma 4, the lower bound is obtained for w(C,o)=D(C,o) (for each oO). Regarding w(A,o), the lower bound of the confidence is obtained for w(A,o)D(A,o),D(A,o) (for each oO), which follows from Lemma 6. The whole problem of finding the lower bound thus consists in finding the proper combination of w(A,o) for all o.

In other words, we have a set of possible worlds

R=woiO:w(A,oi)D(A,o),D(A,o)w(C,oi)=D(C,oi),
from which we have to select the one that yields in the minimum confidence. According to Lemma 9, finding a minimum confidence among worlds wR is equivalent to finding F(Imin), where
F(I)=G+iIgiH+iIhi,I{1,,n},
with G,H from (13), (15) and g=(gi),h=(hi) from (16), (18). The minimum confidence is obtained by calling FracMinim (G,H,g,h), as proven in Lemma 12.

Similarly, finding the upper bound of the confidence is equivalent to finding F(Imax), where

F(I)=G+iIgiH+iIhi,I{1,,n},
with G,H from (14), (15) and g=(gi),h=(hi) from (17), (18). According to Corollary 13, the maximum confidence is obtained by calling the function FracMinim (G,H,g′,h).

Theorem 15.

Let D be a dataset and AC be an association rule. Then the call of function Confidence min(D,A,C), as defined in Algorithm 3, returns confmin(AC).

Proof.

The proof is analogous to the proof of Theorem 14. It is again based on the search for a possible world wW that yields a globally minimal or maximal confidence. Again, the lower or upper bound of the confidence is respectively reached for w(C,o) equal to D(C,o) or D(C,o), see Lemma 4.

Let us prove the correctness of computation of the lower bound lo. Let

S={iD(A,oi)D(C,oi)}T={iD(A,oi)D(C,oi)}U={iD(A,oi)<D(C,oi)<D(A,oi)}

Note that STU={1,,n} and S,T,U are pairwisely disjoint. The algorithm initializes the variables G,H,g,h as follows:

G=iSUD(A,oi)+iTD(C,oi),H=iSUD(A,oi)+iTD(A,oi),g=giiU,gi0wheregi=D(C,oi)D(A,oi),h=hiiU,hi0wherehi=D(A,oi)D(A,oi),

Note that G,g are equivalent to

G=i=1nminD(A,oi),D(C,oi),g=giiU,gi0,gi=minD(A,oi),D(C,oi)minD(A,oi),D(C,oi).

It is now easy to see that G,H,g,h are equivalent to G,H,g,h from Lemma 9 with ai,ai set as in Lemma 7. This all together with Lemma 11 and Lemma 12 leads to the fact that the call of FracMinim (G,H,g,h) indeed returns the lower bound of the confidence.

The correctness of upper bound hi easily follows from (10) in Lemma 7.

Theorem 16.

Let D be a dataset and AC be an association rule. Then the call of function Confidence L(D,A,C), as defined in Algorithm 4, returns confL(AC).

Proof.

The proof is very analogous to proofs of Theorems 14 and 15. It follows from Lemmas 4, 8, 9 and 12 and Corollary 13.

Theorem 17.

The time complexity of FracMinim, Confidence , Confidence min and Confidence L is in O(nlogn), for n=|O|.

Proof.

First of all, let us focus on the time complexity of FracMinim. It is known that pushing an element into the heap as well as popping the minimal element from the heap is in O(logn). FracMinim does this step n-times, which results in O(nlogn). After that, the elements are removed from the heap sequentially, which is again in O(nlogn). Hence, the whole FracMinim is in O(nlogn).

Regarding Algorithms 24, the initialization steps can be executed with time complexity in O(n). The calls of FracMinim afterwards are in O(nlogn), hence, whole algorithms are in O(nlogn) too.

4. EXPERIMENT

In this section, a practical experiment is conducted to show the efficiency of the algorithm with real settings. As many other papers have addressed the usefulness of the association analysis, we limit ourselves only to the computation of the confidence, which is the main topic of this paper.

To evaluate the performance, we randomly generated (from the uniform distribution) four vectors of equal size, which represent the lower and upper bounds of antecedent A and consequent C. Three algorithms were then executed to compute AC's confidence based on the Gödel, Goguen and Łukasiewicz t-norm. Since we are not aware of any other algorithm for the computation of the confidence on interval data, we have compared it to the algorithm that computes the confidence on noninterval data which simply involves the calculation of the equation (2). Naturally, the noninterval algorithm is much faster because it has a linear time complexity, O(n). The results can be found in Table 7.

Data Size Non-int G P L
400 0.00 0.07 0.19 0.03
800 0.00 0.05 0.35 0.07
1600 0.00 0.09 0.70 0.12
3200 0.01 0.18 1.49 0.24
6400 0.01 0.35 2.93 0.48
12800 0.01 0.37 3.21 0.53
25600 0.02 0.71 6.93 1.02
51200 0.12 1.52 14.83 2.24
102400 0.09 2.84 29.60 4.31
204800 0.21 5.74 65.66 9.13
409600 0.63 12.12 144.46 18.59
819200 1.39 25.11 322.94 38.27
1638400 2.17 52.72 850.75 81.70
3276800 4.44 108.16 1769.73 167.95
6553600 8.60 226.69 4073.44 365.52
13107200 17.43 482.65 9182.41 795.08
Table 7

Experimental running times (in milliseconds) of computations of confidence based on Gödel t-norm (G), Goguen product t-norm (P) and Łukasiewicz t-norm (L). A noninterval confidence computation (non-int) is also provided for comparison.

As can be seen, the algorithm for Gödel t-norm is about 32 times slower than noninterval computation, the Łukasiewicz variant 43 times, and the confidence based on Goguen t-norm is computed in average 319 times slower than the noninterval variant. However, we are very satisfied with the results as the computation of interval-valued confidence, which is much more complicated than the noninterval variant, still costs less than a second for both Gödel and Łukasiewicz t-norm even for data having more than 13 millions of records with each record being interval-valued, which will often not be the case in real applications.

The single-threaded experiment was performed at Intel(R) Core(TM) i5-3570 CPU at 3.40GHz.

Algorithm 4 Computation of confL(AC) on the dataset D

 function (Confidence) L(D,A,C)

   G,H,M,N0

   g,h new empty vector

   for all oO do

     alD(A,o),ahD(A,o)

     clD(C,o),chD(C,o)

     aclosest([al,ah],1cl)

     MM+max(0,a+cl1)

     NN+a

     if ah1ch then

      HH+al

     else if al1ch then

      GG+ah+ch1

      HH+ah

     else

      HH+al

      if alah then

       append (ah+ch1) to g

       append (ahal) to h

      end if

     end if

   end for

   loMN

   hi FracMinim (G,H,g,h)

   return [lo,hi]

 end function

5. CONCLUSION

In this paper, efficient algorithms were presented that allow computation of the support and confidence of an association rule from interval-valued fuzzy data. Interval values may be used to model a partial knowledge, i.e., values that are unknown or measured with a limited precision. The proposed algorithms compute the minimal and maximal possible support and confidence based on the assumption that the nonprecise data are independent. It is shown that the computation of the confidence on interval-valued data is more complex than the computation for noninterval data. The time complexity of the interval-valued confidence is in O(nlogn). For contrast, the time complexity of noninterval data confidence is in O(n). However, the most time-consuming part of the association analysis is the search for rules with sufficiently high support, whose complexity remains unchanged (as discussed in Theorem 3).

The computation of minimal and maximal confidence is an optimization problem that depends on the applied t-norm. The algorithms presented here allow computation with the Gödel (minimum) t-norm, Goguen (product) t-norm and Łukasiewicz t-norm. However, several results are applicable to an arbitrary t-norm: see Lemmas 4 and 5. Moreover, the developed algorithms are directly applicable to a t-norm that is proven to be connected to at most two suspicious values (see Lemma 5) per data row. Investigation of other t-norms or families of t-norms is left for future work.

CONFLICT OF INTEREST

Authors declare no conflict of interest.

AUTHORS' CONTRIBUTIONS

All authors contributed to the work presented in this paper. All authors read and approved the final manuscript.

ACKNOWLEDGMENTS

Authors acknowledge support by project “LQ1602 IT4Innovations excellence in science” and by GAČR 16-19170S.

REFERENCES

2.P. Hájek, The problem of a general conception of the GUHA method, Kybernetika., Vol. 4, 1968, pp. 505-515. https://dml.cz//handle/10338.dmlcz/125459
4.R. Agrawal and R. Srikant, Fast algorithms for mining association rules, AAAI Press, Santiago de Chile, Chile, in Proceeding of 20th International Conferenceon Very Large Databases (Chile), 1994, pp. 487-499. https://dl.acm.org/doi/proceedings/10.5555/645920
6.J. Kacprzyk, R.R. Yager, and S. Zadrożny, A fuzzy logic based approach to linguistic summaries of databases, Int. J. Appl. Math. Comput. Sci., Vol. 10, 2000, pp. 813-834. https://www.amcs.uz.zgora.pl/?action=paper&paper=930
7.M. Ralbovský, Fuzzy GUHA, University of Economics, Prague, Czech Republic, 2009. PhD Thesis https://insis.vse.cz/zp/14956/podrobnosti
12.J. Ross Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1993. https://dl.acm.org/doi/book/10.5555/152181
13.J. Lukasiewicz, O logice trojwartosciowej, Ruch Filozoficzny, Vol. 5, 1920, pp. 170-171. https://philpapers.org/rec/UKAOLT-2
Journal
International Journal of Computational Intelligence Systems
Volume-Issue
13 - 1
Pages
1014 - 1026
Publication Date
2020/07/29
ISSN (Online)
1875-6883
ISSN (Print)
1875-6891
DOI
10.2991/ijcis.d.200715.001How to use a DOI?
Copyright
© 2020 The Authors. Published by Atlantis Press B.V.
Open Access
This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - JOUR
AU  - Michal Burda
AU  - Viktor Pavliska
AU  - Petra Murinová
PY  - 2020
DA  - 2020/07/29
TI  - Computation of Support and Confidence for Interval-Valued Fuzzy Association Rules
JO  - International Journal of Computational Intelligence Systems
SP  - 1014
EP  - 1026
VL  - 13
IS  - 1
SN  - 1875-6883
UR  - https://doi.org/10.2991/ijcis.d.200715.001
DO  - 10.2991/ijcis.d.200715.001
ID  - Burda2020
ER  -