Computation of Support and Confidence for Interval-Valued Fuzzy Association Rules

Michal Burda; Viktor Pavliska; Petra Murinová

doi:10.2991/ijcis.d.200715.001

<Previous Article In Issue

Download article (PDF)

Next Article In Issue>

Volume 13, Issue 1, 2020, Pages 1014 - 1026

Computation of Support and Confidence for Interval-Valued Fuzzy Association Rules

Authors

Michal Burda^*^,, Viktor Pavliska, Petra Murinová

Institute for Research and Applications of Fuzzy Modeling, CE IT4Innovations, Division University of Ostrava, 30. Dubna 22, 701 03 Ostrava, Czech Republic

^*Corresponding author. Email: michal.burda@osu.cz

Corresponding Author

Michal Burda

Received 3 February 2020, Accepted 13 July 2020, Available Online 29 July 2020.

DOI: 10.2991/ijcis.d.200715.001 How to use a DOI?
Keywords: Interval-valued data; Fuzzy association rules; Support; Confidence; Algorithm
Abstract: The aim of this paper is to provide an algorithm for the computation of support and confidence of the association rules on interval-valued fuzzy sets. Each element of the interval-valued fuzzy set has a membership degree defined as an interval. In other words, the membership intervals may be interpreted as partial knowledge when the precise value is not known. The computations of the support and the confidence are discussed with respect to the three most common triangular norms (minimum, product and Łukasiewicz), which act as conjunction in support and confidence definitions.
Copyright: © 2020 The Authors. Published by Atlantis Press B.V.
Open Access: This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

1. INTRODUCTION

The main objective of this paper is to continue the development of generalized association rules with missing values, which are a tool for the explanatory analysis of large data sets. In [1], we introduced a generalized algorithm for the association analysis of natural data with undefined values.

Association rules were first introduced by Hájek et al. in the late 1960s [2] through their formulation of the General Unary Hypotheses Automaton (GUHA) method [3] for processing binary or categorical data. Agrawal [4] reinvented this approach independently in 1993. An analysis of natural data based on linguistic summaries was developed by Yager in 1982 [5] and later by Kacprzyk [6] and, in many aspects, is similar to association analysis. An extension of association rules for fuzzy data was also developed in 2009 by Ralbovský [7]. An interpretation of data using expressions of natural language, namely using the generalized intermediate quantifiers “Almost all, Most, Many,” was proposed in [8].

In real-world applications, it is very common for data being analyzed to sometimes be missing. There may be many reasons for data unavailability. Some reasons are discussed in [9]. If an analytical method cannot work with missing data, records with missing values are often removed. Another common practice is to impute some reasonable values in the empty cells, if the missingness is not dependent on the unobserved or missing value itself [10]. Imputation is performed by substituting the missing values with averages or by applying a more sophisticated approach, such as machine learning techniques [11].

On the other hand, some analytical methods allow for the direct processing of missing values. For instance, an extension of the GUHA method allowing the search for association rules on binary and categorical data with missing values was introduced in [3]. The popular C4.5 algorithm [12] for decision trees can also manage missing values via distribution-based imputation.

The problem of missing values has also been studied from the mathematical logic's point of view. Fundamental grounds were laid by Kleene, Bochvar, Sobociński and others by establishing three-valued logics, which were also studied by Łukasiewicz [13] in 1920. An overview can be found in [14]. Generalized approaches were introduced in [15], [16] and [17]. A three-valued logic modeling borderline or unknown values is presented in [18]. A deeper interpretation of the third-truth value (possible, undefined, half-true, inconsistent) can be found in [19].

The missingness of a value is a special case of partial knowledge. Other values may be measured imprecisely, which can be described by an interval of possible values. From that perspective, a completely missing value can be characterized with an interval that is identical to the whole domain of the attribute.

Recently, computations with intervals have become popular. For instance, Grzegorzewski [20] introduces the measures of dispersion for interval data, Billard and Diday [21] focuses on regression analysis for interval-valued data and Lima Neto and De Carvalho [22] develops a nonlinear interval-valued regression.

The aim of this paper is to generalize the concept of association rules for interval-valued fuzzy sets as introduced in [23]. The fuzzy set generalizes a classical mathematical set by allowing a partial membership of an element. Each element of the universe is assigned a degree of membership to a given fuzzy set. Interval-valued fuzzy sets further generalize this concept by using intervals of membership degrees. These intervals may be interpreted as partial information where the exact membership degree is not known. Interval-valued fuzzy sets were studied in [24] and [25].

This paper enables the search for association rules on fuzzy data that have missing information. As the fuzziness captures the degree of an object possessing some property, the lack of information in data or imprecision in membership degree assignment may be modeled using the intervals of these membership degrees.

Concretely, this paper focuses on the algorithm for fast computing of the most common quality measures of an association rule: the support and confidence. The support measures the number of records in the database that satisfy both antecedent A and consequent C of the association rule A⇀C. The confidence is a measure defined as a generalization of conditional probability: it is the fraction of records satisfying consequent C in a set of records that fulfill antecedent A. The “and” connective is in a fuzzy association analysis modeled with a triangular norm (t-norm). The presented algorithms are designed for the three most commonly used t-norms: Gödel (minimum), Goguen (product) and Łukasiewicz t-norm.

Example 1.

Consider a dataset with concentrations of potassium and water pH measured in order to prove their influence on the grow of water plants. U is the set of experiments conducted. The initial three numeric columns (pH, potassium, grow rate) can be transformed into several fuzzy sets such as “acidic pH,” “neutral pH,” “basic pH,” “low potassium,” “medium potassium,” “high potassium,” etc. From that, the association rules searching algorithm may find the rule “acidic pH”⊗“medium potassium”⇀“high grow rate”.

If the original numeric values were all measured precisely, then also membership degrees to the fuzzy sets would be precise and we obtain a precise confidence of that rule. On the other hand, if some measures were missing or if the measuring tool was imprecise, we might be given intervals of possible values instead of precise numbers, which may lead into intervals of possible membership degrees and also into imprecise confidence.

The rest of the paper is organized as follows: Section 2 recalls basic definitions from interval-valued fuzzy set theory. Section 3 introduces the reader into association rules and Section 4 generalizes association rules for interval-valued fuzzy sets. Section 5 concludes the paper.

2. PRELIMINARIES

The main goal of this section is to introduce a mathematical background that will be used for fuzzy association rules.

2.1. Interval-Valued Fuzzy Sets

A fuzzy set [26] is defined as a mapping from a universe of discourse U to a real interval [0,1], i.e. F:U→[0,1]. Unlike crisp sets, where an object fully belongs or does not belong to a set, fuzzy sets enable an object u∈U to belong partially to the set F in a degree F(u). A fuzzy set X is a subset of a fuzzy set Y, X⊆Y, if X(u)⩽Y(u), for all u∈U. A size of a fuzzy set X is |X|=∑u∈UX(u).

Triangular norms (t-norms) are binary operations ⊗:[0,1]×[0,1]→[0,1], which fulfill the following axioms for each x,y,z∈[0,1]:

T1
commutativity: x⊗y=y⊗x;
T2
associativity: x⊗(y⊗z)=(x⊗y)⊗z;
T3
monotonicity: x⊗y⩽x⊗z whenever y⩽z;
T4
boundary condition: 1⊗x=x.

T-norms have been mainly studied by Klement et al. [27] and later elaborated by many others. The most common examples of t-norms are

Gödel (minimum) t-norm: min{x,y},
Goguen (product) t-norm: x⋅y,
Łukasiewicz t-norm: max{0,x+y−1}.

Convention. In the sequel, we will work with closed intervals [x1,x2]⊆[0,1]. By ℐ we will denote the set of all closed intervals on [0,1], i.e., ℐ={[x1,x2]∣[x1,x2]⊆[0,1]}. For X=[x1,x2], the lower and upper bound will be respectively denoted with X and X, i.e., X=x1 and X=x2.

An interval-valued fuzzy set F on the universe U is a mapping F:U→ℐ assigning an interval from ℐ to any element from the universe U.

First of all, we start with a definition of the ordering ⩽ℐ on ℐ. Let X,Y∈ℐ then X⩽ℐY iff X∨ℐY=Y (equivalently X∧ℐY=X), where

X∨ℐY=max{X,Y},max{X,Y},X∧ℐY=min{X,Y},min{X,Y}.

We continue with definitions of basic operations on ℐ. We generalize this approach to interval fuzzy logic in definitions below which were introduced in [24,25].

A t-norm on ℐ is a binary operation ⊗ℐ:ℐ×ℐ→ℐ, which satisfies the following axioms for each X,Y,Z∈ℐ:

(T1ℐ) commutativity: X⊗ℐY=Y⊗ℐX;

(T2ℐ) associativity: X⊗ℐ(Y⊗ℐZ)=(X⊗ℐY)⊗ℐZ;

(T3ℐ) monotonicity: X⊗ℐY⩽ℐX⊗ℐZ whenever Y⩽ℐZ;

(T4ℐ) boundary condition: [1,1]⊗ℐX=X.

Theorem 1.

(Representability [28]) Let ⊗:[0,1]×[0,1]→[0,1] be a t-norm. For X,Y∈ℐ, we put

X⊗ℐY=X⊗Y,X⊗Y.

Then ⊗ℐ is a t-norm on ℐ.

Proof.

Let ⊗ be a t-norm, hence ⊗ is commutative. Then obviously

X⊗ℐY=X⊗Y,X⊗Y==Y⊗X,Y⊗X=Y⊗ℐX,

so we see that ⊗ℐ is commutative too. We can prove the associativity (T 2ℐ) and the boundary condition (T 4ℐ) in the similar way.

To prove the monotonicity (T 3ℐ), let us consider arbitrary Y,Z∈ℐ such that Y⩽ℐZ. By definition, Y∨ℐZ=Z, which means that max(Y,Z)=Z and max(Y,Z)=Z, i.e., Y⩽Z and Y⩽Z. Since ⊗ is monotonous, we can see that for any X∈ℐ, X⊗Y⩽X⊗Z and X⊗Y⩽X⊗Z, which is equivalent to X⊗ℐY∨ℐX⊗ℐZ=X⊗ℐZ, which means that X⊗ℐY⩽ℐX⊗ℐZ, from which we can see the monotonicity of ⊗ℐ.

Convention. Whenever we will need to perform an operation between an element a∈[0,1] and an interval X∈ℐ (e.g., a⊗ℐX) then we can promote the element a to the interval [a,a] so we can use the operation defined for intervals.

2.2. (Noninterval) Fuzzy Association Rules

Let O={o1,o2,…,on}, n>0, be a finite set of abstract elements called objects and A={a1,a2,…,am}, m>0, be a finite set of attributes. Within the association rules framework, a dataset D is a mapping that assigns to each object o∈O and attribute a∈A a truth degree D(a,o)∈[0,1], which represents the intensity of assignment of attribute a to object o. We also define

D(X,o)=⊗a∈XD(a,o)(1)

for each subset X⊆A and a fixed t-norm ⊗.

Association rule is a formula A⇀C, where A⊆A is an antecedent and C⊆A is an consequent. It is natural to assume A∩C=∅ and also |C|=1.

As each combination of attributes in antecedent and consequent form a well-formed association rules, an important problem is to identify rules that are relevant to the given dataset D. So far, there exist a large number of measures of rule relevance. An overview can be found in [29]. Perhaps the most essential association rule quality measures are support and confidence, which may be defined as follows:

Let A⇀C be an association rule and D be a dataset. Then

supp(A⇀C)=∑o∈OD(A,o)⊗D(C,o),conf(A⇀C)=∑o∈OD(A,o)⊗D(C,o)∑o∈OD(A,o).(2)

Example 2.

Consider a dataset D in Table 1. Here O={o1,…,o5} are table rows and A={a1,…,a3} are table columns. The numeric values in the table represent membership degrees D(a,o), for a∈A and o∈O. Let A={a1,a2} and let ⊗ be a Gödel (minimum) t-norm. Then D(A,o1)=min{0.9,1.0}=0.9, D(A,o2)=0, D(A,o3)=0.8 etc. Let C={a3}. The association rule A⇀C, i.e., {a1,a2}⇀{a3}, has the following support and confidence:

supp(A⇀C)=0.9+0+0.8+0+0=1.7,conf(A⇀C)=1.7∕(0.9+0+0.8+0.1+0)≐0.94.

	a1	a2	a3
o1	0.9	1.0	0.9
o2	0.7	0.0	0.1
o3	0.9	0.8	1.0
o4	0.1	0.9	0.0
o5	0.0	0.0	0.2

Table 1

An exemplary noninterval fuzzy dataset D. Objects from O are represented by table rows, attributes from A with columns.

The mining process of the association rules starts with the generation of all association rules with antecedents of length one. Support measures are computed for all of the generated rules, and the generated rules, which exceed the user-defined minimum support threshold, serve as a basis for the longer rules. The process is based on the fact that the support is nonincreasing with respect to the extension of antecedents by the new attributes. The monotonicity of the support is often the basis for efficient association rules in mining algorithms. Rules that satisfy the minimum support condition are then subjected to a confidence evaluation. A rule with both support and confidence above the user-defined thresholds is considered interesting and it is included in the results. Details of the algorithm may be found in [30].

3. ASSOCIATION RULES ON INTERVAL-VALUED FUZZY SETS

In order to extend the association rules framework for data containing interval-valued membership degrees, one has to switch the range of membership degrees from [0,1] to ℐ. In other words, dataset D becomes a mapping such that D(a,o)∈ℐ for each a∈A and each o∈O. Similarly to (1), we define

D(X,o)=D(x1,o)⊗ℐD(x2,o)⊗ℐ⋯⊗ℐD(xk,o),

for X={x1,x2,…,xk}⊆A and a fixed t-norm ⊗ℐ on ℐ.

For interval-valued membership degrees, the support and confidence of an association rule cannot be computed directly as in (2). Note that the interval membership degrees represent a partial knowledge—the exact membership degree is only known to be from the given interval. Therefore, the rule's support and confidence must also be represented as an interval. Although the evaluation of support is quite straightforward, the computation of the lower and upper bounds for confidence is an optimization problem that involves finding the exact values of the membership degrees such that the resulting confidence is minimal or maximal. To formalize this problem mathematically, we introduce the notion of a possible world as follows.

Definition 1.

Let D be a dataset that assigns an interval of truth values to each pair 〈a,o〉 of attribute a∈A and object o∈O, i.e., D(a,o)∈ℐ. A possible world of the dataset D is a mapping wD:A×O→[0,1] such that w(a,o)∈D(a,o) for each a∈A,o∈O. A class of all such mappings wD will be denoted by WD.

Convention. For the sake of simplicity, we omit the subscript D in wD and WD, if D is evident from the context. Moreover, we put w(A,o)=⊗a∈Aw(a,o), for any A⊆A.

Example 3.

Table 2 shows a dataset D with interval-valued fuzzy memberhsip degrees, from which we can see, e.g., D(a1,o3)=[0.5,0.9]. If we put A={a1,a2} and we let ⊗ to be Goguen (product) t-norm, we obtain D(A,o3)=[0.5⋅0.8,0.9⋅1.0]=[0.4,0.9], etc. As one can see, Table 1 may be viewed as a possible world of the interval-valued dataset from Table 2. Another possible world is, e.g., such w that w(a1,o1)=0.95, w(a2,o1)=1.0, w(a3,o1)=0.88, w(a1,o2)=0.0, etc.

	a1	a2	a3
o1	[0.9, 1.0]	[1.0, 1.0]	[0.8, 0.9]
o2	[0.0, 0.7]	[0.0, 0.5]	[0.1, 0.1]
o3	[0.5, 0.9]	[0.8, 1.0]	[1.0, 1.0]
o4	[0.0, 0.1]	[0.9, 0.9]	[0.0, 0.0]
o5	[0.0, 0.0]	[0.0, 0.5]	[0.2, 1.0]

Table 2

An exemplary interval fuzzy dataset D. Objects from O are represented by table rows, attributes from A with columns.

In the following subsections, we generalize the quality measures of an association rule—the support and confidence—for interval-valued data. We also propose efficient algorithms for their computation and prove their correctness.

3.1. The Support

In this section, a generalization of an association rule's support for interval-valued data is provided. A subsequent theorem gives a hint for a rather straightforward way of computation.

Definition 2.

Let D be a dataset, let us have a rule A⇀C and let ⊗ be a t-norm. Then the support of A⇀C on a possible world w∈W is

supp⊗A⇀C(w)=∑o∈Ow(A,o)⊗w(C,o).

The interval-valued support of A⇀C in the dataset D is

supp⊗(A⇀C)=[supp1,supp2],

with such maximal supp1 and minimal supp2 that

∀w∈W:supp1⩽supp⊗A⇀C(w)⩽supp2.

For simplicity, we omit the superscript and write only supp⊗(w) because the rule A⇀C is fixed throughout the paper. Obviously, supp⊗(w)∈[0,|O|].

As stated in the subsequent theorem, finding the boundaries supp1 and supp2 is quite straightforward.

Theorem 2.

Let us have a rule A⇀C and let D be a (interval-valued) dataset. Let w1,w2∈W such that for each o∈O,

w1(A,o)=D(A,o), w1(C,o)=D(C,o),w2(A,o)=D(A,o), w2(C,o)=D(C,o).

Then for any t-norm ⊗, supp⊗(A⇀C)=[supp1,supp2], where supp1=supp⊗(w1) and supp2=supp⊗(w2).

Proof.

The proof evidently follows from the monotonicity of the t-norm ⊗.

Example 4.

Let ⊗ be Gödel min t-norm. Accordingly to Theorem 2, the support of rule {a1,a2}⇀{a3} in Table 2 can be obtained by evaluating the support of two possible worlds: w1 of lower bounds and w2 of upper bounds.

supp1=min{0.9,1,0.8}+min{0,0,0.1}+…++min{0,0,0.2}=1.3,supp2=min{1,1,0.9}+min{0.7,0.5,0.1}+…++min{0,0.5,1}=1.9.

Theorem 3.

Let A⇀C be a rule, A,C⊆A, let a∈A be an attribute, a∉A∪C, and let D be a dataset. Then for arbitrary t-norm ⊗,

supp⊗A⇀C⩾ℐsupp⊗(A∪{a})⇀C.(3)

Proof.

Note that accordingly to Theorem 2,

supp⊗(A⇀C)=∑o∈OD(A,o)⊗D(C,o),supp⊗((A∪{a})⇀C)=∑o∈OD(A,o)⊗⊗D(a,o)⊗D(C,o).

(Analogous equations hold for upper bounds of the support.) Since ⊗ is monotone, we can immediately see that inequality (3) holds.

Theorem 3 says that with extension of rule's antecedent by a new attribute, the rule's support is not increasing. As discussed in Section 2.2, such property of the support is important for association rules searching algorithms. Theorem 3 allows us to use, e.g., the Agrawal's Apriori algorithm [30] to search also for association rules on interval-valued data by constraining the support with user-defined minSupp threshold, e.g., as follows:

supp⊗(A⇀C)⩾minSupp;
supp⊗(A⇀C)⩾minSupp; or
12supp⊗(A⇀C)+supp⊗(A⇀C)⩾minSupp.

3.2. The Confidence

The objective of this section is to provide an algorithm computing the association rule's confidence in the interval-valued data. The problem is analyzed from the perspective of possible worlds (see Definition 1).

First, the setting of w(C,o)∈D(C,o) that yields the minimum or maximum confidence is discussed. After that, the setting of w(A,o)∈D(A,o) is analyzed separately with respect to the Gödel (minimum), Goguen (product) and Łukasiewicz t-norms, and the potential solutions from the interval D(A,o) are identified. It is shown, that for each o∈O, at most two w(A,o) suspicious points exist that need to be examined in the search for the extreme confidence. (The combination of those points is called the suspicious world). After that, an efficient algorithm (Algorithm 1) is presented that solves the optimization sub-problem, which is later used to find the extreme confidence (minimum or maximum) based on the suspicious points (Algorithms 2, 3 and 4).

Algorithm 1 An algorithm for searching for F(Imin), as introduced in Lemma 11. The input argumentsG, H are assumed to be real numbers, H≠0 and input argumentsg and h are assumed to be vectors of length n, g=(g1,…,gn), h=(h1,…,hn),hi≠0.

function (FracMinim) (G,H,g,h)

heap← new empty min-heap

for all i∈{1,…,n} do

push gihi into heap

end for

while heap is not empty do

pop a minimal element ab from heap

if ab<GH then

G←G+a

H←H+b

else

break the while loop

end if

end while

return GH

end function

Algorithm 2 The computation of conf⊙(A⇀C) on the dataset D

function (Confidence) ⊙(D,A,C)

G,G′,H←0

g,g′,h← new empty vector

for all o∈O do

al←D(A,o), ah←D(A,o)

cl←D(C,o), ch←D(C,o)

G←G+al⋅cl

G′←G′+al⋅ch

H←H+al

if al≠ah then

append (ah⋅cl−al⋅cl) to g

append (ah⋅ch−al⋅ch) to g′

append (ah−al) to h

end if

end for

lo← FracMinim (G,H,g,h)

hi←− FracMinim (−G′,H,−g′,h)

return [lo,hi]

end function

Definition 3.

Let D be a dataset, let us have a rule A⇀C and let w∈W such that w(A,o)≠0 for some o∈O. Let ⊗ be a t-norm. Then the confidence of A⇀C in a possible world w is

conf⊗A⇀C(w)=∑o∈Ow(A,o)⊗w(C,o)∑o∈Ow(A,o).(4)

The interval-valued confidence of A⇀C in the dataset D is

conf⊗(A⇀C)=[conf1,conf2],

with such maximal conf1 and minimal conf2 that

∀w∈W:conf1⩽conf⊗A⇀C(w)⩽conf2.

Obviously, conf⊗A⇀C(w)∈[0,1]. Again, we omit A⇀C in the superscript where the rule A⇀C is obvious from the context.

Algorithm 3 Computation of confmin(A⇀C) on the dataset D

function (Confidence) min(D,A,C)

G,H,M,N←0

g,h← new empty vector

for all o∈O do

al←D(A,o), ah←D(A,o)

cl←D(C,o), ch←D(C,o)

if ah⩽cl then

G←G+al

H←H+al

⊳ because min(al,cl)=al

else if al⩾cl then

G←G+cl

H←H+ah

⊳ because min(ah,cl)=cl

else

G←G+al

H←H+al

⊳ because min(al,cl)=al

if al≠ah then

append (cl−al) to g

append (ah−al) to h

end if

a←closest([al,ah],ch)

M←M+min(a,ch)

N←N+a

end for

lo← FracMinim (G,H,g,h)

hi←MN

return [lo,hi]

end function

Example 5.

Let D be a dataset as in Table 2 and ⊗ be Goguen product t-norm. Let A={a1,a2} and C={a3}. Let us define three possible worlds, w1, w2, w3 by taking the lower bounds for C and lower bound for A (w1), upper bound for A (w2) and purposely mixed bounds for A (w3). Please see Table 3 and compare it with Table 2. However, neither a selection of lower bounds as in w1 nor upper bounds as in w2 yield in a minimum confidence of a rule A⇀C possible to obtain from D. The minimum confidence is reached for the world w3: conf⊗A⇀C(w1)≐0.86, conf⊗A⇀C(w2)≐0.74, conf⊗A⇀C(w3)≐0.67.

	w1		w2		w3
	A	C	A	C	A	C
o1	0.9	0.8	1.0	0.8	1.0	0.8
o2	0.0	0.1	0.35	0.1	0.35	0.1
o3	0.4	1.0	0.9	1.0	0.4	1.0
o4	0.0	0.0	0.09	0.0	0.09	0.0
o5	0.0	0.2	0.0	0.2	0.0	0.2

Table 3

Three possible worlds for dataset D in Table 2. w1 takes lower bounds for A, w2 takes upper bounds for A, and w3 is mixed so that it yields the minimum confidence of rule A⇀C that can be obtained from D.

As can be seen from the example, finding boundaries conf1 and conf2 of the confidence of the association rule A⇀C is not as straightforward as for the support. It is an optimization problem, which is equivalent to finding global extremes (minima and maxima) of function (4), i.e., to finding such worlds that yield minimal (resp. maximal) possible confidence.

Worlds w∈W with optimal values of w(C,o) can be obtained accordingly to the following lemma.

Lemma 4.

Let A⇀C be a rule, D be a dataset, W1,W2⊆W such that

W1={w∣w(C,o)=D(C,o),∀o∈O},W2={w∣w(C,o)=D(C,o),∀o∈O}

and let

conf⊗(A⇀C)=[conf1,conf2].

Then for any t-norm ⊗, there exist w1∈W1 and w2∈W2 such that conf1=conf⊗(w1) and conf2=conf⊗(w2).

Proof.

The proof evidently follows from the monotonicity of the t-norm.

Lemma 4 gives us a straightforward advice of how to select values of w(C,o) for each o∈O in order to minimize (resp. maximize) the confidence conf⊗(w). Now, let us sketch the way of finding the values of w(A,o) (for all o∈O) such that the confidence becomes extreme.

Searching for a possible world w∈W with globally maximal (resp. minimal) confidence conf⊗(w) is equivalent to searching for global extremes of a function f(x1,…,xn) with n=|O| parameters, where xi corresponds to w(A,oi), for oi∈O and for all i∈{1,…,n}. Formally, function f is a mapping

f:D(A,o1)×…×D(A,on)→[0,1],

defined as

f(x1,…,xn)=∑i=1nxi⊗w(C,oi)∑i=1nxi=∑i=1nti(xi)∑i=1nxi,(5)

where for each i∈{1,…,n}, ti:[0,1]→[0,1] is a function defined as

ti(x)=x⊗w(C,oi),

since w(C,oi) are fixed.

The extremes of function f may be found:

at boundaries of intervals D(A,oi), or
at critical points, where partial derivatives of f with respect to xi are equal to zero or do not exist.

Lemma 5.

Let function f be defined for ∑i=1nxi≠0 as

f(x1,…,xn)=∑i=1nti(xi)∑i=1nxi,

where ti:[0,1]→[0,1] is for each i∈{1,…,n} a differentiable function. Then for j∈{1,…,n},

∂f(x1,…,xn)∂xj=0ifdtjdxj(xj)=f(x1,…,xn).(6)

The function f(x1,…,xn) is increasing with respect to xj if

dtjdxj(xj)>f(x1,…,xn)(7)

and decreasing with respect to xj if

dtjdxj(xj)<f(x1,…,xn).(8)

Proof.

Let ∂f(x1,…,xn)∂xj=0, i.e.,

∂∑i=1nti(xi)∂xj∑i=1nxi−∑i=1nti(xi)∑i=1nxi−2=0,

which yields in

dtjdxj(xj)=∑i=1nti(xi)∑i=1nxi=f(x1,…,xn).

The inequalities (7) and (8) easily follow from the fact that a function f is increasing with respect to xj if ∂f(x1,…,xn)∂xj>0 and decreasing with respect to xj if ∂f(x1,…,xn)∂xj<0.

Lemma 5 provides a general framework for finding the points suspicious of extrema for arbitrary t-norm ⊗. Below we examine suspicious points for the most commonly used t-norms: Goguen product (⊙), Gödel minimum (min) and Łukasiewicz (⊗L).

Lemma 6.

(Suspicious worlds related to product t-norm) Let A⇀C be a rule and D be a dataset. Let

P={w∈W∣∀o∈O:w(A,o)∈{D(A,o),D(A,o)}}.

Then there exist w1,w2∈P such that

conf⊙(A⇀C)=conf⊙(w1)andconf⊙(A⇀C)=conf⊙(w2).

Proof.

For w∈W, conf⊙(w) can be viewed as a function f in (5). If we put ti(xi)=xi⊙w(C,oi), where w(C,oi) is fixed accordingly to Lemma 4, we obtain from equation (6) of Lemma 5 the following equality:

dtjdxj(xj)=w(C,oj)=∑i=1nxi⊙w(C,oi)∑i=1nxi,∀j∈{1,…,|O|}.

Since ⊙ is simply the product, the equation above can be further simplified to

w(C,oj)∑i=1nxi=∑i=1nxi⋅w(C,oi),w(C,oj)xj+w(C,oj)∑i=1i≠jnxi=xj⋅w(C,oj)+∑i=1i≠jnxi⋅w(C,oi),w(C,oj)=∑i=1i≠jnxi⋅w(C,oi)∑i=1i≠jnxi.(9)

So, the partial derivative ∂f∂xj is equal to zero for all xj∈D(A,oj) if (9) holds. In other words, the local extreme either does not exist (if (9) is not fulfilled) or the extreme is at any point xj∈D(A,oj) (if (9) holds). It means that the function is monotonic with respect to xj. In each case, the global extreme of conf(w) is thus reached if w(A,oj) is equal to either lower or upper bound of the interval D(A,oj), i.e., D(A,oj) or D(A,oj).

Example 6.

Consider data from Example 3 and Table 2. Naturally, there exists an infinite number of possible worlds that are represented by that interval-valued data. However, if we are searching for worlds with minimum or maximum confidence of the rule A⇀C, A={a1,a2}, C={a3}, and product t-norm is used for computations, there exists, accordingly to Lemmas 4 and 6, a finite set P of possible worlds, which contains both extreme worlds. Any combination of values of suspicious points from Table 4 is a possible world from the set P. As can be seen, |P|=64.

	Lower conf.		Upper conf.
	A	C	A	C
o1	{0.9,1.0}	{0.8}	{0.9,1.0}	{0.9}
o2	{0.0,0.35}	{0.1}	{0.0,0.35}	{0.1}
o3	{0.4,0.9}	{1.0}	{0.4,0.9}	{1.0}
o4	{0.0,0.09}	{0.0}	{0.0,0.09}	{0.0}
o5	{0.0}	{0.2}	{0.0}	{1.0}

Table 4

Sets of suspicious points for rule A⇀C from Example 3, if the product t-norm is being used.

Definition 4.

(Closest element) A function closest:ℐ×[0,1]→[0,1] is defined as

closest(X,y)=X,ifX<y,X,ifX>y,y,otherwise,

for X∈ℐ and y∈[0,1].

In other words, closest(X,y) finds an element from interval X that is the closest to y.

Lemma 7.

(Suspicious worlds related to minimum t-norm) Let A⇀C be a rule and D be a dataset. Let

M={w∣∀o∈O:w(A,o)∈αo},

where

αo=D(A,o),ifD(A,o)⩽D(C,o),D(A,o),ifD(A,o)⩾D(C,o),D(A,o),D(A,o),otherwise.

Then there exists w1∈M such that confmin(A⇀C)=confmin(w1).

Let w2∈W such that

w2(A,o)=closest(D(A,o),D(C,o)).(10)

Then confmin(A⇀C)=confmin(w2).

Proof.

For confidence based on minimum t-norm, we can continue similarly as in previous Lemma 6. In this case, we put ti(xi)=min(xi,w(C,oi)), for w(C,oi)) again fixed accordingly to Lemma 4. From (6) of Lemma 5 we obtain

dtjdxj(xj)=∑i=1nmin(xi,w(C,oi))∑i=1nxi,∀j∈{1,…,|O|}.

The derivative of min(xj,w(C,oj)) does not exist for xj=w(C,oj). The derivative equals 1 for xj<w(C,oj), and 0 for xj>w(C,oj).

We can deduce from inequalities (7) and (8) that the confidence is increasing with respect to xj for xj⩽w(C,oj) and decreasing with respect to xj for xj⩾w(C,oj). From that, we can deduce the following values for w(A,o):

if D(A,o)⩽D(C,o), the confidence is increasing on the whole D(A,o) interval. Therefore, D(A,o) is the point of global minimum;
if D(A,o)⩾D(C,o), the confidence is decreasing on the whole D(A,o) interval. Therefore, D(A,o) is the point of global minimum;
otherwise, the global minimum is reached either for D(A,o) or D(A,o).

Similarly,

if D(A,o)⩽D(C,o), the confidence is increasing on the whole D(A,o) interval. Therefore, D(A,o) is the point of global maximum;
if D(A,o)⩾D(C,o), the confidence is decreasing on the whole D(A,o) interval. Therefore, D(A,o) is the point of global maximum;
otherwise, the global maximum is reached for D(C,o).

Example 7.

Consider data from Example 3 and Table 2 and minimum t-norm. Accordingly to Lemmas 4 and 7, there are only two possible worlds that are candidates for a lower bound of the confidence of rule A⇀C. The upper bound of the confidence is given by the world w2 – see Table 5.

	M		w2
	A	C	A	C
o1	{1.0}	{0.8}	{0.9}	{0.9}
o2	{0.0,0.35}	{0.1}	{0.1}	{0.1}
o3	{0.4}	{1.0}	{0.9}	{1.0}
o4	{0.09}	{0.0}	{0.0}	{0.0}
o5	{0.0}	{0.2}	{0.0}	{1.0}

Table 5

Sets of suspicious points for rule A⇀C from Example 3, if the minimum t-norm is being used.

Lemma 8.

(Suspicious worlds related to Łukasiewicz t-norm) Let A⇀C and D be a dataset. Let w1∈W such that

w1(A,o)=closest(D(A,o),1−D(C,o)).

Then conf⊗L(A⇀C)=conf⊗L(w1). Let

L={w∣∀o∈O:w(A,o)∈αo},

where

αo=D(A,o),ifD(A,o)⩽1−D(C,o),D(A,o),ifD(A,o)⩾1−D(C,o),D(A,o),D(A,o),otherwise.

Then there exists w2∈L such that conf⊗L(A⇀C)=conf⊗L(w2).

Proof.

By setting w(C,oi)) fixed accordingly to Lemma 4 and by putting ti(xi)=xi⊗Lw(C,oi)=max(0,xi+w(C,oi)−1), we obtain from (6) of Lemma 5, for all j∈{1,…,|O|}:

dtjdxj(xj)=dmax(0,xj+w(C,oj)−1)dxj=∑i=1nmax(0,xi+w(C,oi)−1)∑i=1nxi.

The derivative of max(0,xj+w(C,oj)−1) does not exist for xj=1−w(C,oj). The derivative equals 0 (for xj<1−w(C,oj)), and 1 (for xj>1−w(C,oj)).

We can deduce from inequalities (7) and (8) that the confidence is decreasing with respect to xj for xj⩽1−w(C,oj) and increasing with respect to xj for xj⩾1−w(C,oj). From that, we can deduce the following values for w(A,o):

if D(A,o)⩽1−D(C,o), the confidence is decreasing on the whole D(A,o) interval. Therefore, D(A,o) is the point of global minimum;
if D(A,o)⩾1−D(C,o), the confidence is increasing on the whole D(A,o) interval. Therefore, D(A,o) is the point of global minimum;
otherwise, the global minimum is reached for 1−D(C,o).

Similarly,

if D(A,o)⩽1−D(C,o), the confidence is decreasing on the whole D(A,o) interval. Therefore, D(A,o) is the point of global maximum;
if D(A,o)⩾1−D(C,o), the confidence is increasing on the whole D(A,o) interval. Therefore, D(A,o) is the point of global maximum;
otherwise, the global maximum is reached either for D(A,o) or D(A,o).

Example 8.

Consider data from Example 3 and Table 2 and minimum t-norm. Accordingly to Lemmas 4 and 8, there exists a single possible world that is a candidate for a lower, and a different single possible world that is a candidate for the upper bound of the confidence of rule A⇀C – see Table 6.

	w1		L
	A	C	A	C
o1	{0.9}	{0.8}	{1.0}	{0.9}
o2	{0.35}	{0.1}	{0.0}	{0.1}
o3	{0.4}	{1.0}	{0.9}	{1.0}
o4	{0.09}	{0.0}	{0.0}	{0.0}
o5	{0.0}	{0.2}	{0.0}	{1.0}

Table 6

Sets of suspicious points for rule A⇀C from Example 3, if the Łukasiewicz t-norm is being used.

Lemmas 6–8 identify the points that are suspected to be global extremes for each object o∈O. Recall that for each examined t-norm, there are at most two suspicious alternative values that could be extreme for each oi∈O={o1,…,on}. The core of the problem of finding the global extremes lies in selecting the correct value from these alternatives. The subsequent lemma transforms the problem to search for a set of indices I={1,…,n}, for which the objects oi pick the greatest suspicious value.

Lemma 9.

Let O={o1,…,on′,…,on} be a set of objects, n′⩽n, ⊗ be a t-norm, A⇀C be a rule and D be a dataset. Let us have a set R of possible worlds,

R={w∣∀oi∈O:w(A,oi)∈αi∧w(C,oi)=ci},

where

αi={a′i,a′′i}fori∈{1,…,n′},{a′i}fori∈{n′+1,…,n},

ai′,ai′′∈D(A,oi), ai′<ai′′, and ci∈D(C,oi). Then

min{conf⊗(w)∣w∈R}=F(Imin),max{conf⊗(w)∣w∈R}=F(Imax),

where

F(I)=G+∑i∈IgiH+∑i∈Ihi,I⊆{1,…,n′}.

and where

G=∑i=1nai′⊗ci,H=∑i=1nai′,gi=ai′′⊗ci−ai′⊗ci,hi=ai′′−ai′,

and Imin,Imax⊆{1,…,n′} such that for all I⊆{1,…,n′},

F(Imin)⩽F(I)⩽F(Imax).

Proof.

Let w∈R and let I={i∣w(A,oi)=ai′′}. Then

conf⊗(w)=∑oi∈Ow(A,oi)⊗w(C,oi)∑oi∈Ow(A,oi)=∑i∉Iai′⊗ci+∑i∈Iai′′⊗ci∑i∉Iai′+∑i∈Iai′′=∑i=1nai′⊗ci+∑i∈I(ai′′⊗ci−ai′⊗ci)∑i=1nai′+∑i∈I(ai′′−ai′)=G+∑i∈IgiH+∑i∈Ihi=F(I).

Lemma 10.

Let G,H,g,h∈ℝ, H,h>0. If gh<GH then

gh<G+gH+h<GH.

Proof.

From gh<GH and h>0 we get G>gHh. By substituting that G into G + gH + h we obtain

gHh+gH+h<G+gH+h,g(H+h)hH+h<G+gH+h,

which yields in gh<G + gH + h. The second inequality can be proven analogously.

Lemma 11.

Let G,H,∈ℝ, H>0, let gi,hi∈ℝ, hi>0, for all i∈{1,…,n}, and let

F(I)=G+∑i∈IgiH+∑i∈Ihi,I⊆{1,…,n}.(11)

Moreover, let Imin,Imax⊆{1,…,n} such that for all I⊆{1,…,n},

F(Imin)⩽F(I)⩽F(Imax).(12)

Then gjhj<F(Imin) implies j∈Imin and gjhj>F(Imin) implies j∉Imin. Similarly, gjhj>F(Imax) implies j∈Imax and gjhj<F(Imax) implies j∉Imax.

Proof.

Let us prove only the first implication, i.e., gjhj<F(Imin) implies j∈Imin. (The other implications can be proven analogously.) Assume to the contrary that gjhj<F(Imin) and j∉Imin. Let G′=G+∑i∈Imingi and H′=H+∑i∈Iminhi. Then F(Imin)=G′H′>gjhj. According to Lemma 10, it is evident that

F(Imin)=G′H′>G′+gjH′+hj=F(Imin∪{j}),

which yields that F(Imin) is is not minimal, which is a contradiction.

Next, we present the FracMinim algorithm that searches for F(Imin) (see Algorithm 1). The algorithm uses the so-called minimum heap (min-heap), which is a data structure that allows us to effectively obtain the smallest element of the set of stored values. The algorithm first puts all gihi into the heap and then iteratively pops the elements in ascending order from the smallest. Such approach is functionally equivalent to iterating through the collection of values sorted in ascending order by gihi. The benefit of using the heap instead of a sorted collection is that a heap has a better time complexity as the elements that are never popped from the heap never need to be fully ordered.

Lemma 12.

Let G,H∈ℝ, H>0, let g,h be vectors of real numbers, i.e., g=(g1,…,gn), h=(h1,…,hn), where gi,hi∈ℝ, hi>0, for all i∈{1,…,n}. Then the call FracMinim (G,H,g,h), as defined in Algorithm 1, returns F(Imin) as in (12).

Proof.

We are going to prove this lemma inductively by proving a simultaneous validity of the following statements:

if gihi⩾GH for all i∈{1,…,n} then FracMinim (G,H,g,h)=F(Imin) where g=(g1,…,gn) and h=(h1,…,hn);
let j∈{1,…,n} such that gjhj⩽gihi for all i∈{1,…,n}; if gjhj<GH then FracMinim (G,H,g,h)= FracMinim (G′,H′,g′,h′) where G′=G+gj, H′=H+hj, g′=(g1,…,gj−1,gj+1,…,gn), h′=(h1,…,hj−1,hj+1,…,hn).

Ad (a): Let gihi⩾GH for all i∈{1,…,n}, possibly also n may be equal to 0, then obviously the Algorithm 1 returns GH. A proof is needed that GH is the correct value for F(Imin). Assume to the contrary that F(Imin)<GH. Then from the definition of F(I) in (11) obviously follows that Imin≠∅. On the other hand, it follows from assumptions gihi⩾GH>F(Imin) and from Lemma 11 that i∉Imin, for all i∈{1,…,n}, which is in contradiction with Imin≠∅.

Ad (b): Without loss of generality, let us assume the vectors g and h to be indexed according to the ascending order of values gihi, i.e., g1h1⩽g2h2⩽⋯⩽gmhm. Now, let j∈{1,…,n} such that gjhj⩽gihi for all i∈{1,…,n} and let k=i∣gihi=gjhj, i.e., k is the number of equally smallest fractions. If k=1 then obviously also j=1 and calls FracMinim (G,H,g,h) and FracMinim (G′,H′,g′,h′) return evidently the same result. If k>1 then j∈{1,…,k} and it is clear that the call FracMinim (G,H,g,h) will process at least all first k elements from the vectors g and h, i.e., after k steps of the “while” loop, the variables G and H become incremented by ∑i=1kgi and ∑i=1khi, respectively. This is due the fact that gjhj<GH implies accordingly to Lemma 10 that gjhj<G+gjH+hj. Hence, the “if” condition inside of the “while” loop remains true for all first k fractions in the heap. Then obviously, the call FracMinim (G′,H′,g′,h′) will surely process all first k−1 fractions and thus increment G′ and H′ by ∑i=1k−1g′i=∑i=1,i≠jkgi and ∑i=1k−1h′i=∑i=1,i≠jkhi, respectively. This proves FracMinim (G,H,g,h)= FracMinim (G′,H′,g′,h′).

Corollary 13.

Let G,H∈ℝ, H>0, let g=(g1,…,gn), h=(h1,…,hn), where gi,hi∈ℝ, hi>0, for all i∈{1,…,n}. Then the call of the function FracMinim (−G,H,−g,h), as defined in Algorithm 1, returns −F(Imax) for F(I) as in (11) and Imax as in (12).

Proof.

The call of FracMinim (−G,H,−g,h) searches for F′(I′min), where

F′(I)=−G+∑i∈I(−gi)H+∑i∈Ihi=−G+∑i∈IgiH+∑i∈Ihi=−F(I),

for F(I) as in (11) and where for all I⊆{1,…,n}

F′(I′min)⩽F′(I)=−F(I)⩽F′(I′max).

Hence, −F′(I′min) is the highest bound of F(I), and therefore, −F(I′min)=F(Imax).

Theorem 14.

Let D be a dataset and A⇀C be an association rule. Then the call of function Confidence ⊙(D,A,C), as defined in Algorithm 2, returns conf⊙(A⇀C).

Proof.

Algorithm 2 prepares arguments for calls of FracMinim. Namely, variables G,G′,H and vectors g,g′,h are initialized as follows (where O={o1,…,on}):

G=∑i=1nD(A,oi)⋅D(C,oi),(13)

G′=∑i=1nD(A,oi)⋅D(C,oi),(14)

H=∑i=1nD(A,oi),(15)

g=gii=1,gi≠0nwheregi=D(A,oi)⋅D(C,oi)−D(A,oi)⋅D(C,oi),(16)

g′=g′ii=1,g′i≠0nwhereg′i=D(A,oi)⋅D(C,oi)−D(A,oi)⋅D(C,oi),(17)

h=hii=1,hi≠0nwherehi=D(A,oi)−D(A,oi),(18)

Let n′=|g|=|g′|=|h|. Note that n′⩽n because the vectors do not contain zero elements, which correspond to objects oi such that D(A,oi) is a single-element interval.

First of all, let us prove that the algorithm computes the correct lower bound lo of the confidence. According to Lemma 4, the lower bound is obtained for w(C,o)=D(C,o) (for each o∈O). Regarding w(A,o), the lower bound of the confidence is obtained for w(A,o)∈D(A,o),D(A,o) (for each o∈O), which follows from Lemma 6. The whole problem of finding the lower bound thus consists in finding the proper combination of w(A,o) for all o.

In other words, we have a set of possible worlds

R=w∣∀oi∈O:w(A,oi)∈D(A,o),D(A,o)∧w(C,oi)=D(C,oi),

from which we have to select the one that yields in the minimum confidence. According to Lemma 9, finding a minimum confidence among worlds w∈R is equivalent to finding F(Imin), where

F(I)=G+∑i∈IgiH+∑i∈Ihi,I⊆{1,…,n},

with G,H from (13), (15) and g=(gi),h=(hi) from (16), (18). The minimum confidence is obtained by calling FracMinim (G,H,g,h), as proven in Lemma 12.

Similarly, finding the upper bound of the confidence is equivalent to finding F′(Imax), where

F′(I)=G′+∑i∈Ig′iH+∑i∈Ihi,I⊆{1,…,n},

with G′,H from (14), (15) and g=(gi),h=(hi) from (17), (18). According to Corollary 13, the maximum confidence is obtained by calling the function FracMinim (−G′,H,−g′,h).

Theorem 15.

Let D be a dataset and A⇀C be an association rule. Then the call of function Confidence min(D,A,C), as defined in Algorithm 3, returns confmin(A⇀C).

Proof.

The proof is analogous to the proof of Theorem 14. It is again based on the search for a possible world w∈W that yields a globally minimal or maximal confidence. Again, the lower or upper bound of the confidence is respectively reached for w(C,o) equal to D(C,o) or D(C,o), see Lemma 4.

Let us prove the correctness of computation of the lower bound lo. Let

S={i∣D(A,oi)⩽D(C,oi)}T={i∣D(A,oi)⩾D(C,oi)}U={i∣D(A,oi)<D(C,oi)<D(A,oi)}

Note that S∪T∪U={1,…,n} and S,T,U are pairwisely disjoint. The algorithm initializes the variables G,H,g,h as follows:

G=∑i∈S∪UD(A,oi)+∑i∈TD(C,oi),H=∑i∈S∪UD(A,oi)+∑i∈TD(A,oi),g=gii∈U,gi≠0wheregi=D(C,oi)−D(A,oi),h=hii∈U,hi≠0wherehi=D(A,oi)−D(A,oi),

Note that G,g are equivalent to

G=∑i=1nminD(A,oi),D(C,oi),g=gii∈U,gi≠0,gi=minD(A,oi),D(C,oi)−minD(A,oi),D(C,oi).

It is now easy to see that G,H,g,h are equivalent to G,H,g,h from Lemma 9 with a′i,a′′i set as in Lemma 7. This all together with Lemma 11 and Lemma 12 leads to the fact that the call of FracMinim (G,H,g,h) indeed returns the lower bound of the confidence.

The correctness of upper bound hi easily follows from (10) in Lemma 7.

Theorem 16.

Let D be a dataset and A⇀C be an association rule. Then the call of function Confidence ⊗L(D,A,C), as defined in Algorithm 4, returns conf⊗L(A⇀C).

Proof.

The proof is very analogous to proofs of Theorems 14 and 15. It follows from Lemmas 4, 8, 9 and 12 and Corollary 13.

Theorem 17.

The time complexity of FracMinim, Confidence ⊙, Confidence min and Confidence ⊗L is in O(nlogn), for n=|O|.

Proof.

First of all, let us focus on the time complexity of FracMinim. It is known that pushing an element into the heap as well as popping the minimal element from the heap is in O(logn). FracMinim does this step n-times, which results in O(nlogn). After that, the elements are removed from the heap sequentially, which is again in O(nlogn). Hence, the whole FracMinim is in O(nlogn).

Regarding Algorithms 2–4, the initialization steps can be executed with time complexity in O(n). The calls of FracMinim afterwards are in O(nlogn), hence, whole algorithms are in O(nlogn) too.

4. EXPERIMENT

In this section, a practical experiment is conducted to show the efficiency of the algorithm with real settings. As many other papers have addressed the usefulness of the association analysis, we limit ourselves only to the computation of the confidence, which is the main topic of this paper.

To evaluate the performance, we randomly generated (from the uniform distribution) four vectors of equal size, which represent the lower and upper bounds of antecedent A and consequent C. Three algorithms were then executed to compute A⇀C's confidence based on the Gödel, Goguen and Łukasiewicz t-norm. Since we are not aware of any other algorithm for the computation of the confidence on interval data, we have compared it to the algorithm that computes the confidence on noninterval data which simply involves the calculation of the equation (2). Naturally, the noninterval algorithm is much faster because it has a linear time complexity, O(n). The results can be found in Table 7.

Data Size	Non-int	G	P	L
400	0.00	0.07	0.19	0.03
800	0.00	0.05	0.35	0.07
1600	0.00	0.09	0.70	0.12
3200	0.01	0.18	1.49	0.24
6400	0.01	0.35	2.93	0.48
12800	0.01	0.37	3.21	0.53
25600	0.02	0.71	6.93	1.02
51200	0.12	1.52	14.83	2.24
102400	0.09	2.84	29.60	4.31
204800	0.21	5.74	65.66	9.13
409600	0.63	12.12	144.46	18.59
819200	1.39	25.11	322.94	38.27
1638400	2.17	52.72	850.75	81.70
3276800	4.44	108.16	1769.73	167.95
6553600	8.60	226.69	4073.44	365.52
13107200	17.43	482.65	9182.41	795.08

Table 7

Experimental running times (in milliseconds) of computations of confidence based on Gödel t-norm (G), Goguen product t-norm (P) and Łukasiewicz t-norm (L). A noninterval confidence computation (non-int) is also provided for comparison.

As can be seen, the algorithm for Gödel t-norm is about 32 times slower than noninterval computation, the Łukasiewicz variant 43 times, and the confidence based on Goguen t-norm is computed in average 319 times slower than the noninterval variant. However, we are very satisfied with the results as the computation of interval-valued confidence, which is much more complicated than the noninterval variant, still costs less than a second for both Gödel and Łukasiewicz t-norm even for data having more than 13 millions of records with each record being interval-valued, which will often not be the case in real applications.

The single-threaded experiment was performed at Intel(R) Core(TM) i5-3570 CPU at 3.40GHz.

Algorithm 4 Computation of conf⊗L(A⇀C) on the dataset D

function (Confidence) ⊗L(D,A,C)

G,H,M,N←0

g,h← new empty vector

for all o∈O do

al←D(A,o),ah←D(A,o)

cl←D(C,o),ch←D(C,o)

a←closest([al,ah],1−cl)

M←M+max(0,a+cl−1)

N←N+a

if ah⩽1−ch then

H←H+al

else if al⩾1−ch then

G←G+ah+ch−1

H←H+ah

else

H←H+al

if al≠ah then

append (ah+ch−1) to g

append (ah−al) to h

end if

end for

lo←MN

hi←− FracMinim (−G,H,−g,h)

return [lo,hi]

end function

5. CONCLUSION

In this paper, efficient algorithms were presented that allow computation of the support and confidence of an association rule from interval-valued fuzzy data. Interval values may be used to model a partial knowledge, i.e., values that are unknown or measured with a limited precision. The proposed algorithms compute the minimal and maximal possible support and confidence based on the assumption that the nonprecise data are independent. It is shown that the computation of the confidence on interval-valued data is more complex than the computation for noninterval data. The time complexity of the interval-valued confidence is in O(nlogn). For contrast, the time complexity of noninterval data confidence is in O(n). However, the most time-consuming part of the association analysis is the search for rules with sufficiently high support, whose complexity remains unchanged (as discussed in Theorem 3).

The computation of minimal and maximal confidence is an optimization problem that depends on the applied t-norm. The algorithms presented here allow computation with the Gödel (minimum) t-norm, Goguen (product) t-norm and Łukasiewicz t-norm. However, several results are applicable to an arbitrary t-norm: see Lemmas 4 and 5. Moreover, the developed algorithms are directly applicable to a t-norm that is proven to be connected to at most two suspicious values (see Lemma 5) per data row. Investigation of other t-norms or families of t-norms is left for future work.

CONFLICT OF INTEREST

Authors declare no conflict of interest.

AUTHORS' CONTRIBUTIONS

All authors contributed to the work presented in this paper. All authors read and approved the final manuscript.

ACKNOWLEDGMENTS

Authors acknowledge support by project “LQ1602 IT4Innovations excellence in science” and by GAČR 16-19170S.

REFERENCES

1.P. Murinová, M. Burda, and V. Pavliska, Fuzzy association rules on data with undefined values, J. Medina, M. Ojeda-Aciego, J. Verdegay, I. Perfilieva, B. Bouchon-Meunier, and R. Yager (editors), Information Processing and Management of Uncertainty in Knowledge-Based Systems, Applications, IPMU 2018, Springer, Cham, Switzerland, Vol. 855, 2018, pp. 165-174.

2.P. Hájek, The problem of a general conception of the GUHA method, Kybernetika., Vol. 4, 1968, pp. 505-515. https://dml.cz//handle/10338.dmlcz/125459

3.P. Hájek and T. Havránek, Mechanizing Hypothesis Formation (Mathematical Foundations for a General Theory), Springer-Verlag, Berlin, Heidelberg, Germany, 1978.

4.R. Agrawal and R. Srikant, Fast algorithms for mining association rules, AAAI Press, Santiago de Chile, Chile, in Proceeding of 20th International Conferenceon Very Large Databases (Chile), 1994, pp. 487-499. https://dl.acm.org/doi/proceedings/10.5555/645920

5.R.R. Yager, A new approach to the summarization of data, Inf. Sci., Vol. 28, 1982, pp. 69-86.

6.J. Kacprzyk, R.R. Yager, and S. Zadrożny, A fuzzy logic based approach to linguistic summaries of databases, Int. J. Appl. Math. Comput. Sci., Vol. 10, 2000, pp. 813-834. https://www.amcs.uz.zgora.pl/?action=paper&paper=930

7.M. Ralbovský, Fuzzy GUHA, University of Economics, Prague, Czech Republic, 2009. PhD Thesis https://insis.vse.cz/zp/14956/podrobnosti

8.P. Murinová, M. Burda, and V. Pavliska, An algorithm for intermediate quantifiers and the graded square of opposition towards linguistic description of data, J. Kacprzyk, E. Szmidt, S. Zadrożny, K. Atanassov, and M. Krawczak (editors), Advances in Fuzzy Logic and Technology, Springer, Cham, Switzerland, 2017, pp. 592-603.

9.M. Burda, P. Murinová, and V. Pavliska, On missing membership degrees: modelling non-existence, ignorance and inconsistency, S. Destercke, T. Denoeux, M.A. Gil, P. Grzegorzewski, and O. Hryniewicz (editors), Uncertainty Modelling in Data Science, SMPS 2018, Springer, Compiègne, France, 2018.

10.A. Gelman and J. Hill, Data Analysis Using Regression and Multilevel/Hierarchical Models, Volume Analytical Methods for Social Research, Cambridge University Press, New York, NY, USA, 2007.

11.Y. Liu and V. Gopalakrishnan, An overview and evaluation of recent machine learning imputation methods using cardiac imaging data, Data, Vol. 2, 2017, pp. 8.

12.J. Ross Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1993. https://dl.acm.org/doi/book/10.5555/152181

13.J. Lukasiewicz, O logice trojwartosciowej, Ruch Filozoficzny, Vol. 5, 1920, pp. 170-171. https://philpapers.org/rec/UKAOLT-2

14.G. Malinowski, The Many Valued and Nonmonotonic Turn in Logic, North-Holand, Amsterdam, Netherlands, 2007.

15.L. Běhounek and V. Novák, Towards fuzzy partial logic, in Proceedings of the IEEE 45th International Symposium on Multiple-Valued Logics (ISMVL 2015) (Waterloo, Canada), 2015, pp. 139-144.

16.L. Běhounek and M. Daňková, Towards fuzzy partial set theory, J. Carvalho, M.J. Lesot, U. Kaymak, S. Vieira, B. Bouchon-Meunier, and R. Yager (editors), Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU 2016), Springer, Cham, Switzerland, 2016, pp. 482-494.

17.P. Murinová, M. Burda, and V. Pavliska, Undefined values in fuzzy logic, J. Kacprzyk, E. Szmidt, S. Zadrożny, K. Atanassov, and M. Krawczak (editors), Advances in Fuzzy Logic and Technology, Springer, Cham, Switzerland, 2017, pp. 604-610.

18.D. Ciucci, D. Dubois, and J. Lawry, Borderline vs. unknown: comparing three-valued representations of imperfect information, Int. J. Approx. Reason., Vol. 55, 2014, pp. 1866-1889.

19.D. Ciucci and D. Dubois, Three-valued logics, uncertainty management and rough set, Int. J. Approx. Reason., Vol. 55, 2014, pp. 1866-1889.

20.P. Grzegorzewski, Measures of dispersion for interval data, S. Destercke, T. Denoeux, M.Á. Gil, P. Grzegorzewski, and O. Hryniewicz (editors), Uncertainty Modelling in Data Science, Springer International Publishing, Cham, Switzerland, 2019, pp. 91-98.

21.L. Billard and E. Diday, Regression analysis for interval-valued data, H.A.L. Kiers, J.-P. Rasson, P.J.F. Groenen, and M. Schader (editors), Data Analysis, Classification, and Related Methods, Springer, Berlin, Heidelberg, Germany, 2000, pp. 369-374.

22.E.D. Lima Neto and F.D. De Carvalho, Nonlinear regression applied to interval-valued data, Pattern Anal. Appl., Vol. 20, 2017, pp. 809-824.

23.P. Murinová, M. Burda, and V. Pavliska, Association analysis on interval-valued fuzzy setsp: uncertainty, vagueness and non-existence, V. Torra, Y. Narukawa, I. Aguiló, and M. González-Hidalgo (editors), Modeling Decisions for Artificial Intelligence, Springer, Cham, Switzerland, 2018, pp. 89-100.

24.Q. Chen and S. Kawase, An approach towards consistency degrees of fuzzy theories, Fuzzy Sets Syst., Vol. 113, 2000, pp. 237-251.

25.C. Deschrijver Cornelis and E.G. Kerre, Implication in intuitionistic fuzzy and interval-valued fuzzy set theroy: construction, classification, application, Int. J. Approx. Reason., Vol. 35, 2004, pp. 55-95.

26.L.A. Zadeh, Fuzzy sets, Inf. Control, Vol. 8, 1965, pp. 338-353.

27.E.P. Klement, R. Mesiar, and E. Pap, Triangular Norms, Kluwer, Dordrecht, Netherlands, 2000.

28.S. Kawase, Q. Chen, and N. Yanagihar, On interval valued fuzzy reasoning, Trans. Japan Soc. Ind. Appl. Math., Vol. 6, 1996, pp. 285-296.

29.L. Geng and H.J. Hamilton, Interestingness measures for data mining: a survey, ACM Comput. Surv. (CSUR), Vol. 38, 2006, pp. 9.

30.R. Agrawal, T. Imielinski, and A. Swarmi, Mining association rules between sets of items in large databases, in Proceedingof the ACM SIGMOD International Conference on Management of Data (Washington, DC, USA), 1993, pp. 207-216.

<Previous Article In Issue

Download article (PDF)

Next Article In Issue>

Journal: International Journal of Computational Intelligence Systems
Volume-Issue: 13 - 1
Pages: 1014 - 1026
Publication Date: 2020/07/29
ISSN (Online): 1875-6883
ISSN (Print): 1875-6891
DOI: 10.2991/ijcis.d.200715.001 How to use a DOI?
Open Access: This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

ris enw bib

TY  - JOUR
AU  - Michal Burda
AU  - Viktor Pavliska
AU  - Petra Murinová
PY  - 2020
DA  - 2020/07/29
TI  - Computation of Support and Confidence for Interval-Valued Fuzzy Association Rules
JO  - International Journal of Computational Intelligence Systems
SP  - 1014
EP  - 1026
VL  - 13
IS  - 1
SN  - 1875-6883
UR  - https://doi.org/10.2991/ijcis.d.200715.001
DO  - 10.2991/ijcis.d.200715.001
ID  - Burda2020
ER  -

download .riscopy to clipboard