International Journal of Computational Intelligence Systems

Volume 12, Issue 2, 2019, Pages 1497 - 1511

Fuzzy System Based on Two-Step Cascade Genetic Optimization Strategy for Tobacco Tar Prediction

Authors
Muamer Kafadar1, *, Zikrija Avdagic1, Lejla Begic Fazlic2
1Faculty of Electrical Engineering, University of Sarajevo, Sarajevo, Bosnia and Herzegovina
2Trier University of Applied Sciences, Environmental Campus, Germany
*Corresponding author. Email: muamer.kafadar@gmail.com
Corresponding Author
Muamer Kafadar
Received 10 November 2019, Accepted 18 November 2019, Available Online 3 December 2019.
DOI
10.2991/ijcis.d.191122.001How to use a DOI?
Keywords
Adaptive neuro fuzzy system (ANFISs); Genetic algorithm (GA); Fuzzy logic (FUZZY); Tar; GA-ANFIS; GA-FUZZY; GA-GA-FUZZY
Abstract

There are many challenges in accurately measuring cigarette tar constituents. These include the need for standardized smoke generation methods related to unstable mixtures. In this research were developed algorithms using fusion of artificial intelligence methods to predict tar concentration. Outputs of development are three fuzzy structures optimized with genetic algorithms resulting in genetic algorithm (GA)-FUZZY, GA-adaptive neuro fuzzy inference system (ANFIS), GA-GA-FUZZY algorithms. Proposed algorithms are used for the tar prediction in the cigarette production process. The results of prediction are compared with gas chromatograph (high-performance liquid chromatography (HPLC)) readings.

Copyright
© 2019 The Authors. Published by Atlantis Press SARL.
Open Access
This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

1. INTRODUCTION

Tar is a term used for describing toxic chemical residue in burning process. Tar is present in all cigarettes and its concentration increases with cigarette burning. This would actually mean that the last “inhalation” of cigarette smoke has twice more tar than the first “inhalation.” Tar unit is mg tar/cigarette, in other words, the amount of tar being “caught” in cigarette filter during standard cigarette smoking. The real amount of tar which smoker inhales is unknown. Modern devices that simulate lung functioning under the norms and requirements of tobacco industry check the level of tar, nicotine and other components remained after tobacco burning. The amount of tar in laboratory is analyzed through high-performance liquid chromatography (HPLC) [1] and gas chromatography (GC) [2]. While analyzing tar amount through gas chromatograph, smoke from so called light cigarette contains less tar than “ordinary” cigarette. However, it is impossible for the machine to predict the amount of tar that the smoker actually inhales. Detection and prediction of tar concentration in production process is a key factor, where contents of cigarettes are frequently changed (in accordance with regulations on decrease of toxic component concentration). Control process of cigarette quality presents complex chemical and physical cigarette analysis and some regulators’ agencies try to control cigarettes quality on market [3]. In the scientific fields there are studies based on mathematical models (adaptive neuro fuzzy classifiers) [46] and statistical analyses [7] for prediction of tar.

“Organizations in different regions of the world, including Parties to the World Health Organization Framework Convention on Tobacco Control (WHO FCTC) and the US Food and Drug Administration (FDA), are working to increase the regulation of tobacco and cigarette-smoke constituents” [2].

Historically, the constituents measured and reported in cigarette smoke have comprised tar, nicotine and, more recently, carbon monoxide, for which test methods have been validated.

The reproducibility of measurement of tar, was not reported but is assumed to satisfy the performance criteria stated in the relevant International Organization for Standardization (ISO). In other words, for the measurement of tar, the reproducibility of Relative Standard Deviations (RSD) should be in the range 73% at 0.82 mg to 11% at 17.4 mg, according to ISO 4387:2000 [2].

Fuzzy systems (FSs) are created using sets of mathematical representing vagueness and imprecise information combined with mathematical operations.

Genetic algorithm (GA) is heuristic approach of natural selection, and because of that GA is deployed for optimal design of antecedents’ part (membership functions (MFs)) in fuzzy logic rules and optimal design of the vector of linear coefficients of consequent parts in rules.

Applying this approach solved the problem of energy flows management in micro grids of an Energy Management System [8], and particle swarm optimization for tuning MFs to attain lower prediction error [9].

In the majority of adaptive neuro fuzzy inference systems (ANFIS), MFs are not adapted to the context of each variable, but some cases required additional adaptation steps of MFs to the context of the variables [10].

It is challenging to produce good filter design in order to reduce tar content in gas [11], and in that sense was proven that ANFIS model can be used to predict tar content in any filter design with high prediction precision.

In the last decade, publications that have been available are focused to applications of evolutionary computing (GAs) and fuzzy logic in prediction smoking constituents (tar, nicotine, carbon dioxide, tobacco-specific nitrosamines, benzo [a] pyrene, aldehydes, volatile organic compounds and carbon monoxide). These publications are relatively rare in comparison to other fields of research.

Genetic learning algorithm was the base for representation of knowledge using new extended fuzzy rule models [12,13]. One example of technical application is how to handle and process high throughput data using integration of GA-FUZZY algorithm with Hadoop Map-Reduce technique in order to solve gene classification problems [14]. One new evolutionary computing algorithm “Jumping-genes (JGs)” has been combined with fuzzy logic in order to produce better digital audio classifier [15].

Classical control theory usually requires a mathematical model for designing the controller. But inaccuracy of this modeling was solved using ANFIS [16]. In medical field ANFIS modeling is suitable for estimation of survival prediction [17].

Development of hybrid GA-ANFIS system produced new momentum in improvement overall system behavior.

Modern medicine applies artificial intelligence techniques to predict heart diseases, such as basic fuzzy logic expert system approach [18] and improved GA optimized fuzzy expert systems [19].

Also, in recent years we have seen a relatively few applications of evolutionary computing (GAs) and ANFIS in tar prediction in comparison to fields of engineering and medicine. A special examples in engineering are applications of GA for ANFIS parameters optimization as classifier for physical work rate [20], and the structure tension prediction model [21].

For prediction of hazardous nature events such as wildfire pattern [22] and annual rainfall [23] responsible to minimize loss of life and goods were developed ANFIS models optimized with evolutionary algorithms.

One another algorithm [24] represents maximization of power efficiency of photovoltaic systems exposed to different climate circumstances using also GA optimized ANFIS model.

In power and energy field an ANFIS model was used to control generator terminal voltage in order to design power stabilizer system [25].

Generally, nonlinear control process is hard to predict, and because of that the GA optimization of ANFIS controller is suitable to keep temperature in acceptable range of plastic extrusion system [26]. In other aspects of our life, marketing and advertising are constantly improving branches [27], and ANFIS models have been used to analyze advertising decision making.

The World Health Organization report that each year most people die from heart diseases especially in developing countries. An ANFIS classifier was developed for seven input variables from the Cleveland Clinic Foundation heart disease dataset [28]. Improvement of this model was developed using same data source and advanced fuzzy resolution mechanism [29].

Further improvements in reduction the number of attributes in dataset were obtained using GA for feature selection [18].

Based on the same developing principles of coupling GA and ANFIS model for short-term energy consumption prediction has been developed [30].

Our goal was to develop novel FS based of two-step cascade genetic optimization strategy (GA-GA-FUZZY) for tobacco tar prediction and get better results of prediction of tar level in a cigarette filter using only four cigarette features: diameter, filter ventilation, nicotine and carbon monoxide. In our previous research [31], the proposed combinatorial MF parameters algorithm combines the advantages of ANFIS [32] and GA [33,34]. In the first step ANFIS combinatorial mechanism generates six different fuzzy structures. In second step GA algorithm do optimization [35] of this structures, which represents GA-FUZZY algorithm.

This paper extends the initial work [31] and shows better results in prediction accuracy than what has been presented in recently published papers.

We developed a new algorithm GA-GA-FUZZY for the determination and prediction of tar as volatile compounds in cigarette smoke. In the frame of the research in the first step was developed an algorithm for combinatorial subtractive clustering ANFIS which parameters are optimized with GA resulting in optimal fuzzy structure. In the second step was done an optimization of fuzzy structure parameters (MFs). This two-step algorithm of optimization we have named as GA-GA-FUZZY. Proposed algorithms are used for the amount prediction of tar in the process of cigarette production. The task was performed using MATLAB interactive environment, which is suitable for system modeling. We compared our results with the amount of tar being result of the lab analysis through HPLC [1]. The performances of these algorithms-methods are also compared with our previous results based on ANFIS and GA and shows effectiveness of this new approach [31].

The paper is organized in six sections. In Section 1, we described the process of tar detection and listed some research studies on different approaches using combination of GA and ANFIS algorithms. Section 2 describes methodological framework including dataset description, ANFIS architecture and GA used in our research. Section 3 introduces GA-FUZZY, GA-ANFIS and GA-GA-FUZZY design architectures including mathematical descriptions and pseudo codes. Section 4 describes evaluation of developed systems, description of implemented algorithms in MATLAB and numerical results. Section 5 gives discussion and comparative analyses. Finally, Section 6 presents our conclusions.

2. DATASETS AND METHODICAL FRAMEWORK

In this section we provide dataset description and background information related to components implemented in our research.

2.1. Datasets

Rigorous control of the Service for quality guarantee includes physical–chemical, sensor and degustation estimate of the raw material and final product in the cigarette production process. Smoke amount from cigarette depends of many factors such as volume of puffs, number and interval between puffs, etc.

Datasets (Dataset 1 and Dataset 2) [36] used in our study was according to ISO of smoke collection method (35 mL puff sizes lasting two seconds taken every 60 seconds). Artificial smoker is device that measures cigarette's chemical characteristics, under standard conditions (Table 1).

Smoking Parameters ISO
Conditioning temperature 22 ± 1°C
Relative humidity (%) 60 ± 2%
Volume of withdrawal (ml) 35
Duration of withdrawal (s) 2
Frequency of withdrawal (ml/s) 1/60
Lenght rest of cigarette (unburned potrion) 23 mm (not less than 8 mm from filter length)

ISO = International Organization for Standardization

Table 1

Condition for artificial smoking by ISO.

Total smoke condensate is residue in cigarette filter, being formed during smoking process, out of which is extricated substance named tar, by separating water and nicotine. It takes twelve hours for getting precise results in the aforementioned process.

For the purpose of our research, we used lab measuring of various cigarette types (909 analyses in total, 20 cigarettes per each analysis) collected from 2014 until 2016 year. The set of 854 samples were used for the process of training, testing and checking, and set of 55 samples is used for propose of validation process.

For the additional validation, one independent group of samples was taken directly from the production line, after which prediction results were compared with the results that quality guarantee got through HPLC. Part of dataset analyses is presented in table (Table 2).

Mositure % Smoke Resistance PD mmVS Smoke Resistance CPD mmVS Diameter mm Filter Ventilation % Total Ventilation % Puff/cig TPM mg/cig Nicotine mg/cig Water mg/cig CO mg/cig
13.58 102 115 7.81 19.2 38.8 6.92 9.58 0.56 0.79 10.60
102 116 7.81 19.5 39.5 6.98 9.59 0.54 0.83 10.90
102 116 7.81 19.6 38.7 6.96 9.64 0.51 0.78 10.60

13.58 102 116 7.81 19.4 39.0 6.95 9.60 0.54 0.80 10.70

13.56 105 120 7.80 21.0 38.9 6.80 8.73 0.50 0.53 10.20
105 120 7.80 20.5 38.6 6.94 9.10 0.52 0.60 10.80
104 118 7.80 20.1 37.8 7.03 9.14 0.52 0.63 10.30

14.06 102 115 7.80 19.1 38.3 7.03 9.82 0.58 0.61 10.30
103 115 7.80 18.5 37.0 7.03 9.82 0.58 0.61 10.40
102 116 7.81 19.6 39.0 7.16 9.72 0.59 0.60 10.40

14.28 96 107 7.84 17.3 34.7 6.90 10.41 0.64 0.73 10.10
Table 2

Part of used data used in this study.

Applying new two-step combinatorial algorithm GA-GA-FUZZY algorithm, we try to predict amount of tar in cigarette knowing only 4 parameters of cigarette: diameter, filter ventilation, nicotine and carbon monoxide.

2.2. ANFIS Architecture

In this work, we used ANFIS architecture [31] (four cigarette features as Inputs, one Output-concentration of tar, and sixteen rules) as it is presented in Figure 1.

Figure 1

Adaptive neuro fuzzy inference systems (ANFIS) structure used in this work.

ANFIS has five layers as it is described below:

Layer 1—L1

Output of i-th node for the first layer (i - adaptive node with node function φ) O1,i,iL1, refers to Eqs. (14)

O1,i=φMiD,i=1,2
O1,i=φNi2FV,i=3,4
O1,i=φPi4N,i=5,6
O1,i=φQi6CO,i=7,8
where premise parameters are values of inputs (D-diameter, FV-filter ventilation, N-nicotine and CO-carbon monoxide) and Mi,Ni,Pi,Qi, are linquistic label related with node function.

Layer 2L2

 i L2,i fixed node labeled whose output is product of inputs refers to Eq. (5)

O2,i=φMiD×φNiFV×φPiN×φQiCO,i=1,16¯

Layer 3L3

Output of i-th node for the layer 3.  i L3, i - fixed node, ri: i - th rule's firing strength which is expressed as Eq. (6)

O3,i=r1¯=rii=116ri,i=1,16¯

r1¯ is reffered to as normalized firing strengths.

Layer 4L4

Output of i-th node for the layer 4.  i L4, i - adaptive node, each node has the following function refers to Eq. (7)

O4,i=r1¯zi=r1¯miD+niFV+oiN+piCO+s1,i=1,16¯

r1¯ is the output of L3zi=mi,ni,oi,pi,si are consequent parameters (LRSE updated).

Layer 5L5

Single node is fixed node marked as - computing the overall output which is expressed as Eq. (8).

O5,i=i=116r1¯zi=iriziirii=1,16¯

2.3. Genetic Algorithm

GAs are a family of algorithms that use some of the genetic principles that are present in nature in order to solve certain computational problem.

These natural principles are: inheritance, crossover, mutation, survival of the fittest, migration and so on. John Holland invented original GA in the early 1970s.

Figure 2 represents GA flow diagram used in our research.

Figure 2

Genetic algorithm flow diagram.

3. MODELING OF GA-GA-FUZZY

In the first step, we developed GA-FUZZY algorithm (as part of Control block- Figure 3), where FIS MFs were optimized. Results from this step are used as reference to Target block (Figure 3), developed by optimization of GA-ANFIS and GA-GA-FUZZY algorithm. Algorithm GA-ANFIS optimized ANFIS structure parameters, where FIS structures are compared with FIS structures from control block. Algorithm GA-GA-FUZZY is based on double-step cascade optimization.

Figure 3

Two-step optimization system design (Genetic algorithm (GA)-GA-FUZZY)).

3.1. GA-FUZZY Control Block

3.1.1. Description

Combinatorial MF parameters algorithm GA-FUZZY optimizes FIS MF parameters using GA. It was developed in the first part of research [31], as it is shown in next steps:

  1. Step: Generating different ANFIS structures [37]

  2. Step: Using ANFIS structure (from Step 1), algorithm genetically optimized fuzzy structure that results in best one

  3. Step: Generating the best prediction and less error for the best GA-FUZZY optimized FIS structure

Using combinatorics of ANFIS parameters, six parameter combinations are generated and each of them has been used in ANFIS. Therefore, six fuzzy (FIS) structures are created. Additionally every FIS structure is additionally optimized, using GA to tune FIS MFs parameters, to minimize the validation error.

Combinatorial MF parameters algorithm GA-FUZZY block diagram is shown in Figure 4.

Figure 4

Genetic algorithm (GA)-FUZZY diagram.

The block diagram can be described in the following steps:

Two-Step Cascade Genetic Optimization algorithm main steps:

Step 1: Develop Control block

  Step 1.1: Develop combinatorial membership function parameters algorithm-GA-FUZZY

    Step 1.1.1: Generate six different fuzzy structures

Step 2: Develop Target Block

  Step 2.1: Develop combinatorial subtractive clustering ANFIS parameters algorithm- GA-ANFIS

    Step 2.1.1: Generate best ANFIS SC parameters (best fuzzy structure)

  Step 2.2: Develop two-stage combinatorial algorithm GA-GA-FUZZY

    Step 2.2.1: Generate best FISSCopt,MFopt structure

Step 3: Apply validation datasets on results from 1.1.1, 2.1.1 and 2.2.1

  1. Database initialization

    Database initialization represents data processing, which defines the data for training, testing, verification and validation. The validation set is allocated to validate the model

  2. ANFIS initialization

    Step 1: represents creation of ANFIS network defining by

    • Number of Inputs and

    • Number of MF

    Step 2: generating different types of ANFIS structure as follows:

    • Grid Partition in combination with hybrid learning rule and MF linear (GP1)

    • Grid Partition in combination with hybrid learning rule and MF constant (GP2)

    • Grid Partition in combination with backpropagation learning rule and MF linear (GP3)

    • Grid Partition in combination with backpropagation learning rule and constant MF (GP4)

    • Subtractive Clustering in combination with hybrid learning rule (SC1)

    • Subtractive Clustering with backpropagation learning rule (SC2)

  3. Initial configuration of the GA and GA optimization of ANFIS structure

    1. Define population size, fitness limit, GA operators and the number of generations

    2. GA-FUZZY chromosome structure is created using ANFIS complexity of calculation [38]

    3. Stopping GA criteria

      c1) Prediction: A set of validation data is applied on genetic optimized structure

      c2) Results: Two functions were created to display the results of which are

      • The best GA-FUZZY structure with the lowest error and

      • The best GA-FUZZY structure with the best prediction

3.1.2. Mathematical description

Different ANFIS structures were generated using one of partition techniques: Grid Partition and Subtractive clustering in combination of different optimization methods (hybrid and backpropagation).

ANFIS combinatory used in work is described in Table 3.

Learning Method Output MF Partition Technique
Grid-Partition Subtractive-Clustering
Hybrid Linear + +
Constant +
Backpropagation Linear + +
Constant +

ANFIS = adaptive neuro fuzzy inference systems; MFs = membership functions

Table 3

ANFIS combinatorix.

We used six different fuzzy structure named GP1 (Grid Partition Hybrid linear), GP2 (Grid partition Backpropagation linear), GP3 (Grid Partition Hybrid constant), GP4 (Grid Partition Backpropagation constant), SC1 (Subtractive clustering linear) and SC2 (Subtractive clustering constant).

The most important segment in the development of the mentioned algorithm is establishing relationship between data structure and chromosomes. ANFIS computational formula [39] (see Table 4) was used for generating size of chromosome structure.

Layer Type Nodes Parameters
L0, O0,i Inputs NumIn 0
L1, O1,i Values (NumMf*NumIn) 3*(NumMf*NumIn)
L2, O2,i Rules NumMfNumIn 0
L3, O3,i Normalization NumMfNumIn 0
L4, O4,i Lin.function NumMfNumIn (NumIn + 1)*NumMfNumIn
L5, O5, i Sum 1 0

Legend: ANFIS = daptive neuro fuzzy inference systems; NumIn =Number of inputs; NumMf =Number of membership functions; Li = Layer number; Ok,i = Output of k-th layer for i-th node

Table 4

ANFIS computational formula.

Chromosome length is defined as Eq. (9).

LChrom=L1+L4=3*NumMf*NumIn+NumIn+1*NumMfNumIn

Mathematically, combinatorial GA-FUZZY algorithm is represented as Eq. (10):

FISiMFopt=GAi=16ANFISDataSetcombinatoric,
where
GAi=16ANFISDataSetcombinatoric=GAANFISDataSetGP1+ANFISDataSetGP2+ANFISDataSetGP3+ANFISDataSetGP4+ANFISDataSetSC1+ANFISDataSetSC2
where
i=1,forGridPartitionHibridLinear2,forGridPartitionBackpropagationLinear3,forGridPartitionHibridConstant4,forPartitionBackpropagationConstant5,forSubtractiveClusteringHibrid6,forSubtractiveclusteringBackpropagation

3.1.3. Pseudo code

Pseudo code for combinatorial membership function parameters algorithm GA-FUZZY is shown as follows (Algorithm 1):

Algorithm 1: Pseudo code of GA-FUZZY algorithm

Require

 X- Training dataset, Y- Checking dataset, Z- Validation dataset

 isCluster – {0-Grid Partition Linear, 1-Subtractive Clustering, 2- Grid Partition Constant}

 ANFIS params {NumMF, NumIn, NumRule, NumEpoch, Optim. method, Constant (Linear), Tolerance (error)}

 GA params {Number of Generation, Population Size, Crossover operator, Mutation operator, Fitness limit}

 RUN ANFIS

  FOR i=1 to 6

  Fuzzy System initialization {apply genfis1 OR genfis2)

  Input other ANFIS params of learning

   BEGIN process of learning

   Use command “anfis”

   END when tolerance achived

  Generate ANFIS structure FISi {i=1,…,6}

 FISi= isCluster

  END

 END

 RUN GA

  FF=@(param)FitnessFunction (param, FISi, X, Z, NumMf, NumIn)

  {param=err}

  IF isCluster==0

   Number of Variables=((3*sum(NumMF) + ((NumIn + 1)*prod(NumMF)));

   Final parameters=GA(FF, Number of Variables, GA params);

 END (FOR)

 IF isCluster==1

   Number of Variables=((2*NumIn*NumRule) + ((NumRule)*(NumIn + 1)));

   Final parameters=GA(FF, Number of Variables, GA params);

 END

 IF isCluster==2

   Number of Variables=(3*sum(NumMF) + prod(NumMF))

   Final parameters=GA(FF, Number of Variables, GA params);

 END

END

FISiMFopt=GAi=16ANFISX,Y=Finalparameters.FISi,

 Resulti = evaluate Z,FISiMFopt,; i=1,2,6

 END

 Best results=min (RMSE(Resulti))

Legend: NumIn—Number of inputs; NumMf—Number of membership functions; NumRule—Number of rules; NumEpoch—Number of epochs; FISiMFopt=GA (FISi) GA optimization based on membership function (MF) parameters tuning; RMSE—Root Mean Square Error (RMSE)

3.2. GA-ANFIS Sub-Target Block

3.2.1. Description

In the second phase of research, our goal was to develop combinatorial subtractive clustering ANFIS parameters algorithm—GA-ANFIS an algorithm that performs genetic optimization of subtractive ANFIS parameters. Within ANFIS, a subtractive clustering method is used to ensure a quick method for taking input/output training data and generate Sugeno FS that models the behavior of the data [38].

The main idea is to create an algorithm to minimize errors (learning methods = hybrid and back propagation) in way to genetically optimizes value of subtractive parameters. Range of Influence specifies a cluster center's range on influence in each data dimension. Squash Factor is the factor used to multiply the radii values that determined neighborhood of cluster centers, in order to quash the potential of outlying points to be considered as part of the corresponding cluster.

Accept Ratio sets the potential, as a fraction of the potential of the first cluster, above which another data point will be accepted as cluster center.

Reject Ratio sets the potential, as a fraction of the potential of the first cluster, below which a data point will be rejected as cluster center. Number of independent input variables in the objective function is four. Chromosome structure is presented as follows (Figure 5).

Figure 5

Chromosome representation of genetic algorithm-adaptive neuro fuzzy inference systems (GA-ANFIS) algorithm.

3.2.2. Block diagram

Combinatorial subtractive clustering ANFIS parameters algorithm contains three functional parts: data preprocessing (training, checking and validation dataset), genetic optimization of subtractive clustering ANFIS parameters (to get best one) to create best FISSCopt structure and validation process at the end. GA-ANFIS block diagram represented in Figure 6, shows that GA-ANFIS is directly using four input cigarette characteristic parameters (collected in datasets) as input variable and one system output (tar prediction error). GA algorithm optimizes ANFIS SC parameters in order to produce FIS structure with minimal tar predicted error with respect comparing it with samples collected in checking dataset. Resulting one FIS structure is generated from GA-ANFIS system.

Figure 6

Combinatorial subtractive clustering parameter adaptive neuro fuzzy inference systems (ANFIS) algorithm genetic algorithm (GA)-ANFIS.

3.2.3. Mathematical description

Mathematically, this algorithm generates best subtractive clustering parameter as part of generated structure we can represent by Eq. (13).

FISSCopt=GASCANFISDataSet

3.2.4. Pseudo code

Pseudo code for combinatorial subtractive clustering ANFIS parameters algorithm GA-ANFIS algorithm is represented as follows (Algorithm 2):

Algorithm 2: Pseudo code of GA-ANFIS algorithm

Require

 X- Training data set, Y- Checking data set, Z- Validation data set

 ANFIS params {NumMF, NumIn, NoEpoch, Optim. method, Constant (Linear), Tolerance (error)}

 GA params {Number of Generation, Population Size, Crossover operator, Mutation operator, Fitness limit}

 RUN ANFIS

  FOR i = 1 to 2

  Fuzzy System initialization {apply genfis2)

  Input other ANFIS params of learning

   BEGIN process of learning

   Use command “anfis”

   END when tolerance achived

  Generate ANFIS structure FISi {i = 1,2}

 RUN GA

   FFerror= @ga_wrapper_func(param_vector) {param_vector = [RoI SF AC RR}

   Number of Variables = 4

   Define Constraints (lb, ub)

   best_point = ga(@ga_wrapper_f, Number of Variables, lb, ub, GA params);

 END

   give best_point to create FISSCopt=GASCANFISDataSet

   Best results = evaluate Z,FISSCopt,;

END

Legend: NumIn—Number of inputs; NumMf—Number of membership functions; NumRule—Number of rules; NumEpoch—Number of epochs; FIS SCopt = GA (FISSC)—GA optimization based on subcluster (SC) parameters tuning

3.3. GA-GA-FUZZY Target Block

3.3.1. Description

Two-step combinatorial algorithm GA-GA-FUZZY optimizes structure by subtractive clustering parameters and MFs FISSCopt,MFopt. The GA-GA-FUZZY algorithm can be represented in the following steps:

  1. Data preprocessing (preparing samples for training, checking and validation)

  2. Generating number of MF for each input

    • 2.1

      Generating a different combinations of fuzzy structure FISi i=1,,6.

    • 3.1

      Apply combinatorial subtractive clustering GA-ANFIS algorithm

      • 3.1.1

        Generating the best subtractive parameters (4 parameter) applying GA on ANFIS parameters

      • 3.1.2

        Creation of FIS structure with that GA optimized SC parameters -FISSCopt

      • 3.1.3

        Model validation—applying validation set for FISSCopt structure

    • 3.2

      Applying GA-FUZZY algorithm

      • 3.2.1

        Applying GA-FUZZY algorithm on structure generated by step 2.1 and 3.1.2.

      • 3.2.2

        Generating GA optimized structure FISiMFopt i=1,,6 and FISSCopt,MFopt represents GA optimized FISSCopt structure.

      • 3.2.3

        Validation

3.3.2. Block diagram

Two-step combinatorial algorithm GA-GA-FUZZY algorithm contains several functional parts: data preprocessing (training, checking and validation dataset), genetic optimization of subtractive clustering ANFIS parameters to create best FISSCopt structure and its validation, generating a different combinations of fuzzy structure FISi (i = 1,‥, 6) and final genetic optimization of generated FISSCopt and FISi i=1,,6 structures and validation process of optimized systems. GA-GA-FUZZY algorithm block diagram is represented in Figure 7.

Figure 7

Two-step genetic algorithm (GA)-GA-FUZZY algorithm.

In general core GA-GA-FUZZY algorithm enhances GA-ANFIS FIS structure additionally optimizing fuzzy MF parameters.

3.3.3. Mathematical description

Result of algorithm for two-step combinatorial algorithm is optimization structure by Sub-clustering (SC) parameters and MF parameters i.e. mathematically represented as Eq. (14).

FISSCopt,MFopt=GAi=16ANFISDataSetcombinatoricsGASCANFISDataSeti=16

As we can see in Eq. (14) final fuzzy optimized structures are developed as process of optimization fuzzy MF parameters using GA on union of six fuzzy structures produced by one ANFIS model with combinatorics of its parameters and the second part of two ANFIS made fuzzy (FIS) structures using first level of optimization ANFIS sub-clustering parameters with GA.

3.3.4. Pseudo Code

Pseudo code for GA-GA-FUZZY algorithm is shown as follows (Algorithm 3):

Algorithm 3: Pseudo code of GA-GA-FUZZY algorithm

Require

 X- Training dataset, Y- Checking dataset, Z- Validation dataset

 isCluster – {0-Grid Partition Linear, 1-Subtractive Clustering, 2- Grid Partition Constant}

 ANFIS params {NumMF, NumIn, NumRule, NoEpoch, Optim. method, Constant (Linear), Tolerance (error)}

 GA params {Number of Generations, Population Size, Crossover operator, Mutation operator, Fitness limit}

 RUN ANFIS

  j=6

  FOR i=1 to j

  Fuzzy System initialization { apply genfis1 OR genfis2)

  Input other ANFIS params of learning

   BEGIN process of learning

   Use command “anfis”

   END when tolerance achived

  Generate ANFIS structure FISi {i=1,…,6}

  FISi=isCluster

  END

 END

 RUN GA

  Number of Variables=4 (RoI, SF, AC, RR)

  FFerror=@ga_wrapper (param_vector) {param_vector=[RoI SF AC RR]}

  {ga_wrapper (param_vector)={((SCparam ANFIS(X, Y)}}

  best_param=ga(@ga_wrapper, Number of Variables,constraints, GA params);

 END

 RUN ANFIS

  FOR best_param

   Fuzzy System initialization {apply genfis2)

   Give best_param vector to ANFIS

   BEGIN process of learning

   Use command “anfis”

   END when tolerance achived

   create FISSCopt=GASCANFISX,Y=best_param.FISSC

   FISSCopt==isCluster

 END (FOR)

 Resultj+2= evaluate Z,FISSCopt;

 END

 RUN GA

   FF=@(param)FitnessFunction(param, FISi, FISSCopt, X, Z,NumMf, NumRule, NumIn)

   {param=err}

  IF isCluster==0

    Number of Variables=((3*sum(NumMF) + ((NumIn + 1)*prod(NumMF)));

    Final parameters=GA(FF, Number of Variables, GA params);

 END

 IF isCluster==1

    Number of Variables=((2*NumIn*NumRule) + ((NumRule)*(NumIn + 1)));

    Final parameters=GA(FF, Number of Variables,GA params);

 END

 IF isCluster==2

    Number of Variables=(3*sum(NumMF) + prod(NumMF))

    Final parameters=GA(FF, Number of Variables, GA params);

 END

 FOR i=1 to 6

   FISiMFopt=GAi=16ANFISX,Y=Finalparameters.FISi,

   Resulti= evaluate Z,FISiMFopt,; {i=1,2,‥ j}

 END (FOR)

   FISSCopt,MFopt=GAGASCANFISX,Y=Finalparameters.FISSCopt

 Resultj+1= evaluate Z,FISSCopt,MFopt;

 END

Best_result=optimumi=1j+2RMSEResulti

Legend: NumIn—Number of inputs; NumMf—Number of membership functions; NumRule—Number of rules; NumEpoch—Number of epochs; FISiMFopt=GA (FISi) GA optimization based on membership function (MF) parameters tuning; FIS SCopt=GA (FISSC)—GA optimization based on subcluster (SC) parameters tunning; FISiSCoptMFopt=GA (GA(FISiSC))—GA optimization based on SC parameters tuning in first run and MF parameters tuning in second run; RMSE—Root Mean Square Error (RMSE)

4. IMPLEMENTATION AND EVALUATION OF DEVELOPED SYSTEM

We used MATLAB software for analyses, development and implementation of our GA-GA-FUZZY system because it has possibilities to integrate high-performance computation, visualization and programming in an easy to use environment for development modelling, simulation and rapid prototyping. Using reach library toolboxes (Math, Fuzzy logic, Optimization, Neural networks) it is possible to develop various applications with graphical user interface (GUI) for accurate reproducibility of results in many fields. We have used Matlab core function libraries (Optimization Toolbox and Fuzzy Toolbox), coder, and Matlab GUI. Fuzzy toolbox was used for fuzzy and ANFIS modelling and Optimization toolbox was used for optimization of fuzzy structures. This framework was core environment for GA-GA-FUZZY algorithm development. The main GUI (Figure 8) consist of four areas which have several subsystems:

  • Area 1. is used for data base initialization, with three generated data samples sets for system development: training, testing and checking, and one independent data sample set for external model validation.

  • Area 2. is used for ANFIS network creation (number of inputs and number of MFs per input). By using different combinations of techniques presented in the Table 3, different fuzzy structures were created.

  • Area 3. is used for GA parameterization. In this step, we define basic parameters of GA: number of generations, population size, and fitness limit and crossover operator.

  • Area 4. contains of three subsystems: GA-FUZZY (area 4a), GA-ANFIS (area 4b) and GA-GA-FUZZY (area 4c).

Figure 8

Main graphical user interface (GUI).

We used Dataset 1 and Dataset 2 to create and evaluate control block by applying GA-FUZZY GUI, which was developed in previous research [31] (Figure 9). Different fuzzy structures are initialized and GA optimized (area 1 and area 2 in Figure 9). Combinatorial MF parameters algorithm GA-FUZZY is validated using validation dataset (area 3 in Figure 9).

Figure 9

Genetic algorithm (GA)-FUZZY graphical user interface (GUI).

The GA-GA-FUZZY system processing consist of two steps. (Area 4b and area 4c in Figure 8). In the first step we apply GA-ANFIS algorithm to get the best subtractive clustering parameter and generate the best GA-ANFIS structure (Area “a” in Figure 10).

Figure 10

Genetic algorithm (GA)-GA-FUZZY graphical user interface (GUI).

Then, we optimized them by GA-GA-FUZZY algorithm (Area “b” in Figure 10). Created GA-GA-FUZZY structure was used in validation process where we used two different validation datasets (Area “c” in Figure 10).

Evaluation results are as follows:

  1. GA-FUZZY algorithm

    The results of GA-FUZZY algorithm are presented in the Table 5. Residual is calculated using samples in external validation data set.

    The best structure generated by GA-FUZZY algorithm is FISGP1MFopt for Dataset 2. Predicted vs expected values for GA-FUZZY algorithm are presented in Figure 11.

  2. GA-ANFIS algorithm

    Table 6 shows the results of GA-ANFIS algorithm success for different optimization methods.

    The GA-ANFIS interface surface for the best GA-ANFIS structure (created on base of Dataset 2) and testing results for this system are shown on Figure 12.

  3. GA-GA-FUZZY algorithm

    The final results for GA-GA-FUZZY algorithm are presented in the Table 7.

    Predicted versus expected values are shown in Figure 13. We compared performances of these three approaches and as we can suggest that combination of these three algorithms can be used to improve the efficiency of the solution. Results showed that combinatorial and functional GA-GA-FUZZY algorithm is promising.

Figure 11

Genetic algorithm (GA)-FUZZY interface surface and diagram of predicted vs. expected values FIDGP1MFopt.

Figure 12

Genetic algorithm (GA)-adaptive neuro fuzzy inference systems (ANFIS) interface surface and diagram of predicted vs. expected values.

Figure 13

Genetic algorithm (GA)-GA-FUZZY interface surface and diagram of predicted vs. expected values.

Fuzzy class (Sample Data: Dataset 1/Dataset 2)
Train Error Testing Error Checking Error Residual
FISGP1MFopt 0.0557/0.0148 0.9477/0.5743 0.1493/0.0939 0.0333/0.0202
FISGP2MFopt 0.7959/0.1339 0.8118/1.5320 1.7388/0.2888 0.9029/0.1282
FISGP3MFopt 0.0079/0.0387 0.5663/1.0892 0.1485/0.1219 0.1921/0.1029
FISGP4MFopt 0.7658/0.2831 0.5822/1.7075 0.9105/0.2864 0.2829/0.2866
FISSC1MFopt 0.8647/0.4403 0.7000/0.4146 0.4283/0.1075 0.2821/0.1092
FISSC2MFopt 1.4535/0.1825 0.9534/0.3203 1.7824/0.0863 1.0202/0.1191

Legend: GP1 = Grid Partition/Hybrid/Linear; GP2 = Grid Partition/Hybrid/Constant; GP3 = Grid Partition /Backpropagation/Linear; GP4 = Grid Partition/Backpropagation/Constant; SC1 = Subtractive Clustering/Hybrid; SC2 = Subtractive Clustering/Backpropagation; MFopt = GA optimization of fuzzy membership function parameters; FIS = Fuzzy system

Table 5

GA-FUZZY results (tar amount prediction error).

Datasets Training Error Checking Error Testing Error Residual (valid.data)
Dataset 1 5.44e-13 1.054 0.7977 0.068
Dataset 2 5.12e-12 0.387 0.348  0.088
Table 6

GA-ANFIS algorithm results (tar amount prediction error).

Fuzzy Class (Sample Data: Dataset 1/Dataset 2)
Train Error Testing Error Checking Error Residual
FISSCopt,MFopt 0.4075/0.0021 0.3953/0.012 0.40887/0.0834 0.0178/0.0190

Legend: MFopt = GA optimization of fuzzy membership function parameters; SCopt = GA optimization of fuzzy subclustering parameters; FIS = Fuzzy system

Table 7

GA-GA-FUZZY algorithm results (tar amount prediction error).

5. DISCUSSION AND COMPARATIVE ANALYSES

We made comparison between similar methods and the proposed algorithms (Table 8).

Optimization Method Existing Systems
Proposed System
GA-FUZZY GA-ANFIS GA-GA-FUZZY
Training error 0.0079/0.0148 5.44 e-13/5.12 e-12 0.4075/0.0021
Checking error 0.1485/0.0939 1.054/0.387 0.4089/0.0834
Residual (prediction error using validation samples) 0.1921/0.0202 0.068/0.088 0.0178/0.0190

Legend: GA = Genetic algorithm; ANFIS = adaptive neuro fuzzy inference systems; Sample data: Dataset 1/Dataset 2 where is dataset 1: 909 samples (644 Training; 210 Checking) for ANFIS model development and additional 55 samples for external validation and Dataset 2: 909 samples (644 Training; 210 Checking) for ANFIS model development and additional 55 samples for external validation

Table 8

Comparison of tar amount prediction error of proposed and existing systems.

In our study we have concluded that using GA-GA-FUZZY algorithm improvement prediction results compared to the GA-ANFIS algorithm in terms of reduced error of prediction from 0.068 using samples in Dataset 1 to 0.017 i.e. from 0.088 to 0.019 using samples in Dataset 2. The proposed approach obtained better accuracy for both tar datasets.

Whereas, in Dataset 2 the accuracy is higher than the average accuracy of related researches. By applying new approach to our study, we achieved more precise prediction results compared to so far known researches.

Production bill of material (production BOM) are frequently changed for the purpose of toxic components reduction, including tar, so that the speed in gaining information on the effects of changes is the key process in making decision on mass production on the basis of the proposed change of production BOM. Thus, by changing cigarette's parameters, we can track the changes in the level of toxic components in faster and more precise manner by using new algorithm.

6. CONCLUSION

In this paper we built three different algorithms:

  1. Combinatorial MF parameters algorithm GA-FUZZY, genetic optimization of FIS MF

  2. Combinatorial subtractive clustering ANFIS parameters algorithm GA-ANFIS, genetic optimization of ANFIS parameters

  3. Two-step combinatorial algorithm GA-GA-FUZZY based on two-step cascade optimization, where GA optimizes fuzzy inference systems generated after genetic optimization of ANFIS parameters. GA is applied on FIS structure generated by GA-ANFIS algorithm

The results show that GA-GA-FUZZY algorithm has better performance in prediction of tar in cigarette knowing only four cigarette parameters: diameter, CO, nicotine and filter ventilation.

As we can see from the Table 9. GA-GA-FUZZY model has done best work in prediction of tar amount on external validation set of data. These final result shows that GA-GA-FUZZY prediction on unknown input data the tar prediction is within 1.9% of accuracy. HPLC methodology takes much more consuming time for obtaining results, therefore prediction models have their role in faster processing and data analysis.

Proposed Systems (Sample Data: Dataset 1/Dataset 2)
GA-FUZZY GA-ANFIS GA-GA-FUZZY
Residual (prediction error using validation data samples) 0.1921/0.0202 0.068/0.088 0.0178/0.0190
HPLC prediction error 0.267 0.267 0.267

GA = Genetic algorithm; ANFIS = adaptive neuro fuzzy inference systems; HPLC = High-performance liquid chromatography

Table 9

Comparison of tar amount prediction error between proposed systems.

From our experiment, we can infer the following:

  1. Four inputs (cigarettes features) are used for first level of GA-ANFIS optimization

  2. Combinatorial subtractive clustering ANFIS parameters algorithm GA-ANFIS generating best subtracive clustering ANFIS parameters and best SC GA-ANFIS structure

  3. GA-GA-FUZZY algorithm optimizes SC GA-ANFIS fuzzy structure (as a result of the combinatorial subtractive clustering ANFIS parameters algorithm - GA-ANFIS)

  4. Validation of the model was done for GA-GA-FUZZY structure, and its results are presented

GA-GA-FUZZY optimization algorithm is based on two-step cascade optimization, where GA in second step optimizes fuzzy inference systems generated after genetic optimization of ANFIS parameters in first step. Results generated using GA-GA-FUZZY algorithm shows new methodology in detection of tar in main stream smoke and show effectiveness of proposed approach. We have matched, in our study, soft computing techniques with the artificial smoking techniques to achieve even more precise results for tar amount prediction.

CONFLICT OF INTEREST

The authors declare that they have no conflict of interest.

AUTHORS' CONTRIBUTIONS

MK conceived conceptual model and designed the research. ZA evaluated appropriate artificial methods to be employed in this research. MK and LBF designed and implemented the simulation model, developed MATLAB GUI and run prognostic model. MK, LBF and ZA performed the experiments, analyzed output data and discussed results. All authors read, reviewed and approved final manuscript.

ETHICAL APPROVAL

This article does not contain any studies with human participants performed by any of the authors.

ACKNOWLEDGMENTS

We would than company “Tobacco Factory Sarajevo” for provided help in collection data.

REFERENCES

1.A. Vairale, R. Khan, G.S. Jadhav, V. Nalamothu, and P. Sivaswaroop, RP HPLC method for the quantification of coal tar in topical foam, Anal. Chem. Indian J., Vol. 10, 2011, pp. 464-469.
14.M.-T. Wu, J.-S. Wu, C.-N. Lee, and M.-C. Chen, A genetic algorithm-fuzzy-based voting mechanism combined with hadoop map-reduce technique for microarray data classification, J. Comput., Vol. 24, 2013, pp. 40-48.
15.L. Sheng and X. Mab, A novel GA-fuzzy classification method for audio signals, J. Inf. Comput. Sci., Vol. 9, 2012, pp. 595-610.
16.A.V. Gite, R.M. Bodade, and B.M. Raut, ANFIS controller and its application, Int. J. Eng. Res. Technol., Vol. 2, 2013, pp. 1-5.
19.N. Pathania and N. Rithika, Implementation of fuzzy controller for diagonse of patient heart disease, Int. J. Innov. Sci. Eng. Technol., Vol. 2, 2015, pp. 694-698.
23.M. Hanefi CALP, A hybrid ANFIS-GA approach for estimation of regional rainfall amount, Gazi Univ. J. Sci., Vol. 32, 2019, pp. 145-162.
28.N. Ziasabounchi and I. Askerzade, ANFİS based classification model for heart disease prediction, Int. J. Eng. Comput. Sci. IJECS-IJENS., Vol. 14, 2014, pp. 7-12. 146402-7373-IJECS-IJENS © April 2014
39.P. Bonissone, Adaptive Neural Fuzzy Inference Systems (ANFIS): Analysis and Applications, Schenectady, New York.
Journal
International Journal of Computational Intelligence Systems
Volume-Issue
12 - 2
Pages
1497 - 1511
Publication Date
2019/12/03
ISSN (Online)
1875-6883
ISSN (Print)
1875-6891
DOI
10.2991/ijcis.d.191122.001How to use a DOI?
Copyright
© 2019 The Authors. Published by Atlantis Press SARL.
Open Access
This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - JOUR
AU  - Muamer Kafadar
AU  - Zikrija Avdagic
AU  - Lejla Begic Fazlic
PY  - 2019
DA  - 2019/12/03
TI  - Fuzzy System Based on Two-Step Cascade Genetic Optimization Strategy for Tobacco Tar Prediction
JO  - International Journal of Computational Intelligence Systems
SP  - 1497
EP  - 1511
VL  - 12
IS  - 2
SN  - 1875-6883
UR  - https://doi.org/10.2991/ijcis.d.191122.001
DO  - 10.2991/ijcis.d.191122.001
ID  - Kafadar2019
ER  -