International Journal of Computational Intelligence Systems

Volume 12, Issue 2, 2019, Pages 1436 - 1445

Optimized Intelligent Design for Smart Systems Hybrid Beamforming and Power Adaptation Algorithms for Sensor Networks Decision-Making Approach

Ali Kamil Khiarullah1, *, Ufuk Tureli1, Didem Kivanc2
1Electronics And Communications Engineering, Yildiz Technical University, Istanbul, Turkey-34220
2Electrical And Electronics Engineering, Okan University, Istanbul, Turkey-34959
*Corresponding author. Email:
Corresponding Author
Ali Kamil Khiarullah
Received 1 September 2019, Accepted 10 November 2019, Available Online 27 November 2019.
10.2991/ijcis.d.191121.001How to use a DOI?
Optimized intelligent design for smart systems; Wireless network beamforming; Power adaption; Interference; Strategic decision-making; Game theory; Reinforcement learning

During last two decades, power adaptation and beamforming solutions have been proposed for multiple input multiple output (MIMO) Ad Hoc networks. Game theory based methods such as cooperative and non-cooperative joint beamforming and power control for the MIMO ad hoc systems consider the interference and overhead reduction, but have failed to achieve the trade-off between communication overhead and power minimization. Cooperative method using game theory achieves the power minimization, but introduced the overhead. The non-cooperative solution using game theory reduced the overhead, but it takes more power and iterations for convergence. In this paper, a novel game theory based algorithms proposed to achieve the trade-off between power control and communication overhead for multiple antennas enabled wireless ad-hoc networks operating in multiple-users interference environment. The optimized joint iterative power adaption and beamforming method designed to minimize the mutual interference at every wireless node with constant received signal to interference noise ratio (SINR) at every receiver node. First cooperative potential game theory based algorithm designed for the power and interference minimization in which users cluster and binary weight books along used to reduce the overhead. Then the non-cooperative based approach using the reinforcement learning (RL) method is proposed to reduce the number of iterations and power consumption in networks, the proposed RL procedure is fully distributed as every transmit node require only an observation of its instantaneous beamformer label which can be obtained from its receive node. The simulation results of both methods prove the efficient power adaption and beamforming for small and large networks with minimum overhead and interference compared to state-of-art methods.

© 2019 The Authors. Published by Atlantis Press SARL.
Open Access
This is an open access article distributed under the CC BY-NC 4.0 license (


Ad Hoc multiple input multiple output (MIMO) communication system is base for the future generation wireless communication system Massive MIMO. Multiuser-MIMO (MU-MIMO) boosts the design and development of Massive MIMO technologies [1]. MIMO communication system efficiency mainly depends on adaptability of transmit parameters such as beamformer selection, transmit power, transmission rate, modulation technique, and so on. The design of beamformer selection plays the important role for the adaptive wireless communication systems. For MIMO, the beamforming methods designed for different types of networks such as point-to-point, cellular, and ad hoc [2]. Number of beamforming methods designed to address the challenges related to each of such networks. The conventional method of beamforming shows the power control, throughput maximization, and capacity enhancement especially for point-to-point and cellular networks. For MIMO ad hoc networks, the distributed beamforming solutions show the system throughput enhancement and energy minimization but suffer from the challenges of interference and communication overhead. Several distributed beamforming methods were designed to minimize the interference and communication overhead in literature.

As discussed earlier, there are three categories in which the beamforming solutions presented for MIMO. The point-to-point communication based MIMO beamforming solutions described in [36] and the beamformers and linear precoders (eigencoders) proposed for point-to-point MIMO communications in [7,8]. The beamforming algorithms designed for the cellular networks in [911] reduces the power and improve the capacity of single antenna mobile transmitters and array-equipped base stations. The beamforming and power control for point-to-point and cellular MIMO networks achieved with minimum overhead and interference challenges. However, the problem is for ad hoc MIMO networks where operate without centralized controller and moving nodes. For MIMO ad hoc networks, the distributed beamforming techniques enhance network throughput and minimize energy consumption [12,13]. The main challenge using optimization based solutions for ad hoc networks is that they needs the systematic and proper study as the environment of such networks is interference limited as well as overhead introduced by beamforming algorithms affects the performance of MIMO ad hoc networks. Earlier were distributed spatial beamforming techniques introduced in [14,15] for multi-user ad hoc MIMO networks under channel reciprocity conditions. However, in such methods the transmission overhead is introduced during the power control and at each iteration. To minimize the communication overhead. Further, many iterative algorithms were introduced for ad hoc MIMO networks. However, the trade-off is not achieved with such techniques as well as convergence of iterative algorithms has not been investigated.

On the other hand, there are a lot of formats can view the information structure that allowed a full understanding and showed a good view of the problems that we are interesting to, when the predictive are enabling such its capability can be determined in order to enhance the view of information such that (modify, reinforce, or maybe reject). Mathematical modeling such that game theory and information theory can capture important reality features and meaningful data structure in a manageable manner with efficient information processing to achieve a smart and trust decision-maker strategies, a qualified intuition is strongly needed. A good decision aid methodology implies understanding reality in a manageable way [16]. Game theory is useful mathematical tools for Studying environments where multiple players are interacting and making decisions, it often talk about rational players (they try to optimize their own interests) [17].

The Game theory based solutions produced the efficiency along with convergence investigation for important problems in wireless communications like joint code-division multiple access [18], distributed power control algorithm [19], and optimum transmission signaling strategies [20,21]. Thus, game theory based beamforming and power adaptation methods designed for multi-user ad hoc MIMO network communications to achieve the power efficiency with minimum overhead. In [22], the initial cooperative and non-cooperative game theory based methods designed for joint channel and power control for wireless mesh networks, and chanel allocation and beamforming for ad hoc networks was studied in [23], however the solutions are suboptimal for wireless communications. For ad hoc networks the reduction of power using the distributed algorithms and transmit beamformer selection is difficult tasks and hence its challenging to formulate the beamforming games for multi-user wireless communications. In [24,25], the first attempt over joint discrete transmits beamforming and power adaptation introduced using the cooperative and non-cooperative game theory based solutions. The decentralized approach designed for optimizing the transmit beamformer and power adaptation by exploiting the local information with the acceptable computational burden under constraint of constant received target signal to interference noise ratio (SINR). However, designing the optimized game theory based approach for the power adaptation and beamforming in ad hoc MIMO systems is still the complex research problem the both solutions failed to achieve the trade-off between power minimization and overhead. In [25], the first solution cooperative power minimization algorithm (COPMA) achieved the power minimization but introduced the overhead and second non-cooperative solution regret-matching-based joint transmit beamformer and power selection game (RMSG) minimizes the communication overhead but it takes more power and iterations to converge. In this paper, we further optimize COPMA and RMSG techniques to minimize the overhead and power consumption respectively. In [26] the reinforcement learning (RL) based algorithm used to determine the suitable policy for selection between two different techniques, first one is beamforming while the second is power control for sensor array networks, by taking the benefit of the individual features of each method according to SINR threshold. The key issue of the RL is to select the action (BF or PC) according to the learning state in this case (SINR), that is satisfied with the system requirement. Q-learning with an epsilon greedy (ε-greedy) policy was used in the RL algorithm that is trained by using the offline method. Artificial intelligence (AI) techniques have been described and applied in problems for beamforming, power control, and MIMO wireless communication systems [2730].

The key contributions of this paper summarized as:

  1. Enhanced co-operative power minimization algorithm (ECOPMA) for MIMO Ad hoc networks is proposed to overcome the challenges of COPMA. We proposed cooperative method using the potential game approach in which first compute allocated power and beamformer for each user until the convergence to the steady state according to Nash equilibrium (NE) technique, then divide users in clusters [31] and enable them to converge at same time in order to minimize the overhead. In short, in this case the users cooperate each other to reduce the power and interference using the potential game theory and NE.

  2. To reduce the communication overhead in ECOPMA, the binary weight books are used rather than using the complex Grassmannian weight books. The binary weight book reduces the overhead incurred by cooperative solution.

  3. Proposed reinforcement learning based power allocation and beamformer algorithm (RLPBA) for multi-user ad hoc MIMO systems. In RLPBA, optimized non-cooperative solution designed using the RL game based approach in which the local information used for the beamforming and power adaption decisions. The Reinforcement approach propose to reduce the number of iterations and hence the transmit power for convergence.

  4. Extensive experimental results and comparative evaluations presented for ECOPMA and RLPBA with centralized and state-of-art cooperative and non-cooperative algorithms.

In Section 2, the proposed system model, algorithms for ECOPMA and RLPBA presented. In Section 3, the simulation results are discussed. In Section 4, conclusion and future work described.


This paper proposes the two technologies ECOPMA and RLPBA proposed to address the challenges of power minimization with minimum interference and communication overhead with guaranteed QoS for multi-user MIMO ad hoc systems with constant SINR constraint. ECOPMA is based on potential cooperative games using binary weight books rather than using the complex Grassmannian weight book to reduce the communication overhead. ECOPMA is similar to the COPMA [25] with inclusion of binary weight books. The RLPBA is non-cooperative games based power allocation and beamforming technique similar to RMSG [25] with use of reinforced learning to reduce the power consumption and number of iterations to converge. In Section 2.1, we present the MIMO ad-hoc communication model and game theory problem. In Section 2.2, we present the design of centralized beamforming and power allocation technique. In Section 2.3, design of ECOPMA presented, in Section 2.4, the design of RLPBA described.

2.1. System Model

Figure 1 demonstrates the system model referred from [25] which consist of wireless ad hoc systems with multiple antennas nodes pair under the same channel. The interference generated by other nodes pairs operating on similar channel. Total N number of node pairs with each pair q consist on single transmit-receive wireless node. Every transmitting and receiving node is equipped with A antennas. Every node having the beamformer pair wq,tq. The complex transmitted symbol stream is sqC and received symbol stream is s^qC. Thus the received signal vector RqC for qth receiving node is computed as:

where Uq,j denotes the A × A MIMO channel between the jth transmitting node and the qth receiving node and is assumed to be quasi-static and Eq is the power of the qth transmitting node. The additive white Gaussian noise terms nq have identical covariance matrices.

Figure 1

Environment and agent system interaction method [26].

The worst case is considered in this work in which all the pairs always having packets to transmit through the wireless channel. The network assumed to be synchronous. The available codebook beamformers set for the qth node pair is represented by q=tq1,tq2,tqγ with the cardinality Υ. The transmit beamformer from codebook selected by receiving nodes and feeds back with selected beamformer index. Every node can select among Υ transmit beamformers codebook. Consider that tqi.e.q selected transmit beamformer for the qth node pair. =t1,t2,tNA and E=E1,E2,ENA are denoted as transmit beamformer selection and transmission power vectors for N number of nodes respectively. The A × A matrix of the interference and noise covariance at qth receiving node is:

where q and Eq are transmit beamformer and powers of the nodes other than q. Further, normalized receive beamformer at qth receiving node computed as:
where wq^=Rq1Uq,qtq. The outcome of SINR at qth receiving node due to desired transmitter of qth node pair is:

Thus the proposed methods in this work attempt to achieve the target SINR through the transmit powers adjustment. According to this, the optimization problem of this work is described as below.

The objective is to reduce the transmit energy of all the nodes pairs q1,2,N in network under the constraint of constant SINRγ0. We define the optimization problem as:

Minimize ,Eq=1NEq
subject to τqγ0, ||wq||=||tq||=1

Emin<EqEmax, where the Emin and Emax are the minimum and maximum transmit powers respectively.

This problem is represented in game theory approach as normal form of game as:

where, N is the set of players, C is set of available actions of all N players, and Fqq=1N is set of utility functions that the players associate with their strategies. The actions cqCq for a player q are the selection of transmit powers EqCmin,Cmax and the transmit beamformer tqi.e.q. In this case, the players in game selects the appropriate action to enhance their utility functions. The convergence point in proposed case is set of strategies, set of beamforming selections =t1,t2,tNA and E=E1,E2,ENA from which no player would deviate. Such set of strategies is known as NE [30]. The NE is a set of strategy profiles c from which no player can increase his utility by unilateral deviations. NE used to decide and change the strategy profile while keeping the actions of other players same. In this paper we designed two scenarios of node pairing such as cooperative and non-cooperative to obtain the best outcomes to satisfy the objective function with guaranteed convergence.

2.2. Centralized Method

Prior discussing the decentralized solutions, this section we formulate the centralized solution for MIMO ad hoc communication systems. Using the centralized agent [32], joint transmit beamfomers and corresponding transmit powers selected to reduce the overall transmit power of all the antennas as:

where,  and E are the optimal transmit beamformer and power solutions respectively. The transmit power (Eq) qth nodes pair is computed as:

In this case, the centralized agent computers the total network power for ϒN possible beamforming vector combinations which leads the complex tasks in case large scale wireless ad hoc networks. The sever complexity invalidates the centralized method for multi-user MIMO ad hoc systems. To solve such problems we introduced the decentralized solutions in this paper.


As the name indicates, ECOPMA is cooperative game based solution (Where nodes can cooperative each other to achieve the optimum solution) for beamforming and adaptive power allocation for multi-user MIMO ad hoc systems. As discussed earlier, the ECOPMA is similar to COPMA [25] in which we used the two users paired in clustering and binary codebook to reduce the computational overhead. In this work, we generated the binary codebook of size 16 (represented as ϒ) code length and 4 (represented as A) is dimension of code [33].

The clustering mode, in which we assume that the node pairs with similar transmit power properties are grouped into one cluster. The mobile users clustered in order to enable them to converge at same time to minimize the energy and overhead. In [34], it is proved that for multi-user scenario, two users paired into cluster leads to common transmit beamforming vector sharing and hence improves the QoS performance with minimum overhead and power consumption. In ECOPMA, finding the joint optimal transmit power allocation and transmit beamformer so that total power consumption in network is minimized. Let the each user in qth paired represented into M clusters in such manner that they share common transmit beamforming vector. The qth nodes pair assumed that it paired to mth cluster, m1,2,M. For the clustering, we assume the channel conditions as for transmit and receive beamforming vectors as: (1) zero-forcing (ZF) precoding at each receiving node to remove the inter-cluster interference and (2) signal alignment is conducted at the receiver between users in the same cluster. According to that, let Eqm,m and qm,m are denoted as transmission power vectors and transmit beamformer selection vectors for qth nodes pair across the mth transmit and receive clusters respectively. Thus, the objective function of ECOPMA is:


Eq. (9) will assume as the every users utility function that is,


The special case of potential games is modelled in this paper called identical interest game [25,35]. Therefore it becomes easier to verify that at least one pure NE produced by all the identical interest games which represent any action profile that enhances Fi,E. The working of ECOPMA technique described as: (See Algorithm 1 on next column)

Algorithm 1 ECOPMA

1. Inputs:

k: predefined number of iterations

M = number of clusters

2. Apply the users clustering on all the mobile users regardless of users pairing

3. FOR each cluster m in M, do

4. Initialization: For each pair q, initialize transmit beamfoerms and transmit powers:

4.1.||wqm,m||=||wqm,m||=1, qN, mM

4.2. Eqm,m=Emax, qN

5. Repeat: random selection of qth nodes pair with probability of 1/N.

5.1. Set tqm,mn=tqm,mn1 (current transmit beamformer for qth nodes pair)

5.2. Calculate Eqcurrent using Eq. (8)

5.3. Random selection of transmit beamformer tqupdated and compute corresponding transmit power Equpdated using Eq. (8) required while using updated transmit beamformer.

5.4. Form a vector IDqmq,Eqcurrent,Equpdated and broadcast all other nodes pairs j, jN.

5.5. On receiving data vector, for each j

IF (Ej changes due to interference at jth receiver)

Every other node pairs sets Ejcurrent=Ejupdated,

Computes new transmit power and set it as Ejupdated


Unchanged Ejcurrentand Ejupdated


Send back vector IDj,mEjcurrent,Ejupdated to node pair q

5.6. node pair q computers the total network transmit power as Ecurrent=q=1NEqcurrent and updated power as Eupdated=q=1NEqupdated with tqupdated

5.7. Updating node pair q selects the tqupdated using the probability based on smoothing factor τ>0 computed by


5.8. qth nodes pair broadcasts the decision signal whether new transmitter beamform is kept.

5.9. If not kept, then every other node pair j, jm/N keeps Ejupdated=Ejcurrent

6. Until predefined number of iterations steps k


The functionality of smoothing factor is referred from [25] with the core contribution of this research paper. The above Algorithm 1 shows the difference between COPMA [25] and ECOPMA using the clustering. The IDqm is the single user unique number that belongs to qth nodes pair and mth cluster.

2.4. RLPBA

The second solution proposed in this paper is based on non-cooperative game theory based approach for beamforming and adaptive power allocation in multi-user ad hoc MIMO systems using the RL scheme. The main aim is to design the distributed learning technique for joint transmits beamformer and power selection scheme which needs only local information for updates for ad hoc MIMO systems. The utility function for non-cooperative users used in this case. As compared to cooperative solution, the qth nodes pair focused only on their own power minimizations rather than complete network power. Every player's utility function based on the selection of transmits beamformers and its power, and other player's selections for beamformers and transmit power through perceived interference. The novel solution of RLPBA proposed that exhibits “rewards” of selecting the action or strategy [26,36]. We formulate non-cooperative game theory approach using the Actor-Critic Learning algorithm known as Continuous Actor-Critic Learning Automaton in this paper to predict and select the beamformer and accordingly calculate the new transmit power.

One of the important methods in Temporal-Difference (TD) methods is Actor-Critic which based on separated memory structure in order to describe the independent of policy as compared with the value function (as contrasted with the reward Re instead of having a short-term return, the value function anticipated of long-term along with decreasing factor). Where policy structure (the action which the agent uses to evaluate the next action strategy depends on the current state) is recognized as an actor, for the reason that it is employed to choose actions, and determined value of a function can be known as the critic, due to it criticizes actions that made by the actor. The critic has to observe and justify if the policy is being followed by way of the actor or not [37].

As like in Figure 1, let S denotes the number of states required that contains the γ number of transmit beamfomer codebooks which selected by every node. Let vector ^q denotes all action set for user q, that is, ^q=tq1,tq2,tqγ, where tq(i) denotes the transmit beamformer vector selected by qth user in iteration i. Define the reward function Req^qk of qth user for an action ^q at kth iteration as:


Every user q computes the Req^q for each action for all the past steps when all other players action remains unchanged. Every user q update its reward function value for every set of action ^q as:


After updation, we perform the selection the transmit beamformer tqk by computing the probability as:


Every user q chooses an action or strategy according to the outcome of Pq^qk while checking the decision condition as:


According to the selection of transmit beamformer tqk, the new transmit power Eq is computed using Eq. (8) on selected tqk. Algorithm 2 demonstrates the above steps.

Algorithm 2 RLPBA


k: predefined number of iterations

1. FOR i = 1, 2,…, k

2. FOR q = 1, 2,…, N

3. Compute the reward value using Eq. (11)

4. Update Reward table using Eq. (12)

5. Compute the probability values using Eq. (13)

6. Take the decision on tqk selection using Eq. (14)

7. Compute new transmit power for tqk using Eq. (8)




This section presents the performance investigation of proposed ECOPMA and RLPBA methods with state-of-art solutions such as centralized optimization, RMSG [25], and COPMA [25]. While evaluating such methods we designed ad hoc networks with assumption of 5 (small) and 10 (large) homogenous pairs with each having one transmitter and receiver node. The complete sets of parameters are described for 5 pairs and 10 pairs in Tables 1 and 2 respectively. For COPMA [25], the Grassmannian codebook of size ϒ = 16 with A = 3 antennas for all users, however as the Grassmannian codebook complex, for ECOPMA we used binary codebook of size ϒ = 16 with A = 4 antennas for all users in this work.

Parameter Value
Number of wireless ad hoc pairs 5
Size of network 30 m × 30 m
Constant Received SINR 10 dB
Grassmannian codebook (ϒ = 16, A = 3)
Binary codebook (ϒ = 16, A = 4)
Emax 100 mW
Emin 1 mW
Propagation channel Radio propagation channel (path loss component is 4)
Predefined number of iterations (k) 120
Smoothing factor (For COPMA and ECOPMA) 0.01/k2
Clusters (ECOPMA) 2
Methods Investigated Centralized, COPMA, RMSG, ECOPMA, and RLPBA.

Notes. COPMA = cooperative power minimization algorithm; ECOPMA = enhanced co-operative power minimization algorithm; MIMO = multiple input multiple output; RLPBA = reinforcement learning based power allocation and beamformer algorithm; RMSG = regretmatching-based joint transmit beamformer and power selection game; SINR = signal to interference noise ratio.

Table 1

Simulation parameters for 5-Pairs wireless ad hoc MIMO system.

Parameter Value
Number of wireless ad hoc pairs 10
Size of network 100 m × 100 m
Constant Received SINR 10 dB
Grassmannian codebook (ϒ = 16, A = 3)
Binary codebook (ϒ = 16, A = 4)
Emax 100 mW
Emin 1 mW
Propagation channel Radio propagation channel (path loss component is 4)
Predefined number of iterations (k) 1500
Smoothing factor (For COPMAand ECOPMA) 200/k2
Clusters (ECOPMA) 4
Methods Investigated Centralized, COPMA, RMSG, ECOPMA, and RLPBA.

Notes. COPMA = cooperative power minimization algorithm; ECOPMA = enhanced co-operative power minimization algorithm; MIMO = multiple input multiple output; RLPBA = reinforcement learning based power allocation and beamformer algorithm; RMSG = regretmatching-based joint transmit beamformer and power selection game; SINR = signal to interference noise ratio.

Table 2

Simulation parameters for 10-Pairs wireless ad hoc MIMO system.

3.1. Evaluations of 5-Pairs Wireless Ad hoc System

Table 1 shows the simulation parameters to use to investigate said methods.

As per the above table which is designed for small ad hoc networks evaluations, we measured the performance of total power consumption. Figure 2 demonstrates the comparative analysis of total power consumed by network for cooperative methods COPMA and ECOPMA. The purpose of ECOPMA is to reduce the overhead by reducing the number of iterations compared to COPMA method which is achieved in results (Figure 2). We observe that ECOPMA's performance settles at the global optimum combination after 87 iterations as compared to COPMA (93 iterations). The reduction in iterations reduces the communication overhead as well as total power consumption of network. The performance of ECOPMA is improved due to the clustering and simple binary codebook. Similarly we evaluated the non-cooperative distributed learning based techniques in Figure 3. The RLPBA designed to reduce the power consumption with the reduction of iterations for optimum allocation solutions for network pairs. We exploited the advantages of RLtechnique over the regret based learning approach in RLPBA. The outcomes of RLPBA show that it reduced the total transmission power for small wireless ad hoc MIMO networks compared to RMSG technique. It takes 110 iterations to converge total network transmission power compared to RMSG (120 iterations).

Figure 2

Total transmit power versus iterations (N = 5) for cooperative methods evaluations.

Figure 3

Total transmit power versus iterations (N = 5) for non-cooperative methods evaluations.

Figure 4 demonstrates the comparative study of all the methods. It shows that total power in network varies using the non-cooperative methods over the 120 iterations. The cooperative methods (COPMA and ECOPMA) achieved the better total transmit power reduction performance compared to non-cooperative methods (RMSG and RLPBA). But the cooperative methods introduced the significant overhead compared to non-cooperative methods. The updating task needs less overhead for non-cooperative methods. The proposed cooperative and non-cooperative achieved the optimum performances compared to existing solutions in this paper.

Figure 4

Total transmit power versus iterations (N = 5).

Figure 5 demonstrate the power trajectories in ECOPMA for every user pair in network. As observed in figure, every user starts with maximum power levels initially (i.e., 100 mW), and then power is updated iteratively as per the behaviur of ECOPMA algorithm till the NE achieved.

Figure 5

Transmit power versus iterations with N = 5.

Figure 6 demonstrates the variations in probability mass function (P.M.F.) of RLPBA method computed by Eq. (13) after iterations 1, 12, 50, and 100 for single user. At the start, the user selects the transmit beamformers with equal probability, further it changes according to working RL.

Figure 6

Probability mass function (P.M.F.) of based power allocation and beamformer algorithm (RLPBA) method for single user (N = 5).

3.2. Evaluations of 10-Pairs Wireless Ad hoc System

Table 2 shows the simulation parameters for large wireless ad hoc MIMO systems performance evaluations using the different methods. As observed in Table 2, the network area, smoothing factor, and number of clusters changed as per the network topology compared to Table 1 parameters.

Similar to scenario of N = 5, we computed the performance of cooperative, non-cooperative, and all methods in Figures 79 respectively for total transmission power of network.

Figure 7

Total transmit power versus iterations (N = 10) for cooperative methods evaluations.

Figure 8

Total transmit power versus iterations (N = 10) for non-cooperative methods evaluations.

Figure 9

Total transmit power versus iterations (N = 10).

The results demonstrate that using the proposed cooperative and non-cooperative methods, we achieved the total transmission power minimization and overhead reduction compared to existing methods cooperative and non-cooperative methods. A proposed solution reduces the number of iterations to achieve the NE and hence delivers the highest throughput compared to existing techniques. The centralized approach is no longer feasible in this scenario due to the enormous strategy space required [25]. As seen in Figure 9, the cooperative methods achieved the reduction in total transmit power as compared to non-cooperative as they converge early. The non-cooperative methods takes too much iterations to achieve the NE step, hence failed to reduce the transmit power, but it take the minimum overhead compared to co-cooperative solutions. For N = 10, the proposed solutions further optimize the cooperative methods. The RLPBA reduced the number of iterations for convergence and hence total transmit power reduction compared to RMSG. Figure 10 demonstrates the variations in P.M.F. of RLPBA method computed by Eq. (13) after iterations 1, 500, 1000, and 1500 for single user in this scenario as well.

Figure 10

Probability mass function (P.M.F.) of based power allocation and beamformer algorithm (RLPBA) method for single user (N = 10).

The proposed solutions achieved the reduction in communication overhead compared to previous methods. The clustering with binary coodbook for cooperative method helps to reduce the communication overhead as compared to COPMA and the RL helps to reduce communication overhead compared RMSG techniques.


In this paper we initiated our study on design of beamformer selection and it's the important role for the adaptive wireless communication systems. We proposed two game theory based algorithms to optimize the performance of ad hoc wireless MIMO systems for joint transmit power and transmit beamforming with minimum power consumption and overhead. We designed cooperative (ECOPMA) solution using clustering and binary codebook and non-cooperative (RLPBA) solution using the distributed reinforce learning in this paper. The simulation outcomes demonstrate the convergence properties of the proposed techniques and their performance in terms of overall power minimization, convergence rate, and communication overhead in network. The proposed solutions achieved the significant performance improved compared to state-of-art methods. For future work, we further suggest to investigate performance under different propagation channels and Signal to Interference Noise Ratio (SINR) constraints.


17.W. Saad and M. Bennis, Game Theory for Future Wireless Networks: Challenges and Opportunities, ICC, London, UK, 2015. Tutorial
24.E. Zeydan, D. Kivanc, and U. Tureli, Joint iterative beamforming and power control for MIMO ad hoc networks, in IEEE Global Telecommunications Conference GLOBECOM (Miami, FL), 2010, pp. 1-5.
28.J. Yoshida and A. Hirose, Beamforming for impulse-radio UWB communication systems based on complex-valued spatio-temporal neural networks, in Proceedings of the International Symposium on Electromagnetic Theory (Hiroshima, Japan), 2013, pp. 848-851.
International Journal of Computational Intelligence Systems
12 - 2
1436 - 1445
Publication Date
ISSN (Online)
ISSN (Print)
10.2991/ijcis.d.191121.001How to use a DOI?
© 2019 The Authors. Published by Atlantis Press SARL.
Open Access
This is an open access article distributed under the CC BY-NC 4.0 license (

Cite this article

AU  - Ali Kamil Khiarullah
AU  - Ufuk Tureli
AU  - Didem Kivanc
PY  - 2019
DA  - 2019/11/27
TI  - Optimized Intelligent Design for Smart Systems Hybrid Beamforming and Power Adaptation Algorithms for Sensor Networks Decision-Making Approach
JO  - International Journal of Computational Intelligence Systems
SP  - 1436
EP  - 1445
VL  - 12
IS  - 2
SN  - 1875-6883
UR  -
DO  - 10.2991/ijcis.d.191121.001
ID  - Khiarullah2019
ER  -