International Journal of Computational Intelligence Systems

Volume 14, Issue 1, 2021, Pages 896 - 921

Value-Based Reasoning in Autonomous Agents

Authors
Tomasz Zurek1, *, ORCID, Michail Mokkas2, ORCID
1Institute of Computer Science, Maria Curie-Sklodowska University, plac Marii Curie-Skłodowskiej 5, Lublin, 20-400, Poland
2Polish-Japanese Academy of Information Technology, Koszykowa 86, Warsaw, 02-008, Poland
*Corresponding author. Email: tomasz.zurek@poczta.umcs.lublin.pl
Corresponding Author
Tomasz Zurek
Received 27 March 2020, Accepted 11 January 2021, Available Online 19 February 2021.
DOI
10.2991/ijcis.d.210203.001How to use a DOI?
Keywords
Reasoning; Model; Goals; Values; Expert systems; Autonomous agents
Abstract

The issue of decision-making of autonomous agents constitutes the current work topic for many researchers. In this paper we propose to extend the existing model of value-based teleological reasoning by a new, numerical manner of representation of the level of value promotion. The authors of the paper present and discuss proofs of compatibility of both previous and current models, a formal mechanism of conversion of the parameters of the autonomous device into the levels of promotion of values, the mechanism of integration with machine learning approaches, and a comprehensive argumentation-based reasoning mechanism allowing for making decisions.

Copyright
© 2021 The Authors. Published by Atlantis Press B.V.
Open Access
This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

1. INTRODUCTION

A distinctive feature of modern technical devices is their increasing self-sufficiency. One may presume that in the future the user will only formulate the most rudimentary rules of how a device works (or goals), whereas the everyday aspects of system operation will be regulated completely autonomously. The most advanced types of such devices are self-driving vehicles, where the user merely sets the journey's destination and the vehicle develops the itinerary on its own and makes hundreds of traffic-related decisions.

The issue of decision-making by autonomous agents has been a research topic of a number of researchers representing various fields of science [1]. Typically, it is perceived as the selection of the best possible action which should be performed. Classical decision theory, developed mostly by economists, relies on the concept of utility, where utility is a way of representing the agent's preferences. Utility theory can be combined with probability theory to build the principle of maximum expected utility (PMEU) which states that an agent is rational if and only if it chooses the action that yields the higher expected utility, averaged over all the possible outcomes of the action [2,3]. In early 2000s an alternative approach to classical decision theory emerged where the basis for modeling a decision-making situation is the logic-based representation of human reasoning and argumentation and the so-called practical reasoning, that is reasoning regarding the action which should be performed by the agent (discussed from a various points of view: [48], and others, where, e.g. [9,10], present it in a general discussion of argumentation). In this work we intend to focus on the latter approach where decisions are made on the grounds of a model of human reasoning and argumentation.

In major decision-making models it is usually assumed that the purpose of system operation is to accomplish some state of affairs pre-declared by the user [1113] (and many others). Many authors (including [14,15]) believe, however, that increasingly complex devices must be endowed with the ability to not only find the best possible way to attain the state of affairs set by the user, but also the ability to set their own goals themselves; the model proposed in this paper is based on this assumption as we believe that the increasing autonomy of devices calls for the realization of systems able to self-sufficiently set themselves a goal and take measures to accomplish it.

Zurek [14] presents a framework allowing for autonomous (based on values which should be promoted) goal-setting by a device. The model relies on the differentiation between several kinds of goals: abstract (i.e., minimal levels to which values should be promoted) and material (i.e., particular states of affairs which fulfil abstract goals).

The objective of this paper consists in proposing for the model from [14] a new method of the representation of the levels to which particular decision options promote various values as well as modification of the reasoning mechanism allowing for autonomous setting and realization of goals. This study is an expansion of the research results presented during the FedCSIS 2017 conference and further discussed in [16]. The proposed mechanism may serve as a formal foundation for the realization of the autonomous agent and allow for an integration of the logic-based reasoning mechanism discussed in [14] with machine learning approaches. A simple example of a partially autonomous mobile device illustrates how the model operates.

1.1. Contribution

This study is an expansion of the research results briefly discussed during the FedCSIS 2017 conference [16]. The paper is built on the background of our two previous papers [14,16]. Sections 3 and 7 briefly summarize paper [14], sections 4.1 and (partially) 4.2 summarize paper [16].

Our main contribution to the previous results can be summarized as follows:

  • We present and discuss proofs of compatibility of our model (basics of which were presented in [16]) with the general model presented in [14] as well as with rationality postulates (Sections 4.2 and 4.3);

  • We introduce an exemplary mechanism of conversion of the parameters of the autonomous device and its environment into the levels of promotion of values (Section 6);

  • We discuss the problem of the discrete and continuous decision space (Section 8);

  • We introduce and discuss the mechanism of integration of our model with machine learning approaches (Sections 9.1 and 9.2);

  • We introduce a general and comprehensive argumentation framework allowing for making value- and goal-based decisions (Section 10);

  • The model is illustrated by the example (Section 11).

The key point of the model from [14] was the possibility of making decision (establishing the desired state of affairs and performing the action leading to such a goal) even for the decisive situations which were not predicted in advance. This could be realized on the basis of the abstract goal (the minimal required level of promotion of values) allowing for the choice of the best practical goal (action leading to the desired state of affairs). The agent, in order to make such a decision, required the evaluation (in terms of the values from his abstract goal) of a possible decisions and the orderings of those evaluations (obtaining of which was an important disadvantage of this approach). Our new approach extends the previous one by introducing the possibility of computing of the levels of promotion of values and ordering between them.

2. THE MAIN IDEA

The framework presented in [14] (and summarized in the following) provides strong possibilities of modellng reasoning in autonomous systems; however, its practical implementation requires the solution of a number of major problems. In our opinion, a serious disadvantage of the system is the requirement of declaration of a great number of orders indicating the relations between levels to which various situations promote various values and sets of values. This task proves simple only in the case of decision support systems which contain a relatively low number of decision options and values. Unfortunately, for the majority of real decision-making situations, these sets are too large. We find it indispensable to extend the model (1) by a mechanism allowing for an automatic evaluation of the levels to which given values are promoted by various situations (2) and by a mechanism which would determine the cumulative evaluation of the promotion of various values. The key assumption of our work is the idea that the estimation of the levels of promotion of values by the set of possible actions and state of affairs is easier to obtain than a declaration of orderings between them (in Zurek [14] we show that the number of orderings can be significantly greater than the number of all values, actions, and states of affairs).

We plan to attain our goal by introducing a numeric representation of the levels of value promotion. Such a method of representation of value levels (for single values as well as for value sets) will facilitate their comparison, leading to easier and quicker (searching order sets O and OR will not be necessary) reasoning. Additionally, for systems in which values are connected with the physical parameters of a device or the environment (e.g., the battery level of a mobile device, transfer speed, temperature, etc.) we propose a basic mechanism for automatic translation of physical units into the levels of promotion of values, allowing the possibility of its development and adjustment to evaluate other values difficult to measure, e.g., the degree of resemblance to the pattern, security level evaluation (such as in [1720]), etc. The completion of the goal requires: (1) definition of the representation method for the level of value promotion including each value's weight, (2) development of the mechanism of determining the cumulative evaluation of the promotion of various values in a given situation, thus making possible the comparison of various options, (3) proposition of relevant modifications of the inference rules discussed in [14], (4) discussion of possibility of integration of our mechanism with machine learning based approaches, (5) verification of whether the developed mechanisms are not contradictory with either intuition or the formal properties of the model presented in [14].

Some comments: Many real-life decision situations are connected with a huge number of options to choose. Since the computer cannot distinguish between “reasonable” and “non-reasonable” options, in most decision systems we have to analyze all possibilities. Since their number can be high, such analysis can be a serious challenge (especially because the analysis requires detailed knowledge about all the options, relations between them and because the result of the analysis has a large impact on the quality of the decision). This is a problem which exists in most (if not all) formal decision-making mechanisms. If we look at the decision-making mechanisms from a very general point of view we can distinguish two main approaches to collect knowledge necessary to make the decision:

  1. Introduce the knowledge directly to the system (knowledge-based approaches);

  2. Train the system by the examples of previous decisions (machine learning-based approaches).

Both approaches have their own advantages and disadvantages: a knowledge-based systems require preparation and formulation of the entire necessary knowledge, but they allow for full control over the decision-making process (and can justify the decision). The machine learning-based approaches require an extensive set of examples to train the system (the quality of the decision depends on the size and quality of a training set) and, since most of them work as a black-box, they cannot justify their decisions, which may exclude such a system from many decision areas (e.g., in the legal matter, which usually requires justification of the decision).

In order to illustrate the problem, we present the knowledge requirements of some models of knowledge-based decision-making models: in the classical PMEU model [2,3], there is necessity of declaration (or computation) of the level of utility and assignment of the level of probability to every option. The AATS+V model initially [7] required declaration of the promotion or demotion of each value for every possible transition between states of affairs (note that there can be much more transitions than states) but in the later works ([21] and others) it allows for calculation of promotion or demotion. In the later versions, the model allowed for the calculation not only the promotion or demotion, but also of the levels of promotion and demotion. In Amgoud et al. [22] there is a necessity of declaration of the orders between all possible options (which represents the relation of preference). Additionally, the preference relations between arguments pro and con each option is computed on the basis of the strength of the knowledge supporting each of the arguments (which also should be declared in advance). In the model of practical reasoning presented in [11] states of affairs (decision options) require declaration of the list of values supporting and demoting each of the states of affairs as well as preorders on states representing the preferences of the agents which allows for analysis of the value system and decision-making (the choice of a state of affairs). In the popular BDI model there is also the necessity of the analysis and comparison of all possible decision options.

In all the above examples, the amount of knowledge which should be declared is (similar to our approach) very high and the quality of the knowledge has a crucial character for the quality of decisions. The difference between our approach and the approaches presented above is in the kind of knowledge which should be introduced to the system. In our model we introduce the levels of promotion of particular values and weight functions instead of various orderings between states, actions, arguments, etc. There is an open problem: which kind of knowledge is easier to obtain? In our opinion, the answer to this question is not trivial and requires further research, especially regarding the possibility of utilisation of machine learning mechanisms to support obtaining such knowledge. In section 6 we introduce the general proposition on how machine learning mechanisms can help to estimate the level of promotion of particular values.

3. GVR MODEL

We will begin with a summarized discussion of the model of teleological reasoning from [14], further referred to as the GVR model.

Firstly, the naming convention will be presented:

  • By upper case letters we denote sets;

  • By lower case letters we denote propositions;

  • Subscripts denote names of propositions;

  • Superscripts denote names of sets;

  • Greek letters denote functions;

  • Other symbols will be defined later (except trivial logical and set-theory ones).

Definition 1.

[State of affairs] Let S={s0,s1,s2,} be a finite, nonempty set of propositions. Each proposition represents one state of affairs. Let γ be a function which returns 1 if a given state of affairs is true and 0 if not. One and only one element from set S can be true: if (γ(sy)=1), then sxS:sxsy(γ(sx)=0). s0 is the initial state of affairs. We assume that all state of affairs are separate. If we would like to model a case in which more than one state of affairs will be achievable simultaneously, then they should be divided into separate decisions: e.g., having two state of affairs sa, sb, an agent have such available state of affairs: S=sa,sb,sa,b. Such an approach may cause a combinatorial explosion, but since set S contains only possible states, then the number of possibility will be significantly lower than 2n.

Definition 2.

[Actions] As an action we understand an activity which carries a transition from a certain state of affairs to another state of affairs. Actions will be represented by propositions from set A={a1,a2,ak}.

It is worth noticing that a particular action cannot be performed in every state of affairs. The set of all possible actions in all possible states of affairs we denote as AS (ASA×S). Set AS is a set of pairs AS={asi,j,ask,w,} in which asi,j=ai,sj,ask,w=ak,sw (the first subscript denotes the name of an action, the second subscript denotes the name of a situation). Each pair represents that a given action (e.g., ai) can be performed in a given state of affairs (e.g., sj). By ASj (where ASjAS) we denote a set of actions possible to perform in a state of affairs sj.

Function δ:ASS returns the result of performing an action ai in a state of affairs sj. By δ(asij)=sy where asij=ai,sj we denote that the result of performing an action ai in a state of affairs sj is sy. The presentation of our model will be illustrated by a simple running example (adopted form [14]:

Example 1.

[Running example] John is in a bar and he is thirsty (state of affairs sthirsty), he should drink a pint of lager (action adrinkingLager), which will slake his thirst and leads to the state of affairs ssatisfied. By asdrinkingLager,thirsty we denote the action-state pair (drinking lager in the state thirsty). Function δ(asdrinkingLager,thirsty)=ssatisfied shows that drinking lager in the state thirsty leads to the state satisfied.

Definition 3.

[Transition process] Let ε:AS×SS be a partial function which represents performing an action a in a state of affairs s. If δ(asi,j)=sy and γ(sj)=1 (the result of performing an action ai in a state of affairs sj is sy), then performing ε(asi,j) causes changing γ(sj)=0 and γ(sy)=1.

Example 2.

[Running example, cont.] If John, being thirsty (γ(sthirsty)=1), drinks a pint of lager (state action pair asdrinkingLager,thirsty), then, since drinking beer brings about satisfaction (δ(asdrinkingLager,thirsty)=ssatisfied), the actual state of affairs will change:

γ(sthirsty)=0 and γ(ssatisfied)=1.

Definition 4.

[Situation] By a situation xn we understand a particular state of affairs or a result of a particular action performed in a given state of affairs. A set of situations X is a union of sets of states of affairs and results of actions: X={SAS}. By xnX we denote an element from set X. By Xj we denote a set of situations available from a state of affairs sj: Sj={sjASj} (we introduce the concept of situation because state of affairs and actions can become decision options and it is easier to use symbol X instead of {SAS})

Example 3.

[Running example, cont.] In our case, the set of situations X contains actual state of affairs and one state-action pair: X={sthirsty,asdrinkingLager,thirsty}

Definition 5.

[Values] Following Rohan [23], we have to separate the two meanings of the word value: a value may be understood as a concept or as a process.

  1. Value as an abstract concept which allows for the estimation of a particular action or a state of affairs and influences one's behavior. V is a set of values: V={v1,v2,vn}

  2. Value as a process of estimation of the level of extent to which a particular situation (state of affairs and / or action) x promotes a value vi. By vi(x) we denote the extent to which x promotes a value vi. By V(X) we denote the set of all valuations of all situations.

It is important to emphasize that values can be promoted to a certain degree by a particular state of affairs or action: v(asi,j) represents the degree to which a value v is promoted by a state of affairs which is the result of performing an action ai in a state of affairs sj.

Example 4.

[Running example, cont.] We introduce two values: happiness (V={vhappiness}) and frugality (V={vfrugality}).

The levels of promotion of those values by both available situations (X={sthirsty,asdrinkingLager,thirsty} are:

V(X)={vhappiness(sthirsty),

vhappiness(asdrinkingLager,thirsty),

vfrugality(sthirsty),vfrugality(asdrinkingLager,thirsty)

By Vi(X) we denote the set of all possible extents to which a value vi from set V may be promoted by any possible situation xX.

Example 5.

[Running example, cont.] For value vhapiness the set Vhappiness(X) contains

{vhappiness(sthirsty),vhappiness(asdrinkingLager,thirsty)}

For value vfrugality the set Vfrugality(X) contains

{vfrugality(sthirsty),vfrugality(asdrinkingLager,thirsty)}

A partial order Oi=(;Vi(X)) represents the relation between extents to which values are promoted: vi(xn)vi(xm) means that xnX promotes a value vi to a no less extent that xmX. If vi(xn)vi(xm) and vi(xm)vi(xn), then extents to which a situation xn and xm promotes a value vi are equal (vi(xn)=vi(xm)). If vi(xn)vi(xm) and vi(xn)vi(xm), then vi(xn)>vi(xm).

Example 6.

[Running example, cont.] We assume that:

Ohappiness contains: {vhappiness(asdrinkingLager,thirsty)>

vhappiness(sthirsty)}, which means that drinking lager promotes happiness to a greater extent than staying thirsty.

Ofrugality contains

{vfrugality(sthirsty)>

vfrugality(asdrinkingLager,thirsty)}

which means that staying thirsty promotes frugality to a greater extent than drinking lager. In real-life reasoning people do not rely only on a comparison of the levels of promotion of one value; usually, they compare the levels of promotion of various values. Theoretically speaking, they are incompatible, but practically, people compare not only the levels of promotion of various values, but also the levels of promotion of various sets of values.

Definition 6.

[Sets of values] By VZV we denote a subset (named Z) of a set of values V which consists of values: vi,vj,VZ.

By VxiV we will denote a set of values promoted by a situation Xi.

Definition 7.

[The level of promotion of a set of values] By VZ(xn) we denote a set of estimations of the levels of promotion of values constituting set VZ by a situation xnX. If VZ={vz,vt}, then VZ(xn)={vz(xn),vt(xn)}.

By Vxi(xi) (when the upper script contains the name of a situation) we denote a set of estimations of the levels of promotion of all values promoted by a situation xiX.

Example 7.

[Running example, cont.] Since both available situations in our example promote the same values (but to different extents), then VZ(sthirsty)=Vsthirsty(sthirsty)=

VZ(asdrinkingLager,thirsty)=

VasdrinkingLager,thirsty(asdrinkingLager,thirsty)

Definition 8.

[Value-extent preference] A partial order OR=(;2V(X)) represents a preference relation between various values and various sets of situations: VZ(xn)VY(xm) means that the extent to which values from set VZ are promoted by a situation xn is preferred to the extent to which values from set VY are promoted by a situation xm. Properties of the OR relation:

  • Relation OR is a strict partial order, hence it is irreflexive, asymmetric, and transitive.

  • If VZ is a set of values promoted by a situation x1 (VZVx1) and VXVZ, then:

    VX(x1)VY(x2)VZ(x1)VY(x2).

Example 8.

[Running example, cont.] By VZ(xasdrinkingLager,thirsty)VZ(xsthirsty) we denote that John prefers the extents to which drinking lager promotes both values (vhappiness(asdrinkingLager,thirsty) and vfrugality(asdrinkingLager,thirsty)) to the extents to which staying thirsty promotes both values (vhappiness(sthirsty) and vfrugality(sthirsty)), which means that he prefers the drinking lager to staying thirsty.

Note that John prefers beer even though frugality is promoted to the lower extent than it is in staying thirsty. It reflects a case in which someone have to balance and accept decision in which some values are promoted to the maximal at the expense of other values.

How to determine whether the extents to which all values promoted by one situation are preferred to the extents to which all values are promoted by another situation? In the simplest case, we can assume some properties connecting orders O and OR, which may allow for reasoning about preferences; however, we have to note that order O is not sufficient to infer all necessary orders OR. Firstly, we have to assume some basics:

  • By Vx1x2 we denote values which x1 promotes to a no less extent than x2: Vx1x2={vivi(x1)vi(x2)}.

  • By Vx2>x1 we denote values which x2 promotes to a higher extent than x1: Vx2>x1={vivi(x2)>vi(x1)}.

  • By Vx1\x2 we denote values which are promoted by x1, but not promoted by x2.

  • By Vx2\x1 we denote values which are promoted by x2, but not promoted by x1.

  • Vx2x1={Vx2x1Vx2\x1}

  • Vx1x2={Vx1x2Vx1\x2}

  • For the sake of consistency, I also assume that the order declarations are rational, i.e., there are no declarations leading to situations in which OR includes both VAVB and VBVA (e.g., there are no declarations like: VAVx1, VBVx2, VA(x1)VX2(x2) and VB(x2)VX1(x1)).

Then we formulate relations between orders O and OR:

  • The first and basic mechanism connecting orders O and OR is very simple: on the basis of the ordering of extents to which a given value is promoted by two situations, preferences between these extents can be derived:

    vi(x1)>vi(x2){vi(x1)}{vi(x2)}(1)

Example 9.

[Running example, cont.] This can be illustrated by a simple example:

Since

vhappiness(asdrinkingLager,thirsty)>

vhappiness(sthirsty) then:

{vhappiness(asdrinkingLager,thirsty)}

{vhappiness(sthirsty)}

Because drinking lager promotes vhappiness to higher extent that staying thirsty, then in the light of this value drinking beer is preferred to staying thirsty.

  • The next one allows to derive preferences between the sets of values promoted by given situations: if every value promoted by a situation x2 is also promoted by a situation x1, x1 promotes it to a no less extent than x2, and at least one of these values is promoted by x1 to a higher extent than by x2, then the extents to which values are promoted by x1 are preferred to x2:

    (vzVx2(vz(x1)vz(x2))vtVx2(vt(x1)>vt(x2))Vx1(x1)Vx2(x2)(2)

Example 10.

[Running example, cont.] In order to illustrate the above we have to add a new value: social relations (vsocialRelationsV). Both available situations promote this value to the extents:

vsocialRelations(asdrinkingLager,thirsty),

vsocialRelations(sthirsty), where

(vsocialRelations(asdrinkingLager,thirsty)=

vsocialRelations(sthirsty))OsocialRelations (which means that both situations promote this value to the same level).

If we assume that there are only 2 values in set V (vsocialRelations,vhappinessV), then sets VasdrinkingLager,thirsty and Vsthirsty are equal and contain only those 2 values.

Since both values in Vsthirsty are promoted by asdrinkingLager,thirsty at least to the same level (vsocialRelations to the same level, vhappiness to the higher extent) then

VasdrinkingLager,thirsty(asdrinkingLager,thirsty)

VasdrinkingLager,thirsty(asdrinkingLager,thirsty)

  • If a set of values promoted by a situation x2 (Vx2) consists of two sets: values which are promoted by x1 to a higher extent than by x2 (Vx1x2) and other values which are promoted by x2 (Vx2x1), and if there exists a set of values VsVx1x2 whose promotion by x1 is preferred to levels to which values from a set Vx2x1 are promoted by a situation x2, then the extent to which values are promoted by a situation x1 is preferred to the extent to which values are promoted by a situation x2:

    (Vx2={Vx1x2Vx2x1})VsVx1x2(Vs(x1)Vx2x1(x2))Vx1(x1)Vx2(x2)(3)

Example 11.

[Running example, cont.] In order to illustrate the above we assume that drinking lager promotes social relations and happiness (VasdrinkingLager,thirsty={vsocialRelations,vhappiness}, note that drinking lager does not promote frugality) and staying thirsty promotes social relations and frugality (Vsthirsty={vsocialRelations,vfrugality}, note that happiness is not promoted by staying thirsty). Additionally we assume that

(vsocialRelations(asdrinkingLager,thirsty)=

vsocialRelations(sthirsty))OsocialRelations and

(vhappiness(asdrinkingLager,thirsty)}

{vfrugality(sthirsty))OR, which can be interpreted that John prefers level of promotion of happiness by drinking beer to the level of promotion of frugality by staying thirsty.

On the basis of the above assumptions and formula 3 we can say that

VasdrinkingLager,thirsty(asdrinkingLager,thirsty)

VasdrinkingLager,thirsty(asdrinkingLager,thirsty)

The above can simply justified: Since social relations are promoted to the equal level by both situations and drinking beer is preferred to staying thirsty in terms of happiness and frugality, then the agent prefers drinking lager.

  • In any other case, it is impossible to derive a preference order.

As we have noticed earlier, orders from set O do not allow for the inference of all necessary orders from set OR, hence some of them must be declared in advance. The main challenge regarding the modeling of the above relation of preference lies in the necessity of declaration of the number of orders between the extents to which situations promote all subsets of values from set V.

3.1. Goals

As it has already been noticed, we need to draw a distinction between a few different kinds of goals [14]:

Definition 9.

[Abstract goals] Abstract goals are goals represented by the minimal extents to which a particular situation promotes a given set of values (in our model the valuation of any target state of affairs can be considered only with the valuation of the action which leads to it, hence by a situation we will understand a particular action performed in a particular state of affairs):

  • GA={ga1,ga2,} - a set of abstract goals.

  • By vnmin(ga) we denote the minimal extent to which the promotion of a value vn satisfies a goal ga.

  • By vn(x1)vnmin(ga) we denote that a goal ga is satisfied by a situation x1 with respect to a value vn.

  • By vnga we denote that the minimal extent of a given value vn is declared in a goal ga: vngavnmin(ga).

Definition 10.

[Abstract unreachable goals] Abstract unreachable goals are abstract goals where the agent desires one value to be promoted as much as possible, while other values to be promoted no less than to a certain extent:

  • GUA={gua1,gua2,} - a set of abstract unreachable goals.

  • By vnmin(gua) we denote the minimal extent to which the promotion of a value vn satisfies a goal gua.

  • Function ω:GUAV returns the value which should be promoted to the maximal possible extent. A value vm:ω(gua)=vm will be called the key value of an abstract unreachable goal gua. Note that value vm is also the ordinary goal of a goal gua (vmgua) and it has its own minimal extent to which it should be promoted.

Definition 11.

[Material goals] Material goals are particular situations which satisfy given abstract goals.

  • GM={gm1,gm2,} - a set of material goals.

  • A material goal is a particular action performed in a given state of affairs where: gmk=(ast,m), while ast,mAS.

  • By sat(gmk,gal) we denote that a material goal gmk satisfies an abstract goal gal and is possible to achieve:

    sat(gmk,gal)(vigal(vi(ast,m)vimin(gal))ast,m(gmk=(ast,m)ASγ(sm)=1).

Definition 12.

[Practical goal] A practical goal is a material goal which is achievable, which satisfies the agent's abstract goal, and which the agent is going to reach. A practical goal will be denoted as gp. The agent can only have one practical goal at a time.

Example 12.

[Running example, cont.] Let's assume that John has one abstract goal ga={vhappinessmin(ga),vfrugalitymin(ga)} expressed by two minimal acceptable levels of promotion of two values: happiness and frugality.

We have three decision options asdrinkingLager,thirsty, asdrinking2Lagers,thirsty, sthirsty (drinking beer, drinking 2 beers, staying thirsty). We assume that:

Ohappiness contains: {vhappiness(asdrinkingLager,thirsty)>

vhappinessmin(ga)},

{vhappiness(asdrinking2Lagers,thirsty)>

vhappinessmin(ga)},

{vhappiness(sthirsty)<

vhappinessmin(gua)},

which means that drinking lager (or two) promotes happiness to the satisfactory levels, while staying thirsty not.

Ofrugality contains

{vfrugality(sthirsty)>

vfrugalitymin(gua)}

{vfrugality(asdrinkingLager,thirsty)>

vfrugalitymin(ga)}

{vfrugality(asdrinking2Lagers,thirsty)<

vfrugalitymin(ga)}

which means that staying thirsty or drinking one lager promotes frugality to the satisfactory levels, while drinking 2 beers not (two beers are too expensive).

Since our agent has abstract goal ga then we can derive that only one option (drinking lager) promotes all values from goal ga to the satisfactory level, thanks to which asdrinkingLager,thirsty becomes a material and practical goal.

Let's assume that John has an unreachable abstract goal gua={vhappinessmin(gua),vfrugalitymin(gua)} in which ω(gua)=vhappiness, We assume that thresholds in gua are the same as in ga and the key value of gua is happiness (happiness should be promoted as much as it is possible having other values promoted at least to the satisfactory level.

3.2. Inference Mechanism

Zurek [14] discusses a number of inference rules as well as a simple argumentation framework which allows for reasoning about goals and making decisions concerning the fulfilment of these goals. Our paper includes a discussion of several inference rules (Section 7) and a slightly modified argumentation framework (Section 10). An exhaustive discussion of the inference rules' structure and properties is included in [14].

4. NUMERIC REPRESENTATION OF THE LEVELS OF VALUE PROMOTION BY PARTICULAR SITUATIONS

The underlying objective of this section is to discuss a new semantics (basics of which were presented in [16]) for the model described in [14], which would allow for a numeric representation of the levels to which given situations promote various values. This task consists of two main parts: first, a proposal and discussion of the numeric method of representation of the level to which given values are promoted; then the development of the mechanism of determining this level, of determining the cumulative evaluation of value sets, the mechanism of comparing them (equivalents of the O and OR relations), a discussion of the properties of the proposed semantics and verification of whether the new propositions are not contradictory to intuition or the formal assumptions of the model from [14].

4.1. Numeric Representation of Levels to Which Various Situations Promote Particular Values

In the model presented in [14], any level to which a given value (e.g., vi) is promoted by a given situation (e.g., xj) is presented as vi(xj)V(X) (without a definition of the kind and range of values which can be taken). The relations between the levels to which a given value (vi) is promoted by situations x1 and x2 are regulated by a partial order Oi. In our model (firstly presented in [16]), we assume that vi(xj) will be expressed with a number from the range 0;1), where vi(xj)=0 means that a value vi is not promoted by a situation xj, a value vi(xj)=1 means that a value vi is promoted by xj to the maximal possible level, though we assume that the accomplishment of the maximal fulfilment of the value is not possible—the value may be promoted very close to the maximal value, but cannot reach it. Such representation allows us to compare the levels to which the same value is promoted by various situations (order O will be a total order).

Assuming a numeric representation of the level of promotion of values one cannot miss the fact that not all values are equally important to each user. The simplest solution would be to assign some weight to each value (from the range (0;1)), though this proposition is a far cry from the way humans reason. In real decision-making, a person does not assign constant weights to various values: e.g., for a mobile device user wishing to run a web application, the battery level seems less important than transfer speed, but if the battery is nearly exhausted, this fact will become the critical value for the user. In most cases, the weight of a given value depends on the user's preferences, external factors, and the level to which the value is promoted (like in the abovementioned example with the battery). This phenomenon is noticed by the author of [24]. The numeric representation of values in the legal decision-making was discussed in [25]. The authors introduced a system dedicated strictly to the legal reasoning with factors. Although some common elements (the numbers represent the weight of value), this model is different to the our one. Firstly, because it was designed strictly to the legal reasoning where are only two possible and strictly opposite decision options (plaintiff-defendant), secondly because the authors of [25] use factors as a case representation (which are absent in our system), thirdly because the system uses different calculus than we are (e.g., we have a dedicated function Θ allowing for cumulative evaluation of different values, where in [25] pro-plaintiff values' weights are added and subtracted from pro-defendant values' weights). Obviously there are some general similarities, e.g., the numerical representation of values allows (similarly to our approach) for comparison of cases (or decisions) which promote different values. We also agree with the authors of [25] that not all values are equally important. This is the reason of introducing the functions of weight (Ωi). The reason of introducing weight function comes from the observation that the importance of a given value can depend on the level of promotion of this value (and, possibly, other values).

Definition 13.

(Function of the weight [16]) Let Ωi:vi(xj)0;1) be a function of the weight referring to a value vi. We assume that every function of the weight in the range 0;1) will be a continuous and increasing function (this assumption is indispensable for the preservation of the features of the GVR model). For every value viV exists maximally one function Ωi. Let Ω be a set of weights' functions. By voi(xj)=Ωi(vi(xj)) we will denote the level of promotion of a value vi by situations xj taking into account weight Ωi. A value voi(xj) will denote the relative level to which a situation xj promotes a value vi. Let VO(x)={voi(xj)|voi(xj)=Ωi(vi(xj))vi(xj)V(X)} be the set of all values voi in all situations.

VOZ(X) will denote any subset VO(X) called Z.

The most basic kind of function Ω is a linear function Ωi(vi(x))=a(vi(x)), where a is a constant from the range (0;1; it is, however, possible to define more complex functions which would better express the relative preferences between values. For example, in the above presented battery case we can use such a function:

Ωbatt(vbatt(x))={0.1vbatt(x)where vbatt(x)0.21.6vbatt(x)0.3where 0.2<vbatt(x)0.450.1vbatt(x)+0.45where vbatt(x)>0.45

If the level of promotion of the battery charge (vbatt) is above than 45%, then the changes of its value do not influence strongly the relative level of promotion of the value (vobatt). If the level of promotion of battery charge decreases below 45%, then the relative level of promotion decreases rapidly almost to 0, which means that such a low level of battery charge becomes critical. The above example presents a situation where for the low level of battery charge (below 0.2), the weight of the vbatt value becomes very low (close to zero), which strongly decreases the cumulated evaluation of a given situation (see Section 4.2) If for a given value vi it is impossible to determine function Ωi, then we can assume its default for Ωi(vi(x))=a(vi(x)), where a=0.5.

4.2. Cumulative Evaluation of the Level of Promotion of a Value Set by a Given Situation

In the model discussed in [14], a given situation may promote various values to various extents. It is represented by set VZ(xn), where VZ is a subset (named Z) of value set V. By VZ(xn) we denote a set of estimations of the levels of promotion of values constituting set VZ by a situation xnX. If VZ={vz,vt}, then VZ(xn)={vz(xn),vt(xn)}. The GVR model also introduces the order relation OR between promotion sets to which various values are promoted by various situations (def. 8). Properties of the OR order are discussed in [14].

Since our model assumes that the grounds for evaluation are relative levels to which particular values are promoted by situations, in our model the correspondent of set V(X) will be set VO(X) and its subsets [16].

For the realization of value-based reasoning to be possible, it is indispensable to define the correspondent of order OR for the new semantics. Introducing the numeric representation of the level to which a given situation promotes a given value (vi(Xn)) and the weight function Ω provides the possibility to develop a mechanism able to determine the cumulative evaluation of the promotion of a value set by a given situation.

Firstly, we assume that the cumulative evaluation of the level to which a given value set will be promoted by a given action will be a number from the range 0;1), where 0 is interpreted as the minimal level of promotion and 1 is interpreted as the maximal level of promotion (impossible to attain). Such a relation will enable the comparison of various situations, including those which promote different values.

Definition 14.

(Function Θ [16]) Let Θ:VOZ(x)0;1) be a function returning the cumulative evaluation of the level to which a situation x promotes a value set VZ.

If:

  • VZ={v1} then Θ(VOZ(x))=vo1(x);

  • VZ={v1,v2}, then Θ(VOZ(x))=vo1(x)+vo2(x)vo1(x)vo2(x);

  • VZ={v1,v2,v3}, then the value returned by function Θ is determined in the following manner: first, we determine Θ(VOvo1,vo2(x) for Vvo1,vo2={vo1,vo2}, then we determine Θ(VOZ(x))=Θ(VOvo1,vo2(x))+vo3(x)Θ(VOvo1,vo2(x))vo3(x);

  • In case of a higher number of values in set VZ, the cumulative value Θ(VOZ(x)) is determined analogously to the previous case.

The main idea of the function Θ is to introduce the joint evaluation of a particular situation. Such evaluation allows for comparison of various situations (even those which promote different values). For example, we have two options to choose: to stay at home and work (xhome) or to go to a party (xparty). We know that staying at home promotes productivity to a certain level (e.g., voprod=0.5) and saves our money (vomoney=0.4). Going to the party promotes fun (vofun=0.6) and allows for contact with friends (vofriends=0.4).

The cumulative evaluation of staying at home is

Θ(VOhome(xhome))=voprod(xhome)+vomoney(xhome)voprod(xhome)vomoney(xhome)=0.7

The cumulative evaluation of going to the party is

Θ(VOparty(xparty))=voprod(xparty)+vomoney(xparty)voprod(xparty)vomoney(xparty)=0.76

The cumulative evaluation of the values promoted by a party is higher than the cumulative evaluation of values promoted by staying at home (note that the basis of the evaluation are the relative levels of the values' promotion), which can be interpreted as the situation in which a particular agent prefers going to the party to staying at home. The concept of the function Θ is a simulation of a real-life problem in which we have to compare situations promoting different and sometimes difficult to compare values.

The properties of function Θ:

  1. The value returned by function Θ is independent of the order in which particular values from VZ are reviewed (proof can be found in the Appendix A.0.1).

  2. Function Θ is monotonic (here as monotonic we understand not only its being non-decreasing, but also the fact that adding a new value which promotes a given situation increases the cumulative evaluation Θ(VOZ(x)): If VOZ(x) is a set of relative levels of promotion of the values from set VZ in a situation x and if a situation x also promotes a value vjVZ, then Θ(VOZ+vj(x))>Θ(VOZ(x)) (the cumulative evaluation of the level of promotion of the values from VZ and of a value vj by a situation x will be higher than the evaluation of the level of promotion of the values from VZ). Proof can be found in Appendix A.0.2.

As we have already noticed, values should not be treated equally (they are not all equally important), and therefore we proposed a set of weight functions Ω. Based on that, we assume that in our model the equivalent of order OR will be order ORO:

Definition 15.

(Value-extent-weight preference [16]) A total order ORO=(;2VO(X)) represents a preference relation between various values, their weight functions, and various sets of situations. VOZ(xn)VOY(xm) means that the relative extent to which values from set VY are promoted by a situation xm is not preferred to the relative extent to which values from set VZ are promoted by a situation xn.

We assume that Θ(VOZ(xn))Θ(VOY(xm))VOZ(xn)VOY(xm).

Since function Θ allows to determine the cumulative evaluation of the promotion of all value sets and all situations, then on the basis of the above formula one can unambiguously determine order ORO. Hence in practice there is no necessity to declare relations in order ORO.

If VOZ(xn)VOY(xm) and VOY(xm)VOZ(xn), then we say that VOZ(xn)VOY(xm).

If VOZ(xn)VOY(xm) and VOY(xm)VOZ(xn), then we say that VOZ(xn)VOY(xm).

4.3. Relations Between Orders O and ORO

Zurek [14] (as well as this paper's sections 3, def. 8 and formulae 1, 2, 3) introduce the properties of order OR and the relations between orders O and OR. These relations may be used to determine unknown relations of order OR. Since in our model the bases for the evaluation of a situation are relative (considering the weight function) levels to which values are promoted, the equivalent of order OR for the numeric representation of the levels of value promotion will be order ORO. To preserve the features of the mechanism from [14], we have to presume that order ORO will have the same properties as OR and will preserve the relations between OR and order O. We assumed that functions in Ω are continuous and increasing, due to which we know that if vi(xj)>vi(xk), then Ωi(vi(xj))>Ωi(vi(xk)). The proof is trivial.

Knowing the properties of the functions from the Ω set, we will examine the behavior of function Θ and, in consequence, the features of order ORO. Firstly, we will investigate whether order ORO has features similar to OR (def. 8).

  1. Since order ORO is a total order, it fulfills the assumptions of a partial order. The only difference is that order ORO is not a strict order.

  2. The other feature of order OR is preserved: If VZ is a set of values promoted by a situation x1 (VZVx1) and VXVZ, then

    VX(x1)VY(x2)VZ(x1)VY(x2). Since function Θ which supports order ORO is monotonic (see 2 property of function Θ), this condition is fulfilled (proof can be found in appendix A.0.3).

Furthermore, we will analyze whether the relations between order O and ORO are the same as the relations formulated in [14] (recalled in this paper in Section 3, formulae 1, 2, 3) between orders O and OR (proofs can be found in Appendix A.0.4])

  1. The first relation will be fulfilled if vi(x1)>vi(x2){voi(x1)}{voi(x2)}.

  2. The second relation will be fulfilled if

    (vzVx2(vz(x1)vz(x2))vtVx2(vt(x1)>vt(x2))VOx1(x1)VOx2(x2)

  3. The third relation will be introduced differently than in [14]. It will be fulfilled if

    (Vx2(x2)={Vx1x2(x2)Vx2x1(x2)})

    (Vx1(x1)={Vx2x1(x1)Vx1x2(x1)})

    VOs(x1)VOx1x2(x1)(VOs(x1)VOx2x1(x2))VOx1(x1)VOx2(x2)

5. GOALS

The four kinds of goals defined in [14] (and discussed in Section 3.1 of this paper) remain the same. The only changes are caused by the fact that the level of promotion of values is expressed in numbers, and therefore the threshold values: (vnmin(ga) and vnmin(gua)) will also be numbers from the range 0;1). These values may be declared directly by the user or determined from function Φ.

Although we focused on the 4 main kinds of goals presented in [14], we do not exclude the possibility of adding other types of goals. For example, should a need arise for a more sophisticated kind of goal allowing for the balancing between the levels of promotion of values, we can introduce a new kind of abstract goal in which the threshold values (vnmin(ga)) are not fixed, but they depend on the levels of promotion of other values (such kind of a goal gives the possibility for representing the situation in which we allow for a small demotion of the level of promotion of one value in order to have a strong promotion of the other value). The discussion of such new types of goals will be presented in future work.

6. DECISION SPACE

Similarly to other argumentation-based mechanisms of practical reasoning and decision-making, our model assumes a discrete space of available events. When making a decision, the agent is faced with a known number of available options. It is worth noticing, that In our model we assume that all actions are certain (i.e., the result of a given action is always certain), the problem of uncertain results of actions and the behavior of other agents, was discussed in [26]. Adding such a possibility to our model is possible, but requires further developments and will be discussed in the future works.

In the presented model, the decision-making options available for the state of affairs sj have been described in the form of a set of situations Xj which allows for maintaining the current state of affairs (sj) or performing an action included in the set of available actions (ASj). For a closer analysis of the decision-making process, we need to differentiate between the decision space (i.e., the total number of options available to the agent) and the respective decision variables. To date, our model provided the agent with a set ASj of actions available in a given situation sj. Building upon the same, we now wish to expand our model in such a way that the respective actions in set ASj will be combinations of various values of the particular decision variables.

Definition 16.

[Decision variables] Let xi=(d1,d2,,dn)T be a vector of decision variables referred to as i and corresponding to an action taken in accordance with the parameters expressed as (d1,d2,,dn)T. Let X be a set of decision variable vectors. Vector xi represents the selection of a certain specific action with the parameters expressed as (d1,d2,,dn), with one exceptional case entailing the action of taking no action whatsoever and maintaining the current state of affairs. Let D1 be the domain (set of possible values) of variable d1, D2 the domain of variable d2, etc. If the domain for each variable is finite and discrete, the set X (decision space) will be a subset of the Cartesian product of variable domains d1,,dn: XD1×D2××Dn. In other words, any situation from the Xj can be expressed as a vector from set X (note that not every combination of decision parameters represent a possible decision, some combinations are not available). In the subsequent parts of these deliberations, whenever it is necessary to refer directly to the vector of decision variables, the same will be denoted as xi, otherwise, we will refer to actions (asi,j) and situations (x), bearing in mind, however, that the same relate to a vector of decision variables.

Example 13.

[Running example cont.] The above definition can be illustrated on an example. Let us assume that the following decision-making situation occurs: we are in a pub (state of affairs spub), feel like having a glass of beer, and can choose between two types of beverage (lager and stout) as well as two glass sizes: 0.5 or 0.3 liters. We also have the choice of taking no action whatsoever. Given the above, we are therefore dealing with two decision variables: dbeer and dglass, where the domain of dbeer is Dbeer={lager,stout,nothing}, and the domain of dglass is Dglass={0.5,0.3,nothing}. Hence, the decision space is as follows:

aspub,(lager,0.5)T,aspub,(stout,0.5)T,aspub,(lager,0.3)T,aspub,(stout,0.3)T,aspub,(nothing,nothing)T.

If the number of decision variables and the number of values in the domain of each variable are high, then we can face the problem of combinatorial explosion of the number of options to choose (the number can be decreased because not all combinations of decision options are available, e.g., there is no aspub,(nothing,0.3)T and aspub,(nothing,0.5)T in AS). Such a situation can be the origin of some computational problems, but (as we notice in the comments to Section 2) this is a problem which appears in most (if not all) formal methods of decision-making: the rational choice of option requires the analysis of all possibilities, no matter how we describe them (as decision vector, state of affairs, transition between states, actions, etc.) In Zurek and Mokkas [16], we proposed a very simplified model for the transformation of the physical quantity pvi into a certain level of promotion of a given value vi by employing the Φi function. The Φi function allows the value of one physical quantity (or any given decision parameter) to be converted into the level of value promotion. Unfortunately, the same does not account for that fact that any decision-making situation entails numerous decision variables (expressed in our model by the variable vector). Therefore, because the model proved too simplistic to be applicable to a more complex decision-making reality, we expanded it as described below.

Definition 17.

Let Φi:X0;1) be a function that transforms the vector of decision variables into the level of promotion of value vi: vi(x). Let Φ be the set of transformation functions. Our model can make use of several different transformation functions which can be additionally declared, depending on the nature of the values.

If action ask,j is represented by decision vector x, then Φi(ask,j)=Φi(x).

The thus determined level of promotion can be used for further inference.

The above definition can be illustrated by an example:

Example 14.

[Running example, cont.] Let's assume that John have such decision options aspub,(lager,0.5)T,aspub,(stout,0.5)T,aspub,(lager,0.3)T,aspub,(stout,0.3)T,aspub,(nothing,nothing)T.

Let's assume function Φhappiness(X):

Φhappiness(X)={0whereaspub,(nothing,nothing)Tx2whereaspub,(larger,x2)Tandx2<0.81.2x2whereaspub,(stout,x2)Tandx2>0.80.3elsewhere

The above function shows that staying without drinking does not promote John's happiness, he prefers to drink more than less, but drinking more than 0.8 liter (or drinking unknown drink) promotes happiness to 0.3 only. John also prefers stout to lager. Drinking 0.5 liter of stout promotes his happiness to 0.6, while drinking 0.5 liter lager promotes happiness to 0.5. This example shows that in a new, unexpected, situation (e.g., there is a new menu in the pub in which beer is served in different glasses: 0.2 and 0.6 l.) John can assign the level of promotion of value happiness to the possible decisions.

Under real-world circumstances, it is not always possible to determine separate functions Φ and Ω for each respective value, particularly if these are obtained inductively (with the use of machine learning mechanisms) rather than predeclared. In such a case we can assume the existence of a single function combining the properties of functions Φi and Ωi:

Definition 18.

Let Λi:X0;1) be a function that transforms the vector of decision variables into the relative level of promotion of value vi: voi(x). Let Λ be the set of transformation functions. Our model can make use of several different transformation functions which can be additionally declared, depending on the nature of the values.

Let us assume that Λi(x) is an equivalence to Ωi(Φi(x)).

If action ask,j is represented by decision vector x, then Λi(ask,j)=Λi(x). Some comments: The issue of establishing of the functions from the sets Φ or Λ is an important and nontrivial problem. The simplest (but very often difficult or impossible to perform) way to establish the functions is to declare them manually. Although each function can have an individual character, sometimes we can use a default function. For example, if we know that only one decision variable of the decision space x (e.g., d1, where by pd1 we denote the value of the decision variable d1) influences value vi, we know the maximal and minimal possible value of d1 (which we denote max(d1) and min(d1), and we know that the influence is linear, then we can compute vi(x) on the basis of the normalization function used in [16]: In order to transform the values of decision variables into the levels of value's promotion we can use the following normalization function:

vi(x)=Φ1(x)=(pd1min(d1))/(max(d1)+emin(d1)) where

  • pd1 - is the actual level of the decision variable d1.

  • x - is the situation that promotes a certain value.

  • min(d1) - is the minimal level of value pvi.

  • max(d1) - is the maximal level of value pvi.

  • e - is an arbitrarily small positive quantity.

The result of Φ1(x) is the level of the corresponding value vi(x).

For decision variables in which higher levels indicate a lower value's promotion we inverse the result of the normalization function: 1Φ1(pd1).

Much more problems will appear if maximal or minimal values of a given decision variable are known partially (we can estimate maxe(d1) or mine(d1) but we cannot exclude examples which are above max or below min.). In such situations we can introduce an imperfect (disrupting the linearity of function Φ1) solution: If pd1 is between maxe(d1) and mine(d1), it is computed on the basis of the modified normalization function:

vi(x)=Φ1(x)=0.8(pd1mine(d1))/(maxe(d1)+emine(d1))+0.1

where e, 0.8 and 0.1 are coefficients.

If pd1 is higher than the previous known maximum, the level of promotion of value vi can be computed on the basis of equation: vi(x)=vi(max(d1))+(1vi(max(d1)))/2 (where vi(max(d1)) is the level of promotion of value vi by an actual maximal value of d1). If pd1 is lower than the previous known minimum, the level of promotion of value vi can be computed on the basis of equation: vi(x)=vi(min(d1))(vi(min(d1)))/2 (where vi(min(d1)) is the level of promotion of value vi by an actual minimal value of d1). If a next new pd1 has value between maxe(d1) and new maximal value of vi, then vi(x) should be between these values. Such a solution is imperfect, but in our opinion, it can be sufficient in many situations.

Alternatively, functions from the sets Φ or Λ can be obtained by machine learning mechanisms. Such a task requires a relatively large training set including various decision vectors with assigned to them proper estimation of each value. This problem is presented in a more detailed way in Section 9.2.

7. INFERENCE RULES

In this work, inference rules will be presented in the form of argumentation schemes, which are forms of argument representing stereotypical patterns of human reasoning.

Argumentation schemes have an antecedent part and a consequence part separated by a double bar which stands for the sign of defeasible inference. Argumentation schemes in computational models are usually interpreted as defeasible inference rules and they constitute a part of a whole argumentation framework, providing a basis on which the process of reasoning is conducted (a detailed description of the framework is presented in Section 10).

The author of [14] introduces a number of such argumentation schemes allowing for the realization of value-based and teleological reasoning. The issue of argumentation schemes was widely discuss in many papers and there is a number of different views on their nature. The author of Prakken [27] presents an in-depth analysis of various approaches to argumentation schemes. From the most general point of view he divided them to the two approaches: the logical one, when argument is treated as a defensible inference rule and the dialogical approach in which arguments are dialogical devices. Since in our approach we do not introduce any dialog moves with shifting the burden of proof, we understand the argumentation schemes as a defeasible inference rule, but unlike it is discussed in the H. Prakken paper, our model is not purely logical because it contains some procedural elements (especially the function ε which changes the state of affairs).

Note that, unlike it is in standard approaches to argumentation schemes, we do not introduce critical questions (the ways of challenging of arguments). It was caused by purely technical character of our model (we do not model the human reasoning process, but we are going to create a reasoning framework for autonomous agents). The ways of challenging of arguments were implemented into reasoning framework.

Below are presented mechanisms from [14] and [16] which have been adapted to our reasoning model with a numeric representation of the level of value promotion (the names will correspond to the mechanisms from [14]). Since the AS1 scheme from [14] is very basic and is practically included in AS2, it will be omitted.

  • AS2

    Generalized practical reasoning: If in circumstances sm performing an action at is preferred to remaining in sm and ast,mAS, then an action at should be performed:

    smSast,mAS:VOast,m(ast,m)VOsm(sm)__ε(ast,m)

    In the above example, relation from [14] (def. 9) is expressed by means of order ORO () which takes into consideration the weight function.

  • AS3

    Reasoning with abstract goals: If in the current circumstances sm achieving an abstract goal gak is possible by an action at performed in sm, then an action ast,m becomes the practical goal gp:

    gakGAsmSast,mAS:γ(sm)=1sat(ast,m,gak)__gp=ast,m

  • AS4

    Reasoning with unreachable abstract goal: In the current circumstances sm, a state of affairs sm does not satisfy the goal or promotes the key value of guam to a lesser extent than ast,m, both ast,m and ask,m satisfy ordinary values of a goal gua, but ast,m satisfies the key value of guam to a greater extent than ask,m, hence ast,m becomes the practical goal gp:

    guamGUAsmSast,mAS:(γ(sm)=1vzguam:(vz(sm)<vxmin(guam))(vz(sm)<vz(ast,m))vxguam:(vx(ast,m)vxmin(guam))ω(guam)=vy¬(ask,m)AS:((vy(ask,m)>vy(ast,m))vxguam:(vx(ask,m)vxmin(guam)))__gp=ast,m

    The abstract unreachable goal can be understand as a specific kind of abstract goal in which one value should be promoted to the maximal possibly extent. The idea is that the agent chooses the decision (action) which promotes this (key) value to the highest possible extent, while rest of the values are promoted at least to their satisfactory levels.

    For the sake of clarity we explain the subsequent verses of the above scheme:

    In the first verse, we assume that there exist an abstract unreachable goal, and the available action.

    The second verse confirms the actual state of affairs.

    Third and fourth verses examine if the actual state of affairs does not satisfy the goal (are the ordinary values promoted below the thresholds or exists available action which promote key value to the higher extent). If not, then

    Fifth verse examine if exists available action (ast,m) which promotes the ordinary values to the satisfactory levels. If yes,

    The last three verses examine if there exists other available actions which promote the key value to the higher extent than ast,m and promote the ordinary values at least to the declared thresholds. If not then ast,m may become a practical goal.

  • AS5

    Goal-driven practical reasoning In the current circumstances sm, in order to achieve the practical goal gp, an action at should be performed:

    smSast,mAS:(γ(sm)=1)(gp=ast,m)__ε(ast,m)

    Although we focused on the inference rules from [14], note that new inference rules can be added do the above list.

8. DECISION-MAKING AS AN OPTIMIZATION PROBLEM

One can notice that our model has some common elements with multi-attribute utility theory [28]. Obviously, the topic of MAUT was widely discussed and developed and it would be difficult to discuss here a relations between our approach and all versions and approaches to MAUT, but we will try to present a brief look on the similarities and differences between our model and the most basic approach to multi-attribute utility theory.

Similar to MAUT we evaluate possible decision options in the light of different values (in MAUT utilities) and then we compute the cumulative evaluation of a decision (in MAUT the aggregate utility function) which is the basis of the decision. The process of aggregation of utility in MAUT can be additive or multiplicative. In our work we use the Θ function, which can be seen as a variation of the additive rule (but normalized to the range <0:1).

Unlike it is in MAUT, we do not use weights (as numbers), but we use weight functions which can be more complex and we do not implement the reasoning with uncertainty (but we plan to do it in the future works). The goal of MAUT is to find option with the highest aggregate utility function. In our work the goals of the model can be more complex, similar to MAUT we can maximize the cumulative evaluation of all values (such an approach can be achieved with the use of AS2), but we also introduce the thresholds (minimal acceptable levels of promotion of values) and the possibility of maximization of one value having other above the threshold (like it is in unreachable abstract goals). Moreover, our system allows for utilization of other (to appear in future works) argumentation schemes.

Concluding, since there is a number of similarities (and some differences) between our model and MAUT, we believe that our approach can be seen as a bridge between a value-based decision-making and the utility theory.

The model presented in our paper was designed, above all, for situations involving a finite and discrete space of events wherein a finite ASj set can be presented.

A considerably more complex situation is observed where at least one of the decision-making variables relates to an infinite set of values. If it is infinite but discrete, it is often possible to limit the range of variability for the given decision variable without prejudice to the quality of the decision. If the domain of at least one of the variables is continuous, the decision space will also be continuous or semi-continuous. In such a situation, the variable should be, if it is possible, discretized. If it is not possible, the argumentation-based approach fails: the number of possible decision-making options is infinite and no viable arguments for or against the respective options can be formulated. In this context, the most effective decision-making models are based not on argumentation, but on single- and multi-criterion optimization aimed at maximizing the target function. Although the mechanism presented herein was developed with a discrete event space in mind, its underlying premise can, to a certain extent, be also applied in the development of an optimization model. The key premise of our model is the assumption that decisions are rooted in values that should be promoted or a goal comprising a set of values that should be promoted up to a certain minimum level. Should we treat such values as criteria in a decision-making process, the entire sequence could be interpreted as single- or multi-criterion optimization.

In Section 6, we introduced a set of functions Φ which allowed us to evaluate the level to which particular values are promoted by different values of the vector x (and therefore the respective actions from the set AS). The assessment can be processed by employing the weight function (Ωi), which subsequently allows the application of the Θ function for the purposes of a joint function-based evaluation of the promotion of the selected set of values VZ, which in turn will allow a comparison between the promotion levels for various values promoted by different values of decision variables, as described by the vector x.

At the beginning of our paper, we distinguished several types of goals that an agent can achieve. Abstract goals and unreachable abstract goals correspond to the agent's needs with respect to the levels up to which particular values ought to be promoted. Based on the same, material goals (i.e., specific actions facilitating particular states of affairs) which fulfil abstract goals are defined, and based on these, the practical goal, i.e., the action to be performed by the agent, is determined. Presented below is the method of approaching the decision-making problem, based on the thus defined goals, as an optimization task.

Let us begin by assuming the following:

Let the model of a decision-making situation D comprise the pair:

D=((x),l(x)), where x is the vector of decision variables (d1,d2,,dn), (x) is the goal function which ought to be maximized, and l(x) is the set of formulae limiting the values of the input vector. Explained below is the method for developing a model of the optimization situation in three distinct cases:

  1. The first case is one where no preimposed goal exists. The decision is based on a choice of the option (option vector) which ensures the best total promotion of all values (taking into account the weight of the respective values).

    In the analyzed case, we have a defined goal function (x) (maximizing the promotion of all values relative to their weight), but with no limitations: l(x)=.

    Assuming that function Φ allows us to determine the level of promotion of values: vi(x)=Φi(x), function Ω determines the relative level of value promotion: voi(x)=Ωi(vi(x)), and VO(x) is the set of relative promotion levels for all values promoted by the vector x, then the goal function will be as follows: (x)=max(Θ(VO(x))).

  2. In the second case, we have a declared an abstract goal gaGA, i.e., a set of minimum levels for values within a given set Vga. This case is not an optimization task as it has no declared increase of the goal function. The abstract goal itself does not define the value or values that should be maximized.

    An abstract goal allows the definition of a set of limitations l(x). The set will contain the number of limitations declared in the goal ga. Each limitation will pertain to a single value. For each value vn whose minimum value is declared in the goal ga (vnga), the following must be true: Φn(x)vnmin(ga).

    Because the abstract goal itself does not define a goal function as such, the function from case 1 can be used adopted as the goal function: (x)=max(Θ(VO(x))).

  3. In the third case, the agent has a declared unreachable abstract goal guaGUA. An unreachable abstract goal is a set of minimum levels up to which the respective values ought to be promoted with one (key value) requiring the maximum possible level of promotion.

    In this case, the goal function will entail maximizing the level of promotion for the key value whereas the minimum levels of promotion for ordinary values of the goal gua will constitute the limits:

    If ω(gua)=vm, then the goal function is to maximize the level of promotion for the value vm: (x)=max(Φm(x)).

    The limit set will be defined on the basis of the ordinary values of the unreachable abstract goal:

    For each value vn whose minimum level is declared in the goal gua (vngua), the following must be true: Φn(x)vnmin(gua).

Having obtained, after employing one of the aforementioned approaches, the set l and function (x), we can use the adequate optimization mechanism to determine the vector xi (or vectors, where more than one applies) for which the function (x) reaches its maximum, thus rendering it (or one of the same) the best possible decision.

As the development of the optimization mechanism is outside the goal and scope of this paper, and because we do not introduce any specific functions Φ or Ω (which can differ significantly depending on the actual decision-making problem at hand), it is not possible to introduce here a single optimization mechanism that would allow us to identify the maxima of the goal function (x) in each of the three cases.

9. SOME COMMENTS ON DECISION-MAKING

In the introduction to this paper, we observed that the ethical context of decisions made by an autonomous agent becomes more important where the environment in which it is to function influences other people and becomes more unpredictable. This has been confirmed by the famous example of the Tay bot [29] which very quickly “learned” to become a racist. The example relates also to the problem of trust placed in learning mechanisms: how far can we trust a device whose actions in the real world can be potentially dangerous to humans? This is particularly true of mechanisms that are not capable of explaining their actions (such as neural networks): A device can “learn” bad behaviors by observing the behavior of other agents (similarly to humans who can be persuaded to do something they would not normally do through the influence of “bad company” or peer pressure).

The proposed framework allows us to include ethical concerns in the decision-making process. It also allows us to view ethics in a somewhat broader perspective, i.e., not only as based on deontic principles of mandatory action and prohibition, but also focused on the promotion of values and their adequate, relative proportions. Indeed, some authors [15,30] go as far as to directly suggest that under certain circumstances deontic rules should be broken and indicate values as a viable justification for the decision to break a rule.

The in-depth discussion of the problem of ethical behavior of autonomous systems was presented in [bench-capon2020]. The author discussed the problem of autonomous system in the light of three main ethical theories: consequentialism, deontological ethics, and virtue ethics. Although such an analysis is outside the scope of this paper, in our opinion (similar to the mentioned in [bench-capon2020] AATS+V model), our model can be treated as an attempt to model the virtue ethics, especially with the use of values in abstract goals and unreachable abstract goals which can represent virtues (however, we have also implemented some elements of consequentialism, e.g., the analysis of the actions in the light of the future consequences).

Moreover, although the author of [bench-capon2020] points out that his approach is focused on modeling of ethical behavior rather than ethical reasoning, we believe that autonomous setting the practical goal (action and state of affairs to be achieved) from abstract or abstract unreachable goal can be seen as a first step to model the ethical reasoning.

9.1. Two Sources of Knowledge

People acquire knowledge necessary in decision-making processes in various ways. Overall, in the most generalized perspective, these can be divided into two groups: knowledge acquired directly from sources: from books, teachers, parents, etc.; and knowledge acquired through experience: a person acquires knowledge through induction, by interacting with the world and evaluating the effects and consequences of certain behaviors. For instance, the first method can allow us to learn road traffic rules, what is and what is not allowed, whether it is acceptable to drive 100 km/h in an urban area, etc. Meanwhile, the second method allows us to learn how to determine the distance from the car in front of us and decide whether it is safe, will the speed with which we enter a corner allow us to negotiate it safely, etc.

The development of formal models of knowledge representation, inference, and argumentation allows us to represent knowledge acquired using the first of the above methods. On the other hand, the spectacular advances made in machine learning in recent years allowed the development of very complex systems focused on the second method of knowledge acquisition.

Based on the above, it is our belief that autonomous systems acting in the real world should take advantage of both of the above approaches. The model presented herein is based on formal reasoning, i.e., it allows the representation of knowledge acquired in the first manner; however, it has been structured in such a way so as to facilitate the implementation of a hybrid system that would draw upon both approaches to the use of artificial intelligence in decision-making.

9.2. Hybrid Model of Decision-Making

In this section, we will propose a way to combine the described model with machine learning systems. The idea of integration of the two approaches entails the division of functions between the respective types of knowledge:

  1. On the one hand we have knowledge in the form of straightforward and clear principles of reasoning and rules for their application (developing arguments, solving arguments, etc.). This type of knowledge can be input into the system directly as it allows a clear and unambiguous choice of the best option based on the values of the agent's goals (what is more, the model is also capable of explaining why this is the case).

  2. On the other hand we have knowledge which allows us to recognize whether the conditions for applying the inference rules have been satisfied, to what extent the respective values are indeed promoted, etc. This type of knowledge is less about inference and more about pattern recognition, classification, etc.

On the basis of the above division, it can be assumed that functions from the set Φ need not be directly provided. They can take the form of any mechanisms assigning to a decision vector a number from the range 0;1), and can therefore be a machine learning based mechanism, even one based on the so-called “black box” principle. The ML-based function from the set Φ can be obtained on the basis of the so-called supervised machine learning (linear or nonlinear regression or, for the cases with a small number of discrete options, classifier). In order to build the function, the training set of various decision vectors (x) with assigned to them proper estimation of the level of promotion of a given value should be prepared. On the basis of the training set the regression function (or classifier) can be trained (however, it is important that the estimated value should be a number from the range 0;1)). Naturally, the development of such a training set and a mechanism is anything but easy, but this far exceeds the scope of the discussion at hand. In our opinion, this can be an interesting direction of future research. Similarly to the functions from the set Φ, the product of the machine learning process can be the activation of the weight function from the set Ω (as a filter processing the level of value promotion, i.e., a number in the range of 0;1), to the weighted level of promotion, i.e., a number in the same range), or function δ returning the result of a given action being performed under a specific state of affairs (function δ can take the form of a classifier assigning a certain decision-making vector x to one of the available states of affairs skS). It is noteworthy that under real-world circumstances, the determination of separate functions Φi and Ωi can prove very difficult, indeed unnecessary, and such situations are best handled by considering them jointly and determining, through the application of machine learning mechanisms, a single function Λi.

Another, alternative way of obtaining Φi and Ωi or Λi is learning from agent's own experience through some form of reinforcement learning functions. However, this a perspective topic to be discussed in a future work.

Concluding, the framework presented in our paper allows the integration of machine learning based solutions with traditional solutions involving logical inference modeling.

10. ARGUMENTATION FRAMEWORK

In this section we introduce a reasoning mechanism allowing for decision-making. The mechanism allows for establishing of the agent's goal and performing actions allowing for fulfilment of the goal (practical reasoning). The argumentation-based practical reasoning framework introduced in our work is based on the simple reasoning mechanism presented in [14].

  1. We assume that sets S, A, X, and V are elements of knowledge base KB.

  2. By KBE we denote that E is derived from knowledge base KB. E can be inferred from the knowledge base where E will be an element of any of the subsets of KB, or E will be an outcome of the operation of one of the functions (γ, δ, ε, Φ, Ω, Θ) or standard mathematical (set-theory, logical, etc.) operators on the elements of any of the subsets of KB.

  3. Let IR be a set of inference rules. As inference rules we understand the argumentation schemes described in Section 7.

  4. We assume that arguments are constructed on the basis of inference rules from set IR and knowledge base KB. An argument will be constructed when KBform and form satisfies the conditional part of an inference rule. By Conc(argx)=E we denote that E is the conclusion of an argument argx.

  5. ARG is the smallest set of all finite arguments constructed from KB and IR.

  6. Two arguments are in conflict if one attacks the other.

  7. There is only one kind of attack: rebuttal, which is an attack on the conclusion of an argument. The attack has a symmetrical character: if arg1 attacks arg2, then arg2 also attacks arg1.

  8. An attack on the conclusion of an argument arg1 occurs if:

    • arg1,arg2ARG:Conc(arg1)=ε(ast,m)Conc(arg2)=ε(ast,m)ataz

      (arg1 concludes that ε(ast,m) and there exists argument arg2 which concludes ε(asz,m), where ataz), or

    • arg1,arg2ARG:Conc(arg1)=gp=ast,mConc(arg2)=gp=asz,mataz

      (arg1 concludes that gp=ast,m and there exists argument arg2 which concludes that gp=asz,m, where ataz).

  9. We assume a partial ordering between arguments OARG=(ARG,) where if arg1arg2, then it means that arg1 is stronger than arg2.

  10. We assume that the basic grounds for determining an order (OARG) between arguments is the inference rule on the basis of which the argument is constructed. We assume that AS5>AS2, meaning that if arguments arg1 and arg2 are in conflict and if arg1 is built on the basis of AS5 and arg2 is built on the basis of AS2, then arg1arg2.

  11. In order to eliminate conflicts between arguments constructed on the basis of AS3 and AS4 (root of this conflict is in the point that unreachable abstract goal is also abstract goal) we assume that if there is declared unreachable abstract goal, then the system does not generate arguments on the basis of AS3.

  12. Argument arg1 defeats argument arg2 when argument arg1 rebuts argument arg2 and arg2arg2.

  13. Reasoning about priorities: we assume that priorities between conflicting arguments built on the basis of the same inference rule depend on values whose application the argument promotes.

    • If both arguments (arg1 and arg2) are built on the basis of inference rule AS2, argument arg1 attacks argument arg2 (or vice versa), the conclusion of arg1 is ε(x1), the conclusion of arg2 is ε(x1), and VOx1(x1)VOx2(x2), then arg1>arg2.

    • If both arguments (arg1 and arg2) are built on the basis of inference rule AS3, argument arg1 attacks argument arg2 (or vice versa), the conclusion of arg1 is gp=gml, where gml=x1, the conclusion of arg2 is ε(x1)gp=gmk, where gmk=x2, and VOx1(x1)VOx2(x2), then arg1arg2.

  14. If two arguments (arg1,arg2) are built on the basis of the same inference rule (ARG2, ARG3, or ARG4), arg1arg2 and arg2arg1, then both conclusions equally fulfil the agent's goal and it is impossible to solve the conflict on grounds of value preferences. Such a situation will appear if it is impossible to derive preference between situations (e.g., if multiple situations promote cumulative levels of values' promotion to the same extent, where individual values are promoted to different levels). Such a conflict can be solved with some other (not listed in Section 7) inference rule, remain unsolved, or it can be solved randomly.

  15. Argument arg1 is not defeated if it is not attacked by any argument or all arguments which attack arg1 are defeated.

  16. If an argument whose conclusion sets a practical goal (i.e., gp=ast,m) is not defeated, then its conclusion will be added to the knowledge base.

  17. If an argument (i.e., arg1) is defeated, then its conclusion will be deleted from KB and action connected with its conclusion will not be performed. If the conclusion of a defeated argument is a premise of the other argument (i.e., arg2) and ¬argXARG:Concl(argx)=Concl(arg1), then arg2 will also be defeated.

  18. If one of the arguments concludes that ε(ast,m) and the argument is not defeated, it brings about performing an action ast,m.

Generally speaking, the argumentation framework used in the example was inspired by the ASPIC+ argumentation framework [31].

11. EXAMPLE

This section illustrates the process of deciding which action should be performed by the agent. We assume that there exists a smart device Dev1 in the form of “smart glasses” equipped by a person in an industrial environment. The device Dev1 is fully autonomous and can perform a variety of actions without the active participation of the user that wears it. One of the functionalities of such a device is scanning for objects of interest and when found taking pictures of the field of view of the person. Dev1 should back up the pictures on an external storage server Ser1 when a stable connection is found. We also assume that Ser1 is accessible in the public network (Internet) and Dev1 is capable of connecting to that network as well (i.e., via WiFi or 4G infrastructure). Such a scenario involves a lot of parameters that affect the image upload speed, the reliability of the connection, the energy supply of the device, carrier fees, the security and quality of the resource being sent. For a better presentation of our mechanism, we assume that the Dev1 device focuses only on the security, upload speed and quality of the resource. One of the main strengths of our method that we would like to point out in this example is its compatibility with the self-governing capabilities of the agent (Dev1). This is partly achieved via the use of a numerical representation for most of the parameters that the evaluation process relies on. Most of the similar methods don't focus on the autonomy of the agent. Dev1 sets the practical goals autonomously basing on the current environment state. After the evaluation process it decides on whether to perform an action to achieve a given goal or not.

Three values are important for the agent:

  • vsecurity

  • vquality

  • vspeed

The agent declares an abstract unreachable goal gua in which it declares the minimal levels of promotion of three values (vsecuritymin(gua), vqualitymin(guagua)), vspeedmin(guagua)) in which:

vsecuritymin(gua)=0.85

vqualitymin(gua)=0.5

vspeedmin(gua)=0.45

The key value of the goal gua is security (ω(gua)=vsecurity), which means that the level of promotion of security should be as high as possible.

The process of deciding which action should be performed is carried out by setting a practical goal by the device Dev1, which involves 5 stages. Each stage is described in detail below:

11.1. STAGE 1: Decision Space

This stage involves the definition of the decision space. First of all, we have to define a set of the decision vectors which we will denote as x.

In our case, Dev1 has not yet sent the image for back up, therefore it is in the following initial state of affairs: sins.

Now we declare the decision space:

We assume that we have four decision variables (parameters):

  • encryption algorithm,

  • encryption key length,

  • quality level,

  • network speed

Next, we assume the domains of particular decision parameters:

  • x1=Dsecurityalgorithm={none,Twofish,AES}

  • x2=Dsecuritykey={0,128,192,256}

  • x3=Dcompressionlevel={low,medium,high}

  • x4=Dnetworkspeed={slow,average,fast}

These domains specify all the possible physical values that a parameter can hold.

The next step is to define the vectors for all combinations of possible actions.

These vectors are defined as follows:

as(none,0,low,slow)=(none,0,low,slow)T

as(none,0,low,average)=(none,0,low,average)T

as(none,0,low,fast)=(none,0,low,fast)T

as(none,0,medium,slow)=(none,0,medium,slow)T

as(none,0,medium,average)=(none,0,medium,average)T

as(none,0,medium,fast)=(none,0,medium,fast)T

as(none,0,high,slow)=(none,0,high,slow)T

as(none,0,high,average)=(none,0,high,average)T

as(none,0,high,fast)=(none,0,high,fast)T

as(Twofish,128,low,slow)=(Twofish,128,low,slow)T

as(Twofish,128,low,average)=(Twofish,128,low,average)T

as(Twofish,128,low,fast)=(Twofish,128,low,fast)T

as(Twofish,128,medium,slow)

=(Twofish,128,medium,slow)T

as(Twofish,128,medium,average)

=(Twofish,128,medium,average)T

as(Twofish,128,medium,fast)

=(Twofish,128,medium,fast)T

as(Twofish,128,high,slow)=(Twofish,128,high,slow)T

as(Twofish,128,high,average)

=(Twofish,128,high,average)T

as(Twofish,128,high,fast)=(Twofish,128,high,fast)T

as(Twofish,196,low,slow)=(Twofish,196,low,slow)T

as(Twofish,196,low,average)

=(Twofish,196,low,average)T

as(Twofish,196,low,fast)=(Twofish,196,low,fast)T

as(Twofish,196,medium,slow)=(Twofish,196,medium,slow)T

as(Twofish,196,medium,average)

=(Twofish,196,medium,average)T

as(Twofish,196,medium,fast)=(Twofish,196,medium,fast)T

as(Twofish,196,high,slow)=(Twofish,196,high,slow)T

as(Twofish,196,high,average)

=(Twofish,196,high,average)T

as(Twofish,196,high,fast)=(Twofish,196,high,fast)T

as(Twofish,256,low,slow)=(Twofish,256,low,slow)T

as(Twofish,256,low,average)=(Twofish,256,low,average)T

as(Twofish,256,low,fast)=(Twofish,256,low,fast)T

as(Twofish,256,medium,slow)=(Twofish,256,medium,slow)T

as(Twofish,256,medium,average)

=(Twofish,256,medium,average)T

as(Twofish,256,medium,fast)=(Twofish,256,medium,fast)T

as(Twofish,256,high,slow)=(Twofish,256,high,slow)T

as(Twofish,256,high,average)

=(Twofish,256,high,average)T

as(Twofish,256,high,fast)=(Twofish,256,high,fast)T

as(AES,128,low,slow)=(AES,128,low,slow)T

as(AES,128,low,average)=(AES,128,low,average)T

as(AES,128,low,fast)=(AES,128,low,fast)T

as(AES,128,medium,slow)=(AES,128,medium,slow)T

as(AES,128,medium,average)=(AES,128,medium,average)T

as(AES,128,medium,fast)=(AES,128,medium,fast)T

as(AES,128,high,slow)=(AES,128,high,slow)T

as(AES,128,high,average)=(AES,128,high,average)T

as(AES,128,high,fast)=(AES,128,high,fast)T

as(AES,196,low,slow)=(AES,196,low,slow)T

as(AES,196,low,average)=(AES,196,low,average)T

as(AES,196,low,fast)=(AES,196,low,fast)T

as(AES,196,medium,slow)=(AES,196,medium,slow)T

as(AES,196,medium,average)

=(AES,196,medium,average)T

as(AES,196,medium,fast)=(AES,196,medium,fast)T

as(AES,196,high,slow)=(AES,196,high,slow)T

as(AES,196,high,average)=(AES,196,high,average)T

as(AES,196,high,fast)=(AES,196,high,fast)T

as(AES,256,low,slow)=(AES,256,low,slow)T

as(AES,256,low,average)=(AES,256,low,average)T

as(AES,256,low,fast)=(AES,256,low,fast)T

as(AES,256,medium,slow)=(AES,256,medium,slow)T

as(AES,256,medium,average)

=(AES,256,medium,average)T

as(AES,256,medium,fast)=(AES,256,medium,fast)T

as(AES,256,high,slow)=(AES,256,high,slow)T

as(AES,256,high,average)=(AES,256,high,average)T

as(AES,256,high,fast)=(AES,256,high,fast)T

11.2. STAGE 2: Calculation of Values (Φ)

During the second stage of our decision-making process, we have to calculate the level of promotion of values for all possible actions that can be performed initial state of the device.

The set Φ has three functions Φ={Φsecurity,Φquality,Φspeed}. All three functions can have two kinds of arguments: the initial state of affairs (sins) and all the elements of the set ASins.

If by x=(x1,x2,x3,x4) we denote our decision vector, a and b will be coefficients, then function Φsecurity will be as follows:

Definition 19. [Definition of Φsecurity]

For the special case in which Dev1 stays in the initial state of affairs (Φsecurity(sins)), the level of security equals Φsecurity(sins)=0 (since the image has not been backed up, the security of the data equals zero). For the other cases, if

  • x1=none, then a=0;

  • x1=Twofish, then a=0.95;

  • x1=AES, then a=1.

Then

Φsecurity(asx)=x2a/(max(x2)+1/max(x2)min(x2)).

For example,

vsecurity(as(Twofish,128,low,average),ins)

=Φsecurity(as(Twofish,128,low),ins)

=(128a)/(256+0.003+0)=(1280.95)/256.003=0.406

The levels of promotion of the value vsecurity:

vsecurity(sins)=Φsecurity(sins)=0

vsecurity(as(none,0,low,slow),ins)=0.0

vsecurity(as(none,0,low,average),ins)=0.0

vsecurity(as(none,0,low,fast),ins)=0.0

vsecurity(as(none,0,medium,slow),ins)=0.0

vsecurity(as(none,0,medium,average),ins)=0.0

vsecurity(as(none,0,medium,fast),ins)=0.0

vsecurity(as(none,0,high,slow),ins)=0.0

vsecurity(as(none,0,high,average),ins)=0.0

vsecurity(as(none,0,high,fast),ins)=0.0

vsecurity(as(Twofish,128,low,slow),ins)=0.474

vsecurity(as(Twofish,128,low,average),ins)=0.474

vsecurity(as(Twofish,128,low,fast),ins)=0.474

vsecurity(as(Twofish,128,medium,slow),ins)=0.474

vsecurity(as(Twofish,128,medium,average),ins)=0.474

vsecurity(as(Twofish,128,medium,fast),ins)=0.474

vsecurity(as(Twofish,128,high,slow),ins)=0.474

vsecurity(as(Twofish,128,high,average),ins)=0.474

vsecurity(as(Twofish,128,high,fast),ins)=0.474

vsecurity(as(Twofish,128,low,slow),ins)=0.474

vsecurity(as(Twofish,128,low,average),ins)=0.474

vsecurity(as(Twofish,128,low,fast),ins)=0.474

vsecurity(as(Twofish,128,medium,slow),ins)=0.474

vsecurity(as(Twofish,128,medium,average),ins)=0.474

vsecurity(as(Twofish,128,medium,fast),ins)=0.474

vsecurity(as(Twofish,128,high,slow),ins)=0.474

vsecurity(as(Twofish,128,high,average),ins)=0.474

vsecurity(as(Twofish,128,high,fast),ins)=0.474

vsecurity(as(Twofish,256,low,slow),ins)=0.949

vsecurity(as(Twofish,256,low,average),ins)=0.949

vsecurity(as(Twofish,256,low,fast),ins)=0.949

a vsecurity(as(Twofish,256,medium,slow),ins)=0.949

vsecurity(as(Twofish,256,medium,average),ins)=0.949

vsecurity(as(Twofish,256,medium,fast),ins)=0.949

vsecurity(as(Twofish,256,high,slow),ins)=0.949

vsecurity(as(Twofish,256,high,average),ins)=0.949

vsecurity(as(Twofish,256,high,fast),ins)=0.949

s vsecurity(as(AES,128,low,slow),ins)=0.499

vsecurity(as(AES,128,low,average),ins)=0.499

vsecurity(as(AES,128,low,fast),ins)=0.499

vsecurity(as(AES,128,medium,slow),ins)=0.499

vsecurity(as(AES,128,medium,average),ins)=0.499

vsecurity(as(AES,128,medium,fast),ins)=0.499

vsecurity(as(AES,128,high,slow),ins)=0.499

vsecurity(as(AES,128,high,average),ins)=0.499

vsecurity(as(AES,128,high,fast),ins)=0.499

vsecurity(as(AES,128,low,slow),ins)=0.499

vsecurity(as(AES,128,low,average),ins)=0.499

vsecurity(as(AES,128,low,fast),ins)=0.499

vsecurity(as(AES,128,medium,slow),ins)=0.499

vsecurity(as(AES,128,medium,average),ins)=0.499

vsecurity(as(AES,128,medium,fast),ins)=0.499

vsecurity(as(AES,128,high,slow),ins)=0.499

vsecurity(as(AES,128,high,average),ins)=0.499

vsecurity(as(AES,128,high,fast),ins)=0.499

vsecurity(as(AES,256,low,slow),ins)=0.999

vsecurity(as(AES,256,low,average),ins)=0.999

vsecurity(as(AES,256,low,fast),ins)=0.999

vsecurity(as(AES,256,medium,slow),ins)=0.999

vsecurity(as(AES,256,medium,average),ins)=0.999

vsecurity(as(AES,256,medium,fast),ins)=0.999

vsecurity(as(AES,256,high,slow),ins)=0.999

vsecurity(as(AES,256,high,average),ins)=0.999

vsecurity(as(AES,256,high,fast),ins)=0.999

Definition 20.

[Φquality] Let us assume that:

  • For the initial state of affairs Φquality(sins)=0.8 which means that the quality of the not sent picture is high (due to low compression). In the other cases:

    as(x1,x2,x3,x4),insASins s.t.:

    • (x3=low)Φquality=0.8

    • (x3=medium)Φquality=0.6

    • (x3=high)Φquality=0.2

Note that quality is inversely proportional (lower compression = higher quality).

On the basis of the function Φquality we compute the levels of promotion of vquality:

vquality(sins)=0.8

vquality(as(none,0,low,slow),ins)=0.8

vquality(as(none,0,low,average),ins)=0.8

vquality(as(none,0,low,fast),ins)=0.8

vquality(as(none,0,medium,slow),ins)=0.6

vquality(as(none,0,medium,average),ins)=0.6

vquality(as(none,0,medium,fast),ins)=0.6

vquality(as(none,0,high,slow),ins)=0.2

vquality(as(none,0,high,average),ins)=0.2

vquality(as(none,0,high,fast),ins)=0.2

vquality(as(Twofish,128,low,slow),ins)=0.8

vquality(as(Twofish,128,low,average),ins)=0.8

vquality(as(Twofish,128,low,fast),ins)=0.8

vquality(as(Twofish,128,medium,slow),ins)=0.6

vquality(as(Twofish,128,medium,average),ins)=0.6

vquality(as(Twofish,128,medium,fast),ins)=0.6

vquality(as(Twofish,128,high,slow),ins)=0.2

vquality(as(Twofish,128,high,average),ins)=0.2

vquality(as(Twofish,128,high,fast),ins)=0.2

vquality(as(Twofish,128,low,slow),ins)=0.8

vquality(as(Twofish,128,low,average),ins)=0.8

vquality(as(Twofish,128,low,fast),ins)=0.8

vquality(as(Twofish,128,medium,slow),ins)=0.6

vquality(as(Twofish,128,medium,average),ins)=0.6

vquality(as(Twofish,128,medium,fast),ins)=0.6

vquality(as(Twofish,128,high,slow),ins)=0.2

vquality(as(Twofish,128,high,average),ins)=0.2

vquality(as(Twofish,128,high,fast),ins)=0.2

vquality(as(Twofish,256,low,slow),ins)=0.8

vquality(as(Twofish,256,low,average),ins)=0.8

vquality(as(Twofish,256,low,fast),ins)=0.8

vquality(as(Twofish,256,medium,slow),ins)=0.6

vquality(as(Twofish,256,medium,average),ins)=0.6

vquality(as(Twofish,256,medium,fast),ins)=0.6

vquality(as(Twofish,256,high,slow),ins)=0.2

vquality(as(Twofish,256,high,average),ins)=0.2

vquality(as(Twofish,256,high,fast),ins)=0.2

vquality(as(AES,128,low,slow),ins)=0.8

vquality(as(AES,128,low,average),ins)=0.8

vquality(as(AES,128,low,fast),ins)=0.8

vquality(as(AES,128,medium,slow),ins)=0.6

vquality(as(AES,128,medium,average),ins)=0.6

vquality(as(AES,128,medium,fast),ins)=0.6

vquality(as(AES,128,high,slow),ins)=0.2

vquality(as(AES,128,high,average),ins)=0.2

vquality(as(AES,128,high,fast),ins)=0.2

vquality(as(AES,128,low,slow),ins)=0.8

vquality(as(AES,128,low,average),ins)=0.8

vquality(as(AES,128,low,fast),ins)=0.8

vquality(as(AES,128,medium,slow),ins)=0.6

vquality(as(AES,128,medium,average),ins)=0.6

vquality(as(AES,128,medium,fast),ins)=0.6

vquality(as(AES,128,high,slow),ins)=0.2

vquality(as(AES,128,high,average),ins)=0.2

vquality(as(AES,128,high,fast),ins)=0.2

vquality(as(AES,256,low,slow),ins)=0.8

vquality(as(AES,256,low,average),ins)=0.8

vquality(as(AES,256,low,fast),ins)=0.8

vquality(as(AES,256,medium,slow),ins)=0.6

vquality(as(AES,256,medium,average),ins)=0.6

vquality(as(AES,256,medium,fast),ins)=0.6

vquality(as(AES,256,high,slow),ins)=0.2

vquality(as(AES,256,high,average),ins)=0.2

vquality(as(AES,256,high,fast),ins)=0.2

Definition 21. [Φspeed]

Let us assume that:

  • For the initial state of affairs Φspeed(sins)=0 which means that the network speed of the not yet sent picture is at the slowest (none). In the other cases:

    as(x1,x2,x3,x4),insASins s.t.: If:

    • x1=none, then a=1;

    • x1=Twofish, then a=0.95;

    • x1=AES, then a=0.9;

    and if:

    • x3=low, then b=0.4;

    • x3=medium, then b=0.75;

    • x3=high, then b=1;

    and if:

    • x4=slow, then c=0.2;

    • x4=average, then c=0.7;

    • x4=fast, then c=0.99;

    then:

    Φspeed(asx)=abc.

On the basis of the function Φspeed we compute the levels of promotion of vspeed:

vspeed(sins)=0

vspeed(as(none,0,low,slow),ins)=0.08

vspeed(as(none,0,low,average),ins)=0.279

vspeed(as(none,0,low,fast),ins)=0.396

vspeed(as(none,0,medium,slow),ins)=0.15

vspeed(as(none,0,medium,average),ins)=0.524

vspeed(as(none,0,medium,fast),ins)=0.742

vspeed(as(none,0,high,slow),ins)=0.2

vspeed(as(none,0,high,average),ins)=0.7

vspeed(as(none,0,high,fast),ins)=0.99

vspeed(as(Twofish,128,low,slow),ins)=0.076

vspeed(as(Twofish,128,low,average),ins)=0.265

vspeed(as(Twofish,128,low,fast),ins)=0.376

vspeed(as(Twofish,128,medium,slow),ins)=0.142

vspeed(as(Twofish,128,medium,average),ins)=0.498

vspeed(as(Twofish,128,medium,fast),ins)=0.705

vspeed(as(Twofish,128,high,slow),ins)=0.19

vspeed(as(Twofish,128,high,average),ins)=0.664

vspeed(as(Twofish,128,high,fast),ins)=0.94

vspeed(as(Twofish,128,low,slow),ins)=0.076

vspeed(as(Twofish,128,low,average),ins)=0.265

vspeed(as(Twofish,128,low,fast),ins)=0.376

vspeed(as(Twofish,128,medium,slow),ins)=0.142

vspeed(as(Twofish,128,medium,average),ins)=0.498

vspeed(as(Twofish,128,medium,fast),ins)=0.705

vspeed(as(Twofish,128,high,slow),ins)=0.19

vspeed(as(Twofish,128,high,average),ins)=0.664

vspeed(as(Twofish,128,high,fast),ins)=0.94

vspeed(as(Twofish,256,low,slow),ins)=0.076

vspeed(as(Twofish,256,low,average),ins)=0.265

vspeed(as(Twofish,256,low,fast),ins)=0.376

vspeed(as(Twofish,256,medium,slow),ins)=0.142

vspeed(as(Twofish,256,medium,average),ins)=0.498

vspeed(as(Twofish,256,medium,fast),ins)=0.705

vspeed(as(Twofish,256,high,slow),ins)=0.19

vspeed(as(Twofish,256,high,average),ins)=0.664

vspeed(as(Twofish,256,high,fast),ins)=0.94

vspeed(as(AES,128,low,slow),ins)=0.072

vspeed(as(AES,128,low,average),ins)=0.252

vspeed(as(AES,128,low,fast),ins)=0.356

vspeed(as(AES,128,medium,slow),ins)=0.135

vspeed(as(AES,128,medium,average),ins)=0.472

vspeed(as(AES,128,medium,fast),ins)=0.668

vspeed(as(AES,128,high,slow),ins)=0.18

vspeed(as(AES,128,high,average),ins)=0.63

vspeed(as(AES,128,high,fast),ins)=0.891

vspeed(as(AES,128,low,slow),ins)=0.072

vspeed(as(AES,128,low,average),ins)=0.252

vspeed(as(AES,128,low,fast),ins)=0.356

vspeed(as(AES,128,medium,slow),ins)=0.135

vspeed(as(AES,128,medium,average),ins)=0.472

vspeed(as(AES,128,medium,fast),ins)=0.668

vspeed(as(AES,128,high,slow),ins)=0.18

vspeed(as(AES,128,high,average),ins)=0.63

vspeed(as(AES,128,high,fast),ins)=0.891

vspeed(as(AES,256,low,slow),ins)=0.072

vspeed(as(AES,256,low,average),ins)=0.252

vspeed(as(AES,256,low,fast),ins)=0.356

vspeed(as(AES,256,medium,slow),ins)=0.135

vspeed(as(AES,256,medium,average),ins)=0.472

vspeed(as(AES,256,medium,fast),ins)=0.668

vspeed(as(AES,256,high,slow),ins)=0.18

vspeed(as(AES,256,high,average),ins)=0.63

vspeed(as(AES,256,high,fast),ins)=0.891

Since all values have different importance, we assume three functions of the weight: Ωsecurity, Ωquality and Ωspeed. In our case, we assume that all of them will be linear ones:

vosecurity(x)=Ω(vsecurity(x)=0.5(vsecurity(x));

voquality(x)=Ω(vquality(x)=0.3(vquality(x));

vospeed(x)=Ω(vspeed(x)=0.2(vspeed(x)).

11.3. STAGE 3: Construction of Arguments

In the previous stages we declared an abstract unreachable goal gua. Since the abstract unreachable goal is also an ordinary abstract goal, in order to determine the practical goal, the system can use two argumentation schemes: AS3 and AS4. However, due to the fact that AS4 based arguments win over AS3 (because of OARG order), in this case the argument construction process considers only AS4 type arguments.

Since we assumed that:

vsecuritymin(gua)=0.85,

vqualitymin(gua)=0.5, and

vspeedmin(gua)=0.45

we can notice that the initial state of affairs sins does not satisfy the goal (vsecurity(sins<vsecuritymin(gua)), but since gua is an unreachable abstract goal, two situations satisfy the conditions of the ARG4 inference rule: asins,(AES,256,medium,average)T,asins,(AES,256,medium,fast)T.

This is because both situations satisfy the requirements (asins,(AES,256,medium,average)T,

asins,(AES,256,medium,fast)T promote vquality, vspeed and vsecurity to a greater extent than the minimal requirements, and promote the key value (vsecurity) to a greater level than the other situations (vsecurity(asins,(AES,256,medium,average)T)=vsecurity(asins,(AES,256,medium,fast)T)=0.999). On the basis of that, we can construct two arguments:

  • arg1 with conclusion:

    gp=asins,(AES,256,medium,average)T;

  • arg2 with conclusion:

    gp=asins,(AES,256,medium,fast)T.

To sum up, we have two arguments setting two different practical goal declarations. Since we can only have one practical goal, we will need to resolve the conflict in the next stage.

11.4. STAGE 4: Conflict Resolution

We have two arguments:

  • arg1 with conclusion gp=asins,(AES,256,medium,average)T;

  • arg2 with conclusion gp=asins,(AES,256,medium,fast)T.

with two different conclusions. Arguments arg1 and arg2 attack each other. The conflicting arguments arg1 and arg2 conclude that gp=as(AES,256,medium,average)T,ins and gp=as(AES,256,medium,fast)T,ins). Both of them are built on the basis of the same inference rule (ARG4). In the case of such a conflict, the argumentation framework regulates that if all arguments (arg1 and arg2) are built on the basis of inference rule AS4, they attack each other, the conclusion of arg1 is gp=gml, where gml=x1, the conclusion of arg2 is gp=gmk, where gmk=x2, and VOx2(x2)VOx1(x1), then arg1>arg2.

In our case, argument arg2 attacks argument arg1.

In the first step, we calculate the relative level of promotion of both values:

vosecurity(as(AES,256,medium,average),ins)=Ω(vsecurity(as(AES,256,medium,average),ins)=0.499

vosecurity(as(AES,256,medium,fast),ins)=Ω(vsecurity(as(AES,256,medium,fast),ins)=0.499

voquality(as(AES,256,medium,average),ins)=Ω(vquality(as(AES,256,medium,average),ins)=0.18

voquality(as(AES,256,medium,fast),ins)=Ω(vquality(as(AES,256,medium,fast),ins)=0.18

vospeed(as(AES,256,medium,average),ins)=Ω(vspeed(as(AES,256,medium,average),ins)=0.094

vospeed(as(AES,256,medium,fast),ins)=Ω(vspeed(as(AES,256,medium,fast),ins)=0.133

In the second step, we calculate the cumulative evaluation of the two conflicting arguments:

o1=vosecurity(as(AES,256,medium,average),ins))+voquality(as(AES,256,medium,average),ins))

vosecurity(as(AES,256,medium,average),ins))voquality(as(AES,256,medium,average),ins))=0.499+0.180.4990.18=0.589

Θ(VO(as(AES,256,medium,average),ins)))=o1+vospeed(as(AES,256,medium,average),ins))o1vospeed(as(AES,256,medium,average),ins))=0.589+0.0940.5890.094=0.628

o2=vosecurity(as(AES,256,medium,fast),ins))+voquality(as(AES,256,medium,fast),ins))

vosecurity(as(AES,256,medium,fast),ins))voquality(as(AES,256,medium,fast),ins))=0.499+0.180.4990.18=0.589

Θ(VO(as(AES,256,medium,fast),ins)))=o2+vospeed(as(AES,256,medium,fast),ins))o2vospeed(as(AES,256,medium,fast),ins))=0.589+0.1330.5890.133=0.644

In consequence, we obtain that:

Θ(VO(as(AES,256,medium,fast),ins))Θ(VO(as(AES,256,medium,average),ins)) and

(VO(as(AES,256,medium,average),ins))(VO(as(AES,256,medium,fast),ins)) and

(VO(as(AES,256,mediumn,fast),ins))(VO(as(AES,256,medium,average),ins)),

which means that arg2 defeats arg1 (arg2arg1).

11.5. STAGE 5: Construction of the Final Argument

The last stage involves the construction of the final argument which will become the chosen action that the device Dev1 will perform in given circumstances.

Since the system established a single practical goal, we can create the following argument arg3 based on AS5, which concludes that ε(as(AES,256,medium,fast)T,ins).

The argumentation scheme indicates that action as(AES,256,medium,fast)T,ins is the single practical goal of the device Dev1 and therefore this action should be performed.

Finally, the device has to perform the action to use security AES-256, medium level of compression and chose a network that has fast upload speed. (ε(as(AES,256,medium,fast)T,ins)).

12. RELATED WORKS

The issue of using the numeric representation of levels to which values are promoted during decision-making has not been much discussed in the literature yet. The author of [24] proposes a simple model of representation of the levels of value promotion, though it is highlighted that “This is not meant to suggest that one can usually balance competing legal values through numerical computations.” Atkinson and Bench-Capon [26] (which is a developed version of an approach from [25] include a proposition of extending the AATS+V model by a simple numeric representation of the agent's preferences (using the product of values and their weights) and the level of utility regarding a decision-making situation. Our concept of the weight's and the level of promotion is a little bit more general: In our work the importance of value is represented by weight function, which do not have to be linear. In order to compute the relative levels of promotion of value (vo) we use both: the levels of promotion of values, represented by a numbers from <0;1) and the weight function which describes, the importance of value, while in [26] the numerical representation (expressed by weights) is constructed on the basis of binary representation of values (values are promoted or not, by a particular decisions). Our approach allows us for estimating not only the weight of value (on the basis of its promotion or demotion) but the relative level of promotion (which takes into consideration weight and the level of promotion). This is the reason why we assumed that weights are not fixed multipliers but the functions (e.g., see the battery example in Section 4.1).

The process of transition from physical value to evaluation of value itself (which in our work is represented by functions form set Φ) in [21] is realized by logic program Δ. There is a significant difference between our approach and that from [21]: our approach is based on the function returning number (the level of promotion of a given value), while in [21] Δ allows for deriving that value is demoted, promoted or neutral. We believe that numerical representation allows for more fine grained representation of the level of promotion of values.

The authors of [2] present a decision-making model for multi-agent systems with agent architecture (named BVG), making it possible to reason on the basis of values. The authors provide the possibility of numeric evaluation of the value level, but they establish constant weights and do not allow for comparing various options promoting various values. However, the system presented in [2] contains an important feature which is absent in our work—the system has a possibility of representation of failure of the action (the action does not leads to the planned state of affairs). This is an interesting property, which should be discussed in the future. Several other works (e.g., [32,33]) introduce methods of numeric representation of the agent's preferences, though lacking any link to values or the possibility of autonomous goal-setting.

An interesting mechanism of numeric representation of the levels of value promotion and relations between the same is proposed in [34]. It is based on a premise largely similar to that described herein (decision-making options promote values to a different extent, and the levels thereof can be compared). However, despite said similarities, the two mechanisms differ in certain key aspects: Sartor introduces the concept of utility which is absent in our model as in our opinion it provides no additional benefits: in our model, the level of value promotion is determined exactly on the basis of its utility. Moreover, the weight function proposed by Sartor is linear which, as observed in Section 4.1, often fails to correspond to real-life decision-making situations. The two models (ours and Sartor's) also differ in terms of the manner in which certain values are determined (e.g., the total level of value promotion which in Sartor's approach is obtained through simple addition). Furthermore, unlike ourselves, Sartor does not introduce any inference mechanism and offers no formal, in-depth analysis of the model he proposes.

An interesting decision-making model for an autonomous vehicle is proposed in [35] where the authors describe a very intricate framework based on the partially observable Markov decision process. The authors' primary goal is to provide a general solution for AV decision-making at any intersection. The model is developed in such a way so as to interact with any and unspecified number of other road users while at the same time accounting for the unpredictability of their behavioor. The model is based on a value function rooted in the expected reward for a given policy. The value function allows the determination of the total regret, whose minimization is the primary goal of the model. The solution proposed in [35] is dedicated to a very specific type of an autonomous device and provides a very advanced decision-making model in the context of behavior at a road intersection, but does not account for a broader decision-making context based on the promotion of a variety of values and does not allow for autonomous goal-setting by a device.

13. DISCUSSION AND CONCLUSIONS

The framework included in [14] allows for the modeling of reasoning in autonomous systems. Regretfully, its practical implementation requires a declaration of a large number of orders describing relations between levels to which various situations promote various values and sets of values. With real decision-making problems, the declaration of such a high number of orders is very challenging and can only be feasible in the case of a situation with relatively small sets of values and actions which are possible to perform. The main objective of our work is to propose modifications of the framework from [14] which would allow a facilitated implementation of the decision-making systems for autonomous agents. Our model can be seen as a kind of theoretical framework constituting a formal background for decision-making mechanisms. The basics of the framework are constructed in a flexible way, which allows for the development of the model by adding new functions to sets Φ, Ω, Λ, by utilization of various methods of obtaining these functions (including machine learning mechanisms), by adding new kinds of goals, inference rules, etc. Moreover, in particular kinds of cases (described in Section 8), the basics of the model can function as a background for optimization algorithms. Such flexibility allows for adjusting the model to various kinds of autonomous devices.

While making a decision, a person intuitively evaluates available decision options, dividing them into better and worse ones (as presented in [14] and other works), but does not attach any numeric values to them. This paper introduces a modified approach, where the level to which particular situations promote various values is represented as a number from the range 0;1). Though unlike a typical human approach, we believe it is much more natural for all kinds of technical devices. The proper definition of functions from the set Φ allows for automatic evaluation of the decision options. The modification we propose allows for a substantial reduction of the number of orderings (declarations of O and OR will not be necessary) because the levels to which particular values are promoted can be easily compared; moreover, the lack of necessity to search large order sets may significantly accelerate the decision-making process. The proposed mechanism of determining the cumulative evaluation of particular situations (decision options), joined by the weight functions, makes it possible to compare complex situations promoting various values to various levels. Although this entails the necessity to declare function sets Φ and Ω, it is worth noticing that for situations in which the user or the creator of the system does not know which function to use, the system may use default functions or machine learning mechanisms which, on the basis of the previous data, can learn the course of the Φ and Ω functions.

The problem of reasoning with values and goals has been discussed by a number of authors; however, in our opinion, there are still many issues requiring further development. One of the most important drawbacks of the existing models consists in the lack of comprehensive discussion of the definition of values, goals, and relations between them. Especially the problem of relations between values and goals has been, in our view, insufficiently analyzed and stands in need of broader discussion.

The main objective of this paper is to present and discuss a new view of the problem of the representation of values and goals in reasoning. We have pointed out some elements in which our idea differs from other concepts of utilization of values in reasoning, particularly those presented in [7] and [21]. We do not intend to prove wrong the ideas included in the aforementioned works or claim they are irrelevant for the representation of various aspects of reasoning with goals and values; it is rather an attempt to demonstrate a new and more intuitive view of the problem. Obviously, the notion of intuitiveness is a subjective one, but we believe that establishing connection between the minimal acceptable levels of promotion of values, understood as goals and motivations of decision makers, allow us to establish a simpler and more abstract (trans-situational) way of representing the goals of agents. On the basis of such goals the decisions can be made.

The model presented in this paper provides a comprehensive tool for representation of values and goals in practical reasoning. Although there are decision-making models which allow for reasoning about values and goals (e.g., [21,36]), we are convinced, that our model of relations between goals and values makes a clear connection between those two concepts and allows us to create an autonomous system which can independently set itself a goal. We believe this is the most important novelty of the paper.

Most models presented in the literature view values as promoted (or demoted) by a change of a state of affairs. Our idea is slightly different: in the model we propose, values are promoted by a particular state of affairs or actions leading to a particular state of affairs, while the level of their promotion by a change of states can be seen as the difference between the promotion of values by the initial state and the action which leads to the target state of affairs. To our minds, such an idea is more intuitive and allows for a more adequate representation of human reasoning in which not only a change of states can be the subject of valuation, but also a group of given states of affairs. Such assumptions make our approach closer to the multi-criteria decision-making problems and we believe that our approach can be seem as a kind of bridge between value-based and multi-criteria decision systems.

On the basis of the above definitions, value-based inference patterns have been introduced with the use of argumentation schemes. The above definition of values allows for the formulation of the concept of abstract, material, and practical goals, which in consequence leads to an improved description of various kinds of teleological reasoning.

Most models presented in the literature view goals as a particular state of affairs which the agent is going to reach. In my conception, an abstract goal is the set of extents to which values should be promoted. On the basis of such a goal, the agent can infer which action brings about the desired goal. This view on the process of teleological reasoning allows for autonomous formulation of a goal by the agent.

Generally speaking, the model presented in the paper introduces a bridge between two separate but connected concepts of values and goals, providing a precise description of the relations between them as well as a possibility of reasoning about goals which, in consequence, allows us to model the motivations of the agent.

There are a few important issues calling for further discussion which I am going to elaborate in my future work. The first one is the discussion of the problem of joint actions with multiple agents. We have assumed that by an action we understand not only a single activity performed by a single agent, but also a joint action made by a group of agents. This is, however, the topic which requires further elaboration (the most comprehensive discussion of the problem is presented in [21]). The second issue is the potential possibility of the incorporation of the model into any formal framework of legal argumentation (ASPIC+ [31] or Carneades [37]). I see this possibility as important in the context of the implementation of this model, but it still requires some elaboration which can be performed in future works. The third issue is the discussion of the problem of decision-making in a legally-regulated environment, including an analysis of various reasoning mechanisms [38,39], conflict resolution [40,41], interpretation [42,43], and other.

CONFLICTS OF INTEREST

The authors declare of no conflicts of interest.

AUTHORS' CONTRIBUTIONS

Tomasz Zurek: 80% (The idea, formal modeling, proofs, analysis, discussion), Michail Mokkas: 20% (Case analysis).

APPENDIX: PROOFS

A.0.1 Independence of the Ordering of Values

Proof.

The proof is trivial. If set VZ includes one or two values, then the order is of no importance, for a set consisting of three elements, let us assume that VZ={v1,v2,v3} and VOZ(x)={vo1(x),vo2(x),vo3(x)}.

Then:

For any accumulation order vo1(x), vo2(x), vo3(x) the result is the same:

If we accumulate vo1(x) with vo2(x), and the result with vo3(x), then

Θ(VOZ(X))=(vo1(x)+vo2(x)(vo1(x)vo2(x)))+vo3(x)(vo1(x)+vo2(x)(vo1(x)vo2(x)))vo3(x)=vo1(x)+vo2(x)+vo3(x)vo1(x)vo2(x)vo1(x)vo3(x)vo2(x)vo3(x)+vo1(x)vo2(x)vo3(x).

If we accumulate vo1(x) with vo3(x), and the result with vo2(x), then

Θ(VOZ(X))=(vo1(x)+vo3(x)(vo1(x)vo3(x)))+vo2(x)(vo1(x)+vo3(x)(vo1(x)vo3(x)))vo2(x)=vo1(x)+vo2(x)+vo3(x)vo1(x)vo2(x)vo1(x)vo3(x)vo2(x)vo3(x)+vo1(x)vo2(x)vo3(x).

If we accumulate vo2(x) with vo3(x), and the result with vo1(x), then

Θ(VOZ(X))=(vo2(x)+vo3(x)(vo2(x)vo3(x)))+vo1(x)(vo2(x)+vo3(x)(vo2(x)vo3(x)))vo1(x)=vo1(x)+vo2(x)+vo3(x)vo1(x)vo2(x)vo1(x)vo3(x)vo2(x)vo3(x)+vo1(x)vo2(x)vo3(x).

For a higher number of values in VZ, further accumulations of results may be put under vo1(x), vo2(x), or vo3(x) and determined with the use of the same formulae, and hence their order will be insignificant.

A.0.2. Proof that Θ is Monotonic

Proof.

Assuming that the cumulative evaluation of the relative levels of promotion of the values from set VZ in a situation x is Θ(VOZ(x)), then the cumulative evaluation of the relative levels of promotion of the values from set VZvj will be: Θ(VOZ+vj(x))=Θ(VOZ(x))+vj(x)Θ(VOZ(x))vj(x). Since Θ(VOZ(x)) and vj(x) are from the range 0;1), then Θ(VOZ(x))vj(x)Θ(VOZ(x)) and Θ(VOZ(x))vj(x)vj(x). In consequence, Θ(VOZ+vj(x))Θ(VOZ(x)).

A.0.3. Proof that ORO Satisfies Second Property of Relation OR.

Proof.

Θ is monotonic.

If VOXVOZ and if VOX(x1)VOY(x2), then the cumulative relative evaluation of the level to which a situation x1 promotes value set VOZ cannot be lower than the evaluation VOX: Θ(VOZ(x1))Θ(VOX(x1)).

A.0.4. Proof sthat Relations between O and ORO are the Same as between O and OR.

  1. The first relation will be fulfilled if: vi(x1)>vi(x2){voi(x1)}{voi(x2)}.

    Proof.

    Since every function in Ω is increasing and vi(x1)>vi(x2), then voi(x1)voi(x2). Since function Θ is non-decreasing and voi(x1)voi(x2), then Θ(voi(x1))Θ(voi(x2)). Hence if vi(x1)>vi(x2), then {voi(x1)}{voi(x2)}.

  2. The second relation will be fulfilled if:

    (vzVx2(vz(x1)vz(x2))vtVx2(vt(x1)>vt(x2))VOx1(x1)VOx2(x2)

    Proof.

    Since every function in Ω is increasing and (vzVx2(vz(x1)vz(x2)), then (1) (vzVx2(voz(x1)voz(x2)) and (2) vtVx2(vot(x1)>vot(x2)). Since function Θ is non-decreasing and monotonic and (1) and (2) are true, then Θ(VOx1(x1)>Θ(VOx2(x2)), which (in consequence) gives: VOx1(x1)VOx2(x2).

  3. The third relation will be introduced differently than in [14]:

    (Vx2(x2)={Vx1x2(x2)Vx2x1(x2)})

    (Vx1(x1)={Vx2x1(x1)Vx1x2(x1)})

    VOs(x1)VOx1x2(x1)(VOs(x1)VOx2x1(x2))VOx1(x1)VOx2(x2)

    Proof.

    The above formula may be also presented as:

    VOs(x1)VOx1x2(x1)(Θ(VOs(x1))>Θ(VOx2x1(x2)))VOx1(x1)VOx2(x2).

    The set of values promoted by a situation x2 consists of two sets: a set of values promoted by x2 to a lower level than x1 (VOx1x2(x2)) and the remaining values (that is, values promoted by x2 to a higher level than by x1 and values not promoted by x1 whatsoever: VOx2x1(x2)). Since function Θ is increasing and monotonic, then Θ(VOx1x2(x1))Θ(VOx1x2(x2)). If in the set of values promoted by x1 to a higher level than by x2 there exists a subset Vs(x1) which is preferred over the entire set VOx2x1(x2), that is Θ(VOs(x1))>Θ(VOx2x1(x2)), then considering the monotonic nature of the function Θ, Θ(VOx1(x1))>Θ(VOx2(x2)), that is VOx1(x1)VOx2(x2), will be true.

REFERENCES

1.S.O. Hansson, Decision theory – a brief introduction, 1994. https://people.kth.se/~soh/decisiontheory.pdf
3.S. Russel and P. Norvig, Artificial Intelligence: A Modern Approach, Prentice Hall, Englewood Cliffs, New Jersey, USA, 2009.
12.A.S. Rao and M.P. Georgeff, BDI agents: from theory to practice, San Francisco, USA, in Proceedings of the First International Conference on Multi-Agent Systems (ICMAS-95), 1995, pp. 312-319. https://www.aaai.org/Papers/ICMAS/1995/ICMAS95-042.pdf
27.H. Prakken, On the nature of argument schemes, C.A. Reed and C. Tindale (editors), Dialectics, Dialogue and Argumentation. An Examination of Douglas Walton's Theories of Reasoning and Argument, College Publications, London, 2010, pp. 167-185.
32.P. Giorgini, Reasoning with goal models, in Proceedings of the 21st International Conference on Conceptual Modeling (ER '02) (London, UK), 2002, pp. 167-181. http://dl.acm.org/citation.cfm?id=647525.725913
Journal
International Journal of Computational Intelligence Systems
Volume-Issue
14 - 1
Pages
896 - 921
Publication Date
2021/02/19
ISSN (Online)
1875-6883
ISSN (Print)
1875-6891
DOI
10.2991/ijcis.d.210203.001How to use a DOI?
Copyright
© 2021 The Authors. Published by Atlantis Press B.V.
Open Access
This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - JOUR
AU  - Tomasz Zurek
AU  - Michail Mokkas
PY  - 2021
DA  - 2021/02/19
TI  - Value-Based Reasoning in Autonomous Agents
JO  - International Journal of Computational Intelligence Systems
SP  - 896
EP  - 921
VL  - 14
IS  - 1
SN  - 1875-6883
UR  - https://doi.org/10.2991/ijcis.d.210203.001
DO  - 10.2991/ijcis.d.210203.001
ID  - Zurek2021
ER  -