Information Sciences xxx (2013) xxx–xxx Contents lists available at SciVerse ScienceDirect Information Sciences journal homepage: www.elsevier.com/locate/ins How to measure the amount of knowledge conveyed by Atanassov’s intuitionistic fuzzy sets q Eulalia Szmidt ⇑, Janusz Kacprzyk, Paweł Bujnowski Systems Research Institute, Polish Academy of Sciences, ul. Newelska 6, 01-447 Warsaw, Poland a r t i c l e i n f o Article history: Available online xxxx Keywords: Intuitionistic fuzzy sets Information Knowledge Entropy a b s t r a c t We address the problem of how to measure the amount of knowledge conveyed by the Atanassov’s intuitionistic fuzzy set (A-IFS for short). The problem is relevant from the point of view of many application areas, notably decision making. An amount of knowledge considered is strongly linked to its related amount of information. Our analysis is concerned with an intrinsic relationship between the positive and negative information and a lack of information expressed by the hesitation margin. Illustrative examples are shown. Ó 2013 Elsevier Inc. All rights reserved. 1. Introduction In the case of data represented in terms of fuzzy sets, information conveyed is expressed by a membership function. On the other hand, knowledge is basically related to information considered in a particular useful context under consideration. The transformation of information into knowledge is critical from the practical point of view (cf. [12]) and a notable example may here be the omnipresent problem of decision making. In this paper we are concerned with information conveyed by a piece of data represented by an A-IFS, and then its related knowledge that is placed in a context considered. Information that is conveyed by an A-IFS, may be considered just as some generalization of information conveyed by a fuzzy set, and consists of the two terms present in the deﬁnition of the A-IFS, i.e., the membership and non-membership functions (‘‘responsible’’ for the positive and negative information, respectively). But for practical purposes, which can be viewed as following a general attitude to use as much information as is available, it seems expedient, even necessary, to also take into account a so called hesitation margin (cf. [17,19,26,21,28,29,33,5,6,36–38], etc.). Entropy is often viewed as a dual measure of the amount of knowledge. In this paper we show that the entropy alone (cf. [21,28]) may not be a satisfactory dual measure of knowledge useful from the point of view of decision making in the A-IFS context. The reason is that an entropy measure answers the question about the fuzziness as such but does not consider any peculiarities of how the fuzziness is distributed. So, the two situations, one with the maximal entropy for a membership function equal to a non-membership function (e.g., both equal to 0.5), and another when we know absolutely nothing (i.e., both equal to 0), are equal from the point of view of the entropy measure (in terms of the A-IFSs). However, from the point of view of decision making the two situations are clearly different. We should have in mind that a properly constructed entropy measure for the fuzzy sets is from the interval [0, 1] so we cannot increase the value above ‘‘1’’ while considering the entropy of A-IFSs. Each properly constructed entropy for the A-IFSs, due to the very sense of entropy, should also be from the interval [0, 1]. The simple way to overcome this situation is to calculate the entropy of the A-IFSs just using an entropy measure for the fuzzy sets, and to add separately a term relate q This paper is an extended version of our paper presented during WILF 2011 [34]. ⇑ Corresponding author. E-mail addresses: [email protected] (E. Szmidt), [email protected] (J. Kacprzyk). 0020-0255/$ - see front matter Ó 2013 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.ins.2012.12.046 Please cite this article in press as: E. Szmidt et al., How to measure the amount of knowledge conveyed by Atanassov’s intuitionistic fuzzy sets, Inform. Sci. (2013), http://dx.doi.org/10.1016/j.ins.2012.12.046 2 E. Szmidt et al. / Information Sciences xxx (2013) xxx–xxx d to the hesitation margin. In other words, we might express the entropy via two numbers. But such a measure would be difﬁcult to use while solving real problems (many conditions would be veriﬁed to come to a conclusion). On the one hand, the need for consistency of the entropy for the fuzzy sets and A-IFSs, and the need to express in a simple way differences occurring from the point of view of decision making when the hesitation margin is greater than 0 for the AIFSs is the motivation of this paper as we propose here a new measure of knowledge for the A-IFSs which is meant to complement the entropy measure to be able to properly capture additional features which may be relevant while making decisions. We would like to stress that the problem does not boil down to introducing a new entropy measure or to comparing the existing entropy measures. The reason is that properly deﬁned entropy measures for the A-IFSs cannot reﬂect in a simple way (from the point of view of calculation) an additional piece of information being a result of hesitation margins in the A-IFS context, and being consistent with the values obtained for the fuzzy sets. The new measure of the amount of knowledge is tested on a simple example taken from the source Quinlan’s paper [11] but solved using different tools than therein. This example, being simple by just judging by appearance, is a challenge to many classiﬁcation and machine learning methods. Here we make use of it to verify if it is possible to obtain the same optimal solution obtained by Quinlan while using the new measure of the amount of knowledge. Data to another example we consider comes from the benchmark data known as ‘‘Sonar’’ [40]. 2. Brief introduction to the A-IFSs The concept of a fuzzy set in X [39], given by A0 ¼ fhx; lA0 ðxÞijx 2 Xg ð1Þ where lA0 ðxÞ 2 ½0; 1 is the membership function of the fuzzy set A0 , can be generalized. One of well known generalizations is that of an A-IFS [1–3] A which is given by A ¼ fhx; lA ðxÞ; mA ðxÞijx 2 Xg ð2Þ where lA:X ? [0, 1] and mA:X ? [0, 1] such that 0 6 lA ðxÞ þ mA ðxÞ 6 1 ð3Þ and lA(x), mA(x) 2 [0, 1] denote a degree of membership and a degree of non-membership of x 2 X, respectively. An additional concept related to an A-IFS in X, that is not only an obvious result of (2) and (3) but which is also relevant for applications, is called a hesitation margin of x 2 A given by (cf. [2]) pA ðxÞ ¼ 1 lA ðxÞ mA ðxÞ ð4Þ which expresses (a degree of) lack of information of whether x belongs to A or not. It is obvious that 0 6 pA(x) 6 1, for each x 2 X. The hesitation margin (4) turns out to be important while considering the distances [17,19,26,33], entropy [21,25,28], similarity [29] for the A-IFSs, ranking [30,31], etc. i.e., the measures that play a crucial role in virtually all information processing tasks. Hesitation margins turn out to be relevant for applications – in image processing (cf. [5,6]) and classiﬁcation of imbalanced and overlapping classes (cf. [36–38]), group decision making, negotiations, voting and other situations [16,18,20,22–24,27]). As we will use three term representation of A-IFSs, each element x will be described via a triple: (l, m, p), i.e., by the membership l, non-membership m, and hesitation margin p. The concept of a complement of an A-IFS, A, is clearly crucial. It is denoted by AC, and deﬁned usually, also here, as (cf. [2]): AC ¼ fhx; mA ðxÞ; lA ðxÞ; pA ðxÞijx 2 Xg ð5Þ 2.1. Two geometrical representations of the A-IFSs Having in mind that for each element x belonging to an A-IFS A, the values of membership, non-membership and the intuitionistic fuzzy index sum up to one, i.e., lA ðxÞ þ mA ðxÞ þ pA ðxÞ ¼ 1 ð6Þ and that each one of the membership, non-membership, and the intuitionistic fuzzy index are from [0, 1], we can imagine a unit cube (Fig. 1) inside which there is an MNH triangle where the above equation is fulﬁlled. In other words, the MNH triangle represents a surface where coordinates of any element belonging to an A-IFS can be represented. Each point belonging to the MNH triangle is described via three coordinates: (l, m, p). Points M and N represent crisp elements. Point M(1, 0, 0) represents elements fully belonging to an A-IFS as l = 1. Point N(0, 1, 0) represents elements fully not belonging to an A-IFS as m = 1. Point H(0, 0, 1) represents elements about which we are not able to say if they belong or not belong to an A-IFS (intuitionistic fuzzy index p = 1). Such an interpretation is intuitively appealing and provides means for the representation of Please cite this article in press as: E. Szmidt et al., How to measure the amount of knowledge conveyed by Atanassov’s intuitionistic fuzzy sets, Inform. Sci. (2013), http://dx.doi.org/10.1016/j.ins.2012.12.046 E. Szmidt et al. / Information Sciences xxx (2013) xxx–xxx 3 Fig. 1. Geometrical representation in 3D. many aspects of imperfect information. Segment MN (where p = 0) represents elements belonging to classical fuzzy sets (l + m = 1). Any other combination of the values characterizing an A-IFS can be represented inside the triangle MNH. In other words, each element belonging to an A-IFS can be represented as a point (l, m, p) belonging to the triangle MNH (Fig. 1). It is worth mentioning that the geometrical interpretation is directly related to the deﬁnition of an A-IFS introduced by Atanassov [1,2], and it does not need any additional assumptions. Another possible geometrical representation of an A-IFS can be in two dimensions (2D) – Fig. 2 (cf. [2]). It is worth noticing that although we use a 2D ﬁgure (which is more convenient to draw in many cases), we still adopt our approach (e.g., [19,26,21,28,29,33,4]) taking into account all three terms (membership, non-membership and hesitation margin values) describing the A-IFS. As previously, any element belonging to an A-IFS may be represented inside an MNO triangle (O is projection of H in Fig. 1). Each point belonging to the MNO triangle is still described by the three coordinates: (l, m, p), and points M and N represent, as previously, crisp elements. Point M(1, 0, 0) represents elements fully belonging to an A-IFS as l = 1, and point N(0, 1, 0) represents elements fully not belonging to an A-IFS as m = 1. Point O(0, 0, 1) represents elements about which we are not able to say if they belong or not belong to an A-IFS (the intuitionistic fuzzy index p = 1). Segment MN (where p = 0) represents elements belonging to the classic fuzzy sets (l + m = 1). For example, point x1(0.2, 0.8, 0) (Fig. 2), like any element from segment MN represents an element of a fuzzy set. A line parallel to MN describes the elements with the same values of the hesitation margin. In Fig. 2 we can see point x3(0.5, 0.1, 0.4) representing an element with the hesitation margin equal 0.4, and point x2(0.2, 0, 0.8) representing an element with the hesitation margin equal 0.8. The closer a line that is parallel to MN is to O, the higher the hesitation margin. By employing the above (equivalent) geometrical representations, we can calculate distances between any two A-IFSs containing n elements. Fig. 2. Geometrical representation in 2D. Please cite this article in press as: E. Szmidt et al., How to measure the amount of knowledge conveyed by Atanassov’s intuitionistic fuzzy sets, Inform. Sci. (2013), http://dx.doi.org/10.1016/j.ins.2012.12.046 4 E. Szmidt et al. / Information Sciences xxx (2013) xxx–xxx 2.2. Distances between the A-IFSs In Szmidt and Kacprzyk [19], Szmidt and Baldwin [13,14], and especially in Szmidt and Kacprzyk [26,33] it is shown why when calculating distances between IFSs we should take into account all three functions describing A-IFSs. In [26] not only the reasons why we should take into account all three functions are given but also some possible serious problems that can occur while taking into account two functions only. In our further considerations we will use the normalized Hamming distance between any two A-IFSs A and B in X = {x1, x2, . . . , xn} (cf. [19,26]): lIFS ðA; BÞ ¼ n 1 X ðjlA ðxi Þ lB ðxi Þj þ jmA ðxi Þ mB ðxi Þj þ jpA ðxi Þ pB ðxi ÞjÞ 2n i¼1 ð7Þ The distance is from the interval [0, 1] and fulﬁlls all the conditions of a metric. 2.3. Entropy of the A-IFSs We will verify here if the entropy of the A-IFSs may be a reliable measure of the amount of knowledge from the point of view of decision making. The entropy, as considered here, is meant as a non-probabilistic-type entropy measure for the AIFSs in the sense of De Luca and Termini’s [10] axioms which are intuitive and have been widely employed in the fuzzy literature. The axioms were properly reformulated for the A-IFSs (see [21]) and are discussed in length by Szmidt and Kacprzyk [21,25,28]. Here we remind only the basic idea. The entropy, as considered here, answers the question: how fuzzy is a fuzzy set? In other words, entropy E(x) measures the missing information which may be necessary to say if an element x described by (l, m, p) fully belongs or fully does not belong to our set. Deﬁnition 1. A ratio-based measure of fuzziness i.e., the entropy of an (intuitionistic fuzzy) element x is given in the following way [21]: EðxÞ ¼ a b ð8Þ where a is a distance (x, xnear) from x to the nearer element xnear among the elements: M(1, 0, 0) and N(0, 1, 0), and b is the distance (x, xfar) from x to the farer element xfar among the elements: M(1, 0, 0) and N(0, 1, 0). Certainly, we assume due to the very sense of entropy, this measure of entropy (8) is equal to 0 for both M(1, 0, 0) and N(0, 1, 0) (for the crisp elements). In other words, we do not allow the situation when the denominator in (8) is equal to 0. Different ways of expressing entropy E(x) (8) with examples are presented in Szmidt and Kacprzyk in [21,25,28]. Formula (8) describes the degree of fuzziness for a single element belonging to an A-IFS. For n elements belonging to an A-IFS we have E¼ n 1X Eðxi Þ n i¼1 ð9Þ In Fig. 3 we have the values of entropy E(x) (8) and its contour plot as a function of the membership and non-membership values. To better see the shape, the entropy is presented for l and m for the whole range [0, 1] (instead for l + m 6 1 only). For N N G H (a) M H (b) M Fig. 3. (a) Entropy E(x) (8) and (b) its contour plot. Please cite this article in press as: E. Szmidt et al., How to measure the amount of knowledge conveyed by Atanassov’s intuitionistic fuzzy sets, Inform. Sci. (2013), http://dx.doi.org/10.1016/j.ins.2012.12.046 E. Szmidt et al. / Information Sciences xxx (2013) xxx–xxx 5 Fig. 4. A typical shape of a properly deﬁned entropy measure. the same reason (to better see the shape), the contour plot of the entropy (8) is given only for the range of l and m for which l + m 6 1). It is worth noticing (cf. Fig. 3) that for all the values of the hesitation margin, the entropy (8) reaches its maximum (equal to 1) for the membership value l equal to the non-membership value m (segment GH in Fig. 3b). Entropy (8) is equal to 0 for two crisp elements M(1, 0, 0) and N(0, 1, 0) (presented in Figs. 1 and 2 and having their counterparts in Fig. 3a and b). On segment MN the entropy increases from value 0 (at both M and N) towards G (for which l = m) where its value is equal to 1 – Fig. 4. The shape of entropy on MN segment (Fig. 4) is typical for the entropy and repeats on each segment parallel to MN (i.e., on each line representing elements with the same value of hesitation margin). It is a proper feature of any entropy measure but a question arises if such an entropy measure conveys all knowledge important from the point of view of decision making. 3. Desirable features of a measure of information, and measure of knowledge for the A-IFSs Information concerning a separate element x belonging to an A-IFS is equal to l(x) + m(x), or [cf. (4)]: 1 p(x). But it is one aspect of information only. For each ﬁxed p there are different possibilities of combination between l and m. The combination between them inﬂuences strongly the amount of knowledge from the point of view of decision making. The knowledge (for a ﬁxed p) is different for the distant values between l and m, and for the close values between l and m. For example, if p = 0.1, then it can well be argued that the knowledge from the point of view of decision making for l = 0.9 and m = 0, or l = 0.7 and m = 0.2 is bigger than for the case: l = 0.45 and m = 0.45 (although in all cases l + m = 0.9). The entropy measure proposed by Szmidt and Kacprzyk [21,25] takes this aspect into account, and it is a good measure answering the question how fuzzy is an A-IFS. On the other hand, while making decision, one is interested in making differences between the following situations [35]: we have no information at all, and we have a large number of arguments in favor but an equally large number of arguments in favor of the opposite statement (against). In other words, we would like to have a measure making a difference between (0.5, 0.5, 0), and (0, 0, 1). To distinguish between these two types of situations, we should take into account, beside the entropy measure, also the hesitation margin p. It seems that a good measure of the amount of knowledge (that is useful from the point of view of decision making which is in our case the useful context) connected to a separate element x 2 A IFS is: KðxÞ ¼ 1 0:5ðEðxÞ þ pðxÞÞ ð10Þ where E(x) is an entropy measure given by (8) [21], p(x) is the hesitation margin. Measure K(x) (10) makes it possible to meaningfully represent what, in our context, is meant by the amount of knowledge, and it is simple – both conceptually and numerically, which will certainly be a big asset while solving complex real world problems. The properties of (10) are: 1. 2. 3. 4. 0 6 K(x) 6 1. K(x) = K(xC). For a ﬁxed E(x), K(x) increases while p decreases. For a ﬁxed value of p, K(x) behaves dually to an entropy measure (i.e., as 1 E(x)). Please cite this article in press as: E. Szmidt et al., How to measure the amount of knowledge conveyed by Atanassov’s intuitionistic fuzzy sets, Inform. Sci. (2013), http://dx.doi.org/10.1016/j.ins.2012.12.046 6 E. Szmidt et al. / Information Sciences xxx (2013) xxx–xxx N N G G H (a) M H (b) M Fig. 5. (a) Measure K(x) and (b) its contour plot. In Fig. 5 we can see the shape of K(x), and its contour plot. We may easily notice desirable differences of the shapes of K(x) and entropy E(x) (Fig. 3) from the point of view of decision making. Measure K is equal 0.5 for G(0.5, 0.5, 0), i.e., for the ‘‘most fuzzy’’ element of a fuzzy set (of course, in the sense that this triple representation, which is used for the A-IFSs, is equivalent to the ‘‘membership–nonmembership’’ representation of a fuzzy set), and increases along segment GH reaching the maximal value 1 for H(0, 0, 1). For other A-IFSs elements, the values of K are such that the values of their entropy increase accordingly to the values of the hesitation margins (reﬂecting additional information carried by the hesitation margins that contributes to the capturing of the essence of the amount of knowledge). It is worth noticing that the components in (10) are independent, i.e., entropy E is independent on p since the shape of E (as presented in Fig. 3) is always the same (cf. Fig. 4) in spite of p. In fact, as we have already emphasized, it was our motivation to propose the measure K (10). Obviously, the use of the mean (average) value in (10) is the straightforward, commonly used choice, motivated by its simplicity and good mathematical properties, but one can well imagine the use of other aggregating operators. In the case of n elements, the total amount of knowledge K is: K¼ n 1X ð1 0:5ðEðxi Þ þ pðxi ÞÞÞ n i¼1 ð11Þ Now we will test and verify the proposed measure of knowledge K. First, we will use the problem formulated by Quinlan [11] to see if we obtain similar results. Next, we will consider the ‘‘Sonar’’ benchmark data from [40]. 3.1. Examples The Quinlan’s example, the so-called ‘‘Saturday Morning’’, considers the classiﬁcation with nominal data. The example is small enough and illustrative, yet is a challenge to many classiﬁcation and machine learning methods. The main idea of solving the example by Quinlan is to select the best attribute to split the training set (Quinlan used a so called Information Gain which was a dual measure to Shannon’s entropy). Quinlan obtained the 100% accuracy, and the optimal solution (the minimal possible tree). In other words, the ranking of the attributes as pointed out by Quinlan [11] is therefore the best as far as the amount of knowledge is concerned. In Quinlan’s example the objects are described by attributes. Each attribute represents a feature and takes on discrete, mutually exclusive values. For example, if the objects were ‘‘Saturday Mornings’’ and the classiﬁcation involved the weather, possible attributes might be [11]: outlook, with values {sunny, overcast, rain}, temperature, with values {cold, mild, hot}, humidity, with values {high, normal}, and windy, with values {true, false}. A particular Saturday morning, an example, might be described as: outlook: sunny; temperature: mild; humidity: high; windy: true. Each object (example) belongs to one of two mutually exclusive classes, C, i.e., C = {P, N}, where: P denotes Please cite this article in press as: E. Szmidt et al., How to measure the amount of knowledge conveyed by Atanassov’s intuitionistic fuzzy sets, Inform. Sci. (2013), http://dx.doi.org/10.1016/j.ins.2012.12.046 7 E. Szmidt et al. / Information Sciences xxx (2013) xxx–xxx Table 1 The ‘‘Saturday Morning’’ data from [11]. No. Attributes 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Class Outlook Temp Humidity Windy Sunny Sunny Overcast Rain Rain Rain Overcast Sunny Sunny Rain Sunny Overcast Overcast Rain Hot Hot Hot Mild Cool Cool Cool Mild Cool Mild Mild Mild Hot Mild High High High High Normal Normal Normal High Normal Normal Normal High Normal High False True False False False True True False False False True True False True N N P P P N P N P P P P P N Table 2 The frequencies obtained. Outlook Positive Negative Temperature Humidity Windy S O R H M C H N T F 2/9 3/5 4/9 0 3/9 2/5 2/9 2/5 4/9 2/5 3/9 1/5 3/9 4/5 6/9 1/5 3/9 3/5 6/9 2/5 Table 3 The counterpart A-IFS model. Outlook Hesitation margins membership values non-membership values Temperature Humidity Windy S O R H M C H N T F 0.67 0 0.33 0 1 0 0.69 0.2 0.11 0.67 0 0.33 1 0 0 0.49 0.4 0.11 0.67 0 0.33 0.4 0.6 0 0.67 0 0.33 0.8 0.2 0 the set of positive examples, and N – that of negative examples. There are 14 training examples as shown in Table 1. Each training example e is represented by the attribute-value pairs, i.e., {(Ai, ai,j);i = 1, . . . , li} where Ai is an attribute, ai,j is its value – one of possible j values (for each ith attribute j can be different, e.g., for outlook: j = 3, for humidity: j = 2, etc.). First, as in Szmidt and Kacprzyk [32]) we make use of frequency description of the problem close to that of De Carvalho et al. [7–9] who use histograms to derive some proximity measures. The frequency measure (Table 2) used for description of the data (Table 1): f ðAi ; ai;j ; CÞ ¼ VðC; Ai ¼ ai;j Þ=pC ð12Þ where C = {P, N}; V(C;Ai = ai,j) – the number of training examples of C for which Ai = ai,j; pC – the number of the training examples of C. To describe the ‘‘Saturday Morning’’ data via the A-IFSs, we use an algorithm – based on the mass assignment theory – proposed in [15] to assign the parameters of an A-IFS model which describes the attributes (the relative frequency distribution functions given in Table 2 were the starting point of the algorithm). The assigned description of the attributes in terms of A-IFSs are given in Table 3 (all the three terms describing A-IFSs are presented, i.e., the values of the membership and nonmembership degrees, and of the hesitation margin). In other words we have a counterpart table in which we have a description of the problem (Table 1) in terms of A-IFSs (see Tables 3 and 4), i.e., (l(), m(), p()); for instance, (0, 0.33, 0.67) is for ‘‘sunny’’. Having the description of Quinlan’s example in terms of the A-IFSs we can examine effects of the evaluation of attributes by the proposed measure of knowledge K (10) from the point of view of taking most advantage of the available knowledge. In Table 5 there are the results obtained for each attribute from the point of view of the measure of the amount of knowl. edge K (10), entropy E (8) and (9), and an average hesitation margin p Let us remind that in the original solution given by Quinlan [11] (leading to the minimal tree), the order from the point of view of the most informative attributes is the following: Please cite this article in press as: E. Szmidt et al., How to measure the amount of knowledge conveyed by Atanassov’s intuitionistic fuzzy sets, Inform. Sci. (2013), http://dx.doi.org/10.1016/j.ins.2012.12.046 8 E. Szmidt et al. / Information Sciences xxx (2013) xxx–xxx Table 4 The ‘‘Saturday Morning’’ data in terms of A-IFSs. No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Attributes Class Outlook Temperature Humidity Windy (0, 0.33, 0.67) (0, 0.33, 0.67) (1, 0, 0) (0.2, 0.11, 0.69) (0.2, 0.11, 0.69) (0.2, 0.11, 0.69) (1, 0, 0) (0, 0.33, 0.67) (0, 0.33, 0.67) (0.2, 0.11, 0.69) (0, 0.33, 0.67) (1, 0, 0) (1, 0, 0) (0.2, 0.11, 0.69) (0, 0.33, 0.67) (0, 0.33, 0.67) (0, 0.33, 0.67) (0, 0, 1) (0.4, 0.11, 0.49) (0.4, 0.11, 0.49) (0.4, 0.11, 0.49) (0, 0, 1) (0.4, 0.11, 0.49) (0, 0, 1) (0, 0, 1) (0, 0, 1) (0, 0.33, 0.67) (0, 0, 1) (0, 0.33, 0.67) (0, 0.33, 0.67) (0, 0.33, 0.67) (0, 0.33, 0.67) (0.6, 0, 0.4) (0.6, 0, 0.4) (0.6, 0, 0.4) (0, 0.33, 0.67) (0.6, 0, 0.4) (0.6, 0, 0.4) (0.6, 0, 0.4) (0, 0.33, 0.67) (0.6, 0, 0.4) (0, 0.33, 0.67) (0.2, 0, 0.8) (0, 0.33, 0.67) (0.2, 0, 0.8) (0.2, 0, 0.8) (0.2, 0, 0.8) (0, 0.33, 0.67) (0, 0.33, 0.67) (0.2, 0, 0.8) (0.2, 0, 0.8) (0.2, 0, 0.8) (0, 0.33, 0.67) (0, 0.33, 0.67) (0.2, 0, 0.8) (0, 0.33, 0.67) N N P P P N P N P P P P P N Table 5 Evaluation of attributes of the ‘‘Saturday Morning’’ data. Attribute 1 K 2 Entropy 3 p 4 Outlook Temperature Humidity Windy 0.49 0.23 0.47 0.26 0.56 0.813 0.535 0.744 0.453 0.72 0.535 0.735 Outlook; Humidity; Windy; Temperature If we order the attributes taking into account the entropy (8) and (9), the most informative attributes are indicated by the smallest values in the 3rd column of Table 5: Humidity; Outlook; Windy; Temperature i.e., the order of the attributes is different (Humidity replaced Outlook; this order would not result in the smallest tree). On the other hand, if we order the attributes taking into account the minimal average values of the hesitation margin only, the most informative attributes are (Table 5, 4th column): Outlook; Humidity; Temperature; Windy i.e., again, the order of the attributes is changed (Temperature replaced Windy). However, when we apply the measure of the amount of knowledge (10), the results are the same as those of Quinlan, (leading to the minimal tree, i.e., are the most valuable from the point of view of decision making – the 2nd column in Table 5), i.e.,: Outlook; Humidity; Windy; Temperature We have also tested the usefulness of measure ‘‘K’’ on the benchmark ‘‘Sonar’’ Data [40] with 208 objects and 60 attributes (from 0.0 to 1.0 for each). The objects are classiﬁed into two classes: ‘‘Rock’’ (97 objects) and ‘‘Mine’’ (111 objects). As for the previous example, we have constructed an A-IFS counterpart of the data. For each attribute the measure of the have been calculated. The results for amount of knowledge K (10), entropy E, (8) and (9), and an average hesitation margin p ten most important attributes are given in Table 6. Taking into account the measure K, attribute 11 turns out to be the most important whereas from the point of view of entropy, attribute 12 is the best. To assess both measures we have constructed two intuitionistic fuzzy trees – the ﬁrst tree uses attribute 11 ﬁrst, the second tree uses attribute 12 ﬁrst. We have veriﬁed the accuracy of classiﬁcation for both trees. The results are shown in Table 7. The values of classiﬁcation accuracy presented in Table 7 have been obtained by generating 100 times the trees considered (of depth equal 1) with the 10-fold cross-validation. The tree making use of attribute 11 (i.e., the attribute pointed out by the measure K) has better recognized each class (rocks – 68%, mines – 81%) in comparison with the tree making use of attribute 12 pointed out by the entropy (rocks – 64%, mines – 80%). The general accuracy of the simple tree is also better while using the attribute pointed out by the measure K (equal to 75%) in comparison with the 72% accuracy obtained while using the attribute pointed out by entropy. Please cite this article in press as: E. Szmidt et al., How to measure the amount of knowledge conveyed by Atanassov’s intuitionistic fuzzy sets, Inform. Sci. (2013), http://dx.doi.org/10.1016/j.ins.2012.12.046 9 E. Szmidt et al. / Information Sciences xxx (2013) xxx–xxx Table 6 Evaluation of attributes of the ‘‘Sonar’’ data. Attribute 1 Attribute Attribute Attribute Attribute Attribute Attribute Attribute Attribute Attribute Attribute 11 12 10 49 21 36 48 9 45 44 K 2 Entropy 3 p 4 0.354 0.330 0.312 0.308 0.302 0.299 0.292 0.283 0.270 0.270 0.711 0.704 0.775 0.772 0.769 0.769 0.799 0.798 0.827 0.843 0.581 0.637 0.602 0.612 0.628 0.632 0.618 0.637 0.632 0.618 Table 7 Evaluation of attributes of the ‘‘Sonar’’ data. Chosen attribute Attribute 11 (K) Attribute 12 (Entropy) Accuracy Rocks Mines Both classes 67.88 63.73 80.59 80.05 74.66 72.44 It is worth mentioning that we have discussed the differences in the attribute order as pointed out by the measure K and the entropy as far as the ﬁrst attributes are concerned. But we may notice that in general the order of other attributes as given by K (the higher amount of knowledge, the better), i.e., 11, 12, 10, 49, 21, 36, 48, 9, 45, 44, is different from the order pointed out by entropy (the smaller entropy the better), i.e., 12, 11, 21, 36, 49, 10, 9, 48, 45, 44. Obviously, we may imagine such data sets when the results given by K and the entropy coincide but the idea of introducing a measure of the amount of knowledge K, which in a sense extends entropy stressing the importance of hesitation margins, seems fully justiﬁed in particular from the point of view of decision making. 4. Conclusions A new measure of the amount of knowledge for information conveyed by the A-IFSs was proposed with emphasis on its usefulness for decision making. The new measure exhibits the advantages of the entropy measure (reﬂecting a relationship between the positive and negative information) and additionally emphasizes the inﬂuence of the lacking information (expressed by the hesitation margins). References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] K. Atanassov, Intuitionistic Fuzzy Sets, VII ITKR Session, Soﬁa (Deposed in Centr. Sci.-Techn. Library of Bulg. Acad. of Sci., 1697/84), 1983 (in Bulgarian). K. Atanassov, Intuitionistic Fuzzy Sets: Theory and Applications, Springer-Verlag, 1999. K. Atanassov, On Intuitionistic Fuzzy Sets Theory, Springer, Verlag, 2012. K. Atanassov, V. Tasseva, E. Szmidt, J. Kacprzyk, On the geometrical interpretations of the intuitionistic fuzzy sets, in: K. Atanassov, J. Kacprzyk, M. Krawczak, E. Szmidt (Eds.), Issues in the Representation and Processing of Uncertain and Imprecise Information. Fuzzy Sets, Intuitionistic Fuzzy Sets, Generalized Nets, and Related Topics, EXIT, Warsaw, 2005. H. Bustince, V. Mohedano, E. Barrenechea, M. Pagola, An algorithm for calculating the threshold of an image representing uncertainty through A-IFSs, in: IPMU’2006, 2006, pp. 2383–2390. H. Bustince, V. Mohedano, E. Barrenechea, M. Pagola, Image thresholding using intuitionistic fuzzy sets, in: K. Atanassov, J. Kacprzyk, M. Krawczak, E. Szmidt, (Eds.), Issues in the Representation and Processing of Uncertain and Imprecise Information, Fuzzy Sets, Intuitionistic Fuzzy Sets, Generalized Nets, and Related Topics, EXIT, Warsaw, 2005. F.A.T. De Carvalho, Proximity coefﬁcients between Boolean symbolic objects, in: E. Diday et al. (Eds.), New Approaches in Classiﬁcation and Data Analysis, Springer-Verlag, Heidelberg, 1994, pp. 387–394. F.A.T. De Carvalho, Extension based proximities between Boolean symbolic objects, in: C. Hayashi et al. (Eds.), Data Science, Classiﬁcation and Related Methods, Springer-Verlag, Tokyo, 1998, pp. 370–378. F.A.T. De Carvalho, R.M.C. Souza, Statistical proximity functions of Boolean symbolic objects based on histograms, in: A. Rizzi, M. Vichi, H.-H. Bock (Eds.), Advances in Data Science and Classiﬁcation, Springer-Verlag, Heidelberg, 1998, pp. 391–396. A. De Luca, S. Termini, A deﬁnition of a non-probabilistic entropy in the setting of fuzzy sets theory, Inform. Control 20 (1972) 301–312. J.R. Quinlan, Induction of decision trees, Mach. Learn. 1 (1986) 81–106. T. Stewart, Wealth of Knowledge, Doubleday, New York, 2001. E. Szmidt, J. Baldwin, New similarity measure for intuitionistic fuzzy set theory and mass assignment theory, Notes IFSs 9 (3) (2003) 60–76. E. Szmidt, J. Baldwin, Entropy for intuitionistic fuzzy set theory and mass assignment theory, Notes IFSs 10 (3) (2004) 15–28. E. Szmidt, J. Baldwin, Intuitionistic Fuzzy Set Functions, Mass Assignment Theory, Possibility Theory and Histograms, 2006 IEEE World Congress on Computational Intelligence, 2006, pp. 237–243. E. Szmidt, J. Kacprzyk, Remarks on some applications of intuitionistic fuzzy sets in decision making, Notes IFS 2 (3) (1996) 22–31. E. Szmidt, J. Kacprzyk, On measuring distances between intuitionistic fuzzy sets, Notes IFS 3 (4) (1997) 1–13. Please cite this article in press as: E. Szmidt et al., How to measure the amount of knowledge conveyed by Atanassov’s intuitionistic fuzzy sets, Inform. Sci. (2013), http://dx.doi.org/10.1016/j.ins.2012.12.046 10 [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] E. Szmidt et al. / Information Sciences xxx (2013) xxx–xxx E. Szmidt, J. Kacprzyk, Group decision making under intuitionistic fuzzy preference relations, in: IPMU’98, 1998, pp. 172–178. E. Szmidt, J. Kacprzyk, Distances between intuitionistic fuzzy sets, Fuzzy Sets Syst. 114 (3) (2000) 505–518. E. Szmidt, J. Kacprzyk, On measures on consensus under intuitionistic fuzzy relations, in: IPMU 2000, 2000, pp. 1454–1461. E. Szmidt, J. Kacprzyk, Entropy for intuitionistic fuzzy sets, Fuzzy Sets Syst. 118 (3) (2001) 467–477. E. Szmidt, J. Kacprzyk, Analysis of consensus under intuitionistic fuzzy preferences, in: Int. Conf. in Fuzzy Logic and Technology, De Montfort Univ. Leicester, UK, 2001, pp. 79–82. E. Szmidt, J. Kacprzyk, Analysis of agreement in a group of experts via distances between intuitionistic fuzzy preferences, in: IPMU 2002, 2002, pp. 1859–1865. E. Szmidt, J. Kacprzyk, An intuitionistic fuzzy set based approach to intelligent data analysis (an application to medical diagnosis), in: A. Abraham, L. Jain, J. Kacprzyk (Eds.), Recent Advances in Intelligent Paradigms and Applications, Springer-Verlag, 2002, pp. 57–70. E. Szmidt, J. Kacprzyk, Similarity of intuitionistic fuzzy sets and the Jaccard coefﬁcient, in: IPMU 2004, 2004, pp. 1405–1412. E. Szmidt, J. Kacprzyk, Distances between intuitionistic fuzzy sets: straightforward approaches may not work, in: 3rd International IEEE Conference Intelligent Systems IS06, London, 2006, pp. 716–721. E. Szmidt, J. Kacprzyk, An Application of Intuitionistic Fuzzy Set Similarity Measures to a Multi-criteria Decision Making Problem. ICAISC 2006, LNAI 4029, Springer-Verlag, 2006, pp. 314–323. E. Szmidt, J. Kacprzyk, Some problems with entropy measures for the Atanassov intuitionistic fuzzy sets, LNAI, vol. 4578, Springer-Verlag, 2007, pp. 291–297. E. Szmidt, J. Kacprzyk, A new similarity measure for intuitionistic fuzzy sets: straightforward approaches may not work, in: 2007 IEEE Conf. on Fuzzy Systems, 2007a, pp. 481–486. E. Szmidt, J. Kacprzyk, A new approach to ranking alternatives expressed via intuitionistic fuzzy sets, in: D. Ruan et al. (Eds.), Computational Intelligence in Decision and Control, World Scientiﬁc, 2008, pp. 265–270. E. Szmidt, J. Kacprzyk, Amount of information and its reliability in the ranking of Atanassov’s intuitionistic fuzzy alternatives, in: E. Rakus-Andersson, R. Yager, N. Ichalkaranje, L.C. Jain (Eds.), Recent Advances in Decision Making, SCI 222, Springer-Verlag, 2009, pp. 7–19. E. Szmidt, J. Kacprzyk, Dealing with typical values via Atanassov’s intuitionistic fuzzy sets, Int. J. General Syst. 39 (5) (2010) 489–506. E. Szmidt, J. Kacprzyk, Intuitionistic fuzzy sets – two and three term representations in the context of a Hausdorff distance, Acta Univ. Matt. Belii Ser. Math. 19 (19) (2011) 53–62. <http://Actamth.savbb.sk>. E. Szmidt, J. Kacprzyk, P. Bujnowski, Measuring the amount of knowledge for Atanassov’s intuitionistic fuzzy sets, LNAI 6857 (2011) 17–24. E. Szmidt, V. Kreinovich, Symmetry between true, false, and uncertain: an explanation, Notes IFS 15 (4) (2009) 1–8. E. Szmidt, M. Kukier, Classiﬁcation of imbalanced and overlapping classes using intuitionistic fuzzy sets, in: 3rd International IEEE Conference Intelligent Systems IS06, London, 2006, pp. 722–727. E. Szmidt, M. Kukier, A new approach to classiﬁcation of imbalanced classes via Atanassov’s intuitionistic fuzzy sets, in: Hsiao-Fan Wang (Ed.), Intelligent Data Analysis: Developing New Methodologies Through Pattern Discovery and Recovery, Idea Group, 2008, pp. 65–102. E. Szmidt, M. Kukier, Atanassov’s intuitionistic fuzzy sets in classiﬁcation of imbalanced and overlapping classes, in: Panagiotis Chountas, Ilias Petrounias, Janusz Kacprzyk (Eds.), Intelligent Techniques and Tools for Novel System Architectures, Springer, Berlin Heidelberg, 2008, pp. 455–471. L.A. Zadeh, Fuzzy sets, Information and Control 8 (1965) 338–353. http://archive.ics.uci.edu/ml/datasets/Connectionist+Bench+(Sonar,+Mines+vs.+Rocks). Please cite this article in press as: E. Szmidt et al., How to measure the amount of knowledge conveyed by Atanassov’s intuitionistic fuzzy sets, Inform. Sci. (2013), http://dx.doi.org/10.1016/j.ins.2012.12.046

© Copyright 2020