Approximation operators based on preconcepts

Abstract Using the notion of preconcept, we generalize Pawlak’s approximation operators from a one-dimensional space to a two-dimensional space in a formal context. In a formal context, we present two groups of approximation operators in a two-dimensional space: one is aided by an equivalence relation defined on the attribute set, and another is aided by the lattice theoretical property of the family of preconcepts. In addition, we analyze the properties of those approximation operators. All these results show that we can approximate all the subsets in a formal context assisted by the family of preconcepts using the above groups of approximation operators. Some biological examples show that the two groups of approximation operators provided in this article have potential ability to assist biologists to do the phylogenetic analysis of insects.


Introduction
Formal concept analysis proposed in [1] is a mathematical thinking for conceptual data analysis and knowledge processing. Since its inception, many researchers improve the construction of formal concepts to expand the scope of application of formal concept analysis. Among them, Stahl and Wille [2] introduced formal concept analysis and mathematized the notion of a "preconcept," which is used in Piaget's cognitive psychology to explain the developmental stage between the stage of senso-motor intelligence and the stage of operational intelligence, which obviously generalizes the definition of formal concepts. For this new notionpreconceptit has been proved [3] that the family of preconcepts can construct a lattice with their hierarchical order; Vormbrock and Wille [4] demonstrated that the idea of preconcepts enriches the theory of formal concept analysis. Additionally, for a formal context, the family of preconcepts provides more information than the set of formal concepts, since we easily know from [1,2] that every formal concept is a preconcept, but not vice versa.
The rough set theory, which was introduced in [5], accounts for the definability of a concept with an approximation in an approximation space. As a mathematical tool to deal with data analysis and knowledge discovery, the rough set theory depends on the understanding of its basic notions, that is, lower and upper approximation operators [6]. In the development of the theory of rough sets, approximation operators are typically defined by equivalence relations [7,8]. Researchers have proposed many generalized notions of approximation operators. For instance, some new approximation operators were provided in [7] with the help of an equivalence relation, a finite Boolean algebra, a lattice, and a poset; some approximation operators were presented in [9] with the assistance of concepts, definable sets, granule-based, and subsystem-based; Ganter [10] gave the approximation operators using granules on one universe set and also discussed the lattice properties of approximation operators defined by equivalence relations and granules. Some other methods can be seen in other studies [11][12][13][14].
Since both the formal concept analysis and the rough set theory are two related mathematical tools in the areas of knowledge representation and knowledge processing, some authors introduced the notion of approximation operators into formal concept analysis. For example, in a formal context, with the aid of object quasi-order, lower and upper approximation operators were defined in [15]; Xiao et al. [16] discussed (object) approximation operators defined by formal contexts; they generalized the approximation considered for a formal context from an equivalence relation to the object preorder in [17]. There are some other results on the application of rough set into formal concept analysis [14,18,19].
However, all the above approximation operators are defined on one set, or say they are defined by the form of one-dimensional space in view of the language of geometry. In a formal context, using attribute implication, Ganter and Meschke [20] defined an operator supp, which is for attribute sets. After that, in a soft-granulated (formal) context, they defined the approximation operators supp and supp [20]. As a matter of fact, the pair of supp and supp, which is not used for a formal context directly, is active to a softgranulated context generalized from a formal context, though the expressed form of supp and supp is in a two-dimensional form. That is to say, some extracted information from some relative systems needs to be considered on two non-related sets at the same time, that is, on a two-dimensional space in view of the language of geometry. Every preconcept is just expressed in a two-dimensional form. In fact, some results of the rough set theory have been received with respect to a two-dimensional space for some of the preconcepts. For example, Mao [21] provided approximation operators in a two-dimensional space for semiconcepts, which are a class of preconcepts. Actually, since a one-dimensional space is a special case of some two-dimensional spaces, we should pay more attention to the research of approximation operators defined in a two-dimensional form so as to extend the research and applicable ranges for rough set, though the results in this aspect are few up to now compared with the results of approximations in one-dimensional forms.
Additionally, the main purpose of the formal concept analysis theory and rough set theory is to deal with the problems exist in real life. The following example and remark will demonstrate this point.

Example 1.
We can provide some biological information in Table 1, which is a combination of biological information from Tables 2 and 3 of Liu and Ren study [22].

Remark 1.
First, we analyze Example 1 as follows.
The context in Example 1 is a formal context and I 0 , as given in Table 1.
For any A ⊆ O 0 , let A′ be the maximal set by the set inclusion order, such that every sample in A has the characteristics in A′, i.e., A′ = {y ∈ P 0 |a owns the attribute y, for every a ∈ A}. In the cluster analysis of biology, biologists consider (A, B) for the biological samples A, where B ⊆ A′ since sometimes biologists hope to know a part B of public characteristics A′ for A. In view of [4], (A, B) is a preconcept since the set . In other words, biologists sometimes pay their attention to the set of preconcepts in a formal context. Under some cases, biologists also hope to know the information not in the family of preconcepts with the aid of preconcepts, though this was not realized in [22] and some other relative studies.
Second, brief summary.
There are some methods exist to search out preconcepts in a formal context such as those in [2][3][4]23]. This study will define approximation operators on two non-related sets, i.e., on the two-dimensional space O × P since O ∩ P = ∅, with preconcepts for a formal context (O, P, I) by two methods to approximate the information not in the family of preconcepts. That is to say, in this article, we will apply rough set theory into the study of formal concept analysis.
The rest of this article is arranged as follows. We will review some notions and properties in Section 2. Section 3 provides two groups of approximation operators based on preconcepts. Additionally, some properties of the two groups of approximation operators are discussed. In Section 4, we will analyze all the approximation operators presented in this article. We conclude this article and leave room for our future research studies in Section 5.

Some notions and lemmas
In this section, we recall briefly some notions and lemmas, which will be used in this article. For more details on formal concept analysis, see Ganter and Wille [24], and for rough set, see Pawlak [6].
, , be a formal context, for which O and P are sets with O ∩ P = ∅, while I is a binary relation between O and P, i.e., ; the elements of O and P are called as objects and attributes, respectively; gIm stands for (g, m) ∈ I. The derivation operators of are defined as follows for X ⊆ O and Y ⊆ P: In our real life, there are |O| < ∞ and |P| < ∞ for an information system . Hence, this article only considers the formal context (O, P, I) satisfying |O| < ∞ and |P| < ∞.
In this article, if Z ⊆ O (or Z ⊆ P) satisfies |Z| = 1 such as Z = {z}, then {z}′ is simply denoted as z′ in what follows.

Lemma 1. [24]
In a formal context, = ( ) O P I , , ; the two derivation operators in Definition 1 satisfy the following conditions for any , where i = 1, 2 and j ∈ J.
3 . This is same as that by Definition 1 in Example 2.
Let ( ) be the family of all preconcepts in .
Some notations: let G, M be two sets, and .

Approximation operators
The notion of preconcept is a new idea for extracting information from a formal context = ( ) O P I , , . Although we know some methods to search out every element in ( ), up to now, we cannot find a method to extract information from the subsets in ( ) with the help of ( ). When biologists do their phylogenetic analysis of insects, they need to guess the evolutionary processes of the insects based on some theoretical basis, since the evolution of insects is a historical process and cannot be reproduced. From Example 1 and Remark 1, we find that some biological information can be expressed as a form of a formal context . Hence, sometimes biologists may obtain their needed results with aid of ( ). As a result, the biologists may obtain the answer of their guessing.
Based on the above analysis induced from theory and practice, we confirm that the most important work is to find the properties of ( ). Hence, this section will give two groups of approximation operators for a given formal context in the two-dimensional space O × P, so as to approximate those subsets that are not in ( ) by the elements in ( ).
The following theorem will be used in the sequel.
holds for any ⊆ B P. Proof.
by Definition 2. Example 5 shows that using the results in Theorem 1 is easier than that of Definition 1 to obtain

Set equivalence relation approximation operators
Similar to the equivalence relation for an information table defined in [5,6,8], we give an equivalence relation for a formal context. 2 for any y 1 , y 2 ∈ P. Then, R is an equivalence relation on P.
Proof. Routine verification from the definition of equivalence relation. □ Similar to the relation R in Lemma 3, we can also define an equivalence relation S on O as: ⇔ ′ = ′ xSy x y for any x, y ∈ O in a formal context (O, P, I).
The following example indicates how to set up an equivalence relation in some biological discussion.
Example 6. Let (O 0 , P 0 , I 0 ) be the formal context defined in Example 1. Let S be a binary relation defined as: x y for any x, y ∈ O 0 . Then, we may easily get S to be an equivalence relation . Let S be defined in Example 6. Let x, y ∈ O 0 and xSy. Then, in the study of Liu and Ren [22] x and y were found to be in the same cluster, which was shown in Figure 1 in [22], using the cluster analysis by SPSS19.0. In fact, according to the basic principles of cladistic systematics, the elements in [z] own symplesiomorphy for any z ∈ O 0 . This is the same as the results at the first layer, which is from the left to right direction in Figure 1 in [22]. Hence, the definition of S is also meaningful in biology.
Example 6 shows the importance of equivalence relation on the set of objects or the set of attributes in a formal context for extracting information and biological research.
Using Figure 1 in [22], we can illustrate our idea of the discussion for preconcepts more clearly as follows. Figure 1 is just similar to Figure 1 in [22], where every vertex except a and b is pointed by us to express easily. The shape of Figure 1 is roughly same as that in Figure 1 in [22] since Figure 1 in [22] is as beautiful as painter's. Moreover, the whole tree of Figure 1 in [22] is the same as that of Figure 1.
We can explain Figure 1 as follows. (1) Combining the above (3) and Definition 1 with Table 1, Lemma 1, and Example 6, we know: Approximation operators based on preconcepts  405 a  a  a  a  a  a   ,  ,  ,  ,  , , ,  ;  ,  , , ,  ;  , ,  ;  ,  , ,  ;  ,  ; , ; , . (2) {a j , j = 1,…,16} is the set of 16 specimens in Table 1, which is also used in [22]. Figure 1 in [22] is a dendrogram of 16 genera of Blaptini based on the 9 characteristics of defensive glands. The result of equivalence relation S on O 0 = {a j , j = 1,…,16} in Example 6 is to divide the 16 objects as that at the first layer in Figure 1, which is same as that in Figure 1 [22]. This also shows the idea of the relation S in Example 6 to be correct to divide the 16 specimens.  (1), every vertex in the eight layers of Figure 1 is a preconcept.
Additionally, using Lemma 2, we know ∨ = ∨ = l l l l l , 11 12 4 13 14  8 . Since the whole tree of Figure 1 in [22] is the same as that of Figure 1. Hence, the above expressions show the idea of theory of preconcepts. Actually, comparing Figure 1 and Figure 1 in [22], we find that the above expressions also show the idea of biology.
In addition, Figure 1 in [22] does not directly give a symbol on every point, but every point is obtained to show the correspondent specimens A to own the same characteristics B. In fact, these points shown at the different layers in Figure 1 in [22] demonstrate the pair (A,B) to be a preconcept according to Lemma 2. This view is shown in Figure 1 and the above analysis. (4) In Figure 1 in [22], we observed that the 16 specimens are divided into two parts: one is part a and another is part b. According to the discussion on two parts in [22], we think this division to be obtained based on the biological knowledge. Hence, Figure 1 uses this division directly. However, in part a of Figure 1, there are l 1j ,(j = 1, 2, 3, 4, 5). Considering Lemma 2, we can obtain 9 . The result of any of these expressions is a point which should be appeared in the construction of preconcepts obtained by the combination of Table 1 and the division of the two parts, since these points are preconcepts by Lemma 2. For the readers to easily compare the idea of Figure 1 in [22] with ours, the above points do not appear in Figure 1. In fact, they are not seen in Figure 1 in [22] and any of the explanations for Figure 1 in [22]. Perhaps, SPSS19.0 considered them not to be "important." But, one of these preconcepts sometimes has some value to extract information from Table 1 for some researchers. In this case, the idea of preconcepts perhaps give a hand to the researchers. (5) The above analysis also demonstrates that the discussion for preconcepts, which is a part of formal concept analysis, is necessary. If some researchers hope to extract some information from a family of biological data to obtain some results such as dendrogram, then the idea of preconcept theory may give them a hand. In order to assist to extract needed information from a family of data, the extraction methods need to be continuously improved. Hence, first of all, the theory of preconcepts should be constantly enriched. Actually, the goal of this study is to enrich the theory of preconcept, so as to serve for more applied fields. In other words, the discussion in this study is necessary to be done.
No matter how rich the theory of preconcepts is, it is important for the biologists to perform their research according to the fundamental knowledge of biology. Any of the other theories such as SPSS and preconcepts are only to assist them to work. Certainly, we hope the idea of preconcept theory to be a good one.
Using the equivalence relation R in Lemma 3, we can define some operators as follows.
do not have any significance using Definition 3.
Using Theorem 1 (1) and Definition 3, we obtain ( , , . However, in biology, = ∅ A means that there are no samples to be considered. This case does not have any significance for biologists. Additionally, the equivalence relation R is defined on the set P of attributes. If only set P is considered, then R and R are defined as the standard model defined as that in [5,6,8]. Based on this point, we can say the pair of operators R and R in Definition 3 to be a generalization of the approximation operators in [5,6,8] , respectively. Then, the following formula is correct.
Proof. We divide two steps to finish the proof.
Step 1. To prove: Proof. It is straightforward from Lemma 4. □ , holds, then (A, B) is a preconcept.
That is to say, we find ( To prove item (2 , .

Remark 3.
(1) It is easy to see that in a formal context = ( ) , holds for any ( ) ∈ ( ) A B , according to Definitions 1 and 3. Hence, Theorem 2 and Corollary 2 are the two judgments that can characterize the set ( ).
(2) By summarizing the natures of approximation operators in [5,6,8], the author [21] provides three conditions (p1), (p2) and (p3) for two operators which are to be lower and upper approximation operators in an information system. Considering the two operators R̲ and R in Definition 3 for a formal context = ( ) O P I , , and the above discussions for R̲ and R, we find the following results (2.1)-(2.3). (2.1) is an information system and ( ) is the set of fundamental sets. That is, the condition (p1) in [21] is correct for R̲ and R.
, . It is easy to obtain ( ) ⊆ ( ) in some formal contexts such as 0 in Example 1 according to Example 7. Thus, we cannot confirm the hold of ( , . That is to say, the condition (p2) does not hold for R̲ and R.
(3) Combining the above item (2) and Remark 2(4), we can roughly speak that the two operators R̲ and R in Definition 3 to be a group of approximation operators in .
Applying the above results in this section into the formal context in Example 1, we get the following Example 9 to describe Theorem 2. and Theorem 1 with Definition 2, we receive Hence, by Definition 3 and Theorem 1, a b  a b  ̲ ,  ,  , ,  ,  , , , , , .
In fact, according to Table 1, A consists of Blaps and Thaumatoblaps and B consists of two characteristics -"glands ovoid" and "the wall of glands thick." ( ) ∉ ( ) A B , 0 implies that the common characteristics of Blaps and Thaumatoblaps are not glands ovoid and the wall of glands thick. The characteristic set B is not the set of symplesiomorphies of Blaps and Thaumatoblaps. ( , expresses that Blaps and Thaumatoblaps have a common characteristic "gland ovoid." At least, this characteristic is one of their symplesiomorphies. , , we also receive the advantage of Definition 3 to approximate the subsets not in ( ).
(2) Using the operators R̲ and R in Definition 3, we cannot approximate the subsets, such that ( , though we give a way to judge ( is significant by Remark 2(5). This follows the significance of ( ) R C D ̲ , . Hence, under the case of ( ) = ∅ U C D , , we can use the lower approximation operator R̲ to approximate (C, D).

Approximation operators from lattices
Applying the equivalence relation on the set P of attribute sets in Lemma 3 for a formal context = ( ) O P I , , , we, similar to the approximation operators defined in [5,6,8], provide lower and upper approximation operators in Section 3.1. However, ( is not correct for some ( , . This is a weakness of the two operators under the idea of three conditions (p1), (p2), and (p3) in [21].
Hence, we hope to find a way not with classical equivalence relations to judge all the elements in ( ) and approximate any members not in ( ). This section will realize this hope.
Burgmann and Wille [3] proved that ( ) is a lattice (see Lemma 2). It gave approximation operators in lattices in [7], though the approximation operators in [7] are only considered for Boolean algebra. However, we can draw some ideas from [7] to give the following definition.
We may easily know (  H(A, B). These results can be found in the following definition, which is little similar to that in [7] on the style of writing, though Definition 5 is different from that in [7].
Furthermore, according to the property of ( ), we receive y D e f i n i t i o n 5 , by Lemma 2 , .
We also receive . Moreover, ( 2 is correct according to the definition of ≤ 2 in Lemma 2. To prove item (2).
, . Then, (A, B) ∈ L(A, B) holds since ⊆ A A and ⊇ B B. Combining ⊆ X A and ⊇ Y B for any (X, Y) ∈ L(A, B), we obtain (X, Y) ≤ 2 (A, B). Hence, according to the lattice property of ( ), we receive ( , . Using the lattice property of ( ), we know ( , . To prove item (3).
, . This means ( . Considering the lattice property of , . Combining Theorem 3 (2) and , , then combining the items (2) and (3) in Theorem 3, we obtain ( The following example expresses how to explain the definitions provided in this section and how to utilize Theorem 3. Step 1. To search L(A, B). a a  a a  2 , , , ,

Remark 5.
First, to explain the results in Example 10.
(1) It is easy from the process in Example 10 to see that Example 10 also illustrates Definitions 4 and 5, and further, Corollary 3. holds for i ̲ and h. 3 and ⊇ D F. It is easy to see ≤ 3 to be a generalization of ≤ 2 from ( ) to 2 O×P . To prevent any confusion for ( ), it denotes ≤ 3 still as ≤ 2 .
, . The analysis for Definition 4 below Definition 4 and beyond Definition 5 shows 2 . Hence, (p2) in [21] holds for i ̲ and h.

Analysis
This section will analyze the two groups of approximation operators given in Section 3.
Before our analysis, we give an example.
First, we analyze Example 11. Combining the ideas in Sections 3.1 and 3.2, we provide Example 11. However, Example 11 indicates that there is no predominant if we directly use the expression similar to the lower and upper approximation operators in [5,6,8] to define approximation operators with the assistance of lattice theory, since (∅ ∅) ( ∅) ∈ ( ) O , , , are easily known by Theorem 1. This implies that the ways in Sections 3.1 and 3.2 to set up approximation operators are better than the way of their combination as Example 11.
Second, we give some analysis for the two pairs of operators in Section 3.
(1) It is clear from Section 3.1 that the approximation operators R̲ and R are analyzed from many different parts such as Remarks 2, 3, and 4. (2) In Section 3.2, using lattice theory, we present two approximation operators, which can approximate every subsets in (O, P). Though the approximation operators are not the raw form of equivalence relation as that in [5,6,8], it still continues the thoughts in [5,6,8]. Additionally, the approach in Section 3.2 can be used for every lattice ( ), no matter whether ( ) is Boolean. Thus, the definition of approximation operators in Section 3.2 is an extension of that in [7], which is a variation of that in [5,6,8] for Boolean algebra. Therefore, we can express that the way the approximation operators defined in Section 3.2 generalizes that in [5,6,8] from a one-dimensional space to a twodimensional space for information systems.
Third, comparison between "R R ̲ ,¯" and " i h ̲ ,¯." (1) Both the pair of R̲ and R and the pair of i ̲ and h are the generalizations of Pawlak's standard model in [5,6,8] from a one-dimensional space to a two-dimensional space, and satisfy the conditions (p1) and (p3) in [21], respectively. (2) R̲ and R are easy to be defined by an equivalence relation as done in the model in [5,6,8], though the pair of R̲ and R does not satisfy the condition (p2) in [21]. Though the pair of i ̲ and h satisfies the condition (p2) in [21] with the partial order ≤ 2 defined in Remark 5, it is not easy to set up the pair since this model needs to know the fundamental sets for the given information system to be a lattice with a partial order ≤ 2 .
(3) According to the problem considered by a reader and the background of mathematical knowledge of the reader, the reader can select one of the pair of R̲ and R and the pair of i ̲ and h provided in Section 3, which is thought to be easily solve the problem.
Fourth, summary. The above analysis from first to third taken together shows that for a formal context = ( ) O P I , , , we can not only generalize the thoughts in [5,6,8] from a one-dimensional space to a two-dimensional space but also provide different methods to approximate the subsets in 2 O×P . The examples for the characteristics data of the defensive glands of 16 genera of Blaptini taken together means that our results in this study can assist biologists to do their research on phylogenetic analysis of insects.

Conclusion
In the two-dimensional space O × P, we give two groups of approximation operators based on ( ) for a formal context = ( ) O P I , , . In fact, we hope to search much more approximation operators on a more than one-dimensional space for different research requirements. Our future work is: (1) to provide more approximation operators for ; (2) to use approximation operators defined for to find some useful algorithms and exploration in extracting information such as in the classification and the analysis of biological characteristics for insects; (3) to search out much more ways in the applications between rough set and formal concept analysis to serve for the study of our real life; (4) to generalize the approximation operators in [5,6,8] from a one-dimensional space to an n-dimensional space for an information system (X 1 , X 2 ,…,X n , I), where 2 ≤ n and I ⊆ X 1 × X 2 ×,…,× X n .