Linked Data (detailed information in  or ) is a trend of the current world of information technologies, including geoinformatics and geomatics (several examples of implementations of Linked Data in the geographical and spatial data domain were published in [ 3 – 5]). The Linked Data approach enables us to publish various types of geographical and spatial data in highly interoperable way. The best description of Linked Data is provided by a 5-star rating scheme of Linked Open Data . This ranking describes Linked Data as data sets under an open license (*) available in a machine-readable(**), non-proprietary for-mat (***) ideally in a RDF (Resource Description Frame-work) standard (****). The last level (*****) is essential. It can be expressed by the sentence “Link your data to other peoples data to provide context”. This statement is a key prerequisite for the development of the Internet of Things which is based on the vision of combination of “captured data with data retrieved from other sources, e.g., with data that is contained in the Web, gives rise to new synergistic services that go beyond the services that can be provided by an isolated embedded system”.
There are a lot of benefits of the Linked Data approach, but the key one is “the provision of integrated access to data from a wide range of distributed and heterogeneous data sources”, which is strongly related to links between data sets or objects. But several authors (for example  or ) point out shortcomings of the Linked Data approach. For example the article  mentions that the links between data sets are “too shallow to realize much of the benefits promised”.
The main principle of the Linked Data approach is a formation of links between particular data. These links can be based on various relations (such topological connections, part-whole relationship), but the relation expressing equivalence belongs to the most frequently used links (as it is evident from Vocabulary of Interlinked Datasets of important Linked Data sources). The goal of this article is to check whether the concepts representing the same geographical phenomenon from various Linked Data resources connected with links are more similar than the concepts expressing the same phenomenon, but standing alone. In other words if the Linked Data approach represents a trustworthy and reliable system interconnecting always the relevant concepts. Except the Linked Data domain the results and methodology of this study can be contributive for other activities connected to data and its understanding such as ontology alignment, data harmonization or general data interoperability. As a domain for testing of above-mentioned assumption the geographical concept “forest” concept is used. The semantic research and evaluation of this very common concept was studied in previous works of Bennett , Helms  or Comber . A dehnition and other information on the geographical concept “forest” is essential in dealing with tasks related to deforestation, landscape changes, protection of species, floods, production of oxygen etc.
The article is structured as follows. Section Related works and terminology focuses on studies of geographical concepts, thesauri (as one of the most frequently used semantic tool for common users as well as the tool applying Linked Data approach), the SKOS standard (Simple Knowledge Organization System), which is used in thesauri, and the Linked Data approach. Then important publications dealing with geographical concepts, their specihcation, similarity and quality of links are introduced. The next section Methods introduces the principles of comparing “forest” concepts in various thesauri selected for this research. The part Results shows the outcomes of similarity of the compared concepts. The results, proposals of handling with geographical concepts in thesauri and continuation of the research are discussed and summarized in the last section Discussion.
2 Related works and terminology
This section is an overview of key theoretical terms used in this article (Essential terms) and an introduction of im-portant background and related studies (Related studies) concentrated on several quality aspects of Linked Data (primarily links interconnecting equivalent or very similar concepts) and investigation of similarity of concepts.
2.1 Essential terms
The following paragraphs focus on four crucial terms of this article: geographical concepts; thesauri, as one of the most frequently employed semantic tools for common users who are not usually experts in thesauri domain and therefore consider information provided by thesauri as reliable; the SKOS standard, used to keep the structure in thesauri; and the Linked Data approach.
The current world of geomatics, geoinformatics and other disciplines related to spatial data and information is connected to various tools and services dealing with semantics of data(for example thesauri, ontologies or controlled vocabularies). The semantic information has to enable better understanding, sharing, integrating and combining any geographical and spatial data. It also improves communication related to geographical phenomena and reduces misinterpretation of data and information. Tools, such as thesauri or ontologies, provide a set of concepts, including geographical concepts.
The term “geographical concepts” is mentioned in many publications focused on conceptual modelling of geographical information or geo-ontologies. In a majority of publications (e.g. [ 13–15]), the dehnitions of geographical concepts are quite vague. Geographical concepts are concepts with a relation to a location in a geographical space. The studies discuss above all the scope of geographical concepts. The scope can be very narrow, similar to gazetteers, including geographical objects such as cities or mountains, or it can be very broad, covering not only all locatable objects, but also concepts connected to geography and related helds (for example “volcano” or “ocean”).
The term “geographical concept” proceeds from the general word “concept” in the context of ontological and conceptual modelling. Both terms “concept” and “geo-graphical concept” are described and analysed in detail in ). The term “concept” (or “conceptualization”) appeared in Gruber's fundamental dehnition of ontology in a sense of information sciences . The term “concept” is described in publications [17 – 19] – “a concept can be anything about which something is said, and, therefore, could also be the description of a task, function, action, strategy, reasoning process, etc.”. A connection between concepts and semantics is mentioned in  - “a concept may be anything: an animal, a technique, and so on. Operationally, a concept is the set of all terms used in all languages to describe the same idea.” Authors are aware of many open questions and discussions on the correct definition and specihcation of the term “concept” (this fact is evident from the comparison above-mentioned articles and papers), but the scope of this document does not allow a broader presentation of this issues which deserve a special research.
Thesaurus, as one of the most important semantic tools dealing with concepts, is dehned as “a list of technical terms with relations among them, enabling generic retrieval of documents having different but related keywords”. Holanda in  points out that “thesaurus is one, out of many, possible representation of term (or word) connectivity”. The other dehnition  specihes types of relations in thesauri “A thesaurus is mainly a controlled vocabulary - a domain-specihc vocabulary, made up of terms not words that are linked to one another by cross-referencing”. Other definitions and the evolution of the thesaurus concept are described in . “The structure of a thesaurus is generally defined a priori. A controlled set of words or expressions (terms) is organised in a known order and structure. The relationships between the terms (e.g. equivalence, homographic, hierarchical and associative) are displayed clearly and identified by standardised relationship indicators (e.g. BT broader term, NT narrower term and RT related term), which are employed reciprocally.”[ 23]. It is necessary to mention that modern thesauri such as examples used in this research (see the section Methods) constitute an integral component of the Linked Data cloud (see linkeddata.org), because contained terms and concepts are usually linked to other thesauri and semantic tools.
Simple Knowledge Organization System (SKOS) is a standard1  provided by World Wide Web Consortium (W3C) to support the Semantic Web and knowledge organization systems, including thesauri2. SKOS is based on XML (Extensible Markup Language). It is an implementation of the RDF standard. SKOS “consists of a set of RDF properties and RDFS (RDF Schema) classes that can be used to express the content and structure of a concept scheme as an RDF graph.”. According to  SKOS is based on “conceptual resources (concepts) which can be identified with URIs (Uniform Resource Identifier), labeled with lexical strings in one or more natural languages, documented with various types of notes, semantically related to each other in informal hierarchies and association networks and aggregated into concept schemes”. The key design principles of SKOS, including history, rationale, particular components, mapping, relations and formal semantics, are explained in .
The best description of the Linked Data approach, including concepts and several RDF-based standards such as SKOS, is provided by the 5-star rating scheme of Linked Open Data . As mentioned in the section Introduction, necessary properties of such a type of data can be summarized as machine readable data under an open licence, which are stored in the RDF format. The most important property is a connection by links to external data. These links should interconnect concepts on the basis of relations that are defined in various standards, for example SKOS or Web Ontology Language (OWL). Tim Berners-Lee  defines that Linked Data are related to the two main standards URI and RDF. URI guarantees the mechanism of unique identifiers for each element. These identifiers are provided to create links between data. The RDF standard deals with triple data structure (subject - predicate - object), which enables us to describe all data and information in a universal way.
2.2 Related Studies
The research focused on geographical concepts, their interconnection and similarity is very broad. As mentioned, this paper deals with the concept “forest”. The essential publication “What is a forest? On the vagueness of certain geographic concepts”  was already mentioned in the Introduction. This article focuses on semantic research and evaluation of geographical concepts representing “forest” phenomenon. The ideas introduced in this article are expanded in further articles and papers dealing with geographical concepts [12, 30], building geoontologies [ 31, 32] or implementing fuzzy logic approaches into the conceptualization process .
In connection with the beginning of the Linked Data approach and the Semantic Web there are several studies evaluating the quality of information provided by relations between concepts. The authors of the article Towards Linkset Quality for Complementing SKOS Thesauri  test relations in the thesauri. Other studies [35 – 39] concentrate on the relation owl:sameAs as the key expression of equivalence between concepts in the language OWL. Ding in [ 36] proposes “a general strategy for integrating and fusing information from the URIs in an owl:sameAs network” based on various types of description.
In order to find out the quality of links between concepts on the basis of provided explicit information it is necessary to investigate a similarity of the compared concepts. There are various approaches to investigate similarity (e.g. [40 – 42]). For example in  there are presented three approaches: feature-based model, semantic-network based models (semantic distances) and information-content based models. Also the article  dealing with similarity of concepts in WordNet presents three types of calculating semantic similarity (edge-based methods, information-based statistics methods, hybrid methods) as well as many references. Feature based model is closely connected to Tversky's studies (e.g. [ 43 – 45]). The Tversky-based methods are also mentioned in other publications, for example [14, 46, 47].
The other approach of similarity measurements is based on the Formal Concept Analysis (FCA) [46, 48 –50]. This method is commonly used for comparing of concepts in one ontological systems. Therefore it is necessary to merge concepts into one ontology.
The SKOS format uses the relation skos:exactMatch to find out equivalent or very similar concepts. It is defined as follows “skos:exactMatch indicates a high degree of conhdence that two concepts can be used interchangeably across a wide range of information retrieval applications”. This description is vague, because there is not mentioned what the “high degree of conhdence” means and how it can be investigated, computed or compared. Therefore, the following methods and their implementation result in the evaluation of the statement that geographical concepts related to the “forest” phenomenon and interconnected by the skos:exactMatch relation are more similar than self-standing concepts.
This section is divided into three parts describing particular phases of the research:
Selection of tested thesauri
Extraction of information from thesauri
Computation of similarity of the “forest” concepts in selected thesauri
The structure of this chapter is also depicted in deep on following Figure 1:
3.1 Selection of tested thesauri
The tested thesauri were selected on the basis of several conditions, which had to eliminate unsuitable products. Because of following criteria thesauri such as TheSoz (Thesaurus Sozialwissenschaften), Deutsche National Bibliothek Thesaurus or RAMEAU (Rpertoire d'autorit-matire encyclopdique et alphabtique unih) of Bibliothque nationale de France were not put into the research. This applies also to general concept resources such as DBpedia, Wikidata or WordNet. The selection criteria include:
Respected and well-known tools developed for a long time.
Containing a large number of concepts.
Tools generally focused on scientihc disciplines related to geography.
The thesauri contain the “forest” concept (and its description and relations) in English (to eliminate the risk of wrong translation).
They are maintained by a respected organization, company, consortium or in case of community administration they have many real users.
The tools providing information under an open or free license were preferred.
The research was realized with the use of the following selected thesauri (in alphabetical order):
AGROVOC (the acronym AV is used in the following text and tables ),
General Multilingual Environmental Thesaurus (GE),
Linked Thesaurus fRamework for Environment / Environmental Applications Reference Thesaurus (LE),
The National Agricultural Library's Agricultural Thesaurus (NA),
OECD (Organisation for Economic Cooperation and Development) Macrothesaurus (OE),
STW (Standard Thesaurus Wirtschaft) Thesaurus for Economics (ST).
All selected thesauri contain a concept related to the “forest” phenomenon. These concepts are labelled as 'forest' (GE and LE), 'Forest' (ST) and 'forests' (AV, EV, NA and OE). All these forms of the noun were taken as equivalent.Extraction of Information
The next step consists of collecting all explicit information provided by the thesauri. Thesauri usually provide four kinds of information on (not only geographical) concepts3 – explicit description, annotations or dehnitions of concepts, information following from implemented hierarchy, other relations and links to external resources. There are three main semantic relations in the SKOS standard ([24, 25]) and thesauri, which are related to hierarchical system of concepts: skos:broader (BT) and skos:narrower (NT) dehne the hierarchy between two concepts. The property skos:related (RT) is used to assert an associative link between two SKOS concepts. The dehnitions, descriptions and particular subjects of the above-mentioned relations are extracted to a word list (according to [12, 51,52]). This step limits the impacts of human interpretation of information. After concepts extraction from each particular thesaurus, several changes, for example transformation to singular or using only small letters, have been made to get a uniform set of terms. The word lists were written down as an XML file that was processed by XSLT (Extensible Stylesheet Language - Transformation) language to compute the similarity of concepts (see following chapter).
Next information extracted from source data consists in relations interconnecting “forest” concepts connected by the relation skos:exactMatch in various thesauri. The following schema *(Figure 2) shows how are particular selected thesauri interconnected in case of studied concept.
3.2 Computation of similarity
The similarity (2 – 5) was computed for four main types of relations provided by the thesauri and mentioned in the previous section. A similar approach to compute similarity of various types of information separately was used also in [53,54], where the four types of similarity (syntactic, property, neighbourhood and context) is recognized.
The total similarity computed in our research was retrieved as the average of particular values (6). The similarity was computed according to Tversky (1, principles are mentioned in , the formula was published in ). A similar approach was used for example in [14, 46,47]. (1)
Characters X and Y in the (1 represent the input sets (in this case the sets X and Y means particular list of words created by decomposition of information provided by thesauri). The parameters α and β and were set to 1 (the Tanimoto coefficient as a specification of the Tversky index). Other coefficients such as Dice's coefficient ((Table 1) were tested, but the results were similar. Comparing (Table 1 and (Table 4 there are evident the same distributions of maxima and minima (local as well as absolute) and also differences between relevant values in both tables (coarse of function) are very similar. The correlation between both tables ((Table 1 and (Table 4) equals 0,988. Except the similar character of outputs the important fact related to the selection of Tversky index is that this index is asymmetric (unlike Dice's coefficient). Therefore it is able to take into consideration a possible extension of research by some relations that are not symmetric.
The similarity was computed with the use of an XSLT template developed by the first author. The template transforms the input data file containing all explicit information separated into the word list into an HTML (HyperText Markup Language) file. This HTML file contains the tables with particular similarities.
The comparison of similarity of “forest” concepts defined in the above mentioned thesauri is summarized in the following tables showing particular aspects of similarity. Rows and columns of the tables represent the “forest” concept in concrete thesauri. The values (between 0 and 1) show similarity between particular concepts in thesauri. It is evident that the same concepts (on the top-left to bottom-right diagonal) show maximum similarity (value 1). The similarity is expressed by the Tversky index ( 1 in the section Methods).
Table 2 shows the similarity of definitions (or description) of the concepts. It shows one of the main problems of thesauri - missing explicit description in a form of definitions or some other texts. Only three thesauri (AGROVOC, GEMET and LusTRE/EARTh) contain a detail specification of the “forest” concept. It is evident that GEMET and LusTRE/EARTh use the same definition (adopted from ). The similarity based on definitions is computed without stop words (words only with syntactic information). As the equivalent terms all forms of words with the same meaning had been taken (this rule was kept in other analyses as well).
The following tables express the similarity of object relations, which are typical for thesauri based on the SKOS standard: broader terms (Table 3, narrower terms (Table 4 and related terms (Table 54.
In order to summarize the similarity of particular concepts, average values from the previous tables (Table 2 – Table 5) were calculated (Table 6). Authors have tested various weights of aspect of similarity, but finally all weights were considered as equal (set to the value 1), because for example the explicit descriptions or definitions are the most important to understand the concept for humans, but the standardized and formalized object relations can be processed automatically.
Table 6 contains three types of extreme values:
Similarity of the same concepts (the top-left to bottom-right diagonal).
Similarity of the “forest” concepts in the GEMET and LusTRE/EARTh. The value 0,38 is the highest in comparison with other computed similarities. The reason is the fact, that both thesauri use the same definition of the “forest” concepts. The example GEMET and LusTRE/EARTh illustrates another problem of skos:exactMatch relation and its implementation. While the “forest” concepts in LusTRE/EARTh is connected to the concept with same name in GEMET, there is not an inverse relation.
Entirely different concepts with value of similarity 0 include the following pairs: STW - EuroVoc and STW -OECD (none of these pairs is interconnected with the skos:exactMatch relation).
If the extremal value (0,38) is eliminated the set of similarity values is quite homogeneous. Figure 3 compares two histograms of values of similarity. White columns show the absolute number of similarity between noninterlinked concepts falling into each interval of similarity (the range of similarity is limited by values in Table 6). Grey colour represents similarity values interconnected to the skos:exactMatch relation.
Also the following scheme (Fig. 4) presents the results of comparison. The particular thesauri are interconnected if the total similarity (Table 6) of the concepts “forest” in both thesauri is higher than 0. Therefore two couples (EV- ST and OE-ST) are missing as well as the extremal value (0,38). Black lines connects concepts interlinked by the skos:exactMatch relation, while the silver colour is used for not interconnected concepts. The width of the lines represents values of similarity according to the Table 6. Values are divided into equal interval according the Fig. 3. The line width is changing from 0 for the lowest interval to 9 pixel for the interval (0,09;0,10).
The both outputs (Fig. 3 and Fig. 4) show the same results, that there is not a direct relation among interconnection of concepts and value of similarity. This results is supported by average similarity (after eliminating of extremal values which fall into both types of concepts) for interconnected (0,05) and notinterconnected concepts (0,04).
From the results presented in the previous section, it is evident that in the case of the “forest” concepts and the selected thesauri the concepts interconnected by the skos:exactMatch are not considerably more similar than other concepts. This statement is based on following facts:
Random or non-ordered occurrence of interconnected and not interconnected concepts in the histogram (Figure 3). It is not possible to say that the number of interconnected concepts tends to any side of the graph.
The similarity of interconnected concepts is higher (0,085 on the contrary to the similarity of noninterconnected concepts 0,049). But if the extreme values are removed (to have the data set more homogeneous and not influenced by one very different value), the average similarity of noninterconnected concepts is even higher than the interconnected concepts (0,056 compared to 0,051).
Regarding the results of this research authors claim that a construction of relations expressing “a high degree of confidence”?? does not follow explicit semantic information provided by thesauri and other semantic tools. The highest value of similarity is 0,358. It is very low (maximum similarity is 1) to bear out the statement mentioned in the Introduction section - concepts representing the same phenomenon from various resources connected with links are more similar than the concepts expressing the same phenomenon, but standing alone. This fact is emphasized by the average value of similarity, which is also very low (about 0,05).
It seems, that the semantic relations between the concepts are probably created on the basis of implicit semantics - subjective view of the authors, editors or the managers of the thesauri, their experiences with other semantic tools and similarity based on the name of a concept. The implicit semantics is not shareable in a wide or global community. Also processing of implicit semantic information by machines is impossible. Therefore, its implementation cannot support interoperability and sharing of knowledge efficiently.
The other reasons of the low similarity of the “forest” concepts are partially mentioned in the article . Several premises of the geographical domain have been published, which contain sources of vagueness. Similarly to the mountains, marsh or thicket concepts (mentioned in ) also the “forest” concept does not have “a precise, universally acknowledged definitions”.
The low similarity values are also caused by the use of vague terms in the definition and very generic description of the studied relation (skos:exactMatch), which contains very general and non-specific phrase “a high degree of confidence”. According to  “'High' and 'dense' are adjectives, which give some indication of physical properties of a feature but do not specify any definite measurable requirement. 'Very' accentuates vague adjectives but does not make them any more definite.”
Both mentioned cases of vagueness represent a combination of conceptual and sorites vagueness (mentioned in ). The conceptual vagueness (closely connected to ambiguity) consists in inadequate explicit definitions and descriptions (for example very poor or missing characterization of the “forest” concept in several thesauri, see Table 2). The sorites vagueness (based on the Sorites paradox) concerns various and very subjective viewing of several properties (for example “high degree”).
The research introduced in this article has not been completed. The further steps of the research of semantic similarity of geographical concepts in semantic tools will be divided into four main parts:
Improving methods of similarity investigation and computation. For example in the research published in this paper the missing relations have the same value (0) as the existing relations, but without any similarities. Also other approaches to similarity computation mentioned above will be studied in more detail.
The set of tested geographical concepts has to be extended (general land cover and land use concept, because the publications [56, 57] declare important heterogeneities) as well as new resources of concepts will be added.
Description of relations and similarity by an approach based on multi-valued logic can be realized.
As a final result of the long-term research, recommendations focused on building semantic relations with focus on context and any explicit specification will be published.
The goal of this article is to verify if selected geographical concepts representing the same geographical phenomenon from various resources and interconnected by a relation expressing very high affinity are really more similar than concepts standing alone. As a domain for testing the geographical concept “forest” is used, because the “forest” phenomenon and concepts representing this phenomenon are essential in dealing with tasks related to deforestation, landscape changes, protection of species, floods, production of oxygen, tourism, forestry etc. The set of studied semantic tools for testing was narrowed down include only relevant thesauri (AGROVOC, EuroVoc, GEMET, LusTRE/EARTh, NAL, OECD Macrothesaurus and STW Thesaurus for Economics) containing geographical concepts. Finally the skos:exactMatch relation, which means high affinity of interconnected concepts, was chosen, because the SKOS format is typically used in thesauri. The above-mentioned methodology proven at the “forest” concept can be easily used for broader set of concepts.
This proximity of concepts was evaluated on the basis of computation of similarity of each type of explicit information provided by the thesauri. These types of information included definitions, descriptions or annotations, hierarchical relations (broader and narrower terms) and semantic relation (related term). The content of subjects of these relations was decomposed into particular words (usually nouns) and the similarity was computed on the basis of Tversky's approach. Total similarity was gained as the average of four particular similarity values based on various types of relations.
The results show that in the case of “forest” concepts and the selected thesauri the concepts interconnected by the skos:exactMatch relation are not considerably more similar than other concepts. It is evident from the low correlation of the “is interconnected” property and the similarity of the concept, the histogram of similarity (Figure 3) and the average similarity of the interconnected and noninterconnected concepts. On the basis of the results of the research it is possible to claim that a construction of skos:exactMatch relation does not follow explicit semantic information provided by thesauri.
Results of this research can be used for further development of studied thesauri, because they should not be a definitive solution, but live system absorbing new data, information and knowledge. Improvements of thesauri can consist in completion of inverse relations or extension, harmonization and standardization of explicit description and specification of concepts.
Regardless of results of our research Linked Data are a very important component of the contemporary world of information technologies. Linked Data enable us to interconnect self-standing and isolated data resources and objects. Since the links are connecting not only to data object, but also data objects and relevant items in vocabularies, Linked Data could contribute to better understanding and sharing of data. But there is a crucial question: are the particular components of Linked Data (primarily the links) really reliable? In the context of this article the question could be narrowed down - Are concepts connected by the skos:exactMatch relation much more similar than other concepts and is this similarity really high? The research published in this paper shows that the answer to above-mentioned questions is negative (at least in the case of the studied thesauri, concepts and the relations).
These results do not criticize the Linked Data approach and its implementation in geographical domain. They point out that Linked Data need clear and understandable descriptions with minimization of vague terms. These descriptions should be based on respected publications, standards and norms. They should follow a general consensus and offer alternatives, but only with detail explanation of meaning and ways of usage of such alternatives. Also, hierarchical and semantic relations have to be constructed on the basis of detailed external information and expert knowledge. The cooperation of semantic engineers and geographers (and other domain experts) is crucial. Also it is necessary to emphasize the key role of explicit and formal semantics and uniform approach to development of interconnections of geographical concepts. These recommendations could contribute to a better use of the amazing potential of Linked Data in the geographical domain. Authors are aware of unrealistic expectation related to complete eliminating of vagueness in geographical concepts in Linked Data. But it is necessary to mention that any particular improvements connected to providing less vague information support interoperability, more quality communication and information transfer. These small steps focused on semantics are a very important part of never-ending effort for the Semantic Web.
This publication was supported by the project LO1506 of the Czech Ministry of Education, Youth and Sports.
Bizer, C., Heath, T., Idehen, K., Berners-Lee, T. Linked data on the web (LDOW2008). In Proceedings of the 17th international conference on World Wide Web, 2008, pp. 1265–1266 Google Scholar
Bizer, C., Heath, T., Berners-Lee, T. Linked data-the story so far. International journal on semantic web and information systems, 2009, 5(3), 1–22. Google Scholar
Goodwin, J., Dolbear, C., Hart, G. Geographical linked data: The administrative geography of Great Britain on the semantic web. Transactions in GIS, 2008,12(s1), 19–30. Google Scholar
Stadler, C., Lehmann, J., Hoffner, K., Auer, S. Linkedgeodata: A core for a web of spatial open data. Semantic Web, 2012, 3(4), 333–354.Google Scholar
Kritikos, K., Rousakis, Y., Kotzinos, D. Linked open GeoData management in the cloud. In Proceedings of the 2nd International Workshop on Open Data, 2013, p. 3. ACM. Google Scholar
Berners-Lee, T. Design issues: Linked data. World Wide Web Consortium, 2006. Google Scholar
Kopetz, H. Internet of things. In Real-Time Systems, 2011, pp. 307–323. Springer US. Google Scholar
Bechhofer, S., Buchan, I., De Roure, D., Missier, P., Ainsworth, J., Bhagat, J. et al. Why linked data is not enough for scientists. Future Generation Computer Systems, 2013, 29(2), 599–611. Google Scholar
Jain, P., Hitzler, P., Yeh, P. Z., Verma, K., Sheth, A. P. Linked Data Is Merely More Data. In AAAI Spring Symposium: linked data meets artificial intelligence, 2010. Google Scholar
Bennett, B. What is a forest? On the vagueness of certain geographic concepts. Topoi, 2001, 20(2), 189–201. Google Scholar
Helms, J. A. Forest, forestry, forester: What do these terms mean?. Journal of Forestry, 2002,100(8), 15–19. Google Scholar
Comber, A. J., Wadsworth, R. A., Fisher, P. F. Usingsemantics to clarify the conceptual confusion between land cover and land use: the example of forest. Journal of Land Use Science, 2008, 3(2–3), 185–198. Google Scholar
Schwering, A., Raubal, M. Spatial relations for semantic similarity measurement. In Perspectives in conceptual modeling, 2005, pp. 259–269. Springer Berlin Heidelberg. Google Scholar
Kavouras, M., Kokla, M. Theories of geographic concepts: ontological approaches to semantic integration. CRC Press, 2007. Google Scholar
Haav, H. M., Kaljuvee, A., Luts, M., Vajakas, T. Ontology-Based Retrieval of Spatially Related Objects for Location Based Services. In On the Move to Meaningful Internet Systems: OTM, 2009, pp. 1010–1024. Springer Berlin Heidelberg. Google Scholar
Gruber, T. R. A translation approach to portable ontology specifications. Knowledge acquisition, 1993, 5(2), 199–220.Google Scholar
Gomez-Perez, A., Benjamins, R. Overview of knowledge sharing and reuse components: Ontologies and problem-solving methods. IJCAI and the Scandinavian AI Societies. CEUR Workshop Proceedings, 1999. Google Scholar
Corcho, O., Gomez-Perez, A. A roadmap to ontology specification languages. In Knowledge Engineering and Knowledge Management Methods, Models, and Tools, 2000, pp. 80–96. Springer Berlin Heidelberg.Google Scholar
Gomez-Perez, A., Corcho, O. Ontology languages for the semantic web. Intelligent Systems, IEEE, 2002,17(1), 54–60. Google Scholar
Caracciolo, C. AGROVOC model description and analysis. With suggestion for improvements. (FAO internal document), 2013. Google Scholar
Miyamoto, S., Miyake, T., Nakayama, K. Generation of a pseudothesaurus for information retrieval based on cooccurrences and fuzzy set operations. Systems, Man and Cybernetics, IEEE Transactions on GIS, 1993, (1), 62–70. Google Scholar
Holanda, A., Torres Pisa, I., Kinouchi, O., Souto Martinez, A., Seron Ruiz, E. Thesaurus as a complex network. Physica A: Statistical Mechanics and its Applications, 2004, 344(3), 530–536. Google Scholar
Severino, F. The term development in the thesauri of international organisations. The European Journal of Development Research, 2007,19(2), 327–351. Google Scholar
Pastor-Sanchez, J. A., Martinez Mendez, F. J., Rodriguez-Muoz, J. V. Advantages of Thesaurus Representation Using the Simple Knowledge Organization System (SKOS) Compared with Proposed Alternatives. Information Research: An International Electronic Journal, 2009,14(4), n4. Google Scholar
Miles, A. Bechhofer, S. SKOS Simple Knowledge Organization System Reference. W3C Recommendation, 2009. Google Scholar
Miles, A., Matthews, B., Wilson, M., Brickley, D. SKOS core: simple knowledge organisation for the web. In International Conference on Dublin Core and Metadata Applications, 2005, pp. 3. Google Scholar
Isaac, A., Summers, E.. SKOS simple knowledge organization system primer. W3C Working Group Note, 2008. Google Scholar
Baker, T., Bechhofer, S., Isaac, A., Miles, A., Schreiber, G., Summers, E. Key choices in the design of Simple Knowledge Organization System (SKOS). Web Semantics: Science, Services and Agents on the World Wide Web, 2013. 20, 35–49. Google Scholar
Van Assem, M., Malais, V., Miles, A., Schreiber, G. A method to convert thesauri to SKOS. Springer Berlin Heidelberg, 2005, pp. 95–109.Google Scholar
Bennett, B., Mallenby, D., Third, A. An Ontology for Grounding Vague Geographic Terms. In FOIS, 2008, Vol. 183, pp. 280–293. Google Scholar
Tomai, E., Kavouras, M. From onto-geonoesis to onto-genesis: The design of geographic ontologies. Geoinformatica, 2004, 8(3), 285–302. Google Scholar
Mark, D., Smith, B., Egenhofer, M., Hirtle, S. Ontological foundations for geographic information science. Research Challenges in Geographic Information Science, 2004, 335–350. Google Scholar
Fisher, P., Cheng, T., Wood, J. Higher order vagueness in geographical information: empirical geographical population of type n fuzzy sets. Geoinformatica, 2007,11(3), 311–330. Google Scholar
Albertoni, R., De Martino, M., Podesta, P. Towards Linkset Quality for Complementing SKOS Thesauri. 2014. Google Scholar
Hogan, A., Polleres, A., Umbrich, J., Zimmermann, A. Some entities are more equal than others: statistical methods to consolidate Linked Data. In 4th International Workshop on New Forms of Reasoning for the Semantic Web: Scalable and Dynamic (Ne- FoRS2010), 2010. Google Scholar
Ding, L., Shinavier, J., Finin, T. McGuinness, D. L. owl: sameAs and Linked Data: An empirical study. WebSci10: Extending the Frontiers of Society On-Line, 2010. Google Scholar
Ding, L., Shinavier, J., Shangguan, Z., McGuinness, D. L. SameAs networks and beyond: analyzing deployment status and implications of owl: sameAs in linked data. In The Semantic WebISWC, 2010. pp. 145–160. Springer Berlin Heidelberg.Google Scholar
Halpin, H., Hayes, P. J. When owl: sameAs isn't the Same: An Analysis of Identity Links on the Semantic Web. In LDOW, 2010.Google Scholar
Hogan, A., Umbrich, J., Harth, A., Cyganiak, R., Polleres, A., Decker, S. An empirical survey of linked data conformance. Web Semantics: Science, Services and Agents on the World Wide Web, 2012, 14, 14–44. Google Scholar
Lin, F., & Sandkuhl, K. (2008). A survey of exploiting wordnet in ontology matching. In Artificial Intelligence in Theory and Practice II (pp. 341–350). Springer US. Google Scholar
Borgida, A., Walsh, T. J., Hirsh, H. Towards Measuring Similarity in Description Logics. Description Logics, 2005,147. Google Scholar
Formica, A. Concept similarity by evaluating information contents and feature vectors: a combined approach. Communications of the ACM, 2009, 52(3), 145–149. Google Scholar
Tversky, A. Features of similarity. Psychological Review, 1977, 84(4):327352. Google Scholar
Tversky, A., Gati, I. Studies of similarity. Cognition and categorization, 1978,1, pp 79–98. Google Scholar
Tversky, A., Gati, I. Similarity, separability, and the triangle inequality. Psychological review, 1982, 89(2), 123. Google Scholar
Wang, L., Liu, X. A new model of evaluating concept similarity. Knowledge-Based Systems, 2008, 21(8), 842–846. Google Scholar
dAmato, C., Staab, S., Fanizzi, N. On the influence of description logics ontologies on conceptual similarity. In Knowledge Engineering: Practice and Patterns, 2008, pp. 48–63. Springer Berlin Heidelberg. Google Scholar
Formica, A. Ontology-based concept similarity in formal concept analysis. Information Sciences, 2006,176(18), 2624–2641. Google Scholar
Yang, Y., Du, Y., Sun, J., Hai, Y. A topic-specific web crawler with concept similarity context graph based on FCA. In Advanced intelligent computing theories and applications. With aspects of artificial intelligence, 2008, pp. 840–847. Springer Berlin Heidelberg.Google Scholar
Formica, A., Concept similarity in Formal Concept Analysis: An information content approach. Knowledge-Based Systems, 2008, 21(1), 80–87. Google Scholar
Lee, M. C., Liu, Z. L., Chen, H. H., Lai, J. B., Lin, Y. T. FCA based concept constructing and similarity measurement algorithms. In Advanced Information Management and Service (IMS), 2010, pp. 384–388. Google Scholar
Ballatore, A., Wilson, D. C., Bertolotto, M. Computing the semantic similarity of geographic terms using volunteered lexical definitions. International Journal of Geographical Information Science, 2013, 27(10), 2099–2118. Google Scholar
Guisheng, Y., Qiuyan, S. Research on ontology-based measuring semantic similarity. In Internet Computing in Science and Engineering, 2008. ICICSE'08, 2008, pp. 250–253. Google Scholar
Ngan, L. D., Hang, T. M., Goh, A. E. Semantic similarity between concepts from different OWL ontologies. In Industrial Informatics, 2006, pp. 618–623. Google Scholar
Dunster, J., Dunster, K. Dictionary of natural resource management. Dictionary of natural resource management, 1996. Google Scholar
Cerba, O., Ontologie jako nastroj pro navrhy datovych modelu vybranych temat priloh smernice INSPIRE. Dissertation, Univerzita Karlova v Praze, 2011. (in Czech) Google Scholar
Belgiu, M., Strobl, J., Mittlboeck, M. Adding Semantics To Spatial Content. A Land Cover Scenario, 2012. Google Scholar
The last version of SKOS W3C Recommendation  is published in URL http://www.w3.org/TR/2009/REC-skos-reference-20090818/.
Particular thesauri could deals with other kinds of relations or hierarchy such as microthesaurus or themes. Because they are not standardized and used only in special cases, they were not incorporated into the research.
About the article
Published Online: 2016-11-12
Published in Print: 2016-01-01
Citation Information: Open Geosciences, Volume 8, Issue 1, Pages 556–566, ISSN (Online) 2391-5447, DOI: https://doi.org/10.1515/geo-2016-0049.
© 2016 O. Čerba and K. Jedlička, published by De Gruyter Open. This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License. BY-NC-ND 3.0