The Internet of Things (IoT) will connect billions of embedded computers that can sense and influence their environment. By integrating perception and control of the real world with data and services available on the Web, a wide range of novel applications can be realized, including Smart Cities, Smart Homes, or Smart Grids. A prerequisite for integrating sensor data with other data on the Web is a common data format that is not constrained to a specific domain, such that joint queries over diverse data sources can be efficiently performed. The Semantic Web offers such a data format called RDF, which essentially consists of subject-predicate-object triples to formulate arbitrary facts, as well as a query language called SPARQL to pose queries over sets of such triples. In order to scale to the huge amount of sensor data being produced in the IoT, RDF databases and SPARQL query engines need to be implemented in a distributed fashion, in particular using peer-to-peer (P2P) techniques. Existing solutions in
that space offer only limited functionality and cannot be easily extended. Therefore, after surveying the state of the art, we propose a generic framework that fully supports SPARQL,but allows plugging in different P2P systems and distribution strategies.Wealso presentandevaluateanovel probabilistic distribution strategy that supports non-uniformly distributed RDF triples.
PIK is the professional journal for the use of information systems dealing with topics related to information processing and communications techniques. The only German-language journal covers the increasingly important fields of super computers, parallel computers and high-output-workstations. PIK addresses the practitioner and decisions-maker in business, science and industry.