Distributing knowledge amongst smart environments

Get Complete Project Material File(s) Now! »

Models for smart environments

Developing applications dedicated to smart environments is inherently complex as it requires to adapt to various changing context information, in particular those associated to users, connected devices and computational resources. These constantly evolving envi-ronments calls for designing adequate models and reasoning techniques in order to support the creation of evolvable context-aware applications [BBH+10]. Over the last ten years, the pervasive computing community has produced diﬀerent modelling approaches directed in this goal, i.e., to formally describe connected devices, applications and environments. While these approaches were made to support some common requirements (e.g., to lower services or data heterogeneity, to capture service mobility, to represent dependencies be-tween services, to describe state, time and planning associated to a service, etc.), all of them diﬀer in their expressiveness, the reasoning that they allow and the computation time which is required to process them.

Early models focused on representing context and were mostly dominated by key-value as well as markup scheme based approaches. The major weakness of these approaches, how-ever, was their inability to derive higher level information e.g., using a reasoning procedure. As pointed out by [BBH+10], the introduction of the W3C standard for description of mo-bile devices, Composite Capabilities / Preference Profile (CC/PP), saw the first context modelling approaches to include elementary constraints and relationships between context types. Based on the Resource Description Framework (RDF) – a language designed by the W3C to represent any resources on the Web – CC/PP was inheriting a simple entailment regime 1 allowing to derive higher level context information. Limitations of RDF-based context modelling were however pointed out in various works [IRRH03, SLP04] and fi-nally led to the design of more expressive approaches based on database or knowledge management techniques that the following subsections will detail.

Ontological models and semantic middlewares

Aside the role-based approach, the increasing popularity of ontologies at the beginning of the 2000’s [BLHL01] led to the development of ontology-based models and semantic middleware solutions. In particular, such models and solutions gained momentum and became an important area of research [NAA09], with an enthusiasm accentuated by the advances performed in the Semantic Web community. Indeed, the development of (compu-tationally) tractable Semantic Web profiles – OWL-DL – together with the realization of eﬃcient reasoning tools paved the way to developing rich and processable context models i.e., where higher level context information could be derived.

Over the years, a plethora of models and associated middlewares have been designed,all of them focusing on a particular domain (healthcare [LLD, PG11] , tourism [Knu04, DD09], transport [NK09, BGM+10, BVAC13], building automation [SVVB12, DP14], home [CNS+06, BC08, WH11, CKY+11], agriculture [CPLM11, WWG13], etc.) in order to better automate user’s daily activities and tasks. Many studies focused on deriving con-text information [Che03, GS04, SLPF03, RMCM03, PBW+04, CGZK04, SPL06] as well as providing frameworks to enable the realization of composite and reconfigurable services, e.g. through agents or applications [KKK+08, MPG+08, ILMF11, RNS+08, SCM10]. The rise of the development of senseable objects resulted in multiple eﬀorts to build models enabling to recognize a situation [CNS+06, KHF+11, CNW12, MDEK13]. Other models [ELES06, HWG07, Goo, CNBC10] as well as those surveyed in [CHN+09] were also dedi-cated to providing generic descriptions for sensors and the entities that they measure.

All these approaches share the point that defining a common semantics for connected de-vices, sensors, tasks, etc. enables to leverage their interoperability, stepping toward the vision expressed by [Las05] in which the Semantic Web technologies are sensed as being particularly well-suited to solve the “interoperability nightmare” introduced by ubiquitous computing.

The following subsections will present a large overview of the existing models and middle-wares that we consider as relevant in the scope of this dissertation. In particular, we will concentrate on the models and middlewares enabling to derive context information and device discovery. We will however leave aside vertical models, too specific to a particular domain and consequently overlooking many aspects that this dissertation intends to cover.

Searching through semantic similarity measures

As mentioned in the previous chapter, we believe that establishing semantic similarity measures to support the discovery of connected devices in smart environment is a key requirement. As the topic is widely covered by the Semantic Web community, this sec-tion provides a review of the most well known measures and places them in the context of concepts described in ontologies almost devoid of instances and underlied by the DL SHOIQ. Choice of this DL comes from the complexity of the models established in the Chapter 3 of this dissertation. Amongst the various possibilities to classify the existing semantic similarity measures, one is to consider two groups: the extensional-based and the intentional-based approaches.

Extensional-based similarity measures

The extensional-based similarity measures use the extensions of the concepts that they compare (i.e. their instances) to determine a similarity value.

In this category, many methods have considered the overlap of the extensions of two con-cepts being compared in order to produce similarity measurements. Grounded by the work of Jaccard [Jac01], D’amato et al. [dFE05] proposed a semantic similarity measure using the ratio of instances belonging to the intersection of the concepts that they com-pare with the number of instances belonging to their union. Resnik [Res95], proposed a slightly diﬀerent approach, defining the semantic distance of two concepts in terms of the IC (Information Content) conveyed by their LCA. To compute the IC of a concept, Resnik used the work of Ross [Ros09] i.e. where the Information Content is quantified as a negative log likelihood (IC(C) = − log(p(C)) and where p(C) is a probability associated to the concept C. In his work, Resnik defined p(C) as the probability of occurrence of C in a corpus. Further works done by Jiang & Conrath [JC97] as well as Lin [Lin98] have extended Resnik’s vision, by considering the variation between the IC conveyed by the concepts being compared and the IC conveyed by their LCA, to obtain a semantic similarity. In these works the probability associated to any concept C accounts for the fraction of the instances belonging to C divided by the total number of instances in the ontology.

Another set of methods were developed to cope with the semantics of ontologies under-lied with expressive DLs (at least DLs up to ALC). In particular, D’amato, Fanizzi et al. [dFE05, dFE06, FDE08] proposed several measures eliciting the semantics of any con-cepts defined in such ontologies. In each of their works, the concepts of the ontology are rewritten in their normal forms and a semantic similarity measure is proposed, relying on the overlap of the extensions of the concepts being compared.
The main drawback of these methods however, is that they do not correctly interpret the semantics underlying any two concepts that they compare. In particular, they consider that two concepts are totally dissimilar if they do not share any instances, whatever they contain common semantics or not (see [DSF08] for a discussion on this point).

To cope with this limitation, D’amato et al. [DSF08] proposed to compute the semantic similarity of ontological concepts by relying on the variation between the number of in-stances in the concept extensions and the number of instances in the extension of what they call the “good common subsumer” (GCS) of the compared concepts. The use of the GCS – a variant of LCS but for concepts expressed through operators of the ALE DL – allows this measure to take into account the semantics underlying the concepts being compared, while the computation of the variation of the extensions (instead of the their overlap) allows to address the aforementioned drawback. However, this measure has two other limitations. First it does not bring much diﬀerence in the case of ontologies under- pinned by a DL above of ALE and in particular does not take into account the special features s of DLs allowing hierarchy of (possibly) transitive roles. Second, it assumes that the ontology covers both the concepts and (at least) a large portion of the instances of a given domain, which again is not always the case. To summarize, we believe that this measure should be chosen in the case of ontologies defining both concepts and instances of a given domain and where the semantics underlying this ontology is based on a DL up to ALE. Experimentations performed in Chapter 6 allows to appreciate how this measure fails in the case of ontologies devoid of instances or underlied by an expressive DL such as SHOIQ.

Intentional-based similarity measures

Unlike extensional measures, intentional measures focus on the structure of the concept definitions in order to evaluate their similarity. In this category, a lot of semantic similarity measures have been proposed as functions of the path distance between concepts in the hierarchical graph underlying the ontology. In this context, Rada et al. [RMBB89] proposed that the semantic distance of two concepts accounts for the length of their short-est path (given by their LCA). Leacock & Chodorow [LCM98] further transformed this distance to a similarity, considering the highest path length existing in the hierarchical graph. Wu & Palmer [WP94] considered the computation of the path between the > element of the hierarchical graph and the LCA of both compared concepts. Finally, Gan-jisaﬀar [GANJ06] set weights on each concept of the graph based on the maximum length of the existing paths between this concept and the > concept and further used the LCA of two compared concepts to get a similarity measure. All these four measures, have in common that they only consider the hierarchical graph of concepts to compute semantic similarity measures. As a consequence, they can be equally applied on ontologies whether they contain instances or not. However, these methods share the point that they are not able to evaluate the semantics that underlies a concept. As relying on the hierarchical graph of concepts, these methods give the same similarity value to any pair of concepts sharing the same LCA, whatever these concepts share some common semantics or not.

Other works proposed to consider all possible relations (i.e., in NR ∪ {v}) between any two concepts to compute their similarity. In this context, Sussna [Sus93] used weights on all edges linking two concepts to compute a semantic similarity distance. Although this method computes a diﬀerent “shortest path” between two concepts, it does not al-low any further to consider the semantics that they convey. In 2011, Pirro et al. [PE10] proposed to assign a score of informativeness to each concept by using the whole set of relations defined in the ontology. For each concept definition, they defined an extended Information Content (eIC) consisting of the sum of the intrinsic Information Content (iIC) [SVH04] applied on each property in the definition of the concept. Computing the iIC of a concept involves the ratio between the number of sub-concepts in the hierarchical graph and the number of concepts defined in the ontology. This method, however, has the same drawbacks than the aforementioned ones, that is, it is not able to convey the semantics underlying any concept as it does not consider the type of restrictions (e.g. universal, existential, on cardinality, etc.) applied to the properties in the definition of a concept.

Table of contents :

Abstract
Introduction
Emerging eco-system
Difficulties in enabling the use of smart spaces
Contributions
Outline of the thesis
1 Preliminaries
1.1 Background
1.2 Notations and Definitions
2 Related Works
2.1 Models for smart environments
2.2 Searching through semantic similarity measures
3 Defining models to support mobile users
3.1 Rationale in using Semantic Web technologies
3.2 Modelling connected devices
3.3 Modelling the location associated to smart environments
3.4 Semantic models for application templates
3.5 Representing user profiles
3.6 Conclusions
4 Towards producing efficient searching procedures
4.1 Preamble
4.2 A semantic similarity measure for SHOIQ concepts
4.3 Example of application
4.4 Conclusions
5 Distributing knowledge amongst smart environments
5.1 Preamble
5.2 Federated architecture of nodes
5.3 Sharing knowledge between federated nodes
5.4 Conclusions
6 Experimentations
6.1 Implementations
6.2 Experimentations
6.3 Conclusions
Conclusions
Summary of contributions
Perspectives
A Subsumption relations in SHOIQ
B Algorithms generating pseudo-concepts
Bibliography