ONTOLOGIES AND EXPLANATION
The act of explaining usually echoes the causality. The Oxford dictionary defines an explanation as ”a statement, fact, or situation that tells you why something happened; a reason given for something”.
The idea of using knowledge from ontologies in order to learn explanatory and causal models has been considered in several works. ´ and Gini (2014) presents the idea of integrating Cutic´ causal knowledge from ontologies for causal discoveries; Messaoud et al. (2009) offers a method to iterative causal discovery by integrating knowledge from beforehand designed ontologies to causal Bayesian networks learning. The main pitfall of these works is that they consider the properties within an ontology as causal, which is not the case each time and explains why we tried, in this work, to develop a more generic method.
One of the main interests of integrating knowledge from ontologies would be the ability to provide constructed and understandable explanations to why artificial intelligences have reach to this or that conclusion. Indeed, explainable artificial intelligences is a trending topic that rose with the multiplication of ”black box” systems (i.e. systems that cannot explain their results in a manner humans can reason with). A lot of works have emerged on this topic (Ribeiro et al., 2016, Zhang and Chen, 2018), mostly about neural networks. On another hand, Bayesian networks, because of their graphical structure, are much easier to explain. Works can be done on their structure and sensitivity analysis (Lacave and Diez, 2002, 2004), but in general they are pretty straightforward to understand, since every node value can be explained with respect to the others. However, ontologies can bring another layer of explanation as they allow to introduce semantic notions to describe the relations between the nodes. Indeed, as we have seen before, there is no semantics behind the orientation of the probabilistic relations in a Bayesian network: A ! B has the same validity (i.e. one model is not better than the other) as B ! A. However, if an ontology can give us an information such that A ! B is not possible, then we would have more informations about the model and rule out all Bayesian networks with this relation.
As a consequence, ontologies can help to build arguments and explanations that models such as Bayesian networks would not be able to discover. For instance, Besnard et al. (2014) presents a tool combining ontological and causal knowledge in order to generate different argument and counterarguments in favor of different facts by defining enriched causal knowledge. They how-ever did not cover the learning part.
DOMAIN OF APPLICATION
In this chapter, we focus on a particular ontology that sparked our interest on combining proba-bilistic graphical models and ontologies: the PO2 ontology, dedicated to transformation processes. We describe what we consider as a transformation process, before introducing the PO2 ontology.
A transformation process is a generic way of describing a sequence of steps. By analogy, each step (i.e. operation) can itself be considered as a small thermodynamic system, with information enter-ing (the inputs) and leaving (the outputs) it. The particularity -and difficulty- of analyzing them is that this flow of information is heterogeneous (different variables, scales) and time-dependent: each operation takes place within a specific time frame. This raises a substantial amount of vari-ables that needs to be described, as illustrated in Fig.2.1.
In this figure, an example of a transformation process is given. Each black dot indicates a variable associated to the scale (ordinate axis) and the step (abscissa axis). In order to study this, it would be interesting to analyse how the variables influence each other: to do so, we have to conduct multiple analysis, across the steps, scale and variables themselves.
For the rest of this section, we define the following vocabulary when describing a transforma-tion process:
• A step is an operation taking place at a specific time, that can be absolute (e.g. ”January the 1rst 2020, 15:03 p.m.”) or relative (”step 1 happens before step 2”).
• A process is a succession of steps.
• An attribute is a variable that represents a measure taken during the transformation process.
• An input is an attribute defined at the beginning of the step that characterize it (e.g. an oven temperature).
• An output is an attribute that results from the step.
We will also consider that all transformation processes studied in this Section are well-defined: all inputs and outputs are useful to describe it. In most cases outputs are consequences of inputs. Two steps are considered identical if they share the same inputs and outputs, even if their value can vary. For instance, while baking a cake, we always define the step Cooking with oven tempera-ture as an input. However, the temperature value itself can change, depending on what we intend to cook.
In conclusion, transformation processes are characterized by two kinds of complexity: time and scale. The goal of learning probabilistic model to represent a transformation process would be to be able to explain the different attributes with respect to the others in spite of these issues.
Example 7. Suppose an agro-food company wants to test a new process for baking their cakes. They have defined a process Baking Cake, composed of the following steps succes-sion: Mixing ! Cooking ! Tasting. This process is detailed in Fig.2.2. Having a probabilis-tic modeling of it would allow us to verify the impact of the quantity of eggs, milk, oven’s temperature on the taste of the final product.
PROCESS AND OBSERVATION ONTOLOGY PO2
The description of the transformation processes we gave in the previous section is coherent with the way the core ontology P O2 is defined. Its conceptual components are mainly composed of four classes: the Step, Participant, Observation and Attribute classes. The participants and observations are described as follows:
• Participants represent the inputs. In the ontology, they can be of three natures: method (e.g. a measurement method, a selection method); mixture (e.g. a cake); and device (e.g. an oven). The mixture usually represents the product transformed throughout the whole process.
• Observations represent the outputs. They usually embody the different measures taken on the mixture and are characterized by multiple concepts: scale, sensory or computed observations. The attribute is the intrinsic value associated to those participants and observations in order to describe them. They are themselves characterized by a numeric value and a unit, given by the datatype properties po2:hasForValue and po2:hasForUnite.
In this ontology, a process is a whole operation with a particular goal (e.g. baking a cake, transforming a product); if two processes share the same goal, they are the same. However, there are multiple means for a same end: a same goal doesn’t mean that we need to always have the same succession of steps. As a consequence, the ontology differentiates the different succession of steps as itineraries. In an itinerary, each step is defined both by its participants and observations, and its relations to the other steps, for instance with the property isBefore. An overview of these conceptual components is given in Fig.2.3.
BUILDING THE RELATIONAL SCHEMA
Given the constraints C1 and C2, we can define the generic relational schema presented in Figu-re 2.6. In this figure, we describe two generic steps happening at times t and t + 1, both with participants and observations.
C1 ensures that events can have a probabilistic relation towards concomitant or future ones, but not on past events. This constraint is translated into a temporal relational chain between the Step classes, with an orientation from Stept to Stept+1.
C2 ensures that participants can have a probabilistic relation towards observations of the same step, or participants and observations of the next steps, but not on past observations or past parti cipants. This constraint is translated both by the explanatory relational chain between participant and observation, and also by the temporal one between Stept and Stept+1.
When building a specific relational schema RSP O2 for a given domain, we define as many domain
step classes as there are steps in the domain ontology. Each is linked with temporal relational chains that we determine using the po2:isBefore object property.
Example 9. In the cooking transformation process of Example 7, we have three steps, there- fore three classes in RSP O2 . They are linked such that for instance Cooking has a relational cooking chain towards Tasting and Mixing has one towards Cooking. However, using the Cook-ing’ reference slot allows Mixing an other relational chain towards Tasting. The resulting relational schema is presented in Figure 2.7.
Table of contents :
1 Background and State of the Art
1.1 Probabilistic Models
1.1.1 Discrete Probability Theory
1.1.2 Bayesian Networks
1.1.3 Learning Bayesian Networks
1.1.4 Essential Graphs
1.1.5 Probabilistic Relational Models
1.1.6 Learning under constraints
1.1.7 Using Ontologies to learn Bayesian networks
1.2.2 Causal discovery
1.2.3 Ontologies and Explanation
2 Learning a Probabilistic Relational Model from a Specific Ontology
2.1 Domain of application
2.1.1 Transformation Processes
2.1.2 Process and Observation Ontology PO2
2.2 ON2PRM Algorithm
2.2.2 Building the Relational Schema
2.2.3 Learning the Relational Model
2.3.1 Generation of synthetic data sets
2.4.1 Determination of explaining and explained attributes
2.4.2 Defining the temporality
3 Interactive Building of a Relational Schema From Any Knowledge Base
3.1 Definition of a generic relational schema
3.1.1 Explicitation of constraints
3.1.2 Structure of the Stack Model
3.2 CAROLL Algorithm
3.2.1 Expert assumption
3.2.2 Assumption’s Attributes Identification
3.3 Towards causal discovery
3.3.1 Validating causal arcs
3.3.2 Possible conclusions
3.3.3 Incompatibility of constraints
3.4.1 Synthetic data set
3.4.3 Control parameters in cheese fabrication
4 Semi-Automatic Building of a Relational Schema from a Knowledge Base
4.1 Closing the Open-World Assumption
4.1.1 General Idea
4.1.2 Defining the Transformation Rules
4.1.3 Limits and Conclusion
4.2 ACROSS Algorithm
4.2.1 Comparison between CAROLL and ACROSS
4.2.3 Relational Schema’s Automatic Generation
4.2.4 User modifications
4.4 Final Remarks
4.4.2 Expert feedback
Conclusion and Perspectives
Summary of Results
Discussion and Future Works
A User’s modifications
A.0.1 Delete a class
A.0.2 Fuse two classes of the same type
A.0.3 Divide a class
A.0.4 Create a Mutually Explaining class
A.0.5 Remove an attribute
A.0.6 Create a relational slot
A.0.7 Remove a relational slot
A.0.8 Reverse a relational slot
A.0.9 Filter the instances used for the learning
B R´esum´e en Franc¸ais