Get Complete Project Material File(s) Now! »

## From Bayesian Networks to Markov Networks

MNs and BNs are closely related, but they both cannot necessarily represent independencies that the other can. First, let us illustrate by the following example a case where they do.

Example 1.7 Let us first consider the following BN B : A Ñ B Ñ C. It defines the following joint probability distribution: PpA;B;Cq PpAq PpB|Aq PpC|Bq. Now let us consider the MN overM: A B C. The joint probability distribution of this MN would be PpA;B;Cq 1 Z 1pA;Bq 2pB;Cq. In this case, possible values for the MN’s factors could be 1pA;Bq PpAq PpB|Aq and 2pB;Cq PpC|Bq.

The partition function Z would then be equal to 1. We can also see that B andMboth encode the same set of independencies: d-seppA;C|Bq and seppA;C|Bq. The previous example illustrates a situation in which both a BN and a MN encode the same set of independencies. However, in most situations MNs and BNs with similar graphical structures encode different dependencies.

Example 1.8 Let A, B, C and D be four random variables and let us suppose we want to model the following independencies pA KK D|tB;Cuq and pB KK C|tA;Duq. Figure 1.2a illustrates a MN for which both independencies exist and no other. Figure 1.2b is an attempt to model the same set of independencies using a BN. Unfortunately, whereas we have d-seppA;D|tB;Cuq, B and C are not d-separated by tA;Cu. Figure 1.2c is another unsuccessful attempt to model figure 1.2a independencies. If we look at the joint probability distributions induced by figures 1.2b and 1.2c we have (respectively) PpA;B;C;Dq PpAqPpB|AqPpC|AqPpD|B;Cq and PpA;B;C;Dq PpAqPpDqPpB|A;DqPpC|A;Dq. If we consider those conditional probability distributions as factors, they induce links that are non-existing in figure 1.2a: B C for figure 1.2b and A D for figure 1.2c. Conversely, note that the set of independencies represented by figure 1.2b cannot be represented precisely by any MN.

### Modeling using Bayesian Networks

When using BNs to model specific domains, experts usually require specific functionalities. We will now present the most used ones. For BNs, local structures are the different data structures used to represent conditional probability distributions. It is important to differentiate the data structure used for the probabilistic computations from the ones used for modeling purpose. Different data structures induce different costs for constructing a BN. CPTs are a straightforward representation of a discrete conditional probability distribution and are considered as the classic data structure used to encode probabilities in BNs. However, the difficulty to fill CPTs grows exponentially with the number of random variables. Regarding ergonomics, there has been a considerable amount of research carried out by the different software companies selling BN oriented products. However, in our case we are more interested in alternative data structures that require fewer parameters, which help reducing the memory consumption of large CPTs and the number of computations. We will also take a glimpse to deterministic and probabilistic functions that are a must-have feature when dealing with expert knowledge. Such functions span from standard probability distributions to logical operators.

#### From Variable Elimination to junction trees inference algorithms

Remarkably, VE (Zhang and Poole, 1994, 1996; Dechter, 1999) is younger than junction tree based algorithms (Lauritzen and Spiegelhalter, 1988; Shenoy and Shafer, 1990; Jensen et al., 1990). We chose to present VE first, because it offers a more intuitive grasp of probabilistic inference. Moreover, its simplicity makes it a good introduction to inference in PGMs. In fact, VE was introduced as a simplification of junction tree based algorithms. However, it’s more appropriate to present junction tree based algorithms as a sophistication of VE. To understand the link between VE and junction tree based algorithms, we must first discuss triangulation and its importance for both inference approaches.

**The Shafer-Shenoy inference algorithm**

The Shafer-Shenoy (SS) inference algorithm transforms MNs into junction trees and computes marginal probabilities using a message passing scheme (Shenoy and Shafer, 1990). In section 2.3 we have detailed triangulation and gave a hint on how junction trees cache computations. We will now present formally junction trees construction steps, present the SS inference algorithm and we will explain the importance of the running intersection property.

The first step in building a junction tree is to triangulate the MN’s graph. We have seen in section 2.3 that eliminating random variables one-by-one adds fill-ins, i.e., eliminating a random variable will produce a new factor that links random variables that were not originally connected. When we triangulate a graph, we add fill-ins and by doing so we fix the elimination order: fill-ins are the graphical representation of intermediate factors created by successively eliminating random variables. Then cliques in the triangulated graph represent any intermediate factor created by VE and can be represented using a clique graph.

Definition 2.7 (Clique Graph) Let G be an undirected graph. A clique graph U of G is an undirected graph with a node for each clique Ci in G if and only if there is no other clique Cj in G such that Ci Cj . An edge Ci Cj exists in U only if Ci X Cj H. Conversely, if Ci X Cj H, then there exists a path linking Ci and Cj in U.

**Table of contents :**

Notations

Introduction

**1 Probabilistic Graphical Models **

1.1 Independencies, graphs and I-maps

1.2 Markov Networks

1.3 Bayesian Networks

1.4 From Bayesian Networks to Markov Networks

1.5 Modeling using Bayesian Networks

**2 Probabilistic Inference **

2.1 Evidence representation

2.2 Variable Elimination

2.3 From Variable Elimination to junction trees inference algorithms

2.4 The Shafer-Shenoy inference algorithm

2.5 Other approaches and conclusion

**3 Bayesian Networks Extensions **

3.1 Early extensions

3.2 Object-Oriented Models

3.3 Probabilistic Relational Models

3.4 First-Order Probabilistic Models

3.5 Discussion

**4 Reinforcing Probabilistic Relational Model’s Object-Oriented Aspect **

4.1 Probabilistic Relational Models

4.2 Class inheritance

4.3 Parfactor representation of PRMs

**5 Structured Probabilistic Inference **

5.1 Ground inference limits

5.2 Structured Variable Elimination

5.3 Structured Probabilistic Inference

5.4 Structured BayesBall

5.5 Experimental results

**6 Pattern Discovery for Structured Probabilistic Inference **

6.1 Formalizing pattern discovery

6.2 Complexity of pattern discovery

6.3 An Approximate Algorithm for pattern discovery

6.4 Experimental Results

Conclusion

Appendices

**A SKOOL Language Specification **

A.1 SKOOL project structure

A.2 Attribute type declaration

A.3 Class and interface declaration

A.4 System declaration

A.5 Query unit declaration

A.6 Functions

**B French Summary **

B.1 Modèles graphiques probabilistes orientés-objet

B.2 Renforcement de l’aspect orienté-objet des PRMs

B.3 Inférence probabiliste structurée

B.4 Recherche de motifs d’instances