Get Complete Project Material File(s) Now! »

## Hypernode graphs

Chapter abstract In this chapter, we introduce the notion of hypernode graph, which is our main contribution to the modeling of binary relations between sets of entities. Our main motivation is to define a model that goes beyond pairwise relations and allows to encode set based similarities. In the first part of the chapter, we review some important definitions concerning undirected graphs as the main model to represent pairwise re-lations between individuals. In particular, we recall the notion of Lapla-cian matrix which is the cornerstone of the spectral theory and discuss its main properties. In the second part, we introduce formally our model and discuss how we can extend the notion of Laplacian matrix to this new class of objects.

Finally, we discuss the expressiveness of our Laplacian framework and compare it with other extensions for higher order learning.

### Undirected Graphs and Spectral Framework

In the following, we recall the commonly accepted definitions of undirected graphs, graph Laplacians and graph kernels.

**Graphs and Laplacians**

An undirected graph g “ pN, Eq is a set of nodes N together with a set of undirected edges E. We define n “ |N| and p “ |E|. Each edge is an unordered pair ti, ju of distinct nodes (no self-loops) and is associated with a non negative weight wti,ju. We define the neighborhood N piq of a node i to be the set of all nodes j that are connected to i by an edge

We define a path on a graph to be a finite sequence of nodes that are con-nected by a sequence of edges. A connected component is a maximal set of nodes in which any two nodes are connected to each other by at least one path.

Example 1. Let g1 be the undirected graph defined by N “ t1, 2, 3, 4u and E “ tt1, 3u, t1, 4u, t2, 4u, t3, 4uu. The edge weights are set to wt1,3u “ 2, wt1,4u “ 1, wt2,4u “ 1 and wt3,4u “ 0.5. We represent g1 in Figure 2.1 below, together with its adjacency matrix W1 and its degree matrix D1. The sequence of nodes p1, 4, 3q is a path in the graph g1 but the set t1, 3, 4u is not a connected component since we can add the node 2 without losing the connectivity property.

We now consider the problem of collective node valuation: for a given graph g “ pN, Eq, the objective is to find a node valuation function f : N Ñ R. As said in the previous chapter, we expect from f that it assigns close values to nodes that are linked by edges in the graph (homophilic assumption). When it happens, we say that f is smooth on g. In order to quantify the smoothness of a function f over g, let us first assume that each edge is associated with an arbitrary orientation. Then, we define the gradient function grad for f by, for every oriented edge pi, jq,

The symmetric matrix Δ “ GT G is called the (unnormalized) graph Lapla-cian, which is also proved to be defined by Δ “ D ´ W where D and W are the degree matrix and the adjacency matrix of the graph. Note that the Laplacian operator is often introduced directly under the standard form D ´W , without considering the underlying concept of graph gradient. How-ever, gradients prove to be essential when it comes to extend the notion of Laplacian operator to the hypernode graphs.

Proposition 2. Let g be an undirected graph and Δ its Laplacian matrix.

Δ is symmetric and positive semi-definite. Moreover, we have

NullpΔq “ NullpGq “ NullpΩq “ Spanp1C1 , . . . , 1Cp q ,

where 1C1 , . . . , 1Cp are the indicator functions of the k connected compo-nents of g and NullpΔq denotes the nullspace of Δ. As a consequence, the dimension of NullpΔq is equal to the number of connected components.

Proof. The symmetricity and the positive semi-definiteness of Δ comes di-rectly from the fact that Δ “ GT G. If f is in NullpΔq, we have fT Δf “ 0 “ }Gf}22 so f P NullpGq. Conversely if Gf “ 0, then Δf “ GT Gf “ 0. Moreover, Gf “ 0 if and only if }Gf}22 “ 0 “ Ωpfq, which concludes the proof of the equality between the three nullspaces.

Let us now consider f P NullpGq. We have, for every edge pi, jq, gradpfqpi, jq “ ?wi,jpfpjq ´ fpiqq “ 0

As a consequence, as soon as two nodes i and j are linked by an edge, we must have fpiq “ fpjq. By transitivity, all the nodes that belong to the same connected component should have the same value. In the following, we will denote by αt the value taken by f on the connected component Ct. Since by definition the connected components have null intersections so f P Spanp1C1 , . . . , 1Cp q. The reverse is straightforward: let us choose f in Spanp1C1 , . . . , 1Cp q. For any edge ti, ju P E, i and j have to belong to the same connected component. Since fpiq “ fpjq, gradpfqpi, jq “ 0 and Δf “ }Gf}22 “ 0.

A direct consequence of Proposition 2 is that NullpΔq is never reduced to t0u since it always contains the constant vector 1. Since Δ is symmetric and positive semi-definite, it is orthogonally diagonalizable and all its eigenvalues are positive or null. In the following, we assume that the q ď n eigenvalues are ordered as follow 0 “ λ1 ă λ2 ă ¨ ¨ ¨ ă λq

Note that if f is a unit eigenvector associated with the eigenvalue λ, then Ωpfq “ f T λf “ λ. The unit eigenvectors of Δ can be seen as the harmonic modes of the graph and their smoothness depend on their related eigenvalue. A mode associated with the eigenvalue λ1 “ 0 is a fundamental mode. Be-cause of Proposition 2, any fundamental mode is a linear combination of the indicator functions of the connected components.

Example 2. Let us consider the graph g2 with N “ t1, 2, 3, 4, 5, 6, 7, 8u presented in Figure 2.2 below.

The Laplacian Δ2 of this graph has four distinct eigenvalues λ1 “ 0, λ2 “ 0.092, λ3 “ 4 and λ4 “ 4.3. λ1 is associated to the fundamental mode f1 “ 1N . λ2 is associated to a unique mode f2 (Ωpf2q “ λ2 “ 0.092). Since λ4 ą λ3 ąą λ2 so the modes associated to λ3 and λ4 are far less smooth than f2. We present in Figure 2.2 below the modes f1 and f2. Note that these modes can be used to feed a clustering algorithm on the nodes. This idea leads to the so-called spectral clustering algorithm (see Von Luxburg 2007).

Note that some papers also consider a diﬀerent version of the Laplacian ma-trix called normalized Laplacian that is defined by Δn “ I ´D´1{2W D´1{2. As shown in (Zhou et al. 2005),

#### Graph kernels and distances

In this section, we review the notion of graph distance and introduce some important notations for the next chapters. A graph distance is a function d : N ˆ N Ñ R that defines a metric on the nodeset N. Namely, d has to satisfy the following axioms:

1. Positivity: @pi, jq P N2, dpi, jq ě 0

2. Symmetricity: @pi, jq P N2, dpi, jq “ dpj, iq

3. Coincidence: @pi, jq P N2, dpi, jq “ 0 ô i “ j

4. Triangle Inequality: @pi, j, kq P N3, dpi, jq ď dpi, kq ` dpk, jq

A standard graph distance is the shortest-path distance and is defined for any pair of nodes pi, j q as the minimum sum of weights along a path joining i and j. For example, let us consider graph g3 presented in Figure 2.3. The closest nodes to 4 in the sense of the shortest-path distance are 1, 2, 3 and 5 (the distance is equal to 1 for these four nodes). The shortest-path distance can be computed using algorithms such as Djikstra’s algorithm and serves as a basis for many applications.

A main limitation of this approach is that it does not take into account the global connectivity of the graph. In the case of g3, the shortest-path distance does not reflect the existence of the two clusters t1, 2, 3, 4u and t5, 6, 7, 8u. In the following, we will introduce a family of graph distances based on the Laplacian matrix Δ that solves this issue. In particular, we will see that one of these distances is strongly related to the notion of random walk on graphs.

**Hypernode graphs**

In this section, we introduce our model for representing binary relations over sets. As discussed in the previous chapter, our goal is to encode pairwise similarity relations between sets of nodes. We start by defining formally the notion of hypernode graph and present some important notations. Then, we introduce our extension of the graph spectral framework (see Section 2.1) and discussed its main properties.

**Model definition**

Definition 1. A hypernode graph h “ pN, Hq is a set of nodes N together with a set of hyperedges H. Each hyperedge h P H is an unordered pair tsh, thu of two non empty and disjoint hypernodes (a hypernode is a subset of N). Each hyperedge h P H has a weight function wh mapping every node

i in sh Yth to a positive real number w hpiq (for i R sh Yth, we define whpiq “

**Table of contents :**

**1 Introduction **

**2 Hypernode graphs **

2.1 Undirected Graphs and Spectral Framework

2.1.1 Graphs and Laplacians

2.1.2 Graph kernels and distances

2.2 Hypernode graphs

2.2.1 Model definition

2.2.2 Hypernode graph Laplacians

2.2.3 Equivalent hypernode graphs

2.3 Expressiveness of the Laplacian framework

2.3.1 The case of hypernode graphs

2.3.2 The case of hypergraph Laplacians

2.4 Conclusion

**3 Properties of hypernode graphs **

3.1 Hypernode graphs, graphs and signed graphs

3.1.1 Pairwise Weight Matrix and Laplacians

3.1.2 Signed graph reduction

3.2 Paths and components in hypernode graphs

3.2.1 Paths and signed components

3.2.2 Independent components and strong connectivity

3.3 Hypernode graph kernels and distances

3.3.1 Definition and main properties

3.3.2 Diffusion on hypernode graphs and relations with d2

3.3.3 The transition matrix P “ D´1W

3.4 Conclusion

**4 Skill rating with hypernode graphs **

4.1 Skill rating in multiplayer games

4.1.1 Notations and team additive model

4.1.2 The Elo rating system

4.1.3 The TrueSkill rating system

4.2 Learning skill ratings with hypernode graphs

4.2.1 Modeling Games with Hypernode Graphs

4.2.2 Regularizing the hypernode graph

4.2.3 Inferring Skill Ratings and Predicting Game Outcomes

4.3 Experiments

4.3.1 Tennis Singles

4.3.2 Tennis Doubles

4.3.3 Xbox Title Halo2

4.4 Conclusion

**5 Perspectives and open problems**

5.1 Cuts in hypernode graphs

5.1.1 Cuts in undirected graphs

5.1.2 Cuts and hypernode graphs

5.1.3 The Min-Cut problem on hypernode graphs

5.1.4 Relation with the signed graph cuts

5.1.5 Algorithmical perspectives and partial results

5.2 Directed hypernode graphs

5.3 An algebraical interpretation of hypernode graphs

5.3.1 The classes of Graph Kernels and Graph Laplacians

5.3.2 The class of Hypernode graph Laplacians

5.3.3 A convex hull conjecture and an intermediate class Fpnq

5.3.4 A Riemanian geometry for strongly connected hypernode graphs

**6 Conclusion **

**Appendices**