Get Complete Project Material File(s) Now! »

## Discrete methods in Image Processing

In the last chapter, we described the Bayesian rational for the denoising problem.

We turn once again to probability thinking, but restricting our analysis to discrete probabilistic models. Markov Random Fields (MRF) are in the foundations of the methods discussed in this chapter, and an inspiration for many others. Image pixels are naturally interpreted as Markov states, and image properties, such as spatial coherence, are encoded as potentials stored in the edges of the image grid graph.

Problems are solved by computing the solution of maximum a posteriori probability (MAP). The MAP inference itself boils down to minimize the challenging class of pseudo-boolean functions.

In some fortunate cases, the functions can be minimized exactly and efficiently by a reduction to a max-flow (min-cut) problem, and it happens that such cases model imaging problems, as segmentation, nicely well. In fact, the minimum cut defines a partition, and one can interpret the cut as the contour separating two objects. It is the key for fruitful research that has followed. One can abstract the MRF machinery and define potentials on vertices and edges of the grid graph such that its minimum cut answers the problem being posed. Moreover, the potentials can model geometric properties of objects embedded in the grid graph, and one can use cuts to estimate the objects perimeters, for example.

We start this chapter by giving a brief description of Markov Random Fields and the minimization problem arising from the MAP inference. In the second section we present some properties of pseudo-boolean functions and how to optimize them. In the third section we describe the special class of submodular functions and efficient algorithms to compute the minimum of such functions. Finally, we describe successful models in the image processing community based on graph cuts and how one can inject geometric information in them.

**Markov Random Fields**

Let G = (V, E) an undirected graph with vertices set V and edges set E. The set of adjacent vertices to v ∈ V is denoted N(v). Given two subsets S,Q ⊂ V, a (S,Q)-cut is any subset of edges E′ ⊂ E such that S,Q are in different connected components in the graph G(V, E \E′). We denote cut(S,Q) the set of all (S,Q) cuts.

For each vertex v ∈ V we associate a discrete random variable Xv that take values from a label set v according with some distribution P. We group all random variables in vector X and we write XS to refer to the set of associated variables with vertex set S ⊂ V. We also group all the label sets in the collection . We denote WX the set of all configurations for the random vector X. We say that H = (G,X, , P) is a Markov Random Field (MRF) if for any non-adjacent states Xu and Xv, the probability distribution P satisfies the independence conditions below

**Hidden Markov model**

Imaging problems are naturally coupled with a set of observations, namely the color intensities of each pixel. We can expect that such observations play a role in any probabilistic model pretending to solve an imaging problem. The Hidden Markov Model (HMM) is a subclass of MRF that incorporates such observed variables in its definition and is quite often used in the image processing literature.

We are going to be interested in problems arising from the setting in which the states of random variables in Y are known, and one wishes to infer the states of random variables in X. In other words, we are interested to find w⋆ such that w⋆ = argmax w = P(X = w | Y ). (2.6)

The vector Y is called the vector of observed variables, and in image problems they are usually associated to the color intensity of pixels.

The set of labels X is defined according to the problem. In segmentation it could represent the different partitions in which to segment the image, e.g. a label to encode vehicles, another to encode pedestrians, a third to encode the sky and so on. In stereo, the labels could mean the relative depth of the object with respect to the others. In denoising and reconstruction problems in general, it could be the color intensities themselves.

Equations (2.9) and (2.10) defines a Potts model. Minimizing Equation (2.10) is NP-hard [BVZ01] if variables xp, xq are not binary, i.e., the underneath HMM is a multilabel one. In the binary case, Equations (2.9) and (2.10) are referred as the Ising model, and this one can be minimized efficiently.

In fact, MAP inference in multilabel HMMs are much more difficult than MAP inference in binary HMMs. At first glance, the multilabeling extension does not seem to be an issue, as we can always transform a multilabel problem in a binary one by including as many as log2 |X| new variables. The difficulty is that the resulting energy is very likely to lie in a class of binary energies whose minimization is NP-hard [Ram+08]. Therefore, MAP inference in multilabel HMMs is not likely to be solved exactly in an efficient manner. Instead, approximation algorithms as (α, β)-swap [BVZ01] or range moves [Vek07] are used.

In the next two sections we are going to focus on the analysis of binary first-order HMMs, i.e., MRF that are encoded by 1, 2-cliques potentials and binary random variables. This restriction is justified by two reasons: first because there exist efficient algorithms to solve them (see Table 2.1); and second because of its generality.

A general MRF can be transformed into an equivalent binary first-order MRF by including a sufficient number of auxiliary variables. Naturally, such transformations are not always useful. In the multilabel case the derived energy is almost surely non-submodular and in the case of high-order cliques, the inclusion of auxiliary variables may turn the minimization problem impractical [Ish10].

In binary first-order MRFs, the clique potentials maps binary vectors to real values and the energy belongs to the class of quadratic pseudo-boolean functions, which can be written as In 0th order HMM, a simple Greedy algorithm computes the MAP inference. In 1st order HMM we have linear algorithms based on dynamic programming for 1-connected graphs and max-flow for other topologies. For orders greater than 2, MAP inference can be computed efficiently if the 2-clique potential is submodular.

Notice, however, that the best algorithm for orders greater than 2 is quite expensive (Q denotes the time to evaluate the function). Submodular functions is a special class of Pseudo-Boolean functions discussed in Section 2.2 that can be minimized in polynomial time, in constrast with non-submodular, which in this case is proven to be a problem in NP-hard. There are other classes of pseudo-boolean functions that are efficiently optimized, some of them are described in [BH02]

In the next section, we explore the class of pseudo-boolean functions and how to optimize them.

### Pseudo-boolean functions

Definition 1(Pseudo-boolean function): A pseudo boolean function f : {0, 1}n → R is a function that maps binary vectors to real values.

Pseudo-boolean functions (PBF) can be interpreted as set functions, i.e., functions that map sets to real values. Indeed, a binary vector of n elements is in bijection with the power sets of V = {1, 2, · · · , n}. Therefore, the PBF f : {0, 1}n → R can be written as f(x1, · · · , xn) = X S⊂2V cS Y i∈S xi. (2.11) where cS ∈ R is a coefficient associated to each subset S. Equation (2.11) is denoted the polynomial form of PBF f. The order of a PBF is defined as the cardinality of the largest subset S in which cS > 0. The polynomial form is unique.

Sometimes it is convenient to express f in its so called posiform representation, usually denoted by φ instead of f. In its posiform representation we use literals ¯xi, xi ∈ L to represent states 0, 1 of indexed variable i and all coefficients are positive except the one associated to the empty set (which is also called the constant term of the posiform and denoted a∅).

φ(f) = φ(x1, · · · , xn) = X T⊂2L aT Y p∈T p, (2.12) where aT ≥ 0 whenever T 6= ∅. The posiform representation can be seen as a truth table with positive values for all configurations, except the one in which all variables are set to zero. One can pass from posiform to polynomial representation by exchanging ¯xi and (1−xi). A posiform is said maximum-constant if its constant term C(φ) is maximum among all posiforms representing f. A maximum-constant posiform is denoted φ⋆. Both general and maximum-constants posiforms are not unique representations of the PBF f.

**Table of contents :**

**Introduction **

**I Image Processing and Digital Geometry **

**1 Variational methods in Image Processing **

1.1 Inverse problems in imaging

1.2 Bayesian rationale and total variation

1.2.1 Tikhonov regularization

1.2.2 Euler-Lagrange equation

1.2.3 Total variation regularization

1.3 Standard techniques

1.3.1 Curve evolution

1.3.2 Level set

1.3.3 Minimum path

1.3.4 Convex relaxation

**2 Discrete methods in Image Processing **

2.1 Markov Random Fields

2.1.1 Clique factorization and Gibbs energy

2.1.2 Hidden Markov model

2.1.3 Grid graph and Tikhonov denoising revisited

2.1.4 Potts and Ising models

2.2 Pseudo-boolean functions

2.2.1 PBF optimization

2.2.2 Submodularity

2.3 Graph cut models

2.3.1 Binary segmentation

2.3.2 Geodesics computation

**3 Curvature as a regularizer **

3.1 Curvature and the curve-shortening flow

3.1.1 Definitions

3.1.2 Curve-shortening flow

3.2 Diffusion and level curves motion

3.2.1 Curvature in denoising and image segmentation

3.2.2 Curvature and the connectivity principle applied to inpainting

3.3 Elastica curve

3.3.1 Imaging models using the elastica

3.4 Discrete methods and squared curvature

3.4.1 Discrete elastica

3.4.2 Linear programming model for image segmentation using the discrete elastica

3.4.3 Unconstrained formulations

**4 Digital Geometry **

4.1 Ground concepts

4.1.1 Digital grid, digitization and digital line

4.1.2 Exact sampling versus digitization

4.2 Geometric measurements in digital objects

4.2.1 Multigrid convergence and perimeter estimation

4.2.2 Tangent and multigrid convergence of local quantities

4.2.3 Multigrid convergent estimators of curvature

4.3 Conclusion

**II Contributions **

**5 A combinatorial model for digital elastica shape optimization **

5.1 Digital elastica

5.2 Local combinatorial scheme

5.3 Experimental results

5.3.1 Free elastica

5.3.2 Constrained elastica

5.3.3 Running time

5.4 Global optimization

5.4.1 Simplified digital elastica

5.4.2 Optimization model for simplified digital elastica

5.4.3 Topological constraints

5.4.4 Linear relaxation of P1

5.4.5 Unconstrained version of P1

5.5 Conclusion

**6 A 2-step evolution model driven by digital elastica minimization **

6.1 FlipFlow model

6.1.1 Definitions

6.1.2 Model and algorithm

6.1.3 Algorithm discussion

6.2 Optimization method

6.3 Evaluation across m-rings

6.4 Data term and image segmentation

6.5 Conclusion

**7 A single step evolution model driven by digital elastica minimization**

7.1 BalanceFlow model

7.1.1 Definitions

7.1.2 Algorithm

7.2 Relation with FlipFlow

7.3 Conclusion

**8 Digital elastica minimization via graph cuts **

8.1 GraphFlow model

8.1.1 Candidate graphs and solution candidates set

8.1.2 GraphFlow algorithm

8.2 Conclusion

**9 Experimental analysis **

9.1 Free elastica

9.1.1 Exp-General

9.1.2 Exp-Radius

9.2 Constrained elastica

9.2.1 Discussion

9.3 Image segmentation

9.3.1 Influence of parameters

9.3.2 Comparison

9.4 Conclusion

**10 Conclusion and perspectives **

Appendices

A Curvature and distant disks

B Pixel incidence matrix