Get Complete Project Material File(s) Now! »

## Classication and Regression Trees

Classication and Regression Trees (CART) were introduced in the 80’s by Breiman et al [21], which creates powerful and simple binary tree based models for classication and regression problems, in statistical learning theory [113]. In both cases, the method consisted in creating rectangular partition of a feature space (high dimensional Rn), either t a model over each of these rectangles in case of regression, produce a classication. These trees (now called decision trees) described then the estimator for the regression function, or a linear separator for classication tasks. The trees we consider here only address recursive binary partitions as show in in the gure 1.2. To avoid over-tting the data, such trees are dealt with commonly in two broad ways: Firstly, prune them (which is what we will consider for the rest of our study). Secondly one can use aggregations of dierent subset of points to average over many such trees, which also includes the classical Random Forests which consists in using a random subspace of features to partition a node and aggregate them.

### Salembier-Garido’s Optimal Pruning & Guigue’s Scale-set

Salembier-Garrido and Guigues [47, 100] generalized the CART framework for the constrained optimization problem. Salembier-Garrido study binary partition trees, which represent a hierarchy of partitions created by using the max-tree representation, while Guigues considers a hierarchy created from complete linkage on regions of an oversegmentation [48], titled as Cocoons. In both studies the cost-complexity pruning, a greedy strategy is used to nd a constrained minimum, while replacing the constraint of the size of the subtree by a more general function which is the perimeter of the partition. See Equation (1.14).

One must note now that the dierence in these methods w.r.t CART is the interpretation of the constrained minimum. Salembier-Garido and Guigues provide a Lagrangian multiplier interpretation of the optimization [39]. This is discussed later in section 2.1. Constraints are now based on the number of regions and length of perimeter, thus providinga larger range of constraint values, while not requiring a large initial tree to start pruning like in the case of CART. Salembier-Garrido [100] provided a Rate-Distortion interpretation of the constraint problem of choosing a segmentation from a hierarchywith the constrained perimeter.

#### Dynamic program for minimal cut

The algorithm achieving the optimally pruned subtree in [21], also known as the BFOS (after the authors, Breiman, Friedman, Olshen, Stone) algorithm in literature, consists of a dynamic program, that performs a bottom-up scan, while pruning o a branch or the node its rooted in. Dynamic program consists in solving a structured complex problem by decomposing it into smaller simpler subproblems. These subproblems require to be overlapping in nature, such that a solution to a subproblem is calculated only once and serves to solve a larger subproblem. Furthermore this partial solution when aggregated with others, produces a global solution. The aggregation is quite often has a successive approximation interpretation. Following Guigues, in view of the Bellmenian characteristic of any dynamic program, in order to nd the optimal cut, while one performs a bottom up search starting at the leaves level, the algorithm will implicitly have to solve subproblems appearing at higher scales before some lower scales. This means, some nodes appearing early lower down in the hierarchy might be part of a global solution.

**Table of contents :**

**0 Motivation and Thesis Overview **

0.1 Motivation

0.1.1 Convex vs. Lattice based optimization

0.1.2 A Morphological Approach

0.2 Thesis Organization

**1 Braids and Energetic Lattices **

1.1 Basic notions and Notations

1.1.1 Partitions and Partial Partitions

1.1.2 Hierarchy

1.1.3 Classes

1.1.4 Energy

1.1.5 Recall on Orderings and lattices

1.2 Optimization on Hierarchies of Partitions

1.2.1 Classication and Regression Trees

1.2.1.1 Pruning CART trees

1.2.1.2 Optimal tree pruning

1.2.1.3 Uniqueness and Monotonicity

1.2.1.4 Departing from Binary Trees

1.2.2 Salembier-Garido’s Optimal Pruning & Guigue’s Scale-set

1.2.3 Problem review on hierarchies

1.2.4 Dynamic program for minimal cut

1.2.5 Going from Scale-sets To Energetic Lattices

1.3 Braids of Partitions

1.3.1 Denition

1.3.2 Classes of Braids

1.3.3 Motivations

1.4 Energetic lattices of braids

1.4.1 Energetic ordering

1.4.2 Energetic lattice

1.4.3 The three lattices

1.5 h-increasing energies

1.5.1 The two orderings and

1.5.2 Minimal cut and h-increasingness

1.5.3 Simple example for a non h-increasing energy

1.6 h-increasing compositions and Minkowski norms

1.6.1 Soille’s Constrained Connectivity and Hierarchies

1.6.1.1 (; !)-components composed by supremum

1.6.2 Dominant ancestor by supremum

1.6.3 Composition of _-generated energies

1.6.4 h-increasingness and generalizing DP

1.6.5 Composition by alternating sum-supremum

1.7 Scale increasing families of energies

1.7.1 Scale increasingness

1.7.2 Scale space properties

1.7.3 The Scale Function

1.8 Inf-modularity

1.8.1 Inf-modularity and scale increasingness

1.8.2 Inf-modularity Vs Sub-additivity

1.9 Summary

**2 Constrained optimization **

2.1 Review on Constrained optimization on Trees

2.1.1 Rate Distortion Theory

2.1.2 Image segmentation formulation using Rate Distortion

2.1.3 Tree structured Vector Quantization(TSVQ)

2.2 Lagrangian Multipliers and Everett’s Theorem

2.2.1 Reminder on Lagrange Multipliers

2.2.2 The Relaxation Theorem

2.2.2.1 Everett’s Main, and Theorems

2.2.3 Lagrangian Dual and KKT conditions

2.2.4 Reviewing constrained optimization on hierarchies

2.3 Guigue’s -cuts are upper bounds

2.3.1 Counter-example

2.3.2 Lessons from the Counter-Example

2.4 Improving the upper-bound -cuts

2.4.1 Perturbing the Scale Function

2.4.2 Penalty Methods

2.5 The energies ! = !’ + !@

2.6 Discussion on Everett’s theorem

2.6.1 Gap’s and lower bounds

2.7 Minimal -cuts and energetic lattices

2.8 Lagrangian Minimization by Energy (LME)

2.8.1 Vector case

2.8.2 Costs

2.8.3 Discussion

2.9 Lagrange minimization by Cut-Constraints (LMCC)

2.10 Class constrained minimization (CCM)

2.10.1 Single constraint

2.10.2 Implementation for inf-modular !@

2.10.3 Class constraint versus Lagrange minimization by energies

2.10.4 Vector case (multi constraints)

2.10.5 Overview of the three models

2.11 Primal Vs Dual: C-cuts Vs -cuts

2.12 Summary

**3 Applications of Energetic Lattice **

3.1 A few useful h-increasing energies

3.1.1 Mumford and Shah energy

3.1.2 Additive energy by convexity

3.1.2.1 Additive energies by active contours

3.1.3 Mumford-Shah Energy for Color image segmentation

3.1.4 Hierarchical structure based energies

3.2 Ground truth Proximal Energies

3.2.1 Ground truth evaluation

3.2.2 Segmentation Versus GT Partitions: Renements and Overlaps .

3.2.3 Segmentation Evaluation Measures

3.2.4 Haussdorf Distance

3.2.5 Hausdor distance

3.2.6 Half Haussdorf distances

3.2.6.1 Precision energy

3.2.6.2 Recall energy

3.2.7 Composition of !G(S) and G(S):

3.2.8 Composing multiple ground truth sets

3.2.9 Number of Classes Constraint

3.2.10 h-increasing Coverage Measures

3.2.11 Local linear dissimilarity

3.2.12 Global Precision-Recall similarity integrals

3.2.13 Proximity between hierarchies

3.2.14 Ground truth energies to Saliency functions

3.3 Summary

**4 Hierarchies and Saliency function **

4.1 Ground truth Evaluation of Segmentation Hierarchies

4.1.1 Contour proximity

4.2 Jordan Curves

4.2.1 Normal Segmentations

4.2.2 Describing Segmentation with Jordan Curves

4.3 Jordan Nets

4.3.1 Denitions

4.3.2 Ordering and lattice of J-nets

4.3.3 Net Opening in Literature

4.4 Watershed transformation and Saliency function

4.4.1 Watershed

4.4.1.1 Saliency functions

4.5 Fusion of hierarchies and functions

4.6 Composing hierarchies and numerical functions

4.6.1 Lower bounds by opening

4.6.2 Lack of upper bounds by closing

4.6.3 Upper bound by Geodesic Reconstruction

4.6.4 Discussion on Graph based methods

4.7 Algorithm and Experiments

4.7.1 Net Opening by up-sampling and down-sampling

4.7.2 Fusion of ground truth and hierarchy

4.7.3 Fusions of two hierarchies

4.7.4 Composition by ^: Hausdor distance Ordered saliencies

4.7.4.1 Partition Asymmetry and Hausdor distance

4.7.5 Combining multiple ground truths with a single hierarchy

4.7.6 Measuring structural changes after transformations

4.7.7 Geometric and Intrinsic Net openings

4.8 Braids from net opening

4.8.1 Braids from multiple functions on single J-net

4.8.2 Intersection of multiple Jordan nets

4.8.3 Braids over multiple hierarchies

4.9 Summary

**5 Algorithms and Graphs **

5.1 Region merging methods review

5.2 Algorithms

5.2.1 Optimal Cut DP on HOP

5.2.2 Optimal Cut DP on BOP

5.3 Intersection Graphs for Partition selection

5.3.1 Denitions

5.3.2 Partition Graphs

5.3.3 Maximally weighted Independent Set on HOP Intersection Graph

5.3.4 Intersection Graph for Braids

5.4 Max-Flow Min-Cut Analysis

5.4.1 Graph Structure for HOP

5.4.2 Graph Structure for BOP

5.5 Summary

**6 Conclusion **

6.1 Thesis Contributions

6.2 Applications and Future perspectives

6.2.1 Multi-segmentations

6.2.2 Thematic partitions

6.2.3 Thematic prediction

6.2.4 Combination of earth images

**Bibliography**