Automated force volume image processing for biological samples

Get Complete Project Material File(s) Now! »

Bayesian formulation of sparse signal restoration

The Bayesian formulation of the inverse problem y = Ax+n consists in inferring the distribution of x = (q; r) knowing y. The MAP estimator of x can be obtained by maximizing the marginal distribution l(qjy) [13] or the joint distribution l(q; rjy) [39], [22]. Following [39], we focus on the joint likelihood l(q; rjy) which leads to a cost function involving the squared error ky Axk2 and the ‘0-norm of x. Assuming a Gaussian white noise n N (0; n2Im), independent from x, it is easy to obtain L(q; r) , 2 n2 log[l(q; rjy)] = ky A qrk2 + n2 krk2 + kqk0 + C (2.4).
where = 2 n2 log(1= 1) and C is a constant. Given q, let us split r into two subvectors u and t indexed by the null and non-null entries of q, respectively. Since A qr = AQt does not depend on u, it is easy to check that minu L(q; t; u) = L(q; t; 0). Finally, the joint MAP estimation problem reduces to the minimization of L(q; t; 0) w.r.t. (q,t).

Mixed ‘2-‘0 minimization as a limit case

A signal x is sparse if some entries xi are equal to 0. Since this de nition does not impose constraints on the range of values of the non zero amplitudes, we choose to describe a sparse signal by a limit Bernoulli-Gaussian model in which the amplitude variance x2 is set to in nity. The minimization of L(q; t; 0) thus rereads : q;t fL (q; t; 0) = k y AQtk 2 + kqk0g (2.5).
Theorem 1. The above formulation (2.5) is equivalent to : xminRnfJ (x; ) = ky Axk2 + kxk0g (2.6) 2 which is referred to as the ‘0-penalized least-square problem. The term \equivalent » means that given a minimizer (q,t) of (2.5), the related vector x = ft; 0g is a minimizer of (2.6), and conversely, given a minimizer x of (2.6), the vectors q and t de ned as the support of x and its non-zero amplitudes, respectively, are such that (q,t) is a minimizer of (2.5).
Proof. To prove the equivalence, we rst prove that minx J (x; ) = minq;t L(q; t; 0) : { Let x be a minimizer of J (:; ). We set q and t to the support and the non zero amplitudes of x, respectively. Obviously, J (x; ) = L(q; t; 0). It follows that minx J (x; ) > minq;t L(q; t; 0).
{ Let (q,t) be a minimizer of L(q; t; 0). The vector x = ft; 0g is such that Ax = AQt and kxk0 = ktk0 6 kqk0. Therefore, J (x; ) 6 L(q; t; 0). It follows that minq;t L(q; t; 0) > minx J (x; ). We have shown that minx J (x; ) = minq;t L(q; t; 0), but also that the minimizers of both problems coincide, i.e., are vectors describing identical signals.
In the following, we focus on the minimization problem (2.6) involving the penalization term kxk0. The hyperparameter is xed. It controls the level of sparsity of the desired solution. The algorithm that will be developed is based on an e cient search of the support of x. In that respect, the ‘0-penalized least-square problem does not drastically di er from the ‘0-constrained problem min ky Axk2 subject to kxk0 6 k.

Adaptation of SMLR to ‘0-penalized least-square optimization

We propose to adapt the SMLR algorithm to the minimization of the mixed ‘2-‘0 cost function J (x; ) de ned in (2.6). To clearly distinguish SMLR which speci cally aims at minimizing (2.4), the adapted algorithm will be termed as Single Best Replacement (SBR).

Principle of SMLR and main notations

The Single Most Likely Replacement algorithm [38] is a deterministic coordinatewise ascent algorithm to maximize log-likelihood functions of the form l(qjy) (marginal MAP estimation) or l(q; t; 0jy) (joint MAP estimation). In the latter case, it is easy to check from (2.4) that given q, the maximizer of l(q; t; 0jy) w.r.t. t has a closed form expression t = t(q). Consequently, the joint MAP estimation reduces to the maximization of l(q; t(q); 0jy) w.r.t. q. At each SMLR iteration, all the possible single replacements of the support q (set qi = 1 qi while keeping the other qj, j 6= i unchanged) are tested, then the replacement yielding the maximal increase of l(q; t(q); 0jy) is chosen. This task is repeated until no single replacement can increase l(q; t(q); 0jy) anymore. The number of possible supports q being nite (2n) and SMLR being an ascent algorithm, it terminates after a nite number of iterations.
Before adapting SMLR, let us introduce some useful notations.
{ We denote by Q i a single replacement, i.e., the insertion or removal of an index i into/from the active set Q : Q i , Q [ fig; if i 62 Q; (2.7).
Qnfig; otherwise. { If Card [Q] 6 min(m; n), we de ne the cost functions : JQ( ) , J (xQ; ) = EQ + kxQk0 (2.8) KQ( ) = EQ + Card [Q] (2.9).
where the least-square solution xQ and the corresponding error EQ have been de ned in (2.1) and (2.2). Obviously, JQ( ) = KQ( ) if and only if xQ has a support equal to Q. In subsection 2.3.2, we introduce a rst version of SBR involving JQ( ) only, and in subsection 2.3.3, we present an alternative (simpler) version relying on the computation of KQ( ) instead of JQ( ) and we discuss in which extent both versions di er. Then, subsection 2.3.4 describes the behavior of SBR and states its main properties.

The Single Best Replacement algorithm ( rst version)

SMLR can be seen as an exploration strategy for discrete optimization rather than an algorithm speci c to a posterior likelihood function. Here, we use the same strategy to minimize the cost function J (x; ). We rename the algorithm Single Best Replacement to remove any statistical connotation. The SBR algorithm works as follows. At each iteration, the n possible single repla-cements Q i(i = 1; : : : ; n) are tested, then the best is selected, i.e., the replacement yielding the maximal decrease of J (x; ). This task is repeated until J (x; ) cannot decrease anymore. Let us detail an SBR iteration.

Behavior and adaptations of SBR

Termination : SBR is a descent algorithm because the value of KQ( ) is always decreasing. Consequently, a set Q cannot be explored twice and similarly to SMLR, SBR terminates after a nite number of iterations. Notice that the size of Q remains lower or equal to min(m; n). Indeed, if a set Q of cardinality min(m; n) is reached, then EQ is equal to 0 due to the URP assumption. Hence, any replacement of the form Q0 = Q [ fig yields an increase of the cost function (KQ0( ) = KQ( ) + ). We emphasize that no stopping condition is needed unlike many algorithms which require to set a maximum number of iterations (MP and variations, OLS) and/or a threshold on the squared error variation (CoSaMP, Iterative Hard Thresholding).
Proposition 1. Under the assumptions of Theorem 2, each SBR iterate xQ is almost surely a local minimizer of J (x; ) and of the ‘0-constrained problem min E(x) subject to kxk0 6 k, with k = Card [Q]. This property holds in particular for the SBR output.
Proof. Let x = xQ be an SBR iterate. According to Theorem 2, the support S(x) = Q almost surely. Setting » = mini2Q jxij > 0, it is easy to check that if x0 2 Rn satis es kx0 xk < « , then 8 2 Q i 6 k x0 k S (x0) S k x0 k > i ; x0 = 0. Thus, x < » implies that (x) and then 0 k.
Now, let x0 satisfy kx0 xk < « . { If kx0k0 = k, then necessarily, S(x0) = S(x) = Q. By de nition of x = xQ, E(x0) > E(x). It follows that x is a local minimizer of the ‘0-constrained problem. Also, J (x0; ) > J (x; ) holds.

READ Interaction of phosphoxylose with GalT-I, GalT-II, GlcAT-I, EXTL2 and ChGn-I

Recursive implementation

At each SBR iteration, n least-square problems of the form (2.13) must be solved, each requiring the inversion of the Gram matrix GQ , AtQAQ of size k k. The computation cost can be high since in the general case, a matrix inversion costs O(k3) scalar operations. Following an idea widely spread in the subset selection literature, we propose to solve (2.13) in a recursive manner.
A rst possibility is to use the Gram-Schmidt procedure [23], [42] which yields an orthogonal decomposition of AQ = W U, where W is an m k matrix with orthogonal columns and U is a k k upper triangular matrix. Although it yields an e cient updating strategy when including an index into the active set (leading to the update of AQ0 = [AQ; ai]), the Gram-Schmidt procedure does not extend with the same level of e ciency when an index removal is considered [34].
An alternative is to use the block matrix inversion lemma [43] allowing an e cient update of GQ1 for both index insertion and removal. An e cient SMLR implementation is proposed in [13], based on the recursive update of matrices of the form (GQ + Ik) 1. This approach can easily be adapted to SBR where the matrix to update is GQ1 (see also [44] for the downdate step). However, we have observed numerical instabilities when the selected columns of A are highly correlated.
A more stable solution is based on the Cholesky factorization GQ = LQLtQ, where LQ is a lower triangular matrix. Updating LQ is advantageous since it is better conditioned than GQ1. This update can be easily done in the insertion case [45]. It is less easy for removals, since they break the triangular structure of LQ. A recursive update of the Cholesky factor of GQ1 was recently proposed [46]. Here, we introduce a simpler strategy relying on the factorization of GQ.

Table of contents :

1 Introduction generale
1.1 Contexte applicatif et objectifs
1.1.1 Contexte general
1.1.2 Microscopie AFM
1.1.3 Objectifs du projet
1.2 Traitement de donnees et besoins algorithmiques
1.2.1 Traitement de courbes de force et d’images force-volume
1.2.2 Formulation du lissage d’un signal
1.2.3 Separation de sources parcimonieuses retardees
1.3 Contributions principales
1.3.1 Approximation parcimonieuse par minimisation d’un critere mixte `2-`0
1.3.2 Algorithme de continuation
1.3.3 Detection conjointe de discontinuites
1.3.4 Separation de sources parcimonieuses retardees
1.3.5 Segmentation d’une courbe de force et estimation de parametres physiques
1.4 Organisation des autres chapitres de la these
2 From Bernoulli-Gaussian deconvolution to sparse signal restoration
2.1 Introduction
2.2 From Bernoulli-Gaussian signal modeling to sparse signal representation
2.2.1 Preliminary denitions and working assumptions
2.2.2 Bernoulli-Gaussian models
2.2.3 Bayesian formulation of sparse signal restoration
2.2.4 Mixed `2-`0 minimization as a limit case
2.3 Adaptation of SMLR to `0-penalized least-square optimization
2.3.1 Principle of SMLR and main notations
2.3.2 The Single Best Replacement algorithm (rst version)
2.3.3 Modied version of SBR (nal version)
2.3.4 Behavior and adaptations of SBR
2.4 Implementation issues
2.4.1 Basic implementation
2.4.2 Recursive implementation
2.4.3 Ecient strategy based on the Cholesky factorization
2.4.4 Memory requirements and computation burden
2.5 Deconvolution of a sparse signal with a Gaussian impulse response
2.5.1 Dictionary and simulated data
2.5.2 Separation of two close Gaussian features
2.5.3 Behavior of SBR for noisy data
2.6 Joint detection of discontinuities at dierent orders in a signal
2.6.1 Approximation of a spline of degree p
2.6.2 Approximation of a piecewise polynomial of maximum degree P
2.6.3 Adaptation of SBR
2.6.4 Numerical simulations
2.6.5 AFM data processing
2.6.6 Discussion
2.7 Conclusion
3 A continuation algorithm for `0-penalized least-square optimization
3.1 Introduction
3.2 Mixed `2-`0 optimization
3.2.1 Denitions and working assumptions
3.2.2 Solution path
3.3 Continuation algorithm
3.3.1 Single Best Replacement
3.3.2 Recursive computation of
3.3.3 CSBR algorithm
3.4 Numerical simulations and real data processing
3.4.1 Discontinuity detection
3.4.2 Sparse spike train deconvolution
3.4.3 Sensitivity to the mutual coherence of the dictionary
3.5 Conclusion
4 Automated force volume image processing for biological samples
4.1 Introduction
4.2 Theory and physical models
4.2.1 Physical models relevant for the approach curves
4.2.2 Physical models relevant for the retraction curves
4.3 Algorithms
4.3.1 Force curve segmentation
4.3.2 Fitting of the approach curves
4.3.3 Fitting of the retraction curves
4.3.4 Processing of a force-volume image
4.4 Results and Discussion
4.4.1 Bacterial culture and sample preparation
4.4.2 AFM measurements and preparation of experiments
4.4.3 Data processing for the approach force curve
4.4.4 Data processing for the retraction force curve
4.4.5 Discussion on the performance of the algorithm for force curve analysis
4.5 Conclusion
5 Separate sparse sources from a large number of shifted mixtures
5.1 Introduction and motivation
5.2 Sparse source separation problem
5.2.1 Dierent kinds of mixing models
5.2.2 Separation of instantaneous mixture of sparse sources
5.2.3 Degenerate Unmixing Estimation Technique (DUET)
5.2.4 Enlighting points of existing algorithms
5.3 Identiability
5.3.1 Modeling of sparse signals
5.3.2 Ambiguities of source separation problem
5.3.3 Denition of eigen interval
5.3.4 Sucient condition for identiability
5.3.5 Extension of sucient condition
5.4 Algorithm to separate sparse spike trains
5.4.1 Separate sparse spike trains in noise-free setting
5.4.2 Extension to noisy setting
5.5 Experiments
5.5.1 Eect of bin width of histogram
5.5.2 Eect of perturbation on locations of spikes
5.5.3 Eect of thresholding the histograms
5.5.4 AFM data processing
5.6 Conclusion
Conclusion