Generating orthogonal matrices with prescribed leverage scores

Get Complete Project Material File(s) Now! »

subsampling problems with linear structure

In order to situate the contributions of this thesis, a brief review of randomized subsampling techniques for problems with a linear structure is given in the following. We start by the problem of linear regression under constraint on the number of observations. Consider a matrix X 2 RNd that contains the d features of N observations, with N d, and assume the rank of X is equal to d. A recurrent task in data analysis and statistics is linear regression, which consists in looking for the minimum-length vector w 2 Rd among the vectors minimizing the residual ky 􀀀 Xwk2, (1.1) for a given vector of labels y 2 RN. It is well known that the solution is given by w = X+y, where X+ is the Moore-Penrose pseudo-inverse of X and the optimal residual writes.

determinantal sampling

As we have seen in the previous section, independent sampling was used in a variety of approximation tasks that have linear structure. These sub-sampling methods rely on first-order information such as the leverage score distribution. Now we will review another approach of sampling that relies on high order information. This is achieved by determinantal point processes and their variants. Indeed, this class of distributions defines the most natural extension of leverage score sampling with negative correlation property.
We start again with the problem of linear regression under budget constraint. Remember that, for this problem, it is required to estimate X+y using a subset of [N]. Now, if X is of full rank, the Moore-Penrose pseudo inverse has a determinantal representation (Ben-Tal and Teboulle, 1990) where IN,d = fI [N], jIj = dg, DI = Det2 XI,: and D = åI2IN,d DI . In other words, the solution of the regression problem defined for the pair (X, y) is a mixture of the solutions of the smaller regression problems defined by the pairs (XI,:, yI), and the weight is proportional to Det2 XI,: that is, the volume spanned by the rows of XI,: in the space Rd. In particular, singular sub-matrices XI,: do not participate in this mixture. Obviously, this determinantal representation cannot be used in this raw form. Indeed, an exhaustive enumeration of IN,d would cost much more than calculating X+y. Yet, its structure is amenable to a probabilistic interpretation. Indeed, if we consider a random element S of IN,d such that 8I 2 IN,d, P(S = I) = DI/D,.

point processes

Intuitively, a point process is a random discrete subset of points in some measurable space (D, B). In order to define this object rigorously, it is more convenient to see a discrete subset of D as an atomic measure defined on D as illustrated in Figure 2.1. Indeed, as we shall see, under some conditions on D, the set of measures on D have nice properties that are compatible with the standard setting of probability theory. Polish spaces are recurrently used in probability theory. They define a large class of topological spaces that includes any countable set equipped with the discrete topology, closed subspaces of Rd endowed with the usual topology induced by Euclidean norm and many more. This abstract class seems to be convenient to give a unified definition of DPPs. Definition 2.1. Let (D, T ) be a topological space. (D, T ) is said to be a Polish space, if there exists a metric d on D such that
(D, d) is a complete metric space,
(D, d) is separable: there exists a countable subset fxn; n 2 Ng D dense in D, and the topology induced by d is equal to T . We say that a Polish space is a separable completely metrizable topological space. The requirements of a Polish space allow to work with the classical concepts of probability theory: the existence of conditional distributions, the definition of weak convergence, the regularity of probability measures…

READ A combinatorial description of the centralizer algebras connected to the Links-Gould Invariant

Discrete subsets as counting measures

Once we have defined the topological space, we move to defining the corresponding Borel s-algebra B: the smallest s-algebra in D that contains all open subsets of D. Its elements are called Borel sets. A measure m is a function from B to R+ [ f¥g that is non-negative, s-additive and vanishes at the empty set. A locally finite measure is a measure m on (D, B) such that for every B 2 B that is relatively compact (its closure is compact) we have m(B) < +¥.

Table of contents :

1 introduction
1.1 Subsampling problems with linear structure
1.2 Determinantal sampling
2 determinantal point processes
2.1 Point processes
2.2 Determinantal point processes
3 column subset selection using projection dpps
3.1 Notation
3.2 Related work
3.3 The proposed algorithm
3.4 Main results
3.5 Numerical experiments
3.6 Discussion
3.7 Proofs
Appendices
3.A Principal angles and the Cosine Sine decomposition
3.B Majorization and Schur convexity
3.C Another interpretation of the k-leverage scores
3.D Generating orthogonal matrices with prescribed leverage scores
4 kernel quadrature using dpps
4.1 Related work
4.2 The kernel quadrature framework
4.3 Projection DPP for optimal kernel quadrature
4.4 Numerical simulations
4.5 Discussion
4.6 Proofs
5 kernel interpolation using volume sampling
5.1 Introduction
5.2 Three topics on kernel interpolation
5.3 Volume Sampling and DPPs
5.4 Main Results
5.5 Sketch of the Proofs
5.6 Numerical Simulations
5.7 Discussion
5.8 Proofs
6 conclusion and perspectives
6.1 Conclusion
6.2 Perspectives
R´esum´e en franc¸ais
Notations