(Downloads - 0)
For more info about our services contact : help@bestpfe.com
Table of contents
1 Introduction
1.1 Machine Learning
1.2 Supervised Machine Learning
1.3 Mathematical Framework
1.4 Analysis of Empirical Risk Minimization
1.5 Complexity Results in Convex Optimization
1.6 Stochastic Approximation
1.7 Online Convex Optimization
1.8 Digest of Least-Squares Regression
I Stochastic Approximation and Least-Squares Regression
2 Multistep Methods for Quadratic Optimization
2.1 Introduction
2.2 Second-Order Iterative Algorithms for Quadratic Functions
2.3 Convergence with Noiseless Gradients
2.4 Quadratic Optimization with Additive Noise
2.5 Experiments
2.6 Conclusion
Appendices
2.A Additional Experimental Results
2.B Proofs of Section 2.2
2.C Proof of Section 2.3
2.D Proof of Theorem 3
2.E Lower Bounds
2.F Proofs of Section 2.4
2.G Comparison with Additional Other Algorithms
2.H Lower Bound for Stochastic Optimization for Least-Squares
3 Optimal Convergence Rates for Least-Squares Regression through Stochastic Approximation
3.1 Introduction
3.2 Least-Squares Regression
3.3 Averaged Stochastic Gradient Descent
3.4 Accelerated Stochastic Averaged Gradient Descent
3.5 Tighter Dimension-Independent Convergence Rates
3.6 Experiments
3.7 Conclusion
Appendices
3.A Proofs of Section 3.3
3.B Proof of Theorem 7
3.C Convergence of Accelerated Averaged Stochastic Gradient Descent
3.D Technical Lemmas
4 Dual Averaging Algorithm for Composite Least-Squares Problems
4.1 Introduction
4.2 Dual Averaging Algorithm
4.3 Stochastic Convergence Results for Quadratic Functions
4.4 Parallel Between Dual Averaging and Mirror Descent
4.5 Experiments
4.6 Conclusion
Appendices
4.A Unambiguity of the Primal Iterate
4.B Proof of Convergence of Deterministic DA
4.C Proof of Proposition 9
4.D Proof of Proposition 10
4.E Lower Bound for Non-Strongly Convex Quadratic Regularization
4.F Lower Bound for Stochastic Approximation Problems
4.G Lower Bounds on the Rates of Convergence of DA and MD Algorithms
4.H Continuous Time Interpretation of DA et MD
4.I Examples of Different Geometries
4.J Proof of Proposition
4.K Standard Benchmarks
II Applications of the Quadratic Loss in Machine Learning
5 Application to Discriminative Clustering
5.1 Introduction
5.2 Joint Dimension Reduction and Clustering
5.3 Regularization
5.4 Extension to Multiple Labels
5.5 Theoretical Analysis
5.6 Algorithms
5.7 Experiments
5.8 Conclusion
Appendices
5.A Joint Clustering and Dimension Reduction
5.B Full (Unsuccessful) Relaxation
5.C Equivalent Relaxation
5.D Auxiliary Results for Section 5.5.1
5.E Auxiliary Results for Sparse Extension
5.F Proof of Multi-Label Results
5.G Efficient Optimization Problem
6 Application to Isotonic Regression and Seriation Problems
6.1 Introduction
6.2 Problem Setup and Related Work
6.3 Main Results
6.4 Further Results in the Monotone Case
6.5 Discussion
Appendices
6.A Proof of the Upper Bounds
6.B Metric Entropy
6.C Proof of the Lower Bounds
6.D Matrices with Increasing Columns
6.E Upper bounds in a Trivial Case
6.F Unimodal Regression
7 Conclusion and Future Work
7.1 Summary of the Thesis
7.2 Perspectives



