Cross Validation and Maximum Likelihood with misspecied family of covariance functions

Get Complete Project Material File(s) Now! »

Classical asymptotic results for parametric estimation

To dene an asymptotic framework, it is rst necessary to let the size n of the observation vector vary. However, we shall keep in mind that the parametric family of distributions of denition 3.1 depends on the number of observations n. It is hence necessary to parameterize all these dierent distributions on Rn, for n varying, by a parameter independent of n.
In subsection 3.2.1, we will see that the distribution of a random vector y, of size n, coming from a centered Gaussian process Y at x(1); :::; x(n) 2 D can be parameterized independently of n, by a parameter characterizing the covariance function of Y . In this subsection 3.1.2, we will consider the case where y is composed of n iid random variables, so that is a parameter for their common distribution and is hence independent of n.
In view of the discussion above, we will consider the framework of denition 3.10 in this subsection 3.1.2.
Denition 3.10. Let y be a random vector of size n, with n 2 N varying. Assume that the components y1; :::; yn of y are iid. A parametric family of iid distributions for y, is a parametric family of distributions on R, dened by P = fP ; 2 g ; where P is a probability distribution on R and is a subset of Rp.

Estimation of the covariance function for Gaussian processes

In this section, we present the parametric estimation of the covariance function of a Gaussian process Y , from an observation vector y. In subsection 3.2.1, we detail the framework for the covariance function estimation. In subsections 3.2.2 and 3.2.3, we present the Maximum Likelihood (ML) and Cross Validation (CV) estimators. In subsection 3.2.4, we provide the explicit gradients of the criteria for the ML and CV estimators. Finally, in subsection 3.2.5, we discuss the rather open problem of taking the covariance function estimation error into account in the Kriging predictions.

Parametric estimation of the covariance function

As discussed in section 3.1, it is unreasonable to consider all possible covariance functions as possible candidates for the Gaussian process at hand. Hence, similarly to denition 3.1, it is classical to assume a parametric family for the covariance function of a Gaussian process Y . Furthermore, in the present manuscript, we especially study the classical case of a family of stationary covariance functions. These two remarks motivate the following denition of a parametric family of stationary covariance functions.
Denition 3.18. A parametric family of stationary covariance functions is of the form fK ; 2 g.

The challenge of taking into account the uncertainty on the covariance function

The Kriging equations of subsection 2.2.2 assume that the covariance function of the Gaussian process Y is known. In practice, this function is estimated beforehand, yielding plug-in ([Ste99], chapter 6.8) prediction equations. The plug-in approach does not take into account the randomness of the covariance function estimator. Let ^ be an estimator of the covariance hyper-parameter that veries, for any two m 1 vectors v and w, ^ (v) = ^ (v +Hw), with H the regression matrix. In the simple Kriging case, this conventionally adds no condition on the estimator ^ . In [Ste99] p.201, the estimator ^ is said to depend only on the contrasts of y. Note that all the estimators studied in the manuscript do depend only on the contrasts of y. Indeed, for example, the likelihood criterion in proposition 3.22 is written as a function of y 􀀀 H^(y), with ^(y + Hv) = ^ (y) + v. Similarly, for CV in (3.16), the LOO criterion is written ytMy, with M a matrix so that MH = 0. It is shown in [ZC92], and discussed in [Ste99] p.201 that, in this case, the estimator ^ is independent of the prediction error, with the true covariance hyper-parameter, ^y (0);0 􀀀 y0 at a new point x(0). As a result, we have that ^y (0);0 􀀀 y0 is independent of ^y (0);0 􀀀 ^y^ ;0. Hence E((^y^ ;0 􀀀 y0)2) = E((^y (0);0 􀀀 y0)2) + E((^y (0);0 􀀀 ^y^ ;0)2).

Asymptotic results for prediction with xed covariance function

Consider the xed-domain asymptotic framework and consider a xed point x 2 D. The goal of this subsection 4.2.1 is to answer the question: when n ! +1 does the prediction error of Y (x) given y1; :::; yn at x(1); :::; x(n) go to zero?
Note that Kriging predictions are not expected to be consistent in the increasing-domain asymptotic framework. Indeed, the interpoint distance is bounded away from zero when n ! +1, so that most of the points in the prediction domain remain isolated.
Naturally, in the xed-domain asymptotic framework, it is desirable that Kriging predictions are consistent. Indeed, when predicting a continuous function on a xed bounded domain D, many simple approximation methods have their prediction error vanishing when the number of observations goes to +1.
We will rst answer the question when Y is a Gaussian process with known mean structure and covariance function. This corresponds to the question of the consistency of Kriging, when the Gaussian process assumption is correct and the mean structure and the covariance function are well-specied.
Second, we will consider the question when the observations stem from a deterministic continuous
function f, which is modeled as a trajectory of a Gaussian process Y with xed mean structure and covariance function. This second case can include the case of a misspecication of the mean structure or covariance function of the Gaussian process Y , when this Gaussian process does yield continuous trajectories. This second question corresponds to the robustness of Kriging in the case where the Gaussian process assumption is wrong. This has an important practical inuence, since Kriging models are often applied, for instance, to approximate deterministic computer models. We will review some results in the literature, but we will also see that this question is not fully solved yet, to the best of our knowledge.

READ THE ART AND SCIENCE OF CONDUCTING ORTHORECTIFICATION WITH AN OVERVIEW ON REMOTE SENSING (RS)

Table of contents :

1 Introduction
I Kriging models
2 Kriging models with known covariance function
2.1 Gaussian processes
2.1.1 Denition and properties of Gaussian processes
2.1.2 The relationship between the covariance function and the trajectories of a Gaussian process
2.2 Prediction and conditional simulation for Gaussian processes
2.2.1 Ordinary, simple and universal Kriging models
2.2.2 Point-wise prediction
2.2.3 Conditional simulation of Gaussian processes
2.2.4 Cross Validation formulas
2.2.5 Alternative RKHS formulation
3 Covariance function estimation for Kriging models
3.1 Introduction to parametric estimation
3.1.1 Denition and properties for parametric estimation
3.1.2 Classical asymptotic results for parametric estimation
3.2 Estimation of the covariance function for Gaussian processes
3.2.1 Parametric estimation of the covariance function
3.2.2 Maximum Likelihood for estimation
3.2.3 Cross Validation for estimation
3.2.4 Gradients of the dierent criteria
3.2.5 The challenge of taking into account the uncertainty on the covariance function
4 Asymptotic results for Kriging
4.1 Two asymptotic frameworks
4.2 Asymptotic results for prediction with xed covariance function
4.2.1 Consistency
4.2.2 Asymptotic inuence of a misspecied covariance function
4.3 Asymptotic results for Maximum Likelihood
4.3.1 Expansion-domain asymptotic results
4.3.2 Fixed-domain asymptotic results
II Cross Validation and Maximum Likelihood for covariance hyperparameter estimation
5 Cross Validation and Maximum Likelihood with well-specied family of covariance functions
5.1 Introduction
5.2 Expansion-domain asymptotic framework with randomly perturbed regular grid .
5.3 Consistency and asymptotic normality for Maximum Likelihood and Cross Validation
5.3.1 Consistency and asymptotic normality
5.3.2 Closed form expressions of the asymptotic variances in dimension one
5.4 Study of the asymptotic variance
5.4.1 Small random perturbations
5.4.2 Large random perturbations
5.4.3 Estimating both the correlation length and the smoothness parameter .
5.4.4 Discussion
5.5 Analysis of the Kriging prediction
5.5.1 Asymptotic inuence of covariance hyper-parameter misspecication on prediction
5.5.2 Inuence of covariance hyper-parameter estimation on prediction .
5.5.3 Analysis of the impact of the spatial sampling on the Kriging prediction .
5.6 Conclusion
5.7 Proofs
5.7.1 Proofs for subsection 5.3.1
5.7.2 Proofs for subsection 5.3.2
5.7.3 Proofs for section 5.5
6 Cross Validation and Maximum Likelihood with misspecied family of covariance functions
6.1 Introduction
6.2 Estimation of a single variance parameter
6.2.1 Theoretical framework
6.2.2 Numerical results
6.3 Estimation of variance and correlation hyper-parameters
6.3.1 Procedure
6.3.2 Results and discussion
6.4 Discussion
III Applications to Uncertainty Quantication for Computer Experiments
7 Probabilistic modeling of discrepancy between computer model and experiments
7.1 Framework for computer models and experiments
7.2 Errors modeled by a variability of the physical system
7.2.1 The general probabilistic model
7.2.2 Non-linear methods
7.2.3 Methods based on a linearization of the computer model
7.3 Errors modeled by a model error process
7.3.1 The general probabilistic model
7.3.2 Non-linear methods
7.3.3 Methods based on a linearization of the computer model
8 Calibration and improved prediction of the thermal-hydraulic code FLICA
8.1 Presentation of FLICA 4 and of the experimental results
8.1.1 The thermal-hydraulic code FLICA 4
8.1.2 The experimental results
8.2 Description of the procedure for the Gaussian process modeling
8.2.1 Objectives for the universal Kriging procedure
8.2.2 Exponential, Matérn and Gaussian covariance functions considered
8.2.3 K-folds Cross Validation for Kriging model validation
8.3 Results
8.3.1 Results in the isothermal regime
8.3.2 Results in the single-phase regime
8.3.3 Inuence of the linear approximation
9 Kriging meta-modeling of the GERMINAL computer model
9.1 Introduction
9.2 Presentation and context for the GERMINAL computer model
9.2.1 A nuclear reactor core design problem
9.2.2 Inputs and outputs considered
9.2.3 Setting for the Kriging model
9.3 Results of the Kriging model
9.3.1 Interpretation of the estimated covariance hyper-parameters
9.3.2 Prediction results
9.3.3 Detection of computation failures for the « Fusion_Margin » output
10 Conclusion and perspectives
A Notation
B Reference