Get Complete Project Material File(s) Now! »

## Quantitative and qualitative model-based approaches

From a model-based approach viewpoint, a fault can be detected if a reference or model of the nominal behaviour of the system being monitored is available. A model is commonly a knowledge representation. The knowledge can be quantitative (numerical) or qualitative (symbolic) in nature. Accordingly, the model can be mathematical, empirical, graphical, symbolic, explicit or implicit, etc. The variables observed from the system, consisting of sensor outputs, actuator inputs and process data, are basic representations of the system state. Modelling the fault-free system behaviour comprises the extraction of relationships among the observed variables, which are valid in normal operating conditions and change following a fault occurrence. The quantitative model-based approach expresses the relationships describing the system behaviour in terms of mathematical functions. The qualitative model in contrast expresses these relationships in terms of qualitative functions and heuristic symptoms. The quantitative approach uses analytical information to build the model, while the qualitative approach exploits as much knowledge about the process as possible and can easily handle qualitative information [6].

### Quantitative model types

The behaviour of processes is governed by laws of physics. The quantitative approach aims at describing either the system’s governing laws or the overall process behaviour (dynamics) using mathematical equations and functions over the observed variables. The challenge is to build a model as accurate, robust and complete as possible with a minimum modelling effort. According to the amount of information available to build such a mathematical model, it is possible to distinguish between three classes of mathematical models [5]:

• White-box models: They correspond to physical models based on a deep and concise understanding of the process physics. Sufficient information about the relationships among process variables and the functions describing their variations is available, yielding a comprehensive model for the system being monitored. First principles models use established laws of physics to write equations relating the process variables without worrying about fitting the model to experimental data. A state-space model with known dynamics including real physical parameters is an example. Advantages: All the dynamics are known accurately. Shortcomings: They may yield to complex models as systems are mainly multi-physics with coupled and non-linear phenomena.

• Black-box (experimental) models: This is the case of systems with unknown dynamics, providing hence little prior knowledge. A general inputoutput model aiming to only reproduce the system’s output response to changes in its inputs can be used. Special structures for a linear input-output model are FIR model, ARX model, ARMAX model. Neural Network models are nonlinear black-box models which in real situations may achieve more accuracy than linear models. Black-box models have flexible structures whose parameters do not necessarily reflect physical meanings. The model is identified via system identification methods. A least square algorithm for example can be used to identify an ARX model, using observed data and the potential knowledge of the approximate orders and pure time-delay.

Advantages: The identification step can be tuned to have simple models that fit in the objectives (control, diagnosis, optimisation, etc.)

Shortcomings: They require the availability of experimental bench test, and the capability of injecting different excitation signals to cover all the system dynamics. Usually, the black-box models will have a limited domain of validity.

#### Subspace projection methods for feature extraction

Principal Component Analysis (PCA) refers to the problem of fitting a low dimensional linear subspace to a given set of measurements contained into a higher dimensional space. The data consist of measurements at N different times of some physical variables, for example temperature, voltage, vibration, etc, or artificial calculated variables (features) for example statistical moments. The time points play the role of observations. To solve this problem PCA uses linear correlations among variables. Statistically formulating the PCA problem, it consists in estimating the first principal components of a vector of random variables. The first principal components are defined to be uncorrelated linear combinations of the original variables that successively maximise the total variance of data projection [73]. From a geometrical viewpoint, this is equivalent to minimising the sum of squared error (euclidian distance) between the observations and their projections. Both the geometrical and statistical formulation lead to the eigenvalue decomposition (EVD) as a solution, such that the first principal components are given by the dominant eigenvectors of the data covariance matrix, i.e. eigenvectors associated with the highest eigenvalues. PCA performs hence an orthogonal partition of the original variable/feature space into two complementary lower dimensional subspaces: the principal subspace, also called the signal or the representation subspace, which contains the most significant variations present in the data set and the residual subspace, also called the null subspace, which represents noises, redundancy, outliers (atypical observations, small-probability instances) and significant errors.

**KL divergence in model-based change detection**

The KL divergence has a particular place in model-based FDD, due to its strong connection with the logarithm of the likelihood ratio. The log-likelihood ratio is itself a key concept in mathematical statistics and is the basis theory in designing statistical decision functions between ’no-fault’ and ’fault’ hypotheses. Statistical decision techniques, commonly Schewhart and Moving average control charts, CUSUM and GLR (Generalised Likelihood Ratio) algorithms, were designed to detect abrupt changes in signal characteristics based on log-likelihood ratio properties. The basic types of changes are deviation from the reference mean value and scale increase in the observed signals laws.

Let’s assume given a finite observation set z1,z2,…,zN , and two possible models H0 and H1 which refer to the null hypothesis ’No-fault (without changes)’ and the alternative hypothesis ’Fault (with changes)’ respectively. Suppose afterwards that the sample has a probability density or a mass function h parametrised with θ, such that θ = θ0 under H0, and θ = θ1 under H1. Then the log-likelihood ratio is.

**PCA for fault detection and diagnosis**

The fault detection and diagnosis using PCA has received considerable attention in the last two decades. The advantage of PCA over other fault diagnosis approaches (chapter 1) is that the development of the reference model to which the operating process will be compared to generate decision functions about the system health state, requires no complicated physical knowledge on the process, neither detailed theoretical studies to be available. The only information needed is a good historical database describing the normal process operation. The basic idea of PCA has been explained in Chapter 1, section 1.3.2. We also reviewed some of the most interesting PCA variants that were proposed in the literature to cope with the limitations of the classical linear version. We consider here, without restriction, the application of the KL divergence to the principal components generated with the classical PCA technique.

**PCA’s model identification**

Let us consider N measurements/observations of m correlated variables x1, x2, …, xm that are collected at different sampling instants i from the process. They are arranged into columns of a matrix X[N×m], X = h x(1), · · · , x(i), · · · , x(N) i′ , where x(i) ∈ Rm is the ith measurement of the m variables. The data are recorded when the process is in state of control (healthy operating mode) so that the PCA’s model that will be built acts as a reference of the normal process behaviour. The analysis can be made on either the covariance or the correlation data matrix [39].

Although, it is often wise to work with a correlation matrix which normalizes the variances of the variables before applying PCA. So the different underlying relationships between variables could be extracted. The covariance matrix has however advantages over the correlation, especially in cases where PCA is used as an inferential and not only descriptive tool. Statistical inferences about principal components population are easier to derive for the covariance matrix than for the correlation one [40]. So, let ¯X denote the autoscaled matrix of X. The columns of ¯X are either centred or centred and reduced, depending on whether the covariance or the correlation matrix is used. PCA transforms the data matrix ¯X into a new matrix T[N×m] of uncorrelated variables t1, t2, …, tm termed principal component scores, T = h t(1), · · · , t(i), · · · , t(N) i′ , where t(i) ∈ Rm. The new variables are linear combinations of the original ones and they successively maximise the total data variance. T is obtained from ¯X by an orthogonal transformation whose vectors correspond to the eigenvectors of the covariance/correlation matrix. These vectors, termed the loading vectors, are arranged as columns of a matrix P[m×m] in the descendent order of their corresponding eigenvalues.

**Table of contents :**

List of Figures

List of Tables

Nomenclature

General introduction

**1 Fault Detection and Diagnosis: State of the art **

1.1 Introduction

1.2 Diagnosis process

1.3 Quantitative and qualitative model-based approaches

1.3.1 The quantitative approach

1.3.1.1 Quantitative model types

1.3.1.2 Residual generation

1.3.1.3 Residual evaluation

1.3.2 The qualitative approach

1.3.2.1 Qualitative modelling

1.3.2.2 Diagnosis strategies

1.4 Data-driven approach

1.4.1 Background

1.4.2 Subspace projection methods for feature extraction

1.4.2.1 Principal Component Analysis

1.4.2.2 Partial Least Squares

1.4.2.3 Independent Component Analysis

1.4.2.4 Linear Dicriminant Analysis

1.4.2.5 Other subspace methods

1.5 Synthesis

1.6 Problem statement

1.7 Conclusion

**2 Kullback-Leibler Divergence for Fault Detection **

2.1 Introduction

2.2 Motivations and Outline

2.3 Kullback-Leibler divergence

2.3.1 Definition

2.3.2 KL divergence in model-based change detection

2.3.3 Estimation

2.4 Detection performance evaluation

2.5 PCA for fault detection and diagnosis

2.5.1 PCA’s model identification

2.5.2 PCA-based fault detection

2.5.2.1 Distance-based statistics

2.5.2.2 Angular-based statistics

2.5.3 Application of KL divergence to the principal components

2.6 Evaluation on a numerical example

2.6.1 Fault detection procedure description

2.6.2 Detection results

2.6.2.1 Detection with common statistics

2.6.2.2 Detection with KL divergence

2.6.3 Detection performance evaluation with respect to small faults

2.6.3.1 The evaluation procedure

2.6.3.2 Evaluation results

2.7 Conclusion

**3 Kullback-Leibler Divergence for Fault Estimation **

3.1 Introduction

3.2 Assumptions

3.2.1 Data modelling

3.2.2 Fault modelling

3.2.3 Assumptions’s consequence

3.3 Analytical model derivation

3.3.1 KL divergence expression

3.3.2 Covariance matrix case

3.3.3 Correlation matrix case

3.4 Fault estimation

3.4.1 Fault amplitude estimator

3.4.2 Probabilistic model for the fault amplitude estimate

3.5 Simulation results and discussions

3.5.1 Procedure description

3.5.2 Model validation

3.5.2.1 Divergence model

3.5.2.2 Fault amplitude estimation model

3.5.3 Estimation error

3.5.3.1 Impact of faulty sample length

3.5.3.2 Impact of the data size

3.6 Comparison to another estimation method

3.7 Conclusion

**4 Application to non-destructive inspection-based damage detection**

4.1 Introduction

4.2 Motivation and outline

4.3 Application to ECT-based damage detection

4.3.1 Eddy-current testing principle

4.3.2 Experimental test bed description

4.3.3 Data structure and preprocessing

4.3.4 Detection and diagnosis results

4.3.4.1 Crack detection results

4.3.4.2 Crack characterisation results

4.4 Application to vibration-based damage detection

4.4.1 Overview of bearings FDD

4.4.2 Problem statement and contribution

4.4.3 Method description

4.4.3.1 The global spectral analysis description

4.4.3.2 Discrimination of faults in the bearing balls using LDA

4.4.4 Experimental data

4.4.4.1 Experimental test bed description

4.4.4.2 Time-domain detection

4.4.4.3 Data preprocessing

4.4.5 Results with PCA

4.4.6 Results with LDA

4.5 Conclusion

**Bibliography**