Deep learning network explainability through feature analysis 

Get Complete Project Material File(s) Now! »

Description of the MSTAR dataset

The MSTAR public dataset was developed by the U.S. Defense Advanced Research Projects Agency (DARPA) and the U.S. Air Force Research Laboratory (AFRL) [12, 17]. This database was collected under Horizontal Horizontal (HH) polarisation in X-Band with a 30 cm ×30 cm resolution.


The database is composed of 10 different targets shown in Fig. 2.5 with their visual and SAR representation. The various targets range from different categories with a bulldozer (D7), a truck (ZIL), a rocket launcher (2S1), an air defence unit (ZSU), armoured person-nel carriers (BRDM2, BTR60, BTR70, BMP2) and tanks (T62, T72). The particularity with the BMP and T72 targets is that there are several tanks representing the same target. Moreo er, for the T72, various variants are available as seen in Fig. 2.6 with fuel barrels, a skirt or reactive armour.

Reasons for creating a new dataset

Algorithms performing ATR rely on a training set to recognise targets in a testing set. To ensure a fair analysis of the performance of these algorithms and avoid any possible bias in the results, training and test images should be taken from independent sets of data.
It has been shown that the MSTAR [12] contains data with a high degree of corre-lation between images in the training and testing set due to the presence of correlated background [13, 25]. Indeed, it has been demonstrated that the recognition rates of al-gorithms tested on the MSTAR are high even when the target to recognise is artificially hidden. Moreover, data released to the public and included in the guidelines [18] con-tains only two targets (BMP, T72) with a complete training and testing sets, and one, the BTR60, with only one sequence for each set. In order to evaluate the correlation on the MSTAR datasets, a simple ATR method is applied on full SAR images as well as on SAR images with the target hidden and compare the results.

Nearest neighbour classification

The training and testing set are chosen according to Section 2.4.2. The nearest neighbour classification method consists in comparing the image to be classified in the testing set to every image in the training set. To be able to compare all the images, they are scaled to the same size. The scale chosen was 64 × 64 pixels to have a quick matching but still some precision to the image. This is done using a bicubic interpolation for the images of different sizes. That means that the pixels without a proper intensity assignation get as an intensity a weighted average of the 4 ×4 neighbouring pixels with assigned values. Once both images are the same size, the global intensity distance is computed between both images as in Eq. (2.16). n m D(I1,I2) = ∑ ∑|I1(x,y) −I2(x,y)| (2.16) x=1 y=1 where Ik is one of the image to compare. Ik(x,y) is the intensity of Ik at row x and column y.
This distance is computed for each test image with all the available training images. Out of all these distances, the class of the test image is chosen to be the same as the class of the target in the training image with the lowest distance to the test image. The full images are classified as well as images with the central area covering the target hidden as in Fig. 2.7 so that the target representation does not interfere in the classi-fication. The target is hidden using a mask which is a black square of 34 ×34 pixels added in the middle of the image. The partially hidden images are classified using training im-ages with the central area covered as well. Thus, two classification scores are computed for each dataset with SOC (3 and 10 targets) and EOC (4 targets with different depression angles, variants, configurations): one for the full images and one for the masked image. Results of the nearest neighbour method on full SAR images and partially masked SAR images.

Description of the Military Ground Target Dataset (MGTD)

Configuration of the experiment The antenna is placed around 5 m away from the tar- get on an adjustable height tripod to allow measurements or depression angles between 11◦ and 13◦. The detailed range for each sequence of measurement can be found in Appendix A. The target is on a rotating turntable and high range resolution profiles are measured every predefined angular step. A single Horizontal-Horizontal polarisation is used. To avoid any ring due to unwanted movements of the setup, a latency period is in- troduced after each rotation step to ensure the target is still before each measurement. The emitted signal is a stepped frequency waveform spanning a bandwidth of 5 GHz between 13 GHz and 18 GHz, with 4001 frequency points. The signal is generated and acquired using an Anritsu Vector Network Analyzer (VNA). A piece of Radiation Absorbent Material (RAM) is laid in front of the turntable (as shown in Fig. 2.9) in order to prevent unwanted multipath effects from the turntable.
Targets The presented database includes 3 classes. Their image representation of the target depends on the targets signature, i.e., the way it backscatters the energy sent by the radar. The various classes are characterised by major signature changes. The first class is the T64 tank shown in Fig. 2.10 (b). The second class is the T72 tank shown in Fig. 2.10 (c). The third class is the BMP1 tank shown in Fig. 2.10 (a). All three targets are model targets made mainly in plastic with some metallic parts. The BMP1 model is 1.5 m long and the T64 and T72 models are 1.7 m long. Environmental variables To avoid correlation between the training and testing set in the MGTD, the images belonging to the training and testing set are separated so that the sets of environmental conditions are different between training and test images. Each target has also training images with varied environmental conditions so that the algorithm can be more resilient. A sequence is defined as a group of images obtained from one single experiment. Images from a single sequence have thus identical environment factors except from the target orientation. Environmental details on the sequences chosen for the training and testing sets are in Sections 2.6.2 and 2.6.2. A detailed description of all the sequences created can be found in Appendix A.
Orientation The MGTD dataset consists of target images taken every 5◦ starting from 0◦ for which the radar faces the front of the target. The training and testing sets are formed using independent image sequences collected under different environmental conditions.
Depression Angle The depression angles used to generate the dataset are for train-ing from 21.8◦ to 23.4◦ and for testing from 17.5◦ to 20.3◦. The depression angle is changed by adjusting the height of the antenna mount relative to the distance between the antenna and the target. When the depression angle is changed, the reflecting surfaces of the target have virtually a new orientation which will impact the way the signal is backscattered. A change in the depression angle thus affects the target signature.
Configuration changes The configuration change is defined as the displacement of an element of the target. In practice, it is the turret or gun direction change that is considered as a configuration change. All orientations of the turret against its central position are included in the following sets of angles for the training: {-90; -45; 45; 90} and for the testing: {-30; 0; 30}. The gun had only two discrete positions which are up and down. Lab environment Not all the sequences were taken at the lab at the same period. It is not possible to remove entirely the background from the images as it was not possible to take measurements of the background alone as the targets were too heavy to be moved easily. Thus, the surrounding objects in the background can have an incidence on the re-sulting image. To make sure to limit correlation relative to the background, the sequences chosen for the training and testing set are not taken at the same time period and have thus a different laboratory background. All the data were taken over 3 different time periods in time labelled 1, 2 and 3 in Section 2.6.2.

READ  Frequency Domain Detection by Heterodyne Holography

Table of contents :

List of Figures
List of Tables
List of Equations
List of Abbreviations
1 Introduction 
1.1 Overview and motivations
1.2 Research objectives and contributions
1.3 Organisation of the thesis
2 Generation of SAR and ISAR data for ATR 
2.1 Summary
2.2 Introduction
2.3 SAR and ISAR theory
2.4 Description of the MSTAR dataset
2.5 Reasons for creating a new dataset
2.6 Description of the Military Ground Target Dataset (MGTD)
2.7 Guidelines for the performance quantification of a SAR ATR method on the MGTD
2.8 Conclusion
3 SAR image classification theory 
3.1 Classification
3.2 Target orientation determination
4 Feature-based classification 
4.1 Summary
4.2 Introduction
4.3 Segmentation
4.4 Classification with features
5 Deep learning classification 
5.1 Summary
5.2 Introduction
5.3 Deep learning approach with classical architecture
5.4 Deep learning approach with pose informed architecture
5.5 Conclusion
6 Deep learning network explainability through feature analysis 
6.1 Summary
6.2 Introduction
6.3 Computation of occlusion maps and classification maps
6.4 Role of the target, shadow and clutter in the classification
6.5 Study of the intensities of the pixels composing the critical features
6.6 Influence of the target in the location of the critical features
6.7 Influence of the orientation in the location of the critical features
6.8 Evolution of the features along the CNN depth
6.9 Limitations of the feature analysis carried out
6.10 Conclusion
7 Discussion and future work 
7.1 Research summary
7.2 Evaluation of SAR ATR methods
7.3 Application of classification methods from the optical to the SAR domain
7.4 Influence of the acquisition environment on the classification scores .
A Generation of SAR and ISAR data for ATR 
B Deep learning classification 
C Deep learning network explainability through feature analysis


Related Posts