Feature extraction from basis vectors using lines properties

Get Complete Project Material File(s) Now! »

Motivations and goals of the thesis

Sub-meter resolution satellite images, capture very detailed information, as for example, shape of buildings and industrial installations, detailed road and road furniture structures, vehicles, etc. Thus, their information content is incredibly rich, and also complicated to be extracted. The classical image descriptors as spectral information, texture, shape, etc., are not any more sufficiently accurate to describe the image content.
Recently, many researches are being done to study, develop, and elaborate algorithms for extraction of information from high resolution optical satellite images.
Among different scenes in the satellite imagery, urban areas and geometrical structures have been the most interesting ones for many applications and studies. We are going to extract the intrinsic cues of satellite images and to propose robust descriptors so that using these descriptors we would be able to recognize a variety of the scenes, especially the geometrical structures, among the VHR (Very High Resolution) satellite imagery. For example, using these descriptors, we would be able
to find the similar urban zones in different parts of a large satellite image.
Here, we insist on the geometrical shapes or the urban areas or the man-made structures inside the satellite images, as the zone of our research, because normally there is no major difficulty in the natural scenes description and recognition. Usually, images from natural landscapes have some properties which let us use a number of texture-like features as their descriptors. They correspond to a specific range of frequency and changes in their spatial domain happen in a continuant and also, usually, in a quasi periodic manner. In addition, usually they don’t contain distinct lines, edges or geometric objects. Figure 1.1(a) shows a part of forest as an example of natural landscapes. On the other side, in the man-made structures we usually find geometrical objects, containing separating lines and edges, which are not necessarily distributed in a regular manner inside the image. Thus, this kind of images, comparing with natural landscapes, cannot be described properly with the textural features.
Texture-like features give a universal interpretation from the scene but don’t present detailed information about the objects inside the scene. On the other side, local descriptors and morphological operators are capable methods for detecting the geometrical objects and urban area characterization, but they are usually time consuming and complicated methods with very long feature vectors. Actually, we need some features neither exactly in the level of texture and nor in the level of local descriptors. Moreover, the local descriptors and morphological operators are usually
used in the object detection algorithms but in many applications we are not going to detect geometrical objects. In fact, in a lot of applications we don’t need to detect particular objects or zones, but the objective is to give a semantic interpretation from the scenes containing different landscapes, particularly man-made structures. The principal purpose in this thesis is to propose patch descriptors which are capable for geometrical structures characterization with regards to the context of the satellite image patches.
Independent Component Analysis (ICA) is the theoretical basis of the thesis. Here, we just express the principal property of ICA which motivates us to use it for satellite
Figure 1.1: Examples of two classes of satellite images and ICA basis vectors obtained for them. (a): Forest, typical of natural landscapes, (b): Urban area, typical of geometrical structures. (c) and (d): ICA basis vectors obtained for two classes. Urban area basis vectors contain lines, bars, edges…but forest basis vectors are more homogeneous.
image characterization. Details of ICA come in chapter 5. Bell and Senjowski [2] used ICA for natural images and found out that the independent components of images include short lines and edges. This is an important property for geometrical structure characterization, since the geometrical objects normally consist of lines and edges. Thus, ICA could be a suitable candidate to define descriptors for satellite image patches containing geometrical structures.
In Figure 1 we see two satellite image patches, one from forest and other from urban area. Also, we see examples of ICA basis vectors which are obtained for each class of data. The difference between the two sets of basis vectors is a sign of ICA capability for satellite image characterization. Particularly, the edges and lines in urban area basis vector demonstrate that ICA can detect the principal characteristics of geometrical structures.
During the thesis we try to extract features related to Independent Components Analysis from VHR optical satellite images. These features are supposed to be able to characterize this kind of images especially those who contain the man-made or geographic structures.

Overview of thesis contributions

The main purpose of the thesis is to propose descriptors for optical satellite image patches. A descriptor, simply, can be defined as a vector of features and every feature is supposed to describe one characteristic of image or a pattern inside the image. Previously, many methods are presented by researchers to extract features from images. In chapter 2 we give definitions and notations for different image features, as well as the feature extraction methods. We will mostly focus on the methods which are related to our work.
On the other side, our research domain is strongly related to the Earth Observation (EO) and Remote Sensing (RS). So, in Chapter 3 we introduce the basic concepts, goals and challenges related to Earth Observation and Remote Sensing. In addition, we explain what kinds of satellite images are used in the thesis.
Then, in Chapter 4, we will review previous studies related to our work to illuminate the atmosphere of researches around the main aspects of the work. Since our objective is to characterize geometrical or man-made landscapes, it is strongly related to urban area detection and we initially investigate the related works around urban area detection and classification. We also review state of art of Independent Component Analysis (ICA) and its applications on satellite image processing.
Since Independent Component Analysis (ICA) is the theoretical framework of the thesis, it is suitable to explain its fundamentals, concepts and algorithms in a separate chapter, i.e. Chapter 5.
We started our practical work on ICA with a study about the effect of scale size and dimensionality of ICA system when it is used for satellite image indexing. There is a relation between the size of ICA basis vectors and the capability of ICA for characterization of satellite images. Normally, if we increase the size of ICA basis vectors then our ICA system will be more capable to index satellite images. But the volume of computations will grow as well. Thus, we are not able to increase the size of ICA basis vectors limitlessly.
Similar relation exists for the dimensionality of ICA system or the number of ICA components. Usually the dimensionality is expressed as the reduction factor which is the normalized number of ICA components. That is, the ratio of ICA components to n2, where n is the size of ICA basis vectors. The purpose of Chapter 6 is to find the optimum point for the size of ICA basis vectors and the number of components. We define the reconstruction error as a criterion of ICA system’s capability for image characterization. In addition, we consider the computation time for obtaining the basis vectors as the other criterion. Using the cost functions which are combinations of these
two criterions we conclude that the optimum point for the reduction factor is placed between 0.08 and 0.14 and the basis vectors with the size of 16*16 is the most suitable case for our work.
In addition, in chapter 6, an approach is proposed to reduce the redundancy in a set of basis vectors. We propose to use a set of Gabor-wavelet filters to choose the optimum learning micro patches. In other words, we choose the micro patches which have the higher energy in a set of Gabor filters.

READ Testing project sustainability under climate change and transboundary scenarios

Texture features

In the field of image processing there is no a clear-cut definition for texture. Available texture definitions are based on texture analysis methods and the features which are extracted from the image. However, texture can be considered as the repeated patterns of pixels over a spatial domain. But the textures usually appear to be random and unstructured because in their model it is supposed that an amount of noise is added to the patterns and also the repetition frequencies changes from an area to another one.
Regularity, directionality, smoothness and coarseness are different examples of texture properties which are perceived by the human eye. Texture analysis has been extensively used to characterize and classify the remotely sensed images. In this sub chapter we study Haralick and Gabor-wavelet features, as the typical texture features to investigate the properties of textural analysis.

Table of contents :

Chapter0 : French resume
Chapter 1: Introduction
1.1 Motivations and goals of the thesis
1.2 Overview of thesis contributions
Chapter2: Feature Extraction Methods
2.1 Image intensity features
2.2 Texture features
2.2.1 Haralick features
2.2.2 Gabor wavelet features
2.3 Local Features
2.3.1 Scale Invariant Feature Transform
2.4 What kind of features do we need?
Chapter3: Satellite Images Properties
3.1 Active and passive sensors
3.2 Optical satellite sensors
3.2.1 Resolution
3.2.2 Panchromatic or Multispectral
3.3 Sub-meter optical satellite images
3.4 Contextual image patches for feature extraction
Chapter4: State of the Art
4.1 Urban area characterization state of the art
4.2 ICA State of the art
Chapter5: Principles of Independent Component Analysis
5.2 Fundamentals of Independent Component Analysis
5.3 Assumptions for the mean and variance of sources
5.4 Pre-processing steps
5.5 Measurement of statistical independence
Chapter6: ICA for Satellite Images: Scale and Dimensionality Behavior
6.1 ICA for image data
6.1.1 Image rescaling
6.1.2 Micro patches
6.1.3 Micro patch conversion to the vector form
6.1.4 Principal Components
6.1.5 ICA basis vectors
6.2 Dimensionality behavior of ICA components
6.2.1 Reconstruction
6.2.2 Reconstruction error
6.2.3 Optimum reduction factor
6.3 Scale behavior of ICA components
6.4 Gabor filters pre-processing step
6.5 Conclusions
Chapter7: Feature Extraction From ICA Sources
7.1 Features for a micro patch
7.2 Features for contextual patches
7.2.1 Number of sampled micro patches
7.3 Simple clustering for evaluation
7.4 Dimensionality and Scale size effects
7.5 Basis vectors improvement
7.5 Conclusions
Chapter8: Middle level Topographic ICA features
8.1 Principles of Topographic ICA
8.2 TICA basis vector production
8.2.1 Scale size of TICA system
8.2.2 Dimensionality of TICA components
8.2.3 Topography dimensions
8.2.4 Neighborhood dimensions
8.2.5 Pre-processing steps
8.2.6 TICA learning procedure
8.3 Middle-level TICA features
8.3.1 Low level TICA features generation
8.3.2 Middle-level features definition
8.4 Simple clustering for evaluation
8.5 Conclusions
Chapter9:Feature Extraction From ICA Basis Vectors:Bag of Words model
9.1 Basis vectors of contextual patch carry its signature
9.1.1 Learning procedure for one contextual patch
9.1.2 Choosing the dimensionality and the size of basis vectors
9.1.3 Number of learning micro patches
9.2 Bag of words model
9.2.1 Visual documents
9.2.2 Visual words for each document
9.2.3 Dictionary
9.2.4 Labeling each word of document by dictionary words
9.2.5 Bayesian approach for classification
9.2.6 Improved labeling and features
9.2.7 Simple clustering for evaluation
9.3 Conclusions
Chapter10: Feature Extraction From ICA Basis Vectors: Line and Gradient Features
10.1 Lines and gradient as basic characteristics of basis vectors
10.2 Edge detection
10.2.1 Edge strength estimation based on first-order gradient
10.2.2 Thresholding and edge thinning
10.3 Line estimation
10.3.1 Three-pixel line detection
10.3.2 Enlarging the three-pixel lines
10.4 Feature extraction from basis vectors using lines properties
10.4.1 Finding lines inside the basis vectors
10.4.2 Length, Gradient and angle as the important line properties
10.4.3 Number of elements in a bin as feature
10.5 Simple clustering for evaluation
10.6 Conclusions
Chapter11: Image Descriptor Based on Line Segments
11.1 Motivation
11.2 Lines properties of contextual patch as features
11.2.1 Finding lines inside the contextual patch
11.2.2 Length, Gradient and Angle as the important line properties
11.2.3 Number of elements in a bin as feature
11.3 Simple clustering for evaluation
11.4 Conclusions
Chapter12: Evaluation
12.1 Super Vector Machine
12.2 Supervised classification based on SVM
12.2.1 Relevance feedback tool
12.2.2 Contextual patch database
12.2.3 Feature extraction
12.2.4 Class detection
12.3 Conclusion
Chapter12: Conclusions and perspectives
13.1 Conclusion
12.2 Perspectives
Bibliography