Saliency-based BCS recovery
Visual saliency is a cognitive mechanism of the HVS in order to accurately identify the significant visual information (salient or foreground regions) and filter out other redundant visual information (non-salient or background regions) when exploring the natural scenes. Since the HVS performs numerous functions when viewing, it is not possible to combine this complex function into the direct sensing process. Beside of the hardware problems, the process of integrating the salient region detection into the cameras is not straightforward due to its time- and energy- consuming process. However, by considering a feedback channel between the sensor and the receiver, the BCS can gain from the visual saliency to improve the image reconstruction quality . The graph-based visual saliency (GBVS) model  is a well-known algorithm to extract the salient regions within the image. In the following, the GBVS and BCS-SPL algorithms are employed to build up an image CS recovery algorithm in an adaptive way to the human visual perception, called saliency-based BCS with smoothed-projected reconstruction (SBCS-SPL). Following this adaptive procedure, the high sampling rates are assigned to the salient regions while the low sampling rates are allocated to the non-salient regions, lead to a better recovery quality.
The block diagram of the SBCS-SPL scheme is shown in the Fig. 3.1. Firstly, the basic BCS acquisition is done at the sensor side, i.e. the image is partitioned into Nb Nb non-overlapping blocks sampled at the same subrate S. At the receiver side, an initial reconstruction of the image is obtained using the BCS-SPL algorithm. In the next step, a saliency map for the whole image is computed using the initial reconstructed image via the GBVS model . The GVBS consists of the following three main steps:
Feature map extraction: A feature map represents the image based on a well-defined feature. These maps can be extracted and computed either from the first-order statistics of stimulus features such as intensity, color, orientation, and texture or by linear filtering on the image followed by some elementary nonlinearity. For example, a contrast map computed using intensity variance in the local patches of the image or a simply an intensity of the grayscale values can be considered as a feature map.
Activation map computation: In the second step, the feature map is transformed to an activation map in which the parts of the scene, according to a specific feature, that strongly diﬀer from its surrounding are highlighted. The process is implemented by a graph-based random walk which can be applied on feature map to extract local activities for each feature type. At first, a graph is generated by connecting the nodes in a feature map. Then, by assignment a weight between nodes, the graph is interpreted as Markov chains. It accumulates mass at nodes that have high dissimilarity with their surrounding nodes and is considered as activation and saliency values. Normalization and combination: In the last stage of the attention model, a saliency map is generated by normalization of the activation map in order to prevent some salient regions masked by noise or interfered by less-salient regions. A similar approach based on graph over activation map is also implemented in this step. Mass will flow preferentially to those nodes with high activation. Please refer to  and its references herein for more details.
Residual-based Image CS Reconstruction Algorithms
In the previous section, an image BCS recovery algorithm has been presented by which the recovery quality is improved via establishing a feedback channel between the receiver and the sensor in an adaptive acquisition framework. This section focuses on introducing receiver-based CS recovery algorithms without any change in the signal acquisition part.
In video compression, motion estimation and compensation are used to support the temporal sparsity. In this technique, the first frame within a group of pictures (GOP) is firstly encoded. Exploiting the high resemblance among the frames within the same GOP, instead of fully encoding the next frame, the residual between the original frame and its prediction from the previous frame, is encoded. This resulted residual is obvi-ously more sparse. The same encoding procedure is conducted for the multi-view image and video compression, exploiting disparity estimation and compensation methods. The fundamental property of the CS theory states that a very sparse signal can be perfectly re-constructed. Therefore, higher sparsity can be provided by estimating a prediction of that image and exploiting it for the CS reconstruction [87, 87, 88] by implicitly increasing the signal compressibility . In this paradigm, during the reconstruction step, a prediction of the current block is obtained by exploiting the strong correlation among the blocks within an image. Then, the CS reconstruction algorithm is driven by the measurement-domain residual that is obtained by diﬀerence between the measurements of current block and its prediction [69, 76, 77]. Suppose the measurements of block xi is obtained via yi = bxi.
Let x~i is a prediction of the block xi, then the measurement-domain residual of the i block is obtained by ri = yi bx~i. Given the measurement-domain residuals for all blocks, i.e. r, the final reconstructed frame x^ is computed as: x^ = x~ + BCS-SPL(r; b); (3.8).
Residual-based CS reconstruction using intra-image prediction
In this section, the basic framework of intra-image prediction is exploited to propose a novel strategy for the block-based image CS reconstruction, called block-based CS reconstruction First, each image is partitioned into K non-overlapping blocks of size Nb Nb pixels and then the measurements are separately obtained for each block using a Mb Nb Gaussian measurement matrix b via yi = bxi at the subrate S = Mb=Nb, where xi is the vector representation of the i-th block. Block diagram of the BCS-IIP algorithm is depicted in Fig. 3.5. The algorithm consists of two phases: initial and enhancement phase. In the initial phase, the BCS-IIP algorithm builds an initial reconstruction of the image,i.e.
x^. This reconstruction can be achieved using any CS reconstruction algorithm. The BCS- SPL algorithm is employed in our implementation. Moving forward to the enhancing phase, the initial reconstructed image x^ is partitioned into L non-overlapping blocks of size Np Np. Np is not necessarily the same as Nb. For the prediction goal, the small block sizes improve the prediction quality, whilst in the case of CS, the bigger the block the better the reconstruction will be. For the former, the selection of a smaller block size increases the correlation among the neighboring blocks, leading to a more accurate prediction. For the latter, the larger blocks for the acquisition provides more sparse blocks, a necessary condition for the success of CS reconstruction algorithms.
Let set S = z1; z2; z8 consist of 8 surrounding blocks of the current block z in the initial reconstruction image x^, as shown in Fig. 3.6. The similarity between the block z and its k th surrounding block is measured by sum of absolute diﬀerences (SAD) as: Np Np X X SADk = jzk(m; n) z(m; n)j: (3.9).
Residual-based compressed sensing recovery using a trained dictionary
In the previous section, a residual-based CS reconstruction algorithm has been described wherein the final reconstructed image is obtained using a measurement-domain residual. Generally, the CS recovery of the residual image, obtained by diﬀerence of the original image and its prediction, increases the CS reconstruction performance [76, 91]. In the BCS-IIP algorithm, the weighted average of the spatially surrounding blocks is computed to obtain a prediction for each block. However, the limitation of this method is that its performance relies heavily on the obtained prediction that is based on the assumption of similarity among the adjacent blocks within an image. To address this issue, the non-local self-similarities of image patches within an image can be exploited in order to produce a prediction that is as close as possible to each block in order to generate a highly compress-ible residual .
The well-known SR models the image patches as a linear combination of a very few number of atoms chosen from an over-complete dictionary. GSR  is a novel modeling technique of the natural images. In this technique, instead of modeling the single image patches, a group of non-local image patches with similar structures is considered as the basic unit of the sparse representation. Based on this concept, the GSR is used to obtain a prediction for each block. This prediction drives a residual-based CS reconstruction algorithm, called block-based CS reconstruction using group sparse representation (BCS- GSR).
blocks of size Nb Nb pixels. Then, each block is sampled as yi = bxi, where xi 2 RNb is the vector representation of i-th block of the image, yi 2 RMb is the measurement vector, and b is the measurement matrix of size Mb Nb. Assume all block are sampled at the same subrate; therefore, the overall subrate is S = Mb=Nb.
Given the measurement vectors [yi]Ki=1, the object is to recover the image patches [xi]Ki=1 in order to reconstruct the whole image x. In the first step, the aim is to find an optimal prediction of the image x by solving the following minimization problem: x~ = arg minkx zk2 + (z); (3.13).
where kx zk2 is the ‘2 data-fidelity term, (z) is the regularization term that regularizes the solution space of the fidelity term, and is the regularization parameter. As discussed in Section 2.5, one of the most significant regularization terms is sparsity, which means that the natural images are locally sparse. In other words, each local image patch can be accurately represented as a few elements chosen from a dictionary [29, 31, 36]. Further, there are many repetitive high level patterns and regular structures globally positioned in the images . This significant property between similar patches within an image, called non-local self-similarity, in combination with the local sparsity existing in the natural images, can be used as a more suitable and eﬃcient regularization term to regularize the solution space of the minimization problem in (3.13) and develop a much more accurate prediction [32,34,92]. the obtained residual has a random nature, it is s deep sparse signal that can be recovered more accurate from the corresponding measurements.
Joint Sparse Representation (JSR)
In recent years, the sparse data representation has been extensively utilized in diﬀerent applications by means of ‘0-norm and ‘1-norm minimization techniques . Based on this modeling, an image patch is represented using a small number of basis functions chosen out of an over-complete dictionary . The choice of over-complete dictionary plays an important role in the sparse representation modeling. One of the most flexible ways to obtain such an over-complete dictionary is by learning from a set of example image patches, which has been an active field of research over the past decade [23, 35, 36, 117].
Recently, Yang et al. proposed a joint sparse representation modeling for image super resolution . They assume that there exist two coupled over-complete dictionaries for two diﬀerent modalities (low resolution (LR) and high resolution (HR) image patches), over which each paired samples of modalities have the same sparse representations. Based on this modeling, a coupled dictionary learning is proposed, in which the coupling is realized by enforcing the low and high resolution patches to share the same sparse feature space. This joint sparse representation modeling has been extensively used in other applications, including classification , cross-model matching , and multispectral image change detection . In , it is assumed that the sparse representations of the two modalities are related to each other via a linear mapping function. This type of modeling relaxes the strong assumption of the coupled dictionary learning algorithm in , and brings more flexibility to characterize the image structures. However, using a single linear mapping may not be good enough for the sparse representations of two modalities accurately . In the field of cross-model matching, a new coupling model has been recently proposed which uses two linear mappings to project the sparse representations of the paired samples from the two diﬀerent modalities into a common space to enable joint assessment of the two modalities .
Performance Analysis of the Proposed Algorithm
The proposed method oﬀers several benefits. Firstly, learning a separate dictionary for each domain preserves the main structure of the data in the both domains, which can be represented well by sparse linear combinations of the dictionary atoms. Further, learning a mapping function in the sparse representation domain makes it more accurate since the irrelevant information in the spatial domain is discarded. We have introduced several mapping approaches in Section 4.2.2 which relate the sparse representations of the original and corrupted patches in diﬀerent ways. In this section, the eﬀect of the employed common space mapping, defined in (4.4), on the EC performance is evaluated in comparison with the direct mapping, defined by the term (4.3). We also evaluate the EC performance of the proposed coupling method in , wherein the mapping matrix M in (4.3) is set as an identity matrix.
Table 4.5 reports the experimental results in terms of PSNR, where the image Lena is corrupted with the random loss pattern at diﬀerent PLRs, ranging from 10% until 50%. The image is concealed using the JSR-based EC algorithm with above coupling terms. “JSR-I », “JSR-D », and “JSR-C » means mapping with the identity matrix, direct mapping, and common space mapping, respectively. It can be observed that coupling of the sparse coeﬃcients using a common space performs better error recovery than other coupling methods. The reason is that transferring the sparse representations into a common space provides more freedom to uncover the relationships between the sparse coeﬃcients of the original and corrupted patches. It means that using an accurate coupling term in the objective function (4.5) plays an important role in the enhancement of error recovery.
Table of contents :
Table of content
List of Figures
List of Tables
List of Algorithms
List of Abbreviations
1.1 Main Objectives of the Thesis
1.2 Thesis Outline
2 Sparsity-Based Signal Models
2.3 Sparse Modeling using Over-complete Dictionaries
2.3.1 Dictionary Choice
2.3.2 Analysis versus Synthesis
2.4 Inverse Problems
2.5 CS-based Image Acquisition and Analysis Sparse Reconstruction .
2.6 Block-based CS Reconstruction
3 Image Compressed Sensing Recovery
3.1.2 Adaptive block-based CS recovery
3.1.3 Saliency-based BCS recovery
3.2 Residual-based Image CS Reconstruction Algorithms
3.2.1 Residual-based CS reconstruction using intra-image prediction .
3.2.2 Residual-based compressed sensing recovery using a trained dictionary
3.3 Experimental Results
4 Receiver-based Error Concealment
4.1.1 Previous Works on Error Concealment
4.1.2 Joint Sparse Representation (JSR)
4.2 Joint Sparse Representation for EC
4.2.1 Problem Formulation
4.2.2 Joint Dictionay-Mapping Learning
4.2.3 The JSR based EC
4.2.4 Block Recovery Order
4.3 JSR-based EC with Non-local and Local Regularization
4.3.1 Non-local Self-similarity for Regularization
4.3.2 Local Structural Regularity for Regularization
4.3.3 Recovery Algorithm
4.4 Experimental Study
4.4.1 Training Sets Description
4.4.2 Experimental Setup
4.4.3 EC Quality
4.4.4 Computational Costs
4.4.5 Performance Analysis of the Proposed Algorithm
5 Transmitter-based Error Concealment
5.2 Analysis Sparse Recovery-based EC
5.2.1 Transmitter Side
5.2.2 Receiver Side
5.2.4 Experimental Results
5.2.5 Reconstruction Quality
5.2.6 Packet length Effect
5.2.7 Wavelet Functions Effect
5.2.8 SB Size Effect
6 Sparse Representation-based Image Compression
6.1.1 Related Work
6.2 Image Compression using Adaptive Sparse Representations
6.2.1 Adaptive Synthesis Sparse Representation
6.2.2 Quantization and Entropy Coding
6.3 Experimental Results
7 Conclusion and Future Directions