Existing closed world re-id approaches used in open world re-id

Get Complete Project Material File(s) Now! »

Sparse representation, an intermediate step for other methods

This paragraph introduces a few other person re-identication approaches which use sparse representations but neither for ranking with the residual errors, nor as new features.
In [110], like in [109], coupled probe and gallery dictionaries are learnt for three body parts (head,body, legs). The method is semi-supervised and can make use of unlabelled data. At test time, the sparse codes of probe elements are computed using the probe dictionary. The obtained sparse codes are used to recover new features using the gallery dictionary. The matching is based on the sum for each body part of the Euclidean distance of the new features with the existing gallery features.
In [111], instead of a coupled probe and gallery dictionary, it is a coupled low and high resolution patch dictionary that is learnt, as well as a mapping matrix to convert features corresponding to low-resolution images into features that could be extracted from high-resolution images. In the person re-identication scenario considered, it is assumed that gallery images, which correspond to known identities, are high resolutions images. The query images on the other hands are in low resolution. Similarly to [110], the sparse codes of a probe image patches obtained through optimization with the low-resolution dictionary are used to compute new high resolution features thanks to the high resolution dictionary and the mapping matrix. Gallery image patches are approximated by a sparse linear combination of elements of the high resolution dictionary, and a gallery image new feature is the concatenation of all its patches reconstructions. For a given probe image, the gallery images are ranked based on the distance of their new high resolution features.
In [112], the proposed semi-supervised coupled dictionary method is used to obtain sample specic SVM. For each training probe identity, an SVM is learnt to separate positive pairs and negative pairs. Then a coupled dictionary composed of a feature dictionary and an SVM weights dictionary are learnt jointly with a mapping between the two dictionaries. At test time, from the coding of the probe image using the feature dictionary, the sample specic SVM weights is recover and used for determining the matching score of that probe image with each of the gallery images. In [113], sparse codes of probe elements are computed using the gallery as the dictionary. But instead of using the residual errors for ranking the gallery identities, they are simply used as a mean to assess the reliability of a matching.

Generalizing person re-identication

Person Re-identication methods performances have tremendously risen in the past few years for the closed world case. However, the situation where every query person has been previously identitied and is thus included in the gallery set, does not correspond to the context of most real applications. Therefore the open world re-identication topic has emerged and is gaining strong interest.
However, many ways of relaxing the closed world assumption exist and the person re-identication research commonity has not yet plebiscite one situation over another. Indeed, « open » simply refers to the fact that the gallery set of known people is not comprehensive. It could correspond, for example, to a case where the gallery is empty and grows as new identities come in, or to a case where not all identities are included in the gallery set but the gallery remains unchanged even if a new identity is captured. Many more situations could match the expression « open world », each being specic and corresponding to dierent applications with distinctive goals. This is why even though most methods rst developed for the closed world task could apply to open world tasks, and vice-versa, we chose to present the methods specically tackling an open world task in a separate section. We grouped the existing papers generalizing person re-identication task into three groups corresponding to the type of scenario tackled: identity inference, groupbased verication, detection and re-identication.

Closed world measures

Closed world person re-identication measures mainly evaluate the relevance of the relative ranking of gallery similarity to each probe person. We present these measures from the most widely used to the least used.
CMC (Cumulative Matching Characteristic) represents the proportion of right matches found in the r top ranks. The ranks range from one to the number of gallery identities. To use this measure, it is assumed that for each probe person, the person re-identication method returns a ranked list of the gallery identities, where each gallery identity appears only once. In the multishot case, the similarity between a probe person and a gallery person is often chosen as the maximum value of the similarity between their image pairs.
mAP (mean Average Precision) is used for multi-shot cases. Gallery images of the same person are not grouped together so as to return a single similarity value for each gallery person. Instead, a ranked list of gallery images is returned for every query person. For a set of queries Q, the mean average precision is the mean of the average precision of each query: mAP = 1 jQj X q2Q AP(q) (2.8). where AP is the average precision. The average precision is the mean of the precision scores after each relevant gallery image is retrieved. It is dened by: AP(q) = 1 R X Precisionr(q).

From closed world re-id to open world detection and re-identication

In the open world re-identication task we tackle in this thesis, we relax the assumption that the person to be re-identied has been identied before. Thefore strictly speaking, open world re-identication is not a re-identication task. The open world re-identication task we adopt here is actually a generalization of the closed world re-identication task which can be decomposed into two subtasks, the detection and the re-identication subtasks. The detection task consists in determining whether the presented probe person should be matched with one of the gallery people or if he should be rejected as an imposter (someone who is not present in the gallery).
The objective of the re-identication task is to rank the gallery identities whom are considered to be possible right matches for the presented probe person. Contrary to the closed world task, for the open world task, besides the relative ranking aspect, the detection aspect also needs to be evaluated. The evaluation commonly adopted in the few papers which dealt with the same problem ([13, 46, 47]) is the one presented in the Operid paper [13]. The proposed DIR vs FAR evaluation (dened in section 2.5.2) is a unied measure for detection and ranking. For reminder, the Detection and Identication Rate DIR(; r) represents the proportion of common identities that are found in the rst r ranks with a dissimilarity score smaller than and the False Acceptance Rate FAR( ) is the proportion of imposter identities whose dissimilarity score to their closest gallery identity is smaller than only the rst rank matches are considered. Indeed, FAR is only a function of and it only takes into account the worst wrong match, ie. the one with the smallest dissimilarity score, the other wrong matches for bigger dissimilarity score are not taken into account in the FAR value. DIR is a function of the rank r and of the threshold , so DIR at rst rank is solely a function of the threshold , and it only takes into account the rst match. DIR at rst rank versus FAR are functions of the threshold variable and it re ects the proportion of well re-identied non imposter probe identities for given proportions of wrongly matched imposter probe identities, where the dissimilarity score of matched elements are below the threshold .
Let’s compare once again the methods A, B and C presented in Figure 3.1 but in the open world case this time. For the sake of the example, we examine two open world partitions of the identities. In both partitions, the probe set of identities is composed of all 5 identities. In the rst partition, the gallery set contains people 1, 2 and 3. In the second partition, the gallery set contains people 3, 4 and 5. Since there are only two probe imposters, the non zero False Acceptance Rate (FAR) only takes 2 values, 50% when only one of them is wrongly accepted and 100% when both are wrongly accepted. For each FAR value, the value of the dissimilarity score of the wrongly accepted probe imposter is used as the decision threshold to determine the corresponding DIR rate. In our example, since there are two possible values of FAR, there are two decision thresholds 1 and 2. Figure 3.2 presents the open world situations for the two described partitions. For each probe person, only the rst match is taken into account, ie. we only consider the pair with the smallest dissimilarity score. They are circled in black in the gure. For each FAR value and its corresponding decision threshold value, the DIR value is computed by counting the proportion of non imposter probe identites re-identied with a dissimilarity value smaller than the decision threshold. On the rst partition, for all three methods, the open world DIR vs FAR results are perfect (100% recognition at rst rank when the false acceptance rate equals 50%). Indeed, the circled green dots corresponds to dissimilarity values that are smaller than the decision threshold 1 associated to the rst wrong accepted probe imposter identity. On the second partition however, the performances are really dierent depending on the method. While method A still achieves perfect DIR vs FAR results, method B and C perform poorly. The open world results are provided in Table 3.1.

READ Homodyne detection as a projective measurement

Existing closed world re-id approaches used in open world reidentication

Since the person re-identication task has been tackled only in the closed world setting for a long time and were evaluated using CMC, a few methods are only based on ranking constraints per probe identity. It is for example the case of metric learning methods EIML [76] and WARCA [78] or neural network approaches [94, 95, 45].
Although it is possible that some of these approaches perform well for the open world re-id task, nothing in their formulation ensures it.
There exist some methods which instead of being based on ranking constraints per identity are based on minimization problems which involve constraints on positive and negative pairs and which do not specically distinguish these constraints per probe person. It is for example the case of metric learning methods KISSME [21] and XQDA [23] where the objective is to minimize intra-class variance and maximize extra-class variance, or RPLM [79] whose objective is to minimize the distance of positive pairs and maximize the distance of negative pairs. However, these kind of methods do not specify any threshold under which the intra-class variance and the positive pairs distances should be nor a threshold above which the extra-class variance and negative pairs distances should be. Therefore, depending on the dataset, or even on the partitions, the decision threshold corresponding to a given FAR value could vary a lot.
Some methods developed for closed world person re-identication, such as the PCCA [20], cast the re-identication task as a binary classication task and introduce a threshold during the training phase. However, even when a threshold is used during the training phase to separate positive pairs from negative pairs, the best decision threshold on the test set might not be the one used for the training phase because the distribution of the distances of test data are often shifted to the right (bigger values) because negative pairs are better modeled.

Existing open world re-id approaches

Some papers specically tackle an open world re-identication task. Among them, some [1, 2, 3, 4] tackle dierent open world tasks from the one we deal with while a few others [13, 46, 47] also tackle the two subtasks detection and re-identication open world task we consider.
The papers [1] which have a dierent denition of the open world re-identication task from ours present methods that focus on the specic aspect of their denition of the open world re-identication task. In the multi label inference paper [1], the optimization of the cost function aims at grouping images by identity and assigning a label to each detection. Rather than having to determine whether a person is a known gallery person or not, the total number of probe identities is unknown and to be determined by the algorithm there is actually no gallery. In [2, 3, 94] the gallery is assumed to be composed of a small group of target identities and the goal is to determine whether a probe person is one of the gallery identities or not, without necessarily giving the exact identity of a probe person who is found to be someone from the target set. In all three methods, the optimization is based on relative ranking constraints.

Table of contents :

List of Figures
List of Tables
1 Introduction
1.1 Context
1.2 Challenges
1.3 Thesis objectives
1.4 Thesis outline
1.5 List of publications
2 Related work
2.1 Introduction
2.2 Datasets
2.3 Closed world approaches
2.3.1 Representation learning
2.3.2 Metric Learning
2.3.3 Neural networks
2.3.4 Sparse representations
2.3.5 Re-ranking methods
2.4 Generalizing person re-identication
2.4.1 Identity Inference
2.4.2 Group-based verication
2.4.3 Detection and Re-Identication
2.4.4 Drone based
2.5 Evaluation measures
2.5.1 Closed world measures
2.5.2 Open world measures
2.6 Conclusion
2.7 Position of our work
3 COPReV
3.1 Motivation
3.1.1 Closed world re-identication
3.1.2 From closed world re-id to open world detection and re-id
3.1.3 Existing closed world re-id approaches used in open world re-id
3.1.4 Existing open world re-id approaches
3.2 COPReV
3.2.1 Overview
3.2.2 Problem notations
3.2.3 Mathematical formulation
3.2.4 Optimization
3.3 Experimental results
3.3.1 Feature extraction
3.3.2 Implementation details
3.3.3 Datasets and Re-ID scenarios
3.3.4 Precision about the evaluations
3.3.5 Evaluation on closed world re-id scenario
3.3.6 Evaluation on open world re-id scenario
3.3.7 Discussion on the evaluation measures and practical uses
3.3.8 Evaluation on the verication task
3.3.9 About the initialization
3.3.10 Robustness to unbalanced data
3.4 Conclusion
4 Sparse representations with enhanced collaboration
4.1 Preliminaries
4.1.1 Notations: training and testing data
4.1.2 Notations: sparse coding
4.1.3 Features prerequisites
4.2 Collaborative versus non collaborative sparse coding
4.2.1 Non collaborative sparse coding of probe elements
4.2.2 Collaborative sparse coding of probe elements
4.2.3 Comparison of non collaborative and collaborative sparse coding
4.3 Collaboration enhanced sparse coding for open world re-id
4.3.1 Enhanced collaboration for open world re-identication
4.3.2 Additional dictionary D
4.3.3 A method also relevant for person verication
4.3.4 About the exploitation of multi-shot data
4.4 Experimental results
4.4.1 Implementation details and feature extraction
4.4.2 Datasets, training and testing sets, testing protocols, evaluation
4.4.3 Evaluation on closed and open world re-identication tasks
4.4.4 Evaluation on the person verication task
4.5 Conclusion
5 Bidirectional Sparse Representations
5.1 Dierence between sparse coding of probe and gallery elements
5.1.1 Known and undetermined identities
5.1.2 Availability of gallery and probe data
5.1.3 Final goal
5.2 Reverse direction: sparse coding of gallery elements
5.2.1 Sparse representation of gallery elements
5.2.2 Choice of the additional dictionaries
5.3 Ranking of gallery identities, meaning of the residual errors
5.4 Combination of both representations
5.5 Complexity
5.6 Experimental results
5.6.1 Evaluation on closed and open world re-identication tasks
5.6.2 In uence of the choice of the additional dictionaries
5.6.3 In uence of the number of probe identities simultaneously available
5.6.4 Evaluation on the person verication task
5.7 Conclusion
6 Conclusion and Perspectives
6.1 Conclusion
6.2 Perspectives
6.2.1 Design adapted features for sparse coding approaches .
6.2.2 Adapt the sparse coding framework to multi-camera scenarios
6.2.3 Learn the additional dictionaries in the reverse direction sparse coding
6.2.4 Learn the additional dictionary in the direct direction sparse coding
6.2.5 Complexity and speed considerations for huge datasets
6.2.6 A better use of simultaneously appearing people
6.2.7 Generalize even more the re-identication task: dynamic set of identities
7 Resume en francais
7.1 Introduction
7.2 Etat de l’art
7.3 COPReV
7.3.1 Presentation de la methode
7.3.2 Resultats experimentaux
7.3.3 Conclusion
7.4 Representations parcimonieuses avec une collaboration elargie
7.4.1 Presentation de la methode
7.4.2 Resultats experimentaux
7.4.3 Conclusion
7.5 Representation collaborative bidirectionnelle
7.5.1 Presentation de la methode
7.5.2 Resultats experimentaux
7.5.3 Conclusion
7.6 Conclusion et perspectives
7.6.1 Conclusion
7.6.2 Perspectives
Bibliography