EVOLUTION OF HETEROGENEOUS ENSEMBLES THROUGH DYNAMIC PARTICLE SWARM OPTIMIZATION FOR VIDEO-BASED FACE RECOGNITION
Statistical and Neural Classifiers
Although using statistical and neural pattern classifiers may represent a flexible solution to biometric recognition problem, their performance depends heavily on the availability of representative reference data. Moreover, the majority of the classifiers proposed in the literature assume a static classification environment and can only perform supervised batch learning of a finite data set. To account for new information from new data, they must accumulate it in memory and train from the start using all previously acquired learning data. Otherwise, new data may corrupt the classifier’s previously acquired knowledge, and compromise its ability to achieve a high level of generalization during future operations (catastrophic forgetting problem). In the context of a face recognition problem, this would lead to the corruption of facial class models when new data are added in time.
Video-based face recognition is becoming an important function in enhanced surveillance systems, which must simultaneously process many video feeds. As these applications must perform in real time, the design of efficient systems for facial matching involves a trade-off between classification speed, accuracy, and resources for the storage of facial models. For instance, today’s video surveillance networks are comprised of a growing number of IP cameras.
The need to design and store representative facial models for recognition – either more user templates or their statistical representation – increases the resource requirements of the system.
In addition, matching captured facial images to models for a large number of frames from different sources may severely increase the computational burden. Finally, the memory and time complexity associated with storing and relearning from the start on all cumulative data makes supervised batch learning impossible in this situation.
When new data becomes available, classifiers can be updated through supervised incremental learning in order to accommodate new knowledge and avoid a growing divergence between class models and their underlying distributions. This method does not involve the redundant and costly computations of batch learning; it rather reduces the memory resources associated with storing classifiers.
Learning and adapting classifiers in changing classification environments raises the so-called stability-plasticity dilemma, where stability refers to retaining existing and relevant knowledge while plasticity enables learning new knowledge (Grossberg (1988)). The literature proposes many classifiers which re-estimate their own parameters and architecture through incremental learning (Carpenter et al. (1991); Chakraborty and Pal (2003); Fritzke (1996); Okamoto et al. (2003); Ruping (2001)). However, if the plasticity of these classifiers is not adjusted to ac-commodate new knowledge presented with new reference data, they can still be affected by the catastrophic forgetfulness problem (Canuto et al. (2000); Dubrawski (1997); Fung and Liu (2003); Granger et al. (2007); Kapp et al. (2009)).
Recently, various methods employing adaptive ensembles of classifiers to perform incremental learning have been put in practice (Polikar et al. (2001); Kapp et al. (2010)). For a wide range of applications, where adaptation is not necessarily required, classifier ensembles allow to exploit several views of a same problem to improve the overall accuracy and reliability. With the use of a combination function, they also offer a flexibility over single classifiers in how class models can be managed and adapted. These methods can be divided in three general categories (Kuncheva (2004)). Dynamic combination, or “horse racing”, methods where individual base classifiers are trained in advance to form a fixed ensemble where only the combination rules is changed dynamically (Blum (1997); Widmer and Kubat (1996); Xingquan et al. (2004)).
Methods that rely on new data to update the parameters of ensemble base classifiers an online learner (Gama et al. (1999)). If blocks of data are available, training can also be performed in batch mode while changing or not the the combination rule at the same time (Breiman (1999); Ganti et al. (2002); Oza (2000); Wang et al. (2003)). The last main category consists of methods that grow ensembles by adding new base classifiers and replacing old or underperforming ones when new data is available (Chen et al. (2001); Street and Kim (2001); Kolter and Maloof (2007); Tsymbla et al. (2008)). Finally there are adaptive ensembles that use hybrid approaches that combine adding new base classifiers and adjusting the combination rule to update class models. The most notable are streaming random forests with entropy (Abdulsalam et al. (2011)), Hoeffding tree with Kalman filter-based active change detection using adaptive sliding window (Bifet et al. (2010)), maintaining and choosing the better of two ensembles trained with current and old data (Scholz and Klinkenberg (2006)), and the AdaBoost-like Learn++ (Polikar et al. (2001)).
Among these methods, horse racing approaches cannot accommodate new knowledge since base classifiers in the ensemble are never updated with new data. On the other hand, while online learners and growing ensembles can be used to explore unknown regions of the feature space, most methods focus on the notion of concept drift where underlying class distributions changes in time. They incrementally append new classifiers to a pool without updating preexisting members to change their parameters and risk losing old knowledge. While these classifiers are trained with new data, their plasticity (or learning dynamics) is set beforehand and remains fixed throughout the learning process, without being adjusted to accommodate new knowledge. Their claim is that old concepts, represented by old data should never be revisited and reinforced in contrast with new concept presented with new data. Although this may happen in a face recognition application, when classes are added and removed from the system for instance, it is not necessarily the case. In fact, when few biometric reference samples are available, the change that most commonly occurs is the knowledge of the underlying distributions, which is initially incomplete. Moreover, face recognition systems in unconstrained scenes are often faced with recurring changes regarding the environment (e.g., light effect over the course of a day) and the individuals to recognize (beard, haircut, glasses, etc.). In this context, adaptive ensemble methods that focus on concept drift may then forget old concepts that are still valid.
As it is detailed with the Learn++ algorithm (Polikar et al. (2001)) in Appendix 1, methods that rely exclusively on adding new ensemble members to explore the feature space become problematic if all classes are not always represented. With the current face recognition application for instance, when new data becomes available after a classifier is designed and deployed in the field, it will most likely be sampled from few, or even, one person at a time. While previously trained classifiers will not be able to recognize new classes, the ones trained with the new data will contained only the facial models of individuals registered in the system at that time.
Objective and contributions
This thesis addresses the challenges mentioned before and seeks to provide a video face recognition system with a mean to perform enrollment and update of biometric models incrementally when new data becomes available. In the context of real-world video applications, where classifier predictions must be accurate and be available in real-time, an ideal face classifier must accommodate emerging reference samples such that two objectives are minimized: classification error rate and computational cost. To achieve this, the relationship between a classification environment, where a classifier’s decision boundaries are defined, and an optimization environment, comprise of a hyperparameter search space and an objective space, is studied and characterized. The result, and the core of this thesis, is a supervised incremental learning strategy based on particle swarm optimization (PSO) that is used to evolve a swarm of fuzzy ARTMAP (FAM) neural networks in response to new data. As each particle in a hyperparameter search space corresponds to a FAM network, the learning strategy co-optimizes all classifier parameters – hyperparameters, weights, and architecture – in order to maximize accuracy, while minimizing computational cost and memory resources.
Table des matières
0.1 Problem Statement
0.1.1 Biometric Systems
0.1.2 Statistical and Neural Classifiers
0.1.3 Adaptive Ensembles
0.2 Objective and contributions
0.3 Organization of the Thesis
CHAPTER 1 AN ADAPTIVE CLASSIFICATION SYSTEM FOR VIDEO-BASED FACE RECOGNITION
1.2 Biometrics and face recognition from video sequences
1.3 Adaptive classification system
1.3.1 Long term memory
1.3.2 Fuzzy ARTMAP Neural Networks
1.3.3 Dynamic particle swarm optimization
1.4 Experimental Methodology
1.4.1 Video Data bases
1.4.2 Incremental learning scenarios
1.4.3 Experimental protocol
1.5 Results and Discussion
1.5.1 Experiment (A) – Impact of the LTM for validation data
220.127.116.11 Enrollment scenario
18.104.22.168 Update scenario
1.5.2 Experiment (B) – Impact of dynamic optimization
22.214.171.124 Enrollment scenario
126.96.36.199 Update scenario
CHAPTER 2 EVOLUTION OF HETEROGENEOUS ENSEMBLES THROUGH DYNAMIC PARTICLE SWARM OPTIMIZATION FOR VIDEO-BASED FACE RECOGNITION
2.2 An adaptive multiclassifier system
2.2.1 Fuzzy ARTMAP neural network classifiers
2.2.2 Dynamic particle swarm optimization
2.3 Strategy for evolving heterogeneous ensemble of FAM networks
2.3.1 Generation and evolution of heterogeneous classifier pools
2.3.2 Selection of diversified ensembles
2.4 Experimental methodology
2.4.1 Application–face recognition in video
2.4.2 Video data bases
2.4.3 Incremental learning scenarios
2.4.4 Experimental protocol
2.4.5 Performance evaluation and diversity indicator
2.5 Results and discussion
2.5.1 Performance for single images (ROIs)
2.5.2 Performance for video-streams (multiple ROIs)
2.5.3 Particle diversity -vs- classifier diversity
CHAPTER 3 DYNAMIC MULTI-OBJECTIVE EVOLUTION OF CLASSIFIER ENSEMBLES APPLIED TO VIDEO-BASED FACE RECOGNITION
3.2 Adaptive biometrics and video face recognition
3.3 Adaptive classifier ensembles
3.3.1 An adaptive multiclassifier system
3.3.2 Fuzzy ARTMAP neural network classifiers
3.3.3 Adaptation as a dynamic MOO problem
3.4 Evolution of incremental learning ensembles
3.4.1 ADNPSO incremental learning strategy
3.4.2 Aggregated dynamical niching PSO
3.4.3 Specialized archive and ensemble selection
3.5 Experimental methodology
3.5.1 Video data bases
3.5.2 Incremental learning scenarios
3.5.3 Experimental protocol
3.5.4 Performance evaluation
3.6 Results and discussion
3.6.1 Performance during video-based face recognition
3.6.2 Swarm and archive evolution during optimization
APPENDIX I ANALYSIS OF THE LEARN++ ALGORITHM FOR VIDEO-BASED
APPENDIX II INCREMENTAL LEARNING AS A DYNAMIC OPTIMIZATION PROBLEM