Mining Fuzzy Moving Object Clusters 

Get Complete Project Material File(s) Now! »

Illustrative Example and Motivations

In this section, we will give an example to generally present challenging issues and our motivations in object movement pattern mining.
1) The first issue is how to efficiently manage different kinds of patterns? Indeed, given a specific dataset, e.g. see Figure 1.1, it is difficult to know which kind of patterns is useful to analyze the data. Intuitively, that could be a set of closed swarms indicate that the cars o1 and o2 or the cars o3 and o4 are moving together. However, that also could be a set of moving object clusters state that the cars o1,o2,o3 and o4 are moving together from C to D to E to F, etc.
To deal with this issue, one of potential solutions is employing all the existing ap-proaches, each of which extracts a specific pattern, to obtain all kinds of patterns. Then, user can analyze the final results to understand the object movement behavior. Naturally, the computation is costly and time consuming since we need to execute different algo-rithms consecutively. Additionally, in some applications object locations are continuously updated, e.g. cars report their locations by using Global Positioning System (GPS). There-fore, new data is always available and it means that we need to execute again and again the algorithms on the entire data to update the final results. This is of course, cost-prohibitive and time consuming as well.
2) The second issue is that the existing patterns are relevant and help us to fully under-stand the complex and unpredictable behavior of moving objects or not. For instance, they require the group of moving objects to be together for at least mint timestamps, i.e. could be consecutive or completely be non-consecutive, which might not be practical in the real cases. Enforcing consecutive time constraint may results in loosing meaningful patterns while completely relax this constrain may generate large amount of extraneous patterns.
For instance, see Figure 1.1, enforcing the consecutive time constraint results in losing of the pattern « the two cars, o1 and o2, are moving together from B to E and to F » since they are not close each other at time t3. While completely relax this constraint will generate the extraneous patterns such as « the two cars, o1 and o2, are moving together from B to E to F to G to K and to L ». This pattern are irrelevant because they meet each other at L after 949 timestamps by chance and not actually moving together from K to L.
Another illustrative example is that the traditional movement patterns usually focus on an unchanged group of objects and thus they cannot capture the object moving trend. Indeed, objects can step by step getting together to go to some place or leaving each other from that place.
For instance, in Figure 1.1, « all the cars are gathering at E after A to C and to D ». This phenomenon is involved in many real world applications such as traffic congestion, animal or population migration, location prediction, etc. Therefore, it demands the definition of novel movement pattern models to enrich the application benefit.
3) Naturally, end user can be overwhelmed by a huge number of extracted patterns al-though only a few of them are useful. Indeed, thousand of patterns representing redundant knowledge clearly poses limit in their usefulness. However, relatively few researchers have addressed the problem of reducing movement pattern redundancy.
4) The potential utility of the developed spatio-temporal mining tools must be justi-fied in their applicable field. One of our ultimate goals is to develop techniques that not only benefit the computer science society, but more importantly, are applicable to vari-ous interdisciplinary science and engineering fields. Therefore, it is always crucial for us to emphasize the practicability of our methods. Performance measures of the developed tools in terms of accessibility, scalability, reliability and other human factors also need to be carefully studied, in order to provide answers to concerns from domain experts.

Contributions

All of the mentioned issues have been addressed in my thesis. We propose a three step framework, i.e. Figure 1.2. First of all, we develop a unifying approach to extract and manage multiple patterns. Second, we discover novel kinds of movement patterns and efficient algorithms to mine them. To select informative patterns from thousand of very redundant ones, we further design a compression step based on Minimum Description Length prin-cipal. To justify the utility of the proposed concepts and developed approaches, we also supply a demonstration system, named MULTI_MOVE, which is designed to efficiently and automatically extract different movement patterns. Our contributions are generally pre-sented as follows:
All in One: Mining Multiple Movement Patterns. Designing a unifying approach which can manage multiple movement patterns is very challenging. This is because: 1) there are many kinds of patterns and each of which have their own characteristics. 2) Additionally the results need to be reusable to obtain the final results when new data is available. 3) Obtaining the optimal parameters is a difficult task for most of algorithms which require parameter setting.
All these challenging issues have been throughly considered and well addressed in my proposed method, GET_MOVE [12]. The basic idea of GET_MOVE is to reconfigure moving object data to a cluster matrix then movement patterns will be redefined in an itemset context. Next, frequent closed itemset will be adopted to efficiently extract closed itemsets from which movement patterns can be mined.
Fuzzy Moving Object Clusters. To soften the consecutive time constraint, the key chal-lenge is to deal with the time gap between a pair of clusters since: 1) it is difficult to rec-ognize which size of a time gap is relevant or not, 2) we need to know when the patterns should be ended to eliminate uninteresting ones.
To address these issues, we present the definition of fuzzy time gap and fuzzy time gap participation index. Obtained patterns are of the type « the two cars, o1 and o2, are moving together from B to E to E to G and to K with 60% weak, 20% medium and 20% strong time gaps », see Figure 1.1. These patterns are characterized by their time gap frequency (or support), which is by definition the proportion of time gaps involved in the patterns. To extract all the fuzzy closed swarms, we propose a novel itemset-based property so that GET_MOVE can be employed to gather all the fuzzy closed swarms.
Time Relaxed Gradual Trajectory Patterns. A gradual moving object cluster is a list of moving object clusters which satisfy the graduality constraint and integrity condition during at least mint timestamps. The graduality constraint can be the increase or decrease of the number of objects and the integrity condition can be that all the objects should remain in the next cluster.
As exemplified in Figure 1.1, the retrieved knowledge from traditional patterns can be « the two cars, o1 and o2, are moving together from t2 to t6 » or « the two cars, o3 and o4, are moving together from t1 to t4 ». Even if these patterns are relevant, they do not really present the actual moving behavior which can be « from t1 to t4, as time passes, the more cars are following the trajectory {A i C i D i E} ».
To avoid finding redundant rGpatterns, we only focus on discovering the complete set of maximal rGpatterns. The basic idea is that if C is a rGpattern, it is unnecessary to output any subset C0 of C even if C0 may also satisfy rGpattern requirements.
Efficiently extracting of complete set of maximal rGpatterns in a large moving object data, denoted db, is a non-trivial task. First, the size of all possible combinations is expo-nential, i.e. approximately 2|Cdb| where Cdb is the set of all clusters extracted from the data db and |Cdb| is significantly large. Second, none of previous work (i.e. frequent pattern mining [1] [15], moving object clusters [7] [17] [18] [23]) solves exactly the same issue as finding maximal rGpatterns. This is because they do not address the graduality in terms of itemsets or moving clusters. However, graduality is a key feature in rGpattern context. Thus, the discovery of rGpatterns introduces a new problem that needs to be solved by specifically designed techniques.
Facing the huge potential search space, we propose an efficient approach, named CLUSTERGROWTH, which shares the same spirit with the ObjectGrowth algorithm [23] but different in terms of design and goal. In CLUSTERGROWTH, we design two efficient rules which are Graduality Pruning and Backward Pruning to end unnecessary further search. After cutting a great portion of invalid candidates through the pruning rules, we perform a further check, i.e. Actual Maximum Checking, to report interesting maximal rGpatterns on-the-fly. This final step avoids using more space to store candidates and extra time for post-processing.
To enrich the utility of the rGpattern concept, we adapt the Minimum Description Length (MDL) principle for mining representative rGpatterns. An encoding scheme which is designed to deal with different kinds of overlapping rGpattern structures is proposed. We show that mining representative rGpatterns is NP-Hard and therefore we propose two heuristic algorithms to extract compressing rGpatterns. The first algorithm, named COMPOGP, uses a greedy two-phase approach. To overcome performance with the required candidate generation in COMPOGP we propose an effective algorithm, called DICOMPOGP, to directly mine compressing rGpatterns.
Mining Representative Object Movement Patterns. Most of the researches are devoted to extract trajectories that differ in their structure and characteristic in order to capture different object behaviors. The first issue is constituted from the fact that all these meth-ods extract thousand of patterns resulting in a huge amount of redundant knowledge that poses limit in their usefulness. The second issue is supplied from the nature of spatio-temporal database from which different type of moving object patterns could be extracted. This means that using only a single type of patterns is not sufficient to supply an insightful picture of the whole database.
Motivating by these issues, we develop a Minimum Description Length (MDL)-based approach that is able to compress spatio-temporal data combining different kinds of mov-ing object patterns. The core of the method is to design and encoding scheme for different movement patterns, and then an approach which allow overlapping between them. The proposed method results in a rank of the patterns involved in the summarization of the dataset.
MULTI_MOVE System. Despite the growing demands for diverse applications, there have been few scalable tools for mining massive and sophisticated moving object data. Even if some tools are available for extracting patterns (e.g. [33]), they mainly focus on specific kinds of patterns at a time. Obviously, when considering a dataset, it is quite dif-ficult for the decision maker to know in advance which kinds of patterns are embedded in the data. To cope with this issue, we propose the MULTI_MOVE system to reveal, au-tomatically and in a very efficient way, collective movement patterns like convoys, group patterns, closed swarms, moving clusters and also periodic patterns. Starting from the results of MULTI_MOVE, the user can then visualize, browse and compare the different ex-tracted patterns through a user friendly interface in Google Maps and Google Earth. The MULTI_MOVE system has gained a lot of publicity, since it is the first system that enable users to obtain the comparative results.
Moreover, we also demonstrate that our proposed approaches and systems can be ef-ficiently applied to many other domains such as gene expression analysis, and tweet user behavior analyzing. The GET_MOVE approach was applied on 3 HIV time-series gene ex-pression dataset to outline relationships between genes based on their expressions at dif-ferent timestamps following infection. We have found that clustering gene expression data groups together efficiently genes of similar function based on their FC value and coherent results with our knowledge on HIV-1 versus HIV-2 infection were obtained. In additional, trajectories expressing the evolution of French political communities can be extracted from the political tweets which have been gathered during the French Election Campaign 2012.
Mining Multi-Relational Gradual Patterns. As an extra work, we also present the multi-relational gradual patterns. Indeed, gradual patterns highlight covariations of at-tributes of the form “The more/less X, the more/less Y ». Their usefulness in several applications has recently stimulated the synthesis of several algorithms for their automated dis-covery from large datasets. However, existing techniques require all the interesting data to be in a single database relation or table. This work extends the notion of gradual pat-tern to the case in which the co-variations are possibly expressed between attributes of different database relations. The interestingness measure for this class of “relational grad-ual patterns » is defined on the basis of both Kendall’s ⌧ and gradual support. Moreover, we propose two algorithms, named ⌧RGp and gradRGp, for the discovery of relational gradual rules, and three pruning strategies to reduce the search space. The efficiency of the algorithms is empirically validated, and the usefulness of relational gradual patterns is proved on some real-world databases.
The remaining of the report is organized as follows. The related work is discussed in Chapter 2. GET_MOVE is presented in Chapter 3. Mining fuzzy closed swarms and gradual trajectory patterns are described in Chapters 4 and 5 separately. Mining representative movement patterns will be introduced in Chapter 6. I also briefly present MULTI_MOVE system in Chapter 7. The extra work, i.e. mining multi-relational gradual patterns, and the conclusion will be drawn in Chapters 8 and 9.
= {o1,o2,…,oz}.

READ  Effects of stochastic wave forcing on equilibrium shoreline modelling across the 21st century

Related Work

Preamble

The problem of object movement patterns has been extensively addressed over the last years. Basically, an object movement patterns are designed to group similar trajecto-ries or objects which tend to move together during a time interval. So many different definitions can be proposed and today lots of patterns have been defined such as flocks, convoys, closed swarms, moving clusters and even periodic patterns.

Preliminary Definitions

First of all, let us assume that we have a group of moving objects Odb a set of timestamps Tdb = {t1,t2,…,tn} and at each timestamp ti 2 Tdb, spatial infor-mation 1 x,y for each object. For example, Table 2.1 illustrates an example of a moving object database. Usually, in object movement pattern mining, we are interested in ex-tracting a group of objects staying together during a period. Therefore, from now, O = {oi1 ,oi2 ,…,oip }(O µ Odb) stands for a group of objects, T = {ta1 ,ta2 ,…,tam }(T µ Tdb) is the set of timestamps within which objects stay together. Let mino be the user-defined threshold standing for a minimum number of objects and mint the minimum number of timestamps. Thus |O| (resp. |T|) must be greater than or equal to mino (resp. mint).
Database of clusters. A database of clusters, Cdb = {Ct1 ,Ct2 ,…,Ct n }, is the collection of snapshots of the moving object clusters at timestamps {t1,t2,…,tn}. Note that an object could belong to several clusters at one timestamp (i.e. cluster overlapping). Given a cluster c 2 Ct(2 Cdb) and c µ Odb, |c| and t(c) are respectively used to denote the number of objects belong to cluster c and the timestamp that c involved in. To make the process more general, the clustering 2 is taken as a preprocessing step. In the following, we formally define all the different kinds of movement patterns.

Table of contents :

1 Introduction 
1.1 Illustrative Example andMotivations
1.2 Contributions
2 RelatedWork 
2.1 Preliminary Definitions
2.2 ObjectMovement PatternMining
3 All in One: Mining Multiple Movement Patterns 
3.1 Object Movement Patterns in Itemset Context
3.2 Frequent Closed Itemset-based ObjectMovement PatternMining Algorithm
3.2.1 GeT_Move
3.2.2 Incremental GeT_Move
3.3 Preliminarily Experimental Results
3.3.1 Effectiveness
3.3.2 Efficiency
3.3.3 Toward A Parameter Free Incremental GeT_Move Algorithm
3.3.4 Object Movement Pattern Mining  Algorithm  Based on Explicit Combination of FCI Pairs
3.4 Experimental Results
3.4.1 Parameter Free Incremental GeT_Move Efficiency
3.4.2 Movement Pattern Mining Algorithm Based on Explicit Combination of FCI Pairs
3.5 Discussion
4 Mining Fuzzy Moving Object Clusters 
4.1 Introduction
4.2 Fuzzy Closed Swarms
4.3 Discovering of Fuzzy Closed Swarms
4.4 Experimental Results
4.4.1 Effectiveness
4.4.2 Parameter Sensitiveness
4.5 Discussion
5 Mining Time Relaxed Gradual Moving Object Clusters 
5.1 Introduction
5.2 Problem Statement
5.3 Discovering Maximal Time Relaxed Gradual Trajectory Patterns
5.3.1 ClusterGrowth Approach
5.3.2 The ClusterGrowth Implementation
5.4 Preliminarily Experimental Results
5.4.1 Effectiveness and PatternMeaning
5.4.2 Parameter Sensitiveness
5.5 Mining Representative Gradual Trajectory Patterns
5.5.1 Problem Statement
5.5.2 Encoding Scheme
5.5.3 Complexity Analysis
5.5.4 Mining top-K Representative rGpatterns
5.6 Experimental Results onMining Representative rGpatterns
5.7 Discussion
6 Mining Representative Movement Patterns through Compression 
6.1 Introduction
6.2 Problem Statement
6.3 Encoding Scheme
6.3.1 Movement Pattern Dictionary-based Encoding
6.3.2 OverlappingMovement Pattern Encoding
6.4 Mining Compression ObjectMovement Patterns
6.4.1 Naive Greedy Approach
6.4.2 Smart Greedy Approach
6.5 Experimental Results
6.6 Discussion
7 MiningMulti-Relational Gradual Patterns 
7.1 Introduction
7.2 Preliminarily Definitions
7.2.1 Multi-Relational Data
7.2.2 Gradual Pattern: Single Relation vsMulti-Relations
7.2.3 Multi-Relational Gradual Pattern
7.3 Pattern Occurrences
7.4 Pattern Support
7.4.1 Kendall’s ⌧-basedMulti-Relational Gradual Pattern Support
7.4.2 Gradual Support
7.5 Multi-Relational Gradual PatternMining Algorithms
7.5.1 MiningMono-Relational Gradual Patterns
7.5.2 DiscoveringMulti-Relational Gradual Patterns
7.6 Experimental Results
7.6.1 Multi-Relational Gradual Patterns
7.6.2 Efficiency and Pattern Distribution
7.7 RelatedWork
7.8 Discussion
8 Applications 
8.1 Introduction
8.2 The MULTI_MOVE System Architecture
8.3 Other Applications
8.3.1 Mining Trajectories on Genes
8.3.2 Mining Trajectories on Tweets
8.4 Discussion
9 Conclusion & Perspectives 
9.1 Conclusion
9.2 Streaming GeT_Move: Mining Representative Movement Patterns from Streaming Trajectory Data
9.3 CorGpattern: Combined Time Relaxed Gpattern
9.3.1 CorGpattern Definition
9.3.2 CoClusterGrowth: DiscoveringMaximal CorGpatterns
9.4 DirectlyMining RepresentativeMovement Patterns through Compression
9.5 CompletedMiningMulti-Relational Gradual Patterns
9.6 TrajectoryMining on Diverse Applications
9.6.1 Social Networks and SocialMedia
9.6.2 Remote Sensing, Spatial Information on Satellite Image Processing
10 Publications 
10.1 International Conferences and Journals
Bibliography

GET THE COMPLETE PROJECT

Related Posts