Human Mobility Assessment
A deep understanding of human-mobility patterns can yield interesting insights into a variety of important societal issues. As a matter of fact, evaluating the eﬀect of human displacements on the environment maps to determining how large popu-lations move in their daily lives. Likewise, understanding the spread of a disease requires a clear picture of how humans move and interact . Other examples abound in such fields as urban planning, where knowing how people come and go can help determining where to deploy infrastructure and how to reduce traﬃc con-gestion . Moreover, cloud and content delivery networks , and location-based recommender systems  , can all benefit from quantitative and qualitative knowledge of users’ mobility patterns.
Nowadays, the huge worldwide mobile-phone penetration is increasingly turning the mobile network into a gigantic ubiquitous sensing platform, enabling large-scale analysis and applications. The objective of this chapter is to check the appropriate-ness of using mobile phone data in estimating the trajectory of people across urban areas and to assess the pertinence of diﬀerent conceivable trajectory estimation approaches in terms of deviation from real human trajectories.1 After reviewing related work from the literature and describing the dataset used in our study, we define an appropriate, compact and easy to compute mobility metric to qualify user mobility, the “radius of gyration”. Then we present the diﬀerent interpolation methods used to model user trajectories and we compare them together in terms of deviation from real trajectories. The results highlight the dependence of mobility characteristic with the radius of gyration.
Related Work on Macro-Mobility Analysis
In recent years, mobile data-based research reached important conclusions about various aspects of human characteristics, such as human mobility and calling pat-terns  , virus spreading  , social networks   , urban and transport planning   and network design .
Nevertheless, in such user displacement sampling data, a high uncertainty is related to users movements, since available samples strongly depend on the user-network interaction frequency. For instance, Call Data Records CDRs (i.e., data records containing information related to phone calls, such as the origination and destination addresses of the calls, the starting time and the duration of the calls, etc.) alone do not provide a suﬃcient fine granularity and accuracy, exhibiting a vast uncertainty about the periods when the user is not active, i.e., not communicating. This represents a challenge for applications and analysis assuming ubiquitous and continuous user-tracking capability.
Some modeling techniques have been proposed in the literature to predict user movement between two places.
Authors in  and  infer the top-k routes traversing a given location se-quence within a specified travel time from uncertain trajectories; they use check-in datasets from mobile social applications.2 The proposed methods permit to identify the most popular travel routes in a city, but they do not allow constructing time-sensitive routes, i.e., the proper visiting time of places as well as the transit time between two places cannot be determined.
Authors in  propose a space-time prism approach, where the prism represents reachable positions as a space-time cube, given user’s origin and destination points– i.e., the assumption of knowing the location of a user at one time and then again at another time fits well mobile phone data in which we only know users’ position during their communication events – as well as time budget and maximum speed. Spatial prisms so allow evaluating binary statements, such as the encounter potential between two moving users. However, the maximum speed that cannot be set for all users in general, is considered as a major limitation of the model applicability.
Similarly, the authors in  propose a probabilistic extension of the space-time approach, applying a non-uniform probability distribution within the space-time prism. A strong assumption made therein is that users move linearly over time. This hypothesis is in a high contrast with the results obtained in  that show the tendency of users to stay in the vicinity of their call places.
Authors in  propose a probabilistic inter-call mobility model, using a finite Gaussian mixture model to determine users’ position between their consecutive communication events (call or SMS) using CDRs. The model evaluates the density estimation of the spatio-temporal probability distribution of users position between calls, but it does not give an approximation of the fine-grained trajectory between calls.
User displacements using GPS traces have been analyzed in ; the authors find that the displacement behavior shows Levy walk properties (i.e., random walk with pause and flight lengths following truncated power laws). While very interesting in order to model inter-contact time distributions and general massive mobility, such random-based approaches cannot give precise approximations between given points on a per-user basis.
In this chapter, we assess the pertinence of diﬀerent conceivable trajectory es-timation approaches in terms of deviation from real available trajectories, via the analysis of real mobile network data for the state of Massachusetts in USA.
Airsage Dataset Description
We use a dataset consisting of anonymous cellular phone signaling data collected by AirSage , which converts cellular network signaling data into anonymous locations over time for cellular devices. The dataset consists of location estimations
– latitude and longitude – for about one million devices from July to October 2009 in the state of Massachusetts. These data are generated each time the device connects to the cellular network including:
• when a call is placed or received (both at the beginning and end of a call);
• when a short message is sent or received;
• when the user connects to the Internet (e.g., to browse the web, or through email synchronization programs).
The location estimations3 not only consist of identifiers of the mobile phone towers that the mobile phones are connected to, but also an estimation of their positions generated through triangulation, using the AirSage’s Wireless Signal Extraction technology . This technology aggregates and analyzes wireless signaling data4 from mobile phones to securely and privately monitor the location and movement of populations in real-time, while guaranteeing acceptable user anonymity and privacy. We note that the observation period is limited to one day because the anonymized user identifiers change from one day to another to ensure user privacy.5
People do not behave similarly; each person has diﬀerent mobility habits and be-haviors. Many studies have been conducted to find mobility patterns from network data sampling, from very complex and complete ones able to determine precise mo-tifs (e.g., ), to more aggregated and synthetic ones extracting a single parameter to characterize user mobility. A suﬃciently precise, synthetic and easy to compute parameter is the radius of gyration. In , authors show that the radius of gyra-tion is defined as the deviation of user positions from the corresponding centroid position.
Given a user u ∈ U (i.e., U represents the set of all users) who has been located at nu(t) locations until time t, its radius of gyration can be thus computed as: wherer iu(t) represents the i th position recorded for the user u andr cmu(t) repre-sents its centroid defined as the center of mass of the user’s recorded displacements.
To explore the statistical properties of the population’s mobility patterns, the cumulative distribution function (CDF) of the radius of gyration for the smartphone users is represented in Figure 2.1 using the dataset described in Section 2.2. It is easy to distinguish four main user categories6 based on steep changes in the CDF slope.
• Users with rg ≤ 3km, who can be identified as the most sedentary people.
• Users with 3km ≤ rg ≤ 10km. They might be identified as urban mobile people as the diameter of the Boston metropolitan area is very approximately around 10 km.
• Users with 10km ≤ rg ≤ 32km. They might be identified as peri-urban mobile people as the diameter of the Boston peri-urban area is very approximately around 32 km.
• Users with rg ≥ 32km, who can be identified as commuters spanning the whole Massachusetts state area.
This ranking seems appropriate as the total traveled length increases with the radius of gyration, as displayed in Figure 2.27. Moreover, this correlation may be interpreted by the fact that the radius of gyration can be viewed as a proper “territory” of each user, and thus increasing the territory area means that the person is able to move over longer distances.
Trajectory Interpolation Methods
Diﬀerent interpolation methods have been proposed in the literature to describe moving object trajectories. We present in the following a selection of classical ones, showing how they approximate the real trajectory; referring to the example in Figure 2.3.
• the Linear Interpolation, is a popular interpolation used for moving objects . It is presented in Figure 2.3(b). It is obtained by joining straight interpolating lines between each pair of consecutive samples. Users are supposed to move at a constant speed along the straight lines. One limitation of the linear interpo-lation is that it can fail in some situations where the interval of time between interpolated points is high. For example, suppose there are two points A and B in the road network with a curved path connecting them: with the linear interpolation we always assume the user drives along a straight line.
• the Nearest-neighbor Interpolation, is an interpolation often used in mapping programs , also known as proximal interpolation. It consists of taking, for each position, the value of the nearest sampling position in time (not plotted in Figure 2.3 because of the simplistic decision). Therefore, if we detect the same user in two diﬀerent instants, at point A and point B respectively, the nearest interpolation attaches the user to position A for the first half period of time, and to position B for the second half.
• the Piecewise Cubic Hermite Interpolation is often used in image processing studies (see ). It is depicted in Figure 2.3(c). It is a third-degree spline that interpolates the function by a cubic polynomial using values of the function and its derivatives at the ends of each subinterval. This method interpolates the samples in such a way that the first derivative is continuous, but the second derivative is not necessary continuous. The slopes are chosen in a way that the function is “shape preserving” and respects monotonicity. Suppose a subinterval [x1, x2], with the function values: y1 = f (x1), y2 = f (x2) and the derivative values d1 = f ′(x1) and d2 = f ′(x2) are given. The cubic polynomial function in this subinterval is given by: C(x) = a + b(x − x1) + c(x − x1)2 + d(x − x1)2(x − x2) (2.3) satisfying C(x1) = y1, C(x2) = y2, C′(x1) = d1 and C′(x2) = d2 This inter-polation determines the coeﬃcients a, b, c and d noting that: C′(x) = b + 2c(x − x1) + d[(x − x1)2 + 2(x − x1)(x − x2)] (2.4) is also continuous. c = y1′−d1 and d = x2−x1
The solution to this system is given by: a = y1; b = d1; d1+d2−2y1′ ′ y2−y1 (x2−x1)2 , where y1 = x2−x1 .
In order to qualify the precision of diﬀerent interpolation methods, we have to de-termine the deviation of the estimated trajectory from the real one. To determine real user trajectories, we fine-select data of those smartphone holders with a lot of samplings, typically those data-plan users with persistent Internet connectivity due to applications such as e-mail synch. By selecting users with more than 1000 con-nections (position samplings) during a given day, we can filter out 707 smartphone users from the whole dataset.
In order to reproduce “normal-phone user” sampling, we subsample8 real trajec-tories (i.e. smartphone user trajectories) according to the experimental inter-event time distribution (i.e., the inter-event time is defined as the time between two con-secutive connections, done by the same user, to the mobile network), extracted from the same dataset and given in Figure 2.4. We determine it by analyzing real normal-phone user samplings (for which the real trajectory is unknown), available in the Airsage original dataset. Therefore, we extract, from the real trajectory, a first random position Pi(longitudei, latitudei, timei), then the corresponding next positions are extracted according to the inter-event time distribution. Hence, given a real trajectory with a high number of positions, and its subsampling that repro-duces normal user’s activity, we apply an interpolation method (as of Section 2.4) to estimate the trajectory between the subsampled points. Given a real user’s position Pi(longitudei, latitudei, timei), we estimate its corresponding position Pi′ (longitude′i, latitude ′i, timei). Then we determine the deviation between the two points Pi and Pi′ as the distance separating the exact position Pi to the estimated position Pi′ in the interpolating curve joining the samples.
Table of contents :
I Mobility Estimation
2 Human Mobility Assessment
2.1 Related Work on Macro-Mobility Analysis
2.2 Airsage Dataset Description
2.3 Mobility Metrics
2.4 Trajectory Interpolation Methods
2.5 Trajectory Modeling
2.6.1 Interpolation Error
2.6.2 Interpolations’ Probability Density Function
3 Estimation of Mobile Crowded Spots
3.2 Orange Dataset Description
3.3 Content Consumption Cartography
3.3.1 Content Consumption Spatial Distributions
3.3.2 Application Usages Distributions
3.4 Crowded Spot Estimators
3.4.1 Related Work on Cell Load Estimation
3.4.2 Trajectory-based Crowded Spot Estimator
3.4.3 Territory-based Crowded Spot Estimator
3.4.4 Comparison between Estimators
II Resource Contention and Offloading Solutions
4 Horizontal Traffic Offloading with Small-Cell Networks: Cooperative Resource Allocation Approaches
4.2 Related Work on Resource Allocation in Femtocell Networks
4.2.1 Split-Spectrum Distributed Schemes
4.2.2 Game-Theoretic Approaches
4.3 Context and Problem Formulation
4.3.2 Related Centralized Optimization Problem
4.3.3 Possible Distributed Approaches
4.3.4 Bankruptcy Game Modeling
Interference Set Detection
Bankruptcy Game Iteration
4.3.5 Possible Imputation Schemes
4.3.6 An Illustrative Example
4.4 Performance Evaluation
4.4.1 Throughput Analysis
4.4.2 Fairness Analysis
4.4.3 Time Complexity Analysis
4.5 Wireless Mesh Networks Use-Case
4.5.1 Performance Evaluation
4.6 Dealing with Cheating Behaviors
5 Vertical Traffic Offloading over Passpoint Access Points
5.2 Passpoint Hotspot-Device Signaling
5.3 Related Work on WiFi Traffic Offloading
5.4 Cellular Network Dataset
5.5 Data Traffic Offloading over Passpoint Hotspots: Methodology
5.6 Data Traffic Offloading over Passpoint Hotspots: Simulation Results
5.6.1 Radio Model
5.6.2 Achievable Gain with different Hotspot Selection Policies
5.6.3 Passpoint Placement Schemes
5.6.4 Energy Saving Gain
6 Content Offloading in Information Centric Networking
6.2 Related Work on Cache Allocation in ICN
6.3 ICN Cache Allocation Framework and Rules
6.3.1 Problem Formulation
6.3.2 Cache Allocation to Content Providers
Allocation by Proportional Fairness (PF)
Allocation by Max-Min Fairness (MMF)
Allocation by Shapley Value
Allocation by Nucleolus
6.3.3 Cache allocation algorithm
6.3.4 Pricing Framework
6.4 Performance Evaluation
6.4.1 Content Access Latency
6.4.2 Fairness of Cache Imputations
7 Conclusion and Perspectives
A Cooperative Game Theory
A.2 Cooperative Game: Definition
A.3 Some Definitions
A.4 Solutions for Coalitional Games
A.4.2 Shapley Value
A.5 Bankruptcy Game
A.5.1 Talmud Mystery
A.5.2 Talmud Solution and Bankruptcy Game