Epidemic simulation on temporal network data

Get Complete Project Material File(s) Now! »

Dynamic processes on networks

Nodes and edges can have diﬀerent states. Networks represent the connections between diﬀerent nodes. These connections can influence the states of the node. In order to understand how the state of a node develops, influenced by the state of its neighbors, we study processes on the network. An introduction to complex dynamics on networks can be found in Barrat et al[9]. In the case of diﬀusion processes, the state of the node is characterized by the number or volume of diﬀusing particles which occupy the space of the node. Traﬃc flow can be modeled, for example, by diﬀusion processes on road networks or electricity flow on power grids. In these flow networks, edges can have a distinct capacity which cannot be exceeded. If there are no sinks or sources, the amount of particles or the volume of the flow is constant. In the case of spreading processes, there is no conservation of the entity (opinion, information or disease) which is spread. They can be used to model the spread of diseases or rumours, as well as the spreading or forming of opinions. In opinion spreading models like the voter model [19], a node adapts its state to the state of a random neighbor. In models of epidemic spreading, a node can transfer its state to a neighboring node if the neighbor is in the ’susceptible’ state and the node itself in the ’infectious’ state. Other state changes, like the transition from the ’infectious’ to the ’recovered’ state, happen independently of the influence of neighbors. Further processes with specific rules concerning the change of states of nodes can be used to better understand such diverse social phenomena as the consensus on word use in language (naming game [87, 27, 8]) or the evolution of cooperation strategies (prisoner’s dilemma [69]). Here we will study more closely epidemic spreading processes on time-varying networks.

Epidemic spreading

To study epidemic spread in populations, the population is divided into diﬀerent compartments. These compartments correspond to node states. People in one compartment are in the same state. We can distinguish between several compartments, for example, susceptible, infected and recovered. People change the compartment with a given transition rate. Depending on the names of the compartments, where S stands for susceptible, I for infectious, E for exposed and R for recovered, we have, among others, the SI, SIR and SEIR model. The diﬀerent models are used in diﬀerent situations. If people cannot recover from the disease and stay infected forever or at least longer than the time span which is considered, then the SI model is appropriate. The SIR model applies if people can recover, gaining life long immunity, immunity which is longer than the modeling time or if they die from the disease. A refinement of this model is the SEIR model, in which the time in which people carry the disease without yet being able to infect others is accounted for. Other models, where infected people become susceptible again after being infectious (SIS) or after having recovered (SIRS), are not treated here. We introduce the deterministic compartment models described by diﬀerential equations in a homogeneously mixing population [3, 47] and afterwards describe the simulation on networks for the SIR model.

Network topology’s influence on epidemic processes

When studying epidemic spread on static networks, it is of particular interest to under-stand how the topological structure of the network influences the course of the epidemic [46, 75, 49, 68, 64, 94, 81, 45]. To this end, often simulations on networks with particular structural properties are compared to simulations on random networks. It has thus been found, that the nature of the degree distribution has a strong influence on the epidemic threshold and the overall outcome of the epidemic. In particular, networks with a scale-free degree distribution facilitate epidemic outbreaks. In these scale-free networks the epidemic threshold is reduced, and epidemics propagate faster than in random networks [75, 76, 54, 65]. This has a direct con-sequence for immunization strategies. Random vaccination on scale-free networks is ineﬃcient, as it often removes only nodes with limited importance for the spreading process, while targeted immunization can easily lead to a complete disintegration of the network by removing the hubs, nodes with high degree and consequently high importance [2, 23]. Another property of many social networks is a dense community structure [33]. While com-munities can facilitate spreading on a limited local scale, they globally hinder the diﬀusion of information. If on top of the community structure weights are correlated in such a way with the topology that inter-community links have low weight and intra-community links have high weight [37], this eﬀect of trapping information in communities is enhanced [70]. If community structures are strong, it is therefore a good immunization strategy to vaccinate individuals who bridge diﬀerent communities [83]. Similarly, clustering slows down spreading [104] and can reduce the epidemic size [49] and R0 [60]. Furthermore, when information on the temporal structure of the network is available, a comparison between simulations on dynamic networks with particular temporal properties and networks with randomized dynamics can inform on the influence of particular temporal properties of the network. In particular the burstiness of contact patterns slows down epidemic spread [44] while correlations between events can sometimes facilitate the propagation of epidemics [82].

Overview

The subsequent chapters treat the following subjects. Chapter 2 gives a short summary on the datasets used. Limitations of data and the eﬀects of decisions concerning the data representa-tions, such as the choice of the minimal time step length of temporal networks, are also discussed. In chapter 3, we investigate the eﬀect of the dynamics of the network on the spreading process on the network. A focus will be put on the interplay between the timescales of the data and the process on the data, as well as the finite time eﬀects of data. In the following chapter (Ch. 4) we follow two directions to simplify the data representation. On the one hand, we look at the optimal aggregation time of the temporal network, on the other hand, we try to simplify the aggregated networks by grouping nodes together. Here we introduce a contact matrix of distributions, which allows us to keep some of the heterogeneity of the links, even though single nodes adopt group properties, losing their individuality. We consider the eﬃcacy of immuniza-tion schemes which can be derived from networks based on diﬀerent data representations with diﬀerent levels of detail in chapter 5. As we have a wide choice of data representations, we try to find a method which uses the maximum level of detail of the data in order to choose the optimal nodes for immunization, and we discuss its limitations. In chapter 6, we test for the amount of data necessary to make predictions for immunization schemes and the reliability of such predic-tions. We will also test the applicability of generalized data representations to other situations and discuss its limits. Finally, in chapter 7, we look at the relation of distances on static and dynamic networks and at the distribution of temporal distances as well as the distribution of the number of intermediary nodes in spreading processes. We give a short conclusion in chapter 8.

SocioPatterns

The data sets used in this thesis were collected by the SocioPatterns collaboration [85]. They comprise face-to-face contact data between individuals at diﬀerent venues. The data sets which we will use come from two settings: conferences and hospitals. Participants were equipped with radio-frequency identification (RFID) tags, which emit and receive signals in a peer-to-peer fashion. The emitted radio packets contain a unique identifier for the device and the time of emission. The tags register contacts autonomously whenever two participants face each other at a distance below 1-1.5 meters. The angle of detection is about 120 ◦. As radio signals are absorbed by the body water, the device can only eﬃciently emit signals towards the front of the body, thus greatly reducing the risk of false positive contacts of people who are in proximity but not facing each other. The resolution of the contact data is extraordinarily precise, as contacts are registered con-tinuously. However, detection of contacts is not instantaneous. The RFID tags alternate emit and receive cycles. When a packet is emitted during the emit cycle, it can only be registered by another RFID tag in the receiving cycle. Thus, it can take some time before a contact is reg-istered. The contact data was therefore discretized into 20 second timesteps, which guaranteed with a probability of 99 % that an actual contact was recorded [20]. Also, meaningful contact lengths were assumed to not last much shorter than 20 seconds.
Further details about the collected data and the method of data collection can be found at the SocioPatterns website (www.sociopatterns.org) and in related papers [85, 20, 11].

The datasets

The data which are used throughout this work are face-to-face contact data of diﬀerent venues, from the SocioPatterns collaboration [85]. The data vary in many properties, like number of participants, density of links or number of days. Depending on the setting, some data also have meta-information about the participants. For instance, the participants in the hospital data sets can be classified as Assistants, Doctors, Nurses, Patients or Caregivers. In order to understand the processes on the networks better, it is of advantage to know some of the structural and dynamical properties of the data. In the following, a short overview of some important aspects of the data are given. We use two types of data sets, data from conferences and data from hospitals. The conferences were the Congress of the ’Soci´et´e Francaise d’Hygi`ene Hospitali`ere (sfhh), the European Semantic Web Conference (eswc) and the ACM Hypertext Conference (ht). The hospital data came from the Childrens’ Hospital Ospedale Bambini Gesu (obg) in Rome and from a pediatric ward in a hospital in Lyon (lyon2011 and lyon2012). The length of the data set and the number of participants for each data set are given in Tab. 2.1.

Activity

In Fig. 2.1 the number of participants which are in contact with other participants is given for each timestep, as well as the number of connections which are active at each instant. This activity shows strong daily patterns. Coﬀee breaks and lunch breaks are marked by high peaks of activity, whereas during the night no contacts take place. Depending on the dataset, weekly variations can also be noticed. For instance, there is a strong drop in activity during the weekend in the ”lyon2012” dataset and the first and last days of the ”eswc” conference show lower activity. The percentage of nodes which are in contact at any time varies also between the datasets. While in the ”obg” dataset, at no time more than about 15% of the participants are in contact at the same time, in the ”sfhh” dataset at peak times it is over 30%. The networks also diﬀer in the number of diﬀerent contacts per person, the average number of events per time and other properties as listed in Tab. 2.2.

Table of contents :

1 Introduction
1.1 Why networks?
1.2 What are networks?
1.3 How to classify networks?
1.4 Dynamic processes on networks
1.4.1 Epidemic spreading
1.4.2 Simulation
1.5 Network topology’s influence on epidemic processes
1.6 Overview
2 Data collection
2.1 SocioPatterns
2.2 The datasets
2.2.1 Activity
2.2.2 Degree vs. Strength
2.2.3 Contact-dynamics distributions
2.3 Limitations of data
2.4 A short note about cleaning
2.5 Incomplete samples
2.6 Discrete timesteps
3 Epidemic simulation on temporal network data
3.1 Activity fluctuations
3.2 Influence of starting time
3.3 Effect of nights
3.4 Finite time
3.5 Model networks
3.6 Conclusion
4 Data representation
4.1 Time resolution
4.2 Structural resolution
4.2.1 Choice of groups
4.2.2 Heterogeneity of weights
4.2.3 Daily networks
4.2.4 Influence of roles
4.2.5 R0-correction
4.3 Conclusion
5 Immunization on dynamic networks and data representations
5.1 Influence of the data representations
5.2 Immunization strategies on static data representations
5.3 Effect of a limited time window
5.4 Time dependence of ranking efficiency
5.5 Immunization strategies on dynamic networks: significance
5.6 Conclusion
6 Predictability
6.1 Degree ranking
6.2 Data-based predictions of epidemic spread
6.2.1 Comparing datasets
6.2.2 Effect of data variability on epidemic predictions
6.3 Conclusion
7 Distances
7.1 Static distance vs. dynamic distance
7.2 Temporal path lengths and infection-path lengths
7.2.1 Discrete vs continuous
7.2.2 Influence of link density
7.2.3 Influence of the weight distribution
7.3 Distance on face-to-face contact networks
7.4 Conclusion
8 Conclusions
A Appendix