influence maximization in social networks

Get Complete Project Material File(s) Now! »

social influence analysis applications

Influence propagation studies have found applications in various fields. From studying human, animal or even plant epidemics [78, 92, 133] to viral marketing [99], social media analytics [161], spread of rumors [122], expert finding [9], recommendation systems [81, 106, 152] etc.
A key task in order to understand information and influence diffusion is the identification of vital nodes that play a significant role in such cases. Such nodes may allow us to control the spread of an epidemy [48, 129], to predict successful scientists and scientific publications based on co-authorship and citation networks [54, 135, 165], design influential advertisements for new products [99, 110] etc. For example, in the case of virus propagation, such as influenza, the transmission of the disease mainly depends on the extend of contacts of the infected person to the susceptible population; thus, being able to locate and vaccinate individuals with good spreading properties can prevent from a potential outbreak of the disease, leading to efficient strategies of epidemic control. In a similar way, suppose that our goal is to promote an idea or a product in order to be adopted by a large fraction of individuals in the network. A key idea behind viral marketing is the word-of-mouth effect [153]; individuals that have already adopted the product, recommend it to their friends who in turn do the same to their own social circle, forming a cascade of recommendations [56]. The basic question here is how to target a few initial individuals (e.g., by giving them free samples of the product or explaining them the idea), that can maximize the spread of influence in the network, leading to a successful promotion campaign. Nevertheless, locating such users in a network is not a trivial task and numerous research has been conducted to solve the problem in the area [111]. It has been of significant importance to identify such nodes that will maximize the influence and information diffusion at the end of a respective phenomenon in a network. The problem is actually split into two subtopics: i) Identification of individual influential nodes that have good spreading properties and ii) Identification of a group of nodes that by acting all together will maximize the total spread of influence in a network. Indeed the two tasks greatly differ as finding a ranking of the nodes that by acting individually can influence a great part of the network cannot be directly used in order to discover the set of nodes that will – by acting at the same time – maximize the spreading of information in a graph. Of course this is justified by the fact that putting some of the most influential spreaders together will not result in a most influential set of such spreaders, because their respective influences may be and is usually largely overlapped.

The SIR model

The SIR model [53] is a model that is used to describe an acute infectious disease. The latter refers to an infection which presents a rapid immune response after a short period of time.
The model assumes a population of N individuals, divided on the following three states.
• Susceptible (S): the individual is not yet infected, thus being susceptible to the epidemic;
• Infected (I): the individual has been infected with the disease and it is capable of spreading the disease to the susceptible population;
• Recovered (R): after an individual has experienced the infectious period, it is considered as removed from the disease and it is not able to be infected again or to transmit the disease to others (immune to further infection or death). Every individual that is on the I state can infect individuals with probability , called infection rate, and afterwards it can recover with probability , called recovery rate. The state diagram of the model is presented in Figure 3.1.

The SIR model applied in networks

In our experiments throughout the Chapter we will be using the SIR model to simulate a spreading process. Based on its definition, the model assumes a fully mixed population: an infected individual can equally infect any other member of the population to which it belongs to. In order to apply the model in a population of individuals which form connections between them (i.e., a network) we follow a more realistic approach. In such a case any susceptible node can only be infected by an infected neighbor on the graph. As we will present in Section 3.5, initially all the nodes of the network are set at the susceptible state S, except from the one that we are interested to examine its performance which is set at the infected state I. Then, at each time step t of the process, every node that is on the I state can infect its susceptible neighbors with probability and afterwards it can recover with probability . Note that, a node cannot directly pass from state I to state R during the same time step t.

Table of contents :

1 introduction
1.1 Social Networks and Social Influence
1.2 Social Influence examples
1.3 Social Influence Analysis Applications
1.4 Thesis statement and overview of contributions
1.4.1 Identification of individual influential spreaders
1.4.2 Identification of a group of influential spreaders
1.4.3 Secure and Private computation of influential metrics
1.5 Outline of the thesis
2 basic concept and preliminaries
2.1 Introduction to Graph Theory
2.2 Adjacency Matrix and Eigenvalues
2.3 Node centralities
2.3.1 Structural centralities
2.3.2 Iterative refinement centralities
2.4 Description of Graph Datasets
3 locating influential spreaders in social networks
3.1 Introduction
3.2 Preliminaries and Background
3.2.1 k-core decomposition
3.2.2 K-truss decomposition
3.2.3 Epidemic models
3.2.4 The SIR model applied in networks
3.3 Related work
3.4 K-truss decomposition for identifying influential nodes
3.5 Experimental Evaluation
3.5.1 Datasets and Methodology
3.5.2 Evaluating the spreading performance
3.5.3 Comparison to the optimal spreading
3.5.4 Impact of infection and recovery rate on the spreading process
3.6 Exploration of network centralities in spreading processes
3.6.1 Evaluation of Results
3.7 Conclusions and Future Work
4 influence maximization in social networks
4.1 Introduction
4.2 Preliminaries and Background
4.2.1 The Influence Maximization (IM) problem
4.2.2 Diffusion Models
4.3 Related work
4.4 MATrix Influence (MATI) Algorithm
4.4.1 Influence in Social Networks
4.4.2 Influence Computation under the LT Model
4.4.3 Influence Computation under the IC Model
4.5 Experimental Evaluation
4.5.1 Datasets
4.5.2 Baseline Algorithms
4.5.3 Experimental Results
4.6 Conclusions and Future Work
5 private, secure and distributed computation of k-cores
5.1 Introduction
5.2 Problem Statement and Preliminiaries
5.2.1 Problem Statement
5.2.2 Preliminaries and Background
5.3 Related work
5.3.1 k-core Computation
5.3.2 Core Maintenance
5.3.3 Decentralized Personal Data Management Platforms
5.4 P2P Algorithm for Core Maintenance
5.4.1 Local variables
5.4.2 Handling Messages and Events
5.4.3 Computing coreness estimations
5.4.4 Example
5.5 Analytical and Experimental Study
5.5.1 Analytical study
5.5.2 Complexity: Experimental Study
5.6 Security and Privacy Analysis
5.6.1 Attack Model
5.6.2 Privacy and Information Quality
5.6.3 Experimental results
5.7 Conclusions and Remarks
6 concluding remarks
6.1 Summary of Contributions and Future Work
6.2 Epilogue
bibliography