Journal Title
Title of Journal: Soc Netw Anal Min
|
Abbravation: Social Network Analysis and Mining
|
Publisher
Springer Vienna
|
|
|
|
Authors: Kirill Dyagilev Shie Mannor Elad YomTov
Publish Date: 2013/02/12
Volume: 3, Issue: 3, Pages: 521-541
Abstract
We consider the dynamics of rapid propagation of information RPI in mobile phone networks We propose a heuristic method for identification of sequences of calls that supposedly propagate the same information and apply it to largescale realworld data We show that some of the information propagation events identified by the proposed method can explain the physical colocation of subscribers We further show that features of subscriber’s behavior in these events can be used for efficient churn prediction To the best of our knowledge our method for churn prediction is the first method that relies on dynamic rather than static social behavior Finally we introduce two generative models that address different aspects of RPI One model describes the emergence of sequences of calls that lead to RPI The other model describes the emergence of different topologies of paths in which the information propagates from one subscriber to another We report high correspondence between certain features observed in the data and these modelsWe proceed to investigate the properties of some of these topologies Assume that M ≥ 5 and consider a tree generated by the information flow tree model with parameters as above Let N denote the total number of nodes in the tree and let E i for i = 1 2…4 denote the event that the generated tree is of Topology i The following proposition assesses one of the basic properties of these topologies namely the distribution mathbbPN=cdotE 1 of sizes of RPIs in each topologyIn this section we investigate the stability of dissemination leaders over different days We consider a training period of three consecutive days and a testing period of 14 following days Overall there are 19 pairs of training and testing periods in DS1 and eight pairs in DS2We compare the set of dissemination leaders found in the training period to the set of dissemination leaders in the testing period In particular we gather the following statistics S1 fraction of dissemination leaders in the testing period that were also dissemination leaders in the training period S2 fraction of dissemination leaders in the training period that were also dissemination leaders in the testing period S3 fraction of RPIs in the testing period in which the dominant node was a dissemination leader in the training period We note that statistics S1 and S3 are different since a subscriber can be a dominant node in more that one RPI during the testing periodAs a baseline for these statistics we use synthetic data in which probability of a user becoming a dominant node in some RPI in the current day does not depend on whether it was a dissemination leader in previous day This synthetic data can be generated by the following baseline model We assume that there exists a general pool of L dissemination leaders and numbers n i for i = 1…D of RPIs on ith day that have a dominant node The identities of dominant node in n i RPIs of day i are selected uniformly at random from the pool of size L with repetitions
Keywords:
.
|
Other Papers In This Journal:
|