Learning Influence Probabilities In Social Networks By Amit Goyal, Francesco Bonchi & Laks V. S. Lakshmanan Gal Bar-On [email protected], Ben-Gurion University of the Negev, Department of Computer Science, Fall 2017 Table of contents: Introduction Related work Problem definition Solution framework Models of Influence Algorithm Step 1: Learning the Parameters of the models Algorithm Step 2: Evaluating the Models Algorithm Step 3: Predicting Time Experimental Evaluation Summary Related work: Domingos & Richardson: - Mining the network value of customers - Mining Knowledge-sharing sites for viral marketing D. Kempe, J. M. Kleinberg and E Tardos. - Maximizing the spread of influence through a social network J. Leskovec - Cost-effective outbreak detection in networks Problem Definition: 2

e g 6 4 c 6 b e a1 5 g a1 7 d a1 10 f

a1 11 b a1 12 a a1 16 f d 2 9 a Problem Definition: Given a social graph and an action log, two questions are asked: Can we use them to build models ?of influence

Can we predict the time by which a user is expected to perform an action? Problem Definition: When a user performs an action, he may have one of many reasons for doing so: - He may have heard of it outside of the network. - The action is very popular. - He may be genuinely influenced by seeing his social friends perform that action. Action Propagation: ( , , ) Indicates that user performed action at time The action log contains such a tuple for every action performed by every user. We say that an action propagates from user to iff: 1. 2. with 3. We write where

Action Propagation: 2 e 2 e g g 6 5 4 c d 1 f 2 4 b a

6 e a1 5 g a1 7 d a1 10 f a1 11 b a1 12

a a1 16 b f d 2 a 9 Some Necessary Definitions: The universe of actions - Number of actions performed by user - Number of actions performed by both and v - Number of actions performed by either or v - Number of actions performed by either or v Solution Framework: Given an inactive user and the set of its activated neighbors we would like to determine , the joint influence probability of on ( ) =1 ( 1 , )

If we can conclude that u will activate as well Solution Framework: User influenceability: Action influenceability: The average time delay is Models: Static Models: Independent on time Are easy to learn and test Bernoulli Distribution Jaccard Index Partial Credits Models: Static Models: Independent on time Are easy to learn and test Bernoulli

Distribution Jaccard Index Partial Credits Models: Partial Credits Static Models: Independent on time Are easy to learn and test Bernoulli Model w. Partial Credits Jaccard Model w. Partial Credits Models: Continuous Time (CT) Models: Assume that influence probabilities are continuous functions of time Very accurate Expensive to test on large data sets

, = 0 , , The probability of influencing its neighbor at time ( ) =1 ( 1 The joint probability of being influenced by a combination of its active neighbors at time , )

Models: Discrete Time (DT) Models: The influence of on remains constant for a time window of and then drops to 0, i.e. is contagious to in the time interval When that happens, is updated as follows: ( ) , ( )= 1 , :Partial credit definition for DT , , ( )= 1 ( 0< ( ) ( ) , ) Table of contents: Introduction Related work

Problem definition Solution framework Models of Influence Algorithm Step 1: Learning the Parameters of the models Algorithm Step 2: Evaluating the Models Algorithm Step 3: Predicting Time Experimental Evaluation Summary Influence Matrix Social Graph P P Q P Q X X R X Action Log Q

= R R Influence Matrix Social Graph P P Q P Q X X R 2 X Action Log Q

= R R Influence Matrix Social Graph P P Q P 4 Q X X R 2 X Action Log

Q = R R Influence Matrix Propagation Graph (a1) Social Graph P P X R 2 R X Q P

4 P Q X Action Log Q = R P a1 5 Influence Matrix Propagation Graph (a1) Social Graph P P

P X R 2 R X Q P 4 Q X Action Log Q R Propagation Graph (a3) =

R P a1 5 R a3 6 Influence Matrix Propagation Graph (a1) Social Graph P 5 P P 2 X

R Q R X Q P 4 Q X Action Log Q R Propagation Graph (a3) = R P

a1 5 Q a1 10 R a3 6 Influence Matrix Propagation Graph (a1) Social Graph P 5 P P Q

2 1 1 X Action Log R Propagation Graph (a3) = X R Q R X Q P 4 Q

R P a1 5 Q a1 10 R a3 6 Influence Matrix Propagation Graph (a1) Social Graph P 5 P

P 2 X R Q X Propagation Graph (a2) Q 1 1 R Q Propagation Graph (a3) = R R

X Q P 4 Q Action Log P a1 5 Q a1 10 Q a2 12 R

a3 6 Influence Matrix Propagation Graph (a1) Social Graph P P 5 P 2 X R Q X Propagation Graph (a2) Q

1 1 R Q 2 R Propagation Graph (a3) = P 8 R R X Q P 4

Q Action Log P a1 5 Q a1 10 Q a2 12 R a2 14 R a3

6 P a3 14 Influence Matrix Propagation Graph (a1) Social Graph P P 5 P 4 X R X Propagation Graph (a2)

Q 1 1 R Q 2 R Propagation Graph (a3) = P 8 R R X Q R

Q 2 P 1 0 Q Action Log P a1 5 Q a1 10 R a1 15 Q

a2 12 R a2 14 R a3 6 P a3 14 P 5 P 4 Influence Matrix

Propagation Graph (a1) Social Graph 1 0 R Q 2 P Q R P X 1/3,5 1/3,1 0 Q

0/0 X 1/3,2 R 1/3,8 0,0 X Propagation Graph (a2) Q 1 1 R Q 2 R Propagation Graph (a3)

= P 8 R Action Log P a1 5 Q a1 10 R a1 15 Q a2

12 R a2 14 R a3 6 P a3 14 P 5 P 4 Influence Matrix Propagation Graph (a1)

Social Graph 1 0 R Q 2 P Q R P X 1/3,5 1/3,1 0 Q 0/0 X

1/3,2 R 1/3,8 0,0 X Propagation Graph (a2) 1 1 Q 0 , R Q 2 1 = =

3 = 2 R Propagation Graph (a3) ( ( ) ( ) ) , = 2 = 5 =5 1 P 8 R Action Log P

a1 5 Q a1 10 R a1 15 Q a2 12 R a2 14 R a3

6 P a3 14 Evaluating the Models: For each action, we maintain a with entries of the form The flag can have one of the following values: 0 if never performs the action but at least one of its neighbors does 1 if performs it but at least one of its neighbors has done so before it 2- if is the initiator of the action in its neighborhood After scanning all tuples for an action, contains all users who are active or are neighbors of one or more active users. Evaluating the Models: The performance is depicted using ROC curves, which plot the TPR against the FPR = + = +

=0 =1 < < Evaluating the Models: With Continuous Time Models, we also consider the chronological order in which different neighbors of performed the action. Predicting Time: We are able not only to predict whether a user performs an action, but also the time interval in which it is most likely to happen:

=+ , 2 Experimental Evaluation: Out of 6.2 million users and 71 million edges, we took users who are members of at least one group and got a graph of 1.4 million users with 40.5 million edges. The graph has 34,766 connected components, where the largest one consists of 1.32 million users and 40.45 edges (99.72%). The rest are ignored. The action log contained a total of 35.97 million tuples. Experimental Evaluation: Experimental Evaluation: Experimental Evaluation: Summary: We understood that we scan build models of influence from a social graph and an action log. In addition to predicting whether a user will perform an action, the predictions on users with high influenceability scores had high precision. We were also able to predict the time by which the user will perform the action after its neighbors have done so. We discovered that DT models can be tested much more efficiently and yield accuracy levels very close to those of the CT models. !Thank You