The dynamics of information diffusion and interaction in online social networks have attracted increasing attention in recent years . Various studies have shown that the characteristics of human activity have a great effect on information diffusion . In particular, time is a critical factor that deeply influences the dynamics on and of social media because it is decided by the users’ multi-levels activities which are composed of browsing, using and controlling information . All the timestamped activities reflecthowusers spend time in interaction in online environments [4, 5]. However, the temporal process which emerges from the interaction of individuals on websites such as Twitter, Facebook and YouTube needs to be explored in more depth .
In the last decade, the access to high resolution datasets from social media platforms has provided opportunities to uncover the structure and dynamics of the social interaction network at different levels, from the small-scale individual’s perspective to the large-scale, collective behavior of the masses . Twitter is in the process of being appropriated for conversational interaction and is starting to be used for collaboration, as well . Thus, the social interactions on Twitter, which is defined as an adaptive network and models the feedback from network structure . Zi Yang et al., explore the behaviors’ timestamp for each user and found that the values of re-tweet intervals may be influenced by daily routines, which is important to retweeting prediction . K Lerman et al., studied the chain of the URL recommendation for each user on Twitter which represent the user’s activity explicitly. The results suggested that information about activity of a user’s friends in the social network is used to predict his or her activity . Bruno et al., constructed a model for users’ behavior on Twitter which includes finite priority queuing and time resources that reproduces the observed social behavior . Obviously, these results showed that social interaction and information transmission are concurrent; the temporal structure of interaction must influence the properties of information spreading. Although there are models of the human activity of the emails, MSN interactions and telephone communication , there is still a lack of models of human behavior for social interaction on Twitter to understand the communication patterns and how they correlated with the time-varying topological structure of social interaction network.
In addition, individual based models of collective social behavior traditionally include two basic ingredients: the mechanism of interaction and the network of interactions . Thus, we introduced a model of temporal networks that allows the human dynamics and social interactions obtained from extensive Twitter data to be modelled simultaneously. The temporal network is significantly different from static networks in many ways, including methods of representation, models, and spreading dynamics [15, 16]. Here we concerned at the temporal graph with nodes having functions, where the emergency of edges are depended on the behavior of nodes [17, 18]. It captures the activity dynamics of the nodes and shapes the attention and time constrains in human communication. In general, models of temporal networks can be used to interpret generating social interaction from the collective level. Specially, the power-law degree distribution is founded in some temporal networks which are used to represent contact networks or social interaction networks [19, 20, 21]. In these studies, the null model is used to randomly shuffle the event sequences, the intervals of the edges and the length of the communication, thereby the effect of timing factors are proved.
Motivated by the above analysis, in this paper we study the users’ communication patterns on a class of temporal network with nodes having an activity window which represents the dynamics of social interaction on Twitter. Then, based on the Stanford dataset, we extracted the statistics of the activity window, analyzed the structure of the social interaction network and found the users’ communication patterns under the activity window model. In addition, we propose a generation model for an interaction network on Twitter. Combined with the null model to the dataset, the results showed that the distribution of the activity window influences the degrees’ distribution of the interaction network.
2 Description of the active-window-based interaction network
2.1 The social interaction network
Twitter is a typical message transfer network. On this kind of communication platform, the relationships and the interaction among users build new channels for information diffusion,as shown in Figure 1, each user A has an input set InputA = f(mj , t1), f(mk , t2) . . .g,inside which the tweets m j or comments m k are posted by their followees j,k.These tweets or comments arrive in turn and are marked by their arrival time tn. When A is online, he will browse a limited number of tweets backwards in his input set from tk to tk−l, choose ml and retweet it to his followers. Thus,the interaction network TG = fV(A(w)), E(e(A, B, p → wA)), Tg is composed of the set of nodes V,the set of edges E,and the length of time edges E, and the duration of the interaction network T = [t0, tk]. For ∀A 2 V, w is named the activity window and means the possible length of his backtracking which cascades from his followees. The activity window is actually a local prediction of the behavior of the user when messages posted by his friends have already appeared on his timeline . For ∀e(A, B, p → wA) ϵ E,A is the followee of B. If p is 1, it represents that the edge is active, which depends on the length of wA and whether there are tweets posted by his followees that have fallen into this activity window.
2.2 Extracting the activity window
However, with the limited crawl technology, it is too difficult to get the complete scenario of how users use the Twitter App. Therefore we used a machine-learning algorithm to estimate the activity window. Twitter user A has a sequence of activities of tweeting and retweeting. We define it as SeqA = f(twA, St1), (RTj , t2, St2), (CMj , t3, St3) . . .g, which makes it explicit that he is online. For example, (twA, St1) means that A wrote a new tweet at time St1. (RTj , t2, St2) means that at time St2, A retweeted a message of j which arrived at time t2.(CMj , t3, St3) means that at time St3,A commented on a message of j which arrived at time t3.Thus, intervals like [Stk , tk] mark the time difference between the time of j’s message arrival at his InputA and the time of A retweeting or commenting. With the EM-GMM algorithm we can find out two types of time difference sets and their probability. One is short-intervals SIA with prior probability pshort, which means regular online behaviour(e.g., browse, RT or tweet). The other is long-intervals LIA with prior probability plong,which are the result of daily routines, such as meeting time or sleeping. According to , both the long-intervals and short-intervals contribute to the discovered patterns of regularity so we mix them together to estimate WA and therefore WA is defined as WA = SIA * pshort + LIA * plong. Thus, the activity window represents a user’s ability or habit of using social media.
We analyzed the records of 550,000 users in one month, consisting of 207 million tweets and 124 million followee-follower relationships . The topology of the followee-follower network has small-world properties with L=4.36,<C>=0.01and M=0.12. In addition, we built the interaction network according to the "RT" or "@" tagged in the dataset and measure the degree distribution of it. We collected the users’ activity sequence mentioned in Section 2.2,estimated the activity windows by EM-GMM algorithm and measure the communication patterns based on the activity window. Combined with the activity-driven model, we examined the effect of the activity window on the interaction network.
3.1 Structure of the social interaction network
The social interaction network is constructed by users’ activities, such as "RT" and "@". Here we represented these temporal data as a static graph and found that it was much sparser and more highly-clustered than the corresponding “followee-follower” network. Although there are only 70,994 nodes and 216,582 edges in it, the cluster coefficient and the modularity respectively increased to 0.13 and 0.621.
We divided the interaction network into four aggregated networks by weeks and observed their degree distributions. Figure 2(a) represents four fragment interaction networks, in which the length of time is one week. Obviously, they have the same degree distribution, which complies with power-law distribution and the slope is about −1.450.02 (Rsqure>=0.95).Meanwhile, each fragment network has about 37000 nodes, 66000 edges, with <k>=6.12. The cluster is 0.12 0.01 and the modularity is 0.710.01. The four aggregated interaction networks are shown in Figure 2(b) in which the distributions of degree are all power-law and the slope increases from -1.45 to -1.19 as the interaction becomes deeper. These distributions indicate a strong heterogeneity in the way users distribute the time across their social relationships, and the patterns are not dependent on the time-scale.
3.2 Features of the activity window
Based on the Stanford Twitter dataset, we analyzed the statistical characteristics of the activity windows. From Figure 3(a), we can see a power-law distribution with exponential cut-off for W. This distribution is different from other online human activities, such as email activity, online games, or web page browsing, which have exponent α < 2 [16, 21]. The reason is that the distribution of W complies with power-law distribution with slope -0.86, when it is smaller than 45 minutes. Meanwhile there is exponential distribution with exponent -0.02, when W is bigger than 45. It is obvious that most of the users’ activity windows range from 5 to 30 minutes. This means that the purposes of some users are to receive messages as quickly as possible in their finite time. Their activity pattern has a bursty characteristic which may be caused by the stimulation of the interaction which happened recently. Others will browse messages tweeted 45 minutes ago, and their browsing behavior is memoryless, which may be influenced by the length of their input set.
Furthermore, in Figure 3(a) we count the number of tweets in the activity windows which comply with the power-law distribution. This means that the most of the users will browse no more than 100 tweets at once. Also, Figure 3(b) displays the mapping of the number of tweets and the size of the W, which grows non-linearly but is clustered in the lower left of the figure. Therefore, most of the users may browse about 50 to 100 tweets. This limited number of tweets is probably due to the limited number of the user’s followees, which may be about 250 in accordance with Dunbar’s theory .
3.3 Patterns of communication
Besides the users’ behaviors, increasing attention is concentrated on the relationship between the social connectivity and the intensity of the communication. For example, Miritello et al. have been investigating this: in line with previous studies for both scientific collaboration and the air-transportation networks . Therefore, we observed the interaction network constructed by users’ activities, such as RT and @. Here we observe a more complex behavior for large values of ki. We measured the interaction stability and burstiness for a user with degree ki and explored their relationships with the interaction network. The burstiness of human activity has been proven to be important for information diffusion , which means a temporally inhomogeneous bursty contact process and results in prevalent decay times in information diffusion. Each edge in the interaction network, named a social tie, has an interaction burstiness and is measured by cvij = σij/μij. σij and μij are the mean and standard deviation of the interaction intervals between i and j. If cvij is 1, two users will contact with fixed frequency. In Figure 4, the value of interaction burstiness fluctuates from 0 to 2.3. In addition the larger the activity window is, the larger the range of the interaction burstiness. Obviously, the heterogeneity of the interaction burstiness may be caused by the users’ limited attention, because the more the social interaction the user has, the shorter the time he will dedicate to each per tie . Each edge in the interaction network has an interaction stability, which is defined as and are the maximum interaction intervals, and the minimum interaction intervals between i and j . Figure 5 shows that the interaction stability not only increases with the size of the activity window, but also depends on the degrees of the two users involved in the social tie. When both of them have smaller degrees in the interaction network, most of the interaction stabilities scatter from 0 to 1 hour [23, 24]. When their degrees exceed to 10, the interaction stability will decrease with the size of the activity window. This means that higher stable interaction will happen among those with more interactive objects and lower backtracking length.
3.4 Effect of the activity window on the interaction network
As mentioned above, the activity window shapes the communication pattern of users on Twitter with users’ transient online activities and results in the time-varying dynamics of the interaction network.This could be because the different sizes of the activity windows will bring temporal sequence, delay, and durations in the asynchronous large-scale contact on Twitter. Thus the interaction network not only recorders the contact on Twitter, but is also a time-aggregated network of information diffusion. When we take the time characteristics of users’ activities into account, the dynamics of information spreading will have significant difference with the SI, SIR, or SIS models . In this study, we explored the effect of the activity window on the generation of an interaction network with the sampling dataset which retains the relationships and the timelines for users. Based on the dataset we combine the activity-driven model  with the activity window to simulate the dynamics of the interaction network with time-varying features. The model is named ADAW (Activity-driven model with activity window) and is as follows:
– Load the follower-followee network, the Input sets and the activity window p(w) complying to power-law distribution with exponential cut off. The slope of the power-law distribution is -0.86 and the rate of the exponential distribution is -0.02. Choose 514 nodes as seeds (0.1% of all nodes) to create new tweets and push them to their friends.
– At each discrete time step t the instant interaction network Gt starts with N nodes.
– With probability ai each node becomes active, where and activityi is the number of activities of node i in one month, which is determined by the empirical data. If node i is active, he can take one of the following behaviors: innovating a new tweet with the probability p or going backward to the timeline with the probability 1-p, the scale of which is equal to Wi, and randomly choosing m tweets to retweet or comment on. Then m links of the interaction network are generated.
– At the next time step t+1, all edges in Gt are deleted.
To gain insight into the effects activity widows, we employed a null model  in the ADAW model. The features in the activity windows in the following null models are separately randomized. The null models are as follows:
W-Random: The whole community structure in the follower-followee network and user timeline are retained. The activity windows are randomly exchanged between random nodes. The temporal correlation consisting of online time length between the follower-followee relationships is destroyed.
W-U: The whole community structure in the follower-followee network and user timeline is retained. The activity windows distribution is p(w) = U[5, 231]. The temporal correlation consisting of the online time length between the follower-followee relationships is destroyed.
For each simulation we recorded the instant interaction network Gt and accumulated them into aggregated networks when the simulation ended. Figure 6(a) presents the probability distribution of degree for each aggregated network, which are all power-law distributions but with different slopes. The statistic of degree distribution of the interaction network generated by ADAW is the most similar to that of the Stanford dataset. In 200 repeated simulations of the ADAW models, there are 186 aggregated networks whose degree distributions are tested to be a power-law, and the slope is about −1.45 0.02.Yet in the results of the W-Random and W-U models there are only 134 and 162 aggregated networks that have power-law degree distribution, and the value of the slope is about −1.280.02 or −1.32 0.02.
While switching the activity window size will bring more deviation to the aggregated network in the two null models, it has been proven that the activity window is a temporal factor of the dynamic mechanisms for generating the interaction networks on Twitter. Figure 6(b) shows the cumulative proportions of users participating in the interaction in one month. When the correlation between the degree and activity window is shuffled randomly, the cumulative proportions are bigger than those of ADAW and W-U. The result proves that both topological and temporal correlations slow down the spreading. Furthermore, W-U has a higher cumulative number of the nodes than those of ADAW in the first 5 days, and then it reverses. We interpret this phenomenon by a phased analysis of the simulation. At the beginning of the simulation of the ADAW model, most users have a small activity window and bigger Δij, which slows the growth of the number of participants. Continuing with the simulation iterator, more and more users appear, who have bigger activity windows, lower Δij,and cvij scattering around 1. The probability of interaction is therefore increased, and the number of participants exceeds that of the W-U model.
Online social network systems can be extremely dynamic and are generated almost instantaneously. Understanding their dynamical properties is not only is of fundamental interest, but also has a broad range of applications . One contribution of this study is to propose a temporal network model with nodes having an activity window, which describes the dynamics of the interaction network on Twitter. The activity window depicts users’ fragmental participation on Twitter, which complies with power-law distribution with an exponential cut off, acquired by way of the EM algorithm. Meanwhile, we have concentrated on how the activity window affects the communication pattern of the online interaction. The results show that the increase in the degrees of the social ties leads to a dramatic drop in the unevenness of the interaction stability between users. However, the interaction burstiness is related to the degrees in the interaction network. The higher the degrees, the higher the burstiness.
The other contribution of this study was developing the ADAW model by combining the activity window with the activity-driven model. It proved that the activity window affects the generation of the interaction network. In addition, we introduced null models into ADAW where the activity window characteristics of the nodes are randomly shuffled . Thus we were able to distinguish the effects of the activity window having different features, including power-law distribution with exponential cut off, random distribution and uniform distribution. The results of the ADAW model are the closest to that of the real dataset because the time characteristics of individuals’ activity will slow down the speed of spreading in a small-world network [2, 23]. Comparing the simulation result of ADAW with the other two distributions, we also found that ADAW reduces the interaction scale but is bigger than the uniform distribution.
Our results also provide new insights for the description of temporal variation of the online social networks. In particular, the ADSW model supplements the way of depicting information diffusion and can be extended further by including other social processes. Hence our future work will explore how the complex contagion mechanism works on it.
JZ is grateful to Professor Ying-Cheng Lai and his students of Arizona State University for the helpful discussions, insights, and the support of the experimental equipment. The authors would also like to thank the anonymous referees for their careful work and thoughtful suggestions that have helped improve this paper substantially.
Iribarren J.L., Moro E., Impact of human activity patternson the dynamics of information diffusion, Phys. Rev. Lett., 2009, 103, 3, 038702. Google Scholar
Ruiz C.V., Aiello L.M., Jaimes A., Modeling dynamics of attention in social media with user efficiency, EPJ.Data. Sci., 2014, 3, 1, 5-20. Google Scholar
Pronovost G., Social Time, Curr. Sociol., 1989, 37, 3, 1-98. Google Scholar
Guille A. et al., Information diffusion in online social networks: A survey, ACM. Sigmod. Rec., 2013, 42, 2, 17-28. Google Scholar
Honey C., Herring S.C., Beyond microblogging: Conversation and collaboration via Twitter Honey, Proc. Int. Conf. System Sciences IEEE (2009, Washington, DC, USA), 1-10. Google Scholar
Saito K. et al., Learning diffusion probability based on node attributes in social networks,In Proc. Int. Symp. Methodologies for Intelligent Systems (2011,Warsaw, Poland), 153-162. Google Scholar
Yang Z. et al., Understanding retweeting behaviors in social networks, In Proc. 19th ACM Int. Conf. on Information and knowledge management (2010, Toronto, Canada), 1633-1636. Google Scholar
Macskassy S.A, Michelson M. Why do People Retweet? Anti-Homophily Wins the Day!, Int. Conf. on Weblogs and Social Media, Barcelona, Catalonia, Spain, July. DBLP, 2011. Google Scholar
Castellano C., Fortunato S., Loreto V., Statistical physics of social dynamics, Rev.Mod.phys, 2007, 81,2, 591-646. Google Scholar
Holme P., Saramäki J., Temporal networks,Phys.Rep..,2012, 519, 3, 97-125. Google Scholar
Rodriguez G., Leskovec M.J., Schölkopf B., Structure and dynamics of information pathways in online media, Proc. 6th ACM Int. Conf. on Web search and data mining, 2013, New York, USA, 23-32. Google Scholar
Liljeros F., Edling C. R., Lan A.,The web of human sexual contacts, Nature, 2001, 411, 6840, 907-908. Google Scholar
Miritello G., Rubén L., Moro E., Time allocation in social networks: correlation between social structure and human communication dynamics, Temporal networks, 2013, Springer Berlin Heidelberg, 175-190. Google Scholar
Vázquez A. et al., Modeling bursts and heavy tails in human dynamics, Phys. Rev. E, 2006, 73, 3, 036127. Google Scholar
Wu S. et al., Who says what to whom on twitter, Proc. 20th Int. Conf. on World wide web (2011, Hyderabad, India), 705-714. Google Scholar
Tuljapurkar S., Infectious diseases of humans: Dynamics and control, Science, 1991, 254, 5031, 591-593. Google Scholar
Guille A., Hacid H., A predictive model for the temporal dynamics of information diffusion in online social networks, Proc. 21st Int. Conf. on World Wide Web (2012, Lyon, France), 1145-1152. Google Scholar
Leskovec J., Krevl A., Large Network Dataset Collection, http://snap.stanford.edu/data June 2014.
About the article
Published Online: 2018-11-19
Conflicts of InterestThe authors declare that there is no conflict of interest regarding the publication of this paper.
Funding StatementThis work was supported in part by the National Natural Science Foundation (grant numbers 71401024 and 71801145).
Citation Information: Open Physics, Volume 16, Issue 1, Pages 685–691, ISSN (Online) 2391-5471, DOI: https://doi.org/10.1515/phys-2018-0087.
© 2018 Jun Zhang et al., published by De Gruyter. This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. BY-NC-ND 4.0