Attention Dynamics on the Web Prof. Christian Bauckhage

Attention Dynamics on the Web
Prof. Christian Bauckhage
found in IEEE Spectrum, Oct. 20, 2014:
M. Jordan: . . . If I just look at all the people who have a heart attack
and compare them to all the people that don’t have a
heart attack, and I’m looking for combinations of the
columns that predict heart attacks, I will find all kinds of
spurious combinations of columns, because there are
huge numbers of them. . . .
Spectrum: Do you think this aspect of big data is currently
underappreciated?
M. Jordan: Definitely. . . .
my translation:
• there is a need for
• model driven data mining
• interpretable data mining
this presentation:
• illustrates what this may mean
The subject of collective attention is
central to an information age where
millions of people are inundated with
daily messages.
It is thus of interest to understand
how attention to novel items propagates
and eventually fades among large
populations.
Wu & Huberman, 2007
Overview
Internet Memes
Bauckhage, Kersting, Hadiji, Proc. AAAI ICWSM, 2013
Social Media & Web-based Services
Bauckhage, Kersting, Rastegarpanah, Proc. ACM WWW, 2014
Bauckhage, Kersting, arXiv, 2014
Internet Memes
mundane viral content on the Web:
• catch phrases (“epic fail”, “so much win”, . . . )
• videos (“chocolate rain”, “gangnam style”, . . . )
• images (“y u no”, “o rly”, “advice dog”, . . . )
Internet Memes
originators:
• 4chan, tumblr, youtube, memegenerator, . . .
multipliers:
• reddit, failblog, knowyourmeme, . . .
recipients:
• the social Web at large
Internet Memes
content evolution: e.g. “y u no” meme
original phenotype
mutations alluding to pop culture
references to meme culture
occurrences in other media
Internet Memes
life cycles: typical patterns of growing and declining popularity
o rly
fmylife
call me maybe
llama song
100
100
80
80
60
60
40
40
20
y u no
20
4
0
20
has cheezburger
05
20
06
20
07
20
8
0
20
9
0
20
10
20
11
20
12
20
13
20
right skewed, narrow about peak
4
0
20
5
0
20
6
0
20
07
20
08
20
9
0
20
10
20
11
20
12
20
13
20
right skewed, broad about peak
Internet Memes
Internet memes:
• dynamic media objects
• inside jokes / hip underground knowledge
• transgress media and cultural boundaries
• show common patterns w.r.t. collective attention dynamics
numa numa
internet is for ...
lol wut
double rainbow
100
100
80
80
60
60
40
40
20
herp derp
20
4
00
2
courage wolf
05
20
06
20
2
7
00
2
8
00
2
9
00
10
20
11
20
12
20
13
20
right skewed, narrow about peak
4
0
20
5
0
20
6
0
20
07
20
08
20
9
0
20
10
20
11
20
12
20
13
20
right skewed, broad about peak
Internet Memes and Collective Attention
questions:
Q1: which social mechanisms explain emergence of noticeably
skewed meme related outbreak data?
Q2: which mathematical functions model these mechanisms?
Q3: how well do such models fit empirical data?
Q4: are Internet memes nothing but fads?
Internet Memes and Collective Attention
fads:
• behavior related to ideas,
activities, or products
• enthusiastically followed by
large populations
• when fad catches on, number
of adopters grows rapidly
• once perception of novelty is
gone, behavior fades out
• dynamics often modeled using
life-time distributions
Internet Memes and Collective Attention
approach:
• analyze > 200 meme related Google Trends time series
• use of Google Trends is increasingly popular and justified:
J. Teevan et al., Understanding and Predicting Personal
Navigation, Proc. WSDM, 2011
J. Mellon, Search Indices and Issue Salience, Sociology Working
Papers, University of Oxford, 2011
Internet Memes and Collective Attention
approach:
• analyze > 200 meme related Google Trends time series
• use of Google Trends is increasingly popular and justified:
J. Teevan et al., Understanding and Predicting Personal
Navigation, Proc. WSDM, 2011
J. Mellon, Search Indices and Issue Salience, Sociology Working
Papers, University of Oxford, 2011
• investigate use of life-time distributions for modeling
• in particular: Weibull, Gompertz, Frechet, and Log-Normal;
the latter for baseline comparison to our previous work in
C. Bauckhage, Insights into Internet Memes, Proc. ICWS, 2011
Internet Memes and Collective Attention
approach:
• analyze > 200 meme related Google Trends time series
• use of Google Trends is increasingly popular and justified:
J. Teevan et al., Understanding and Predicting Personal
Navigation, Proc. WSDM, 2011
J. Mellon, Search Indices and Issue Salience, Sociology Working
Papers, University of Oxford, 2011
• investigate use of life-time distributions for modeling
• in particular: Weibull, Gompertz, Frechet, and Log-Normal;
the latter for baseline comparison to our previous work in
C. Bauckhage, Insights into Internet Memes, Proc. ICWS, 2011
• provide physical explanation for good performance
Life-Time Distributions
Definitions and Properties
1
3
5
7
fWB (t | κ, λ) =
κ
λ
9
t
κ−1
t
λ
11
1
fFR(t)
3
5
fFR (t | α, β) =
7
α
β
t
9
−α−1
t
β
11
13
e−(t/β)
5
7
t
9
11
13
γt −1)
σ
σ
σ
σ
σ
1
−α
3
= 0.0125
= 0.0250
= 0.2500
= 0.5000
= 1.0000
fGO (t | η, γ) = γηeγt e−η(e
κ
α = 0.75
α = 1.00
α = 1.50
α = 2.00
α = 2.50
1
fGO (t)
13
e−(t/λ)
γ
γ
γ
γ
γ
fLN (t)
fWB (t)
κ = 1.0
κ = 1.5
κ = 2.0
κ = 2.5
κ = 3.0
3
5
fLN (t | µ, σ) =
7
t
9
1
√
tσ 2π
11
−
e
= 0.5
= 1.0
= 1.3
= 1.6
= 2.0
13
(log t−µ)2
2σ2
Life-Time Distributions
Empirical Results (1)
Google Trends
Weibull fit
Gompertz fit
Frechet fit
Log-Normal fit
Google Trends
Weibull fit
100
100
80
80
60
60
40
40
20
Gompertz fit
Frechet fit
Log-Normal fit
20
06
20
7
08
0
20
20
0
20
9
10
20
11
20
12
20
13
20
08
20
09
“o rly”
Google Trends
Weibull fit
10
20
11
20
12
20
13
20
20
“has cheezburger”
Gompertz fit
Frechet fit
Log-Normal fit
Google Trends
Weibull fit
100
100
80
80
60
60
40
40
20
Gompertz fit
Frechet fit
Log-Normal fit
20
08
20
0
20
9
20
10
11
20
“it’s over 9000”
12
20
13
20
06
20
07
20
08
20
09
20
10
20
“ytmnd”
11
20
12
20
13
20
Life-Time Distributions
Empirical Results (2)
data sets:
• set 1: time series that grow, peak, and decline during
observation period
• set 2: time series that only decline or only grow during
observation period
fWB
fGO
fFR
fLN
204 memes (set 1)
16.6%
12.2%
62.0%
9.2%
10 memes (set 2)
30.0%
10.0%
50.0%
10.0%
all memes (set 1 ∪ set 2)
17.2%
12.1%
61.4%
9.3%
Interpretation
growth equations:
• propensity of an entity (e.g. attention) to grow:
continuously growing function g(t)
• propensity of an entity (e.g. attention) to decline:
continuously growing function d(t)
Interpretation
growth equations:
• propensity of an entity (e.g. attention) to grow:
continuously growing function g(t)
• propensity of an entity (e.g. attention) to decline:
continuously growing function d(t)
• common general growth dynamics:
f (t) = g(t) − d(t)
or
f (t) =
g(t)
d(t)
⇒ f (t) grows as long g(t) grows quicker than d(t)
Interpretation
Weibull pdf and cdf:
f (t) = bct
c−1 −btc
F(t) = 1 − e
e
f (t)
F (t)
−btc
t
Interpretation
Weibull pdf and cdf:
f (t) = bct
c−1 −btc
F(t) = 1 − e
e
f (t)
F (t)
−btc
t
therefore:
f (t) = bctc−1 − bctc−1 F(t)
Interpretation
Weibull pdf and cdf:
f (t) = bct
c−1 −btc
F(t) = 1 − e
e
f (t)
F (t)
−btc
t
therefore:
f (t) = bctc−1 − bctc−1 F(t)
thus:
• the Weibull implicitly encodes a subtractive growth process
• growth and decline are polynomial in t
• decline depends on F(t)
Interpretation
Gompertz and Frechet:
f (t) = bcect − bcect F(t)
and
f (t) =
F(t)
bctc+1
thus:
• Gompertz and Frechet also encode growth dynamics
note:
• the Log-Normal cannot be expressed this way
Conclusion
collective attention to memes:
• depends on F(t), the amount of attention attracted so far
⇔ attention attracted so far influences future popularity
⇔ cumulative density F(t) acts as a momentum term
whose drag increases over time
⇔ the more a population gets used
to a meme, the quicker it looses
its appeal
Internet memes
seem to be fads
Social Media & Web-based Services
Google Trends
shifted Gompertz
100
Google Trends
shifted Gompertz
100
80
80
60
60
60
40
40
20
04
20
40
20
05
20
06
20
07
20
20
08
20
09
10
20
11
12
20
20
13
20
04
20
20
20
05
06
20
07
20
buzznet
08
20
09
20
10
20
11
20
12
20
13
20
04
20
Google Trends
shifted Gompertz
Google Trends
shifted Gompertz
100
80
60
60
40
40
20
07
20
20
08
20
09
10
20
20
07
11
20
librarything
12
20
13
20
04
20
08
20
09
20
10
20
11
20
12
20
13
20
40
20
06
06
20
Google Trends
shifted Gompertz
100
80
60
20
05
flickr
80
05
20
20
failblog
100
04
20
Google Trends
shifted Gompertz
100
80
20
20
05
06
20
07
20
08
20
09
20
10
20
studiVZ
11
20
12
20
13
20
04
20
20
05
06
20
20
07
08
20
09
20
10
20
11
20
wikipedia
12
20
13
20
Social Media and Collective Attention
approach:
• consider 175 social media and Web-based services
• collect Google Trends data for 45 countries and worldwide
⇒ analyze > 8.000 time series
• investigate use of economic diffusion models
• shifted Gompertz
• Weibull
• Bass
Economic Diffusion models
Definitions and Properties
fSG (t)
β
β
β
β
β
1
3
5
7
9
11
= 0.5
= 1.0
= 2.5
= 5.0
= 7.5
13
t
−βt
fSG (t | η, β) = βe−βt−ηe
1 + η 1 − e−βt
fBA(t)
p = 0.033
p = 0.066
p = 0.100
p = 0.250
p = 0.500
1
3
5
7
9
11
13
t
fBA (t | p, q) =
(p+q)2
p
e−(p+q)t
(1+ qp e−(p+q)t )
2
Economic Diffusion models
Empirical Results (1)
Table : Goodness of fit w.r.t. regions of the world
fSG
region
p > 0.05
Africa
Asia
Australia
Europe
N-America
S-America
hpi
0.61
0.57
0.66
0.59
0.54
0.65
worldwide
0.59
fBA
p > 0.05
68%
63%
70%
65%
57%
71%
hpi
0.55
0.49
0.53
0.48
0.44
0.54
64%
0.50
fWB
p > 0.05
62%
54%
59%
51%
50%
59%
hpi
0.50
0.48
0.50
0.56
0.39
0.55
57%
53%
58%
54%
44%
62%
55%
0.47
53%
Economic Diffusion models
Empirical Results (2)
Table : Goodness of fit w.r.t. languages of the world
fSG
language
p > 0.05
English
Spanish
Portuguese
Russian
French
German
Chinese
Japanese
Hindi
hpi
0.55
0.63
0.60
0.68
0.55
0.58
0.50
0.42
0.57
average
0.57
fBA
p > 0.05
58%
68%
67%
76%
60%
64%
52%
52%
64%
hpi
0.44
0.52
0.50
0.58
0.46
0.47
0.42
0.38
0.47
62%
0.47
fWB
p > 0.05
49%
56%
56%
66%
51%
52%
46%
44%
54%
hpi
0.39
0.54
0.47
0.69
0.39
0.47
0.43
0.31
0.48
45%
60%
51%
76%
45%
54%
47%
38%
52%
52%
0.45
51%
Economic Diffusion models
Empirical Results (3)
Google Trends
Weibull
Bass
shifted Gompertz
Google Trends
Weibull
Bass
shifted Gompertz
Google Trends
Weibull
Google Trends
Weibull
Bass
shifted Gompertz
100
100
100
100
80
80
80
80
60
60
60
60
40
40
20
04
20
06
07
20
20
08
20
09
20
10
11
20
20
12
20
13
20
04
20
06
07
20
08
20
09
20
10
20
20
20
11
12
20
13
20
Bass
shifted Gompertz
Google Trends
Weibull
100
80
80
60
60
13
20
google+
05
Bass
shifted Gompertz
20
06
07
08
20
20
09
20
10
20
20
11
12
20
Google Trends
Weibull
Bass
shifted Gompertz
09
20
10
20
11
20
myspace
12
20
13
20
11
12
20
20
13
20
Google Trends
Weibull
Bass
shifted Gompertz
40
40
08
20
10
60
20
20
07
20
20
80
60
06
20
09
20
100
80
05
20
facebook
100
20
08
13
20
120
20
2
1
20
20
ebay
40
20
04
20
craiglist
100
40
20
20
05
20
amazon
Google Trends
Weibull
40
40
20
05
20
Bass
shifted Gompertz
07
20
08
20
09
20
10
20
11
20
youtube
12
20
13
20
20
09
10
20
11
20
twitter
12
20
13
20
Economic Diffusion models
Empirical Results (4)
Shifted Gompertz scale β
10−1
amazon
ebay
facebook
myspace
twitter
youtube
google+
craiglist
10−2
10−3
10−4
10−5
100
101
102
Shifted Gompertz shape η
Interpretation
observations / conclusions:
• economic diffusion models provide accurate and significant
explanations of general trends in aggregated search
frequency data related to social media
• collective attention to social media evolves according to
simple and highly regular dynamics of growth and decline
• collective attention to social media evolves globally
similarly and independent of regions of origin or cultural
backgrounds of crowds of Web users
• most social media services are able to attract growing
collective attention for a period of 4 to 6 years before user
interest inevitably begins to subside
Predictions
examples:
Google Trends
Weibull
Bass
shifted Gompertz
Google Trends
Weibull
100
100
60
60
20
20
Bass
shifted Gompertz
Google Trends
Weibull
Bass
shifted Gompertz
140
100
60
8
0
20
10
20
12
20
14
20
facebook
16
20
18
20
20
08
20
10
20
12
20
14
20
youtube
16
20
18
20
10
20
12
20
14
20
twitter
16
20
18
20
Predictions
examples:
Google Trends
Weibull
Bass
shifted Gompertz
Google Trends
Weibull
100
100
60
60
20
20
Bass
shifted Gompertz
Google Trends
Weibull
Bass
shifted Gompertz
140
100
60
8
10
0
20
12
20
14
20
20
16
20
20
18
20
08
10
20
facebook
Google Trends
Weibull
12
20
14
20
16
20
18
20
10
20
12
20
youtube
Bass
shifted Gompertz
Google Trends
Weibull
100
100
60
60
20
20
14
20
16
20
18
20
20
twitter
Bass
shifted Gompertz
Google Trends
Weibull
Bass
shifted Gompertz
140
100
60
96
19
98
19
00
20
2
0
20
4
0
20
06
20
08
20
ebay
10
20
12
20
14
20
16
20
18
20
00
20
20
02
20
04
20
06
20
20
08
10
20
12
20
paypal
14
20
16
20
18
20
96
19
19
98
00
20
20
02
20
04
06
20
08
20
10
20
amazon
12
20
20
14
20
16
18
20
social media services
seem to follow hype cycles
Finally . . .
hype cycles are everywhere:
Google Trends
shifted Gompertz fit / prediction
100
Google Trends
shifted Gompertz fit / prediction
Google Trends
shifted Gompertz fit / prediction
100
100
60
60
20
20
08
20
0
1
20
2
1
20
14
20
16
20
18
20
20
20
social media
22
20
24
20
08
20
60
20
20
10
12
20
14
20
20
16
20
18
20
20
22
20
cloud computing
24
20
20
12
20
14
16
20
18
20
20
20
big data
22
20
24
20
Summary
Internet memes:
• are a staple of contemporary Web culture
• grow and decline in popularity
dynamics of collective attention to memes:
• can be explained using plausible models
• show characteristics of fads / hype cycles
Summary
Internet memes:
• are a staple of contemporary Web culture
• grow and decline in popularity
dynamics of collective attention to memes:
• can be explained using plausible models
• show characteristics of fads / hype cycles
dynamics of collective attention to social media:
• can be explained using economic diffusion models
• appear to be highly regular and predictable
Thank You!
mail:
• christian . bauckhage @ iais . fraunhofer . de
web:
• mmprec.iais.fraunhofer.de/bauckhage
social networks:
• ResearchGate
• LinkedIn
• XING
`