Author Archives: Chris Danforth

Moose on the Loose!

Note: a version of this post was given by the author for Invocation at the UVM College of Engineering & Mathematical Sciences graduation ceremony, Flynn Theatre, May 18, 2014.

A few weeks ago—on one of those beautiful spring mornings that makes the long winter seem like it happened elsewhere—something quite remarkable took place here at the University of Vermont.

At the time, I was sitting outside on a small wooden picnic bench near my office on Trinity campus. The sun was shining, the birds were chirping their layered periodic rhythms, and the Green Mountains were finally living up to their name after months of radiating white.

Vermont Mountains Panorama at Sunrise Mt Ascutney, Vermont, New England, USA

I’d love to say that I was meditating within our glorious landscape. I’d really like to say that I was deeply appreciating the sacred gifts offered by Mother Nature. The truth is that I was totally geeking out.

I was reading storylabber Dilan Kiley’s undergraduate Honors thesis. It was awesome! He quantified the spread of information on Twitter in response to sudden, unanticipated events. These system-scale shocks can briefly synchronize our society’s chaotic collective attention. And while reading about them in Dilan’s thesis, I experienced one myself.

I heard a strange noise nearby, and looked up to find a moose staring at me, from 10 feet away. Well played Dilan.

The moose was out of breath, having been chased up the hill from the lake by an excited mob of followers. In a moment that seemed to last several seconds, we looked at each other. I was in awe. The moose looked unsure, confused, and lost.

Our time together was quickly interrupted by animal control officers. They were sprinting after the moose, trying to steer it safely into the woods. A small flock of undergraduates followed, looking at each other in disbelief at what they were seeing.

Not surprisingly, a seven foot tall, thousand pound wild animal jogging through campus caused quite a stir! Pictures of the moose received thousands of likes on social media, and #mooseontheloose started trending, at least here in Vermont. After a few hours in the spotlight, Vermont Fish & Wildlife happily reported that the moose found its way back into the woods north of campus.

I tell you this story today because I think the moose’s adventure offers some lessons for us as you wander off campus to find your way home.

In the past few weeks, I’ve spoken to many of you, asking about your plans for the future. This is a time of great transition in your life. Most of you don’t have a grand plan, or even a muddy pond to call home.

Like the moose, you too may feel a bit lost. You too will have many people taking your picture, and making a big fuss over you. 

Over the coming months, you too will have well-intentioned loved ones trying to steer you to a safe path in life, advising you where to go, and what to do. You too will have to find your way through a noisy, often confusing set of uncertain options.

As people, we imitate role models whom we admire, using their past choices to inform our own. As scientists, we use mathematical models to make predictions, which are helpful, because unfortunately, observations of the future are not available at this time [1].

Seemingly inconsequential decisions, that you make, may change your life in the biggest ways. But which decisions are most important? To which decisions will your life be sensitively dependent?

I reached out to the hero of our story via his parody Twitter account @BTVMoose. Really. Talk about geeking out. I asked for words of wisdom for the class of 2014. Overcoming the great modern difficulty of finding a wireless internet signal in the dense forest, he was able to tweet this advice:

To paraphrase this bit of spiritual guidance: you may need to wander around a bit, before you find your way.

[1] Original quote from Knutson and Tuleya, Journal of Climate, 2005.

 

Leave a comment

Filed under geohappiness, mathematics, prediction, social phenomena

Now online: the Dow Jones Index of Happiness

Total excitement people: our website hedonometer.org has gone live.  We’re measuring Twitter’s happiness in real time.  Please check it out!

If you’re still here, here’s the blurb from the site’s about page:

Happiness: It’s what most people say they want. So how do we know how happy people are? You can’t improve or understand what you can’t measure. In a blow to happiness, we’re very good at measuring economic indices and this means we tend to focus on them. With hedonometer.org we’ve created an instrument that measures the happiness of large populations in real time.

Our hedonometer is based on people’s online expressions, capitalizing on data-rich social media, and we’re measuring how people present themselves to the outside world. For our first version of hedonometer.org, we’re using Twitter as a source but in principle we can expand to any data source in any language. We’ll also be adding an API soon.

So this is just a start – we invite you to explore the Twitter time series, let us know what you think, and follow the daily updates through the hedonometer twitter feed: .

1 Comment

Filed under networks, psychology, social phenomena

A data-driven study of the patterns of life for 180,000 people

Here at the Computational Story Lab, some of us commute by foot, some by car, and a few deliver themselves by bike, even in the middle of our cold, snowful Vermont winter.  Occasionally, we transport ourselves over very long distances in magic flying tubes with wings to attend conferences, to see family, or for travel.  So what do our movement patterns look like over time?  Are there distinct kinds of movement patterns as we look across populations, or are they variations on a single theme?

Inspired by an analysis of mobile phone data by Marta Gonzalez at MIT, James Bagrow at Northwestern, and colleagues, we used 37 million geotagged tweets to characterize the movement patterns of 180,000 people during their 2011 travels. We used the standard deviation in their position, a.k.a. radius of gyration, as a reflection of their movement. As an example, below we plot a dot for each geotagged tweet we found posted in the San Francisco Bay area, colored by the author’s radius of gyration.

The Bay Area is shown with a dot for each tweet, colored by the radius of gyration of its author.

The Bay Area is shown with a dot for each tweet, colored by the radius of gyration of its author. The color scale is logarithmic, so we can compare people with very different habits.

You can see from the picture that there are many people with a radius near 100km tweeting from downtown San Francisco. This pattern could reflect a concentration of tourists visiting the area, or individuals who live downtown and travel for work or pleasure. Images for New York City, Chicago, and Los Angeles are also quite beautiful.

In the image below, we rotated every individual’s movement pattern so that the origin represents their average location, and the horizontal line heading to the left represents their principle axis (most likely the path from home to work). We also stretched or shrunk the vertical and horizontal axes for each individual, so that everyone could fit on the same picture. Basically, we have a heatmap of collective movement, with each individual in their own intrinsic reference frame.  The immediate good news for these kinds of data-driven studies is that we see a very similar form to those found for mobile phone data sets.  Apart from being a different social signal, Geotagged Tweets also have much better spatial resolution than mobile phone calls which are referenced by the nearest cellphone tower.

Movement pattern exhibited by 180,000 individuals in 2011, as inferred from 37 million geolocated tweets. Colormap shows the probability density in log10. Note that despite the resemblance, this image is neither a nested rainbow horseshoe crab, nor the Mandelbrot set.

Movement pattern exhibited by 180,000 individuals in 2011, as inferred from 37 million geolocated tweets. Colormap shows the probability density in log10. Note that despite the resemblance, this image is neither a nested rainbow horseshoe crab, nor the Mandelbrot set.

Several features of the map reveal interesting patterns. First, the teardrop shape of the contours demonstrates that people travel predominantly along their principle axis, with deviations becoming shorter and less frequent as they move farther away. Second, the appearance of two spatially distinct yellow regions suggests that people spend the vast majority of their time near two locations. We refer to these locations as the work and home locales, where the home locale is centered on the dark red region right of the origin, and the work locale is centered just left of the origin.

Finally, we see a clear horizontal asymmetry indicating the increasingly isotropic variation in movement surrounding the home locale, as compared to the work locale. We suspect this to be a reflection of the tendency to be more familiar with the surroundings of one’s home, and to explore these surroundings in a more social context. The up-down symmetry demonstrates the remarkable consistency of the movement patterns revealed by the data.

We see a clear separation between the most likely and second most likely position.

We see a clear separation between the most likely and second most likely position.

Looking just at the messages posted along the work-home corridor, the distribution is skewed left, with movement from home in a heading opposite work seen to be highly unlikely.

The isotropy ratio shows the change in the probability density's shape as a function of radius.

The isotropy ratio shows the change in the probability density’s shape as a function of radius.

Above we see that individuals who move around a lot have a much larger variation in their positions along their principle axis, exhibiting a less circular pattern of life than people who stay close to home. Remarkably, the isotropy ratio decays logarithmically with radius.

Finally, we grabbed messages from the most prolific tweople, those 300 champions who had posted more than 10,000 geotagged messages in 2011. We received 10% of these messages through our gardenhose feed from Twitter. Below, we plot the times during the week that they post from their most frequently visited location. These folks most likely have the geotag switch on for all messages, and exhibit a very regular routine.

A robust diurnal cycle is observed in the hourly time of day at which statuses are updated, with those from the mode location (black curve) occurring more often than other locations (red curve) in the morning and evening.

A robust diurnal cycle is observed in the hourly time of day at which statuses are updated, with those from the mode location (black curve) occurring more often than other locations (red curve) in the morning and evening.

Peaks in activity are seen in the morning (8-10am) and evening (10pm-midnight), separated by lulls in the afternoon (2-4pm) and overnight (2-4am) hours.  As we and our friend Captain Obvious would expect, people tend to tweet more from their home locale than any other locale (red curve) in the morning and evening.

Bottom line: Despite our seemingly different patterns of life, we are remarkably similar in the way we move around. Our walks are a far cry from random.

Next up: We’ll examine the emotional content of tweets as a function of distance.  Is home where the heart is?

For more details on these results, see our paper Happiness and the Patterns of Life: A Study of Geolocated Tweets.

2 Comments

Filed under networks, physics, prediction, social phenomena

Chaos in an Atmosphere Hanging on a Wall

This month marks the 50th anniversary of the 1963 publication of Ed Lorenz’s groundbreaking paper, Deterministic Nonperiodic Flow, by the Journal of Atmospheric Science. This seminal work, now cited more than 11,000 times, inspired a generation of mathematicians and physicists to bravely relax their linear assumptions about reality, and embrace the nonlinearity governing our complex world. Quoting from the abstract of his paper:

`A simple system representing cellular convection is solved numerically. All of the solutions are found to be unstable, and almost all of them are nonperiodic.’

While many scientists had observed and characterized nonlinear behavior before, Lorenz was the first to simulate this remarkable phenomenon in a simple set of differential equations using a computer. He went on to demonstrate the limit of predictability of the atmosphere to be roughly 2 weeks, the time it takes for two virtually indistinguishable weather patterns to become completely different. No matter how accurate our satellite measurements get, no matter how fast our computers become, we will never be able to predict the likelihood of rain beyond 14 days. This phenomenon became known as the butterfly effect, popularized in James Gleick’s book Chaos.

lorenz-sketch

Lorenz’s sketch of the attractor for his system.

Inspired by the work of Lorenz and colleagues, in our lab at the University of Vermont we’re using Computational Fluid Dynamics (CFD) simulations to understand the flow behaviors observed in a physical experiment. It’s a testbed for developing mathematical techniques to improve the predictions made by weather and climate models. Here you’ll find a brief video describing the experiment analogous to the model developed by Lorenz:

And below you’ll find a CFD simulation of the dynamics observed in the experiment:

What is most remarkable about Lorenz’s 1963 model is its relevance to the state-of-the-art in weather prediction today, despite the enormous advances that have been made in theoretical, observational, and computational studies of the Earth’s atmosphere. Every PhD student working in the field of weather prediction cuts their teeth testing data assimilation schemes on simple models proposed by Lorenz, his influence is incalculable.

In 2005, while I was a PhD student in Applied Mathematics at the University of Maryland, the legendary Lorenz visited my advisor Eugenia Kalnay in her office in the Department of Atmospheric & Oceanic Science. At some point during his stay, he penned the following on a piece of paper:

Chaos: When the present determines the future, but the approximate present does not approximately determine the future.’

Even near the end of his career, Lorenz was still searching for the essence of nonlinearity, seeking to describe this incredibly complicated phenomenon in the simplest of terms.

_______________________________________________________________

*Note: this post also appeared as part of the Mathematics of Planet Earth 2013 daily blog.

Taming Atmospheric Chaos with Big Data, a talk I gave at the 2011 UVM TEDx Conference Big Data, Big Stories:

Leave a comment

Filed under mathematics, physics, prediction

If you’re happy and we know it … are your friends?

Do your friends influence your behavior?  Of course they do.  But it’s hard to actually measure their influence.  Social contagion is difficult to distinguish from homophily, the tendency we have to seek relationships with people like ourselves.

In response to the “happiness is contagious” phenomenon promoted by Nicholas Christakis and James Fowler, we here at onehappybird were wondering whether happy Twitter users were more likely to be connected to each other.  In other words, is happiness assortative in the Twitter social network?  (See related work here.)

In the image below, each circle represents a person in the social network of the center node.  We color nodes by the happiness of their tweets during a single week.  Pink colors are happier, gray colors are sadder, and nodes depicted with the color black did not meet our thresholding criteria (50 labMT words).

We established a friendship link between two users if they both replied directly to the other at least once during the week.

As users are added to this network, it quickly becomes difficult to tell whether pink nodes are disproportionately connected to each other, so instead we look at the correlation of their happiness scores.  The plot below shows the Spearman correlation coefficient of the happiness ranks for roughly 100,000 people, with blue squares and green diamonds indicating different word thresholds, and red circles representing the same network but with randomly shuffled happiness scores.

The larger correlation for friends indicates that happy users are likely to be connected to each other, as are sad users. Moving further away from one’s local social neighborhood to friends of friends, and friends of friends of friends, the strength of assortativity decreases as expected.

We also looked at the average happiness of users as a function of their number of friends (degree k). Happiness increases gradually with popularity, with large degree nodes demonstrating a larger average happiness than small degree nodes.

The most popular users used words such as “you,” “thanks,” and “lol” more frequently than small degree nodes, while the latter group used words such as “damn,” “hate,” and “tired” more frequently.  The transition appears to occur near Dunbar’s number (around 150), demonstrating a quantitative difference between personal and professional relationships.

Finally, here we show a visualization of the reciprocal-reply network for the day of October 28, 2008.

The size of the nodes is proportional to their degree, and colors indicate communities detected by Gephi’s community detection algorithm.

For more details, see the publication:

C. A. Bliss, I. M. Kloumann, K. D. Harris, C. M. Danforth, P. S. Dodds.  Twitter Reciprocal Reply Networks Exhibit Assortativity with Respect to Happiness. Journal of Computational Science. 2012. [pdf]

Abstract: Based on nearly 40 million message pairs posted to Twitter between September 2008 and February 2009, we construct and examine the revealed social network structure and dynamics over the time scales of days, weeks, and months. At the level of user behavior, we employ our recently developed hedonometric analysis methods to investigate patterns of sentiment expression. We find users’ average happiness scores to be positively and significantly correlated with those of users one, two, and three links away. We strengthen our analysis by proposing and using a null model to test the effect of network topology on the assortativity of happiness. We also find evidence that more well connected users write happier status updates, with a transition occurring around Dunbar’s number. More generally, our work provides evidence of a social sub-network structure within Twitter and raises several methodological points of interest with regard to social network reconstructions.

38 Comments

Filed under psychology, social phenomena

Chaos in an Experimental Toy Climate

In the 1960’s, MIT meteorologist Edward Lorenz was investigating the effects of nonlinearity on short-term weather prediction in a model of convection. In his ground-breaking paper “Deterministic Nonperiodic Flow,” Lorenz showed that numerical solutions of the model exhibit sensitive dependence on their initial position, leading virtually indistinguishable states to diverge quickly. This phenomenon, which became known as chaos, is a major contributor to inaccuracies in weather and climate forecasts.

The thermal convection loop is an experimental analog of Lorenz’s system in the form of a hula-hoop shaped tube, filled with fluid, and oriented vertically like a wheel. The bottom half of the tube is warmed uniformly by a bath of hot water and the top half is cooled. Under certain conditions, a steady state is never reached, and the fluid switches direction in an unpredictable pattern.

In the past few years, we have used Computational Fluid Dynamics (CFD) simulations of the loop as a testbed for data assimilation, ensemble forecasting, and model error experiments in weather and climate prediction. Our team is developing algorithms to improve forecasts and uncertainty quantification using this simple but realistic toy climate. Successful techniques are then implemented on more realistic weather and climate models.

Details:

K. D. Harris, E.-H. Ridouane, D. L. Hitt, C. M. Danforth. 2012. Predicting Flow Reversals in Chaotic Natural Convection using Data Assimilation. Tellus A, 64, 17598. [pdf]

N. Allgaier, K. D. Harris, C. M. Danforth. 2012. Empirical Correction of a Toy Climate Model. Physical Review E. 85, 026201. [pdf]

R. Lieb-Lappen, C. M. Danforth. 2012. Aggressive Shadowing of a Low-Dimensional Model of Atmospheric Dynamics. Physica D. Volume 241, Issue 6, Pages 637–648. [pdf]

E.-H. Ridouane, C. M. Danforth, D. L. Hitt. 2009. A 2-D Numerical Study Of Chaotic Flow In A Natural Convection Loop. International Journal of Heat and Mass Transfer. [pdf]

and a lecture on the topic given by Danforth to the Applied Dynamics graduate course at UNC Chapel Hill:

Funding from the project comes from NASA and NSF through the Mathematics and Climate Research Network.

1 Comment

Filed under physics

Hedonometrics

Our paper “Temporal Patterns of Happiness and Information in a Global Social Network: Hedonometrics and Twitter” appears in PLoS ONE this week. Their blog encourages you to tweet for the sake of science!

Among other findings, in this paper we demonstrate that human ratings of the happiness of an individual word correlate very strongly with the average happiness of the words that co-occur with it. This implies that tweets containing particular keywords can be used as an unsolicited public opinion poll.

For example, tweets containing “Tiger Woods” became decidedly less positive after his Thanksgiving disaster in 2009 as the words ‘accident’, ‘crash’, ‘scandal’, and ‘cheating’ are more abundant, while the word ‘love’ appears less often.

Happiness is measured relative to the ambient background of all tweets.

Sad words are blue, happy words are yellow. Up (down) arrow indicates that the word appeared more (less) frequently in tweets containing "Tiger Woods".

Generally, tweets containing personal pronouns tell a positive prosocial story with ‘our’ and ‘you’ outranking ‘I’ and ‘me’ in happiness. The least happy pronoun on our list is the easily demonized ‘they’.

Emoticons in increasing order of happiness are ‘:(’, ‘:-(’, ‘;-)’, ‘;)’, ‘:-)’, and ‘:)’. In terms of increasing information content (diversity of words co-occuring with each emoticon), the order is ‘:(’, ‘:-(’, ‘:)’, ‘:-)’, ‘;)’, and ‘;-)’. We see that happy emoticons co-occur with words of higher levels of both happiness and information but the ordering changes in a way that appears to reflect a richness associated with cheekiness and mischief: the two emoticons involving semi-colon winks are third and fourth in terms of happiness but first and second for information.

A list of the happiness ratings of tweets containing some interesting keywords can be seen here.

And not surprisingly, the happiness of all tweets appearing on a given day of the week correlates well with the happiness ratings humans give each day.

Happiness of tweets appearing on a given day

Human ratings of the happiness of each day of the week

You can download the language assessment by Mechanical Turk (labMT 1.0) word list here. It is a text file containing the set of 10,222 most frequently occurring words in the New York Times, Google Books, music lyrics, and tweets, as well as their average happiness evaluations according to users on Mechanical Turk.  See the paper for details.

Much more to come regarding sociotechnical phenomena…

3 Comments

Filed under social phenomena

The Happiest Distribution

Do you laugh within your tweets? e.g. hahaha!!!  Here we show the number of times these different laugh species appear in tweets as a function of how many ha‘s they contain.  A few observations:

  1. Longer laughs are less frequent, and the frequency decays at a constant rate. We’re plotting on logarithmic axes, the black line has a slope of -5 and appears to match the data over at least 5 decades in frequency… Zipf would be proud of the people: Hahaha power law?
  2. ha is less frequent than haha but slightly more frequent than hahaha.
  3. Only a select few humans are able to make it out beyond 100 letters without a typo.  Congratulations!

Thanks to one of our students, Tyler Gray, for sorting this all out.

2 Comments

Filed under social phenomena

Happy and we know it

Science Magazine published a piece today framing twitter as a laboratory for research, Social Scientists Wade Into The Tweet Stream, including the above figure showing our hedonometer’s measure of happiness in 2011 as a function of day. Dodds was also interviewed by Science for their weekly podcast, and by Benedict Carey for a New York Times piece, Happy and You Know It? So Are Millions on Twitter.

3 Comments

Filed under social phenomena

Tweet Cartography

Six months of geo-located messages from Twitter’s gardenhose feed, roughly 20 million.  World, US, and NYC twitterific projections.

PDF versions available here. Made possible by data ninjas Kameron Harris and Morgan Frank.

5 Comments

Filed under social phenomena