Tag Archives: maps

A data-driven study of the patterns of life for 180,000 people

Here at the Computational Story Lab, some of us commute by foot, some by car, and a few deliver themselves by bike, even in the middle of our cold, snowful Vermont winter.  Occasionally, we transport ourselves over very long distances in magic flying tubes with wings to attend conferences, to see family, or for travel.  So what do our movement patterns look like over time?  Are there distinct kinds of movement patterns as we look across populations, or are they variations on a single theme?

Inspired by an analysis of mobile phone data by Marta Gonzalez at MIT, James Bagrow at Northwestern, and colleagues, we used 37 million geotagged tweets to characterize the movement patterns of 180,000 people during their 2011 travels. We used the standard deviation in their position, a.k.a. radius of gyration, as a reflection of their movement. As an example, below we plot a dot for each geotagged tweet we found posted in the San Francisco Bay area, colored by the author’s radius of gyration.

The Bay Area is shown with a dot for each tweet, colored by the radius of gyration of its author.

The Bay Area is shown with a dot for each tweet, colored by the radius of gyration of its author. The color scale is logarithmic, so we can compare people with very different habits.

You can see from the picture that there are many people with a radius near 100km tweeting from downtown San Francisco. This pattern could reflect a concentration of tourists visiting the area, or individuals who live downtown and travel for work or pleasure. Images for New York City, Chicago, and Los Angeles are also quite beautiful.

In the image below, we rotated every individual’s movement pattern so that the origin represents their average location, and the horizontal line heading to the left represents their principle axis (most likely the path from home to work). We also stretched or shrunk the vertical and horizontal axes for each individual, so that everyone could fit on the same picture. Basically, we have a heatmap of collective movement, with each individual in their own intrinsic reference frame.  The immediate good news for these kinds of data-driven studies is that we see a very similar form to those found for mobile phone data sets.  Apart from being a different social signal, Geotagged Tweets also have much better spatial resolution than mobile phone calls which are referenced by the nearest cellphone tower.

Movement pattern exhibited by 180,000 individuals in 2011, as inferred from 37 million geolocated tweets. Colormap shows the probability density in log10. Note that despite the resemblance, this image is neither a nested rainbow horseshoe crab, nor the Mandelbrot set.

Movement pattern exhibited by 180,000 individuals in 2011, as inferred from 37 million geolocated tweets. Colormap shows the probability density in log10. Note that despite the resemblance, this image is neither a nested rainbow horseshoe crab, nor the Mandelbrot set.

Several features of the map reveal interesting patterns. First, the teardrop shape of the contours demonstrates that people travel predominantly along their principle axis, with deviations becoming shorter and less frequent as they move farther away. Second, the appearance of two spatially distinct yellow regions suggests that people spend the vast majority of their time near two locations. We refer to these locations as the work and home locales, where the home locale is centered on the dark red region right of the origin, and the work locale is centered just left of the origin.

Finally, we see a clear horizontal asymmetry indicating the increasingly isotropic variation in movement surrounding the home locale, as compared to the work locale. We suspect this to be a reflection of the tendency to be more familiar with the surroundings of one’s home, and to explore these surroundings in a more social context. The up-down symmetry demonstrates the remarkable consistency of the movement patterns revealed by the data.

We see a clear separation between the most likely and second most likely position.

We see a clear separation between the most likely and second most likely position.

Looking just at the messages posted along the work-home corridor, the distribution is skewed left, with movement from home in a heading opposite work seen to be highly unlikely.

The isotropy ratio shows the change in the probability density's shape as a function of radius.

The isotropy ratio shows the change in the probability density’s shape as a function of radius.

Above we see that individuals who move around a lot have a much larger variation in their positions along their principle axis, exhibiting a less circular pattern of life than people who stay close to home. Remarkably, the isotropy ratio decays logarithmically with radius.

Finally, we grabbed messages from the most prolific tweople, those 300 champions who had posted more than 10,000 geotagged messages in 2011. We received 10% of these messages through our gardenhose feed from Twitter. Below, we plot the times during the week that they post from their most frequently visited location. These folks most likely have the geotag switch on for all messages, and exhibit a very regular routine.

A robust diurnal cycle is observed in the hourly time of day at which statuses are updated, with those from the mode location (black curve) occurring more often than other locations (red curve) in the morning and evening.

A robust diurnal cycle is observed in the hourly time of day at which statuses are updated, with those from the mode location (black curve) occurring more often than other locations (red curve) in the morning and evening.

Peaks in activity are seen in the morning (8-10am) and evening (10pm-midnight), separated by lulls in the afternoon (2-4pm) and overnight (2-4am) hours.  As we and our friend Captain Obvious would expect, people tend to tweet more from their home locale than any other locale (red curve) in the morning and evening.

Bottom line: Despite our seemingly different patterns of life, we are remarkably similar in the way we move around. Our walks are a far cry from random.

Next up: We’ll examine the emotional content of tweets as a function of distance.  Is home where the heart is?

For more details on these results, see our paper Happiness and the Patterns of Life: A Study of Geolocated Tweets.

1 Comment

Filed under networks, physics, prediction, social phenomena

Where is the happiest city in the USA?

Is Disneyland really the happiest place on Earth?* How happy is the city you live in? We have already seen how the hedonometer can be used to find the happiest street corner in New York City, now it’s time to let it loose on the entire United States.

We plotted over 10 million geotagged tweets from 2011 (all our results are in this paper), coloring each point by the average happiness of nearby words (detail on how we calculate happiness can be found in this article published in PLoS ONE):

Image

As well as cities and the roads between them, we can make out many regions of higher and lower happiness, even within individual cities. As an example, check out this tweet-generated map of the city of Chicago:

Tweet-generated map of Chicago. Click to enlarge.

Tweet-generated map of Chicago. Click to enlarge.

Notice the striking contrast between the relatively happy Central/North Side of the city, and the sadder South Side. You can also find a few airports in this map, and if you look very closely you might even be able to pick out happy and sad terminals!

To quantify this variation in happiness a bit better, let’s look at the average happiness of each state:

Image

Southern states tend to produce sadder words than those in northern New England or out west. Hawaii emerges as the happiest state and Louisiana as the saddest, due to relative differences in the frequencies of happy and sad words used in each state. Here at onehappybird, we characterize such differences by “word shifts”, which are basically word clouds for grown-ups. You can find examples of these, as well as the full list of the average happiness of each state, here (page best viewed using Google Chrome).

Zooming in further to the level of cities, we produced a similar list for 373 cities in the lower 48 states (you can find the full list, as well as maps and word shifts for each city, here). With a score of 6.25, we found the happiest city to be Napa, CA, due to a relative abundance of such happy words as “restaurant”, “wine”, and even “cheers”, along with a lack of profanity.

wordshiftNapaRedacted

At the other end of the spectrum, we found the saddest city to be Beaumont, TX, with a score of 5.82. In general, cities in the south tended to be less happy than those in the north, with a major contributing factor being the relative abundance of profanity used in those cities.

We can go even further than this, and group cities by similarities in word usage. Each square in the heatmap below represents the similarity (Spearman correlation for you mathematically minded onehappybird watchers) between word distributions for the largest cities in the US. Red squares mean that the corresponding cities use words in a similar fashion, while blue means that those cities tend to use different types of words with respect to each other. The colors in the tree diagram at the top signify clusters of cities exhibiting similar word usage (below a certain threshold).

As we might expect for two cities that are geographically nearby, New Orleans and Baton Rouge are clumped together at the bottom right of the figure. On the other hand, New York and Seattle get clumped together as well, suggesting that similarities in language depend on more than just geographical proximity.

cityClusters

You can find more information about happiness and cities, as well as details on the methods used to produce these results, in our arxiv research article. In our next post, we’ll look at how these results are related to various underlying socioeconomic characteristics of cities. What makes a city happy or sad? Can we use Big Data to predict future changes in the demographics, health, or happiness of a city? How does happiness relate to the food you eat?

*By the way, to answer the question at the start of this post: According to this analysis Disneyland is not the happiest place on Earth; it isn’t even the happiest place in Southern California! See if you can find it in this tweet-generated map of LA! Or find your city here.

74 Comments

Filed under geohappiness, mathematics, psychology, social phenomena