The Null Device blog: Analysis of Facebook updates

Following in the footsteps of OKCupid's data-mining blog, some people at Facebook have recently analysed a sample of status updates by word category, extracting correlations between word categories (as well as overall subject matter and positivity/negativity), time of day and probability of updates being liked/commented on. The analysis has shown, among other things:

there are correlations between word categories and age; older people use more first-person plurals, positive emotions and references to religion and family, while young people tend to talk in the singular first-person (presumably adolescent alienation?), mention sadness and death, swear a lot and talk about sex, music and TV.
People with more friends talk more about social processes and other people, and have higher total word counts; whereas, while talking about home, family and emotions are correlated with having fewer friends, the most strongly correlated categories are time and the past.
Positive emotions are one of the most likely categories to be liked, but least likely to attract comments. Negative emotions, however, attract a lot of comments (presumably from the people posting empathetic "Don't Like" messages).
The one thing less likeable than negative emotions is talk about sleeping.
People who talk about metaphysical or religious subjects are most likely to be friends. And people who use prepositions a lot tend not to be friends with people who swear a lot or exhibit anger or negative emotions.

There are 3 comments on "Analysis of Facebook updates":

Posted by: Greg Thu Dec 30 02:27:35 2010

What makes this research interesting/problematic (depending on point of view) is the heavy degree of impression-management that occurs within sites like FB.

Actions available in FB are given natural-language names that are no more than metaphors - and its easy to forget they're metaphors.

So for example a "friend" isn't really a friend, though they can be. "Friend" really means "fellow FB user whom one has chosen to include in their contacts list, and who has agreed to be listed". "Like" doesn't really mean like, though it can. If person X "likes" posting Y, it merely means they have chosen to say so publicly, and this needn't bear any relation to what they actually like.

Metaphor is used in software all the time - we use variables and user-interface elements that are abstractions of real-world things. With enough usage the metaphor is sometimes forgotten: "file" and "desktop" are examples.

I suspect FB want us to "forget the metaphor" and take their use of words like "friend and "like" at face value.

Posted by: Greg Thu Dec 30 02:36:40 2010

... So, in research like the above, we can't assume that users with more or less "friends" actually have more or less friends, or that kinds of posts that are "liked" more or less than others are actually liked more or less.

Users from under-represented demographics (eg oldies and people from some countries) are likely to exhibit fewer FB "friends", though they might have plenty of friends who aren't on FB. Groups that use FB less will have views that are under-represented within the FB corpus of postings.

We can't even take the demographic data supplied by users too seriously. There are plenty of FB profiles that have the wrong age or religion, plenty of people that have >1 profile, and so on. The things one can choose to "like" are limited. A person's FB activities needn't bear any resemblance to offline reality.

All we can take at face value is comment-counts and the like. Statistics derived from them can be interesting, but it would be a mistake to attempt to derive psychological insights from them.

Posted by: Greg Thu Dec 30 03:07:25 2010

What would be interesting IMO would be to (somehow) measure how close Facebook representations are to offline reality, and then compare this distance with that of other online social systems.

Facebook encourages (demands in some respects) that users' online representations match their offline properties. (FB haven't quite mandated that *actions* match.)

Contrast this with Second Life, which demands that online representations be fictional. For example, SL users may not use their real names, and there is a user culture of never linking SL and real identities. Yet research has shown that avatar appearances in SL tend to converge to idealizations of their users' offline characteristics (http://www2.parc.com/csl/members/nicolas/documents/CHI2009-Avatars.pdf).

I'd hypothesize that users' representations of themselves in FB are likewise idealizations of offline selves, and that the offline-online distance may be greater than we realize, and even not radically different to SL's.