A new study by Anita Elberse, published in the Harvard Business Review, raises questions about the validity of Chris Anderson's Long Tail theory. If you're related to Rip Van Winkle, the Long Tail theory suggests that the dramatically lower distribution costs for media (such as music and movies) enabled by the internet has the potential to reshape the demand curve for media. Traditionally, these businesses have been hits-driven, with the majority of revenue and profits being attributable to a small number of items (the hits). Anderson argues that the internet's ability to serve niches cost-effectively increases the demand for items further down the "tail" of the demand curve, making the aggregate demand for the tail comparable to that for the head.
Anderson's insight resonated instantly with the digerati. It is said that Helen of Troy's face launched a thousand ships; the Long Tail theory certainly launched more than a thousand startups, all with an obligatory Long Tail slide in their investor pitches. Recently, however, there has been a creeping suspicion that the data don't support the theory; the backlash has been spearheaded, among others, by Lee Gomes of the Wall Street Journal. In her piece, Anita Elberse does a deep dive into the data and concludes that the Long Tail theory is flawed.
Anderson has posted a rebuttal on his blog, pointing out a problem with Elberse's analysis: defining the head and tail in percentage terms. There is some truth to Anderson's rebuttal. But the heart of Elberse's criticism lies not in the definition of the head and the tail. It's in using McPhee's theory of exposure to conclude that positive feedback effects reinforce the popularity of hits, while dooming items in the tail to perpetual obscurity. She presents data from Quickflix, an Australian movie rentals service showing that movies in the tail are rated on average lower than movies in the head. Thus, movies in the tail are destined to remain in the tail. Elberse exhorts media executives to concentrate their resources on backing a small set of potential blockbusters, rather than fritter it away on niches.
The big problem with this argument is that it conflates cause and effect. Before the internet, distribution was expensive, and there was no way for consumers to provide instant feedback on products. Consumers then got little choice in the matter of what items were readily available and what items were hard to find. Thus, the hits were picked by a few studio executives, publishers, or record producers who "greenlighted" projects they thought had hit potential. But when distribution is cheap, and consumer feedback loops are in place, the items that a lot of consumers like become popular and move into the head. It's not that items in the tail are inherently rated lower; items are in the tail precisely because they are rated lower.
It's as if we're comparing two systems of government, a hereditary aristocracy and a democracy, by comparing the sizes of the ruling elite in the two cases. That misses the point entirely. What matters is not the size of the ruling elite, it's how they got there. So, the big change wrought by the internet is not so much to change the shape of the demand curve for media products, as Anderson claims; nor has there been no change whatsoever, as Elberse posits. The big change is not in what fraction of the demand is in the head, it's in how the items that are in the head got there in the first place. Any change in the shape of the curve itself is incidental.
There's another market where we are seeing this phenomenon play out: the market for Facebook (and MySpace) apps. In earlier years, it took a lot of capital to get a company off the ground. The companies that got funded were the ones with good business plans who could convince VCs to take the plunge based on the people, the plan, and potentially some intellectual property. But it doesn't take much capital to write a Facebook app, leading to a proliferation of them. This paves the way for the expected inversion. Facebook users don't use the apps that VCs fund. Instead, Facebook users decide which apps they like, and VCs fund the ones, such as Slide and RockYou, that gain popularity.
It is instructive to look at the Facebook app trends study published by Roger Margoulas and Ben Lorica at O'Reilly Research. The study shows that at last count, there were close to 30,000 facebook apps. Usage, however, is highly concentrated among the top few apps, a classic example of a hits-driven industry (see graph) -- no long tail. However, these hits have been produced by the collective action of millions of Facebook users, rather than by a small set of savvy media executives. And there's a lot of churn: new applications join the winners and old winners die and are buried in the tail.
The real Long Tail created by the internet is not the long tail of consumption, but the long tail of influence. Earlier, the ability to influence the decisions on who the winners and losers were rested with a few media executives. Now every social network user has some potential influence, however small, on the result. The long tail of influence, combined with instant feedback loops, leads to a short tail of consumption. The Facebook app market is a leading indicator of the path the entire media industry will take in years to come.
Update: Chris Anderson has posted a rebuttal in the Comments. Thanks Chris! Please do read his comment and my response. Chris points out that Facebook apps still follow a power law distribution. It doesn't matter how long the tail is, what matters is how heavy it is. The area under the long tail is a function of both length and depth, and depends crucially on the power law exponent. For the mathematically minded, the details are here.
Sigh. I was with you right up until you introduced the facebook data. A Long Tail is a powerlaw distribution, which looks exactly like what you've shown. All powerlaws have a huge drop-off like that--but the tail being long (get it?) the area under what appears to almost nothing adds up to a lot. The only way you can tell whether it really does conform to the theory or not is to plot it log-log and see if it's a straight line.
Posted by: Chris Anderson | July 10, 2008 at 08:02 AM
Chris, not all power laws lead to heavy tails. Depending on the exponent of the power law, you can get either the classic hits-dominated model, or a heavy tail. The Facebook data does show that we have a hits-dominated marketplace.
That said, I think you were on to something in that the internet transforms media. It's just not in the narrow way you describe in your book. Time for a new edition?
Posted by: Anand Rajaraman | July 10, 2008 at 08:10 AM
My understanding has been that all power laws are heavy-tailed. That understanding seems to jive with Wikipedia definitions: a power law has pdf p(x) ~ ax^k; a dist is heavy-tailed if \lim_{x \to \infty} e^{\lambda x} p(x) = \infty for all \lambda > 0. I don't see how a power law can outweigh an exponential in the limit as x approaches infinity. Am I missing something?
http://en.wikipedia.org/wiki/Power_law
http://en.wikipedia.org/wiki/Heavy-tailed_distribution
Posted by: Jason Rennie | July 10, 2008 at 02:36 PM
@Jason
Yes, as far as a function is sub-exponential, it results a 'heavy tail'. That's the mathematical definition.
Posted by: D. Liu | July 11, 2008 at 09:41 AM
Jason, D. Liu: Yes, you guys are right as far as the mathematical definition is concerned. But the actual heaviness of the tail, in terms of practical consequences, depends on the exponent. Details at this page:
http://anand.typepad.com/datawocky/not-all-powerlaws-have-lo.html
Posted by: Anand Rajaraman | July 11, 2008 at 01:54 PM
Anand, this is exactly the thing I wanted to say in the comment feed on Anitas article.
http://conversationstarter.hbsp.com/2008/07/the_long_tail_debate_a_respons.html#c026062
Has there been any analysis made on the rate of change of the hits in any specific industry? My gut feeling (based on Lily Allen, Kate Nash, every deep house produced you can think of and etc etc) is that in the music industry the rate of change is significantly higher now than before.
Posted by: Johan | July 22, 2008 at 05:43 AM
Chris is of course correct, and that is part of the reason for my frustration with lots of long tail discussions. There's a simple and straightforward way to see whether data fits a long tail description or not, and yet people choose to ignore it most of the time, turning to arbitrary hand-waving arguments with a few numbers and percentages thrown in.
I am especially annoyed by this being the case in Elberse's paper, but I managed to find sufficient data for partial analysis in one part of her paper. I was hoping to find that her data fits a long-tailed distribution but found out that an exponential one is a much better fit.
This makes me even more puzzled.
Anyway, you can find it here:
http://longtailanalysis.blogspot.com/2008/07/shes-got-point-or-three.html
Posted by: Shahar | July 23, 2008 at 02:12 AM
Chris is of course correct, and that is part of the reason for my frustration with lots of long tail discussions. There's a simple and straightforward way to see whether data fits a long tail description or not, and yet people choose to ignore it most of the time, turning to arbitrary hand-waving arguments with a few numbers and percentages thrown in.
I am especially annoyed by this being the case in Elberse's paper, but I managed to find sufficient data for partial analysis in one part of her paper. I was hoping to find that her data fits a long-tailed distribution but found out that an exponential one is a much better fit.
This makes me even more puzzled.
Anyway, you can find it here:
http://longtailanalysis.blogspot.com/2008/07/shes-got-point-or-three.html
Posted by: Shahar | July 23, 2008 at 03:25 AM