Understanding Media Markets in the Digital Age: Economics and Methodology

January 3, 2014

This post is about a recent chapter I co-authored for the NBER volume on the Economics of Digitization. 

Viewpoints on such topics as filesharing, copyright enforcement, and digital distribution strategies can be quite polarized.  Further complicating the picture, empirical “research” appears to support both sides of the issues on these topics, with one example being the fact that some studies suggest that piracy harms sales of artistic works while others seem to suggest that piracy may be beneficial toward these sales.  But not all “research” is created equal – in our recent chapter for the NBER “Economics of Digitization” book, we attempt to explain the difference between high quality research that can support its claims and lower quality research which uses real data but makes claims that are not supported by these data.  In this chapter, we take an agnostic stance as to the debate over copyright but we set forth a primer for how good research in this area should be conducted.

Correlation and Causation

Our central tenant is one that few people in the debate will disagree with – correlation is not causation.  The key to the debate over many copyright and digitization issues in media lies in determining the causal effect of one thing on another, such as the causal effect of piracy on sales of media.  Simplistic analysis of data usually produces only correlations.  For example, consider the claim put forth in some research – “music pirates purchase more music than non-pirates.”  Often this claim is put forth as evidence that piracy is not harmful or even that it is helpful to music sales.  But consider the following:  In a pre-piracy world, some people probably bought music CD’s and some people did not.  Many in the second group simply are not interested in acquiring music – they simply aren’t that into it, or aren’t into owning it.  Now imagine that suddenly Napster introduces filesharing/piracy to the world.  Who do you think will start acquiring music on Napster?  The group of people who liked music enough to buy CD’s, or the group who never showed any interest in acquiring it before?  While some of the latter group may be attracted to the zero price of Napster, clearly the people who actually like music are more likely to start pirating than the people who didn’t like music enough to ever acquire any before.  So now we will observe that people who are pirating more (the former group) are also purchasing more (they were always bigger purchasers than the latter group even before piracy existed).  In other words, piracy will be positively correlated with purchases across individuals.  But this does not mean that piracy helps sales.  The question you want to answer is this – if piracy had not come into existence, how much would people be buying/paying for music and how does that compare to how much they actually buy?  The difference between what they actually buy/pay and what they would buy if piracy were unavailable (the counterfactual) is the causal effect piracy has had on sales.  This is a more complicated question, but it is the one that needs to be answered to understand the effects of the erosion of copyright and to discuss what copyright policy should look like in the digital era.

Good Methodology vs. Bad

In our chapter, we discuss different methodologies that can be used to make causal inferences from data and how these methodologies (long used in the field of economics) can be applied to experiments that have occurred in digital media.  For example, we show that after the shutdown of the popular filesharing cyberlocker, digital movie sales actually decreased across the world.  However, this is because the shutdown occurred in January of 2012, and sales decrease every year in January as they come down from Christmas highs.  As an example of a methodology called “difference-in-difference”, we discuss one of our research papers that shows that in countries where Megaupload was more widely used, the post Christmas sales decline was less-steep than in countries where Megaupload was not popular.  This indicates that the natural season decline in sales was smaller in 2012 as a result of the shutdown of the piracy site, and we show how the combination of this fact and other evidence in the data (for example, the fact that countries in our data all had similar digital movie sales trends before the shutdown) leads to the reasonable interpretation that the shutdown of Megaupload actually caused digital movie revenues to be 7-9% higher than they would have if not for the shutdown.  A simple report that digital sales were lower in the two weeks after the shutdown than the two weeks before would be very misleading due to the natural seasonal trends – instead, careful research and application of methodology for causal inference was required to uncover the true impact of the Megaupload shutdown on digital movie revenues.

In the NBER chapter, we provide several examples of our own prior research that isolates causal effects from correlations in the data.  Our research combines econometric methodology with “natural experiments” – discrete shocks to filesharing or to legal digital availability – to make our causal claims. In addition to our Megaupload study, we discuss our paper which isolates the causal impact of HADOPI (an anti-filesharing law in France) on digital music sales – HADOPI caused a 23-35% increase in digital music sales.  We also discuss our paper that shows that offering television content on the iTunes video store can reduce piracy of that same content without cannibalizing sales of DVD box sets.  Through our application of methodologies for causal inference, we build a case that legal sellers of media can compete with piracy either by making legal offerings more attractive or by making illegal consumption channels (like filesharing) less attractive.


Our hope for our NBER chapter is two-fold:  First, we hope that it will spur additional research in digital media that is grounded in valid methodology and that focuses on teasing causality out of correlations in data.   Such research is essential to guide the distribution strategies of firms as well as government copyright policy in the digital era.   Second, we hope that readers of research on digital media and copyright can use our chapter to better understand why studies in digital media may seem to be in conflict – our chapter can serve as a short text on understanding the difference between good research that can support its claims (through randomized or natural experiments) and lower quality research that uses real data but makes causal claims not supported by said data.


