An Objective Analysis of Piracy Site Blocking

I’ve just returned from my first visit to Taiwan, where I met with representatives from the ROC Government and the local creative industries, and participated in an international conference regarding the continued evolution of online governance.  One of the issues discussed, which seems to be politically sensitive in Taiwan, concerned proposed amendments to Taiwan’s copyright law and specifically a provision for blocking for pirate websites.  Apparently this has been considered in the past but has in each instance been met with strong opposition from citizens concerned about the impact it will have on use of the Internet – as a result, government has been reticent.

Arguments against piracy website blocking seem to mostly take one of two forms:  there are arguments about it violating freedom speech or potentially “breaking the Internet” (to use a popular phrase), and there are arguments of the form “it won’t have any impact, you can still find ways to pirate any content you want, so why do it?”  I’m not a legal scholar, so I don’t have any special expertise on freedom of speech.  I’m also not an expert on the technical side of the Internet – although I do know that site blocking has been implemented in over 40 countries and in studying site blocking in several of those countries I’ve seen no evidence of a “broken Internet” or even a serious challenge to net neutrality.  But I would leave the arguments about this to those whose expertise lies in the area of Internet policy regulation.

That said, I can offer my expertise on the argument that website blocking won’t have any impact because pirate content can always be found somewhere.  I’ve been doing empirical research on this claim for the better part of a decade now; specifically, I’ve been looking at whether supply side antipiracy policies – enforcement actions that target piracy sites or protocols – can influence consumers to turn from illegal channels to legal ones.  And I’ve been keeping an eye on the research of colleagues asking this same question.  My findings are summarized in a peer-reviewed article published in Communications of the ACM.

Basically, there is no evidence of a supply side antipiracy action that completely removes popular content from the Internet – you can always find it somewhere.  And there is evidence that antipiracy actions that do not make piracy sufficiently difficult have no meaningful impact on legal sales or total piracy, because consumers can either find other sites from which to pirate the content or circumvent the enforcement actions.  These results are found by Aguiar et. al., by Poort et. al, and by Danaher et. al.

However, research also shows that taking actions against piracy sites can sometimes make piracy so inconvenient that a meaningful group of marginal consumers will turn their behavior from illegal sources to legal ones.  For example, Mike Smith and I found that when and (the two largest piracy cyberlockers in the world) were both shutdown, many sites that linked to their content stopped working and many cyberlockers shifted their policies to be less tolerant toward copyright infringement.  The result was a causal increase in revenues from digital movie sales and rentals of about 7.5%.  Later, along with Rahul Telang, we found that even though the UK blocking the caused little decrease in total piracy and no increase in legal consumption, the UK blocking of 19 major video piracy sites in November 2013 caused a 12% increase in legal consumption and a large decrease in total piracy.  Later, when 53 more piracy websites were blocked in November 2014, we found a similar effect.

The point is that supply side antipiracy actions fail when they only slightly raise the difficulty of pirating content, but seem to succeed in nudging consumption toward legal channels when they sufficiently raise the cost/inconvenience of pirating content.

Based on my trip, it appears that Taiwan’s creators want to explore piracy site blocking as a potential means of protecting content against piracy and increasing legitimate consumption.  By all means, interested parties should debate what impact this could have on the Internet and/or whether it is legally desirable.  But all of the research that I have undertaken indicates that one cannot argue against piracy website blocking by saying it won’t work – the data clearly indicate that when enough sites are blocked to make piracy sufficiently inconvenient, viewers migrate some of their consumption from illegal channels toward legal, revenue-generating channels.

My hope is that policy makers in Taiwan, and elsewhere, will not base their discussions and decisions solely on passionate rhetoric and political posturing, but will instead consider all sides of the argument – including empirical data – reasonably and objectively.  Creators and Internet users in Taiwan and around the world deserve no less than that.



Content, Geographic Restrictions, VPN’s, and Fairness

BBC has started blocking access to their iPlayer service from anyone trying to connect through a VPN.

To understand why this is controversial and gets at a much larger issue, you need to know a few things about the BBC (besides when they’ll be producing the next season of Sherlock!), about VPN’s, and about geographic restrictions.

The Background:

The BBC is actually a publicly funded entertainment company in that it receives a large amount of money from the British government to produce content.  That money is raised through what is effectively a tax on UK citizens (or at least UK citizens who watch TV content).  Thus, unlike content on NBC, Fox, or HBO, one could consider BBC content as already having been “paid for” by UK citizens once it is produced.  As such, the BBC makes a lot of this content available for free to UK citizens through their iPlayer (particularly recent content).  Viewers from other countries may not access the iPlayer, with the justification that they have not “paid for” the content yet. However, through the company BBC Worldwide, BBC does make content available to people in other countries in a variety of ways, such as licensing it to over-the-air channels, digital streaming services like Netflix, digital download services like iTunes, etc.  Revenues from these deals offset the amount of “tax” that needs to be collected to produce BBC content (I don’t know if it is viewed that way at BBC or not, but from an economic perspective this is effectively the case – if BBC Worldwide were not selling the content abroad, BBC would need more money to produce what it is currently producing).

The Dilemma:

Individuals outside the UK could effectively access free BBC content through the iPlayer if they used a VPN to be appear to be connecting from within the UK (not unlike, say, Aussies who accessed US Netflix using a VPN long before it was introduced to Australia… except that said Aussies paid the Netflix subscription fee).  To stop this, it is reported that BBC has blocked connections coming from a VPN.

The Techdirt take on this (linked above) is that this also blocks some UK citizens who want to connect through a VPN from doing so, though BBC is attempting to mitigate this.  In a world (post-Snowden, as techdirt notes) where people want privacy, this is costly to people.  This is a legitimate point.  Techdirt also implies that BBC is failing to see all of the interest in their content as a business opportunity.  This is less legitimate.  BBC clearly sees this, and thus the entire goal of BBC Worldwide (who make the content available in a number of formats and have been reasonably aggressive in shortening international delay windows).

The issue is that providing free access to BBC content to non-UK citizens basically allows them to free ride.  UK citizens have paid for the content, others have not.  BBC has a right to monetize their content internationally so as to offset the tax burden on UK citizens in producing the content.

If the argument is that there is a more effective or elegant way to do this than blocking access through VPN’s, I’m all ears and I’m interested – I am no technical expert.  If the argument is that BBC should allow free access to their content to non-UK citizens, I have to disagree.

The overarching point is that geographic restrictions have been one way that companies have either price discriminated or found ways to monetize content by licensing rights to distributors on a country-by-country basis.  In a world where VPN’s are increasingly desirable (for a number of reasons), geographic restrictions will be harder to enforce.  I’m curious if there is a known technical solution to this problem yet.

Piracy – Removing Content vs. Removing Access


Sometimes synthesizing research from several papers allows for insights not contained in any one of those papers.  Recently I’ve asked myself “Why is it that shutting down increased digital movies revenues, but blocking access to The Pirate Bay in the UK did not increase legal consumption?  Why did the shutdown of cause little to no increase in legal consumption in Germany?  After all, they are all piracy sites.”

The answer, I think, lies not in whether the action was a shutdown or a block, but rather what the antipiracy action in question actually did.  It’s about whether you’ve removed the actual content from the Internet, or removed one channel of access to the content.

Removing Content:

When Megaupload was shut down, the Department of Justice (in coordination with foreign authorities) raided the headquarters and seized the servers that hosted the content.  The next day, all of the content that had been available on Megaupload or Megavideo was gone.  Other sites that linked to that content now contained broken links.  Sure, some of that content also existed on other sites.  But the Megaupload content was gone.  And we saw that it caused digital movie revenues to increase by 6.5-8.5%.

Removing Access:

But when the UK courts ordered ISP’s to block access to The Pirate Bay, any content or links (torrent files) there were not removed from the Internet.  You could still access The Pirate Bay through a VPN.  Other sites could mirror the content on The Pirate Bay and unless ISP’s quickly realized this and also blocked access to those sites, you could access Pirate Bay content through said mirror sites.  And in our study, we found that this was exactly what happened – people found ways to access the content and did not move toward legal channels.

What about when was shut down in Germany?  That site was fully shut down so the content was removed from the Internet.  Except that never hosted content (or hosted very little) – it was simply a popular linking site that people went to in order to click a link to a copyright-infringing file hosted elsewhere.  When this site was shut down, other sites could and did easily link to the same content.  So the Kino shutdown has more in common with the Pirate Bay block in the UK than it does the Megaupload shutdown.

What Works?

Is there any way that removing access without seizing content can work?  Maybe.  Our website blocking study also showed that when 19 sites were all blocked within a month, pirates did increase legal consumption meaningfully.  So maybe removing access to one site does not inconvenience pirates as much as removing all Megaupload content from the Internet did, but it seems as if removing access to many sites can create enough inconvenience to matter.

Piracy Website Blocking – Effective or Not?

New Study:

My coauthors and I recently released a paper entitled “The Effect of Piracy Website Blocking on Consumer Behavior.”  In it, we ask whether the strategy of having Internet Service Providers (ISPs) block access to websites  that facilitate distribution of illegal content is effective in increasing traffic to legal sites for that content.

To determine this, we exploit two natural experiments in the UK – the court ordered blocking of The Pirate May in May 2012 and the subsequent court-ordered blocking of 19 major piracy sites in November 2013.  Rather than simply looking at time trends (which are affected by many factors), we ask whether users of the blocked sites (the “treated group”) increase their consumption of legal content more than non-users of the blocked sites (the “control group”, since the blocks had no real impact on them).

Our results are interesting.  We find that blocking The Pirate Bay had no causal impact on legal consumption.  Rather, former Pirate Bay users either found other piracy websites or they turned to technologies that allowed them to circumvent the blocks, such as VPN’s.  This finding is consistent with a recent study by Aguiar, Claussen, and Peukert, who found that shutting down a single piracy linking site in Germany did not meaningfully increase legal consumption there for the same reason.

In contrast, we find that the near simultaneous blocking of 19 major piracy sites in November 2013 caused users of those sites to increase their usage of legal video streaming sites like Netflix or Viewster by 12% (over and above the control group).  The lightest users of the blocked sites only increase their legal consumption by 3.5% while the heaviest users of the blocked sites increased their legal consumption by 23.6%, which is exactly what one would expect if the blocks caused the increase.

The fact that the Pirate Bay block did not migrate consumers to legal channels while the 19 site block did suggests that only persistent blocking of multiple piracy sites can be effective in converting pirates to legal channels.  The most likely explanation is that when you block one major site, it is easy for pirates to find their second favorite site.  But when you block 19 major sites, not all pirates have a “20th favorite site” and the cost of finding another reliable site is prohibitive enough to convert some of those pirates.  Other aspects of the data support this interpretation (read the paper to find out!).

Policy Implications:

These findings are especially timely in Australia, where the House and Senate have recently passed a controversial bill that will allow for piracy website blocking as a form of antipiracy enforcement.  Opponents of the bill argue that such actions will be ineffective – our research suggests that these measures are only ineffective if they do not sufficiently increase the cost of piracy to the individual.  If enough websites are blocked, our study implies that it could meaningfully increase legitimate consumption in Australia.

Of course, while our research is a step toward quantifying the potential benefits of piracy website blocking, it does not allow for a full cost-benefit analysis.  First, we do now know how or if increased industry revenues from reduced piracy will change incentives to produce quality content.  Second, we do not know the direct or indirect costs associated with website blocking activities.

That said, if this new bill is utilized in Australia, it may provide more opportunities to study the effect of website blocking (and it appears as if there are some data available on the cost of such activities there.)

Streaming content on Hulu significantly reduces piracy

In an article published a while ago in Marketing Science, my colleagues and I showed that removing video content from the iTunes video store caused a significant increase in piracy of that content without stimulating DVD sales.  Further, adding the same content back to iTunes caused a decrease in piracy, but at a smaller level than the increase that corresponded to removal.

Recently we chose to revisit this topic but in a different context – we noted that about a year after came into existence (streaming NBC and Fox content), ABC chose to start streaming the last four episodes of a number of its hit television shows on Hulu.  In an NBER chapter on the Economics of Digitization, we asked whether piracy of ABC content changed after its addition to Hulu in a different fashion than any natural, seasonal changes in content of CBS, Fox, NBC, or CW content.  What we found was not surprising – piracy of ABC content dropped 20% more than other networks immediately following the addition of this content to Hulu.  See the graph attached to this blog post which shows the trend in ABC piracy vs. the “control” networks, surrounding the date of ABC’s addition to Hulu.


It is not surprising that making content freely available (with short advertisements) online at any time of day can migrate consumers from illegal channels to this legal one.  Of course, streaming on Hulu has lower profit margin than over-the-air broadcasts (advertisers pay less for Hulu ad time than over-the-air ad time).  So to put this finding into context, one would want to ask if Hulu broadcasts have any impact on viewership in other channels like over-the-air broadcast, digital downloads, or DVD box set sales.  Our finding was put forth as an example of using good methodology to tease out the causal effect of Hulu on piracy.

Surprising Finding

However, in the midst of this seemingly innocuous exercise, we noticed something else that was less expected.  In the graph above, notice that piracy of Fox, CBS, CW, and NBC content also seems to drop some immediately following the addition of ABC content to Hulu.  Perhaps this is due to chance, but we notice that the drop in the control group (after the experiment) is an immediate break from its former trend.  In other words, it almost looks as if maybe the control group experienced a smaller dropper in piracy that might be attributed to ABC’s addition to Hulu…

Why might this be?  Well, imagine a viewer is pirating her favorite two shows on ABC as well as a show she likes on CBS (which does not stream on Hulu).  Now imagine that those two favorite ABC shows become available on Hulu and she switches over to Hulu for its convenience and relative safety (I find Hulu much easier than piracy).  But something else happens… she discovers an NBC show that she really likes on Hulu and starts watching it while she’s there… this replaces the time she used to spend watching the CBS show she pirated.  As a result, piracy of CBS content drops even though they didn’t add their content to Hulu.

As researchers, can we be certain of this narrative?  Absolutely not – there is not enough in our data to validate this story.  However, it’s an interesting and not ridiculous explanation for the apparent drop in control group content after the ABC experiment on Hulu.  If it’s true, that means that when a network adds its content to a major digital distribution site, it has positive spillovers for other networks on that site due to consumer discovery.  Certainly this possibility is worth some additional research to see if it can be supported.

I wouldn’t make such a big deal of this story right now since our data do not really provide enough proof, but we observed the same sort of thing back in our iTunes paper.  When NBC removed their content from iTunes, NBC piracy spiked very high but piracy of the control group content experience a smaller spike as well.  We also could not prove this at the time, but it almost suggested that when NBC viewers turned to piracy as a result of lack of iTunes availability, they started pirating content of other networks more as well (consistent with the idea that there is a “fixed cost” to piracy).

I think the idea of spillovers across networks in the “channel distribution game” is a very interesting idea that deserves further exploration.

Understanding Media Markets in the Digital Age: Economics and Methodology

This post is about a recent chapter I co-authored for the NBER volume on the Economics of Digitization. 

Viewpoints on such topics as filesharing, copyright enforcement, and digital distribution strategies can be quite polarized.  Further complicating the picture, empirical “research” appears to support both sides of the issues on these topics, with one example being the fact that some studies suggest that piracy harms sales of artistic works while others seem to suggest that piracy may be beneficial toward these sales.  But not all “research” is created equal – in our recent chapter for the NBER “Economics of Digitization” book, we attempt to explain the difference between high quality research that can support its claims and lower quality research which uses real data but makes claims that are not supported by these data.  In this chapter, we take an agnostic stance as to the debate over copyright but we set forth a primer for how good research in this area should be conducted.

Correlation and Causation

Our central tenant is one that few people in the debate will disagree with – correlation is not causation.  The key to the debate over many copyright and digitization issues in media lies in determining the causal effect of one thing on another, such as the causal effect of piracy on sales of media.  Simplistic analysis of data usually produces only correlations.  For example, consider the claim put forth in some research – “music pirates purchase more music than non-pirates.”  Often this claim is put forth as evidence that piracy is not harmful or even that it is helpful to music sales.  But consider the following:  In a pre-piracy world, some people probably bought music CD’s and some people did not.  Many in the second group simply are not interested in acquiring music – they simply aren’t that into it, or aren’t into owning it.  Now imagine that suddenly Napster introduces filesharing/piracy to the world.  Who do you think will start acquiring music on Napster?  The group of people who liked music enough to buy CD’s, or the group who never showed any interest in acquiring it before?  While some of the latter group may be attracted to the zero price of Napster, clearly the people who actually like music are more likely to start pirating than the people who didn’t like music enough to ever acquire any before.  So now we will observe that people who are pirating more (the former group) are also purchasing more (they were always bigger purchasers than the latter group even before piracy existed).  In other words, piracy will be positively correlated with purchases across individuals.  But this does not mean that piracy helps sales.  The question you want to answer is this – if piracy had not come into existence, how much would people be buying/paying for music and how does that compare to how much they actually buy?  The difference between what they actually buy/pay and what they would buy if piracy were unavailable (the counterfactual) is the causal effect piracy has had on sales.  This is a more complicated question, but it is the one that needs to be answered to understand the effects of the erosion of copyright and to discuss what copyright policy should look like in the digital era.

Good Methodology vs. Bad

In our chapter, we discuss different methodologies that can be used to make causal inferences from data and how these methodologies (long used in the field of economics) can be applied to experiments that have occurred in digital media.  For example, we show that after the shutdown of the popular filesharing cyberlocker, digital movie sales actually decreased across the world.  However, this is because the shutdown occurred in January of 2012, and sales decrease every year in January as they come down from Christmas highs.  As an example of a methodology called “difference-in-difference”, we discuss one of our research papers that shows that in countries where Megaupload was more widely used, the post Christmas sales decline was less-steep than in countries where Megaupload was not popular.  This indicates that the natural season decline in sales was smaller in 2012 as a result of the shutdown of the piracy site, and we show how the combination of this fact and other evidence in the data (for example, the fact that countries in our data all had similar digital movie sales trends before the shutdown) leads to the reasonable interpretation that the shutdown of Megaupload actually caused digital movie revenues to be 7-9% higher than they would have if not for the shutdown.  A simple report that digital sales were lower in the two weeks after the shutdown than the two weeks before would be very misleading due to the natural seasonal trends – instead, careful research and application of methodology for causal inference was required to uncover the true impact of the Megaupload shutdown on digital movie revenues.

In the NBER chapter, we provide several examples of our own prior research that isolates causal effects from correlations in the data.  Our research combines econometric methodology with “natural experiments” – discrete shocks to filesharing or to legal digital availability – to make our causal claims. In addition to our Megaupload study, we discuss our paper which isolates the causal impact of HADOPI (an anti-filesharing law in France) on digital music sales – HADOPI caused a 23-35% increase in digital music sales.  We also discuss our paper that shows that offering television content on the iTunes video store can reduce piracy of that same content without cannibalizing sales of DVD box sets.  Through our application of methodologies for causal inference, we build a case that legal sellers of media can compete with piracy either by making legal offerings more attractive or by making illegal consumption channels (like filesharing) less attractive.


Our hope for our NBER chapter is two-fold:  First, we hope that it will spur additional research in digital media that is grounded in valid methodology and that focuses on teasing causality out of correlations in data.   Such research is essential to guide the distribution strategies of firms as well as government copyright policy in the digital era.   Second, we hope that readers of research on digital media and copyright can use our chapter to better understand why studies in digital media may seem to be in conflict – our chapter can serve as a short text on understanding the difference between good research that can support its claims (through randomized or natural experiments) and lower quality research that uses real data but makes causal claims not supported by said data.

Explanation of Megaupload Study (or: Econometrics 101)

As I’ve already blogged, Mike Smith and I released a study on the impact of the Megaupload shutdown on digital movie sales and rentals.

Sine we found that it actually boosted revenues meaningfully, there are naturally a number of people who don’t like the study and criticize it without even reading the abstract, let alone the paper.

The most common critique in comments on blogs and news articles is that “sales were increasing anyway because of (digital growth) (new digital channels) (blockbusters released in January) (insert your favorite reason you think sales would have grown here).”  I suppose people think that as economists we would not have thought of this.

I thought I’d explain the actual methodology of the study below and why it accounts for this… in other words, why you might find it compelling.  But I’ll do so by analogy without any econometrics equations.

What A Bad Study Would Look Like:

Imagine you wanted to know the effectiveness of a new medicine in treating the common cold.  And imagine you had a good way to measure how bad a patient’s symptoms were each day.  If you just took 100 people who had had the cold for 4 days and gave them your pill, and 2 days later they were all very improved, you could hardly release this as a study.  People would (rightfully) say “colds usually last about 4 days, your patients all would have been recovering anyway!  You guys are hacks (paid for by big pharmaceuticals).”  That would be the equivalent of simply asking how sales changed around the world after the Megaupload shutdown.  This is what people are claiming we did.  It’s also precisely what we say in our half page abstract that we did not do.  So this blog post is to provide some more information to people whose response to our paper is “correlation is not causation.”

What Our Study Actually Did:

What you would really like is to split the patients randomly into two groups of 50, give half of them the medicine and give the other half an identical looking sugar pill.  Then ask how the “treated” group compares in 2 days to the “control” group that got the sugar pill.  That would be a reasonable study.  As scientists, we would have loved it if the government randomly picked half the countries in the world to completely block access to Megaupload and left it untouched in the other countries – we could ask how sales changed in the (randomly chosen) blocked countries compared to the unblocked.  If a new release came out in January that boosted sales, it would boost sales in both sets of countries so our estimate of the effect of the shutdown would not be biased.  Even if you think that the new release came out in some countries but not others (in January), since they were chosen at random it should be coming out in approximately equal numbers of blocked and unblocked countries.  That would be a great way to get a good estimate of the impact.  The unblocked countries give you an estimate of how sales would have changed if not for the shutdown, and any change in the blocked countries over and above this might be attributed to the shutdown.  Like a medical trial with control and treatment groups.

Unfortunately we didn’t have that experiment.  But we had something that is similar and equally valid.  Imagine that you couldn’t give any of your patients pure sugar pills but you could give some patients pills that were 80% medicine and 20% sugar.  And you could give some patients pills that were 60% medicine and 40% sugar.  And some patients pills that were 20% medicine and 80% sugar.  Imagine that before you gave them the pills, all groups of patients were recovering or not recovering at equal rates.  So you have evidence that they are all following about the same recovery track.  Then, immediately after you give them the pills, the people who got the 80% medicine pill have the highest amount of recovery.  And the people who got the 60% medicine pill have reasonably high (but not as high) recovery.  And the people who got the 20% medicine pill have the lowest amount of recovery.  Given that the groups were following the same trend before hand you would have expected them to continue to do so, but *immediately* after getting the pill you observed a strong significant positive correlation between” recovery” and “% medicine in the pill”.  Would you not think that the most likely explanation for this was that the medicine has a causal effect treating the cold?  That’s why we call our correlation a causal impact.

Our situation was analogous.  After controlling for various variables (including Christmas), we found that countries with high Megaupload use had similar sales trends to countries with low Megaupload use before the shutdown (the levels of sales were different, but the time trends were the same).  Immediately after the shutdown, the sale changes were no longer the same.  The sales change was positively correlated with the pre-shutdown amount of Megaupload use.  Countries with high pre-shutdown Megaupload adoption had higher sales growth (or less loss) than countries with low adoption.  Would you not now say that the most logical explanation for this immediate change from no correlation to a correlation is that the Megaupload shutdown causally affected sales?

Is This 100% Proof?

Of course it is not 100% proven.  Perhaps lots of invisible fairies *just happened to appear in January 2012* in countries with high Megaupload use and told consumers to start buying more movies.  And some fairies appeared in medium Megaupload countries and told consumers to start buying a few more movies.  And no such fairies appeared in low Megaupload countries.  But how likely is this counter-explanation?  Can you come up with a counter-explanation that is more likely than faeries?  If so, please post here – we love exploring alternate theories to see if they could explain our findings or not.  We want to know the truth.  We just can’t think of any reasonably likely counter-explanations yet.  (except the movie fairies!)


This methodology, known as a difference-in-difference technique that exploits treatment intensity, is a very common methodology used in economics and is the basis for many studies in very highly ranked peer-reviewed journals.  By the way, when we requested the data from the studios, we did not tell them our methodology.  We don’t believe they tampered with the data (or we wouldn’t publish).  But even if they hypothetically would have chosen to, they would have had to falsely increase their sales in countries like Spain and France (high Megaupload adoption) while lowering (or not changing) their sales in countries like the US and UK (low Megaupload countries).  That’s the only way one could hypothetically fake data to produce our results.  Since they did not know what our methodology was, that hardly seems like the manner in which one would tamper with data, right?  Lowering or ignoring sales in the US?