This study examines the impact of peer-to-peer (P2P) file sharing on the Australian theatrical film industry. Using a large data set of torrent downloads observed on three popular P2P networks, we find evidence of a sales displacement effect on box office revenues. However, although statistically significant, the economic significance of this displacement appears relatively small. To establish causality, we make use of the state-day-level panel data structure permitting the use of film fixed effects to help mitigate the endogeneity between film revenues and downloads. To further assist identification, we propose a downloading cost function that considers other states’ downloading activities as a proxy for the number of peers in the download swarm; the US DVD release date as a supply shock to P2P networks; and the substantial structural progression within the Australian internet service provision industry that occurred over the sample period. We observe that the release gap between the US and Australian markets is a key contributor to piracy early in a film’s theatrical life. This finding provides a partial explanation for the industry’s move towards coordinated worldwide releases.
Since the arrival of peer-to-peer (P2P) file-sharing technologies more than a decade ago, intellectual property protection has become an increasingly important issue for copyright protected products that can be digitally reproduced. P2P file-sharing technologies, such as BitTorrent, allow internet users to illegally share copyrighted content (e.g. software, games, books, music, television shows and films) at minimal cost provided they have a high-speed internet connection and sufficient download capacity.  The increasing popularity of such services, particularly among younger generations, has created something of a revolution to the way in which many people access and consume content. In response to the increasing incidence of “digital piracy”, owners of copyrighted content have pursued legal recourse in many countries, with high-profile cases brought against file-sharing services, individuals using file-sharing services, and internet service providers (ISPs).
Although it is extremely difficult to put an exact figure on the extent and costs of digital piracy, a recent annual industry survey by Business Software Alliance of 15,000 computer users across 33 countries found that more than 57% of respondents admitted to pirating software in 2012, up from 42% in 2011. Another recent study of piracy habits in the United States found that 46% of adults have bought, copied, or downloaded unauthorised music, TV shows or films and that these practices correlate strongly with youth and moderately with higher incomes.  In Europe, a 2010 study by the International Chamber of Commerce found that internet pirates downloaded 10 billion worth of music, film and television and claimed that digital piracy could cost the content industries 240 billion in revenue and 1.2 million jobs by 2015. And in Australia, two recent (2011) studies by Australian Content Industry Group (ACIG) and Australian Federation Against Copyright Theft (AFACT) estimate annual losses at A$900m and A$1.37b, respectively. 
Although numerous industry studies have estimated significant costs from digital piracy, many of them likely overestimate these costs because calculations typically assume a one-for-one sales displacement effect from illegal downloads. This implies that everyone who downloaded the product would have been prepared to pay full price in the absence of the illegal alternative – an extremely uncomfortable assumption. More measured industry studies have assumed some “substitution rate” between number of downloads and lost sales which is less than one-for-one.  But often these rates are arbitrarily imposed and not based on any real evidence on consumer decision-making.
While substitution rates represent a more honest approach by which to measure potential sales displacement effects, many academics and industry observers have offered an alternative perspective that piracy may actually have beneficial effects on sales. For example, if illegal downloading acts as a sample which precedes legal paid consumption, or if there are bandwagon effects in demand from shared word-of-mouth. In fact, relatively simple economic theory can predict either a positive or negative legal consumption effect from piracy and which effect dominates is ultimately an empirical question.  As Dejean (2009) and Waldfogel (2012) discuss in detail, three broad approaches have been pursued in the empirical literature investigating demand-side effects of digital piracy on legal paid consumption.  First, a number of studies have examined aggregate sales vis-à-vis internet usage, or computer ownership, as a proxy for downloading activity. These studies typically pursue either a cross-sectional approach (e.g. Peitz and Waelbroeck 2004; Zentner 2005; Walls 2008), a time-series approach (e.g. Stevans and Sessions 2005), or a combination thereof (e.g. Michel 2006; Liebowitz 2008). Second, some studies have utilised actual download information (e.g. Liebowtiz 2006; Bhattacharjee et al. 2007; Oberholzer-Gee and Strumpf 2007; De Vany and Walls 2007). Third, others have based analyses on data obtained by surveying individuals on their consumption behaviour (e.g. Zentner 2006; Rob and Waldfogel 2006; 2007; Hennig-Thurau, Henning, and Sattler 2007; Waldfogel 2010).
Conceptually, the best approach would appear to be the second one, where actual downloading activity is measured directly and related to legitimate sales. However, this approach is complicated by the inherent simultaneity between sales and downloads, which exists because popular films at the box office are also more likely to be popular to illegal downloaders. A particularly impressive study employing data on actual downloads vis-à-vis sales is that of Oberholzer-Gee and Strumpf (2007). Their study of the US recorded music industry used file-sharing data collected from OpenNap, a centralised P2P network, providing a sample capturing 0.01% of the world’s downloads, and contemporaneous album sales (retail and on-line) over a 17-week period in late 2002. To mitigate the endogeneity between sales and downloads, they considered the number of German school-aged children on holiday under the assumption that German students provided much of the supply of songs on file-sharing networks. However, they were unable to detect any displacement effect between download activity and music sales concluding that the observed decline in music sales is not the primary result of file sharing. 
Our study similarly investigates digital piracy using actual download and sales data, but with specific application to the theatrical film industry. We employ an extensive data set of daily Australian state/territory level P2P torrent downloads and contemporaneous box office revenues. Our study is the first that we know of to consider digital piracy in the film industry using a large data set of actual downloading activity. Our empirical methodology is in many respects similar to the approach of Oberholzer-Gee and Strumpf. However, as discussed further below, there are a number of subtle and important differences. Similar to Olberholzer-Gee and Strumpf, our identification strategy uses the notion of a download cost function which has variation both in film and time dimension and also allows for structural changes observed in the Australian internet market over the sample period, which have likely reduced costs associated with downloading.
At a film-temporal level, we consider downloading cost to be (i) inversely related to the number of peers in the P2P swarm and (ii) inversely related to the US DVD release date. Regarding (i), we proxy the number of peers in the download swarm by using the contemporaneous number of downloads occurring in geographically separated Australian states/territories – an approach that has some similarity to the use of (average) price in other markets sometimes employed in differentiated goods models of demand (e.g. Hausman, Leonard, and Zona 1994; Nevo 2001). As discussed in further detail below, this identification strategy is greatly assisted by the panel structure of our data which allows us to account for film-level heterogeneity with film fixed effects. Regarding (ii), we include the US DVD release date as an instrument under the assumption that more and better quality torrent files would likely be available after the DVD release, reducing the opportunity cost from downloading a poor quality torrent. At a more temporal level, we also make use of the large increases broadband subscriptions, speeds and capacities offered by Australian ISPs allowing us to introduce an additional moment condition into our model.
Like Oberholzer-Gee and Strumpf, we find no evidence of a contemporaneous relationship between downloading and sales (i.e. box office revenues), but we do find evidence of a sales displacement effect when downloads are considered as a dynamic stock over one, two-, three- and four-week windows. We also observe that both contemporaneous and dynamic stock downloads have a significant negative impact on first-week box office. Given many films are subject to a release lag between the US and Australian markets, this suggests downloading activity post-US release but pre-Australian release decreases opening week revenues which are well known to be particularly important in a film’s life.  We find that the release delay between the US and Australian markets provides an opportunistic window for online pirates which is statistically related to decreased revenues at the box office. Although the present impact on box office revenues appears small, with the increasing use of file-sharing technology and increased speed of bandwidth in Australia, this problem is likely to increase. The trend towards day-and-date releases seems the most sensible response given this increasing threat in the absence of legal solutions.
2 Australian Context
Australia provides an interesting context within which to study digital piracy – and in particular that related to theatrical films. Australians are well known to be some of the most frequent cinema-goers worldwide. According to statistics compiled by Screen Australia, in 2010 Australia ranked third behind Iceland and Singapore in terms of annual admissions per capita, with the US ranking fourth.  However, it is also widely known that Australians are among some of the most avid users of P2P file-sharing technologies for music, television and film.  Australia’s attraction to file sharing is often attributed to relatively high content prices as well as international release delays for TV and film.  With the National Broadband Network (NBN) progressively rolled-out over the next decade, content industries fear that digital piracy will proliferate even further as consumers are able to access and download illegal content with increasing speed and ease.
It is likely that the introduction of the NBN has been a key contributor in the content industries’ public campaign for legislation to deal with piracy in Australia over recent years. However, although content providers have been lobbying the government to take a stand against piracy, thus far their efforts have largely been ignored with policy makers encouraging continued negotiations with ISPs to find a cooperative solution.  As a result of political inaction, some content owners have pursued direct legal action against individuals and, subsequently, ISPs in their war on piracy. The initial legal strategy pursued by some companies involved actions previously employed in the United States (and other countries) by sending Australian ISPs “cease and desist” notices outlining details of the customer’s infringement and requesting they threaten the individual with disconnection of their internet service.  However, not all ISPs complied with such instruction and the failure to comply by Australia’s second largest ISP, iiNet, resulted in a landmark court case spanning more than four years.
In November 2008, a consortium of 34 record labels, pay-TV providers, film studios and other content providers filed the case against iiNet for failing to discipline its customers in relation to allegations of copyright infringement. The case of Roadshow Films and others v iiNet (or commonly AFACT v iiNet) was initially heard by the Federal Court of Australia and decided on 4 February 2010 with the trial judge ruling in favour of iiNet and awarding costs. In passing judgement, the trial judge noted that while iiNet users did infringe copyright, it was not the responsibility of iiNet to police its customers on the infringement of other parties’ copyrights. The decision was subsequently appealed by AFACT to the full bench (Full Court) of the Federal Court on 24 February 2011 but was again dismissed by the presiding judges. The trial judges upheld the initial decision but for different reasons. They noted that although iiNet showed an indifferent attitude to the complainants’ allegations, iiNet’s inaction did not constitute authorisation for the act of copyright infringement. On further appeal, the case was heard by the High (Supreme) Court of Australia which again sided with iiNet in judgement passed on 20 April 2012. The Court unanimously dismissed AFACT’s appeal and ordered AFACT to pay costs of approximately A$9m.  Given the Courts’ decisions on this case, we believe it is unlikely downloading behaviour would have reduced in response to these rulings.
Although it is undeniably true that Australians are among the world’s most prolific users of P2P file-sharing technologies, it is also true that until recently – and certainly over the sample period of our study – many internet users faced constraints on the activity given typical internet plans contained data download limits which were relatively low in comparison to other countries.  Given a typical film download can be in excess of one gigabyte, consumers with limited data allowances would have to be considerate of the amount of downloading performed each month so as not to incur excess data charges or have download speed throttled.  It would therefore seem reasonable to believe that such constraints would incline internet users towards downloading more highly valued content, such as new films playing at cinemas.
However, while many internet users have been constrained by download limits in Australia, the particular period of our study also coincides with dramatic increases in data allowances offered by all the major Australian ISPs.  Data collected by the Australian Bureau of Statistics (ABS 2011) on internet subscriptions, speeds and downloads supports large increases in all these areas for all Australian ISPs operating during our sample period. In relation to broadband subscriptions, from December 2009 to June 2011 there was an increase from 8.0 m to 10.3 m (i.e. 28%) subscribers and an increase from 2.5 m to 5.0 m (i.e. 102%) subscribers for plans offering above >8 Mb/second. In relation to actual volume of data downloaded, the quarterly volume increased from 127,661 Tb (December 2009 quarter) to 274,096 Tb (June 2011 quarter) – an increase of more than 114%.  These aggregate statistics provide evidence of dramatically changing landscape in internet service provision in Australia as the overall number of broadband subscriptions increased coupled with faster download speeds and less restrictive data allowances. Ceteris paribus, these changes would almost certainly increase levels of downloading observed over our sample period. We make use of this (exogenous) structural change in the empirical model we outline below.
3 Data and Descriptive Statistics
To investigate the impact of file sharing on film revenues, we employ an extensive data set of Australian state/territory level daily box office revenues and P2P torrent downloads of 166 films released in Australian cinemas between January 2010 and August 2011. The films in our sample are typically large budget “Hollywood-type” films that received a wide release in the US theatrical market (i.e. >2000 theatres, as per conventional industry definition) as well as an Australian theatrical release. Given the international nature of these films, a priori we would expect substantial interest in both cinematic consumption and illegal downloading allowing us to investigate potential displacement relationships between downloading activity and box office revenues. The torrent data were sourced from Peer Media Technology – a company which, among other services, measures digital piracy for companies in the entertainment, software and publishing industries.  We tracked downloads on three popular P2P networks: (1) BitTorrent, (2) eDonkey, and (3) Ares, where a download is defined as a unique instance of an IP address attempting to download an appropriately named torrent file on a given day.  The IP addresses are subsequently geo-located by another company, MaxMind, to provide state/territory-level number of downloads per title per day in our particular context. Peer Media Technology estimates that their measurement provides approximately 55% of all downloads in the Australian context. 
The torrent data of each film in our study span a longer period than the observed Australian theatrical life of the film, allowing us to track downloads which may have occurred both before and after the theatrical window. In particular, we observe downloads post US release, but pre-Australian release. Figure 1 reveals that the number of downloads spikes after the initial US release, presumably the result of increased availability and interest. Approximately 100 days after the theatrical release, another peak in the distribution is evident giving the appearance of a bimodal distribution. It is our belief and conjecture that this second peak coincides with the US DVD release date, which would make sense as typically DVDs are released no sooner than three-month post-theatrical release and is strongly evident in our data.  As discussed in detail below, we make use of this feature of our data and include the US DVD release date as an instrument in our empirical models. The temporal distribution of downloads relative to the Australian release date reveals a significant number of downloads occurring prior to the Australian release. This is almost certainly related to release gaps between the two markets – an issue we discuss in more detail below. During the theatrical release window, we observe contemporaneous (daily) box office revenues and number of theatres for each film disaggregated to the state/territory level (Rentrak). In addition, we also observe US box office revenues, opening theatres and cinematic release dates, hence we observe the release gap between the US and Australian theatrical releases. 
In total we observe 295,304 torrent download data points and 64,328 daily box office revenue and theatre data points. We limit our attention to 20 weeks of box office revenues post Australian release which provides us 56,663 data points in the final estimation sample. Tables 1 and 2 provide summary statistics for our data. Table 1 details aggregate-level information for the 166 films observed. On average, each film earned nearly A$8.9m at the Australian box office and was released on 259 screens. The highest earning film, Harry Potter and the Deathly Hallows Part 2, made almost A$51m. The average number of downloads per title was almost 113,000, with an average file size of 1.2 Gb. The most downloaded film, Inception, was downloaded more than 435,000 times. As noted above, all the films in our sample had a wide release in the US market of at least 2,000 theatres (average of 3,125) and were generally large budget titles with an average (estimated) production budget of US$70m (data sourced from IMDb) and earning an average of just under US$90m at the US box office. Of all 166 films we observe, the correlation between total revenue and total downloads is moderately strong at 0.522. Figure 2 shows this relation with a simple linear regression which reveals a statistically significant positive relation.
|Total revenue (AUD)||166||8,869,875||6,121,475||9,310,205||559||50,800,000|
|Aus opening weekend screens||166||259||242||135||2||758|
|US opening weekend theatres||166||3,125||3,045||511||2,012||4,468|
|US total revenue (USm)||166||89.0||60.6||79.2||10.1||415.0|
|Mean||Median||Std. Dev.||Min||Max||Mean||Median||Std. Dev.||Min||Max|
Table 2 provides summary statistics for the disaggregated data used in estimation. Of the 56,663 data points, the average film’s daily (state/territory level) revenue is A$25,148 and the average number of downloads is 95. These are simply weighted averages of the state/territory data contained in the body of the table. The state/territory summary data are consistent with respective sate/territory populations in terms of mean ranking.  However, the data for Northern Territory (NT) downloads are particularly low due to problems in primary data collection.  Aggregating all revenue and downloads across states/territories for all films observed on each of the 596 days in our sample, the average Australia-wide daily revenue was just under A$2.5m, with the highest recorded single day revenue of A$9.4m occurring on Wednesday July 13, 2011, coinciding with the release of Harry Potter and the Deathly Hallows. In terms of downloads, nation-wide the daily average was just over 31,000 with the highest single day number of downloads occurring on Sunday June 12, 2011, where more than 61,700 downloads occurred. Our data also display intra-week seasonality in relation to both revenue and downloads. Unsurprisingly Saturday, Sundays and Friday recorded highest average nation-wide revenues (downloads) at A$4.0m (35,296), A$3.3m (34,863), and A$2.7 (28,424), respectively. We control for the intra-week seasonality in our model with the use of day-of-week dummy variables as discussed in the following section. 
4 Econometric Model
4.1 Contemporaneous Downloads
Our empirical model aims to quantify the sales displacement effect from illegal P2P torrent downloads on box office revenue. The approach is similar to Oberholzer-Gee and Strumpf (2007), with a number of important differences. First, and most obviously, our context is theatrical film revenues rather than music sales. This removes the complicated issue of transforming single downloads (sales) to a proxy for album downloads (sales). Second, the data are observed at daily, rather than weekly, levels. Also, our data are observed at the state/territory level, rather than national level. Third, our identification strategy relies on a downloading cost function which relates to contemporaneous downloads in other Australian states (proxying for number of peers), US DVD release date, and significant structural changes in the Australian internet service industry over our sample period. We discuss identification in more detail below.
Our empirical model is based on the following equation
where and define revenue and downloads of film i in state s on date t, respectively. The logarithmic transformation is applied to revenue and download data since both distributions are bounded below at zero and are right skewed.  An additional benefit of the log-transformation is that the impact of downloads on revenues can be interpreted as an elasticity. We partition control variables into observable and unobservable vectors and , respectively. In particular, vector includes the (time-variant) number of theatres showing film i in state s on date t () which also enters in log-form; the week of the run of film i in state s on date t (); as well as a set of dummy variables for day-of-week effects (, where d indexes day). Additionally, we include state/territory fixed effects and film fixed effects . 
We seek to make causal inference on the parameter . However, because popular films at the box office are also likely to be popular on P2P networks, download activity should be treated as endogenous. In terms of eq. , this implies that . To remedy this problem, we employ two-step efficient generalized method of moments (GMM) estimation with robust standard errors (Hansen 1982). 
Our instruments are chosen based on the assumption of a downloading cost function with “cost-shifters” which would not be expected to be related to demand for theatrical consumption. We propose the following
where is the cost of downloading film i in state s on date t, is the number of peers of film i on date t, is a binary variable relating to whether film i had been released on DVD (in the US market) at date t, and t represents time (trend) index. The first argument of the cost function relates to the number of peers who are (contemporaneously) “seeding” the film file. Typically, the more seeders in a swarm, the faster the download and the more likely the chance of a successful download. Therefore, we would expect . Although we don’t observe actual information on the number of peers, we proxy this variable with the (contemporaneous) total number of downloads made across all other Australian states on date t. Regarding the second argument, we expect downloading costs to be inversely related to the US DVD release date under the assumption that once the DVD has been released, higher quality torrent downloads are more likely to exist on P2P networks. This is supported by the second mode observed in Figure 1 that suggests an increase in downloading activity approximately three-month post-theatrical release. Therefore, we expect . The final argument is a simple time trend, which enters the first-stage regression linearly allowing for an exponential fit in levels given the dependent variable (downloads) is measured in log form. As reported in Section 2, the period 2010–2011 saw significant increases in broadband subscriptions, speeds and data allowances in Australia. As a result, downloading costs have fallen significantly over our sample and we conjecture . We discuss identification issues in more detail in the following section.
Identification in our model derives from the use of film fixed effects and inclusion of instruments in the GMM estimation procedure that relates to downloading activity in other states (as a proxy for total number of peers), US DVD release date and a trend variable to account for structural changes in the Australian internet industry.
The primary endogeneity problem we face is one of unobserved film quality that makes popular films at the box office similarly popular on P2P networks. To address this at a global level, we use film fixed effects. Of course, this implicitly treats the endogeneity problem as constant through time, which may not be true. For example, a film may benefit from a national advertising campaign or awards after its release. Our model deals with such temporal (global) demand shocks implicitly by the use of theatres and the assumption that cinema managers respond dynamically to demand shocks by adjustment to the number of daily theatres. The instruments in our GMM estimation, most notably summed downloads in other states, therefore handle a more subtle form of time-varying endogeneity that occurs at a state (local) level, rather than national (global) level. Because we use downloads in other states, the instruments cannot assist in identification when demand shocks occur at the national level. Rather, the type of time-varying demand shocks that our instruments can assist with is state-level effects that may be orthogonal to other states. For example, this could include local newspaper critic reviews or localised word-of-mouth via social networks. We now discuss the instruments we use in more detail.
There are three well-known requirements for a good instrument: (1) that it itself does not belong in the focal relationship, (2) that it is uncorrelated with the error process, and (3) that it is correlated with the endogenous variable. Taken together, the instrument’s effect on the outcome variable occurs only through its association with the endogenous variable. Our first two instruments clearly satisfy the first requirement. By assumption of geographic separation, downloads in other states couldn’t displace sales in a focal state. And US DVD release date certainly has no place as a determinant of Australian box office revenues. As for trend, we note three reasons why we do not believe trend should appear in the focal (revenue) relation: (1) the relatively short time frame of our sample (20 months), (2) no significant increase in cinema infrastructure, and (3) no significant trend effect in sales.  With the significant changes in internet provision previously discussed, it is therefore appropriate to consider the trend as also satisfying the first requirement outlined above and be excluded from the focal equation.
The second and third requirements deserve further attention. Regarding the first instrument, the use of summed downloads in other Australian states is taken as a proxy for the number of peers (seeders) in the P2P network and, as such, should be inversely related to downloading costs. That is, more downloads in other states suggests more seeders generally on a network and as a result a higher chance of a quicker and successful download. In some respects, the use of the total number of downloads from geographically separated markets as an instrument is similar to the use of (average) price from other markets when modelling demand for differentiated goods (e.g. Hausman, Leonard, and Zona 1994; Nevo 2001). The intuition is that prices are correlated through common marginal cost shocks. Assuming the errors in demand are independent across market, this “cost shifter” approach is valid.
In our context, we similarly require an independence assumption for total downloads in other states to be a valid instrument. Specifically, the unobserved elements of in eq.  should not also be correlated with our instrument. We argue this is true based largely on the panel structure of our data permitting us to use film fixed effects to absorb much of the film level heterogeneity, which is the major source of our endogeneity.  We do, however, include instruments to assist with further identification owing to potential state-level (local) time-varying effects. We implicitly assume that the remaining (time) variation in the instrument is then more reflective of technical aspects and shocks related to downloading than any systematic variable which relates to unobserved demand.  For example, a low number of downloads could reflect a poor quality torrent file (e.g. camcorder version) or a problematic tracker (i.e. the server which directs the torrent uploads and downloads not responding).  As a result, low number of peers may indicate high chances of download failure and associated higher cost (wasted time and data capacity). It is also typically true that more seeders imply a faster download with higher probability of success. Downloaders therefore preference torrents for which they can see a high seeder ratios. At the margin therefore downloaders would prefer to download titles with more active swarms.
Regarding the second instrument, the US DVD release date likely leads to high-quality torrent files becoming available on P2P networks. As a result, downloaders face lower (net) costs of downloading – or, alternatively stated, receive greater (net) benefits. We conjecture the marginal consumer who is sensitive to quality is more likely to download a DVD quality torrent file over, for example, a low-quality cam version.
The third instrument we make use of is related to the changing structural features of the Australian internet landscape. As discussed in Section 2, there were significant increases in data offerings and speeds from Australian ISPs over the sample period. As a direct result, broadband subscriptions, speeds and data downloaded jumped dramatically. In this environment, it would seem that illegal downloading costs would have declined and downloading increased as a result. To capture this we include a simple linear trend as an additional instrument in our model.
4.3 Dynamic Stock of Downloads
The model described in eq.  captures the potential sales displacement for film i, in market s, on date t. There are, however, a number of reasons it would seem more appropriate to consider the total number of downloads over some window of time prior to, and including, date t rather than only those downloads observed contemporaneously. For example, illegal downloaders may browse torrent sites for new titles and may initialise a download with the intent of viewing the download at a later time – and, we assume, forego the cinematic alternative. In addition, downloading speeds (at least during the period of analysis in Australia) could be restrictively slow preventing instant playback necessitating files are downloaded in advance of actual consumption. Further, there is also the possibility that once a download has been completed, an individual may share the file with friends which would similarly contribute to potential future sales displacement effects. For these reason, we consider it likely that downloading over some window of time affects future box office sales which would not be captured in our contemporaneous specification. We subsequently refer to this alternate approach as a dynamic stock of downloads in our empirical methodology and consider the modified model:
where defines downloads of film i in state s over a period of T days, where , prior to and including the date of observation.  Alternatively stated, . Essentially, the modification simply considers the dynamic stock of downloads over one, two, three or four weeks prior to (and including) date t. Our first instrument related to number of downloaders in other states, as discussed above, , is also similarly redefined and now represents total downloads of film i in all other states/territories over T, or . All other variables remain as specified above.
This transformation requires further discussion in relation to role of our instruments. In particular, how to interpret the dynamic stock variable in the cost function. The number of downloads per T is now more reflective of aggregated popularity of the torrent on the P2P networks. The time-invariant heterogeneity will, once again, be picked up though film fixed effect. But the temporal variation will pick up changes in technical aspects of the torrent file. As an example, take the case of a typical film at the box office which is enjoying an average reception and theatre/screens decease familiarly over some number of weeks. Suppose up until week three only a bad (camcorder) torrent was available for this film. As a result, educated downloaders avoided this torrent and only low downloads were observed. Assume, for sake of argument, beyond the third week a better torrent appeared. This would increase numbers of downloads, increased number of peers, and consequently reduce download costs. This temporal variation is entirely related to the changes in the P2P offerings and the instrument is constructed to capture this type of variation. We continue discussion of identification in Section 6.
5 Estimation Results
5.1 Contemporaneous Downloads
Table 3 provides GMM estimation results for the base model where downloads are treated contemporaneously to revenue (i.e. on the same date of observation). The first column reports simple OLS estimates without film fixed effects. The coefficient on the key variable of interest, contemporaneous downloads (), is positive and significantly different from zero at 1%. The other estimated coefficients conform with a priori expectations. Specifically, week-of-run () and contemporaneous theatres () reveal negative and positive signage, respectively. Again, both are statistically significant. Inclusion of film fixed effects in the second column of Table 3 purges some of the endogeneity between revenue and downloads that exists because of unobservable shared tastes for more popular films. Notably, the estimated coefficient on contemporaneous downloads is reduced but is still significantly positive. The third and fourth columns represent the first- and second-stage GMM estimation defined in eq. , respectively. The first stage estimates reveal a strong and significant positive relation with downloads in other states () and the US DVD release date (), but insignificant relation with trend (t) (although signage is as expected). Using the Kleibergen and Paap (2006) LM test for under-identification, Cragg and Donald (1993) test for weak identification, and the Hansen J test for over-identification (see, for example, Hayashi 2000) we find our instruments reject under- and weak-identification but do not reject over-identification. However, it is well known that this test suffers in large samples.
|1st stage||2nd stage|
|US DVD Releaseit||0.166***|
|Under identified (P-Value)||7008.3|
|Weakly identified (P-Value)||46987.5|
|Over identified (P-Value)||808.3|
The second stage results of the GMM estimation reveals a further decrease in the magnitude of the estimated coefficient on contemporaneous downloads when compared to the OLS specification with film fixed effects. The positive relation between downloading activity and revenues is still apparent and significant, but the magnitude of the effect has been substantially reduced. This supports that the excluded variables (instruments) are helping to identify the model by reducing the positive bias induced by the endogeneity of downloads. The other estimated coefficients in the second stage regressions are similar in magnitude and signage to the OLS results with film fixed effects of column 2.
Time-invariant and film-specific variables – such as budget, cast/director appeal (pre-release) advertising, genre, classification rating, etc. – are implicitly captured in our model by the inclusion of film fixed effects. To examine the contributions of some of these variables, we extract individual film fixed effects and consider them against the time-invariant film-specific variables observed in our data set. The correlation between budget and the extracted fixed effects is 0.50 (compared to a correlation of 0.65 with total film revenues); and the correlation between (national) opening week screens and fixed effects is 0.72 (compared to a correlation of 0.82 with total film revenues). In a basic OLS regression with fixed effects as the dependent variable, estimated coefficients of both (log) budget and (log) opening screens are positive and significant at the 1% level of significance with an =0.44. Both relationships still hold at 1% when controls are added for the categorical variables of sequel, genre and classification rating with =0.64. In comparison, a regression of (log) total revenue on the same set of covariates found similar explanatory evidence in terms of signage and significance with =0.81. To the extent that the extracted fixed effects correlate strongly with key variables, which have been shown as important attributes of overall film revenues, these findings validate the use of film fixed effects as serving the model to capture the time-invariant determinants of demand for the films we observe. 
5.2 Dynamic Stock of Downloads
As discussed in Section 4, we consider it likely that individuals make theatre attendance decisions after a download decision. Under this assumption, it would be more appropriate to consider the number of downloads over some window of time prior to the actual date at which box office is observed. Table 4 provides evidence when this window of time is considered at one, two, three and four weeks prior to (and including) the date of observation. We report only results from the fixed effects and GMM models with all three instruments (i.e. sum of downloads from other states, US DVD release date and trend). In all four models, T= 7,14,21,28, the (log) downloads coefficient from the GMM estimation is less than the OLS fixed effects (as also observed within the contemporaneous results reported in Table 3) providing evidence of a reduction in the inherent bias of this coefficient. However, unlike the contemporaneous case, the relationship between downloads and revenues is now observed to be statistically negative implying a sales displacement effect. The (absolute) increasing magnitude of the download coefficient over the four models, T= 7,14,21,28, is a simple manifestation of the increasing dynamic stock. The estimated coefficient suggests a sales displacement elasticity in the range 0.06–0.31. We discuss the economic interpretation of these results further below.
|ln Revenueist||T= 7||T= 14|
|1st stage||2nd stage||1st stage||2nd stage|
|US DVD Releaseit||0.163***||0.156***|
|1st stage||2nd stage||1st stage||2nd stage|
|US DVD Releaseit||0.149***||0.142***|
Regarding the first stage results, the primary instrument relating to download activity in other states continues to be positive (as expected) and highly statistically significant across all four models. Also, the US DVD release date and trend variables are highly significant with positive signage. This is consistent with our a priori intuition related to the cost of downloading being inversely related to the availability of a DVD quality torrent file and also that download costs have fallen due to the changing structural features in the Australian internet service provision industry.
6 Discussion of Estimation Results
6.1 Instruments and Identification
Correct causal inference depends critically on the model being correctly specified. In modelling a potential sales displacement effect, the inherent difficulty lies in purging the simultaneity between downloads and revenues which manifests in a positive bias on the estimated coefficient in the absence of remedial measures. Fixed effects provide a solution for the time-invariant component of the endogeneity in panel data sets such as ours as the (time-invariant) heterogeneity between films can be accommodated. However, full identification of the effect requires an instrumental variable which also has some temporal variation and therefore removes remaining time-varying unobserved heterogeneous effects. We argue that our instruments are strong and satisfy both economic and statistical requirements. However, it is necessary to restate that the temporal endogeneity we treat with our instruments (specifically, downloads in other states) is related to what we term local (state) effects, rather than global (national) effects. For example, temporal demand shocks that occur due to local media effects and/or local network effects.
Intuitively, identification beyond fixed effects stems from an assumption that temporal changes in downloading activity relate more to technical aspects of file sharing rather than unobserved temporal (global) shocks to demand. This is not to say that there is no unobserved relation between box office revenues and downloads, but simply to say that it is not one that is systematically time varying beyond that accounted for by the inclusion of other control variables (e.g. theatres) and is thus accounted for by film fixed effects.
It is worth further interrogating this critical assumption. As discussed previously, there is similarity in approach to the use of (average) price in geographically separated markets (aka Hausman-type instruments) to instrument price in a focal market when modelling demand for differentiated products. However, many researchers have also noted this approach becomes invalid if unobserved demand shocks are correlated between markets. For example, national advertising/marketing campaigns may generate an unobserved shock to demand and could potentially increase prices. In our context, advertising is one possible source of time-varying heterogeneity which may affect both downloads and revenues at the (global) national level and isn’t observed in our model. However, it is well documented that the large majority of typical advertising spending on a movie occur pre-release (typically in excess of 90%) and therefore would not affect demand temporally beyond what is captured by the film fixed effect.
Similar concerns may also be put forward for other types of post-release promotion of the film. This could, for example, be critics’ reviews or award nominations and/or wins. While we cannot rule out these effects entirely, most films are reviewed prior to (or coinciding with) opening weekend. Additionally, the release delay between the US and Australian markets means many reviews are already available online prior to the Australian release. Award nominations/wins are potentially an issue that could cause an unobserved demand shock to both revenues and downloads. However, the relatively low number of films in our sample which were playing in cinemas when major award nominations and/or wins took place leads us to believe this would not be a significant effect. Moreover, the types of films we consider in our sampling frame (i.e. US wide release films) are often not those that are typically nominated or receive major awards. 
While we have explained the theoretical justification why our instruments – in particular, the summed downloads of other states – are valid, it is also of value to evaluate the statistical evidence a little more closely. The results of the first stage regressions reveal a particularly high which indicates high explanatory power of the (included and excluded) instruments. This may create concern that the correlation is too high in the sense that the first stage is just recovering the endogenous variable. This would certainly be true if there was near perfect correlation between downloads in the focal market and other states. However, a simple regressions of on for T= 1,7,14,28 reveal in the range 0.22–0.34. In terms of simple correlations, the range is 0.47–0.58 between the endogenous (log) download variable and respective (log) downloads in other states. It is apparent that even though there is reasonably strong correlation between the endogenous variable and the instrument – the statistical requirement – it is far from simply recovering itself and the other instruments play an important role. As well, in terms of the statistical requirement that the instrument is uncorrelated with revenues, the condition appears to be well satisfied with simple correlation in the range −0.11–0.08. More formally, it was also observed that the statistical tests rejected under-identification and weak-identification, but not over-identification. However, as noted above, we believe this is in part due to large sample size.
6.2 Opening Week Revenues and Release Delay
One potential criticism of our model is that we are observing daily film revenues over the theatrical life of a film, which would typically be decreasing, whereas the dynamic stock variable may be increasing. Although we include a week-of-run variable in both the first and second stage regressions, this potentially inverse relation may be driving the results. To address this potential issue, we restrict the model described in eq.  to only model revenues of films in their opening week of theatrical release. This means the variable is now redundant as all films are observed in their first week of release. Table 5 provides results for the second-stage regressions for the contemporaneous and dynamic models. In all cases the coefficient on the downloads variable is statistically less than zero implying that first-week revenues are subject to a sales displacement effect. As illustrated graphically in Figure 1, the fact that many films are subject to a release delay between the US and Australian markets seems a logical explanation of the decreased first-week sales when there is an opportunity for download prior to the opportunity for legal consumption.  In our sample of 166 films, the average release gap between the US and Australian markets was 28 days (median of 13 days). One film, Thor, was released in Australia two weeks prior to the US release, a further six films were released in Australian cinemas one week prior to the US release, and 50 films had a simultaneous release with the US release (opening within one day of the US release). The remainder of films had a positive release gap with the greatest gap being Diary of a Wimpy Kid which opened in Australia cinemas six months after its US release.
|ln Revenueist||GMM 2nd stage|
A simple regression of (log) downloads (observed within the first week) on observation date relative to the US release (controlling for day-of-week, state and film), retrieved a significant positive relation suggesting an increase in the number of first week downloads the more time elapsed since the US release, ceteris paribus. Given that illegal copies of films typically show up after the US release, this evidence is supportive of the release delay between the US and Australian markets providing opportunities for illegal consumption before the film is theatrically released in Australia. When release delays between the US and Australian markets are significant, and extend beyond the US DVD release date, the likelihood of high-quality torrents files appearing on networks increases and increased levels of downloading would displace more sales.
6.3 Weekly Model
The model outlined in Section 4 was estimated with daily data. To examine whether this feature of our data has any bearing on results we also consider a model where revenue and downloads are considered at the week level, rather than daily. We do this both for the contemporaneous and dynamic stock models. The contemporaneous model is analogous to eq.  but now the time subscript t represents a week, rather than a day, and is redefined . The theatres variable is redefined as so it now represents the maximum number of theatres screening on any day of that week for the film of interest. Also, the day-of-week dummy variables are now redundant. The dynamic stock model is similarly redefined in terms of weeks as , where now represents the number of weeks included prior to week . We consider the dynamic horizons implying that downloads are observed for one, two, three and four weeks prior to, and including, week .
Results for the weekly model are reported in Table 6 which display second-stage regression results for the contemporaneous and dynamic stock models. In the contemporaneous and model, a positive and significant relation between downloads is observed. However, in the models the relationship is negative implying a statistically significant sales displacement relationship. The magnitude of the estimated coefficients suggests a displacement elasticity of 0.15% for a 1% increase in downloads over the four weeks prior to, and also including, the week of observation.
|ln Revenue||GMM 2nd stage|
6.4 Forward Looking Dynamic Stock of Downloads
It has been argued that the dynamic stock approach to considering future sales displacement is more realistic than a contemporaneous approach. It might also be possible, however, that an individual forgoes cinema attendance today by making a decision to download the title at some point in the future. Obviously, this argument relies on the title being available for download and that the intention is actually carried out. To test whether downloaders forgo theatrical consumption prior to actually downloading the file, we consider a simple modification of our previously specified dynamic model  in which T is now forward looking. We denote this forward looking dynamic stock as and define for the horizons . Similarly our instrument is also redefined as for all .
As Table 7 demonstrates, the estimated coefficients on the (log) downloads variable across all specifications () remain positive implying no displacement effect. Taken with the evidence of the (backward looking) dynamic stock model, this might suggest individuals who partake in downloading (and substitute it for paid cinematic consumption) are time impatient and seek out films on torrent sites early in their theatrical life – often prior to the official release if the film has already been released overseas as discussed above.  This observation is also consistent with the “movie-maven” subculture among (particularly young and tech savvy) consumers who desire to see a film early and before the masses.
|ln Revenueist||GMM 2nd stage|
6.5 Sales Displacement Effects
Although we have detected a statistically significant negative impact from file sharing on box office sales, the economic significance of these effects appears relatively small. We demonstrate this by providing some “back-of-the-envelope” numbers concerning the potential sales displacement effects of piracy implied by our results. Table 8 provides estimates of sales displacement or “substitution rates” for the daily, opening week and weekly models as presented in Tables 4–6. 
|Daily Model||T= 7||T= 14||T= 21||T= 28|
|Adjusted median downloadsa||680||1,345||1,962||2,485|
|Opening week model||T= 7||T= 14||T= 21||T= 28|
|Adjusted median downloadsa||371||565||665||725|
|Weekly model||=1||=2||=3||=4 [b]|
|Adjusted median downloads||–||1,562||1,933||2,232|
If we focus on the daily T= 7,14,21,28 models, we observe sales displacement elasticities of 0.06, 0.16, 0.25 and 0.31, respectively. Given the median daily box office of films in our sample is A$3,593 (Table 2), this translates to a reduction in revenue of between A$2.06 and A$11.01, or between 0.16 and 0.86 people assuming an average ticket price of A$12.87 (Screen Australia reported average ticket price for 2011), for a 1% increase in downloading activity across these time horizons. Given that state/territory-level median number of downloads over one, two, three and four weeks are 680, 1345, 1962 and 2485 (adjusting for Peer Media Technology’s estimate of a 55% market coverage), this suggests that somewhere between 29 and 42 downloads displaces one purchased ticket each day depending upon which model is considered. Put another way, for every 100 downloads somewhere between 2.4 and 3.4 cinema admissions are displaced.
When considering the opening week and weekly models, we find lower levels of displacement for the weekly model but also find substantially higher levels of sales displacement for opening week revenues. Given the range of elasticities from Table 5, and the median opening week daily revenue of A$31,046, we estimate a 1% increase in downloads displaces between 1.65 and 2.01 paid admissions. With the range of the median number of downloads between 371 (T=7) and 725 (T=28), these levels imply that anywhere from 2.2 to 3.8 downloads displaces one paid admission – which appears relatively high – but in terms of economic significance, the overall potential effect is low because of the relatively low levels of downloading actually taking place in the first week. In part this is reflective of the large number of films with simultaneous releases in the United States and Australian market, but it is apparent that the longer a film’s release is delayed between the two markets, the more likely the title is to appear on torrent sites which in turn increases the number of illegal downloads.
In terms of the weekly model’s results of Table 6, and given the median weekly revenue of A$19,142, the estimated displacement for T=2, 3 and 4 (weeks) suggest between 6.0 and 8.3 downloads displaces one paid admission over a weekly time horizon. This finding is lower than the daily model where it took between 29 and 40 downloads over the various windows considered to displace a sale on a given day. In the weekly model, we are observing approximately the same time frame for downloads but now a week for revenue/admissions. However, the calculation is not simply divided by seven as the medians are not related by this scale. As it turns out, the displacement effect is estimated larger in the daily model if the substitution rate is simply multiplied by seven.
To put these “back-of-the-envelope” calculations in perspective, assume that (as some industry reports implicitly suggest) one download displaces one paid admission. Then the total lost revenue of the median film is in the order of A$2.34m (adjusting the observed median downloads of 100,000 – see Table 1 – for 55% estimated market coverage, and assuming ticket price of $12.87), or about 38%. If we compare to the maximum (daily) substitution rate of 3.5, and continuing to assume median downloads of the same number of median downloads and ticket price, this would imply total lost revenues of A$82k (1.3%) for the median film.
This study has investigated digital piracy in the context of the Australian theatrical film industry. We find evidence of a sales displacement effect from illegal downloading on box office revenues. However, our estimates suggest (at least at present) the economic magnitude of this effect is small. One particular issue our study sheds light on is that piracy behaviour increases proportionally to the release gap between the US and Australian markets. Opening week revenues were shown to decline significantly because of downloads that occurred prior to the theatrical release. This finding is not unsurprising and provides partial explanation for the observed and growing trend of day-and-date world-wide releases – particularly for blockbuster titles.
Whether the theatrical film industry is likely to suffer revenue declines similar to those observed in the music industry is yet to be seen. Certainly there are key differences between the two industries which are important such as the relatively large size of film files relative to music files, as well as the extent to which a download provides a substitute with the social experience of cinematic consumption. Also, over the time frame of our study, Australian broadband internet plans and speeds were often restrictive for downloading films but this will change dramatically in the very near future – especially with the roll-out of the National Broadband Network (NBN).
We are grateful to participants of the University of Sydney Microeconometric and Public Policy working group (Sydney, October 2012), Motion Picture Scholars Workshop (Los Angeles, November 2012), University of Melbourne Institute of Applied Economic and Social Research seminar series (Melbourne, February 2013), 42nd Australian Conference of Economists (Perth, July 2013), Econometric Society Australasian Meeting 2013 (Sydney, July 2013), Macquarie University seminar series (Sydney, August 2013), and 18th International Conference of the Association of Cultural Economics International (Montreal, July 2014) for comments that have helped to improve the manuscript. We are also grateful for comments of an anonymous referee. Any remaining errors are our own.
Adermon, A., and C. Y. Liang. 2014. “Piracy and Music Sales: The Effects of an Anti-Piracy Law.” Journal of Economic Behavior and Organization 105:90–106.10.1016/j.jebo.2014.04.026Search in Google Scholar
Australian Bureau of Statistics. 2011. Catalogue 8153.0. Internet Activity Australia, June 2011.Search in Google Scholar
Bhattacharjee, S., R. Gopal, K. Lertwachara, J. Marsden, and R. Telang. 2007. “The Effect of Digital Sharing Technologies on Music Markets: A Survival Analysis of Albums on Rankings Charts.” Management Science 53:1359–74.10.1287/mnsc.1070.0699Search in Google Scholar
Britton, S. 2011. Exclusive – Australian downloaders targeted. www.mediawave.tv March 24. Accessed March 24, 2011. http://www.mediawave.tv/site/item.cfm?item=F695F6940631EA84670E9237FA46C46E.Search in Google Scholar
TERA Consulting. 2010. Building a digital economy: the importance of saving jobs in the EU’s creative industries. Accessed March 31, 2010. http://www.droit-technologie.org/upload/dossier/doc/219-1.pdf.Search in Google Scholar
Danaher, B., and M. Smith. 2014. “Gone in 60 Seconds: The Impact of the Megaupload Shutdown on Movie Sales.” International Journal of Industrial Organization 33:1–8.10.1016/j.ijindorg.2013.12.001Search in Google Scholar
Danaher, B., M. D. Smith, R. Telang, and S. Chen. 2014. “The Effect of Graduated Response Anti-Piracy Laws on Music Sales: Evidence from an Event Study in France.” Journal of Industrial Economics 62:541–53.10.2139/ssrn.1989240Search in Google Scholar
Danaher, B., and J. Waldfogel. 2012. Reel piracy: The effect of online film piracy on international box office sales. Unpublished manuscript, Wellesley College.10.2139/ssrn.1986299Search in Google Scholar
Ernesto 2012. Who’s pirating Game of Thrones, and why? www.torrentfreak.com, May 20. Accessed May 20, 2012. http://torrentfreak.com/whos-pirating-game-of-thrones-and-why-120520/.Search in Google Scholar
Hausman, J., G. Leonard, and J. Zona. 1994. “Competitive Analysis with Differentiated Products.” Annals of Economics and Statistics 34:159–80.Search in Google Scholar
Hayashi, F. 2000. Econometrics. Princeton, NJ: Princeton University Press.Search in Google Scholar
Liebowitz, S. 2008. “Testing File Sharing’s Impact on Music Album Sales in Cities.” Management Science 54:853–9.Search in Google Scholar
Michel, N. 2006. “The Impact of Digital File Sharing on the Music Industry: An Empirical Analysis.” The B.E. Journal of Economic Analysis and Policy 6 (1). Article 18.10.2202/1538-0653.1549Search in Google Scholar
Mitra-Kahn, B. H. 2011. “Copyright, Evidence and Lobbynomics: The World After the UK’s Hargreaves Review.” Review of Economic Research on Copyright Issues 8:65–100.Search in Google Scholar
Peitz, M., and P. Waelbroeck. 2004. “The Effect of Internet Piracy on Music Sales: Cross-Section Evidence.” Review of Economic Research on Copyright Issues 1:71–9.Search in Google Scholar
Peitz, M., and P. Waelbroeck. 2006. “Piracy of Digital Products: A Critical Review of the Theoretical Literature.” Information Economics and Policy 18:449–76.10.1016/j.infoecopol.2006.06.005Search in Google Scholar
Rob, R., and J. Waldfogel. 2006. “Piracy on the High C‘s: Music Downloading, Sales Displacement, and Social Welfare in a Sample of College Students.” Journal of Law and Economics 49:29–62.10.3386/w10874Search in Google Scholar
Stevans, L., and D. Sessions. 2005. “An Empirical Investigation Into the Effect of Music Downloading on the Consumer Expenditure of Recorded Music: A Time Series Approach.” Journal of Consumer Policy 28:311–24.10.1007/s10603-005-8645-ySearch in Google Scholar
Waldfogel, J. 2012. “Music Piracy and Its Effects on Demand, Supply and Welfare.” In Innovation Policy and the Economy Volume 12, edited by Lerner, J. and Sten, S., 91–109. Chicago, IL: University of Chicago Press, NBER.10.1086/663157Search in Google Scholar
Wall, W. D. 2008. “Economics of Motion Pictures.” In The New Palgrave Dictionary of Economics, Volume 5, edited by Blume, L. and Durlauf, S., 787–91. London: Palgrave Macmillan Ltd.Search in Google Scholar
Walls, W. D. 2014. “Bestsellers and Blockbusters: Movies, Music, and Books.” In Handbook of the Economics of Art and Culture, edited by Ginsburgh, V. and Throsby, D., 185–214. Amsterdam: North Holland.10.1016/B978-0-444-53776-8.00008-8Search in Google Scholar
Zentner, A. 2005. “File Sharing and International Sales of Copyrighted Music: An Empirical Analysis with a Cross Section of Countries.” The B.E. Journal of Economic Analysis and Policy 5 (1). Article 21.10.2202/1538-0653.1452Search in Google Scholar
Zuel, B. 2012. Australians world’s worst for illegal music downloads. Sydney Morning Herald, September 19.Search in Google Scholar
©2016 by De Gruyter