A replication plan for “Does social media reduce corruption?” (Information Economics and Policy, 2017)

The importance of replicating economic research to improve the validity of findings has been the topic of an ongoing discussion, but there is not a consensus of what that means in practice. This article discusses a rationale for replicating a study and offers a plan of how one might go about approaching a replication of an actual study. (Published in Special Issue The practice of replication) JEL B40 D73


Introduction
The importance of replicating empirical research in economics has been advanced by numerous studies. The difficulties associated with replicating studies has also been discussed at length (Duvendack et al, 2017;Lindsay and Ehrenberg, 1993 among others). The topic this paper will address is an approach to actually doing a meaningful replication.
Before looking at a specific approach to performing a replication, it is important to consider a rationale for the replication. For this, assume a model with three actors: an original researcher, a replicator, and a consumer of the research output, who is dependent on the researcher and the replicator to determine what is true. Replications are of value if the consumer gains more certainty about the reliability of the results obtained by the researcher. Value would be obtained if it is assumed that the researcher may be fallible and/or biased, which may include being unethical, leading to erroneous or biased results, while the replicator is an infallible and unbiased discerner of the truth. Under this scenario, a replicator's acceptance, rejection, or clarification of the researcher's results would, therefore, create value for the consumer. If, on the other hand, the researcher is an infallible and unbiased discerner of the truth and the replicator is fallible and/or biased, then the replicator's rejection or clarification of the researcher's results would unfairly raise doubts and may unjustly reduce the value of the original research.
Given the possibility that both the researcher and the replicator may be fallible and/or biased, the question arises, what is the value of a replication to the consumer? The replication may lead to the unfortunate outcome of confirming erroneous or biased results, or rejecting correct results. The replication may also lead to the fortunate outcome of rejecting erroneous or biased results or confirming correct results. If the consumer is not able to determine the truth in the results of the researcher or the replicator, the consumer may rely on other means, such as author or journal reputations, to determine what to believe. In such situations, a replication is of limited or no value.
The challenge, therefore, is how to perform a replication that creates value for the consumer. Since it cannot be assumed that all replicators will be infallible and unbiased, a set of procedures that reduces the fallibility and/or bias of a replicator would increase the credibility of the replication and, therefore, provide value to the consumer.

Replication Principles
Since there is no agreed upon definition of what a replication is, the replicator first needs to define the scope of the replication. A number of papers dealing with replications have presented frameworks for classifying types of replication studies (Clemens, 2017;Morrison et al, 2010). Replications are usually differentiated along two dimensions in terms of the data used and the methods employed to analyze the data. This paper will use a slightly different approach by starting with the question that replication studies should be dealing with, which is, are the results of the original study reliable? Or, put in a more operational form, do the results hold up to constructive scrutiny? Looked at from this perspective, a replication can range from a quality control exercise to an expanded robustness check based on an evaluation of the data considered and the methods employed to further extensions of the original study. Types of replications can differ based on how far along that continuum the replicator journeys.
In deciding the scope of the replication, one should consider the validity of the approach to be taken, which is why the above framework is suggested. If the replication doesn't start with the original data and the methods employed in the original study to determine if the results can be truly replicated, then the results from the replication can easily be dismissed if they differ from the results of the original study. By confirming that the original results can be duplicated using the original data and, optimally, the original code, any criticism of the original study can be based on full knowledge regarding how the original results were obtained. If the original results cannot be replicated as a result of honest attempts to do so (a discussion with the authors of the original study should be part of this process), then it is safe to say that the results of the original study do not hold up to constructive scrutiny and, therefore, are unreliable.
Since replications are not universally performed on all published empirical research in economics, an additional issue that the replicator should consider is the process used to choose a particular study to replicate. One method has been to replicate all papers in a particular journal or dealing with a particular topic. By applying the same approach to a number of papers, the replicator avoids the criticism of singling out a particular paper for excessive scrutiny. While this approach has proven useful in providing evidence that replications should more routinely be performed, it may be impractical if the objective is to go beyond seeing whether the researcher's data and code can be used to obtain identical results.
Since one motivation for selecting a particular study to replicate is that something seems "wrong" with the results, it may seem that the replicator is merely trying to undermine the results of the original study rather than constructively scrutinizing how they were obtained. Given this potential bias in selecting a particular study, the replicator should clearly indicate where the results of the replication of the original study end and where anything new, in terms of data or methodology, is being introduced. Subjecting the original results to constructive scrutiny can be seen as equivalent to cross-examination as employed by the court system. Just as in courts there are rules to make the system fairer, there should be rules that replicators follow to preserve the integrity of the process. The rest of this paper expands on these guidelines by showing how they can be used to develop a plan to replicate an actual paper.

Replication Plan
The paper chosen for this purpose is "Does Social Media Reduce Corruption?" by Jha and Sarangi, 2017. As the title suggests, Jha and Sarangi set out to answer the question: Does social media reduce corruption? While accepting the broader claim that all research should be replicated before being accepted, the author chose this particular paper for the replication discussion to highlight a number of issues in developing a replication plan. As will be seen, these issues relate to the data, the methods, and a consideration of the conclusions drawn. Before discussing a replication plan for this paper, a brief summary of the main result is provided.
The primary finding of the Jha and Sarangi paper is that Facebook penetration in a country, used as a proxy for social media penetration, is negatively correlated with corruption levels. To analyze this relationship, they start by estimating the following equation: Corruption I = α + βFacebookI + δInternetI + γ1log(GDPPCI) + γ2Political RightsI + γ3log(PopulationI) + γ4ChristianI + γ5MuslimI + εI. The Corruption measure used is the negative transformation of the 2012 Control of Corruption Index, which is published by the World Bank and, as published, ranges from -2.5 to 2.5 with 2.5 representing the lowest level of corruption. The Facebook variable represents Facebook penetration and comes from an organization called Quintly. The Internet variable represents internet penetration, and GDPPC is per capita GDP. The data for both of these come from the World Bank, along with the data for the Population variable. The Political Rights variable is published by Freedom House and ranges from 1 to 7, with 1 representing the best level of political rights. The Christian and Muslim variables represent the proportion of the population identified as such and are published by the Association of Religion Data Archive. The year for the data is 2012, except for religion variables, which is 2005. Data for all of these variables was available for 170 countries.
The OLS estimates of this model found the coefficient of Facebook to be highly significant (p<0.01) and of the correct (negative) sign to indicate a negative relationship between Facebook penetration and level of corruption. Alternative specifications of the model presented also show Facebook penetration to be significant and of the correct sign. The coefficients of Facebook presented in the paper range from -0.0114 to -0.0163. As the authors point out, this means that by the most conservative estimate of the coefficient (-0.0114) a one-standard deviation (18.2%) increase in Facebook penetration (mean = 20.62%) increases the Control of Corruption Index by 0.2.
The replication plan being presented for this study would best be described as replication with constructive scrutiny to determine the extent to which the results could be described as reliable. I will assume that I as a replicator fall short of the standard of being infallible and unbiased, and, therefore, the replication plan assesses only the evidence presented and, where appropriate, scrutinizes choices made regarding data considered or methods used that differ from the choices made in papers they reference.
The starting point in the replication would be to assess the argument made in light of the evidence presented. As mentioned in the summary of the Jha and Sarangi paper, the authors present evidence, based on data from more than 150 countries, that Facebook penetration, used as a proxy for social media, is negatively correlated with corruption. While the authors point out that they are not claiming that their evidence supports the hypothesis that Facebook penetration causes less corruption, their interest is in the significance of this social media variable. The next step would be to obtain the data and code used to generate the results presented in the paper. Some journals require these to be submitted but are sometimes rather lax about actually obtaining them. Often, the author needs to be contacted to obtain the data and the code. The data and code for the chosen paper, for instance, is not available from the journal, but the authors did make it available to me upon request. Contacting the author, even if the data is publicly available, should be done anyway to let the author know what your plans are and to raise any preliminary questions you may have. An alternative approach is to try to recreate the data based on the descriptions presented in the paper. While this is a recommended step further along in the replication plan, it is more efficient to start with the original data and code given that the descriptions provided in papers may not be sufficient to recreate identical data given possible revisions, corrections, transformations, etc.
Once the original data is obtained, the next step would be to determine whether the results can be duplicated. If they cannot, discrepancies should be clarified with the author rather than assuming that the replicator is infallible. Having dealt with or noted discrepancies, if any, the replicator should next look at whether there are any issues with the data. The chosen paper, for instance, relies on data from a number of sources that have been merged together. Has that been done correctly? Are there some observations contained in the dataset that are not included in the results? Would the results be different if they were included? Are there data that reasonably should have been included that are not included? For instance, in the paper chosen, it is noted that Argentina was not included. Does the data differ from the data that is currently available? Would any changes significantly alter the results? Does the description of the data in the paper match the data actually used? This all assumes that the same approach the researcher used is being used to analyze the data.
The statistical methodology should also be evaluated. Do the methods actually used match the descriptions discussed in the paper? Was a reasonable approach taken to analyze the data? One danger when critically evaluating the data and/or the methodology in a replication is that the farther either deviate from the original study, the more useful it may appear to the consumer, but the less reasonable it actually may seem if the results do not support the conclusions of the original. For instance, with the paper chosen, it might seem reasonable to wonder whether changes in corruption are correlated with changes in Facebook penetration. Since this question goes beyond the scope of what the authors are arguing, it may appear to be an unreasonable basis on which to claim that the replication rejects the results of the original study but a reasonable part of an extension of the original. It might seem reasonable as part of the replication to ask why the authors did not report results that include the Protestant percentage of the population by country given that it is a variable included in many of the papers cited. Again, though, any changes to the data or the methodology included in the replication that yield results that lead to conclusions that differ from the original should be clearly indicated and explained.

Judging Replication Success
A replication of this paper could "fail" to replicate in several different ways. The most obvious way is if the results obtained using exactly the same data and code failed to yield a statistically significant coefficient on Facebook. It could also be said to have failed if it is found that there are errors in the data, either different from those that are published or assembled incorrectly, meaning data from one country incorrectly matched up with another country, which, again, yields a result at odds with the main results of the paper. A third way is if the actual methods used in the analysis differ from what was discussed in the paper in a way that renders the results meaningless. If the replication yields identical results or results that do not undermine the results of the original, then the replication has "successfully" replicated the original study.

Conclusion
In the end, the result of a replication is whether it confirms or rejects all or parts of the original study. In the strictest sense, the confirmation or rejection should apply only to whether or not the original results can be reproduced using the original data and code. In a broader sense, a rejection can also apply to whether there are any issues with the original data and/or code that render the results meaningless. Any issues beyond that should be seen as clarification or an extension of the original study. The value of the replication lies in its persuasiveness from the perspective of the consumer. If the replication results show no significant discrepancies from the original, it may seem as though effort was wasted needlessly,