Estimating causal effects from randomized experiments is central to clinical research. Reducing the statistical uncertainty in these analyses is an important objective for statisticians. Registries, prior trials, and health records constitute a growing compendium of historical data on patients under standard-of-care that may be exploitable to this end. However, most methods for historical borrowing achieve reductions in variance by sacrificing strict type-I error rate control. Here, we propose a use of historical data that exploits linear covariate adjustment to improve the efficiency of trial analyses without incurring bias. Specifically, we train a prognostic model on the historical data, then estimate the treatment effect using a linear regression while adjusting for the trial subjects’ predicted outcomes (their prognostic scores). We prove that, under certain conditions, this prognostic covariate adjustment procedure attains the minimum variance possible among a large class of estimators. When those conditions are not met, prognostic covariate adjustment is still more efficient than raw covariate adjustment and the gain in efficiency is proportional to a measure of the predictive accuracy of the prognostic model above and beyond the linear relationship with the raw covariates. We demonstrate the approach using simulations and a reanalysis of an Alzheimer’s disease clinical trial and observe meaningful reductions in mean-squared error and the estimated variance. Lastly, we provide a simplified formula for asymptotic variance that enables power calculations that account for these gains. Sample size reductions between 10% and 30% are attainable when using prognostic models that explain a clinically realistic percentage of the outcome variance.
We are grateful to Xinkun Nie and Oleg Sofrygin for enlightening conversations and to Rachael C. Aikens for feedback on a draft of this article. Data collection and sharing for this project was funded in part by the Alzheimer’s Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense award number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: AbbVie, Alzheimer’s Association; Alzheimer’s Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Cogstate; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.; Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co., Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (www.fnih.org). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer’s Therapeutic Research Institute at the University of Southern California. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California. Data collection and sharing for this project was funded in part by the University of California, San Diego Alzheimer’s Disease Cooperative Study (ADCS) (National Institute on Aging Grant Number U19AG010483).
Author contribution: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.
Research funding: None declared.
Conflict of interest statement: The authors declare no conflicts of interest regarding this article.
Appendix A. Mathematical results
Throughout we assume enough regularity conditions for the asymptotic normality of M-estimators to hold. The details are found in chapter 5 (thm 5.23) of van der Vaart .
(Rosenblum). The influence function for the linear regression treatment effect estimator we describe in Section 3 is ψ = ψ1 − ψ0 where
and and . The parameters are those that maximize the (model-based) likelihood in expectation (under the true law of the data). In other words, characterizes the linear model that comes as close as possible to the true conditional mean function and is its mean value (averaged over X).
(Difference-in-means). The “difference-in-means” (or “unadjusted”) estimator of τ = μ1 − μ0 is .
Note that throughout the appendix we omit the subscript n on estimators. E.g. τΔ is shorthand for τΔ,n and our asymptotic statements refer to the sequence of estimators as n becomes large.
The difference-in-means estimator has asymptotic variance given by
This fact is well-known. One proof follows the outline of 7 below taking Z ⊤ = [1, W]. □
(ANCOVA I). The “ANCOVA I” estimator of τ = μ1 − μ0 (denoted ) is the effect estimated using a linear regression with predictors Z ⊤ = [1, W, X ⊤ ] and outcome Y.
(ANCOVA II). The “ANCOVA II” estimator of τ = μ1 − μ0 (denoted ) is the effect estimated using a linear regression with predictors and outcome .
The following two Theorems A.3 and A.4 are mild generalizations of or follow closely from results stated in Leon et al.  and Yang and Tsiatis . Details are provided here for the reader’s convenience.
The ANCOVA I estimator is asymptotically unbiased for τ = μ1 − μ0 and has asymptotic variance given by
where , , and .
where , , and τ = μ1 − μ0. Thus . In this equation and from here on, let . So clearly . Then, from Eq. (7),
Where . An application of A.1 and some algebra gives
By the theory of influence functions, our estimator has a limiting distribution 
The asymptotic variance of is thus . The first term is the variance of the influence function for the difference-in-means (also called “unadjusted”) estimator. It may be verified that this evaluates to where . The variance of ϕ is
The covariance of the two terms involves the expectations (note that ξ = π0ξ0 + π1ξ1):
where we have introduced ξ* = π1ξ0 + π0ξ1. Assembling obtains the desired result. □
When X ∈ R (a single covariate), a consistent estimate of the sampling variance is
where and the “hat” quantities are any consistent estimates of their respective population parameters.
This follows from the definitions and Slutsky’s theorem. □
If either π0 = π1 or ξ0 = ξ1, then
The ANCOVA II estimator is asymptotically unbiased for τ = μ1 − μ0 and has asymptotic variance given by
Arguments similar to those in Theorem A.3 show that the influence function for the GLM marginal effect estimator with this specification is identical to Eq. (12) except that ξ = π0ξ0 + π1ξ1 is replaced by ξ* = π1ξ0 + π0ξ1. Specifically ψII = ψ1,II − ψ0,II with
The result follows from proceeding along the outline of Theorem A.3. □
When X ∈ R (a single covariate), a consistent estimate of the sampling variance is
Adding covariates to the ANCOVA II estimator can only decrease its asymptotic variance.
Consider using covariates X with variance Σ x and covariance with Y w of ξw,x versus a set of covariates [X, M] ( ) such that M is not a linear combination of the variables in X. Let , and . Let ξm* = π0ξ1,m + π1ξ0,m and ξx* = π0ξ1,x + π1ξ0,x. From Eq. (22) and some matrix algebra the difference in asymptotic variance between these two estimators is
The denominator must be positive because , implies . □
ANCOVA II is a more efficient estimator than ANCOVA I or difference-in-means. ANCOVA I may or may not be more efficient than difference-in-means (unless π0 = π1 = 0.5 or ξ0 = ξ1, in which case it is as efficient as ANCOVA II). In a slight abuse of notation,
because Eq. (22) subtracted from Eq. (9) is . is self-evident from Eq. (22). To show we rely on an example: using X ∈ R with π1 = 5/6 (so π0 = 1/6), ξ1 = 4 and ξ0 = 1 in Eq. (9) gives a positive addition to . □
Consider using the ANCOVA II estimator with an arbitrary (multivariate) transformation of the covariates f(X) in place of the raw covariates X. Among all fixed transformations f(X), the transformation is optimal in terms of efficiency. Furthermore, the estimator is semiparametric efficient: the ANCOVA II estimator with used as the vector of covariates has the lowest possible asymptotic variance among all regular and asymptotically linear estimators with access to the covariates X.
Consider replacing X in the interacted linear model (ANCOVA II) with an arbitrary fixed (possibly multivariate) function of the covariates f(X). By Eq. (23) and our definitions of ξ* and V the influence function for this estimator is ψ = ψ1 − ψ0 with
where and . Consider now using the special transformation f(X) ⊤ = [μ0(X), μ1(X)] where . Note that and by an orthogonal decomposition of Y w . Plugging these in and performing the appropriate algebra shows that V f ξf* in this case is so h w (X) in 30 is π0(μ1(X) − μ1) + π1(μ0(X) − μ0). A little algebra shows
The result is precisely the efficient influence function for the treatment effect [24, 26]. It is known that no regular and asymptotically linear (RAL) estimator (which essentially all practical and reasonable estimators are) can be more efficient than any estimator with this influence function.
Presume a constant treatment effect: μ1(X) = μ0(X) + τ. Then the ANCOVA II analysis that uses μ0(X) in the role of X has the lowest possible asymptotic variance among all regular and asymptotically linear estimators with access to the covariates X.
μ1(X) = μ0(X) + τ implies . Following the outline for the proof of Lemma A.6 above shows that the influence function for the ANCOVA II estimator with μ0(X) as the single covariate is
which is the same as the efficient influence function when μ1(X) = μ0(X) + τ. □
Corollary A.6.1 also holds when the ANCOVA II estimator is replaced by the ANCOVA I estimator.
Theorem A.5 establishes that ANCOVA I is as efficient as ANCOVA II when . A constant treatment effect means that μ1(X) = μ0(X) + τ and this ensures the equality of the covariances. □
The following lemma is required for the proof that proceeds it.
Let be a bounded function on a compact set and let be a sequence of uniformly bounded random functions such that . Let be a random variable independent of . Then , , and .
and X are independent, so let their joint distribution factor into and P. Now
The final convergence holds by our assumption that . This shows and convergence in probability follows.
Taking advantage of the fact that |f|, |f n | ≤ b are bounded we can make similar arguments to show that and . Slutsky’s theorem and the definition of covariance and variance then imply and as desired. □
Let . Under the conditions of the above lemma, .
Let . By the above lemma, our assumption that , and Slutsky’s theorem, . Together with the uniform bound on and Cauchy-Schwarz this is also enough to ensure that .
Now note by the triangle inequality and the fact that . Thus
as desired. □
Presume X has compact support and there is a constant treatment effect: μ1(X) = μ0(X) + τ with |μ0(x)| < b bounded. Let m(x) be a (random) function learned from the external data ( Y ′, X ′)n′ such that |m(x)| < b is also bounded and so that the learned model approaches the truth in MSE as n′ → ∞. If the number of trial samples n grows in tandem with the size of the historical data n′ (i.e. n = O(n′)), then the ANCOVA II analysis that uses the learned model m(X) in the role of X has the lowest possible asymptotic variance among all regular and asymptotically linear estimators with access to the covariates X.
Define our estimator of interest as the ANCOVA II estimator that uses the learned model m(X) in place of the covariates X if m(X) is not numerically constant up to some machine precision and otherwise as the difference-in-means estimator. Denote this estimator (omitting the II subscript for the duration of this proof). Define the “oracle” estimator as the equivalent estimator that uses the true conditional mean μ0(X) instead of the estimate m(X) and denote this estimator . The oracle estimator is obviously infeasible in practice because μ0(⋅) is not known. Corollary A.6.1 proves that the oracle estimator is semiparametric efficient (i.e. has the lowest possible asymptotic variance among regular and asymptotically linear estimators). Thus, letting denote the optimal asymptotic variance, we have that . If we can show that , then Slutsky’s theorem and the delta method imply that has the same asymptotic properties as , i.e. . In other words, since the oracle estimator is efficient with a known asymptotic variance, the feasible estimator is also efficient and has the same asymptotic variance because the two are asymptotically equivalent.
Showing requires an intermediate estimator that is asymptotically equivalent to . Using the assumption of the constant effect and Eq. (23) from Theorem A.4 we can show (with an application of the law of total variance) that the influence function for using some fixed m(⋅) is ψ = ψ1 − ψ0 with
where denotes that the expectation (or variance or covariance) is taken only with respect to X, i.e. m(⋅) is considered fixed.
Let and let where ψ* is the influence function above with μ0(⋅) substituted for m(⋅). Note that and share the same influence function so we must have that . Similarly, . Therefore if , then we have as desired. This is useful because the estimator and its oracle counterpart are easier to work with.
To wit, consider the difference . So all we need to show the desired convergence is to show . Expanding,
where we’ve abbreviated and . Our plan is to show that both of these terms L2-converge to 0 at the rate so that they both converge in probability in that rate, as does their sum (which is what we want). To show L2 convergence for the first term, we must consider the expression
And show it converges to 0. Recalling that m itself is random (depends on the external data ( X ′ Y ′)), but independent of the trial data ( X , W , Y ), note that we can treat m(⋅) as if it were a fixed function and B as a fixed constant if we condition on the external data. After conditioning, the quantity inside the parentheses is IID and has mean zero because its μ0(X) − m(X)B and (by randomization) and because . Therefore the quantity above is
where we’ve used the fact that the summands are IID to pass the variance through the sum and effectively gain the 1/n required to cancel the n. The same argument shows that the equivalent for the second term in Eq. (35) is (note m and B are random here).
To complete the proof we invoke Corollary A.7.1 in combination with our assumptions |m(x)| < b, |μ0(x)| < b and to arrive at the fact that and . The condition that in Corollary A.7.1 is automatically satisfied because we only include the prognostic score in the regression if it has nonzero variance. Thus the expectations and converge to 0 as desired. □
Theorem A.8 also holds for the ANCOVA I estimator.
In the case of a constant treatment effect ANCOVA I and ANCOVA II have the same asymptotic variance (Theorem A.5). The result follows immediately. □
Appendix B. Estimating and ρ w for power calculations
One method for obtaining estimates for the marginal potential outcome variances ( ) and potential outcome-prognostic score correlations (ρ w ) is to use prior data, for example data from the placebo control arm of a previous trial performed on a similar population (separate from the data used to train the prognostic model). In this case we presume we have access to a vector of outcomes for these subjects and their corresponding prognostic scores , calculated by applying the prognostic model m to each subject’s vector of baseline covariates X, i.e. .
The control-arm marginal outcome variance can be estimated with the usual estimator
The correlation ρ0 between M″ and Y″ can be estimated by
which is the usual sample correlation coefficient. These values may be inflated ( ) or deflated (ρ0) in order to provide more conservative estimates of power.
The corresponding values for the treatment arm can rarely be estimated from data because treatment-arm data for the experimental treatment is likely to be scarce or unavailable. It is therefore prudent to assume and ρ0 = ρ1, the latter which holds exactly if the effect of treatment is constant across the population. It may also be prudent (and conservative) to assume a slightly higher value for and a slightly smaller value for ρ1 relative to their control-arm counterparts in the absence of data to the contrary.
Appendix C. Additional simulation results
Here we detail a full set of simulation results using additional specifications for the regression estimators (Figure 1). “Covariates” indicates whether the raw covariates were adjusted for. “Prognostic score” indicates whether any prognostic score was used, and, if so, whether it was estimated from a training dataset or whether the true value was used. “Interactions” specifies whether treatment × (covariates and/or prognostic score) interactions were used. “SE” indicates the standard deviation of the mean squared error.
|Baseline||False||None||True||7.64 × 10−2||1.08 × 10−3|
|Baseline||False||None||False||7.64 × 10−2||1.08 × 10−3|
|Baseline||False||Estimated||True||1.76 × 10−2||2.46 × 10−4|
|Baseline||False||Estimated||False||1.75 × 10−2||2.45 × 10−4|
|Baseline||False||Oracle||True||7.69 × 10−3||1.09 × 10−4|
|Baseline||False||Oracle||False||7.69 × 10−3||1.09 × 10−4|
|Baseline||True||None||True||5.07 × 10−2||7.18 × 10−4|
|Baseline||True||None||False||5.04 × 10−2||7.14 × 10−4|
|Baseline||True||Estimated||True||1.74 × 10−2||2.46 × 10−4|
|Baseline||True||Estimated||False||1.73 × 10−2||2.44 × 10−4|
|Baseline||True||Oracle||True||7.85 × 10−3||1.11 × 10−4|
|Baseline||True||Oracle||False||7.85 × 10−3||1.11 × 10−4|
|Surrrogate||False||None||True||7.47 × 10−2||1.05 × 10−3|
|Surrrogate||False||None||False||7.47 × 10−2||1.05 × 10−3|
|Surrrogate||False||Estimated||True||4.05 × 10−2||5.69 × 10−4|
|Surrrogate||False||Estimated||False||4.03 × 10−2||5.66 × 10−4|
|Surrrogate||False||Oracle||True||8.25 × 10−3||1.18 × 10−4|
|Surrrogate||False||Oracle||False||8.24 × 10−3||1.18 × 10−4|
|Surrrogate||True||None||True||5.03 × 10−2||7.09 × 10−4|
|Surrrogate||True||None||False||5.00 × 10−2||7.04 × 10−4|
|Surrrogate||True||Estimated||True||3.75 × 10−2||5.27 × 10−4|
|Surrrogate||True||Estimated||False||3.72 × 10−2||5.23 × 10−4|
|Surrrogate||True||Oracle||True||8.41 × 10−3||1.20 × 10−4|
|Surrrogate||True||Oracle||False||8.41 × 10−3||1.20 × 10−4|
|Shifted||False||None||True||7.65 × 10−2||1.10 × 10−3|
|Shifted||False||None||False||7.65 × 10−2||1.10 × 10−3|
|Shifted||False||Estimated||True||6.79 × 10−2||9.62 × 10−4|
|Shifted||False||Estimated||False||6.79 × 10−2||9.62 × 10−4|
|Shifted||False||Oracle||True||8.20 × 10−3||1.15 × 10−4|
|Shifted||False||Oracle||False||8.20 × 10−3||1.15 × 10−4|
|Shifted||True||None||True||5.03 × 10−2||7.11 × 10−4|
|Shifted||True||None||False||5.00 × 10−2||7.05 × 10−4|
|Shifted||True||Estimated||True||4.91 × 10−2||6.97 × 10−4|
|Shifted||True||Estimated||False||4.86 × 10−2||6.90 × 10−4|
|Shifted||True||Oracle||True||8.34 × 10−3||1.17 × 10−4|
|Shifted||True||Oracle||False||8.34 × 10−3||1.17 × 10−4|
|Strong||False||None||True||7.73 × 10−2||1.08 × 10−3|
|Strong||False||None||False||7.73 × 10−2||1.08 × 10−3|
|Strong||False||Estimated||True||1.85 × 10−2||2.65 × 10−4|
|Strong||False||Estimated||False||1.85 × 10−2||2.64 × 10−4|
|Strong||False||Oracle||True||8.16 × 10−3||1.16 × 10−4|
|Strong||False||Oracle||False||8.16 × 10−3||1.16 × 10−4|
|Strong||True||None||True||5.14 × 10−2||7.18 × 10−4|
|Strong||True||None||False||5.11 × 10−2||7.13 × 10−4|
|Strong||True||Estimated||True||1.84 × 10−2||2.62 × 10−4|
|Strong||True||Estimated||False||1.82 × 10−2||2.59 × 10−4|
|Strong||True||Oracle||True||8.33 × 10−3||1.18 × 10−4|
|Strong||True||Oracle||False||8.32 × 10−3||1.18 × 10−4|
|Linear||False||None||True||3.49 × 10−2||4.83 × 10−4|
|Linear||False||None||False||3.49 × 10−2||4.83 × 10−4|
|Linear||False||Estimated||True||9.64 × 10−3||1.38 × 10−4|
|Linear||False||Estimated||False||9.64 × 10−3||1.38 × 10−4|
|Linear||False||Oracle||True||8.20 × 10−3||1.16 × 10−4|
|Linear||False||Oracle||False||8.20 × 10−3||1.16 × 10−4|
|Linear||True||None||True||8.37 × 10−3||1.18 × 10−4|
|Linear||True||None||False||8.37 × 10−3||1.18 × 10−4|
|Linear||True||Estimated||True||8.39 × 10−3||1.19 × 10−4|
|Linear||True||Estimated||False||8.39 × 10−3||1.19 × 10−4|
|Linear||True||Oracle||True||8.37 × 10−3||1.18 × 10−4|
|Linear||True||Oracle||False||8.37 × 10−3||1.18 × 10−4|
|Heterogeneous||False||None||True||5.54 × 10−2||7.76 × 10−4|
|Heterogeneous||False||None||False||5.54 × 10−2||7.76 × 10−4|
|Heterogeneous||False||Estimated||True||2.30 × 10−2||3.23 × 10−4|
|Heterogeneous||False||Estimated||False||2.32 × 10−2||3.25 × 10−4|
|Heterogeneous||False||Oracle||True||2.29 × 10−2||3.20 × 10−4|
|Heterogeneous||False||Oracle||False||2.32 × 10−2||3.24 × 10−4|
|Heterogeneous||True||None||True||2.99 × 10−2||4.30 × 10−4|
|Heterogeneous||True||None||False||2.98 × 10−2||4.29 × 10−4|
|Heterogeneous||True||Estimated||True||2.13 × 10−2||3.01 × 10−4|
|Heterogeneous||True||Estimated||False||2.19 × 10−2||3.08 × 10−4|
|Heterogeneous||True||Oracle||True||1.89 × 10−2||2.69 × 10−4|
|Heterogeneous||True||Oracle||False||1.98 × 10−2||2.81 × 10−4|
Appendix D. Covariates in the empirical demonstration dataset
|AChEI or memantine usage||Whether a subject is using a class of symptomatic Alzheimer’s drugs|
|ADAS commands||Assesses the subject’s ability to follow commands|
|ADAS comprehension||Assesses the subject’s ability to understand spoken language|
|ADAS construction||Assesses the subject’s ability to draw basic figures|
|ADAS ideational||Assesses the subject’s ability to carry out a basic task|
|ADAS naming||Assesses the subject’s ability to name common objects|
|ADAS orientation||Assesses the subject’s knowledge of time and place|
|ADAS remember instructions||Assesses the subject’s ability to remember test instructions|
|ADAS spoken language||Assesses the subject’s ability to speak clearly|
|ADAS word finding||Assesses the subject’s word finding in speech|
|ADAS word recall||Assesses the subject’s ability to recall a list of words|
|ADAS word recognition||Assesses the subject’s ability to remember and identify words|
|Age||Subject age at baseline|
|ApoE e4 Allele count||The number of ApoE e4 alleles a subject has (0, 1, or 2)|
|CDR community||Assesses the subject’s engagement in community activities|
|CDR home and hobbies||Assesses the subject’s engagement in home and personal activities|
|CDR judgement||Assesses the subject’s judgement skills|
|CDR memory||Assesses the subject’s memory|
|CDR orientation||Assesses the subject’s knowledge of time and place|
|CDR personal care||Assesses the subject’s ability to care for themselves|
|Diastolic blood pressure||The diastolic blood pressure of a subject|
|Education (Years)||The number of years of education of a subject|
|Heart rate||The resting heart rate of a subject|
|Height||The height of a subject|
|Indicator for clinical trial||1 if the subject is in an RCT, 0 if not|
|MMSE attention and calculation||Assesses the subject’s attention and calculation skills|
|MMSE language||Assesses the subject’s language skills|
|MMSE orientation||Assesses the subject’s knowledge of place and time|
|MMSE recall||Assesses the subject’s ability to remember prompts|
|MMSE registration||Assesses the subject’s ability to repeat prompts|
|Region: Europe||1 if the subject lives in Europe, 0 otherwise|
|Region: Northern America||1 if the subject lives in the US or Canada, 0 otherwise|
|Region: Other||1 if the subject lives outside of Europe/US/Canada, 0 otherwise|
|Serious adverse events||The number of serious adverse events reported|
|Sex||1 if female, 0 if male|
|Systolic blood pressure||The systolic blood pressure of a subject|
|Weight||The weight of a subject|
2. Sox, HC, Goodman, SN. The methods of comparative effectiveness research. Publ Health 2012;33:425–45. https://doi.org/10.1146/annurev-publhealth-031811-124610.Search in Google Scholar PubMed
4. Hannan, EL Randomized clinical trials and observational studies guidelines for assessing respective strengths and limitations. JACC Cardiovasc Interv 2008;1:211–7. https://doi.org/10.1016/j.jcin.2008.01.008.Search in Google Scholar PubMed
5. Kopp-Schneider, A, Calderazzo, S, Wiesenfarth, M. Power gains by using external information in clinical trials are typically not possible when requiring strict type I error control. Biom J 2020;62:361–74. https://doi.org/10.1002/bimj.201800395.Search in Google Scholar PubMed PubMed Central
7. Lim, J, Walley, R, Yuan, J, Liu, J, Dabral, A, Best, N. Minimizing patient burden through the use of historical subject-level data in innovative confirmatory clinical trials. TIRS 2018;52:546–59. https://doi.org/10.1177/2168479018778282.Search in Google Scholar PubMed
9. Ghadessi, M, Tang, R, Zhou, J, Liu, R, Wang, C, Toyoizumi, K, et al.. A roadmap to using historical controls in clinical trials – by drug information association adaptive design scientific working group (DIA-ADSWG). Orphanet J Rare Dis 2020;15:69. https://doi.org/10.1186/s13023-020-1332-x.Search in Google Scholar PubMed PubMed Central
12. Wyss, R, Lunt, M, Brookhart, MA, Glynn, RJ, Stürmer, T. Reducing bias amplification in the presence of unmeasured confounding through out-of-sample estimation strategies for the disease risk score. J Causal Inference 2014;2:131–46. https://doi.org/10.1515/jci-2014-0009.Search in Google Scholar
14. Kahan, BC, Jairath, V, J Doré, C, Morris, TP. The risks and rewards of covariate adjustment in randomized trials: an assessment of 12 outcomes from 8 studies. Trials 2014;15:139. https://doi.org/10.1186/1745-6215-15-139.Search in Google Scholar
15. Raab, GM, Day, S, Sales, J. How to select covariates to include in the analysis of a clinical trial. Contr Clin Trials 2000;21:330–42. https://doi.org/10.1016/s0197-2456(00)00061-1.Search in Google Scholar
16. Yang, L, Tsiatis, AA. Efficiency study of estimators for a treatment effect in a pretest–posttest trial. Am Statistician 2001;55:314–21. https://doi.org/10.1198/000313001753272466.Search in Google Scholar
17. Committee for Medicinal Products for Human Use. Guideline on adjustment for baseline covariates in clinical trials. London: European Medicines Agency; 2015.Search in Google Scholar
18. Cooney, MT, Dudina, AL, Graham, IM. Value and limitations of existing scores for the assessment of cardiovascular risk: a review for clinicians. J Am Coll Cardiol 2009;54:1209–27. https://doi.org/10.1016/j.jacc.2009.07.020.Search in Google Scholar PubMed
19. Austin, SR, Wong, Y-N, Uzzo, RG, Beck, JR, Egleston, BL. Why summary comorbidity measures such as the Charlson comorbidity index and elixhauser score work. Medical Care 2015;53:e65–72. https://doi.org/10.1097/mlr.0b013e318297429c.Search in Google Scholar
20. Ambrosius, WT, Sink, KM, Foy, CG, Berlowitz, DR, Cheung, AK, Cushman, WC, et al., The SPRINT Study Research Group. The design and rationale of a multicenter clinical trial comparing two strategies for control of systolic blood pressure: the systolic blood pressure intervention trial (SPRINT). Clin Trials 2014;11:532–46. https://doi.org/10.1177/1740774514537404.Search in Google Scholar PubMed PubMed Central
21. Borm, GF, Fransen, J, Lemmens, WAJG. A simple sample size formula for analysis of covariance in randomized clinical trials. J Clin Epidemiol 2007;60:1234–8. https://doi.org/10.1016/j.jclinepi.2007.02.006.Search in Google Scholar PubMed
23. Wang, B, Ogburn, EL, Rosenblum, M. Analysis of covariance in randomized trials: more precision and valid confidence intervals, without model assumptions. Biometrics 2019;75:1391–400. https://doi.org/10.1111/biom.13062.Search in Google Scholar PubMed
24. Leon, S, Tsiatis, AA, Davidian, M. Semiparametric estimation of treatment effect in a pretest–posttest study. Biometrics 2003;59:1046–55. https://doi.org/10.1111/j.0006-341x.2003.00120.x.Search in Google Scholar PubMed
26. Tsiatis, A. Semiparametric theory and missing data. New York: Springer Science & Business Media; 2007.Search in Google Scholar
27. Luo, Y, Spindler, M. High-dimensional L2 boosting: rate of convergence. 2016 arXiv.Search in Google Scholar
30. Syrgkanis, V, Zampetakis, M. Estimation and inference with trees and forests in high dimensions. 2020 arXiv.Search in Google Scholar
31. Pedregosa, F, Varoquaux, G, Gramfort, A, Michel, V, Thirion, B, Grisel, O, et al.. Scikit-learn: machine learning in Python. 2012 arXiv.Search in Google Scholar
32. Quinn, JF, Raman, R, Thomas, RG, Yurko-Mauro, K, Nelson, EB, Van Dyck, C, et al.. Docosahexaenoic acid supplementation and cognitive decline in alzheimer disease: a randomized trial. J Am Med Assoc 2010;304:1903–11. https://doi.org/10.1001/jama.2010.1510.Search in Google Scholar PubMed PubMed Central
33. Coon, KD, Myers, AJ, Craig, DW, Webster, JA, Pearson, JV, Lince, DH, et al.. A high-density whole-genome association study reveals that APOE is the major susceptibility gene for sporadic late-onset alzheimer’s disease. J Clin Psychiatr 2007;68:613–8. https://doi.org/10.4088/jcp.v68n0419.Search in Google Scholar PubMed
35. Galasko, D, Bennett, D, Sano, M, Ernesto, C, Thomas, R, Grundman, M, et al.. An inventory to assess activities of daily living for clinical trials in Alzheimer’s disease. The Alzheimer’s disease cooperative study. Alzheimer Dis Assoc Disord 1997;11:S33–9. https://doi.org/10.1097/00002093-199700112-00005.Search in Google Scholar
37. Neville, J, Kopko, S, Broadbent, S, Avilés, E, Stafford, R, Solinsky, CM, et al., Coalition Against Major Diseases. Development of a unified clinical trial database for Alzheimer’s disease. Alzheimer’s Dementia 2015;11:1212–21. https://doi.org/10.1016/j.jalz.2014.11.005.Search in Google Scholar PubMed
38. Romero, K, Mars, M, Frank, D, Anthony, M, Neville, J, Kirby, L, et al.. The coalition against major diseases: developing tools for an integrated drug development process for Alzheimer’s and Parkinson’s diseases. Clin Pharmacol Ther 2009;86:365–7. https://doi.org/10.1038/clpt.2009.165.Search in Google Scholar PubMed
39. Chernozhukov, V, Chetverikov, D, Demirer, M, Duflo, E, Hansen, C, Newey, W, et al.. Double/debiased machine learning for treatment and structural parameters. Econom J 2018;21:C1–68. https://doi.org/10.1111/ectj.12097.Search in Google Scholar
40. Wager, S, Du, W, Taylor, J, Tibshirani, RJ. High-dimensional regression adjustments in randomized experiments. Proc Natl Acad Sci Unit States Am 2016;113:12673–8. https://doi.org/10.1073/pnas.1614732113.Search in Google Scholar PubMed PubMed Central
41. Rothe, C.Flexible covariate adjustments in randomized experiments, Working Paper; 2018.Search in Google Scholar
42. Dankar, FK, El Emam, K. The application of differential privacy to health data. In: Proceedings of the 2012 Joint EDBT/ICDT Workshops on – EDBT-ICDT ’12; 2012. pp. 158–66.10.1145/2320765.2320816Search in Google Scholar
43. Brisimi, TS, Chen, R, Mela, T, Olshevsky, A, Paschalidis, IC, Shi, W. Federated learning of predictive models from federated electronic health records. Int J Med Inf 2018;112:59–67. https://doi.org/10.1016/j.ijmedinf.2018.01.007.Search in Google Scholar PubMed PubMed Central
44. Coalition Against Major Diseases, Organiza, Abbott, Alliance for Aging Research, Alzheimer’s Association, Alzheimer’s Foundation of America, AstraZeneca Pharmaceuticals LP, Bristol-Myers Squibb Company, Critical Path Institute, CHDI Foundation Inc, Eli Lilly and Company, F Hoffmann-La Roche Ltd, Forest Research Institute, Genentech Inc, GlaxoSmithKline, Johnson & Johnson, National Health Council, Novartis Pharmaceuticals Corporation, Parkinson’s Action Network, Parkinson’s Disease Foundation, Pfizer Inc, sanofi-aventis Collaborating, Fisher, CK, Smith, AM, Walsh, JR. Machine learning for comprehensive forecasting of Alzheimer’s disease progression. Sci Rep 2019;9:13622. https://doi.org/10.1038/s41598-019-49656-2.Search in Google Scholar PubMed PubMed Central
45. Rajkomar, A, Oren, E, Chen, K, Dai, AM, Hajaj, N, Hardt, M, et al.. Scalable and accurate deep learning with electronic health records. npj Digital Medicine 2018;1:18. https://doi.org/10.1038/s41746-018-0029-1.Search in Google Scholar PubMed PubMed Central
47. Miotto, R, Wang, F, Wang, S, Jiang, X, Dudley, JT. Deep learning for healthcare: review, opportunities and challenges. Briefings Bioinf 2018;19:1236–46. https://doi.org/10.1093/bib/bbx044.Search in Google Scholar PubMed PubMed Central
48. Dubois, S, Romano, N, Jung, K, Shah, N, Kale, D. The effectiveness of transfer learning in electronic health records data. In: Workshop Track - ICLR; 2017.Search in Google Scholar
49. van der Vaart, AW. Asymptotic statistics. Cambridge: Cambridge University Press; 2000.Search in Google Scholar
50. Robins, JM, Rotnitzky, A, Zhao, LP. Estimation of regression coefficients when some regressors are not always observed. J Am Stat Assoc 1994;89:846. https://doi.org/10.2307/2290910.Search in Google Scholar
51. Rosenblum, M, van der Laan, MJ. Simple, efficient estimators of treatment effects in randomized trials using generalized linear models to leverage baseline variables. Int J Biostat 2010;6:13. https://doi.org/10.2202/1557-4679.1138.Search in Google Scholar PubMed PubMed Central
53. Long, JS, Ervin, LH. Using heteroscedasticity consistent standard errors in the linear regression model. Am Statistician 2012;54:217–24. https://doi.org/10.1080/00031305.2000.10474549.Search in Google Scholar
© 2021 Walter de Gruyter GmbH, Berlin/Boston