Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter December 22, 2021

Increasing the efficiency of randomized trial estimates via linear adjustment for a prognostic score

  • Alejandro Schuler ORCID logo EMAIL logo , David Walsh , Diana Hall , Jon Walsh , Charles Fisher , for the Critical Path for Alzheimer’s Disease , the Alzheimer’s Disease Neuroimaging Initiative and the Alzheimer’s Disease Cooperative Study

Abstract

Estimating causal effects from randomized experiments is central to clinical research. Reducing the statistical uncertainty in these analyses is an important objective for statisticians. Registries, prior trials, and health records constitute a growing compendium of historical data on patients under standard-of-care that may be exploitable to this end. However, most methods for historical borrowing achieve reductions in variance by sacrificing strict type-I error rate control. Here, we propose a use of historical data that exploits linear covariate adjustment to improve the efficiency of trial analyses without incurring bias. Specifically, we train a prognostic model on the historical data, then estimate the treatment effect using a linear regression while adjusting for the trial subjects’ predicted outcomes (their prognostic scores). We prove that, under certain conditions, this prognostic covariate adjustment procedure attains the minimum variance possible among a large class of estimators. When those conditions are not met, prognostic covariate adjustment is still more efficient than raw covariate adjustment and the gain in efficiency is proportional to a measure of the predictive accuracy of the prognostic model above and beyond the linear relationship with the raw covariates. We demonstrate the approach using simulations and a reanalysis of an Alzheimer’s disease clinical trial and observe meaningful reductions in mean-squared error and the estimated variance. Lastly, we provide a simplified formula for asymptotic variance that enables power calculations that account for these gains. Sample size reductions between 10% and 30% are attainable when using prognostic models that explain a clinically realistic percentage of the outcome variance.


Corresponding author: Alejandro Schuler, UC Berkeley Center for Targeted Learning, Berkeley, CA, USA, E-mail:

Critical Path for Alzheimer's Disease: Data used in the preparation of this article were obtained from the Critical Path Institute’s Critical Path for Alzheimer’s Disease (CPAD) consortium. As such, the investigators within CPAD contributed to the design and implementation of the CPAD database and/or provided data, but did not participate in the analysis of the data or the writing of this report.

The Alzheimer's Disease Neuroimaging Initiative: Data used in preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. A complete listing of ADNI investigators can be found in this document.

The Alzheimer's Disease Cooperative Study: Data used in preparation of this manuscript/publication/article were obtained from the University of California, San Diego Alzheimer’s Disease Cooperative Study. Consequently, the ADCS Core Directors contributed to the design and implementation of the ADCS and/or provided data but did not participate in analysis or writing of this report.


Acknowledgments

We are grateful to Xinkun Nie and Oleg Sofrygin for enlightening conversations and to Rachael C. Aikens for feedback on a draft of this article. Data collection and sharing for this project was funded in part by the Alzheimer’s Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense award number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: AbbVie, Alzheimer’s Association; Alzheimer’s Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Cogstate; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.; Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co., Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (www.fnih.org). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer’s Therapeutic Research Institute at the University of Southern California. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California. Data collection and sharing for this project was funded in part by the University of California, San Diego Alzheimer’s Disease Cooperative Study (ADCS) (National Institute on Aging Grant Number U19AG010483).

  1. Author contribution: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.

  2. Research funding: None declared.

  3. Conflict of interest statement: The authors declare no conflicts of interest regarding this article.

Appendix A. Mathematical results

Throughout we assume enough regularity conditions for the asymptotic normality of M-estimators to hold. The details are found in chapter 5 (thm 5.23) of van der Vaart [49].

Lemma A.1

(Rosenblum). The influence function for the linear regression treatment effect estimator we describe in Section 3 is ψ = ψ 1ψ 0 where

(7) ψ w = W w π w ( Y μ ̂ w * ( X ) ) + ( μ ̂ w * ( X ) μ ̂ w * )

and μ ̂ w * ( X ) = Z w β * and μ ̂ w * = E μ ̂ w * ( X ) . The parameters β ̂ * are those that maximize the (model-based) likelihood in expectation (under the true law of the data). In other words, μ ̂ w * ( X ) characterizes the linear model that comes as close as possible to the true conditional mean function μ w ( X ) = E Y w | X and μ ̂ w * is its mean value (averaged over X).

This follows from results in Robins et al. [50]. An accessible presentation for the case of generalized linear models is given in Rosenblum and Laan [51].

Definition A.1

(Difference-in-means). The “difference-in-means” (or “unadjusted”) estimator of τ = μ 1μ 0 is τ ̂ Δ = E ̂ Y | W 1 E ̂ Y | W 0 .

Note that throughout the appendix we omit the subscript n on estimators. E.g. τ Δ is shorthand for τ Δ,n and our asymptotic statements refer to the sequence of estimators as n becomes large.

Lemma A.2

The difference-in-means estimator has asymptotic variance given by

(8) n V τ ̂ Δ p σ 0 2 π 0 + σ 1 2 π 1

where σ w = V Y w .

Proof

This fact is well-known. One proof follows the outline of 7 below taking Z = [1, W]. □

Definition A.2

(ANCOVA I). The “ANCOVA I” estimator of τ = μ 1μ 0 (denoted τ ̂ I ) is the effect estimated using a linear regression with predictors Z = [1, W, X ] and outcome Y.

Definition A.3

(ANCOVA II). The “ANCOVA II” estimator of τ = μ 1μ 0 (denoted τ ̂ II ) is the effect estimated using a linear regression with predictors Z = [ 1 , W ̃ , X ̃ , W ̃ X ̃ T ] and outcome Y ̃ .

The following two Theorems A.3 and A.4 are mild generalizations of or follow closely from results stated in Leon et al. [24] and Yang and Tsiatis [16]. Details are provided here for the reader’s convenience.

Theorem A.3

The ANCOVA I estimator is asymptotically unbiased for τ = μ 1μ 0 and has asymptotic variance given by

(9) n V τ ̂ I p σ 0 2 π 0 + σ 1 2 π 1 + 1 π 0 π 1 ξ V ξ 2 1 π 0 π 1 ξ * V ξ

where ξ = π 0 C Y 0 , X + π 1 C Y 1 , X , ξ * = π 0 C Y 1 , X + π 1 C Y 0 , X , and V = V X 1 .

Proof

We begin by applying Lemma A.1. Minimization of the expected log-likelihood shows that β ̂ * = E Z Z 1 E Z Y . Some algebra[12] demonstrates

(10) β ̂ * = μ 0 , τ , ( V ξ )

where V = V X 1 , ξ = C X , Y , and τ = μ 1μ 0. Thus μ ̂ w * ( X ) = μ 0 + w τ + X ̃ V ξ = μ w + X ̃ V ξ . In this equation and from here on, let X ̃ = X E X . So clearly μ ̂ w * = μ w . Then, from Eq. (7),

(11) ψ w = W w π w ( Y μ w ) W ̃ w π w ( X ̃ V ξ ) h w ( X )

Where W ̃ w = W w π w . An application of A.1 and some algebra gives

(12) ψ I = W 1 π 1 ( Y μ 1 ) ψ 1 , Δ W 0 π 0 ( Y μ 0 ) ψ 0 , Δ ψ Δ ( W 1 π 1 ) ( X ̃ V ξ ) π 0 π 1 h ( X ) ϕ

It is known that all regular and asymptotically linear estimators of the treatment effect have an influence function of this form with h(X) dependent on the choice of estimator [24, 26].

By the theory of influence functions, our estimator has a limiting distribution [26]

(13) n ( τ ̂ I τ ) d N 0 , E ψ I 2

The asymptotic variance of τ ̂ I is thus E ψ I 2 = E ( ψ Δ ϕ ) 2 = E ψ Δ 2 2 E ψ Δ ϕ + E ϕ 2 . The first term is the variance of the influence function for the difference-in-means (also called “unadjusted”) estimator. It may be verified that this evaluates to E ψ Δ 2 = σ 0 2 π 0 + σ 1 2 π 1 where σ w 2 = V Y w . The variance of ϕ is

(14) E ϕ 2 = E W 1 π 1 π 0 π 1 X ̃ V ξ 2

(15) = E ( W 1 π 1 ) 2 π 0 2 π 1 2 ξ V E X ̃ X ̃ V ξ

(16) = 1 π 0 π 1 ξ V ξ

The covariance of the two terms involves the expectations E ( Y w μ w ) X ̃ = C Y w , X = ξ w (note that ξ = π 0 ξ 0 + π 1 ξ 1):

(17) E ψ Δ ϕ = E ψ 1 , Δ ϕ E ψ 0 , Δ ϕ

(18) = 1 π 1 ξ 1 V ξ 1 π 0 ξ 0 V ξ

(19) = 1 π 0 π 1 ξ * V ξ

where we have introduced ξ * = π 1 ξ 0 + π 0 ξ 1. Assembling obtains the desired result. □

Corollary A.3.1

When XR (a single covariate), a consistent estimate of the sampling variance V τ ̂ I is

(20) ν ̂ I 2 = σ ̂ 0 2 n 0 + σ ̂ 1 2 n 1 + n 0 n 1 n ρ ̂ 0 σ ̂ 0 n 1 + ρ ̂ 1 σ ̂ 1 n 0 2 2 n 0 n 1 n ρ ̂ 0 σ ̂ 0 n 1 + ρ ̂ 1 σ ̂ 1 n 0 ρ ̂ 0 σ ̂ 0 n 0 + ρ ̂ 1 σ ̂ 1 n 1

where ρ w = C Y w , X / V X V Y w and the “hat” quantities are any consistent estimates of their respective population parameters.

Proof

This follows from the definitions and Slutsky’s theorem. □

Corollary A.3.2

If either π 0 = π 1 or ξ 0 = ξ 1, then

(21) n V τ ̂ I p σ 0 2 π 0 + σ 1 2 π 1 1 π 0 π 1 ξ * V ξ *

Theorem A.4

The ANCOVA II estimator is asymptotically unbiased for τ = μ 1μ 0 and has asymptotic variance given by

(22) n V τ ̂ II p σ 0 2 π 0 + σ 1 2 π 1 1 π 0 π 1 ξ * V ξ *

Proof

Arguments similar to those in Theorem A.3 show that the influence function for the GLM marginal effect estimator with this specification is identical to Eq. (12) except that ξ = π 0 ξ 0 + π 1 ξ 1 is replaced by ξ * = π 1 ξ 0 + π 0 ξ 1. Specifically ψ II = ψ 1,IIψ 0,II with

(23) ψ w , I I = W w π w ( Y μ w ) W ̃ w π w X ̃ V ξ * h w ( X )

The result follows from proceeding along the outline of Theorem A.3. □

Corollary A.4.1

When XR (a single covariate), a consistent estimate of the sampling variance V τ ̂ II is

(24) ν ̂ II 2 = σ ̂ 0 2 n 0 + σ ̂ 1 2 n 1 n 0 n 1 n ρ ̂ 0 σ ̂ 0 n 0 + ρ ̂ 1 σ ̂ 1 n 1 2

Corollary A.4.2

Adding covariates to the ANCOVA II estimator can only decrease its asymptotic variance.

Proof

Consider using covariates X with variance Σ x and covariance with Y w of ξ w,x versus a set of covariates [X, M] ( M R ) such that M is not a linear combination of the variables in X. Let C X , M = ζ , V M = σ m 2 and C Y w , M = ξ w , m . Let ξ m* = π 0 ξ 1,m + π 1 ξ 0,m and ξ x* = π 0 ξ 1,x + π 1 ξ 0,x . From Eq. (22) and some matrix algebra the difference in asymptotic variance between these two estimators is

(25) 1 π 0 π 1 ξ m ξ x Σ x 1 ζ 2 σ m 2 ζ Σ x 1 ζ

The denominator must be positive because V X , M 0 , V X 0 implies det ( V X , M ) = det Σ x 1 σ m 2 ζ Σ x 1 ζ 0 . □

Theorem A.5

ANCOVA II is a more efficient estimator than ANCOVA I or difference-in-means. ANCOVA I may or may not be more efficient than difference-in-means (unless π 0 = π 1 = 0.5 or ξ 0 = ξ 1, in which case it is as efficient as ANCOVA II). In a slight abuse of notation,

(26) V τ ̂ II V τ ̂ I

(27) V τ ̂ II V τ ̂ Δ

(28) V τ ̂ I V τ ̂ Δ

(29) π 0 = π 1 V τ ̂ I = V τ ̂ II

Proof

V τ ̂ II V τ ̂ I because Eq. (22) subtracted from Eq. (9) is ( V 1 / 2 ( ξ ξ * ) ) 2 / ( π 0 π 1 ) 0 . V τ ̂ II V τ ̂ Δ is self-evident from Eq. (22). To show V τ ̂ I V τ ̂ Δ we rely on an example: using XR with π 1 = 5/6 (so π 0 = 1/6), ξ 1 = 4 and ξ 0 = 1 in Eq. (9) gives a positive addition to V τ ̂ Δ . □

Lemma A.6

Consider using the ANCOVA II estimator with an arbitrary (multivariate) transformation of the covariates f(X) in place of the raw covariates X. Among all fixed transformations f(X), the transformation [ μ 0 ( X ) , μ 1 ( X ) ] is optimal in terms of efficiency. Furthermore, the estimator is semiparametric efficient: the ANCOVA II estimator with [ μ 0 ( X ) , μ 1 ( X ) ] used as the vector of covariates has the lowest possible asymptotic variance among all regular and asymptotically linear estimators with access to the covariates X.

Consider replacing X in the interacted linear model (ANCOVA II) with an arbitrary fixed (possibly multivariate) function of the covariates f(X). By Eq. (23) and our definitions of ξ * and V the influence function for this estimator is ψ = ψ 1ψ 0 with

(30) ψ w = W w π w ( Y μ w ) W ̃ w π w ( f ( X ) E f ( X ) ) V f ξ f * h w ( X )

where ξ f * = π 1 C Y 0 , f ( X ) + π 0 C Y 1 , f ( X ) and V f = V f ( X ) 1 . Consider now using the special transformation f(X) = [μ 0(X), μ 1(X)] where μ w ( X ) = E Y w | X . Note that C Y w , μ w ( X ) = V μ w ( X ) and C Y 1 , μ 0 ( X ) = C μ 1 ( X ) , μ 0 ( X ) by an orthogonal decomposition of Y w .[13] Plugging these in and performing the appropriate algebra shows that V f ξ f* in this case is [ π 1 , π 0 ] so h w (X) in 30 is π 0(μ 1(X) − μ 1) + π 1(μ 0(X) − μ 0). A little algebra shows

(31) ψ = ψ 1 ψ 0

(32) = W 1 π 1 ( Y μ 1 ) W 0 π 0 ( Y μ 0 ) ( W 1 π 1 ) π 0 ( μ 1 ( X ) μ 1 ) + π 1 ( μ 0 ( X ) μ 0 ) π 0 π 1

The result is precisely the efficient influence function for the treatment effect [24, 26]. It is known that no regular and asymptotically linear (RAL) estimator (which essentially all practical and reasonable estimators are) can be more efficient than any estimator with this influence function.

Corollary A.6.1

Presume a constant treatment effect: μ 1(X) = μ 0(X) + τ. Then the ANCOVA II analysis that uses μ 0(X) in the role of X has the lowest possible asymptotic variance among all regular and asymptotically linear estimators with access to the covariates X.

Proof

μ 1(X) = μ 0(X) + τ implies C μ 0 ( X ) , μ 1 X = V μ 0 ( X ) = V μ 1 ( X ) . Following the outline for the proof of Lemma A.6 above shows that the influence function for the ANCOVA II estimator with μ 0(X) as the single covariate is

(33) ψ = W 1 π 1 ( Y μ 1 ) W 0 π 0 ( Y μ 0 ) ( W 1 π 1 ) μ 0 ( X ) μ 0 π 0 π 1

which is the same as the efficient influence function when μ 1(X) = μ 0(X) + τ. □

Corollary A.6.2

Corollary A.6.1 also holds when the ANCOVA II estimator is replaced by the ANCOVA I estimator.

Proof

Theorem A.5 establishes that ANCOVA I is as efficient as ANCOVA II when C m ( X ) , Y 0 = ξ 0 = ξ 1 = C m ( X ) , Y 1 . A constant treatment effect means that μ 1(X) = μ 0(X) + τ and this ensures the equality of the covariances. □

The following lemma is required for the proof that proceeds it.

Lemma A.7

Let f : X R be a bounded function on a compact set X and let f ̂ n : X R be a sequence of uniformly bounded random functions such that | f ( X ) f ̂ n ( X ) | L 2 0 . Let X X be a random variable independent of f ̂ n . Then E X f ̂ n ( X ) p E f ( X ) , C X f ( X ) , f ̂ n ( X ) p V f ( X ) , and V X f ̂ n ( X ) p V f ( X ) .

Proof

f ̂ n and X are independent, so let their joint distribution factor into P n and P. Now

E X f ̂ n ( X ) E f ( X ) 2 d P n = f ̂ n ( X ) d P f ( X ) d P 2 d P n = f ̂ n ( X ) f ( X ) d P 2 d P n f ̂ n ( X ) f ( X ) 2 d P d P n ( J e n s e n s i n e q u a l i t y ) 0

The final convergence holds by our assumption that | f ( X ) f ̂ n ( X ) | L 2 0 . This shows E X f ̂ n ( X ) L 2 E f ( X ) and convergence in probability follows.

Taking advantage of the fact that |f|, |f n | ≤ b are bounded we can make similar arguments to show that E X f ( X ) f ̂ n ( X ) p E X f ( X ) 2 and E X f ̂ n ( X ) 2 p E X f ( X ) 2 . Slutsky’s theorem and the definition of covariance and variance then imply C X f ( X ) , f ̂ n ( X ) p C f ( X ) , f ( X ) and V X f ̂ n ( X ) p V f ( X ) as desired. □

Corollary A.7.1

Let V X f ̂ n ( X ) > ϵ > 0 . Under the conditions of the above lemma, f ( x ) f ̂ n ( x ) C X f ( X ) , f ̂ n ( X ) V X f ̂ n ( X ) L 2 0 .

Proof

Let B n = C X f ( X ) , f ̂ n ( X ) V X f ̂ n ( X ) . By the above lemma, our assumption that V X f ̂ n ( X ) > ϵ > 0 , and Slutsky’s theorem, B n p 1 . Together with the uniform bound on V X f ̂ n ( X ) and Cauchy-Schwarz this is also enough to ensure that ( 1 B n ) L 2 0 .

Now note f ( x ) f ̂ n ( x ) B n f ( x ) f ̂ n ( x ) + b 1 B n by the triangle inequality and the fact that | f ̂ n ( x ) | < b . Thus

E ( f ( X ) f ̂ n ( X ) B n ) 2 = E ( f ( X ) f ̂ n ( X ) ) 2 0 ( by assumption ) + b 2 E ( 1 B n ) 2 0 ( shown above ) + b E ( f ( X ) f n ( X ) ) 2 b ( 1 B n ) = o ( 1 ) + o ( 1 ) + 2 b 2 E 1 B n 0

as desired. □

Theorem A.8

Presume X has compact support and there is a constant treatment effect: μ 1(X) = μ 0(X) + τ with |μ 0(x)| < b bounded. Let m(x) be a (random) function learned from the external data ( Y ′, X ′) n such that |m(x)| < b is also bounded and | m ( X ) μ 0 ( X ) | L 2 0 so that the learned model approaches the truth in MSE as n′ → ∞. If the number of trial samples n grows in tandem with the size of the historical data n′ (i.e. n = O(n′)), then the ANCOVA II analysis that uses the learned model m(X) in the role of X has the lowest possible asymptotic variance among all regular and asymptotically linear estimators with access to the covariates X.

Proof

Define our estimator of interest as the ANCOVA II estimator that uses the learned model m(X) in place of the covariates X if m(X) is not numerically constant up to some machine precision and otherwise as the difference-in-means estimator. Denote this estimator τ ̂ (omitting the II subscript for the duration of this proof). Define the “oracle” estimator as the equivalent estimator that uses the true conditional mean μ 0(X) instead of the estimate m(X) and denote this estimator τ ̂ * . The oracle estimator is obviously infeasible in practice because μ 0(⋅) is not known. Corollary A.6.1 proves that the oracle estimator is semiparametric efficient (i.e. has the lowest possible asymptotic variance among regular and asymptotically linear estimators). Thus, letting ν * 2 denote the optimal asymptotic variance, we have that n ( τ ̂ * τ ) N 0 , ν * 2 . If we can show that n ( τ ̂ τ ̂ * ) p 0 , then Slutsky’s theorem and the delta method imply that τ ̂ has the same asymptotic properties as τ ̂ * , i.e. n ( τ ̂ τ ) N 0 , ν * 2 . In other words, since the oracle estimator is efficient with a known asymptotic variance, the feasible estimator is also efficient and has the same asymptotic variance because the two are asymptotically equivalent.

Showing n ( τ ̂ τ ̂ * ) p 0 requires an intermediate estimator that is asymptotically equivalent to τ ̂ . Using the assumption of the constant effect and Eq. (23) from Theorem A.4 we can show (with an application of the law of total variance) that the influence function for τ ̂ using some fixed m(⋅) is ψ = ψ 1ψ 0 with

(34) ψ w = W w π w ( Y μ w ) W ̃ w π w m ( X ) E X [ m ( X ) ] C X [ m ( X ) , μ 0 ( X ) ] V X [ m ( X ) ]

where E X m ( X ) denotes that the expectation (or variance or covariance) is taken only with respect to X, i.e. m(⋅) is considered fixed.

Let τ ̌ = E ̂ ψ + τ and let τ ̌ * = E ̂ ψ * + τ where ψ* is the influence function above with μ 0(⋅) substituted for m(⋅). Note that τ ̂ and τ ̌ share the same influence function so we must have that n ( τ ̂ τ ̌ ) p 0 . Similarly, n ( τ ̂ * τ ̌ * ) p 0 . Therefore if n ( τ ̌ τ ̌ * ) p 0 , then we have n ( τ ̂ τ ̂ * ) p 0 as desired. This is useful because the estimator τ ̌ and its oracle counterpart τ ̌ * are easier to work with.

To wit, consider the difference τ ̌ τ ̌ * = E ̂ ( ψ 1 ψ 0 ) ψ 1 * ψ 0 * . So all we need to show the desired convergence n ( τ ̌ τ ̌ * ) p 0 is to show n E ̂ ψ w ψ w * p 0 . Expanding,

(35) E ̂ ψ w ψ w * = 1 n i n W ̃ w , i π w μ 0 ( X i ) E X μ 0 ( X ) C X μ 0 ( X ) , μ 0 ( X ) V X μ 0 ( X ) m ( X i ) E X m ( X ) C X m ( X ) , μ 0 ( X ) V X m ( X ) = 1 n i n W ̃ w , i π w μ 0 ( X i ) m ( X i ) B 1 n i n W ̃ w , i π w μ 0 m B

where we’ve abbreviated B = C X m ( X ) , μ 0 ( X ) V X m ( X ) and m = E X m ( X ) . Our plan is to show that both of these terms L 2-converge to 0 at the n rate so that they both converge in probability in that rate, as does their sum (which is what we want). To show L 2 convergence for the first term, we must consider the expression

(36) E n 1 n i n W ̃ w π w ( μ 0 ( X ) m ( X ) B ) 2

And show it converges to 0. Recalling that m itself is random (depends on the external data ( X Y ′)), but independent of the trial data ( X , W , Y ), note that we can treat m(⋅) as if it were a fixed function and B as a fixed constant if we condition on the external data. After conditioning, the quantity inside the parentheses is IID and has mean zero because its μ 0(X) − m(X)B and W ̃ w (by randomization) and because E W ̃ w = 0 . Therefore the quantity above is

(37) E n E 1 n i n W ̃ w π w ( μ 0 ( X ) m ( X ) B ) 2 X , Y = E n V 1 n i n W ̃ w π w ( μ 0 ( X ) m ( X ) B ) X , Y = E n n V W ̃ w π w ( μ 0 ( X ) m ( X ) B ) X , Y = 1 π w π w E μ 0 ( X ) m ( X ) B 2

where we’ve used the fact that the summands are IID to pass the variance through the sum and effectively gain the 1/n required to cancel the n. The same argument shows that the equivalent for the second term in Eq. (35) is 1 π w π w E μ 0 m B 2 (note m and B are random here).

To complete the proof we invoke Corollary A.7.1 in combination with our assumptions |m(x)| < b, |μ 0(x)| < b and | m ( X ) μ 0 ( X ) | L 2 0 to arrive at the fact that | m ( x ) B μ 0 ( x ) | L 2 0 and | m B μ 0 | L 2 0 . The condition that V X f ̂ n ( X ) in Corollary A.7.1 is automatically satisfied because we only include the prognostic score in the regression if it has nonzero variance. Thus the expectations 1 π w π w E μ 0 ( X ) m ( X ) B 2 and 1 π w π w E μ 0 m B 2 converge to 0 as desired. □

Corollary A.8.1

Theorem A.8 also holds for the ANCOVA I estimator.

Proof

In the case of a constant treatment effect ANCOVA I and ANCOVA II have the same asymptotic variance (Theorem A.5). The result follows immediately. □

Appendix B. Estimating σ w 2 and ρ w for power calculations

One method for obtaining estimates for the marginal potential outcome variances ( σ w 2 ) and potential outcome-prognostic score correlations (ρ w ) is to use prior data, for example data from the placebo control arm of a previous trial performed on a similar population (separate from the data used to train the prognostic model). In this case we presume we have access to a vector Y = Y 1 Y n of outcomes for these subjects and their corresponding prognostic scores M = M 1 M n , calculated by applying the prognostic model m to each subject’s vector of baseline covariates X, i.e. M i = m X i .

The control-arm marginal outcome variance σ 0 2 can be estimated with the usual estimator

σ ̂ 0 2 = 1 n 1 Y i Y ̄ 2

The correlation ρ 0 between M″ and Y″ can be estimated by

ρ ̂ 0 = Y i Y ̄ M i M ̄ Y i Y ̄ 2 M i M ̄ 2

which is the usual sample correlation coefficient. These values may be inflated ( σ 0 2 ) or deflated (ρ 0) in order to provide more conservative estimates of power.

The corresponding values for the treatment arm can rarely be estimated from data because treatment-arm data for the experimental treatment is likely to be scarce or unavailable. It is therefore prudent to assume σ 0 2 = σ 1 2 and ρ 0 = ρ 1, the latter which holds exactly if the effect of treatment is constant across the population. It may also be prudent (and conservative) to assume a slightly higher value for σ 1 2 and a slightly smaller value for ρ 1 relative to their control-arm counterparts in the absence of data to the contrary.

Appendix C. Additional simulation results

Here we detail a full set of simulation results using additional specifications for the regression estimators (Figure 1). “Covariates” indicates whether the raw covariates were adjusted for. “Prognostic score” indicates whether any prognostic score was used, and, if so, whether it was estimated from a training dataset or whether the true value was used. “Interactions” specifies whether treatment × (covariates and/or prognostic score) interactions were used. “SE” indicates the standard deviation of the mean squared error.

Scenario Covariates Prognostic score Interaction MSE SE
Baseline False None True 7.64 × 10−2 1.08 × 10−3
Baseline False None False 7.64 × 10−2 1.08 × 10−3
Baseline False Estimated True 1.76 × 10−2 2.46 × 10−4
Baseline False Estimated False 1.75 × 10−2 2.45 × 10−4
Baseline False Oracle True 7.69 × 10−3 1.09 × 10−4
Baseline False Oracle False 7.69 × 10−3 1.09 × 10−4
Baseline True None True 5.07 × 10−2 7.18 × 10−4
Baseline True None False 5.04 × 10−2 7.14 × 10−4
Baseline True Estimated True 1.74 × 10−2 2.46 × 10−4
Baseline True Estimated False 1.73 × 10−2 2.44 × 10−4
Baseline True Oracle True 7.85 × 10−3 1.11 × 10−4
Baseline True Oracle False 7.85 × 10−3 1.11 × 10−4
Surrrogate False None True 7.47 × 10−2 1.05 × 10−3
Surrrogate False None False 7.47 × 10−2 1.05 × 10−3
Surrrogate False Estimated True 4.05 × 10−2 5.69 × 10−4
Surrrogate False Estimated False 4.03 × 10−2 5.66 × 10−4
Surrrogate False Oracle True 8.25 × 10−3 1.18 × 10−4
Surrrogate False Oracle False 8.24 × 10−3 1.18 × 10−4
Surrrogate True None True 5.03 × 10−2 7.09 × 10−4
Surrrogate True None False 5.00 × 10−2 7.04 × 10−4
Surrrogate True Estimated True 3.75 × 10−2 5.27 × 10−4
Surrrogate True Estimated False 3.72 × 10−2 5.23 × 10−4
Surrrogate True Oracle True 8.41 × 10−3 1.20 × 10−4
Surrrogate True Oracle False 8.41 × 10−3 1.20 × 10−4
Shifted False None True 7.65 × 10−2 1.10 × 10−3
Shifted False None False 7.65 × 10−2 1.10 × 10−3
Shifted False Estimated True 6.79 × 10−2 9.62 × 10−4
Shifted False Estimated False 6.79 × 10−2 9.62 × 10−4
Shifted False Oracle True 8.20 × 10−3 1.15 × 10−4
Shifted False Oracle False 8.20 × 10−3 1.15 × 10−4
Shifted True None True 5.03 × 10−2 7.11 × 10−4
Shifted True None False 5.00 × 10−2 7.05 × 10−4
Shifted True Estimated True 4.91 × 10−2 6.97 × 10−4
Shifted True Estimated False 4.86 × 10−2 6.90 × 10−4
Shifted True Oracle True 8.34 × 10−3 1.17 × 10−4
Shifted True Oracle False 8.34 × 10−3 1.17 × 10−4
Strong False None True 7.73 × 10−2 1.08 × 10−3
Strong False None False 7.73 × 10−2 1.08 × 10−3
Strong False Estimated True 1.85 × 10−2 2.65 × 10−4
Strong False Estimated False 1.85 × 10−2 2.64 × 10−4
Strong False Oracle True 8.16 × 10−3 1.16 × 10−4
Strong False Oracle False 8.16 × 10−3 1.16 × 10−4
Strong True None True 5.14 × 10−2 7.18 × 10−4
Strong True None False 5.11 × 10−2 7.13 × 10−4
Strong True Estimated True 1.84 × 10−2 2.62 × 10−4
Strong True Estimated False 1.82 × 10−2 2.59 × 10−4
Strong True Oracle True 8.33 × 10−3 1.18 × 10−4
Strong True Oracle False 8.32 × 10−3 1.18 × 10−4
Linear False None True 3.49 × 10−2 4.83 × 10−4
Linear False None False 3.49 × 10−2 4.83 × 10−4
Linear False Estimated True 9.64 × 10−3 1.38 × 10−4
Linear False Estimated False 9.64 × 10−3 1.38 × 10−4
Linear False Oracle True 8.20 × 10−3 1.16 × 10−4
Linear False Oracle False 8.20 × 10−3 1.16 × 10−4
Linear True None True 8.37 × 10−3 1.18 × 10−4
Linear True None False 8.37 × 10−3 1.18 × 10−4
Linear True Estimated True 8.39 × 10−3 1.19 × 10−4
Linear True Estimated False 8.39 × 10−3 1.19 × 10−4
Linear True Oracle True 8.37 × 10−3 1.18 × 10−4
Linear True Oracle False 8.37 × 10−3 1.18 × 10−4
Heterogeneous False None True 5.54 × 10−2 7.76 × 10−4
Heterogeneous False None False 5.54 × 10−2 7.76 × 10−4
Heterogeneous False Estimated True 2.30 × 10−2 3.23 × 10−4
Heterogeneous False Estimated False 2.32 × 10−2 3.25 × 10−4
Heterogeneous False Oracle True 2.29 × 10−2 3.20 × 10−4
Heterogeneous False Oracle False 2.32 × 10−2 3.24 × 10−4
Heterogeneous True None True 2.99 × 10−2 4.30 × 10−4
Heterogeneous True None False 2.98 × 10−2 4.29 × 10−4
Heterogeneous True Estimated True 2.13 × 10−2 3.01 × 10−4
Heterogeneous True Estimated False 2.19 × 10−2 3.08 × 10−4
Heterogeneous True Oracle True 1.89 × 10−2 2.69 × 10−4
Heterogeneous True Oracle False 1.98 × 10−2 2.81 × 10−4

Figure 1: 
Visualization of the simulation results presented in tabular form above.
Figure 1:

Visualization of the simulation results presented in tabular form above.

Appendix D. Covariates in the empirical demonstration dataset

Table 4:

Baseline covariates in the DHA study and ADNI/CPAD historical training data.

Covariate Description
AChEI or memantine usage Whether a subject is using a class of symptomatic Alzheimer’s drugs
ADAS commands Assesses the subject’s ability to follow commands
ADAS comprehension Assesses the subject’s ability to understand spoken language
ADAS construction Assesses the subject’s ability to draw basic figures
ADAS ideational Assesses the subject’s ability to carry out a basic task
ADAS naming Assesses the subject’s ability to name common objects
ADAS orientation Assesses the subject’s knowledge of time and place
ADAS remember instructions Assesses the subject’s ability to remember test instructions
ADAS spoken language Assesses the subject’s ability to speak clearly
ADAS word finding Assesses the subject’s word finding in speech
ADAS word recall Assesses the subject’s ability to recall a list of words
ADAS word recognition Assesses the subject’s ability to remember and identify words
Age Subject age at baseline
ApoE e4 Allele count The number of ApoE e4 alleles a subject has (0, 1, or 2)
CDR community Assesses the subject’s engagement in community activities
CDR home and hobbies Assesses the subject’s engagement in home and personal activities
CDR judgement Assesses the subject’s judgement skills
CDR memory Assesses the subject’s memory
CDR orientation Assesses the subject’s knowledge of time and place
CDR personal care Assesses the subject’s ability to care for themselves
Diastolic blood pressure The diastolic blood pressure of a subject
Education (Years) The number of years of education of a subject
Heart rate The resting heart rate of a subject
Height The height of a subject
Indicator for clinical trial 1 if the subject is in an RCT, 0 if not
MMSE attention and calculation Assesses the subject’s attention and calculation skills
MMSE language Assesses the subject’s language skills
MMSE orientation Assesses the subject’s knowledge of place and time
MMSE recall Assesses the subject’s ability to remember prompts
MMSE registration Assesses the subject’s ability to repeat prompts
Region: Europe 1 if the subject lives in Europe, 0 otherwise
Region: Northern America 1 if the subject lives in the US or Canada, 0 otherwise
Region: Other 1 if the subject lives outside of Europe/US/Canada, 0 otherwise
Serious adverse events The number of serious adverse events reported
Sex 1 if female, 0 if male
Systolic blood pressure The systolic blood pressure of a subject
Weight The weight of a subject

References

1. Maldonado, G, Greenland, S. Estimating causal effects. Int J Epidemiol 2002;31:422–9.10.1093/ije/31.2.422Search in Google Scholar

2. Sox, HC, Goodman, SN. The methods of comparative effectiveness research. Publ Health 2012;33:425–45. https://doi.org/10.1146/annurev-publhealth-031811-124610.Search in Google Scholar PubMed

3. Overhage, JM, Ryan, PB, Schuemie, MJ, Stang, PE. Desideratum for evidence based epidemiology. Drug Saf 2013;36:5–14. https://doi.org/10.1007/s40264-013-0102-2.Search in Google Scholar PubMed

4. Hannan, EL Randomized clinical trials and observational studies guidelines for assessing respective strengths and limitations. JACC Cardiovasc Interv 2008;1:211–7. https://doi.org/10.1016/j.jcin.2008.01.008.Search in Google Scholar PubMed

5. Kopp-Schneider, A, Calderazzo, S, Wiesenfarth, M. Power gains by using external information in clinical trials are typically not possible when requiring strict type I error control. Biom J 2020;62:361–74. https://doi.org/10.1002/bimj.201800395.Search in Google Scholar PubMed PubMed Central

6. Ibrahim, JG, Chen, M-H, Gwon, Y, Chen, F. The power prior: theory and applications. Stat Med 2015;34:3724–49. https://doi.org/10.1002/sim.6728.Search in Google Scholar PubMed PubMed Central

7. Lim, J, Walley, R, Yuan, J, Liu, J, Dabral, A, Best, N. Minimizing patient burden through the use of historical subject-level data in innovative confirmatory clinical trials. TIRS 2018;52:546–59. https://doi.org/10.1177/2168479018778282.Search in Google Scholar PubMed

8. Baker, SG, Lindeman, KS. Rethinking historical controls. Biostatistics 2001;2:383–96. https://doi.org/10.1093/biostatistics/2.4.383.Search in Google Scholar PubMed

9. Ghadessi, M, Tang, R, Zhou, J, Liu, R, Wang, C, Toyoizumi, K, et al.. A roadmap to using historical controls in clinical trials – by drug information association adaptive design scientific working group (DIA-ADSWG). Orphanet J Rare Dis 2020;15:69. https://doi.org/10.1186/s13023-020-1332-x.Search in Google Scholar PubMed PubMed Central

10. Hansen, BB. The prognostic analogue of the propensity score. Biometrika 2008;95:481–8. https://doi.org/10.1093/biomet/asn004.Search in Google Scholar

11. Aikens, RC, Greaves, D, Baiocchi, M. A pilot design for observational studies: using abundant data thoughtfully. Stat Med 2020;39:4821–40.10.1002/sim.8754Search in Google Scholar PubMed

12. Wyss, R, Lunt, M, Brookhart, MA, Glynn, RJ, Stürmer, T. Reducing bias amplification in the presence of unmeasured confounding through out-of-sample estimation strategies for the disease risk score. J Causal Inference 2014;2:131–46. https://doi.org/10.1515/jci-2014-0009.Search in Google Scholar PubMed PubMed Central

13. Lin, W. Agnostic notes on regression adjustments to experimental data: reexamining Freedman’s critique. Ann Appl Stat 2013;7:295–318. https://doi.org/10.1214/12-aoas583.Search in Google Scholar

14. Kahan, BC, Jairath, V, J Doré, C, Morris, TP. The risks and rewards of covariate adjustment in randomized trials: an assessment of 12 outcomes from 8 studies. Trials 2014;15:139. https://doi.org/10.1186/1745-6215-15-139.Search in Google Scholar PubMed PubMed Central

15. Raab, GM, Day, S, Sales, J. How to select covariates to include in the analysis of a clinical trial. Contr Clin Trials 2000;21:330–42. https://doi.org/10.1016/s0197-2456(00)00061-1.Search in Google Scholar PubMed

16. Yang, L, Tsiatis, AA. Efficiency study of estimators for a treatment effect in a pretest–posttest trial. Am Statistician 2001;55:314–21. https://doi.org/10.1198/000313001753272466.Search in Google Scholar

17. Committee for Medicinal Products for Human Use. Guideline on adjustment for baseline covariates in clinical trials. London: European Medicines Agency; 2015.Search in Google Scholar

18. Cooney, MT, Dudina, AL, Graham, IM. Value and limitations of existing scores for the assessment of cardiovascular risk: a review for clinicians. J Am Coll Cardiol 2009;54:1209–27. https://doi.org/10.1016/j.jacc.2009.07.020.Search in Google Scholar PubMed

19. Austin, SR, Wong, Y-N, Uzzo, RG, Beck, JR, Egleston, BL. Why summary comorbidity measures such as the Charlson comorbidity index and elixhauser score work. Medical Care 2015;53:e65–72. https://doi.org/10.1097/mlr.0b013e318297429c.Search in Google Scholar

20. Ambrosius, WT, Sink, KM, Foy, CG, Berlowitz, DR, Cheung, AK, Cushman, WC, et al., The SPRINT Study Research Group. The design and rationale of a multicenter clinical trial comparing two strategies for control of systolic blood pressure: the systolic blood pressure intervention trial (SPRINT). Clin Trials 2014;11:532–46. https://doi.org/10.1177/1740774514537404.Search in Google Scholar PubMed PubMed Central

21. Borm, GF, Fransen, J, Lemmens, WAJG. A simple sample size formula for analysis of covariance in randomized clinical trials. J Clin Epidemiol 2007;60:1234–8. https://doi.org/10.1016/j.jclinepi.2007.02.006.Search in Google Scholar PubMed

22. Rubin, DB. Causal inference using potential outcomes. J Am Stat Assoc 2005;100:322–31. https://doi.org/10.1198/016214504000001880.Search in Google Scholar

23. Wang, B, Ogburn, EL, Rosenblum, M. Analysis of covariance in randomized trials: more precision and valid confidence intervals, without model assumptions. Biometrics 2019;75:1391–400. https://doi.org/10.1111/biom.13062.Search in Google Scholar PubMed

24. Leon, S, Tsiatis, AA, Davidian, M. Semiparametric estimation of treatment effect in a pretest–posttest study. Biometrics 2003;59:1046–55. https://doi.org/10.1111/j.0006-341x.2003.00120.x.Search in Google Scholar PubMed

25. Aronow, PM, Miller, BT. Foundations of agnostic statistics. New York: Cambridge University Press; 2019:286–7 pp.10.1017/9781316831762.010Search in Google Scholar

26. Tsiatis, A. Semiparametric theory and missing data. New York: Springer Science & Business Media; 2007.Search in Google Scholar

27. Luo, Y, Spindler, M. High-dimensional L2 boosting: rate of convergence. 2016 arXiv.Search in Google Scholar

28. Belloni, A, Chernozhukov, V. Least squares after model selection in high-dimensional sparse models. Bernoulli 2013;19:521–47. https://doi.org/10.3150/11-bej410.Search in Google Scholar

29. Farrell, MH, Liang, T, Misra, S. Deep neural networks for estimation and inference. 2018 arXiv.10.3982/ECTA16901Search in Google Scholar

30. Syrgkanis, V, Zampetakis, M. Estimation and inference with trees and forests in high dimensions. 2020 arXiv.Search in Google Scholar

31. Pedregosa, F, Varoquaux, G, Gramfort, A, Michel, V, Thirion, B, Grisel, O, et al.. Scikit-learn: machine learning in Python. 2012 arXiv.Search in Google Scholar

32. Quinn, JF, Raman, R, Thomas, RG, Yurko-Mauro, K, Nelson, EB, Van Dyck, C, et al.. Docosahexaenoic acid supplementation and cognitive decline in alzheimer disease: a randomized trial. J Am Med Assoc 2010;304:1903–11. https://doi.org/10.1001/jama.2010.1510.Search in Google Scholar PubMed PubMed Central

33. Coon, KD, Myers, AJ, Craig, DW, Webster, JA, Pearson, JV, Lince, DH, et al.. A high-density whole-genome association study reveals that APOE is the major susceptibility gene for sporadic late-onset alzheimer’s disease. J Clin Psychiatr 2007;68:613–8. https://doi.org/10.4088/jcp.v68n0419.Search in Google Scholar PubMed

34. Rosen, WG, Mohs, RC, Davis, KL. A new rating scale for Alzheimer’s disease. Am J Psychiatr 1984;141:1356–64. https://doi.org/10.1176/ajp.141.11.1356.Search in Google Scholar PubMed

35. Galasko, D, Bennett, D, Sano, M, Ernesto, C, Thomas, R, Grundman, M, et al.. An inventory to assess activities of daily living for clinical trials in Alzheimer’s disease. The Alzheimer’s disease cooperative study. Alzheimer Dis Assoc Disord 1997;11:S33–9. https://doi.org/10.1097/00002093-199700112-00005.Search in Google Scholar

36. Morris, JC. The clinical dementia rating (CDR): current version and scoring rules. Neurology 1993;43:2412–4. https://doi.org/10.1212/wnl.43.11.2412-a.Search in Google Scholar PubMed

37. Neville, J, Kopko, S, Broadbent, S, Avilés, E, Stafford, R, Solinsky, CM, et al., Coalition Against Major Diseases. Development of a unified clinical trial database for Alzheimer’s disease. Alzheimer’s Dementia 2015;11:1212–21. https://doi.org/10.1016/j.jalz.2014.11.005.Search in Google Scholar PubMed

38. Romero, K, Mars, M, Frank, D, Anthony, M, Neville, J, Kirby, L, et al.. The coalition against major diseases: developing tools for an integrated drug development process for Alzheimer’s and Parkinson’s diseases. Clin Pharmacol Ther 2009;86:365–7. https://doi.org/10.1038/clpt.2009.165.Search in Google Scholar PubMed

39. Chernozhukov, V, Chetverikov, D, Demirer, M, Duflo, E, Hansen, C, Newey, W, et al.. Double/debiased machine learning for treatment and structural parameters. Econom J 2018;21:C1–68. https://doi.org/10.1111/ectj.12097.Search in Google Scholar

40. Wager, S, Du, W, Taylor, J, Tibshirani, RJ. High-dimensional regression adjustments in randomized experiments. Proc Natl Acad Sci Unit States Am 2016;113:12673–8. https://doi.org/10.1073/pnas.1614732113.Search in Google Scholar PubMed PubMed Central

41. Rothe, C.Flexible covariate adjustments in randomized experiments, Working Paper; 2018.Search in Google Scholar

42. Dankar, FK, El Emam, K. The application of differential privacy to health data. In: Proceedings of the 2012 Joint EDBT/ICDT Workshops on – EDBT-ICDT ’12; 2012. pp. 158–66.10.1145/2320765.2320816Search in Google Scholar

43. Brisimi, TS, Chen, R, Mela, T, Olshevsky, A, Paschalidis, IC, Shi, W. Federated learning of predictive models from federated electronic health records. Int J Med Inf 2018;112:59–67. https://doi.org/10.1016/j.ijmedinf.2018.01.007.Search in Google Scholar PubMed PubMed Central

44. Coalition Against Major Diseases, Organiza, Abbott, Alliance for Aging Research, Alzheimer’s Association, Alzheimer’s Foundation of America, AstraZeneca Pharmaceuticals LP, Bristol-Myers Squibb Company, Critical Path Institute, CHDI Foundation Inc, Eli Lilly and Company, F Hoffmann-La Roche Ltd, Forest Research Institute, Genentech Inc, GlaxoSmithKline, Johnson & Johnson, National Health Council, Novartis Pharmaceuticals Corporation, Parkinson’s Action Network, Parkinson’s Disease Foundation, Pfizer Inc, sanofi-aventis Collaborating, Fisher, CK, Smith, AM, Walsh, JR. Machine learning for comprehensive forecasting of Alzheimer’s disease progression. Sci Rep 2019;9:13622. https://doi.org/10.1038/s41598-019-49656-2.Search in Google Scholar PubMed PubMed Central

45. Rajkomar, A, Oren, E, Chen, K, Dai, AM, Hajaj, N, Hardt, M, et al.. Scalable and accurate deep learning with electronic health records. npj Digital Medicine 2018;1:18. https://doi.org/10.1038/s41746-018-0029-1.Search in Google Scholar PubMed PubMed Central

46. LeCun, Y, Bengio, Y, Hinton, G. Deep learning. Nature 2015;521:436. https://doi.org/10.1038/nature14539.Search in Google Scholar PubMed

47. Miotto, R, Wang, F, Wang, S, Jiang, X, Dudley, JT. Deep learning for healthcare: review, opportunities and challenges. Briefings Bioinf 2018;19:1236–46. https://doi.org/10.1093/bib/bbx044.Search in Google Scholar PubMed PubMed Central

48. Dubois, S, Romano, N, Jung, K, Shah, N, Kale, D. The effectiveness of transfer learning in electronic health records data. In: Workshop Track - ICLR; 2017.Search in Google Scholar

49. van der Vaart, AW. Asymptotic statistics. Cambridge: Cambridge University Press; 2000.Search in Google Scholar

50. Robins, JM, Rotnitzky, A, Zhao, LP. Estimation of regression coefficients when some regressors are not always observed. J Am Stat Assoc 1994;89:846. https://doi.org/10.2307/2290910.Search in Google Scholar

51. Rosenblum, M, van der Laan, MJ. Simple, efficient estimators of treatment effects in randomized trials using generalized linear models to leverage baseline variables. Int J Biostat 2010;6:13. https://doi.org/10.2202/1557-4679.1138.Search in Google Scholar PubMed PubMed Central

52. Freedman, DA. On regression adjustments to experimental data. Adv Appl Math 2008;40:180–93. https://doi.org/10.1016/j.aam.2006.12.003.Search in Google Scholar

53. Long, JS, Ervin, LH. Using heteroscedasticity consistent standard errors in the linear regression model. Am Statistician 2012;54:217–24. https://doi.org/10.1080/00031305.2000.10474549.Search in Google Scholar

Received: 2021-07-23
Revised: 2021-10-28
Accepted: 2021-11-28
Published Online: 2021-12-22

© 2021 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 6.12.2023 from https://www.degruyter.com/document/doi/10.1515/ijb-2021-0072/html
Scroll to top button