Positivity is a requirement for the identification of both statistical and causal effect parameters. As a heuristic example, if a treatment or intervention of interest is absolutely never given to a particular subgroup of patients, perhaps because it is against medical guidelines or the law, then the average causal effect (of treatment) parameter for such patients is nonsensical. Mathematically, a violation of positivity corresponds with a parameter whose efficient influence curve is unbounded. For the parameter considered here it is easy to see that a positivity violation means that we get a comprised of at least one fraction with a 0 in the denominator, which means .

In practice, we often encounter data that exhibits near-positivity violations. For example, in the ATRIA-1 study, it was uncommon for persons with a history of bleeding events and falls to be prescribed warfarin. The case where treatment is particularly rare for certain subgroups is generally known as a practical positivity violation. For estimators whose influence curves have factors that involve the inverse probability of treatment, practical positivity violations are a concern. Because such estimators behave statistically like the empirical mean of their respective influence curves, a very small probability of treatment in the denominator can lead to a very large outlying value. The effect of this type of outlier may be likened to the effect of an outlier in the typical estimation of an empirical mean. The TMLE, IPCW, and DRIPCW are all estimators of this type, and a careful investigation of their respective influence curves can give insight into their behaviors under practical positivity violations.

Given the true , the TMLE has influence curve

where is the asymptotic limit of our estimator . When is estimated with then the influence curve of the TMLE is the above influence curve minus its projection on the tangent space [8] of the model for . Confidence intervals may be based on the empirical variance of . The DRIPCW influence curve is similar except that it is based on the initial estimator of the *Q*-factors of the likelihood, i.e. .

Given the true , the IPCW estimator influence curve is

where .

Note that the KM can be expressed as an IPCW where the estimator of the *g*-factors of the likelihood does not depend on the covariates. Its influence curve follows the same form as that of the IPCW above.

Under regularity conditions, including when is bounded, the TMLE is efficient when and both converge to true and , and remains consistent if one of them is correct. Despite this, its performance relative to the IPCW under positivity violations is not obvious. The reason is that the TMLE influence curve involves a sum of fractions with in the denominator, whereas the IPCW influence curve only has one such fraction. In the TMLE, the numerator of the fraction is always a number bounded by [–1,1] while the IPCW is exactly equal to 0 or 1.

The simulation study presented here investigates the performance of each estimator under positivity violations, introduced by scaling the linear component of the treatment assignment mechanism by constant factors, 1, 10, and 20 corresponding, respectively, to “no”, “substantial”, and “extreme” positivity violations. In the “no positivity” scenario, no truncation of the *g*-factors was required. In the “substantial” positivity scenario, 28% of the treatment group, for whom , required truncation, and 4% of the control group, for whom , required truncation. In the “extreme” case, 41% of the treatment group and 12% of the controls required truncation.

For the TMLE, IPCW, and DRIPCW the treatment assignment and censoring mechanisms were estimated with consistent parametric logistic regressions. To reflect common practice, the estimated *g*-factors at every time point were truncated to the interval [0.01, 0.99]. To carry out the TMLE, MLE, and DRIPCW, we first stratified our data set according to each treatment intervention, and then fit initial estimators of the intervention-specific *Q*-factors at every time point with Super Learners [10–14]. In brief, the Super Learner at each time point is a convex weighted average of the predictions from a library of five candidate estimators. The convex weights are estimated from the data to minimize the expectation of the 5-fold cross validated negative Bernoulli loglikelihood. In the simulations here, the library of candidate estimators included (1) the unconditional mean; (2) a logistic regression for continuous outcomes in the interval [0,1]; (3) ordinary least squares regression; (4) neural network 3-layer perceptron with two hidden units; and (5) a recursive partitioning decision tree.

Table 1 Simulation: performance under positivity violations

Results are shown in . In the absence of positivity violations, the KM is heavily biased due to the fact that both the treatment and censoring probabilities depend on covariate values. The KM also had the highest mean squared error. The MLE substitution estimator performed somewhat better, but was still more biased and had higher standard error than the remaining estimators. The IPCW, DRIPCW, and TMLE had the lowest mean squared errors. The biases for the TMLE and DRIPCW were higher than that of the IPCW, though their magnitudes were negligible with respect to statistical inference and 95% confidence intervals based on their respective influence curves had reasonably good coverage probabilities.

The bias and mean squared error of the IPCW increased with higher levels of positivity violation. The DRIPCW and TMLE performed better than the IPCW under extreme positivity violations, though the best estimator with respect to overall mean squared error in this scenario was the MLE. Interestingly, the performance of the MLE, which does not rely on estimation of the *g*-factors of the likelihood, also suffered with increasing positivity.

## Comments (0)

General note:By using the comment function on degruyter.com you agree to our Privacy Statement. A respectful treatment of one another is important to us. Therefore we would like to draw your attention to our House Rules.