You are looking at 1 - 10 of 146 items :

  • Probability and Statistics x
Clear All


The relationship between hormonal contraceptive method use and sexually transmitted infections is not well understood. Studies that implement routine screening for STIs among different contraceptive users, such as the ASPIRE HIV-1 prevention trial, can be useful for identifying potential risk factors of STIs. However, the complex nature of non-random data can lead to challenges in estimation of associations for potential risk factors. In particular, if screening for the disease is not random (i. e. it is driven by symptoms or other clinical indicators), estimates of association can suffer from bias, often referred to as informative sampling bias. Time-varying predictors and potential stratification variables can further contribute to difficulty in obtaining unbiased estimates. In this paper, we estimate the association between time- varying contraceptive use and Sexually transmitted infections acquisition, in the presence of informative sampling, by extending the work Buzkova (2010). We use a two-step procedure to jointly model the non-random screening process and sexually transmitted infection risk. In the first step, inverse intensity rate ratios (IIRR) weights are estimated. In the second step, a weighted proportional rate model is fit to estimate the IIRR weighted hazard ratio. We apply the method to evaluate the relationship between hormonal contraception and risk of sexually transmitted infections among women participating in a biomedical HIV-1 prevention trial. We compare our results using the proposed weighted method to those generated using conventional approaches that do not account for potential informative sampling bias or do not use the full potential of the data. Using the IIRR weighted approach we found depot medroxyprogesterone acetate users have a significantly decreased hazard of Trichomonas vaginalis acquisition compared to IUD users (hazard ratio: 0.44, 95% CI: (0.25, 0.83)), which is consistent with the literature. We did not find significant increased or decreased hazard of other STIs for hormonal contraceptive users compared to non-hormonal IUD users.



This paper is establishing the relationship between the spreading dynamics of the Covid-19 pandemic in Morocco and the efficiency of the measures and actions taken by public authorities to contain it. The main objective is to predict the evolution of the COVID-19 pandemic in Morocco and to estimate the time needed for its disappearance.


For these reasons, we have highlighted the role of mathematical models in understanding the transmission chain of this virus as well as its future evolution. Then we used the SIR epidemiological model, which proves to be well suited to address this issue. It shows that identification of the key parameters of this pandemic, such as the probability of transmission, should help to adequately explain its behaviour and make it easier to predict its progress.


As a result, the measures and actions taken by the public authorities in Morocco allowed to record lower number of virus reproduction than many countries.


So, in the case of Morocco, we were able to predict that the Covid-19 pandemic should disappear in a shorter time and without registering a larger number of infected individuals compared to other countries.


This manuscript extends the definition of the Absolute Standardized Mean Difference (ASMD) for binary exposure (M = 2) to cases for M > 2 on multiple imputed data sets. The Maximal Maximized Standardized Difference (MMSD) and the Maximal Averaged Standardized Difference (MASD) were proposed. For different percentages, missing data were introduced in covariates in the simulated data based on the missing at random (MAR) assumption. We then investigate the performance of these two metric definitions using simulated data of full and imputed data sets. The performance of the MASD and the MMSD were validated by relating the balance metrics to estimation bias. The results show that there is an association between the balance metrics and bias. The proposed balance diagnostics seem therefore appropriate to assess balance for the generalized propensity score (GPS) under multiple imputation.


When studying the causal effect of x on y, researchers may conduct regression and report a confidence interval for the slope coefficient βx. This common confidence interval provides an assessment of uncertainty from sampling error, but it does not assess uncertainty from confounding. An intervention on x may produce a response in y that is unexpected, and our misinterpretation of the slope happens when there are confounding factors w. When w are measured we may conduct multiple regression, but when w are unmeasured it is common practice to include a precautionary statement when reporting the confidence interval, warning against unwarranted causal interpretation. If the goal is robust causal interpretation then we can do something more informative. Uncertainty, in the specification of three confounding parameters can be propagated through an equation to produce a confounding interval. Here, we develop supporting mathematical theory and describe an example application. Our proposed methodology applies well to studies of a continuous response or rare outcome. It is a general method for quantifying error from model uncertainty. Whereas, confidence intervals are used to assess uncertainty from unmeasured individuals, confounding intervals can be used to assess uncertainty from unmeasured attributes.


Investigating the joint exposure to several risk factors is becoming a key component of epidemiologic studies. Individuals are exposed to multiple factors, often simultaneously, and evaluating patterns of exposures and high-dimension interactions may allow for a better understanding of health risks at the individual level. When jointly evaluating high-dimensional exposures, common statistical methods should be integrated with machine learning techniques that may better account for complex settings. Among these, Logic regression was developed to investigate a large number of binary exposures as they relate to a given outcome. This method may be of interest in several public health settings, yet has never been presented to an epidemiologic audience. In this paper, we review and discuss Logic regression as a potential tool for epidemiological studies, using an example of occupation history (68 binary exposures of primary occupations) and amyotrophic lateral sclerosis in a population-based Danish cohort. Logic regression identifies predictors that are Boolean combinations of the original (binary) exposures, fully operating within the regression framework of interest (e. g. linear, logistic). Combinations of exposures are graphically presented as Logic trees, and techniques for selecting the best Logic model are available and of high importance. While highlighting several advantages of the method, we also discuss specific drawbacks and practical issues that should be considered when using Logic regression in population-based studies. With this paper, we encourage researchers to explore the use of machine learning techniques when evaluating large-dimensional epidemiologic data, as well as advocate the need of further methodological work in the area.


Clinic-based cohort studies enroll patients on first being admitted to the clinic, and follow them as part of usual care, with interest being in the marginal mean of the outcome process. As the required frequency of follow-up varies among patients, these studies often feature irregular visit times, with no two patients sharing a visit time. Inverse-intensity weighting has been developed to handle this, however it requires that the visit process be conditionally independent of the outcome given the observed history. When patients schedule visits in response to changes in their health (for example a disease flare), the conditional independence assumption is no longer plausible, leading to biased results. We suggest additional information that can be collected to ensure that conditional independence holds, and examine how this might be used in the analysis. This allows clinic-based cohort studies to be used to determine longitudinal outcomes without incurring bias due to irregular follow-up.


Mediation analysis is popular in examining the extent to which the effect of an exposure on an outcome is through an intermediate variable. When the exposure is subject to misclassification, the effects estimated can be severely biased. In this paper, when the mediator is binary, we first study the bias on traditional direct and indirect effect estimates in the presence of conditional non-differential misclassification of a binary exposure. We show that in the absence of interaction, the misclassification of the exposure will bias the direct effect towards the null but can bias the indirect effect in either direction. We then develop an EM algorithm approach to correcting for the misclassification, and conduct simulation studies to assess the performance of the correction approach. Finally, we apply the approach to National Center for Health Statistics birth certificate data to study the effect of smoking status on the preterm birth mediated through pre-eclampsia.


Here, we address the issue of experimental design for animal and crop disease transmission experiments, where the goal is to identify some characteristic of the underlying infectious disease system via a mechanistic disease transmission model. Design for such non-linear models is complicated by the fact that the optimal design depends upon the parameters of the model, so the problem is set in simulation-based, Bayesian framework using informative priors. This involves simulating the experiment over a given design repeatedly using parameter values drawn from the prior, calculating a Monte Carlo estimate of the utility function from those simulations for the given design, and then repeating this over the design space in order to find an optimal design or set of designs.

Here we consider two agricultural scenarios. The first involves an experiment to characterize the effectiveness of a vaccine-based treatment on an animal disease in an in-barn setting. The design question of interest is on which days to make observations if we are limited to being able to observe the disease status of all animals on only two days. The second envisages a trial being carried out to estimate the spatio-temporal transmission dynamics of a crop disease. The design question considered here is how far apart to space the plants from each other to best capture those dynamics. In the in-barn animal experiment, we see that for the prior scenarios considered, observations taken very close to the beginning of the experiment tend to lead to designs with the highest values of our chosen utility functions. In the crop trial, we see that over the prior scenarios considered, spacing between plants is important for experimental performance, with plants being placed too close together being particularly deleterious to that performance.