Show Summary Details
More options …

# Statistical Applications in Genetics and Molecular Biology

Editor-in-Chief: Sanguinetti, Guido

6 Issues per year

IMPACT FACTOR 2017: 0.812
5-year IMPACT FACTOR: 1.104

CiteScore 2017: 0.86

SCImago Journal Rank (SJR) 2017: 0.456
Source Normalized Impact per Paper (SNIP) 2017: 0.527

Mathematical Citation Quotient (MCQ) 2017: 0.04

Online
ISSN
1544-6115
See all formats and pricing
More options …
Volume 14, Issue 1

# Bayesian mixed-effects model for the analysis of a series of FRAP images

Martina Feilke
• Department of Statistics, Ludwig-Maximilians-University Munich, Ludwigstr. 33, 80539 Munich, Germany
• Other articles by this author:
/ Katrin Schneider
• Department of Biology and Center for Integrated Protein Science, Ludwig Maximilians University Munich, 82152 Planegg-Martinsried, Germany
• Other articles by this author:
/ Volker J. Schmid
• Corresponding author
• Department of Statistics, Ludwig-Maximilians-University Munich, Ludwigstr. 33, 80539 Munich, Germany
• Email
• Other articles by this author:
Published Online: 2014-12-13 | DOI: https://doi.org/10.1515/sagmb-2014-0013

## Abstract

The binding behavior of molecules in nuclei of living cells can be studied through the analysis of images from fluorescence recovery after photobleaching experiments. However, there is still a lack of methodology for the statistical evaluation of FRAP data, especially for the joint analysis of multiple dynamic images. We propose a hierarchical Bayesian nonlinear model with mixed-effect priors based on local compartment models in order to obtain joint parameter estimates for all nuclei as well as to account for the heterogeneity of the nuclei population. We apply our method to a series of FRAP experiments of DNA methyltransferase 1 tagged to green fluorescent protein expressed in a somatic mouse cell line and compare the results to the application of three different fixed-effects models to the same series of FRAP experiments. With the proposed model, we get estimates of the off-rates of the interactions of the molecules under study together with credible intervals, and additionally gain information about the variability between nuclei. The proposed model is superior to and more robust than the tested fixed-effects models. Therefore, it can be used for the joint analysis of data from FRAP experiments on various similar nuclei.

This article offers supplementary material which is provided at the end of the article.

## 1 Introduction

Fluorescence recovery after photobleaching (FRAP) is an imaging technique to investigate the binding behavior of molecules inside organisms, cells or cellular sub-compartments in vivo (Meyvis et al., 1999; Reits and Neefjes, 2001; Sprague et al., 2004). To analyze, for example, the dynamic properties of proteins of interest within the cell nucleus, the proteins are genetically tagged to a fluorescent protein [e.g., green fluorescent protein (GFP)] and expressed in cells of interest. A part of the molecules in the cell nucleus is bleached by a focused laser beam and the recovery in the bleached part of the nucleus is observed by capturing images of the nucleus in predefined time intervals (McNally, 2008). Typically, such analyses are done on a couple of similar nuclei and the resulting rate estimators are summarized afterwards. However, the results often differ between nuclei. This variability is, however, not only due to random observation noise and the randomness of the underlying diffusion process, but also caused by cell-to-cell variation or the cellular status of the examined cells. To this end, we propose to analyze all nuclei together and account for the variance between nuclei by using mixed-effects models.

To date, for the analysis of data from FRAP experiments for the same molecule on multiple similar cell nuclei with a mathematical model, either the data of each recovery curve is analyzed separately, and the results for all cell nuclei are regarded together (e.g., Schneider et al., 2013), or the data of the different cell nuclei are pooled, averaged and then analyzed (e.g., McNally, 2008). In the second case, the recovery curves of the cell nuclei are averaged to obtain a smooth curve that can then be analyzed by the same mathematical model that is usually used for the analysis of one recovery curve. It is, however, vital, that only data of comparable fluorescent intensities are averaged (Sprague et al., 2004).

Random effects are frequently used in linear models for longitudinal data. They account for the fact that subjects are sampled randomly from a heterogeneous population (Pinheiro and Bates, 2000). Typically, random effects are combined with fixed effects, i.e., the usual effects in a linear model, resulting into mixed-effects models. Mixed-effects models are used in many applications including agriculture, pharmacokinetics, and geophysics (Pinheiro and Bates, 2000), as well as clinical trials (Brown and Prescott, 1999). In mixed-effects models, the relationships between a response variable and covariates, which are grouped by one or several factors, are described (Pinheiro and Bates, 2000). In our approach, fixed effects are parameters that are associated with the recovery curves of all cell nuclei, while random effects are parameters associated with the recovery curves of the individual cell nuclei.

The aim of a FRAP experiment is to infer the binding behavior of the unbleached molecules in the cell nucleus from their speed of movement. Because the bleached and unbleached molecules are assumed to behave identically, we can infer from that the binding behavior of all – bleached and unbleached – molecules of interest in the cell nucleus. See Sprague and McNally (2005) for more information on FRAP experiments. In this paper, we concentrate on half-nucleus FRAP [as opposed to circle FRAP or strip FRAP (Sprague et al., 2004; Mueller et al., 2008)], which should cover representative fractions of heterogeneously distributed binding sites in all cell cycle stages (Schneider et al., 2013). An example for such data is given in Figure 1. In the first post-bleach image (second image from left, after 0.15 s), it is apparent that one half of the cell nucleus has been bleached. The recovery of fluorescence in this half can be tracked over time in the subsequent images.

Figure 1

Fluorescence recovery after photobleaching one half of the nucleus of a mouse C2C12 cell expressing GFP-Dnmt1 in late S phase. Images of a cell nucleus in a FRAP experiment: In the prebleach image, the complete cell nucleus is visible because all molecules are fluorescent. In the first postbleach image (acquired after 0.15 s), it is obvious that one half of the nucleus has been bleached. The subsequent images show the recovery of the fluorescence in the nucleus after 5, 10, 20 and 50 s.

We propose a Bayesian nonlinear regression with mixed-effect priors for the simultaneous analysis of all recovery curves resulting from a series of FRAP experiments. In the following section, the data used in our analyses is described and the compartment model as well as the differential equations associated with it are introduced. Thereafter, the model equation, which is based on the solution to the differential equations, is presented. In the subsequent section, the Bayesian nonlinear mixed-effects model, which consists of the data model, the prior model and the hyper priors, is described. Then, the parameter estimation procedure is presented, and the three different fixed-effects models with which the mixed-effects model is compared, are introduced, together with the information criterion used for the comparison. In the Results section, we present the results from the application of the proposed mixed-effects model to a series of FRAP experiments of GFP-tagged DNA methyltransferase 1 (GFP-Dnmt1) expressed in a somatic mouse cell line. Moreover, the comparison between the mixed-effects model and the fixed-effects models is done. The paper ends with a discussion of the proposed approach.

## 2.1 Data

In this paper, we use FRAP data sets of GFP-Dnmt1 expressed in mouse C2C12 myoblast cells (Schneider et al., 2013), which were obtained from multiple cell nuclei and can therefore be utilized to illustrate how our nonlinear regression model can be used to fit all available data at once. DNA methylation at position 5 of cytosines within CpG dinocleotide sequences is an important biochemical process for the stable epigenetic gene silencing in vertebrates (Bird, 2002, Spada et al., 2006). The maintenance methyltransferase Dnmt1 reestablishes methylation of hemi-methylated CpG sites generated during DNA replication in S phase and thus ensures propagation of genomic methylation pattern over many cell divisions.

In order to study the cell cycle dependent binding behavior of wild type Dnmt1, we analyzed data from cells in different cell cycle stages as identified by the nuclear distribution pattern of GFP-Dnmt1 (Schneider et al., 2013): 12 cells with diffuse nuclear distribution (mostly G1 phase and possibly also late G2 phase), 26 cells in early S phase with Dnmt1 association at early replication foci, and 11 cells in late S phase with Dnmt1 associating with late replicating heterochromatin clusters. For each cell, the concentration of unbleached GFP-Dnmt1 in the bleached half of the cell nucleus was documented every 0.15 s up to 779 times after the bleaching. For the cells with diffuse nuclear distribution, the concentration was measured 778 times for 10 cells. For the two remaining cells with diffuse nuclear distribution, 390 and 480 measurements, respectively, were available. 778 measurements were available for 24 cells in early S phase. For the other two cells in early S phase, 774 and 754 measurements, respectively, were available. For the cells in late S phase, 779 measurements were available for 10 cells, whereas 777 measurements were available for the eleventh cell. The original FRAP data has been normalized by a triple normalization procedure, see Dargatz (2010), Schneider et al. (2013).

The major part of the FRAP data analyzed in this paper has previously been published and analyzed (Schneider et al., 2013). The goal of the experiment was to identify the contribution of two different kinds of interactions Dnmt1 is involved in in different cell cycle phases. The interactions can be attributed to two subdomains of Dnmt1, first, the proliferating cell nuclear antigen (PCNA)-binding domain (Schermelleh et al., 2007; Schneider et al., 2013), and second, the targeting sequence domain, which targets Dnmt1 to the replication sites in S phase (Schneider et al., 2013). Schneider et al. (2013) use the term “mobility classes” instead of the term “binding partners,” because all interactions with similar on- and off-rates can not be distinguished (Schneider, 2009), and, hence, build one mobility class (MC). Moreover, processes like anomalous diffusion, which are not related to binding, can also be represented by a MC.

## 2.2 Nonlinear recovery model

The movement of a molecule of interest in a cell nucleus is influenced by diffusion and by interactions, including binding reactions, the molecule is involved in (Mueller et al., 2010; Hemmerich et al., 2011; van Royen et al., 2011; Mazza et al., 2012). It is possible to model this process by using the full reaction-diffusion equations (Carrero et al., 2004; Sprague et al., 2004; Beaudouin et al., 2006). As we strive for an analytical solution of the equations describing the movement of the molecule of interest, we use a simplification of the full reaction-diffusion equations. Usually, one of the following three simplifications is employed: the pure-diffusion scenario, the effective diffusion scenario or the reaction dominant scenario (Sprague et al., 2004). A pure-diffusion dominant scenario (Sprague et al., 2004) is present when most of the fluorescent molecules are free and interactions can be ignored. An effective diffusion scenario (Sprague et al., 2004; Beaudouin et al., 2006; Mueller et al., 2008; van Royen et al., 2009) occurs “when the reaction process is much faster than diffusion” (Sprague et al., 2004). A reaction dominant scenario is present, when diffusion is very fast compared to the timescale of the image acquisition and to the reaction process (Sprague et al., 2004).

The interactions Dnmt1 is involved in are described by on- and off-rates. In Schneider et al. (2013), where a correction value for diffusion was used, it was found that, in S phase, Dnmt1 is involved in interactions with relatively small off-rates, which means in the case of binding reactions, that the molecules of interest have a relatively long residence time (about 10–20 s) at their binding sites. We have no sufficient information about the magnitude of the on-rate. For these reasons and because we aim to have an analytical solution to the ordinary differential equations describing the movement of Dnmt1, we assume a reaction dominant scenario for our data.

In a reaction dominant FRAP scenario, diffusion is very fast in comparison to reaction processes and the time scale of the FRAP measurement (Bulinski et al., 2001; Coscoy et al., 2002; Dundr et al., 2002) and the recovery curve in the bleached part of the cell nucleus can be modeled using a nonlinear regression model (Sprague et al., 2004).

Here, we regard cases with two or three MCs (Schneider et al., 2013). In all considered cell cycle phases, a MC with a very long residence time compared to the time of image acquisition is indicated (Schneider et al., 2013). For this MC, we estimate only one parameter, and it is later also referred to as “immobile fraction.” In cells with diffuse localization and in early S phase, one additional MC has been identified. For the late S phase, two additional MCs with different off-rates were found.

The binding sites to which the molecules of interest bind are assumed to be part of large complexes, which are relatively immobile on the time scale of the FRAP measurement and the molecular movement (Carrero et al., 2004; Sprague et al., 2004). A compartment model with two or three compartments (Figure 2; the immobile fraction is ignored in this representation) is used to describe the change of the concentration of unbleached molecules in the bleached part of the cell nucleus. In a compartment model with two compartments, the molecules can be either free or bound. Exchange between the compartment of the free and the compartment of the bound molecules occurs with rates $b1on*$ and $b1off.$ In a compartment model with three compartments, the molecules can be either free or bound in one of two discriminable binding states. Exchange between the compartment of the free molecules and the compartments of the bound molecules occurs with rates $b1on*$ and $b1off$ and $b2on*$ and $b2off,$ respectively. A similar procedure based on the reaction equation of a binding interaction was proposed by Sprague et al. (2004).

Figure 2

(A) Compartment model with two compartments and (B) compartment model with three compartments.

The on- and off-rates of the binding reaction are denoted by $bkon*$ and $bkoff,$ k=0,…, K. As stated in Sprague et al. (2004), $bkon*$ is actually a pseudo-on-rate. It is the product of the actual on-rate $bkon$ and the concentration of vacant bindings sites belonging to MC k. It is constant during the entire recovery process, because we assume that the biological system is in equilibrium before the bleaching and because bleaching does not affect the number of vacant binding sites (Sprague et al., 2004).

Let f(t)=[Free](t) denote the concentration of the free molecules and ak(t)=[Boundk](t) the concentration of the bound molecules in MC k at time t. We can describe the change of the concentration of the free and bound molecules based on the compartment model by the two differential equations

$ddtf(t)=∑k=0K(−bkon*f(t)+bkoffak(t))+Df∇2f(t), (1)$(1)

with ∇2 the Laplacian operator and Df the diffusion coefficient for free proteins, and

$ddtak(t)=bkon*f(t)−bkoffak(t). (2)$(2)

The molecules in the cell nucleus are in equilibrium before the bleaching. In a diffusion-uncoupled FRAP scenario, the free molecules are moreover assumed to be in equilibrium again immediately after the bleaching. Therefore f(t)=feq, a constant, and equation (2) can be written as

$ddtak(t)=bkon*feq−bkoffak(t). (3)$(3)

Moreover, we do not have to model the change of the concentration of the free molecules, it suffices to model the change of concentration of the bound molecules, which means that equation (1) can be ignored.

With boundary condition ak(0)=0, which means that at time t=0 (the time of the bleaching) the concentration of unbleached bound molecules in MC k in the bleached area equals zero, the solution of equation (3) is

$ak(t)=bkon*feqbkoff − bkon*feqbkoffexp(−bkofft). (4)$(4)

As the system is in equilibrium before bleaching we have $ddtf(t)=0,$ $ddtak(t)=0$ and constant steady-state intensities feq, ak,eq. Together with equation (3) we get

$ak,eq=bkon*feqbkoff, (5)$(5)

and can therefore write equation (4) as

$ak(t)=ak,eq(1−exp(−bkofft)). (6)$(6)

The observed value during FRAP recovery is the total fluorescence intensity in the bleached area. It can be described by the sum of the bound and the free unbleached molecules plus an error. The sum of the bound and the free unbleached molecules is denoted by total(t):

$total(t)=feq+∑k=0Kak(t). (7)$(7)

For our analysis, in each cell nucleus, the fluorescence intensity has been averaged over the bleached part of the cell nucleus. Therefore, in our analysis, feq is the average of the intensity of the free fluorescent molecules in the bleached half, and ak(t) is the average of the intensity of the bound fluorescent molecules in the bleached part of the nucleus. With equation (6) we can then write

$total(t)=feq+∑k=0Kak,eq(1−exp(−bkofft)). (8)$(8)

With $feq+∑k=0Kak,eq=1,$ which holds because the concentration of the unbleached molecules has been normalized to one, we arrive at

$total(t)=1−∑k=0Kak,eqexp(−bkofft), (9)$(9)

which is the deterministic approximation of the model with multiple mobility classes in Fuchs (2013).

## 3 Bayesian nonlinear mixed-effects model

In order to analyze all recovery curves from all nuclei simultaneously, we use a hierarchical Bayesian model. Such models consist of three levels:

1. Data model, here derived from the nonlinear model described above,

2. Prior model, here a mixed-effects model in order to account for the heterogeneity in the nuclei population,

3. Hyper prior model, prior assumptions on all unknown parameters in level 2).

## 3.1 Data model

The total observed concentration of unbleached molecules in the bleached part of the cell nucleus of cell j, j=1, …, J, at time ti, i=1, …, Tj, is denoted by Cj(ti). We assume Gaussian noise for the observations

$Cj(ti)∼N(totalj(ti),σ2). (10)$(10)

The true concentration of unbleached molecules is modeled by the nonlinear model

$totalj(ti)=1−∑k=0Kakjexp(−bkjoffti). (11)$(11)

Therefore, we fit the mixed-effects model

$Cj(ti)=1−∑k=0Kakjexp(−bkjoffti)+εij (12)$(12)

to the data of each cell cycle phase, where εij are independent Gaussian noise terms with mean 0 and variance σ2.

## 3.2 Prior model

In a Bayesian framework prior probability density functions (pdf) have to be defined for all unknown parameters. Here, for the parameters akj and $bkjoff,$ we use a mixed-effect decomposition of the form

$akj=ak+αkj,bkjoff=exp(fk+ϕkj)=exp(fk)⋅exp(ϕkj), (13)$(13)

with $bkoff=exp(fk)$ and $βkjoff=exp(ϕkj).$ So each of these parameters is split into a fixed effect, which represents a joint parameter for all recovery curves of all cell nuclei, and a random effect representing a curve-specific parameter. The prior for the parameter $bkjoff$ incorporates moreover the knowledge that transfer rates must be non-negative (Schmid et al., 2009). We do not assume non-negativity for the parameter akj, as a0j can be also negative. This is due to the fact that a triple normalization procedure has been applied to the data, which assumes that the equilibrium concentration of unbleached molecules in the bleached part of the cell nucleus is one. However, due to erroneous pre-processing, a0j can also be smaller than zero, which leads to a equilibrium concentration bigger than one.

For the fixed effects, uniform priors of the form

$p(ak)=p(bkoff)∝constant (14)$(14)

are used. These prior distributions are uninformative, which means that they do not contain any relevant information.

As prior distributions for the nuclei-specific random effects, we use Gaussian distributions and log-normal distributions, respectively, which are given by

$αkj∼N(0,ταk2),βkjoff∼LN(0,τβkoff2), (15)$(15)

where $ταk2$ and $τβkoff2$ are unknown variance parameters.

## 3.3 Hyper priors

Additional prior pdfs have to be defined for all other unknown parameters. As prior distributions for the unknown variance parameters, inverse Gamma distributions, which are given by

$ταk2∼IG(ck,dk),τβkoff2∼IG(ek,gk), (16)$(16)

are used. The inverse Gamma distribution is known as a conjugate prior for the Gaussian distribution with known mean.

By using uninformative priors for the parameters ak and $bkoff,$ we ensure that as much variance as possible is covered by the fixed effects. Only the variability that is not covered by the fixed effects is captured by the random effects. The definition of the hyperpriors with prudently chosen parameters on the variances of the parameters akj and $βkjoff$ leads to a shrinkage of the random effects, so that they do not cover variance explained by the fixed effects (Schmid et al., 2009).

If K=1, which means that there is one MC in addition to the immobile fraction, we have to choose the parameters for the three inverse Gamma distributions

$τα02∼IG(c0,d0),τα12∼IG(c1,d1),τβ1off2∼IG(e1,g1). (17)$(17)

For the diffuse and the early S phase, we choose the parameters

c0=c1=e1=1, d0=d1=g1=10–4.

When K=2, which means that there exist two MCs in addition to the immobile fraction, we have the inverse Gamma distributions

$τα02∼IG(c0,d0),τα12∼IG(c1,d1),τβ1off2∼IG(e1,g1), (18)$(18)

$τα22∼IG(c2,d2),τβ2off2∼IG(e2,g2). (19)$(19)

The chosen parameters for the late S phase similar to above are

c0=c1=c2=e1=e2=1, d0=d1=d2=g1=g2=10–4. By choosing the first parameter of the inverse Gamma distribution to be 1 and the second parameter to be considerably smaller than 1, we perform shrinkage of the variances of the random effects. The smaller the second parameter is with respect to 1, the stronger is the shrinkage of the variance of the corresponding random effect, and, hence, the more variance is covered by the corresponding fixed effect.

As prior for the variance σ2 of the noise term εij, we define the inverse Gamma distribution σ2~IG(a,b) with a=b=1. We assume a priori independence of all unknown parameters.

## 3.4 Posterior distribution and MCMC inference

In the Bayesian framework, all conclusions are drawn from the posterior distribution. The posterior pdf can be computed via Bayes’ theorem (Carlin and Louis, 2000):

$p(θ|y)=f(y|θ)π(θ)∫f(y|θ˜)π(θ˜)dθ˜, (20)$(20)

with f(y∣θ) the pdf of the data distribution defined in (10) and π(θ) the product of the prior distributions.

A Markov chain Monte Carlo (MCMC)-algorithm with Gibbs- and Metropolis-Hastings (MH)-update steps is applied to obtain samples from the full conditional distributions of the parameters of the nonlinear regression model, which can be derived from the posterior distribution. Therefore, in each iteration of the algorithm, a random sample from the conditional posterior distribution (given all other parameters and the data) is drawn for each parameter. The full conditional distributions of all parameters can be found in the electronic appendix.

The parameters ak and akj are drawn in Gaussian Gibbs steps, because their full conditional distributions are Gaussian distributions, from which one can sample directly. For the parameters σ2, $ταk2,$ and $τβkoff2,$ Gamma Gibbs steps are used, because the full conditional distributions of the parameters are Inverse Gamma distributions. The parameters $bkoff$ and $βkjoff$ are drawn in MH-steps with random walk proposals, because their full conditional distributions are not standard distributions. For the MC which is present in all considered cell cycle phases and has a very long residence time compared to the time of image acquisition (k=0), the parameters $b0off$ and $β0joff$ are close to zero. Therefore, we set $b0off=β0joff=0$ and estimate only the parameter a0 for this immobile fraction (Sprague and McNally, 2005; Schermelleh et al., 2007; Schneider et al., 2013).

The random walk proposal of the Metropolis Hastings algorithm was tuned and resulted in acceptance rates between 35% and 52%. This is in accordance with recommendations for acceptance rates; for example Gilks et al. (1996) recommend acceptance rates between 15% and 50% [see also Gelman et al. (1996) and Roberts et al. (1994)]. In our opinion, acceptance rates should be rather higher than lower, because with acceptance rates that are too low it might happen that part of the state space is never visited, what we intend to avoid.

We ran 10 parallel chains for each model (one model per cell cycle phase). For each parameter, a point estimate was obtained via the median of the sample formed by the observations of the converged parallel chains. Additionally, a 95% credible interval was calculated for each parameter. Approximate convergence of the parallel chains was diagnosed if the upper confidence limit of the potential scale reduction factor (Gelman and Rubin, 1992; Brooks and Gelman, 1998; Plummer et al., 2006) was smaller than or equal to 1.1. The number of burn-in iterations was determined by visual inspection of the sampling paths.

In order to check the sensitivity to the prior assumptions, i.e., to the choice of the parameter values of the hyper priors, we ran the mixed-effects model with different parameter values for the hyper priors. The parameter estimates together with 95% credible intervals can be found in Tables 14 in the electronic appendix. We found that the point estimates stay the same for the different parameters of the hyper priors. Changes take place at most at the third decimal place of the estimates. The width of the credible intervals varies slightly. Overall, we found that the parameter estimates are not very sensitive to the choice of the parameter values of the hyper priors.

Table 1

Mixed-effects model: fixed effects – median plus 95% credible interval.

Table 2

Mixed-effects model: variances of random effects – median plus 95% credible interval.

Table 3

Fixed-effects model 1 (fitted to the whole of the data): fixed effects – median plus 95% credible interval.

Table 4

Fixed-effects model 2 (fitted to the individual recovery curves): fixed effects – median [min, max].

To evaluate the model fit, we compared the mixed-effects model – which was fitted to the whole of the data resulting from the FRAP experiments – to

1. a model without random effects fitted to the whole of the data,

2. a model without random effects fitted to the individual recovery curves,

3. a model without random effects fitted to the averaged recovery curves (all recovery curves of the same phase were averaged).

For each of these three scenarios, we fitted the following fixed-effects model to the data of each cell cycle phase:

$C(ti)=1−∑k=0Kakexp(−bkoffti)+εi,εi∼N(0,σ2). (21)$(21)

As for the mixed-effects model, we ran 10 parallel chains for each modeling alternative. Again, we calculated the medians and the 95% credible intervals of the samples formed by the observations of the converged chains (upper confidence limit of the potential scale reduction factor ≤1.1). The number of burn-in iterations was again determined by visual inspection of the sampling paths.

The Deviance Information Criterion (DIC) served as a measure of the model fit for the comparison of the mixed-effects model to the fixed-effects models 1 and 2. It is a suitable information criterion for model selection in hierarchical models, where parameters may outnumber observations and measures like the Akaikes information criterion (AIC) or Bayesian information criterion (BIC) cannot be directly applied (Spiegelhalter et al., 2002). The DIC itself is not an absolute measure, that is, the absolute values cannot be interpreted, but can be compared relatively between models. We did not compare the mixed-effects model and fixed-effects model 3 on the basis of the DIC, because these two models were fit to different kinds of data. The fixed-effects model 3 in contrast to the mixed-effects model was not fitted to the whole of all recovery curves but to the averaged recovery curve per phase.

The DIC can be calculated by the deviance of the medians D(θmed) plus two times the effective number of parameters pD (Spiegelhalter et al., 2002):

$DIC=D(θmed)+2pD.$

The deviance is a measure of the fit of a model and is calculated by

$D(θ)=−2l(θ),$

where l(θ) is the log-likelihood. The effective number of parameters is a measure of the complexity of the model. It is the median deviance minus the deviance of the medians and is calculated by

$pD=median(D(θ))−D(θmed).$

The effective number of parameters is high for models with a high effective model complexity. When comparing two models on the basis of their DIC, the model with the lower DIC is to be favored.

All software was written in the programming languages R (R Core Team, 2013) and C.

## 4.1 Mixed-effects model

By using the Bayesian regression model with mixed-effect priors we gain common parameter estimates for all cell nuclei through the estimation of the fixed effects, as well as curve specific parameter estimates through the estimation of the random effects, and estimates for the variances of the random effects.

In Figure 3, for each phase (diffuse, early S, and late S phase), the estimated joint recovery curve for all cell nuclei is shown together with the normalized data. The joint recovery curve is computed using the posterior medians of the MCMC-samples of the fixed effects.

Figure 3

Normalized data together with estimated joint recovery curve. The estimated joint recovery curves for all cell nuclei using the posterior medians of the MCMC-samples of the fixed effects are shown together with the normalized data for all three reviewed cell cycle phases – (A) diffuse, (B) early S, (C) late S phase.

The random effects take into account the variability resulting from the joint analysis of data of multiple cell nuclei, which is not covered by the fixed effects. In Figure 4, the estimated joint recovery curve for all cell nuclei (black, solid line) is shown together with the cell nuclei-specific curves (colored, dashed lines), which are computed using the posterior medians of the MCMC-samples of the curve-specific random effects.

Figure 4

Estimated joint recovery curve together with cell nuclei-specific recovery curves. The cell nuclei-specific recovery curves using the posterior medians of the MCMC-samples of the random effects for all three reviewed cell cycle phases – (A) diffuse, (B) early S, (C) late S phase – are shown together with the estimated joint recovery curve.

The posterior medians of the fixed parameters ak and $bkoff$ together with 95%-credible intervals can be found in Table 1. Density plots of the posterior distributions of the fixed parameters can be found in the electronic appendix (Figures 13). All parameters could be estimated with small variance. For the diffuse phase, the posterior median of the fixed effect of the off-rate is denoted by $b1off$ and equals 0.163 (0.152, 0.174). For the early S phase, $b1off$ equals 0.094 (0.089, 0.105). In both cases, we assumed that there is only one MC in addition to the immobile fraction (K=1), based on Schneider et al. (2013). In the presence of binding, the off-rate is the rate of the unbinding reaction where a protein is unsoldered from its binding site (Sprague and McNally, 2005), and its inverse is the residence time, i.e., the time a protein remains at a binding site (McNally, 2008). In early S phase, binding of Dnmt1 to immobilized PCNA trimetric rings at replication forks takes place (Sporbert et al., 2005; Schermelleh et al., 2007; Schneider et al., 2013). The median residence time of Dnmt1 at this binding site is about 11 s. In the diffuse phase, the off-rate can not be interpreted in the same way, because in this phase, there is no specific binding partner present and the additional MC is probably due to anomalous diffusion behavior (Schneider et al., 2013).

For the late S phase, we assumed that there are two distinctive MCs in addition to the immobile fraction (K=2) (Schneider et al., 2013). The posterior medians of the fixed effect of the off-rates are $b1off=0.217(0.183,0.273)$ and $b2off=0.043(0.038,0.048),$ which corresponds to median residence times of about 5 s and about 23 s, respectively. This is in compliance with the finding that the protein Dnmt1 is involved in two distinctive interactions in the late S phase (Schneider et al., 2013).

As it is of essential interest how much variance is captured by the random effects, point estimates for the variances of the random effects plus 95%-credible intervals were calculated and can be found in Table 2. Density plots of the posterior distributions of the variances can be found in the electronic appendix (Figures 4–6).

Each of the credible intervals in Tables 1 and 2 embodies the true parameter with a probability of 95%. To give an impression about the variation of the off-rates in the population of cell nuclei, for each cell cycle phase, we calculated the minimum and the maximum of the products of the median of the fixed effect of the off-rate and the medians of the nuclei-specific off-rates to get approximate limits in which the off-rates of the different cell nuclei lie. According to that, the off-rate for the diffuse phase varies approximately between 0.141 and 0.195 over the different cell nuclei, whereas for the early S phase, it lies approximately between 0.056 and 0.143 (residence time: about 7–18 s). For the late S phase, $b1off$ varies between 0.129 and 0.293 (residence time: about 3–8 s), and $b2off$ varies between 0.026 and 0.059 (residence time: about 17–38 s). Thus, we do not only get a joint point estimate of the off-rate for the population of the cell nuclei for each cell cycle phase, but we also gain information about the variation of the off-rate in the population of nuclei.

## 4.2 Comparison between the mixed-effects model and the fixed-effects models

Table 3 provides the posterior medians of the fixed effects ak and $bkoff$ together with 95%-credible intervals for fixed-effects model 1, where regression model (21) was fitted to the whole of the data resulting from the FRAP experiments. Table 4 provides the posterior medians, minima, and maxima of the fixed effects ak and $bkoff$ for fixed-effects model 2, where regression model (21) was fitted to the individual recovery curves of all cell nuclei. In Table 5, the posterior medians of the fixed effects ak and $bkoff$ together with 95%-credible intervals for fixed-effects model 3, where regression model (21) was fitted to the averaged recovery curves, are shown. All three tables provide estimates for each cell cycle phase under review. Density plots of the posterior distributions of the inferred parameters for the fixed-effects models 1–3 can be found in the electronic appendix (Figures 7–18). For fixed-effects model 2, exemplary density plots of the inferred parameters are shown for two cell nuclei of each cell cycle phase. In Figure 5, the point estimates and the 95%-credible intervals for the mixed-effects model and fixed-effects models 1 and 3 are displayed.

Table 5

Fixed-effects model 3 (fitted to the averaged recovery curves): fixed effects – median plus 95% credible interval.

Figure 5

Point estimates and 95%-credible intervals for the fixed effects of the mixed-effects model and fixed-effects models 1 and 3. The posterior medians of the fixed effects together with 95%-credible intervals are shown for the proposed mixed-effects model and fixed-effects models 1 and 3 for all three reviewed cell cycle phases.

Table 6 contains the DIC, the effective number of parameters (pD) and the deviance of the medians [D(θmed)] for the proposed mixed-effects model and the fixed-effects models 1 and 2 introduced in Section 3.4.

Table 6

Deviance of the medians [D(θmed)], effective number of parameters (pD) and DIC of the mixed-effects model and fixed-effects models 1 and 2.

Regarding the point estimates in Table 3, one sees that the point estimates for the fixed effects provided by fixed-effects model 1 differ from the estimates provided by the mixed-effects model (Table 1). Figure 5 and the comparison of Table 3 with Table 1 reveal that the 95%-credible intervals for the fixed parameters resulting from fitting fixed-effects model 1 to the whole of the data are considerably smaller than the credible intervals for the fixed parameters resulting from fitting the proposed mixed-effects model to the whole of the data. Therefore, it could be erroneously concluded that the estimation of the fixed parameters is more exact by using the fixed-effects model 1 than by using the proposed mixed-effects model. This is not the case, as for the fixed-effects model 1 we assume many independent observations and, hence, ignore the structure of the data, i.e., the observations come from different cells and are not all independent. Due to this assumption, the variance of the fixed effects is underestimated.

For all three cell cycle phases (diffuse, early S, and late S phase), when comparing the DIC of the mixed-effects model to the DIC of fixed-effects model 1, it is obvious that the DIC is considerably lower for the mixed-effects model. The clearest result can be found for the late S phase, where the DIC of the mixed-effects model is almost two times lower than the DIC of fixed-effects model 1. This means that the proposed mixed-effects model is superior to fixed-effects model 1 for all three cell cycle phases.

Regarding the point estimates resulting from fitting fixed-effects model 2 to the individual recovery curves (Table 4), we observe that most of them differ from the point estimates resulting from fitting the proposed mixed-effects model to the whole of the data (Table 1). When comparing the DIC of the mixed-effects model to the DIC of fixed-effects model 2, it can be seen that the DIC of the mixed-effects model is lower than the DIC of fixed-effects model 2 for the diffuse and late S phase. Only for the early S phase, the DIC of fixed-effects model 2 is lower than the DIC of the mixed-effects model. But overall, the DIC is approximately of the same magnitude for both models.

Figure 5 reveals that most of the credible intervals for the fixed effects ak and $bkoff$ resulting from fitting fixed-effects model 3 to the averaged recovery curves for the fixed effects ak and $bkoff$ (Table 5) are broader or of approximately the same size as the corresponding credible intervals resulting from the proposed mixed-effects model (Table 1). Only for the fixed effect a0, it is converse. This is due to the difference in the number of data points between the two models. For fixed-effects model 3, we have only one recovery curve per cell cycle phase because of the averaging of the data. Therefore, the estimation of the fixed effects is more exact when using the proposed mixed-effects model. Moreover, we are of the opinion that averaging the recovery curves induces a loss of information because not all available data is used and the variability contained in the data is not appropriately quantified, which is why we favor the proposed mixed-effects model over fixed-effects model 3.

Overall, we conclude that the mixed-effects model is superior to the fixed-effects models 1–3, because the DIC of the mixed-effects model is lower in almost all considered scenarios. Moreover, it adequately reflects the heterogeneity of the data caused by cell-to-cell variability through the estimation of the variances of the random effects. The heterogeneity of the data is also taken into account by fixed-effects model 2, which gives point estimates of the fixed effects for each curve per cell cycle phase. However, estimating the variance through a mixed-effects model is the more appropriate and comfortable way to quantify the cell-to-cell variability. Moreover, the proposed mixed-effects model is more robust than the fixed-effects models 1–3 because it uses more information.

## 5 Discussion

Our objective was to develop an approach with which data from FRAP experiments on various similar cell nuclei can be analyzed in one model, taking into account the variability contained in the data. The variability can only be assessed by considering data of several cell nuclei in a joint model.

Using the proposed Bayesian nonlinear regression model with mixed-effect priors, we are able to do a joint analysis of the recovery curves of all available cell nuclei per cell cycle phase. So all available data resulting from different FRAP experiments can be used for the estimation of the parameters of interest and no data is ignored. Hence, a distinct benefit of the proposed model is that we fit only one model to the whole of the data arising from all available cell nuclei, which is more comfortable than fitting one model per cell nucleus and analysing the results afterwards.

Curve-specific effects are taken into account by the use of random effects. These are, however, shrunk towards zero, so that most variability in the data is captured by the fixed effects. The variability of the parameters of interest can however be quantified through the estimation of the variance of the random effects. Hence, the proposed method allows not only to gain estimates of the parameters of interest, that is, the binding rates, but it also allows to gain insight into the variability between cells. That is, the proposed method allows to decompose the total variability in the data into the variability between cells and the remaining variability for example due to noise. The Bayesian technique gives complete posterior distributions for the binding rates in each phase, allowing to compute credible intervals for these binding rates, and, hence, showing the precision of the binding rate point estimates. The mixed-effect approach allows to quantify the variability between cells, which, so far, has not been studied. For example, in our data we see a similar variability between cells in the off-rates in early S phase and late S phase, but the variability in the concentration of bound molecules at equilibrium is higher in the late S phase compared to the early S phase.

Algorithms for nonlinear model fitting have consistency problems by specifying starting values and have convergence issues. Therefore, the model is typically fitted several times using a grid of starting values or random starting values, and the best model is determined using an information criterion like Akaikes information criterion (AIC) or Bayesian information criterion (BIC). This results in a high computational burden. Using a Bayesian approach, the algorithm is guaranteed to converge and the resulting parameter estimates are not dependent on any starting values, which reduces computation time. Moreover, the regression model is very flexible. Mixed-effect priors on the nonlinear parameters can be incorporated easily into the nonlinear regression, which is a novel approach. In addition, the proposed technique allows to analyze all data jointly. For our data, the whole analysis of all data took 42 min (diffuse phase), 127 min (early S phase), and 138 min (late S phase), respectively.

In our approach, the number of MCs for the molecule and the cell cycle phase is a fixed parameter that was adopted from a previous study using a refined compartmental approach (Schneider et al., 2013). To make sure that the number of MCs is still valid with the mixed-effects model, we additionally fitted the mixed-effects model with two MCs for the cell cycle phases diffuse and early S, and the mixed-effects model with one MC for the cell cycle phase late S. For the cell cycle phases diffuse and early S, when fitting the model with two MCs, convergence and redundancy issues arise, which give a hint that the model with two MCs is not the appropriate one. Redundancy issues may for example occur when the exponential rates are too similar. In Reich (1981), a redundancy measure has been used to show that parameters in a sum of two exponentials model are highly redundant if the exponential rates differ by less than a factor of 5 (Sommer and Schmid, 2014). The mixed-effects model with one MC for the cell cycle phase late S can be fitted without convergence problems. The resulting DIC is –47057.77 (pD=63.26, D(θmed)=–47184.30), which is bigger than the DIC for the mixed-effects model with two MCs (DIC=–60138.01). Hence, we conclude that according to the DIC, the model with two MCs is more suitable for the data of the late S phase than the model with one MC. The number of MCs is moreover biologically sound, as desribed in Schneider et al., 2013.

We can conclude that the DIC of the mixed-effects model is lower or in approximately the same range as the DIC of all considered models without random effects for all three cell cycle phases. With the mixed-effects model, we additionally gain precious insight into the variability in the population of cell nuclei in the different cell cycle phases through the estimation of random effects and their variances.

With the proposed mixed-effects model, estimates of the off-rates of the interactions the molecules of interest are involved in, and of the variances of the random effects are attained. Therefore, the model is useful for the analysis of data from FRAP experiments on various similar cell nuclei. With that model, it is no longer necessary to analyze each recovery curve belonging to an experiment on one cell nucleus separately and summarize the results afterwards, or to pool and average the data of experiments on multiple similar cell nuclei to be able to analyze it. The data of FRAP experiments on different cell nuclei can rather be analyzed simultaneously by one single model.

The main goal of this study is to show that the proposed technique can be used for the joint analysis of the data of many cells at once, furthermore providing insight into the variation of the off-rates in the population of cell nuclei. This is a novel approach in the field of FRAP analysis. Although we use a simplified kinetic model here, the approach can easily be adapted to other FRAP experiments and any kinetic model for such FRAP experiments.

## Acknowledgments

We thank Prof. Dr. Heinrich Leonhardt (Biocenter Martinsried, LMU Munich) and Dr. Lothar Schermelleh (Department of Biochemistry, University of Oxford) for the provision of the data and helpful discussions. Thanks to Dr. Joseph W. Sakshaug for proofreading. MF and VS were supported by Deutsche Forschungsgemeinschaft (DFG SCHM 2747/1-1). KS was supported by the International Max Planck Research School for Molecular and Cellular Life Sciences (IMPRS-LS).

## References

• Beaudouin, J., F. Mora-Bermúdez, T. Klee, N. Daigle and J. Ellenberg (2006): “Dissecting the contribution of diffusion and interactions to the mobility of nuclear proteins,” Biophys. J., 90, 1878–1894.

• Bird, A. (2002): “DNA methylation patterns and epigenetic memory,” Gene. Dev., 16, 6–21.

• Brooks, S. P. and A. Gelman (1998): “General methods for monitoring convergence of iterative simulations,” J. Comput. Graph. Stat., 7, 434–455.Google Scholar

• Brown, H. and R. Prescott (1999): Applied mixed models in medicine, Chichester, UK: Wiley.Google Scholar

• Bulinski, J. C., D. J. Odde, B. J. Howell, T. D. Salmon and C. M. Waterman-Storer (2001): “Rapid dynamics of the microtubule binding of ensconsin in vivo,” J. Cell Sci., 114, 3885–3897.Google Scholar

• Carlin, B. P. and T. A. Louis (2000): Bayes and empirical Bayes methods for data analysis, London, UK: Chapman & Hall, 2nd edition.Google Scholar

• Carrero, G., E. Crawford, M. J. Hendzel and G. de Vries (2004): “Characterizing fluorescence recovery curves for nuclear proteins undergoing binding events,” B Math. Biol., 66, 1515–1545.

• Coscoy, S., F. Waharte, A. Gautreau, M. Martin, D. Louvard, P. Mangeat, M. Arpin and F. Amblard (2002): “Molecular analysis of microscopic ezrin dynamics by two-photon FRAP,” Proc. Natl. Acad. Sci. USA., 99, 12813–12818.Google Scholar

• Dargatz, C. (2010): Bayesian inference for diffusion processes with applications in life sciences, LMU Munich: Dissertation.Google Scholar

• Dundr, M., U. Hoffmann-Rohrer, Q. Hu, I. Grummt, L. I. Rothblum, R. D. Phair and T. Misteli (2002): “A kinetic framework for a mammalian RNA polymerase in vivo,” Science, 298, 1623–1626.Google Scholar

• Fuchs, C. (2013): Inference for diffusion processes with applications in life sciences, Heidelberg: Springer.Google Scholar

• Gelman, A. and D. B. Rubin (1992): “Inference from iterative simulation using multiple sequences,” Stat. Sci., 7, 457–511.

• Gelman, A., G. O. Roberts and W. R. Gilks (1996): Bayesian statistics 5, Oxford, UK: Oxford University Press.Google Scholar

• Gilks, W. R., S. Richardson and D. J. Spiegelhalter (1996): Markov chain Monte Carlo in practice, London, UK: Chapman & Hall.Google Scholar

• Hemmerich, P., L. Schmiedeberg and S. Diekmann (2011): “Dynamic as well as stable protein interactions contribute to genome function and maintenance,” Chromosome Res., 19, 131–151.

• Mazza, D., A. Abernathy, N. Golob, T. Morisaki and J. G. McNally (2012): “A benchmark for chromatin binding measurements in live cells,” Nucleic Acids Res., 40, e119.

• McNally, J. G. (2008): “Quantitative FRAP in analysis of molecular binding dynamics in vivo,” Method Cell Biol., 85, 329–351.

• Meyvis, T. K. L., S. C. De Smedt, P. van Oostveldt and J. Demeester (1999): “Fluorescence recovery after photobleaching: a versatile tool for mobility and interaction measurements in pharmaceutical research,” Pharmaceut. Res., 16, 1153–1162.Google Scholar

• Mueller, F., P. Wach and J. G. McNally (2008): “Evidence for a common mode of transcription factor interaction with chromatin as revealed by improved quantitative fluorescence recovery after photobleaching,” Biophys. J., 94, 3323–3339.

• Mueller, F., D. Mazza, T. J. Stasevich and J. G. McNally (2010): “FRAP and kinetic modeling in the analysis of nuclear protein dynamics: what do we really know?” Curr. Opin. Cell Biol., 22, 403–411.

• Pinheiro, J. and D. Bates (2000): Mixed-effects models in S and S-PLUS, New York: Springer Verlag.Google Scholar

• Plummer, M., N. Best, K. Cowles and K. Vines (2006): “CODA: convergence diagnosis and output analysis for MCMC,” R News, 6, 7–11, URL http://CRAN.R-project.org/doc/Rnews/.

• R Core Team (2013): R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria, URL http://www.R-project.org/, ISBN 3-900051-07-0.

• Reich, J. G. (1981): On parameter redundancy in curve fitting of kinetic data, Kinetic data analysis: design and analysis of enzyme and pharmacokinetic experiments, New York: Plenum Press.Google Scholar

• Reits, E. A. and J. J. Neefjes (2001): “From fixed to FRAP: measuring protein mobility and activity in living cells,” Nat. Cell Biol., 3, E145–E147.Google Scholar

• Roberts, G. O., A. Gelman and W. R. Gilks (1994): “Weak convergence and optimal scaling of random walk Metropolis algorithms,” Research Report 94.16, Statistical Laboratory, University of Cambridge.Google Scholar

• Schermelleh, L., A. Haemmer, F. Spada, N. Rösing, D. Meilinger, U. Rothbauer, M. C. Cardoso and H. Leonhardt (2007): “Dynamics of Dnmt1 interaction with the replication machinery and its role in postreplicative maintenance of DNA methylation,” Nucleid Acids Res., 35, 4301–4312.Google Scholar

• Schmid, V. J., B. Whitcher, A. R. Padhani, N. J. Taylor and G. Yang (2009): “A Bayesian hierarchical model for the analysis of a longitudinal dynamic contrast-enhanced MRI oncology study,” Magnet. Reson. Med., 61, 163–174.

• Schneider, K. (2009): Analysis of cell cycle dependent kinetics of Dnmt1 by FRAP and kinetic modeling, LMU Munich: Diploma Thesis.Google Scholar

• Schneider, K., C. Fuchs, A. Dobay, A. Rottach, W. Qin, P. Wolf, J. M. Álvarez Castro, M. M. Nalaskowski, E. Kremmer, V. Schmid, H. Leonhardt and L. Schermelleh (2013): “Dissection of cell cycle-dependent dynamics of Dnmt1 by FRAP and diffusion-coupled modeling,” Nucleid Acids Res., 41, 4860–4876.Google Scholar

• Sommer, J. and V. J. Schmid (2014): “Spatial two-tissue compartment model for dynamic contrast enhanced magnetic resonance imaging,” J. Roy. Stat. Soc. C-App., 63, 695–713.

• Spada, F., U. Rothbauer, K. Zolghadr, L. Schermelleh and H. Leonhardt (2006): “Regulation of DNA methyltransferase 1,” Adv. Enzyme Regul., 46, 224–234.

• Spiegelhalter, D. J., N. G. Best, B. P. Carlin and A. van der Linde (2002): “Bayesian measures of model complexity and fit,” J. Roy. Stat. Soc. B., 64, 583–639.

• Sporbert, A., P. Domaing, M. C. Cardoso and H. Leonhardt (2005): “PCNA acts as a stationary loading platform for transiently interacting Okazaki fragment maturation proteins,” Nucleid Acids Res., 33, 3521–3528.

• Sprague, B. L. and J. G. McNally (2005): “FRAP analysis of binding: proper and fitting,” TRENDS Cell Biol., 15, 84–91.

• Sprague, B. L., R. L. Pego, D. A. Stavreva and J. G. McNally (2004): “Analysis of binding reactions by fluorescence recovery after photobleaching,” Biophys. J., 86, 3473–3495.

• van Royen, M. E., P. Farla, K. A. Mattern, B. Geverts, J. Trapman and A. B. Houtsmuller (2009): “Fluorescence recovery after photobleaching (FRAP) to study nuclear protein dynamics in living cells,” Meth. Mol. Biol., 464, 363–385.Google Scholar

• van Royen, M. E., A. Zotter, S. M. Ibrahim, B. Geverts and A. B. Houtsmuller (2011): “Nuclear proteins: finding and binding target sites in chromatin,” Chromosome Res., 19, 83–98.

## Supplemental Material

The online version of this article (DOI: 10.1515/sagmb-2014-0013) offers supplementary material, available to authorized users.

Corresponding author: Volker J. Schmid, Department of Statistics, Ludwig-Maximilians-University Munich, Ludwigstr. 33, 80539 Munich, Germany, e-mail:

Published Online: 2014-12-13

Published in Print: 2015-02-01

Citation Information: Statistical Applications in Genetics and Molecular Biology, Volume 14, Issue 1, Pages 35–51, ISSN (Online) 1544-6115, ISSN (Print) 2194-6302,

Export Citation