Longitudinal image data based on fluorescent proteins play a crucial role for both in vivo and in vitro analysis of various biological processes such as gene expression and cell lineage fate. Assessing the growth patterns of different cell types within a heterogeneous population and monitoring their interactions enables biomedical researchers to determine the role of different cell types in important biological processes such as organ development and regeneration, malignant growth or immune responses under various experimental conditions. For example, tumor progression has been shown to be affected by bidirectional interactions among cancer cells or between cancer cells and cells from the microenvironment, including tumor-infiltrating immune cells . Being able to study these interactions in a laboratory setting is therefore highly relevant, but is complicated by the difficulty of dissecting the effect of the different cell types as soon as the number of cell types exceeds two. In the present study we used longitudinal image data collected from multicolor live-cell imaging growth experiments of co-cultures of cancer cells and fibroblasts (a key cell type in the tumor microenvironment) as well as behaviourally distinct (cloned) cancer cells. Using a high-content imaging system, we were able to acquire characteristics for each individual cell at subsequent times, including fluorescent properties, spatial coordinates, and morphological features. The motivation of this work was to design a model allowing the determination of spatio-temporal growth interactions between these multiple cell populations.
In longitudinal growth experiments, the two important goals are to determine growth rates for different cell populations and to assess how interactions between cell types may affect their growth. Whilst a wide range of descriptive data analysis approaches have been used in applications, inference based on a comprehensive model of multicolor cell data is an open research area. The main challenges are related to the presence of complicated spatio-temporal interactions amongst cells and difficulties related to tracking individual cells across time from image data. Typical longitudinal experiments consist of a relatively small number of measurements (e.g. 5 to 20 images taken every few hours), which is adequate for monitoring cell growth. Tracking individual cells would typically require more frequent measurements, complicating the practicality of the experiments in terms of the storage cost of very large image files and the cytotoxicity induced by the imaging process.
Although tracking individual cell trajectories is difficult due to cell migration, overlapping cells, changes in cell morphology, image artifacts, cell death and division, obtaining cell counts by cell type (represented by a certain color) is straightforward and can be easily automated. To describe the spatial distribution for different cell types, we propose to divide an image into a number of contiguous regions (tiles) to form a regular lattice structure as shown in Figure 1(a). We then record the frequency of cells of different colors in each tile at subsequent time points, and based on which we model the spatial and temporal dependencies of the cell growth.
To model spatio-temporal data, one could choose to approximate the spatio-temporal process by a spatial process of time series, that is, to view the process as a multivariate spatial process where the multivariate dependencies are inherited from temporal dependencies. In other words, it can be seen as a temporal extension of spatial processes.
The most popular way of developing a spatial process is through the conditionally auto-regressive (CAR) model proposed by Besag . Waller, Carlin, Xia, and Gelfand  extend the CAR model into a spatio-temporal setting by allowing spatial effects to vary across time. However, the model lacks a specification of temporal dependency, as also noted by Knorr-Held . More recently, Quick, Waller, and Casper  proposed a multivariate space-time CAR (MSTCAR) model, which is essentially a multivariate CAR model, where both temporal and between group dependencies are modelled as multivariate dependencies. Other works related to spatial process of time series include Sans, Schmidt, Nobre, et al.  and Quick, Waller, and Casper .
Alternatively, one also think of the process as a time series of spatial process, or a spatial extension of time series. This is the approach we take in our spatio-temporal modelling. The underlying notion is that “the temporal dependence is more natural to model than the spatial dependence” .
Following Cox et al. , it is useful to distinguish two modelling approaches for the analysis of time series data commonly seen in spatial-temporal modelling literature: the parameter-driven and observation-driven model. In a parameter-driven model, the dependence between subsequent observations is modelled by a latent stochastic process, which evolves independently of the past history of the observation process. In contrast, in an observation-driven model, time dependence arises because the conditional expectation of the outcome given the past depends explicitly on the past values.
For multivariate count data, the advantage of parameter-driven models is that one can easily assume that the conditional expectation of the observed process (on log-scale), as a latent process, is (multivariate) normal. There are extensive works related to latent spatio-temporal models under the Bayesian framework, including models with Gaussian data modelled by (multivariate) Gaussian process with an additive error [10, 11, 12, 13], Poisson data with conditional expectation modelled by Gaussian latent process ([14, 15] and Chapter 7 of ) and Poisson data with multivariate log-gamma latent process . However, estimation of parameters in parameter-driven models requires considerable computational effort, as does prediction of the latent process.
On the other hand, in observation-driven models, inference is possible in a (penalized) maximum likelihood framework and therefore can be easily fitted even for quite complex regression models . Schrödle, Held, and Rue  proposed a parameter-driven spatio-temporal model and compared it with a similar observation-driven model proposed by Paul, Held, and Toschke . They conclude that the parameter-driven models perform slightly better in terms of prediction in some cases, however, while the computation time for the observation-driven model is mostly less than a second, fitting a parameter-driven model takes several hours if it ever converges, because of the complexity with the latent autoregressive process. Besides, their model contains only five parameters, while in our application, the number of parameters of interest grows quadratically with the number of cell populations, which will make the parameter-driven models intractable even with a moderate number of cell populations.
Therefore, we choose to work with a spatial extension of observation-driven time series. Zeger and Qaqish  review various observation-driven time series models with a quasi-likelihood estimation. Fokianos and Tjøstheim  develop and study the probabilistic properties of a log-linear autoregressive time series model for Poisson data, as an extension of the model considered by Fokianos, Rahbek, and Tjøstheim . See Scott, et al. and Kedem and Fokianos [23, 24] for a complete review.
Literature about observation-driven spatio-temporal models, however, is relatively sparse. Held, Höhle, and Hofmann  propose a multivariate time series model where parameters are allowed to vary across space. Paul et al.  extended the model such that spatial dependences are captured by additional parameters that quantify the “directed influence” of neighboring areas at previous time points on the observation of interest. Paul and Held  further extend the model by introducing random effects. Note that these approaches model directly the conditional expectation of the count data, meaning they are using an identity link function, instead of the canonical log-link. Thus, it is required that the parameters are positive to ensure that the resulting conditional expectation is positive. Knorr-Held and Richardson  propose a space-time model for surveillance data, apart from separate seasonal and spatial components, they include an autoregressive term with a latent indicator.
In this paper, we develop a conditional spatial-temporal model for multivariate count data on tiled images, and provide its application on tiled images in the context of longitudinal cancer cell monitoring experiments. Our model enables us to measure the effect on the growth rate of each cell population and changes due to local cross-population interactions. Specifically, we consider a multivariate Poisson model with intensity modeled as a log-linear form similar to those in  and , and we quantify spatio-temporal impacts of different cell populations in neighboring tiles through model parameters, as illustrated in Figure 1(b). Impacts are allowed to be positive or negative, and unlike those models that describe between group dependence through a covariance matrix, influences do not have to be symmetrical in our model. Another main advantage of the proposed framework is that it enables one to accommodate spatio-temporal cell interactions for heterogeneous cell populations within a relatively parsimonious statistical model.
Since the model complexity can be potentially very large in the presence of many cell types, it is also important to address the question of how to select an appropriate model by retaining only the meaningful spatio-temporal interactions between cell populations We cary out model selection using the common model selection criteria for parametric models, the Akaike and the Bayesian information criteria (AIC and BIC).
The remainder of the paper is organized as follows. In Section 2, we introduce the conditional spatio-temporal lattice model for multivariate count data and develop maximum likelihood inference tools. In the same section, we discuss the asymptotic properties of our estimator and standard errors. In Section 3, we study the performance of our methodology using simulated data, and compare it to that of the multivariate conditional autoregressive (MCAR) model. In Section 4, we apply our method, as well as the MCAR model to analyze datasets from an in-vitro experiment, where cancer cells are co-cultured with fibroblasts. In Section 5, we conclude and give final remarks.
2.1 Multicolor spatial autoregressive model on the lattice
Let be a discrete lattice. In the context of our application, the lattice is obtained by tiling a microscope image into tiles, denoted by . The total number of tiles is a monotonically increasing function of One can choose various forms of lattice, for example, the regular or hexagonal lattices. For simplicity, we tile the image into regular rectangular tiles, which makes An example of a tiled image with is shown in Figure 1(a). Denote a pair of neighboring tiles with , if tiles and share the same border or coincide (). Each tile may contain cells of different colors; thus, we let be a finite set of colors and denote by the total number of colors. Let be the sample of observations where is the collection of observations at time point , and is the vector of observed frequencies for color on the lattice at time . The joint distribution for the spatio-temporal process on the lattice is difficult to specify, due to local spatial interactions for neighboring tiles and global interactions occurring at the level of the entire image. An additional issue is that cells tend to be clustered together due to the cell division process and other biological mechanisms; thus it is not uncommon to observe low counts in a considerable portion of tiles. In typical longitudinal experiments, the number of time points seldom go beyond due to experimental, storage and processing cost, while can be relatively large. So we work under the framework where is assumed to be finite, while is allowed to grow to infinity.
We suppose that the count for the th tile follows a marginal Poisson distribution , with intensity modeled by the canonical log-link , where takes the following spatial autoregressive form: (1)(2)
for all , with being the number of tiles in a neighborhood of tile . Although we are adopting the regular grids for simplicity, the model is readily applicable to other tiling strategies. Changing the tiling strategy would only change the realisations of in (2).
Here, we assume that the conditional count for different tiles at time is independent conditioning on information from , i.e.
for all and This does not suggest that they ( and ) are independent, but rather that their spatio-temporal dependence is due to the structure of intensity in (1). Conditional independence is a commonly used assumption for spatio-temporal models in a non-gaussian setting [3, 28], since it’s exceedingly difficult to work with multivariate non-Gaussian distribution .
The elements of the parameter vector are main effects corresponding to a baseline average count for cells of different colors. The spatio-temporal interactions are measured by the statistic in (2), which essentially counts the number of cells of color in the neighborhood of tile at time . Hence, the autoregressive parameter is interpreted as positive or negative change in the average number of cells with color , due to interactions with cells of color in neighbouring tiles. A positive (or a negative) sign of means that the presence of cells of color in neighboring tiles promotes (or inhibits) the growth of cells of color . The spatio-temporal effects are collected in the weighted incidence matrix . This may be used to generate weighted directed graphs, as shown in the example of Figure 2, where the nodes of the directed graph correspond to cell types, and the directed edges are negative or positive spatio-temporal interactions between cell types.
Equation (1) could be extended to some more specific form, for example, , where are interpreted as the effect of cells of color from neighbouring (but not the same) tiles have on the growth of cells with color , while as the effect of cells of color from the same tile. However, we stick to the model in (1) because we have no evidence showing that the more complex model is advantageous from a model selection view point.
We choose to work with a log-linear form for the autoregressive equation of in eq. (1), where we apply a logarithmic transform and add to the counts at time , . It offers several advantages compared to the more commonly used linear form. First, and are transformed on the same scale. Moreover, this model can accommodate both positive and negative correlations, while it is not possible to account for positive association in a stationary model if past counts are directly included as explanatory variables. For example, with the model for a single color, the intensity would be which may lead to instability of the Poisson means if since is allowed to increase exponentially fast. Finally, adding to is for coping with zero data values, since is not defined when , which arises often, and it maps zeros of into zeros of .
2.2 Likelihood inference
Let be the overall parameter vector , where is a -dimensional vector defined in Section 2.1 and is a matrix of color interaction effects, is the total number of parameters. In this section, we develop a weighted maximum likelihood estimator for our model, (3)
where is the expected number of cells with color in tile at time , defined in (1) and the weights are given constants. The weighted maximum likelihood estimator (MLE), , is obtained by maximizing the weighted log-likelihood function (4)
where . Equivalently, is formed by solving the weighted estimating equations (5)
where , denotes the Kronecker product, is the gradient operator with respect to and .
Specific weights could be used to address the presence of outliers. Following Ferrari and Vecchia  and La Vecchia, Camponovo, and Ferrari , the influence of strong outliers could be avoided by taking weights of form with being a tuning constant smaller than 1. However, for the current application we use constant weights all equal to 1.
Our empirical results show that this choice performs reasonably well in terms of estimation accuracy in all our numerical examples and guarantees optimal variance for the estimator under correct model specification. The solution to eq. (5) is obtained by a standard Fisher scoring algorithm, which is found to be stable and converges fast in all our numerical examples.
Finally, in practical applications it is also important to address the question of how to select an appropriate model by retaining only the meaningful spatio-temporal interactions between cell populations, and avoid over-parametrized models. Model selection plays an important role by balancing goodness-of-fit and model complexity. Here, we select non-zero model parameters based traditional model selection approaches: the Akaike Information criterion, , and the Bayesian information criterion, .
2.3 Asymptotic properties and standard errors
In this section, we overview the asymptotic behavior of the estimator introduced in Section 2.2. In our setting we consider a fixed number of time points, , whilst the lattice is allowed to increase. This reflects the notion that the statistician is allowed to choose an increasingly fine tiling grid as the number of cells increases. If the regularity conditions stated in the Appendix hold, then converges in distribution to a -variate normal distribution with zero mean vector and identity variance, as , with given in (6). Asymptotic normality of follows by applying the limit theorems for M-estimators for nonlinear spatial models developed by Jenish and Prucha . One condition required to ensure this behaviour is that has constant entries at the initial time point , which is quite realistic since typically cells are seeded randomly at the beginning of the experiment. Our proofs mostly check -mixing conditions and -Uniform Integrability of the score functions ensures a pointwise law of large numbers, with additional stochastic equicontinuity, a uniform version of the law of large numbers required by Jenish and Prucha .
The asymptotic variance of is , where is the Hessian matrix (6)
with being the partial score function for the th tile. Direct evaluation of may be challenging since the expectations in (6) is intractable. Thus, we estimate by the empirical counterpart
Note that the above estimators approximate the quantities in formula (6) by conditional expectations. Our numerical results suggest that the above variance approximation yields confidence intervals with coverage close to the nominal level . Besides the above formulas, we also consider confidence intervals obtained by a parametric bootstrap approach. Specifically, we generate bootstrap samples by sampling at subsequent times from the conditional model specified in eqs. (1) and (2) with . From such bootstrap samples, we obtain bootstrapped estimators, , which are used to estimate by the usual covariance estimator , where . Finally, a confidence interval for is obtained as , where is the -quantile of a standard normal distribution, and is an estimate of obtained by either eq. (7) or bootstrap resampling.
3 Monte Carlo simulations
In our Monte Carlo experiments, we generate data from a Poisson model as follows. At time , we populate tiles using equal counts for cells of different colors. For , observations are drawn from the multivariate Poisson model Recall that the rate defined in Section 2.1 contains autoregressive coefficients , which are collected in the matrix .
We assess the performance of MLE under different settings concerning the size and sparsity of . Consider the three models with the following choices of :
Denote Model as the model corresponding to . In Model 1, all the effects in have the same size; in Model 2, the effects have decreasing sizes; Model 3 is the same as Model 1, but with some interactions exactly equal to zero.
We set for all three models. The above parameter choices reflect the situation where the generated process has a moderate growth.
In Table 1 and Table 2, we show results based on 1000 Monte Carlo runs generated from Models 1-3, for and and . In Table 1, we show Monte Carlo estimates of squared bias and variance of . Both squared bias and variance of our estimator are quite small in all three models, and decrease as gets larger. The variances of Model 2 are slightly larger than those in the other two models due to the increasing difficulty in estimating parameters close to zero.
In Table 2, we report the coverage probability for symmetric confidence intervals of the form , where is the quantile for a standard normal distribution, with The standard error, , is obtained by the squared root of diagonal elements of and the parametric bootstrap estimate, and , described in Section 2.3. The coverage probability of the confidence intervals are very close to the nominal level for both methods.
In Table 3, we show results for the model selection based on 1000 Monte Carlo samples from Model 3 using the AIC and the BIC given in Section 2 for and . We report Type A error (a term is not selected when it actually belongs to the true model ) and Type B error (a term is selected when it is not in the true model ). For both AIC and BIC model selection is more accurate for large . As expected AIC tends to over select, and BIC outperforms AIC, with zero Type A error, and very low Type B error.
Finally, we compare the performance of our model with the following Multivariate conditional autoregressive (MCAR) model proposed by Leroux, Lei, and Breslow :
where are random effects with conditional distribution
where is a spatial autocorrelation parameter, with corresponding to independence, while corresponds to the intrinsic model, and is a between variable covariance matrix, which is assumed to have no fixed structure, and is the number of tiles in a neighborhood of tile as defined in Section 2.1. Let be a vector of regression parameters, where is defined in Section 2.1 and is the intercept. Let the covariate be a -dimensional vector consists of vectors: , where carries the information from the neighbouring tiles on the previous time point, defined in eq. (2).
An independent Gaussian prior, , is specified for each regression parameter in . A uniform prior on the unit interval, , is specified for . For covariance matrix , assume an inverse Wishart distribution with identity scale matrix and degree of freedom.
To evaluate the performance of MLE under our model and estimators obtained by the MCAR model, we generate set of data from Model 1. Estimation of the MCAR model is done by MCMC sampling, using R package CARBayes by Lee . Table 4 show Monte Carlo estimates of squared bias, variance, the coverage probability of confidence intervals and computation time for and . Two of the settings are the same as those shown for Model 1 in Table 1: and . In estimation of MCAR, we also show results of two MCMC settings: 1. MCAR1: MCMC samples generated and discarded as the burn-in period; 2. MCAR2: samples with discarded. Coverage probabilities of our model is computed as , where is the quantile for a standard normal distribution. The standard error, , is obtained by taking the squared root of diagonal elements of described in Section 2.3.
In overall, our method performs better than MCAR at analysing the kind of data that we generate, especially when and/or is small, with much smaller bias and variance, as well as computation time. The performance of MCAR improves significantly as the model gets more complicated (i.e. larger ), and when and increases. In the case where and , it almost performs equally well with our model, however, it takes almost an hour to obtain the estimates, while our method requires less than a minute. Besides, for the coverage probabilities to reach the nominal level, it seems that MCAR requires larger MCMC sample size as the model gets more complicated, while those of our model has been stable and close to the nominal level in all cases.
4 Analysis of the cancer cell growth data
Cancer cell behaviour is believed to be determined by several factors including genetic profile and differentiation state. However, the presence of other cancer cells and non-cancer cells has also been shown to have a great impact on overall tumor behaviour [34, 35]. It is therefore important to be able to dissect and quantify these interactions in complex culture systems. The data sets in this section represent a cancer cell-fibroblast co-culture experiment. The data sets analyzed consist of counts of cell types (different cancer cell populations expressing different fluorescent proteins, and non-fluorescent fibroblasts) from 9 subsequent images taken at an 8-hour frequency over a period of 3 days using the Operetta high-content imager (Perkin Elmer). Information regarding cell type (fluorescent profile) and spatial coordinates for each individual cell were extracted using the associated software (Harmony, Perkin Elmer).
Each image was subsequently tiled using a regular grid.
We choose the number of tiles for a balance between the fit of the model and capturing the local impact between cell populations. More specifically, decreasing tile sizes enables one to detect local impacts between cell populations, which is one of the objectives of our analysis. However, if the tiles are too small, we will end up with mostly no cells in most tiles. In this situation the conditional Poisson model would not fit well the data. On the other hand, when the tiles are too large the model would fit the data well (the conditional Poisson would be approximately a conditional normal model), but we lose information on local impacts. We recommend 0 to 20 average cells per tile, since for such choice our diagnostic and goodness-of-fit analyses suggest that the conditional Poisson model fits well the data whilst enabling us to measure local correlation effects between populations.
4.1 Cancer cell-fibroblast co-culture experiment
In this experiment, cancer cells are co-cultured with fibroblasts, a predominant cell type in the tumor microenvironment, believed to affect tumor progression, partly due to interactions with and activation by cancer cells . In this experiment, fibroblasts (F) are non-fluorescent whereas cancer cells fluoresce either in the red (R) or green (G) channels due to the experimental expression of mCherry or GFP proteins, respectively. Cells were initially seeded at a ratio of 1:1:2 (R:G:F).
Model selection and inference. We applied our methodology to quantify the magnitude and direction of the impacts have on growth for the considered cell types. To select the relevant terms in the intensity expression (1), we carry out model selection using the BIC model selection criterion. In Table 5, we show estimated parameters for the full and the BIC models, with bootstrap confidence intervals in parenthesis. Figure 2 illustrates estimated spatio-temporal impacts between cell types using a directed graph. The solid and dashed arrows represent respectively significant and not significant impacts between cell types at the confidence level. Significant impacts coincide with parameters selected by BIC.
The interactions within each cell type () are significant, which is consistent with healthy growing cells. As anticipated, the effects for the cancer cells are larger than those for the slower growing fibroblasts. The validity of the estimated parameters is also supported by the similar sizes of the parameters for the green and red cancer cells. This is expected, since the red and green cancer cells are biologically identical except for the fluorescent protein they express. Interestingly, the size of the estimated effects within both types of cancer cells () are larger than the impact they have on one another ( and ). This is not surprising, since reflects not only impacts between cells from the same cell population, but also cell proliferation. The fact that we are able to detect the impacts between the red and green cancer cells confirms that our methodology is sensitive enough to detect biologically relevant impacts even though no interactions were found between the cancer cells and the fibroblasts. This might be due to the fact that we used normal fibroblasts that had not previously been in contact with cancer cells and thus had not been activated to support tumor progression as is the case with cancer-activated fibroblasts.
Goodness-of-fit and one-step ahead prediction To illustrate the goodness-of-fit of the estimated model, we generate cell counts for each type in each tile, , from the Pois() distribution for , where is computed using observations at time , with parameters estimated from the entire dataset. In Figure 4, we compare the actually observed and generated cell counts for GFP cancer cells (G) and mCherry cancer cells (R) and fibroblasts (F) across the entire image. The solid and dashed curves for all cell types are close, suggesting that the model fits the data reasonably well. As anticipated, the overall growth rate for the red and green cancer cells are similar, and sensibly larger than the growth rate for fibroblasts.
To assess the prediction performance of our method, we consider one-step-ahead forecasting using parameters estimated from a moving window of five time points. In Figure 3, we show quantiles of observed cell counts against predicted counts for each tile. The upper and lower confidence bounds are computed non-parametrically by taking and , where and are the empirical distributions of the observations and predictions at time respectively . The identity line falls within the confidence bands in each plot, indicating a satisfactory prediction performance.
Comparison with MCAR model Next, we compare the estimates as well as the goodness-of-fit on the real data with the MCAR model. Parameter estimates are shown in Table 5, with confidence intervals given in parenthesis. Results from both models are mostly consistent with each other, specifically, both models show that impacts within each cell type () are significant, the effects for cancer cells are larger than those for the slower growing fibroblasts, the green and red cancer cells have positive impact on each other, and cancer cells have no impact on fibroblasts. The only difference is, the MCAR model shows a negative impact of fibroblasts on the green cancer cells only, while our model detect no significant impact on either cancer cells. Since the red and green cancer cells are biologically identical except for the fluorescent protein they express, we expect a symmetrical result with both cancer cells.
In Figure 4, apart from the observed (solid curve) and generated (dashed curve) cell counts from our model, we also show the generated cell counts from the MCAR model (dotted curve) for the green cancer cells (G), red cancer cells (R) and fibroblasts (F) across the entire image. Compared to the dotted curves, the dashed curves are slightly closer to the solid ones, which means our model seems more appropriate for analysing this type of data than the MCAR model.
5 Conclusion and final remarks
In this paper, we introduced a conditional spatial autoregressive model and accompanying inference tools for multivariate spatio-temporal cell count data. The new methodology enables one to measure the overall cell growth rate in longitudinal experiments and spatio-temporal interactions with either homogeneous or heterogeneous cell populations. The proposed inference approach is computationally tractable and strikes a good balance between computational feasibility and statistical accuracy. Numerical findings from simulated and real data in Sections 3 and 4 confirm the validity of the proposed approach in terms of prediction, goodness-of-fit and estimation accuracy.
The data sets described in this paper serve as a proof-of-concept that the proposed methodology works. However, the potential applications and the relevant questions that the methodology can help to answer in cancer cell biology are plentiful. To build on from the examples given in this paper, the methodology can be used to study interactions between cancer cells and a wide range of cancer-relevant cell types such as cancer-activated fibroblasts, macrophages, and other immune cells when co-cultured. Since a substantial proportion of cancer cells in tumors are in close proximity to other cell types that have been shown to affect tumor progression, using these co-cultures is more representative of the situation in a patient compared to studying cancer cells on their own. In addition to just giving the final cell number, the presented approach can dissect which cell types affect the growth of others and to what extent in complex heterogeneous populations. This could be relevant in a drug discovery setting to determine if a drug affects cancer cell growth due to internal effects (on other cancer cells) or by interfering with the interaction between the cancer cells and other cell types. Finding drugs with different targets and mechanisms of action are particularly sought after as they provide a wider target profile, increasing the chance of patients responding as well as reducing the risk of tumors becoming resistant. The impact of different genes and associated pathways in different cell types in relation to inter-cellular interactions can also be studied by genetically modifying the cell type(s) in question before mixing the cells together. This could be beneficial to identify new potential drug targets. Our approach is also applicable in other kinds of studies where local spatial cell-cell interactions are believed to affect cell growth such as studies of neurodegenerative diseases  and wound healing/tissue re-generation . In addition to evaluating cell growth, our approach can also be used to study transitions between cellular phenotypes upon interaction with other cell types, provided that the different phenotypes studied can be distinguished from one another based on the image data. Finally, it is worth noting that issues may arise when cells become too confluent/dense, this may lead to segmentation problems of the imaging system. If they become completely confluent, they are likely to progressively stop growing. If one wants to measure for longer period of time, experiments can be performed in larger wells/plates or with smaller starting cell numbers.
Our method offers several practical advantages to researchers interested in analysing multivariate count data on heterogeneous cell populations. First, the conditional Poisson model does not require tracking individual cells across time, a process that is often difficult to automate due to cell movement, morphology changes at subsequent time points, and additional complications related to storage of large data files. Second, we are able to quantify local spatio-temporal interactions between different cell populations from a very simple experimental set-up where the different cell populations are grown together in a single experimental condition (co-culture). An alternative, solely experimentally-based strategy would require monitoring the different cell types alone and together at different cell densities (number of cells per condition) in order to make inferences in terms of potential interactions. However, such an approach would give no possibility of evaluating the spatial relations in the co-culture conditions and would still restrict the number of simultaneously tested cell types to two.
In the future, we foresee several useful extensions of the current methodology, possibly enabling the treatment of more complex experimental settings. First, complex experiments involving a large number of cell populations, , would imply an over-parametrized model. Clearly, this large number of parameters would be detrimental to both statistical accuracy and reliable optimization of the likelihood objective function in (4). To address these issues, we plan to explore a penalized likelihood of form , where is a nonnegative sparsity-inducing penalty function. For example, in a different likelihood setting, Bardic et al.  consider the -type penalty .
Second, for certain experiments, it would be desirable to modify the statistics in eq. (2) to include additional information on cell growth such as the distance between heterogeneous cells, and covariates describing cell morphology.
Thirdly, it would be useful to develop a more principled way to select the tile sizes/number, and consider tiling the microscope image into a hexagonal lattice, which is a more natural choice in real application, since the distance between neighboring tiles would be more even than that of a regular lattice.
Finally, although numerical results (results not reported here) show that our method are quite robust in the presence of mild outliers (with around of contaminated data), for more severe situations, we expect that severe or numerous outliers will have some influence on the estimates since the Poisson score function is unbounded. To address this problem, the log-likelihood scores in eq. magenta (5) should be replaced by some other robust alternative. Following Ferrari and Vecchia  and La Vecchia et al. , robustness can be obtained by the so-called -entropy estimation method simply obtained by replacing the usual logarithm in the log-likelihood estimating equation by the -logarithm logarithm function if , and if , for all . This ensures a bounded influence function for the implied estimator and therefore guarantees control of the bias under contamination.
The authors wish to acknowledge support from the Australian National Health and Medical Research Council grants 1049561, 1064987 and 1069024 to Frédéric Hollande. Christina Mølck is supported by the Danish Cancer Society.
In the first part of this section, we provide technical lemmas required to prove asymptotic properties of the estimator
Denote as the expectation with respect to and as the expectation of . Let be the set of tiles in the neighborhood of tile , with radius . Specifically, for two locations and , we say if Thus, the neighborhood defined in Section 2 is of radius , i.e. . Denote . Actually, for any tile that is not on the boundary of the image,
In the remainder of this paper we use the following assumptions:
A.1: The parameter space is a compact subset of , and that is the unique maximiser of
A.2: The matrix is full rank.
Let be independent Poisson random variables with mean respectively, where is a finite positive integer. Then for any positive integer ,
Denote , with corresponding observation and conditional mean , then (8)
the denotes Stirling number of the second kind,
Similarly, for any , we have since .
Next, we proceed by induction.
For , by the conditional independence assumption and Lemma 1, we have
Since and has constant entries at time point ,
Suppose eq. (8) is true for , then for , we have
Given Assumption A.1, for any finite constant and
By the definition of given in Lemma 2, we know that is bounded for all bounded under assumption A.1. Thus, Lemma 2 implies
For simplicity, define the distance between tile and as if
Let be the collection of counts in tiles at time that are correlated with the count in tile at time (). Due to the neighborhood structure in the autoregressive term described in Section 2, one can easily tell that is a neighbourhood around tile , with the radius equal to .
Due to the condition that has constant entries at time , we have if which is true when
For any , is a neighborhood around tile , with a radius .
Since we have
In the second part of this section, we study the asymptotic properties of the estimator .
(Existence and uniqueness) If assumption A.3 holds, then there exist unique maximizer of , denoted by .
First, since is compact and is continuous, at least one maximiser of exist. Next, we wish to prove that the maximiser is unique. The Hessian matrix of can be written as a block matrix
where is a matrix. Matrix is positive semidefinite with rank 1. By Assumption A.2, is full rank, which means is positive definite for all and , since This shows that is strictly convex, which implies is unique.
[Consistency] If the regularity assumption A.1 holds, then with probability tending 1, as .
We proceed by verifying the conditions of Theorem 2 in . First we show that the score functions are -Uniform Integrable for , i.e. (10)
The general form of each entry of is , take , we have (a)
which is finite by lemma 3. This gives us the boundedness of , i.e.
which implies -Uniform Integrability, for .
Second, we show the stochastic equicontinuity of , i.e.
The is a matrix, with each column being either or , and
Thus, the non-zero entries of have the general form: , which are bounded by an equivalent analogous to Lemma 3.
Thirdly, we check mixing conditions. Let and be two subsets of , and let be the algebra generated by random variables .
Then the mixing coefficient for the random field is defined as
Following Bai et al. , in an dimensional space, we need (a) (b) For (c) where and is the distance between sets and .
For any fixed and ,
consider and , then and as By Lemma 4, we have if . Thus, for any , provided that , that is, if .
This implies all three mixing conditions.
If the regularity assumptions A.1 and A.2 hold, we have converges in distribution to a variate Normal with zero mean vector and identity variance, as .
First, we show the uniform law of large numbers for : (11)
where as defined in Section 2. Note that (12)
The first term in eq. (12) is , since , which is shown to be finite in the proof of Proposition 2.
For the second term in eq. (12), by Lemma 2 we have
where is finite by Lemma 2. Thus, the second term in eq. (12) is also of order element wise, which means as Therefore, eq. (11) follows by Chebyshev’s inequality.
Second, , which is shown to be positive definite under Assumption A.2 in Proposition 1. Thus, together with uniform Integrability in eq. (10) and the mixing conditions, by Theorem 1 in , we have (13)
Finally, by Taylor’s expansion, (14)
where is a vector with elements between and Since by Proposition 2, we have The second derivative is a matrix, with entries being either or , where and and Due to the structure of and in Section 2, all non-zero elements in are monotone with respect to Thus, there exists such that for all Therefore, we have ,
which can be shown to be finite by an equivalent analogous to Lemma 3.
Thus, eq. (13) can be written as
Besag J. Spatial interaction and the statistical analysis of lattice systems. J Royal Stat Soci Series B Methodol. 1974;192–236. Google Scholar
Knorr-Held L. Bayesian modelling of inseparable space-time variation in disease risk. 1999. Google Scholar
Quick H, Waller LA, Casper M. Hierarchical multivariate space-time methods for modeling counts with an application to stroke mortality data. arXiv preprint arXiv:1602.04528. 2016. Google Scholar
Cressie N, Wikle CK. Statistics for spatio-temporal data. John Wiley & Sons, 2011. Google Scholar
Cox DR, Gudmundsson G, Lindgren G, Bondesson L, Harsaae E, Laake P, Juselius K, Lauritzen SL. Statistical analysis of time series: Some recent developments [with discussion and reply]. Scand J Stat. 1981;93–115. Google Scholar
Bradley JR, Holan SH, Wikle CK. Multivariate spatio-temporal models for high-dimensional areal data with application to longitudinal employer-household dynamics. Ann Appl Stat. 2015;9:1761–1791. Web of ScienceCrossrefGoogle Scholar
Bradley JR, Holan SH, Wikle CK. Multivariate spatio-temporal survey fusion with application to the american community survey and local area unemployment statistics. Stat. 2016;5:224–233. CrossrefWeb of ScienceGoogle Scholar
Holan S, Wikle C. Hierarchical dynamic generalized linear mixed models for discrete-valued spatio-temporal data. Handbook of Discrete–Valued Time Series, 2015. Google Scholar
Dunsmuir WT, Scott DJ, et al. The glarma package for observation driven time series regression of counts. J Stat Softw. 2015;67:1–36. Google Scholar
Wikle CK, Anderson CJ. Climatological analysis of tornado report counts using a hierarchical bayesian spatiotemporal model. J Geophys Res Atmos. 2003;108. Google Scholar
Ferrari D, Vecchia. On robust estimation via pseudo-additive information. Biometrika. 2011;99:238–244. Google Scholar
Leroux BG, Lei X, Breslow N. Estimation of disease rates in small areas: a new mixed model for spatial dependence. In: Statistical models in epidemiology, the environment clinical trials, 179–191. Springer, 2000. Google Scholar
Lee D. Carbayes: An r package for bayesian spatial modeling with conditional autoregressive priors. J Stat Softw. 2013;55:1–24. Google Scholar
Koenker R. Quantile regression. No. 38, Cambridge university press, 2005. Google Scholar
Bradic J, Fan J, Wang W. Penalized composite quasi-likelihood for ultrahigh dimensional variable selection. J R Stat Soc Series B Stat Methodol. 2011;73:325–349. Web of ScienceCrossrefPubMedGoogle Scholar
About the article
Published Online: 2018-07-07
This article was supported by the Australian National Health and Medical Research Council (1049561, 1064987 and 1069024)