Äijö, T. and H. Lähdesmäki (2009): “Learning gene regulatory networks from gene expression measurements using non-parametric molecular kinetics,” Bioinformatics, 25, 2937–2944.PubMedCrossrefGoogle Scholar
Barenco, M., D. Tomescu, D. Brewer, R. Callard, J. Stark, and M. Hubank (2006): “Ranked prediction of p53 targets using hidden variable dynamic modeling,” Genome Biology, 7, R25.CrossrefGoogle Scholar
Beal, M., F. Falciani, Z. Ghahramani, C. Rangel, and D. Wild (2005): “A Bayesian approach to reconstructing genetic regulatory networks with hidden factors,” Bioinformatics, 21, 349–356.PubMedCrossrefGoogle Scholar
Beal, M. (2003): Variational Algorithms for Approximate Bayesian Inference, Ph.D. thesis, Gatsby Computational Neuroscience Unit, University College London, UK.Google Scholar
Bishop, C. M. (2006): Pattern Recognition and Machine Learning, Singapore: Springer.Google Scholar
Brandt, S. (1999): Data Analysis: Statistical and Computational Methods for Scientists and Engineers, New York, USA: Springer.Google Scholar
Brooks, S. and A. Gelman (1999): “General methods for monitoring convergence of iterative simulations,” J. Comput. Graph. Stat., 7, 434–455.Google Scholar
Butte, A. J. and I. S. Kohane (2000): “Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements,” in Pacific Symposium on Biocomputing, volume 5, 418–429.Google Scholar
Davies, J. and M. Goadrich (2006): “The relationship between Precision-Recall and ROC curves,” Proceedings of the 23rd International Conference on Machine Learning, 233–240.Google Scholar
Edwards, K., O. Akman, K. Knox, P. Lumsden, A. Thomson, P. Brown, A. Pokhilko, L. Kozma-Bognar, F. Nagy, D. Rand, A. J. Millar. (2010): “Quantitative analysis of regulatory flexibility under changing environmental conditions,” Mol. Syst. Biol., 6, 424.PubMedGoogle Scholar
Feugier, F. and A. Satake (2012): “Dynamical feedback between circadian clock and sucrose availability explains adaptive response of starch metabolism to various photoperiods,” Front. Plant Sci., 3.PubMedGoogle Scholar
Friedman, J., T. Hastie, and R. Tibshirani (2010): “Regularization paths for generalized linear models via coordinate descent,” J. Stat. Softw., 33, 1–22.Google Scholar
Geiger, D. and D. Heckerman (1994): “Learning gaussian networks,” in International Conference on Uncertainty in Artificial Intelligence, Seattle, WA: Morgan Kaufmann Publishers, 235–243.Google Scholar
Grzegorczyk, M. and D. Husmeier (2012): “A non-homogeneous dynamic Bayesian network with sequentially coupled interaction parameters for applications in systems and synthetic biology,” Stat. Appl. Genet. Mol. Biol. (SAGMB), 11, article 7.Google Scholar
Grzegorczyk, M. and D. Husmeier (2013): “Regularization of non-homogeneous dynamic Bayesian networks with global information-coupling based on hierarchical Bayesian models,” Mach. Learn., 91, 1–50.Google Scholar
Hanley, J. A. and B. J. McNeil (1982): “The meaning and use of the area under a receiver operating characteristic (ROC) curve,” Radiology, 143, 29–36.Google Scholar
Hastie, T., R. Tibshirani, and J. J. H. Friedman (2001): The Elements of Statistical Learning, volume 1, New York: Springer.Google Scholar
Herrero, E., E. Kolmos, N. Bujdoso, Y. Yuan, M. Wang, M. C. Berns, H. Uhlworm, G. Coupland, R. Saini, M. Jaskolski, A. Webb, J. Gonçalves, S. J. Davis. (2012): “EARLY FLOWERING4 recruitment of EARLY FLOWERING3 in the nucleus sustains the Arabidopsis circadian clock,” Plant Cell, 24, 428–443.Google Scholar
Husmeier, D. (1999): Neural Networks for Conditional Probability Estimation: Forecasting Beyond Point Predictions, Perspectives in Neural Computing, London: Springer.Google Scholar
Husmeier, D. (2003): “Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic Bayesian networks,” Bioinformatics, 19, 2271–2282.CrossrefPubMedGoogle Scholar
Kalaitzis, A. A., A. Honkela, P. Gao, and N. D. Lawrence (2013): gptk: Gaussian processes tool-kit, URL http://CRAN.R-project.org/package=gptk, R package version 1.06.
Ko, Y., C. Zhai, and S. Rodriguez-Zas (2007): “Inference of gene pathways using Gaussian mixture models,” in International Conference on Bioinformatics and Biomedicine, Fremont, CA, 362–367.Google Scholar
Kolmos, E., M. Nowak, M. Werner, K. Fischer, G. Schwarz, S. Mathews, H. Schoof, F. Nagy, J. M. Bujnicki, and S. J. Davis (2009): “Integrating ELF4 into the circadian system through combined structural and functional studies,” HFSP J, 3, 350–366.CrossrefGoogle Scholar
Lawrence, N. D., M. Girolami, M. Rattray, and G. Sanguinetti (2010): Learning and inference in computational systems biology, Cambridge, MA: MIT Press Cambridge.Google Scholar
Locke, J. C. W., M. M. Southern, L. Kozma-Bognár, V. Hibberd, P. E. Brown, M. S. Turner, and A. J. Millar (2005): “Extension of a genetic network model by iterative experimentation and mathematical analysis,” Mol. Syst. Biol., 1.PubMedCrossrefGoogle Scholar
Locke, J. C. W., L. Kozma-Bognár, P. D. Gould, B. Fehér, E. Kevei, F. Nagy, M. S. Turner, A. Hall, and A. J. Millar (2006): “Experimental validation of a predicted feedback loop in the multi-oscillator clock of Arabidopsis thaliana,” Mol. Syst. Biol., 2.CrossrefGoogle Scholar
Margolin, A. A., I. Nemenman, K. Basso, C. Wiggins, G. Stolovitzky, R. Dalla-Favera, and A. Califano (2006): “ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context,” BMC Bioinformatics, 7.CrossrefGoogle Scholar
Marin, J.-M. and C. P. Robert (2007): Bayesian core: A practical approach to computational Bayesian statistics, New York, USA: Springer.Google Scholar
Morrissey, E. R., M. A. Juárez, K. J. Denby, and N. J. Burroughs (2011): “Inferring the time-invariant topology of a nonlinear sparse gene regulatory network using fully Bayesian spline autoregression,” Biostatistics, 12, 682–694.PubMedCrossrefGoogle Scholar
Murphy, K. P. (2012): Machine learning: a probabilistic perspective, Cambridge, MA: MIT Press.Google Scholar
Nabney, I. (2002): NETLAB: algorithms for pattern recognition, Springer.Google Scholar
Neuneier, R., F. Hergert, W. Finnoff, and D. Ormoneit (1994): “Estimation of conditional densities: a comparison of neural network approaches,” in International Conference on Artificial Neural Networks, National Cheng Kung University, Taiwan: Springer, 689–692.Google Scholar
Opgen-Rhein, R. and K. Strimmer (2007): “From correlation to causation networks: a simple approximate learning algorithm and its application to high-dimensional plant gene expression data,” BMC Syst. Biol., 1.Google Scholar
Pokhilko, A., A. Fernández, K. Edwards, M. Southern, K. Halliday, and A. Millar (2012): “The clock gene circuit in Arabidopsis includes a repressilator with additional feedback loops,” Mol. Syst. Biol., 8, 574.PubMedGoogle Scholar
Pokhilko, A., S. Hodge, K. Stratford, K. Knox, K. Edwards, A. Thomson, T. Mizuno, and A. Millar (2010): “Data assimilation constrains new connections and components in a complex, eukaryotic circadian clock model,” Mol. Syst. Biol., 6.CrossrefGoogle Scholar
Pokhilko, A., P. Mas, A. J. Millar, et al. (2013): “Modeling the widespread effects of TOC1 signaling on the plant circadian clock and its outputs,” BMC Syst. Biol., 7, 1–12.Google Scholar
Rasmussen, C. E., R. M. Neal, G. E. Hinton, D. van Camp, M. Revow, Z. Ghahramani, R. Kustra, and R. Tibshirani (1996): “The DELVE manual,” URL http://www.cs.toronto.edu/delve.
Rasmussen, C. E. (1996): Evaluation of Gaussian processes and other methods for non-linear regression, Ph.D. thesis, Citeseer.Google Scholar
Rasmussen, C. and C. Williams (2006): Gaussian processes for machine learning, volume 1, MA: MIT press Cambridge.Google Scholar
Solak, E., R. Murray-Smith, W. E. Leithead, D. J. Leith, and C. E. Rasmussen (2002): “Derivative observations in Gaussian process models of dynamic systems,” Advances in Neural Information Processing Systems, MIT Press: Vancouver, Canada, 1033–1040.Google Scholar
Tibshirani, R. (1995): “Regression shrinkage and selection via the Lasso,” J. R. Stat. Soc. Series B, 58, 267–288.Google Scholar
TiMet (2014): “The TiMet Project - Linking the clock to metabolism: URL http://timing-metabolism.eu.
Tipping, M. and A. Faul (2003): “Fast marginal likelihood maximisation for sparse Bayesian models,” in Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics, Key West, FL, 1, 3–6.Google Scholar
Tipping, M. (2001): “Spare Bayesian learning and the relevance vector machine,” Journal of Machine Learning Research, 1, 211–244.Google Scholar
Weirauch, M. T., A. Cote, R. Norel, M. Annala, Y. Zhao, T. R. Riley, J. Saez-Rodriguez, T. Cokelaer, A. Vedenko, S. Talukder, DREAM5 Consortium, Bussemaker, H. J., Morris, Q. D., Bulyk, M. L., Stolvitzky, G, and T. R. Hughes (2013): “Evaluation of methods for modeling transcription factor sequence specificity,” Nat. Biotechnol., 31, 126–134.CrossrefGoogle Scholar
Werhli, A. V., M. Grzegorczyk, and D. Husmeier (2006): “Comparative evaluation of reverse engineering gene regulatory networks with relevance networks, graphical Gaussian models and Bayesian networks,” Bioinformatics, 22, 2523–2531.PubMedCrossrefGoogle Scholar
Wilkinson, D. (2011): Stochastic modeling for systems biology, volume 44, Taylor & Francis, Boca Raton, FL: CRC press.Google Scholar
Zoppoli, P., S. Morganella, and M. Ceccarelli (2010): “TimeDelay-ARACNE: Reverse engineering of gene networks from time-course data by ab information theoretic approach,” BMC Bioinformatics, 11.CrossrefGoogle Scholar
About the article
Published Online: 2014-05-26
Published in Print: 2014-06-01
Note that the sets of potential regulators are defined for each gene g specifically. That is, the potential regulators for two target variables yg and can be different, e.g., if certain (biologically-motivated) restrictions are imposed.
For consistency with the fundamental equation of transcription, equation (1), we will enforce that each regulator set πg for yg contains the concentration xg of g, symbolically xg∈πg.
Note that vector x·,t includes every available regulator without any dependency on the target gene g.
Note that the repeated bi-partitioning of the genes into targets and putative regulators renders Glasso equivalent to Lasso, as discussed on page 4 of Friedman et al. (2008). Lasso will be discussed in Section 2.3.
We set: ν=0.005, Aδ=2, and Bδ=0.2, as in Grzegorczyk and Husmeier (2012).
We note that the coupled variant of the non-homogeneous Bayesian regression model cannot be represented properly as a graphical model, as the regression parameter vectors are sequentially coupled among adjacent segments via equations (21–22).
For each yg we apply exactly the same permutation to order the realizations of the explanatory variables (covariates) and thereby ensure that the segment-specific design matrices are built properly.
In our study we follow Rogers and Girolami (2005) and use a slightly modified version of the fast marginal likelihood algorithm from Tipping et al. (2003) for optimization.
We use the authors’ terminology, although the model is not a proper Bayesian network.
More precisely, is obtained by deleting the element corresponding to the target variable yg,t in μg,h, and is obtained by deleting the row and the column corresponding to yg,t in Σg,h.
Note that the abbreviation “BGe” was introduced by Geiger and Heckerman (1994) and stands for Bayesian metric for Gaussian networks having score equivalence; see Geiger and Heckerman (1994) for more details.
We turned off the translation of those proteins contributing to interactions we like to surpress.
In the model equations defined by Guerriero et al. (2012) the concentration of P only appears in a product with the binary light indicator L, where the light variable L is equal to zero in the absence of light.
For the Bayesian methods this can be enforced by setting the prior P(πg) to zero for all πg with xg∉πg.
Matlab software for Disciplined Convex Programming: http://cvxr.com/cvx/.
Note that the maximal number of hidden nodes n is restricted by the number of regulators, Gg. In our simulation study we analyzed various data sets, and we employed the lowest Gg as an upper bound on the number of hidden nodes n.
In our study we initialized the EM-algorithm with allocations obtained by the k-means cluster algorithm. Thereby the initial 𝕂g centers of the k-means algorithms were sampled from a multivariate Gaussian N(μ, I) distribution, where I is the identity matrix and μ is a random expectation vector with entries sampled independently from continuous uniform distributions on the interval [–1, +1]. To avoid that the EM-algorithm is initialized with allocations that possess unoccupied (empty) mixture components, we re-sampled the initial centers and re-ran the k-means algorithm whenever we obtained k-means outputs with empty components.
Loosely speaking, this setting (μ0=0 and T0=I) reflects our “prior belief” that all domain variables, i.e., the potential regulators and the target variable, are i.i.d. standard normally distributed.
The sensitivity is the proportion of true interactions that have been detected, the specificity is the proportion of non-interactions that have been avoided.