Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Statistical Applications in Genetics and Molecular Biology

Editor-in-Chief: Stumpf, Michael P.H.

6 Issues per year

IMPACT FACTOR 2016: 0.646
5-year IMPACT FACTOR: 1.191

CiteScore 2016: 0.94

SCImago Journal Rank (SJR) 2016: 0.625
Source Normalized Impact per Paper (SNIP) 2016: 0.596

Mathematical Citation Quotient (MCQ) 2016: 0.06

See all formats and pricing
More options …
Volume 12, Issue 6


Volume 17 (2018)

Volume 10 (2011)

Volume 9 (2010)

Volume 6 (2007)

Volume 5 (2006)

Volume 4 (2005)

Volume 2 (2003)

Volume 1 (2002)

A new variance stabilizing transformation for gene expression data analysis

Diana M. Kelmansky / Elena J. Martínez / Víctor Leiva
  • Corresponding author
  • Departamento de Estadística, Universidad de Valparaíso, Avda. Gran Bretaña 1111, Playa Ancha, Valparaíso, Chile
  • Email
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
Published Online: 2013-10-16 | DOI: https://doi.org/10.1515/sagmb-2012-0030


In this paper, we introduce a new family of power transformations, which has the generalized logarithm as one of its members, in the same manner as the usual logarithm belongs to the family of Box-Cox power transformations. Although the new family has been developed for analyzing gene expression data, it allows a wider scope of mean-variance related data to be reached. We study the analytical properties of the new family of transformations, as well as the mean-variance relationships that are stabilized by using its members. We propose a methodology based on this new family, which includes a simple strategy for selecting the family member adequate for a data set. We evaluate the finite sample behavior of different classical and robust estimators based on this strategy by Monte Carlo simulations. We analyze real genomic data by using the proposed transformation to empirically show how the new methodology allows the variance of these data to be stabilized.

Keywords: classical and robust estimators; linear models; microarrays; Monte Carlo method; power transformations; R software; regression methods


  • Barros, M., G. A. Paula and V. Leiva (2009): “An R implementation for generalized Birnbaum-Saunders distributions,” Comp. Stat. Data Anal., 53, 1511–1528.CrossrefGoogle Scholar

  • Bengtsson, H. and O. Hössjer (2006): “Methodological study of affine transformations of gene expression data with proposed robust non-parametric multi-dimensional normalization method,” BMC Bioinformatics, 7:100.CrossrefPubMedGoogle Scholar

  • Box, G. E. P. and D. R. Cox (1964): “An analysis of transformations,” J. Roy. Stat. Soc. B, 26, 211–251.Google Scholar

  • Cui, X., M. K. Kerr and G. A. Churchill (2003): “Transformations for cDNA microarray data,” Stat. Appl. Genet. Mol. Biol., 2(1), Article 4.PubMedGoogle Scholar

  • Durbin, B. P., J. S. Hardin, D. M. Hawkins and D. M. Rocke (2002): “A variance-stabilizing transformation for gene-expression microarray data,” Bioinformatics, 18, S105–S110.CrossrefGoogle Scholar

  • Emmerson, J. D. and M. A. Stoto (1987): Transforming data. In: Hoaglin, D. C., Mosteller, F., Tukey, J. W. (Eds.), Understanding Robust and Exploratory Data Analysis, Wiley, New York, pp. 65–104.Google Scholar

  • Galton, F. (1879): “The geometric mean, in vital and social statistics,” Proc. Royal Soc., 29, 365–367.CrossrefGoogle Scholar

  • Gibrat, R. (1930): Les Inegalités Économiques, Sirey, Paris.Google Scholar

  • Hawkins, D. M. (2002): “Diagnostics for conformity of paired quantitative measurements,” Stat. Med., 21, 1913–1935.PubMedCrossrefGoogle Scholar

  • Huang, S. and Y. Qu (2006): “The loss in power when the test of differential expression is performed under a wrong scale,” J. Comp. Biol., 13, 786–797.Google Scholar

  • Huber, P. J. (1987): Robust Statistics, Wiley, New York.Google Scholar

  • Huber, W., A. Heydebreck, H. Sültmann, A. Poustka and M. Vingron (2002): “Variance stabilization applied to microarray data calibration and to the quantification of differential expression,” Bioinformatics, 18(Suppl. 1), S96–S104.CrossrefGoogle Scholar

  • Huber, W., A. Heydebreck, H. Sueltmann, A. Poustka and M. Vingron (2003): “Parameter estimation for the calibration and variance stabilization of microarray data,” Stat. Appl. Gen. Mol. Biol., 2(1), Article 3.Google Scholar

  • Johnson, N. L. (1949): “Systems of frequency curves generated by methods of translation,” Biometrika, 36, 149–176.PubMedGoogle Scholar

  • Johnson, N. L., S. Kotz and N. Balakrishnan (1994): Continuous Univariate Distributions, Wiley, New York.Google Scholar

  • Kapteyn, J. and M. J. van Uven (1916): Skew Frequency Curves in Biology and Statistics, Hoitsema Brothers, Groningen.Google Scholar

  • Kotz, S., V. Leiva and A. Sanhueza (2010): “Two new mixture models related to the inverse Gaussian distribution,” Meth. Comp. App. Prob., 12, 199–212.Google Scholar

  • Leiva, V., H. Hernández and A. Riquelme (2006): “A new package for the Birnbaum-Saunders distribution,” R J., 6, 35–40.Google Scholar

  • Leiva, V., H Hernández and A. Sanhueza (2008): “An R package for a general class of inverse Gaussian distributions,” J. Stat. Soft., 26, 1–21.Google Scholar

  • Leiva, V., A. Sanhueza, D. M. Kelmansky and E. J. Martínez (2009): “On the glog-normal distribution and its association with the gene expression problem,” Comp. Stat. Data Anal., 53, 1613–1621.CrossrefWeb of ScienceGoogle Scholar

  • McAlister, D. (1879): “The law of the geometric mean,” Proc. Royal Soc., 29, 367–376.CrossrefGoogle Scholar

  • Purdom, E. and S. P. Holmes (2005): “Error distribution for gene expression data,” Stat. Appl. Genet. Mol. Biol., 4(1), Article 16.PubMedGoogle Scholar

  • R Development Core Team (2013): R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria, available at www.R-project.org..

  • Rocke, D. M. and B. Durbin (2001): “A model for measurement error for gene expression arrays,” J. Comp. Biol., 8, 557–569.Google Scholar

  • Rocke, D. M., and S. Lorenzato (1995): “A two-component model for measurement error in analytical chemistry,” Technometrics, 37, 176–184.CrossrefGoogle Scholar

  • Rousseeuw, P. J., and A. M. Leroy (1987): Robust Regression and Outlier Detection, Wiley, New York.Google Scholar

  • Smyth, G. K. (2004): “Linear models and empirical Bayes methods for assessing differential expression in microarray experiments,” Stat. Appl. Genet. Mol. Biol., 3(1), Article 3.PubMedGoogle Scholar

  • Smyth, G. K., Y. H. Yang and T. Speed (2003): Statistical Issues in cDNA Microarray Data Analysis, Humana Press, Totowa, NJ.Google Scholar

  • Speed, T. (2003): Statistical Analysis of Gene Expression Data, Chapman & Hall, New York.Google Scholar

  • van den Berg, R. A., H. C. Hoefsloot, J. A. Westerhuis, A. K. Smilde and M. J. van der Werf (2006): “Centering, scaling, and transformations: improving the biological information content of metabolomics data,” BMC Genomics, 7, 142–147.Google Scholar

  • Wicksell, S. D. (1917): “On the genetic theory of frequency. Arkiv för Matematik,” Astronomi och Fysik, 12, 1–56.Google Scholar

About the article

Corresponding author: Víctor Leiva, Departamento de Estadística, Universidad de Valparaíso, Avda. Gran Bretaña 1111, Playa Ancha, Valparaíso, Chile, URL: www.deuv.cl/leiva, e-mail:

Published Online: 2013-10-16

Published in Print: 2013-12-01

Citation Information: Statistical Applications in Genetics and Molecular Biology, Volume 12, Issue 6, Pages 653–666, ISSN (Online) 1544-6115, ISSN (Print) 2194-6302, DOI: https://doi.org/10.1515/sagmb-2012-0030.

Export Citation

©2013 by Walter de Gruyter Berlin Boston. Copyright Clearance Center

Citing Articles

Here you can find all Crossref-listed publications in which this article is cited. If you would like to receive automatic email messages as soon as this article is cited in other publications, simply activate the “Citation Alert” on the top of this page.

Arthur C. Tsai, Michelle Liou, Maria Simak, and Philip E. Cheng
Computational Statistics & Data Analysis, 2017, Volume 115, Page 250
Diana Kelmansky and Lila Ricci
Microarrays, 2017, Volume 6, Number 1, Page 5

Comments (0)

Please log in or register to comment.
Log in