Weighted Kolmogorov Smirnov testing: an alternative for Gene Set Enrichment Analysis

Konstantina Charmpi
  • Université Grenoble Alpes, France
  • Laboratoire Jean Kuntzmann, CNRS UMR5224, Grenoble, France
  • Laboratoire d’Excellence TOUCAN, Toulouse, France
/ Bernard Ycart
  • Corresponding author
  • Université Grenoble Alpes, France
  • Laboratoire Jean Kuntzmann, CNRS UMR5224, Grenoble, France
  • Laboratoire d’Excellence TOUCAN, Toulouse, France
  • Email:
Published Online: 2015-05-30 | DOI: https://doi.org/10.1515/sagmb-2014-0077


Gene Set Enrichment Analysis (GSEA) is a basic tool for genomic data treatment. Its test statistic is based on a cumulated weight function, and its distribution under the null hypothesis is evaluated by Monte-Carlo simulation. Here, it is proposed to subtract to the cumulated weight function its asymptotic expectation, then scale it. Under the null hypothesis, the convergence in distribution of the new test statistic is proved, using the theory of empirical processes. The limiting distribution needs to be computed only once, and can then be used for many different gene sets. This results in large savings in computing time. The test defined in this way has been called Weighted Kolmogorov Smirnov (WKS) test. Using expression data from the GEO repository, tested against the MSig Database C2, a comparison between the classical GSEA test and the new procedure has been conducted. Our conclusion is that, beyond its mathematical and algorithmic advantages, the WKS test could be more informative in many cases, than the classical GSEA test.

Keywords: empirical processes; GSEA; Monte-Carlo simulation; statistical test; weak convergence

AMS Subject Classification: Primary 62F03; Secondary 60F17


