Statistical Applications in Genetics and Molecular Biology

Editor-in-Chief: Sanguinetti, Guido

Volume 3, Issue 1


PLS Dimension Reduction for Classification with Microarray Data

Anne-Laure Boulesteix
Published Online: 2004-11-23 | DOI: https://doi.org/10.2202/1544-6115.1075

Partial Least Squares (PLS) dimension reduction is known to give good prediction accuracy in the context of classification with high-dimensional microarray data. In this paper, the classification procedure consisting of PLS dimension reduction and linear discriminant analysis on the new components is compared with some of the best state-of-the-art classification methods. Moreover, a boosting algorithm is applied to this classification method. In addition, a simple procedure to choose the number of PLS components is suggested. The connection between PLS dimension reduction and gene selection is examined and a property of the first PLS component for binary classification is proved. In addition, we show how PLS can be used for data visualization using real data. The whole study is based on 9 real microarray cancer data sets.

Keywords: partial least squares; feature extraction; variable selection; boosting; gene expression; discriminant analysis; supervised learning

Citation Information: Statistical Applications in Genetics and Molecular Biology, Volume 3, Issue 1, Pages 1–30, ISSN (Online) 1544-6115, DOI: https://doi.org/10.2202/1544-6115.1075.

