Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter November 19, 2013

Estimation of weighted log partial area under the ROC curve and its application to MicroRNA expression data

  • Ahmed Hossain EMAIL logo and Joseph Beyene

Abstract

MicroRNAs (miRNAs) are short non-coding RNAs that play critical roles in numerous cellular processes through post-transcriptional functions. The aberrant role of miRNAs has been reported in a number of diseases. A robust computational method is vital to discover novel miRNAs where level of noise varies dramatically across the different miRNAs. In this paper, we propose a flexible rank-based procedure for estimating a weighted log partial area under the receiver operating characteristic (ROC) curve statistic for selecting differentially expressed miRNAs. The statistic combines results taking partial area under the curve (pAUC) and their corresponding variance. The proposed method does not involve complicated formulas and does not require advanced programming skills. Two real datasets are analyzed to illustrate the method and a simulation study is carried out to assess the performance of different miRNA ranking statistics. We conclude that the proposed method offers robust results with large samples for miRNA expression data, and the method can be used as an alternative analytical tool for identifying a list of target miRNAs for further biological and clinical investigation.


Corresponding author: Ahmed Hossain, Clinical Epidemiology and Biostatistics, McMaster University, 1280 Main Street West, Hamilton, Ontario L8S4K1, Canada, Tel.: +1-416-8362934, e-mail:

AH acknowledges post-doctoral fellowship funding from the Drug Safety and Effectiveness Cross-Disciplinary Training (DSECT) Program. JB would like to acknowledge Discovery Grant funding from the Natural Sciences and Engineering Research Council of Canada (NSERC) (grant number 293295-2009) and Canadian Institutes of Health Research (CIHR) (grant number 84392). JB holds the John D. Cameron Endowed Chair in the Genetic Determinants of Chronic Diseases, Department of Clinical Epidemiology and Biostatistics, McMaster University. We would like to thank two anonymous reviewers and the editor for insightful comments that improved the presentation and clarity of our manuscript.

References

Agilent Manual (2012): Agilent Feature Extraction Software Manual, http://cp.chem.agilent.com/Library/usermanuals/Public/G4460-90019_FE_10.5_User.pdf.Search in Google Scholar

Ambros, V. (2003): “MicroRNA pathways in flies and worms: growth, death, fat, stress, and timing,” Cell, 113, 673–676.10.1016/S0092-8674(03)00428-8Search in Google Scholar

Ambros, V. (2004): “The functions of animal microRNAs,” Nature, 431, 350–355.10.1038/nature02871Search in Google Scholar

Calin, G. A. and C. M. Croce (2006): “The functions of animal microRNAs,” Cancer Res., 66, 7390–7394.Search in Google Scholar

Calin, G. A., C. D. Dumitru, M. Shimizu, R. Bichi, S. Zupo, E. Noch, H. Aldler, S. Rattan, M. Keating, K. Rai, L. Rassenti, T. Kipps, M. Negrini, F. Bullrich and C. M. Croce (2002): “Frequent deletions and down-regulation of micro- RNA genes miR15 and miR16 at 13q14 in chronic lymphocytic leukemia,” Proc. Natl. Acad. Sci. USA, 99, 15524–15529.10.1073/pnas.242606799Search in Google Scholar

Caren H., F. Abel, P. Kogner and T. Martinsson (2000): “High incidence of DNA mutations and gene amplifications of the ALK gene in advanced sporadic neuroblastoma tumours,” Biochem J., 416, 153–159.Search in Google Scholar

Efron, B., R. Tibshirani, J. Storey and V. Tusher (2001): “Empirical bayes analysis of a microarray experiment,” Clinical Chemistry, 96, 1151–1160.10.1198/016214501753382129Search in Google Scholar

Faraggi, D. and B. Reiser (2002): “Estimation of the area under the ROC curve,” Statist. Med., 21, 3093–3106.Search in Google Scholar

Goddard, M. J. and I. Hinberg (1990): “Receiver operating characteristic (ROC) curves and non-normal data: an empirical study,” Statistics in Medicine, 9, 325–337.10.1002/sim.4780090315Search in Google Scholar

Hardy, R. J. and S. G. Thompson (1998): “Detecting and describing heterogeneity in meta-analysis,” Statistics in Medicine, 17 (8), 841–856.10.1002/(SICI)1097-0258(19980430)17:8<841::AID-SIM781>3.0.CO;2-DSearch in Google Scholar

He, Y. and M. Escobar (2008): “Nonparametric statistical inference method for partial areas under receiver operating characteristic curves, with application to genomic studies,” Statistics in Medicine, 27, 5291–5308.10.1002/sim.3335Search in Google Scholar

He, L. and G. J. Hannon (2004): “MicroRNAs: small RNAs with a big role in gene regulation,” Nat. Rev. Genet., 5, 522–531.Search in Google Scholar

Hossain, A., A. Willan and J. Beyene (2013): “A flexible nonparametric approach to find candidate genes associated to disease in microarray experimets,” J. Bioinformat. Comput. Biol., 11 (2), 1250021.Search in Google Scholar

Jason, B. N. and C. L. Walter (2012): “Linear discriminant functions in connection with the micro-rna diagnosis of colon cancer, Cancer Informatics, 11, 1–14.10.4137/CIN.S8779Search in Google Scholar

Lewis, B. P., C. B. Burge and D. P. Bartel (2005): “Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets,” Cell, 1, 15–20.10.1016/j.cell.2004.12.035Search in Google Scholar

Pepe, M. S., G. Longton, G. L. Anderson and M. Schummer (2003): “Selecting differentially expressed genes from microarray experiments,” Biometrics, 59, 133–142.10.1111/1541-0420.00016Search in Google Scholar

Raychaudhuri, S., J. M. Stuart, X. Liu, P. M. Small and R. B. Altman (2000): “Pattern recognition of genomic features with microarrays: site typing of Mycrobacterium tuberculosis strains,” Proc. Int. Conf.Intell. Syst. Mol. Biol., 8, 286–295.Search in Google Scholar

Sarver, A. L., A. J. French, P. M. Borralho and V. Thayanithy, A. L. Oberg, K. A. T. Silverstein, B. W. Morlan, S. M. Riska, L. A. Boardman, J. M. Cunningham, S. Subramanian, L. Wang, T. C. Smyrk, C. M. P. Rodrigues, S. N. Thibodeau and C. J. Steer (2009): “Human colon cancer profiles show differential microRNA expression depending on mismatch repair status and are characteristic of undifferentiated proliferative states,” BMC Cancer, 9 (18), 401–413.10.1186/1471-2407-9-401Search in Google Scholar PubMed PubMed Central

Scaruffi, P., S. Stigliani, S. Moretti, S. Coco, C. De Vecchi, F. Valdora, A. Garaventa, S. Bonassi and G. P. Tonini (2009): “Transcribed-ultra conserved region expression is associated with outcome in high-risk neuroblastoma,” BMC Cancer, 15, 441–450.10.1186/1471-2407-9-441Search in Google Scholar PubMed PubMed Central

Troyanskaya, O. G., M. Garber, P. Brown, D. Botstein and R. B. Altman (2002): “Nonparamteric methods for identifying differentially expressed genes in microarray data,” Bioinformatics, 18 (11), 1454–1461.10.1093/bioinformatics/18.11.1454Search in Google Scholar PubMed

Tsodikov, A., A. Szabo and D. Jones (2002): “Adjustments and measures of differential expression for microarray data,” Bioinformatics, 18, 251–260.10.1093/bioinformatics/18.2.251Search in Google Scholar PubMed

Vermeulen, J., K. De Preter, A. Naranjo, L. Vercruysse, N. Van Roy, J. Hellemans, K. Swerts, S. Bravo, P. Scaruffi, G. P. Tonini, B. De Bernardi, R. Noguera, M. Piqueras, A. Cañete, V. Castel, I. Janoueix-Lerosey, O. Delattre, G. Schleiermacher, J. Michon, V. Combaret, M. Fischer, A. Oberthuer, P. F. Ambros, K. Beiske, J. Bénard, B. Marques, H. Rubie, J. Kohler, U. Pötschger, R. Ladenstein, M. D. Hogarty, P. McGrady, W. B. London, G. Laureys, F. Speleman, J. Vandesompele (2009): “Predicting outcomes for children with neuroblastoma using a multigeneexpression signature: a retrospective SIOPEN/COG/GPOH study,” Lancet Oncol., 10 (7), 663–671.Search in Google Scholar

Published Online: 2013-11-19
Published in Print: 2013-12-01

©2013 by Walter de Gruyter Berlin Boston

Downloaded on 28.3.2024 from https://www.degruyter.com/document/doi/10.1515/sagmb-2013-0035/html
Scroll to top button