Search Results

You are looking at 1 - 10 of 29 items :

Clear All

cyanobacteria. Finally, a predictive model able to forecast the possible presence of cyanotoxins in a short term was obtained. Keywords: statistical learning techniques, cyanobacteria, cyanotoxins, support vector machines (SVM), regression analysis PACS® (2010). 62J02, 62M45, 92B20, 65C60, 68T05 *Corresponding author: P.J. García Nieto: Department of Mathematics, Faculty of Sciences, University of Oviedo, 33007 Oviedo, Spain. E-mail: paulino.lato@gmail.com J.R. Alonso Fernández, C. Díaz Muñiz: Cantabrian Basin Authority, Spanish Ministry of Agriculture, Food and

functions of Z. Theoretical arguments are followed by numerical examples providing statistics for inverses of random matrices, solutions of stochastic algebraic equations, and eigenvalues/eigenvectors of randommatrices. Keywords: Random eigenvalue problem, random matrices, stochastic algebraic equations, stochastic reduced order models MSC 2010: 65C05, 65C50, 65C60 || Mircea Grigoriu: Cornell University, Ithaca NY 14853–3501, USA, e-mail: mdg12@cornell.edu 1 Introduction Most physical systems have uncertain properties and are subjected to random actions so that their

be used in given situations. Keywords: Bhattacharyya Bound, Cramèr-Rao Bound, Generalized Gamma Distribution, Hammersley- Chapman-Robins Bound, Inverse Gaussian Distribution, Burr Type III Distribution, Burr Type XII Distribu- tion, Kshirsagar Bound. MSC 2010: 65C20, 65C60 || S. Nayeban: Department of Statistics, School of Mathematical Sciences, Ferdowsi University of Mashhad, P.O. Box 91775-1159, Mashhad, Iran A. H. Rezaei Roknabadi: Department of Statistics, School of Mathematical Sciences, Ferdowsi University of Mashhad, P.O. Box 91775-1159, Mashhad, Iran

Abstract

Numerical challenges inherent in algorithms for computing worst Value-at-Risk in homogeneous portfolios are identified and solutions as well as words of warning concerning their implementation are provided. Furthermore, both conceptual and computational improvements to the Rearrangement Algorithm for approximating worst Value-at-Risk for portfolios with arbitrary marginal loss distributions are given. In particular, a novel Adaptive Rearrangement Algorithm is introduced and investigated. These algorithms are implemented using the R package qrmtools and may be of interest in any context in which it is required to find columnwise permutations of a matrix such that the minimal (maximal) row sum is maximized (minimized).

Abstract

A generalization of the linear least squares method to a wide class of parametric nonlinear inverse problems is presented. The approach is based on the consideration of the operator equations, with the selected function of parameters as the solution. The generalization is based on the two mandatory conditions: the operator equations are linear for the estimated parameters and the operators have discrete approximations. Not requiring use of iterations, this approach is well suited for hardware implementation and also for constructing the first approximation for the nonlinear least squares method. The examples of parametric problems, including the problem of estimation of parameters of some higher transcendental functions, are presented.

Abstract

High-throughput sequencing techniques are increasingly affordable and produce massive amounts of data. Together with other high-throughput technologies, such as microarrays, there are an enormous amount of resources in databases. The collection of these valuable data has been routine for more than a decade. Despite different technologies, many experiments share the same goal. For instance, the aims of RNA-seq studies often coincide with those of differential gene expression experiments based on microarrays. As such, it would be logical to utilize all available data. However, there is a lack of biostatistical tools for the integration of results obtained from different technologies. Although diverse technological platforms produce different raw data, one commonality for experiments with the same goal is that all the outcomes can be transformed into a platform-independent data format – rankings – for the same set of items. Here we present the R package TopKLists, which allows for statistical inference on the lengths of informative (top-k) partial lists, for stochastic aggregation of full or partial lists, and for graphical exploration of the input and consolidated output. A graphical user interface has also been implemented for providing access to the underlying algorithms. To illustrate the applicability and usefulness of the package, we integrated microRNA data of non-small cell lung cancer across different measurement techniques and draw conclusions. The package can be obtained from CRAN under a LGPL-3 license.

Abstract

Partial least squares regression – or PLS regression – is a multivariate method in which the model parameters are estimated using either the SIMPLS or NIPALS algorithm. PLS regression has been extensively used in applied research because of its effectiveness in analyzing relationships between an outcome and one or several components. Note that the NIPALS algorithm can provide estimates parameters on incomplete data. The selection of the number of components used to build a representative model in PLS regression is a central issue. However, how to deal with missing data when using PLS regression remains a matter of debate. Several approaches have been proposed in the literature, including the Q 2 criterion, and the AIC and BIC criteria. Here we study the behavior of the NIPALS algorithm when used to fit a PLS regression for various proportions of missing data and different types of missingness. We compare criteria to select the number of components for a PLS regression on incomplete data set and on imputed data set using three imputation methods: multiple imputation by chained equations, k-nearest neighbour imputation, and singular value decomposition imputation. We tested various criteria with different proportions of missing data (ranging from 5% to 50%) under different missingness assumptions. Q 2-leave-one-out component selection methods gave more reliable results than AIC and BIC-based ones.

Abstract

Bivariate counts are collected in many sectors of research but the analysis of such data is often challenging because each series of counts may exhibit different levels and types of dispersion. This paper addresses this problem by proposing a flexible bivariate COM-Poisson model that may handle any combination of over-, equi- and under-dispersion at any levels. In this paper, the bivariate COM-Poisson is developed via Archimedean copulas. The Generalized Quasi-Likelihood (GQL) approach is used to estimate the unknown mean parameters in the copula-based bivariate COM-Poisson model while the dependence parameter is estimated using the copula likelihood. We further introduce a Monte Carlo experiment to generate bivariate COM-Poisson data under different dispersion levels. The performance of the GQL approach is assessed on the simulated data. The model is applied to analyze real-life epileptic seizures data.

Abstract

The sum of log-normal variates is encountered in many challenging applications such as performance analysis of wireless communication systems and financial engineering. Several approximation methods have been reported in the literature. However, these methods are not accurate in the tail regions. These regions are of primordial interest as small probability values have to be evaluated with high precision. Variance reduction techniques are known to yield accurate, yet efficient, estimates of small probability values. Most of the existing approaches have focused on estimating the right-tail of the sum of log-normal random variables (RVs). Here, we instead consider the left-tail of the sum of correlated log-normal variates with Gaussian copula, under a mild assumption on the covariance matrix. We propose an estimator combining an existing mean-shifting importance sampling approach with a control variate technique. This estimator has an asymptotically vanishing relative error, which represents a major finding in the context of the left-tail simulation of the sum of log-normal RVs. Finally, we perform simulations to evaluate the performances of the proposed estimator in comparison with existing ones.

both scrambled quasi-Monte Carlo and variance reduction methods to improve the accuracy for Monte Carlo schemes. We also present theoretical analyses and numerical experiments to validate our nu- merical algorithms. Keywords. Quasi-Monte Carlo, parallel computing, scrambled sequences, sensitivity deriva- tives, computational fluid dynamics. AMS classification. 65C05, 65C20, 65C60. 1. Introduction We consider here estimating an integral in [0, 1]s: I(f) = ∫ [0,1]s f(x)dx. (1.1) Assume that {ξi}1≤i≤N is an s-dimension quasirandom sequence. The quasi-Monte Carlo (QMC