Search Results

You are looking at 1 - 10 of 27 items :

  • "Bayesian shrinkage" x
Clear All

. Variational Bayes Procedure for Effective Classification of Tumor Type with Microarray Gene Expression Data Takeshi Hayashi Abstract Recently, microarrays that can simultaneously measure the expression levels of thousands of genes have become a valuable tool for classifying tumors. For such classification, where the sample size is usually much smaller than the number of genes, it is essential to construct properly sparse models for accurately predicting tumor types to avoid over-fitting. Bayesian shrinkage estimation is considered a suitable method for providing such sparse

λ , which both estimates λ and samples from a range of λ values instead of just a single λ , which often seems to give better fits. – Classical forms like lasso give the same parameter estimates as the posterior mode, but the posterior mean from MCMC is often a more reliable estimator of future values. Bayesian shrinkage can be applied easily to the regression and GLM models that actuaries use. You just need a vector of observed values and a design matrix with columns for each variable and rows with the values of the variables at each of the data points. Then


Traditional NBA player evaluation metrics are based on scoring differential or some pace-adjusted linear combination of box score statistics like points, rebounds, assists, etc. These measures treat performances with the outcome of the game still in question (e.g. tie score with five minutes left) in exactly the same way as they treat performances with the outcome virtually decided (e.g. when one team leads by 30 points with one minute left). Because they ignore the context in which players perform, these measures can result in misleading estimates of how players help their teams win. We instead use a win probability framework for evaluating the impact NBA players have on their teams’ chances of winning. We propose a Bayesian linear regression model to estimate an individual player’s impact, after controlling for the other players on the court. We introduce several posterior summaries to derive rank-orderings of players within their team and across the league. This allows us to identify highly paid players with low impact relative to their teammates, as well as players whose high impact is not captured by existing metrics.


We present a regularized logistic regression model for evaluating player contributions in hockey. The traditional metric for this purpose is the plus-minus statistic, which allocates a single unit of credit (for or against) to each player on the ice for a goal. However, plus-minus scores measure only the marginal effect of players, do not account for sample size, and provide a very noisy estimate of performance. We investigate a related regression problem: what does each player on the ice contribute, beyond aggregate team performance and other factors, to the odds that a given goal was scored by their team? Due to the large-p (number of players) and imbalanced design setting of hockey analysis, a major part of our contribution is a careful treatment of prior shrinkage in model estimation. We showcase two recently developed techniques – for posterior maximization or simulation – that make such analysis feasible. Each approach is accompanied with publicly available software and we include the simple commands used in our analysis. Our results show that most players do not stand out as measurably strong (positive or negative) contributors. This allows the stars to really shine, reveals diamonds in the rough overlooked by earlier analyses, and argues that some of the highest paid players in the league are not making contributions worth their expense.

(PBA) tournaments between 2003 and 2014 (see Section 4 – Data). The repetitive nature of bowling that makes it so conducive to hot-hand analysis also makes it conducive to frame-by-frame simulation. Rather than rely on noisy empirical distributions, we model how a bowler will perform on each frame, accounting for his hot- or cold-handedness with a 4th-order Markov model as suggested by Martin (2006) (see Section 5 – Model). Following McCarthy (2011) , we also explore to what extent Bayesian shrinkage can be used to improve our model. The result is a relatively

: 10.2202/1544-6115.1504 and Q2(a; b; c) ̂MLE a(̂MLE 0) exp b R + cV; where h, a, b, and c are tuning parameters chosen by the user and R = S2=n (̂MLE 0)2 ; V S= p n 1 +R sign(̂MLE 0): They are modi…ed for paired and unpaired microarray data and 0 = 0 is chosen according to the fact that the most genes are equivalently expressed. See Section 2.2 of the Supplementary Information for the de…nitions of these estimators and the tuning parameters values. 3.2.2 Bayesian shrinkage model Scott and Berger (2006) have described a fully Bayesian model for estimating

separation between the overall model sparsity and the distinctive degrees of shrinkage resulting from the factorization λ i = δ η i ${\lambda _i} = \delta {\eta _i}$ of the idiosyncratic regularization parameters under the EBL causes LASSO to adaptively release the shrinkage pressure on important parameters while inflexibly shrinking spurious ones towards zero [ 15 , 16 ]. This differential shrinkage obviates the tuning of the regularization parameter, which remains a challenging issue in Bayesian shrinkage analysis [ 18 , 19 ]. The EBL has been successfully applied

ANV in estimating both the intercept and main effect for all the three ρ values. For the case ρ = 0, the gain is from information borrowing among probes through a hierarchical Bayes setup (i.e., Bayesian shrinkage). For ρ > 0 , it is from both information borrowing and spatial smoothing. By contrast, when estimating the intercept, ANV-s improves ANV only for ρ = 0.75 (moderately high), and it is worse than ANV for ρ = 0 or 0.5, meaning that smoothing data among adjacent probes may hurt the performance in estimation when the autocorrelation does not

a given transcript. As a final stage the authors partition the genes into groups which show similar linkages. Other authors have suggested more sophisticated models which provide a wider search of the model space: in particular Bayesian Variable Selection schemes based around Reversible Jump Markov Chain Monte Carlo (Stephens and Fisch, 1998, Yi et al., 2003, Jia and Xu, 2007, Bottolo and Richardson, 2010). We will not discuss such methods here, or related Bayesian shrinkage regression methods such as Xiaohong and Shizhong (2010) which do not produce sparse

- ward zero. This attenuation should be greater for hospitals in which the quality indicators are not estimated precisely. This premise is the basic idea behind Bayesian shrinkage estimators (Morris 1983): the ob- served variation in quality indicators will tend to overstate the amount of actual variation across hospitals, so by pulling all the estimates (es- pecially the more imprecise estimates) back toward the mean, we can improve prediction accuracy. The second problem with the conven- tional method is that it does not use any of the information available in other