Fast and reliable parameter tuning in high dimensions
In this talk, we study the problem of parameter tuning or equivalently the problem of out-of-sample risk estimation under the high dimensional settings where standard techniques such as K-fold cross-validation suffer from large biases. Motivated by the low bias of the leave-one-out cross-validation (LO) method, we propose a computationally efficient closed-form approximate leave-one-out formula (ALO) for a large class of regularized estimators. Given the regularized estimate, calculating ALO requires minor computational overhead. With minor assumptions about the data generating process, we obtain a finite-sample upper bound for |LO-ALO|. Our theoretical analysis illustrates that |LO -ALO| converges to zero with overwhelming probability, when both n and p tend to infinity, while the dimension p of the feature vectors may be comparable with or even greater than the number of observations, n. Despite the high-dimensionality of the problem, our theoretical results do not require any sparsity assumption on the vector of regression coefficients. Our extensive numerical experiments show that |LO – ALO| decreases as n and p increase, revealing the excellent finite sample performance of ALO.
Prof. Arian Maleki
Associate Professor, Columbia University on November 8, 2019 at 11:45 AM in EB2 1230.
Arian Maleki is an associate professor in the Department of Statistics at Columbia University. He received his Ph.D. from Stanford University in 2010. Before joining Columbia University, he was a postdoctoral scholar at Rice University. Arian is interested in high-dimensional statistics, compressed sensing, and machine learning.
The Department of Electrical and Computer Engineering hosts a regularly scheduled seminar series with preeminent and leading reseachers in the US and the world, to help promote North Carolina as a center of innovation and knowledge and to ensure safeguarding its place of leading research.