A Comparison of Optimization Methods and software for Large-scale L1-regularized Linear Classification
Guo-Xun Yuan, Kai-Wei Chang, Cho-Jui Hsieh, and Chih-Jen Lin, in JMLR, 2010.
CodeDownload the full text
Abstract
Large-scale linear classification is widely used in many areas. The L1-regularized form can be applied for feature selection; however, its non-differentiability causes more difficulties in training. Although various optimization methods have been proposed in recent years, these have not yet been compared suitably. In this paper, we first broadly review existing methods. Then, we discuss state-of-the-art software packages in detail and propose two efficient implementations. Extensive comparisons indicate that carefully implemented coordinate descent methods are very suitable for training large document data.
Bib Entry
@inproceedings{YCHL10,
author = {Yuan, Guo-Xun and Chang, Kai-Wei and Hsieh, Cho-Jui and Lin, Chih-Jen},
title = {A Comparison of Optimization Methods and software for Large-scale L1-regularized Linear Classification},
booktitle = {JMLR},
year = {2010}
}
Related Publications
- Large Linear Classification When Data Cannot Fit In Memory, TKDD, 2012
- Selective Block Minimization for Faster Convergence of Limited Memory Large-scale Linear Models, KDD, 2011
- Iterative Scaling and Coordinate Descent Methods for Maximum Entropy Models, JMLR, 2010
- Training and Testing Low-degree Polynomial Data Mappings via Linear SVM, JMLR, 2010
- A Sequential Dual Method for Large Scale Multi-Class Linear SVMs, KDD, 2008
- A Dual Coordinate Descent Method for Large-Scale Linear SVM, ICML, 2008
- Coordinate Descent Method for Large-scale L2-loss Linear SVM, JMLR, 2008
- LIBLINEAR: A Library for Large Linear Classification, JMLR, 2008