Iterative Scaling and Coordinate Descent Methods for Maximum Entropy Models

Fang-Lan Huang, Cho-Jui Hsieh, Kai-Wei Chang, and Chih-Jen Lin, in JMLR, 2010.

Download the full text

Abstract

Maximum entropy (Maxent) is useful in natural language processing and many other areas. Iterative scaling (IS) methods are one of the most popular approaches to solve Maxent. With many variants of IS methods, it is difficult to understand them and see the differences. In this paper, we create a general and unified framework for iterative scaling methods. This framework also connects iterative scaling and coordinate descent methods. We prove general convergence results for IS methods and analyze their computational complexity. Based on the proposed framework, we extend a coordinate descent method for linear SVM to Maxent. Results show that it is faster than existing iterative scaling methods.

Bib Entry

@inproceedings{HHCL10,
  author = {Huang, Fang-Lan and Hsieh, Cho-Jui and Chang, Kai-Wei and Lin, Chih-Jen},
  title = {Iterative Scaling and Coordinate Descent Methods for Maximum Entropy Models},
  booktitle = {JMLR},
  year = {2010}
}

Related Publications

Large Linear Classification When Data Cannot Fit In Memory, TKDD, 2012
Selective Block Minimization for Faster Convergence of Limited Memory Large-scale Linear Models, KDD, 2011
A Comparison of Optimization Methods and software for Large-scale L1-regularized Linear Classification, JMLR, 2010
Training and Testing Low-degree Polynomial Data Mappings via Linear SVM, JMLR, 2010
A Sequential Dual Method for Large Scale Multi-Class Linear SVMs, KDD, 2008
A Dual Coordinate Descent Method for Large-Scale Linear SVM, ICML, 2008
Coordinate Descent Method for Large-scale L2-loss Linear SVM, JMLR, 2008
LIBLINEAR: A Library for Large Linear Classification, JMLR, 2008