Share this page:

Learning from Explicit and Implicit Supervision Jointly For Algebra Word Problems

Shyam Upadhyay, Ming-Wei Chang, Kai-Wei Chang, and Wen-tau Yih, in EMNLP, 2016.

Download the full text


Abstract

Automatically solving algebra word problems has raised considerable interest recently. Existing state-of-the-art approaches mainly rely on learning from human annotated equations. In this paper, we demonstrate that it is possible to efficiently mine algebra problems and their numerical solutions with little to no manual effort. To leverage the mined dataset, we propose a novel structured-output learning algorithm that aims to learn from both explicit (e.g., equations) and implicit (e.g., solutions) supervision signals jointly. Enabled by this new algorithm, our model gains 4.6% absolute improvement in accuracy on the ALG-514 benchmark compared to the one without using implicit supervision. The final model also outperforms the current state-of-the-art approach by 3%. Dataset



Bib Entry

@inproceedings{BCWS16,
  author = {Upadhyay, Shyam and Chang, Ming-Wei and Chang, Kai-Wei and Yih, Wen-tau},
  title = {Learning from Explicit and Implicit Supervision Jointly For Algebra Word Problems},
  booktitle = {EMNLP},
  year = {2016}
}

Related Publications

  • GPT-GNN: Generative Pre-Training of Graph Neural Networks

    Ziniu Hu, Yuxiao Dong, Kuansan Wang, Kai-Wei Chang, and Yizhou Sun, in KDD, 2020.
    Abstract BibTeX Details
    Graph neural networks (GNNs) have been demonstrated to besuccessful in modeling graph-structured data. However, training GNNs requires abundant task-specific labeled data, which is often arduously expensive to obtain. One effective way to reduce labeling effort is to pre-train an expressive GNN model on unlabelled data with self-supervision and then transfer the learned knowledge to downstream models. In this paper, we present the GPT-GNN’s framework to initialize GNNs by generative pre-training. GPT-GNN introduces a self-supervised attributed graph generation task to pre-train a GNN,which allows the GNN to capture the intrinsic structural and semantic properties of the graph. We factorize the likelihood of graph generation into two components: 1) attribute generation, and 2) edgegeneration. By modeling both components, GPT-GNN captures the inherent dependency between node attributes and graph structure during the generative process. Comprehensive experiments on thebillion-scale academic graph and Amazon recommendation data demonstrate that GPT-GNN significantly outperforms state-of-the-art base GNN models without pre-training by up to 9.1% across different downstream tasks.
    @inproceedings{hu2020gptgnn,
      author = {Hu, Ziniu and Dong, Yuxiao and Wang, Kuansan and Chang, Kai-Wei and Sun, Yizhou},
      title = {GPT-GNN: Generative Pre-Training of Graph Neural Networks},
      booktitle = {KDD},
      year = {2020}
    }
    
    Details
  • SentiBERT: A Transferable Transformer-Based Architecture for Compositional Sentiment Semantics

    Da Yin, Tao Meng, and Kai-Wei Chang, in ACL, 2020.
    Full Text Slides Video Abstract BibTeX Details
    We propose SentiBERT, a variant of BERT that effectively captures compositional sentiment semantics. The model incorporates contextualized representation with binary constituency parse tree to capture semantic composition. Comprehensive experiments demonstrate that SentiBERT achieves competitive performance on phrase-level sentiment classification. We further demonstrate that the sentiment composition learned from the phrase-level annotations on SST can be transferred to other sentiment analysis tasks as well as related tasks, such as emotion classification tasks. Moreover, we conduct ablation studies and design visualization methods to understand SentiBERT. We show that SentiBERT is better than baseline approaches in capturing negation and the contrastive relation and model the compositional sentiment semantics.
    @inproceedings{yin2020sentibert,
      author = {Yin, Da and Meng, Tao and Chang, Kai-Wei},
      title = {SentiBERT: A Transferable Transformer-Based Architecture for Compositional Sentiment Semantics},
      booktitle = {ACL},
      year = {2020}
    }
    
    Details
  • Building Language Models for Text with Named Entities

    Md Rizwan Parvez, Saikat Chakraborty, Baishakhi Ray, and Kai-Wei Chang, in ACL, 2018.
    Full Text Poster Code Abstract BibTeX Details
    Text in many domains involves a significant amount of named entities. Predicting the entity names is often challenging for a language model as they appear less frequent on the training corpus. In this paper, we propose a novel and effective approach to building a language model which can learn the entity names by leveraging their entity type information. We also introduce two benchmark datasets based on recipes and Java programming codes, on which we evaluate the proposed model. Experimental results show that our model achieves 52.2% better perplexity in recipe generation and 40.3% on code generation than state-of-the-art language models.
    @inproceedings{parvez2018building,
      author = {Parvez, Md Rizwan and Chakraborty, Saikat and Ray, Baishakhi and Chang, Kai-Wei},
      title = {Building Language Models for Text with Named Entities},
      booktitle = {ACL},
      year = {2018}
    }
    
    Details
  • Learning from Explicit and Implicit Supervision Jointly For Algebra Word Problems

    Shyam Upadhyay, Ming-Wei Chang, Kai-Wei Chang, and Wen-tau Yih, in EMNLP, 2016.
    Full Text Abstract BibTeX Details
    Automatically solving algebra word problems has raised considerable interest recently. Existing state-of-the-art approaches mainly rely on learning from human annotated equations. In this paper, we demonstrate that it is possible to efficiently mine algebra problems and their numerical solutions with little to no manual effort. To leverage the mined dataset, we propose a novel structured-output learning algorithm that aims to learn from both explicit (e.g., equations) and implicit (e.g., solutions) supervision signals jointly. Enabled by this new algorithm, our model gains 4.6% absolute improvement in accuracy on the ALG-514 benchmark compared to the one without using implicit supervision. The final model also outperforms the current state-of-the-art approach by 3%.
    Dataset
    @inproceedings{BCWS16,
      author = {Upadhyay, Shyam and Chang, Ming-Wei and Chang, Kai-Wei and Yih, Wen-tau},
      title = {Learning from Explicit and Implicit Supervision Jointly For Algebra Word Problems},
      booktitle = {EMNLP},
      year = {2016}
    }
    
    Details