Multiview convolutional neural networks for multidocument extractive summarization

Abstract

Multidocument summarization has gained popularity in many real world applications because vital information can be extracted within a short time. Extractive summarization aims to generate a summary of a document or a set of documents by ranking sentences and the ranking results rely heavily on the quality of sentence features. However, almost all previous algorithms require hand-crafted features for sentence representation. In this paper, we leverage on word embedding to represent sentences so as to avoid the intensive labor in feature engineering. An enhanced convolutional neural networks (CNNs) termed multiview CNNs is successfully developed to obtain the features of sentences and rank sentences jointly. Multiview learning is incorporated into the model to greatly enhance the learning capability of original CNN. We evaluate the generic summarization performance of our proposed method on five Document Understanding Conference datasets. The proposed system outperforms the state-of-the-art approaches and the improvement is statistically significant shown by paired t-test

Publication
IEEE transactions on cybernetics