首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Web document summarization by exploiting social context with matrix co-factorization
Authors:Minh-Tien Nguyen  Viet Cuong Tran  Xuan Hoai Nguyen  Le-Minh Nguyen
Institution:1. Faculty of Information Technology, Hung Yen University of Technology and Education, Hung Yen, Vietnam;2. School of Information Science, Japan Advanced Institute of Science and Technology (JAIST), 1-1 Asahidai, Nomi, Ishikawa, 923-1292, Japan;3. Hanoi University of Science and Technology, Hanoi, Vietnam;4. AI Academy Vietnam, 489 Hoang Quoc Viet Rd, Hanoi, Vietnam
Abstract:In the context of social media, users usually post relevant information corresponding to the contents of events mentioned in a Web document. This information posses two important values in that (i) it reflects the content of an event and (ii) it shares hidden topics with sentences in the main document. In this paper, we present a novel model to capture the nature of relationships between document sentences and post information (comments or tweets) in sharing hidden topics for summarization of Web documents by utilizing relevant post information. Unlike previous methods which are usually based on hand-crafted features, our approach ranks document sentences and user posts based on their importance to the topics. The sentence-user-post relation is formulated in a share topic matrix, which presents their mutual reinforcement support. Our proposed matrix co-factorization algorithm computes the score of each document sentence and user post and extracts the top ranked document sentences and comments (or tweets) as a summary. We apply the model to the task of summarization on three datasets in two languages, English and Vietnamese, of social context summarization and also on DUC 2004 (a standard corpus of the traditional summarization task). According to the experimental results, our model significantly outperforms the basic matrix factorization and achieves competitive ROUGE-scores with state-of-the-art methods.
Keywords:Corresponding author    Data mining  Information retrieval  Document summarization  Social context summarization  Matrix factorization
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号