首页 | 本学科首页   官方微博 | 高级检索  
     检索      

Using LSA and text segmentation to improve automatic Chinese dialogue text summarization
作者姓名:LIU  Chuan-han  WANG  Yong-cheng  ZHENG  Fei  LIU  De-rong
作者单位:[1]Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200030, China [2]Center for Biomimetic Sensing and Control Research, Institute of Intelligent Machines, Chinese Academy of Sciences, Hefei 230031, China
基金项目:Project (No. 2002AA119050) supported by the National Hi-Tech Research and Development Program (863) of China
摘    要:Automatic Chinese text summarization for dialogue style is a relatively new research area. In this paper, Latent Semantic Analysis (LSA) is first used to extract semantic knowledge from a given document, all question paragraphs are identified, an automatic text segmentation approach analogous to Text'filing is exploited to improve the precision of correlating question paragraphs and answer paragraphs, and finally some "important" sentences are extracted from the generic content and the question-answer pairs to generate a complete summary. Experimental results showed that our approach is highly efficient and improves significantly the coherence of the summary while not compromising informativeness.

关 键 词:汉语对话  自动文本摘要  隐含语义分析  文本分析  对话方式
收稿时间:2006-07-04
修稿时间:2006-10-07

Using LSA and text segmentation to improve automatic Chinese dialogue text summarization
LIU Chuan-han WANG Yong-cheng ZHENG Fei LIU De-rong.Using LSA and text segmentation to improve automatic Chinese dialogue text summarization[J].Journal of Zhejiang University Science,2007,8(1):79-87.
Authors:Chuan-han Liu  Yong-cheng Wang  Fei Zheng  De-rong Liu
Institution:(1) Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, 200030, China;(2) Center for Biomimetic Sensing and Control Research, Institute of Intelligent Machines, Chinese Academy of Sciences, Hefei, 230031, China
Abstract:Automatic Chinese text summarization for dialogue style is a relatively new research area. In this paper, Latent Semantic Analysis (LSA) is first used to extract semantic knowledge from a given document, all question paragraphs are identified, an automatic text segmentation approach analogous to TextTiling is exploited to improve the precision of correlating question paragraphs and answer paragraphs, and finally some “important” sentences are extracted from the generic content and the question-answer pairs to generate a complete summary. Experimental results showed that our approach is highly efficient and improves significantly the coherence of the summary while not compromising informativeness. Project (No. 2002AA119050) supported by the National Hi-Tech Research and Development Program (863) of China
Keywords:Automatic text summarization  Latent semantic analysis (LSA)  Text segmentation  Dialogue style  Coherence  Question-answer pairs
本文献已被 CNKI 维普 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号