基于热传导模型的更新摘要算法

摘要
图/表
参考文献
相关文章 (15)

全文: PDF (440 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要更新摘要除了要解决传统的面向话题的多文档摘要的两个要求——话题相关性和信息多样性，还要求应对用户对信息新颖性的需求。文中为更新摘要提出一种基于热传导模型的抽取式摘要算法——HeatSum。该方法能够自然利用句子与话题，新句子和旧句子，以及已选句子和待选句子之间的关系，并且为更新摘要找出话题相关、信息多样且内容新颖的句子。实验结果表明，HeatSum与参加TAC09评测的表现最好的抽取式方法性能相当，且更优于其它基准方法。

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	杜攀
	郭嘉丰
	张瑾
	程学旗
	张旭

关键词 ：更新摘要, 面向话题的多文档摘要, 热传导模型

Abstract：Besides the problems of topic relevance and information diversity tackled by traditional topic-focused multi-document summarization, the update summarization is required to address the problem of information novelty as well. In this paper, HeatSum, an extractive approach based on heat conduction for update summarization, is proposed. The process can naturally make use of the relationships among the given topic, the old sentences, the new sentences, and the sentences selected and to be selected to find proper sentences for update summarization. Therefore, HeatSum is able to simultaneously address the challenging problems above for update summarization in a unified way. The experiments on benchmark of TAC2009 are performed and the ROUGE evaluation results show that the HeatSum achieves fine performance compared to the best existing performing systems in TAC tasks and it significantly outperforms other baseline methods.

Key words： Update Summarization Topic-Oriented Multi-Document Summarization Heat Conduction Model

收稿日期: 2010-10-13

ZTFLH:

TP391

基金资助:国家自然科学基金重点项目(No.60933005)、国家自然科学基金项目(No.60903139,61003166)和国家863计划项目(No.2010AA012500)资助

作者简介: 杜攀，男，1981年生，博士，主要研究方向为网络挖掘。E-mail:xiaopandu@gmail。com。郭嘉丰，男，1980年生，博士，主要研究方向为社会搜索、网络挖掘、信息检索。张瑾，男，1978年生，博士，主要研究方向为文本挖掘、自动文摘。程学旗，男，1971年生，博士，研究员，主要研究方向为网络科学、网络搜索与数据挖掘、P2P与分布式系统、信息安全。张旭，女，1983年生，硕士，主要研究方向为视频内容分析与检索。

引用本文:

杜攀，郭嘉丰，张瑾，程学旗，张旭. 基于热传导模型的更新摘要算法[J]. 模式识别与人工智能, 2012, 25(3): 367-374. DU Pan, GUO Jia-Feng, ZHANG Jin, CHENG Xue-Qi, ZHANG Xu. Update Summarization Based on Heat Conduction Model. , 2012, 25(3): 367-374.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/ 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2012/V25/I3/367

[1] Boudin F,El-Bμèze M,Torres-Moreno J M.A Scalable MMR Approach to Sentence Scoring for Multi-Document Update Summarization // Proc of the 22nd International Conference on Computing Linguistics.Manchester,UK,2008: 23-26
[2] Wan Xiaojun.Timedtextrank: Adding the Temporal Dimension to Multi-Document Summarization // Proc of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.Amsterdam,Netherlands,2007: 867-868
[3] Zhang Jin,Cheng Xueqi,Xu Hongbo,et al.ICTCASs ICTGrasper at TAC 2008: Summarizing Dynamic Information with Signature Terms Based Content Filtering [EB/OL].[2010-08-13].http://www.nist.gov/tac/publications/2008/participant.papers/ICTCAS.proceedings.pdf
[4] Steinberger J,Jeek K.Update Summarization Based on Novel Topic Distribution // Proc of the 9th ACM Symposium on Document Engineering.Munich,Germany,2009: 205-213
[5] Li Wenjie,Wei Furu,Lu Qin,et al.PNR 2: Ranking Sentences with Positive and Negative Reinforcement for Query-Oriented Update Summarization // Proc of the 22nd International Conference on Computational Linguistics.Manchester,UK,2008: 489-496
[6] Erkan G,Radev D R.LexRank: Graph-Based Lexical Centrality as Salience in Text Summarization.Journal of Artificial Intelligence Research,2004,22(1): 457-479
[7] Lin C Y,Hovy E.Manual and Automatic Evaluation of Summaries // Proc of the ACL-02 Workshop on Automatic Summarization.Morristown,USA,2002: 45-51
[8] Mihalcea R,Tarau P.Textrank: Bringing Order into Texts // Proc of the Conference on Empirical Methods in Natural Language Processing.Barcelona,Spain,2004: 404-411
[9] Otterbacher J,Erkan G,Radev D.Using Random Walks for Question-Focused Sentence Retrieval // Proc of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing.Vancouver,Canada,2005: 915-922
[10] Saggion H,Bontcheva K,Cunningham H.Robust Generic and Query-Based Summarization // Proc of the 10th Conference on European Chapter of the Association for Computational Linguistics.Stroudsburg,USA,2003: 235-238
[11] Wan Xiaojun,Yang Jianwu,Xiao Jianguo.Manifold-Ranking Based Topic-Focused Multi-Document Summarization // Proc of the 20th International Joint Conference on Artificial Intelligence.Hyderabad,India,2007: 2903-2908
[12] Wei Furu,Li Wenjie,Lu Qin,et al.Query-Sensitive Mutual Reinforcement Chain and Its Application in Query-Oriented Multi-Document Summarization // Proc of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.Singapore,Singapore,2008: 283-290
[13] Zhang Jin,Cheng Xueqi,Wu Gaowei,et al.Adasum: An Adaptive Model for Summarization // Proc of the 17th ACM Conference on Information and Knowledge Management.Napa Valley,USA,2008: 901-910
[14] Conroy J M,Schlesinger J D.Classy Query-Based Multidocument Summarization [EB/OL].[2010-08-13].http://www.nlpic.nist.gov/tac/projects/duc/pubs/2005papers/ida.conroy.pdf
[15] Hovy E,Lin C Y,Zhou L,et al.Automated Summarization Evaluation with Basic Elements // Proc of the 5th Conference on Language Resources and Evaluation.Genvoa,Italy,2006: 899-902
[16] Allan J,Gupta R,Khandelwal V.Temporal Summaries of New Topics // Proc of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.New Orleans,USA,2001: 10-18
[17] Carbonell J,Goldstein J.The Use of MMR,Diversity-Based Reranking for Reordering Documents and Producing Summaries // Proc of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.Melbourne,Australia,1998: 335-336
[18] Zhou Tao,Kuscsik Z,Liu Jianguo,et al.Solving the Apparent Diversity-Accuracy Dilemma of Recommender Systems.Proc of the National Academy of Sciences of the United States of America,2010,107(10): 4511-4515
[19] Zhang Zike,Zhou Tao,Zhang Yicheng.Personalized Recommendation via Integrated Diffusion on User-Item-Tag Tripartite Graphs.Physica A: Statistical Mechanics and Its Applications,2010,389(1): 179-186
[20] Zhang Y C,Blattner M,Yu Y K.Heat Conduction Process on Community Networks as a Recommendation Model.Physical Review Letters,2007,99(15): 154301
[21] Korniss G,Hastings M,Bassler K,et al.Scaling in Small-World Resistor Networks.Physics Letters A,2006,35(5/6): 324-330
[22] Wu F Y.Theory of Resistor Networks: The Two-Point Resistance.Journal of Physics A: Mathematical and General,2004,37(26): 6653-6673
[23] Lin C Y.Rouge: A Package for Automatic Evaluation of Summaries // Proc of the ACL-04 Workshop on Text Meaning and Interpretation.Barcelona,Spain,2004: 74-81
[24] Long Chong,Huang Minlie,Zhu Xiaoyan.Tsinghua University at TAC 2009: Summarizing Multi-Documents by Information Distance [EB/OL].[2010-08-13].http://www.nist.gov/tac/publications/2009/participant.paper/THVSVM.proceedings.pdf
[25] Dang H T,Owczarzak K.Overview of the Tac 2009 Summarization Track (draft) [EB/OL].[2010-08-13].http://www.nist.gov/publications/2009/presentations/TAC2009_Sum_overview.pdf