模式识别与人工智能
2025年4月1日 星期二   首 页     期刊简介     编委会     投稿指南     伦理声明     联系我们                                                                English
模式识别与人工智能  2024, Vol. 37 Issue (9): 798-810    DOI: 10.16451/j.cnki.issn1003-6059.202409004
研究与应用 最新目录| 下期目录| 过刊浏览| 高级检索 |
基于内部知识扩展的软提示学习点击诱饵检测方法
董丙冰1,2, 吴信东1,2
1.合肥工业大学 大数据知识工程教育部重点实验室 合肥 230009;
2.合肥工业大学 计算机与信息学院 合肥 230601
Soft Prompt Learning with Internal Knowledge Expansion for Clickbait Detection
DONG Bingbing1,2, WU Xindong1,2
1. Key Laboratory of Knowledge Engineering with Big Data of Ministry of Education of China, Hefei University of Technology, Hefei 230009;
2. School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230601

全文: PDF (742 KB)   HTML (1 KB) 
输出: BibTeX | EndNote (RIS)      
摘要 点击诱饵的主要目的是通过引导用户点击链接以增加页面浏览量和广告收入.点击诱饵的内容往往具有低质量、误导性或虚假性的特征,对用户产生潜在不利影响.现有的基于预训练语言模型的提示学习方法依赖外部开放知识库以检测点击诱饵,不仅性能受制于外部知识库的质量和可用性,而且不可避免地导致查询和响应的延迟.为此,文中提出基于内部知识扩展的软提示学习点击诱饵检测方法,从训练数据集本身提取扩展词,同时采用层次聚类和优化策略,在提示学习中对获得的扩展词进行微调,避免从外部知识库检索知识.此外,采用软提示学习可获得适合特定文本类型的最佳提示,避免手工模板带来的偏差.在少样本场景下,尽管文中方法只基于内部知识进行扩展,但在三个公开的点击诱饵数据集上可以以较少的时间取得较优的检测效果.
服务
把本文推荐给朋友
加入我的书架
加入引用管理器
E-mail Alert
RSS
作者相关文章
董丙冰
吴信东
关键词 点击诱饵检测软提示内部知识扩展提示学习    
Abstract:The main purpose of clickbait is to increase page views and advertising revenues by enticing users to click on bait links. The content of clickbait is often characterized by low-quality, misleading or false information, and this potentially engenders negative effects on users. Existing prompt learning methods based on pre-trained language models are reliant on external open knowledge bases to detect clickbait. These methods not only limit model performance due to the quality and availability of external knowledge bases, but also inevitably lead to delays in queries and responses. To address this issue, a soft prompt learning method with internal knowledge expansion for clickbait detection(SPCD_IE) is proposed in this paper. Expansion words are extracted from the training dataset, while hierarchical clustering and optimization strategies are employed to fine-tune the obtained expansion words in prompt learning, and the necessity of knowledge retrieval from external knowledge bases is avoided. Moreover, soft prompt learning is utilized to obtain the best prompts suitable for specific text types, preventing biases introduced by manual templates. Although SPCD_IE expands solely based on internal knowledge in few-shot scenarios, experimental results show it achieves better detection performance on three public clickbait datasets in less time.
Key wordsClickbait Detection    Soft Prompt    Internal Knowledge Expansion    Prompt Learning   
收稿日期: 2024-05-08     
ZTFLH: TP 391  
基金资助:国家自然科学基金项目(No.62120106008)资助
通讯作者: 吴信东,博士,教授,主要研究方向为数据挖掘、大数据分析、基于知识的系统.E-mail:xwu@hfut.edu.cn.   
作者简介: 董丙冰,博士研究生,主要研究方向为数据挖掘.E-mail:blingdong@mail.hfut.edu.cn.
引用本文:   
董丙冰, 吴信东. 基于内部知识扩展的软提示学习点击诱饵检测方法[J]. 模式识别与人工智能, 2024, 37(9): 798-810. DONG Bingbing, WU Xindong. Soft Prompt Learning with Internal Knowledge Expansion for Clickbait Detection. Pattern Recognition and Artificial Intelligence, 2024, 37(9): 798-810.
链接本文:  
http://manu46.magtech.com.cn/Jweb_prai/CN/10.16451/j.cnki.issn1003-6059.202409004      或     http://manu46.magtech.com.cn/Jweb_prai/CN/Y2024/V37/I9/798
版权所有 © 《模式识别与人工智能》编辑部
地址:安微省合肥市蜀山湖路350号 电话:0551-65591176 传真:0551-65591176 Email:bjb@iim.ac.cn
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn