WU Chen1,2, ZHANG Ruqing1,2, GUO Jiafeng1,2, FAN Yixing1,2
1. Key Laboratory of Network Data Science and Technology, In-stitute of Computing Technology, Chinese Academy of Sciences, Beijing 100190; 2. School of Computer and Control Engineering, University of Chinese Academy of Sciences, Beijing 100190
Abstract:Ranking competition is prevalent in Web retrieval, and undesirable effects are caused by this adversarial attack behavior. Thus, the study on attack methods is conducive to designing a more robust ranking model.The existing attack methods are recognized by people easily and cannot attack neural ranking models effectively.In this paper, a gradient-based adversarial attack method(GARA) is proposed, including gradient-based word importance ranking, gradient-based adversarial ranking attack and embedding-based word replacement. Given a target ranking model, the backpropagation is firstly conducted based on the constructed ranking-based adversarial attack objective. Then the most important words of a specific document is recognized based on the gradient information. These important words are perturbed in the word embedding space based on the projected gradient descent. Finally, by adopting the counter-fitting technology, the document perturbation is completed by substituting the important word with its synonym which is semantically similar to the original word and nearest to the perturbed word vector.Experiments on MQ2007 and MS MARCO datasets demonstrate the effectiveness of the proposed method.
[1] RAIFER N, RAIBER F, TENNENHOLTZ M, et al. Information Retrieval Meets Game Theory: The Ranking Competition between Documents' Authors // Proc of the 40th ACM SIGIR International Conference on Research and Development in Information Retrieval. New York,USA: ACM, 2017: 465-474. [2] SZEGEDY C, ZAREMBA W, SUTSKEVER I, et al. Intriguing Properties of Neural Networks[C/OL].[2021-09-08]. https://arxiv.org/pdf/1312.6199v4.pdf. [3] GOODFELLOW I J, SHLENS J, SZEGEDY C.Explaining and Har-nessing Adversarial Examples[C/OL]. [2021-09-08].https://arxiv.org/pdf/1412.6572v2.pdf. [4] NGUYEN A, YOSINSKI J, CLUNE J.Deep Neural Networks Are Easily Fooled: High Confidence Predictions for Unrecognizable Images // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington,USA: IEEE, 2015: 427-436. [5] CHAKRABORTY A, ALAM M, DEY V, et al. Adversarial Attacks and Defenses: A Survey[C/OL].[2021-09-08]. https://arxiv.org/pdf/1810.00069.pdf. [6] KURAKIN A, GOODFELLOW I, BENGIO S.Adversarial Exam-ples in the Physical World[C/OL]. [2021-09-08].https://arxiv.org/pdf/1607.02533v1.pdf. [7] MOOSAVI-DEZFOOLI S, FAWZI A, FAWZI O, et al. Universal Adversarial Perturbations // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington,USA: IEEE, 2017: 86-94. [8] MADRY A, MAKELOV A, SCHMIDT L, et al. Towards Deep Lear-ning Models Resistant to Adversarial Attacks[C/OL].[2021-09-08]. https://arxiv.org/pdf/1706.06083.pdf. [9] EBRAHIMI J, RAO A, LOWD D, et al. HotFlip: White-Box Adversarial Examples for Text Classification // Proc of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg,USA: ACL, 2018: 31-36. [10] GAO J, LANCHANTIN J, SOFFA M L, et al. Black-Box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers // Proc of the IEEE Security and Privacy Workshops. Wa-shington,USA: IEEE, 2018: 50-56. [11] LIANG B, LI H C, SU M Q, et al. Deep Text Classification Can Be Fooled // Proc of the 27th International Joint Conference on Artificial Intelligence. New York,USA: ACM, 2018: 4208-4215. [12] SATO M, SUZUKI J, SHINDO H, et al. Interpretable Adversarial Perturbation in Input Embedding Space for Text // Proc of the 27th International Joint Conference on Artificial Intelligence. New York,USA: ACM, 2018: 4323-4330. [13] ALZANTOT M, SHARMA Y, ELGOHARY A, et al. Generating Natural Language Adversarial Examples // Proc of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg,USA: ACL, 2018: 2890-2896. [14] LI J F, JI S L, DU T Y, et al. TextBugger: Generating Adversarial Text Against Real-World Applications[C/OL].[2021-09-08]. https://arxiv.org/pdf/1812.05271.pdf. [15] MINERVINI P, RIEDEL S.Adversarially Regularizing Neural NLI Models to Integrate Logical Background Knowledge // Proc of the 22nd Conference on Computational Natural Language Learning. Stroudsburg,USA: ACL, 2018: 65-74. [16] GONG Z T, WANG W L, LI B, et al. Adversarial Texts with Gradient Methods[C/OL].[2021-09-08]. https://arxiv.org/pdf/1801.07175v1.pdf. [17] XU Y, ZHONG X, YEPES A J, et al. Grey-Box Adversarial Attack and Defence for Sentiment Classification // Proc of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg,USA: ACL, 2021: 4078-4087. [18] JIN D, JIN Z J, ZHOU J T, et al. Is BERT Really Robust?A Strong Baseline for Natural Language Attack on Text Classification and Entailment[C/OL]. [2021-09-08]. https://arxiv.org/pdf/1907.11932.pdf. [19] CASTILLO C, DAVISON B D. Adversarial Web Search. Foundations and Trends in Information Retrieval, 2011, 4(5): 377-386. [20] GOREN G, KURLAND O, TENNENHOLTZ M, et al. Ranking Robustness under Adversarial Document Manipulations // Proc of the 41st ACM SIGIR International Conference on Research and Development in Information Retrieval. New York,USA: ACM, 2018: 395-404. [21] LIU Z R, ZHAO Z Y, LARSON M.Who's Afraid of Adversarial Queries? The Impact of Image Modifications on Content-Based Image Retrieval // Proc of the International Conference on Multimedia Retrieval. New York,USA: ACM, 2019: 306-314. [22] LI J, JI R R, LIU H, et al. Universal Perturbation Attack Against Image Retrieval // Proc of the IEEE/CVF International Conference on Computer Vision. Washington,USA: IEEE, 2019: 4898-4907. [23] ZHOU M, NIU Z X, WANG L, et al. Adversarial Ranking Attack and Defense[C/OL].[2021-09-08]. https://arxiv.org/pdf/2002.11293.pdf. [24] GYÖNGYI Z, GARCIA-MOLINA H. Web Spam Taxonomy[C/OL]. [2021-09-08]. http://airweb.cse.lehigh.edu/2005/gyongyi.pdf. [25] MRK$\bar{S}$IĆ N, SÉAGHDHA D O, THOMSON B, et al. Counter-Fi-tting Word Vectors to Linguistic Constraints // Proc of the Confe-rence of the North American Chapter of the Association for Computational Linguistics(Human Language Technologies). Stroudsburg,USA: ACL, 2016: 142-148. [26] XU J C, DU Q F.TextTricker: Loss-Based and Gradient-Based Adversarial Attacks on Text Classification Models. Engineering Applications of Artificial Intelligence, 2020, 92. DOI: 10.1016/j.engappai.2020.103641. [27] PENNINGTON J, SOCHER R, MANNING C.GloVe: Global Vec-tors for Word Representation // Proc of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg,USA: ACL, 2014: 1532-1543. [28] ROTH T, GAO Y S, ABUADBBA A, et al. Token-Modification Adversarial Attacks for Natural Language Processing: A Survey[C/OL].[2021-09-08]. https://arxiv.org/pdf/2103.00676.pdf. [29] QIN T, LIU T Y. Introducing LETOR 4.0 Datasets[C/OL]. [2021-09-08]. https://arxiv.org/pdf/1306.2597.pdf. [30] CER D, YANG Y F, KONG S Y, et al. Universal Sentence Encoder[C/OL].[2021-09-08]. https://arxiv.org/pdf/1803.11175v1.pdf. [31] GUO J F, FAN Y X, JI X, et al. MatchZoo: A Learning, Practicing, and Developing System for Neural Text Matching // Proc of the 42nd ACM SIGIR International Conference on Research and Development in Information Retrieval. New York,USA: ACM, 2019: 1297-1300. [32] KINGMA D, BA J.Adam: A Method for Stochastic Optimization[C/OL]. [2021-09-08].https://arxiv.org/pdf/1412.6980v5.pdf.