宋泽宇, 李旸, 李德玉, 王素格. 融合标签关系的法律文本多标签分类方法. 模式识别与人工智能, 2022,35(2): 185-192
SONG Zeyu, LI Yang, LI Deyu, WANG Suge. Multi-label Classification of Legal Text with Fusion of Label Relations. PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2022,35(2): 185-192.
Multi-label Classification of Legal Text with Fusion of Label Relations
SONG Zeyu1, LI Yang2, LI Deyu1,3, WANG Suge1,3
1.School of Computer and Information Technology, Shanxi University, Taiyuan 030006
2.School of Finance, Shanxi University of Finance and Economics, Taiyuan 030006
3.Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, Shanxi University, Taiyuan 030006
Corresponding author:
LI Deyu, Ph.D., professor. His research interests include gra-nular computing and machine learning.
About Author:
SONG Zeyu, master student. His research interests include text mining and natural language processing.
LI Yang, Ph.D., lecturer. Her research interests include text sentiment analysis.
WANG Suge, Ph.D., professor. Her research interests include natural language processing and sentiment analysis
Fund:Supported by National Natural Science Foundation of China(No.62072294,62076158,62106130,61906112), Key Research and Development Program of Shanxi Province(No.201803D421024), Graduate Innovation Programs of Shanxi Province(No.2021Y149)
Abstract
With the rapid development of big data technology, multi-label text classification spawns many applications in the judicial field. There are multiple element labels in legal texts, and the labels are interdependent or correlated. Accurate identification of these labels requires the support of multi-label classification method. In this paper, a multi-label classification method of legal texts with fusion of label relations(MLC-FLR) is proposed. A graph convolution network model is utilized to capture the dependency relationship between labels by constructing the co-occurrence matrix of labels. The label attention mechanism is employed to calculate the degrees of correlation between a legal text and each label word, and the legal text semantic representation of a specific label can be obtained. Finally, the comprehensive representation of a text for multi-label classification is carried out by combining the dependency relationship and the legal text semantic representation of a specific label. Experimental results on the legal text datasets show that MLC-FLR achieves better classification accuracy and stability.
3)基于语义单元的多标签文本分类(Seman-tic-Unit-Based Dilated Convolution for Multi-label Text Classification, SU4MLC)[27].通过多层扩展卷积产生更高层次的语义单位表示, 并产生相应的混合注意机制, 在词级和语义单位级提取信息.
READJ, PFAHRINGERB, HOLMESG, et al. Classifier Chains for Multi-label Classification. , 2011, 85(3): 333-359. [本文引用:1]
[5]
ELISSEEFFA, WESTONJ. . Cambridge, USA: The MIT Press, 2001: 681-687. [本文引用:1]
[6]
THOMPSONP. Automatic Categorization of Case Law // Proc of the 8th International Conference on Artificial Intelligence and Law. , 2001: 70-77. [本文引用:1]
[7]
ULEA OM, ZAMPIERIM, MALMASIS, et al. [C/OL]. [2021-05-03]. https: //arxiv. org/pdf/171009306. pdf. [本文引用:1]
[8]
CONNEAUA, SCHWENKH, BARRAULTL, et al. Very Deep Convolutional Networks for Text Classification // Proc of the 15th Conference of the European Chapter of Association for Computational Linguistics. , 2017: 1107-1116. [本文引用:1]
REYESO, MORELLC, VENTURAS. Scalable Extensions of the ReliefF Algorithm for Weighting and Selecting Features on the Multi-label Learning Context. , 2015, 161: 168-182. [本文引用:1]
[11]
ZHANG ML, ZHOU ZH. A Review on Multi-label Learning Algorithms. , 2014, 26(8): 1819-1837. [本文引用:1]
[12]
LIU JZ, CHANG WC, WU YX, et al. Deep Learning for Extreme Multi-label Text Classification // Proc of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. , 2017: 115-124. [本文引用:1]
[13]
YOU RH, DAI SY, ZHANGZ, et al. [C/OL]. [2021-05-03]. https: //arxiv. org/pdf/1811. 01727v1. pdf. [本文引用:1]
[14]
YANG PC, SUNX, LIW, et al. SGM: Sequence Generation Model for Multi-label Classification // Proc of the 27th International Conference on Computational Linguistics. , 2018: 3915-3926. [本文引用:3]
[15]
YEH, JIANGX, LUO ZC, et al. Interpretable Charge Predictions for Criminal Cases: Learning to Generate Court Views from Fact Descriptions // Proc of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. , 2018: 1854-1864. [本文引用:1]
[16]
YANG ZC, YANG DY, DYERC, et al. Hierarchical Attention Networks for Document Classification // Proc of the Conference of the North American Chapter of the Association for Computational Linguistics(Human Language Technologies). , 2016: 1480-1489. [本文引用:1]
[17]
YANG PC, LUO FL, MA SM, et al. A Deep Reinforced Sequence-to-Set Model for Multi-label Classification // Proc of the 57th Annual Meeting of the Association for Computational Linguistics. , 2019: 5252-5258. [本文引用:2]
[18]
LUO BF, FENG YS, XU JB, et al. Learning to Predict Charges for Criminal Cases with Legal Basis // Proc of the Conference on Empirical Methods in Natural Language Processing. , 2017: 2727-2736. [本文引用:1]
[19]
DU CX, CHEN ZZ, FENG FL, et al. . [2021-05-03]. https: //arxiv. org/pdf/1811. 09386v1. pdf. [本文引用:1]
[20]
BYRDJ, LIPTON ZC. What Is the Effect of Importance Weighting in Deep Learning? // Proc of the 36th International Confe-rence on Machine Learning. , 2019: 872-881. [本文引用:1]
[21]
CHAWLA NV, BOWYER KW, HALL LO, et al. SMOTE: Synthetic Minority Over-Sampling Technique. , 2002, 16: 321-357. [本文引用:1]
[22]
CUIY, JIA ML, LIN TY, et al. Class-Balanced Loss Based on Effective Number of Samples // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. , 2019: 9260-9269. [本文引用:1]
[23]
VASWANIA, SHAZEERN, PARMARN, et al. Cambridge, USA: The MIT Press, 2017: 6000-6010. [本文引用:1]
[24]
CHEN ZM, WEI XS, WANGP, et al. Multi-label Image Recognition with Graph Convolutional Networks // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. , 2019: 5172-5181. [本文引用:1]
[25]
NAMJ, KIMJ, MENCÍA EL, et al. Large-Scale Multi-label Text Classification Revisiting Neural Networks // Proc of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases. , 2014: 437-452. [本文引用:1]
[26]
王得贤. 法律文书中的要素识别方法研究. 硕士学位论文. , 2020. (WANG DX. Research on Element Identification for Legal Documents. Master Dissertation. , 2020. )[本文引用:1]
[27]
LIN JY, SUQ, YANG PC, et al. Semantic-Unit-Based Dilated Convolution for Multi-label Text Classification // Proc of the Conference on Empirical Methods in Natural Language Processing. , 2018: 4554-4564. [本文引用:1]
[28]
XIAOL, HUANGX, CHEN BL, et al. Label-Specific Document Representation for Multi-label Text Classification // Proc of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. , 2019: 466-475. [本文引用:1]