模式识别与人工智能
Saturday, May. 3, 2025 Home      About Journal      Editorial Board      Instructions      Ethics Statement      Contact Us                   中文
  2019, Vol. 32 Issue (6): 545-556    DOI: 10.16451/j.cnki.issn1003-6059.201906007
Researches and Applications Current Issue| Next Issue| Archive| Adv Search |
Safe Sample Screening Based Sampling Method for Imbalanced Data
SHI Hongbo1, LIU Yanxin1, JI Suqin1
1.College of Information, Shanxi University of Finance and Economics, Taiyuan 030006

Download: PDF (890 KB)   HTML (1 KB) 
Export: BibTeX | EndNote (RIS)      
Abstract  

The loss of valuable information may be caused by undersampling, and the class overlapping between the majority class and the minority class may be aggravated by the synthetic minority oversampling technique(SMOTE). A sampling method, Screening_SMOTE, is proposed in this paper, combining safe sample screening based undersampling with SMOTE. Parts of non-informative instances and noise instances in the majority class are identified and discarded by the undersampling method using safe screening rules. Then, the minority class instances generated by SMOTE are added into the screened dataset. The loss of informative information is avoided and the noise instances in the majority class are discarded using safe sample screening based undersampling, relieving the class overlapping. The experimental results show that Screening_SMOTE is an effective method of rebalancing imbalanced datasets, especially for high dimensional imbalanced datasets.

Key wordsImbalanced Data      Safe Sample Screening      Undersampling      Imbalance Ratio      Synthetic Minority Oversampling Technique(SMOTE)     
Received: 29 January 2019     
ZTFLH: TP 391  
About author:: (SHI Hongbo(Corresponding author), Ph.D., professor. Her research interests include machine learning and artificial intelligence.)(LIU Yanxin, master student. Her research interests include machine learning.)(JI Suqin, master, associate professor. Her research interests include machine learning and data mining.)
Service
E-mail this article
Add to my bookshelf
Add to citation manager
E-mail Alert
RSS
Articles by authors
SHI Hongbo
LIU Yanxin
JI Suqin
Cite this article:   
SHI Hongbo,LIU Yanxin,JI Suqin. Safe Sample Screening Based Sampling Method for Imbalanced Data[J]. , 2019, 32(6): 545-556.
URL:  
http://manu46.magtech.com.cn/Jweb_prai/EN/10.16451/j.cnki.issn1003-6059.201906007      OR     http://manu46.magtech.com.cn/Jweb_prai/EN/Y2019/V32/I6/545
Copyright © 2010 Editorial Office of Pattern Recognition and Artificial Intelligence
Address: No.350 Shushanhu Road, Hefei, Anhui Province, P.R. China Tel: 0551-65591176 Fax:0551-65591176 Email: bjb@iim.ac.cn
Supported by Beijing Magtech  Email:support@magtech.com.cn