模式识别与人工智能
Wednesday, Apr. 23, 2025 Home      About Journal      Editorial Board      Instructions      Ethics Statement      Contact Us                   中文
Pattern Recognition and Artificial Intelligence  2024, Vol. 37 Issue (4): 352-367    DOI: 10.16451/j.cnki.issn1003-6059.202404006
Researches and Applications Current Issue| Next Issue| Archive| Adv Search |
Vicinal Distribution Based Denoising Diffusion Probabilistic Model
SHI Hongbo1, WAN Bowen1, ZHANG Ying2
1. School of Information, Shanxi University of Finance and Economics, Taiyuan 030031;
2. College of Computer Science and Technology, Harbin Engineering University, Harbin 150009

Download: PDF (814 KB)   HTML (1 KB) 
Export: BibTeX | EndNote (RIS)      
Abstract  

Tabular datasets with limited sample size lack invariance structure and enough samples, making traditional generative data augmentation methods difficult to obtain diverse data that conforms to the original data distribution. To address this issue, a vicinal distribution-based denoising diffusion probabilistic model(VD-DDPM) and its learning algorithm based on the characteristics of tabular data and the principle of vicinal risk minimization are proposed. Firstly, features of the tabular data with limited sample size are analyzed. Weakly correlated features are selected via priori knowledge, and the vicinal distribution of the training sample is constructed. Then, the VD-DDPM is built on the data sampled from vicinal distribution. A diverse dataset that conforms to the original data distribution is generated via VD-DDPM generation algorithm. Experiments on multiple datasets verify the effectiveness of the proposed algorithm in terms of the quality of the generated data and the performance of the downstream model.

Key wordsData Augmentation      Vicinal Risk Minimization      Vicinal Distribution      Diffusion Models      Tabular Data     
Received: 04 February 2024     
ZTFLH: TP 391  
Fund:

Special Fund for the Central Government to Guide Local Technological Development(No.YDZJSX20231A057), Humanities and Social Sciences Program of Ministry of Education of China(No.22YJAZH092)

Corresponding Authors: SHI Hongbo, Ph.D., professor. Her research interests include machine learning and data mining.   
About author:: WAN Bowen, Master student. His research interests include machine learning and data mining. ZHANG Ying, Ph.D. candidate. Her research interests include machine learning and data mining.
Service
E-mail this article
Add to my bookshelf
Add to citation manager
E-mail Alert
RSS
Articles by authors
SHI Hongbo
WAN Bowen
ZHANG Ying
Cite this article:   
SHI Hongbo,WAN Bowen,ZHANG Ying. Vicinal Distribution Based Denoising Diffusion Probabilistic Model[J]. Pattern Recognition and Artificial Intelligence, 2024, 37(4): 352-367.
URL:  
http://manu46.magtech.com.cn/Jweb_prai/EN/10.16451/j.cnki.issn1003-6059.202404006      OR     http://manu46.magtech.com.cn/Jweb_prai/EN/Y2024/V37/I4/352
Copyright © 2010 Editorial Office of Pattern Recognition and Artificial Intelligence
Address: No.350 Shushanhu Road, Hefei, Anhui Province, P.R. China Tel: 0551-65591176 Fax:0551-65591176 Email: bjb@iim.ac.cn
Supported by Beijing Magtech  Email:support@magtech.com.cn