RGAN-EL: A GAN and ensemble learning-based hybrid approach for imbalanced data classification |
| |
Institution: | 1. School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, China;2. ByteDance, CA, United States |
| |
Abstract: | Imbalanced sample distribution is usually the main reason for the performance degradation of machine learning algorithms. Based on this, this study proposes a hybrid framework (RGAN-EL) combining generative adversarial networks and ensemble learning method to improve the classification performance of imbalanced data. Firstly, we propose a training sample selection strategy based on roulette wheel selection method to make GAN pay more attention to the class overlapping area when fitting the sample distribution. Secondly, we design two kinds of generator training loss, and propose a noise sample filtering method to improve the quality of generated samples. Then, minority class samples are oversampled using the improved RGAN to obtain a balanced training sample set. Finally, combined with the ensemble learning strategy, the final training and prediction are carried out. We conducted experiments on 41 real imbalanced data sets using two evaluation indexes: F1-score and AUC. Specifically, we compare RGAN-EL with six typical ensemble learning; RGAN is compared with three typical GAN models. The experimental results show that RGAN-EL is significantly better than the other six ensemble learning methods, and RGAN is greatly improved compared with three classical GAN models. |
| |
Keywords: | Imbalanced data Generative adversarial networks Data sampling Ensemble learning |
本文献已被 ScienceDirect 等数据库收录! |
|