首页 | 本学科首页   官方微博 | 高级检索  
     检索      

不平衡古漆器漆膜数据分类研究
引用本文:张岚斌,徐国庆,李 澜.不平衡古漆器漆膜数据分类研究[J].教育技术导刊,2021,20(1):84-88.
作者姓名:张岚斌  徐国庆  李 澜
作者单位:1. 武汉工程大学 计算机科学与工程学院,湖北 武汉 430205;2. 湖北省博物馆,湖北 武汉 430077
基金项目:湖北省自然科学基金项目(2014CFB786);武汉工程大学第十届研究生教育创新基金项目(CX2018210)
摘    要:针对古漆器漆膜数据类间不平衡、样本规模小,以及传统机器学习算法分类效果较差的问题,提出一种改进SMOTE的过采样方法改变漆器漆膜数据样本分布,使其达到平衡。该方法通过比较各类样本间的欧式距离,删除了人工样本中的噪声数据,然后运用集成学习中的随机森林算法进行分类,提高了少数类的分类准确率。UCI数据集上的实验结果表明,改进的过采样方法性能更优,评价指标F1-score与AUC值分别得到2%、5%以上的提升。结合改进的过采样方法与机器学习算法进行对比实验,结果证明,随机森林算法精度更高,在对古漆器年代的判别中,随机森林算法的F1-score与AUC值高达87.76%、89.34%。

关 键 词:古漆器漆膜  过采样  集成学习  随机森林  
收稿时间:2020-05-06

Research on Data Classification of Imbalanced Lacquer Film on Ancient Lacquerware
ZHANG Lan-bin,XU Guo-qing,LI Lan.Research on Data Classification of Imbalanced Lacquer Film on Ancient Lacquerware[J].Introduction of Educational Technology,2021,20(1):84-88.
Authors:ZHANG Lan-bin  XU Guo-qing  LI Lan
Institution:1. School of Computer Science & Engineering, Wuhan Institute of Technology, Wuhan 430205, China;2. Hubei Provincial Museum, Wuhan 430077, China
Abstract:In order to solve the problems of the imbalance of data categories in the lacquer film on ancient lacquerware , the small sample size, and the poor classification effect of traditional machine learning algorithms, an oversampling method to improve SMOTE is proposed to change the sample distribution of lacquer film data to keep the balance. This method removes the noise data in the artificial samples by comparing the Euclidean distance between the samples of each category, and then uses the random forest algorithm in ensemble learning to make classification, which improves the classification accuracy of the minority class. The experimental results on the UCI data set show that the improved oversampling method has better performance, and the evaluation indexes f1-score and AUC are increased by more than 2% and 5% respectively. Combined with the improved oversampling method and the machine learning algorithm for comparative experiments, the experimental results prove that the random forest algorithm has higher accuracy,and the F1-score and AUC values are as high as 87.76% and 89.34% in the age determination of ancient lacquerware.
Keywords:lacquer film on ancient lacquerware  oversampling  ensemble learning  random forest  
点击此处可从《教育技术导刊》浏览原始摘要信息
点击此处可从《教育技术导刊》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号