期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

黄静薛书田肖进《软科学》2017,(7):131-134

将半监督学习技术与多分类器集成模型Bagging相结合,构建类别分布不平衡环境下基于Bagging的半监督集成模型(SSEBI),综合利用有、无类别标签的样本来提高模型的性能.该模型主要包括三个阶段:(1)从无类别标签数据集中选择性标记一部分样本并训练若干个基本分类器;(2)使用训练好的基本分类器对测试集样本进行分类;(3)对分类结果进行集成得到最终分类结果.在五个客户信用评估数据集上进行实证分析,结果表明本研究提出的SSEBI模型的有效性. 相似文献

2.

基于多分类器联合的遥感图像监督分类方法研究

张耀宇于光王志强《黑龙江科技信息》2011,(34):59-59

遥感图像分类是遥感图像处理的一个重要内容,根据遥感图像监督分类方法适用范围不同且分类机制各有优劣的特点,将多分类器联合对遥感图像进行分类,结果表明,与单一分类器的分类结果相比,多分类器结合的监督分类技术能有效提高遥感图像专题信息提取的精度。相似文献

3.

基于多类型分类器装袋技术的数据分类模型研究

下载免费PDF全文

段尧清林平李施展《情报科学》2019,37(4):59-65

【目的/意义】数据分类是数据挖掘研究的重要内容之一。数据分类时,由于单一分类算法分类性能的差异性,使其不能很好地解决大部分的分类问题,探讨一种基于多类型分类器装袋技术的数据分类方法具有重要理论意义和应用价值。【方法/过程】基于分类性能评价的准确率,使用五种不同类型的分类算法作为分类器,随机抽取训练集后分别训练得到若干个弱分类器,然后采用自动优化加权方式,组合构建一个强的分类器。通过实验对五种分类算法和装袋算法的分类准确率均值和标准差分别进行对比,得出各分类算法在四种数据集上分类性能的优劣和稳定性。【结果/结论】在四个UCI数据集上的实验结果表明,与五种不同类型的分类算法相比,装袋算法不仅在大部分数据集上都表现出很好的稳定性,而且具有更好的泛化能力。相似文献

4.

文本自动分类研究——基于径向基函数

黄翠玉《情报科学》2013,(5):67-71

目前的文本分类还是以人工分类为主,为了减少人工分类的不确定性和分类错误,将径向基函数(RBF)算法引入文本自动分类系统,实现文本的自动分类。实验结果表明,采用RBF构造的分类器在文本自动分类中具有较好的分类性能,其测试平均值(F1)比BP、kNN分类器的F1值都要高。相似文献

5.

基于贝叶斯多分类器融合的人脸识别算法研究

顾伟刘文杰朱忠浩《人天科学研究》2014,(12):65-67

通过对贝叶斯分类器进行研究,提出一种结合贝叶斯分类规则和 FIS H ER线性判决分析,同时采用人脸的关键局部特征,通过加权相似度求和策略,实现对多个分类器进行融合的一种综合性人脸识别算法。实验表明,该方法在无光照处理下识别效果更为精确。相似文献

6.

一种基于Bayes分类器的中文期刊自动分类系统

萧莉明于宽蔡珣《现代情报》2007,27(4):146-147,150

本文设计了一个有效的基于贝叶斯分类器的中文期刊自动分类系统。首先，该系统以期刊的名称作为惟一的标引内容，并利用自动分词技术将期刊名称分成待分类的样本集；其次，通过对图书馆的样本数据进行训练建立的分类库，本文使用贝叶斯分类器实现中文期刊的自动分类。实验结果表明，该分类器对中文期刊的分类具有很好的高效性和准确性。相似文献

7.

基于随机森林的数据融合架空输电线路铁塔损伤识别

张晔杨国田《黑龙江科技信息》2014,(20):102-102

架空输电线路铁塔结构是我国主要的输电方式,一旦发生损伤破坏将造成严重的经济损失。本文提出了一种基于随机森林的数据融合架空输电线路损伤识别方法。首先,采用多个传感器获取铁塔在不同损伤位置和程度上的振动加速度信号,并运用小波包对其进行多层分解;然后,将提取出来的各频带能量值构成特征向量输入到相应的随机森林进行训练和测试;最后,将多个随机森林分类器的次级决策进行数据融合,做出最终铁塔损失情况决策。应用该方法对500kV高压输电铁塔模型进行试验,并与单一分类器相比较。通过对实验数据的分析表明,该方法对铁塔损伤的识别效果优于单一RF分类器,可以有效地改善单一分类器的识别能力。同时也表明该方法具有较好的分类效果和容错能力。相似文献

8.

PCA和SVM在新疆哈萨克族食管癌图像分类中的研究与应用

《科技通报》2017,(2)

目的:利用PCA和SVM对新疆哈萨克族食管癌X射线图像进行特征提取、特征选择及分类研究。方法:利用基于灰度共生矩阵的纹理特征和小波变换的频域特征提取法,提出将ROC曲线面积选择法和主成分分析法相结合的两步式特征选择法,利用Bayes和SVM分类器对图像进行分类以验证所提取特征的分类能力。结果:AUC0.7的特征经主成分分析后输入到SVM分类器和Bayes分类器中得到的分类准确率和AUC值最高,分别为91%和85%、0.945和0.915。结论:SVM具有较好的分类性能,两步式特征选择法能有效地消除特征之间的共线性,极大提高了特征的分类能力,本研究有望提高新疆哈萨克族食管癌CAD系统的整体性能。相似文献

9.

基于协同训练的意图分类优化方法

邱云飞刘聪《现代情报》2019,39(5):57

[目的/意义]针对单纯使用统计自然语言处理技术对社交网络上产生的短文本数据进行意向分类时存在的特征稀疏、语义模糊和标记数据不足等问题，提出了一种融合心理语言学信息的Co-training意图分类方法。[方法/过程]首先，为丰富语义信息，在提取文本特征的同时融合带有情感倾向的心理语言学线索对特征维度进行扩展。其次，针对标记数据有限的问题，在模型训练阶段使用半监督集成法对两种机器学习分类方法（基于事件内容表达分类器与情感事件表达分类器）进行协同训练（Co-training）。最后，采用置信度乘积的投票制进行分类。[结论/结果]实验结果表明融入心理语言学信息的语料再经过协同训练的分类效果更优。相似文献

10.

搜索引擎搜索结果中文网页分类系统

周莹《科教文汇》2007,(13)

论文设计实现中文搜索网页分类系统,包括:关键字搜索结果网页类型判断方法,网页主题内容提取.对于不容易分类的网页,采用基于摘要的网页搜索结果聚类和基于学习的网页搜索结果分类器设计方法.最后,构造中文文本分类器,并编程实现,通过实例测试分类器性能. 相似文献

11.

一种改进的最小距离分类器NN-MDC

郭亚琴王兴洲《人天科学研究》2009,(10)

为了提高最小距离分类器的性能,在其基础上提出了一种改进MDC——NN-MDC:它先对训练样本进行修剪,根据每个样本与其最近邻类标的异同决定其取舍,然后再用剩余的训练样本训练得到分类器。采用UCI标准数据集实验,结果表明本文所提出的NN-MDC与MDC相比具有较高的分类精度。相似文献

12.

朴素贝叶斯在文本分类中的应用

熊志斌刘冬《人天科学研究》2013,(2):49-51

朴素贝叶斯理论是一种典型机器学习技术,能够应用于文本分类中。运用朴素贝叶斯理论阐述了贝叶斯分类器的样本训练和分类计算的过程,构造了一个文本分类器。试验表明,朴素贝叶斯理论在文本分类中有较好的分类效果。相似文献

13.

A distance-based weighting framework for boosting the performance of dynamic ensemble selection

《Information processing & management》2019,56(4):1300-1316

Dynamic Ensemble Selection (DES) strategy is one of the most common and effective techniques in machine learning to deal with classification problems. DES systems aim to construct an ensemble consisting of the most appropriate classifiers selected from the candidate classifier pool according to the competence level of the individual classifier. Since several classifiers are selected, their combination becomes crucial. However, most of current DES approaches focus on the combination of the selected classifiers while ignoring the local information surrounding the query sample needed to be classified. In order to boost the performance of DES-based classification systems, we in this paper propose a dynamic weighting framework for the classifier fusion during obtaining the final output of an DES system. In particular, the proposed method first employs a DES approach to obtain a group of classifiers for a query sample. Then, the hypothesis vector of the selected ensemble is obtained based on the analysis of consensus. Finally, a distance-based weighting scheme is developed to adjust the hypothesis vector depending on the closeness of the query sample to each class. The proposed method is tested on 30 real-world datasets with six well-known DES approaches based on both homogeneous and heterogeneous ensemble. The obtained results, supported by proper statistical tests, show that our method outperforms, both in terms of accuracy and kappa measures, the original DES framework. 相似文献

14.

遗传算法在股票投资技术分析中的应用

冯平宣慧玉《预测》2001,20(2):38-41

本文首先分析了股票投资技术分析的特点,然后阐述了遗传算法及基于遗传算法的分类器系统的基本理论.最后,详细讨论了遗传算法及分类器系统在两种最常用的股票投资技术分析方法(指标分析和图形分析)的计算机化中的运用问题. 相似文献

15.

基于改进的LBP特征和随机森林相结合的人脸关键点检测方法研究

王鹏葛红《人天科学研究》2013,(5):139-141

提出了一种人脸关键点检测方法,该方法用了少量的正面图像,不用归一化人脸图像,而传统的人脸关键点检测方法需要对图像进行严格预处理。随机森林是一种分类器融合算法,可以很好地解决多类分类问题,虽然LBP特征简单,但其可以包含大量的纹理信息。利用改进的LBP特征与随机森林相结合,构成一种对人脸关键点检测的方法。通过高斯平滑图像的LBP特征的提取,对每个点生成特征,计算出有用的特征作为正例,并且与反例集合变为训练集。通过随机森林分类器进行分类,误差率较低,仅在10％左右。相似文献

16.

Discourse-aware rumour stance classification in social media using sequential classifiers

Arkaitz Zubiaga Elena Kochkina Maria Liakata Rob Procter Michal Lukasik Kalina Bontcheva Trevor Cohn Isabelle Augenstein 《Information processing & management》2018,54(2):273-290

Rumour stance classification, defined as classifying the stance of specific social media posts into one of supporting, denying, querying or commenting on an earlier post, is becoming of increasing interest to researchers. While most previous work has focused on using individual tweets as classifier inputs, here we report on the performance of sequential classifiers that exploit the discourse features inherent in social media interactions or ‘conversational threads’. Testing the effectiveness of four sequential classifiers – Hawkes Processes, Linear-Chain Conditional Random Fields (Linear CRF), Tree-Structured Conditional Random Fields (Tree CRF) and Long Short Term Memory networks (LSTM) – on eight datasets associated with breaking news stories, and looking at different types of local and contextual features, our work sheds new light on the development of accurate stance classifiers. We show that sequential classifiers that exploit the use of discourse properties in social media conversations while using only local features, outperform non-sequential classifiers. Furthermore, we show that LSTM using a reduced set of features can outperform the other sequential classifiers; this performance is consistent across datasets and across types of stances. To conclude, our work also analyses the different features under study, identifying those that best help characterise and distinguish between stances, such as supporting tweets being more likely to be accompanied by evidence than denying tweets. We also set forth a number of directions for future research. 相似文献

17.

Multi-source information fusion to identify water supply pipe leakage based on SVM and VMD

《Information processing & management》2022,59(2):102819

In order to solve the problem of the low leakage recognition rate of water pipes due to operating conditions influence in practice, a multi-source information fusion recognition method based on VMD and SVM is proposed. In this method, it firstly uses VMD to decompose the acoustic vibration signal of water pipes, and then a principle of IMF component selection is proposed. The IMF component selection is selected to extract the kurtosis vector of VMD, the sample entropy vector of VMD, the center frequency vector of VMD. Because the different eigenvectors to the sensitivity of different operating conditions have a great gap, the three eigenvectors become a new eigenvector by multi-source information fusion, which is finally input into SVM classifier for leak recognition. The comparison of experimental results show that this method can effectively recognize the signals of water pipes leak and other operating conditions. The recognition accuracy rate reach 98.75%, which is 1.04 times higher than SVM sorting technique, 1.18 times higher than that SVM classification recognition accuracy based on the sample entropy vector of VMD,1.14 times higher than that SVM classification recognition accuracy based on the kurtosis vector of VMD, and 1.11 times higher than SVM classification recognition accuracy based on the center frequency vector of VMD. 相似文献

18.

一种基于文本分类技术的邮件过滤系统设计

浦海晨万晓冬《科技广场》2005,(6):21-24

垃圾邮件的泛滥提出了极为迫切的技术诉求,文章介绍了基于文本分类技术的垃圾邮件过滤系统模型,首先介绍了整个系统工作流程,然后阐述了系统中文本分词,文本特征提取,Winnow线性分类器等关键环节。相似文献