首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Classical supervised machine learning (ML) follows the assumptions of closed-world learning. However, this assumption does not work in an open-world dynamic environment. Therefore, the automated systems must be able to discover and identify unseen instances. Open-world ML can deal with unseen instances and classes through a two-step process: (1) discover and classify unseen instances and (2) identify novel classes discovered in step (1). Most existing research on open-world machine learning (OWML) only focuses on step 1. However, performing step 2 is required to build intelligent systems. The proposed framework comprises three different but interconnected modules that discover and identify unseen classes. Our in-depth performance evaluation establishes that the proposed framework improves open accuracy by up to 8% compared to the state-of-the-art models.  相似文献   

2.
Many machine learning algorithms have been applied to text classification tasks. In the machine learning paradigm, a general inductive process automatically builds a text classifier by learning, generally known as supervised learning. However, the supervised learning approaches have some problems. The most notable problem is that they require a large number of labeled training documents for accurate learning. While unlabeled documents are easily collected and plentiful, labeled documents are difficultly generated because a labeling task must be done by human developers. In this paper, we propose a new text classification method based on unsupervised or semi-supervised learning. The proposed method launches text classification tasks with only unlabeled documents and the title word of each category for learning, and then it automatically learns text classifier by using bootstrapping and feature projection techniques. The results of experiments showed that the proposed method achieved reasonably useful performance compared to a supervised method. If the proposed method is used in a text classification task, building text classification systems will become significantly faster and less expensive.  相似文献   

3.
In recent years, Zero-shot Node Classification (ZNC), an emerging and more difficult task is starting to attract attention, where the classes of testing nodes are unobserved in the training stage. Existing studies for ZNC mainly utilize Graph Neural Networks (GNNs) to construct the feature subspace to align with the classes’ semantic subspace, thus enabling knowledge transfer from seen classes to unseen classes. However, the modeling of the node feature is single-view and unilateral, e.g., the bag-of-words vector, which is not enough to fully describe the characteristics of the node itself. To address this dilemma, we propose to develop the Multi-View Enhanced zero-shot node classification paradigm (MVE) to promote the machine’s generality to approach the human-like thinking mode. Specifically, multi-view features are obtained from different aspects such as pre-trained model embeddings, knowledge graphs, statistic methods, and then fused by a contrastive learning module into the compositional node representation. Meanwhile, a developed Graph Convolutional Network (GCN) is used to make the nodes fully absorb the information of neighbors while the over-smooth issue is alleviated by multi-view features and the proposed contrastive learning mechanism. Experimental results conducted on three public datasets show an average 25% improvement compared to baseline methods, proving the superiority of our multi-view learning framework. The code and data can be found at https://github.com/guaiqihen/MVE.  相似文献   

4.
Users of social media websites tend to rapidly spread breaking news and trending stories without considering their truthfulness. This facilitates the spread of rumors through social networks. A rumor is a story or statement for which truthfulness has not been verified. Efficiently detecting and acting upon rumors throughout social networks is of high importance to minimizing their harmful effect. However, detecting them is not a trivial task. They belong to unseen topics or events that are not covered in the training dataset. In this paper, we study the problem of detecting breaking news rumors, instead of long-lasting rumors, that spread in social media. We propose a new approach that jointly learns word embeddings and trains a recurrent neural network with two different objectives to automatically identify rumors. The proposed strategy is simple but effective to mitigate the topic shift issues. Emerging rumors do not have to be false at the time of the detection. They can be deemed later to be true or false. However, most previous studies on rumor detection focus on long-standing rumors and assume that rumors are always false. In contrast, our experiment simulates a cross-topic emerging rumor detection scenario with a real-life rumor dataset. Experimental results suggest that our proposed model outperforms state-of-the-art methods in terms of precision, recall, and F1.  相似文献   

5.
Few-shot intent recognition aims to identify user’s intent from the utterance with limited training data. A considerable number of existing methods mainly rely on the generic knowledge acquired on the base classes to identify the novel classes. Such methods typically ignore the characteristics of each meta task itself, resulting in the inability to make full use of limited given samples when classifying unseen classes. To deal with such issues, we propose a Contrastive learning-based Task Adaptation model (CTA) for few-shot intent recognition. In detail, we leverage contrastive learning to help achieve task adaptation and make full use of the limited samples of novel classes. First, a self-attention layer is employed in the task adaptation module, which aims to establish interactions between samples of different categories so that new representations are task-specific rather than relying entirely on the base classes. Then, the contrastive-based loss functions and the semantics of the label name are respectively used for reducing the similarity between sample representations in different categories while increasing it in the same categories. Experimental results on a public dataset OOS verify the effectiveness of our proposal by beating the competitive baselines in terms of accuracy. Besides, we conduct the cross-domain experiments on three datasets, i.e., OOS, SNIPS as well as ATIS. We find that CTA gains obvious improvements in terms of accuracy in all cross-domain experiments, indicating that it has a better generalization ability than other competitive baselines in both cross-domain and single-domain settings.  相似文献   

6.
Authorship disambiguation is an urgent issue that affects the quality of digital library services and for which supervised solutions have been proposed, delivering state-of-the-art effectiveness. However, particular challenges such as the prohibitive cost of labeling vast amounts of examples (there are many ambiguous authors), the huge hypothesis space (there are several features and authors from which many different disambiguation functions may be derived), and the skewed author popularity distribution (few authors are very prolific, while most appear in only few citations), may prevent the full potential of such techniques. In this article, we introduce an associative author name disambiguation approach that identifies authorship by extracting, from training examples, rules associating citation features (e.g., coauthor names, work title, publication venue) to specific authors. As our main contribution we propose three associative author name disambiguators: (1) EAND (Eager Associative Name Disambiguation), our basic method that explores association rules for name disambiguation; (2) LAND (Lazy Associative Name Disambiguation), that extracts rules on a demand-driven basis at disambiguation time, reducing the hypothesis space by focusing on examples that are most suitable for the task; and (3) SLAND (Self-Training LAND), that extends LAND with self-training capabilities, thus drastically reducing the amount of examples required for building effective disambiguation functions, besides being able to detect novel/unseen authors in the test set. Experiments demonstrate that all our disambigutators are effective and that, in particular, SLAND is able to outperform state-of-the-art supervised disambiguators, providing gains that range from 12% to more than 400%, being extremely effective and practical.  相似文献   

7.
Few-Shot Event Classification (FSEC) aims at assigning event labels to unlabeled sentences when limited annotated samples are available. Existing works mainly focus on using meta-learning to overcome the low-resource problem that still requires abundant held-out classes for model learning and selection. Thus we propose to deal with the low-resource problem by utilizing prompts. Further, existing methods suffer from severe trigger biases that may result in ignorance of the context. That is, the correct classifications are gained by looking at only the triggers, which hurts the model’s generalization ability. Thus, we propose a knowledgeable augmented-trigger prompt FSEC framework (AugPrompt), which can overcome the bias issues and alleviates the classification bottleneck brought by insufficient data. In detail, we first design an External Knowledge Injection (EKI) module to incorporate an external knowledge base (Related Words) for trigger augmentation. Then, we propose an Event Prompt Generation (EPG) module to generate appropriate discrete prompts for initializing the continuous prompts. After that, we propose an Event Prompt Tuning (EPT) module to automatically search prompts in the continuous space for FSEC and finally predict the corresponding event types of the inputs. We conduct extensive experiments on two public English datasets for FSEC, i.e., FewEvent and RAMS. The experimental results show the superiority of our proposal over the competitive baselines, where the maximum accuracy increase compared to the strongest baseline reaches 10.8%.  相似文献   

8.
Irony as a literary technique is widely used in online texts such as Twitter posts. Accurate irony detection is crucial for tasks such as effective sentiment analysis. A text’s ironic intent is defined by its context incongruity. For example in the phrase “I love being ignored”, the irony is defined by the incongruity between the positive word “love” and the negative context of “being ignored”. Existing studies mostly formulate irony detection as a standard supervised learning text categorization task, relying on explicit expressions for detecting context incongruity. In this paper we formulate irony detection instead as a transfer learning task where supervised learning on irony labeled text is enriched with knowledge transferred from external sentiment analysis resources. Importantly, we focus on identifying the hidden, implicit incongruity without relying on explicit incongruity expressions, as in “I like to think of myself as a broken down Justin Bieber – my philosophy professor.” We propose three transfer learning-based approaches to using sentiment knowledge to improve the attention mechanism of recurrent neural models for capturing hidden patterns for incongruity. Our main findings are: (1) Using sentiment knowledge from external resources is a very effective approach to improving irony detection; (2) For detecting implicit incongruity, transferring deep sentiment features seems to be the most effective way. Experiments show that our proposed models outperform state-of-the-art neural models for irony detection.  相似文献   

9.
Propaganda is a mechanism to influence public opinion, which is inherently present in extremely biased and fake news. Here, we propose a model to automatically assess the level of propagandistic content in an article based on different representations, from writing style and readability level to the presence of certain keywords. We experiment thoroughly with different variations of such a model on a new publicly available corpus, and we show that character n-grams and other style features outperform existing alternatives to identify propaganda based on word n-grams. Unlike previous work, we make sure that the test data comes from news sources that were unseen on training, thus penalizing learning algorithms that model the news sources used at training time as opposed to solving the actual task. We integrate our supervised model in a public website, which organizes recent articles covering the same event on the basis of their propagandistic contents. This allows users to quickly explore different perspectives of the same story, and it also enables investigative journalists to dig further into how different media use stories and propaganda to pursue their agenda.  相似文献   

10.
Sequential recommendation models a user’s historical sequence to predict future items. Existing studies utilize deep learning methods and contrastive learning for data augmentation to alleviate data sparsity. However, these existing methods cannot learn accurate high-quality item representations while augmenting data. In addition, they usually ignore data noise and user cold-start issues. To solve the above issues, we investigate the possibility of Generative Adversarial Network (GAN) with contrastive learning for sequential recommendation to balance data sparsity and noise. Specifically, we propose a new framework, Enhanced Contrastive Learning with Generative Adversarial Network for Sequential Recommendation (ECGAN-Rec), which models the training process as a GAN and recommendation task as the main task of the discriminator. We design a sequence augmentation module and a contrastive GAN module to implement both data-level and model-level augmentations. In addition, the contrastive GAN learns more accurate high-quality item representations to alleviate data noise after data augmentation. Furthermore, we propose an enhanced Transformer recommender based on GAN to optimize the performance of the model. Experimental results on three open datasets validate the efficiency and effectiveness of the proposed model and the ability of the model to balance data noise and data sparsity. Specifically, the improvement of ECGAN-Rec in two evaluation metrics (HR@N and NDCG@N) compared to the state-of-the-art model performance on the Beauty, Sports and Yelp datasets are 34.95%, 36.68%, and 13.66%, respectively. Our implemented model is available via https://github.com/nishawn/ECGANRec-master.  相似文献   

11.
Deep multi-view clustering (MVC) is to mine and employ the complex relationships among views to learn the compact data clusters with deep neural networks in an unsupervised manner. The more recent deep contrastive learning (CL) methods have shown promising performance in MVC by learning cluster-oriented deep feature representations, which is realized by contrasting the positive and negative sample pairs. However, most existing deep contrastive MVC methods only focus on the one-side contrastive learning, such as feature-level or cluster-level contrast, failing to integrating the two sides together or bringing in more important aspects of contrast. Additionally, most of them work in a separate two-stage manner, i.e., first feature learning and then data clustering, failing to mutually benefit each other. To fix the above challenges, in this paper we propose a novel joint contrastive triple-learning framework to learn multi-view discriminative feature representation for deep clustering, which is threefold, i.e., feature-level alignment-oriented and commonality-oriented CL, and cluster-level consistency-oriented CL. The former two submodules aim to contrast the encoded feature representations of data samples in different feature levels, while the last contrasts the data samples in the cluster-level representations. Benefiting from the triple contrast, the more discriminative representations of views can be obtained. Meanwhile, a view weight learning module is designed to learn and exploit the quantitative complementary information across the learned discriminative features of each view. Thus, the contrastive triple-learning module, the view weight learning module and the data clustering module with these fused features are jointly performed, so that these modules are mutually beneficial. The extensive experiments on several challenging multi-view datasets show the superiority of the proposed method over many state-of-the-art methods, especially the large improvement of 15.5% and 8.1% on Caltech-4V and CCV in terms of accuracy. Due to the promising performance on visual datasets, the proposed method can be applied into many practical visual applications such as visual recognition and analysis. The source code of the proposed method is provided at https://github.com/ShizheHu/Joint-Contrastive-Triple-learning.  相似文献   

12.
李缨  于谦 《科技通报》2012,28(8):29-32
维数简约是肺结节分类识别问题中的关键步骤,现有的方法中都是将所有类别的数据作为一个整体进行降维,忽略了不同类别数据之间在特征子集上的差异性。本文提出了一种将类集和类对相结合的有监督流形特征抽取思想,并将之应用于肺结节的分类中,最终形成一个基于CT影像的肺结节分类系统。实验结果表明了方法的有效性。  相似文献   

13.
This research presents an enhanced approach for Aspect-Based Sentiment Analysis (ABSA) of Hotels’ Arabic reviews using supervised machine learning. The proposed approach employs a state-of-the-art research of training a set of classifiers with morphological, syntactic, and semantic features to address the research tasks namely: (a) T1:Aspect Category Identification, (b) T2:Opinion Target Expression (OTE) Extraction, and (c) T3: Sentiment Polarity Identification. Employed classifiers include Naïve Bayes, Bayes Networks, Decision Tree, K-Nearest Neighbor (K-NN), and Support-Vector Machine (SVM).The approach was evaluated using a reference dataset based on Semantic Evaluation 2016 workshop (SemEval-2016: Task-5). Results show that the supervised learning approach outperforms related work evaluated using the same dataset. More precisely, evaluation results show that all classifiers in the proposed approach outperform the baseline approach, and the overall enhancement for the best performing classifier (SVM) is around 53% for T1, around 59% for T2, and around 19% in T3.  相似文献   

14.
Previous federated recommender systems are based on traditional matrix factorization, which can improve personalized service but are vulnerable to gradient inference attacks. Most of them adopt model averaging to fit the data heterogeneity of federated recommender systems, requiring more training costs. To address privacy and efficiency, we propose an efficient federated item similarity model for the heterogeneous recommendation, called FedIS, which can train a global item-based collaborative filtering model to eliminate user feature dependencies. Specifically, we extend the neural item similarity model to the federated model, where each client only locally optimizes the shared item feature matrix. We then propose a fast-convergent federated aggregation method inspired by meta-learning to address heterogeneous user updates and accelerate the convergence of global training. Furthermore, we propose a two-stage perturbation method to protect both local training and transmission while reducing communication costs. Finally, extensive experiments on four real-world datasets validate that FedIS can provide more competitive performance on federated recommendations. Our proposed method also shows significant training efficiency with less performance degradation.  相似文献   

15.
With the widespread application of 3D capture devices, diverse 3D object datasets from different domains have emerged recently. Consequently, how to obtain the 3D objects from different domains is becoming a significant and challenging task. The existing approaches mainly focus on the task of retrieval from the identical dataset, which significantly constrains their implementation in real-world applications. This paper addresses the cross-domain object retrieval in an unsupervised manner, where the labels of samples from source domain are provided while the labels of samples from target domain are unknown. We propose a joint deep feature learning and visual domain adaptation method (Deep-VDA) to solve the cross-domain 3D object retrieval problem by the end-to-end learning. Specifically, benefiting from the advantages of deep learning networks, Deep-VDA employs MVCNN for deep feature extraction and domain alignment for unsupervised domain adaptation. The framework can enable the statistical and geometric shift between domains to be minimized in an unsupervised manner, which is accomplished by preserving both common and unique characteristics of each domain. Deep-VDA can improve the robustness of object features from different domains, which is important to maintain remarkable retrieval performance.  相似文献   

16.
Big data generated by social media stands for a valuable source of information, which offers an excellent opportunity to mine valuable insights. Particularly, User-generated contents such as reviews, recommendations, and users’ behavior data are useful for supporting several marketing activities of many companies. Knowing what users are saying about the products they bought or the services they used through reviews in social media represents a key factor for making decisions. Sentiment analysis is one of the fundamental tasks in Natural Language Processing. Although deep learning for sentiment analysis has achieved great success and allowed several firms to analyze and extract relevant information from their textual data, but as the volume of data grows, a model that runs in a traditional environment cannot be effective, which implies the importance of efficient distributed deep learning models for social Big Data analytics. Besides, it is known that social media analysis is a complex process, which involves a set of complex tasks. Therefore, it is important to address the challenges and issues of social big data analytics and enhance the performance of deep learning techniques in terms of classification accuracy to obtain better decisions.In this paper, we propose an approach for sentiment analysis, which is devoted to adopting fastText with Recurrent neural network variants to represent textual data efficiently. Then, it employs the new representations to perform the classification task. Its main objective is to enhance the performance of well-known Recurrent Neural Network (RNN) variants in terms of classification accuracy and handle large scale data. In addition, we propose a distributed intelligent system for real-time social big data analytics. It is designed to ingest, store, process, index, and visualize the huge amount of information in real-time. The proposed system adopts distributed machine learning with our proposed method for enhancing decision-making processes. Extensive experiments conducted on two benchmark data sets demonstrate that our proposal for sentiment analysis outperforms well-known distributed recurrent neural network variants (i.e., Long Short-Term Memory (LSTM), Bidirectional Long Short-Term Memory (BiLSTM), and Gated Recurrent Unit (GRU)). Specifically, we tested the efficiency of our approach using the three different deep learning models. The results show that our proposed approach is able to enhance the performance of the three models. The current work can provide several benefits for researchers and practitioners who want to collect, handle, analyze and visualize several sources of information in real-time. Also, it can contribute to a better understanding of public opinion and user behaviors using our proposed system with the improved variants of the most powerful distributed deep learning and machine learning algorithms. Furthermore, it is able to increase the classification accuracy of several existing works based on RNN models for sentiment analysis.  相似文献   

17.
While image-to-image translation has been extensively studied, there are a number of limitations in existing methods designed for transformation between instances of different shapes from different domains. In this paper, a novel approach was proposed (hereafter referred to as ObjectVariedGAN) to handle geometric translation. One may encounter large and significant shape changes during image-to-image translation, especially object transfiguration. Thus, we focus on synthesizing the desired results to maintain the shape of the foreground object without requiring paired training data. Specifically, our proposed approach learns the mapping between source domains and target domains, where the shapes of objects differ significantly. Feature similarity loss is introduced to encourage generative adversarial networks (GANs) to obtain the structure attribute of objects (e.g., object segmentation masks). Additionally, to satisfy the requirement of utilizing unaligned datasets, cycle-consistency loss is combined with context-preserving loss. Our approach feeds the generator with source image(s), incorporated with the instance segmentation mask, and guides the network to generate the desired target domain output. To verify the effectiveness of proposed approach, extensive experiments are conducted on pre-processed examples from the MS-COCO datasets. A comparative summary of the findings demonstrates that ObjectVariedGAN outperforms other competing approaches, in the terms of Inception Score, Frechet Inception Distance, and human cognitive preference.  相似文献   

18.
Cross-genre author profiling aims to build generalized models for predicting profile traits of authors that can be helpful across different text genres for computer forensics, marketing, and other applications. The cross-genre author profiling task becomes challenging when dealing with low-resourced languages due to the lack of availability of standard corpora and methods. The task becomes even more challenging when the data is code-switched, which is informal and unstructured. In previous studies, the problem of cross-genre author profiling has been mainly explored for mono-lingual texts in highly resourced languages (English, Spanish, etc.). However, it has not been thoroughly explored for the code-switched text which is widely used for communication over social media. To fulfill this gap, we propose a transfer learning-based solution for the cross-genre author profiling task on code-switched (English–RomanUrdu) text using three widely known genres, Facebook comments/posts, Tweets, and SMS messages. In this article, firstly, we experimented with the traditional machine learning, deep learning and pre-trained transfer learning models (MBERT, XLMRoBERTa, ULMFiT, and XLNET) for the same-genre and cross-genre gender identification task. We then propose a novel Trans-Switch approach that focuses on the code-switching nature of the text and trains on specialized language models. In addition, we developed three RomanUrdu to English translated corpora to study the impact of translation on author profiling tasks. The results show that the proposed Trans-Switch model outperforms the baseline deep learning and pre-trained transfer learning models for cross-genre author profiling task on code-switched text. Further, the experimentation also shows that the translation of RomanUrdu text does not improve results.  相似文献   

19.
20.
Question classification (QC) involves classifying given question based on the expected answer type and is an important task in the Question Answering(QA) system. Existing approaches for question classification use full training dataset to fine-tune the models. It is expensive and requires more time to develop labelled datasets in huge size. Hence, there is a need to develop approaches that can achieve comparable or state of the art performance using limited training instances. In this paper, we propose an approach that uses data augmentation as a tool to generate additional training instances. We evaluate our proposed approach on two question classification datasets namely TREC and ICHI datasets. Experimental results show that our proposed approach reduces the requirement of labelled instances (a) up to 81.7% and achieves new state of the art accuracy of 98.11 on TREC dataset and (b) up to 75% and achieves 67.9 on ICHI dataset.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号