首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Multimodal relation extraction is a critical task in information extraction, aiming to predict the class of relations between head and tail entities from linguistic sequences and related images. However, the current works are vulnerable to less relevant visual objects detected from images and are not able to sufficiently fuse visual information into text pre-trained models. To overcome these problems, we propose a Two-Stage Visual Fusion Network (TSVFN) that employs the multimodal fusion approach in vision-enhanced entity relation extraction. In the first stage, we design multimodal graphs, whose novelty lies mainly in transforming the sequence learning into the graph learning. In the second stage, we merge the transformer-based visual representation into the text pre-trained model by a multi-scale cross-model projector. Specifically, two multimodal fusion operations are implemented inside the pre-trained model respectively. We finally accomplish deep interaction of multimodal multi-structured data in two fusion stages. Extensive experiments are conducted on a dataset (MNRE), our model outperforms the current state-of-the-art method by 1.76%, 1.52%, 1.29%, and 1.17% in terms of accuracy, precision, recall, and F1 score, respectively. Moreover, our model also achieves excellent results under the condition of fewer samples.  相似文献   

2.
Stance detection is to distinguish whether the text’s author supports, opposes, or maintains a neutral stance towards a given target. In most real-world scenarios, stance detection needs to work in a zero-shot manner, i.e., predicting stances for unseen targets without labeled data. One critical challenge of zero-shot stance detection is the absence of contextual information on the targets. Current works mostly concentrate on introducing external knowledge to supplement information about targets, but the noisy schema-linking process hinders their performance in practice. To combat this issue, we argue that previous studies have ignored the extensive target-related information inhabited in the unlabeled data during the training phase, and propose a simple yet efficient Multi-Perspective Contrastive Learning Framework for zero-shot stance detection. Our framework is capable of leveraging information not only from labeled data but also from extensive unlabeled data. To this end, we design target-oriented contrastive learning and label-oriented contrastive learning to capture more comprehensive target representation and more distinguishable stance features. We conduct extensive experiments on three widely adopted datasets (from 4870 to 33,090 instances), namely SemEval-2016, WT-WT, and VAST. Our framework achieves 53.6%, 77.1%, and 72.4% macro-average F1 scores on these three datasets, showing 2.71% and 0.25% improvements over state-of-the-art baselines on the SemEval-2016 and WT-WT datasets and comparable results on the more challenging VAST dataset.  相似文献   

3.
Multimodal sentiment analysis aims to judge the sentiment of multimodal data uploaded by the Internet users on various social media platforms. On one hand, existing studies focus on the fusion mechanism of multimodal data such as text, audio and visual, but ignore the similarity of text and audio, text and visual, and the heterogeneity of audio and visual, resulting in deviation of sentiment analysis. On the other hand, multimodal data brings noise irrelevant to sentiment analysis, which affects the effectness of fusion. In this paper, we propose a Polar-Vector and Strength-Vector mixer model called PS-Mixer, which is based on MLP-Mixer, to achieve better communication between different modal data for multimodal sentiment analysis. Specifically, we design a Polar-Vector (PV) and a Strength-Vector (SV) for judging the polar and strength of sentiment separately. PV is obtained from the communication of text and visual features to decide the sentiment that is positive, negative, or neutral sentiment. SV is gained from the communication between the text and audio features to analyze the sentiment strength in the range of 0 to 3. Furthermore, we devise an MLP-Communication module (MLP-C) composed of several fully connected layers and activation functions to make the different modal features fully interact in both the horizontal and the vertical directions, which is a novel attempt to use MLP for multimodal information communication. Finally, we mix PV and SV to obtain a fusion vector to judge the sentiment state. The proposed PS-Mixer is tested on two publicly available datasets, CMU-MOSEI and CMU-MOSI, which achieves the state-of-the-art (SOTA) performance on CMU-MOSEI compared with baseline methods. The codes are available at: https://github.com/metaphysicser/PS-Mixer.  相似文献   

4.
首先分析了数字电视收视行为分析的研究现状,然后提出将点击流数据仓库、数据挖掘、客户关系管理等技术引入数字电视观众行为分析,并探讨构建数字电视点击流数据仓库应用系统的可行性与思路。  相似文献   

5.
Imbalanced sample distribution is usually the main reason for the performance degradation of machine learning algorithms. Based on this, this study proposes a hybrid framework (RGAN-EL) combining generative adversarial networks and ensemble learning method to improve the classification performance of imbalanced data. Firstly, we propose a training sample selection strategy based on roulette wheel selection method to make GAN pay more attention to the class overlapping area when fitting the sample distribution. Secondly, we design two kinds of generator training loss, and propose a noise sample filtering method to improve the quality of generated samples. Then, minority class samples are oversampled using the improved RGAN to obtain a balanced training sample set. Finally, combined with the ensemble learning strategy, the final training and prediction are carried out. We conducted experiments on 41 real imbalanced data sets using two evaluation indexes: F1-score and AUC. Specifically, we compare RGAN-EL with six typical ensemble learning; RGAN is compared with three typical GAN models. The experimental results show that RGAN-EL is significantly better than the other six ensemble learning methods, and RGAN is greatly improved compared with three classical GAN models.  相似文献   

6.
当前,在线教育已经成为现代教育的重要组成部分,为教育形式的创新开拓了无限的可能。但是,以慕课为代表的在线教育的课程注册人数通常远远高于最终完成课程的人数,课程完成率较低。该文探索影响在线学习效果特别是课程完成率的因素,并据此提出改进建议。具体采取了实证研究的方法,运用了关联规则挖掘技术和WEKA数据挖掘开源工具,对学堂在线平台上39门课程的学习记录数据进行分析,得出了一系列基于大数据的、有指导意义的在线学习行为方面的关联规则,为进一步开展后续研究提供了参考。  相似文献   

7.
In this work, we investigate compressed sensing (CS) techniques based on the exploitation of prior knowledge to support telemedicine. In particular, prior knowledge is obtained by computing the probability of appearance of non-zero elements in each row of a sparse matrix, which is then employed in sensing matrix design and recovery algorithms for CS systems. A robust sensing matrix is designed by jointly reducing the average mutual coherence and the projection of the sparse representation error. A Probability-Driven Normalized Iterative Hard Thresholding (PD-NIHT) algorithm is developed as the recovery method, which also exploits the prior knowledge of the probability of appearance of non-zero elements and can bring performance benefits. Simulations for synthetic data and different organs of endoscopy image are carried out, where the proposed sensing matrix and PD-NIHT algorithm achieve a better performance than previously reported algorithms.  相似文献   

8.
Stock movement forecasting is usually formalized as a sequence prediction task based on time series data. Recently, more and more deep learning models are used to fit the dynamic stock time series with good nonlinear mapping ability, but not much of them attempt to unveil a market system’s internal dynamics. For instance, the driving force (state) behind the stock rise may be the company’s good profitability or concept marketing, and it is helpful to judge the future trend of the stock. To address this issue, we regard the explored pattern as an organic component of the hidden mechanism. Considering the effective hidden state discovery ability of the Hidden Markov Model (HMM), we aim to integrate it into the training process of the deep learning model. Specifically, we propose a deep learning framework called Hidden Markov Model-Attentive LSTM (HMM-ALSTM) to model stock time series data, which guides the hidden state learning of deep learning methods via the market’s pattern (learned by HMM) that generates time series data. What is more, a large number of experiments on 6 real-world data sets and 13 stock prediction baselines for predicting stock movement and return rate are implemented. Our proposed HMM-ALSTM achieves an average 10% improvement on all data sets compared to the best baseline.  相似文献   

9.
The wide spread of false information has detrimental effects on society, and false information detection has received wide attention. When new domains appear, the relevant labeled data is scarce, which brings severe challenges to the detection. Previous work mainly leverages additional data or domain adaptation technology to assist detection. The former would lead to a severe data burden; the latter underutilizes the pre-trained language model because there is a gap between the downstream task and the pre-training task, which is also inefficient for model storage because it needs to store a set of parameters for each domain. To this end, we propose a meta-prompt based learning (MAP) framework for low-resource false information detection. We excavate the potential of pre-trained language models by transforming the detection tasks into pre-training tasks by constructing template. To solve the problem of the randomly initialized template hindering excavation performance, we learn optimal initialized parameters by borrowing the benefit of meta learning in fast parameter training. The combination of meta learning and prompt learning for the detection is non-trivial: Constructing meta tasks to get initialized parameters suitable for different domains and setting up the prompt model’s verbalizer for classification in the noisy low-resource scenario are challenging. For the former, we propose a multi-domain meta task construction method to learn domain-invariant meta knowledge. For the latter, we propose a prototype verbalizer to summarize category information and design a noise-resistant prototyping strategy to reduce the influence of noise data. Extensive experiments on real-world data demonstrate the superiority of the MAP in new domains of false information detection.  相似文献   

10.
Some businesses on product development prefer to use a chatbot for judging the customer's view. Today, the ability of a chatbot to consider the context is challenging due to its technical nature. Sometimes, it may misjudge the context, making the wrong decision in predicting the product's originality in the market. This task of chatbot helps the enterprise make huge profits from accurate predictions. However, chatbots may commit errors in dialogs and bring inappropriate responses to users, reducing the confidentiality of product and marketing information. This, in turn, reduces the enterprise gain and imposes cost complications on businesses. To improve the performance of chatbots, AI models are used based on deep learning concepts. This research proposes a multi-headed deep neural network (MH-DNN) model for addressing the logical and fuzzy errors caused by retrieval chatbot models. This model cuts down on the error raised from the information loss. Our experiments extensively trained the model on a large Ubuntu dialog corpus. The recall evaluation scores showed that the MH-DNN approach slightly outperformed selected state-of-the-art retrieval-based chatbot approaches. The results obtained from the MHDNN augmentation approach were pretty impressive. In our proposed work, the MHDNN algorithm exhibited accuracy rates of 94% and 92%, respectively, with and without the help of the Seq2Seq technique.  相似文献   

11.
Sedentarism is a common problem that can affect human health and wellbeing. Predicting sedentary behaviour is an emerging area that can benefit from data collected from sensors available in ubiquitous devices, such as wearables and smartphones. In this paper, we present an approach aiming at predicting the sedentary behaviour of a user from data collected from sensors installed in wearable/mobile devices. We compare personal and impersonal models using a real-life dataset consisting of sensing data of 48 users during 10 weeks. We found that impersonal models using Deep Neural Networks were able to accurately predict the subject’s future sedentary behaviour.  相似文献   

12.
Sequential recommendation models a user’s historical sequence to predict future items. Existing studies utilize deep learning methods and contrastive learning for data augmentation to alleviate data sparsity. However, these existing methods cannot learn accurate high-quality item representations while augmenting data. In addition, they usually ignore data noise and user cold-start issues. To solve the above issues, we investigate the possibility of Generative Adversarial Network (GAN) with contrastive learning for sequential recommendation to balance data sparsity and noise. Specifically, we propose a new framework, Enhanced Contrastive Learning with Generative Adversarial Network for Sequential Recommendation (ECGAN-Rec), which models the training process as a GAN and recommendation task as the main task of the discriminator. We design a sequence augmentation module and a contrastive GAN module to implement both data-level and model-level augmentations. In addition, the contrastive GAN learns more accurate high-quality item representations to alleviate data noise after data augmentation. Furthermore, we propose an enhanced Transformer recommender based on GAN to optimize the performance of the model. Experimental results on three open datasets validate the efficiency and effectiveness of the proposed model and the ability of the model to balance data noise and data sparsity. Specifically, the improvement of ECGAN-Rec in two evaluation metrics (HR@N and NDCG@N) compared to the state-of-the-art model performance on the Beauty, Sports and Yelp datasets are 34.95%, 36.68%, and 13.66%, respectively. Our implemented model is available via https://github.com/nishawn/ECGANRec-master.  相似文献   

13.
Federated learning (FL), as a popular distributed machine learning paradigm, has driven the integration of knowledge in ubiquitous data owners under one roof. Although designed for privacy-preservation by nature, the supposed well-sanitized parameters still convey sensitive information (e.g., reconstruction attack), while existing technical countermeasures provide weak explainability for privacy understanding and protection practices of general users. This work investigates these privacy concerns with an exploratory study and elaborates on data owners’ expectations in FL. Based on the analysis, we design the first interactive visualization system for FL privacy that supports intelligible privacy inspection and adjustment for data owners. Specifically, our proposal facilitates sample recommendation for joint privacy–performance training at cold start. Then it provides visual interpretation and attention rendering of privacy risks in view of multiple attacking channels and a holistic view. Further it supports interactive privacy enhancement involving both user initiative and differential privacy technique, and iterative trade-off with real-time inference accuracy estimation. We evaluate the effectiveness of the system and collect qualitative feedbacks from users. The results demonstrate that 96.7% of users acknowledge the benefits to privacy inspection and adjustment and 90.3% are willing to use our system. More importantly, 87.1% increase the willingness of contributing data for FL.  相似文献   

14.
Adequate adherence is a necessary condition for success with any intervention, including for computerized cognitive training designed to mitigate age-related cognitive decline. Tailored prompting systems offer promise for promoting adherence and facilitating intervention success. However, developing adherence support systems capable of just-in-time adaptive reminders requires understanding the factors that predict adherence, particularly an imminent adherence lapse. In this study we built machine learning models to predict participants’ adherence at different levels (overall and weekly) using data collected from a previous cognitive training intervention. We then built machine learning models to predict adherence using a variety of baseline measures (demographic, attitudinal, and cognitive ability variables), as well as deep learning models to predict the next week's adherence using variables derived from training interactions in the previous week. Logistic regression models with selected baseline variables were able to predict overall adherence with moderate accuracy (AUROC: 0.71), while some recurrent neural network models were able to predict weekly adherence with high accuracy (AUROC: 0.84-0.86) based on daily interactions. Analysis of the post hoc explanation of machine learning models revealed that general self-efficacy, objective memory measures, and technology self-efficacy were most predictive of participants’ overall adherence, while time of training, sessions played, and game outcomes were predictive of the next week's adherence. Machine-learning based approaches revealed that both individual difference characteristics and previous intervention interactions provide useful information for predicting adherence, and these insights can provide initial clues as to who to target with adherence support strategies and when to provide support. This information will inform the development of a technology-based, just-in-time adherence support systems.  相似文献   

15.
With the emergence and development of deep generative models, such as the variational auto-encoders (VAEs), the research on topic modeling successfully extends to a new area: neural topic modeling, which aims to learn disentangled topics to understand the data better. However, the original VAE framework had been shown to be limited in disentanglement performance, bringing their inherent defects to a neural topic model (NTM). In this paper, we put forward that the optimization objectives of contrastive learning are consistent with two important goals (alignment and uniformity) of well-disentangled topic learning. Also, the optimization objectives of contrastive learning are consistent with two key evaluation measures for topic models, topic coherence and topic diversity. So, we come to the important conclusion that alignment and uniformity of disentangled topic learning can be quantified with topic coherence and topic diversity. Accordingly, we are inspired to propose the Contrastive Disentangled Neural Topic Model (CNTM). By representing both words and topics as low-dimensional vectors in the same embedding space, we apply contrastive learning to neural topic modeling to produce factorized and disentangled topics in an interpretable manner. We compare our proposed CNTM with strong baseline models on widely-used metrics. Our model achieves the best topic coherence scores under the most general evaluation setting (100% proportion topic selected) with 25.0%, 10.9%, 24.6%, and 51.3% improvements above the second-best models’ scores reported on four datasets of 20 Newsgroups, Web Snippets, Tag My News, and Reuters, respectively. Our method also gets the second-best topic diversity scores on the dataset of 20Newsgroups and Web Snippets. Our experimental results show that CNTM can effectively leverage the disentanglement ability from contrastive learning to solve the inherent defect of neural topic modeling and obtain better topic quality.  相似文献   

16.
Technology-intensive industries spend huge resources in the production of products to commercialize successful products. If the appetite on the market continues to change, the capacity to rapidly and cost-effectively refresh the offerings is an important competitive advantage. Even if components and designs need to be modified as new models are released, their underlying technology and designs can generally be reused to allow rapid economic development. Data are considered an important raw material that can influence multidisciplinary analysis, government, and business efficiency. In this paper, the Efficient data analytics (EDA) method has been suggested to fix societal challenges. The proposed methods aim to share the authors' views and perspectives on the emerging opportunities and challenges of the efficient data revolution.EDA provides four key aspects of technology reuse: strategy, method, culture, and information technology. The dimensions are further broken down into concepts supporting this technology reuse, including design on the technology platform and the reuse assessments. In practice, the system can evaluate an organization's existing reuse capabilities and offer an overall theoretical review of activities promoting technology reuse. To prove the system's concepts, industrial scenarios highlighting real questions of technological growth are used. Besides, the possible societal benefits of EDA in six ways are illustrated: enhanced decision management and incident prediction, data-informed technologies and innovative market models, direct social and climate benefits, community engagement, accountability, and public trust. Some best practice is suggested to capture these advantages. The experimental results suggested EDA increases reusability knowledge in the organization (96.3%), operational cost (95.1%), prediction ratio (97.4%), Community engagement ratio (94.1%), and public trust (98.5%).  相似文献   

17.
In this paper, we propose a machine learning approach to title extraction from general documents. By general documents, we mean documents that can belong to any one of a number of specific genres, including presentations, book chapters, technical papers, brochures, reports, and letters. Previously, methods have been proposed mainly for title extraction from research papers. It has not been clear whether it could be possible to conduct automatic title extraction from general documents. As a case study, we consider extraction from Office including Word and PowerPoint. In our approach, we annotate titles in sample documents (for Word and PowerPoint, respectively) and take them as training data, train machine learning models, and perform title extraction using the trained models. Our method is unique in that we mainly utilize formatting information such as font size as features in the models. It turns out that the use of formatting information can lead to quite accurate extraction from general documents. Precision and recall for title extraction from Word are 0.810 and 0.837, respectively, and precision and recall for title extraction from PowerPoint are 0.875 and 0.895, respectively in an experiment on intranet data. Other important new findings in this work include that we can train models in one domain and apply them to other domains, and more surprisingly we can even train models in one language and apply them to other languages. Moreover, we can significantly improve search ranking results in document retrieval by using the extracted titles.  相似文献   

18.
孙娟  ;李松岭 《科教文汇》2014,(21):145-146
“学进去,讲出来”是以学生自主学习作为主要学习方式,以合作学习作为主要教学组织形式,以“学进去”、“讲出来”作为学生学习方式的导向和学习目标达成的基本要求的课堂教学方式。  相似文献   

19.
We present a novel multimodal query expansion strategy, based on genetic programming (GP), for image search in visually-oriented e-commerce applications. Our GP-based approach aims at both: learning to expand queries with multimodal information and learning to compute the “best” ranking for the expanded queries. However, different from previous work, the query is only expressed in terms of the visual content, which brings several challenges for this type of application. In order to evaluate the effectiveness of our method, we have collected two datasets containing images of clothing products taken from different online shops. Experimental results indicate that our method is an effective alternative for improving the quality of image search results when compared to a genetic programming system based only on visual information. Our method can achieve gains varying from 10.8% against the strongest learning-to-rank baseline to 54% against an adhoc specialized solution for the particular domain at hand.  相似文献   

20.
Viewer gifting is an important business mode in live streaming industry, which closely relates to the income of the platforms and streamers. Previous studies on gifting prediction are often limited to cross-section data and consider the problem from the macro perspective of the whole live streaming. However, the multimodal information and the time accumulation effect of live streaming content on viewer gifting behavior are ignored. In this paper, we put forward a multimodal time-series method (MTM) for predicting real-time gifting. The core module of the method is the multimodal time-series analysis (MTA), which targets at effectively fusing multimodal information. Specifically, the proposed orthogonal projection (OP) model can promote cross-modal information interaction without introducing additional parameters. To achieve the interaction of multi-modal information at the same level, we also design a stackable joint representation layer, which makes each target modality's representation (visual, acoustic and textual modality) can benefit from all the other modalities. The residual connections are introduced as well to ensure the integration of low-level and high-level information. On our dataset, our model shows improved performance compared to other advanced models by at least 8% on F1. Meanwhile, the MTA is able to meet the real-time requirements of the live streaming setting, and has demonstrated its robustness and transferability in other tasks. Our research may offer some insights about how to efficiently fuse multimodal information, and contribute to the research on viewer gifting behavior prediction in the live streaming context.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号