首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
任妮  鲍彤  沈耕宇  郭婷 《情报科学》2021,39(11):96-102
【 目的/意义】开展面向领域的细粒度命名实体识别研究对于提升文本挖掘精度具有重要的意义,本文以番 茄病虫害命名实体为例,探索采用深度学习技术实现面向领域的细粒度命名实体识别研究方法。【目的/意义】文章 以电子书、论文、网页作为数据源,选择品种、病虫害、症状、时间、部位、防治药剂六类实体进行标注,利用BERT和 CBOW 预训练字向量分别输入 BiLSTM-CRF 模型训练,并在识别后补充规则控制实体的边界。【结果/结论】 BERT预训练的字向量和BiLSTM-CRF结合,在补充规则控制后F值达到了81.03%,优于其它模型,在番茄病虫害 领域的实体识别中具有较好的效果。【创新/局限】BERT预训练的字向量可以有效降低番茄病虫害领域实体因分 词错误带来的影响,针对不同实体的特点,补充规则可以有效控制实体边界,提高识别准确率。但本文的规则补充 仅在测试阶段,并没有加入训练过程,整体的准确率还有待提高。  相似文献   

2.
Visual Question Answering (VQA) requires reasoning about the visually-grounded relations in the image and question context. A crucial aspect of solving complex questions is reliable multi-hop reasoning, i.e., dynamically learning the interplay between visual entities in each step. In this paper, we investigate the potential of the reasoning graph network on multi-hop reasoning questions, especially over 3 “hops.” We call this model QMRGT: A Question-Guided Multi-hop Reasoning Graph Network. It constructs a cross-modal interaction module (CIM) and a multi-hop reasoning graph network (MRGT) and infers an answer by dynamically updating the inter-associated instruction between two modalities. Our graph reasoning module can apply to any multi-modal model. The experiments on VQA 2.0 and GQA (in fully supervised and O.O.D settings) datasets show that both QMRGT and pre-training V&L models+MRGT lead to improvement on visual question answering tasks. Graph-based multi-hop reasoning provides an effective signal for the visual question answering challenge, both for the O.O.D and high-level reasoning questions.  相似文献   

3.
4.
Image–text matching is a crucial branch in multimedia retrieval which relies on learning inter-modal correspondences. Most existing methods focus on global or local correspondence and fail to explore fine-grained global–local alignment. Moreover, the issue of how to infer more accurate similarity scores remains unresolved. In this study, we propose a novel unifying knowledge iterative dissemination and relational reconstruction (KIDRR) network for image–text matching. Particularly, the knowledge graph iterative dissemination module is designed to iteratively broadcast global semantic knowledge, enabling relevant nodes to be associated, resulting in fine-grained intra-modal correlations and features. Hence, vector-based similarity representations are learned from multiple perspectives to model multi-level alignments comprehensively. The relation graph reconstruction module is further developed to enhance cross-modal correspondences by constructing similarity relation graphs and adaptively reconstructing them. We conducted experiments on the datasets Flickr30K and MSCOCO, which have 31,783 and 123,287 images, respectively. Experiments show that KIDRR achieves improvements of nearly 2.2% and 1.6% relative to Recall@1 on Flicr30K and MSCOCO, respectively, compared to the current state-of-the-art baselines.  相似文献   

5.
Among existing knowledge graph based question answering (KGQA) methods, relation supervision methods require labeled intermediate relations for stepwise reasoning. To avoid this enormous cost of labeling on large-scale knowledge graphs, weak supervision methods, which use only the answer entity to evaluate rewards as supervision, have been introduced. However, lacking intermediate supervision raises the issue of sparse rewards, which may result in two types of incorrect reasoning path: (1) incorrectly reasoned relations, even when the final answer entity may be correct; (2) correctly reasoned relations in a wrong order, which leads to an incorrect answer entity. To address these issues, this paper considers the multi-hop KGQA task as a Markov decision process, and proposes a model based on Reward Integration and Policy Evaluation (RIPE). In this model, an integrated reward function is designed to evaluate the reasoning process by leveraging both terminal and instant rewards. The intermediate supervision for each single reasoning hop is constructed with regard to both the fitness of the taken action and the evaluation of the unreasoned information remained in the updated question embeddings. In addition, to lead the agent to the answer entity along the correct reasoning path, an evaluation network is designed to evaluate the taken action in each hop. Extensive ablation studies and comparative experiments are conducted on four KGQA benchmark datasets. The results demonstrate that the proposed model outperforms the state-of-the-art approaches in terms of answering accuracy.  相似文献   

6.
【目的/意义】基于知识元挖掘科技文献中的科学知识,建立科学知识之间的联系,构建细粒度知识图谱,旨在打通知识之间的壁垒,满足用户细粒度的知识需求。【方法/过程】首先,本文构建基于图的知识元表示框架,并以知识元为核心构建细粒度知识组织模型;其次,设计面向科技文献知识元的知识图谱,并探究知识图谱构建流程,以实现知识图谱的自动化构建;最后以科技文献中的摘要和引言为实验数据,进行实证研究,构建面向科技文献知识元的知识图谱。【结果/结论】本文所构建的知识图谱不仅能直观地展现学术论文所需要探究的问题、所提出的方法和模型等,还能够揭示科学知识之间的内在关联。【创新/局限】本文探究了细粒度知识组织模型,构建了面向科技文献知识元的知识图谱。在未来的研究中,将不断完善知识图谱构建流程,并探索知识图谱的应用领域。  相似文献   

7.
一种可配置的可信引导系统   总被引:3,自引:0,他引:3  
引导过程的安全是计算机系统安全的基点,安全的引导系统需要保证系统加电后引导执行链条中的实体未受篡改。当前,基于可信平台模块(TPM)开展的可信引导工作,仅能可信地记录并报告系统引导的证据链,无法进行验证以及进一步的处理。本文提出一种可配置的可信引导系统,可以配置认证引导和安全引导,支持细粒度的文件验证,以及操作系统内核的可信恢复。给出了系统的设计思想,并介绍了其原型工作,实验表明该系统能够有效实现其设计目标。  相似文献   

8.
To improve the effect of multimodal negative sentiment recognition of online public opinion on public health emergencies, we constructed a novel multimodal fine-grained negative sentiment recognition model based on graph convolutional networks (GCN) and ensemble learning. This model comprises BERT and ViT-based multimodal feature representation, GCN-based feature fusion, multiple classifiers, and ensemble learning-based decision fusion. Firstly, the image-text data about COVID-19 is collected from Sina Weibo, and the text and image features are extracted through BERT and ViT, respectively. Secondly, the image-text fused features are generated through GCN in the constructed microblog graph. Finally, AdaBoost is trained to decide the final sentiments recognized by the best classifiers in image, text, and image-text fused features. The results show that the F1-score of this model is 84.13% in sentiment polarity recognition and 82.06% in fine-grained negative sentiment recognition, improved by 4.13% and 7.55% compared to the optimal recognition effect of image-text feature fusion, respectively.  相似文献   

9.
Existing approaches to learning path recommendation for online learning communities mainly rely on the individual characteristics of users or the historical records of their learning processes, but pay less attention to the semantics of users’ postings and the context. To facilitate the knowledge understanding and personalized learning of users in online learning communities, it is necessary to conduct a fine-grained analysis of user data to capture their dynamical learning characteristics and potential knowledge levels, so as to recommend appropriate learning paths. In this paper, we propose a fine-grained and multi-context-aware learning path recommendation model for online learning communities based on a knowledge graph. First, we design a multidimensional knowledge graph to solve the problem of monotonous and incomplete entity information presentation of the single layer knowledge graph. Second, we use the topic preference features of users’ postings to determine the starting point of learning paths. We then strengthen the distant relationship of knowledge in the global context using the multidimensional knowledge graph when generating and recommending learning paths. Finally, we build a user background similarity matrix to establish user connections in the local context to recommend users with similar knowledge levels and learning preferences and synchronize their subsequent postings. Experiment results show that the proposed model can recommend appropriate learning paths for users, and the recommended similar users and postings are effective.  相似文献   

10.
李晓敏  王昊  李跃艳 《情报科学》2022,40(4):156-165
【目的/意义】为帮助科研用户快速准确地找到与自身研究兴趣相关的学术论文,构建了基于细粒度语义实 体的学术论文推荐模型。【方法/过程】将实验前期识别出的研究主题、研究对象和理论技术类语义实体作为学术论 文和核心作者的内容特征,分别利用TF-IDF算法、TextRank算法和LDA模型得到学术论文和核心作者的特征词, 利用Word2vec对特征词进行向量化,再计算核心作者和学术论文的余弦相似度,将余弦相似度值靠前的Top20推 荐给作者。【结果/结论】利用准确率、召回率和F值对基于三种算法得到的特征词生成的推荐结果进行比较评价,结 果表明,基于TF-IDF算法得到的特征词生成的推荐效果最佳,并对推荐结果进行了实例展示,可以看出本文提出 的推荐模型能够更为全面地为科研用户推荐与其研究兴趣类似的学术论文,提高科研效率。【创新/局限】本文主要 是从学术论文的内容特征入手,对类型细分后的关键词利用不同算法进行核心作者特征词筛选,进而实现学术论 文推荐,但是对学术论文中包含的网络关系并未涉及。  相似文献   

11.
The identification of knowledge graph entity mentions in textual content has already attracted much attention. The major assumption of existing work is that entities are explicitly mentioned in text and would only need to be disambiguated and linked. However, this assumption does not necessarily hold for social content where a significant portion of information is implied. The focus of our work in this paper is to identify whether textual social content include implicit mentions of knowledge graph entities or not, hence forming a two-class classification problem. To this end, we adopt the systemic functional linguistic framework that allows for capturing meaning expressed through language. Based on this theoretical framework we systematically introduce two classes of features, namely syntagmatic and paradigmatic features, for implicit entity recognition. In our experiments, we show the utility of these features for the task, report on ablation studies, measure the impact of each feature subset on each other and also provide a detailed error analysis of our technique.  相似文献   

12.
Learning a continuous dense low-dimensional representation of knowledge graphs (KGs), known as knowledge graph embedding (KGE), has been viewed as the key to intelligent reasoning for deep learning (DL) and gained much attention in recent years. To address the problem that the current KGE models are generally ineffective on small-scale sparse datasets, we propose a novel method RelaGraph to improve the representation of entities and relations in KGs by introducing neighborhood relations. RelaGraph extends the neighborhood information during entity encoding, and adds the neighborhood relations to mine deeper level of graph structure information, so as to make up for the shortage of information in the generated subgraph. This method can well represent KG components in a vector space in a way that captures the structure of the graph, avoiding underlearning or overfitting. KGE based on RelaGraph is evaluated on a small-scale sparse graph KMHEO, and the MRR reached 0.49, which is 34 percentage points higher than that of the SOTA methods, as well as it does on several other datasets. Additionally, the vectors learned by RelaGraph is used to introduce DL into several KG-related downstream tasks, which achieved excellent results, verifying the superiority of KGE-based methods.  相似文献   

13.
Recently, geolocalisation of tweets has become important for a wide range of real-time applications, including real-time event detection, topic detection or disaster and emergency analysis. However, the number of relevant geotagged tweets available to enable such tasks remains insufficient. To overcome this limitation, predicting the location of non-geotagged tweets, while challenging, can increase the sample of geotagged data and has consequences for a wide range of applications. In this paper, we propose a location inference method that utilises a ranking approach combined with a majority voting of tweets, where each vote is weighted based on evidence gathered from the ranking. Using geotagged tweets from two cities, Chicago and New York (USA), our experimental results demonstrate that our method (statistically) significantly outperforms state-of-the-art baselines in terms of accuracy and error distance, in both cities, with the cost of decreased coverage. Finally, we investigated the applicability of our method in a real-time scenario by means of a traffic incident detection task. Our analysis shows that our fine-grained geolocalisation method can overcome the limitations of geotagged tweets and precisely map incident-related tweets at the real location of the incident.  相似文献   

14.
15.
Visual Question Answering (VQA) systems have achieved great success in general scenarios. In medical domain, VQA systems are still in their infancy as the datasets are limited by scale and application scenarios. Current medical VQA datasets are designed to conduct basic analyses of medical imaging such as modalities, planes, organ systems, abnormalities, etc., aiming to provide constructive medical suggestions for doctors, containing a large number of professional terms with limited help for patients. In this paper, we introduce a new Patient-oriented Visual Question Answering (P-VQA) dataset, which builds a VQA system for patients by covering an entire treatment process including medical consultation, imaging diagnosis, clinical diagnosis, treatment advice, review, etc. P-VQA covers 20 common diseases with 2,169 medical images, 24,800 question-answering pairs, and a medical knowledge graph containing 419 entities. In terms of methodology, we propose a Medical Knowledge-based VQA Network (MKBN) to answer questions according to the images and a medical knowledge graph in our P-VQA. MKBN learns two cluster embeddings (disease-related and relation-related embeddings) according to structural characteristics of the medical knowledge graph and learns three different interactive features (image-question, image-disease, and question-relation) according to characteristics of diagnosis. For comparisons, we evaluate several state-of-the-art baselines on the P-VQA dataset as benchmarks. Experimental results on P-VQA demonstrate that MKBN achieves the state-of-the-art performance compared with baseline methods. The dataset is available at https://github.com/cs-jerhuang/P-VQA.  相似文献   

16.
Humans are able to reason from multiple sources to arrive at the correct answer. In the context of Multiple Choice Question Answering (MCQA), knowledge graphs can provide subgraphs based on different combinations of questions and answers, mimicking the way humans find answers. However, current research mainly focuses on independent reasoning on a single graph for each question–answer pair, lacking the ability for joint reasoning among all answer candidates. In this paper, we propose a novel method KMSQA, which leverages multiple subgraphs from the large knowledge graph ConceptNet to model the comprehensive reasoning process. We further encode the knowledge graphs with shared Graph Neural Networks (GNNs) and perform joint reasoning across multiple subgraphs. We evaluate our model on two common datasets: CommonsenseQA (CSQA) and OpenBookQA (OBQA). Our method achieves an exact match score of 74.53% on CSQA and 71.80% on OBQA, outperforming all eight baselines.  相似文献   

17.
Visual dialog, a visual-language task, enables an AI agent to engage in conversation with humans grounded in a given image. To generate appropriate answers for a series of questions in the dialog, the agent is required to understand the comprehensive visual content of an image and the fine-grained textual context of the dialog. However, previous studies typically utilized the object-level visual feature to represent a whole image, which only focuses on the local perspective of an image but ignores the importance of the global information in an image. In this paper, we proposed a novel model Human-Like Visual Cognitive and Language-Memory Network for Visual Dialog (HVLM), to simulate global and local dual-perspective cognitions in the human visual system and understand an image comprehensively. HVLM consists of two key modules, Local-to-Global Graph Convolutional Visual Cognition (LG-GCVC) and Question-guided Language Topic Memory (T-Mem). Specifically, in the LG-GCVC module, we design a question-guided dual-perspective reasoning to jointly learn visual contents from both local and global perspectives through a simple spectral graph convolution network. Furthermore, in the T-Mem module, we design an iterative learning strategy to gradually enhance fine-grained textual context details via an attention mechanism. Experimental results demonstrate the superiority of our proposed model, which obtains the comparable performance on benchmark datasets VisDial v1.0 and VisDial v0.9.  相似文献   

18.
【目的/意义】通过概念层次关系自动抽取可以快速地在大数据集上进行细粒度的概念语义层次自动划分, 为后续领域本体的精细化构建提供参考。【方法/过程】首先,在由复合术语和关键词组成的术语集上,通过词频、篇 章频率和语义相似度进行筛选,得到学术论文评价领域概念集;其次,考虑概念共现关系和上下文语义信息,前者 用文献-概念矩阵和概念共现矩阵表达,后者用word2vec词向量表示,通过余弦相似度进行集成,得到概念相似度 矩阵;最后,以关联度最大的概念为聚类中心,利用谱聚类对相似度矩阵进行聚类,得到学术论文评价领域概念层 次体系。【结果/结论】经实验验证,本研究提出的模型有较高的准确率,构建的领域概念层次结构合理。【创新/局限】 本文提出了一种基于词共现与词向量的概念层次关系自动抽取模型,可以实现概念层次关系的自动抽取,但类标 签确定的方法比较简单,可以进一步探究。  相似文献   

19.
Identifying petition expectation for government response plays an important role in government administrative service. Although some petition platforms allow citizens to label the petition expectation when they submit e-petitions, the misunderstanding and misselection of petition labels still has necessitated manual classification involved. Automatic petition expectation identification has faced challenges in poor context information, heavy noise and casual syntactic structure of the petition text. In this paper we propose a novel deep reinforcement learning based method for petition expectation (citizens’ demands for the level of government response) correction and identification named PecidRL. We collect a dataset from Message Board for Leaders, the largest official petition platform in China, containing 237,042 petitions. Firstly, we introduce a deep reinforcement learning framework to automatically correct the mislabeled and ambiguous labels of the petitions. Then, multi-view textual features, including word-level and document-level semantic features, sentiment features and different textual graph representations are extracted and integrated to enrich more auxiliary information. Furthermore, based on the corrected petitions, 19 novel petition expectation identification models are constructed by extending 11 popular machine learning models for petition expectation detection. Finally, comprehensive comparison and evaluation are conducted to select the final petition expectation identification model with the best performance. After performing correction by PecidRL, each metric on all extended petition expectation identification models improves by an average of 8.3% with the highest increase ratio reaching 14.2%. The optimal model is determined as Peti-SVM-bert with the highest accuracy 93.66%. We also analyze the petition expectation label variation of the dataset by using PecidRL. We derive that 16.9% of e-petitioners tend to exaggerate the urgency of their petitions to make the government pay high attention to their appeals and 4.4% of the petitions urgency are underestimated. This study has substantial academic and practical value in improving government efficiency. Additionally, a web-server is developed to facilitate government administrators and other researchers, which can be accessed at http://www.csbg-jlu.info/PecidRL/.  相似文献   

20.
With the prosperity and development of the digital economy, many fraudsters have emerged on e-commerce platforms to fabricate fraudulent reviews to mislead consumers’ shopping decisions for profit. Moreover, in order to evade fraud detection, fraudsters continue to evolve and present the phenomenon of adversarial camouflage and collaborative attack. In this paper, we propose a novel temporal burstiness and collaborative camouflage aware method (TBCCA) for fraudster detection. Specifically, we capture the hidden temporal burstiness features behind camouflage strategy based on the time series prediction model, and identify highly suspicious target products by assigning suspicious scores as node priors. Meanwhile, a propagation graph integrating review collusion is constructed, and an iterative fraud confidence propagation algorithm is designed for inferring the label of nodes in the graph based on Loop Belief Propagation (LBP). Comprehensive experiments are conducted to compare TBCCA with state-of-the-art fraudster detection approaches, and experimental results show that TBCCA can effectively identify fraudsters in real review networks with achieving 6%–10% performance improvement than other baselines.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号