首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
柯佳 《情报科学》2021,39(10):165-169
【目的/意义】实体关系抽取是构建领域本体、知识图谱、开发问答系统的基础工作。远程监督方法将大规 模非结构化文本与已有的知识库实体对齐,自动标注训练样本,解决了有监督机器学习方法人工标注训练语料耗 时费力的问题,但也带来了数据噪声。【方法/过程】本文详细梳理了近些年远程监督结合深度学习技术,降低训练 样本噪声,提升实体关系抽取性能的方法。【结果/结论】卷积神经网络能更好的捕获句子局部、关键特征、长短时记 忆网络能更好的处理句子实体对远距离依赖关系,模型自动抽取句子词法、句法特征,注意力机制给予句子关键上 下文、单词更大的权重,在神经网络模型中融入先验知识能丰富句子实体对的语义信息,显著提升关系抽取性能。 【创新/局限】下一步的研究应考虑实体对重叠关系、实体对长尾语义关系的处理方法,更加全面的解决实体对关系 噪声问题。  相似文献   

2.
The majority of currently available entity alignment (EA) solutions primarily rely on structural information to align entities, which is biased and disregards additional multi-source information. To compensate for inadequate structural details, this article suggests the SKEA framework, which is a simple but flexible framework for Entity Alignment with cross-modal supervision of Supporting Knowledge. We employ a relational aggregate network to specifically utilize the details about the entity and its neighbors. To overcome the limitations of relational features, two multi-modal encode modules are being used to extract visual and textural information. A new set of potential aligned entity pairs are generated by SKEA in each iteration using the knowledge of two reference modalities, which can enhance the model’s supervision. It is important to note that the supporting information used in our framework does not participate in the network’s backpropagation, which considerably improves efficiency and differs dramatically from earlier work. In comparison to existing baselines, experiments demonstrate that our proposed framework can incorporate multi-aspect information efficiently and enable supervisory signals from other modalities to transmit to entities. The maximum performance improvement of 5.24% indicates our suggested framework’s superiority, especially for sparse KGs.  相似文献   

3.
4.
We propose an approach to the retrieval of entities that have a specific relationship with the entity given in a query. Our research goal is to investigate whether related entity finding problem can be addressed by combining a measure of relatedness of candidate answer entities to the query, and likelihood that the candidate answer entity belongs to the target entity category specified in the query. An initial list of candidate entities, extracted from top ranked documents retrieved for the query, is refined using a number of statistical and linguistic methods. The proposed method extracts the category of the target entity from the query, identifies instances of this category as seed entities, and computes similarity between candidate and seed entities. The evaluation was conducted on the Related Entity Finding task of the Entity Track of TREC 2010, as well as the QA list questions from TREC 2005 and 2006. Evaluation results demonstrate that the proposed methods are effective in finding related entities.  相似文献   

5.
Overlapping entity relation extraction has received extensive research attention in recent years. However, existing methods suffer from the limitation of long-distance dependencies between entities, and fail to extract the relations when the overlapping situation is relatively complex. This issue limits the performance of the task. In this paper, we propose an end-to-end neural model for overlapping relation extraction by treating the task as a quintuple prediction problem. The proposed method first constructs the entity graphs by enumerating possible candidate spans, then models the relational graphs between entities via a graph attention model. Experimental results on five benchmark datasets show that the proposed model achieves the current best performance, outperforming previous methods and baseline systems by a large margin. Further analysis shows that our model can effectively capture the long-distance dependencies between entities in a long sentence.  相似文献   

6.
Fact verification aims to retrieve relevant evidence from a knowledge base, e.g., Wikipedia, to verify the given claims. Existing methods only consider the sentence-level semantics for evidence representations, which typically neglect the importance of fine-grained features in the evidence-related sentences. In addition, the interpretability of the reasoning process has not been well studied in the field of fact verification. To address such issues, we propose an entity-graph based reasoning method for fact verification abbreviated as RoEG, which generates the fine-grained features of evidence at the entity-level and models the human reasoning paths based on an entity graph. In detail, to capture the semantic relations of retrieved evidence, RoEG introduces the entities as nodes and constructs the edges in the graph based on three linking strategies. Then, RoEG utilizes a selection gate to constrain the information propagation in the sub-graph of relevant entities and applies a graph neural network to propagate the entity-features for reasoning. Finally, RoEG employs an attention aggregator to gather the information of entities for label prediction. Experimental results on a large-scale benchmark dataset FEVER demonstrate the effectiveness of our proposal by beating the competitive baselines in terms of label accuracy and FEVER Score. In particular, for a task of multiple-evidence fact verification, RoEG produces 5.48% and 4.35% improvements in terms of label accuracy and FEVER Score against the state-of-the-art baseline. In addition, RoEG shows a better performance when more entities are involved for fact verification.  相似文献   

7.
王颖  于改红  谢靖 《情报科学》2021,39(8):67-77
【目的/意义】通过对学术资源进行深度挖掘与语义化组织,实现学术资源及其内部知识之间的关联发现。 【方法/过程】本文提出基于全文知识网络的学术资源关联发现方法,设计了全文知识网络的模型和构建流程,以 Pubmed Central数据库中拟南芥(Arabidopsis)相关的520篇期刊论文全文数据为实验对象,通过全文解析和挖掘将 其分解为细粒度的知识,形成全文知识网络。然后利用SPARQL查询和RelFinder可视化工具从数字资源层、知识 单元层和知识对象层三个层次开展关联发现实验。【结果/结论】本文构建全文知识网络对学术资源进行细粒度组 织和挖掘,有助于发现不同学术资源及其内部知识之间的潜在关联,对学术资源的深度利用具有重要的意义。【创 新/局限】本文创新之处在于通过构建全文知识网络对学术资源进行细粒度揭示和组织并进一步发现潜在关联,局 限在于尚未开展大规模应用实践。  相似文献   

8.
Entity disambiguation is a fundamental task of semantic Web annotation. Entity Linking (EL) is an essential procedure in entity disambiguation, which aims to link a mention appearing in a plain text to a structured or semi-structured knowledge base, such as Wikipedia. Existing research on EL usually annotates the mentions in a text one by one and treats entities independent to each other. However this might not be true in many application scenarios. For example, if two mentions appear in one text, they are likely to have certain intrinsic relationships. In this paper, we first propose a novel query expansion method for candidate generation utilizing the information of co-occurrences of mentions. We further propose a re-ranking model which can be iteratively adjusted based on the prediction in the previous round. Experiments on real-world data demonstrate the effectiveness of our proposed methods for entity disambiguation.  相似文献   

9.
Recent developments have shown that entity-based models that rely on information from the knowledge graph can improve document retrieval performance. However, given the non-transitive nature of relatedness between entities on the knowledge graph, the use of semantic relatedness measures can lead to topic drift. To address this issue, we propose a relevance-based model for entity selection based on pseudo-relevance feedback, which is then used to systematically expand the input query leading to improved retrieval performance. We perform our experiments on the widely used TREC Web corpora and empirically show that our proposed approach to entity selection significantly improves ad hoc document retrieval compared to strong baselines. More concretely, the contributions of this work are as follows: (1) We introduce a graphical probability model that captures dependencies between entities within the query and documents. (2) We propose an unsupervised entity selection method based on the graphical model for query entity expansion and then for ad hoc retrieval. (3) We thoroughly evaluate our method and compare it with the state-of-the-art keyword and entity based retrieval methods. We demonstrate that the proposed retrieval model shows improved performance over all the other baselines on ClueWeb09B and ClueWeb12B, two widely used Web corpora, on the [email protected], and [email protected] metrics. We also show that the proposed method is most effective on the difficult queries. In addition, We compare our proposed entity selection with a state-of-the-art entity selection technique within the context of ad hoc retrieval using a basic query expansion method and illustrate that it provides more effective retrieval for all expansion weights and different number of expansion entities.  相似文献   

10.
The knowledge contained in academic literature is interesting to mine. Inspired by the idea of molecular markers tracing in the field of biochemistry, three named entities, namely, methods, datasets, and metrics, are extracted and used as artificial intelligence (AI) markers for AI literature. These entities can be used to trace the research process described in the bodies of papers, which opens up new perspectives for seeking and mining more valuable academic information. Firstly, the named entity recognition model is used to extract AI markers from large-scale AI literature. A multi-stage self-paced learning strategy (MSPL) is proposed to address the negative influence of hard and noisy samples on the model training. Secondly, original papers are traced for AI markers. Statistical and propagation analyses are performed based on the tracing results. Finally, the co-occurrences of AI markers are used to achieve clustering. The evolution within method clusters is explored. The above-mentioned mining based on AI markers yields many significant findings. For example, the propagation rate of the datasets gradually increases. The methods proposed by China in recent years have an increasing influence on other countries.  相似文献   

11.
Learning semantic representations of documents is essential for various downstream applications, including text classification and information retrieval. Entities, as important sources of information, have been playing a crucial role in assisting latent representations of documents. In this work, we hypothesize that entities are not monolithic concepts; instead they have multiple aspects, and different documents may be discussing different aspects of a given entity. Given that, we argue that from an entity-centric point of view, a document related to multiple entities shall be (a) represented differently for different entities (multiple entity-centric representations), and (b) each entity-centric representation should reflect the specific aspects of the entity discussed in the document.In this work, we devise the following research questions: (1) Can we confirm that entities have multiple aspects, with different aspects reflected in different documents, (2) can we learn a representation of entity aspects from a collection of documents, and a representation of document based on the multiple entities and their aspects as reflected in the documents, (3) does this novel representation improves algorithm performance in downstream applications, and (4) what is a reasonable number of aspects per entity? To answer these questions we model each entity using multiple aspects (entity facets1), where each entity facet is represented as a mixture of latent topics. Then, given a document associated with multiple entities, we assume multiple entity-centric representations, where each entity-centric representation is a mixture of entity facets for each entity. Finally, a novel graphical model, the Entity Facet Topic Model (EFTM), is proposed in order to learn entity-centric document representations, entity facets, and latent topics.Through experimentation we confirm that (1) entities are multi-faceted concepts which we can model and learn, (2) a multi-faceted entity-centric modeling of documents can lead to effective representations, which (3) can have an impact in downstream application, and (4) considering a small number of facets is effective enough. In particular, we visualize entity facets within a set of documents, and demonstrate that indeed different sets of documents reflect different facets of entities. Further, we demonstrate that the proposed entity facet topic model generates better document representations in terms of perplexity, compared to state-of-the-art document representation methods. Moreover, we show that the proposed model outperforms baseline methods in the application of multi-label classification. Finally, we study the impact of EFTM’s parameters and find that a small number of facets better captures entity specific topics, which confirms the intuition that on average an entity has a small number of facets reflected in documents.  相似文献   

12.
丁晟春  方振  王楠 《现代情报》2009,40(3):103-110
[目的/意义] 为解决目前网络公开平台的多源异构的企业数据的散乱、无序、碎片化问题,提出Bi-LSTM-CRF深度学习模型进行商业领域中的命名实体识别工作。[方法/过程] 该方法包括对企业全称实体、企业简称实体与人名实体3类命名实体识别。[结果/结论] 实验结果显示对企业全称实体、企业简称实体与人名实体3类命名实体识别的识别率平均F值为90.85%,验证了所提方法的有效性,证明了本研究有效地改善了商业领域中的命名实体识别效率。  相似文献   

13.
Despite the accepted need for strategic alignment in the manufacturing strategy literature, there has been relatively little research aimed at simultaneously aligning decisions in the structural and infrastructural areas with the competitive priorities of an organization. This article attempts to bridge the gap by examining how work force management practices, an infrastructural decision, should be aligned with both the process technology of a manufacturing plant, a structural decision, and its competitive priorities. A conceptual model for linking competitive priorities, process technologies, and work force management practices is presented, and specific and testable propositions are developed for future empirical research. The proposed conceptual model is based on the premise that the process of matching work force management practices to competitive priorities involves identification of the key managerial tasks underlying various competitive priorities. These tasks, in turn, are matched with the process technology characteristics and work force management practices to seek a good fit, which is expected to improve manufacturing performance.  相似文献   

14.
王长峰  赵迪  史志武  檀程操 《科研管理》2013,34(11):131-136
负载均衡是特大型科技(工程)项目异构组织知识集成云计算网络空间联盟进行异构平台之间知识资源最优化配置的关键问题,其解决机制将会极大地影响异构组织知识的有效性、准确性、时效性,大大提高了异构组织众多相关干系人利益相关者间的知识共享。本文基于蚁群优化算法,结合复杂动态环境下特大型科技(工程)项目异构组织知识的特征和知识需求,构建了复杂动态环境下特大型科技(工程)项目异构组织知识集成云计算网络空间联盟负载均衡机制,基本实现了知识载荷的均衡状态,以及异构组织知识资源最优化配置,提高异构组织知识的实效性。  相似文献   

15.
In recent years, reasoning over knowledge graphs (KGs) has been widely adapted to empower retrieval systems, recommender systems, and question answering systems, generating a surge in research interest. Recently developed reasoning methods usually suffer from poor performance when applied to incomplete or sparse KGs, due to the lack of evidential paths that can reach target entities. To solve this problem, we propose a hybrid multi-hop reasoning model with reinforcement learning (RL) called SparKGR, which implements dynamic path completion and iterative rule guidance strategies to increase reasoning performance over sparse KGs. Firstly, the model dynamically completes the missing paths using rule guidance to augment the action space for the RL agent; this strategy effectively reduces the sparsity of KGs, thus increasing path search efficiency. Secondly, an iterative optimization of rule induction and fact inference is designed to incorporate global information from KGs to guide the RL agent exploration; this optimization iteratively improves overall training performance. We further evaluated the SparKGR model through different tasks on five real world datasets extracted from Freebase, Wikidata and NELL. The experimental results indicate that SparKGR outperforms state-of-the-art baseline models without losing interpretability.  相似文献   

16.
One strategy to recognize nested entities is to enumerate overlapped entity spans for classification. However, current models independently verify every entity span, which ignores the semantic dependency between spans. In this paper, we first propose a planarized sentence representation to represent nested named entities. Then, a bi-directional two-dimensional recurrent operation is implemented to learn semantic dependencies between spans. Our method is evaluated on seven public datasets for named entity recognition. It achieves competitive performance in named entity recognition. The experimental results show that our method is effective to resolve nested named entities and learn semantic dependencies between them.  相似文献   

17.
We examine patent licensing business models of non-practicing entities that generate revenue by selling, licensing, or litigating patents. They may also pursue R&D activities, invent new technologies, or provide services to inventors or product companies. We describe their business models and patent market behavior and then compare their litigation strategies against product companies using a matched sample of highly comparable patents. The main differences among patent licensing firms stem from their technological capabilities, patent portfolio sizes, and external relationships. We find that licensing firms with technological capabilities often pursue litigation until decision and engage in forum shopping. In contrast, litigation incidence, parties involved, and outcomes are primarily determined by patent characteristics, not entity types. Licensing business models drive the acquisition of certain types of patents that influence the outcomes of the patent system. We argue that patent policy should strengthen mechanisms to discover invention quality rather than focus on the amount of litigation or types of entities.  相似文献   

18.
Among existing knowledge graph based question answering (KGQA) methods, relation supervision methods require labeled intermediate relations for stepwise reasoning. To avoid this enormous cost of labeling on large-scale knowledge graphs, weak supervision methods, which use only the answer entity to evaluate rewards as supervision, have been introduced. However, lacking intermediate supervision raises the issue of sparse rewards, which may result in two types of incorrect reasoning path: (1) incorrectly reasoned relations, even when the final answer entity may be correct; (2) correctly reasoned relations in a wrong order, which leads to an incorrect answer entity. To address these issues, this paper considers the multi-hop KGQA task as a Markov decision process, and proposes a model based on Reward Integration and Policy Evaluation (RIPE). In this model, an integrated reward function is designed to evaluate the reasoning process by leveraging both terminal and instant rewards. The intermediate supervision for each single reasoning hop is constructed with regard to both the fitness of the taken action and the evaluation of the unreasoned information remained in the updated question embeddings. In addition, to lead the agent to the answer entity along the correct reasoning path, an evaluation network is designed to evaluate the taken action in each hop. Extensive ablation studies and comparative experiments are conducted on four KGQA benchmark datasets. The results demonstrate that the proposed model outperforms the state-of-the-art approaches in terms of answering accuracy.  相似文献   

19.
Entity alignment is an important task for the Knowledge Graph (KG) completion, which aims to identify the same entities in different KGs. Most of previous works only utilize the relation structures of KGs, but ignore the heterogeneity of relations and attributes of KGs. However, these information can provide more feature information and improve the accuracy of entity alignment. In this paper, we propose a novel Multi-Heterogeneous Neighborhood-Aware model (MHNA) for KGs alignment. MHNA aggregates multi-heterogeneous information of aligned entities, including the entity name, relations, attributes and attribute values. An important contribution is to design a variant attention mechanism, which adds the feature information of relations and attributes to the calculation of attention coefficients. Extensive experiments on three well-known benchmark datasets show that MHNA significantly outperforms 12 state-of-the-art approaches, demonstrating that our approach has good scalability and superiority in both cross-language and monolingual KGs. An ablation study further supports the effectiveness of our variant attention mechanism.  相似文献   

20.
Document-level relation extraction (RE) aims to extract the relation of entities that may be across sentences. Existing methods mainly rely on two types of techniques: Pre-trained language models (PLMs) and reasoning skills. Although various reasoning methods have been proposed, how to elicit learnt factual knowledge from PLMs for better reasoning ability has not yet been explored. In this paper, we propose a novel Collective Prompt Tuning with Relation Inference (CPT-RI) for Document-level RE, that improves upon existing models from two aspects. First, considering the long input and various templates, we adopt a collective prompt tuning method, which is an update-and-reuse strategy. A generic prompt is first encoded and then updated with exact entity pairs for relation-specific prompts. Second, we introduce a relation inference module to conduct global reasoning overall relation prompts via constrained semantic segmentation. Extensive experiments on two publicly available benchmark datasets demonstrate the effectiveness of our proposed CPT-RI as compared to the baseline model (ATLOP (Zhou et al., 2021)), which improve the 0.57% on the DocRED dataset, 2.20% on the CDR dataset, and 2.30 on the GDA dataset in the F1 score. In addition, further ablation studies also verify the effects of the collective prompt tuning and relation inference.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号