首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 258 毫秒
1.
针对图书出版领域的常用问题集研制自动问答系统,重点解决问句索引与检索问题。在问句索引中提出结合分词与词性标注、浅层语义分析等方法来索引问句;在问句检索中提出基于特征向量空间和语义类的方法来计算问句相似度。最后对该系统进行实现。  相似文献   

2.
郭海红  李姣  代涛 《情报工程》2016,2(6):039-049
本文旨在构建一个中文健康问句分类方法,并通过对高血压相关的健康问句进行人工分类标注,分析公众的高血压相关健康信息需求,同时为研发高血压相关的智能中文问答系统提供语料基础。本研究基于临床问句分类及公众健康信息查询场景层次模型,构建一个四级中文健康问句主题分类方法,并由5位标注员独立地对从某中文健康网站上收集的将近10万条高血压相关提问数据中随机抽取的2000条样本数据进行人工分类标注,以优化和测试该问句分类方法的可靠性,构建标注语料库,并分析公众的高血压相关健康信息需求。5位标注员使用该分类方法进行独立标注的四级类目评判者间信度kappa值为0.63,意味着分类结果可靠,一级大类获得高度一致性(kappa=0.82),略优于国际上的同类研究。分布在治疗、诊断、健康生活方式、临床发现/病情管理、流行病学、择医六个一级类别中的问句分别占样本总量的48.1%、23.8%、11.9%、5.2%、9.0%和1.9%。所构建的健康问句分类方法可用于组织大型健康问题集,以提高检索效率;分类标注的样本问句可作为高血压相关健康问句自动分类研究的语料;得出的高血压相关健康问句主题分布有助于指导健康网站的知识资源建设。此外,所设计和采用的问句分类方法构建方式、语料标注流程、评判者间信度测量方法等,也可为开放领域及其他受限领域开展用户问句分类与语料构建提供借鉴。  相似文献   

3.
汉语框架网络问答系统问句处理研究   总被引:1,自引:0,他引:1  
问句处理是问答系统的首要问题。汉语框架网络问答系统旨在以汉语框架网络本体为基础,选择法律领域作为研究对象,进行问句处理的研究,探索新型的问答系统设计技术,来满足用户准确检索信息的需求。本论文利用依存关系表示查询问句的句法关系,并将查询问句与问句模板库中的模板进行匹配,最终确定查询问句的配价模式,实现对查询问句的框架语义标注,为下一步基于问答的框架语义检索系统的设计奠定基础。  相似文献   

4.
面向农民的问答系统问句处理研究*   总被引:1,自引:0,他引:1  
为提高农民获取信息的便利性,文章着重面向农民问答系统的开发,提出问答系统由知识库构建、问句处理、信息检索、答案抽取4个模块组成,其中问句处理是研究重点。在总结农民问句特点的基础上,提出基于疑问词和短语的问句分类方法,并在问句处理过程中采用去除客气词、建立针对非正式疑问词和无疑问词时的“特殊规则表”等方法,以有效地进行问句归类及关键词提取。同时利用所构建的“同义词扩展词表”扩充关键词,并设定不同的权重基准,为信息检索模块的处理奠定基础。  相似文献   

5.
随着大数据时代的到来,问答系统成为人们获取信息的有效手段之一。作为问答系统关键一环的问句分类直接影响系统的性能。目前,问句分类研究主要集中在现代汉语领域,对中华典籍的问句分类研究还不多见。本文从问句分类概念出发,在生成中华典籍问句分类语料集的基础上,设计了面向中华典籍的问句分类体系,并对支持向量机、循环神经网络、长短时记忆神经网络、双向长短时记忆神经网络、BERT等模型的问句分类性能进行了比较研究。实验结果表明,与支持向量机和传统深度学习模型相比,BERT模型具有更优的问句分类能力,在本文提出的问句分类体系上,F1值达到95.55%,BERT模型在中华典籍问句分类任务中具有一定优势,具有一定的推广和应用价值。  相似文献   

6.
汉语框架网络问答系统的问句分析设计与实现   总被引:1,自引:0,他引:1  
利用框架语义学原理,构建出面向问句分析的语义框架——Q框架,在此基础上实现对问句的语义分析。从语义规则角度提出问句分析设计的思路:基于依存句法树确定不同类型问句的目标词,采取模式匹配方法实现基于Q框架的问句语义分析,通过映射完成对问句的框架语义标注,最终确定问句焦点和问句类型。  相似文献   

7.
文章介绍了能自动在大量数据中找出问题答案的自动问答系统.具体介绍了自动问答系统的定义;分析自动问答系统的发展现状、自动问答系统的分类以及与传统信息检索的区别;重点研究了自动问答系统使用的技术;最后使用浅层句法分析、命名实体抽取、段落分割排序等技术设计了一个自动问答系统的实现模型.  相似文献   

8.
基于词共现模型的常问问题集的自动问答系统研究   总被引:1,自引:0,他引:1  
在自动问答系统中引入基于Frequently asked questions(FAQ)的辅助模块满足常见问题的回答是一种有效的手段,其中关键问题是用户提出的问句与FAQ中问句的相似度比较,找出FAQ中最相似的问句,并返回对应的答案.本文将词共现模型引入到问句的相似度匹配中,利用互信息构造共现词汇,同时,结合相关关键词个数及问句长度等信息计算问句之间的相似度.相关实验结果表明,结合词共现模型的FAQ自动问答系统具有较高的准确率和较快的响应速度.  相似文献   

9.
中文问答系统模型研究   总被引:4,自引:0,他引:4  
张亮  黄河燕  胡春玲 《情报学报》2006,25(2):197-201
问答系统是信息检索的高级形式,也是该领域的研究重点和热点。本文较全面地分析了中文问答系统所涉及的关键技术和知识资源平台,提出了一个完整的中文问答系统处理模型,对系统的运行机制和处理流程作了清晰的描述,最后详细讨论了问答系统中的两个关键算法,即形式化扩展算法和答案抽取算法。  相似文献   

10.
句子相似度计算是自动问答系统的重要理论基础和关键实现技术.目前,用于中文自动问答系统的句子相似度计算方法很多,由于缺乏系统的分析,给研究人员带来了较大的不便.依据所利用的特征信息,可以将这些方法分为四类,即基于关键词信息、基于语义信息、基于句法结构信息以及基于多重信息.通过对各类方法实验结果的比较,指出各自的优势和不足.同时指出,基于多重信息的方法是当前的主流方法,实现不同特征信息的最佳权重分配是该类方法今后的研究重点.另外,还提出一个有关相似度概念认识上的看法,即对于中文自动问答系统,实质上依据的是句子的相关度,而不是句子的相似度.通过本文的研究,旨在为中文自动问答领域的句子相似度计算研究提供一定的参考.  相似文献   

11.
Automatic question answering using the web: Beyond the Factoid   总被引:4,自引:0,他引:4  
In this paper we describe and evaluate a Question Answering (QA) system that goes beyond answering factoid questions. Our approach to QA assumes no restrictions on the type of questions that are handled, and no assumption that the answers to be provided are factoids. We present an unsupervised approach for collecting question and answer pairs from FAQ pages, which we use to collect a corpus of 1 million question/answer pairs from FAQ pages available on the Web. This corpus is used to train various statistical models employed by our QA system: a statistical chunker used to transform a natural language-posed question into a phrase-based query to be submitted for exact match to an off-the-shelf search engine; an answer/question translation model, used to assess the likelihood that a proposed answer is indeed an answer to the posed question; and an answer language model, used to assess the likelihood that a proposed answer is a well-formed answer. We evaluate our QA system in a modular fashion, by comparing the performance of baseline algorithms against our proposed algorithms for various modules in our QA system. The evaluation shows that our system achieves reasonable performance in terms of answer accuracy for a large variety of complex, non-factoid questions.  相似文献   

12.
Background: Question‐answering systems (or QA Systems) stand as a new alternative for Information Retrieval Systems. Most users frequently need to retrieve specific information about a factual question to obtain a whole document. Objectives: The study evaluates the efficiency of QA systems as terminological sources for physicians, specialised translators and users in general. It assesses the performance of one open‐domain QA system, START, and one restricted‐domain QA system, MedQA. Method: The study collected two hundred definitional questions (What is…?), either general or specialised, from the health website WebMD. Sources used by the open‐domain QA system, START, and the restricted‐domain QA system, MedQA, were studied to retrieve answers, and later a range of evaluation measures (precision, Mean Reciprocal Rank, Total Reciprocal Rank, First Hit Success) were applied to mark the quality of answers. Results: It was established that both systems are useful in the retrieval of valid definitional healthcare information, with an acceptable degree of coherent and precise responses from both. The answers supplied by MedQA were more reliable that those of START in the sense that they came from specialised clinical or academic sources, most of them showing links to further research articles. Conclusions: Results obtained show the potential of this type of tool in the more general realm of information access, and the retrieval of health information. They may be considered a good, reliable and reasonably precise alternative in alleviating the information overload. Both QA systems can help professionals and users can obtain healthcare information.  相似文献   

13.
Given a user question, the goal of a Question Answering (QA) system is to retrieve answers rather than full documents or even best-matching passages, as most Information Retrieval systems currently do. In this paper, we present BRUJA, a QA system for the management of multilingual collections. BRUJ rkstions (English, Spanish and French). The BRUJA architecture is not formed with three monolingual QA systems but instead uses English as Interlingua to make usual QA tasks such as question classifications and answer extractions. In addition, BRUJA uses Cross Language Information Retrieval (CLIR) techniques to retrieve relevant documents from a multilingual collection. On the one hand, we have more documents to find answers from but on the other hand, we are introducing noise into the system because of translations to the Interlingua (English) and the CLIR module. The question is whether the difficulty of managing three languages is worth it or whether a monolingual QA system delivers better results. We report on in-depth experimentation and demonstrate that our multilingual QA system gets better results than its monolingual counterpart whenever it uses good translation resources and, especially, CLIR techniques that are state-of-the-art.  相似文献   

14.
This article presents FIDJI, a question-answering (QA) system for French. FIDJI combines syntactic information with traditional QA techniques such as named entity recognition and term weighting; it does not require any pre-processing other than classical search engine indexing. Among other uses of syntax, we experiment in this system the validation of answers through different documents, as well as specific techniques for answering different types of questions (e.g., yes/no or list questions). We present several experiments which show the benefits of syntactic analysis, as well as multi-document validation. Different types of questions and corpora are tested, and specificities are commented. Links with result aggregation are also discussed.  相似文献   

15.
16.
对《(中图法)第四版使用手册》中关于“战争史”著作归类的规定提出了质疑,指出这一规定是片面的,应予以修正,同时对问题进行了分析、论证,并提出了自己的看法。  相似文献   

17.
[目的/意义]旨在构建社会化问答社区用户生成答案质量评价指标体系,实现面向用户需求的答案质量自动化评价和筛选,提高社会化问答社区知识服务质量。[方法/过程]引入社会情感特征和用户特征,运用因子分析和结构方程实证构建用户生成答案质量评价指标体系。基于GA-BP神经网络模型设计答案质量自动化评价方法。最后,选取知乎网站数据对用户生成答案质量评价指标体系和自动化评价方法进行应用研究。[结果/结论]构建包含答案文本特征、回答者特征、时效特征、用户特征、社会情感特征5个维度的评价指标体系。实验分析发现基于GA-BP神经网络的答案质量自动化评价方法相比于其他方法准确率较高、平均误差低,具有可行性和有效性,能够进一步应用和推广实践。  相似文献   

18.
中文问句与RDF三元组映射方法研究   总被引:1,自引:0,他引:1  
探索中文问句与RDF三元组的转换方法:首先对中文问句的特点进行分析,然后结合RDF(S)模型的优势,探索RDF三元组与问句语义的对应关系,进而提出直接映射和间接映射两种映射方式。该方法只需做浅层的句法分析,将获取的限定成分映射为三元组内部的语义标签,从而降低句法分析和三元组组配的难度。最后分析映射方法中存在的问题并提出未来工作的重点。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号