共查询到18条相似文献,搜索用时 150 毫秒
1.
全文检索系统由三大功能模块组成:索引模块、检索模块和存储模块。本文着重分析系统组成和XML数据库的设计、建立倒排索引文件、中文分词等技术难点。同时在此基础之上建立基于Lucene/XML的期刊文献全文检索系统。 相似文献
2.
基于XML的全文检索原型系统的设计与实现* 总被引:1,自引:0,他引:1
针对当前单位网站搜索引擎存在的索引速度慢、更新不及时、检索效率低等问题,在深入分析和研究Lucene和XML等技术在建立搜索引擎方面优越性能的基础上,构建一个基于XML的全文检索原型系统。该系统以XML作为通用数据接口,以Lucene作为实现平台,能够实现快速及时索引和提高检索效率的目的。 相似文献
3.
一种基于Lucene检索引擎的全文数据库的研究与实现 总被引:15,自引:0,他引:15
在对数据库有关技术分析的基础上,分析对比了全文数据库的特点和关键所在。介绍了实现全文检索的工具包Lucene搜索引擎,并将其引入具体应用,对全文数据库索引和查找技术进行研究,提出并实现了无需后台数据库的全文数据库组织方式,事实表明,索引和查找的时空效率都很高。 相似文献
4.
Lucene全文检索的应用及检索效率测试研究 总被引:1,自引:0,他引:1
使用Lucene设计一个全文检索系统,系统由三大功能模块组成:索引模块、检索模块和存储模块.第二部分着重分析PDF数据转换,XML文档设计,索引的分词、建立及效率等技术难点,并对中文分词分析器、索引文件膨胀率、索引影响因子及检索系统并对检索响应时间进行测试.应关注XML数据库的安全性. 相似文献
5.
基于传统文本检索系统的XML索引实现研究 总被引:3,自引:0,他引:3
作为重要的信息交换与存储标准,XML得到学者们越来越多的重视。作为XML检索研究的重要组成部分,XML索引机制与实现的研究已经取得了一定的研究成果。然而,大部分研究都是基于数据库及专门的半结构化管理器之上的。本文提出了如何在传统文本检索系统Okapi的基础上构建XML索引的方法。首先介绍了Okapi的索引结构,在此基础上,深入探讨了XML索引的存储结构及实现,并对索引的性能进行了评价。 相似文献
6.
7.
8.
针对质量管理信息系统对全文检索功能的需求,在分析了数据库检索和使用搜索引擎检索两种方式后,选择开源搜索引擎DotLucene实现质量管理信息系统全文检索功能,详细介绍了实现索引增量更新的方法、索引建立过程、查询WebService的实现。 相似文献
9.
10.
用JAVA+XML实现网站全文检索 总被引:2,自引:0,他引:2
介绍了用JAVA和XML实现网站站内全文检索的实现过程,即用java编写索引工具,对WEB文档进行索引,索引结果写入XML文档;用Servlet和JDOM实现读取和查询XML文档,并在客户返回查询结果。 相似文献
11.
12.
张健 《现代图书情报技术》2005,21(4):83-85
作为Internet网络的标准之一,XML文档通常用于文本数据的描述、存储和交换。本文讨论了用XML文档存储图片的技术方案,包括XML文档结构、图片存储、在线提交、下载和显示,描述了各个功能步骤的技术要点,并给出了基于ASP.NET的程序代码。本文探讨的基于纯XML文档和ASP.NET的图片管理技术具有无数据库驱动、易于实现等特点。 相似文献
13.
This paper investigates the impact of three approaches to XML retrieval: using Zettair, a full-text information retrieval system; using eXist, a native XML database; and using a hybrid system that takes full article answers from Zettair and uses eXist to extract elements from those articles. For the content-only topics, we undertake a preliminary analysis of the INEX 2003 relevance assessments in order to identify the types of highly relevant document components. Further analysis identifies two complementary sub-cases of relevance assessments (General and Specific) and two categories of topics (Broad and Narrow). We develop a novel retrieval module that for a content-only topic utilises the information from the resulting answer list of a native XML database and dynamically determines the preferable units of retrieval, which we call Coherent Retrieval Elements. The results of our experiments show that—when each of the three systems is evaluated against different retrieval scenarios (such as different cases of relevance assessments, different topic categories and different choices of evaluation metrics)—the XML retrieval systems exhibit varying behaviour and the best performance can be reached for different values of the retrieval parameters. In the case of INEX 2003 relevance assessments for the content-only topics, our newly developed hybrid XML retrieval system is substantially more effective than either Zettair or eXist, and yields a robust and a very effective XML retrieval. 相似文献
14.
15.
This paper relates to the difficulty in retrieving precise information from big repositories of magazine articles in full text, and proposes an Extended Markup Language (XML) vocabulary for improving retrieval rates. The hypothesis tested was as follows: Magazine articles marked up with an XML vocabulary, indexed only by selected parts, give more precise search results than the same search using full text index.The study was exploratory with the following characteristics: 29 magazine articles were tested for results, 8 scholars were interviewed for defining 23 search strategies and evaluating results. The data showed that precision improved from 40.72% with full text search to 62.84% using XML markup and searching only in specific labels.Revision of the vocabulary and more testing has to be done by the library and information science community in order to obtain a valid vocabulary and provide more research results. Cultural characteristics and politics of librarians and information managers’ community are as important as technical issues in order to consider any technical proposal to be implemented successfully to achieve interoperability. 相似文献
16.
本文采用BORLANDIDAPI关系数据库集成技术,集成多种关系数据库系统,并用信息存储与检索软件QUICKIMS进行管理,实现对关系数据库的全文检索。对基于PC和基于SQL的关系数据库数据结构、数据访问方式、数据类型进行集成;对基本表和单库或多库查询的结果进行转移,生成QUICKIMS的必要文件和索引;对关系数据库提供布尔检索、前方一致检索、字段限定检索、相邻检索和位置检索等检索方式。采用动态转换关系数据库数据,减少了空间的浪费 相似文献
17.
Mohamed Taher Amin Ahmed Khan Muhammed Burhanuddin 《International Information and Library Review》2013,45(4):337-345
This paper relates to the difficulty in retrieving precise information from big repositories of magazine articles in full text, and proposes an Extended Markup Language (XML) vocabulary for improving retrieval rates. The hypothesis tested was as follows: Magazine articles marked up with an XML vocabulary, indexed only by selected parts, give more precise search results than the same search using full text index.The study was exploratory with the following characteristics: 29 magazine articles were tested for results, 8 scholars were interviewed for defining 23 search strategies and evaluating results. The data showed that precision improved from 40.72% with full text search to 62.84% using XML markup and searching only in specific labels.Revision of the vocabulary and more testing has to be done by the library and information science community in order to obtain a valid vocabulary and provide more research results. Cultural characteristics and politics of librarians and information managers’ community are as important as technical issues in order to consider any technical proposal to be implemented successfully to achieve interoperability. 相似文献
18.
全文检索系统中动态索引技术的研究与实现 总被引:6,自引:1,他引:5
分析了传统全文检索系统中静态索引技术的实现,讨论了静态索引技术的优缺点;然后提出来一种动态索引技术,阐述了动态索引技术的原理,并在两个数据库开发平台上给出了动态索引技术的实现。 相似文献