首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 218 毫秒
1.
乔林  糜仲春  刘亮  张群 《情报学报》2006,25(4):420-427
在进行文献检索特别是多关键词文献检索时,现有科学搜索引擎的检索结果排序存在一些问题。本文推导了两关键词加权文献检索的相关度计算公式,在此基础上提出了多关键词组合加权文献检索方法,并与文献年平均被引频次指标相结合,确定综合考虑文献相关度和文献质量的排序分数。最后,通过实例分析验证了该方法的有效性。  相似文献   

2.
传统搜索引擎通常靠抓取全文关键词进行分析,由此带来三大缺陷:缺乏语义描述导致查准率低;检索结果冗余模糊导致检索效率低;检索途径不足。基于DC元数据描述网络资源的优越性,课题组设计了一个基于DC元数据的网络搜索引擎系统DCSE,力图克服传统搜索引擎的上述缺陷。DCSE系统自动抓取含DC描述的Web网页,把DC描述信息存入到数据库,排序索引后提供用户检索。检索界面设计成以15个DC元素为检索项的多项逻辑组合检索,检索结果以各DC元素的描述内容来显示,如标题、创建者、描述、日期等。用户通过多项组合检索提高查准率,并通过清晰的结果显示对所需信息做出快速判断、选择,从而达到提高检索效率的目的。  相似文献   

3.
基于词序的多关键词加权检索融合研究   总被引:1,自引:0,他引:1  
 分析国内外目前报道的三种元搜索多关键词加权方法,指出这些方法存在忽视词序的问题;进而提出结合词序特征的多关键词综合加权方法,对多关键词加权方法做重要改进;此外研究基于D-S理论的元搜索结果融合方法;实验表明,该方法可以明显提高检索性能。  相似文献   

4.
本文通过文献调查,对2002~2007年国外关于搜索引擎检索结果研究的文献进行了概述与分析,从搜索引擎检索结果覆盖面、重叠性、相关性三个方面,对搜索引擎检索结果覆盖面的评价、影响因素及改进方法,搜索引擎检索结果重叠性的评价,搜索引擎检索结果相关性的评价、页面排序算法及文本分析方法对搜索引擎检索结果相关性的影响及改进作用、搜索引擎检索结果相关性的其他影响因素作一个综述,以期了解国外的研究态势,供国内的后续研究参考.  相似文献   

5.
搜索引擎中完善关键词检索功能的探索   总被引:7,自引:0,他引:7  
关键词检索是搜索引擎的两大基本检索功能之一。文章在简述和分析搜索引擎增强关键词检索功能措施的基础上,着重探讨了运用分类主题一体化的原理、方法组织索引。完善中文搜索引擎关键词检索功能的问题。  相似文献   

6.
MetaCrawler集成搜索引擎   总被引:5,自引:0,他引:5  
本文论述了一种集成搜索引擎-MetaCrawler的功能特点,检索方法及其结果输出。进而得知:MetaCrawler集成搜索引擎是一种使用方便,功能强大的网上检索工具,是检索者获得网上信息的理想选择。  相似文献   

7.
搜索引擎检索结果的组织技术   总被引:9,自引:0,他引:9  
赵荣  黄燕云  张露 《情报学报》2004,23(1):69-72
本文综合分析了几种主要的搜索引擎检索结果排序组织技术的原理及应用 ,包括关键词词频和位置原理、网页链接级别算法和结果分类组织等。  相似文献   

8.
叙词在网络环境中的应用   总被引:1,自引:0,他引:1  
叙述了叙词在网络环境下的三种应用模式 :用叙词直接标引和检索 ;在基于关键词检索的搜索引擎中实现检索式的扩展 ;通过叙词实现不同词表或分类法之间的兼容互换 ,以便交叉检索  相似文献   

9.
为了便于用户浏览搜索引擎返回结果,本文提出了一种基于TFIDF新的文本相似度计算方法,并提出使用具有近似线性时间复杂度的增量聚类算法对文本进行多层聚类的策略。同时,提出了一种从多文本中提取关键词的策略:提取簇中的名词或名词短语作为候选关键词,综合考虑每个候选关键词的词频、出现位置、长度和文本长度设置加权函数来计算其权重,不需要人工干预以及语料库的协助,自动提取权重最大的候选关键词作为类别关键词。在收集的百度、ODP语料以及公开测试的实验结果表明本文提出方法的有效性。  相似文献   

10.
试论中文搜索引擎之关键词检索   总被引:2,自引:0,他引:2  
对网络謦索引擎的概念、分类、工作原理等作了简要的分析。指出了关键词在中文搜索引擎检索中的作用原理以及关键词检索失败的常见原因。最后论述了中文搜索引擎关键词检索的优化策略。  相似文献   

11.
The authors of this paper investigated the impact of the advanced search features of three common search engines on retrieval result performance: Yahoo, Google, and Live Search. The authors analyzed 240 search queries with different information need emphases to determine retrieval effectiveness differences among regular search, title search, exact phrase search, and PDF file format restriction search. A one-way ANOVA method and regression analysis method were used for the study. It was found that the PDF file format restriction search achieved the best retrieval performance among Yahoo, Google and Live Search. The regular search achieved the best web page ranking performance among Yahoo, Google, and Live Search. The findings of this study can be used to assist users in formulating an appropriate search strategy to improve search effectiveness, and to shed light on how search engines react to different types of search features in terms of retrieval effectiveness.  相似文献   

12.
The authors of this paper investigated the impact of the advanced search features of three common search engines on retrieval result performance: Yahoo, Google, and Live Search. The authors analyzed 240 search queries with different information need emphases to determine retrieval effectiveness differences among regular search, title search, exact phrase search, and PDF file format restriction search. A one-way ANOVA method and regression analysis method were used for the study. It was found that the PDF file format restriction search achieved the best retrieval performance among Yahoo, Google and Live Search. The regular search achieved the best web page ranking performance among Yahoo, Google, and Live Search. The findings of this study can be used to assist users in formulating an appropriate search strategy to improve search effectiveness, and to shed light on how search engines react to different types of search features in terms of retrieval effectiveness.  相似文献   

13.
Measuring Search Engine Quality   总被引:12,自引:3,他引:9  
The effectiveness of twenty public search engines is evaluated using TREC-inspired methods and a set of 54 queries taken from real Web search logs. The World Wide Web is taken as the test collection and a combination of crawler and text retrieval system is evaluated. The engines are compared on a range of measures derivable from binary relevance judgments of the first seven live results returned. Statistical testing reveals a significant difference between engines and high intercorrelations between measures. Surprisingly, given the dynamic nature of the Web and the time elapsed, there is also a high correlation between results of this study and a previous study by Gordon and Pathak. For nearly all engines, there is a gradual decline in precision at increasing cutoff after some initial fluctuation. Performance of the engines as a group is found to be inferior to the group of participants in the TREC-8 Large Web task, although the best engines approach the median of those systems. Shortcomings of current Web search evaluation methodology are identified and recommendations are made for future improvements. In particular, the present study and its predecessors deal with queries which are assumed to derive from a need to find a selection of documents relevant to a topic. By contrast, real Web search reflects a range of other information need types which require different judging and different measures.  相似文献   

14.
Empirical modeling of the score distributions associated with retrieved documents is an essential task for many retrieval applications. In this work, we propose modeling the relevant documents’ scores by a mixture of Gaussians and the non-relevant scores by a Gamma distribution. Applying Variational Bayes we automatically trade-off the goodness-of-fit with the complexity of the model. We test our model on traditional retrieval functions and actual search engines submitted to TREC. We demonstrate the utility of our model in inferring precision-recall curves. In all experiments our model outperforms the dominant exponential-Gaussian model.  相似文献   

15.
中文搜索引擎检索语言研究   总被引:1,自引:0,他引:1  
中文搜索引擎在很大程度上满了用户检索中文网络信息资源的要求,但也存在检索效果不够理想等问题,本文人检索语言的角度对现有中文搜索引擎进行分析,指出将检索语言的原理与方法应用于中文搜索引擎,必然极大地提高搜索引擎的检索效率。  相似文献   

16.
在分析已有相关研究的基础上设计一个基于会话管理的Web即时信息检索代理JITIRA,该代理对用户提交给搜索引擎的查询或打开的新网页进行处理,并在此基础上即时构造新查询,并代替用户提交给搜索引擎。实验表明该方法有助于提高查准率。  相似文献   

17.
Summary

This study evaluates how well eight major search engines produced answers to twenty-one real reference questions and five made-up subject questions. The retrieval and relevancy-ranking abilities of search engines were measured by precision, duplicate, most-relevant-item score, and relevancy-ranking score. Search engines did not produce good results for the reference questions, but did well with the subject questions. T-tests found the two types of questions quite different in nature, so the best engines were identified by the type of questions. Open Text was the best in handling the reference questions, and InfoSeek was the best at answering subject questions.  相似文献   

18.
Despite a clear improvement of search and retrieval temporal applications, current search engines are still mostly unaware of the temporal dimension. Indeed, in most cases, systems are limited to offering the user the chance to restrict the search to a particular time period or to simply rely on an explicitly specified time span. If the user is not explicit in his/her search intents (e.g., “philip seymour hoffman”) search engines may likely fail to present an overall historic perspective of the topic. In most such cases, they are limited to retrieving the most recent results. One possible solution to this shortcoming is to understand the different time periods of the query. In this context, most state-of-the-art methodologies consider any occurrence of temporal expressions in web documents and other web data as equally relevant to an implicit time sensitive query. To approach this problem in a more adequate manner, we propose in this paper the detection of relevant temporal expressions to the query. Unlike previous metadata and query log-based approaches, we show how to achieve this goal based on information extracted from document content. However, instead of simply focusing on the detection of the most obvious date we are also interested in retrieving the set of dates that are relevant to the query. Towards this goal, we define a general similarity measure that makes use of co-occurrences of words and years based on corpus statistics and a classification methodology that is able to identify the set of top relevant dates for a given implicit time sensitive query, while filtering out the non-relevant ones. Through extensive experimental evaluation, we mean to demonstrate that our approach offers promising results in the field of temporal information retrieval (T-IR), as demonstrated by the experiments conducted over several baselines on web corpora collections.  相似文献   

19.
搜索引擎的最新进展述要   总被引:1,自引:0,他引:1  
搜索引擎已成为人们利用网络的最重要工具.目前,网络上出现了一些具有新颖性、创新性的搜索引擎或挖掘出搜索引擎的新功能,其中有些研究成果直接代表着搜索引擎的发展方向.文章通过跟踪、试用、分析等环节,对新的搜索引擎或具有新功能的搜索引擎进行了归纳,便于人们更好地了解当前世界搜索引擎的现状.  相似文献   

20.
案例教学在文献检索课中发挥了重要作用,然而随着第三代搜索引擎自然语言处理技术的成熟和数据库智能化检索的趋势,一些教学案例存在一定的失效性现象。文章列举了这种现象,并分析了原因,旨在对文检课教学提出一些改进的想法和建议。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号