首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Relevance feedback is an effective technique for improving search accuracy in interactive information retrieval. In this paper, we study an interesting optimization problem in interactive feedback that aims at optimizing the tradeoff between presenting search results with the highest immediate utility to a user (but not necessarily most useful for collecting feedback information) and presenting search results with the best potential for collecting useful feedback information (but not necessarily the most useful documents from a user’s perspective). Optimizing such an exploration–exploitation tradeoff is key to the optimization of the overall utility of relevance feedback to a user in the entire session of relevance feedback. We formally frame this tradeoff as a problem of optimizing the diversification of search results since relevance judgments on more diversified results have been shown to be more useful for relevance feedback. We propose a machine learning approach to adaptively optimizing the diversification of search results for each query so as to optimize the overall utility in an entire session. Experiment results on three representative retrieval test collections show that the proposed learning approach can effectively optimize the exploration–exploitation tradeoff and outperforms the traditional relevance feedback approach which only does exploitation without exploration.  相似文献   

2.
毛振鹏  胡滨  代海岩 《晋图学刊》2005,(5):23-25,39
建立搜索引擎质量评价体系可以指导用户进行网络信息检索和网站搜索引擎优化,促进搜索引擎功能的不断升级。搜索引擎质量评价体系中总体定性评价主要是分别对搜索引擎的用户舒适程度、专业程度、智能程度进行总体评价;量化指标评价主要是采用传统检索指标和网络检索指标对搜索引擎进行单项评价。  相似文献   

3.
交互式跨语言信息检索是信息检索的一个重要分支。在分析交互式跨语言信息检索过程、评价指标、用户行为进展等理论研究基础上,设计一个让用户参与跨语言信息检索全过程的用户检索实验。实验结果表明:用户检索词主要来自检索主题的标题;用户判断文档相关性的准确率较高;目标语言文档全文、译文摘要、译文全文都是用户认可的判断依据;翻译优化方法以及翻译优化与查询扩展的结合方法在用户交互环境下非常有效;用户对于反馈后的翻译仍然愿意做进一步选择;用户对于与跨语言信息检索系统进行交互是有需求并认可的。用户行为分析有助于指导交互式跨语言信息检索系统的设计与实践。  相似文献   

4.
While past research has shown that learning outcomes can be influenced by the amount of effort students invest during the learning process, there has been little research into this question for scenarios where people use search engines to learn. In fact, learning-related tasks represent a significant fraction of the time users spend using Web search, so methods for evaluating and optimizing search engines to maximize learning are likely to have broad impact. Thus, we introduce and evaluate a retrieval algorithm designed to maximize educational utility for a vocabulary learning task, in which users learn a set of important keywords for a given topic by reading representative documents on diverse aspects of the topic. Using a crowdsourced pilot study, we compare the learning outcomes of users across four conditions corresponding to rankings that optimize for different levels of keyword density. We find that adding keyword density to the retrieval objective gave significant learning gains on some topics, with higher levels of keyword density generally corresponding to more time spent reading per word, and stronger learning gains per word read. We conclude that our approach to optimizing search ranking for educational utility leads to retrieved document sets that ultimately may result in more efficient learning of important concepts.  相似文献   

5.
There have been a number of linear, feature-based models proposed by the information retrieval community recently. Although each model is presented differently, they all share a common underlying framework. In this paper, we explore and discuss the theoretical issues of this framework, including a novel look at the parameter space. We then detail supervised training algorithms that directly maximize the evaluation metric under consideration, such as mean average precision. We present results that show training models in this way can lead to significantly better test set performance compared to other training methods that do not directly maximize the metric. Finally, we show that linear feature-based models can consistently and significantly outperform current state of the art retrieval models with the correct choice of features.
  相似文献   

6.
[目的/意义] 针对当前我国健康类搜索引擎可用性程度较低、用户满意度不高的现状,对3个常用健康类搜索引擎的可用性进行评估,以期促进该类搜索引擎技术的发展,提高信息服务质量。[方法/过程] 从用户视角出发,采用实验研究方法,探讨并比较"有问必答"好大夫在线"和"39健康搜"3个健康类搜索引擎的有效性、效率和满意度。[结果/结论] 3个搜索引擎在系统反应速度和易学程度方面较好,但查准率还有待提高;返回网页虽内容丰富,但同时还存在重复、啰嗦、不严谨等问题。本研究发现,用户对健康信息的主观评价与基于临床证据的评价结果有所冲突,如何调和二者之间的矛盾,建立更为全面且有效的健康信息评价指标体系有待进一步研究。  相似文献   

7.
Information Filtering in TREC-9 and TDT-3: A Comparative Analysis   总被引:2,自引:0,他引:2  
Much work on automated information filtering has been done in the TREC and TDT domains, but differences in corpora, the nature of TREC topics vs. TDT events, the constraints imposed on training and testing, and the choices of performance measures confound any meaningful comparison between these domains. We attempt to bridge the gap between them by evaluating the performance of the k-nearest-neighbor (kNN) classification system on the corpus and categories from one domain using the constraints of the other. To maximize comparability and understand the effect of the evaluation metrics specific to each domain, we optimize the performance of kNN separately for the F 1, T9P (preferred metric for TREC-9) and C trk (official metric for TDT-3) metrics. Through a thorough comparison of our within-domain and cross-domain results, our results demonstrate that the corpus used for TREC-9 is more challenging for an information filtering system than the TDT-3 corpus and strongly suggest that the TDT-3 event tracking task itself is more difficult than the TREC batch filtering task. We also show that optimizing performance in TREC-9 and TDT-3 tends to result in systems with different performance characteristics, confounding any meaningful comparison between the two domains, and that T9P and C trk both have properties that make them undesirable as general information filtering metrics.  相似文献   

8.
We study the problem of web search result diversification in the case where intent based relevance scores are available. A diversified search result will hopefully satisfy the information need of user-L.s who may have different intents. In this context, we first analyze the properties of an intent-based metric, ERR-IA, to measure relevance and diversity altogether. We argue that this is a better metric than some previously proposed intent aware metrics and show that it has a better correlation with abandonment rate. We then propose an algorithm to rerank web search results based on optimizing an objective function corresponding to this metric and evaluate it on shopping related queries.  相似文献   

9.
在ACSI(美国客户满意度指数)模型基础上,构建Web搜索引擎满意度(简称WSES)模型。同时,依据已有的搜索引擎评价指标体系,结合所构建的WSES模型,建立相应的测量指标体系,为进一步进行结构方程模型的验证分析建立基础,并为评价Web搜索引擎满意度提供参考。  相似文献   

10.
中国科学院查新检索联合服务体系建设与服务模式探索   总被引:1,自引:0,他引:1  
面对查新检索业务在高峰期需求旺盛与人力不足的矛盾,中国科学院创建全新的查新检索联合服务体系,建立查新检索联合服务机制,面向100余个研究所开展认证检索员与查新员培训,开发查新检索工作平台。三年实践工作表明,联合服务有效地推广查新检索业务,延伸服务内涵,提升所级图书馆服务能力和院内用户的满足率,有效扩充服务的队伍。文章总结联合服务的经验以及存在的问题,并提出改进建议。  相似文献   

11.
基于Ontology的个性化检索   总被引:4,自引:0,他引:4  
目前检索工具的设计大都面向所有用户,而不考虑用户个人的特殊信息需求。本文提出一种基于Ontology的个性化检索方法,该方法自动学习用户查询的历史记录,构建用户兴趣模型,以此推导用户新提问的真正意图,满足用户特殊的信息需求。该方法适用于Internet特定领域或者特定用户群、企业网等智能信息检索。  相似文献   

12.
分析搜索引擎评价与搜索引擎可用性分析两者的关系,并结合搜索引擎查询信息的三阶段特点分析搜索引擎可用性评价的要求,基于可用性评价指标的4大来源构建搜索引擎可用性评价指标层次模型,并用德尔菲法计算评价指标的权重,最终形成一个具有应用价值的搜索引擎可用性评价指标体系。  相似文献   

13.
The critical task of predicting clicks on search advertisements is typically addressed by learning from historical click data. When enough history is observed for a given query-ad pair, future clicks can be accurately modeled. However, based on the empirical distribution of queries, sufficient historical information is unavailable for many query-ad pairs. The sparsity of data for new and rare queries makes it difficult to accurately estimate clicks for a significant portion of typical search engine traffic. In this paper we provide analysis to motivate modeling approaches that can reduce the sparsity of the large space of user search queries. We then propose methods to improve click and relevance models for sponsored search by mining click behavior for partial user queries. We aggregate click history for individual query words, as well as for phrases extracted with a CRF model. The new models show significant improvement in clicks and revenue compared to state-of-the-art baselines trained on several months of query logs. Results are reported on live traffic of a commercial search engine, in addition to results from offline evaluation.  相似文献   

14.
[目的/意义]信息搜索是人们常用的信息查询方法,目前搜索系统在查找事实型信息时支持效果较好,但是对人们以学习为目的的搜索功能还缺乏研究。"搜索即学习"(search as learning)是近年来交互式信息检索的研究热点,这类研究中将搜索看作学习的过程,并尝试对用户搜索中的知识学习进行评估,进而提出系统支持用户学习的功能优化建议。本文着重解决如何全面评估用户搜索前后的知识水平,为此类研究提供参考。[方法/过程]采用用户实验法,对用户搜索前和搜索后撰写的知识内容进行评估,提出综合数量与质量维度的用户知识评估方法,对用户在学习型任务搜索前后知识水平进行评估。数据分析阶段采用统计方法来验证用户搜索后与搜索前的知识水平差异。[结果/结论]研究发现,用户在知识数量上的表现随着搜索的完成而变得更加全面和深入,在知识点数量、知识面数量、知识面广度和知识面深度上都有显著的提升。同时,在搜索后产生了专业度较高的知识面。对于搜索前较模糊的某些概念,在搜索后表达得更清晰明确。在质量上,搜索后绝大多数的用户都在知识的相关性、分析程度及用户观点的提出方面有所提升。  相似文献   

15.
We constructed an end-user model to measure end-user satisfaction with the quality of database search results, using the customer satisfaction theory as a metric. We investigated end-user satisfaction and analyzed key factors which affected user satisfaction. The results show that the end-users' perception of value is the key factor among all of the different factors that impact satisfaction with regard to quality and end-users are willing to make efforts to obtain a higher quality of data. Users tend to evaluate their satisfaction from the perspective of their demands, and database developers should be user oriented in order to improve the level of satisfaction with the data in the database.  相似文献   

16.
探讨当前搜索引擎存在的问题以及搜索引擎的语义功能需求,然后基于Web搜索引擎和语义Web,提出语义Web环境下的搜索引擎功能流图,并针对crawler、本体与知识库、语义注释、筛选与推理、语义索引、语义检索等对搜索引擎的功能进行分析。语义Web环境下的搜索引擎将促进信息、知识需求得到更好、更精确的语义表述和满足,推动高效的信息和知识管理。  相似文献   

17.
针对个性化搜索的3个关键问题:用户信息搜集,用户信息库的动态更新与个性化检索算法,探索性地提出基于Ajax用户行为跟踪方案,以会话为单位动态更新用户行为信息库策略与加入用户文档的向量空间检索模型,并在此基础上设计和实现个性化搜索引擎实验系统。  相似文献   

18.
网络信息搜索行为与用户的日常生活息息相关,用户认知导向的网络信息搜索是认知观和社会认知理论在网络信息搜索中的应用,是与传统信息检索和用户导向信息检索不同的检索范式。在介绍认知交互模型、信息问题解决模型和使用搜索引擎的网络信息搜索行为模型等用户认知导向的信息搜索模型的基础上,进一步从用户因素、信息环境和社会情境方面分析网络信息搜索过程中的影响因素。  相似文献   

19.
Batch IR evaluations are usually performed in a framework that consists of a document collection, a set of queries, a set of relevance judgments, and one or more effectiveness metrics. A large number of evaluation metrics have been proposed, with two primary families having emerged: recall-based metrics, and utility-based metrics. In both families, the pragmatics of forming judgments mean that it is usual to evaluate the metric to some chosen depth such as \(k=20\) or \(k=100\), without necessarily fully considering the ramifications associated with that choice. Our aim is this paper is to explore the relative risks arising with fixed-depth evaluation in the two families, and document the complex interplay between metric evaluation depth and judgment pooling depth. Using a range of TREC resources including NewsWire data and the ClueWeb collection, we: (1) examine the implications of finite pooling on the subsequent usefulness of different test collections, including specifying options for truncated evaluation; and (2) determine the extent to which various metrics correlate with themselves when computed to different evaluation depths using those judgments. We demonstrate that the judgment pools constructed for the ClueWeb collections lack resilience, and are suited primarily to the application of top-heavy utility-based metrics rather than recall-based metrics; and that on the majority of the established test collections, and across a range of evaluation depths, recall-based metrics tend to be more volatile in the system rankings they generate than are utility-based metrics. That is, experimentation using utility-based metrics is more robust to choices such as the evaluation depth employed than is experimentation using recall-based metrics. This distinction should be noted by researchers as they plan and execute system-versus-system retrieval experiments.  相似文献   

20.
Knowledge transfer for cross domain learning to rank   总被引:1,自引:1,他引:0  
Recently, learning to rank technology is attracting increasing attention from both academia and industry in the areas of machine learning and information retrieval. A number of algorithms have been proposed to rank documents according to the user-given query using a human-labeled training dataset. A basic assumption behind general learning to rank algorithms is that the training and test data are drawn from the same data distribution. However, this assumption does not always hold true in real world applications. For example, it can be violated when the labeled training data become outdated or originally come from another domain different from its counterpart of test data. Such situations bring a new problem, which we define as cross domain learning to rank. In this paper, we aim at improving the learning of a ranking model in target domain by leveraging knowledge from the outdated or out-of-domain data (both are referred to as source domain data). We first give a formal definition of the cross domain learning to rank problem. Following this, two novel methods are proposed to conduct knowledge transfer at feature level and instance level, respectively. These two methods both utilize Ranking SVM as the basic learner. In the experiments, we evaluate these two methods using data from benchmark datasets for document retrieval. The results show that the feature-level transfer method performs better with steady improvements over baseline approaches across different datasets, while the instance-level transfer method comes out with varying performance depending on the dataset used.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号