首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到7条相似文献,搜索用时 0 毫秒
1.
2.
In information retrieval (IR), the improvement of the effectiveness often sacrifices the stability of an IR system. To evaluate the stability, many risk-sensitive metrics have been proposed. Since the theoretical limitations, the current works study the effectiveness and stability separately, and have not explored the effectiveness–stability tradeoff. In this paper, we propose a Bias–Variance Tradeoff Evaluation (BV-Test) framework, based on the bias–variance decomposition of the mean squared error, to measure the overall performance (considering both effectiveness and stability) and the tradeoff between effectiveness and stability of a system. In this framework, we define generalized bias–variance metrics, based on the Cranfield-style experiment set-up where the document collection is fixed (across topics) or the set-up where document collection is a sample (per-topic). Compared with risk-sensitive evaluation methods, our work not only measures the effectiveness–stability tradeoff of a system, but also effectively tracks the source of system instability. Experiments on TREC Ad-hoc track (1993–1999) and Web track (2010–2014) show a clear effectiveness–stability tradeoff across topics and per-topic, and topic grouping and max–min normalization can effectively reduce the bias–variance tradeoff. Experimental results on TREC Session track (2010–2012) also show that the query reformulation and increase of user data are beneficial to both effectiveness and stability simultaneously.  相似文献   

3.
Knowledge acquisition and bilingual terminology extraction from multilingual corpora are challenging tasks for cross-language information retrieval. In this study, we propose a novel method for mining high quality translation knowledge from our constructed Persian–English comparable corpus, University of Tehran Persian–English Comparable Corpus (UTPECC). We extract translation knowledge based on Term Association Network (TAN) constructed from term co-occurrences in same language as well as term associations in different languages. We further propose a post-processing step to do term translation validity check by detecting the mistranslated terms as outliers. Evaluation results on two different data sets show that translating queries using UTPECC and using the proposed methods significantly outperform simple dictionary-based methods. Moreover, the experimental results show that our methods are especially effective in translating Out-Of-Vocabulary terms and also expanding query words based on their associated terms.  相似文献   

4.
Many operational IR indexes are non-normalized, i.e. no lemmatization or stemming techniques, etc. have been employed in indexing. This poses a challenge for dictionary-based cross-language retrieval (CLIR), because translations are mostly lemmas. In this study, we face the challenge of dictionary-based CLIR in a non-normalized index. We test two optional approaches: FCG (Frequent Case Generation) and s-gramming. The idea of FCG is to automatically generate the most frequent inflected forms for a given lemma. FCG has been tested in monolingual retrieval and has been shown to be a good method for inflected retrieval, especially for highly inflected languages. S-gramming is an approximate string matching technique (an extension of n-gramming). The language pairs in our tests were English–Finnish, English–Swedish, Swedish–Finnish and Finnish–Swedish. Both our approaches performed quite well, but the results varied depending on the language pair. S-gramming and FCG performed quite equally in all the other language pairs except Finnish–Swedish, where s-gramming outperformed FCG.  相似文献   

5.
Research on collaborative information retrieval (CIR) has shown positive impacts of collaboration on retrieval effectiveness in the case of complex and/or exploratory tasks. The synergic effect of accomplishing something greater than the sum of its individual components is reached through the gathering of collaborators’ complementary skills. However, these approaches often lack the consideration that collaborators might refine their skills and actions throughout the search session, and that a flexible system mediation guided by collaborators’ behaviors should dynamically adapt to this situation in order to optimize search effectiveness. In this article, we propose a new unsupervised collaborative ranking algorithm which leverages collaborators’ actions for (1) mining their latent roles in order to extract their complementary search behaviors; and (2) ranking documents with respect to the latent role of collaborators. Experiments using two user studies with respectively 25 and 10 pairs of collaborators demonstrate the benefit of such an unsupervised method driven by collaborators’ behaviors throughout the search session. Also, a qualitative analysis of the identified latent role is proposed to explain an over-learning noticed in one of the datasets.  相似文献   

6.
The estimation of query model is an important task in language modeling (LM) approaches to information retrieval (IR). The ideal estimation is expected to be not only effective in terms of high mean retrieval performance over all queries, but also stable in terms of low variance of retrieval performance across different queries. In practice, however, improving effectiveness can sacrifice stability, and vice versa. In this paper, we propose to study this tradeoff from a new perspective, i.e., the bias–variance tradeoff, which is a fundamental theory in statistics. We formulate the notion of bias–variance regarding retrieval performance and estimation quality of query models. We then investigate several estimated query models, by analyzing when and why the bias–variance tradeoff will occur, and how the bias and variance can be reduced simultaneously. A series of experiments on four TREC collections have been conducted to systematically evaluate our bias–variance analysis. Our approach and results will potentially form an analysis framework and a novel evaluation strategy for query language modeling.  相似文献   

7.
Emotions are an integral component of all human activities, including human–computer interactions. This article reviews literature on the theories of emotions, methods for studying emotions, and their role in human information behaviour. It also examines current research on emotions in library and information science, information retrieval and human–computer interaction, and outlines some of the challenges and directions for future work.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号