共查询到20条相似文献,搜索用时 93 毫秒
1.
The Web contains a tremendous amount of information. It is challenging to determine which Web documents are relevant to a user query, and even more challenging to rank them according to their degrees of relevance. In this paper, we propose a probabilistic retrieval model using logistic regression for recognizing multiple-record Web documents against an application ontology, a simple conceptual modeling approach. We notice that many Web documents contain a sequence of chunks of textual information, each of which constitutes a record. This type of documents is referred to as multiple-record documents. In our categorization approach, a document is represented by a set of term frequencies of index terms, a density heuristic value, and a grouping heuristic value. We first apply the logistic regression analysis on relevant probabilities using the (i) index terms, (ii) density value, and (iii) grouping value of each training document. Hereafter, the relevant probability of each test document is interpolated from the fitting curves. Contrary to other probabilistic retrieval models, our model makes only a weak independent assumption and is capable of handling any important dependent relationships among index terms. In addition, we use logistic regression, instead of linear regression analysis, because the relevance probabilities of training documents are discrete. Using a test set of car-ads and another one for obituary Web documents, our probabilistic model achieves the averaged recall ratio of 100%, precision ratio of 83.3%, and accuracy ratio of 92.5%. 相似文献
2.
To cope with the fact that, in the ad hoc retrieval setting, documents relevant to a query could contain very few (short)
parts (passages) with query-related information, researchers proposed passage-based document ranking approaches. We show that several of
these retrieval methods can be understood, and new ones can be derived, using the same probabilistic model. We use language-model
estimates to instantiate specific retrieval algorithms, and in doing so present a novel passage language model that integrates information from the containing document to an extent controlled by the estimated document homogeneity. Several document-homogeneity measures that we present yield passage language models that are more effective than the standard
passage model for basic document retrieval and for constructing and utilizing passage-based relevance models; these relevance models also outperform a document-based relevance model. Finally, we demonstrate the merits in using the
document-homogeneity measures for integrating document-query and passage-query similarity information for document retrieval. 相似文献
3.
Comparing Rank and Score Combination Methods for Data Fusion in Information Retrieval 总被引:4,自引:0,他引:4
Combination of multiple evidences (multiple query formulations, multiple retrieval schemes or systems) has been shown (mostly experimentally) to be effective in data fusion in information retrieval. However, the question of why and how combination should be done still remains largely unanswered. In this paper, we provide a model for simulation and a framework for analysis in the study of data fusion in the information retrieval domain. A rank/score function is defined and the concept of a Cayley graph is used in the design and analysis of our framework. The model and framework have led us to better understanding of the data fusion phenomena in information retrieval. In particular, by exploiting the graphical properties of the rank/score function, we have shown analytically and by simulation that combination using rank performs better than combination using score under certain conditions. Moreover, we demonstrated that the rank/score function might be used as a predictive variable for the effectiveness of combination of multiple evidences.Authors wish to dedicate this paper to the memory of our friend and colleague Professor Jacob Shapiro, who passed away September 2003.Supported in part by the DIMACS NSF grant STC-91-19999 and by NJ Commission.Supported in part by a grant from The City University of New York PSC-CUNY Research Award. 相似文献
4.
Web search engines are increasingly deploying many features, combined using learning to rank techniques. However, various practical questions remain concerning the manner in which learning to rank should be deployed. For instance, a sample of documents with sufficient recall is used, such that re-ranking of the sample by the learned model brings the relevant documents to the top. However, the properties of the document sample such as when to stop ranking—i.e. its minimum effective size—remain unstudied. Similarly, effective listwise learning to rank techniques minimise a loss function corresponding to a standard information retrieval evaluation measure. However, the appropriate choice of how to calculate the loss function—i.e. the choice of the learning evaluation measure and the rank depth at which this measure should be calculated—are as yet unclear. In this paper, we address all of these issues by formulating various hypotheses and research questions, before performing exhaustive experiments using multiple learning to rank techniques and different types of information needs on the ClueWeb09 and LETOR corpora. Among many conclusions, we find, for instance, that the smallest effective sample for a given query set is dependent on the type of information need of the queries, the document representation used during sampling and the test evaluation measure. As the sample size is varied, the selected features markedly change—for instance, we find that the link analysis features are favoured for smaller document samples. Moreover, despite reflecting a more realistic user model, the recently proposed ERR measure is not as effective as the traditional NDCG as a learning loss function. Overall, our comprehensive experiments provide the first empirical derivation of best practices for learning to rank deployments. 相似文献
6.
On information retrieval metrics designed for evaluation with incomplete relevance assessments 总被引:1,自引:0,他引:1
Modern information retrieval (IR) test collections have grown in size, but the available manpower for relevance assessments
has more or less remained constant. Hence, how to reliably evaluate and compare IR systems using incomplete relevance data, where many documents exist that were never examined by the relevance assessors, is receiving a lot of attention.
This article compares the robustness of IR metrics to incomplete relevance assessments, using four different sets of graded-relevance
test collections with submitted runs—the TREC 2003 and 2004 robust track data and the NTCIR-6 Japanese and Chinese IR data
from the crosslingual task. Following previous work, we artificially reduce the original relevance data to simulate IR evaluation
environments with extremely incomplete relevance data. We then investigate the effect of this reduction on discriminative power, which we define as the proportion of system pairs with a statistically significant difference for a given probability of
Type I Error, and on Kendall’s rank correlation, which reflects the overall resemblance of two system rankings according to two different metrics or two different relevance
data sets. According to these experiments, Q′, nDCG′ and AP′ proposed by Sakai are superior to bpref proposed by Buckley and
Voorhees and to Rank-Biased Precision proposed by Moffat and Zobel. We also point out some weaknesses of bpref and Rank-Biased
Precision by examining their formal definitions.
相似文献
Noriko KandoEmail: |
7.
T. Couto N. Ziviani P. Calado M. Cristo M. Gonçalves E. S. de Moura W. Brandão 《Information Retrieval》2010,13(4):315-345
Automatic document classification can be used to organize documents in a digital library, construct on-line directories, improve
the precision of web searching, or help the interactions between user and search engines. In this paper we explore how linkage
information inherent to different document collections can be used to enhance the effectiveness of classification algorithms.
We have experimented with three link-based bibliometric measures, co-citation, bibliographic coupling and Amsler, on three
different document collections: a digital library of computer science papers, a web directory and an on-line encyclopedia.
Results show that both hyperlink and citation information can be used to learn reliable and effective classifiers based on
a kNN classifier. In one of the test collections used, we obtained improvements of up to 69.8% of macro-averaged F
1 over the traditional text-based kNN classifier, considered as the baseline measure in our experiments. We also present alternative ways of combining bibliometric
based classifiers with text based classifiers. Finally, we conducted studies to analyze the situation in which the bibliometric-based
classifiers failed and show that in such cases it is hard to reach consensus regarding the correct classes, even for human
judges. 相似文献
8.
Many queries have multiple interpretations; they are ambiguous or underspecified. This is especially true in the context of Web search. To account for this, much recent research has focused on creating systems that produce diverse ranked lists. In order to validate these systems, several new evaluation measures have been created to quantify diversity. Ideally, diversity evaluation measures would distinguish between systems by the amount of diversity in the ranked lists they produce. Unfortunately, diversity is also a function of the collection over which the system is run and a system’s performance at ad-hoc retrieval. A ranked list built from a collection that does not cover multiple subtopics cannot be diversified; neither can a ranked list that contains no relevant documents. To ensure that we are assessing systems by their diversity, we develop (1) a family of evaluation measures that take into account the diversity of the collection and (2) a meta-evaluation measure that explicitly controls for performance. We demonstrate experimentally that our new measures can achieve substantial improvements in sensitivity to diversity without reducing discriminative power. 相似文献
9.
Direct optimization of evaluation measures has become an important branch of learning to rank for information retrieval (IR).
Since IR evaluation measures are difficult to optimize due to their non-continuity and non-differentiability, most direct
optimization methods optimize some surrogate functions instead, which we call surrogate measures. A critical issue regarding
these methods is whether the optimization of the surrogate measures can really lead to the optimization of the original IR
evaluation measures. In this work, we perform formal analysis on this issue. We propose a concept named “tendency correlation”
to describe the relationship between a surrogate measure and its corresponding IR evaluation measure. We show that when a
surrogate measure has arbitrarily strong tendency correlation with an IR evaluation measure, the optimization of it will lead
to the effective optimization of the original IR evaluation measure. Then, we analyze the tendency correlations of the surrogate
measures optimized in a number of direct optimization methods. We prove that the surrogate measures in SoftRank and ApproxRank
can have arbitrarily strong tendency correlation with the original IR evaluation measures, regardless of the data distribution,
when some parameters are appropriately set. However, the surrogate measures in SVM
MAP
, DORM
NDCG
, PermuRank
MAP
, and SVM
NDCG
cannot have arbitrarily strong tendency correlation with the original IR evaluation measures on certain distributions of
data. Therefore SoftRank and ApproxRank are theoretically sounder than SVM
MAP
, DORM
NDCG
, PermuRank
MAP
, and SVM
NDCG
, and are expected to result in better ranking performances. Our theoretical findings can explain the experimental results
observed on public benchmark datasets. 相似文献
10.
Fernando Diaz 《Information Retrieval》2007,10(6):531-562
We adapt the cluster hypothesis for score-based information retrieval by claiming that closely related documents should have
similar scores. Given a retrieval from an arbitrary system, we describe an algorithm which directly optimizes this objective
by adjusting retrieval scores so that topically related documents receive similar scores. We refer to this process as score
regularization. Because score regularization operates on retrieval scores, regardless of their origin, we can apply the technique
to arbitrary initial retrieval rankings. Document rankings derived from regularized scores, when compared to rankings derived
from un-regularized scores, consistently and significantly result in improved performance given a variety of baseline retrieval
algorithms. We also present several proofs demonstrating that regularization generalizes methods such as pseudo-relevance
feedback, document expansion, and cluster-based retrieval. Because of these strong empirical and theoretical results, we argue
for the adoption of score regularization as general design principle or post-processing step for information retrieval systems.
相似文献
Fernando DiazEmail: |
11.
《Legal Reference Services Quarterly》2013,32(4):53-67
Abstract This article contends that the relative effectiveness of Lexis and Westlaw should be measured by their ability to find and rank relevant documents. It contains an evaluation of Boolean systems and includes discussion of the Blair and Maron and the Dabney studies in which high recall was the goal of online searches. It then suggests that when other means of discovering relevant cases are available, the need for high recall is diminished. Finally, the article examines and presents performance test results of Lexis' Freestyle and Westlaw's Win natural language search engines. 相似文献
12.
Oren Kurland 《Information Retrieval》2009,12(4):437-460
To obtain high precision at top ranks by a search performed in response to a query, researchers have proposed a cluster-based
re-ranking paradigm: clustering an initial list of documents that are the most highly ranked by some initial search, and using
information induced from these (often called) query-specific clusters for re-ranking the list. However, results concerning the effectiveness of various automatic cluster-based re-ranking methods have been inconclusive. We show that using query-specific clusters for automatic re-ranking
of top-retrieved documents is effective with several methods in which clusters play different roles, among which is the smoothing of document language models. We do so by adapting previously-proposed cluster-based retrieval approaches, which are based on (static) query-independent
clusters for ranking all documents in a corpus, to the re-ranking setting wherein clusters are query-specific. The best performing
method that we develop outperforms both the initial document-based ranking and some previously proposed cluster-based re-ranking
approaches; furthermore, this algorithm consistently outperforms a state-of-the-art pseudo-feedback-based approach. In further
exploration we study the performance of cluster-based smoothing methods for re-ranking with various (soft and hard) clustering
algorithms, and demonstrate the importance of clusters in providing context from the initial list through a comparison to
using single documents to this end.
相似文献
Oren KurlandEmail: |
13.
K.T.L.V. Vaughan 《Medical reference services quarterly》2013,32(3):328-341
The catalogs of 11 university libraries were analyzed against the Basic Resources for Pharmaceutical Education (BRPE) to measure the percent coverage of the core total list as well as the core sublist. There is no clear trend in this data to link school age, size, or rank with percentage of coverage of the total list or the “First Purchase” core list when treated as independent variables. Approximately half of the schools have significantly higher percentages of core titles than statistically expected. Based on this data, it is difficult to predict what percentage of titles on the BRPE a library will contain. 相似文献
14.
《The Reference Librarian》2013,54(79-80):395-407
Summary The gathering, organization, and archiving of critical business intelligence is a complex task. Competitive Intelligence systems gather information for use in the decision making process. Knowledge Management Systems are used to organize this knowledge. Library Science provides structure for the storage of published documents, in both printed and electronic formats. This paper proposes that the common link among the three disciplines is Archive Theory, which is the process by which an archive of information is built. This process provides a framework for analysis of what documents or information to retain and what format to use when retaining them. The paper details the linkage and concludes with an example of a working system that ties all parts together. 相似文献
15.
Robert W. P. Luk 《Information Retrieval》2008,11(6):539-561
This paper discusses various issues about the rank equivalence of Lafferty and Zhai between the log-odds ratio and the query
likelihood of probabilistic retrieval models. It highlights that Robertson’s concerns about this equivalence may arise when
multiple probability distributions are assumed to be uniformly distributed, after assuming that the marginal probability logically
follows from Kolmogorov’s probability axioms. It also clarifies that there are two types of rank equivalence relations between
probabilistic models, namely strict and weak rank equivalence. This paper focuses on the strict rank equivalence which requires
the event spaces of the participating probabilistic models to be identical. It is possible that two probabilistic models are
strict rank equivalent when they use different probability estimation methods. This paper shows that the query likelihood,
p(q|d, r), is strict rank equivalent to p(q|d) of the language model of Ponte and Croft by applying assumptions 1 and 2 of Lafferty and Zhai. In addition, some statistical
component language model may be strict rank equivalent to the log-odds ratio, and that some statistical component model using
the log-odds ratio may be strict rank equivalent to the query likelihood. Finally, we suggest adding a random variable for
the user information need to the probabilistic retrieval models for clarification when these models deal with multiple requests. 相似文献
16.
In Information Retrieval, since it is hard to identify users’ information needs, many approaches have been tried to solve
this problem by expanding initial queries and reweighting the terms in the expanded queries using users’ relevance judgments.
Although relevance feedback is most effective when relevance information about retrieved documents is provided by users, it
is not always available. Another solution is to use correlated terms for query expansion. The main problem with this approach
is how to construct the term-term correlations that can be used effectively to improve retrieval performance. In this study,
we try to construct query concepts that denote users’ information needs from a document space, rather than to reformulate initial queries using the term correlations
and/or users’ relevance feedback. To form query concepts, we extract features from each document, and then cluster the features into primitive concepts that are then used to form
query concepts. Experiments are performed on the Associated Press (AP) dataset taken from the TREC collection. The experimental evaluation
shows that our proposed framework called QCM (Query Concept Method) outperforms baseline probabilistic retrieval model on
TREC retrieval. 相似文献
17.
18.
Multilingual information retrieval is generally understood to mean the retrieval of relevant information in multiple target
languages in response to a user query in a single source language. In a multilingual federated search environment, different
information sources contain documents in different languages. A general search strategy in multilingual federated search environments
is to translate the user query to each language of the information sources and run a monolingual search in each information
source. It is then necessary to obtain a single ranked document list by merging the individual ranked lists from the information
sources that are in different languages. This is known as the results merging problem for multilingual information retrieval.
Previous research has shown that the simple approach of normalizing source-specific document scores is not effective. On the
other side, a more effective merging method was proposed to download and translate all retrieved documents into the source
language and generate the final ranked list by running a monolingual search in the search client. The latter method is more
effective but is associated with a large amount of online communication and computation costs. This paper proposes an effective
and efficient approach for the results merging task of multilingual ranked lists. Particularly, it downloads only a small
number of documents from the individual ranked lists of each user query to calculate comparable document scores by utilizing
both the query-based translation method and the document-based translation method. Then, query-specific and source-specific
transformation models can be trained for individual ranked lists by using the information of these downloaded documents. These
transformation models are used to estimate comparable document scores for all retrieved documents and thus the documents can
be sorted into a final ranked list. This merging approach is efficient as only a subset of the retrieved documents are downloaded
and translated online. Furthermore, an extensive set of experiments on the Cross-Language Evaluation Forum (CLEF) () data has demonstrated the effectiveness of the query-specific and source-specific results merging algorithm against other
alternatives. The new research in this paper proposes different variants of the query-specific and source-specific results
merging algorithm with different transformation models. This paper also provides thorough experimental results as well as
detailed analysis. All of the work substantially extends the preliminary research in (Si and Callan, in: Peters (ed.) Results
of the cross-language evaluation forum-CLEF 2005, 2005).
相似文献
Hao YuanEmail: |
19.
As the volume and variety of information sources continues to grow, there is increasing difficulty with respect to obtaining
information that accurately matches user information needs. A number of factors affect information retrieval effectiveness
(the accuracy of matching user information needs against the retrieved information). First, users often do not present search
queries in the form that optimally represents their information need. Second, the measure of a document’s relevance is often
highly subjective between different users. Third, information sources might contain heterogeneous documents, in multiple formats
and the representation of documents is not unified. This paper discusses an approach for improvement of information retrieval
effectiveness from document databases. It is proposed that retrieval effectiveness can be improved by applying computational
intelligence techniques for modelling information needs, through interactive reinforcement learning. The method combines qualitative
(subjective) user relevance feedback with quantitative (algorithmic) measures of the relevance of retrieved documents. An
information retrieval is developed whose retrieval effectiveness is evaluated using traditional precision and recall. 相似文献
20.
This study explores the potential of an online platform that encourages journalists to post the documents behind their news stories to help restore the deteriorating public trust in news media. Based on content analysis of 200 news items and 315 accompanying documents posted on DocumentCloud, findings indicate that contrary to journalists’ traditional reluctance to rely on documents, the platform succeeds in boosting massive use of documents, both by mainstream and alternative journalists. Findings show that documents serve mainly to support factual claims (in 96 percent of items) and enhance the transparency of news processes, allowing audiences’ unmediated access to raw materials, and greater capacity to evaluate information independently. However, there are no apparent signs that journalists verified the content of the document. The article suggests that DocumentCloud is a unique example of a technology that may succeed where the former technology that promised to serve as a journalistic reference system, hyperlinks, had failed. If the DocumentCloud experiment is implemented on a wider scale, it might have serious theoretical and practical implications, which are discussed here. 相似文献