首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
Information Retrieval (IR) develops complex systems, constituted of several components, which aim at returning and optimally ranking the most relevant documents in response to user queries. In this context, experimental evaluation plays a central role, since it allows for measuring IR systems effectiveness, increasing the understanding of their functioning, and better directing the efforts for improving them. Current evaluation methodologies are limited by two major factors: (i) IR systems are evaluated as “black boxes”, since it is not possible to decompose the contributions of the different components, e.g., stop lists, stemmers, and IR models; (ii) given that it is not possible to predict the effectiveness of an IR system, both academia and industry need to explore huge numbers of systems, originated by large combinatorial compositions of their components, to understand how they perform and how these components interact together.We propose a Combinatorial visuaL Analytics system for Information Retrieval Evaluation (CLAIRE) which allows for exploring and making sense of the performances of a large amount of IR systems, in order to quickly and intuitively grasp which system configurations are preferred, what are the contributions of the different components and how these components interact together.The CLAIRE system is then validated against use cases based on several test collections using a wide set of systems, generated by a combinatorial composition of several off-the-shelf components, representing the most common denominator almost always present in English IR systems. In particular, we validate the findings enabled by CLAIRE with respect to consolidated deep statistical analyses and we show that the CLAIRE system allows the generation of new insights, which were not detectable with traditional approaches.  相似文献   

The quality of stemming algorithms is typically measured in two different ways: (i) how accurately they map the variant forms of a word to the same stem; or (ii) how much improvement they bring to Information Retrieval systems. In this article, we evaluate various stemming algorithms, in four languages, in terms of accuracy and in terms of their aid to Information Retrieval. The aim is to assess whether the most accurate stemmers are also the ones that bring the biggest gain in Information Retrieval. Experiments in English, French, Portuguese, and Spanish show that this is not always the case, as stemmers with higher error rates yield better retrieval quality. As a byproduct, we also identified the most accurate stemmers and the best for Information Retrieval purposes.  相似文献   

The Inductive Query By Example (IQBE) paradigm allows a system to automatically derive queries for a specific Information Retrieval System (IRS). Classic IRSs based on this paradigm [Smith, M., & Smith, M. (1997). The use of genetic programming to build Boolean queries for text retrieval through relevance feedback. Journal of Information Science, 23(6), 423–431] generate a single solution (Boolean query) in each run, that with the best fitness value, which is usually based on a weighted combination of the basic performance criteria, precision and recall.  相似文献   

The strongest tradition of IR systems evaluation has focused on system effectiveness; more recently, there has been a growing interest in evaluation of Interactive IR systems, balancing system and user-oriented evaluation criteria. In this paper we shift the focus to considering how IR systems, and particularly digital libraries, can be evaluated to assess (and improve) their fit with users’ broader work activities. Taking this focus, we answer a different set of evaluation questions that reveal more about the design of interfaces, user–system interactions and how systems may be deployed in the information working context. The planning and conduct of such evaluation studies share some features with the established methods for conducting IR evaluation studies, but come with a shift in emphasis; for example, a greater range of ethical considerations may be pertinent. We present the PRET A Rapporter framework for structuring user-centred evaluation studies and illustrate its application to three evaluation studies of digital library systems.  相似文献   

In this paper, we describe an operational methodology for characterizing the architecture of complex technical systems and demonstrate its application to a large sample of software releases. Our methodology is based upon directed network graphs, which allows us to identify all of the direct and indirect linkages between the components in a system. We use this approach to define three fundamental architectural patterns, which we label core–periphery, multi-core, and hierarchical. Applying our methodology to a sample of 1286 software releases from 17 applications, we find that the majority of releases possess a “core–periphery” structure. This architecture is characterized by a single dominant cyclic group of components (the “Core”) that is large relative to the system as a whole as well as to other cyclic groups in the system. We show that the size of the Core varies widely, even for systems that perform the same function. These differences appear to be associated with different models of development – open, distributed organizations develop systems with smaller Cores, while closed, co-located organizations develop systems with larger Cores. Our findings establish some “stylized facts” about the fine-grained structure of large, real-world technical systems, serving as a point of departure for future empirical work.  相似文献   

Whereas in language words of high frequency are generally associated with low content [Bookstein, A., & Swanson, D. (1974). Probabilistic models for automatic indexing. Journal of the American Society of Information Science, 25(5), 312–318; Damerau, F. J. (1965). An experiment in automatic indexing. American Documentation, 16, 283–289; Harter, S. P. (1974). A probabilistic approach to automatic keyword indexing. PhD thesis, University of Chicago; Sparck-Jones, K. (1972). A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28, 11–21; Yu, C., & Salton, G. (1976). Precision weighting – an effective automatic indexing method. Journal of the Association for Computer Machinery (ACM), 23(1), 76–88], shallow syntactic fragments of high frequency generally correspond to lexical fragments of high content [Lioma, C., & Ounis, I. (2006). Examining the content load of part of speech blocks for information retrieval. In Proceedings of the international committee on computational linguistics and the association for computational linguistics (COLING/ACL 2006), Sydney, Australia]. We implement this finding to Information Retrieval, as follows. We present a novel automatic query reformulation technique, which is based on shallow syntactic evidence induced from various language samples, and used to enhance the performance of an Information Retrieval system. Firstly, we draw shallow syntactic evidence from language samples of varying size, and compare the effect of language sample size upon retrieval performance, when using our syntactically-based query reformulation (SQR) technique. Secondly, we compare SQR to a state-of-the-art probabilistic pseudo-relevance feedback technique. Additionally, we combine both techniques and evaluate their compatibility. We evaluate our proposed technique across two standard Text REtrieval Conference (TREC) English test collections, and three statistically different weighting models. Experimental results suggest that SQR markedly enhances retrieval performance, and is at least comparable to pseudo-relevance feedback. Notably, the combination of SQR and pseudo-relevance feedback further enhances retrieval performance considerably. These collective experimental results confirm the tenet that high frequency shallow syntactic fragments correspond to content-bearing lexical fragments.  相似文献   

Recently, question series have become one focus of research in question answering. These series are comprised of individual factoid, list, and “other” questions organized around a central topic, and represent abstractions of user–system dialogs. Existing evaluation methodologies have yet to catch up with this richer task model, as they fail to take into account contextual dependencies and different user behaviors. This paper presents a novel simulation-based methodology for evaluating answers to question series that addresses some of these shortcomings. Using this methodology, we examine two different behavior models: a “QA-styled” user and an “IR-styled” user. Results suggest that an off-the-shelf document retrieval system is competitive with state-of-the-art QA systems in this task. Advantages and limitations of evaluations based on user simulations are also discussed.  相似文献   

New methods and new systems are needed to filter or to selectively distribute the increasing volume of electronic information being produced nowadays. An effective information filtering system is one that provides the exact information that fulfills user's interests with the minimum effort by the user to describe it. Such a system will have to be adaptive to the user changing interest. In this paper we describe and evaluate a learning model for information filtering which is an adaptation of the generalized probabilistic model of Information Retrieval. The model is based on the concept of `uncertainty sampling', a technique that allows for relevance feedback both on relevant and nonrelevant documents. The proposed learning model is the core of a prototype information filtering system called ProFile.  相似文献   

Relevance-Based Language Models, commonly known as Relevance Models, are successful approaches to explicitly introduce the concept of relevance in the statistical Language Modelling framework of Information Retrieval. These models achieve state-of-the-art retrieval performance in the pseudo relevance feedback task. On the other hand, the field of recommender systems is a fertile research area where users are provided with personalised recommendations in several applications. In this paper, we propose an adaptation of the Relevance Modelling framework to effectively suggest recommendations to a user. We also propose a probabilistic clustering technique to perform the neighbour selection process as a way to achieve a better approximation of the set of relevant items in the pseudo relevance feedback process. These techniques, although well known in the Information Retrieval field, have not been applied yet to recommender systems, and, as the empirical evaluation results show, both proposals outperform individually several baseline methods. Furthermore, by combining both approaches even larger effectiveness improvements are achieved.  相似文献   

Collaborative Filtering techniques have become very popular in the last years as an effective method to provide personalized recommendations. They generally obtain much better accuracy than other techniques such as content-based filtering, because they are based on the opinions of users with tastes or interests similar to the user they are recommending to. However, this is precisely the reason of one of its main limitations: the cold-start problem. That is, how to recommend new items, not yet rated, or how to offer good recommendations to users they have not information about. For example, because they have recently joined the system. In fact, the new user problem is particularly serious, because an unsatisfied user may stop using the system before it could even collect enough information to generate good recommendations. In this article we tackle this problem with a novel approach called “profile expansion”, based on the query expansion techniques used in Information Retrieval. In particular, we propose and evaluate three kinds of techniques: item-global, item-local and user-local. The experiments we have performed show that both item-global and user-local offer outstanding improvements in precision, up to 100%. Moreover, the improvements are statistically significant and consistent among different movie recommendation datasets and several training conditions.  相似文献   

In this paper, we present a well-defined general matrix framework for modelling Information Retrieval (IR). In this framework, collections, documents and queries correspond to matrix spaces. Retrieval aspects, such as content, structure and semantics, are expressed by matrices defined in these spaces and by matrix operations applied on them. The dualities of these spaces are identified through the application of frequency-based operations on the proposed matrices and through the investigation of the meaning of their eigenvectors. This allows term weighting concepts used for content-based retrieval, such as term frequency and inverse document frequency, to translate directly to concepts for structure-based retrieval. In addition, concepts such as pagerank, authorities and hubs, determined by exploiting the structural relationships between linked documents, can be defined with respect to the semantic relationships between terms. Moreover, this mathematical framework can be used to express classical and alternative evaluation measures, involving, for instance, the structure of documents, and to further explain and relate IR models and theory. The high level of reusability and abstraction of the framework leads to a logical layer for IR that makes system design and construction significantly more efficient, and thus, better and increasingly personalised systems can be built at lower costs.  相似文献   

The transformation of many governments all around the world into new forms, namely, electronic government (e-Government), could not leave the Greek government unaffected. Therefore, it initiated an e-Government project related to national information systems and finance services, the Greek Taxation Information System (TAXIS). The purpose of this paper is to investigate the success of TAXIS from the perspective of expert employees, who work in public taxation agencies. This topic is interesting, because TAXIS is applied in a tax-driven country, under a mandatory setting. Also, it is the first time that the success of this project is examined, from the perspective of employees, using IS success models. The study adapts DeLone and McLean [DeLone, W. H., & McLean, E. R. (2003). The DeLone and McLean model of information systems success: A ten year update. Journal of Management Information Systems, 19(4), 9–30] and Seddon's [Seddon, P. B. (1997). A respecification and extension of the DeLone and McLean model of IS success. Information Systems Research, 8(3) 240–253] information systems success models. The model developed includes the constructs of information, system and service quality, perceived usefulness and user satisfaction. The results provide evidence that there are strong connections between the five success constructs. All hypothesized relationships are supported, except for the relationship between system quality and user satisfaction. The empirical evidence and discussion presented can help the Greek Government improve and fully exploit the potential of TAXIS as an innovative tool for taxation purposes.  相似文献   

The increasing volume of textual information on any topic requires its compression to allow humans to digest it. This implies detecting the most important information and condensing it. These challenges have led to new developments in the area of Natural Language Processing (NLP) and Information Retrieval (IR) such as narrative summarization and evaluation methodologies for narrative extraction. Despite some progress over recent years with several solutions for information extraction and text summarization, the problems of generating consistent narrative summaries and evaluating them are still unresolved. With regard to evaluation, manual assessment is expensive, subjective and not applicable in real time or to large collections. Moreover, it does not provide re-usable benchmarks. Nevertheless, commonly used metrics for summary evaluation still imply substantial human effort since they require a comparison of candidate summaries with a set of reference summaries. The contributions of this paper are three-fold. First, we provide a comprehensive overview of existing metrics for summary evaluation. We discuss several limitations of existing frameworks for summary evaluation. Second, we introduce an automatic framework for the evaluation of metrics that does not require any human annotation. Finally, we evaluate the existing assessment metrics on a Wikipedia data set and a collection of scientific articles using this framework. Our findings show that the majority of existing metrics based on vocabulary overlap are not suitable for assessment based on comparison with a full text and we discuss this outcome.  相似文献   

Question Answering (QA) systems are developed to answer human questions. In this paper, we have proposed a framework for answering definitional and factoid questions, enriched by machine learning and evolutionary methods and integrated in a web-based QA system. Our main purpose is to build new features by combining state-of-the-art features with arithmetic operators. To accomplish this goal, we have presented a Genetic Programming (GP)-based approach. The exact GP duty is to find the most promising formulas, made by a set of features and operators, which can accurately rank paragraphs, sentences, and words. We have also developed a QA system in order to test the new features. The input of our system is texts of documents retrieved by a search engine. To answer definitional questions, our system performs paragraph ranking and returns the most related paragraph. Moreover, in order to answer factoid questions, the system evaluates sentences of the filtered paragraphs ranked by the previous module of our framework. After this phase, the system extracts one or more words from the ranked sentences based on a set of hand-made patterns and ranks them to find the final answer. We have used Text Retrieval Conference (TREC) QA track questions, web data, and AQUAINT and AQUAINT-2 datasets for training and testing our system. Results show that the learned features can perform a better ranking in comparison with other evaluation formulas.  相似文献   

Climate adaptation research increasingly focuses on the socio-cultural dimensions of change. In this context, narrative research is often seen as a qualitative social science method used to frame adaptation communication. However, this perspective neglects an important insight provided by narrative theory as applied in the cognitive sciences and other practical fields: human cognition is organized around specific narrative structures. In adaptation, this means that how we ‘story’ the environment determines how we understand and practice adaptation, how risks are defined, who is authorized as actors in the change debate, and the range of policy options considered. Furthermore, relating an experience through story-telling is already doing ‘knowledge work’, or learning. In taking narrative beyond its use as an extractive social research methodology, we argue that narrative research offers an innovative, holistic approach to a better understanding of socio-ecological systems and the improved, participatory design of local adaptation policies. Beyond producing data on local knowledge(s) and socio-cultural and affective-emotive factors influencing adaptive capacity, it can significantly inform public engagement, deliberation and learning strategies–features of systemic adaptive governance. We critically discuss narrative as both a self-reflective methodology and as a paradigmatic shift in future adaptation research and practice. We explore the narrative approach as a basis for participatory learning in the governance of socio-ecological systems. Finally, we assemble arguments for investing in alternative governance approaches consistent with a shift to a ‘narrative paradigm’.  相似文献   

Although there has been a great deal of research into Collaborative Information Retrieval (CIR) and Collaborative Information Seeking (CIS), the majority has assumed that team members have the same level of unrestricted access to underlying information. However, observations from different domains (e.g. healthcare, business, etc.) have suggested that collaboration sometimes involves people with differing levels of access to underlying information. This type of scenario has been referred to as Multi-Level Collaborative Information Retrieval (MLCIR). To the best of our knowledge, no studies have been conducted to investigate the effect of awareness, an existing CIR/CIS concept, on MLCIR. To address this gap in current knowledge, we conducted two separate user studies using a total of 5 different collaborative search interfaces and 3 information access scenarios. A number of Information Retrieval (IR), CIS and CIR evaluation metrics, as well as questionnaires were used to compare the interfaces. Design interviews were also conducted after evaluations to obtain qualitative feedback from participants. Results suggested that query properties such as time spent on query, query popularity and query effectiveness could allow users to obtain information about team's search performance and implicitly suggest better queries without disclosing sensitive data. Besides, having access to a history of intersecting viewed, relevant and bookmarked documents could provide similar positive effect as query properties. Also, it was found that being able to easily identify different team members and their actions is important for users in MLCIR. Based on our findings, we provide important design recommendations to help develop new CIR and MLCIR interfaces.  相似文献   

For historical and cultural reasons, English phases, especially proper nouns and new words, frequently appear in Web pages written primarily in East Asian languages such as Chinese, Korean, and Japanese. Although such English terms and their equivalences in these East Asian languages refer to the same concept, they are often erroneously treated as independent index units in traditional Information Retrieval (IR). This paper describes the degree to which the problem arises in IR and proposes a novel technique to solve it. Our method first extracts English terms from native Web documents in an East Asian language, and then unifies the extracted terms and their equivalences in the native language as one index unit. For Cross-Language Information Retrieval (CLIR), one of the major hindrances to achieving retrieval performance at the level of Mono-Lingual Information Retrieval (MLIR) is the translation of terms in search queries which can not be found in a bilingual dictionary. The Web mining approach proposed in this paper for concept unification of terms in different languages can also be applied to solve this well-known challenge in CLIR. Experimental results based on NTCIR and KT-Set test collections show that the high translation precision of our approach greatly improves performance of both Mono-Lingual and Cross-Language Information Retrieval.  相似文献   

Interactive query expansion (IQE) (c.f. [Efthimiadis, E. N. (1996). Query expansion. Annual Review of Information Systems and Technology, 31, 121–187]) is a potentially useful technique to help searchers formulate improved query statements, and ultimately retrieve better search results. However, IQE is seldom used in operational settings. Two possible explanations for this are that IQE is generally not integrated into searchers’ established information-seeking behaviors (e.g., examining lists of documents), and it may not be offered at a time in the search when it is needed most (i.e., during the initial query formulation). These challenges can be addressed by coupling IQE more closely with familiar search activities, rather than as a separate functionality that searchers must learn. In this article we introduce and evaluate a variant of IQE known as Real-Time Query Expansion (RTQE). As a searcher enters their query in a text box at the interface, RTQE provides a list of suggested additional query terms, in effect offering query expansion options while the query is formulated. To investigate how the technique is used – and when it may be useful – we conducted a user study comparing three search interfaces: a baseline interface with no query expansion support; an interface that provides expansion options during query entry, and a third interface that provides options after queries have been submitted to a search system. The results show that offering RTQE leads to better quality initial queries, more engagement in the search, and an increase in the uptake of query expansion. However, the results also imply that care must be taken when implementing RTQE interactively. Our findings have broad implications for how IQE should be offered, and form part of our research on the development of techniques to support the increased use of query expansion.  相似文献   

In the traditional evaluation of information retrieval systems, assessors are asked to determine the relevance of a document on a graded scale, independent of any other documents. Such judgments are absolute judgments. Learning to rank brings some new challenges to this traditional evaluation methodology, especially regarding absolute relevance judgments. Recently preferences judgments have been investigated as an alternative. Instead of assigning a relevance grade to a document, an assessor looks at a pair of pages and judges which one is better. In this paper, we generalize pairwise preference judgments to relative judgments. We formulate the problem of relative judgments in a formal way and then propose a new strategy called Select-the-Best-Ones to solve the problem. Through user studies, we compare our proposed method with a pairwise preference judgment method and an absolute judgment method. The results indicate that users can distinguish by about one more relevance degree when using relative methods than when using the absolute method. Consequently, the relative methods generate 15–30% more document pairs for learning to rank. Compared to the pairwise method, our proposed method increases the agreement among assessors from 95% to 99%, while halving the labeling time and the number of discordant pairs to experts’ judgments.  相似文献   

This paper extends Goffman's Indirect Method of Information Retrieval by suggesting a more flexible search strategy. The suggested strategy in its simplified version is more effective than the Boolean strategy used in existing information retrieval systems and can be implemented at comparable costs. Some of its features might further increase the effectiveness of information retrieval systems.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号