首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 750 毫秒
1.
Searchers can face problems finding the information they seek. One reason for this is that they may have difficulty devising queries to express their information needs. In this article, we describe an approach that uses unobtrusive monitoring of interaction to proactively support searchers. The approach chooses terms to better represent information needs by monitoring searcher interaction with different representations of top-ranked documents. Information needs are dynamic and can change as a searcher views information. The approach we propose gathers evidence on potential changes in these needs and uses this evidence to choose new retrieval strategies. We present an evaluation of how well our technique estimates information needs, how well it estimates changes in these needs and the appropriateness of the interface support it offers. The results are presented and the avenues for future research identified.  相似文献   

2.
A critical challenge for Web search engines concerns how they present relevant results to searchers. The traditional approach is to produce a ranked list of results with title and summary (snippet) information, and these snippets are usually chosen based on the current query. Snippets play a vital sensemaking role, helping searchers to efficiently make sense of a collection of search results, as well as determine the likely relevance of individual results. Recently researchers have begun to explore how snippets might also be adapted based on searcher preferences as a way to better highlight relevant results to the searcher. In this paper we focus on the role of snippets in collaborative web search and describe a technique for summarizing search results that harnesses the collaborative search behaviour of communities of like-minded searchers to produce snippets that are more focused on the preferences of the searchers. We go on to show how this so-called social summarization technique can generate summaries that are significantly better adapted to searcher preferences and describe a novel personalized search interface that combines result recommendation with social summarization.  相似文献   

3.
Information seeking is traditionally conducted in environments where search results are represented at the user interface by a minimal amount of meta-information such as titles and query-based summaries. The goal of this form of presentation is to give searchers sufficient context to help them make informed interaction decisions without overloading them cognitively. The principle of polyrepresentation [Ingwersen, P. (1996). Cognitive perspectives of information retrieval interaction: elements of a cognitive IR theory. Journal of Documentation 52, 3–50] suggests that information retrieval (IR) systems should provide and use different cognitive structures during acts of communication to reduce the uncertainty associated with interactive IR. In previous work we have created content-rich search interfaces that implement an aspect of polyrepresentative theory, and are capable of displaying multiple representations of the retrieved documents simultaneously at the results interface. Searcher interaction with content-rich interfaces was used as implicit relevance feedback (IRF) to construct modified queries. These interfaces have been shown to be successful in experimentation with human subjects but we do not know whether the information was presented in a way that makes good use of the display space, or positioned most useful components in easily accessible locations, for use in IRF. In this article we use simulations of searcher interaction behaviour as design tools to determine the most rational interface design for when IRF is employed. This research forms part of the iterative design of interfaces to proactively support searchers.  相似文献   

4.
Experimental results of cross-language information retrieval (CLIR) do not indicate why a model fails or how a model could be improved. One basic research question is thus whether it is possible to provide conditions by which one can evaluate any existing or new CLIR strategy analytically and one can improve the design of CLIR models. Inspired by the heuristics in monolingual IR, we introduce in this paper Dilution/Concentration (D/C) conditions to characterize good CLIR models based on direct intuitions under artificial settings. The conditions, derived from first principles in CLIR, generalize the idea of query structuring approach. Empirical results with state-of-the-art CLIR models show that when a condition is not satisfied, it often indicates non-optimality of the method. In general, we find that the empirical performance of a retrieval formula is tightly related to how well it satisfies the conditions. Lastly, we propose, by following the D/C conditions, several novel CLIR models based on the information-based models, which again shows that the D/C conditions are efficient to feature good CLIR models.  相似文献   

5.
For historical and cultural reasons, English phases, especially proper nouns and new words, frequently appear in Web pages written primarily in East Asian languages such as Chinese, Korean, and Japanese. Although such English terms and their equivalences in these East Asian languages refer to the same concept, they are often erroneously treated as independent index units in traditional Information Retrieval (IR). This paper describes the degree to which the problem arises in IR and proposes a novel technique to solve it. Our method first extracts English terms from native Web documents in an East Asian language, and then unifies the extracted terms and their equivalences in the native language as one index unit. For Cross-Language Information Retrieval (CLIR), one of the major hindrances to achieving retrieval performance at the level of Mono-Lingual Information Retrieval (MLIR) is the translation of terms in search queries which can not be found in a bilingual dictionary. The Web mining approach proposed in this paper for concept unification of terms in different languages can also be applied to solve this well-known challenge in CLIR. Experimental results based on NTCIR and KT-Set test collections show that the high translation precision of our approach greatly improves performance of both Mono-Lingual and Cross-Language Information Retrieval.  相似文献   

6.
In this paper, we use time series analysis to evaluate predictive scenarios using search engine transactional logs. Our goal is to develop models for the analysis of searchers’ behaviors over time and investigate if time series analysis is a valid method for predicting relationships between searcher actions. Time series analysis is a method often used to understand the underlying characteristics of temporal data in order to make forecasts. In this study, we used a Web search engine transactional log and time series analysis to investigate users’ actions. We conducted our analysis in two phases. In the initial phase, we employed a basic analysis and found that 10% of searchers clicked on sponsored links. However, from 22:00 to 24:00, searchers almost exclusively clicked on the organic links, with almost no clicks on sponsored links. In the second and more extensive phase, we used a one-step prediction time series analysis method along with a transfer function method. The period rarely affects navigational and transactional queries, while rates for transactional queries vary during different periods. Our results show that the average length of a searcher session is approximately 2.9 interactions and that this average is consistent across time periods. Most importantly, our findings shows that searchers who submit the shortest queries (i.e., in number of terms) click on highest ranked results. We discuss implications, including predictive value, and future research.  相似文献   

7.
李村合  刘竞 《情报科学》2005,23(6):905-907,916
Cloaking指的是显示给Google和其他搜索引擎的页面与显示给普通浏览者的页面不同的技术。这种技术的目的通常是想操纵搜索引擎的排名结果,当普通浏览者在搜索引擎结果中查找资料的时候就会受到误导。本文介绍了Cloaking的定义和出现原因,并对Cloaking技术进行了分析和评价。  相似文献   

8.
Interactive query expansion (IQE) (c.f. [Efthimiadis, E. N. (1996). Query expansion. Annual Review of Information Systems and Technology, 31, 121–187]) is a potentially useful technique to help searchers formulate improved query statements, and ultimately retrieve better search results. However, IQE is seldom used in operational settings. Two possible explanations for this are that IQE is generally not integrated into searchers’ established information-seeking behaviors (e.g., examining lists of documents), and it may not be offered at a time in the search when it is needed most (i.e., during the initial query formulation). These challenges can be addressed by coupling IQE more closely with familiar search activities, rather than as a separate functionality that searchers must learn. In this article we introduce and evaluate a variant of IQE known as Real-Time Query Expansion (RTQE). As a searcher enters their query in a text box at the interface, RTQE provides a list of suggested additional query terms, in effect offering query expansion options while the query is formulated. To investigate how the technique is used – and when it may be useful – we conducted a user study comparing three search interfaces: a baseline interface with no query expansion support; an interface that provides expansion options during query entry, and a third interface that provides options after queries have been submitted to a search system. The results show that offering RTQE leads to better quality initial queries, more engagement in the search, and an increase in the uptake of query expansion. However, the results also imply that care must be taken when implementing RTQE interactively. Our findings have broad implications for how IQE should be offered, and form part of our research on the development of techniques to support the increased use of query expansion.  相似文献   

9.
Taylor (1968) dramatically stated that information seekers/searchers do not use their real Q1-level of information need when formulating their query to the system. Instead, they use a compromised Q4-level form of their need. The article directly confronts what Taylor's (1968) Q1-level information need is–the “actual” or “real” information need of the searcher. The article conceptually and operationally defines what Taylor's Q1-level of information need is using Belkin's (1980) ASK concept as a basis for designing a system intervention that shifts the searcher from representing the Q4-level compromised form of the need in her query to representing instead her Q1-level real information need. The article describes the Q1 Actualizing Intervention Model, which can be built into a system capable of actualizing the uncertainty distribution of the searcher's belief ASK so that information search is directed by the searcher's real Q1-level information need. The objective of the Q1 Actualizing Intervention Model is to enable in our Knowledge Age the introduction of intervention IR systems that are organic and human-centric, designed to initiate organic knowledge production processes in the searcher.  相似文献   

10.
This paper presents a study of relevance feedback in a cross-language information retrieval environment. We have performed an experiment in which Portuguese speakers are asked to judge the relevance of English documents; documents hand-translated to Portuguese and documents automatically translated to Portuguese. The goals of the experiment were to answer two questions (i) how well can native Portuguese searchers recognise relevant documents written in English, compared to documents that are hand translated and automatically translated to Portuguese; and (ii) what is the impact of misjudged documents on the performance improvement that can be achieved by relevance feedback. Surprisingly, the results show that machine translation is as effective as hand translation in aiding users to assess relevance in the experiment. In addition, the impact of misjudged documents on the performance of RF is overall just moderate, and varies greatly for different query topics.  相似文献   

11.
In contrast with their monolingual counterparts, little attention has been paid to the effects that misspelled queries have on the performance of Cross-Language Information Retrieval (CLIR) systems. The present work makes a first attempt to fill this gap by extending our previous work on monolingual retrieval in order to study the impact that the progressive addition of misspellings to input queries has, this time, on the output of CLIR systems. Two approaches for dealing with this problem are analyzed in this paper. Firstly, the use of automatic spelling correction techniques for which, in turn, we consider two algorithms: the first one for the correction of isolated words and the second one for a correction based on the linguistic context of the misspelled word. The second approach to be studied is the use of character n-grams both as index terms and translation units, seeking to take advantage of their inherent robustness and language-independence. All these approaches have been tested on a from-Spanish-to-English CLIR system, that is, Spanish queries on English documents. Real, user-generated spelling errors have been used under a methodology that allows us to study the effectiveness of the different approaches to be tested and their behavior when confronted with different error rates. The results obtained show the great sensitiveness of classic word-based approaches to misspelled queries, although spelling correction techniques can mitigate such negative effects. On the other hand, the use of character n-grams provides great robustness against misspellings.  相似文献   

12.
In this paper, we explore the effects of individual pressure level and time constraint on searchers' behaviors and their assessment of search experience within the framework of interactive information retrieval. A user experiment was conducted in which 40 participants individually searched for information in a laboratory setting under two conditions: with time constraint (TC) and with no time constraint (NTC). Participants filled in a Perceived Stress Scale questionnaire to measure their chronic pressure value (subjective stress), and their pressure value was recorded as their individual characteristic. The results showed that the more chronic pressure the searcher has, the more search efforts they devote, including more time in searching and more time to complete the search tasks, especially when there was no time constraint. Time constraint and searchers’ pressure value had a significant effect on users’ numbers of scrolling actions per minute. The results indicate that when given a time constraint, searchers with higher-pressure values tend to lower their reading or scanning speed, while searchers with lower-pressure values tend to accelerate their reading or scanning speed. The results suggested different people would react to the time condition change in different ways, especially people with higher pressure. Therefore, it is necessary to examine users’ search behaviors in person-in-situation frameworks to analyze the effects of contextual factors on users. This study contributes to our knowledge of how contextual factors and individual characteristics affect searchers’ behaviors and have implications for the design of IIR systems.  相似文献   

13.
Cross-lingual semantic interoperability has drawn significant attention in recent digital library and World Wide Web research as the information in languages other than English has grown exponentially. Cross-lingual information retrieval (CLIR) across different European languages, such as English, Spanish, and French, has been widely explored; however, CLIR across European languages and Oriental languages is still in the initial stage. To cross language boundary, corpus-based approach is promising to overcome the limitation of the knowledge-based and controlled vocabulary approaches but collecting parallel corpora between European language and Oriental language is not an easy task. Length-based and text-based approaches are two major approaches to align parallel documents. In this paper, we investigate several techniques using these approaches and compare their performances in aligning English and Chinese titles of parallel documents available on the Web.  相似文献   

14.
In this paper, we compile and review several experiments measuring cross-lingual information retrieval (CLIR) performance as a function of the following resources: bilingual term lists, parallel corpora, machine translation (MT), and stemmers. Our CLIR system uses a simple probabilistic language model; the studies used TREC test corpora over Chinese, Spanish and Arabic. Our findings include:
  • •One can achieve an acceptable CLIR performance using only a bilingual term list (70–80% on Chinese and Arabic corpora).
  • •However, if a bilingual term list and parallel corpora are available, CLIR performance can rival monolingual performance.
  • •If no parallel corpus is available, pseudo-parallel texts produced by an MT system can partially overcome the lack of parallel text.
  • •While stemming is useful normally, with a very large parallel corpus for Arabic–English, stemming hurt performance in our empirical studies with Arabic, a highly inflected language.
  相似文献   

15.
Search engine researchers typically depict search as the solitary activity of an individual searcher. In contrast, results from our critical-incident survey of 150 users on Amazon’s Mechanical Turk service suggest that social interactions play an important role throughout the search process. A second survey of also 150 users, focused instead on difficulties encountered during searches, suggests similar conclusions. These social interactions range from highly coordinated collaborations with shared goals to loosely coordinated collaborations in which only advice is sought. Our main contribution is that we have integrated models from previous work in sensemaking and information-seeking behavior to present a canonical social model of user activities before, during, and after a search episode, suggesting where in the search process both explicitly and implicitly shared information may be valuable to individual searchers.  相似文献   

16.
查收查引服务中存在文献未检到现象,文献检索不到既影响用户使用,也说明图书馆需要提高服务质量.文章从数据库商、用户、检索人员3个角度针对性地提出了相关策略和建议.  相似文献   

17.
This introductory paper covers not only the research content of the articles in this special issue of IP&M but attempts to characterize the state-of-the-art in the Cross-Language Information Retrieval (CLIR) domain. We present our view of some major directions for CLIR research in the future. In particular, we find that insufficient attention has been given to the Web as a resource for multilingual research, and to languages which are spoken by hundreds of millions of people in the world but have been mainly neglected by the CLIR research community. In addition, we find that most CLIR evaluation has focussed narrowly on the news genre to the exclusion of other important genres such as scientific and technical literature. The paper concludes by describing an ambitious 5-year research plan proposed by James Mayfield and Paul McNamee.  相似文献   

18.
基于新图书馆和老图书馆文献通借通还的混合流通管理模式,考虑到减少成本开支和开发复杂度,设计了既能与老文献管理系统兼容,又能与老文献管理系统并存并用的无线射频识别(RFID)图书馆智能管理系统。该系统充分利用原图书馆文献信息管理系统中的中央数据库,通过中间件使RFID图书馆智能管理系统与图书馆老文献信息管理系统无缝对接,既体现了RFID图书馆智能管理系统的技术优势,又实现了新馆和老馆之间文献通借通还的管理目标。
Abstract:
Based on the mixed circulation management model of the new library and the old library where the documents can be borrowed and returned freely,and considering the reduction of cost and development complexity,this paper designs a Radio Frequency Identification (RFID) intelligent library management system,which is not only compatible with the old document management system,but can also exist and work together with it.The system makes full use of the central database of the original library document information management system,and by the use of the mediators,is integrated with the old library document information management system seamlessly,which not only embodies the technical advantages of the RFID intelligent library management system,but also realizes the management aim of borrowing and returning documents between the new and old libraries.  相似文献   

19.
王昊 《情报科学》2005,23(10):1573-1578
本文主要论述跨语言信息检索(CLIR)技术与数字图书馆(D-Lib)技术相结合的系统模型。首先介绍CLIR和D-Lib的概念及涉及的相关技术;然后讨论CLIR技术在D-Lib中应用的必然性和可行性;将CLIR的技术平台与D-Lib的系统结构相结合,设计基于CLIR的D-Lib系统模型;最后关于CLIR技术和D-Lib相结合的应用目前存在的问题提出自己见解。  相似文献   

20.
A main challenge in Cross-Language Information Retrieval (CLIR) is to estimate a proper translation model from available translation resources, since translation quality directly affects the retrieval performance. Among different translation resources, we focus on obtaining translation models from comparable corpora, because they provide appropriate translations for both languages and domains with limited linguistic resources. In this paper, we employ a two-step approach to build an effective translation model from comparable corpora, without requiring any additional linguistic resources, for the CLIR task. In the first step, translations are extracted by deriving correlations between source–target word pairs. These correlations are used to estimate word translation probabilities in the second step. We propose a language modeling approach for the first step, where modeling based on probability distribution provides two key advantages. First, our approach can be tuned easier in comparison with heuristically adjusted previous work. Second, it provides a principled basis for integrating additional lexical and translational relations to improve the accuracy of translations from comparable corpora. As an indication, we integrate monolingual relations of word co-occurrences into the process of translation extraction, which helps to extract more reliable translations for low-frequency words in a comparable corpus. Experimental results on an English–Persian comparable corpus show that our method outperforms the previous approaches in terms of both translation quality and the performance of CLIR. Indeed, the proposed method is naturally applicable to any comparable corpus, regardless of its languages. In addition, we demonstrate the significant impact of word translation probabilities, estimated in the second step of our approach, on the performance of CLIR.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号