首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 662 毫秒
1.
We present an image retrieval framework based on automatic query expansion in a concept feature space by generalizing the vector space model of information retrieval. In this framework, images are represented by vectors of weighted concepts similar to the keyword-based representation used in text retrieval. To generate the concept vocabularies, a statistical model is built by utilizing Support Vector Machine (SVM)-based classification techniques. The images are represented as "bag of concepts" that comprise perceptually and/or semantically distinguishable color and texture patches from local image regions in a multi-dimensional feature space. To explore the correlation between the concepts and overcome the assumption of feature independence in this model, we propose query expansion techniques in the image domain from a new perspective based on both local and global analysis. For the local analysis, the correlations between the concepts based on the co-occurrence pattern, and the metrical constraints based on the neighborhood proximity between the concepts in encoded images, are analyzed by considering local feedback information. We also analyze the concept similarities in the collection as a whole in the form of a similarity thesaurus and propose an efficient query expansion based on the global analysis. The experimental results on a photographic collection of natural scenes and a biomedical database of different imaging modalities demonstrate the effectiveness of the proposed framework in terms of precision and recall.  相似文献   

2.
Traditional content based image retrieval attempts to retrieve images using syntactic features for a query image. Annotated image banks and Google allow the use of text to retrieve images. In this paper, we studied the task of using the content of an image to retrieve information in general. We describe the significance of object identification in an information retrieval paradigm that uses image set as intermediate means in indexing and matching. We also describe a unique Singapore Tourist Object Identification Collection with associated queries and relevance judgments for evaluating the new task and the need for efficient image matching using simple image features. We present comprehensive experimental evaluation on the effects of feature dimensions, context, spatial weightings, coverage of image indexes, and query devices on task performance. Lastly we describe the current system developed to support mobile image-based tourist information retrieval.  相似文献   

3.
Web searchers commonly have difficulties crafting queries to fulfill their information needs; even after they are able to craft a query, they often find it challenging to evaluate the results of their Web searches. Sources of these problems include the lack of support for constructing and refining queries, and the static nature of the list-based representations of Web search results. WordBars has been developed to assist users in their Web search and exploration tasks. This system provides a visual representation of the frequencies of the terms found in the first 100 document surrogates returned from an initial query, in the form of a histogram. Exploration of the search results is supported through term selection in the histogram, resulting in a re-sorting of the search results based on the use of the selected terms in the document surrogates. Terms from the histogram can be easily added or removed from the query, generating a new set of search results. Examples illustrate how WordBars can provide valuable support for query refinement and search results exploration, both when vague and specific initial queries are provided. User evaluations with both expert and intermediate Web searchers illustrate the benefits of the interactive exploration features of WordBars in terms of effectiveness as well as subjective measures. Although differences were found in the demographics of these two user groups, both were able to benefit from the features of WordBars.  相似文献   

4.
5.
A critical challenge for Web search engines concerns how they present relevant results to searchers. The traditional approach is to produce a ranked list of results with title and summary (snippet) information, and these snippets are usually chosen based on the current query. Snippets play a vital sensemaking role, helping searchers to efficiently make sense of a collection of search results, as well as determine the likely relevance of individual results. Recently researchers have begun to explore how snippets might also be adapted based on searcher preferences as a way to better highlight relevant results to the searcher. In this paper we focus on the role of snippets in collaborative web search and describe a technique for summarizing search results that harnesses the collaborative search behaviour of communities of like-minded searchers to produce snippets that are more focused on the preferences of the searchers. We go on to show how this so-called social summarization technique can generate summaries that are significantly better adapted to searcher preferences and describe a novel personalized search interface that combines result recommendation with social summarization.  相似文献   

6.
综合运用基于文本与基于内容技术检索Web图像   总被引:1,自引:0,他引:1  
黄崑 《情报科学》2004,22(11):1391-1395,1408
本文介绍了基于文本和基于内容的图像检索技术,并归纳分析了Web图像的特点,指出综合运用文本和内容信息共同检索Web图像,最后对建立Web图像搜索引擎提出了建议。  相似文献   

7.
Content-based image retrieval (CBIR) with global features is notoriously noisy, especially for image queries with low percentages of relevant images in a collection. Moreover, CBIR typically ranks the whole collection, which is inefficient for large databases. We experiment with a method for image retrieval from multimedia databases, which improves both the effectiveness and efficiency of traditional CBIR by exploring secondary media. We perform retrieval in a two-stage fashion: first rank by a secondary medium, and then perform CBIR only on the top-K items. Thus, effectiveness is improved by performing CBIR on a ‘better’ subset. Using a relatively ‘cheap’ first stage, efficiency is also improved via the fewer CBIR operations performed. Our main novelty is that K is dynamic, i.e. estimated per query to optimize a predefined effectiveness measure. We show that our dynamic two-stage method can be significantly more effective and robust than similar setups with static thresholds previously proposed. In additional experiments using local feature derivatives in the visual stage instead of global, such as the emerging visual codebook approach, we find that two-stage does not work very well. We attribute the weaker performance of the visual codebook to the enhanced visual diversity produced by the textual stage which diminishes codebook’s advantage over global features. Furthermore, we compare dynamic two-stage retrieval to traditional score-based fusion of results retrieved visually and textually. We find that fusion is also significantly more effective than single-medium baselines. Although, there is no clear winner between two-stage and fusion, the methods exhibit different robustness features; nevertheless, two-stage retrieval provides efficiency benefits over fusion.  相似文献   

8.
In this paper, we propose a re-ranking algorithm using post-retrieval clustering for content-based image retrieval (CBIR). In conventional CBIR systems, it is often observed that images visually dissimilar to a query image are ranked high in retrieval results. To remedy this problem, we utilize the similarity relationship of the retrieved results via post-retrieval clustering. In the first step of our method, images are retrieved using visual features such as color histogram. Next, the retrieved images are analyzed using hierarchical agglomerative clustering methods (HACM) and the rank of the results is adjusted according to the distance of a cluster from a query. In addition, we analyze the effects of clustering methods, query-cluster similarity functions, and weighting factors in the proposed method. We conducted a number of experiments using several clustering methods and cluster parameters. Experimental results show that the proposed method achieves an improvement of retrieval effectiveness of over 10% on average in the average normalized modified retrieval rank (ANMRR) measure.  相似文献   

9.
Both general and domain-specific search engines have adopted query suggestion techniques to help users formulate effective queries. In the specific domain of literature search (e.g., finding academic papers), the initial queries are usually based on a draft paper or abstract, rather than short lists of keywords. In this paper, we investigate phrasal-concept query suggestions for literature search. These suggestions explicitly specify important phrasal concepts related to an initial detailed query. The merits of phrasal-concept query suggestions for this domain are their readability and retrieval effectiveness: (1) phrasal concepts are natural for academic authors because of their frequent use of terminology and subject-specific phrases and (2) academic papers describe their key ideas via these subject-specific phrases, and thus phrasal concepts can be used effectively to find those papers. We propose a novel phrasal-concept query suggestion technique that generates queries by identifying key phrasal-concepts from pseudo-labeled documents and combines them with related phrases. Our proposed technique is evaluated in terms of both user preference and retrieval effectiveness. We conduct user experiments to verify a preference for our approach, in comparison to baseline query suggestion methods, and demonstrate the effectiveness of the technique with retrieval experiments.  相似文献   

10.
Professional, workplace searching is different from general searching, because it is typically limited to specific facets and targeted to a single answer. We have developed the semantic component (SC) model, which is a search feature that allows searchers to structure and specify the search to context-specific aspects of the main topic of the documents. We have tested the model in an interactive searching study with family doctors with the purpose to explore doctors’ querying behaviour, how they applied the means for specifying a search, and how these features contributed to the search outcome. In general, the doctors were capable of exploiting system features and search tactics during the searching. Most searchers produced well-structured queries that contained appropriate search facets. When searches failed it was not due to query structure or query length. Failures were mostly caused by the well-known vocabulary problem. The problem was exacerbated by using certain filters as Boolean filters. The best working queries were structured into 2–3 main facets out of 3–5 possible search facets, and expressed with terms reflecting the focal view of the search task. The findings at the same time support and extend previous results about query structure and exhaustivity showing the importance of selecting central search facets and express them from the perspective of search task. The SC model was applied in the highest performing queries except one. The findings suggest that the model might be a helpful feature to structure queries into central, appropriate facets, and in returning highly relevant documents.  相似文献   

11.
Interactive query expansion (IQE) (c.f. [Efthimiadis, E. N. (1996). Query expansion. Annual Review of Information Systems and Technology, 31, 121–187]) is a potentially useful technique to help searchers formulate improved query statements, and ultimately retrieve better search results. However, IQE is seldom used in operational settings. Two possible explanations for this are that IQE is generally not integrated into searchers’ established information-seeking behaviors (e.g., examining lists of documents), and it may not be offered at a time in the search when it is needed most (i.e., during the initial query formulation). These challenges can be addressed by coupling IQE more closely with familiar search activities, rather than as a separate functionality that searchers must learn. In this article we introduce and evaluate a variant of IQE known as Real-Time Query Expansion (RTQE). As a searcher enters their query in a text box at the interface, RTQE provides a list of suggested additional query terms, in effect offering query expansion options while the query is formulated. To investigate how the technique is used – and when it may be useful – we conducted a user study comparing three search interfaces: a baseline interface with no query expansion support; an interface that provides expansion options during query entry, and a third interface that provides options after queries have been submitted to a search system. The results show that offering RTQE leads to better quality initial queries, more engagement in the search, and an increase in the uptake of query expansion. However, the results also imply that care must be taken when implementing RTQE interactively. Our findings have broad implications for how IQE should be offered, and form part of our research on the development of techniques to support the increased use of query expansion.  相似文献   

12.
13.
Traditional information retrieval techniques that primarily rely on keyword-based linking of the query and document spaces face challenges such as the vocabulary mismatch problem where relevant documents to a given query might not be retrieved simply due to the use of different terminology for describing the same concepts. As such, semantic search techniques aim to address such limitations of keyword-based retrieval models by incorporating semantic information from standard knowledge bases such as Freebase and DBpedia. The literature has already shown that while the sole consideration of semantic information might not lead to improved retrieval performance over keyword-based search, their consideration enables the retrieval of a set of relevant documents that cannot be retrieved by keyword-based methods. As such, building indices that store and provide access to semantic information during the retrieval process is important. While the process for building and querying keyword-based indices is quite well understood, the incorporation of semantic information within search indices is still an open challenge. Existing work have proposed to build one unified index encompassing both textual and semantic information or to build separate yet integrated indices for each information type but they face limitations such as increased query process time. In this paper, we propose to use neural embeddings-based representations of term, semantic entity, semantic type and documents within the same embedding space to facilitate the development of a unified search index that would consist of these four information types. We perform experiments on standard and widely used document collections including Clueweb09-B and Robust04 to evaluate our proposed indexing strategy from both effectiveness and efficiency perspectives. Based on our experiments, we find that when neural embeddings are used to build inverted indices; hence relaxing the requirement to explicitly observe the posting list key in the indexed document: (a) retrieval efficiency will increase compared to a standard inverted index, hence reduces the index size and query processing time, and (b) while retrieval efficiency, which is the main objective of an efficient indexing mechanism improves using our proposed method, retrieval effectiveness also retains competitive performance compared to the baseline in terms of retrieving a reasonable number of relevant documents from the indexed corpus.  相似文献   

14.
XMage is introduced in this paper as a method for partial similarity searching in image databases. Region-based image retrieval is a method of retrieving partially similar images. It has been proposed as a way to accurately process queries in an image database. In region-based image retrieval, region matching is indispensable for computing the partial similarity between two images because the query processing is based upon regions instead of the entire image. A naive method of region matching is a sequential comparison between regions, which causes severe overhead and deteriorates the performance of query processing. In this paper, a new image contents representation, called Condensed eXtended Histogram (CXHistogram), is presented in conjunction with a well-defined distance function CXSim() on the CX-Histogram. The CXSim() is a new image-to-image similarity measure to compute the partial similarity between two images. It achieves the effect of comparing regions of two images by simply comparing the two images. The CXSim() reduces query space by pruning irrelevant images, and it is used as a filtering function before sequential scanning. Extensive experiments were performed on real image data to evaluate XMage. It provides a significant pruning of irrelevant images with no false dismissals. As a consequence, it achieves up to 5.9-fold speed-up in search over the R*-tree search followed by sequential scanning.  相似文献   

15.
16.
Content-based image retrieval systems require the development of relevance feedback mechanisms that allow the user to progressively refine the system's response to a query. In this paper a new relevance feedback mechanism is described which evaluates the feature distributions of the images judged relevant, or not relevant, by the user and dynamically updates both the similarity measure and the query in order to accurately represent the user's particular information needs. Experimental results demonstrate the effectiveness of this mechanism.  相似文献   

17.
Searching for relevant material that satisfies the information need of a user, within a large document collection is a critical activity for web search engines. Query Expansion techniques are widely used by search engines for the disambiguation of user’s information need and for improving the information retrieval (IR) performance. Knowledge-based, corpus-based and relevance feedback, are the main QE techniques, that employ different approaches for expanding the user query with synonyms of the search terms (word synonymy) in order to bring more relevant documents and for filtering documents that contain search terms but with a different meaning (also known as word polysemy problem) than the user intended. This work, surveys existing query expansion techniques, highlights their strengths and limitations and introduces a new method that combines the power of knowledge-based or corpus-based techniques with that of relevance feedback. Experimental evaluation on three information retrieval benchmark datasets shows that the application of knowledge or corpus-based query expansion techniques on the results of the relevance feedback step improves the information retrieval performance, with knowledge-based techniques providing significantly better results than their simple relevance feedback alternatives in all sets.  相似文献   

18.
In this paper, we explore the effects of individual pressure level and time constraint on searchers' behaviors and their assessment of search experience within the framework of interactive information retrieval. A user experiment was conducted in which 40 participants individually searched for information in a laboratory setting under two conditions: with time constraint (TC) and with no time constraint (NTC). Participants filled in a Perceived Stress Scale questionnaire to measure their chronic pressure value (subjective stress), and their pressure value was recorded as their individual characteristic. The results showed that the more chronic pressure the searcher has, the more search efforts they devote, including more time in searching and more time to complete the search tasks, especially when there was no time constraint. Time constraint and searchers’ pressure value had a significant effect on users’ numbers of scrolling actions per minute. The results indicate that when given a time constraint, searchers with higher-pressure values tend to lower their reading or scanning speed, while searchers with lower-pressure values tend to accelerate their reading or scanning speed. The results suggested different people would react to the time condition change in different ways, especially people with higher pressure. Therefore, it is necessary to examine users’ search behaviors in person-in-situation frameworks to analyze the effects of contextual factors on users. This study contributes to our knowledge of how contextual factors and individual characteristics affect searchers’ behaviors and have implications for the design of IIR systems.  相似文献   

19.
Ontologies are frequently used in information retrieval being their main applications the expansion of queries, semantic indexing of documents and the organization of search results. Ontologies provide lexical items, allow conceptual normalization and provide different types of relations. However, the optimization of an ontology to perform information retrieval tasks is still unclear. In this paper, we use an ontology query model to analyze the usefulness of ontologies in effectively performing document searches. Moreover, we propose an algorithm to refine ontologies for information retrieval tasks with preliminary positive results.  相似文献   

20.
Web search engines are beginning to offer access to multimedia searching, including audio, video and image searching. In this paper we report findings from a study examining the state of multimedia search functionality on major general and specialized Web search engines. We investigated 102 Web search engines to examine: (1) how many Web search engines offer multimedia searching, (2) the type of multimedia search functionality and methods offered, such as “query by example”, and (3) the supports for personalization or customization which are accessible as advanced search. Findings include: (1) few major Web search engines offer multimedia searching and (2) multimedia Web search functionality is generally limited. Our findings show that despite the increasing level of interest in multimedia Web search, those few Web search engines offering multimedia Web search, provide limited multimedia search functionality. Keywords are still the only means of multimedia retrieval, while other methods such as “query by example” are offered by less than 1% of Web search engines examined.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号