共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
要实现网络信息或数字图书馆信息的有效多语言获取,需充分考虑用户交互.通过用户实验,检验用户相关反馈机制在多语言信息获取中的作用,并分析用户行为特点.实验结果证明,查询扩展、翻译优化以及两者的结合均是有效的用户相关反馈方法. 相似文献
3.
面对日益膨胀的多语种信息资源,跨语言信息检索已成为实现全球知识存取和共享的关键技术手段。构建一个实用型的跨语言检索查询翻译接口,可方便地嵌入任意的信息检索平台,扩展现有信息检索平台的多语言信息处理能力。该查询翻译接口采用基于最长短语、查询分类和概率词典等多种翻译消歧策略,并从查询翻译的准确性和接口的运行效率两个角度对构建的查询翻译接口进行评测,实验结果验证所采用方法具有可行性。 相似文献
4.
本体在跨语言信息检索中的应用机制研究 总被引:3,自引:1,他引:2
解释多语本体的含义,指出其在不同语言中所对应的领域知识,分析多语本体在查询扩展、语义标注、基于概念索引3方面对改善跨语言信息检索的作用,并通过介绍EuroWorldNet和Cindor系统的多语本体概念的对应方法,探讨本体应用于跨语言信息检索最关键的多语本体库的映射方法,认为采用中间语言作为概念表示、并通过词典翻译对照与不同语种的词汇建立链接关系是多语本体映射的一种良好方法。 相似文献
5.
Multilingual retrieval (querying of multiple document collections each in a different language) can be achieved by combining several individual techniques which enhance retrieval: machine translation to cross the language barrier, relevance feedback to add words to the initial query, decompounding for languages with complex term structure, and data fusion to combine monolingual retrieval results from different languages. Using the CLEF 2001 and CLEF 2002 topics and document collections, this paper evaluates these techniques within the context of a monolingual document ranking formula based upon logistic regression. Each individual technique yields improved performance over runs which do not utilize that technique. Moreover the techniques are complementary, in that combining the best techniques outperforms individual technique performance. An approximate but fast document translation using bilingual wordlists created from machine translation systems is presented and evaluated. The fast document translation is as effective as query translation in multilingual retrieval. Furthermore, when fast document translation is combined with query translation in multilingual retrieval, the performance is significantly better than that of query translation or fast document translation. 相似文献
6.
[目的/意义] 要实现"一带一路"多语种共享型数据库资源的有效利用,必须解决跨语言检索问题,基于已建"一带一路"数据库检索功能调查结果,分析"一带一路"多语种共享型数据库检索功能需求,以调研跨语言检索平台为视角,为"一带一路"多语种共享型数据库的跨语言检索功能设计与开发提供参考。[方法/过程] 采用文献调研法和网络调研法,选取11个国内外典型的跨语言检索平台,从跨语言检索方法、跨语言翻译实现方法、检索功能设置、检索结果呈现、界面与检索支持语种6个方面进行分析,总结其实现方法。[结果/结论] 为"一带一路"多语种共享型数据库的跨语言检索功能设计与开发提出策略:应采用基于神经网络机器翻译的提问式-文献翻译方法,实现多种检索功能,应用可视化技术呈现检索结果,提供多语言检索界面和资源。 相似文献
7.
《Library & information science research》2005,27(2):249-263
The problem of language in Web searching has been discussed primarily in the area of cross-language information retrieval (CLIR). However, much CLIR research centers on investigation of the effectiveness of automatic translation techniques. The case study reported here explored bilingual user behaviors, perceptions, and preferences with respect to the capability of the Web as a multilingual information resource. Twenty-eight bilingual academic users from Myongji University in Korea were recruited for the study. Findings show that the subjects did not use Web search engines as multilingual tools. For search queries, they selected a language that represents their information need most accurately depending on the types of information task rather than choosing their first language. Subjects expressed concerns about the accuracy of machine translation of scholarly terminologies and preferred to have user control over multilingual Web searches. 相似文献
8.
Martin Braschler 《Information Retrieval》2004,7(1-2):183-204
We describe the Eurospider component for Cross-Language Information Retrieval (CLIR) that has been employed for experiments at all three CLEF campaigns to date. The central aspect of our efforts is the use of combination approaches, effectively combining multiple language pairs, translation resources and translation methods into one multilingual retrieval system. We discuss the implications of building a system that allows flexible combination, give details of the various translation resources and methods, and investigate the impact of merging intermediate results generated by the individual steps. An analysis of the resulting combination system is given which also takes into account additional requirements when deploying the system as a component in an operational, commercial setting. 相似文献
9.
Multilingual information retrieval is generally understood to mean the retrieval of relevant information in multiple target
languages in response to a user query in a single source language. In a multilingual federated search environment, different
information sources contain documents in different languages. A general search strategy in multilingual federated search environments
is to translate the user query to each language of the information sources and run a monolingual search in each information
source. It is then necessary to obtain a single ranked document list by merging the individual ranked lists from the information
sources that are in different languages. This is known as the results merging problem for multilingual information retrieval.
Previous research has shown that the simple approach of normalizing source-specific document scores is not effective. On the
other side, a more effective merging method was proposed to download and translate all retrieved documents into the source
language and generate the final ranked list by running a monolingual search in the search client. The latter method is more
effective but is associated with a large amount of online communication and computation costs. This paper proposes an effective
and efficient approach for the results merging task of multilingual ranked lists. Particularly, it downloads only a small
number of documents from the individual ranked lists of each user query to calculate comparable document scores by utilizing
both the query-based translation method and the document-based translation method. Then, query-specific and source-specific
transformation models can be trained for individual ranked lists by using the information of these downloaded documents. These
transformation models are used to estimate comparable document scores for all retrieved documents and thus the documents can
be sorted into a final ranked list. This merging approach is efficient as only a subset of the retrieved documents are downloaded
and translated online. Furthermore, an extensive set of experiments on the Cross-Language Evaluation Forum (CLEF) () data has demonstrated the effectiveness of the query-specific and source-specific results merging algorithm against other
alternatives. The new research in this paper proposes different variants of the query-specific and source-specific results
merging algorithm with different transformation models. This paper also provides thorough experimental results as well as
detailed analysis. All of the work substantially extends the preliminary research in (Si and Callan, in: Peters (ed.) Results
of the cross-language evaluation forum-CLEF 2005, 2005).
相似文献
Hao YuanEmail: |
10.
M. á. García-Cumbreras F. Martínez-Santiago L. A. Ure?a-López 《Information Retrieval》2012,15(5):413-432
Given a user question, the goal of a Question Answering (QA) system is to retrieve answers rather than full documents or even best-matching passages, as most Information Retrieval systems currently do. In this paper, we present BRUJA, a QA system for the management of multilingual collections. BRUJ rkstions (English, Spanish and French). The BRUJA architecture is not formed with three monolingual QA systems but instead uses English as Interlingua to make usual QA tasks such as question classifications and answer extractions. In addition, BRUJA uses Cross Language Information Retrieval (CLIR) techniques to retrieve relevant documents from a multilingual collection. On the one hand, we have more documents to find answers from but on the other hand, we are introducing noise into the system because of translations to the Interlingua (English) and the CLIR module. The question is whether the difficulty of managing three languages is worth it or whether a monolingual QA system delivers better results. We report on in-depth experimentation and demonstrate that our multilingual QA system gets better results than its monolingual counterpart whenever it uses good translation resources and, especially, CLIR techniques that are state-of-the-art. 相似文献
11.
12.
13.
14.
We present a system for multilingual information retrieval that allows users to formulate queries in their preferred language and retrieve relevant information from a collection containing documents in multiple languages. The system is based on a process of document level alignments, where documents of different languages are paired according to their similarity. The resulting mapping allows us to produce a multilingual comparable corpus. Such a corpus has multiple interesting applications. It allows us to build a data structure for query translation in cross-language information retrieval (CLIR). Moreover, we also perform pseudo relevance feedback on the alignments to improve our retrieval results. And finally, multiple retrieval runs can be merged into one unified result list. The resulting system is inexpensive, adaptable to domain-specific collections and new languages and has performed very well at the TREC-7 conference CLIR system comparison. 相似文献
15.
Fernando Martínez-Santiago L. Alfonso Ureña-López Maite Martín-Valdivia 《Information Retrieval》2006,9(1):71-93
A usual strategy to implement CLIR (Cross-Language Information Retrieval) systems is the so-called query translation approach.
The user query is translated for each language present in the multilingual collection in order to compute an independent monolingual
information retrieval process per language. Thus, this approach divides documents according to language. In this way, we obtain
as many different collections as languages. After searching in these corpora and obtaining a result list per language, we
must merge them in order to provide a single list of retrieved articles.
In this paper, we propose an approach to obtain a single list of relevant documents for CLIR systems driven by query translation.
This approach, which we call 2-step RSV (RSV: Retrieval Status Value), is based on the re-indexing of the retrieval documents
according to the query vocabulary, and it performs noticeably better than traditional methods.
The proposed method requires query vocabulary alignment: given a word for a given query, we must know the translation or translations
to the other languages. Because this is not always possible, we have researched on a mixed model. This mixed model is applied
in order to deal with queries with partial word-level alignment. The results prove that even in this scenario, 2-step RSV
performs better than traditional merging methods. 相似文献
16.
《Library Acquisitions: Practice & Theory #》1991,15(2):155-163
The multiple meanings of Judaica and bibliographic control are defined. A broad view of Judaica is taken, that is, intersection with Hebraica as opposed to exclusion thereof. Bibliographic control is presented as a collection development function, while bibliographic organization is the cataloger's domain. Four problem areas in defining the scope of Judaica are identified: overlap with Hebraica and Israeli publications, the numerous cognate disciplines of Jewish studies, Judaism as a religion vs Jews as an ethnic group, and “who is a Jew?” Each issue is discussed from both the bibliographer's and the subject cataloger's perspective. The automatic identification of works relevant to a Judaica collection is dependent upon appropriate subject analysis or classification. Judaica bibliographers should therefore monitor developments in Judaica cataloging and advocate changes that would simplify control of the multidisciplinary, multilingual literature of Jewish studies. 相似文献
17.
AbstractThis article reports on the results of an exploratory user-centered study that examined how technological advancements in natural language processing (NLP) such as the availability of multilingual information access (MLIA) tools impact the information searching behavior of bi/multilingual academic users. Thirty-one bi/multilingual students participated in a controlled lab-based user experiment in which they carried out two assigned tasks each on Google and WorldCat for a total of four tasks, and then completed a post experiment questionnaire. The captures from the experiment showed 86.7% of the participants using multilingual information access tools. Further analyses of the captures also showed that participants were more likely to use MLIA tools when the instructions for the task were stated in their native language. An independent samples t-test revealed that participants spent less time on their searches when they used MLIA tools. The study revealed considerable diversity in the information searching behavior of the participants, even within the same pair of languages, and even for the same user. Diversity was noted for instance, on which tasks MLIA tools were used and in how these tools were used. User-centered designed, personalized multilingual information retrieval (PMLIR) models could hold promise for best representing the information searching behavior of bi/multilingual users. 相似文献
18.
19.
多语言信息检索系统可视化初探 总被引:1,自引:0,他引:1
多语言检索的研究在信息种类越来越多的现在十分重要,除检索技术与翻译功能的研究外,信息可视化的运用以及界面设计是另一个研究要点.依据以前的研究和文章综述,信息可视化被证明是帮助用户实施多语言信息检索的有效方法.研究提出一个多语言信息检索系统可视化模型及其设计方案,并指出该领域未来的发展方向. 相似文献