共查询到20条相似文献,搜索用时 380 毫秒
1.
虽然传统的向量空间模型被誉为检索模型中最富有想象力和创造力的模型,但是它同时也存在着一些考虑不足的地方,如未考虑文档结构、文档类型等问题。本文就这些问题给予分析并给出了相应的改进方法,最后构建了一个改进后的向量空间模型。 相似文献
2.
本文通过对电子政务新环境下的政府文档的电子化、标准化、全文检索及安全策略等的研究,提出了政府纸质文档电子化及全文数据库建设的模型。并基于该模型,提出了解决政府文档电子化问题及建设全文数据库的方案。 相似文献
3.
通过对现阶段在各高校普遍存在的教师个人文档不规范管理的分析,论证了不科学管理电子文档给教学度教务工作带来的不利.分析指出规范化文档管理的作用,并提出了可行性建议. 相似文献
4.
5.
信息采集中Web文档模型的表示是影响采集精度的重要因素之一。本文通过LIRA系统对用户的信息需求进行目标表示,对Web文档模型结构进行分析,提出基于该模型的特定领域目标信息预测采集控制方法,并通过用户自学习实验给出该模型的优化指标。 相似文献
6.
7.
传统信息检索方法忽视了文档结构对信息检索过程的影响.本文提出了一种改进的基于文档结构的信息检索方法,该方法首先使用第一类特征域对检索文档集进行过滤,然后使用第二类特征域进行匹配排序;引入AHP方法动态确定各特征域的重要性权重因子;最后使用向量内积计算的方法合成总相似度值.实验结果表明该方法可以提高信息检索的查准率和检索结果的排序合理性. 相似文献
8.
9.
10.
11.
《Information processing & management》2001,37(4):623-637
One of the most important problems in information retrieval is determining the order of documents in the answer returned to the user. Many methods and algorithms for document ordering have been proposed. The method introduced in this paper differs from them especially in that it uses a probabilistic model of a document set. In this model documents are regarded as states of a Markov chain, where transition probabilities are directly proportional to similarities between documents. Steady-state probabilities reflect similarities of particular documents to the whole answer set. If documents are ordered according to these probabilities, at the top of a list there will be documents that are the best representatives of the set, and at the bottom those which are the worst representatives. The method was tested against databases INSPEC and Networked Computer Science Technical Reference Library (NCSTRL). Test results are positive. Values of the Kendall rank correlation coefficient indicate high similarity between rankings generated by the proposed method and rankings produced by experts. Results are comparable with rankings generated by the vector model using standard weighting schema tf·idf. 相似文献
12.
Several papers have appeared that have analyzed recent developments in the problem of processing, in a document retrieval system, queries expressed as Boolean expressions. The purpose of this paper is to continue that analysis. We shall show that the concept of threshold values resolves the problems inherent with relevance weights. Moreover, we shall explore possible evaluation mechanisms for retrieval of documents, based on fuzzy-set-theoretic considerations. 相似文献
13.
[目的/意义]旨在为图书馆提高服务水平提供参考。[方法/过程]通过图书馆微信主题研究和微博主题研究的对比分析和文献计量分析,探讨图书馆微信研究的整体情况、发文趋势、学科分布、热点主题以及存在的问题;通过汇总全国范围内各类图书馆微信服务的调研情况,分析图书馆微信服务的开通情况、服务内容、运营与服务评价、存在的问题和可能的原因。[结果/结论]微信不仅是图书馆一种新的宣传工具和社交媒体,更是一种深藏丰富原创信息的资源。提出图书馆应从资源的角度对微信加以利用。 相似文献
14.
15.
This paper presents a probabilistic information retrieval framework in which the retrieval problem is formally treated as a statistical decision problem. In this framework, queries and documents are modeled using statistical language models, user preferences are modeled through loss functions, and retrieval is cast as a risk minimization problem. We discuss how this framework can unify existing retrieval models and accommodate systematic development of new retrieval models. As an example of using the framework to model non-traditional retrieval problems, we derive retrieval models for subtopic retrieval, which is concerned with retrieving documents to cover many different subtopics of a general query topic. These new models differ from traditional retrieval models in that they relax the traditional assumption of independent relevance of documents. 相似文献
16.
17.
从馆藏古籍数据库的建设、古籍电子资源的购买数量、有关古籍特色数据库的建设3个方面对国内古籍藏量较多的49所211院校的古籍数字化情况进行调查分析,指出在古籍数字化中应注意的问题,这对于各高校图书馆吸取经验、深化古籍服务工作具有重要的意义。 相似文献
18.
运用文献计量学方法,选取近十年(2001-2010)来被Scopus数据库收录的数字图书馆研究的相关文献,对中外作者发表的文献数量、主题词、文献类型、出版物、作者和研究机构、文献被引等进行了分析和对比,以明确我国在数字图书馆研究领域的国际影响力。结果发现我国作者发表的数字图书馆文献数量近三年增长迅速,但是会议论文过多,引用率低下,文献总体国际影响力较低。同时对相关原因进行了探讨,以便为未来我国数字图书馆学的发展提供参考。 相似文献
19.
Abraham Bookstein 《Information processing & management》1977,13(6):377-383
Most automated information retrieval systems operate by relating a document to a request by means of a measure of pertinance, and then retrieving the most pertinent documents for their patrons. In this paper the consistency of this operating procedure with the well known Swets Model is examined. It is shown that accepting the assumptions made by Swets would result in the possible rejection of the most pertinent documents in favor of those that are less pertinent. This conclusion is a consequence of the normality assumptions of the model, while other distributions, such as the Poisson distribution, is consistent with the standard procedure. In the course of the development, the fundamentals of decision theory and signal detection theory are reviewed. 相似文献