首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Identifying and representing the content of a document was, and still is, one of the main concerns of information retrieval systems. Representation of content is not dependent of the search strategy and other elements of information retrieval systems (IRS) but rather has some relationship with them.In the conventional IRS, each document in the file is characterized by one or more index terms which supposedly describe its content. Those terms are assigned from the natural language or from a pre-prepared list (Thesaurus). Over the years, other means of representing content were suggested. Also, attempts were made to combine several of them assuming independence.This paper discusses the attributes of the items in the data-base and their qualities. It seems that there is no single one which has all the desired qualities.If the attributes are not totally independent neither highly correlated then combining them in a certain way may increase effectiveness. The justification for this comes from the users' information seeking behavior—users are using index terms, author's names, citations, and other attributes in their searches.A model to accomodate the above hypothesis is formulated and the small experiment performed indicates that the hypothesis may be true, and this way of combining might improve effectiveness.  相似文献   

2.
《Research Policy》2022,51(4):104484
Although citations are widely used to measure the influence of scientific works, research shows that many citations serve rhetorical functions and reflect little-to-no influence on the citing authors. If highly cited papers disproportionately attract rhetorical citations then their citation counts may reflect rhetorical usefulness more than influence. Alternatively, researchers may perceive highly cited papers to be of higher quality and invest more effort into reading them, leading to disproportionately substantive citations. We test these arguments using data on 17,154 randomly sampled citations collected via surveys from 9,380 corresponding authors in 15 fields. We find that most citations (54%) had little-to-no influence on the citing authors. However, citations to the most highly cited papers were 2–3 times more likely to denote substantial influence. Experimental and correlational data show a key mechanism: displaying low citation counts lowers perceptions of a paper's quality, and papers with poor perceived quality are read more superficially. The results suggest that higher citation counts lead to more meaningful engagement from readers and, consequently, the most highly cited papers influence the research frontier much more than their raw citation counts imply.  相似文献   

3.
Although more than a million academic papers have been posted on Facebook, there is little detailed research about which fields or cross-field issues are involved and whether there are field or public interest relationships between Facebook mentions and future citations. In response, we identified health and biomedical scientific papers mentioned on Facebook and assigned subjects to them using the MeSH and Science Metrix journal classification schema. Multistage adaptive LASSO and unpenalized least-squares regressions were used to model Facebook mentions by fields and MeSH terms. The fields Science and Technology, General and Internal Medicine, Complementary and Alternative Medicine, and Sport Sciences produced higher Facebook mention counts than average. However, no MeSH cross-field issue differences were found in the rate of attracting Facebook mentions. The relationship between Facebook mentions and citations varies between both fields and MeSH cross-field issues. General and Internal Medicine, Cardiovascular System and Hematology and Developmental Biology have strongest correlations between Facebook mentions and citations, probably due to high citation rates and high Facebook visibility in these areas.  相似文献   

4.
Prior art patent citations have become a popular measure of patent quality and knowledge flow between firms. Interpreting these measurements is complicated, in some cases, because prior art citations are added by patent examiners as well as by patent applicants. The U.S. Patent and Trademark Office (USPTO) adopted new reporting procedures in 2001, making it possible to measure examiner and applicant citations separately for the first time. We analyzed prior art citations listed in all U.S. patents granted in 2001-2003, and found that examiners played a significant role in identifying prior art, adding 63% of citations on the average patent, and all citations on 40% of patents granted. An analysis of variance found that firm-specific variables explain most of the variation in examiner-citation shares. Using multivariate regression, we found that foreign applicants to the USPTO had the highest proportion of citations added by examiners. High-volume patent applicants had a greater proportion of examiner citations, and a substantial number of firms won patents without listing a single applicant citation. In terms of technology, we found higher examiner shares among patents in electronics, communications, and computer-related fields. Taken together, our findings suggest that firm-level patenting practices, particularly among high-volume applicants, have a strong influence on citation data and merit additional research.  相似文献   

5.
David Tan 《Research Policy》2010,39(1):89-102
Patent applicants and examiners do not always have the same views about what constitutes a patent's relevant prior art. We propose that the processes of categorization and classification variably shape the interface between applicants and examiners by influencing assessments of similarity between new and existing technologies. Some inventions sit in technological domains that cut across the categorical boundaries implied by examiners’ patterns of specialization. Some sit in domains wherein the classification system that guides examiner searches is more volatile. In either of these circumstances, heightened ambiguity leads to more examiner-added citations on patents that are granted. We test and confirm our predictions in a sample of patents granted to semiconductor firms in 2005.  相似文献   

6.
Using a composite sample of over 3600 index terms drawn from 11 different machine-readable bibliographic data bases, estimates were made of the spelling error frequencies of each of these data bases, as well as the frequency of posting to misspelled terms. The terms studied included assigned index terms as well as some terms from titles and abstracts. The frequency of index term misspellings ranged from a high of almost 23% for one data base to a low of less than 12% for another data base. The frequency of posting to misspelled terms ranged from about one posting in 8000 citations for one data base, to about one posting in 160 citations in another data base. The impact of these error rates is discussed for the tape supplier, tape user and end user. Some suggestions are given regarding search strategry.  相似文献   

7.
Technology transfer, research and development and engineering projects frequently require in-depth literature reviews. These reviews are carried out using computerized, bibliographic data bases. The review and/or searching process involves keywords selected from data base thesauri. The search strategy is formulated to provide both breadth and depth of coverage and yields both relevant and nonrelevant citations. Experience indicates that about 10–20% of the citations are relevant. As a consequence, significant amounts of time are required to eliminate the nonrelevant citations. This paper describes statistically based, lexical association methods which can be employed to determine citation relevance. In particular, the searcher selects relevant terms from citation-derived indexes and this information along with lexical statistics is used to determine citation relevance. Preliminary results are encouraging with the techniques providing an effective concentration of relevant citations.  相似文献   

8.
We exploit a unique database on research and invention disclosure of faculty at 11 major US universities over a period of 17 years to explore the extent to which faculty involvement in license activity has affected their research profiles. We relate faculty disclosures to their industry and government-sponsored research, publications, and citations. Recent disclosure by faculty has a positive effect on industry and government funding, but, if they disclose multiple times, the effect on government funding can be negative. Recent and repeated disclosures increase the faculty member's publication count as well as the importance of these publications in terms of citations. We also examine life-cycle effects and find that the ability to attract funding and the rate of publication increase as the faculty member ages but at a decreasing rate. We also find that post-tenure, both types of funding decrease.  相似文献   

9.
中国国际科技合著论文的学科分布差异   总被引:1,自引:0,他引:1  
 本文研究了十年间(1996-2005)SCIE论文中国国际科技合著论文的学科分布差异。通过对论文数量和引文影响的相对水平进行比较,发现在基础研究、工程技术和生物医学三个主要学科大类之间存在明显的差异:基础研究学科论文和引文数量都比较多;工程技术学科论文数多而引文数量相对较少;医学生物学则是论文数量较少而引文数量相对较多。同时,本文还将h指数的概念引入到学科比较中,以学科作为论文产出的主体进行比较分析。同样可以发现三大类学科的差异:基础研究学科h指数普遍较高,工程技术类普遍较低,而医学生物学处于二者之间的地位。  相似文献   

10.
Knowledge of window style, content, location, and grammatical structure may be used to classify documents as originating within a particular discipline or may be used to place a document on a theory vs practice spectrum. This distinction is also studied here using the type-token ratio to differentiate between sublanguages. The statistical significance of windows is computed, based on the presence of terms in titles, abstracts, citations, and section headers, as well as binary-independent and inverse-document-frequency weightings. The characteristics of windows are studied by examining their within-window density and the S concentration, the concentration of terms from various document fields (e.g. title, abstract) in the fulltext. The rate of window occurrences from the beginning to the end of document fulltext differs between academic fields. Different syntactic structures in sublanguages are examined, and their use is considered for discriminating between specific academic disciplines and, more generally, between theory vs practice or knowledge vs applications-oriented documents.  相似文献   

11.
Citation rates are becoming increasingly important in judging the research quality of journals, institutions and departments, and individual faculty. This paper looks at the pattern of citations across different management science journals and over time. A stochastic model is proposed which views the generating mechanism of citations as a gamma mixture of Poisson processes generating overall a negative binomial distribution. This is tested empirically with a large sample of papers published in 1990 from six management science journals and found to fit well. The model is extended to include obsolescence, i.e., that the citation rate for a paper varies over its cited lifetime. This leads to the additional citations distribution which shows that future citations are a linear function of past citations with a time-dependent and decreasing slope. This is also verified empirically in a way that allows different obsolescence functions to be fitted to the data. Conclusions concerning the predictability of future citations, and future research in this area are discussed.  相似文献   

12.
Authors and searchers usually express the same things in many different ways, which causes problems in free text searching of text databases. Thus, a switching tool connecting the different names of one concept is needed. This study tests the effectiveness of a thesaurus as a search-aid in free text searching of a full text database. A set of queries was searched against a large full text database of newspaper articles. The search-aid thesaurus constructed for the test contains the usual relationships of a thesaurus, namely equivalence, hierarchical, and associative relationships. Each query was searched in five distinct modes: basic search, synonym search, narrower term search, related term search, and union of all previous searches. The basic searches contained only terms included in the original query statements. In the synonym searches, the terms of the basic search were extended by disjunction of the synonyms given by the search-aid thesaurus without modifying the overall logic of the basic search. Likewise, the basic search was extended in turn with the narrower terms and with the related terms given by the search-aid thesaurus. The last search mode included the basic terms and all the terms used in the previous searches. The searches were analyzed in terms of relative recall and precision; relative recall was estimated by setting the recall of the union search to 100%. On the average the value of relative recall was 47.2% in the basic search, compared with 100% in the union search; the average value of precision decreased only from 62.5% in the basic search to 51.2% in the union search.  相似文献   

13.
Numerous metrics have been developed to identify revolutionary science which is crucial for advancing science. However, these metrics have rarely successfully identified revolutionary discoveries. We propose a two-dimension metric to quantify revolutionary discoveries by combining the consolidation-or-destabilization (CD) index with the citation count. To verify the validity of the metric, we utilize multivariate linear regression to investigate the differences in the CD indices and citations between 164 Nobel prize-winning papers from 1976 to 2016 (i.e., revolutionary science) and 9,034 counterparts that are similar to the Nobel prize-winning papers in terms of bibliographic information. We find that our proposed metric successfully shows a significant and distinct difference between the Nobel prize-winning papers and their counterparts in that the former receive around 880 more citations and 0.07 higher CD indices than the latter. The reliability of our proposed measure is robust.  相似文献   

14.
15.
王超 《情报探索》2020,(6):33-39
[目的/意义]探讨论文被引量与下载量之间的关系,对论文影响力评价有重要的意义。[方法/过程]通过CNKI数据库,以《中文核心期刊要目总览》中理工农医类及经济、历史、法律、哲学类期刊2006年刊载的55 000多篇论文为基础,分析不同类论文的被引量分布特征,比较同被引量论文的下载量以及相近下载量论文的平均被引量,采用Spearman相关性方法计算不同类论文下载量、被引量的等级相关系数。[结果/结论]不同类论文的被引量分布具有一致趋势:随被引量的增多论文数比例较快地减小直至为0,相应的被引量分布可以由指数衰减函数近似函数表征。在绝对数量上,论文的下载量、被引量存在较大差别,二者之间的相关性不明显,与下载、被引的自身特征以及各类论文的文献使用特征有关。Spearman等级相关分析表明,论文层级上,下载量、被引量的等级相关性较强,某一篇论文在一定时期内的下载量可以依靠其排序位数预测其统计意义上的被引排序数,可以为论文评价提供参考。  相似文献   

16.
杨思洛  邢欣  郑梦雪 《现代情报》2019,39(7):143-152
[目的/意义]对比分析G20国家图书影响力的现状,为我国提升图书国际影响力提供借鉴。[方法/过程]选择Springer收录的5个学科(化学、工程、医学、法律、历史)图书为样本,以Bookmetrix为数据源,在讨论指标覆盖率与相关性的基础上,从图书的产出数量、5类指标(下载量、引用量、提及量、读者量及评论量)以及合著关系3个方面展开比较分析。[结果/结论]研究发现:1)在指标覆盖率和相关性方面,整体情况与中、美、德等国家层面的表现类似。体现为图书的下载量覆盖率最高,读者量和引用量相关性值较高且稳定。2)中国的国际图书产出量与德国、美国、英国存在较大差距,但在指标均值、综合排名方面表现良好。3)在G20国家图书合著网络中,人文社科类图书国际合作强度明显低于理工类学科;美国、德国具有较高的科研合作实力与影响力;中国的国际合作具有一定基础,但整体影响力作用并不突出。  相似文献   

17.
针对相对影响和百分位数指标的局限,提出修正百分位数指标。从引文量分布特征、同区引文量差异、引文量与发文时长关联性等方面,对相对影响和百分位数指标进行了对比分析。在此基础上,将相对影响和百分位数指标融合并引入时效性参数,构建了修正百分位数指标并以图书情报类核心期刊及高校图书馆作为测评对象进行了案例分析。结果表明,修正百分位数指标可将位于同一百分位区间但发文时长及引文量不同的论文纳入到同一尺度下进行对比评价,测评结果对于引文数量特征和内在规律性的反映更为客观。  相似文献   

18.
鞠秀芳 《现代情报》2018,38(11):14-17
真实性、准确性、直接性与完整性是引用参考文献的首要准则,然而种种研究表明,当前的研究论著中引而不注、过度引用、模糊标注甚至虚假引用等不当引用行为日渐增多,这极大地影响了科学研究领域的学术风气,给读者阅读、期刊审稿及成果评定等工作带来了许多不便。本文利用文本相似度算法建立期刊引文有效性识别方法,试图从海量期刊引文数据中识别出期刊引文的真实有效性。实验表明,本文的期刊引文有效性识别方法在引文有效性方面实现了较好的识别效果,可为虚假引文的识别工作提供可靠的依据,从而为编辑人员发现、修正虚假引用问题提供帮助,彰显科学研究的严谨务实精神。  相似文献   

19.
We propose an empirical strategy to estimate competition in innovation markets. Our method relates firms’ market return on equity to information about patent citation patterns. Two innovations are implemented in the methodology. First is the application of daily abnormal stock returns rather than annual measures of Tobin's q. Second is the creation of citation patterns related to the area of science a firm patents in as represented by the detailed patent classification system. We find that markets positively reward firms when patents are granted. We further find that firm's market value increases when its patent portfolio is cited. We find evidence of competition in innovation markets. The market reacts at the time that the citation occurs and does not anticipate future citations at the time of patenting. Holding this effect constant, we find that citations from patents in the same area of science tend to reduce market value. We interpret these findings as consistent with more citations indicating more valuable intellectual property but citations from competing technologies decreasing it.  相似文献   

20.
An experimental computer intermediary system, CONIT, that assists users in accessing and searching heterogeneous retrieval systems has been enhanced with various search aids. Controlled experiments have been conducted to compare the effectiveness of the enhanced CONIT intermediary with that of human expert intermediary search specialists. Some 16 end users, none of whom had previously operated either CONIT or any of the four connected retrieval systems, performed searches on 20 different topics using CONIT with no assistance other than that provided by CONIT itself (except to recover from computer/software bugs). These same users also performed searches on the same topics with the help of human expert intermediaries who searched using the retrieval systems directly. Sometimes CONIT and sometimes the human expert were clearly superior in terms of such parameters as recall and search time. In general, however, users searching alone with CONIT achieved somewhat higher online recall at the expense of longer session times. We conclude that advanced experimental intermediary techniques are now capable of providing search assistance whose effectiveness at least approximates that of human intermediaries in some contexts. Also analyzed is the cost effectiveness of current intermediary systems. Finally, consideration is given to the prospects for much more advanced systems which would perform such functions as automatic data-base selection and the simulation of human experts, and thereby make information retrieval more effective for all classes of users.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号