首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We evaluate author impact indicators and ranking algorithms on two publication databases using large test data sets of well-established researchers. The test data consists of (1) ACM fellowship and (2) various life-time achievement awards. We also evaluate different approaches of dividing credit of papers among co-authors and analyse the impact of self-citations. Furthermore, we evaluate different graph normalisation approaches for when PageRank is computed on author citation graphs.We find that PageRank outperforms citation counts in identifying well-established researchers. This holds true when PageRank is computed on author citation graphs but also when PageRank is computed on paper graphs and paper scores are divided among co-authors. In general, the best results are obtained when co-authors receive an equal share of a paper's score, independent of which impact indicator is used to compute paper scores. The results also show that removing author self-citations improves the results of most ranking metrics. Lastly, we find that it is more important to personalise the PageRank algorithm appropriately on the paper level than deciding whether to include or exclude self-citations. However, on the author level, we find that author graph normalisation is more important than personalisation.  相似文献   

2.
We evaluate article-level metrics along two dimensions. Firstly, we analyse metrics’ ranking bias in terms of fields and time. Secondly, we evaluate their performance based on test data that consists of (1) papers that have won high-impact awards and (2) papers that have won prizes for outstanding quality. We consider different citation impact indicators and indirect ranking algorithms in combination with various normalisation approaches (mean-based, percentile-based, co-citation-based, and post hoc rescaling). We execute all experiments on two publication databases which use different field categorisation schemes (author-chosen concept categories and categories based on papers’ semantic information).In terms of bias, we find that citation counts are always less time biased but always more field biased compared to PageRank. Furthermore, rescaling paper scores by a constant number of similarly aged papers reduces time bias more effectively compared to normalising by calendar years. We also find that percentile citation scores are less field and time biased than mean-normalised citation counts.In terms of performance, we find that time-normalised metrics identify high-impact papers better shortly after their publication compared to their non-normalised variants. However, after 7 to 10 years, the non-normalised metrics perform better. A similar trend exists for the set of high-quality papers where these performance cross-over points occur after 5 to 10 years.Lastly, we also find that personalising PageRank with papers’ citation counts reduces time bias but increases field bias. Similarly, using papers’ associated journal impact factors to personalise PageRank increases its field bias. In terms of performance, PageRank should always be personalised with papers’ citation counts and time-rescaled for citation windows smaller than 7 to 10 years.  相似文献   

3.
The process of assessing individual authors should rely upon a proper aggregation of reliable and valid papers’ quality metrics. Citations are merely one possible way to measure appreciation of publications. In this study we propose some new, SJR- and SNIP-based indicators, which not only take into account the broadly conceived popularity of a paper (manifested by the number of citations), but also other factors like its potential, or the quality of papers that cite a given publication. We explore the relation and correlation between different metrics and study how they affect the values of a real-valued generalized h-index calculated for 11 prominent scientometricians. We note that the h-index is a very unstable impact function, highly sensitive for applying input elements’ scaling. Our analysis is not only of theoretical significance: data scaling is often performed to normalize citations across disciplines. Uncontrolled application of this operation may lead to unfair and biased (toward some groups) decisions. This puts the validity of authors assessment and ranking using the h-index into question. Obviously, a good impact function to be used in practice should not be as much sensitive to changing input data as the analyzed one.  相似文献   

4.
Metrics based on percentile ranks (PRs) for measuring scholarly impact involves complex treatment because of various defects such as overvaluing or devaluing an object caused by percentile ranking schemes, ignoring precise citation variation among those ranked next to each other, and inconsistency caused by additional papers or citations. These defects are especially obvious in a small-sized dataset. To avoid the complicated treatment of PRs based metrics, we propose two new indicators—the citation-based indicator (CBI) and the combined impact indicator (CII). Document types of publications are taken into account. With the two indicators, one would no more be bothered by complex issues encountered by PRs based indicators. For a small-sized dataset with less than 100 papers, special calculation is no more needed. The CBI is based solely on citation counts and the CII measures the integrate contributions of publications and citations. Both virtual and empirical data are used so as to compare the effect of related indicators. The CII and the PRs based indicator I3 are highly correlated but the former reflects citation impact more and the latter relates more to publications.  相似文献   

5.
基于网络结构挖掘算法的引文网络研究   总被引:1,自引:0,他引:1  
本文在对网络结构挖掘的两种典型算法(HITS算法和PageRank算法)进行比较分析的基础上,将PageRank算法应用到大规模引文网络中.对由236 517篇SCI文章构成的引文网络,计算得到每一篇文献的PageRank值,并深入分析了文献的PageRank值与通常使用的引文数指标之间的关系.分析表明:PageRank值具有与引文数很强的相关性和相似的幂律分布特征,但是PageRank算法能够在高引文文献中更好的区别文献的潜在重要性,并在很大程度上削弱作者自引对文献评价客观性的影响.  相似文献   

6.
In the past, recursive algorithms, such as PageRank originally conceived for the Web, have been successfully used to rank nodes in the citation networks of papers, authors, or journals. They have proved to determine prestige and not popularity, unlike citation counts. However, bibliographic networks, in contrast to the Web, have some specific features that enable the assigning of different weights to citations, thus adding more information to the process of finding prominence. For example, a citation between two authors may be weighed according to whether and when those two authors collaborated with each other, which is information that can be found in the co-authorship network. In this study, we define a couple of PageRank modifications that weigh citations between authors differently based on the information from the co-authorship graph. In addition, we put emphasis on the time of publications and citations. We test our algorithms on the Web of Science data of computer science journal articles and determine the most prominent computer scientists in the 10-year period of 1996–2005. Besides a correlation analysis, we also compare our rankings to the lists of ACM A. M. Turing Award and ACM SIGMOD E. F. Codd Innovations Award winners and find the new time-aware methods to outperform standard PageRank and its time-unaware weighted variants.  相似文献   

7.
The objective assessment of the prestige of an academic institution is a difficult and hotly debated task. In the last few years, different types of university rankings have been proposed to quantify it, yet the debate on what rankings are exactly measuring is enduring.To address the issue we have measured a quantitative and reliable proxy of the academic reputation of a given institution and compared our findings with well-established impact indicators and academic rankings. Specifically, we study citation patterns among universities in five different Web of Science Subject Categories and use the PageRank algorithm on the five resulting citation networks. The rationale behind our work is that scientific citations are driven by the reputation of the reference so that the PageRank algorithm is expected to yield a rank which reflects the reputation of an academic institution in a specific field. Given the volume of the data analysed, our findings are statistically sound and less prone to bias, than, for instance, ad–hoc surveys often employed by ranking bodies in order to attain similar outcomes. The approach proposed in our paper may contribute to enhance ranking methodologies, by reconciling the qualitative evaluation of academic prestige with its quantitative measurements via publication impact.  相似文献   

8.
Several studies have reported on metrics for measuring the influence of scientific topics from different perspectives; however, current ranking methods ignore the reinforcing effect of other academic entities on topic influence. In this paper, we developed an effective topic ranking model, 4EFRRank, by modeling the influence transfer mechanism among all academic entities in a complex academic network using a four-layer network design that incorporates the strengthening effect of multiple entities on topic influence. The PageRank algorithm is utilized to calculate the initial influence of topics, papers, authors, and journals in a homogeneous network, whereas the HITS algorithm is utilized to express the mutual reinforcement between topics, papers, authors, and journals in a heterogeneous network, iteratively calculating the final topic influence value. Based on a specific interdisciplinary domain, social media data, we applied the 4ERRank model to the 19,527 topics included in the criteria. The experimental results demonstrate that the 4ERRank model can successfully synthesize the performance of classic co-word metrics and effectively reflect high citation topics. This study enriches the methodology for assessing topic impact and contributes to the development of future topic-based retrieval and prediction tasks.  相似文献   

9.
Main path analysis is a popular method for extracting the backbone of scientific evolution from a (paper) citation network. The first and core step of main path analysis, called search path counting, is to weight citation arcs by the number of scientific influence paths from old to new papers. Search path counting shows high potential in scientific impact evaluation due to its semantic similarity to the meaning of scientific impact indicator, i.e. how many papers are influenced to what extent. In addition, the algorithmic idea of search path counting also resembles many known indirect citation impact indicators. Inspired by the above observations, this paper presents the FSPC (Forward Search Path Count) framework as an alternative scientific impact indicator based on indirect citations. Two critical assumptions are made to ensure the effectiveness of FSPC. First, knowledge decay is introduced to weight scientific influence paths in decreasing order of length. Second, path capping is introduced to mimic human literature search and citing behavior. By experiments on two well-studied datasets against two carefully created gold standard sets of papers, we have demonstrated that FSPC is able to achieve surprisingly good performance in not only recognizing high-impact papers but also identifying undercited papers.  相似文献   

10.
Questionable publications have been accused of “greedy” practices; however, their influence on academia has not been gauged. Here, we probe the impact of questionable publications through a systematic and comprehensive analysis with various participants from academia and compare the results with those of their unaccused counterparts using billions of citation records, including liaisons, i.e., journals and publishers, and prosumers, i.e., authors. Questionable publications attribute publisher-level self-citations to their journals while limiting journal-level self-citations; yet, conventional journal-level metrics are unable to detect these publisher-level self-citations. We propose a hybrid journal-publisher metric for detecting self-favouring citations among QJs from publishers. Additionally, we demonstrate that the questionable publications were less disruptive and influential than their counterparts. Our findings indicate an inflated citation impact of suspicious academic publishers. The findings provide a basis for actionable policy-making against questionable publications.  相似文献   

11.
A variety of bibliometric measures have been proposed to quantify the impact of researchers and their work. The h-index is a notable and widely used example which aims to improve over simple metrics such as raw counts of papers or citations. However, a limitation of this measure is that it considers authors in isolation and does not account for contributions through a collaborative team. To address this, we propose a natural variant that we dub the Social h-index. The idea is to redistribute the h-index score to reflect an individual's impact on the research community. In addition to describing this new measure, we provide examples, discuss its properties, and contrast with other measures.  相似文献   

12.
It is widely accepted that data is fundamental for research and should therefore be cited as textual scientific publications. However, issues like data citation, handling and counting the credit generated by such citations, remain open research questions.Data credit is a new measure of value built on top of data citation, which enables us to annotate data with a value, representing its importance. Data credit can be considered as a new tool that, together with traditional citations, helps to recognize the value of data and its creators in a world that is ever more depending on data.In this paper we define data credit distribution (DCD) as a process by which credit generated by citations is given to the single elements of a database. We focus on a scenario where a paper cites data from a database obtained by issuing a query. The citation generates credit which is then divided among the database entities responsible for generating the query output. One key aspect of our work is to credit not only the explicitly cited entities, but even those that contribute to their existence, but which are not accounted in the query output.We propose a data credit distribution strategy (CDS) based on data provenance and implement a system that uses the information provided by data citations to distribute the credit in a relational database accordingly.As use case and for evaluation purposes, we adopt the IUPHAR/BPS Guide to Pharmacology (GtoPdb), a curated relational database. We show how credit can be used to highlight areas of the database that are frequently used. Moreover, we also underline how credit rewards data and authors based on their research impact, and not merely on the number of citations. This can lead to designing new bibliometrics for data citations.  相似文献   

13.
Scholarly citations – widely seen as tangible measures of the impact and significance of academic papers – guide critical decisions by research administrators and policy makers. The citation distributions form characteristic patterns that can be revealed by big-data analysis. However, the citation dynamics varies significantly among subject areas, countries etc. The problem is how to quantify those differences, separate global and local citation characteristics. Here, we carry out an extensive analysis of the power-law relationship between the total citation count and the h-index to detect a functional dependence among its parameters for different science domains. The results demonstrate that the statistical structure of the citation indicators admits representation by a global scale and a set of local exponents. The scale parameters are evaluated for different research actors – individual researchers and entire countries – employing subject- and affiliation-based divisions of science into domains. The results can inform research assessment and classification into subject areas; the proposed divide-and-conquer approach can be applied to hidden scales in other power-law systems.  相似文献   

14.
The citation counts are increasingly used to assess the impact on the scientific community of publications produced by a researcher, an institution or a country. There are many institutions that use bibliometric indicators to steer research policy and for hiring or promotion decisions. Given the importance that counting citations has today, the aim of the work presented here is to show how citations are distributed within a scientific area and determine the dependence of the citation count on the article features. All articles referenced in the Web of Science in 2004 for Biology & Biochemistry, Chemistry, Mathematics and Physics were considered.We show that the distribution of citations is well represented by a double exponential-Poisson law. There is a dependence of the mean citation rate on the number of co-authors, the number of addresses and the number of references, although this dependence is a little far from the linear behaviour. For the relation between the mean impact and the number of pages the dependence obtained was very low. For Biology & Biochemistry and Chemistry we found a linear behaviour between the mean citation per article and impact factor and for Mathematics and Physics the results obtained are near to the linear behaviour.  相似文献   

15.
16.
睡美人与王子文献的识别方法研究   总被引:1,自引:0,他引:1  
[目的/意义] 研究睡美人与王子文献的识别方法。分析唤醒机制,为未来在学术交流体系中发现"王子"作者,发掘、唤醒低被引和零被引文献的潜在价值提供理论依据。[方法/过程] 采用被引速率指标和睡美人指数两种客观指标识别1970-2005年临床医学四大名刊上发表的睡美人文献;基于以下4个原则寻找唤醒睡美人的王子文献:①发表于被引突增的附近年份;②本身被引次数较高;③与睡美人文献的同被引次数高;④在年度被引次数曲线上,王子文献对睡美人文献的"牵引或拉动"作用非常显著,即至少在睡美人文献引用突增的附近年份,王子文献的年度被引次数应高于睡美人文献。[结果/结论] 由于考虑了全部引文窗的引文曲线,被引速率指标能够识别出那些被引生命周期长、至今仍持续不断高频被引的论文;睡美人指数能够快速识别出睡美人文献,但却无法反映年度被引次数达到峰值之后的引文曲线;将被引速率+发表最初5年年均被引次数两个指标结合起来能够更好地识别睡美人文献。分析发现,综述、指南、著作等"共识型"的文献对于引发那些提出了新思想但尚未被认可的睡美人文献的被引突增起到了关键作用。建议事后识别睡美人文献可采用客观指标与主观界定相结合的方法,事前预测睡美人文献要注意追踪其是否被"共识性"文献推荐和引用,学术评价要特别关注被引速率低的论文。  相似文献   

17.
The way retracted papers have been mentioned in post-retraction citations reflects the perception of the citing authors. The characteristics of post-retraction citations are therefore worth studying to provide insights into the prevention of the citation chain of retracted papers. In this study, full-text analysis is used to compare the distinctions of citation location and citation sentiment—attitudes and dispositions toward the cited work—between the conditions of correctly mentioning the retracted status (called CM) and not mentioning the retracted status (called NM). Statistical test is carried out to explore the effect of CM on post-retraction citations in the field of psychology. It is shown that the citation sentiment of CM is equally distributed as negative, neutral, and positive, while for NM, it is mainly distributed as the latter two. CM papers tend to cite retracted papers in Methodology, whereas NM papers cite more in Theoretical Background and Conclusion. The perception efficiency of retractions in psychology is low, where the average unaware duration (UD, the period between when the retraction note has been published and when the first citation directly pointed out its retracted status) lasts for 2.88 years. Also, UD is negatively correlated with the quantity of CM and the growth rate of NM, the proportionate change of NM before and after the first CM paper appears (P <0.01). After being aware of retractions, the average rate of change (ARC, the total change divided by its taken time) of NM declines significantly (Z=-2.823, P <0.01) whereas CM sees a raise in most disciplines, which contributes to the reduction of possible interdisciplinary impact.  相似文献   

18.
Identifying the future influential papers among the newly published ones is an important yet challenging issue in bibliometrics. As newly published papers have no or limited citation history, linear extrapolation of their citation counts—which is motivated by the well-known preferential attachment mechanism—is not applicable. We translate the recently introduced notion of discoverers to the citation network setting, and show that there are authors who frequently cite recent papers that become highly-cited in the future; these authors are referred to as discoverers. We develop a method for early identification of highly-cited papers based on the early citations from discoverers. The results show that the identified discoverers have a consistent citing pattern over time, and the early citations from them can be used as a valuable indicator to predict the future citation counts of a paper. The discoverers themselves are potential future outstanding researchers as they receive more citations than average.  相似文献   

19.
[目的/意义]在引文分析中,可通过论文的一些属性特征对其未来的被引情况进行预测,并通过预测结果对论文、论文作者、作者所属机构及出版物做出评价。[方法/过程] 从出版物、作者和论文三个方面对影响论文被引的多个因素展开研究,以图书馆学情报学领域被SCI索引的论文作为分析及验证数据,使用逻辑回归、GBDT、XGBoost、AdaBoost、随机森林等算法进行预测,使用多组评测指标对比不同预测方法的效果,并使用GBDT识别对论文被引影响较大的因素。[结果/结论]确定三个方面的影响因素对论文被引预测的影响程度,构建预测模型,并较好地预测论文在未来一段时间的被引情况。大量实验分析发现GBDT、XGBoost和随机森林的预测能力较强,且预测的时间段越长,效果也就相对越好。  相似文献   

20.
Across the various scientific domains, significant differences occur with respect to research publishing formats, frequencies and citing practices, the nature and organisation of research and the number and impact of a given domain's academic journals. Consequently, differences occur in the citations and h-indices of the researchers. This paper attempts to identify cross-domain differences using quantitative and qualitative measures. The study focuses on the relationships among citations, most-cited papers and h-indices across domains and for research group sizes. The analysis is based on the research output of approximately 10,000 researchers in Slovenia, of which we focus on 6536 researchers working in 284 research group programmes in 2008–2012.As comparative measures of cross-domain research output, we propose the research impact cube (RIC) representation and the analysis of most-cited papers, highest impact factors and citation distribution graphs (Lorenz curves). The analysis of Lotka's model resulted in the proposal of a binary citation frequencies (BCF) distribution model that describes well publishing frequencies. The results may be used as a model to measure, compare and evaluate fields of science on the global, national and research community level to streamline research policies and evaluate progress over a definite time period.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号