首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Assigning credit to scientific datasets using article citation networks
Institution:1. School of Information Management, Nanjing University, Nanjing 210023, China;2. School of Information Studies, Syracuse University, Syracuse, NY 13244, USA;1. College of Liberal Arts and Sciences, National University of Defense Technology, Changsha, China;2. Department of Mathematics, University of California, Los Angeles, USA;1. Laboratory for Studies of Research and Technology Transfer, Institute for System Analysis and Computer Science (IASI-CNR), National Research Council of Italy, Viale Manzoni 30, 00185 Rome, Italy;2. Department of Engineering and Management, University of Rome “Tor Vergata”, Via del Politecnico 1, 00133 Rome, Italy;1. Laboratory for Studies in Research Evaluation, Institute for System Analysis and Computer Science (IASI-CNR), National Research Council, Rome, Italy;2. Nordic Institute for Studies in Innovation, Research and Education, Oslo, Norway;3. University of Rome “Tor Vergata”, Dept of Engineering and Management, Rome, Italy;1. Department of Information Management, Nanjing University of Science and Technology, Nanjing, China;2. Department of Network and New Media, Nanjing Normal University, Nanjing, China;1. University of Hasselt, Belgium;2. University of Antwerp, Faculty of Social Sciences, B-2020, Antwerpen, Belgium;3. Centre for R&D Monitoring (ECOOM) and Dept. MSI, KU Leuven, Leuven, Belgium
Abstract:A citation is a well-established mechanism for connecting scientific artifacts. Citation networks are used by citation analysis for a variety of reasons, prominently to give credit to scientists’ work. However, because of current citation practices, scientists tend to cite only publications, leaving out other types of artifacts such as datasets. Datasets then do not get appropriate credit even though they are increasingly reused and experimented with. We develop a network flow measure, called DataRank, aimed at solving this gap. DataRank assigns a relative value to each node in the network based on how citations flow through the graph, differentiating publication and dataset flow rates. We evaluate the quality of DataRank by estimating its accuracy at predicting the usage of real datasets: web visits to GenBank and downloads of Figshare datasets. We show that DataRank is better at predicting this usage compared to alternatives while offering additional interpretable outcomes. We discuss improvements to citation behavior and algorithms to properly track and assign credit to datasets.
Keywords:DataRank  Scientific dataset  Dataset impact  Citation network
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号