期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Normalization of zero-inflated data: An empirical analysis of a new indicator family and its use with altmetrics data

Lutz Bornmann Robin Haunschild 《Journal of Informetrics》2018,12(3):998-1011

Recently, two new indicators (Equalized Mean-based Normalized Proportion Cited, EMNPC; Mean-based Normalized Proportion Cited, MNPC) were proposed which are intended for sparse scientometrics data, e.g., alternative metrics (altmetrics). The indicators compare the proportion of mentioned papers (e.g. on Facebook) of a unit (e.g., a researcher or institution) with the proportion of mentioned papers in the corresponding fields and publication years (the expected values). In this study, we propose a third indicator (Mantel-Haenszel quotient, MHq) belonging to the same indicator family. The MHq is based on the MH analysis – an established method in statistics for the comparison of proportions. We test (using citations and assessments by peers, i.e. F1000Prime recommendations) if the three indicators can distinguish between different quality levels as defined on the basis of the assessments by peers. Thus, we test their convergent validity. We find that the indicator MHq is able to distinguish between the quality levels in most cases while MNPC and EMNPC are not. Since the MHq is shown in this study to be a valid indicator, we apply it to six types of zero-inflated altmetrics data and test whether different altmetrics sources are related to quality. The results for the various altmetrics demonstrate that the relationship between altmetrics (Wikipedia, Facebook, blogs, and news data) and assessments by peers is not as strong as the relationship between citations and assessments by peers. Actually, the relationship between citations and peer assessments is about two to three times stronger than the association between altmetrics and assessments by peers. 相似文献

2.

Implicit indexing of natural language text by reorganizing bytecodes

Nieves R. Brisaboa Antonio Fari?a Susana Ladra Gonzalo Navarro 《Information Retrieval》2012,15(6):527-557

Word-based byte-oriented compression has succeeded on large natural language text databases, by providing competitive compression ratios, fast random access, and direct sequential searching. We show that by just rearranging the target symbols of the compressed text into a tree-shaped structure, and using negligible additional space, we obtain a new implicitly indexed representation of the compressed text, where search times are drastically improved. The occurrences of a word can be listed directly, without any text scanning, and in general any inverted-index-like capability, such as efficient phrase searches, can be emulated without storing any inverted list information. We experimentally show that our proposal performs not only much more efficiently than sequential searches over compressed text, but also than explicit inverted indexes and other types of indexes, when using little extra space. Our representation is especially successful when searching for single words and short phrases. 相似文献

3.

Information rights and national security

Nadia Caidi Anthony Ross 《Government Information Quarterly》2005,22(4):663-684

The changes in the global information landscape, as epitomized by the reaction of governments to the 9/11 attacks, resulted in legislation, policy, and the formation of agencies that have affected many issues related to information and its use. This article examines the recent multiplicity of challenges that affect citizens' control and use of information. In the name of the war on terror, greater national security, and globalization trends, information laws, and policies often go further than is necessary and impact on the information rights of citizens. In this article, we advocate for bringing together what are at times disparate information issues under one label, namely, “information rights” (which include privacy, freedom of expression, access, etc.). Information rights are apprehended from a user-centered perspective (i.e., users as citizens, not just consumers). They cover many different aspects of the information life cycle and the roles and responsibilities of individuals and communities. Such an approach provides an alternative way of framing current information issues as they relate to national security policies and civil liberties in the broader sense. 相似文献

4.

The ineffectiveness of within-document term frequency in text classification

W. John Wilbur Won Kim 《Information Retrieval》2009,12(5):509-525

For the purposes of classification it is common to represent a document as a bag of words. Such a representation consists of the individual terms making up the document together with the number of times each term appears in the document. All classification methods make use of the terms. It is common to also make use of the local term frequencies at the price of some added complication in the model. Examples are the naïve Bayes multinomial model (MM), the Dirichlet compound multinomial model (DCM) and the exponential-family approximation of the DCM (EDCM), as well as support vector machines (SVM). Although it is usually claimed that incorporating local word frequency in a document improves text classification performance, we here test whether such claims are true or not. In this paper we show experimentally that simplified forms of the MM, EDCM, and SVM models which ignore the frequency of each word in a document perform about at the same level as MM, DCM, EDCM and SVM models which incorporate local term frequency. We also present a new form of the naïve Bayes multivariate Bernoulli model (MBM) which is able to make use of local term frequency and show again that it offers no significant advantage over the plain MBM. We conclude that word burstiness is so strong that additional occurrences of a word essentially add no useful information to a classifier. 相似文献

5.

Abbreviations,Full Spellings,and Searchers’ Preferences

Jeffrey Beall 《Cataloging & classification quarterly》2013,51(6):443-456

This study examined ten, selected word pairs, each containing a word's full spelling and its abbreviation, to determine which form search engine users preferred in searching. Using seven search logs gathered from several Internet search engines with approximately 608 MB of data, the study measured the occurrences of the twenty terms. The selected words are important in library cataloging, for some are prescribed abbreviations in metadata content standards. The study found that in eight of the ten word pairs users preferred to search words’ full spellings over the abbreviations, often by a high margin. 相似文献

6.

"公文"释源

丁海斌康胜利《北京档案》2015,(11):13-16

针对"公文"一词的起源诸说,采用文献检索统计、同类词比较和文本分析等方法,本文发现,"公文"一词起源于东汉末年,却到宋、明、清才日渐盛行;它的出现次数和频次总体呈递增趋势;其文书类用法较为广泛,应用范围始终稳定在"官方政务文书"的基本范畴,是古代汉语在近现代得以发扬光大的典型之一. 相似文献

7.

Reaching Additional Users with Proactive Chat

Linda Rich Vera Lux 《The Reference Librarian》2018,59(1):23-34

ABSTRACT

Despite a general decline in recent years in academic libraries’ reference desk statistics, research indicates that library users continue to have complex research questions but are largely unaware that librarians are waiting and ready to assist them. The challenge for librarians is to connect with users at their point of need. At Bowling Green State University, we are making a move in this direction with proactive (pop-up) chat widgets embedded within our library Web pages, catalog, and databases. Since implementation, the number of chat reference questions received has more than doubled, helping us reach additional users from on-and off-campus. 相似文献

8.

图书馆图书借阅系统与单标度二元网络模型 总被引：8，自引：0，他引：8

傅林华郭建峰朱建阳《情报学报》2004,23(5):571-575

本文从网络的角度 ,研究了图书馆这样一种有趣的复杂系统。读者和图书之间通过借阅建立联系 ,可以在两个层次上用网络语言来描述 ,即二元 (读者—图书 )和单元 (读者—读者 ,图书—图书 )网络。我们以研究配位数分布为工具 ,研究了北京师范大学图书馆外借处图书在 14个月内的借阅情况所构成的网络 ,发现其体现了很好的单标度性质 ,即配位数分布体现为一指数衰减的形式。随后提出了一个单标度二元网络模型 ,对此进行解释 ,定性地重现了这一实测结果。相似文献

9.

自然语言语义分析研究进展 总被引：5，自引：0，他引：5

秦春秀祝婷赵捧未张毅《图书情报工作》2014,58(22):130-137

按照自然语言的构成层次——词语、句子和篇章,分析各层次语义分析的内涵、现有的研究策略、理论依据及存在的主要方法,并对现存的两类主要研究策略进行对比分析.认为词语语义分析是指确定词语意义,衡量两个词之间的语义相似度或相关度;句子语义分析研究包含句义分析和句义相似度分析两方面;文本语义分析就是识别文本的意义、主题、类别等语义信息的过程.当前的自然语言语义分析主要存在两种主要的研究策略:基于知识或语义学规则的语义分析和基于统计学的语义分析.基于统计与规则相融合的语义分析方法是未来自然语言语义分析的主流方法,本体语义学是自然语言语义分析的重要基础. 相似文献

10.

How to consider fractional counting and field normalization in the statistical modeling of bibliometric data: A multilevel Poisson regression approach

Rüdiger Mutz Hans-Dieter Daniel 《Journal of Informetrics》2019,13(2):643-657

The numerical-algorithmic procedures of fractional counting and field normalization are often mentioned as indispensable requirements for bibliometric analyses. Against the background of the increasing importance of statistics in bibliometrics, a multilevel Poisson regression model (level 1: publication, level 2: author) shows possible ways to consider fractional counting and field normalization in a statistical model (fractional counting I). However, due to the assumption of duplicate publications in the data set, the approach is not quite optimal. Therefore, a more advanced approach, a multilevel multiple membership model, is proposed that no longer provides for duplicates (fractional counting II). It is assumed that the citation impact can essentially be attributed to time-stable dispositions of researchers as authors who contribute with different fractions to the success of a publication’s citation. The two approaches are applied to bibliometric data for 254 scientists working in social science methodology. A major advantage of fractional counting II is that the results no longer depend on the type of fractional counting (e.g., equal weighting). Differences between authors in rankings are reproduced more clearly than on the basis of percentiles. In addition, the strong importance of field normalization is demonstrated; 60% of the citation variance is explained by field normalization. 相似文献