首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Egghe and Proot [Egghe, L., & Proot, G. (2007). The estimation of the number of lost multi-copy documents: A new type of informetrics theory. Journal of Informetrics] introduce a simple probabilistic model to estimate the number of lost multi-copy documents based on the numbers of retrieved ones. We show that their model in practice can essentially be described by the well-known Poisson approximation to the binomial. This enables us to adopt a traditional maximum likelihood estimation (MLE) approach which allows the construction of (approximate) confidence intervals for the parameters of interest, thereby resolving an open problem left by the authors. We further show that the general estimation problem is a variant of a well-known unseen species problem. This work should be viewed as supplementing that of Egghe and Proot [Egghe, L., & Proot, G. (2007). The estimation of the number of lost multi-copy documents: A new type of informetrics theory. Journal of Informetrics]. It turns out that their results are broadly in line with those produced by this rather more robust statistical analysis.  相似文献   

2.
作为信息科学领域的权威会议,ASIS&T年会是展示和传播信息科学领域最新研究成果的重要场所。本文以第82届ASIS&T年会收录的74篇会议论文为研究对象,从信息行为、信息组织与管理、信息技术与计量、信息服务四方面进行详细述评和动态解析。信息行为研究内容涉及阅读、研究、游戏、健康、检索等方面,更贴近人们日常学习生活中的信息行为;信息组织更关注元数据、实体间的数据关联、数据复用与数据治理;信息管理方面,无论是个人信息管理还是机构信息管理都更注重用户体验、用户情感因素;信息技术更关注文本分析技术,信息计量更关注跨学科的知识结构,为审稿与撤稿的研究提供了新视角;在强调信息道德与数据伦理基础上,信息服务聚焦于面向特殊人群的图书馆全纳支持服务、面向数字人文的信息系统开发与数据组织,面向社交网络的数据分析与社区服务。  相似文献   

3.
Throughout the ages, man has reworded events, ideas, research and knowledge for the benefit of future generations. This Information has been Historically housed and proved by various entities such as libraries, universities, government agencies and archives. In addition to preservation, one of the primary objectives of these various custodial entities was to provide access to the information. The amount of printed material during the explosion of the Information Age alone exceeded mankind's complete history of documents. Due to the increased volume of data, the difficulty of accessing it and the growing demand for instant information, man's ingenuity transformed the concept of extracting information. A need for assisting the present generation in findin8 specific information was evident; information became a commodity rather than a benefactor; and the information broker was spawned.  相似文献   

4.
Calls for public engagement and participation in AI governance align strongly with a public value management approach to public administration. Simultaneously, the prominence of commercial vendors and consultants in AI discourse emphasizes market value and efficiency in a way often associated with the private sector and New Public Management. To understand how this might influence the consolidation of AI governance regimes and decision-making by public administrators, 16 national strategies for AI are subjected to content analysis. References to the public's role and public engagement mechanisms are mapped across national strategies, as is the articulation of values related to professionalism, efficiency, service, engagement, and the private sector. Though engagement rhetoric is common, references to specific engagement mechanisms and activities are rare. Analysis of value relationships highlights congruence of engagement values with professionalism and private sector values, and raises concerns about neoliberal technology frames that normalize AI, obscuring policy complexity and trade-offs.  相似文献   

5.
In today's global culture where the Internet has established itself as the main tool for communication and commerce, the capability to massively analyze and predict citizens' behavior has become a priority for governments in terms of collective intelligence and security. At the same time, in the context of novel possibilities that artificial intelligence (AI) brings to governments in terms of understanding and developing collective behavior analysis, important concerns related to citizens' privacy have emerged. In order to identify the main uses that governments make of AI and to define citizens' concerns about their privacy, in the present study, we undertook a systematic review of the literature, conducted in-depth interviews, and applied data-mining techniques. Based on our results, we classified and discussed the risks to citizens' privacy according to the types of AI strategies used by governments that may affect collective behavior and cause massive behavior modification. Our results revealed 11 uses of AI strategies used by the government to improve their interaction with citizens, organizations in cities, services provided by public institutions or the economy, among other areas. In relation to citizens' privacy when AI is used by governments, we identified 8 topics related to human behavior predictions, intelligence decision making, decision automation, digital surveillance, data privacy law and regulation, and the risk of behavior modification. The paper concludes with a discussion of the development of regulations focused on the ethical design of citizen data collection, where implications for governments are presented aimed at regulating security, ethics, and data privacy. Additionally, we propose a research agenda composed by 16 research questions to be investigated in further research.  相似文献   

6.
针对文本信息内容结构参差不齐的问题,提出一种评价文本内容结构分析方法,该方法将文本中的句子作为节点,句子之间的共同名词作为边,构建文本复杂网络,并选取复杂网络的拓扑性质对文本结构特征进行分析。基于一个新闻文本案例构建复杂网络,并计算度、强度、最短路径、加权聚类系数等衡量指标,这些指标能很好地评价文本内容结构的好坏,也为理解和提取文本的中心思想、生成摘要、文本检索过滤提供重要参考依据。  相似文献   

7.
8.
Governments are increasing digital communication with citizens, yet little is known about how the public sector influences communicators’ daily social media activities. This ethnographic study uses interviews, documents, and participant observation to offer a rare emic view of the US Coast Guard (USCG) social media program. Breaking up the monolithic public sector communication context, influences on social media communication were nested within five contexts: organization, military, parent agency, federal government, and the US public sector. By observing how the contexts and related attributes influence personnel and the program, the study provides insights related to social media communication processes rather than merely content products. Findings extend theoretical and practical applications by identifying enablers and challenges to government social media communication within an applied context. USCG's culture and history of transparency and engagement drive the strategy, while resource constraints and a devaluing of social media within the decentralized organization constrain program effectiveness and real-time engagement.  相似文献   

9.
赵华茗  钱力  余丽 《图书情报工作》2020,64(11):108-115
[目的/意义] 探索科研命名实体及其关系的识别与抽取,提升其在长句等复杂情况下的识别效果,为进一步的应用提供参考与借鉴。[方法/过程] 以依存句法特征分析为基础,提出一种科研命名实体关系抽取方法,过程包括:①使用Standford Tagger工具对目标文本进行词性标注;②基于标注结果,围绕核心谓词和SAO结构,将目标文本分割为结构规范的语义片段;③通过依存句法分析,找出与核心谓词语义相关的主语和宾语,构成(实体,关系,实体)三元组。[结果/结论] 与Ollie、Reverb等主流算法进行的对比测试表明,该方法可以有效提升科研命名实体识别的准确性。  相似文献   

10.
以信息计量学为支撑的资源语义化是当前数字图书馆领域逐渐兴起的研究热点。围绕发文共现与耦合、引文共现与耦合、发文-引文共现3个层面对数字文献资源中的元数据概念及其中的计量语义关系进行全面揭示和扩散推演,进一步系统化地构建了由计量分析产生的概念与关系所构成的,具有本体模型、语义网络以及社会网络的共有特性的数字文献资源计量语义网络。  相似文献   

11.
提出一种基于潜在语义索引和本体论的文本语义处理方法。首先构建一个基于本体论的虚拟标准文本特征向量,然后采用潜在语义索引方法以虚拟标准文本特征向量为参照对文本集进行语义聚类,最后在虚拟标准文本特征向量的导引下利用本体库中的知识对聚类获得的文本集合的类别和语义进行显性标注。实验表明,该方法能较好地在语义层面对文本进行有效的聚类,而且聚类结果能显性地显示类聚所属的类别。  相似文献   

12.
信息计量学的基础与发展研究   总被引:6,自引:0,他引:6  
信息计量学的基础是指发生学意义上的逻辑起点。一是“他律性”基础,即决定其研究内容的社会经济基础,主要是社会劳动的信息化与知识化的要求;二是“自律性”基础,即决定其学科形式的内在规律,主要是相关学科成果形式与“科学共同体”知识结构的规定性。信息计量学是在文献计量学与科学计量学基础上形成和发展起来的。它的形成与发展符合社会经济基础的发展,特别是科学与交流的社会化与信息化,以及信息资源电子化与网络化的要求。它是情报定量化研究的必然结果。  相似文献   

13.
Summarizing Similarities and Differences Among Related Documents   总被引:10,自引:0,他引:10  
In many modern information retrieval applications, a common problem which arises is the existence of multiple documents covering similar information, as in the case of multiple news stories about an event or a sequence of events. A particular challenge for text summarization is to be able to summarize the similarities and differences in information content among these documents. The approach described here exploits the results of recent progress in information extraction to represent salient units of text and their relationships. By exploiting meaningful relations between units based on an analysis of text cohesion and the context in which the comparison is desired, the summarizer can pinpoint similarities and differences, and align text segments. In evaluation experiments, these techniques for exploiting cohesion relations result in summaries which (i) help users more quickly complete a retrieval task (ii) result in improved alignment accuracy over baselines, and (iii) improve identification of topic-relevant similarities and differences.  相似文献   

14.
This study investigates the public's initial trust in so-called “artificial intelligence” (AI) chatbots about to be introduced into use in the public sector. While the societal impacts of AI are widely speculated about, empirical testing remains rare. To narrow this gap, this study builds on theories of operators' trust in machines in industrial settings and proposes that initial public trust in chatbot responses depends on (i) the area of enquiry, since expectations about a chatbot's performance vary with the topic, and (ii) the purposes that governments communicate to the public for introducing the use of chatbots. Analyses based on an experimental online survey in Japan generated results indicating that, if a government were to announce its intention to use “AI” chatbots to answer public enquiries, the public's initial trust in their responses would be lower in the area of parental support than in the area of waste separation, with a moderate effect size. Communicating purposes that would directly benefit citizens, such as achieving uniformity in response quality and timeliness in responding, would enhance public trust in chatbots. Although the effect sizes are small, communicating these purposes might be still worthwhile, as it would be an inexpensive measure for a government to take.  相似文献   

15.
尹培丽 《图书与情报》2011,(3):53-56,84
口述资料是指所承载信息的原始获取和传递方式来源于口头的资料,依据其形成过程中信息流的方向,可将其区分为叙述式和访谈式两种。依据我国现行的《著作权法》,应当对口述资料以"作品"的形式加以保护,但是需要区分职务作品以及访谈式口述资料的合作作品等情形,对于其整理和利用过程中所涉及的著作权问题,应尊重"意思自治"的原则,通过"授权委托书"等方式明确权利归属。  相似文献   

16.
浅淡民国文献的抢救与保护   总被引:1,自引:0,他引:1  
民国时期出版的文献是一批非常重要的文献,其思想文化价值不在善本古籍之下。但是,由于民国时期的造纸状况和现代人对民国文献保护的意识不强,造成了民国文献的酸化和老化损毁状况极为严重。因此,加强对民国文献的抢救与保护具有十分重要的意义。本文旨在为民国文献的保护提出自己的一点见解,以供同行参考。  相似文献   

17.
18.
[目的/意义]现有新闻文档实体排序研究大多以文档或实体为中心,如文本分类、实体链接等,关注实体在文本中的重要性的研究较少,本研究探讨基于重要性的新闻文档实体排序。[方法/过程]给定一篇文档,判断文档中实体相对文档而言的重要性,并基于此对实体进行排序。在搜狗全网新闻数据集上进行实验,并利用NDCG和逆序对比率两个指标对实体排序结果进行评价。[结果/结论]实验结果表明,基于实体频率、TF*IDF、信息熵、TextRank等的方法以及集成方法都达到了较好的效果,基于聚集系数的方法效果一般。其中基于TF*IDF的方法NDCG值为95.86%,是该指标下的最好结果;基于集成方法的逆序对比率值为84.46%,是该指标下的最好结果。  相似文献   

19.
In this paper a machine learning approach for classifying Arabic text documents is presented. To handle the high dimensionality of text documents, embeddings are used to map each document (instance) into R (the set of real numbers) representing the tri-gram frequency statistics profiles for a document. Classification is achieved by computing a dissimilarity measure, called the Manhattan distance, between the profile of the instance to be classified and the profiles of all the instances in the training set. The class (category) to which an instance (document) belongs is the one with the least computed Manhattan measure. The Dice similarity measure is used to compare the performance of method. Results show that tri-gram text classification using the Dice measure outperforms classification using the Manhattan measure.  相似文献   

20.
选择期刊Scientometrics、Journal of Informetrics 2003-2012年期间发表的论文和国际科学计量学与信息计量学大会(ISSI大会)论文集论文为样本,对中国大陆和台湾地区科学计量学与信息计量学的发展进行比较研究。研究从4个方面展开:论文计量分析、引文计量分析、合作研究以及研究内容考察。研究结果表明,中国大陆和台湾地区已经成为国际上科学计量学与信息计量学论文产出大户,但是中国大陆地区论文年篇均引文略低于世界平均水平,AR指数低于台湾地区;两地区学者已经产出合著论文,但合作局限于少数学者和少数机构之间;两地区的研究对象和研究方法有共性又各有特色,中国大陆地区学者更重视科学计量,台湾地区学者更重视技术测度。最后提出促进两地区科学计量学与信息计量学发展需要深入讨论的几个问题。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号