首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
林杰  苗润生 《情报学报》2020,39(1):68-80
专业社交媒体中主题图谱的内容包括论坛中的主题及主题之间的关系,其具有挖掘专业产品创新方向、构建专业知识索引等重要应用价值。本文基于深度学习技术与文本挖掘技术,提出了专业社交媒体中的主题图谱构建方法。首先,使用专业社交媒体中的文本训练Skip-Gram模型,利用该模型的隐藏层权重与模型输出的预测结果,分别获取词语间的语义相似度与上下文关联度。其次,基于该语义相似度与上下文关联度,对已有领域种子本体词汇进行扩充,将语义相似或上下文相邻近的词汇纳入本体词汇,为主题抽取提供高质量的领域词汇。然后,基于扩充的专业本体词汇,使用结合本体词汇的LDA主题模型从专业社交媒体文本中抽取主题与主题词。最后,利用语义相似度与上下文关联度,定义关联度权重,通过图模型与谱聚类,获取主题间与主题词的关联关系与层次结构。本文使用汽车论坛语料进行主题图谱生成实验。实验结果表明,本文方法获取的主题词纯净度相比单独使用LDA模型提升了20.2%,且能够清晰合理地展现主题之间的关系。  相似文献   

文章目的在于梳理30多年来健康素。养领域研究的演进路径,拇示、研究热点与研究前沿。为此,将1990--2010年SCI、SSCI数据库中收录的主题为“健康素养”的论文数据作为研究对象,应用Citespace知识可视化软件绘制文献共被引网络闺谱,析出健康素养演进过程中:的关键节点文献;并应用其文献共引一名词短语混合网络图谱、关键词聚类和膨胀词探测4功能分析了研究热点与研究前沿。11篇关键节点文献很好地展示了健康素养领域研虎的演进路径2个明显的聚类以及19个突观词表征了研究热点与研究前沿。文章认为健康素养和精神健康素养是目前两个主流研凳领域矗健康素养内涵、健康材料的可读性、快速潮评工具和方法、健康教育、突发性事件以及以艾滋病为代表盼慢性痛与传染病是健康素养的研究前沿与发展趋势。  相似文献   

[目的/意义]研究前沿的准确判断是国家宏观层面的战略需求,文献计量学作为一种定量研究方法广泛应用于科学主题探测和研究前沿识别中。[方法/过程]梳理研究前沿主题探测的发展历程和方法模型,引入全域微观模型的概念,详细介绍SciVal模块采用的主题创建方法,包括直接引用文献聚类、关键词主题命名和研究前沿遴选的主题显著性算法,并对SciVal创建的9.6万个主题和遴选出的前1%的研究前沿主题的特征进行实证分析。[结果/结论]全域微观模型可以同时一次识别整个科学领域的所有主题,但不同学科在研究前沿上表现存在差异,不能把主题显著性简单等同为重要性;主题论文数量与主题排名之间存在中度相关性;自动抽取的关键词术语从学科领域层和独特性上命名和描述主题;石墨烯相关前沿主题的演变趋势分析可以用于发现关键节点和新兴主题。  相似文献   

Research topic studies have gained popularity in many disciplines, including library and information science (LIS). However, the lack of representation of library science and librarianship in literature indicates a research bias due to the preset methodology parameters, which are commonly based on impact factor scores in the Journal Citation Report of Thomson Reuters. In research, the authors utilize an improved selection criterion of journals and author-supplied keyword clustering and analysis technique to study the most recent ten years of LIS journal publications. This article presents a clear picture of popular research topics in seminal literature to help practicing librarians and library science scholars gain a better understanding and considerable prediction on the research trends in the LIS field.  相似文献   

我国数字图书馆研究十年:基于科研项目分析   总被引:1,自引:0,他引:1  
文章以2000—2010年国内数字图书馆领域的504个立项研究项目作为数据基础,从立项数量年度分析、项目地区分布分析、项目来源分析三个方面对研究项目的基本情况进行分析总结,从研究主题时序分析、研究主题关键词分析、基于关键词的研究热点分析、基于关键词的研究趋势分析四个方面对研究项目主题和研究趋势进行分析。  相似文献   

在规范化的学科关键词交集的基础之上,从定量角度引入系统聚类分析和战略坐标分析方法,结合聚类类团命名的粘合力指标,绘制出高频交叉关键词的聚类树状图和战略坐标图,深入探讨学科交叉研究热点的内在联系和发展脉络。研究得出有关图书情报学和新闻传播学两门学科交叉研究热点领域的主题划分、结构特性和演化过程的一些有益结论。  相似文献   

基于共现关键词统计的图书馆学情报学学科研究趋势分析   总被引:3,自引:1,他引:2  
以17种图书馆学情报学核心期刊中40 634篇文献的142 303个关键词为调研对象,运用词对关键词、相对词频统计和理论词对关键词矩阵方法,分析近10年来我国图书馆学情报学核心研究的热点论题分布及其变化趋势,据此给出图书馆学情报学研究的核心课题、高频词对关键词时间变化特征以及尚处于研究空白的一些论题。  相似文献   

[目的/意义]图书馆学、情报学及档案学(简称“图情档学”)是当代培养新型信息知识人才的重要学科,本文揭示“十三五”期间图情档学科中外期刊论文的研究热点及趋势,以期为我国图情档学科的未来研究提供些许思路,有助于推动图情档学科的发展,提升学术话语权。[方法/过程]选取国内的CNKI和国外的Web of Science数据库为数据来源,以2016-2020年图情档的核心期刊论文为研究样本,运用Citespace、VOSviewer、Gephi等可视化软件,从发文量与被引量、关键词共现及时区分布揭示中外文期刊的研究热点以及发展趋势。[结果/结论]研究发现,“十三五”期间,中文期刊论文的发文量呈逐年略递减趋势,外文期刊论文发文量呈现上升趋势,学科领域的研究逐渐面向国际化;中外期刊论文研究热点呈现研究主题多元化、“图书馆”相关主题研究逐步深化、论文主题紧跟本学科的研究热点与动向,但各有侧重等特点;整体呈现出对传统研究领域的坚守与深化,与其他学科的交叉融合发展,向社会公众领域的拓展的研究趋势。  相似文献   

进行学术期刊关键词分析对于掌握学科主题和学科构成脉络具有重要意义,由此本研究利用网络嵌入技术提取了大型关键词关联网络的高阶信息,并利用聚类算法对“图书馆学;情报学”学科进行关键词主题可视化分析。首先,刻画了关键词之间的局部聚集和全局分布,并分析了最近四年中该学科的热度持续、热度增加和热度减退主题,最后通过国内外关键词关联网络对比揭示了中外研究热点异同。  相似文献   

传统的专家识别系统大多采用一组带权重的关键词来表征专家的专长,然而这种基于关键词的专长描述不足以概括专家的研究主题。提出基于领域本体概念的专长表示方法,通过构建相应的领域本体来描述领域核心概念和概念间关系,利用谷歌距离来计算关键词到本体概念的语义相似度,完成关键词到概念的映射,从而得到基于本体概念的专长表示。  相似文献   

Identifying research fronts is an essential aspect of promoting scientific development. Many researchers choose their research directions and topics by analyzing their field's current research fronts. Many previous researchers have used academic papers or patents to identify research fronts; however, this is potentially outdated and reduces the prospective value of the research front detection. Considering this, this work proposes adapted indicators to conduct research front topic detection based on research grant data, which aims to identify research front topics and forecast trends using path analysis. First, research topics were identified using topic modeling, and then the mapping relations from topics to both fund projects and cross-domain categories were built. Then, research front topics were detected by multi-dimensional measurements, and the evolution of research topics was analyzed using topic evolution visualization to predict development trends. Finally, the Brillouin index was used to measure the cross-domain degree. Our method was evaluated using a dataset from the field of health informatics and was shown to be effective in research front identification. We found that the proposed adapted indicators were informative in identifying the evolutional trends in the health informatics field. In addition, research grants with higher cross-domain degrees are more likely to receive a high amount of funding.  相似文献   

This article sought to investigate the evolution of library and information science by tracking the author-supplied keywords in the research articles published in the domain between 1971 and 2015. Data was extracted from Thomson Reuters’ citation mainstream indexes and analysed using the VosViewer computer-aided software to obtain author-supplied keyword frequencies in each decade since 1971. We identified the most salient and common research themes in LIS and how the themes have evolved, by delving into the author-supplied keywords to proxy research themes in the field domain. Results indicate that the field of LIS has evolved in terms of its subject focus from information systems design and management in the 1970s to scientific communication, information storage and retrieval, information access, information and knowledge management, and user education in 2015. The application of ICTs in LIS practice and education, too, has emerged as a prominent topic in the field. These issues have the potential of shaping or have shaped the LIS curriculum in some LIS schools in the continent.  相似文献   

The number of received citations have been used as an indicator of the impact of academic publications. Developing tools to find papers that have the potential to become highly-cited has recently attracted increasing scientific attention. Topics of concern by scholars may change over time in accordance with research trends, resulting in changes in received citations. Author-defined keywords, title and abstract provide valuable information about a research article. This study performs a latent Dirichlet allocation technique to extract topics and keywords from articles; five keyword popularity (KP) features are defined as indicators of emerging trends of articles. Binary classification models are utilized to predict papers that were highly-cited or less highly-cited by a number of supervised learning techniques. We empirically compare KP features of articles with other commonly used journal-related and author-related features proposed in previous studies. The results show that, with KP features, the prediction models are more effective than those with journal and/or author features, especially in the management information system discipline.  相似文献   

谈搜索引擎中Web页面标引关键词的确定   总被引:2,自引:0,他引:2  
论述搜索引擎在对网络信息进行关键词标引时,传统加权词频统计算法的应用和影响关键词权重的几种因素,指出使用后控制词表是改善关键词语言性能的有效措施,最后提出一种新型的、基于逻辑“非”运算的后控制词表,用以提高搜索引擎的检准率。  相似文献   

提取和分析领域重要关键词及其演化模式,对于探索和预测领域知识的研究重点和研究趋势具有重要的意义。论文采用特征分解的方法,提取领域知识网络中的重要结构成分,从网络全局结构关系的视角对领域中的重要关键词进行提取与分析。研究结果表明,在网络全局结构的视角下,领域中始终保持部分恒定不变的重要关键词;恒定关键词之间关联稀疏且包含具有结构洞特征的知识关联;新生的重要关键词遵循先成为重要结构再成为关联核心的涌现模式。  相似文献   

While past research has shown that learning outcomes can be influenced by the amount of effort students invest during the learning process, there has been little research into this question for scenarios where people use search engines to learn. In fact, learning-related tasks represent a significant fraction of the time users spend using Web search, so methods for evaluating and optimizing search engines to maximize learning are likely to have broad impact. Thus, we introduce and evaluate a retrieval algorithm designed to maximize educational utility for a vocabulary learning task, in which users learn a set of important keywords for a given topic by reading representative documents on diverse aspects of the topic. Using a crowdsourced pilot study, we compare the learning outcomes of users across four conditions corresponding to rankings that optimize for different levels of keyword density. We find that adding keyword density to the retrieval objective gave significant learning gains on some topics, with higher levels of keyword density generally corresponding to more time spent reading per word, and stronger learning gains per word read. We conclude that our approach to optimizing search ranking for educational utility leads to retrieved document sets that ultimately may result in more efficient learning of important concepts.  相似文献   

Several studies have reported on metrics for measuring the influence of scientific topics from different perspectives; however, current ranking methods ignore the reinforcing effect of other academic entities on topic influence. In this paper, we developed an effective topic ranking model, 4EFRRank, by modeling the influence transfer mechanism among all academic entities in a complex academic network using a four-layer network design that incorporates the strengthening effect of multiple entities on topic influence. The PageRank algorithm is utilized to calculate the initial influence of topics, papers, authors, and journals in a homogeneous network, whereas the HITS algorithm is utilized to express the mutual reinforcement between topics, papers, authors, and journals in a heterogeneous network, iteratively calculating the final topic influence value. Based on a specific interdisciplinary domain, social media data, we applied the 4ERRank model to the 19,527 topics included in the criteria. The experimental results demonstrate that the 4ERRank model can successfully synthesize the performance of classic co-word metrics and effectively reflect high citation topics. This study enriches the methodology for assessing topic impact and contributes to the development of future topic-based retrieval and prediction tasks.  相似文献   

利用引文内容进行主题级学科交叉类型分析   总被引:1,自引:0,他引:1  
[目的/意义]针对学科交叉宏观研究不能刻画学科交叉主题,以及学科交叉微观研究仍处于主题挖掘研究阶段的现状,从内容层面解决主题学科交叉度计算问题,并构建学科交叉分类的量化标准。[方法/过程]首先,采集学术论文并解析引文内容;利用术语集获取术语和术语主题。然后,统计引文内容中的主题术语重复率。接着,计算学科间的主题学科交叉度。最后,基于主题学科交叉度分布熵,进行分类并分析。[结果/结论]研究结果表明:①六个学科难以与医学在实践应用知识层面进行学科交叉;医学的理论基础与六个学科有明显的学科知识交叉。②学科交叉存在三种类型分别为:界内交叉、工具型交叉和界外交叉。综上,通过引文内容中的术语可以有效地计算主题学科交叉度,定量地研究学科交叉类型。  相似文献   

准确地研究和测度科学知识之间的逻辑关系和结构体系,是进行科学政策研究和科研项目资助布局等科研管理活动的重要基础。学术期刊作为科学知识传播和交流的重要平台,是探测科学知识结构的一种有效载体,但是不同的学术期刊分类体系对科学知识结构的测度结构会产生直接而广泛的影响。文章从学术期刊分群的角度出发,考虑期刊在共被引过程中的距离因素,通过采用深度学习算法,来进行期刊的相似度计算与分群问题研究,在此基础上进行科学知识结构测度方法研究,并以中国人文社会科学期刊引文数据库为实验对象进行了实证研究。从实证结果来看,我国人文社会科学学科知识结构存在较为明显的结构划分,不同学科类别或不同研究领域的期刊都被分到了相应的群组,表明从期刊使用的角度来看,我国人文社会科学知识结构边界是相对较为清晰的。在此基础上重点对法学期刊的两个群组的科学研究主题进行了挖掘,从关键词的共现网络中可以明显看出,两个期刊群体内的研究主题虽有一定的交叉,但是两者在具体研究内容上也存在着显著区别。  相似文献   

[目的/意义]对我国图书情报学在国际权威期刊上发表的文章主题特点进行研究。[方法/过程]对文献关键词进行整体分析,根据发文量的变化情况分为三个阶段,梳理出总体研究热点。进一步对三个阶段中发文数量最多、上涨速度最快的阶段进行细粒度划分,基于关键词共现方法分析研究主题的时序变化情况。[结果/结论]我国学者在国际上的研究主题集中、多样化程度低,近十年来科学计量学、社会网络主题发展迅速,知识组织和管理主题的研究热度长久不衰。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号