首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Simon J 《Endeavour》2002,26(4):132-136
During the 18th century, mineralogy constituted an integral part of natural history, sharing the concerns of botany and zoology over collection and classification. In Paris, many people owned private mineral collections, but these have been largely neglected by historians. Here, I examine the place of private collections in the history of mineralogy, arguing that they contributed socially, economically and intellectually to the field in a period before the dominance of the large national collection. I also show how the interests of private collectors diverged from those of the curators of public collections, particularly following the French Revolution.  相似文献   

2.
The currently used Basic-Applied research classification schemes and the criteria used to define them: (a) do not really identify different types of research activity but rather motivation of performers and sponsors or the expected generality of results; (b) generate relatively ambiguous statistical data and (c) generate great difficulties among those who have to produce or collect the information. It is suggested that this particular mode of research classification is not optimally responsive to the needs of policy makers, who are primarily interested in the utility of results, and that this utility is related to different operational types of research. A substitute broad research activity taxonomy is proposed based on a single criterion - degree of external intellectual constraints. On this basis, a simplified operational mode of statistical data collection is proposed. It is illustrated how science policy formulation, especially in the areas of resource allocation and organization, would be facilitated by use of these more suitable concepts and the resulting improved quantitative information.  相似文献   

3.
Transductive classification is a useful way to classify texts when labeled training examples are insufficient. Several algorithms to perform transductive classification considering text collections represented in a vector space model have been proposed. However, the use of these algorithms is unfeasible in practical applications due to the independence assumption among instances or terms and the drawbacks of these algorithms. Network-based algorithms come up to avoid the drawbacks of the algorithms based on vector space model and to improve transductive classification. Networks are mostly used for label propagation, in which some labeled objects propagate their labels to other objects through the network connections. Bipartite networks are useful to represent text collections as networks and perform label propagation. The generation of this type of network avoids requirements such as collections with hyperlinks or citations, computation of similarities among all texts in the collection, as well as the setup of a number of parameters. In a bipartite heterogeneous network, objects correspond to documents and terms, and the connections are given by the occurrences of terms in documents. The label propagation is performed from documents to terms and then from terms to documents iteratively. Nevertheless, instead of using terms just as means of label propagation, in this article we propose the use of the bipartite network structure to define the relevance scores of terms for classes through an optimization process and then propagate these relevance scores to define labels for unlabeled documents. The new document labels are used to redefine the relevance scores of terms which consequently redefine the labels of unlabeled documents in an iterative process. We demonstrated that the proposed approach surpasses the algorithms for transductive classification based on vector space model or networks. Moreover, we demonstrated that the proposed algorithm effectively makes use of unlabeled documents to improve classification and it is faster than other transductive algorithms.  相似文献   

4.
Users of the Internet are stripped of voice inflections, body language, and other common cues of conversation-only their words are left. Some claim that the lack of these social cues and the lack of hierarchy in the structure of the Internet provide the potential for equality in cyberspace. Many others have shown, though, that the issues of power in cyberspace are similar to the issues of power in physical space. This article examines an intersection of feminism and cyberspace in the ethos of online discussion. It is a rhetorical analysis of two popular feminist newsgroups, alt.feminism and soc.feminism. Do these newsgroups create a feminist and inclusive space online? What are the rhetorical strategies that make an online space more or less inclusive of women? Usenet newsgroups reveal the rhetorical power of these bare words. Although no formal means of discrimination is built into Usenet newsgroup discussions, discrimination does occur through the subtle and not so subtle use of language. This article looks at how various characteristics of language are used on the two newsgroups. Though the ethos on such discussion forums is dynamic, the analysis reveals examples of how sarcastic questioning, strong assertions, accusatory disagreements, and sexistcomments can create a hostile and noninclusive ethos.  相似文献   

5.
[目的/意义]区别于文献资源集合,网络音频资源集合的组织具有更强的个性化特征,其用户偏好的揭示不仅可拓展数字资源集合组织行为规律,亦有助于网络音频资源服务水平的提升。[方法/过程]选择代表性网络音频资源分享平台中的用户自组织音频资源集合作为样本,通过对音频资源集合名称的高频热词分析,探究用户创建网络音频资源集合逻辑与组织偏好。[结果/结论]相较于文献资源集合组织中对文献资源类型、学科领域等的强调,用户在创建网络音频资源集合时具有优先情感表达(内部归因),其次进行风格、主题、语种描述(外部归因)的组织规律和行为偏好。  相似文献   

6.
In this paper we introduce the notion of content locality in distributed document collections. Content locality is the degree to which content-similar documents are colocated in a distributed collection. We propose two metrics for measurement of content locality, one based on topic signatures and the other based on collection statistics. We provide derivations and analysis of both metrics and use them to measure the content locality in two kinds of document collections, the well-known TREC corpus and the Networked Computer Science Technical Report Library (NCSTRL), an operational digital library. We also show that content locality can be thought of temporally as well as spatially and provide evidence of its existence in temporally ordered document collections like news feeds.  相似文献   

7.
Term weighting for document ranking and retrieval has been an important research topic in information retrieval for decades. We propose a novel term weighting method based on a hypothesis that a term’s role in accumulated retrieval sessions in the past affects its general importance regardless. It utilizes availability of past retrieval results consisting of the queries that contain a particular term, retrieved documents, and their relevance judgments. A term’s evidential weight, as we propose in this paper, depends on the degree to which the mean frequency values for the relevant and non-relevant document distributions in the past are different. More precisely, it takes into account the rankings and similarity values of the relevant and non-relevant documents. Our experimental result using standard test collections shows that the proposed term weighting scheme improves conventional TF*IDF and language model based schemes. It indicates that evidential term weights bring in a new aspect of term importance and complement the collection statistics based on TF*IDF. We also show how the proposed term weighting scheme based on the notion of evidential weights are related to the well-known weighting schemes based on language modeling and probabilistic models.  相似文献   

8.
In this paper, we aim to improve query expansion for ad-hoc retrieval, by proposing a more fine-grained term reweighting process. This fine-grained process uses statistics from the representation of documents in various fields, such as their titles, the anchor text of their incoming links, and their body content. The contribution of this paper is twofold: First, we propose a novel query expansion mechanism on fields by combining field evidence available in a corpora. Second, we propose an adaptive query expansion mechanism that selects an appropriate collection resource, either the local collection, or a high-quality external resource, for query expansion on a per-query basis. The two proposed query expansion approaches are thoroughly evaluated using two standard Text Retrieval Conference (TREC) Web collections, namely the WT10G collection and the large-scale .GOV2 collection. From the experimental results, we observe a statistically significant improvement compared with the baselines. Moreover, we conclude that the adaptive query expansion mechanism is very effective when the external collection used is much larger than the local collection.  相似文献   

9.
It is a common assumption that digital technology stores and retrieves text differently than physical libraries do. Websites and digitized text uploaded on them are retrieved via algorithms, tied together by links, and so on, whereas physical libraries are structured by classification schemes, catalogs, and indexes. Nevertheless, looking further back in history, it can be argued that digital technology in fact makes us store and retrieve text as humanity did to begin with—before the invention of the printing press. This article argues how.  相似文献   

10.
We present an efficient document clustering algorithm that uses a term frequency vector for each document instead of using a huge proximity matrix. The algorithm has the following features: (1) it requires a relatively small amount of memory and runs fast, (2) it produces a hierarchy in the form of a document classification tree and (3) the hierarchy obtained by the algorithm explicitly reveals a collection structure. We confirm these features and thus show the algorithm's feasibility through clustering experiments in which we use two collections of Japanese documents, the sizes of which are 83,099 and 14,701 documents. We also introduce an application of this algorithm to a document browser. This browser is used in our Japanese-to-English translation aid system. The browsing module of the system consists of a huge database of Japanese news articles and their English translations. The Japanese article collection is clustered into a hierarchy by our method. Since each node in the hierarchy corresponds to a topic in the collection, we can use the hierarchy to directly access articles by topic. A user can learn general translation knowledge of each topic by browsing the Japanese articles and their English translations. We also discuss techniques of presenting a large tree-formed hierarchy on a computer screen.  相似文献   

11.
Queries submitted to search engines can be classified according to the user goals into three distinct categories: navigational, informational, and transactional. Such classification may be useful, for instance, as additional information for advertisement selection algorithms and for search engine ranking functions, among other possible applications. This paper presents a study about the impact of using several features extracted from the document collection and query logs on the task of automatically identifying the users’ goals behind their queries. We propose the use of new features not previously reported in literature and study their impact on the quality of the query classification task. Further, we study the impact of each feature on different web collections, showing that the choice of the best set of features may change according to the target collection.  相似文献   

12.
在数字图书馆建设过程中,资源集合无处不在,建立其元数据标准,是提供资源集合网上服务和数据交换的基础。本文分析了资源集合的概念、现有的资源集合元数据标准及资源集合与资源对象的关系,认为资源集合也是一种资源对象,从而提出可以采用适用于资源对象元数据标准的方法——元数据深度来分析资源集合,从而使资源集合元数据标准的建立更为简单化。  相似文献   

13.
In order to organise and manage geospatial and georeferenced information on the Web making them convenient for searching and browsing, a digital portal known as G-Portal has been designed and implemented. Compared to other digital libraries, G-Portal is unique for several of its features. It maintains metadata resources in XML with flexible resource schemas. Logical groupings of metadata resources as projects and layers are possible to allow the entire metadata collection to be partitioned differently for users with different information needs. These metadata resources can be displayed in both the classification-based and map-based interfaces provided by G-Portal. G-Portal further incorporates both a query module and an annotation module for users to search metadata and to create additional knowledge for sharing respectively. G-Portal also includes a resource classification module that categorizes resources into one or more hierarchical category trees based on user-defined classification schemas. This paper gives an overview of the G-Portal design and implementation. The portal features will be illustrated using a collection of high school geography examination-related resources.  相似文献   

14.
Conglomerates as a general framework for informetric research   总被引:2,自引:0,他引:2  
We introduce conglomerates as a general framework for informetric (and other) research. A conglomerate consists of two collections: a finite source collection and a pool, and two mappings: a source-item map and a magnitude map. The ratio of the sum of all magnitudes of item-sets, and the number of elements in the source collection is called the conglomerate ratio. It is a kind of average, generalizing the notion of an impact factor. The source-item relation of a conglomerate leads to a list of sources ranked according to the magnitude of their corresponding item-sets. This list, called a Zipf list, is the basic ingredient for all considerations related to power laws and Lotkaian or Zipfian informetrics. Examples where this framework applies are: impact factors, including web impact factors, Bradford–Lotka type bibliographies, first-citation studies, word use, diffusion factors, elections and even bestsellers lists.  相似文献   

15.
Many user-centred studies of electronic information resources include a think-aloud element – where users are asked to verbalise their thoughts, interface actions and sometimes their feelings whilst using these resources to help them complete one or more information tasks. These studies are usually conducted with the purpose of identifying usability issues related to the resource(s) used or understanding aspects of users’ information behaviour. However, few of these studies present detailed accounts of how their think-aloud data was collected and analysed or provide detailed reflection on methodological decisions made. In this article, we discuss and reflect on the methodology used when planning and conducting a think-aloud study of lawyers’ interactive information behaviour. Our discussion is framed by Blandford et al.’s PRET A Rapporter (‘ready to report’) framework – a framework that can be used to plan, conduct and describe user-centred studies of electronic information resource use from an information work perspective.  相似文献   

16.
This paper examines several different approaches to exploiting structural information in semi-structured document categorization. The methods under consideration are designed for categorization of documents consisting of a collection of fields, or arbitrary tree-structured documents that can be adequately modeled with such a flat structure. The approaches range from trivial modifications of text modeling to more elaborate schemes, specifically tailored to structured documents. We combine these methods with three different text classification algorithms and evaluate their performance on four standard datasets containing different types of semi-structured documents. The best results were obtained with stacking, an approach in which predictions based on different structural components are combined by a meta classifier. A further improvement of this method is achieved by including the flat text model in the final prediction.  相似文献   

17.
网络信息资源分类体系的优化研究   总被引:2,自引:1,他引:2  
刘红泉 《现代情报》2006,26(7):26-28,31
从分类角度检索网络信息资源是人们检索时常用的方法,不同的网络分类体系在类目体系和资源选择上都有自己的特点。本文分析了网络分类体系优化的基本原理和设计原则;提出了网络信息资源分类体系的基本设计思路反结构优化的纲领原则和措施。  相似文献   

18.
论高校图书馆与院系资料室资源的整合   总被引:3,自引:0,他引:3  
倪红华  周雪伟  左文革 《现代情报》2009,29(7):135-137,143
客观分析高校图书馆文献供给与需求矛盾日益突出的多种因素,结合高等院校院系资料室的特色与优势以及存在的问题,阐述了高校图书馆与院系资料室资源整合的必要性,并从管理机制、管理模式、资源建设与读者服务等不同层面,提出了构建高校文献资源保障体系、实现校内资源共建与共享的建设目标与实施方案。  相似文献   

19.
20.
Our lives are increasingly intertwined with the digital realm, and with new technology, new ethical problems emerge. The academic field that addresses these problems—which we tentatively call ‘digital ethics’—can be an important intellectual resource for policy making and regulation. This is why it is important to understand how the new ethical challenges of a digital society are being met by academic research. We have undertaken a scientometric analysis to arrive at a better understanding of the nature, scope and dynamics of the field of digital ethics. Our approach in this paper shows how the field of digital ethics is distributed over various academic disciplines. By first having experts select a collection of keywords central to digital ethics, we have generated a dataset of articles discussing these issues. This approach allows us to generate a scientometric visualisation of the field of digital ethics, without being constrained by any preconceived definitions of academic disciplines. We have first of all found that the number of publications pertaining to digital ethics is exponentially increasing. We furthermore established that whereas one may expect digital ethics to be a species of ethics, we in fact found that the various questions pertaining to digital ethics are predominantly being discussed in computer science, law and biomedical science. It is in these fields, more than in the independent field of ethics, that ethical discourse is being developed around concrete and often technical issues. Moreover, it appears that some important ethical values are very prominent in one field (e.g., autonomy in medical science), while being almost absent in others. We conclude that to get a thorough understanding of, and grip on, all the hard ethical questions of a digital society, ethicists, policy makers and legal scholars will need to familiarize themselves with the concrete and practical work that is being done across a range of different scientific fields to deal with these questions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号