首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Problem-solving by everyday individuals is thought to occur as a two-step process. First, an individual identifies or formulates a problem, followed by entering into a subsequent search to find the best solution. Here, however, we consider an alternative process that everyday individuals may use for solution finding first theorized by von Hippel and von Krogh (2016). Specifically, von Hippel and von Krogh proposed that everyday individuals may sometimes discover a solution and the need it satisfies simultaneously without the need for apriori problem formation, a cognitive process they called “need-solution pair recognition”. Utilizing a rich literature from psychology and neuroscience, we propose that seemingly spontaneous discoveries found by need-solution pair recognition are natural products of the object recognition system and its underlying mechanisms. This view asserts that on encountering an object and reasoning how it might be used (i.e. functional object understanding), an individual's perception of an object may culminate in recognizing the object as a solution, and in some cases, as a solution to a problem previously unknown to him or her, thus bypassing formal problem-formulation and active solution searching entirely. To empirically test this view, we manipulated the ability of everyday individuals to functionally reason about objects while we examined the spontaneous occurrence of solutions found by either need-solution pair recognition or traditional problem-first problem-solving. Consistent with our hypothesized mechanism, our results indicate that need-solution pair recognition occurs more frequently when constraints on functional object understanding are reduced. That is, we found that needsolution pair discoveries outpaced solutions found from traditional problem solving, in environments with unfamiliar objects, where participants were not directed to solve specific problems. Our results provide clear evidence that everyday individuals in the household sector do not always innovate through traditional problem-solving processes, but instead may arrive at solutions as they recognize and reason about objects. Implications for research and practice in household innovation, and for innovation more generally are considered.  相似文献   

2.
Measuring the similarity between the semantic relations that exist between words is an important step in numerous tasks in natural language processing such as answering word analogy questions, classifying compound nouns, and word sense disambiguation. Given two word pairs (AB) and (CD), we propose a method to measure the relational similarity between the semantic relations that exist between the two words in each word pair. Typically, a high degree of relational similarity can be observed between proportional analogies (i.e. analogies that exist among the four words, A is to B such as C is to D). We describe eight different types of relational symmetries that are frequently observed in proportional analogies and use those symmetries to robustly and accurately estimate the relational similarity between two given word pairs. We use automatically extracted lexical-syntactic patterns to represent the semantic relations that exist between two words and then match those patterns in Web search engine snippets to find candidate words that form proportional analogies with the original word pair. We define eight types of relational symmetries for proportional analogies and use those as features in a supervised learning approach. We evaluate the proposed method using the Scholastic Aptitude Test (SAT) word analogy benchmark dataset. Our experimental results show that the proposed method can accurately measure relational similarity between word pairs by exploiting the symmetries that exist in proportional analogies. The proposed method achieves an SAT score of 49.2% on the benchmark dataset, which is comparable to the best results reported on this dataset.  相似文献   

3.
We study several machine learning algorithms for cross-language patent retrieval and classification. In comparison with most of other studies involving machine learning for cross-language information retrieval, which basically used learning techniques for monolingual sub-tasks, our learning algorithms exploit the bilingual training documents and learn a semantic representation from them. We study Japanese–English cross-language patent retrieval using Kernel Canonical Correlation Analysis (KCCA), a method of correlating linear relationships between two variables in kernel defined feature spaces. The results are quite encouraging and are significantly better than those obtained by other state of the art methods. We also investigate learning algorithms for cross-language document classification. The learning algorithm are based on KCCA and Support Vector Machines (SVM). In particular, we study two ways of combining the KCCA and SVM and found that one particular combination called SVM_2k achieved better results than other learning algorithms for either bilingual or monolingual test documents.  相似文献   

4.
In view of the sizeable climate change challenge, we need a clean innovation machine operating at full speed. Beyond the supply of public clean R&D infrastructure and clean public procurement, the development and adoption of new clean technologies by the private sector needs to be assured to reduce Green House Gas (GHG) emissions. The private clean innovation machine, left on its own, is not up to this challenge. It needs government intervention to address the combination of environmental and knowledge externalities and overcome path dependencies. The firm level evidence presented in this contribution on the motives of private sector firms for introducing clean innovations from the latest Flemish CIS eco-innovation survey confirms that firms are responsive to eco-policy demand interventions. At the same time, the high importance of demand pull from customers and voluntary codes of conduct or voluntary sector agreements as drivers for introducing clean innovations, is a reminder of the internal strength of the private innovation machine, which governments need to leverage. Policy interventions are shown to be more powerful to induce the adoption and development of new clean technologies when designed in policy mix and time consistently, affecting future expectations.  相似文献   

5.
Social media has become the most popular platform for free speech. This freedom of speech has given opportunities to the oppressed to raise their voice against injustices, but on the other hand, this has led to a disturbing trend of spreading hateful content of various kinds. Pakistan has been dealing with the issue of sectarian and ethnic violence for the last three decades and now due to freedom of speech, there is a growing trend of disturbing content about religion, sect, and ethnicity on social media. This necessitates the need for an automated system for the detection of controversial content on social media in Urdu which is the national language of Pakistan. The biggest hurdle that has thwarted the Urdu language processing is the scarcity of language resources, annotated datasets, and pretrained language models. In this study, we have addressed the problem of detecting Interfaith, Sectarian, and Ethnic hatred on social media in Urdu language using machine learning and deep learning techniques. In particular, we have: (1) developed and presented guidelines for annotating Urdu text with appropriate labels for two levels of classification, (2) developed a large dataset of 21,759 tweets using the developed guidelines and made it publicly available, and (3) conducted experiments to compare the performance of eight supervised machine learning and deep learning techniques, for the automated identification of hateful content. In the first step, experiments are performed for the hateful content detection as a binary classification task, and in the second step, the classification of Interfaith, Sectarian and Ethnic hatred detection is performed as a multiclass classification task. Overall, Bidirectional Encoder Representation from Transformers (BERT) proved to be the most effective technique for hateful content identification in Urdu tweets.  相似文献   

6.
With the advent of Web 2.0, there exist many online platforms that results in massive textual data production such as social networks, online blogs, magazines etc. This textual data carries information that can be used for betterment of humanity. Hence, there is a dire need to extract potential information out of it. This study aims to present an overview of approaches that can be applied to extract and later present these valuable information nuggets residing within text in brief, clear and concise way. In this regard, two major tasks of automatic keyword extraction and text summarization are being reviewed. To compile the literature, scientific articles were collected using major digital computing research repositories. In the light of acquired literature, survey study covers early approaches up to all the way till recent advancements using machine learning solutions. Survey findings conclude that annotated benchmark datasets for various textual data-generators such as twitter and social forms are not available. This scarcity of dataset has resulted into relatively less progress in many domains. Also, applications of deep learning techniques for the task of automatic keyword extraction are relatively unaddressed. Hence, impact of various deep architectures stands as an open research direction. For text summarization task, deep learning techniques are applied after advent of word vectors, and are currently governing state-of-the-art for abstractive summarization. Currently, one of the major challenges in these tasks is semantic aware evaluation of generated results.  相似文献   

7.
Innovation researchers have begun to look beyond how users develop tangible objects or product innovations and moved to investigate the existence and impact of intangible user-developed innovations in techniques and services in the household sector . In this paper, to incorporate technique and service innovations and other varieties of intangible innovations not yet described in the literature into an efficient and encompassing typology, we propose the new concept of intangible Behavioral Innovation as an overarching category that stands in contrast to tangible product innovation. Behavioral innovation is defined as consisting of one or a connected sequence of intangible problem-solving activities that provide a functionally novel benefit to its user developer relative to previous practice. We demonstrate in a pilot study using a relatively novel big data-gathering and semantic analysis approach that behavioral innovation exists and can be identified in user-generated content posted openly online in peer-to-peer discussion forums relating to household sector activities such as parenting. The preponderance (N = 138) of the 168 user innovations captured in our samples of discussion comments were intangible behavioral innovations, most of which were developed by women. The majority of behavioral innovations identified were diffused by their user developers in response to specific requests for help or advice from peers in their online community. Thus, incorporating the new concept of intangible behavioral innovation into studies of user innovation's scope and significance in the household sector can serve to clarify which users innovate in our communities of interest, what and how they innovate, why they are triggered to diffuse their innovations peer-to-peer, and how their innovative activities might impact social welfare.  相似文献   

8.
In a study of innovations developed by mountain bikers, we find that user-innovators almost always utilize “local” information - information already in their possession or generated by themselves - both to determine the need for and to develop the solutions for their innovations. We argue that this finding fits the economic incentives operating on users. Local need information will in general be the most relevant to user-innovators, since the bulk of their innovation-related rewards typically come from in-house use. User-innovators will increasingly tend to rely on local solution information as the stickiness of non-local solution information rises. When user-innovators do rely on local information, it may be possible to predict the general nature of the innovations they might develop.  相似文献   

9.
Nowadays, online word-of-mouth has an increasing impact on people's views and decisions, which has attracted many people's attention.The classification and sentiment analyse in online consumer reviews have attracted significant research concerns. In this thesis, we propose and implement a new method to study the extraction and classification of online dating services(ODS)’s comments. Different from traditional emotional analysis which mainly focuses on product attribution, we attempted to infer and extract the emotion concept of each emotional reviews by introducing social cognitive theory. In this study, we selected 4,300 comments with extremely negative/positive emotions published on dating websites as a sample, and used three machine learning algorithms to analyze emotions. When testing and comparing the efficiency of user's behavior research, we use various sentiment analysis, machine learning techniques and dictionary-based sentiment analysis. We found that the combination of machine learning and lexicon-based method can achieve higher accuracy than any type of sentiment analysis. This research will provide a new perspective for the task of user behavior.  相似文献   

10.
研究了一种全新的基于KCCA算法的全自动跨语言信息检索方法,这种算法能通过学习双语训练语料来获得文献的语义表示(向量空间)。运用KCCA来进行中英跨语言专利检索的实验,结果令人鼓舞,所获得的检索结果明显好于以往的技术方法。  相似文献   

11.
Today, due to a vast amount of textual data, automated extractive text summarization is one of the most common and practical techniques for organizing information. Extractive summarization selects the most appropriate sentences from the text and provide a representative summary. The sentences, as individual textual units, usually are too short for major text processing techniques to provide appropriate performance. Hence, it seems vital to bridge the gap between short text units and conventional text processing methods.In this study, we propose a semantic method for implementing an extractive multi-document summarizer system by using a combination of statistical, machine learning based, and graph-based methods. It is a language-independent and unsupervised system. The proposed framework learns the semantic representation of words from a set of given documents via word2vec method. It expands each sentence through an innovative method with the most informative and the least redundant words related to the main topic of sentence. Sentence expansion implicitly performs word sense disambiguation and tunes the conceptual densities towards the central topic of each sentence. Then, it estimates the importance of sentences by using the graph representation of the documents. To identify the most important topics of the documents, we propose an inventive clustering approach. It autonomously determines the number of clusters and their initial centroids, and clusters sentences accordingly. The system selects the best sentences from appropriate clusters for the final summary with respect to information salience, minimum redundancy, and adequate coverage.A set of extensive experiments on DUC2002 and DUC2006 datasets was conducted for investigating the proposed scheme. Experimental results showed that the proposed sentence expansion algorithm and clustering approach could considerably enhance the performance of the summarization system. Also, comparative experiments demonstrated that the proposed framework outperforms most of the state-of-the-art summarizer systems and can impressively assist the task of extractive text summarization.  相似文献   

12.
In this paper, we propose a new learning method for extracting bilingual word pairs from parallel corpora in various languages. In cross-language information retrieval, the system must deal with various languages. Therefore, automatic extraction of bilingual word pairs from parallel corpora with various languages is important. However, previous works based on statistical methods are insufficient because of the sparse data problem. Our learning method automatically acquires rules, which are effective to solve the sparse data problem, only from parallel corpora without any prior preparation of a bilingual resource (e.g., a bilingual dictionary, a machine translation system). We call this learning method Inductive Chain Learning (ICL). Moreover, the system using ICL can extract bilingual word pairs even from bilingual sentence pairs for which the grammatical structures of the source language differ from the grammatical structures of the target language because the acquired rules have the information to cope with the different word orders of source language and target language in local parts of bilingual sentence pairs. Evaluation experiments demonstrated that the recalls of systems based on several statistical approaches were improved through the use of ICL.  相似文献   

13.
Users of Social Networking Sites (SNSs) like Facebook, LinkedIn or Twitter, are facing two problems: (1) it is difficult for them to keep track of their social friendships and friends’ social activities scattered across different SNSs; and (2) they are often overwhelmed by the huge amount of social data (friends’ updates and other activities). To address these two problems, we propose a user-centric system called “SocConnect” (Social Connect) for aggregating social data from different SNSs and allowing users to create personalized social and semantic contexts for their social data. Users can blend and group friends on different SNSs, and rate the friends and their activities as favourite, neutral or disliked. SocConnect then provides personalized recommendation of friends’ activities that may be interesting to each user, using machine learning techniques. A prototype is also implemented to demonstrate these functionalities of SocConnect. Evaluation on real users confirms that users generally like the proposed functionalities of our system, and machine learning can be effectively applied to provide personalized recommendation of friends’ activities and help users deal with cognitive overload.  相似文献   

14.
A methodology for automatically identifying and clustering semantic features or topics in a heterogeneous text collection is presented. Textual data is encoded using a low rank nonnegative matrix factorization algorithm to retain natural data nonnegativity, thereby eliminating the need to use subtractive basis vector and encoding calculations present in other techniques such as principal component analysis for semantic feature abstraction. Existing techniques for nonnegative matrix factorization are reviewed and a new hybrid technique for nonnegative matrix factorization is proposed. Performance evaluations of the proposed method are conducted on a few benchmark text collections used in standard topic detection studies.  相似文献   

15.
Many machine learning algorithms have been applied to text classification tasks. In the machine learning paradigm, a general inductive process automatically builds a text classifier by learning, generally known as supervised learning. However, the supervised learning approaches have some problems. The most notable problem is that they require a large number of labeled training documents for accurate learning. While unlabeled documents are easily collected and plentiful, labeled documents are difficultly generated because a labeling task must be done by human developers. In this paper, we propose a new text classification method based on unsupervised or semi-supervised learning. The proposed method launches text classification tasks with only unlabeled documents and the title word of each category for learning, and then it automatically learns text classifier by using bootstrapping and feature projection techniques. The results of experiments showed that the proposed method achieved reasonably useful performance compared to a supervised method. If the proposed method is used in a text classification task, building text classification systems will become significantly faster and less expensive.  相似文献   

16.
柯佳 《情报科学》2021,39(10):165-169
【目的/意义】实体关系抽取是构建领域本体、知识图谱、开发问答系统的基础工作。远程监督方法将大规 模非结构化文本与已有的知识库实体对齐,自动标注训练样本,解决了有监督机器学习方法人工标注训练语料耗 时费力的问题,但也带来了数据噪声。【方法/过程】本文详细梳理了近些年远程监督结合深度学习技术,降低训练 样本噪声,提升实体关系抽取性能的方法。【结果/结论】卷积神经网络能更好的捕获句子局部、关键特征、长短时记 忆网络能更好的处理句子实体对远距离依赖关系,模型自动抽取句子词法、句法特征,注意力机制给予句子关键上 下文、单词更大的权重,在神经网络模型中融入先验知识能丰富句子实体对的语义信息,显著提升关系抽取性能。 【创新/局限】下一步的研究应考虑实体对重叠关系、实体对长尾语义关系的处理方法,更加全面的解决实体对关系 噪声问题。  相似文献   

17.
This paper examines the extent to which users in developing countries innovate, the factors that enable these innovations and whether they are meaningful on a global stage. To study this issue, we conducted an empirical investigation into the origin and types of innovations in financial services offered via mobile phones, a global, multi-billion-dollar industry in which developing economies play an important role. We used the complete list of mobile financial services, as reported by the GSM Association, and collected detailed histories of the development of the services and their innovation process. Our analysis, the first of its kind, shows that 85% of the innovations in this field originated in developing countries. We also conclude that, at least 50% of all mobile financial services were pioneered by users, approximately 45% by producers, and the remaining were jointly developed by users and producers. The main factors contributing to these innovations to occur in developing countries are the high levels of need, the existence of flexible platforms, in combination with increased access to information and communication technology. Additionally, services developed by users diffused at more than double the rate of producer-innovations. Finally, we observe that three-quarters of the innovations that originated in non-OECD countries have already diffused to OECD countries, and that the (user) innovations are therefore globally meaningful. This study suggests that the traditional North-to-South diffusion framework fails to explain these new sources of innovation and may require re-examination.  相似文献   

18.
With the development of 3D technology and the increase in 3D models, 2D image-based 3D model retrieval tasks have drawn increased attention from scholars. Previous works align cross-domain features via adversarial domain alignment and semantic alignment. However, the extracted features of previous methods are disturbed by the residual domain-specific features, and the lack of labels for 3D models makes the semantic alignment challenging. Therefore, we propose disentangled feature learning associated with enhanced semantic alignment to address these problems. On one hand, the disentangled feature learning enables decoupling the twisted raw features into the isolated domain-invariant and domain-specific features, and the domain-specific features will be dropped while performing adversarial domain alignment and semantic alignment to acquire domain-invariant features. On the other hand, we mine the semantic consistency by compacting each 3D model sample and its nearest neighbors to further enhance semantic alignment for unlabeled 3D model domain. We give comprehensive experiments on two public datasets, and the results demonstrate the superiority of the proposed method. Especially on MI3DOR-2 dataset, our method outperforms the current state-of-the-art methods with gains of 2.88% for the strictest retrieval metric NN.  相似文献   

19.
In this paper, we propose a multi-strategic matching and merging approach to find correspondences between ontologies based on the syntactic or semantic characteristics and constraints of the Topic Maps. Our multi-strategic matching approach consists of a linguistic module and a Topic Map constraints-based module. A linguistic module computes similarities between concepts using morphological analysis, string normalization and tokenization and language-dependent heuristics. A Topic Map constraints-based module takes advantage of several Topic Maps-dependent techniques such as a topic property-based matching, a hierarchy-based matching, and an association-based matching. This is a composite matching procedure and need not generate a cross-pair of all topics from the ontologies because unmatched pairs of topics can be removed by characteristics and constraints of the Topic Maps. Merging between Topic Maps follows the matching operations. We set up the MERGE function to integrate two Topic Maps into a new Topic Map, which satisfies such merge requirements as entity preservation, property preservation, relation preservation, and conflict resolution. For our experiments, we used oriental philosophy ontologies, western philosophy ontologies, Yahoo western philosophy dictionary, and Wikipedia philosophy ontology as input ontologies. Our experiments show that the automatically generated matching results conform to the outputs generated manually by domain experts and can be of great benefit to the following merging operations.  相似文献   

20.
This paper presents a binary classification of entrepreneurs in British historical data based on the recent availability of big data from the I-CeM dataset. The main task of the paper is to attribute an employment status to individuals that did not fully report entrepreneur status in earlier censuses (1851–1881). The paper assesses the accuracy of different classifiers and machine learning algorithms, including Deep Learning, for this classification problem. We first adopt a ground-truth dataset from the later censuses to train the computer with a Logistic Regression (which is standard in the literature for this kind of binary classification) to recognize entrepreneurs distinct from non-entrepreneurs (i.e. workers). Our initial accuracy for this base-line method is 0.74. We compare the Logistic Regression with ten optimized machine learning algorithms: Nearest Neighbors, Linear and Radial Support Vector Machine, Gaussian Process, Decision Tree, Random Forest, Neural Network, AdaBoost, Naive Bayes, and Quadratic Discriminant Analysis. The best results are boosting and ensemble methods. AdaBoost achieves an accuracy of 0.95. Deep-Learning, as a standalone category of algorithms, further improves accuracy to 0.96 without using the rich text-data that characterizes the OccString feature, a string of up to 500 characters with the full occupational statement of each individual collected in the earlier censuses. Finally, and now using this OccString feature, we implement both shallow (bag-of-words algorithm) learning and Deep Learning (Recurrent Neural Network with a Long Short-Term Memory layer) algorithms. These methods all achieve accuracies above 0.99 with Deep Learning Recurrent Neural Network as the best model with an accuracy of 0.9978. The results show that standard algorithms for classification can be outperformed by machine learning algorithms. This confirms the value of extending the techniques traditionally used in the literature for this type of classification problem.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号