首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
This paper will describe Transformative Computing technologies as a new area of modern information sciences, which plays crucial role in development future IT technologies. Transformative computing allow to join communication technologies, and data processing techniques with advanced AI solutions, which allow to analytically process and manage acquired data. It enhances possibilities of efficient data analysis, and thanks to the application of AI, opens a new areas of data exploration towards planning, decision supporting, and advanced secure information management in distributed systems. In this paper we'll focus on new possible areas of transformative computing applications, especially for semantic information processing, as well as cognitive data reasoning.  相似文献   

2.
This paper will present new theoretical and applied solutions for intelligent data analysis and information management in the fields of cognitive economics. Intelligent data analysis and information management are performed by information systems called cognitive systems, dedicated for semantic interpretation of acquired business information. To interpret the meaning of the analysed data, complex linguistic algorithms must be used, based on which it is possible to find the core information elements for business processes forecasting and economical knowledge management. The presentation of selected methods of semantic data analysis in cognitive economy, which allow to perform both local and global information management forms the main subject of this paper. Here, semantic analysis methods are dedicated to cognitive economics problems, namely the interpretation, analysis and assessment of the meaning of selected sets of economic/financial ratios. The meaning of the interpreted data sets is assessed by analysing the layers of meaning contained in data analysed sets. Obtained semantic information may be used in future business processes evaluation and forecasting.  相似文献   

3.
This publication discusses ways of using cognitive analyses for semantically interpreting economic figures. A semantic analysis making use of mathematical linguistic algorithms in order to extract meaning from sets of analysed data is illustrated with an example of a class of cognitive systems designed to analyse economic figures, or more precisely, financial ratios. The cognitive analysis systems presented in this publication are discussed as exemplified by the class of Cognitive Financial Analysis Information Systems (CFAIS). This publication proposes algorithms executed by a broad class of automatic data interpretation and understanding systems designed for the in-depth semantic analysis and interpretation of the results obtained. This will be done by defining new system classes as applications supporting decision-making processes, useful in various areas of knowledge.  相似文献   

4.
This study uses data mining techniques to examine the effect of various demographic, cognitive and psychographic factors on Egyptian citizens’ use of e-government services. Data mining uses a broad family of computationally intensive methods that include decision trees, neural networks, rule induction, machine learning and graphic visualization. Three artificial neural network models (multi-layer perceptron neural network [MLP], probabilistic neural network [PNN] and self-organizing maps neural network [SOM]) and three machine learning techniques (classification and regression trees [CART], multivariate adaptive regression splines [MARS], and support vector machines [SVM]) are compared to a standard statistical method (linear discriminant analysis [LDA]). The variable sets considered are sex, age, educational level, e-government services perceived usefulness, ease of use, compatibility, subjective norms, trust, civic mindedness, and attitudes. The study shows how it is possible to identify various dimensions of e-government services usage behavior by uncovering complex patterns in the dataset, and also shows the classification abilities of data mining techniques.  相似文献   

5.
潜在语义索引方法是一种无监督的学习方法,能够自动地从未经加工的文本中学习词法分析处理的数据。通过计算单词之间的语义相关性,提高学习的效果。本文首先对词法分析和词法学习的概念和早期出现过的词法学习的方法进行简单阐述,然后描述了基于这一理论进行词法学习的方法,接着是对这一方法的一些改进和测评,最后是结论和展望。  相似文献   

6.
Although deep learning breakthroughs in NLP are based on learning distributed word representations by neural language models, these methods suffer from a classic drawback of unsupervised learning techniques. Furthermore, the performance of general-word embedding has been shown to be heavily task-dependent. To tackle this issue, recent researches have been proposed to learn the sentiment-enhanced word vectors for sentiment analysis. However, the common limitation of these approaches is that they require external sentiment lexicon sources and the construction and maintenance of these resources involve a set of complexing, time-consuming, and error-prone tasks. In this regard, this paper proposes a method of sentiment lexicon embedding that better represents sentiment word's semantic relationships than existing word embedding techniques without manually-annotated sentiment corpus. The major distinguishing factor of the proposed framework was that joint encoding morphemes and their POS tags, and training only important lexical morphemes in the embedding space. To verify the effectiveness of the proposed method, we conducted experiments comparing with two baseline models. As a result, the revised embedding approach mitigated the problem of conventional context-based word embedding method and, in turn, improved the performance of sentiment classification.  相似文献   

7.
This paper addresses the problem of the automatic recognition and classification of temporal expressions and events in human language. Efficacy in these tasks is crucial if the broader task of temporal information processing is to be successfully performed. We analyze whether the application of semantic knowledge to these tasks improves the performance of current approaches. We therefore present and evaluate a data-driven approach as part of a system: TIPSem. Our approach uses lexical semantics and semantic roles as additional information to extend classical approaches which are principally based on morphosyntax. The results obtained for English show that semantic knowledge aids in temporal expression and event recognition, achieving an error reduction of 59% and 21%, while in classification the contribution is limited. From the analysis of the results it may be concluded that the application of semantic knowledge leads to more general models and aids in the recognition of temporal entities that are ambiguous at shallower language analysis levels. We also discovered that lexical semantics and semantic roles have complementary advantages, and that it is useful to combine them. Finally, we carried out the same analysis for Spanish. The results obtained show comparable advantages. This supports the hypothesis that applying the proposed semantic knowledge may be useful for different languages.  相似文献   

8.
董坤 《现代情报》2015,35(2):12-17
针对当前非物质文化遗产分类组织方法的不足,提出一个基于关联数据的非物质文化遗产语义化组织框架。通过构建非物质文化遗产本体描述模型,实现非物质文化遗产知识元及其关联关系的语义化描述,在其基础上,基于关联数据所采用的RDF模型与链接机制实现了非物质文化遗产知识元以及知识元之间关联关系的语义化整合与组织。  相似文献   

9.
Probabilistic topic models are unsupervised generative models which model document content as a two-step generation process, that is, documents are observed as mixtures of latent concepts or topics, while topics are probability distributions over vocabulary words. Recently, a significant research effort has been invested into transferring the probabilistic topic modeling concept from monolingual to multilingual settings. Novel topic models have been designed to work with parallel and comparable texts. We define multilingual probabilistic topic modeling (MuPTM) and present the first full overview of the current research, methodology, advantages and limitations in MuPTM. As a representative example, we choose a natural extension of the omnipresent LDA model to multilingual settings called bilingual LDA (BiLDA). We provide a thorough overview of this representative multilingual model from its high-level modeling assumptions down to its mathematical foundations. We demonstrate how to use the data representation by means of output sets of (i) per-topic word distributions and (ii) per-document topic distributions coming from a multilingual probabilistic topic model in various real-life cross-lingual tasks involving different languages, without any external language pair dependent translation resource: (1) cross-lingual event-centered news clustering, (2) cross-lingual document classification, (3) cross-lingual semantic similarity, and (4) cross-lingual information retrieval. We also briefly review several other applications present in the relevant literature, and introduce and illustrate two related modeling concepts: topic smoothing and topic pruning. In summary, this article encompasses the current research in multilingual probabilistic topic modeling. By presenting a series of potential applications, we reveal the importance of the language-independent and language pair independent data representations by means of MuPTM. We provide clear directions for future research in the field by providing a systematic overview of how to link and transfer aspect knowledge across corpora written in different languages via the shared space of latent cross-lingual topics, that is, how to effectively employ learned per-topic word distributions and per-document topic distributions of any multilingual probabilistic topic model in various cross-lingual applications.  相似文献   

10.
高雪霞  王芸 《科技通报》2012,28(8):74-76
提出了一种基于差异约束自然语义模型的图形描述算法,建立了自然图像特征的差异相似模型,并用维数约简对图像中转化的歧义特征进行分类,实现了计算机图形的视觉特征到语义描述的转换.消除了图像转化中的歧义.实验结果证明,该方法在图像的特征理解和基于自然语义的图像特征描述中,有助于缩小图像特征的数学描述和人类理解之间的语义歧义,取得了良好的效果.  相似文献   

11.
To obtain high performances, previous works on FAQ retrieval used high-level knowledge bases or handcrafted rules. However, it is a time and effort consuming job to construct these knowledge bases and rules whenever application domains are changed. To overcome this problem, we propose a high-performance FAQ retrieval system only using users’ query logs as knowledge sources. During indexing time, the proposed system efficiently clusters users’ query logs using classification techniques based on latent semantic analysis. During retrieval time, the proposed system smoothes FAQs using the query log clusters. In the experiment, the proposed system outperformed the conventional information retrieval systems in FAQ retrieval. Based on various experiments, we found that the proposed system could alleviate critical lexical disagreement problems in short document retrieval. In addition, we believe that the proposed system is more practical and reliable than the previous FAQ retrieval systems because it uses only data-driven methods without high-level knowledge sources.  相似文献   

12.
Figures and tables in scientific articles serve as data sources for various academic data mining tasks. These tasks require input data to be in its entirety. However, existing studies measure the performance of algorithms using the same IoU (Intersection over Union) or IoU-based metrics that are used for natural situations. There is a gap between high IoU and detection entirety in scientific figures and tables detection tasks. In this paper, we demonstrate the existence of this gap and suggest that the leading cause is the detection error in the boundary area. We propose an effective detection method that cascades semantic segmentation and contour detection. The semantic segmentation model adopted a novel loss function to enhance the weights of boundary parts and a categorized dice metric to evaluate the imbalanced pixels in the segmentation result. Under rigorous testing criteria, the method proposed in this paper yielded a page-level F1 of 0.983 exceeding state-of-the-art academic figure and table detection methods. The research results in this paper can significantly improve the data quality and reduce data cleaning costs for downstream applications.  相似文献   

13.
Advancements in recent networking and information technology have always been a natural phenomenon. The exponential amount of data generated by the people in their day-to-day lives results in the rise of Big Data Analytics (BDA). Cognitive computing is an Artificial Intelligence (AI) based system that can reduce the issues faced during BDA. On the other hand, Sentiment Analysis (SA) is employed to understand such linguistic based tweets, feature extraction, compute subjectivity and sentimental texts placed in these tweets. The application of SA on big data finds it useful for businesses to take commercial benefits insight from text-oriented content. In this view, this paper presents new cognitive computing with the big data analysis tool for SA. The proposed model involves various process such as pre-processing, feature extraction, feature selection and classification. For handling big data, Hadoop Map Reduce tool is used. The proposed model initially undergoes pre-processing to remove the unwanted words. Then, Term Frequency-Inverse Document Frequency (TF-IDF) is utilized as a feature extraction technique to extract the set of feature vectors. Besides, a Binary Brain Storm Optimization (BBSO) algorithm is being used for the Feature Selection (FS) process and thereby achieving improved classification performance. Moreover, Fuzzy Cognitive Maps (FCMs) are used as a classifier to classify the incidence of positive or negative sentiments. A comprehensive experimental results analysis ensures the better performance of the presented BBSO-FCM model on the benchmark dataset. The obtained experimental values highlights the improved classification performance of the proposed BBSO-FCM model in terms of different measures.  相似文献   

14.
The proposed work aims to explore and compare the potency of syntactic-semantic based linguistic structures in plagiarism detection using natural language processing techniques. The current work explores linguistic features, viz., part of speech tags, chunks and semantic roles in detecting plagiarized fragments and utilizes a combined syntactic-semantic similarity metric, which extracts the semantic concepts from WordNet lexical database. The linguistic information is utilized for effective pre-processing and for availing semantically relevant comparisons. Another major contribution is the analysis of the proposed approach on plagiarism cases of various complexity levels. The impact of plagiarism types and complexity levels, upon the features extracted is analyzed and discussed. Further, unlike the existing systems, which were evaluated on some limited data sets, the proposed approach is evaluated on a larger scale using the plagiarism corpus provided by PAN1 competition from 2009 to 2014. The approach presented considerable improvement in comparison with the top-ranked systems of the respective years. The evaluation and analysis with various cases of plagiarism also reflected the supremacy of deeper linguistic features for identifying manually plagiarized data.  相似文献   

15.
高亚琪  王昊  刘渊晨 《情报科学》2021,39(10):107-117
【目的/意义】针对当前利用计算机管理图像资源存在图像语义特征表达不足等问题,探索和分析了特征及 特征融合对分类结果的影响,提出了一种提高图像语义分类准确率的方法。【方法/过程】本文定义了四种图像风 格,将图像描述特征划分为三个层次,探究特征融合的特点,寻求能有效表达图像语义的特征。分别采用SVM、 CNN、LSTM 及迁移学习方法实现图像风格分类,并将算法组合以提高分类效果。【结果/结论】基于迁移学习的 ResNet18模型提取的深层特征能够较好地表达图像的高级语义,将其与SVM结合能提高分类准确率。特征之间 并不总是互补,在特征选择时应避免特征冗余,造成分类效率下降。【创新/局限】本文定义的风格数目较少,且图像 展示出的风格并不绝对,往往可以被赋予多种标签,今后应进一步丰富图像数据集并尝试进行多标签分类。  相似文献   

16.
With the advent of Web 2.0, there exist many online platforms that results in massive textual data production such as social networks, online blogs, magazines etc. This textual data carries information that can be used for betterment of humanity. Hence, there is a dire need to extract potential information out of it. This study aims to present an overview of approaches that can be applied to extract and later present these valuable information nuggets residing within text in brief, clear and concise way. In this regard, two major tasks of automatic keyword extraction and text summarization are being reviewed. To compile the literature, scientific articles were collected using major digital computing research repositories. In the light of acquired literature, survey study covers early approaches up to all the way till recent advancements using machine learning solutions. Survey findings conclude that annotated benchmark datasets for various textual data-generators such as twitter and social forms are not available. This scarcity of dataset has resulted into relatively less progress in many domains. Also, applications of deep learning techniques for the task of automatic keyword extraction are relatively unaddressed. Hence, impact of various deep architectures stands as an open research direction. For text summarization task, deep learning techniques are applied after advent of word vectors, and are currently governing state-of-the-art for abstractive summarization. Currently, one of the major challenges in these tasks is semantic aware evaluation of generated results.  相似文献   

17.
Governmental and local authorities are facing many new information and communication technologies challenges. The amount of data is rapidly increasing. The data sets are published in different formats. New services are based on linking and processing differently structured data from various sources. Users expect openness of public data, fast processing, and intuitive visualisation. The article addresses the challenges and proposes a new government enterprise architecture framework. The following partial architectures are included: big and open linked data storage, processing, and publishing using cloud computing. At first, the key concepts are defined. Next, the basic architectural roles and components are specified. The components result from the decomposition of related frameworks. The main part of the article deals with the detailed proposal of the architecture framework and partial views on architecture (sub-architectures). A methodology, including a proposal of appropriate steps, solutions and responsibilities for them, is described in the next step - after the verification and validation of the new framework with respect to the attributes of quality. The new framework responds to emerging ICT trends in order to evolve government enterprise architecture continually and represent current architectural components and their relationships.  相似文献   

18.
Relation classification is one of the most fundamental tasks in the area of cross-media, which is essential for many practical applications such as information extraction, question&answer system, and knowledge base construction. In the cross-media semantic retrieval task, in order to meet the needs of cross-media uniform representation and semantic analysis, it is necessary to analyze the semantic potential relationship and construct semantic-related cross-media knowledge graph. The relationship classification technology is an important part of solving semantic correlation classification. Most of existing methods regard relation classification as a multi-classification task, without considering the correlation between different relationships. However, two relationships in the opposite directions are usually not independent of each other. Hence, this kind of relationships are easily confused in the traditional way. In order to solve the problem of confusing the relationships of the same semantic with different entity directions, this paper proposes a neural network fusing discrimination information for relation classification. In the proposed model, discrimination information is used to distinguish the relationship of the same semantic with different entity directions, the direction of entity in space is transformed into the direction of vector in mathematics by the method of entity vector subtraction, and the result of entity vector subtraction is used as discrimination information. The model consists of three modules: sentence representation module, relation discrimination module and discrimination fusion module. Moreover, two fusion methods are used for feature fusion. One is a Cascade-based feature fusion method, and another is a feature fusion method based on convolution neural network. In addition, this paper uses the new function added by cross-entropy function and deformed Max-Margin function as the loss function of the model. The experimental results show that the proposed discriminant feature is effective in distinguishing confusing relationships, and the proposed loss function can improve the performance of the model to a certain extent. Finally, the proposed model achieves 84.8% of the F1 value without any additional features or NLP analysis tools. Hence, the proposed method has a promising prospect of being incorporated in various cross-media systems.  相似文献   

19.
Three-way opinion classification (3WOC) models are based on a human perspective of opinion classification and offer human-like decision-making capabilities. The purpose of this study was to determine the effectiveness of a three-way decision-making framework with multiple features (fuzzy features and semantic features) in simulating human judgement of opinions. This was an quantitative study. A simple prototype of the three-way decision model was run against the Amazon Musical Instrument dataset to evaluate the model. The data used to verify the results were collected from 125 respondents via an online survey. The participants tested the model in context, then immediately filled in the online questionnaire. Results show that the statistical correlation between semantic features and fuzzy feature is low. Therefore, classification coverage and accuracy can be increased when both types of features are used together rather than using one type of feature alone. With the integration of semantic features and fuzzy features, we found that our three-way decision model performs better than a two-way classification model. Furthermore, the 3WOC model is a simulation of human judgements executed when people make decisions. Finally, we offer usability recommendations based on our analysis. A three-way decision-making framework is a better solution to simulate human judgement of opinion classification than a two-way decision model. The research outcomes will help in the development of better opinion classification systems that can support businesses and organisations to make strategic plans to improve their products or services based on customer preference patterns.  相似文献   

20.
Big data generated by social media stands for a valuable source of information, which offers an excellent opportunity to mine valuable insights. Particularly, User-generated contents such as reviews, recommendations, and users’ behavior data are useful for supporting several marketing activities of many companies. Knowing what users are saying about the products they bought or the services they used through reviews in social media represents a key factor for making decisions. Sentiment analysis is one of the fundamental tasks in Natural Language Processing. Although deep learning for sentiment analysis has achieved great success and allowed several firms to analyze and extract relevant information from their textual data, but as the volume of data grows, a model that runs in a traditional environment cannot be effective, which implies the importance of efficient distributed deep learning models for social Big Data analytics. Besides, it is known that social media analysis is a complex process, which involves a set of complex tasks. Therefore, it is important to address the challenges and issues of social big data analytics and enhance the performance of deep learning techniques in terms of classification accuracy to obtain better decisions.In this paper, we propose an approach for sentiment analysis, which is devoted to adopting fastText with Recurrent neural network variants to represent textual data efficiently. Then, it employs the new representations to perform the classification task. Its main objective is to enhance the performance of well-known Recurrent Neural Network (RNN) variants in terms of classification accuracy and handle large scale data. In addition, we propose a distributed intelligent system for real-time social big data analytics. It is designed to ingest, store, process, index, and visualize the huge amount of information in real-time. The proposed system adopts distributed machine learning with our proposed method for enhancing decision-making processes. Extensive experiments conducted on two benchmark data sets demonstrate that our proposal for sentiment analysis outperforms well-known distributed recurrent neural network variants (i.e., Long Short-Term Memory (LSTM), Bidirectional Long Short-Term Memory (BiLSTM), and Gated Recurrent Unit (GRU)). Specifically, we tested the efficiency of our approach using the three different deep learning models. The results show that our proposed approach is able to enhance the performance of the three models. The current work can provide several benefits for researchers and practitioners who want to collect, handle, analyze and visualize several sources of information in real-time. Also, it can contribute to a better understanding of public opinion and user behaviors using our proposed system with the improved variants of the most powerful distributed deep learning and machine learning algorithms. Furthermore, it is able to increase the classification accuracy of several existing works based on RNN models for sentiment analysis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号