首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
Convolutional neural network (CNN) and its variants have led to many state-of-the-art results in various fields. However, a clear theoretical understanding of such networks is still lacking. Recently, a multilayer convolutional sparse coding (ML-CSC) model has been proposed and proved to equal such simply stacked networks (plain networks). Here, we consider the initialization, the dictionary design and the number of iterations to be factors in each layer that greatly affect the performance of the ML-CSC model. Inspired by these considerations, we propose two novel multilayer models: the residual convolutional sparse coding (Res-CSC) model and the mixed-scale dense convolutional sparse coding (MSD-CSC) model. They are closely related to the residual neural network (ResNet) and the mixed-scale (dilated) dense neural network (MSDNet), respectively. Mathematically, we derive the skip connection in the ResNet as a special case of a new forward propagation rule for the ML-CSC model. We also find a theoretical interpretation of dilated convolution and dense connection in the MSDNet by analyzing the MSD-CSC model, which gives a clear mathematical understanding of each. We implement the iterative soft thresholding algorithm and its fast version to solve the Res-CSC and MSD-CSC models. The unfolding operation can be employed for further improvement. Finally, extensive numerical experiments and comparison with competing methods demonstrate their effectiveness.  相似文献   

3.
Augmented reality is very useful in medical education because of the problem of having body organs in a regular classroom. In this paper, we propose to apply augmented reality to improve the way of teaching in medical schools and institutes. We propose a novel convolutional neural network (CNN) for gesture recognition, which recognizes the human's gestures as a certain instruction. We use augmented reality technology for anatomy learning, which simulates the scenarios where students can learn Anatomy with HoloLens instead of rare specimens. We have used the mesh reconstruction to reconstruct the 3D specimens. A user interface featured augment reality has been designed which fits the common process of anatomy learning. To improve the interaction services, we have applied gestures as an input source and improve the accuracy of gestures recognition by an updated deep convolutional neural network. Our proposed learning method includes many separated train procedures using cloud computing. Each train model and its related inputs have been sent to our cloud and the results are returned to the server. The suggested cloud includes windows and android devices, which are able to install deep convolutional learning libraries. Compared with previous gesture recognition, our approach is not only more accurate but also has more potential for adding new gestures. Furthermore, we have shown that neural networks can be combined with augmented reality as a rising field, and the great potential of augmented reality and neural networks to be employed for medical learning and education systems.  相似文献   

4.
5.
Knowledge graphs are sizeable graph-structured knowledge with both abstract and concrete concepts in the form of entities and relations. Recently, convolutional neural networks have achieved outstanding results for more expressive representations of knowledge graphs. However, existing deep learning-based models exploit semantic information from single-level feature interaction, potentially limiting expressiveness. We propose a knowledge graph embedding model with an attention-based high-low level features interaction convolutional network called ConvHLE to alleviate this issue. This model effectively harvests richer semantic information and generates more expressive representations. Concretely, the multilayer convolutional neural network is utilized to fuse high-low level features. Then, features in fused feature maps interact with other informative neighbors through the criss-cross attention mechanism, which expands the receptive fields and boosts the quality of interactions. Finally, a plausibility score function is proposed for the evaluation of our model. The performance of ConvHLE is experimentally investigated on six benchmark datasets with individual characteristics. Extensive experimental results prove that ConvHLE learns more expressive and discriminative feature representations and has outperformed other state-of-the-art baselines over most metrics when addressing link prediction tasks. Comparing MRR and Hits@1 on FB15K-237, our model outperforms the baseline ConvE by 13.5% and 16.0%, respectively.  相似文献   

6.
移动网络视频传输控制技术的关键问题是如何避免拥塞,并在丢包存在的情况下尽可能提高视频质量。文章基于无线网络平台,为了保证解码端的正确执行,使用了一些传输控制技术来减少传输错误造成的影响,并基于RTCP/RTP协议研究一些视频传输中的关键问题如误差掩盖、差错控制和速率控制等,针对视频传输中的速率控制问题分析了几种方法和思路。  相似文献   

7.
易淼 《科技广场》2009,(5):120-122
数字图像编码技术是数字信息传输、存储、播放等环节的前提和基础.本文概述了国际国内视频编码标准发展的过程,并分析了混合编码结构框架的技术特点.对目前作为混合编码互补之势的可伸缩编码、多视点视频编码进行了分析,揭露了其关键技术.  相似文献   

8.
为了在有限的宽带下获得更高的传送速率,图片数据在传送之前必须进行压缩。同现有的小波传输方式相比,嵌入式零树小波编码利用小波多分辨率特性传输。人造神经网络已经应用于许多图像处理问题中,在图片压缩应用方面,处理噪音或不完整的图像数据时,对传统方式而言显示出了它的优越性。  相似文献   

9.
AimIn a pilot study to improve detection of malignant lesions in breast mammograms, we aimed to develop a new method called BDR-CNN-GCN, combining two advanced neural networks: (i) graph convolutional network (GCN); and (ii) convolutional neural network (CNN).MethodWe utilised a standard 8-layer CNN, then integrated two improvement techniques: (i) batch normalization (BN) and (ii) dropout (DO). Finally, we utilized rank-based stochastic pooling (RSP) to substitute the traditional max pooling. This resulted in BDR-CNN, which is a combination of CNN, BN, DO, and RSP. This BDR-CNN was hybridized with a two-layer GCN, and yielded our BDR-CNN-GCN model which was then utilized for analysis of breast mammograms as a 14-way data augmentation method.ResultsAs proof of concept, we ran our BDR-CNN-GCN algorithm 10 times on the breast mini-MIAS dataset (containing 322 mammographic images), achieving a sensitivity of 96.20±2.90%, a specificity of 96.00±2.31% and an accuracy of 96.10±1.60%.ConclusionOur BDR-CNN-GCN showed improved performance compared to five proposed neural network models and 15 state-of-the-art breast cancer detection approaches, proving to be an effective method for data augmentation and improved detection of malignant breast masses.  相似文献   

10.
提出了一种将DiffServ over MPLS技术应用于深空网络通信中的网络QoS控制方案,以提高各种业务的QoS控制性能. 仿真实验结果表明,该方案较好地满足了深空通信网络中视频、语音等流媒体通信对传输延迟、丢包率等性能的要求.  相似文献   

11.
Abstract

Several information processing technologies which are capable of augmenting human performance in handling a range of emergency situations are featured in this discussion. Among the more sophisticated systems which have proven useful in enhancing information collection, transmission, and selective processing are packet radio networks, “expert planning systems,” multiple‐satellite technology, and such related emerging developments as “internetting” and “machine intelligence.”  相似文献   

12.
With the increasing growth of video data, especially in cyberspace, video captioning or the representation of video data in the form of natural language has been receiving an increasing amount of interest in several applications like video retrieval, action recognition, and video understanding, to name a few. In recent years, deep neural networks have been successfully applied for the task of video captioning. However, most existing methods describe a video clip using only one sentence that may not correctly cover the semantic content of the video clip. In this paper, a new multi-sentence video captioning algorithm is proposed using a content-oriented beam search approach and a multi-stage refining method. We use a new content-oriented beam search algorithm to update the probabilities of words generated by the trained deep networks. The proposed beam search algorithm leverages the high-level semantic information of an input video using an object detector and the structural dictionary of sentences. We also use a multi-stage refining approach to remove structurally wrong sentences as well as sentences that are less related to the semantic content of the video. To this intent, a new two-branch deep neural network is proposed to measure the relevance score between a sentence and a video. We evaluated the performance of the proposed method with two popular video captioning databases and compared the results with the results of some state-of-the-art approaches. The experiments showed the superior performance of the proposed algorithm. For instance, in the MSVD database, the proposed method shows an enhancement of 6% for the best-1 sentences in comparison to the best state-of-the-art alternative.  相似文献   

13.
基于ADSP-BF561的智能视频分析前端,针对BF561的双核特性采用了双核应用模式设计并实现了视频数据的采集、智能视频分析和视频H.264编码的并行工作以及压缩码流的传输。实验表明,该系统能够实时有效地检测物体的入侵和丢失,同时进行视频帧数据的H.264编码与转发。  相似文献   

14.
The Franklin Institute, Philadelphia, Pennsylvania awarded the 2001 Bower Award and prize to Paul Baran for his efforts to advance our knowledge of physical science or its application for his seminal invention of packet switching—the foundation of modern communications networks and, in particular, the Internet.Simply stated the technology of packet-switching, allows pieces of information to be divided into small packets or “envelopes” of information that are addressed, sent using multiple available routes to a specific destination, then reassembled. This technology—a post office-like system—revolutionized the telecommunications industry. Originally devised during the cold war for a military communications system survivable in the event of nuclear attack, packet switching became the foundation of computer networks including the Internet and truly has altered the world in which we live.  相似文献   

15.
The VISION (video indexing for searching over networks) digital video library system has been developed in our laboratory as a testbed for evaluating automatic and comprehensive mechanisms for video archive creation and content-based search, filtering and retrieval of video over local and wide area networks. In order to provide access to video footage within seconds of broadcast, we have developed a new pipelined digital video processing architecture which is capable of digitizing, processing, indexing and compressing video in real time on an inexpensive general purpose computer. These videos were automatically partitioned into short scenes using video, audio and closed-caption information. The resulting scenes are indexed based on their captions and stored in a multimedia database. A client-server-based graphical user interface was developed to enable users to remotely search this archive and view selected video segments over networks of different bandwidths. Additionally, VISION classifies the incoming videos with respect to a taxonomy of categories and will selectively send users videos which match their individual profiles.  相似文献   

16.
Semantic image segmentation is a challenging problem from image processing where deep convolutional neural networks (CNN) have been applied with great success in the recent years. It deals with pixel-wise classification of an input image, dividing it into regions of multiple object classes. However, CNNs are opaque models. Given a trained CNN, it is hard to tell which information encoded in the input image is important for the network to perform segmentation. Such information could be useful to judge whether a trained network learned to segment in a plausible way or how its performance can be improved.For a trained CNN, we formulate an optimization problem to extract relevant image fractions for semantic segmentation. We try to identify a subset of pixels that contain the relevant information for the segmentation of one selected object class. In experiments on the Cityscapes dataset, we show that this is an easy way to gain valuable insight into a CNN trained for semantic segmentation. Looking at the relevant image fractions, we can identify possible limits of a trained network and draw conclusions about possible improvements.  相似文献   

17.
Deep forest     
Current deep-learning models are mostly built upon neural networks, i.e. multiple layers of parameterized differentiable non-linear modules that can be trained by backpropagation. In this paper, we explore the possibility of building deep models based on non-differentiable modules such as decision trees. After a discussion about the mystery behind deep neural networks, particularly by contrasting them with shallow neural networks and traditional machine-learning techniques such as decision trees and boosting machines, we conjecture that the success of deep neural networks owes much to three characteristics, i.e. layer-by-layer processing, in-model feature transformation and sufficient model complexity. On one hand, our conjecture may offer inspiration for theoretical understanding of deep learning; on the other hand, to verify the conjecture, we propose an approach that generates deep forest holding these characteristics. This is a decision-tree ensemble approach, with fewer hyper-parameters than deep neural networks, and its model complexity can be automatically determined in a data-dependent way. Experiments show that its performance is quite robust to hyper-parameter settings, such that in most cases, even across different data from different domains, it is able to achieve excellent performance by using the same default setting. This study opens the door to deep learning based on non-differentiable modules without gradient-based adjustment, and exhibits the possibility of constructing deep models without backpropagation.  相似文献   

18.
19.
利用文本分类、情感分析等自然语言处理手段,开发基于互联网文本信息的地区环境形象评价方法。为满足生态环境大数据的分析需求,划分了环境形象类别,分别从文体来源、情感极性和环境要素这三种角度评价地区环境形象。人工标注环境文本语料,对比支持向量机、朴素贝叶斯和卷积神经网络三种算法,最终构建了以卷积神经网络为核心算法的地区环境形象评价模型。方法的分类效果较好,三种分类的F1值均满足分析需求,环境要素的F1值在0.8~0.9之间,情感分析的F1值在0.8以上,文体来源的F1值在0.9左右。该方法应用在长三角城市,可实时处理地区热点环境舆情,分析地区环境形象,提供精准直观的环境形象评估结果,为区域环境管理提供基础信息支持。  相似文献   

20.
鲍玉来  耿雪来  飞龙 《现代情报》2019,39(8):132-136
[目的/意义]在非结构化语料集中抽取知识要素,是实现知识图谱的重要环节,本文探索了应用深度学习中的卷积神经网络(CNN)模型进行旅游领域知识关系抽取方法。[方法/过程]抓取专业旅游网站的相关数据建立语料库,对部分语料进行人工标注作为训练集和测试集,通过Python语言编程实现分词、向量化及CNN模型,进行关系抽取实验。[结果/结论]实验结果表明,应用卷积神经网络对非结构化的旅游文本进行关系抽取时能够取得满意的效果(Precision 0.77,Recall 0.76,F1-measure 0.76)。抽取结果通过人工校对进行优化后,可以为旅游知识图谱构建、领域本体构建等工作奠定基础。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号