首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于深度Q网络的水面无人艇路径规划算法
引用本文:随博文,黄志坚,姜宝祥,郑欢,温家一.基于深度Q网络的水面无人艇路径规划算法[J].上海海事大学学报,2020,41(3):1-5.
作者姓名:随博文  黄志坚  姜宝祥  郑欢  温家一
作者单位:上海海事大学商船学院,上海 201306;上海海事大学商船学院,上海 201306;上海海事大学商船学院,上海 201306;上海海事大学商船学院,上海 201306;上海海事大学商船学院,上海 201306
基金项目:国家自然科学基金(61403250)
摘    要:为实现水面无人艇(unmanned surface vessel, USV)在未知环境下的自主避障航行,提出一种基于深度Q网络的USV避障路径规划算法。该算法将深度学习应用到Q学习算法中,利用深度神经网络估计Q函数,有效解决传统Q学习算法在复杂水域环境的路径规划中容易产生维数灾难的问题。通过训练模型可有效地建立感知(输入)与决策(输出)之间的映射关系。依据此映射关系,USV在每个决策周期选择Q值最大的动作执行,从而能够成功避开障碍物并规划出最优路线。仿真结果表明,在迭代训练8 000次时,平均损失函数能够较好地收敛,这证明USV有效学习到了如何避开障碍物并规划出最优路线。该方法是一种不依赖模型的端到端路径规划算法。

关 键 词:水面无人艇(USV)  自主避障  路径规划  深度Q网络  卷积神经网络  强化学习
收稿时间:2019/8/26 0:00:00
修稿时间:2019/9/29 0:00:00

Path planning algorithm for unmanned surface vessels based on deep Q network
Sui Bowen,Huang Zhijian,Jiang Baoxiang,Zheng Huan and Wen Jiayi.Path planning algorithm for unmanned surface vessels based on deep Q network[J].Journal of Shanghai Maritime University,2020,41(3):1-5.
Authors:Sui Bowen  Huang Zhijian  Jiang Baoxiang  Zheng Huan and Wen Jiayi
Institution:Shanghai Maritime University,Shanghai Maritime University,Shanghai Maritime University and Shanghai Maritime University
Abstract:In order to realize the autonomous obstacle avoidance navigation of unmanned surface vessels (USVs) in unknown environment, a USV obstacle avoidance path planning algorithm based on the deep Q network is proposed. In this algorithm, the deep learning is applied to the Q learning algorithm, and the Q function is estimated by the deep neural network, which effectively solves the problem of dimension disasters in the path planning of complex waters environment caused by the traditional Q learning algorithm. The mapping relationship between the perception (input) and the decision (output) can be established effectively by the trained model. According to the mapping relationship, a USV chooses the action with the largest Q value in each decision cycle, so that it can successfully avoid obstacles and plan the optimal route. The simulation results show that, the average loss function can converge well through the iteration training of 8 000 times, which proves that the USV has learned how to avoid obstacles and plan the optimal route effectively. This method is an end to end path planning algorithm which does not depend on models.
Keywords:unmanned surface vessel (USV)  autonomous obstacle avoidance  path planning  deep Q network  convolutional neural network  reinforcement learning
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《上海海事大学学报》浏览原始摘要信息
点击此处可从《上海海事大学学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号