Tactical Missile Technology

2024, 06, No.228 107-117

Application research of deep reinforcement learning in intelligent control of unmanned aerial vehicle

Hou Lei¹ Jia Beixi¹ Du Ziliang¹ Zhang Peng¹ Wang Tianyu²

Aviation System Engineering Institute of China;

Email:

DOI: 10.16358/j.issn.1009-1300.20240016

Published: 2024-12-15

Publication Date: 2024-12-15

Mobile reading

864	10	932
Downloads	Citas	Reads

Cite Download

PDF

Reference

GB/T 7714-2015 MLA APA Refworks EndNote NoteExpress NoteFirst

Abstract Full Article References Publication Related

Abstract：

With the wide application of unmanned aerial vehicles(UAVs) in military and civil domains,the research and application of intelligent technology in the domain of UAV control has become the focus in the related field. Deep reinforcement learning(DRL) can solve complex control problems of UAVs and realize endto-end decision-making control of UAVs. At the same time, it also brings new opportunities and challenges to the intelligent application of UAVs. In view of these, the application of deep reinforcement learning in the intelligent control of UAVs is reviewed. The basic principles and common algorithms of deep reinforcement learning are introduced, and the application of DRL in the fields of UAV attitude control, flight control, target searching and tracking, cluster cooperative control and air combat decision control is outlined. The problems and challenges existing in the application of DRL in UAV control are pointed out, and the possible solutions are discussed. The summary and prospect of the research on deep reinforcement learning technology in the intelligent control of UAVs is given, so as to provide reference for the development of UAV systems towards automation, autonomy and intelligence.

KeyWords： intelligent UAV; deep reinforcement learning; UAV attitude control; target searching and tracking; UAV swarm; air battle decision control; sim to real;

References

[1]李波，黄晶益，万开方，等.基于深度强化学习的无人机系统应用研究综述[J].战术导弹技术，2023(1):58-68.

[2]郭宪，宋俊潇，方勇纯.深入浅出强化学习[M].北京：电子工业出版社，2020.

[3] Koch W, Mancuso R, West R, et al. Reinforcement learning for UAV attitude control[J]. ACM Transactions on Cyber-Physical Systems,2019,3(2):1-21.

[4] Bohn E, Coates E M, Moe S, et al. Deep reinforcement learning attitude control of fixed-wing UAVs using proximal policy optimization[C]. 2019 International Conference on Unmanned Aircraft Systems(ICUAS),Atlanta,GA,USA,2019.

[5]张镭，李浩.基于模糊PID和深度强化学习的四旋翼无人机姿态控制研究[J].计算机仿真，2018,35(10):43-47.

[6] Qiu X, Gao C, Wang K, et al. Attitude control of a moving mass-actuated UAV based on deep reinforcement learning[J]. Journal of Aerospace Engineering,2022,35(2):4021133.

[7]王伟，吴昊，刘鸿勋，等.基于深度强化学习的无人机姿态控制器设计[J].科学技术与工程，2023(34):14888-14895.

[8] Xu J,Du T,Foshey M,et al. Learning to fly:Computational controller design for hybrid UAVs with reinforcement learning[J]. ACM Transactions on Graphics(TOG),2019,38(4):1-12.

[9]孔飞，赵振根，程磊，等.输入受限及干扰下固定翼无人机强化学习控制[J].电光与控制，2024,31(2):21-28.

[10]余自权，程月华，张友民，等.风扰和故障条件下集群无人机强化学习自适应容错协同控制[J].厦门大学学报（自然科学版），2022(6):943-953.

[11]何金，丁勇，杨勇，等.未知环境下基于PF-DQN的无人机路径规划[J].兵工自动化，2020,39(9):15-21.

[12]杨清清，高盈盈，郭玙，等.基于深度强化学习的海战场目标搜寻路径规划[J].系统工程与电子技术，2022(11):3486-3495.

[13]牟治宇，张煜，范典，等.基于深度强化学习的无人机数据收集和路径规划研究[J].物联网学报，2020,4(3):42-51.

[14]赖俊，饶瑞.深度强化学习在室内无人机目标搜索中的应用[J].计算机工程与应用，2020(17):156-160.

[15] Zhang W, Song K, Rong X, et al. Coarse-to-fine UAV target tracking with deep reinforcement learning[J]. IEEE Transactions on Automation Science and Engineering,2018,16(4):1522-1530.

[16]杨兴昊，宋建梅，佘浩平，等.基于深度强化学习的无人机空中目标自主跟踪[J].兵工学报，2022,43(12):2551-2560.

[17] Li B,Gan Z,Chen D,et al. UAV maneuvering target tracking in uncertain environments based on deep reinforcement learning and meta-learning[J]. Remote Sensing,2020,12(22):1-20.

[18]李琳，张修社，韩春雷，等.基于卡尔曼滤波和DDQN算法的无人机机动目标跟踪[J].战术导弹技术，2022(2):98-104.

[19]沈遂欣.基于深度强化学习的无人机目标跟踪研究[J].电子技术（上海），2022(51):5-8.

[20]黄嘉，王玉平，刘志勇，等.基于注意力机制的DDPG无人机目标跟踪算法[J].计算机工程与应用，2020,56(12):1-7.

[21]李明，张立新，周晓明，等.结合深度学习和强化学习的无人机目标跟踪框架[J].控制理论与应用，2022,39(2):285-294.

[22]周立新，陈勇，张凯，等.基于深度强化学习和注意力机制的无人机目标跟踪算法[J].电子技术，2022,45(2):105-110.

[23]甄岩，袁健全，池庆玺，等.深度强化学习方法在飞行器控制中的应用研究[J].战术导弹技术，2020(4):112-118.

[24] To?i?ka J,Szulyovszky B,de Chambrier G,et al. Application of deep reinforcement learning to UAV fleet control[C]. Intelligent Systems and Applications:Proceedings of the 2018 Intelligent Systems Conference(IntelliSys),London,UK,2018.

[25]费陈，郑晗，赵亮.基于强化学习的无人机智能任务分配方法[J].弹箭与制导学报，2022(6):61-67.

[26]刘敬蜀，吴嘉琪，刘旭波.基于聚类和强化学习的多无人机协同侦察任务规划[J].中国电子科学研究院学报，2023(1):21-25+55.

[27]张雅楠，仇洪冰.基于深度强化学习的无人机可信地理位置路由协议[J].电子与信息学报，2022(12):4211-4217.

[28]程进，胡寒栋，江业帆，等.基于强化学习的通信受限环境多无人机协同策略[J].无人系统技术，2022(5):12-20.

[29]陈浩，黄健，刘权，等.自主空战机动决策技术研究进展与展望[J].控制理论与应用，2023,40(12):2104-2129.

[30]张强，杨任农，俞利新，等.基于Q-Network强化学习的超视距空战机动决策[J].空军工程大学学报：自然科学版，2018,19(6):7-8.

[31]何金，丁勇，高振龙.基于Double Deep Q-Network的无人机隐蔽接敌策略[J].电光与控制，2020,27(7):6-10.

[32]张宏鹏，黄长强，轩永波，等.基于深度神经网络的无人作战飞机自主空战机动决策[J].兵工学报，2020,41(8):10-14.

[33] Li B,Yang Z P,Chen D Q,et al. Maneuvering target tracking of UAV based on MN-DDPG and transfer learning[J]. Defence Technology,2021,17(2):10-13.

[34]毛梦月，张安，周鼎，等.基于机动预测的强化学习无人机空中格斗研究[J].电光与控制，2019,26(2):5-10+22.

[35] Bai S X,Song S M,Liang S Y,et al. UAV maneuvering decision-making algorithm based on twin delayed deep deterministic policy gradient algorithm[J]. Journal of Artificial Intelligence and Technology, 2022, 2(1):16-22.

[36]杨霄，李晓婷，赵彦东，等.基于深度强化学习与微分对策的无人机空战决策研究[J].火力与指挥控制，2021,46(5):71-75.

[37] Li B,Huang J Y,Bai S X,et al. Autonomous air combat decision-making of UAV based on parallel selfplay reinforcement learning[J]. CAAI Transactions on Intelligence Technology,2022,1:1-18.

[38]李卿莹.协同空战技术发展概况及作战模式[J].科技与创新，2020(7):124-126

[39]施伟，冯旸赫，程光权，等.基于深度强化学习的多机协同空战方法研究[J].自动化学报，2021,47(7):1610-1623.

[40]张磊，李姜，侯进永，等.基于改进强化学习的多无人机协同对抗算法研究[J].兵器装备工程学报，2023(5):230-238.

[41]王昱，任田君，范子琳.基于引导Minimax-DDQN的无人机空战机动决策[J].计算机应用，2023,43(8):2636-2643.

[42]赵琳，吕科，郭靖，等.基于深度强化学习的无人机集群协同作战决策方法[J].计算机应用，2023,43(11):3641-3646.

[43] RenéT, Hugo C, TimothéL, et al. Continual reinforcement learning deployed in real-life using policy distillation and sim2real transfer[EB/OL]. 2019-06-11.https：//doi.org/10.48550/arXiv.1906.04452.

[44] Fabio M,Christian E,Michael G,et al. Bayesian domain randomization for sim-to-real transfer[J]. IEEE Robotics and Automation Letters,2021,6(2):911-918.

[45] Arndt K,Hazara M,Ghadirzadeh A,et al. Meta reinforcement learning for sim-to-real domain adaptation[P]. Compiler:Germany,10. 48550,2019-09-16.

[46] Zhao W S,Pe J,Li Q Q,et al. Towards closing the sim-to-real gap in collaborative multi-robot deep reinforcement learning[C]. 5th ICRAE,Singapore,2020.

[47] Ramya R,Ece K,Debadeepta D,et al. Blind spot detection for safe sim-to-real transfer[J]. Journal of Artificial Intelligence Research,2020(67):191-234.

[48] Gupta A,Eysenbach B,Finn C,et al. Unsupervised meta-learning for reinforcement learning[EB/OL]. 2018.https：//doi. org/10. 48550/arXiv. 1806. 04640

[49] Celiberto Jr L A,Matsuura J P,De Màntaras R L,et al. Using transfer learning to speed-up reinforcement learning:A cased-based approach[C]. 2010-10-23.Latin American Robotics Symposium And Intelligent Robotics Meeting, IEEE, S?o Bernardo do Campo, Brazi,2010.

[50]俞扬.离线数据强化学习：途径与进展[J].中国基础科学，2022,3:35-39.

Basic Information:

DOI：10.16358/j.issn.1009-1300.20240016

China Classification Code:V279;V249.1

Citation Information:

[1]Hou Lei,Jia Beixi,Du Ziliang ,et al.Application research of deep reinforcement learning in intelligent control of unmanned aerial vehicle[J].Tactical Missile Technology,2024,No.228(06):107-117.DOI:10.16358/j.issn.1009-1300.20240016.

Published:

2024-12-15

Publication Date:

2024-12-15

请选择需要下载的pdf数据

Tactical Missile Technology

使用微信“扫一扫”功能。
将此内容分享给您的微信好友或者朋友圈

quote

请选择需要下载的pdf数据

Tactical Missile Technology

使用微信“扫一扫”功能。将此内容分享给您的微信好友或者朋友圈

quote

使用微信“扫一扫”功能。
将此内容分享给您的微信好友或者朋友圈