nav emailalert searchbtn searchbox tablepage yinyongbenwen piczone journalimg journalInfo journalinfonormal searchdiv searchzone qikanlogo popupnotification paper paperNew
Abstract:

In view of the difficulty in modeling command & control model,an intelligent simulation platform based on deep reinforcement learning( DRL) is designed,and a method that is taking advantage of DRL agent for command & control is proposed. Through combining traditional simulation platform and agent,endowing the agent with the ability to sense the simulation environment and made command decisions on the basis of the perception. The carri-er-based aircraft guidance simulation results show that the platform is qualified enough to train command & control agent,which could make appropriate decisions for the aircraft to arrive at the final approach point.

References

[1]黄柯棣.作战仿真技术综述[C].中国系统仿真学会:中国系统仿真学会,2004:80-89.

[2]李昌玺,于军,徐颖,等.联合作战条件下战场态势感知体系构建问题研究[J].中国电子科学研究院学报,2018,13(6):680-684.

[3]杨瑞平,郭齐胜.指挥实体建模与仿真研究[J].火力与指挥决策,2008(10):63-66.

[4]魏万强.联合作战指挥人才的科技素质研究[D].国防科学技术大学,2013.

[5] Silver D,Huang A,Maddison C J,et al. Mastering the game of go with deep neural networks and tree search[J]. Nature,2016,529(7587).

[6]卢锐轩,孙莹,杨奇,等.基于人工智能技术的智能自博弈平台研究[J].战术导弹技术,2019,(2):47-52+98.

[7] Mnih V,Kavukcuoglu K,Silver D,et al. Human-level con-trol through deep reinforcement learning[J]. Nature,2015,518(7540),529-52933.

[8] Mnih V,Kavukcuoglu K,Silver D,et al. Playing atari with deep reinforcement learning[C]. Proc of the 26th Interna-tional Conference on Neural Information Processing Sys-tems. Lake Tahoe:ACM Press,2013:201-220.

[9] Yan D,Xi C,Rein H,et al. Benchmarking deep reinforce-ment learning for continuous control[C]. Proc of the33rd International Conference on Machine Learning. New York:ACM Press,2016:1329-1338.

[10]赵星宇,丁世飞.深度强化学习研究综述[J].计算机科学,2018,45(7):1-6.

[11]徐志雄,曹雷,陈希亮.基于强化学习的无人坦克对战仿真研究[J].计算机工程与应用,2018,54(8):166-171.

[12]张晓海,操新文.基于深度学习的军事智能决策支持系统[J].指挥决策与仿真,2018,40(2):1-7.

[13]王壮.基于深度强化学习的作战智能体研究[C].中国指挥与控制学会.第六届中国指挥决策大会论文集(上册):中国指挥与控制学会,2018:37-41.

[14]闫雪飞,李新明,刘东,等.基于强化学习的体系对抗仿真战役层次指控算法[J].计算机工程与科学,2018,40(8):1511-1520.

[15]邓红艳,邓桂龙,赵倩,等.作战仿真理论与实践[M].北京:国防工业出版社,2013.

[16]申普兵,高凯,赵卫伟.作战指挥流程建模研究[J].火力与指挥决策,2015,40(6):1-5+13.

[17]吴昊霖,蔡乐才,高祥.在线更新的信息强度引导启发式Q学习[J].计算机应用研究,2018,35(8):2323-2327.

[18] Wei Yingzi,Zhao Mingyang. A reinforcement learningbased approach to dynamic job-shop scheduling[J].Acta Automat-ica Sinica,2005,(5):113-119.

[19]张福海,李宁,袁儒鹏,等.基于强化学习的机器人路径规划算法[J].华中科技大学学报(自然科学版),2018,46(12):65-70.

[20] Tesauro,TD-Gammon G. A self-teaching backgammon program,achieves master-level play[J]. Neural Computa-tion,1994,6(2):215-21.

[21]付文博,孙涛,梁藉,等.深度学习原理及应用综述[J].计算机科学,2018,45(S1):11-15+40.

[22] Lillicrap T,Hunt J,Pritzel A,et al. Continuous control with deep reinforcement learning[C/OL]. International conference on learning representation 2016. Caribe Hilton:ICLR,2016.(2019-09-20)[2019-09-20]. https://arxiv. org/abs/1509. 02971.

[23] Schulman J,Wolski F,Dhariwal P,et al. Proximal policy op-timization algorithms[J/OL]. ar Xiv preprint ar Xiv:1707. 06347,2017.(2019-09-20)[2019-09-20].https://arxiv. org/abs/1707. 06347.

[24] Barto A G. Reinforcement learning[M]. Berlin Heidelberg:Springer,2012:665-685.

[25]潘婷婷.舰载机进近着舰航线设计及控制系统仿真[D].南京:南京航空航天大学,2014.

[26]王锐平,高正红.无人机空战仿真中基于机动动作库的决策模型[J].飞行力学,2009,27(6):72-75+79.

[27]董肖杰.空战机动动作库及控制算法设计研究[C].中国指挥与控制学会.第五届中国指挥控制大会论文集.中国指挥与控制学会,2017:188-193.

Basic Information:

DOI:10.16358/j.issn.1009-1300.2020.1.007

China Classification Code:TP18;TP391.9

Citation Information:

[1]Wu Zhaoxin,Li Hui,Wang Zhuang ,et al.The Design of Intelligence Simulation Platform Based on DRL[J].Tactical Missile Technology,2020,No.202(04):193-200.DOI:10.16358/j.issn.1009-1300.2020.1.007.

Fund Information:

教育部联合基金(6141A02011607)

quote

GB/T 7714-2015
MLA
APA
Search Advanced Search