English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  Items with full text/Total items : 110944/141864 (78%)
Visitors : 48051762      Online Users : 951
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    政大機構典藏 > 資訊學院 > 資訊科學系 > 學位論文 >  Item 140.119/139558
    Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/139558


    Title: 無人機基於深度強化學習於虛擬環境之視覺化分析
    Visual Analysis for drone with Reinforcement Learning in Virtual Environment
    Authors: 李宣毅
    Lee, Hsuan-I
    Contributors: 紀明德
    Chi, Ming-Te
    李宣毅
    Lee, Hsuan-I
    Keywords: 深度強化學習
    無人機競賽
    虛擬環境
    視覺化分析
    Deep reinforcement learning
    Drone racing
    Virtual environment
    Visual analytics
    Date: 2022
    Issue Date: 2022-04-01 15:04:57 (UTC+8)
    Abstract: 近年來非常流行全自動無人機競賽,2019 年微軟團隊 Airsim 於
    NeurlIPS 的會議上舉辦一個基於虛擬環境的無人機過框比賽,其主要
    目標希望能夠超越人類玩家的表現,而在得名的參賽者中並沒有針對
    這項競賽設計一套利用深度強化學習的方法,因此本研究針對此虛擬
    競賽使用深度強化學習的方法訓練成功過框完賽的模型,並結合現實
    中無人機時常運用的 ROS 系統作為指令傳遞的通訊架構縮小虛擬與
    現實的差異。
    眾所周知深度強化學習這項方法就如同黑盒子,使用者不知道模
    型究竟學習到什麼,因此本研究設計一套視覺化介面,提供使用者分
    析模型表現,並設計一套圖表分析各項動作選擇的機率,看出模型在
    當下狀態所做的思考是否與普遍認知上相同,最後利用神經網路視覺
    化的技巧看出模型表現不佳的問題並將其改良,其中發現某些情況下
    模型表現與人類的行為相似,使得對深度強化學習的信任以及現實應
    用的可能性大幅增加。
    Autonomous drone racing has become very popular in recent years. At the 2019 Microsoft team, Airsim at the NeurlIPS conference held a virtual environment-based drone passing-gate competition. Its main goal is to surpass the performance of human players. None of the contestants designed a method for utilizing DRL (Deep Reinforcement Learning) specifically for this competition. This research uses the DRL method to train a model for this virtual racing and combines the ROS system that is often used by drones in reality as the communication architecture for command transmission to reduce the difference between virtual and reality. It is well known that the method of DRL is like a black box, and the user does not know what the model has learned. Therefore, this research designed a visual interface to provide users with an analysis of the model`s performance and designed a chart to analyze the probability of each action selection so users could know whether the thinking of the model in the current state is the same as the general cognition. Finally, the neural network visualization technique is used to identify the problem of poor performance of the model and improve it, as well as to find to behave similarly to human behavior. In some cases, it greatly increases the trust in DRL and the possibility of real-world applications.
    Reference: [1] Gebhardt, C., Stevšić, S., & Hilliges, O. (2018). Optimizing for aesthetically
    pleasing quadrotor camera motion. ACM Transactions on Graphics (TOG), 37(4),
    1-11.

    [2] Hepp, B., Dey, D., Sinha, S. N., Kapoor, A., Joshi, N., & Hilliges, O. (2018).
    Learn-to-score: Efficient 3d scene exploration by predicting view utility. In
    Proceedings of the European conference on computer vision (ECCV) (pp. 437-
    452).

    [3] Kaufmann, E., Loquercio, A., Ranftl, R., Dosovitskiy, A., Koltun, V., &
    Scaramuzza, D. (2018, October). Deep drone racing: Learning agile flight in
    dynamic environments. In Conference on Robot Learning (pp. 133-145). PMLR.

    [4] Xu, J., Du, T., Foshey, M., Li, B., Zhu, B., Schulz, A., & Matusik, W. (2019).
    Learning to fly: computational controller design for hybrid uavs with
    reinforcement learning. ACM Transactions on Graphics (TOG), 38(4), 1-12.

    [5] Shin, S. Y., Kang, Y. W., & Kim, Y. G. (2020). Reward-driven U-net training for
    obstacle avoidance drone. Expert Systems with Applications, 143, 113064.

    [6] Shin, S. Y., Kang, Y. W., & Kim, Y. G. (2019). Obstacle avoidance drone by deep
    reinforcement learning and its racing with human pilot. Applied sciences, 9(24),
    5571.

    [7] Madaan, R., Gyde, N., Vemprala, S., Brown, M., Nagami, K., Taubner, T., ... &
    Kapoor, A. (2020, August). Airsim drone racing lab. In NeurIPS 2019 Competition
    and Demonstration Track (pp. 177-191). PMLR.

    [8] Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., &
    Riedmiller, M. (2013). Playing atari with deep reinforcement learning. arXiv
    preprint arXiv:13

    [9] Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., ... & Wierstra,
    D. (2015). Continuous control with deep reinforcement learning. arXiv preprint
    arXiv:1509.02971.

    [10] Mnih, V., Badia, A. P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., ... &
    Kavukcuoglu, K. (2016, June). Asynchronous methods for deep reinforcement
    learning. In International conference on machine learning (pp. 1928-1937).
    PMLR.

    [11] Wu, Y., Mansimov, E., Grosse, R. B., Liao, S., & Ba, J. (2017). Scalable trustregion method for deep reinforcement learning using kronecker-factored
    approximation. Advances in neural information processing systems, 30, 5279-
    5288.

    [12] Won, J., Park, J., Kim, K., & Lee, J. (2017). How to train your dragon: exampleguided control of flapping flight. ACM Transactions on Graphics (TOG), 36(6), 1-
    13.

    [13] Simonyan, K., Vedaldi, A., & Zisserman, A. (2013). Deep inside convolutional
    networks: Visualising image classification models and saliency maps. arXiv
    preprint arXiv:1312.6034.

    [14] Iyer, R., Li, Y., Li, H., Lewis, M., Sundar, R., & Sycara, K. (2018, December).
    Transparency and explanation in deep reinforcement learning neural networks. In
    Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society (pp.
    144-150).

    [15] Wang, X., Li, H., Zhang, H., Lewis, M., & Sycara, K. (2020). Explanation of
    Reinforcement Learning Model in Dynamic Multi-Agent System. arXiv preprint
    arXiv:2008.01508.

    [16] Deshpande, S., Eysenbach, B., & Schneider, J. (2020). Interactive Visualization
    for Debugging RL. arXiv preprint arXiv:2008.07331.

    [17] Greydanus, S., Koul, A., Dodge, J., & Fern, A. (2018, July). Visualizing and
    understanding atari agents. In International Conference on Machine Learning (pp.
    1792-1801). PMLR.

    [18] Dabkowski, P., & Gal, Y. (2017). Real time image saliency for black box
    classifiers. arXiv preprint arXiv:1705.07857.

    [19] Fong, R. C., & Vedaldi, A. (2017). Interpretable explanations of black boxes by
    meaningful perturbation. In Proceedings of the IEEE international conference on
    computer vision (pp. 3429-3437).

    [20] Rosynski, M., Kirchner, F., & Valdenegro-Toro, M. (2020). Are Gradient-based
    Saliency Maps Useful in Deep Reinforcement Learning?. arXiv preprint
    arXiv:2012.01281.

    [21] Atrey, A., Clary, K., & Jensen, D. (2019). Exploratory not explanatory:
    Counterfactual analysis of saliency maps for deep reinforcement learning. arXiv
    preprint arXiv:1912.05743.

    [22] Wang, J., Gou, L., Shen, H. W., & Yang, H. (2018). Dqnviz: A visual analytics
    approach to understand deep q-networks. IEEE transactions on visualization and
    computer graphics, 25(1), 288-298.

    [23] Jaunet, T., Vuillemot, R., & Wolf, C. (2020, June). DRLViz: Understanding
    decisions and memory in deep reinforcement learning. In Computer Graphics
    Forum (Vol. 39, No. 3, pp. 49-61).

    [24] Jaderberg, M., Czarnecki, W. M., Dunning, I., Marris, L., Lever, G., Castaneda, A.
    G., ... & Graepel, T. (2019). Human-level performance in 3D multiplayer games
    with population-based reinforcement learning. Science, 364(6443), 859-865.

    [25] Deng, Z., Weng, D., Chen, J., Liu, R., Wang, Z., Bao, J., ... & Wu, Y. (2019). Airvis:
    Visual analytics of air pollution propagation. IEEE transactions on visualization
    and computer graphics, 26(1), 800-810.

    [26] Ates, U. (2020, October). Long-Term Planning with Deep Reinforcement
    Learning on Autonomous Drones. In 2020 Innovations in Intelligent Systems and
    Applications Conference (ASYU) (pp. 1-6). IEEE.

    [27] Chattopadhay, A., Sarkar, A., Howlader, P., & Balasubramanian, V. N. (2018,
    March). Grad-cam++: Generalized gradient-based visual explanations for deep
    convolutional networks. In 2018 IEEE winter conference on applications of
    computer vision (WACV) (pp. 839-847). IEEE.

    [28] Mott, A., Zoran, D., Chrzanowski, M., Wierstra, D., & Rezende, D. J. (2019).
    Towards interpretable reinforcement learning using attention augmented agents.
    arXiv preprint arXiv:1906.02500.

    [29] Puri, N., Verma, S., Gupta, P., Kayastha, D., Deshmukh, S., Krishnamurthy, B., &
    Singh, S. (2019). Explain your move: Understanding agent actions using specific
    and relevant feature attribution. arXiv preprint arXiv:1912.12191.

    [30] Kostrikov, I.. (2018). PyTorch Implementations of Reinforcement Learning
    Algorithms. https://github.com/ikostrikov/pytorch-a2c-ppo-acktr-gail.

    [31] Tzutalin. LabelImg. Git code (2015). https://github.com/tzutalin/labelImg

    [32] Quigley, M., Conley, K., Gerkey, B., Faust, J., Foote, T., Leibs, J., ... & Ng, A. Y.
    (2009, May). ROS: an open-source Robot Operating System. In ICRA workshop
    on open source software (Vol. 3, No. 3.2, p. 5).

    [33] Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., &
    Zaremba, W. (2016). Openai gym. arXiv preprint arXiv:1606.01540.

    [34] Amdegroot. (2017). SSD.PyTorch. https://github.com/amdegroot/ssd.pytorch

    [35] Reinforcement learning basic architecture diagram https://www.newton.com.tw/wiki

    [36] Actor critic architecture http://incompleteideas.net/book/ebook
    Description: 碩士
    國立政治大學
    資訊科學系
    108753130
    Source URI: http://thesis.lib.nccu.edu.tw/record/#G0108753130
    Data Type: thesis
    DOI: 10.6814/NCCU202200384
    Appears in Collections:[資訊科學系] 學位論文

    Files in This Item:

    File Description SizeFormat
    313001.pdf3854KbAdobe PDF230View/Open


    All items in 政大典藏 are protected by copyright, with all rights reserved.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback