English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  Items with full text/Total items : 109952/140887 (78%)
Visitors : 46289514      Online Users : 1191
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    政大機構典藏 > 商學院 > 資訊管理學系 > 學位論文 >  Item 140.119/118697
    Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/118697


    Title: 應用AI強化學習於建立股票交易代理人之研究-以台積電股票為例
    A study on establishing trading agent of stocks by AI reinforcement learning in Taiwan semiconductor manufacturing company stocks
    Authors: 林睿峰
    Lin, Jui-Feng
    Contributors: 姜國輝
    季延平

    林睿峰
    Lin, Jui-Feng
    Keywords: 機器學習
    強化學習
    Q-learning
    Deep Q-learning
    Machine learning
    Reinforcement learning
    Q-learning
    Deep Q-learning
    Date: 2018
    Issue Date: 2018-07-17 11:25:51 (UTC+8)
    Abstract:   機器學習技術中,強化學習受心理學的行為主義啟發,模仿生物從與環境的互動中,透過追求獎勵與避開懲罰逐步改變行為的學習方法。強化學習非常擅長進行連續多次決策的決策控制,而股市交易符合此類型問題的性質。
      然而股市環境狀態具有多樣性,難以用有限的狀態種類來概括,要讓學習代理人能夠學到面對所有環境狀態的應對行動會花費大量的訓練成本,因此本研究採用兩種訓練模型,其一是配合非監督式學習的分群能力先將環境狀態分群,再經由Q-learning演算法訓練;其二是使用將強化學習與深度學習結合的Deep Q-learning演算法訓練價值函數,利用深度學習擬似函數的能力,以Deep Q Network(DQN)為基礎建立股票交易代理人。
      系統設計上,本研究採用包含MA、MACD、RSI、BIAS、KD等多種技術指標作為交易代理人觀察市場環境狀態的方法,為歸納何種技術指標較能夠代表市場狀態,本研究設計七組技術指標組合並實測、比較其績效。以投資結束時所持資金與投資開始時所持資金的差距,即總獲利或總虧損作為獎勵訊號激勵代理人改變其交易行為,追求更高的獲利。
      本研究以台積電股票為例,擷取自2011年11月3日至2017年12月1日,共六年的臺灣證券交易所網站所公開之盤後資訊,訓練與測試交易代理人的性能,在其中表現最優的模型中,交易代理人平均具有16.14%年獲利率,並形成穩定的交易策略,具備有效獲利的能力。
      Reinforcement learning is one of machine learning techniques. Reinforcement learning is inspired by psychology`s behaviorism. Agents imitate learning methods that change behavior by pursuing rewards and avoiding punishment, just as creatures interact with the environment. Reinforcement learning is very good at continuous multiple decision-making. Stock market trading meets the nature of this type of problem.
      The state of the stock market environment is uncountable. It takes a lot of training costs for the learning agent to learn the response to all environmental states. This study uses two training models. First, cluster environmental states with unsupervised learning. Second, train the value function by Deep Q-learning algorithm which is combined with reinforcement learning and deep learning.
      This study uses technical indicators including MA, MACD, RSI, BIAS, KD as environmental states for trading agents to observe the market environment. This study designed seven sets of technical indicators. We compare their performance to find out which technical indicators are more representative of the market state. Take the total profit or total loss which is the difference between the funds held at the end of the trading and the funds held at the beginning of the trading as the reward signal.
      This study takes Taiwan Semiconductor Manufacturing Company stock as an example. We take six years of the after hours information on the website of the Taiwan Stock Exchange to train and test the performance of trading agents. Trading agents showed an average annual interest rate of 16.14% in the best performing model. The agent presents a stable trading strategy with effective profitability.
    Reference: [ 1 ] 吳欣曄. (2004). 以增強式學習法設計機台派工法則之研究. 臺灣大學電機工程學研究所學位論文, 1-77.
    [ 2 ] 林典南. (2008). 使用 AdaBoost 之臺股指數期貨當沖交易系統. 臺灣大學資訊網路與多媒體研究所學位論文, 1-55.
    [ 3 ] 周俊志. (2008). 自動交易系統與策略評價之研究. 臺灣大學資訊工程學研究所學位論文, 1-48.
    [ 4 ] 賴怡玲. (2009). 使用增強式學習法建立臺灣股價指數期貨當沖交易策略. 臺灣大學資訊工程學研究所學位論文, 1-24.
    [ 5 ] Hsiao, Y. W., Liu, H. J., & Liao, Y. F. (2016). 基於增強式深層類神經網路之語言辨認系統 (Reinforcement Training for Deep Neural Networks-based Language Recognition)[In Chinese]. In Proceedings of the 28th Conference on Computational Linguistics and Speech Processing (ROCLING 2016) (pp. 325-341).
    [ 6 ] Lee, J. W. (2001). Stock price prediction using reinforcement learning. In Industrial Electronics, 2001. Proceedings. ISIE 2001. IEEE International Symposium on (Vol. 1, pp. 690-695). IEEE.
    [ 7 ] Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., & Riedmiller, M. (2013). Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602.
    [ 8 ] Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., ... & Petersen, S. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529.
    [ 9 ] Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction (Vol. 1, No. 1). Cambridge: MIT press.
    Description: 碩士
    國立政治大學
    資訊管理學系
    105356036
    Source URI: http://thesis.lib.nccu.edu.tw/record/#G0105356036
    Data Type: thesis
    DOI: 10.6814/THE.NCCU.MIS.004.2018.A05
    Appears in Collections:[資訊管理學系] 學位論文

    Files in This Item:

    File SizeFormat
    603601.pdf2279KbAdobe PDF215View/Open


    All items in 政大典藏 are protected by copyright, with all rights reserved.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback