政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/130954
English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  Items with full text/Total items : 109951/140887 (78%)
Visitors : 46265284      Online Users : 941
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/130954


    Title: 情感分析於電影推薦與評論展現系統之應用
    Application of Sentiment Analysis in Movie Recommendation and Comment-Revealing System
    Authors: 黃德潔
    Huang, Te-Chieh
    Contributors: 鄭宇庭
    黃德潔
    Huang, Te-Chieh
    Keywords: 文字探勘
    情緒分析
    特徵分群
    語意指向
    機器學習
    Text Mining
    Sentiment Analysis
    Feature Clustering
    PMI
    Semantic Orientation
    Machine Learning
    Date: 2020
    Issue Date: 2020-08-03 17:31:00 (UTC+8)
    Abstract: 隨著以文字資訊為主的社交平台興起,例如:微博、推特、部落格…等微型網誌,消費者對於購買商品或服務品質的評價可以在網路世界中迅速傳播,對於其他消費者的購買意願造成很大的影響,同時也加深大眾對於該產品的品牌形象。對於電影產業更是如此,消費者只能透過片商剪輯的預告片,觀賞部分電影片段,就必須決定是否要進電影院觀賞,事後也沒有退換貨的服務,因此民眾在購買電影票之前,會更加注重網路上對於該部電影的相關評論以及心得分享。有鑑於此,如何從巨量的網路資訊當中,正確且有效率地辨別顧客所表達的語意與情緒,成為近年來文字探勘學者致力於探討的議題。
    本論文實作出一個有效的電影評價系統,蒐集2019年Yahoo!奇摩電影網頁中網友滿意榜的短評資料,透過意見提取、屬性擷取、情緒分析、語意指向、特徵分群與機器學習分類法等技術,對評論按照其極性做分類,實驗結果顯示正確率為83.74%,F1-Measure也達84.29%,代表本研究在評論極性的判別上,確實有達到預期的效果。
    最終評論呈現的方式有兩個特點,首先,評論會依據其情緒強度由大至小排序,讓使用者優先瀏覽情緒與內容最豐富的評論;再者,藉由呈現意見詞與屬性詞搭配的結果,提供使用者搜尋電影多面向的情緒分析成果,了解該電影在各個屬性類別的各自評價,藉此推薦合適的電影給消費者觀賞。
    Following the rise of social media platforms for text information, such as Weibo, Twitter and Blog. Consumers’ rating for purchasable commodity and service quality can be rapidly spread in social media. It causes significant effect to other consumers’ desire to purchase. It also impresses the public about the product’s brand imagine. Furthermore, in movie industry, consumers have to decide whether to go into theater only through watching the segments of movie trailer. They can’t get a refund when they feel regrettable. So consumers will pay more attention on related comments and knowledge-sharing. For this reason, how to identify consumer’s expression of mood and semantization correctly becomes the subject for dedicated scholars.
    This essay produces an efficient movie evaluation system. It collected netizen’s satisfactory list of comments from 2019 Yahoo movie web page. Through Feature Extraction, Attribute Capture, Sentiment Analysis, Semantic Orientation, Feature Clustering, Machine Learning Classification to classify comments in accord with polarity. This experiment proves that the accuracy reaching 83.74% and the F1-Measure reaching 84.29%. It means that this study has achieved its anticipative result in identifying the polarization of comments.
    There are two characters appearing in final comments. First, comments will be listed in sequence according to sentiment intensity that let users browse the most abundant ones at first place. Secondly, by matching opinion keywords and feature keywords to offer users the outcome of multi-faceted analysis which could let them know the evaluation of each film’s attribute. Through it to recommend the suitable movie to consumers.
    Reference: 一、中文文獻
    [1] 李淑惠,2014年,“運用文字探勘技術於口碑分析之研究”,東吳大學商學院資訊管理學系碩士論文。
    [2] 邱鴻達,2011年,“意見探勘在中文電影評論之應用”,國立交通大學大資訊科學與工程研究所碩士論文。
    [3] 俞舒褆,2018年,“應用情感分析於產品比較與品牌推薦系統-以美妝產平為利”,國立政治大學商學院統計學系碩士論文。
    [4] 洪梓達,2019年,“應用特徵分群法進行情緒分析於中文電影評論之研究”,東吳大學商學院資訊管理學系碩士論文。
    [5] 張莊平,2012年,“中文文法剖析應用於電影評論之意見情感分類”,國立台灣師範大學資訊工程研究所碩士論文。
    [6] 張傳珩,2019年,“文本探勘與情緒分析於產品推薦之應用-以PTT電影版為例”,東吳大學商學院資訊管理學系碩士論文。
    [7] 梅家駒等編著,1997年,“同義詞詞林”,臺灣東華書局股份有限公司。
    [8] 陳克健、黃淑齡、施悅音、陳怡君,2004年,“多層次概念定義與複雜關係表達-繁體字知網的新增架構”,漢語詞彙語義研究的現狀與發展趨勢國際學術研討會,北京大學。
    [9] 楊惠淳,2011年,“以主客觀分析與相互資訊檢索探討情感分析之準確度-以電影評論為例”,國立臺北科技大學資訊與運籌管理研究所碩士論文。
    [10] 謝佩庭,2014年,“基於使用者情緒關鍵字彙之臉書粉絲專頁評論分類與評分系統”,國立交通大學多媒體工程研究所碩士論文。

    二、英文文獻
    [1] Agarwal, B. & N. Mittal, (2014), Semantic feature clustering for sentiment analysis of English reviews. IETE Journal of Research, 60(6), 414-422.
    [2] Agarwal, B. & N. Mittal, (2016), Prominent feature extraction for sentiment analysis, Berlin: Springer International Publishing.
    [3] Chen, J., H. Huang, S. Tian & Y. Qua , (2009), Feature selection for text classification with Naïve Bayes, Expert Systems with Applications, Vol. 36, No. 3, pp. 5432-5435.
    [4] Church, K. W. & P. Hanks, (1990), Word association norms, mutual information, and lexicography. Computational Linguistics, 16(1), 22-29.
    [5] Cortes, C. & V. Vapnik, (1995), "Support-Vector Networks", Machine Learning, Vol. 20, pp. 273-297.
    [6] Dong, Z. & Q. Dong, (2006), HowNet and the Computation of Meaning. World Scientific.
    [7] Galavotti, L., F. Sebastiani & M. Simi, (2000), Feature selection and negative evidence in automated text categorization, In Proceedings of KDD.
    [8] Hu, M. & B. Liu, (2004), "Mining and summarizing customer reviews", KDD, pp.168-177.
    [9] Karabatak, M. & M.C. Ince, (2009), A New Feature Selection Method Based on Association Rules for Diagnosis of Erythemato-squamous Diseases, Expert Systems with Applications, Vol. 36, No. 10, pp. 12500-12505.
    [10] Ku, L. W., & H. H. Chen, (2007), Mining Opinions from the Web: Beyond Relevance Retrieval. Journal of American Society for Information Science and Technology, 58(12), 1838-1850.
    [11] Li, B., S. Xu & J. Zhang, (2007), Enhancing Clustering Blog Documents by Utilizing Author/Reader Comments, Proceedings of the 45th Annual Southeast Regional Conference, pp. 94-99.
    [12] Liu, B., M. Hu & J. Cheng, (2005), "Opinion Observer: Analyzing and Comparing Opinions on the Web", 14th international conference on World Wide Web(www), pp. 342–351.
    [13] Manning, C. & H. Schutze, (1999). MITCogNet. Foundations of statistical natural language processing, Vol.59. MIT Press.
    [14] Manning, C. D., P. Raghavan & H. Schütze, (2008), An Introduction to Information Retrieval. Cambridge University Press. ISBN 978-0-521-86571-5.
    [15] Marneffe, M., C. D. Manning & C.Potts, (2010) ,"“Was it good? It was provocative.” Learning the meaning of scalar adjectives", 48th Annual Meeting of the Association for Computational Linguistics(ACL).
    [16] Polat, K. & S. Gunes, (2009), A New Feature Selection Method on Classification of Medical Datasets: Kernel F-score Feature Selection, Expert Systems with Applications, Vol. 36, No. 7, pp. 10367-10373.
    [17] Salton, G. & C. Buckley, (1988), Term Weighting Approaches in Automatic Text Retrieval. Information Processing and Management: An International Journal, 24(5), pp. 513-523.
    [18] Simeon, M. & R. Hilderman, (2008), Categorical Proportional Difference: A Feature Selection Method for Text Categorization, Proceedings of the 17th Australasian Data Mining Conference, pp. 201-208.
    [19] Tan, S. & J. Zhang, (2008), An empirical study of sentiment analysis for chinese documents, Expert Systems with Applications 34, pp. 2622–2629
    [20] Tian, P., Y. Liu, M. Liu & S. Zhu, (2009), “Research of product ranking technology based on opinion mining,” Proceedings of the 2009 Second International Conference on Intelligent Computation Technology and Automation, Volume 4, pp. 239-243, 2009.
    [21] Tian, X. & W. Tong, (2010), An Improvement to TF: Term Distribution Based Term Weight Algorithm, Proceedings of the second International Conference on Networks Security Wireless Communications and Trusted Computing (NSWCTC), pp. 252-255.
    [22] Turney, P. D., (2002), Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia, July, p.417-424.
    [23] Wang, T., H. Huang, S. Tian & J. Xu, (2010), Feature Selection for SVM via Optimization of Kernel Polarization with Gaussian ARD Kernels, Expert Systems with Applications, Vol.37, No. 9, pp. 6663-6668.
    [24] Yang, Y. & J.O. Pedersen, (1997), A comparative study on feature selection in text categorization, ICML, pp. 412–420.
    [25] Zhang, C., D. Zeng, J. Li, F. Y. Wang & W.
Zuo, (2009) ,"Sentiment Analysis of Chinese Documents: From Sentence to Document Level", Journal of the American Society for Information Science and Technology, pp.2474-2487.
    [26] Zhang, L., B. Liu, S. H. Lim & E. O’Brien-Strain, (2010), “Extracting and Ranking Product Features in Opinion Documents,” Proceedings of the 23rd International Conference on Computational Linguistics, pp. 1462-1470, 2010.
    [27] Zhuang, L., F. Jing & X. Y. Zhu, (2006), “Movie review mining and summarization,” Proceedings of the 2006 ACM CIKM International Conference on Information and Knowledge Management, Arlington, Virginia, USA, 2006, pp.43-50.

    三、網路資料
    [1] CKIP,中央研究院中文斷詞系統,2011年,http://ckipsvr.iis.sinica.edu.tw/。
    [2] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.222.8905&rep=rep1&type=pdf.
    [3] http://cpmarkchang.logdown.com/posts/195584-natural-language-processing-pointwise-mutual-information.
    [4] http://journal.dyu.edu.tw/dyujo/document/setjournal/s3-1-9-18.pdf.
    [5] http://oplab.im.ntu.edu.tw/csimweb/system/application/views/files/ICIM/20110026.
    [6] http://pythonsparkhadoop.blogspot.com/2016/10/machine-learning.html.
    [7] https://ir.nctu.edu.tw/bitstream/11536/50236/1/758401.pdf.
    [8] https://medium.com/@chih.sheng.huang821/機器學習-kernel-函數-47c94095171.
    [9] https://medium.com/@chih.sheng.huang821/機器學習-支撐向量機-support-vector-machine-svm-詳細推導-c320098a3d2e.
    [10] https://medium.com/jameslearningnote/資料分析-機器學習-第3-4講-支援向量機-support-vector-machine-介紹-9c6c6925856b.
    [11] https://medium.com/marketingdatascience/你了解你的消費者想-告訴-你什麼嗎-情感分析-sentiment-analytics-2f06fd52f10c.
    [12] https://oosga.com/machine-learning/.
    [13] https://www.aclweb.org/anthology/O12-3002.pdf.
    [14] https://www.itread01.com/content/1541479756.html.
    [15] https://www.ponews.net/technique/jwta8fmjrk.html.
    [16] https://www.zhihu.com/question/273517852.
    [17] https://wzwhit.github.io/2019/07/19/SVM2/.
    [18] https://zh.wikipedia.org/wiki/Tf-idf.
    [19] Yahoo 奇摩電影, https://movies.yahoo.com.tw/.
    [20] 台灣大學情緒詞辭典 National Taiwan University Semantic Dictionary (NTUSD),http://nlg18.csie.ntu.edu.tw:8080/opinion/pub1.html
    [21] 知網 HowNet,http://www.keenage.com/.
    Description: 碩士
    國立政治大學
    統計學系
    107354004
    Source URI: http://thesis.lib.nccu.edu.tw/record/#G0107354004
    Data Type: thesis
    DOI: 10.6814/NCCU202000667
    Appears in Collections:[Department of Statistics] Theses

    Files in This Item:

    File Description SizeFormat
    400401.pdf5655KbAdobe PDF20View/Open


    All items in 政大典藏 are protected by copyright, with all rights reserved.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback