政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/97113
English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  全文笔数/总笔数 : 109953/140892 (78%)
造访人次 : 46230898      在线人数 : 960
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜寻范围 查询小技巧:
  • 您可在西文检索词汇前后加上"双引号",以获取较精准的检索结果
  • 若欲以作者姓名搜寻,建议至进阶搜寻限定作者字段,可获得较完整数据
  • 进阶搜寻
    政大機構典藏 > 資訊學院 > 資訊科學系 > 學位論文 >  Item 140.119/97113


    请使用永久网址来引用或连结此文件: https://nccur.lib.nccu.edu.tw/handle/140.119/97113


    题名: 針對臉書粉絲專頁貼文之政治傾向預測
    Predicting Political Affiliation for Posts on Facebook Fan Pages
    作者: 張哲嘉
    Chang, Che Chia
    贡献者: 徐國偉
    Hsu, Kuo Wei
    張哲嘉
    Chang, Che Chia
    关键词: 政治傾向
    分類
    臉書
    文字探勘
    political affiliation
    classification
    facebook
    text mining
    日期: 2016
    上传时间: 2016-06-01 13:53:37 (UTC+8)
    摘要: 近年來社群媒體興起,尤其以臉書為主。在台灣超過1500萬個臉書用戶,其遍及族群從公眾人物到一般民眾。此外,這類的新興資訊交流平台其實內含許多有意義的資訊,每一則貼文都隱含著每個使用者的情緒以及立場傾向。然而,利用社群媒體來預測選舉與使用者政治傾向已成為目前的趨勢,在台灣各政黨與政治人物紛紛成立粉絲專頁,投入利用網路與社群媒體來打選戰與預測民調。本研究發現此一特性,致力於預測粉絲專頁貼文之政治傾向,收集台灣兩大政黨派國民黨與民進黨之粉絲專頁貼文,建立兩種預測模型分別為以相異字為特徵模型與文字互動特徵模型。利用資料探勘之相關技術,以貼文所含藍綠政黨特徵表現建立分類器,並細部探討與設計多種特徵組合,比較不同特徵組合之預測效果與影響因素以及在預測資料不平衡的情況下是否影響分類結果。最後,研究結果顯示使用文字特徵中黨派典型字與互動特徵值域取對數並搭配KNN分類器效果最佳,其準確度可達0.908,F1-score可達0.827。
    Recently, the social media is becoming more and more popular, especially Facebook. In Taiwan, there are 15 million Facebook users from celebrities to the general public. Receiving information every day from Facebook has become a lifestyle of most people. These new information-exchanging platforms contain lots of meaningful messages including users` emotions and affiliations. Moreover, using the social media data to predict the election result and political affiliation is becoming the current trend in Taiwan. For example, politicians try to win the election and predict the polls by means of Internet and the social media, and every political parties also have their own fan pages. In this thesis, we make an effort to predict the political inclinations of the posts of fan pages, especially for KMT and DPP which are the two largest political parties in Taiwan. We filter the appropriate literal and interactive features. We use the posts of the two parties to predict the political inclinations by constructing the classification models .In the end, we compare the performances of different classifiers .The result shows that the literal and interactive features work the best with KNN classifier, whose accuracy and F1-score are 0.908 and 0.827, respectively.
    參考文獻: [1] D. Gayo-Avello, P. T. Metaxas and E. Mustafaraj, “Limits of Electoral Predictions using Twitter,” Proceedings of the International Conference on Weblogs and Social Media (ICWSM’11), 2011.
    [2] A. Boutet, H. Kim, and E. Yoneki, “What’s in Your Tweets? I Know Who You Supported in the UK 2010 General Election,” Proceedings of the International Conference on Weblogs and Social Media (ICWSM’12), 2012.
    [3] 結合長詞優先與序列標記之中文斷詞研究 林千翔∗、張嘉惠*、陳貞伶∗ Computational Linguistics and Chinese Language Processing Vol. 15, No. 3-4, September/December 2010, pp. 161-180
    [4] Chen, K.J. & Ming-Hong Bai, "Unknown Word Detection for Chinese by a Corpus-based Learning Method," International Journal of Computational linguistics and Chinese Language Processing, 1998, Vol.3, #1, pages 27-44 [PS]
    [5]Chen, Keh-Jiann, and Wei-Yun Ma. "Unknown word extraction for Chinese documents." Proceedings of the 19th international conference on Computational linguistics-Volume 1. Association for Computational Linguistics, 2002.
    [6]Ma, Wei-Yun, and Keh-Jiann Chen. "A bottom-up merging algorithm for Chinese unknown word extraction." Proceedings of the second SIGHAN workshop on Chinese language processing-Volume 17. Association for Computational Linguistics, 2003.
    [7] B. O’Connor, R. Balasubramanyan, B. R. Routledge, and N. A. Smith, “From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series,” Proceedings of the International Conference on Weblogs and Social Media (ICWSM’10), 2010.

    [8] A. Tumasjan, T. O. Sprenger, P. G. Sandner and I. M. Welpe, “Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment,” Proceedings of the International Conference on Weblogs and Social Media (ICWSM’10), 2010.
    [9] M. D. Conver, B. Goncalves, J. Ratkiweicz, A. Flammini, F. Menczer, “Predicting the Political Alignment of Twitter Users,” Proceedings of the IEEE Conference on Social Computing (SocialCom’11), 2011.
    [10] Clay Fink, Nathan Bos, Alexander Perrone, Edwina Liu, and Jonathon Kopcky, “Twitter, Public Opinion, and the 2011 Nigerian Presidential Election,” Proceedings of the IEEE Conference on Social Computing (SocialCom’13), 2013.
    [11] A. Makazhanov and D. Rafiel, “Predicting Political Preference of Twitter Users,” Proceedings of the International Conference on Advances in Social Network Analysis and Mining (ASONAM’13), 2013.
    [12] S. O’Banion and L. Birnbaum, “Using Explicit Linguistic Expressions of Preference in Social Media to Predict Voting Behavior,” Proceedings of the International Conference on Advances in Social Network Analysis and Mining (ASONAM’13), 2013.
    [13] Marco Pennacchiotti, Ana-Maria Popescu,” Democrats, Republicans and Starbucks Afficionados: User Classification in Twitter,” Proceedings of the 17th SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’11), 2011.
    [14] Tumitan, Diego, and Kurt Becker. "Sentiment-based features for predicting election polls: a case study on the brazilian scenario." Web Intelligence (WI) and Intelligent Agent Technologies (IAT), 2014 IEEE/WIC/ACM International Joint Conferences on. Vol. 2. IEEE, 2014.
    [15] Z. Dong and Q. Dong, “HowNet and the Computation of Meaning,” World Scientific Publishing Co., Inc., River Edge, NJ, 2006.
    [16] Wu, Xindong, et al. "Top 10 algorithms in data mining." Knowledge and Information Systems 14.1 (2008): 1-37.
    [17] L. W. Ku and H. H. Chen, "Mining Opinions from the Web: Beyond Relevance Retrieval," Journal of American Society for Information Science and Technology, Special Issue on Mining Web Resources for Enhancing Information Retrieval, 2007, Volume 58 Issue 12, pp.1838-1850.
    [18] 江家榕,以社群媒體為考量之選民政治傾向探索,政治大學論文,2015
    [19] 陳慧潔,國小高年級學童臉書使用行為,臉書成癮與人際溝通能力相關研究,中華大學碩士論文 2013
    [20] 林育珊,科技接受模式對學生使用社群媒體輔助學習的行為意圖之研究,高雄師範大學碩士論文,2015
    [21] 陳冰淳,Web2.0時代影響社群媒體新聞資訊信任的心理因素——以微博為例,台灣大學碩士論文,2015
    [22] 維基百科https://zh.wikipedia.org/wiki/Wikipedia:%E9%A6%96%E9%A1%B5
    [23] 中央研究院中文斷詞系統,http://ckipsvr.iis.sinica.edu.tw/[2011/11/12]
    [24] 陳克健, 黃淑齡, 施悅音, 和陳怡君, “多層次概念定義與複雜關係表達-繁體字知網的新增架構,” 漢語詞彙語義研究的現狀與發展趨勢國際學術研討會, 2004.
    [25]Weaver, Jesse, and Paul Tarjan. "Facebook linked data via the graph API." Semantic Web 4.3 (2013): 245-250.
    [26] 黃羿綺,政治人物之社交網路建置與分析,政治大學論文,2015
    [27]Loureiro, Antonio, Luis Torgo, and Carlos Soares. "Outlier detection using clustering methods: a data cleaning application." Proceedings of KDNet Symposium on Knowledge-based Systems for the Public Sector. Bonn, Germany. 2004.
    [28]Lewis, David D. "Naive (Bayes) at forty: The independence assumption in information retrieval." Machine learning: ECML-98. Springer Berlin Heidelberg, 1998.
    [29]Zhang, Min-Ling, and Zhi-Hua Zhou. "ML-KNN: A lazy learning approach to multi-label learning." Pattern recognition 40.7 (2007): 2038-2048.
    [30]Joachims, Thorsten. Making large scale SVM learning practical. Universität Dortmund, 1999.
    [31]Safavian, S. Rasoul, and David Landgrebe. "A survey of decision tree classifier methodology." (1990).
    [32]Rätsch, Gunnar, Takashi Onoda, and K-R. Müller. "Soft margins for AdaBoost." Machine learning 42.3 (2001): 287-320.
    描述: 碩士
    國立政治大學
    資訊科學學系
    103753002
    資料來源: http://thesis.lib.nccu.edu.tw/record/#G0103753002
    数据类型: thesis
    显示于类别:[資訊科學系] 學位論文

    文件中的档案:

    档案 大小格式浏览次数
    300201.pdf3365KbAdobe PDF2371检视/开启


    在政大典藏中所有的数据项都受到原著作权保护.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 回馈