Please use this identifier to cite or link to this item:
Improving Conversion Rate Prediction with Review Text
|Issue Date: ||2020-09-02 11:43:53 (UTC+8)|
|Abstract: ||隨著電商平台的出現，顧客消費習慣逐漸受到改變，「線上評論」成為左右消費者購買意願的重要因素，參考過去學者 Chevalier 和 Mayzlin 對此議題的探討，以銷售排名作為反應變數，建立迴歸模型觀察評論分數、其他特徵的顯著程度，並無直接從評論文字萃取特徵，本論文建立在 Chevalier 和 Mayzlin 所提出的特徵，研究加上評論文字資訊能否更有效的預測顧客消費行為，評論文字資訊以 TFIDF、CBOW、Skip-gram 詞嵌入向量為特徵。|
本文以某旅遊電商平台評論資料集為主，研究分成三部分，第一部分使用機器學習方法以文字特徵預測評論分數，預測分數與實際分數相關係數介於 0.2 到0.4 之間。第二部分以轉換率為預測目標，第三部分預測下期轉換率漲跌，分別比較加入文字特徵與僅以分數、其他評論特徵所建模型是否有更好的預測效果，實驗結果顯示，在此資料集上不包含前期轉換率時預測轉換率及下期漲跌，加入文字特徵皆有變好，若含前期轉換率時則僅有小幅的提升。
With the showing of electronic commerce, consuming behavior has been changed.“Online Review”is an important factor that has big emphasis on customers’purchase intention. According to Chevalier and Mayzlin  s’research, they take sales number as response variable and build regression model to check the significance of score characteristics and other characteristics. However, they don’t consider the text review due to lack of natural language preprocessing methods. This research add review text information to see whether model has a better ability to predict customer behavior. We take two kinds of TFIDF、CBOW and Skip-gram as text characteristics.
Based on a traveling e-commerce review data, this research spit into three sections. In Section 1, predicting review score by using machine learning methods at first. In order to compare the difference between text characteristics and review score, we calculate the correlation of predicted score and original review score and get the result between 0.2 and 0.4. In section 2 and 3, our predict target is conversion rate and the trend of next week conversion rate, which go up, down or keep constant. We comparing model with text characteristics and without text characteristics to see whether text can bring useful information. Result shows that adding text characteristics truly can help predict conversion rate and the trend of next week conversion rate when model don’t combine previous conversion rate but only has a little help with previous conversion rate.
|Reference: || Salton Gerard and Michael J. McGill. Introduction to Modern Information Retrieval, October 1986.|
 Greg Corrado, Jeffrey Dean, Kai Chen and Tomas Mikolov. Efficient Estimation of Word Representations in Vector Space, September 2013.
 Dina Mayzlin and Judith A. Chevalier. The Effect of Word of Mouth on Sales:Online Book Reviews, August 2006.
 Eric Clemons, Guodong Gao and Lorin M. Hitt. When Online Reviews Meet Hyperdifferentiation : A Study of Craft Beer Industry, February 2006.
 Nan Hu, Ling Liu and Jie Zhang. Do Online Reviews Affect Product Sales? The Role of Reviewer Characteristics and Temporal Effects, September 2008.
 Yong Liu. Word of Mouth for Movies:Its Dynamics and Impact on Box Office Revenue, July 2006.
 Jerome H. Friedman. Greedy Function Approximation: A Gradient Boosting Machine. The Annals of Statistics Vol. 29 No.5, 2001.
 Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye and Tie-Yan Liu. LightGBM:A Highly Efficient Gradient Boosting Decision Tree, December 2017.
 Menno van Zaanen and Pieter Kanters. Automatic Mood Classification Using tf*idf Based on Lyrics. In J. Stephen Downie and Remco C. Veltkamp, 11th International Society for Music Information and Retrieval Conference, August 2010.
 Hsin-His Chen and Lun-Wei Ku. Mining opinions from the Web: Beyond relevance retrieval. Journal of the American Society for Information Science and Technology, 58(12), 1838-1850, August 2007.
|Source URI: ||http://thesis.lib.nccu.edu.tw/record/#G0107354029|
|Data Type: ||thesis|
|Appears in Collections:||[統計學系] 學位論文|
Files in This Item:
There are no files associated with this item.
All items in 政大典藏 are protected by copyright, with all rights reserved.