政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/139264

English | 正體中文 | 简体中文 | Post-Print筆數 : 27 | Items with full text/Total items : 109952/140887 (78%)
Visitors : 46371425 Online Users : 1233

RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.

Scope

please add "double quotation mark" for query phrases to get precise results

please goto advance search for comprehansive author search

Adv. Search

Home ‧ Login ‧ Upload ‧ Help ‧ About ‧ Administer

Goto mobile version

政大機構典藏 > 社會科學學院 > 經濟學系 > 學位論文 > Item 140.119/139264

Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/139264

Title:	應用實價登錄建立以聚類方法之堆疊泛化房價預測模型 -以桃園市區分建物房價資料為例 Predicting Housing Prices using Clustering-based Stacked Generator- A study on Taoyuan City Actual Price Registration Data
Authors:	黃允亭 Huang, Yun-Ting
Contributors:	陳樹衡鄧筱蓉黃允亭 Huang, Yun-Ting
Keywords:	特徵選取聚類分析機器學習集成學習堆疊泛化實價登錄房價預測
Date:	2022
Issue Date:	2022-03-01 17:52:29 (UTC+8)
Abstract:	本研究探討結合聚類分析的堆疊泛化模型對台灣房價預測的適用性。利用最新可用的桃園市實價登錄資料, 本研究首先拓展了Trivedi et. al (2015) 的聚類分析集成學習方法，建立了一個聚類分析的兩層堆疊泛化模型。第一層聚類分析群模型分別由Lasso，KNN以及決策樹建立，第二層元模型分別由線性迴歸、隨機森林以及XGBoost所建立。接下來用此拓展的兩層聚類分析堆疊泛化模型預測了桃園市房價資料，並與其他機器學習模型，包括線性迴歸、隨機森林和XGBoost，比較他們的預測結果。 This research explores the applicability of combining clustering technique with stacked generalization for Taiwan housing prices prediction. Taking advantage of the most currently available Taoyuan City Actual Price Registration Data, we first expanded the clustering-based ensemble learning method by Trivedi et al. (2015) to develop two-layer clustering-based stacked generalizers. In the first layer, three machine learning methods (Lasso, KNN and Decision Tree) were used to construct the cluster models. In the second layer, Linear Regression, Random Forest and XGBoost were used to build meta models. These developed stacked generalizers are then used to predict housing prices in the Taoyuan City. Their prediction accuracies are then compared with that from other machine learning methods, including Linear Regression, Random Forest and XGBoost.
Reference:	[1] Altman, N. S. (1992). An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression. The American Statistician, 46(3), 175–185. [2] Breiman, L. (1996a). Bagging Predictors. Machine Learning, 24(2), 123–140. [3] Breiman, L. (1996b). Stacked Regressions Leo Breiman. Machine Learning, 24(1), 49–64. [4] Breiman, L. (2001). Random Forests. Machine Learning, 45, 5–32. [5] Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification And Regression Trees. Chapman & Hall/CRC, 368. [6] Chen, T., & Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. KDD, 785–794. [7] Efron, B. (1979). Bootstrap Methods: Another Look at the Jackknife. The Annals of Statistics, 7(1), 1–26. [8] Frank, A. and Asuncion, A. (2010). UCI machine learning repository. http://archive.ics.uci.edu/ml. [9] Freund, Y. (1995). Boosting a Weak Learning Algorithm by Majority. Information and Computation, 121(2), 256–285. [10] Efron, B. (1997). A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. J. Comput. Syst. Sci., 1(55), 119–139. [11] Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–22. [12] Friedman, J. H. (2001). Boosting a Weak Learning Algorithm by Majority. Greedy Function Approximation: A Gradient Boosting Machine, 29(5), 1189–1232. [13] Guyon, I., Weston, J., Barnhill, S., & Vapnik, V. (2002). Gene selection for cancer classification using Support Vector Machines. Machine Learning, 46(1-3), 389–422. [14] Ho, T. K. (1995). Random Decision Forests. Proceedings of 3rd International Conference on Document Analysis and Recognition, 278–282. [15] Huang, S.Y. and Yu, F. and Tsaih, R. H. and Huang, Y. (2014). Resistant Learning on the Envelope Bulk for Identifying Anomalous Patterns. 2014 International Joint Conference on Neural Networks (IJCNN), 3303–3310. [16] Schapire, R. E. (1990). The Strength of Weak Learnability. Machine Learning, 5, 197–227. [17] Tibshirani, R. (1996). Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 267–288. [18] Ting, K.M. & Witten, I.H. (1997). Stacked generalization: when does it work?. Hamilton, New Zealand: University of Waikato, Department of Computer Science. [19] Trivedi, S., Pardos, Z. A., & Heffernan, N. T. (2015). The Utility of Clustering in Prediction Tasks. ArXiv:1509.06163. [20] Van der laan, M. J., Polley, E. C., & Hubbard, A. E. (2007). Super Learner. Statistical Applications in Genetics and Molecular Biology, 6(25). [21] Wolpert, D. H. (1992). Stacked Generalization. Neural Networks, 5(2), 241–259. [22] Zou, H., & Hastie, T. (2005). Regularization and Variable Selection via the Elastic Net. Journal of the Royal Statistical Society. Series B (Statistical Methodology), 67(2), 301–320. [23] 何睿婷，(2018)。基於異質集成學習方法的房價預測。通訊世界，10，pp.296-297。 [24] 吳晏榕，(2010)。房價指數應用在銀行資產重估之研究。未出版之碩士論文，政治大學，經濟學研究所，台北市。 [25] 洪淑娟、雷立芬，(2010)。根據中古屋、預售屋／新成屋房價與總體經濟變數互動關係之研究。臺灣銀行季刊，61(1)，pp.155-167。 [26] 洪鴻智、張能政，(2006)。不動產估價人員之價值探索過程：估價程序與參考點的選擇。建築與規劃學報，7(1)，pp.71-90。 [27] 郁嘉綾，(2018)。應用大數據於杭州市房地產價格模型之建立。未出版之碩士論文，政治大學，統計學研究所，台北市。 [28] 張曦方，(1994)。住宅樓層價差之探討–以台北市為例。未出版之碩士論文，政治大學，地政學研究所，台北市。 [29] 陳敬筌，(2019)。應用深度學習預測區域住房平均價格— 以台北市實價登錄為例。未出版之碩士論文，銘傳大學，資訊管理學系碩士在職專班，台北市。 [30] 陳樹衡、郭子文、棗厥庸，（2007）。以決策樹之迴歸樹建構住宅價格模型－臺灣地區之實證分析。住宅學報，16(1)，pp.1-20。 [31] 馮世傑，(2014)。房價影響變數之探討-以台北市為例。未出版之碩士論文。東吳大學，國際貿易學研究所，台北市。 [32] 黃佳鈴、張金鶚， (2005)。從房地價格分離探討地價指數與公告土地現值評估。台灣土地研究；8(2)，pp.73-106。 [33] 楊博文、曹布陽，(2017)。基於集成學習的房價預測模型。電腦知識與技術，13(29)，pp.191-194。 [34] 蔡育政，(2009)。影響房地產價格因素之研究:以台中市北屯區、西屯區、南屯區、中區、東區為例。未出版之碩士論文，朝陽科技大學，財務金融研究所，台中市。 [35] 蔡育展，(2017)。機器學習與房地產估價。未出版之碩士論文，政治大學，資訊管理學研究所，台北市。 [36] 蔡瑞煌、高明志、張金鶚，(1999)。類神經網路應用於房地產估價之研究。住宅學報，8，pp.1-20。 [37] 賴碧瑩，(2007)。應用類神經網路於電腦輔助大量估價。住宅學報，16(2)，pp.43-65。 [38] 謝明穎，(2017) 。運用機器學習方法建構房價預測視覺化平台。未出版之碩士論文。輔仁大學，統計資訊學系應用統計研究所，新北市。
Description:	碩士國立政治大學經濟學系 107258025
Source URI:	http://thesis.lib.nccu.edu.tw/record/#G0107258025
Data Type:	thesis
DOI:	10.6814/NCCU202200343
Appears in Collections:	[經濟學系] 學位論文

Files in This Item:

File	Description	Size	Format
802501.pdf		4583Kb	Adobe PDF2	467	View/Open

All items in 政大典藏 are protected by copyright, with all rights reserved.

社群 sharing

著作權政策宣告 Copyright Announcement

1.本網站之數位內容為國立政治大學所收錄之機構典藏，無償提供學術研究與公眾教育等公益性使用，惟仍請適度，合理使用本網站之內容，以尊重著作權人之權益。商業上之利用，則請先取得著作權人之授權。
The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

2.本網站之製作，已盡力防止侵害著作權人之權益，如仍發現本網站之數位內容有侵害著作權人權益情事者，請權利人通知本網站維護人員(nccur@nccu.edu.tw)，維護人員將立即採取移除該數位著作等補救措施。
NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.

DSpace Software Copyright © 2002-2004 MIT & Hewlett-Packard / Enhanced by NTU Library IR team Copyright © - Feedback