政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/158369

English | 正體中文 | 简体中文 | Post-Print筆數 : 27 | Items with full text/Total items : 117578/148609 (79%)
Visitors : 70201884 Online Users : 650

RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.

Scope

please add "double quotation mark" for query phrases to get precise results

please goto advance search for comprehansive author search

Adv. Search

Home ‧ Login ‧ Upload ‧ Help ‧ About ‧ Administer

Goto mobile version

政大機構典藏 > 理學院 > 應用數學系 > 學位論文 > Item 140.119/158369

Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/158369

Title:	利用合成文本提升電影推薦系統的效能：RAG框架的實證分析 Enhancing Movie Recommendation Systems with Synthetic Texts: An Empirical Study Using the RAG Framework
Authors:	陳鴻文 Chen, Hung-Wen
Contributors:	蔡炎龍 Tsai, Yen-Lung 陳鴻文 Chen, Hung-Wen
Keywords:	檢索增強生成大型語言模型電影推薦系統合成文本語意檢索生成 RAG Retrieval-augmented generation Large language model Movie recommendation system Synthetic text Semantic retrieval generation
Date:	2025
Issue Date:	2025-08-04 13:10:29 (UTC+8)
Abstract:	本研究旨在開發一套能夠理解使用者口語化觀影偏好並以同樣自然語言回應建議的電影推薦系統。鑑於大型語言模型（LLM）在根據網路資料上訓練時所產生的不穩定性或幻覺等問題，本文引入檢索增強生成（Retrieval-Augmented Generation, RAG）機制以提升推薦內容的準確性與穩定性。首先，RAG 透過向量檢索自有電影資料庫，確保所擷取之上下文資訊正確無誤；接著，將檢索結果與使用者查詢一併輸入 LLM，藉由大規模語言模型生成語義豐富且具語境連貫性的推薦建議，兼顧正確性與對話自然度。系統同時整合外部電影資料庫，透過 RAG 即時加入新上映電影，以提升推薦準確度；並將結構化資料轉換為非結構化文本，在統一框架中結合向量檢索與生成模型進行處理。此外，我們設計實驗生成並評估 LLM 產生的合成文本，以增強電影概述的敘事深度、連貫性與說服力。為克服中文資料較少之問題，模型透過使用英文資料，產出英文以及中文兩種推薦，以驗證在英文資料基礎上之中文推薦的可行性以及準確性，。最後，本框架結合 LLM 的語意理解能力與 RAG 的精確檢索機制，能從自由描述的查詢中自動推斷使用者偏好，並提供個性化建議，為未來對話式推薦系統之研究與應用提供實務可行之實證。 This study proposes a movie recommendation system that interprets users’ colloquial viewing preferences and responds with natural-language suggestions. To address the instability and hallucination issues common in large language models (LLMs) trained on heterogeneous data, we adopt a Retrieval-Augmented Generation (RAG) framework. The system first retrieves relevant context from a self-constructed movie database via vector search, then combines the results with user queries to generate coherent and factually grounded recommendations. To enhance relevance, external movie databases are integrated, enabling dynamic updates with newly released films. Structured metadata is converted into unstructured text, allowing both retrieval and generation to operate within a unified text-based pipeline. We further evaluate the LLM’s ability to generate synthetic overviews with improved narrative quality. To mitigate the lack of Chinese-language data, English resources are leveraged to generate recommendations in both English and Chinese, demonstrating cross-lingual transferability. By combining the semantic understanding of LLMs with RAG’s precision, the system infers user intent from free-form input and delivers personalized, context-aware suggestions, providing a robust foundation for future conversational recommender systems.
Reference:	[1] Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E. Hinton. Layer normalization. In Proceedings of the NIPS 2016 Deep Learning Symposium, 2016. [2] Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. Neural machine translation by jointly learning to align and translate. In Proceedings of the International Conference on Learning Representations (ICLR), 2015. [3] Y. Bengio, P. Simard, and P. Frasconi. Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, 5(2):157–166, 1994. [4] Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Chris Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. Language models are few- shot learners. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 1877–1901. Curran Associates, Inc., 2020. [5] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of deep bidirectional transformers for language understanding. In Jill Burstein, Christy Doran, and Thamar Solorio, editors, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota, June 2019. Association for Computational Linguistics. 40 [6] Shijie Geng, Shuchang Liu, Zuohui Fu, Yingqiang Ge, and Yongfeng Zhang. Recommendation as language processing (rlp): A unified pretrain, personalized prompt & predict paradigm (p5). In Proceedings of the 16th ACM Conference on Recommender Systems (RecSys’ 22), pages 299–315, 2022. [7] Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, and Ming-Wei Chang. Realm: Retrieval-augmented language model pre-training. In Proceedings of the 37th International Conference on Machine Learning (ICML), volume 119 of PMLR, pages 3929–3938, 2020. [8] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep Residual Learning for Image Recognition. In Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR ’16, pages 770–778. IEEE, June 2016. [9] Sepp Hochreiter, Yoshua Bengio, Paolo Frasconi, and Jürgen Schmidhuber. Gradient flow in recurrent nets: The difficulty of learning long-term dependencies. In John F. Kolen and Stefan C. Kremer, editors, A Field Guide to Dynamical Recurrent Neural Networks, pages 237–243. IEEE Press, Piscataway, NJ, 2001. [10] Sergey Ioffe and Christian Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd International Conference on Machine Learning (ICML), volume 37 of JMLR: Workshop and Conference Proceedings, pages 448–456, 2015. [11] Yehuda Koren, Robert Bell, and Chris Volinsky. Matrix factorization techniques for recommender systems. Computer, 42(8):30–37, Aug 2009. [12] Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, Sebastian Riedel, and Douwe Kiela. Retrieval-augmented generation for knowledge-intensive nlp tasks. In Advances in Neural Information Processing Systems, Vol. 33, pages 9459–9474, 2020. [13] Pasquale Lops, Marco De Gemmis, and Giovanni Semeraro. Content-based recommender systems: State of the art and trends. In Francesco Ricci, Lior Rokach, and Bracha Shapira, editors, Recommender Systems Handbook, pages 73–105. Springer US, 2011. 41 [14] Michael J. Pazzani and Daniel Billsus. Content-based recommendation systems. In Peter Brusilovsky, Alfred Kobsa, and Wolfgang Nejdl, editors, The Adaptive Web: Methods and Strategies of Web Personalization, pages 325–341. Springer Berlin Heidelberg, 2007. [15] Peter Shaw, Jakob Uszkoreit, and Ashish Vaswani. Self-attention with relative position representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pages 464–468, New Orleans, Louisiana, 2018. Association for Computational Linguistics. [16] Jianlin Su, Yu Lu, Shengfeng Pan, Ahmed Murtadha, Bo Wen, and Yunfeng Liu. Roformer: Enhanced transformer with rotary position embedding. Neurocomputing, 568:127063, 2023. [17] Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. Sequence to sequence learning with neural networks. In Zoubin Ghahramani, Max Welling, Corinna Cortes, Neil D. Lawrence, and Kilian Q. Weinberger, editors, Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8-13 2014, Montreal, Quebec, Canada, pages 3104–3112, 2014. [18] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. In Advances in Neural Information Processing Systems 30, pages 5998–6008, 2017. [19] Lingzhi Wang, Huang Hu, Lei Sha, Can Xu, Kam-Fai Wong, and Daxin Jiang. Recindial: A unified framework for conversational recommendation with pretrained language models. In Proceedings of the 16th ACM Conference on Recommender Systems (RecSys’ 22), pages 299–307, 2022. [20] Jason Wei, Yi Tay, Rishi Bommasani, Colin Raffel, Barret Zoph, Sebastian Borgeaud, Dani Yogatama, Maarten Bosma, Denny Zhou, Donald Metzler, Ed H. Chi, Tatsunori Hashimoto, Oriol Vinyals, Percy Liang, Jeff Dean, and William Fedus. Emergent abilities of large language models. Transactions on Machine Learning Research, 2022-1:1– 48, 2022.
Description:	碩士國立政治大學應用數學系 110751014
Source URI:	http://thesis.lib.nccu.edu.tw/record/#G0110751014
Data Type:	thesis
Appears in Collections:	[應用數學系] 學位論文

Files in This Item:

File	Description	Size	Format
101401.pdf		5672Kb	Adobe PDF	0	View/Open

All items in 政大典藏 are protected by copyright, with all rights reserved.

社群 sharing

著作權政策宣告 Copyright Announcement

1.本網站之數位內容為國立政治大學所收錄之機構典藏，無償提供學術研究與公眾教育等公益性使用，惟仍請適度，合理使用本網站之內容，以尊重著作權人之權益。商業上之利用，則請先取得著作權人之授權。
The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

2.本網站之製作，已盡力防止侵害著作權人之權益，如仍發現本網站之數位內容有侵害著作權人權益情事者，請權利人通知本網站維護人員(nccur@nccu.edu.tw)，維護人員將立即採取移除該數位著作等補救措施。
NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.

DSpace Software Copyright © 2002-2004 MIT & Hewlett-Packard / Enhanced by NTU Library IR team Copyright © - Feedback