政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/159317

English | 正體中文 | 简体中文 | Post-Print筆數 : 27 | 全文筆數/總筆數 : 118786/149850 (79%)
造訪人次 : 81852640 線上人數 : 490

RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.

搜尋範圍

查詢小技巧：

您可在西文檢索詞彙前後加上"雙引號"，以獲取較精準的檢索結果

若欲以作者姓名搜尋，建議至進階搜尋限定作者欄位，可獲得較完整資料

進階搜尋

主頁 ‧ 登入 ‧ 上傳 ‧ 說明 ‧ 關於政大典藏 ‧ 管理

到手機版

政大機構典藏 > 理學院 > 應用數學系 > 學位論文 > Item 140.119/159317

請使用永久網址來引用或連結此文件: https://nccur.lib.nccu.edu.tw/handle/140.119/159317

題名:	大型語言模型的非檢索式上下文延展機制研究：從鍵值緩存到微積分AI教師 RAG-Free Contextual Extension for LLMs: A Study on KV-Cache and Calculus AI Tutoring
作者:	孫翊珈 Sun, Yi-Jia
貢獻者:	蔡炎龍 Tsai, Yen-Lung 孫翊珈 Sun, Yi-Jia
關鍵詞:	大型語言模型鍵值快取上下文延展 AI 教學系統非檢索生成 Large Language Models Key-Value Cache Context Extension AI Tutoring System Retrieval-Free Generation
日期:	2025
上傳時間:	2025-09-01 16:30:03 (UTC+8)
摘要:	隨著大型語言模型（Large Language Models, LLMs）在自然語言處理領域的快速發展，其應用已逐漸擴展至教育場域。然而，現有 LLM 面臨上下文長度（context window）受限的挑戰，使其在處理長篇教材與多輪教學問答時難以維持語境連貫性與邏輯一致性。傳統解法如檢索增強生成（Retrieval-Augmented Generation, RAG）雖能引入外部知識，但也易產生檢索偏誤及語境斷裂，影響教學應用的效能。本研究提出一種基於鍵值快取（Key-Value Cache, KV-Cache）的非檢索式上下文延展策略，並設計實作了一套以微積分教材為基礎的 AI 教師系統。系統透過分段預填充（chunked prefill）將教材內容逐步輸入模型，並快取中間計算結果，讓模型在後續教學問答中能延續語境、節省運算資源並提升語義一致性。實驗比較了 KV-Cache 系統、RAG 系統與無快取系統，評估其記憶體使用與回應延遲。實驗結果顯示，所提出的 KV-Cache 機制在長文本教學場景下能有效提升語境連貫性，並顯著降低回應延遲，展現其於 AI 教學應用中的潛力。 With the advancement of Large Language Models (LLMs), their integration into educational applications has attracted increasing attention. However, LLMs are constrained by their fixed context window size, making it difficult to handle long instructional materials and maintain coherent multi-turn teaching dialogues. While Retrieval-Augmented Generation (RAG) alleviates some knowledge limitations by incorporating external retrieval, it often introduces retrieval bias and context fragmentation, reducing its effectiveness in educational scenarios. This study proposes a retrieval-free context extension approach based on Key-Value Cache (KV-Cache) and implements a calculus-focused AI tutoring system. The system incrementally feeds LaTeX-based calculus textbooks into the model using a chunked prefill strategy, caching intermediate computations to enable consistent context retention and improved semantic coherence in subsequent teaching interactions. The experiments compare the proposed system with RAG-based and non-caching baselines, focusing on response latency and teaching continuity. Experimental results demonstrate that the KV-Cache mechanism effectively enhances contextual coherence and significantly reduces response latency in long-text teaching scenarios, showing great potential for future AI-driven educational systems.
參考文獻:	[1] Jie Hu, Shengnan Wang, Yutong He, Ping Gong, Jiawei Yi, Juncheng Zhang, Youhui Bai, Renhai Chen, Gong Zhang, Cheng Li, et al. Efficient long-context llm inference via kv cache clustering. arXiv preprint arXiv:2506.11418, 2025. [2] Neusha Javidnia, Bita Darvish Rouhani, and Farinaz Koushanfar. Key, value, compress: A systematic exploration of kv cache compression techniques. In 2025 IEEE Custom Integrated Circuits Conference (CICC), pages 1–3. IEEE, 2025. [3] Jushi Kai, Boyi Zeng, Yixuan Wang, Haoli Bai, Ziwei He, Bo Jiang, and Zhouhan Lin. Freqkv: Frequencydomainkey-valuecompressionforefficientcontextwindowextension. arXiv preprint arXiv:2505.00570, 2025. [4] Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, et al. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in neural information processing systems, 33:9459–9474, 2020. [5] Guangda Liu, Chengwei Li, Jieru Zhao, Chenqi Zhang, and Minyi Guo. Clusterkv: Manipulating llm kv cache in semantic space for recallable compression. arXiv preprint arXiv:2412.03213, 2024. [6] A. Palu and B. Smith. Kv-cache compression with low-rank projection. In International Conference on Learning Representations (ICLR), 2024. [7] Aurick Qiao, Zhewei Yao, Samyam Rajbhandari, and Yuxiong He. Swiftkv: Fast prefill optimized inference with knowledge-preserving model transformation. arXiv preprint arXiv:2410.03960, 2024. [8] Hanshi Sun, Li-Wen Chang, Wenlei Bao, Size Zheng, Ningxin Zheng, Xin Liu, Harry Dong, Yuejie Chi, and Beidi Chen. Shadowkv: Kv cache in shadows for high-throughput long-context llm inference. arXiv preprint arXiv:2410.21465, 2024. [9] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017. [10] Liang Wang, Nan Yang, Xiaolong Huang, Linjun Yang, Rangan Majumder, and Furu Wei. Multilingual e5 text embeddings: A technical report. arXiv preprint arXiv:2402.05672, 2024. [11] Jialong Wu, Zhenglin Wang, Linhai Zhang, Yilong Lai, Yulan He, and Deyu Zhou. Scope: Optimizing key-value cache compression in long-context generation. arXiv preprint arXiv:2412.13649, 2024. [12] Jingbo Yang, Bairu Hou, Wei Wei, Yujia Bao, and Shiyu Chang. Kvlink: Accelerating large language models via efficient kv cache reuse. arXiv preprint arXiv:2502.16002, 2025.
描述:	碩士國立政治大學應用數學系 111751001
資料來源:	http://thesis.lib.nccu.edu.tw/record/#G0111751001
資料類型:	thesis
顯示於類別:	[應用數學系] 學位論文

文件中的檔案:

檔案	大小	格式	瀏覽次數
100101.pdf	1103Kb	Adobe PDF	0	檢視/開啟

在政大典藏中所有的資料項目都受到原著作權保護.

社群 sharing

著作權政策宣告 Copyright Announcement

1.本網站之數位內容為國立政治大學所收錄之機構典藏，無償提供學術研究與公眾教育等公益性使用，惟仍請適度，合理使用本網站之內容，以尊重著作權人之權益。商業上之利用，則請先取得著作權人之授權。
The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

2.本網站之製作，已盡力防止侵害著作權人之權益，如仍發現本網站之數位內容有侵害著作權人權益情事者，請權利人通知本網站維護人員(nccur@nccu.edu.tw)，維護人員將立即採取移除該數位著作等補救措施。
NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.

DSpace Software Copyright © 2002-2004 MIT & Hewlett-Packard / Enhanced by NTU Library IR team Copyright © - 回饋