政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/157750

English | 正體中文 | 简体中文 | Post-Print筆數 : 27 | Items with full text/Total items : 116918/147948 (79%)
Visitors : 65251211 Online Users : 722

RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.

Scope

please add "double quotation mark" for query phrases to get precise results

please goto advance search for comprehansive author search

Adv. Search

Home ‧ Login ‧ Upload ‧ Help ‧ About ‧ Administer

Goto mobile version

政大機構典藏 > 理學院 > 應用數學系 > 學位論文 > Item 140.119/157750

Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/157750

Title:	擴散模型排成器之研究 Research on Diffusion Model Scheduler
Authors:	謝竣宇 Hsieh, Chun-Yu
Contributors:	蔡炎龍 Tsai, Yen-Lung 謝竣宇 Hsieh, Chun-Yu
Keywords:	擴散模型排成器圖像生成 Diffusion Models Schedulers Image Generation
Date:	2025
Issue Date:	2025-07-01 14:40:43 (UTC+8)
Abstract:	本研究聚焦於圖像生成技術中不同採樣排程器（scheduler）對生成品質與效率的影響。傳統上，生成對抗網路（ Generative Adversarial Networks，GAN）透過對抗式訓練生成高品質影像，為早期主流方法。隨著擴散模型（Diffusion Models）興起，其透過加噪與逐步去噪的方式生成樣本，展現出優異的品質與穩定性。潛空間擴散模型（Latent Diffusion Models， LDM）則進一步藉由在潛空間中運行，降低計算負擔，並成為 Stable Diffusion 等應用的核心架構。常見的採樣策略包括 DDPM（Denoising Diffusion Probabilistic Models）、DDIM（Denoising Diffusion Implicit Models）、SDE（ Stochastic Differential Equations，SDE）及 ODE（Ordinary Differential Equations，ODE）方法，其中 DDPM 速度較慢，DDIM 採用確定性推理提升效率，而 SDE 與 ODE 則統一建構於連續時間的數學框架下。本研究比較各類方法於不同生成步數下的品質、時間與收斂性，實驗顯示：確定性推理有助於提升圖像穩定性，ODE 採樣在速度上表現最佳，除 SDE 與 DDPM 偶有不收斂情形，其餘方法皆具良好穩定性與實用性。 This study focuses on image generation techniques, particularly examining how different sampling schedulers affect the quality and efficiency of generated results. Traditionally, Generative Adversarial Networks (GANs) dominated the field by producing high-quality images through adversarial training. In recent years, Diffusion Models have emerged as a powerful alternative, generating samples via a gradual process of noise addition and denoising, offering strong stability and image fidelity. Latent Diffusion Models (LDMs), which operate in a compressed latent space, further reduce computational cost and serve as the core architecture behind models like Stable Diffusion. Common sampling strategies include DDPM (Denoising Diffusion Probabilistic Models), DDIM (Denoising Diffusion Implicit Models), and methods based on Stochastic Differential Equations (SDEs) and Ordinary Differential Equations (ODEs). Among these, DDPM tends to be slower, while DDIM improves efficiency through deterministic inference. SDE- and ODE-based approaches reformulate the sampling process under a continuous-time mathematical framework. This study compares these methods in terms of sample quality, runtime, and convergence under various sampling steps. Experimental results show that deterministic samplers enhance output stability, with ODE-based methods achieving the fastest generation. Except for occasional non-convergence in SDE and DDPM, all other methods demonstrate reliable convergence and practical effectiveness.
Reference:	[1] John Butcher. Runge-kutta methods. Scholarpedia, 2(9):3147, 2007. [2] Ting Chen. On the importance of noise scheduling for diffusion models. arXiv preprint arXiv:2301.10972, 2023. [3] Robert J. Elliott and Brian D.O. Anderson. Reverse time diffusions. Stochastic Processes and their Applications, 19(2):327–339, 1985. [4] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial networks. Communications of the ACM, 63(11):139–144, 2020. [5] Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020. [6] Aapo Hyvärinen. Estimation of non-normalized statistical models by score matching. J. Mach. Learn. Res., 6:695–709, 2005. [7] Diederik P. Kingma and Max Welling. An introduction to variational autoencoders. Foundations and Trends® in Machine Learning, 12(4):307–392, 2019. [8] Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, and Jun Zhu. Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps. Advances in Neural Information Processing Systems, 35:5775–5787, 2022. [9] Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, and Jun Zhu. Dpmsolver++: Fast solver for guided sampling of diffusion probabilistic models. arXiv preprint arXiv:2211.01095, 2022. [10] Shakir Mohamed and Balaji Lakshminarayanan. Learning in implicit generative models. arXiv preprint arXiv:1610.03483, 2016. [11] Alexander Quinn Nichol and Prafulla Dhariwal. Improved denoising diffusion probabilistic models. In International conference on machine learning, pages 8162–8171. PMLR, 2021. [12] Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PmLR, 2021. [13] Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022. [14] Jascha Sohl-Dickstein, Eric Weiss, Niru Maheswaranathan, and Surya Ganguli. Deep unsupervised learning using nonequilibrium thermodynamics. In International conference on machine learning, pages 2256–2265. PMLR, 2015. [15] Jiaming Song, Chenlin Meng, and Stefano Ermon. Denoising diffusion implicit models. CoRR, abs/2010.02502, 2020. [16] Yang Song and Stefano Ermon. Generative modeling by estimating gradients of the data distribution. Advances in neural information processing systems, 32, 2019. [17] Yang Song, Sahaj Garg, Jiaxin Shi, and Stefano Ermon. Sliced score matching: A scalable approach to density and score estimation. In Uncertainty in artificial intelligence, pages 574–584. PMLR, 2020. [18] Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456, 2020. [19] Mingtian Zhang, Tim Z Xiao, Brooks Paige, and David Barber. Improving vae-based representation learning. arXiv e-prints, pages arXiv–2205, 2022. [20] Wenliang Zhao, Lujia Bai, Yongming Rao, Jie Zhou, and Jiwen Lu. Unipc: A unified predictor-corrector framework for fast sampling of diffusion models. Advances in Neural Information Processing Systems, 36:49842–49869, 2023.
Description:	碩士國立政治大學應用數學系 110751013
Source URI:	http://thesis.lib.nccu.edu.tw/record/#G0110751013
Data Type:	thesis
Appears in Collections:	[應用數學系] 學位論文

Files in This Item:

File	Description	Size	Format
101301.pdf		17427Kb	Adobe PDF	0	View/Open

All items in 政大典藏 are protected by copyright, with all rights reserved.

社群 sharing

著作權政策宣告 Copyright Announcement

1.本網站之數位內容為國立政治大學所收錄之機構典藏，無償提供學術研究與公眾教育等公益性使用，惟仍請適度，合理使用本網站之內容，以尊重著作權人之權益。商業上之利用，則請先取得著作權人之授權。
The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

2.本網站之製作，已盡力防止侵害著作權人之權益，如仍發現本網站之數位內容有侵害著作權人權益情事者，請權利人通知本網站維護人員(nccur@nccu.edu.tw)，維護人員將立即採取移除該數位著作等補救措施。
NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.

DSpace Software Copyright © 2002-2004 MIT & Hewlett-Packard / Enhanced by NTU Library IR team Copyright © - Feedback