Loading...
|
Please use this identifier to cite or link to this item:
https://nccur.lib.nccu.edu.tw/handle/140.119/157750
|
Title: | 擴散模型排成器之研究 Research on Diffusion Model Scheduler |
Authors: | 謝竣宇 Hsieh, Chun-Yu |
Contributors: | 蔡炎龍 Tsai, Yen-Lung 謝竣宇 Hsieh, Chun-Yu |
Keywords: | 擴散模型 排成器 圖像生成 Diffusion Models Schedulers Image Generation |
Date: | 2025 |
Issue Date: | 2025-07-01 14:40:43 (UTC+8) |
Abstract: | 本研究聚焦於圖像生成技術中不同採樣排程器(scheduler)對生成品質與效率的影響。傳統上,生成對抗網路( Generative Adversarial Networks,GAN)透過對抗式訓練生成高品質影像,為早期主流方法。隨著擴散模型(Diffusion Models)興起,其透過加噪與逐步去噪的方式生成樣本,展現出優異的品質與穩定性。潛空間擴散模型(Latent Diffusion Models, LDM)則進一步藉由在潛空間中運行,降低計算負擔,並成為 Stable Diffusion 等應用的核心架構。常見的採樣策略包括 DDPM(Denoising Diffusion Probabilistic Models)、DDIM(Denoising Diffusion Implicit Models)、SDE( Stochastic Differential Equations,SDE) 及 ODE(Ordinary Differential Equations,ODE) 方法,其中 DDPM 速度較慢,DDIM 採用確定性推理提升效率,而 SDE 與 ODE 則統一建構於連續時間的數學框架下。本研究比較各類方法於不同生成步數下的品質、時間與收斂性,實驗顯示:確定性推理有助於提升圖像穩定性,ODE 採樣在速度上表現最佳,除 SDE 與 DDPM 偶有不收斂情形,其餘方法皆具良好穩定性與實用性。 This study focuses on image generation techniques, particularly examining how different sampling schedulers affect the quality and efficiency of generated results. Traditionally, Generative Adversarial Networks (GANs) dominated the field by producing high-quality images through adversarial training. In recent years, Diffusion Models have emerged as a powerful alternative, generating samples via a gradual process of noise addition and denoising, offering strong stability and image fidelity. Latent Diffusion Models (LDMs), which operate in a compressed latent space, further reduce computational cost and serve as the core architecture behind models like Stable Diffusion. Common sampling strategies include DDPM (Denoising Diffusion Probabilistic Models), DDIM (Denoising Diffusion Implicit Models), and methods based on Stochastic Differential Equations (SDEs) and Ordinary Differential Equations (ODEs). Among these, DDPM tends to be slower, while DDIM improves efficiency through deterministic inference. SDE- and ODE-based approaches reformulate the sampling process under a continuous-time mathematical framework. This study compares these methods in terms of sample quality, runtime, and convergence under various sampling steps. Experimental results show that deterministic samplers enhance output stability, with ODE-based methods achieving the fastest generation. Except for occasional non-convergence in SDE and DDPM, all other methods demonstrate reliable convergence and practical effectiveness. |
Reference: | [1] John Butcher. Runge-kutta methods. Scholarpedia, 2(9):3147, 2007. [2] Ting Chen. On the importance of noise scheduling for diffusion models. arXiv preprint arXiv:2301.10972, 2023. [3] Robert J. Elliott and Brian D.O. Anderson. Reverse time diffusions. Stochastic Processes and their Applications, 19(2):327–339, 1985. [4] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial networks. Communications of the ACM, 63(11):139–144, 2020. [5] Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020. [6] Aapo Hyvärinen. Estimation of non-normalized statistical models by score matching. J. Mach. Learn. Res., 6:695–709, 2005. [7] Diederik P. Kingma and Max Welling. An introduction to variational autoencoders. Foundations and Trends® in Machine Learning, 12(4):307–392, 2019. [8] Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, and Jun Zhu. Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps. Advances in Neural Information Processing Systems, 35:5775–5787, 2022. [9] Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, and Jun Zhu. Dpmsolver++: Fast solver for guided sampling of diffusion probabilistic models. arXiv preprint arXiv:2211.01095, 2022. [10] Shakir Mohamed and Balaji Lakshminarayanan. Learning in implicit generative models. arXiv preprint arXiv:1610.03483, 2016. [11] Alexander Quinn Nichol and Prafulla Dhariwal. Improved denoising diffusion probabilistic models. In International conference on machine learning, pages 8162–8171. PMLR, 2021. [12] Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PmLR, 2021. [13] Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022. [14] Jascha Sohl-Dickstein, Eric Weiss, Niru Maheswaranathan, and Surya Ganguli. Deep unsupervised learning using nonequilibrium thermodynamics. In International conference on machine learning, pages 2256–2265. PMLR, 2015. [15] Jiaming Song, Chenlin Meng, and Stefano Ermon. Denoising diffusion implicit models. CoRR, abs/2010.02502, 2020. [16] Yang Song and Stefano Ermon. Generative modeling by estimating gradients of the data distribution. Advances in neural information processing systems, 32, 2019. [17] Yang Song, Sahaj Garg, Jiaxin Shi, and Stefano Ermon. Sliced score matching: A scalable approach to density and score estimation. In Uncertainty in artificial intelligence, pages 574–584. PMLR, 2020. [18] Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456, 2020. [19] Mingtian Zhang, Tim Z Xiao, Brooks Paige, and David Barber. Improving vae-based representation learning. arXiv e-prints, pages arXiv–2205, 2022. [20] Wenliang Zhao, Lujia Bai, Yongming Rao, Jie Zhou, and Jiwen Lu. Unipc: A unified predictor-corrector framework for fast sampling of diffusion models. Advances in Neural Information Processing Systems, 36:49842–49869, 2023. |
Description: | 碩士 國立政治大學 應用數學系 110751013 |
Source URI: | http://thesis.lib.nccu.edu.tw/record/#G0110751013 |
Data Type: | thesis |
Appears in Collections: | [應用數學系] 學位論文
|
Files in This Item:
File |
Description |
Size | Format | |
101301.pdf | | 17427Kb | Adobe PDF | 0 | View/Open |
|
All items in 政大典藏 are protected by copyright, with all rights reserved.
|