دورية أكاديمية

PaGoDA: Progressive Growing of a One-Step Generator from a Low-Resolution Diffusion Teacher ...

التفاصيل البيبلوغرافية
العنوان: PaGoDA: Progressive Growing of a One-Step Generator from a Low-Resolution Diffusion Teacher ...
المؤلفون: Kim, Dongjun, Lai, Chieh-Hsin, Liao, Wei-Hsiang, Takida, Yuhta, Murata, Naoki, Uesaka, Toshimitsu, Mitsufuji, Yuki, Ermon, Stefano
بيانات النشر: arXiv
سنة النشر: 2024
المجموعة: DataCite Metadata Store (German National Library of Science and Technology)
مصطلحات موضوعية: Computer Vision and Pattern Recognition cs.CV, Artificial Intelligence cs.AI, Machine Learning cs.LG, Machine Learning stat.ML, FOS Computer and information sciences
الوصف: To accelerate sampling, diffusion models (DMs) are often distilled into generators that directly map noise to data in a single step. In this approach, the resolution of the generator is fundamentally limited by that of the teacher DM. To overcome this limitation, we propose Progressive Growing of Diffusion Autoencoder (PaGoDA), a technique to progressively grow the resolution of the generator beyond that of the original teacher DM. Our key insight is that a pre-trained, low-resolution DM can be used to deterministically encode high-resolution data to a structured latent space by solving the PF-ODE forward in time (data-to-noise), starting from an appropriately down-sampled image. Using this frozen encoder in an auto-encoder framework, we train a decoder by progressively growing its resolution. From the nature of progressively growing decoder, PaGoDA avoids re-training teacher/student models when we upsample the student model, making the whole training pipeline much cheaper. In experiments, we used our ...
نوع الوثيقة: article in journal/newspaper
report
اللغة: unknown
DOI: 10.48550/arxiv.2405.14822
الإتاحة: https://doi.org/10.48550/arxiv.2405.14822Test
https://arxiv.org/abs/2405.14822Test
حقوق: Creative Commons Attribution Non Commercial Share Alike 4.0 International ; https://creativecommons.org/licenses/by-nc-sa/4.0/legalcodeTest ; cc-by-nc-sa-4.0
رقم الانضمام: edsbas.13D20799
قاعدة البيانات: BASE