Hi authors,
Thanks for the great work and for open-sourcing the REPA-E codebase! I am currently studying your VAE regularization setup and noticed a discrepancy between the paper and the code.
In the paper (arXiv:2504.10483v3), the method and experiments sections explicitly state that the reconstruction loss uses MSE:
"In particular, following [stabilityai2025sdvae], we use three losses, 1) Reconstruction Losses ($\mathcal{L}{\mathrm{MSE}},\mathcal{L}{\mathrm{LPIPS}}$)..."
"The VAE regularization loss combines multiple objectives and is defined as: $\mathcal{L}\mathrm{REG} = \mathcal{L}\mathrm{KL} + \mathcal{L}\mathrm{MSE} + \mathcal{L}\mathrm{LPIPS} + \mathcal{L}_\mathrm{GAN}$."
However, in the provided code (e.g., inside l1_lpips_kl_gan.yaml), the reconstruction_loss parameter is explicitly set to "l1" instead of "l2"/MSE.
My questions are:
- Is $\mathcal{L}_{\mathrm{MSE}}$ in the paper just a notational convention for pixel-wise reconstruction loss, while the actual experiments used L1 loss to prevent blurry reconstructions?
- If we aim to perfectly reproduce the metrics reported in the paper, should we stick with the
l1 configuration provided in the YAML, or do we need to switch it to l2 (MSE)?
Thank you in advance for clarifying this!
Hi authors,
Thanks for the great work and for open-sourcing the REPA-E codebase! I am currently studying your VAE regularization setup and noticed a discrepancy between the paper and the code.
In the paper (arXiv:2504.10483v3), the method and experiments sections explicitly state that the reconstruction loss uses MSE:
However, in the provided code (e.g., inside l1_lpips_kl_gan.yaml), the
reconstruction_lossparameter is explicitly set to"l1"instead of"l2"/MSE.My questions are:
l1configuration provided in the YAML, or do we need to switch it tol2(MSE)?Thank you in advance for clarifying this!