Skip to content

Clarification request: VAE Reconstruction Loss discrepancy between paper ( L MSE L MSE ​ ) and open-sourced config (L1) #29

@XIONGPEILIN

Description

@XIONGPEILIN

Hi authors,

Thanks for the great work and for open-sourcing the REPA-E codebase! I am currently studying your VAE regularization setup and noticed a discrepancy between the paper and the code.

In the paper (arXiv:2504.10483v3), the method and experiments sections explicitly state that the reconstruction loss uses MSE:

"In particular, following [stabilityai2025sdvae], we use three losses, 1) Reconstruction Losses ($\mathcal{L}{\mathrm{MSE}},\mathcal{L}{\mathrm{LPIPS}}$)..."
"The VAE regularization loss combines multiple objectives and is defined as: $\mathcal{L}\mathrm{REG} = \mathcal{L}\mathrm{KL} + \mathcal{L}\mathrm{MSE} + \mathcal{L}\mathrm{LPIPS} + \mathcal{L}_\mathrm{GAN}$."

However, in the provided code (e.g., inside l1_lpips_kl_gan.yaml), the reconstruction_loss parameter is explicitly set to "l1" instead of "l2"/MSE.

My questions are:

  1. Is $\mathcal{L}_{\mathrm{MSE}}$ in the paper just a notational convention for pixel-wise reconstruction loss, while the actual experiments used L1 loss to prevent blurry reconstructions?
  2. If we aim to perfectly reproduce the metrics reported in the paper, should we stick with the l1 configuration provided in the YAML, or do we need to switch it to l2 (MSE)?

Thank you in advance for clarifying this!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions