why set `lr = 2e-4` for oxford102 flowers dataset? #3

lvyufeng · 2022-11-27T05:08:44Z

why set lr = 2e-4 for oxford102 flowers dataset? I've tried on denoising-diffusion-pytorch and my implementation denoising-diffusion-mindspore, the loss waves around 0.4 and the sampled image are always noisy.

Is the weight initialization method not the same between Pytorch and Jax? I use the training config below which can sample a better image:

model = Unet(
    dim = 64,
    dim_mults = (1, 2, 4, 8)
)

diffusion = GaussianDiffusion(
    model,
    image_size = 128,
    timesteps = 1000,           # number of steps
    sampling_timesteps = 250,   # number of sampling timesteps (using ddim for faster inference [see citation for ddim paper])
    loss_type = 'l1'            # L1 or L2
)

trainer = Trainer(
    diffusion,
    path,
    train_batch_size = 16,
    train_lr = 8e-5,
    train_num_steps = 700000,         # total training steps
    gradient_accumulate_every = 2,    # gradient accumulation steps
    ema_decay = 0.995,                # exponential moving average decay
    amp_level = 'O1',                        # turn on mixed precision
)

trainer.train()

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

why set `lr = 2e-4` for oxford102 flowers dataset? #3

why set `lr = 2e-4` for oxford102 flowers dataset? #3

lvyufeng commented Nov 27, 2022 •

edited

Loading

why set lr = 2e-4 for oxford102 flowers dataset? #3

why set lr = 2e-4 for oxford102 flowers dataset? #3

Comments

lvyufeng commented Nov 27, 2022 • edited Loading

why set `lr = 2e-4` for oxford102 flowers dataset? #3

why set `lr = 2e-4` for oxford102 flowers dataset? #3

lvyufeng commented Nov 27, 2022 •

edited

Loading