I think there is a bug in your beta schedule calculation, as you go with the default beta_start value of 0.02 and only provide the end value of 1e-4. However, you scale the provided end value (beta_max in the configuration) with 1/N_timesteps while the start value of 0.02 is kept as is.
betas = make_beta_schedule(n_timestep=opt.interval, linear_end=opt.beta_max / opt.interval)
I guess that the values are taken from VPSDE (20 and 0.1) and and already scaled down by the number of timesteps (20/1000, 0.1/1000), so what you wanted to do is more something like this:
def make_beta_schedule(n_timestep=1000, linear_start=1e-4, linear_end=2e-2):
# return np.linspace(linear_start, linear_end, n_timestep)
scale = 1000 / n_timestep
linear_start *= scale
linear_end *= scale
betas = torch.linspace(linear_start**0.5, linear_end**0.5, n_timestep, dtype=torch.float64) ** 2
return betas.numpy()
betas = make_beta_schedule(n_timestep=opt.interval, linear_end=opt.beta_max)
Is this correct or is there areason to only scaling the linear_end variable?
I think there is a bug in your beta schedule calculation, as you go with the default
beta_startvalue of0.02and only provide the end value of1e-4. However, you scale the provided end value (beta_maxin the configuration) with 1/N_timesteps while the start value of0.02is kept as is.I guess that the values are taken from VPSDE (20 and 0.1) and and already scaled down by the number of timesteps (20/1000, 0.1/1000), so what you wanted to do is more something like this:
Is this correct or is there areason to only scaling the
linear_endvariable?