Skip to content

Possible error on the SDE #4

@happynear

Description

@happynear

Hi, thanks for the great work and for releasing the code/paper.

Image

I have a question about the SDE sampler in Algorithm 6. The paper uses the interpolation convention

z_t = t * x + (1 - t) * eps

so t is the data coefficient and 1 - t is the noise coefficient. In Algorithm 6, the sampler defines

alpha = 1 - gamma * dt
t_back = alpha * t
z_back = alpha * z + (1 - alpha) * e

If we substitute z = t * x + (1 - t) * eps, then

z_back = alpha * t * x + alpha * (1 - t) * eps + (1 - alpha) * e

The clean-data coefficient is indeed alpha * t = t_back. However, the noise part is a mixture of the previous noise and newly injected independent noise:

alpha * (1 - t) * eps + (1 - alpha) * e

If eps and e are independent (this is naturally true), the total noise standard deviation is

sqrt(alpha^2 * (1 - t)^2 + (1 - alpha)^2)

whereas a sample truly at timestep t_back = alpha * t under the paper's interpolation would require noise coefficient

1 - t_back = 1 - alpha * t

These are generally not equal! Therefore, it seems that z_back matches the target clean-data coefficient, but not the target total noise level / marginal distribution at t_back.

This related to the coefficient-preserving sampling discussed in our paper, which emphasizes that stochastic flow samplers should preserve both the data/sample coefficient and the total noise level specified by the scheduler. It would be nice if you consider fixing the algorithm and citing our paper:

@article{wang2025coefficients,
  title={Coefficients-Preserving Sampling for Reinforcement Learning with Flow Matching},
  author={Wang, Feng and Yu, Zihao},
  journal={arXiv preprint arXiv:2509.05952},
  year={2025}
}

这里还有我们文章的中文介绍:https://zhuanlan.zhihu.com/p/1948388095151026330

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions