Skip to content

Questions Regarding Optical Flow Supervision and Potential Enhancements #50

@linwk20

Description

@linwk20

First of all, thank you for your impressive work. I’ve been searching for methods that can provide accurate dense depth maps (which is why I believe FlowMap is significantly superior to colmap). It seems like using optical flow to fine-tune depth networks is a great idea. I have the following questions:

  • Why supervise depth with optical flow? Is it because optical flow typically offers higher accuracy and can offer subpixel reprojection errors for depth network? The reason i am asking this is that optical flow might not be accurate since it also comes from DNN and why we don't co-optimized it for the scene?
  • Potential improvement with a pretrained MVS model? If we use a pretrained large reconstruction model that takes multi-view as input as the depth estimator, is there a chance of significantly improving the final performance? Or do you think optical flow supervision already a form of multi-view stereo (MVS), making a pretrained MVS model unnecessary?
  • Can increasing resolution and image count improve depth accuracy? The current training resolution and the number of images supported are limited by GPU memory. However, we know that for models using Layer Norm instead of Batch Norm, we can accumulate gradients to achieve an equivalent large batch size (for example, the ViT model used in DepthAnything v2 follows this approach). If we use this method to greatly increase resolution and image count, do you think it will improve the final depth accuracy?

These are just some speculations, and I look forward to your response. Your thoughts may help us design more reasonable experiments. Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions