Question regarding the return type of geometry_motion_vae.encode in trainers/unet_base.py

Hi there,

First of all, thank you for your amazing work! 

While reading the code, I encountered a minor confusion regarding the `encode_point_map` function in `trainers/unet_base.py`. 

In the `AutoencoderKL` branch, `.latent_dist.mode()` is explicitly called to extract the tensor. However, in the `SeperateAutoencoderKL` and `UnifyAutoencoderKL` branches, the output of `self.geometry_motion_vae.encode` is directly assigned to `latent`, followed by an `assert isinstance(latent, torch.Tensor)`. 

```python
elif self.config.vae_type == "SeperateAutoencoderKL":
    latent = self.geometry_motion_vae.encode(pmap_[i : i + chunk_size])
    # ...
elif self.config.vae_type == "UnifyAutoencoderKL":
    latent = self.geometry_motion_vae.encode( ... )
    
assert isinstance(latent, torch.Tensor)
```
My questions are:

Does self.geometry_motion_vae.encode return a DiagonalGaussianDistribution (or a similar output object) in these two elif branches? If so, the assert statement might fail here.

Assuming it does return a DiagonalGaussianDistribution, should we append .mode() or .sample() to the encode call during training in these specific branches?

Thank you in advance for your time and clarification!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question regarding the return type of geometry_motion_vae.encode in trainers/unet_base.py #4

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Question regarding the return type of geometry_motion_vae.encode in trainers/unet_base.py #4

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions