In your paper, it states alignment with BBDM. Why wasn‘t VQGAN employed?
In your paper, it states alignment with BBDM. Why wasn‘t VQGAN employed?