Thank you very much for your great work and for sharing the code.
While reading the paper and the official implementation, I had a question regarding the definition of Lip Vertex Error (LVE).
In the paper, LVE is described as computing the maximal L2 error for each predicted frame and then averaging over all frames.
However, in the official code (main/cal_metric.py), it seems that the maximum error is taken over time for each lip vertex first, and then averaged over lip vertices.
I was wondering whether this difference is intentional, and if the implementation should be considered the authoritative definition of LVE for reproducing the reported results.
Any clarification would be greatly appreciated.
Thank you in advance for your time and help.
Thank you very much for your great work and for sharing the code.
While reading the paper and the official implementation, I had a question regarding the definition of Lip Vertex Error (LVE).
In the paper, LVE is described as computing the maximal L2 error for each predicted frame and then averaging over all frames.
However, in the official code (main/cal_metric.py), it seems that the maximum error is taken over time for each lip vertex first, and then averaged over lip vertices.
I was wondering whether this difference is intentional, and if the implementation should be considered the authoritative definition of LVE for reproducing the reported results.
Any clarification would be greatly appreciated.
Thank you in advance for your time and help.