Currently, I am trying to apply this framework to our 4D-OR dataset (https://github.com/egeozsoy/4D-OR, TU Munich Germany). After setting up the corresponding dataset files and adapting the projection logic (due to our camera calibration), we are having trouble getting the posenet training to improve the pose detection. As it is harder to estimate poses on our dataset, we tried to reproduce the great pose estimation results on a subset of the panoptic datasets.
I freshly cloned this repository, adapted the paths, and otherwise only changed the selected training and val datasets to:
TRAIN_LIST = [ "160224_haggling1"], VAL_LIST = [ "160906_pizza1"]
We used the default configuration (without a pre-trained backbone) for the backbone, root- and posenet training steps. We did not train using the optional fine-tuning.
Unfortunately, the results with this provided configuration were not as good as expected. The human root joints are detected fairly accurately, but the pose estimation training does not seems to work as expected. After the full training, the debug image still look like this:
last epoch, train_2300_3d.png / ~2500 pictures

heatmap:
train_00002300_view_2_hm_pred.png

gt:
train_00002300_view_2_gt.jpg

@keqizero In case you need more information, I am happy to provide it.
Thank you!
Currently, I am trying to apply this framework to our 4D-OR dataset (https://github.com/egeozsoy/4D-OR, TU Munich Germany). After setting up the corresponding dataset files and adapting the projection logic (due to our camera calibration), we are having trouble getting the posenet training to improve the pose detection. As it is harder to estimate poses on our dataset, we tried to reproduce the great pose estimation results on a subset of the panoptic datasets.
I freshly cloned this repository, adapted the paths, and otherwise only changed the selected training and val datasets to:
TRAIN_LIST = [ "160224_haggling1"], VAL_LIST = [ "160906_pizza1"]
We used the default configuration (without a pre-trained backbone) for the backbone, root- and posenet training steps. We did not train using the optional fine-tuning.
Unfortunately, the results with this provided configuration were not as good as expected. The human root joints are detected fairly accurately, but the pose estimation training does not seems to work as expected. After the full training, the debug image still look like this:

last epoch, train_2300_3d.png / ~2500 pictures
heatmap:

train_00002300_view_2_hm_pred.png
gt:

train_00002300_view_2_gt.jpg
@keqizero In case you need more information, I am happy to provide it.
Thank you!