Hello, I have trained the baseline models for all of the tasks and the results are only good for the pushing and picking tasks, and not even that good for the picking one. As for pick and place and stacking, the trained baseline model fails consistently. Is this expected behavior?
I ran the reproduce_experiments.py script and then evaluated each trained model with the evaluation pipeline. I can post the videos and the evaluation script if needed.