Interface documentation

Hey,
I'm trying to understand your code, especially how and where you are interfacing with bullet and caffe.
I'd be glad, if you could answer me some questions:
Where do you evaluate the current net in the simulation?
Where do you get the value used for reward out of the simulation?
Where and how do you hand the reward back to the solver?
Thanks for the great work and your help.
Best wishes