Why do we do this reward_df processing in graph_generator (def preprocessing(reward_df, max_iter, explicit_node_names=False) )
It looks more consistent to do it when first generating the reward df with make_df_from_res and return at step 2 (reward_df = run_many.run_all) the proper reward_df