(This may be completely unnecessary/inapplicable, so feel free to just close it)
The performance of simulating gates on big states drastically depends on its position (i.e. it is much faster to apply H to 0th qubit of the 30-qubit state that to the 29th)
So, for "unbalanced" circuits it is very beneficial to move "the most used" qubits to lower indices.