You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am looking for a way to update gym environment after an episode /or game ends. I looked at the code in reset() and step_wait() but after putting some logs cant figure out when a game is ending.
The text was updated successfully, but these errors were encountered:
Note that MicroRTSGridModeVecEnv (usually) runs multiple episodes "in parallel". Not so much in the sense of truly running in parallel on multiple threads, but it doesn't fully play out one game before starting another. Every time you call step(), it takes one step in each of potentially many episodes. This allows for more efficient usage of GPUs, because we can batch up inputs and outputs. Instead of having a single state as input, we can have a larger batch with one state per episode, and the GPU can do a forwards pass of a neural network for all of them in parallel. Then it also produces actions for all the different episodes as outputs in parallel, and they are passed to the game engine to take one step in each episode.
Of course, different episodes may end after different numbers of time steps. So, while at the very beginning all episodes are "synchronised" in the sense that they all start at time = 0, this will gradually become desynchronised. Some episodes will end early (and get reset such that new episodes start in those slots), while others are still ongoing.
In step_wait(), you should be able to figure out when individual episodes end though. The done variable there is not a single bool, it's actually a matrix. This matrix is first indexed by player (0 or 1), and then by episode index (ranging from 0-inclusive to number-of-parallel-episodes-exclusive). I suppose which player you use to index doesn't actually matter: if the game is over for one player, it's also over for the other.
I am looking for a way to update gym environment after an episode /or game ends. I looked at the code in reset() and step_wait() but after putting some logs cant figure out when a game is ending.
The text was updated successfully, but these errors were encountered: