-
Notifications
You must be signed in to change notification settings - Fork 269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistent initialization of object positions #324
Comments
@mtcrawshaw Thanks a lot for bringing this bug up. This is in fact a bug and not a design choice. Essentially what has happened here is that we have logic to generate multiple random The problem that you are facing is that there is an initial position specified by each of the mujoco files. This is a pretty easy fix, and I'll be sure to have it out soon (less than a week) Thanks for bringing this up, ps @krzentner can you look at this too, thanks! |
Hmm upon investigating your bug what I initially said was the case is not actually the case @mtcrawshaw. You are right to say that it has to do with mujoco. |
In some environments, the locations of objects are updated via changes to the underlying simulator. Those changes don't reflect in the simulator until the simulator itself has been stepped (a call to sim.step()). This commit forces sim.step when state rand vecs are changed by calling reset inside set_task. closes #324
* Force state rand vec updates to update the sim In some environments, the locations of objects are updated via changes to the underlying simulator. Those changes don't reflect in the simulator until the simulator itself has been stepped (a call to sim.step()). This commit forces sim.step when state rand vecs are changed by calling reset inside set_task. closes #324
6 of the v1 environments (names listed in the output below) seem to have an issue where the initial object positions (elements 3 through 9 (exclusive) of the observation) for the very first episode of an environment instantiation is different than the initial object positions for the remaining episodes. I wrote a minimum working example and included it and the output below. This may seem like an insignificant problem, and it doesn't really matter in a lot of cases because the importance of the first episode is dwarfed as training gets longer and longer. However, it is crucial in some cases. In my training code, I included an option that creates a new environment instance at the beginning of each episode. This allows for training on MT50 with a much smaller memory requirement, since we only need to store one environment instance at a time instead of 50, so that I can train MT50 on my laptop. But in this case, every episode is the very first episode of an environment instantiation, so the goals will only ever be set to the initial (erroneous) value. I'm not sure how to fix this myself because I assume this has something to do with the way the Mujoco model is set up, but I haven't dug into that part of metaworld very much. Any help with this would be appreciated. Thanks!
and the resulting output:
The text was updated successfully, but these errors were encountered: