-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EOFError error during remote_worker_envs flags #46346
Comments
@XavierGeerinck Thanks for raising this issue. Can you provide a reproducable example? I guess this might be in the new API stack which does not support asynchronous vector environments (yet - we wait for a |
Awesome! We indeed are thinking the same and are awaiting the 1.0.0a2 release to start testing . Is there any ETA currently that you are aware of? |
This should come soon, but we know of no ETA in regard to it. Would it help, for the time being to just sample with more Env Runners but a single env in each of them? |
What happened + What you expected to happen
I am trying to get training to work while setting
remote_worker_envs
to true, but I am getting anEOFError
Versions / Dependencies
platform : macOS 14.2 23C64 (arm64)
memory : 48.0 GB
cpu : 16 cores
mac : 6a:3c:67:54:c1:4d
ip : 192.168.4.76
model_info : Mac15,9 (MUW73LL/A)
kernel_version : 23.2.0
git_commit_sha : Unknown
python_version : 3.11.8 (
/.venv/bin/python)/.venv/lib/python3.11/site-packages/pip)pip_version : 24.0 (
torch_version : 2.3.0
docker_version : 24.0.7,
kubernetes_version : 1.28.2
ray_version : 2.31.0
nvidia_smi : Unknown,
nvidia-smi
was not foundnvidia_cuda : Unknown,
nvcc
was not foundis_tty : True
Reproduction script
Issue Severity
High: It blocks me from completing my task.
The text was updated successfully, but these errors were encountered: