Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

请教PPO问题 #87

Open
394262597 opened this issue Aug 24, 2024 · 0 comments
Open

请教PPO问题 #87

394262597 opened this issue Aug 24, 2024 · 0 comments

Comments

@394262597
Copy link

我看PPO这里加载的agent是train on policy的,但是直接train的话并不会有经验池,但PPO中N步更新的时候不是应该有一个经验池吗,就是对应的off policy部分,这里是在哪体现出来的呢?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant