Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

第十三章 DDPG算法 代码实践中的一点疏漏 #75

Open
xiyanzzz opened this issue Mar 29, 2024 · 0 comments
Open

第十三章 DDPG算法 代码实践中的一点疏漏 #75

xiyanzzz opened this issue Mar 29, 2024 · 0 comments

Comments

@xiyanzzz
Copy link

13.3 DDPG 代码实践中,在定义的DDPG类中,方法def take_action(self, state):的返回动作应该加上截断。
return action -> return np.clip(action, -self.action_bound, self.action_bound)
该动作会用于Q网络对当前时间步的q值估计,动作不应大于环境的限制(添加的噪声会导致这种情况发生,尽管概率很小)。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant