You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In dist = Normal(mu, sigma) , sigma should be a positive value, but actor_net output can be negative, so action_log_prob = dist.log_prob(action) can be nan.
Try:
import torch
a = torch.FloatTensor([1]).cuda()
b = torch.FloatTensor([-1]).cuda()
dist = Normal(a,b)
action = dist.sample()
action_log_prob = dist.log_prob(action)
print(action.cpu().numpy())
print(action_log_prob.item())
The text was updated successfully, but these errors were encountered:
You can add an activation function before the output of actor network. Using relu or softplus function may change sigma into a positive value. Hope it helps.
In
dist = Normal(mu, sigma)
,sigma
should be a positive value, but actor_net output can be negative, soaction_log_prob = dist.log_prob(action)
can benan
.Try:
The text was updated successfully, but these errors were encountered: