-
How should a user implement auxiliary tasks that uses additional self-supervision to optimize policy networks during training? For example, one might want to use image reconstruction loss to provide additional supervision for vision encoders, while using PPO loss at the same time. Is there a recommended way of realizing this? Thanks for your thoughts and comments. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Hi @ErcBunny I think that for this will be necessary to modify the desired agent implementation skrl/skrl/agents/torch/ppo/ppo.py Line 429 in 636936f |
Beta Was this translation helpful? Give feedback.
Hi @ErcBunny
I think that for this will be necessary to modify the desired agent implementation
For example, for the PPO agent, compute the additional loss and add it to:
skrl/skrl/agents/torch/ppo/ppo.py
Line 429 in 636936f