Logging RL results and tracking them with ModelCheckpoint(monitor=...) #5883
-
I am using Pytorch Lightning in an RL setting and want to save a model when it hits a new max average reward. I am using the Tensorboard logger where I return my neural network loss in the
And then I am saving my RL environment rewards using in
And every 5 epochs I am also writing out another RL reward loss where I use the best actions rather than sampling from them:
My question is, how can I set my ModelCheckpoint to monitor I know that in the new PL version I have spent a lot of time looking through the docs and for examples of this but I have found the logging docs on this to be quite sparse and difficult to even get everything to log in the first place. I am using Pytorch Lightning 1.0.5 and Pytorch 1.7.0. Thank you for any help/guidance. |
Beta Was this translation helpful? Give feedback.
Replies: 5 comments
-
Hi! thanks for your contribution!, great first issue! |
Beta Was this translation helpful? Give feedback.
-
I have multiple comments that I did not verify yet but they might help
So in summary, I imagine something like this: # Model
def training_epoch_end(self, outputs):
# ... compute reward losses
if self.current_epoch % self.hparams['eval_every']==0:
self.last_eval_mean = # compute the new eval mean
self.log("eval_mean", self.last_eval_mean)
# Trainer
trainer = Trainer(callbacks=[ModelCheckpoint(monitor="eval_mean")]
# or maybe also try
trainer = Trainer(callbacks=[ModelCheckpoint(monitor="eval_mean", period=self.hparams['eval_every'])] |
Beta Was this translation helpful? Give feedback.
-
Thanks for this all of this. It sounds like the fundamental problem may be that with my code I was not logging from I will try this and let you know if it works. |
Beta Was this translation helpful? Give feedback.
-
I had a very similar issue: in my reinforcement learning framework I wanted to measure the validation performance of my agent. Of course I would do so without a Maybe pytorch_lightning could at least give a warning once one tries to use |
Beta Was this translation helpful? Give feedback.
-
Regarding the |
Beta Was this translation helpful? Give feedback.
I have multiple comments that I did not verify yet but they might help
self.log
only works within a selection of hooks currently. I suggest you try to move the relevant code totraining_epoch_end
whereself.log
should work correctly.ModelCheckpoint(monitor=)
explicitly.period
parameter to only run on the epochs you update the monitor quantity. 2) Cache the last value and log it in the epochs between your regular interval, to make the ModelCheckpoint see it as unchanged. The second option may even be the defau…