Standard DQN with experience replay and target network. Double DQN with prioritized experience replay.
Ran for 10M frames on Ice Hockey (openAI gym). Results similar to those posted on the leaderboards of SLM lab. Performance on Ice Hockey is abysmal using these two algorithms.