Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Fix OOB sampling in PrioritizedSliceSampler #2239

Merged
merged 2 commits into from
Jun 20, 2024
Merged

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Jun 20, 2024

Closes #2230

Unfortunately, I'm not sure how to test this. I guess we should save somewhere a tree structure and mass to replicate the issue (and possibly fix it directly in the c++ code @xiaomengy if you can help with this)

Copy link

pytorch-bot bot commented Jun 20, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2239

Note: Links to docs will display an error until the docs builds have been completed.

❌ 10 New Failures, 1 Unrelated Failure

As of commit 8438770 with merge base c44a521 (image):

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 20, 2024
@vmoens vmoens added the bug Something isn't working label Jun 20, 2024
Copy link

github-actions bot commented Jun 20, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 91. Improved: $\large\color{#35bf28}38$. Worsened: $\large\color{#d91a1a}15$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 0.1318s 61.3574ms 16.2980 Ops/s 16.9326 Ops/s $\color{#d91a1a}-3.75\%$
test_sync 35.0064ms 32.7731ms 30.5128 Ops/s 29.2820 Ops/s $\color{#35bf28}+4.20\%$
test_async 59.4381ms 29.6312ms 33.7482 Ops/s 32.8915 Ops/s $\color{#35bf28}+2.60\%$
test_simple 0.3799s 0.3745s 2.6702 Ops/s 2.5908 Ops/s $\color{#35bf28}+3.06\%$
test_transformed 0.5281s 0.5236s 1.9099 Ops/s 1.8310 Ops/s $\color{#35bf28}+4.31\%$
test_serial 1.3106s 1.2545s 0.7971 Ops/s 0.7757 Ops/s $\color{#35bf28}+2.77\%$
test_parallel 1.1837s 1.1210s 0.8920 Ops/s 0.9273 Ops/s $\color{#d91a1a}-3.80\%$
test_step_mdp_speed[True-True-True-True-True] 0.2099ms 22.5320μs 44.3812 KOps/s 41.5494 KOps/s $\textbf{\color{#35bf28}+6.82\%}$
test_step_mdp_speed[True-True-True-True-False] 72.1450μs 13.2236μs 75.6224 KOps/s 70.1065 KOps/s $\textbf{\color{#35bf28}+7.87\%}$
test_step_mdp_speed[True-True-True-False-True] 43.3210μs 13.1551μs 76.0160 KOps/s 71.1943 KOps/s $\textbf{\color{#35bf28}+6.77\%}$
test_step_mdp_speed[True-True-True-False-False] 47.7790μs 7.6469μs 130.7720 KOps/s 120.5472 KOps/s $\textbf{\color{#35bf28}+8.48\%}$
test_step_mdp_speed[True-True-False-True-True] 61.3650μs 23.8461μs 41.9356 KOps/s 39.2643 KOps/s $\textbf{\color{#35bf28}+6.80\%}$
test_step_mdp_speed[True-True-False-True-False] 39.6440μs 14.5283μs 68.8310 KOps/s 63.8122 KOps/s $\textbf{\color{#35bf28}+7.86\%}$
test_step_mdp_speed[True-True-False-False-True] 40.3950μs 14.3965μs 69.4615 KOps/s 64.8086 KOps/s $\textbf{\color{#35bf28}+7.18\%}$
test_step_mdp_speed[True-True-False-False-False] 32.8810μs 8.9931μs 111.1958 KOps/s 102.8123 KOps/s $\textbf{\color{#35bf28}+8.15\%}$
test_step_mdp_speed[True-False-True-True-True] 90.6260μs 25.3084μs 39.5126 KOps/s 37.4335 KOps/s $\textbf{\color{#35bf28}+5.55\%}$
test_step_mdp_speed[True-False-True-True-False] 67.9970μs 16.0123μs 62.4520 KOps/s 59.2163 KOps/s $\textbf{\color{#35bf28}+5.46\%}$
test_step_mdp_speed[True-False-True-False-True] 83.4960μs 14.4870μs 69.0276 KOps/s 65.9064 KOps/s $\color{#35bf28}+4.74\%$
test_step_mdp_speed[True-False-True-False-False] 44.4130μs 9.0052μs 111.0466 KOps/s 105.4339 KOps/s $\textbf{\color{#35bf28}+5.32\%}$
test_step_mdp_speed[True-False-False-True-True] 0.1061ms 26.5565μs 37.6556 KOps/s 35.8027 KOps/s $\textbf{\color{#35bf28}+5.18\%}$
test_step_mdp_speed[True-False-False-True-False] 80.7010μs 17.1559μs 58.2891 KOps/s 54.3908 KOps/s $\textbf{\color{#35bf28}+7.17\%}$
test_step_mdp_speed[True-False-False-False-True] 81.8830μs 15.9025μs 62.8833 KOps/s 60.1860 KOps/s $\color{#35bf28}+4.48\%$
test_step_mdp_speed[True-False-False-False-False] 71.6940μs 10.5006μs 95.2325 KOps/s 90.3042 KOps/s $\textbf{\color{#35bf28}+5.46\%}$
test_step_mdp_speed[False-True-True-True-True] 63.2090μs 25.4845μs 39.2396 KOps/s 36.5983 KOps/s $\textbf{\color{#35bf28}+7.22\%}$
test_step_mdp_speed[False-True-True-True-False] 77.0740μs 16.0042μs 62.4836 KOps/s 58.4742 KOps/s $\textbf{\color{#35bf28}+6.86\%}$
test_step_mdp_speed[False-True-True-False-True] 92.7540μs 16.9160μs 59.1158 KOps/s 55.4775 KOps/s $\textbf{\color{#35bf28}+6.56\%}$
test_step_mdp_speed[False-True-True-False-False] 47.3980μs 10.2800μs 97.2762 KOps/s 90.5890 KOps/s $\textbf{\color{#35bf28}+7.38\%}$
test_step_mdp_speed[False-True-False-True-True] 85.4300μs 26.6831μs 37.4768 KOps/s 35.1872 KOps/s $\textbf{\color{#35bf28}+6.51\%}$
test_step_mdp_speed[False-True-False-True-False] 54.4320μs 17.1072μs 58.4549 KOps/s 53.3465 KOps/s $\textbf{\color{#35bf28}+9.58\%}$
test_step_mdp_speed[False-True-False-False-True] 94.3960μs 18.1019μs 55.2427 KOps/s 52.6609 KOps/s $\color{#35bf28}+4.90\%$
test_step_mdp_speed[False-True-False-False-False] 48.6310μs 11.5554μs 86.5398 KOps/s 81.0884 KOps/s $\textbf{\color{#35bf28}+6.72\%}$
test_step_mdp_speed[False-False-True-True-True] 0.1072ms 27.6719μs 36.1378 KOps/s 33.9041 KOps/s $\textbf{\color{#35bf28}+6.59\%}$
test_step_mdp_speed[False-False-True-True-False] 61.1740μs 18.5491μs 53.9111 KOps/s 50.2889 KOps/s $\textbf{\color{#35bf28}+7.20\%}$
test_step_mdp_speed[False-False-True-False-True] 89.3670μs 17.9660μs 55.6607 KOps/s 52.6724 KOps/s $\textbf{\color{#35bf28}+5.67\%}$
test_step_mdp_speed[False-False-True-False-False] 63.1780μs 11.5454μs 86.6142 KOps/s 81.4575 KOps/s $\textbf{\color{#35bf28}+6.33\%}$
test_step_mdp_speed[False-False-False-True-True] 69.4500μs 29.3055μs 34.1233 KOps/s 32.2546 KOps/s $\textbf{\color{#35bf28}+5.79\%}$
test_step_mdp_speed[False-False-False-True-False] 85.2900μs 19.5685μs 51.1026 KOps/s 47.9100 KOps/s $\textbf{\color{#35bf28}+6.66\%}$
test_step_mdp_speed[False-False-False-False-True] 76.2460μs 19.4593μs 51.3893 KOps/s 50.1599 KOps/s $\color{#35bf28}+2.45\%$
test_step_mdp_speed[False-False-False-False-False] 47.9490μs 12.6781μs 78.8763 KOps/s 74.5862 KOps/s $\textbf{\color{#35bf28}+5.75\%}$
test_values[generalized_advantage_estimate-True-True] 10.3525ms 10.0144ms 99.8561 Ops/s 105.7614 Ops/s $\textbf{\color{#d91a1a}-5.58\%}$
test_values[vec_generalized_advantage_estimate-True-True] 39.8261ms 36.5262ms 27.3776 Ops/s 29.9462 Ops/s $\textbf{\color{#d91a1a}-8.58\%}$
test_values[td0_return_estimate-False-False] 0.2618ms 0.1773ms 5.6404 KOps/s 6.0439 KOps/s $\textbf{\color{#d91a1a}-6.68\%}$
test_values[td1_return_estimate-False-False] 26.6015ms 23.5810ms 42.4070 Ops/s 41.3536 Ops/s $\color{#35bf28}+2.55\%$
test_values[vec_td1_return_estimate-False-False] 38.8931ms 35.7592ms 27.9648 Ops/s 29.5976 Ops/s $\textbf{\color{#d91a1a}-5.52\%}$
test_values[td_lambda_return_estimate-True-False] 36.6848ms 33.6384ms 29.7279 Ops/s 29.1394 Ops/s $\color{#35bf28}+2.02\%$
test_values[vec_td_lambda_return_estimate-True-False] 36.8390ms 35.7156ms 27.9990 Ops/s 29.6182 Ops/s $\textbf{\color{#d91a1a}-5.47\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 10.3397ms 8.2961ms 120.5386 Ops/s 120.1403 Ops/s $\color{#35bf28}+0.33\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.7841ms 2.0030ms 499.2595 Ops/s 555.0875 Ops/s $\textbf{\color{#d91a1a}-10.06\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4689ms 0.3578ms 2.7946 KOps/s 2.7795 KOps/s $\color{#35bf28}+0.55\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 46.2936ms 44.7260ms 22.3584 Ops/s 21.9657 Ops/s $\color{#35bf28}+1.79\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 3.6245ms 3.0302ms 330.0086 Ops/s 330.5686 Ops/s $\color{#d91a1a}-0.17\%$
test_dqn_speed 6.0949ms 1.3592ms 735.7082 Ops/s 725.5594 Ops/s $\color{#35bf28}+1.40\%$
test_ddpg_speed 3.1883ms 2.8725ms 348.1230 Ops/s 344.2989 Ops/s $\color{#35bf28}+1.11\%$
test_sac_speed 9.6952ms 8.4812ms 117.9079 Ops/s 111.5963 Ops/s $\textbf{\color{#35bf28}+5.66\%}$
test_redq_speed 14.9188ms 13.3452ms 74.9333 Ops/s 71.0143 Ops/s $\textbf{\color{#35bf28}+5.52\%}$
test_redq_deprec_speed 97.8080ms 14.5723ms 68.6233 Ops/s 70.6847 Ops/s $\color{#d91a1a}-2.92\%$
test_td3_speed 9.4790ms 8.4525ms 118.3079 Ops/s 110.1768 Ops/s $\textbf{\color{#35bf28}+7.38\%}$
test_cql_speed 37.7974ms 36.8489ms 27.1379 Ops/s 26.1088 Ops/s $\color{#35bf28}+3.94\%$
test_a2c_speed 16.3268ms 7.6418ms 130.8588 Ops/s 100.8662 Ops/s $\textbf{\color{#35bf28}+29.74\%}$
test_ppo_speed 8.2877ms 7.7053ms 129.7815 Ops/s 104.6806 Ops/s $\textbf{\color{#35bf28}+23.98\%}$
test_reinforce_speed 10.1887ms 7.6603ms 130.5437 Ops/s 140.1846 Ops/s $\textbf{\color{#d91a1a}-6.88\%}$
test_iql_speed 36.9033ms 34.2525ms 29.1949 Ops/s 28.4949 Ops/s $\color{#35bf28}+2.46\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.4699ms 4.4973ms 222.3573 Ops/s 240.1035 Ops/s $\textbf{\color{#d91a1a}-7.39\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.2462ms 0.5456ms 1.8329 KOps/s 1.8397 KOps/s $\color{#d91a1a}-0.37\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 1.1224ms 0.5082ms 1.9679 KOps/s 1.9101 KOps/s $\color{#35bf28}+3.02\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 4.8123ms 4.4207ms 226.2107 Ops/s 242.5904 Ops/s $\textbf{\color{#d91a1a}-6.75\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.0394ms 0.5171ms 1.9340 KOps/s 1.8601 KOps/s $\color{#35bf28}+3.97\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6572ms 0.5049ms 1.9808 KOps/s 1.9919 KOps/s $\color{#d91a1a}-0.56\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.5390ms 1.8478ms 541.1796 Ops/s 559.5705 Ops/s $\color{#d91a1a}-3.29\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.4120ms 1.7763ms 562.9802 Ops/s 591.2421 Ops/s $\color{#d91a1a}-4.78\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.3627ms 4.6374ms 215.6358 Ops/s 241.7700 Ops/s $\textbf{\color{#d91a1a}-10.81\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.2522ms 0.6551ms 1.5265 KOps/s 1.2964 KOps/s $\textbf{\color{#35bf28}+17.75\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0219ms 0.6247ms 1.6007 KOps/s 1.7071 KOps/s $\textbf{\color{#d91a1a}-6.23\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 4.7352ms 4.1706ms 239.7747 Ops/s 287.3745 Ops/s $\textbf{\color{#d91a1a}-16.56\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7467ms 0.5339ms 1.8730 KOps/s 1.9841 KOps/s $\textbf{\color{#d91a1a}-5.60\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 4.0364ms 0.5114ms 1.9554 KOps/s 2.1072 KOps/s $\textbf{\color{#d91a1a}-7.20\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.2310ms 3.9335ms 254.2257 Ops/s 251.0100 Ops/s $\color{#35bf28}+1.28\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.6261ms 0.5165ms 1.9360 KOps/s 1.8515 KOps/s $\color{#35bf28}+4.56\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6813ms 0.4827ms 2.0715 KOps/s 1.9645 KOps/s $\textbf{\color{#35bf28}+5.44\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 4.7724ms 3.9754ms 251.5481 Ops/s 239.8997 Ops/s $\color{#35bf28}+4.86\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.1717ms 0.6706ms 1.4913 KOps/s 1.5413 KOps/s $\color{#d91a1a}-3.24\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9364ms 0.6367ms 1.5706 KOps/s 1.5952 KOps/s $\color{#d91a1a}-1.54\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.1319s 6.5644ms 152.3363 Ops/s 148.1597 Ops/s $\color{#35bf28}+2.82\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 15.5556ms 12.9011ms 77.5131 Ops/s 76.5619 Ops/s $\color{#35bf28}+1.24\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 0.1087s 3.2951ms 303.4839 Ops/s 927.0188 Ops/s $\textbf{\color{#d91a1a}-67.26\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.1118s 6.0473ms 165.3626 Ops/s 111.6263 Ops/s $\textbf{\color{#35bf28}+48.14\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 17.2366ms 12.9756ms 77.0676 Ops/s 75.7753 Ops/s $\color{#35bf28}+1.71\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 3.6774ms 1.1533ms 867.0482 Ops/s 851.1034 Ops/s $\color{#35bf28}+1.87\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.1102s 6.1304ms 163.1206 Ops/s 148.5676 Ops/s $\textbf{\color{#35bf28}+9.80\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 15.1564ms 12.6009ms 79.3597 Ops/s 74.7673 Ops/s $\textbf{\color{#35bf28}+6.14\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 1.9217ms 1.2391ms 807.0114 Ops/s 776.8816 Ops/s $\color{#35bf28}+3.88\%$

Copy link

github-actions bot commented Jun 20, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 94. Improved: $\large\color{#35bf28}4$. Worsened: $\large\color{#d91a1a}16$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 0.1146s 0.1141s 8.7609 Ops/s 8.8598 Ops/s $\color{#d91a1a}-1.12\%$
test_sync 0.1050s 0.1024s 9.7638 Ops/s 9.6400 Ops/s $\color{#35bf28}+1.28\%$
test_async 0.1954s 78.8844ms 12.6768 Ops/s 10.1433 Ops/s $\textbf{\color{#35bf28}+24.98\%}$
test_single_pixels 0.1237s 0.1225s 8.1666 Ops/s 8.1653 Ops/s $\color{#35bf28}+0.02\%$
test_sync_pixels 83.0279ms 80.8358ms 12.3708 Ops/s 12.8136 Ops/s $\color{#d91a1a}-3.46\%$
test_async_pixels 0.1537s 66.9532ms 14.9358 Ops/s 14.9429 Ops/s $\color{#d91a1a}-0.05\%$
test_simple 0.8145s 0.8040s 1.2438 Ops/s 1.2811 Ops/s $\color{#d91a1a}-2.91\%$
test_transformed 1.0541s 1.0418s 0.9599 Ops/s 0.9665 Ops/s $\color{#d91a1a}-0.68\%$
test_serial 2.5016s 2.4550s 0.4073 Ops/s 0.4172 Ops/s $\color{#d91a1a}-2.36\%$
test_parallel 2.4788s 2.3616s 0.4234 Ops/s 0.4189 Ops/s $\color{#35bf28}+1.09\%$
test_step_mdp_speed[True-True-True-True-True] 0.2148ms 31.5189μs 31.7270 KOps/s 32.8317 KOps/s $\color{#d91a1a}-3.36\%$
test_step_mdp_speed[True-True-True-True-False] 0.1362ms 18.4105μs 54.3169 KOps/s 56.1455 KOps/s $\color{#d91a1a}-3.26\%$
test_step_mdp_speed[True-True-True-False-True] 45.7910μs 17.7642μs 56.2930 KOps/s 56.5903 KOps/s $\color{#d91a1a}-0.53\%$
test_step_mdp_speed[True-True-True-False-False] 43.6210μs 10.4045μs 96.1127 KOps/s 97.9790 KOps/s $\color{#d91a1a}-1.90\%$
test_step_mdp_speed[True-True-False-True-True] 66.6210μs 33.2382μs 30.0859 KOps/s 31.2370 KOps/s $\color{#d91a1a}-3.69\%$
test_step_mdp_speed[True-True-False-True-False] 49.8300μs 19.8973μs 50.2580 KOps/s 52.3626 KOps/s $\color{#d91a1a}-4.02\%$
test_step_mdp_speed[True-True-False-False-True] 52.7310μs 19.3006μs 51.8119 KOps/s 52.1085 KOps/s $\color{#d91a1a}-0.57\%$
test_step_mdp_speed[True-True-False-False-False] 39.7210μs 12.1641μs 82.2089 KOps/s 85.6134 KOps/s $\color{#d91a1a}-3.98\%$
test_step_mdp_speed[True-False-True-True-True] 65.3110μs 35.0052μs 28.5672 KOps/s 30.1076 KOps/s $\textbf{\color{#d91a1a}-5.12\%}$
test_step_mdp_speed[True-False-True-True-False] 55.1010μs 21.6957μs 46.0921 KOps/s 47.5605 KOps/s $\color{#d91a1a}-3.09\%$
test_step_mdp_speed[True-False-True-False-True] 45.1810μs 19.4560μs 51.3980 KOps/s 51.8150 KOps/s $\color{#d91a1a}-0.80\%$
test_step_mdp_speed[True-False-True-False-False] 37.0210μs 12.0570μs 82.9396 KOps/s 85.1394 KOps/s $\color{#d91a1a}-2.58\%$
test_step_mdp_speed[True-False-False-True-True] 0.1645ms 36.0397μs 27.7472 KOps/s 28.1881 KOps/s $\color{#d91a1a}-1.56\%$
test_step_mdp_speed[True-False-False-True-False] 60.8320μs 23.1867μs 43.1282 KOps/s 44.7624 KOps/s $\color{#d91a1a}-3.65\%$
test_step_mdp_speed[True-False-False-False-True] 68.4010μs 20.9347μs 47.7676 KOps/s 48.8914 KOps/s $\color{#d91a1a}-2.30\%$
test_step_mdp_speed[True-False-False-False-False] 42.9600μs 13.7024μs 72.9800 KOps/s 78.3969 KOps/s $\textbf{\color{#d91a1a}-6.91\%}$
test_step_mdp_speed[False-True-True-True-True] 0.2301ms 34.7105μs 28.8097 KOps/s 29.3608 KOps/s $\color{#d91a1a}-1.88\%$
test_step_mdp_speed[False-True-True-True-False] 0.2034ms 21.2543μs 47.0494 KOps/s 48.4139 KOps/s $\color{#d91a1a}-2.82\%$
test_step_mdp_speed[False-True-True-False-True] 0.2192ms 22.7094μs 44.0347 KOps/s 45.1929 KOps/s $\color{#d91a1a}-2.56\%$
test_step_mdp_speed[False-True-True-False-False] 90.4820μs 13.8338μs 72.2865 KOps/s 73.8764 KOps/s $\color{#d91a1a}-2.15\%$
test_step_mdp_speed[False-True-False-True-True] 72.7220μs 35.8647μs 27.8826 KOps/s 27.8742 KOps/s $\color{#35bf28}+0.03\%$
test_step_mdp_speed[False-True-False-True-False] 64.2710μs 22.8982μs 43.6715 KOps/s 44.1502 KOps/s $\color{#d91a1a}-1.08\%$
test_step_mdp_speed[False-True-False-False-True] 56.3910μs 24.1067μs 41.4822 KOps/s 41.3019 KOps/s $\color{#35bf28}+0.44\%$
test_step_mdp_speed[False-True-False-False-False] 42.5210μs 15.3523μs 65.1368 KOps/s 66.0612 KOps/s $\color{#d91a1a}-1.40\%$
test_step_mdp_speed[False-False-True-True-True] 68.7420μs 37.4364μs 26.7120 KOps/s 27.5805 KOps/s $\color{#d91a1a}-3.15\%$
test_step_mdp_speed[False-False-True-True-False] 0.1264ms 24.6004μs 40.6497 KOps/s 41.6306 KOps/s $\color{#d91a1a}-2.36\%$
test_step_mdp_speed[False-False-True-False-True] 53.5110μs 23.7107μs 42.1750 KOps/s 42.0483 KOps/s $\color{#35bf28}+0.30\%$
test_step_mdp_speed[False-False-True-False-False] 41.3310μs 15.3868μs 64.9907 KOps/s 66.7393 KOps/s $\color{#d91a1a}-2.62\%$
test_step_mdp_speed[False-False-False-True-True] 55.9310μs 39.5743μs 25.2689 KOps/s 25.0057 KOps/s $\color{#35bf28}+1.05\%$
test_step_mdp_speed[False-False-False-True-False] 75.0720μs 26.4817μs 37.7619 KOps/s 38.3279 KOps/s $\color{#d91a1a}-1.48\%$
test_step_mdp_speed[False-False-False-False-True] 0.1359ms 25.4587μs 39.2792 KOps/s 39.5849 KOps/s $\color{#d91a1a}-0.77\%$
test_step_mdp_speed[False-False-False-False-False] 41.2710μs 16.5601μs 60.3861 KOps/s 60.6911 KOps/s $\color{#d91a1a}-0.50\%$
test_values[generalized_advantage_estimate-True-True] 27.2530ms 26.6300ms 37.5516 Ops/s 35.7470 Ops/s $\textbf{\color{#35bf28}+5.05\%}$
test_values[vec_generalized_advantage_estimate-True-True] 0.1090s 3.0995ms 322.6284 Ops/s 340.3571 Ops/s $\textbf{\color{#d91a1a}-5.21\%}$
test_values[td0_return_estimate-False-False] 93.3910μs 66.4304μs 15.0533 KOps/s 14.8726 KOps/s $\color{#35bf28}+1.22\%$
test_values[td1_return_estimate-False-False] 67.1181ms 58.7469ms 17.0222 Ops/s 16.3834 Ops/s $\color{#35bf28}+3.90\%$
test_values[vec_td1_return_estimate-False-False] 1.4245ms 1.1122ms 899.1033 Ops/s 894.3683 Ops/s $\color{#35bf28}+0.53\%$
test_values[td_lambda_return_estimate-True-False] 96.4680ms 92.9338ms 10.7603 Ops/s 10.2169 Ops/s $\textbf{\color{#35bf28}+5.32\%}$
test_values[vec_td_lambda_return_estimate-True-False] 1.4414ms 1.1072ms 903.1894 Ops/s 883.9709 Ops/s $\color{#35bf28}+2.17\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 28.3139ms 27.8611ms 35.8923 Ops/s 35.8777 Ops/s $\color{#35bf28}+0.04\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 0.9927ms 0.7767ms 1.2874 KOps/s 1.3154 KOps/s $\color{#d91a1a}-2.13\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.8801ms 0.7149ms 1.3988 KOps/s 1.4317 KOps/s $\color{#d91a1a}-2.30\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.6870ms 1.5199ms 657.9285 Ops/s 660.8648 Ops/s $\color{#d91a1a}-0.44\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.9274ms 0.7345ms 1.3614 KOps/s 1.3527 KOps/s $\color{#35bf28}+0.65\%$
test_dqn_speed 80.1057ms 1.5703ms 636.8400 Ops/s 695.7407 Ops/s $\textbf{\color{#d91a1a}-8.47\%}$
test_ddpg_speed 3.2956ms 2.9262ms 341.7376 Ops/s 347.7255 Ops/s $\color{#d91a1a}-1.72\%$
test_sac_speed 8.9392ms 8.4231ms 118.7217 Ops/s 120.6895 Ops/s $\color{#d91a1a}-1.63\%$
test_redq_speed 17.6406ms 11.0625ms 90.3952 Ops/s 93.5945 Ops/s $\color{#d91a1a}-3.42\%$
test_redq_deprec_speed 12.1435ms 11.6558ms 85.7943 Ops/s 86.5440 Ops/s $\color{#d91a1a}-0.87\%$
test_td3_speed 18.0573ms 8.4058ms 118.9659 Ops/s 120.2432 Ops/s $\color{#d91a1a}-1.06\%$
test_cql_speed 29.5751ms 26.3069ms 38.0129 Ops/s 38.2363 Ops/s $\color{#d91a1a}-0.58\%$
test_a2c_speed 6.0594ms 5.7284ms 174.5683 Ops/s 175.5867 Ops/s $\color{#d91a1a}-0.58\%$
test_ppo_speed 6.7681ms 6.0861ms 164.3078 Ops/s 164.8776 Ops/s $\color{#d91a1a}-0.35\%$
test_reinforce_speed 5.0840ms 4.6373ms 215.6425 Ops/s 215.1587 Ops/s $\color{#35bf28}+0.22\%$
test_iql_speed 20.7466ms 20.1474ms 49.6341 Ops/s 49.4397 Ops/s $\color{#35bf28}+0.39\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 4.6603ms 4.4533ms 224.5544 Ops/s 221.4917 Ops/s $\color{#35bf28}+1.38\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.3406ms 0.4111ms 2.4325 KOps/s 3.1797 KOps/s $\textbf{\color{#d91a1a}-23.50\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6138ms 0.3891ms 2.5701 KOps/s 3.4534 KOps/s $\textbf{\color{#d91a1a}-25.58\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 4.7274ms 4.4613ms 224.1509 Ops/s 223.4503 Ops/s $\color{#35bf28}+0.31\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.6114ms 0.4054ms 2.4670 KOps/s 3.2356 KOps/s $\textbf{\color{#d91a1a}-23.76\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 9.4202ms 0.3823ms 2.6159 KOps/s 3.5111 KOps/s $\textbf{\color{#d91a1a}-25.50\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.9007ms 1.7300ms 578.0181 Ops/s 644.5667 Ops/s $\textbf{\color{#d91a1a}-10.32\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.8988ms 1.6361ms 611.1935 Ops/s 686.8405 Ops/s $\textbf{\color{#d91a1a}-11.01\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 4.7567ms 4.5912ms 217.8066 Ops/s 217.1305 Ops/s $\color{#35bf28}+0.31\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.7066ms 0.5377ms 1.8597 KOps/s 2.0697 KOps/s $\textbf{\color{#d91a1a}-10.14\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 9.6423ms 0.4530ms 2.2073 KOps/s 2.1376 KOps/s $\color{#35bf28}+3.26\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.0556ms 4.4922ms 222.6071 Ops/s 223.1150 Ops/s $\color{#d91a1a}-0.23\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.4986ms 0.3205ms 3.1203 KOps/s 3.1719 KOps/s $\color{#d91a1a}-1.63\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 9.4910ms 0.3232ms 3.0943 KOps/s 3.4296 KOps/s $\textbf{\color{#d91a1a}-9.78\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 4.7095ms 4.4579ms 224.3212 Ops/s 224.9549 Ops/s $\color{#d91a1a}-0.28\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.5777ms 0.3907ms 2.5593 KOps/s 2.4500 KOps/s $\color{#35bf28}+4.46\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 9.5799ms 0.3003ms 3.3300 KOps/s 3.4433 KOps/s $\color{#d91a1a}-3.29\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 4.8066ms 4.6089ms 216.9702 Ops/s 217.2290 Ops/s $\color{#d91a1a}-0.12\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.3186ms 0.5533ms 1.8074 KOps/s 2.2691 KOps/s $\textbf{\color{#d91a1a}-20.35\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7507ms 0.5356ms 1.8671 KOps/s 2.4303 KOps/s $\textbf{\color{#d91a1a}-23.17\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.1448s 8.0295ms 124.5402 Ops/s 125.7742 Ops/s $\color{#d91a1a}-0.98\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 20.7034ms 16.0327ms 62.3725 Ops/s 62.4076 Ops/s $\color{#d91a1a}-0.06\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.0354ms 0.9246ms 1.0815 KOps/s 840.8454 Ops/s $\textbf{\color{#35bf28}+28.62\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.1230s 7.6263ms 131.1246 Ops/s 130.4628 Ops/s $\color{#35bf28}+0.51\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 0.1341s 18.3623ms 54.4593 Ops/s 54.5766 Ops/s $\color{#d91a1a}-0.22\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 8.2470ms 1.0252ms 975.4576 Ops/s 1.0462 KOps/s $\textbf{\color{#d91a1a}-6.76\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.1249s 7.8202ms 127.8737 Ops/s 127.7830 Ops/s $\color{#35bf28}+0.07\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 20.9688ms 16.0789ms 62.1935 Ops/s 62.3377 Ops/s $\color{#d91a1a}-0.23\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 7.6962ms 1.2222ms 818.1979 Ops/s 892.2030 Ops/s $\textbf{\color{#d91a1a}-8.29\%}$

@vmoens vmoens merged commit eb35793 into main Jun 20, 2024
36 of 47 checks passed
@vmoens vmoens deleted the fix-clamp-index-prb branch June 20, 2024 12:28
@@ -475,6 +475,15 @@ def sample(self, storage: Storage, batch_size: int) -> torch.Tensor:
index = index.unsqueeze(0)
index.clamp_max_(len(storage) - 1)
weight = torch.as_tensor(self._sum_tree[index])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vmoens it seems the weight here is not updated, I think we should do weight = torch.as_tensor(self._sum_tree[index]) after changing the index?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Unexpected behavior of SumSegmentTree Resulting in Invalid Slices in PrioritizedSliceSampler.sample()
3 participants