-
Notifications
You must be signed in to change notification settings - Fork 309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Performance] consolidate TDs in ParallelEnv without buffers #2231
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2231
Note: Links to docs will display an error until the docs builds have been completed. ❌ 4 New Failures, 1 Unrelated FailureAs of commit 18d10e8 with merge base eb6c85d (): NEW FAILURES - The following jobs have failed:
BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Jun 14, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_single | 0.1201s | 59.2226ms | 16.8855 Ops/s | 18.0148 Ops/s | |
test_sync | 38.4063ms | 32.0311ms | 31.2196 Ops/s | 29.0733 Ops/s | |
test_async | 50.0934ms | 29.3137ms | 34.1138 Ops/s | 34.5825 Ops/s | |
test_simple | 0.3748s | 0.3741s | 2.6734 Ops/s | 2.6661 Ops/s | |
test_transformed | 0.5311s | 0.5291s | 1.8901 Ops/s | 1.8087 Ops/s | |
test_serial | 1.3361s | 1.2767s | 0.7832 Ops/s | 0.7908 Ops/s | |
test_parallel | 1.1401s | 1.0794s | 0.9265 Ops/s | 0.9117 Ops/s | |
test_step_mdp_speed[True-True-True-True-True] | 0.1376ms | 22.9070μs | 43.6547 KOps/s | 44.7474 KOps/s | |
test_step_mdp_speed[True-True-True-True-False] | 42.2590μs | 13.5831μs | 73.6208 KOps/s | 74.8991 KOps/s | |
test_step_mdp_speed[True-True-True-False-True] | 40.0550μs | 13.1530μs | 76.0285 KOps/s | 76.3698 KOps/s | |
test_step_mdp_speed[True-True-True-False-False] | 26.8700μs | 7.9158μs | 126.3296 KOps/s | 128.7063 KOps/s | |
test_step_mdp_speed[True-True-False-True-True] | 59.8520μs | 24.1592μs | 41.3920 KOps/s | 41.9946 KOps/s | |
test_step_mdp_speed[True-True-False-True-False] | 39.0230μs | 14.6738μs | 68.1487 KOps/s | 68.7577 KOps/s | |
test_step_mdp_speed[True-True-False-False-True] | 42.4000μs | 14.4505μs | 69.2018 KOps/s | 69.6389 KOps/s | |
test_step_mdp_speed[True-True-False-False-False] | 36.6290μs | 9.1559μs | 109.2198 KOps/s | 111.4671 KOps/s | |
test_step_mdp_speed[True-False-True-True-True] | 61.9360μs | 25.4166μs | 39.3444 KOps/s | 39.5727 KOps/s | |
test_step_mdp_speed[True-False-True-True-False] | 36.4580μs | 15.9838μs | 62.5632 KOps/s | 62.1967 KOps/s | |
test_step_mdp_speed[True-False-True-False-True] | 59.0600μs | 14.2372μs | 70.2387 KOps/s | 70.0893 KOps/s | |
test_step_mdp_speed[True-False-True-False-False] | 30.1360μs | 9.0054μs | 111.0440 KOps/s | 110.7412 KOps/s | |
test_step_mdp_speed[True-False-False-True-True] | 57.6280μs | 26.8344μs | 37.2656 KOps/s | 37.5487 KOps/s | |
test_step_mdp_speed[True-False-False-True-False] | 49.2720μs | 16.8908μs | 59.2039 KOps/s | 58.6740 KOps/s | |
test_step_mdp_speed[True-False-False-False-True] | 41.1470μs | 15.3864μs | 64.9924 KOps/s | 63.8370 KOps/s | |
test_step_mdp_speed[True-False-False-False-False] | 30.1470μs | 9.7631μs | 102.4260 KOps/s | 99.2617 KOps/s | |
test_step_mdp_speed[False-True-True-True-True] | 54.3310μs | 25.6537μs | 38.9807 KOps/s | 39.4914 KOps/s | |
test_step_mdp_speed[False-True-True-True-False] | 41.4180μs | 16.1330μs | 61.9846 KOps/s | 62.3461 KOps/s | |
test_step_mdp_speed[False-True-True-False-True] | 43.4220μs | 16.6663μs | 60.0012 KOps/s | 59.9750 KOps/s | |
test_step_mdp_speed[False-True-True-False-False] | 34.0140μs | 10.3120μs | 96.9748 KOps/s | 97.9057 KOps/s | |
test_step_mdp_speed[False-True-False-True-True] | 74.3790μs | 26.7348μs | 37.4044 KOps/s | 37.8008 KOps/s | |
test_step_mdp_speed[False-True-False-True-False] | 68.8480μs | 17.1990μs | 58.1430 KOps/s | 58.5675 KOps/s | |
test_step_mdp_speed[False-True-False-False-True] | 44.8940μs | 17.4556μs | 57.2882 KOps/s | 55.8906 KOps/s | |
test_step_mdp_speed[False-True-False-False-False] | 32.8120μs | 11.2573μs | 88.8314 KOps/s | 87.6021 KOps/s | |
test_step_mdp_speed[False-False-True-True-True] | 59.9020μs | 27.1851μs | 36.7848 KOps/s | 36.4322 KOps/s | |
test_step_mdp_speed[False-False-True-True-False] | 56.2720μs | 18.2955μs | 54.6583 KOps/s | 54.6852 KOps/s | |
test_step_mdp_speed[False-False-True-False-True] | 54.1610μs | 18.0791μs | 55.3123 KOps/s | 55.5742 KOps/s | |
test_step_mdp_speed[False-False-True-False-False] | 42.7690μs | 11.5075μs | 86.9000 KOps/s | 85.9589 KOps/s | |
test_step_mdp_speed[False-False-False-True-True] | 42.8000μs | 29.5732μs | 33.8144 KOps/s | 34.1094 KOps/s | |
test_step_mdp_speed[False-False-False-True-False] | 52.9890μs | 19.3028μs | 51.8061 KOps/s | 51.6715 KOps/s | |
test_step_mdp_speed[False-False-False-False-True] | 42.0390μs | 18.7533μs | 53.3238 KOps/s | 55.5204 KOps/s | |
test_step_mdp_speed[False-False-False-False-False] | 35.3760μs | 12.2324μs | 81.7502 KOps/s | 79.8214 KOps/s | |
test_values[generalized_advantage_estimate-True-True] | 12.4540ms | 9.3418ms | 107.0453 Ops/s | 103.7001 Ops/s | |
test_values[vec_generalized_advantage_estimate-True-True] | 51.2840ms | 36.3196ms | 27.5334 Ops/s | 28.1175 Ops/s | |
test_values[td0_return_estimate-False-False] | 0.2418ms | 0.1661ms | 6.0215 KOps/s | 6.0943 KOps/s | |
test_values[td1_return_estimate-False-False] | 39.7834ms | 24.5024ms | 40.8124 Ops/s | 42.5957 Ops/s | |
test_values[vec_td1_return_estimate-False-False] | 38.2765ms | 35.6386ms | 28.0595 Ops/s | 28.6149 Ops/s | |
test_values[td_lambda_return_estimate-True-False] | 37.1308ms | 33.7669ms | 29.6148 Ops/s | 29.6019 Ops/s | |
test_values[vec_td_lambda_return_estimate-True-False] | 37.3138ms | 34.8774ms | 28.6719 Ops/s | 28.1043 Ops/s | |
test_gae_speed[generalized_advantage_estimate-False-1-512] | 8.6799ms | 8.2614ms | 121.0455 Ops/s | 119.1485 Ops/s | |
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 1.9017ms | 1.7977ms | 556.2692 Ops/s | 565.1140 Ops/s | |
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.5059ms | 0.3495ms | 2.8612 KOps/s | 2.8799 KOps/s | |
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 49.0552ms | 45.6233ms | 21.9186 Ops/s | 24.7836 Ops/s | |
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 3.7911ms | 3.0204ms | 331.0837 Ops/s | 327.6142 Ops/s | |
test_dqn_speed | 6.6529ms | 1.3431ms | 744.5365 Ops/s | 753.4103 Ops/s | |
test_ddpg_speed | 3.5361ms | 2.8391ms | 352.2260 Ops/s | 361.3506 Ops/s | |
test_sac_speed | 9.3380ms | 8.5319ms | 117.2076 Ops/s | 118.9792 Ops/s | |
test_redq_speed | 15.9461ms | 13.9783ms | 71.5397 Ops/s | 70.1969 Ops/s | |
test_redq_deprec_speed | 14.8197ms | 13.4359ms | 74.4276 Ops/s | 74.2289 Ops/s | |
test_td3_speed | 8.7786ms | 8.4958ms | 117.7052 Ops/s | 120.3391 Ops/s | |
test_cql_speed | 38.0057ms | 36.8549ms | 27.1334 Ops/s | 27.4417 Ops/s | |
test_a2c_speed | 8.8432ms | 7.4467ms | 134.2869 Ops/s | 137.7526 Ops/s | |
test_ppo_speed | 8.9104ms | 7.7297ms | 129.3714 Ops/s | 130.3205 Ops/s | |
test_reinforce_speed | 7.6338ms | 6.6577ms | 150.2028 Ops/s | 150.8066 Ops/s | |
test_iql_speed | 34.2873ms | 32.9241ms | 30.3729 Ops/s | 30.6301 Ops/s | |
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 5.3115ms | 3.4144ms | 292.8735 Ops/s | 295.6828 Ops/s | |
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.7406ms | 0.4909ms | 2.0373 KOps/s | 1.7878 KOps/s | |
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 3.8945ms | 0.4650ms | 2.1505 KOps/s | 2.1407 KOps/s | |
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 5.2828ms | 3.4229ms | 292.1469 Ops/s | 299.4209 Ops/s | |
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.8917ms | 0.4837ms | 2.0675 KOps/s | 2.0649 KOps/s | |
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.8149ms | 0.4615ms | 2.1670 KOps/s | 2.1961 KOps/s | |
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 2.4840ms | 1.7286ms | 578.4966 Ops/s | 600.2405 Ops/s | |
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 1.9854ms | 1.6469ms | 607.1953 Ops/s | 623.0809 Ops/s | |
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 3.9718ms | 3.5345ms | 282.9246 Ops/s | 293.1909 Ops/s | |
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.2341ms | 0.6247ms | 1.6007 KOps/s | 1.6093 KOps/s | |
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.9066ms | 0.6029ms | 1.6585 KOps/s | 1.6668 KOps/s | |
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 5.1618ms | 3.4044ms | 293.7374 Ops/s | 292.5491 Ops/s | |
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.9672ms | 0.4864ms | 2.0559 KOps/s | 2.0427 KOps/s | |
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.7326ms | 0.4732ms | 2.1133 KOps/s | 2.1586 KOps/s | |
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 5.2156ms | 3.3891ms | 295.0678 Ops/s | 291.4815 Ops/s | |
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.5746ms | 0.4813ms | 2.0778 KOps/s | 2.0613 KOps/s | |
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 3.7276ms | 0.4662ms | 2.1450 KOps/s | 2.1779 KOps/s | |
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 4.3662ms | 3.5405ms | 282.4472 Ops/s | 283.0629 Ops/s | |
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.1784ms | 0.6273ms | 1.5940 KOps/s | 1.6064 KOps/s | |
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.8300ms | 0.6013ms | 1.6631 KOps/s | 1.6637 KOps/s | |
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 0.1383s | 6.4829ms | 154.2520 Ops/s | 117.6903 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 0.1597s | 15.4608ms | 64.6797 Ops/s | 80.8929 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 3.6666ms | 1.1401ms | 877.0875 Ops/s | 972.0960 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 0.1230s | 6.1727ms | 162.0040 Ops/s | 174.2111 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 15.4546ms | 12.5859ms | 79.4541 Ops/s | 83.0276 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 1.5154ms | 1.0415ms | 960.1140 Ops/s | 956.8717 Ops/s | |
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 0.1091s | 6.0538ms | 165.1862 Ops/s | 163.0899 Ops/s | |
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 15.4251ms | 12.8259ms | 77.9673 Ops/s | 80.1762 Ops/s | |
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 4.3216ms | 1.2789ms | 781.8920 Ops/s | 815.5396 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_single | 0.1175s | 0.1173s | 8.5256 Ops/s | 8.5332 Ops/s | |
test_sync | 0.1058s | 0.1033s | 9.6775 Ops/s | 10.1744 Ops/s | |
test_async | 0.1988s | 0.1003s | 9.9675 Ops/s | 10.4190 Ops/s | |
test_single_pixels | 0.1284s | 0.1278s | 7.8234 Ops/s | 7.7698 Ops/s | |
test_sync_pixels | 89.7415ms | 84.0924ms | 11.8917 Ops/s | 12.3087 Ops/s | |
test_async_pixels | 0.1601s | 69.8413ms | 14.3182 Ops/s | 15.0135 Ops/s | |
test_simple | 0.8794s | 0.8180s | 1.2225 Ops/s | 1.2426 Ops/s | |
test_transformed | 1.1395s | 1.0780s | 0.9277 Ops/s | 0.9309 Ops/s | |
test_serial | 2.5408s | 2.4798s | 0.4033 Ops/s | 0.4022 Ops/s | |
test_parallel | 2.4179s | 2.3604s | 0.4236 Ops/s | 0.4234 Ops/s | |
test_step_mdp_speed[True-True-True-True-True] | 0.1015ms | 34.6777μs | 28.8370 KOps/s | 29.3974 KOps/s | |
test_step_mdp_speed[True-True-True-True-False] | 44.8620μs | 19.8730μs | 50.3195 KOps/s | 50.5187 KOps/s | |
test_step_mdp_speed[True-True-True-False-True] | 38.7300μs | 19.7522μs | 50.6274 KOps/s | 51.3364 KOps/s | |
test_step_mdp_speed[True-True-True-False-False] | 27.4200μs | 11.3384μs | 88.1957 KOps/s | 88.4101 KOps/s | |
test_step_mdp_speed[True-True-False-True-True] | 51.9310μs | 36.5485μs | 27.3609 KOps/s | 28.1476 KOps/s | |
test_step_mdp_speed[True-True-False-True-False] | 39.7310μs | 21.8642μs | 45.7369 KOps/s | 47.0170 KOps/s | |
test_step_mdp_speed[True-True-False-False-True] | 53.2600μs | 21.2967μs | 46.9557 KOps/s | 47.0611 KOps/s | |
test_step_mdp_speed[True-True-False-False-False] | 38.0010μs | 13.2619μs | 75.4038 KOps/s | 75.8870 KOps/s | |
test_step_mdp_speed[True-False-True-True-True] | 58.0410μs | 37.9796μs | 26.3299 KOps/s | 26.7121 KOps/s | |
test_step_mdp_speed[True-False-True-True-False] | 40.3810μs | 23.5614μs | 42.4423 KOps/s | 42.4101 KOps/s | |
test_step_mdp_speed[True-False-True-False-True] | 39.8710μs | 21.8132μs | 45.8439 KOps/s | 47.2785 KOps/s | |
test_step_mdp_speed[True-False-True-False-False] | 31.9600μs | 13.2042μs | 75.7336 KOps/s | 75.2700 KOps/s | |
test_step_mdp_speed[True-False-False-True-True] | 67.2010μs | 40.0701μs | 24.9562 KOps/s | 25.4958 KOps/s | |
test_step_mdp_speed[True-False-False-True-False] | 44.9910μs | 24.9882μs | 40.0190 KOps/s | 39.2739 KOps/s | |
test_step_mdp_speed[True-False-False-False-True] | 41.2300μs | 22.7438μs | 43.9680 KOps/s | 43.5983 KOps/s | |
test_step_mdp_speed[True-False-False-False-False] | 31.1700μs | 14.9397μs | 66.9358 KOps/s | 67.0887 KOps/s | |
test_step_mdp_speed[False-True-True-True-True] | 54.3300μs | 37.9407μs | 26.3569 KOps/s | 26.7114 KOps/s | |
test_step_mdp_speed[False-True-True-True-False] | 41.6620μs | 23.8722μs | 41.8898 KOps/s | 42.6934 KOps/s | |
test_step_mdp_speed[False-True-True-False-True] | 0.1915ms | 25.7233μs | 38.8753 KOps/s | 39.2559 KOps/s | |
test_step_mdp_speed[False-True-True-False-False] | 34.8600μs | 15.1624μs | 65.9527 KOps/s | 67.3926 KOps/s | |
test_step_mdp_speed[False-True-False-True-True] | 59.5110μs | 40.1862μs | 24.8842 KOps/s | 25.4845 KOps/s | |
test_step_mdp_speed[False-True-False-True-False] | 43.4910μs | 25.4521μs | 39.2894 KOps/s | 39.6933 KOps/s | |
test_step_mdp_speed[False-True-False-False-True] | 47.2310μs | 27.4193μs | 36.4707 KOps/s | 36.6633 KOps/s | |
test_step_mdp_speed[False-True-False-False-False] | 40.9400μs | 16.9128μs | 59.1270 KOps/s | 60.0393 KOps/s | |
test_step_mdp_speed[False-False-True-True-True] | 64.3710μs | 41.7882μs | 23.9302 KOps/s | 24.3642 KOps/s | |
test_step_mdp_speed[False-False-True-True-False] | 45.7700μs | 27.3276μs | 36.5930 KOps/s | 36.7907 KOps/s | |
test_step_mdp_speed[False-False-True-False-True] | 47.8410μs | 27.6957μs | 36.1067 KOps/s | 37.5462 KOps/s | |
test_step_mdp_speed[False-False-True-False-False] | 33.8410μs | 16.7383μs | 59.7431 KOps/s | 60.3449 KOps/s | |
test_step_mdp_speed[False-False-False-True-True] | 64.7610μs | 44.2091μs | 22.6198 KOps/s | 22.9041 KOps/s | |
test_step_mdp_speed[False-False-False-True-False] | 50.3410μs | 29.3589μs | 34.0612 KOps/s | 34.0527 KOps/s | |
test_step_mdp_speed[False-False-False-False-True] | 50.0500μs | 28.6516μs | 34.9021 KOps/s | 35.3558 KOps/s | |
test_step_mdp_speed[False-False-False-False-False] | 35.6300μs | 18.5333μs | 53.9570 KOps/s | 54.7167 KOps/s | |
test_values[generalized_advantage_estimate-True-True] | 24.0645ms | 23.5009ms | 42.5516 Ops/s | 42.2931 Ops/s | |
test_values[vec_generalized_advantage_estimate-True-True] | 89.6213ms | 2.6805ms | 373.0600 Ops/s | 364.9216 Ops/s | |
test_values[td0_return_estimate-False-False] | 89.8410μs | 64.6989μs | 15.4562 KOps/s | 15.4430 KOps/s | |
test_values[td1_return_estimate-False-False] | 53.5278ms | 53.0963ms | 18.8337 Ops/s | 18.7936 Ops/s | |
test_values[vec_td1_return_estimate-False-False] | 1.3287ms | 1.0662ms | 937.9255 Ops/s | 935.3004 Ops/s | |
test_values[td_lambda_return_estimate-True-False] | 87.0180ms | 84.6711ms | 11.8104 Ops/s | 11.7714 Ops/s | |
test_values[vec_td_lambda_return_estimate-True-False] | 1.4353ms | 1.0650ms | 938.9514 Ops/s | 937.3132 Ops/s | |
test_gae_speed[generalized_advantage_estimate-False-1-512] | 23.8513ms | 23.6582ms | 42.2687 Ops/s | 41.5246 Ops/s | |
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 0.9291ms | 0.7006ms | 1.4272 KOps/s | 1.4051 KOps/s | |
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.7490ms | 0.6543ms | 1.5284 KOps/s | 1.5082 KOps/s | |
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 1.4879ms | 1.4543ms | 687.6153 Ops/s | 687.1469 Ops/s | |
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 0.6928ms | 0.6689ms | 1.4949 KOps/s | 1.4940 KOps/s | |
test_dqn_speed | 2.8332ms | 1.4510ms | 689.1869 Ops/s | 696.2505 Ops/s | |
test_ddpg_speed | 3.0951ms | 2.9606ms | 337.7639 Ops/s | 343.2048 Ops/s | |
test_sac_speed | 8.5514ms | 8.3353ms | 119.9712 Ops/s | 118.7998 Ops/s | |
test_redq_speed | 0.1024s | 11.7741ms | 84.9325 Ops/s | 93.7374 Ops/s | |
test_redq_deprec_speed | 12.3327ms | 11.6910ms | 85.5359 Ops/s | 87.4485 Ops/s | |
test_td3_speed | 8.5480ms | 8.3577ms | 119.6506 Ops/s | 119.6194 Ops/s | |
test_cql_speed | 26.0323ms | 25.3865ms | 39.3911 Ops/s | 39.0816 Ops/s | |
test_a2c_speed | 6.9041ms | 5.6856ms | 175.8842 Ops/s | 181.4542 Ops/s | |
test_ppo_speed | 6.2317ms | 5.9792ms | 167.2457 Ops/s | 172.1726 Ops/s | |
test_reinforce_speed | 5.3672ms | 4.6173ms | 216.5748 Ops/s | 216.4199 Ops/s | |
test_iql_speed | 20.2436ms | 19.5416ms | 51.1729 Ops/s | 50.8615 Ops/s | |
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 4.8099ms | 4.6532ms | 214.9078 Ops/s | 216.0321 Ops/s | |
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.1075s | 0.6938ms | 1.4413 KOps/s | 1.6835 KOps/s | |
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.7240ms | 0.5731ms | 1.7449 KOps/s | 1.7403 KOps/s | |
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 4.8178ms | 4.6115ms | 216.8501 Ops/s | 217.6854 Ops/s | |
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.2738ms | 0.5908ms | 1.6926 KOps/s | 1.7065 KOps/s | |
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.7381ms | 0.5680ms | 1.7607 KOps/s | 1.7712 KOps/s | |
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 4.4607ms | 2.0773ms | 481.4035 Ops/s | 482.2085 Ops/s | |
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 2.1255ms | 1.9622ms | 509.6276 Ops/s | 510.7211 Ops/s | |
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 4.8990ms | 4.7772ms | 209.3255 Ops/s | 212.0965 Ops/s | |
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 0.8643ms | 0.7451ms | 1.3422 KOps/s | 1.2744 KOps/s | |
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 4.4420ms | 0.7319ms | 1.3664 KOps/s | 1.3008 KOps/s | |
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 4.7710ms | 4.6340ms | 215.7942 Ops/s | 216.6947 Ops/s | |
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.7285ms | 0.5969ms | 1.6753 KOps/s | 1.6578 KOps/s | |
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.7261ms | 0.5724ms | 1.7471 KOps/s | 1.7329 KOps/s | |
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 4.7748ms | 4.5687ms | 218.8797 Ops/s | 219.1000 Ops/s | |
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.7603ms | 0.5936ms | 1.6845 KOps/s | 1.6813 KOps/s | |
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.1391s | 0.7809ms | 1.2806 KOps/s | 1.7543 KOps/s | |
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 4.9728ms | 4.7924ms | 208.6652 Ops/s | 209.9341 Ops/s | |
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 0.9044ms | 0.7494ms | 1.3343 KOps/s | 1.3435 KOps/s | |
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.9021ms | 0.7263ms | 1.3769 KOps/s | 1.3921 KOps/s | |
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 0.1285s | 7.4243ms | 134.6924 Ops/s | 135.5177 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 18.0941ms | 15.6439ms | 63.9227 Ops/s | 63.0259 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 1.3607ms | 1.2820ms | 780.0296 Ops/s | 753.6734 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 0.1243s | 7.3313ms | 136.4023 Ops/s | 102.9270 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 0.1382s | 18.2585ms | 54.7690 Ops/s | 62.3783 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 7.5703ms | 1.4118ms | 708.3106 Ops/s | 769.1381 Ops/s | |
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 0.1245s | 7.5022ms | 133.2945 Ops/s | 133.7177 Ops/s | |
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 18.3551ms | 15.8477ms | 63.1008 Ops/s | 63.2428 Ops/s | |
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 2.5474ms | 1.4761ms | 677.4635 Ops/s | 636.8191 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
performance
Performance issue or suggestion for improvement
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Improves perf of ParallelEnv without buffer
FPS for CartPole-v1 with gym on 4 procs (local) goes from 660fps to 1000fps on my machine.
Based on
pytorch/tensordict#814