Skip to content

Commit

Permalink
fix: forgot to always set _disable_torch_cuda_device_set
Browse files Browse the repository at this point in the history
Signed-off-by: Terry Kong <terryk@nvidia.com>
  • Loading branch information
terrykong committed Oct 2, 2024
1 parent 0142ee7 commit 148543d
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions nemo/export/trt_llm/tensorrt_llm_run.py
Original file line number Diff line number Diff line change
Expand Up @@ -502,6 +502,7 @@ def load_distributed(engine_dir, model_parallel_rank, gpus_per_node):
# We want the engine to have the mp_rank, but the python runtime to not resassign the device of the current process
# So we will set it to the current
rank=torch.cuda.current_device(),
_disable_torch_cuda_device_set=True,
)

tensorrt_llm_worker_context.decoder = decoder
Expand Down

0 comments on commit 148543d

Please sign in to comment.