Skip to content

Commit

Permalink
Fix CUDA-11.4 build issue
Browse files Browse the repository at this point in the history
Summary: `#include <torch/script.h>` was introduced by D65260109 and somehow causes NVCC-11.4 ICE

Reviewed By: xw285cornell

Differential Revision: D66512512
  • Loading branch information
malfet authored and facebook-github-bot committed Nov 27, 2024
1 parent cffa05a commit d49bc17
Showing 1 changed file with 4 additions and 4 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@
#include <ATen/cuda/CUDAContext.h>
#include <cutlass/util/device_memory.h>
#include <cutlass/util/packed_stride.hpp>
#include <torch/script.h>

// clang-format off
// The fixed ordering of the headers is required for CUTLASS 3.2+
Expand Down Expand Up @@ -332,9 +331,10 @@ at::Tensor bf16bf16bf16_grouped_impl(
auto stream = at::cuda::getCurrentCUDAStream().stream();
int64_t output_offset = 0;

if (zero_start_index_M.has_value() == true) {
TORCH_CHECK(zero_start_index_M.value().dtype() == torch::kInt32);
}
// If passed, zero_start_index_M must be tensor of int32
TORCH_CHECK(
!zero_start_index_M.has_value() ||
zero_start_index_M->dtype() == at::kInt);

// Set arguments
for (int i = 0; i < problem_count; ++i) {
Expand Down

0 comments on commit d49bc17

Please sign in to comment.