Skip to content

Commit

Permalink
Fix index_reduce on fake-hpu
Browse files Browse the repository at this point in the history
  • Loading branch information
madamczykhabana committed Nov 13, 2024
1 parent d3b6ef8 commit 43869e7
Showing 1 changed file with 6 additions and 3 deletions.
9 changes: 6 additions & 3 deletions vllm/worker/hpu_model_runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -361,12 +361,15 @@ def _set_block_mapping(self, metadata, batch_size, device, dtype):
else:

Check failure on line 361 in vllm/worker/hpu_model_runner.py

View workflow job for this annotation

GitHub Actions / ruff (3.12)

Ruff (E501)

vllm/worker/hpu_model_runner.py:361:81: E501 Line too long (89 > 80)
# Unfortunately one_hot on CPU/torch.compile mode/eager mode
# doesn't handle out of bounds classes,
# so we convert all negative values to 0.
block_mapping = torch.nn.functional.relu(metadata.block_groups)
# so we convert all negative values to 0 (block_mapping) or bs (block_groups)
block_groups = metadata.block_groups.to(torch.long)
block_mapping = torch.nn.functional.relu(block_groups)
block_mapping = torch.nn.functional.one_hot(block_mapping,
num_classes=batch_size)
oob_values = metadata.block_groups.lt(0)
oob_values = block_groups.lt(0)
block_mapping.masked_fill_(oob_values.unsqueeze(-1), 0)
block_groups.masked_fill_(oob_values, batch_size)
metadata = metadata._replace(block_groups=block_groups)
block_mapping = block_mapping.to(dtype)
metadata = metadata._replace(block_mapping=block_mapping,
attn_bias=attn_bias)
Expand Down

0 comments on commit 43869e7

Please sign in to comment.