Support varlen FlashAttention #23

Seventeen17 · 2024-09-19T12:25:28Z

No description provided.

baoleai · 2024-09-20T02:58:41Z

torchacc/ops/flash_attn.py

        if softmax_scale is None:
            softmax_scale = q.shape[-1]**(-0.5)
        assert isinstance(window_size, tuple) and len(window_size) == 2

-        softmax_lse, out, rng_state = torch_xla._XLAC._flash_attention_forward(


SPMDFlashAttnVarlenXla and FlashAttnVarlenQKVPackedXla also need to be updated.

Done. But I found no UT for them.

anw90 · 2024-10-08T02:58:47Z

tests/ops/test_flash_attn.py

-@pytest.mark.parametrize("dtype", [torch.float16, torch.bfloat16])
-@pytest.mark.parametrize("mha_type", ["mha", "mqa", "gqa"])
-@pytest.mark.parametrize("deterministic", [False, True])
+@pytest.mark.parametrize("dtype", [torch.bfloat16])


Why are we removing so many test options?

anw90 · 2024-10-08T03:00:55Z

tests/ops/test_flash_attn_varlen.py

+    )
+
+
+class FlashAttention2(nn.Module):


What does the FlashAttention2 mean? Does FA2 stand for it, or is the 2 just a magic number?

anw90 · 2024-10-08T03:02:28Z

tests/ops/test_flash_attn_varlen.py

+        dk_xla,
+        dv_xla,
+    ) = torch.autograd.grad(ret_xla, (q_xla, k_xla, v_xla), g_xla)
+    ta.mark_step()


use ta.sync instead.

Seventeen17 assigned anw90 and baoleai Sep 19, 2024

Seventeen17 requested review from yitongh and anw90 September 19, 2024 12:30

Seventeen17 unassigned anw90 and baoleai Sep 19, 2024

baoleai reviewed Sep 20, 2024

View reviewed changes

anw90 reviewed Oct 8, 2024

View reviewed changes

Seventeen17 added 2 commits November 7, 2024 15:42

adapt to new fa

f4e27ac

refine test cases

c3ad6dc

Seventeen17 force-pushed the dev/fa_varlen_swt branch from c6b2cb0 to c3ad6dc Compare November 19, 2024 07:57

code format

aa16662

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support varlen FlashAttention #23

Support varlen FlashAttention #23

Seventeen17 commented Sep 19, 2024

baoleai Sep 20, 2024

Seventeen17 Sep 23, 2024

anw90 Oct 8, 2024

anw90 Oct 8, 2024

anw90 Oct 8, 2024

		)


		class FlashAttention2(nn.Module):

Support varlen FlashAttention #23

Are you sure you want to change the base?

Support varlen FlashAttention #23

Conversation

Seventeen17 commented Sep 19, 2024

baoleai Sep 20, 2024

Choose a reason for hiding this comment

Seventeen17 Sep 23, 2024

Choose a reason for hiding this comment

anw90 Oct 8, 2024

Choose a reason for hiding this comment

anw90 Oct 8, 2024

Choose a reason for hiding this comment

anw90 Oct 8, 2024

Choose a reason for hiding this comment