[TKW] Hoist loop invariant reads #296

raikonenfnu · 2024-11-26T07:54:01Z

In flash attention, Q's reduction dimension is typically relatively small, and hence we do only have reduction tile across K2 dimension/reduction dimension of 2nd gemm(P and V). Hence, an optimization we can do is to hoist reading of Q from global memory out of the for loop, this actually generates quite a big speedup (hoistQ + use global->register for Q gives typically 2x speed up.)

To implement the optimization above, we needed to add:

Expand hoisting.py to also look for Read that is:
- independent of induction variable
- has memory that is read-only/do not have write as users (i.e important for correctness since this guarantee data being read is constant/not changing with loop)
- has memory who is a captured_var (i.e memory can be traced to outside the loop)
Implement method to hoist reads properly:
- Copy Read to rootOp
- replace rootOp's memory who is a captured_var with it's counterpart in the RootOp by querying reduction's implicit_capture
- Remove unused captured_var from Reduction otherwise scf.for will be indexing/loading from the wrong bindings.
Updated specifically chained_gemm_tests in lit_tests/codegen.py to test for the hoisted reads from global.
Updated lit_tests/attention.py since this change generates new schedule

Signed-off-by: Stanley Winata <stanley.winata@amd.com>

[TKW] Hoist loop invariant reads

8fef88a

Signed-off-by: Stanley Winata <stanley.winata@amd.com>

raikonenfnu requested review from Hardcode84 and harsh-nod November 26, 2024 07:54

raikonenfnu added 2 commits November 26, 2024 00:03

add examples for direct global variable in LIT + F8 use global q

7bc4c2d

Signed-off-by: Stanley Winata <stanley.winata@amd.com>

fix ci for scheduling_test

4259768

Signed-off-by: Stanley Winata <stanley.winata@amd.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TKW] Hoist loop invariant reads #296

[TKW] Hoist loop invariant reads #296

raikonenfnu commented Nov 26, 2024

[TKW] Hoist loop invariant reads #296

Are you sure you want to change the base?

[TKW] Hoist loop invariant reads #296

Conversation

raikonenfnu commented Nov 26, 2024