Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding resize(PadOp) vectorization analysis #3321

Open
wants to merge 95 commits into
base: main
Choose a base branch
from

Conversation

jjsjann123
Copy link
Collaborator

@jjsjann123 jjsjann123 commented Oct 31, 2024

Adding conditional support of reszie in vectorization analysis. This PR allows vectorized load on PadOp directly without using cache load. This PR improves performance of generated kernel.

What's in this PR:

  1. Add propagation rule for resize in vectorization analysis. The propagation rule works as:
    i. For supported resize: a). project the resize op to the frontier and clear (frontier.begin(), resize_position); b). add projected extent of the new resize op as gcd(id_from, resize_op->leftExpand(), resize_op->rightExpand)
    ii. For unsupported resize: clear [frontier.begin(), resize_position]; no behavior change.

  2. updating TensorView::cacheAfter to opt-in a set of uses to cache while leaving other uses unchanged. Necessary for cases where inputs are used by PadOp as well as other operation that relies on cached load for vectorization.

Follow up to #3261.
Work for supporting rope performance. design doc:

jjsjann123 added a commit that referenced this pull request Nov 5, 2024
Added support for lowering TernaryOp:where with vectorization factor.

i.e.
```
predicate
  ? loadGlobalToLocal<...>(&dst[0], &src[i_src])
  : dst.set(0.0f) 
```

Currently this can only be done via manual scheduling. The follow up PR
on vectorization analysis will make this automatically applied in PR
#3321
Base automatically changed from jjsjann123/resize_vec to main November 5, 2024 16:51
@jjsjann123
Copy link
Collaborator Author

!test

@jjsjann123 jjsjann123 changed the title resize(PadOp) vectorization factor analysis Adding resize(PadOp) vectorization analysis Nov 6, 2024
csrc/tensor_view.cpp Outdated Show resolved Hide resolved
csrc/preseg_passes/move_pad.cpp Show resolved Hide resolved
tests/cpp/test_resize.cpp Outdated Show resolved Hide resolved
tests/cpp/test_resize.cpp Outdated Show resolved Hide resolved
csrc/scheduler/vectorize_helper.cpp Show resolved Hide resolved
csrc/scheduler/vectorize_helper.cpp Outdated Show resolved Hide resolved
csrc/scheduler/vectorize_helper.cpp Outdated Show resolved Hide resolved
csrc/scheduler/vectorize_helper.cpp Outdated Show resolved Hide resolved
Co-authored-by: Naoya Maruyama <naoyam@users.noreply.github.com>
Copy link
Collaborator

@naoyam naoyam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It overall looks good. Just would like a few things I commented about to get addressed.

@naoyam
Copy link
Collaborator

naoyam commented Nov 8, 2024

!test --pybench

@naoyam
Copy link
Collaborator

naoyam commented Nov 8, 2024

Initiated testing with python benchmarks just in case.

@jjsjann123
Copy link
Collaborator Author

Thanks, I'll address the issues you brought up as well as running through some real size problem so we get a taste of the perf impact. 🙇

@jjsjann123
Copy link
Collaborator Author

!test --pybench

@jjsjann123
Copy link
Collaborator Author

!test --pybench

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants