-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance difference between essentially the same einsum with dummy index interchanging #143
Comments
I would have assumed that If you use snakeviz you can see that it is indeed a reshape within tensordot that's costly on the first one: Other lowest block is I guess the question for |
Just to be clear, one is obviously free to choose the same permutation to apply to both One canonical choice would be left_pos, right_pos = zip(*sorted(zip(left_pos, right_pos))) which should make the two contractions above the same, but is it always the fastest ordering? |
We talk about this some in #103. We can optimize the indices for a single contraction assuming that it is in C order, but it may cause issues down the line. In this example it doesn't matter for binary contractions however. Another way to solve this is a custom tensor_blas function. I don't recall why we apparently deprecated it in favor of pure |
Hi, there.
The relevant issue with this observation is google/TensorNetwork#650.
In simple code,
This huge difference is possibly due to
np.tensordot
.The slower one may have invoked some transpose operations beyond matrix multiplications.
Any thoughts on this? And is there a persistent axes arguments order that make tensordot fast?
The text was updated successfully, but these errors were encountered: