-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fail to pass the torch.autograd.gradcheck
#17
Comments
As mentioned in #13, in most practical cases involving sparse matrices, you would want the derivative with respect to the non-zero selements only. There is some further discussion here: I haven't checked in a while but at the time, pytorch had some issues with gradcheck and sparse matrix operations: Finally, if you want, we consolidate the pure python functions above here: |
Thank you very much for your reply. So, based on the contents from links, the The version of
which is a total mess. I wonder whether there is a way to check the gradient of the sparse matrix.
Well, thank you for your advice, I will try the program in this repository. |
Thank you very much for providing such a good tool!
My problem is that when the input
A
is a 'real' sparse matrix, not the sparse matrix converted from a dense matrix, thetorch.autograd.gradcheck()
function will throw an exception. Thepython
program I use iswhich is based on the program from Differentiable sparse linear solver with cupy backend - “unsupported tensor layout: Sparse” in gradcheck, whose author @tvercaut wrote the program based on your blog and program. I modified the program and limited it to running only on
CPU
.The output is
The success of your and @tvercaut's program in passing the gradient check can be attributed to the fact that the sparse matrix
A
you used is actually a dense matrix. Consequently, theautograd()
function computes the gradient for each element.The derivative formula from your blog is
$$\frac{\partial L}{\partial A} = - \frac{\partial L}{\partial b} \otimes x$$ $\frac{\partial L}{\partial A_{ij}}=0$ when $A_{ij}=0$ , but the results computed by
Since the matrix
A
is sparse, thenpytorch
show it's not true. If I changebackward()
function intoThen the gradient check is passed. However the
gradA
is now a dense matrix, which is not consistent to the theoretical result. There is a similar issue #13 without detailed explanation. So I want to ask which gradient is right ? the sparse one or the dense one?The text was updated successfully, but these errors were encountered: