-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pseudo-inverse of symmetric matrices using SVD / Utility for moving least squares #950
Conversation
symmetricPseudoInverseSVDSerialKernel(InOutMatrix &io, SMatrix &s, UMatrix &u) | ||
{ | ||
using value_t = typename InOutMatrix::non_const_value_type; | ||
std::size_t const size = io.extent(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is io.extent(1) == size
a precondition?
Presumably s
and u
must also be properly sized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes and yes. io
, s
and u
must have the same size and this should be guaranteed by the host function. But due to the fact I cannot use assert
in kernels, I skipped checking it. (although, speaking of it, I could use KOKKOS_ASSERT
)
src/interpolation/details/ArborX_InterpDetailsSymmetricPseudoInverseSVD.hpp
Outdated
Show resolved
Hide resolved
src/interpolation/details/ArborX_InterpDetailsSymmetricPseudoInverseSVD.hpp
Outdated
Show resolved
Hide resolved
src/interpolation/details/ArborX_InterpDetailsSymmetricPseudoInverseSVD.hpp
Outdated
Show resolved
Hide resolved
src/interpolation/details/ArborX_InterpDetailsSymmetricPseudoInverseSVD.hpp
Outdated
Show resolved
Hide resolved
src/interpolation/details/ArborX_InterpDetailsSymmetricPseudoInverseSVD.hpp
Outdated
Show resolved
Hide resolved
test/CMakeLists.txt
Outdated
@@ -235,6 +235,12 @@ target_link_libraries(ArborX_Test_BoostAdapters.exe PRIVATE ArborX Boost::unit_t | |||
target_compile_definitions(ArborX_Test_BoostAdapters.exe PRIVATE BOOST_TEST_DYN_LINK) | |||
add_test(NAME ArborX_Test_BoostAdapters COMMAND ArborX_Test_BoostAdapters.exe) | |||
|
|||
add_executable(ArborX_Test_InterpDetailsSymmPInvSVD.exe tstInterpDetailsSymmPInvSVD.cpp utf_main.cpp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably change all the relevant names ArborX_Test_DetailsInterpolationSVD
. This would emphasize that it is a Details
test, as it starts with ArborX_Test_Details
. We don't need to emphasize what exact SVD is done, so may omit that. And imho, Interpolation
is better than Interp
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will stick with InterpDetailsSVD
because it mimics the location and name of the tested file (in src/interpolation/details
). But maybe would it be better to have ArborX_InterpDetailsSymmetricPseudoInverseSVD.hpp
elsewhere?
src/interpolation/details/ArborX_InterpDetailsSymmetricPseudoInverseSVD.hpp
Outdated
Show resolved
Hide resolved
src/interpolation/details/ArborX_InterpDetailsSymmetricPseudoInverseSVD.hpp
Outdated
Show resolved
Hide resolved
src/interpolation/details/ArborX_InterpDetailsSymmetricPseudoInverseSVD.hpp
Outdated
Show resolved
Hide resolved
value_t const a = sigma(p, p); | ||
value_t const b = sigma(p, q); | ||
value_t const c = sigma(q, q); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
value_t const a = sigma(p, p); | |
value_t const b = sigma(p, q); | |
value_t const c = sigma(q, q); | |
auto const a = sigma(p, p); | |
auto const b = sigma(p, q); | |
auto const c = sigma(q, q); |
// | b | c | | 0 | y | | ||
// +---+---+ +---+---+ | ||
|
||
value_t cos, sin, x, y; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably a bad idea to name variables cos
and sin
. If one puts using namespace std;
somewhere above, this will break.
if (a == c) | ||
{ | ||
cos = Kokkos::sqrt(value_t(2)) / 2; | ||
sin = Kokkos::sqrt(value_t(2)) / 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dunno if the compiler is smart enough to not do this computation twice.
sin = Kokkos::sqrt(value_t(2)) / 2; | |
sin = cos; |
value_t const u = (2 * b) / (a - c); | ||
value_t const v = 1 / Kokkos::sqrt(u * u + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
value_t const u = (2 * b) / (a - c); | |
value_t const v = 1 / Kokkos::sqrt(u * u + 1); | |
auto const u = (2 * b) / (a - c); | |
auto const v = 1 / Kokkos::sqrt(u * u + 1); |
|
||
template <typename ExecutionSpace, typename InvMatrices> | ||
void symmetricPseudoInverseSVD(ExecutionSpace const &space, | ||
InvMatrices &inv_matrices) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the name InvMatrices
is misleading, given that it is both input and output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have renamed the type InOutMatrices
, as the inverse is already implied by the function name.
src/interpolation/details/ArborX_InterpDetailsSymmetricPseudoInverseSVD.hpp
Outdated
Show resolved
Hide resolved
{ | ||
isSquareMatrix(mat); | ||
using value_t = typename Matrix::non_const_value_type; | ||
int const size = mat.extent(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this closer to the place you first use it.
struct | ||
{ | ||
value_t max = 0; | ||
int row = 0; | ||
int col = 0; | ||
} result; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't even know you could return an object from a local unnamed class.
Sure why not given that these are implementation details.
0618fcf
to
3d105a1
Compare
b8c9556
to
0c5bb0d
Compare
int i = 0; | ||
for (; i < p; i++) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int i = 0; | |
for (; i < p; i++) | |
{ | |
for (int i = 0; i < p; i++) | |
{ |
i++; | ||
for (; i < q; i++) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i++; | |
for (; i < q; i++) | |
for (int i = p + 1; i < q; i++) |
i++; | ||
for (; i < size; i++) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i++; | |
for (; i < size; i++) | |
for (int i = q + 1; i < size; i++) |
for (int i = 0; i < size; i++) | ||
{ | ||
auto const u_ip = U(i, p); | ||
auto const u_iq = U(i, q); | ||
U(i, p) = cos_theta * u_ip + sin_theta * u_iq; | ||
U(i, q) = -sin_theta * u_ip + cos_theta * u_iq; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to use hierarchical parallelism for these for-loops or do we assume size
to be small enough that it doesn't matter? Or do we have sufficient parallelism already in the outer loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would make a lot of sense to use hierarchical parallelism here (and in other loops) as well as using scratch pads for the auxiliary matrices. However, because this is part of the MLS algorithm, it might not be a bottleneck and it is best to avoid early optimization.
However, the following should be a logical distribution:
- Each team treats a single matrix
- Threads/Vectors handle the loops and reductions
- Use the team scratch pad for
ES
andU
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough.
// +---+---+ +---+---+ | ||
|
||
value_t cos_theta, sin_theta, x, y; | ||
if (a == c) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't you need to have a tolerance here? What happens if a = 1
and c = 1+1e-15
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A tolerance is not really needed as the algorithm behaves correctly even if the values are very close together. This case is here to avoid a "true" division by zero.
symmetricPseudoInverseSVDSerialKernel(A, ES, U); | ||
}); | ||
} | ||
else if constexpr (InOutMatrices::rank == 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a case where we care about using SVD for a single matrix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, at least for now. This was added because the interface of the kernel itself takes only a single matrix. However, I will probably remove it as this might be the source of why cuda+nvcc 11.0.3
fails.
} | ||
|
||
if (!same_dim_size) | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that you don't need this. BOOST_TEST_REQUIRE
does what you want (to check).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really, as this will abort the current test and not check the other matrices of the same test.
test/tstInterpDetailsSVD.cpp
Outdated
} | ||
makeCase<ExecutionSpace, view_t, double[128][128]>(space, 0, mat, inv, 128); | ||
|
||
// Case for invertible 128x128 matrix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you plan to add a test in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I do find an way to easily build a non-trivial 128x128 matrix and its inverse. Since I added this comment I did not found one so I will remove that command. This test could be extended if I ever find said matrix.
Windows CI is reporting
and all GPU builds seem to be stuck in |
for (int i = 0; i < size; i++) | ||
max_eigen = Kokkos::max(Kokkos::abs(ES(i, i)), max_eigen); | ||
|
||
// We inverse the diagonal of ES, except if "0" is found |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// We inverse the diagonal of ES, except if "0" is found | |
// We invert the diagonal of ES, except if "0" is found |
typename ESMatrix::value_type> && | ||
std::is_same_v<typename ESMatrix::value_type, | ||
typename UMatrix::value_type>, | ||
"Each input matrix must have the same value type"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Each input matrix must have the same value type"); | |
"All input matrices must have the same value type"); |
// Pseudo-inverse of symmetric matrices using SVD | ||
// We must find U, E (diagonal and positive) and V such that A = U.E.V^T | ||
// We also suppose, as the input, that A is symmetric, so U = SV where S is | ||
// a sign matrix (only 1 or -1 in the diagonal, 0 elsewhere). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// a sign matrix (only 1 or -1 in the diagonal, 0 elsewhere). | |
// a sign matrix (only 1 or -1 on the diagonal, 0 elsewhere). |
} | ||
|
||
static constexpr value_t epsilon = Kokkos::Experimental::epsilon_v<float>; | ||
while (true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the CI is stuck in this loop?
// | ES(q, p) | ES(q, q) | | b | c | | ||
// +----------+----------+ +---+---+ | ||
|
||
// Lets compute x, y and theta such that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Lets compute x, y and theta such that | |
// Let's compute x, y and theta such that |
y = a + c - x; | ||
} | ||
|
||
// Now lets compute the following new values for U and ES |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Now lets compute the following new values for U and ES | |
// Now let's compute the following new values for U and ES |
for (int i = 0; i < size; i++) | ||
{ | ||
auto const u_ip = U(i, p); | ||
auto const u_iq = U(i, q); | ||
U(i, p) = cos_theta * u_ip + sin_theta * u_iq; | ||
U(i, q) = -sin_theta * u_ip + cos_theta * u_iq; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough.
} | ||
} | ||
|
||
// We compute the max to get a range of the invertible eigen values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// We compute the max to get a range of the invertible eigen values | |
// We compute the max to get a range of the invertible eigenvalues |
I don't know why Windows cannot build this. The logs show no error. I am wondering if the headers must be referenced somewhere else.
It seems more likely that, in job 12, only |
auto const val = Kokkos::abs(mat(i, j) - mat(j, i)); | ||
auto const ref = Kokkos::abs(mat(i, j)); | ||
static constexpr value_t epsilon = | ||
Kokkos::Experimental::epsilon_v<float>; | ||
if (ref == value_t(0) && val > epsilon) | ||
return false; | ||
if (ref != value_t(0) && val / ref > epsilon) | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why the check is using tolerances. I think the input matrix should always be symmetric without tolerances, if constructed correctly. So I would just do
auto const val = Kokkos::abs(mat(i, j) - mat(j, i)); | |
auto const ref = Kokkos::abs(mat(i, j)); | |
static constexpr value_t epsilon = | |
Kokkos::Experimental::epsilon_v<float>; | |
if (ref == value_t(0) && val > epsilon) | |
return false; | |
if (ref != value_t(0) && val / ref > epsilon) | |
return false; | |
if (mat(i, j) != mat(j, i)) | |
return false; |
If that's not the case, I would rename the function to include Tol
in the name, and provide an epsilon
function argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will change to the tolerance-free version, as it forces users to have their matrices properly symmetric.
test/tstInterpDetailsSVD.cpp
Outdated
#include "BoostTest_CUDA_clang_workarounds.hpp" | ||
#include <boost/test/unit_test.hpp> | ||
|
||
namespace axid = ArborX::Interpolation::Details; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does axid
stand for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It stands for ArborX Interpolation Details
. It is simply a namespace shortcut. But because it is not used in a lot of places, I will use the full namespace name.
test/tstInterpDetailsSVD.cpp
Outdated
host_view src("src", m, n, n); | ||
host_view ref("ref", m, n, n); | ||
U inv("inv", m, n, n); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
host_view src("src", m, n, n); | |
host_view ref("ref", m, n, n); | |
U inv("inv", m, n, n); | |
host_view src("Testing::src", m, n, n); | |
host_view ref("Testing::ref", m, n, n); | |
U inv("Testing::inv", m, n, n); |
I'm confused about what's going on with the testing. |
6bde40a
to
61fff17
Compare
25f374f
to
8098fc0
Compare
The windows CI fails because of the location of the added header (two folder deep from |
There are still some warnings to fix, see https://cloud.cees.ornl.gov/jenkins-ci/job/ArborX/job/PR-950/19/gcc/. |
We need to deal with the failing |
I have removed the self-containment test from the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good enough.
In an attempt to break-up #946, this PR focuses only on the pseudo inverse used in the moving least squares algorithm. It includes the pseudo inverse itself in
src/interpolation
folder, tests andan iterator for multi-dimensional views (for use in boost tests)a macro to test equality on md views