Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename oneMKL Interface to oneMath #602

Open
wants to merge 42 commits into
base: develop
Choose a base branch
from

Conversation

Rbiessy
Copy link
Contributor

@Rbiessy Rbiessy commented Oct 21, 2024

Description

Related RFC: #564 and related spec PR: uxlfoundation/oneAPI-spec#596
As a reminder the plan is to have this PR and the spec PR approved before we move this repository to https://github.com/uxlfoundation and merge the 2 PRs.

  • The PR has the following implications on the MKLCPU and MKLGPU backends:
    • Had to introduce src/include/common_onemkl_conversion.hpp to convert oneMath common types to oneMKL types. This wasn't needed before as the backends relied on the 2 projects to use the same namespace. The file also provide helper macros to catch oneMKL exceptions and rethrow them as oneMath exceptions.
    • The DFT MKLGPU backend used to catch and rethrow some but not all oneapi::mkl exceptions when committing a descriptor. We discussed this issue by email recently. With the changes needed by the renaming, we now catch and rethrow all the exceptions as oneapi::math exceptions and I was able to remove a related workaround in the tests (FYI @oneapi-src/onemkl-dft-write). Also see related discussion below: Rename oneMKL Interface to oneMath #602 (comment)
    • The RNG MKLGPU backend relied heavily on the 2 libraries using the same namespace. I introduced onemkl_distribution_conversion.hpp to convert the RNG types and call the backend's free functions such as oneapi::mkl::rng::generate, oneapi::mkl::rng::skip_ahead. @oneapi-src/onemkl-rng-write you may want to carefully review src/rng/backends/mklgpu/ as I am not familiar with the RNG domain.
    • The renaming made it clear that some header files were duplicating function declarations for no reason in the BLAS and LAPACK domains. They are removed in baeb476.
  • The PR also remove the domain documentation for BLAS and LAPACK which duplicates the specification (commit b382291). We discussed in the issue [BLAS] Wrong namespace usage in BLAS Documentation #489 that the duplicated documentation was an issue and some of us are keen to remove it. Now is a good time as it becomes even more outdated with the name change. (@oneapi-src/onemkl-blas-write and @oneapi-src/onemkl-lapack-write FYI)
  • The changes assume that the teams @oneapi-src/onemkl-* will be renamed (or duplicated) to @uxlfoundation/onemath-* in the new repository (@rscohn2 FYI)
  • The changes assume that the onemkl Slack channel will be renamed to onemath (@rodburns FYI)
  • Removed USE_MKLREF which was only used in Lapack tests and was not documented. This macro wouldn't work easily with the namespace change anymore (@oneapi-src/onemkl-lapack-write FYI)
  • For reference the remaining occurrences of MKL are:
    • Intel backend names MKLCPU and MKLGPU.
    • Usages of oneapi::mkl inside the Intel backends.
    • File cmake/mkl/MKLConfig.cmake (could probably be removed IMO).
    • onemkl.maintainers@intel.com in CODE_OF_CONDUCT.md and CONTRIBUTING.md. Let me know if there is any plan to create onemath.maintainers@intel.com. If not this can be updated separately.
    • Folder include/oneapi/mkl for deprecated headers and deprecated oneapi::mkl C++ namespace.
    • Deprecated ONEMKL CMake namespace.
    • Some files in the MKLCPU and MKLGPU backends use "mkl" in their name. They could be updated to onemkl if needed.

Checklist

All Submissions

@Rbiessy Rbiessy requested review from a team as code owners October 21, 2024 16:46
Copy link
Contributor

@rafbiels rafbiels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an immense amount of work, well done! I only found a few missed occurrences.

docs/building_the_project_with_dpcpp.rst Outdated Show resolved Hide resolved
docs/README.md Outdated Show resolved Hide resolved
docs/building_the_project_with_adaptivecpp.rst Outdated Show resolved Hide resolved
examples/README.md Outdated Show resolved Hide resolved
@Rbiessy
Copy link
Contributor Author

Rbiessy commented Oct 23, 2024

I merged the PR with the develop branch and the clang-format-19.10 changes which reduces the diff.
I have also updated the PR description with more details on the implications to make it work with the MKLCPU and MKLGPU backends (first bullet point).

Copy link
Contributor

@anantsrivastava30 anantsrivastava30 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are a lot of changes for one PR!. Mostly they look fine with what was discussed in the RFC and spce change. However I have a few question in my review.

src/dft/backends/mklgpu/commit.cpp Show resolved Hide resolved
include/oneapi/math/detail/backends_table.hpp Show resolved Hide resolved
@Rbiessy
Copy link
Contributor Author

Rbiessy commented Oct 30, 2024

The conflicts are mostly due to the recent PR for cuSPARSE support. I will work on fixing them but the other domains can still be reviewed.

@andreyfe1
Copy link
Contributor

Thanks for doing this huge work!
I have a general question from my side. Have you used git mv for renaming/duplicating files? This command preserves the file’s history. So, we will be able to see the history of commits for old files.

@Rbiessy
Copy link
Contributor Author

Rbiessy commented Nov 4, 2024

Thanks for doing this huge work! I have a general question from my side. Have you used git mv for renaming/duplicating files? This command preserves the file’s history. So, we will be able to see the history of commits for old files.

Yes I have used git mv in the second commit: 68d3966
In this commit GitHub shows the movement unfortunately when the changes from the other commits are shown GitHub decides it's best to show some files have been deleted and re-created.
To work around this you could try to compare the diff between the commit after the move and the latest commit if that's easier. You can see it here: https://github.com/Rbiessy/oneMKL/compare/600cd44...2a8a76b?diff=split&w=

@andreyfe1
Copy link
Contributor

ok. Thanks. Just wanted to make sure that we don't accidentally miss important information

@Rbiessy
Copy link
Contributor Author

Rbiessy commented Nov 5, 2024

@oneapi-src/onemkl-sparse-write I have updated the PR to fix the conflicts with the cuSPARSE PR. It would be useful if you can review the sparse related changes in this PR before the rocSPARSE PR #544!
Log for Intel backends: intel_mklcpu_mklgpu.txt
Log for Nvidia backends: log_a100.txt

@Rbiessy
Copy link
Contributor Author

Rbiessy commented Nov 7, 2024

I fixed some more recent conflicts.
@oneapi-src/onemkl-blas-write, @oneapi-src/onemkl-lapack-write in the last commit (baeb476) I have removed a couple of header files in blas and lapack which were duplicating function declarations for the MKLCPU and MKLGPU backends. Given that we can simply include the Intel oneMKL headers instead they can be removed here. This was discussed in #606 (comment)
Log testing the MKLCPU and MKLGPU backends with the 2025.0 base toolkit: intel_mklcpu_mklgpu.txt

Copy link
Contributor

@andrewtbarker andrewtbarker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BLAS changes look good to me.

docs/building_the_project_with_dpcpp.rst Outdated Show resolved Hide resolved
@sknepper
Copy link
Contributor

sknepper commented Nov 8, 2024

/intelci: run

Copy link
Contributor

@anantsrivastava30 anantsrivastava30 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving on behalf of DFT

@toxicscum
Copy link

/intelci: run

@sknepper
Copy link
Contributor

Thanks for all your work here - there are so many changes, so I know you've spent a lot of effort on this!
It looks like there are some problems building on Windows. For example, I'm seeing an error for the BLAS domain like:

[1/155] C:\cache\stash\oneapi-compiler\20240719_rls\win\package\compiler\latest\bin\icx.exe -fsycl /nologo -IC:\temp\ec\mkltest\auto-efa034fe-1902-f165-9fcf-a4bf010d0e2d\sources\src -IC:\temp\ec\mkltest\auto-efa034fe-1902-f165-9fcf-a4bf010d0e2d\sources\src\include -IC:\temp\ec\mkltest\auto-efa034fe-1902-f165-9fcf-a4bf010d0e2d\build\bin -IC:\cache\stash\onemkl\regular\2024.2.1\20240722\build__release_win\include /EHsc /DWIN32 /D_WINDOWS /W3 /GR /EHsc -Wno-unused-function -w /O2 /Ob2 /DNDEBUG -iquote C:/temp/ec/mkltest/auto-efa034fe-1902-f165-9fcf-a4bf010d0e2d/sources/include -DMKL_ILP64 -fsycl -QMD -QMT bin\blas\backends\mklcpu\CMakeFiles\onemath_blas_mklcpu_obj.dir\mklcpu_level2.cpp.obj -QMF bin\blas\backends\mklcpu\CMakeFiles\onemath_blas_mklcpu_obj.dir\mklcpu_level2.cpp.obj.d /Fobin\blas\backends\mklcpu\CMakeFiles\onemath_blas_mklcpu_obj.dir\mklcpu_level2.cpp.obj -c C:\temp\ec\mkltest\auto-efa034fe-1902-f165-9fcf-a4bf010d0e2d\sources\src\blas\backends\mklcpu\mklcpu_level2.cpp
FAILED: bin/blas/backends/mklcpu/CMakeFiles/onemath_blas_mklcpu_obj.dir/mklcpu_level2.cpp.obj
C:\cache\stash\oneapi-compiler\20240719_rls\win\package\compiler\latest\bin\icx.exe -fsycl /nologo -IC:\temp\ec\mkltest\auto-efa034fe-1902-f165-9fcf-a4bf010d0e2d\sources\src -IC:\temp\ec\mkltest\auto-efa034fe-1902-f165-9fcf-a4bf010d0e2d\sources\src\include -IC:\temp\ec\mkltest\auto-efa034fe-1902-f165-9fcf-a4bf010d0e2d\build\bin -IC:\cache\stash\onemkl\regular\2024.2.1\20240722\build__release_win\include /EHsc /DWIN32 /D_WINDOWS /W3 /GR /EHsc -Wno-unused-function -w /O2 /Ob2 /DNDEBUG -iquote C:/temp/ec/mkltest/auto-efa034fe-1902-f165-9fcf-a4bf010d0e2d/sources/include -DMKL_ILP64 -fsycl -QMD -QMT bin\blas\backends\mklcpu\CMakeFiles\onemath_blas_mklcpu_obj.dir\mklcpu_level2.cpp.obj -QMF bin\blas\backends\mklcpu\CMakeFiles\onemath_blas_mklcpu_obj.dir\mklcpu_level2.cpp.obj.d /Fobin\blas\backends\mklcpu\CMakeFiles\onemath_blas_mklcpu_obj.dir\mklcpu_level2.cpp.obj -c C:\temp\ec\mkltest\auto-efa034fe-1902-f165-9fcf-a4bf010d0e2d\sources\src\blas\backends\mklcpu\mklcpu_level2.cpp
icx: error: cannot specify '-Fobin\blas\backends\mklcpu\CMakeFiles\onemath_blas_mklcpu_obj.dir\mklcpu_level2.cpp.obj' when compiling multiple source files

And somewhat similar for LAPACK (and RNG); e.g.,
icx: error: cannot specify '-Fobin\lapack\backends\mklcpu\CMakeFiles\onemath_lapack_mklcpu_obj.dir\mkl_lapack.cpp.obj' when compiling multiple source files

@Rbiessy
Copy link
Contributor Author

Rbiessy commented Nov 13, 2024

Thanks @sknepper I have been able to identify that the issue is due to using -iquote flag here which is not supported on Windows.
This is needed because we now need to include the Intel oneMKL mkl.hpp file to have a declaration of the oneapi::mkl symbols. Including this file conflicts with the mkl.hpp in this project which does not provide the right symbols anymore.
I am looking for another solution other than using -iquote but I am not confident I can find a nice solution. This is a difficult problem to solve and the main reason why I think different projects should not provide the same header files.

In terms of review it is still possible to review most of the changes. I expect the changes needed will only affect CMake files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants