Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize merge algorithm for data sizes equal or greater then 4M items with SLM cache usage #1937

Closed

Conversation

SergeyKopienko
Copy link
Contributor

@SergeyKopienko SergeyKopienko commented Nov 18, 2024

One more approach for #1933

Unfortunately this approach doesn't gave us performance profit in comparison with #1933

…re-implement __find_start_point function

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
…rename template params in __parallel_merge_submitter

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
…implementation of __parallel_merge_submitter_large

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
…using __parallel_merge_submitter_large in the __parallel_merge

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
…removed redundand comment

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
…small data types should be acceptable too

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
…define __base_diagonals_sp_global_ptr outside of parallel_for

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
…calculate and use cached data-size for work-group

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
…rename some local variables

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
…h - debug code

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
…fix review comment: let's use __parallel_merge_submitter with std::uint32_t data type only

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
…load source data into SLM by all available work-items in the group

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
…remove debug code

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
…rename some variables

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
…removed redundand comment

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
…removed redundand assert

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
…fix unused variable

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
…rename some variables

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
…declare load_data_into_slm as inline

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
…removed redundand assert

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
…additional comments for load_data_into_slm

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
…rename some local variables and params

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
…rewrite the data loading into SLM cache #1

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
…h - always use two separate SLM cache

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
…use large submitter after 16M items

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
…h - using __parallel_merge_submitter_large for all data sizes

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
…avoid barrier if we have more then one work-item in each work-group

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
…avoid any action in the __parallel_merge_submitter_large::operator() if we haven't any data to process

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
…remove inline on load_data_into_slm_impl

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
@SergeyKopienko SergeyKopienko force-pushed the dev/skopienko/optimize_merge_to_main_V21_final branch from 1fbe771 to 253ca8d Compare November 20, 2024 14:48
…h - debug code under DUMP_DATA_LOADING

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
…fix an error in data loading

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
…fix chunk size on GPU

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
…fix calculation of available SLM memory amount

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
…l_merge.h - debug code under DUMP_DATA_LOADING"

This reverts commit 952871e.
@SergeyKopienko SergeyKopienko force-pushed the dev/skopienko/optimize_merge_to_main_V21_final branch from f604a72 to 39b68e4 Compare November 20, 2024 15:35
…another approach to calculate the amount of work-groups and work-items

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
@SergeyKopienko SergeyKopienko force-pushed the dev/skopienko/optimize_merge_to_main_V21_final branch 2 times, most recently from 0aa0ca3 to 56060d0 Compare November 21, 2024 08:59
…do not use SLM bank size

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
@SergeyKopienko SergeyKopienko force-pushed the dev/skopienko/optimize_merge_to_main_V21_final branch from 56060d0 to b04b25e Compare November 21, 2024 09:17
…use std::size_t instead of _IdType

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
….h - fix compile errors

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
…fix compile errors

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
…using oneapi::dpl::__internal::__value_t to detect range's value types

Signed-off-by: Sergey Kopienko <sergey.kopienko@intel.com>
@SergeyKopienko SergeyKopienko marked this pull request as ready for review November 28, 2024 17:15
@SergeyKopienko SergeyKopienko added this to the 2022.8.0 milestone Nov 29, 2024
@SergeyKopienko SergeyKopienko removed the request for review from MikeDvorskiy November 29, 2024 08:17
@SergeyKopienko SergeyKopienko removed this from the 2022.8.0 milestone Nov 29, 2024
@SergeyKopienko SergeyKopienko marked this pull request as draft November 29, 2024 08:17
@SergeyKopienko SergeyKopienko changed the title Optimize merge algorithm for data sizes equal or greater then 4M items with SLM cache usage Optimize merge algorithm for data sizes equal or greater then 4M items Dec 16, 2024
@SergeyKopienko SergeyKopienko changed the title Optimize merge algorithm for data sizes equal or greater then 4M items Optimize merge algorithm for data sizes equal or greater then 4M items with SLM cache usage Dec 16, 2024
@SergeyKopienko SergeyKopienko deleted the dev/skopienko/optimize_merge_to_main_V21_final branch December 23, 2024 08:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants