Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit buffers sizes to leave some memory for the platform #1172

Merged
merged 1 commit into from
Aug 6, 2024

Conversation

ouakheli
Copy link
Contributor

Some conformance tests use directly the size returned by the runtime
for max memory size to allocate buffers.
This doesn't leave enough memory for the system to run the tests.

Copy link
Contributor

@jlewis-austin jlewis-austin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a pretty fundamental change to code that other implementations are passing without a problem.
I'd suggest starting with a detailed analysis of an individual test to show why the current value is unreasonable. Most tests do reduce the allocation size, so it seems that the question should be more about whether those reductions are correct or not, versus arbitrarily cutting everything in half.

test_conformance/api/test_api_min_max.cpp Show resolved Hide resolved
@ouakheli ouakheli force-pushed the max_size_change branch 8 times, most recently from 743a633 to 4859c47 Compare May 24, 2021 11:07
@ouakheli
Copy link
Contributor Author

@jlewis-austin It depends really of what the implementation is returning as max size. If the implementation returns the whole available memory in the system, creating a buffer with that size will succeed but the system won't have enough memory left.

Copy link
Contributor

@jlewis-austin jlewis-austin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's some good fixes in here, but it also introduces functional changes that could impact existing implementations. I think we need to address the fundamental question first--should an implementation report a buffer size that can't be created due to platform memory usage? I tend to think that the allocation tests should be able to pass as-is without adjusting the reported numbers, but welcome other opinions.


currentSize = maxAllocSize;
while (currentSize >= maxAllocSize / PASSING_FRACTION)
{
log_info("Trying to create a buffer of size of %lld bytes (%gMB).\n", maxAllocSize, (double)maxAllocSize/(1024.0*1024.0));
memHdl = clCreateBuffer( context, CL_MEM_READ_ONLY, (size_t)maxAllocSize, NULL, &error );
if (error == CL_MEM_OBJECT_ALLOCATION_FAILURE || error == CL_OUT_OF_RESOURCES || error == CL_OUT_OF_HOST_MEMORY) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original version doesn't make much sense, but I'm guessing the intent was to limit attempts to quit trying at maxAllocSize/4. In that case, shouldn't currentSize be used in clCreateBuffer (and decremented) instead of maxAllocSize?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to currentSize

while (maxAllocSize > (maxAllocSize/4)) {

currentSize = maxAllocSize;
while (currentSize >= maxAllocSize / PASSING_FRACTION)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took me a while to figure out the name "PASSING_FRACTION", but I'm having trouble thinking of anything better. Maybe something like MAX_REDUCTION_FACTOR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to MAX_REDUCTION_FACTOR

Comment on lines 514 to 515
if ((0 == gIsEmbedded && maxAllocSize < 128 * 1024 * 1024)
|| maxAllocSize < 1 * 1024 * 1024)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These checks can introduce new failures since maxAllocSize has already been reduced. There should be a clear distinction between reported max alloc size and any modifications we do to it. I still have the same objection about reducing allocation size in two different places. Applying this factor up front is reducing coverage for implementations that currently pass the test with their reported size--after this change they would be testing only half the size they're capable of.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This value is 1M for embedded and 128M otherwise.
I changed the check to use requiredAllocSize for me consistency but I don't see an application reporting 2M or 256M which with the reduction would make it fail ?

if (clGetDeviceInfo(device, info, sizeof(max_size), &max_size, NULL)
!= CL_SUCCESS)
{
throw std::runtime_error("clGetDeviceInfo failed\n");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this caught somewhere? It looks like a change in functionality, since the existing test_error_abort macro just returns an error from the caller and allows further testing to continue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to harmonize this function with get_max_param_size function
Does't really make sense to continue the testing if we clGetDeviceInfo returns an error ?

@ouakheli
Copy link
Contributor Author

ouakheli commented Jun 10, 2021

There's some good fixes in here, but it also introduces functional changes that could impact existing implementations. I think we need to address the fundamental question first--should an implementation report a buffer size that can't be created due to platform memory usage? I tend to think that the allocation tests should be able to pass as-is without adjusting the reported numbers, but welcome other opinions.

Agreed we need more opinions on that before proceeding with the other comments as this is fundamental for this change.
I don't think any implementation should return something else than the maximum memory available for the buffer creation.
Only the application is aware of the memory needed for the system and the operations that are going to be performed in the system with the buffer.
The implementation can't know what is needed for the platform and this can be different from a platform to another, so if it doesn't report the whole memory it would apply an arbitrary factor and would work for most of the platforms ?

What about passing a parameter to the test ? as we do in test_allocation reduction% ?

@jlewis-austin
Copy link
Contributor

I should clarify that most of the reductions look fine, for the tests that aren't specifically testing max alloc size functionality. The main one I'm concerned about is test_api_min_max, where there's no "app" to speak of beyond spinning up the platform and doing a single allocation.

That being said, I think you're right about not estimating platform usage, and I got two trains of thought mixed up in the problem statement. The real question should have been whether the CTS lower limits on retries (ie, your PASSING_FRACTION) serve as a "reasonable" minimum value that should succeed, or if the tests should remove all limitations and simply keep re-trying down to size 1. On the max allocation tests, I think it makes sense in either case to always attempt the max size in order to verify that it either fails gracefully or succeeds.

@ahesham-arm ahesham-arm force-pushed the max_size_change branch 5 times, most recently from d4b02e8 to a6bf03c Compare June 17, 2024 16:46
@bashbaug
Copy link
Contributor

Discussed in the June 18th teleconference. This is an older PR that has been rebased and updated and is now ready for review.

Copy link
Contributor

@bashbaug bashbaug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me technically. I do wonder whether we're losing too much coverage reducing allocation size by 50% in many cases, but we can try this and see.

@@ -24,8 +24,10 @@

typedef long long unsigned llu;

#define REDUCTION_PERCENTAGE_DEFAULT 50
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not necessarily opposed to this, but it does change the default behavior of the test pretty significantly, since a "reduction percentage" less than 100% also affects the number of work-items executed.

Should we just change the number of work-items executed unconditionally? I'd feel more comfortable if this were just changing the max allocation percentage.

Some conformance tests use directly the size returned by the runtime
for max memory size to allocate buffers.
This doesn't leave enough memory for the system to run the tests.
@bashbaug
Copy link
Contributor

bashbaug commented Aug 6, 2024

Merging as discussed in the August 6th teleconference.

@bashbaug bashbaug merged commit f473546 into KhronosGroup:main Aug 6, 2024
7 checks passed
@ahesham-arm ahesham-arm deleted the max_size_change branch August 6, 2024 17:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants