-
Notifications
You must be signed in to change notification settings - Fork 199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limit buffers sizes to leave some memory for the platform #1172
Conversation
606b56c
to
c82bc20
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a pretty fundamental change to code that other implementations are passing without a problem.
I'd suggest starting with a detailed analysis of an individual test to show why the current value is unreasonable. Most tests do reduce the allocation size, so it seems that the question should be more about whether those reductions are correct or not, versus arbitrarily cutting everything in half.
743a633
to
4859c47
Compare
@jlewis-austin It depends really of what the implementation is returning as max size. If the implementation returns the whole available memory in the system, creating a buffer with that size will succeed but the system won't have enough memory left. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's some good fixes in here, but it also introduces functional changes that could impact existing implementations. I think we need to address the fundamental question first--should an implementation report a buffer size that can't be created due to platform memory usage? I tend to think that the allocation tests should be able to pass as-is without adjusting the reported numbers, but welcome other opinions.
|
||
currentSize = maxAllocSize; | ||
while (currentSize >= maxAllocSize / PASSING_FRACTION) | ||
{ | ||
log_info("Trying to create a buffer of size of %lld bytes (%gMB).\n", maxAllocSize, (double)maxAllocSize/(1024.0*1024.0)); | ||
memHdl = clCreateBuffer( context, CL_MEM_READ_ONLY, (size_t)maxAllocSize, NULL, &error ); | ||
if (error == CL_MEM_OBJECT_ALLOCATION_FAILURE || error == CL_OUT_OF_RESOURCES || error == CL_OUT_OF_HOST_MEMORY) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original version doesn't make much sense, but I'm guessing the intent was to limit attempts to quit trying at maxAllocSize/4. In that case, shouldn't currentSize be used in clCreateBuffer (and decremented) instead of maxAllocSize?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change to currentSize
while (maxAllocSize > (maxAllocSize/4)) { | ||
|
||
currentSize = maxAllocSize; | ||
while (currentSize >= maxAllocSize / PASSING_FRACTION) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It took me a while to figure out the name "PASSING_FRACTION", but I'm having trouble thinking of anything better. Maybe something like MAX_REDUCTION_FACTOR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change to MAX_REDUCTION_FACTOR
if ((0 == gIsEmbedded && maxAllocSize < 128 * 1024 * 1024) | ||
|| maxAllocSize < 1 * 1024 * 1024) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These checks can introduce new failures since maxAllocSize has already been reduced. There should be a clear distinction between reported max alloc size and any modifications we do to it. I still have the same objection about reducing allocation size in two different places. Applying this factor up front is reducing coverage for implementations that currently pass the test with their reported size--after this change they would be testing only half the size they're capable of.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This value is 1M for embedded and 128M otherwise.
I changed the check to use requiredAllocSize for me consistency but I don't see an application reporting 2M or 256M which with the reduction would make it fail ?
if (clGetDeviceInfo(device, info, sizeof(max_size), &max_size, NULL) | ||
!= CL_SUCCESS) | ||
{ | ||
throw std::runtime_error("clGetDeviceInfo failed\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this caught somewhere? It looks like a change in functionality, since the existing test_error_abort macro just returns an error from the caller and allows further testing to continue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to harmonize this function with get_max_param_size function
Does't really make sense to continue the testing if we clGetDeviceInfo returns an error ?
Agreed we need more opinions on that before proceeding with the other comments as this is fundamental for this change. What about passing a parameter to the test ? as we do in test_allocation reduction% ? |
I should clarify that most of the reductions look fine, for the tests that aren't specifically testing max alloc size functionality. The main one I'm concerned about is That being said, I think you're right about not estimating platform usage, and I got two trains of thought mixed up in the problem statement. The real question should have been whether the CTS lower limits on retries (ie, your PASSING_FRACTION) serve as a "reasonable" minimum value that should succeed, or if the tests should remove all limitations and simply keep re-trying down to size 1. On the max allocation tests, I think it makes sense in either case to always attempt the max size in order to verify that it either fails gracefully or succeeds. |
2e029ab
to
c55b922
Compare
c55b922
to
6579c70
Compare
0e12851
to
a53a099
Compare
d4b02e8
to
a6bf03c
Compare
Discussed in the June 18th teleconference. This is an older PR that has been rebased and updated and is now ready for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good to me technically. I do wonder whether we're losing too much coverage reducing allocation size by 50% in many cases, but we can try this and see.
@@ -24,8 +24,10 @@ | |||
|
|||
typedef long long unsigned llu; | |||
|
|||
#define REDUCTION_PERCENTAGE_DEFAULT 50 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not necessarily opposed to this, but it does change the default behavior of the test pretty significantly, since a "reduction percentage" less than 100% also affects the number of work-items executed.
Should we just change the number of work-items executed unconditionally? I'd feel more comfortable if this were just changing the max allocation percentage.
Some conformance tests use directly the size returned by the runtime for max memory size to allocate buffers. This doesn't leave enough memory for the system to run the tests.
a6bf03c
to
a98506b
Compare
Merging as discussed in the August 6th teleconference. |
Some conformance tests use directly the size returned by the runtime
for max memory size to allocate buffers.
This doesn't leave enough memory for the system to run the tests.