-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cuda.host_empty function #64
Conversation
This PR aims to make the C code compilable using nvcc. The cuda language was added as well as a CudaCodePrinter. Changes to stdlib: Wrapped expressions using complex types in an `ifndef __NVCC__` to avoid processing them with the nvcc compiler --------- Co-authored-by: Mouad Elalj, EmilyBourne
This pull request fixes #48, by implementing a tiny wrapper for CUDA and a wrapper for non-CUDA functionalities only with external 'C'. **Commit Summary** - Implemented new header printer for CUDA. - Added CUDA wrapper assignment - Instead of wrapping all local headers, wrap only C functions with extern 'C' --------- Co-authored-by: EmilyBourne <louise.bourne@gmail.com> Co-authored-by: bauom <40796259+bauom@users.noreply.github.com>
This pull request addresses issue #28 by implementing a new feature in Pyccel that allows users to define custom GPU kernels. The syntax for creating these kernels is inspired by Numba. and I also need to fix issue #45 for testing purposes **Commit Summary** - Introduced KernelCall class - Added cuda printer methods _print_KernelCall and _print_FunctionDef to generate the corresponding CUDA representation for both kernel calls and definitions - Added IndexedFunctionCall represents an indexed function call - Added CUDA module and cuda.synchronize() - Fixing a bug that I found in the header: it does not import the necessary header for the used function --------- Co-authored-by: EmilyBourne <louise.bourne@gmail.com> Co-authored-by: bauom <40796259+bauom@users.noreply.github.com> Co-authored-by: Emily Bourne <emily.bourne@epfl.ch>
…nctions, and refining CUDA type handling
Hello again! Thank you for this new pull request 🤩. Please begin by requesting your checklist using the command |
assert isinstance(rank, int) | ||
assert order in (None, 'C', 'F') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing assert for memory location
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably also assert rank>0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think the rank can be less than 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neither do I. That's why I want you to add an assert so we have an error if we write code wrong and get something with rank less than 1
other_f_contiguous = other.order in (None, 'F') | ||
self_f_contiguous = self.order in (None, 'F') | ||
order = 'F' if other_f_contiguous and self_f_contiguous else 'C' | ||
return CudaArrayType(result_type, rank, order, self.memory_location) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Memory mismanagement.
from pyccel import cuda
a = cuda.ones_host(4)
b = cuda.ones_device(4)
c = a + b # According to your function this is ok and c is on host
d = b + a # According to your function this is ok and d is on device
pyccel/ast/cudatypes.py
Outdated
if isinstance(other, FixedSizeNumericType): | ||
return CudaArrayType(elem_type and other) | ||
elif isinstance(other, CudaArrayType): | ||
return CudaArrayType(elem_type+other.element_type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure about this implementation. It also seems to be broken for NumPy arrays.
What is the rank/memory space?
this function returns None. | ||
""" | ||
return self._order | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing memory property
def __repr__(self): | ||
dims = ','.join(':'*self._container_rank) | ||
order_str = f'(order={self._order})' if self._order else '' | ||
return f'{self.element_type}[{dims}]{order_str}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing memory information
/bot run pr_tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seems to be lines in this PR which aren't tested. Please take a look at my comments and add tests which cover the new code.
If this is modified code which cannot be easily tested in this PR please open an issue to request that this code be either removed or tested. Once you have done that please leave a message on the relevant conversation beginning with the line /bot accept
and referencing the issue.
Similarly if the new code cannot be tested for some reason, please leave a comment beginning with the line /bot accept
on the relevant conversation explaining why the code can't be tested.
elif isinstance(other, CudaArrayType) or (isinstance(other, NumpyNDArrayType) and self.memory_location == "host"): | ||
comparison_type = np.zeros(1, dtype = pyccel_type_to_original_type[other.element_type]) | ||
else: | ||
return NotImplemented |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code isn't tested. Please can you take a look
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/bot accept
The fallback return NotImplemented
does not need testing
return f"cuda_free({var_code});\n" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code isn't tested. Please can you take a look
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/bot accept
the free of the device array will be tested in the next PRs
/bot run pyccel_lint |
/bot run pr_tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seems to be lines in this PR which aren't tested. Please take a look at my comments and add tests which cover the new code.
If this is modified code which cannot be easily tested in this PR please open an issue to request that this code be either removed or tested. Once you have done that please leave a message on the relevant conversation beginning with the line /bot accept
and referencing the issue.
Similarly if the new code cannot be tested for some reason, please leave a comment beginning with the line /bot accept
on the relevant conversation explaining why the code can't be tested.
/bot run pr_tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seems to be lines in this PR which aren't tested. Please take a look at my comments and add tests which cover the new code.
If this is modified code which cannot be easily tested in this PR please open an issue to request that this code be either removed or tested. Once you have done that please leave a message on the relevant conversation beginning with the line /bot accept
and referencing the issue.
Similarly if the new code cannot be tested for some reason, please leave a comment beginning with the line /bot accept
on the relevant conversation explaining why the code can't be tested.
I can't seem to find your checklist to confirm that you have completed all necessary tasks. Please request one using |
I can't seem to find your checklist to confirm that you have completed all necessary tasks. Please request one using |
Here is your checklist. Please tick items off when you have completed them or determined that they are not necessary for this pull request:
|
I can't seem to find your checklist to confirm that you have completed all necessary tasks. Please request one using |
/bot run pr_tests |
I can't seem to find your checklist to confirm that you have completed all necessary tasks. Please request one using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job ! Your PR is using all the code it added/changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job ! Your PR is using all the code it added/changed.
This pull request addresses issue #56 by adding a new feature to 'cuda' host_empty that allows you to allocate memory on the CPU