-
Notifications
You must be signed in to change notification settings - Fork 1
OpenCL implementation notes
OpenCL is currently work in progress in the v1.0.a-opencl
branch of this project.
This library comes with optional OpenCL support which seems to be promising compared to raw CPU computations.
To compile with OpenCL support, first install the required libraries listed below, compilation has been successfully tested on linux
and macos
. Then run cmake with
cmake -G %1 -DCG_USE_OPENCL=ON ..
Good Luck.
- CF4OCL: https://github.com/fakenmc/cf4ocl, Which is a higher level layer for OpenCL to reduce code verbosity. Internally, it requires glib to be installed. On the long run, I would love the have CGraph requiring only OpenCL libraries.
- CLBlast https://github.com/CNugteren/CLBlast which is used mainly for computing dot products, since I do not know how to write efficient and optimized OpenCL kernels yet.
Everything is allocated on the GPU, for forward and backward mode. There are plenty of rooms for optimizations here, but for the moment even if you are not going to do a backward pass, the device memory will be allocated even for node's partial derivative.
Only scalar data types are not allocated on the GPU, which I do not think they should anyway, but we will see.
Kernels are provided in a single file for the moment, which I think should be compiled within the program for the sake of simplicity, avoiding any path searching problem.