-
-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to reduce the initialization time? #137
Comments
Hello, No, there is no other way. saveApplicationToString only saves the binaries, not the plan and this is precisely what you are looking for. Best regards, |
Is it possible to save the internally compiled kernel function in a directory so that it can be used directly next time without compiling it again? Is such a solution feasible in vkfft? Is it difficult? |
Hello, saveApplicationToString saves the binaries, which you can later load with loadApplicationFromString configuration option. See pages 64-65 of the documentation. Best regards, |
save-application cannot adapt to all situations. I want to save the internal kernel, not the application. I compile all the kernels in advance, and any subsequent application that requires the same kernel can be used directly without compiling again. |
I am sorry, I don't understand. Which situations it can't adapt to? Please provide an example configuration. |
I want to implement a function such that any size and any stride can be quickly initialized. My approach is to prepare all the kernels in advance and put them in a directory. When you use it later, you don't need to compile it again. It is impossible to save all apps, there are too many apps. But the internal kernel is universal and limited, and can be saved in advance. |
The internal kernel is not universal and can't be saved in advance. The thing that you call kernel is a sequence of CPU calls that create the code for a particular FFT and compile it later. |
I extracted the generic kernel and compiled it in advance. Subsequent can call directly without compilation at runtime. Can such a feature be achieved by modifying the code? If possible, I'll try to try it. |
No, it is not possible to create an uberkernel that will work for all system configurations - it will require a big redesign of the library and won't work with all the algorithms. |
Just for HIP. And it’s ok if it can cover most of the algorithms. Can such a feature be implemented? |
This feature is not on the radar of my development, as it will require too much time to implement for no particular benefits. If you want to experiment with it - you are free to do so. |
As long as it can be achieved and it takes no more than one month, that I will give it a try. |
The initialization time of plan is too long. How to reduce this time? For example, can the compiled kernel be saved? I saw the saveApplicationToString option, which can save the entire plan. Is there any other way?
The text was updated successfully, but these errors were encountered: