Replies: 1 comment 5 replies
-
Hi Stephan, Regarding Monte-Carlo/JIT, please refer to the issue #70 for details. What is important to note is that the approach of recording the tape only once is only valid if you can guarantee that the code-path taken does not depend on the inputs in any possible way. If there are branches, data-dependent iterations, different polymorphic calls, etc. (the general case), it is unsafe and wrong derivatives would be calculated. With that in mind, at their own risk, users can use the JIT approach mentioned in the issue once it's implemented. Regarding GPUs: Having a dynamically-growing tape for each GPU thread is not practical and will be very inefficient. And for most code, even a fixed-size tape would not fit into local or shared memory, while global memory access in every GPU thread is highly inefficient. We therefore recommend to include GPU functions into a CPU-managed tape using the external functions interface. That is, implementing the adjoint reverse path manually with another GPU kernel. This ensures highest performance, and since GPU kernels are typically small, manual implementation of the corresponding adjoint kernel is typically feasible. |
Beta Was this translation helpful? Give feedback.
-
Hello everyone !
I have a general question regarding the use of XAD in a Monte Carlo setting. In a simplified world and general MC setting, a Tape is recorded per path, and the adjoints of that path are then calculated to determine the first-order derivatives. This process is then repeated n times and averaged over the n paths to generate a linear combination of the row vector of the Jacobian.
In essence however, one is repeating the generation of a "Tape per path" approach, but the structure of the Tape generally does not change, but the values fed into the model does. Has this ever been considered in terms of MC optimisations ie a possible Code generation/JIT approach for a single tape which could be executed in parallel ?
And then secondly, has anybody been able to implement XAD in a GPU setting ? would be very curious to see if this is possible
Looking forward to hearing from everyone
Beta Was this translation helpful? Give feedback.
All reactions