-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split reverse mode for Tapir #115
Comments
Side note: it would be interesting to compare this with the way Enzyme works in split reverse mode |
Right, yes, so: it's not just the arguments which get modified on the reverse-pass -- any intermediate mutable data structures that get modified during the forwards-pass will also get modified during the reverse-pass. For multiple reverse passes to be safe given only a single forwards-pass, I think it would have to be the case that no mutation occurs on the forwards-pass. I've not thought at all about how you might go about checking this, so I really do think it's the case that you'll have to do a single forwards pass per reverse-pass. It would definitely be possible to do multiple reverse-passes at the same time if I modified the package to explicitly handle "chunked" reverse-mode, where you pass multiple cotangents back at the same time. This would involve a complete overhaul of the cotangent system and all of the rules though, so it's certainly not happening in the forseeable future. |
These intermediate data structures are stored in the rule object? |
Exactly. For example, if you have a function: function f(x::Vector{Float64})
y = map(sin, x)
z = map(cos, y)
return z
end there will, somewhere in the rule derived to differentiate this function, be memory for |
To expand on @willtebbutt’a answer, the primary objective of this project is to produce a rewrite of Zygote/ReverseDiff with high performance and rigorous testing. We want to keep the internals transparent and extensible but generally want to avoid building one-hammer for all-nails. |
And hypothetically, if we want to compute many many pullbacks with the same forward sweep, would it make sense to deepcopy the rule? |
Possibly, but most likely we would find out there are some intricacies that are annoying for one reason or another. |
We've discussed this on Slack with @willtebbutt but I wanted to make sure where we stand on split reverse mode, i.e. separating the forward sweep from the reverse sweep.
The idea is being able to perform multiple reverse sweeps with different seeds after just one forward sweep. A typical example computing a Jacobian (where there is one seed per basis vector of the output space).
My question is the following: is that currently possible with Tapir's
rrule
? IIUC, the answer is no for functions that mutate their argument (gdalle/DifferentiationInterface.jl#142), but what about simple allocating functions?I took inspiration from https://github.com/withbayes/Tapir.jl/blob/f5e2b90cd17fd3127dd0fd8dfa617bc112275626/src/interface.jl#L9-L15
to try and write what I called
value_and_pullback_split
in DifferentiationInterfaceBut the behavior of the resulting closure changes at each call.
For some functions it gives different results:
For others it downright errors:
What should I copy to allow for independent pullback calls? Probably
out
,tf
andtx
?The text was updated successfully, but these errors were encountered: