Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add derivative methods without caches #410

Closed
wants to merge 5 commits into from
Closed

Conversation

apkille
Copy link
Contributor

@apkille apkille commented Aug 11, 2024

Addresses #406. This PR makes defining in-place functions straightforward (that is, the user does not have to input caches or adjoints of operators) for any solver. This is important given the recent merges (#404, qojulia/QuantumOpticsBase.jl#172) to have a SciML interface in QO.jl. The following example illustrates the motivation for this PR.

b = FockBasis(10)
psi0 = fockstate(b, 4)
rho0 = dm(psi0)
H = number(b)
J = [destroy(b), create(b)]
rates = [0.7, 0.5]

Before, we would have to define an in-place function as

f!(drho, rho, p, t) = timeevolution.dmaster_h!(drho, H, J, dagger.(J), rates, rho, copy(drho))

Now, we can instead simply write

f!(drho, rho, p, t) = timeevolution.dmaster_h!(drho, H, J, rates, rho)

And solve the ODE as follows:

prob = ODEProblem(f!, rho0, (0.0, 1.0))
sol = solve(prob, DP5(); save_everystep=false)

Copy link

codecov bot commented Aug 11, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.68%. Comparing base (d2c71fe) to head (2280d97).
Report is 12 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #410      +/-   ##
==========================================
- Coverage   97.03%   96.68%   -0.36%     
==========================================
  Files          18       19       +1     
  Lines        1554     1688     +134     
==========================================
+ Hits         1508     1632     +124     
- Misses         46       56      +10     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@apkille
Copy link
Contributor Author

apkille commented Aug 14, 2024

@Krastanov @david-pl pinging for review.

@Krastanov
Copy link
Collaborator

For the following:

f!(drho, rho, p, t) = timeevolution.dmaster_h!(drho, H, J, dagger.(J), rates, rho, copy(drho))

does this mean that each step of the solver will be causing an allocation (the copy)?

@Krastanov
Copy link
Collaborator

And related to my question, do we want to create something like this:

struct dmaster_h_cache{T,S}
    Jdagger::SomethingSpecific{T}
    rhocache::SomethingElse{S}
end

function (s::dmaster_h_cache)(drho, H, J, rates, rho)
    timeevolution.dmaster_h!(drho, H, J, s.Jdagger, rates, rho, s.rhocache))
end

but a bit cleaner and with better names

@apkille
Copy link
Contributor Author

apkille commented Aug 14, 2024

does this mean that each step of the solver will be causing an allocation (the copy)?

@Krastanov yes, that is the case. We actually get a really nice performance speedup here using just the ODE interface. Although there's some allocation overhead, the speed is much better (here I set b = FockBasis(100)) and maintains good performance as the problem scales:

julia> @benchmark solve($prob, DP5(); save_everystep=false)
BenchmarkTools.Trial: 263 samples with 1 evaluation.
 Range (min … max):  18.140 ms …  22.007 ms  ┊ GC (min … max): 0.00% … 15.85%
 Time  (median):     18.723 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   19.052 ms ± 765.057 μs  ┊ GC (mean ± σ):  3.10% ±  3.71%

    ▃▂█▄▂ ▃                                                     
  ▄▅███████▅▄▃▃▃▅▃▄▅▃▅▆▅▅▅▆▄▅▆▅▆▄▃▃▄▄▄▃▃▃▃▄▃▄▃▃▃▁▁▃▃▁▁▁▃▁▁▁▁▁▃ ▃
  18.1 ms         Histogram: frequency by time         21.5 ms <

 Memory estimate: 18.58 MiB, allocs estimate: 693.

julia> @benchmark timeevolution.master_h([$0.0, $1.0], $rho0, $H, $J)
BenchmarkTools.Trial: 109 samples with 1 evaluation.
 Range (min … max):  45.506 ms …  48.548 ms  ┊ GC (min … max): 0.00% … 5.60%
 Time  (median):     45.900 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   46.005 ms ± 505.631 μs  ┊ GC (mean ± σ):  0.27% ± 1.03%

      ▆▆▆▇▄▄█▆▂                                                 
  ▄▄▄▆█████████▆▃▃▁▁▁▁▁▁▁▁▁▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄▁▁▃▁▁▁▃▁▁▁▁▁▁▁▃ ▃
  45.5 ms         Histogram: frequency by time         48.3 ms <

 Memory estimate: 3.12 MiB, allocs estimate: 131.

@apkille
Copy link
Contributor Author

apkille commented Aug 14, 2024

And related to my question, do we want to create something like this:

struct dmaster_h_cache{T,S}
    Jdagger::SomethingSpecific{T}
    rhocache::SomethingElse{S}
end

function (s::dmaster_h_cache)(drho, H, J, rates, rho)
    timeevolution.dmaster_h!(drho, H, J, s.Jdagger, rates, rho, s.rhocache))
end

but a bit cleaner and with better names

I think this would be really good to implement. It seems pretty inefficient to allocate so much just for the sake of having a saving step in updating drho. I'd say it's outside the scope of this PR (as here I'm just creating more straightforward methods), but I'd be happy to introduce this to our solvers.

@Krastanov
Copy link
Collaborator

does this mean that each step of the solver will be causing an allocation (the copy)?

@Krastanov yes, that is the case. We actually get a really nice performance speedup here using just the ODE interface. Although there's some allocation overhead, the speed is much better (here I set b = FockBasis(100)) and maintains good performance as the problem scales:

julia> @benchmark solve($prob, DP5(); save_everystep=false)
BenchmarkTools.Trial: 263 samples with 1 evaluation.
 Range (min … max):  18.140 ms …  22.007 ms  ┊ GC (min … max): 0.00% … 15.85%
 Time  (median):     18.723 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   19.052 ms ± 765.057 μs  ┊ GC (mean ± σ):  3.10% ±  3.71%

    ▃▂█▄▂ ▃                                                     
  ▄▅███████▅▄▃▃▃▅▃▄▅▃▅▆▅▅▅▆▄▅▆▅▆▄▃▃▄▄▄▃▃▃▃▄▃▄▃▃▃▁▁▃▃▁▁▁▃▁▁▁▁▁▃ ▃
  18.1 ms         Histogram: frequency by time         21.5 ms <

 Memory estimate: 18.58 MiB, allocs estimate: 693.

julia> @benchmark timeevolution.master_h([$0.0, $1.0], $rho0, $H, $J)
BenchmarkTools.Trial: 109 samples with 1 evaluation.
 Range (min … max):  45.506 ms …  48.548 ms  ┊ GC (min … max): 0.00% … 5.60%
 Time  (median):     45.900 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   46.005 ms ± 505.631 μs  ┊ GC (mean ± σ):  0.27% ± 1.03%

      ▆▆▆▇▄▄█▆▂                                                 
  ▄▄▄▆█████████▆▃▃▁▁▁▁▁▁▁▁▁▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄▁▁▃▁▁▁▃▁▁▁▁▁▁▁▃ ▃
  45.5 ms         Histogram: frequency by time         48.3 ms <

 Memory estimate: 3.12 MiB, allocs estimate: 131.

This confuses me a lot. What is going wrong in master_h that this new method which should be strictly slower (because of all the extra copies) is actually faster?

Is there a potential future version of timeevolution.dmaster_h!(drho, H, J, rates, rho) that does not make copies on each step? I feel reluctant including a new approach that is slower than the currently existing approach, even if it is more convenient. Can we make this into something that is as fast as the already existing method?

@apkille
Copy link
Contributor Author

apkille commented Aug 15, 2024

@Krastanov timeevolution.master_h calls dmaster_h! in its backend. Before I submitted the PR,dmaster_h! already copied drho at each step. Here, I am simply defining a method that makes this function usable, that is, not providing an argument that copies drho:

dmaster_h!(drho, H, J, rates, rho) = dmaster_h!(drho, H, J, dagger.(J), rates, rho, copy(drho))

Maybe I am thinking about this the wrong way but this was my thought process.

@apkille apkille marked this pull request as draft August 15, 2024 14:07
@apkille apkille closed this Aug 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants