Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LazySum of LazyTensor of sparse result in high alloc in integration #352

Closed
AmitRotem opened this issue Dec 13, 2022 · 2 comments
Closed

Comments

@AmitRotem
Copy link
Contributor

Short example of two subsystems;
LazySum of LazyTensor of sparse have memory allocations that scale with integration time, whereas LazySum of sparse has a constant number of allocations.
Integration takes about 6 times as long for this 225 sized Hilbert space.
Profiler showed some GC in _strides, but other than that all boxes are bluish.

using SparseArrays, QuantumOptics, Plots, BenchmarkTools
# setup
nn = 15
bas = NLevelBasis(nn)
ψ0 = randoperator(bas^2, NLevelBasis(2));
# Hamiltonians
Ht_sum_tensor_sparse, Ht_sum_sparse = let op, op1, op2
    # list of sparse operators
    op = [Operator(bas, sprandn(nn,nn,0.5)) for k=1:4]
    op.+= op
    # LazySum of LazyTensor of sparse operators
    op1 = LazySum(LazyTensor(bas^2, 1, op[1]),
        LazyTensor(bas^2, 2, op[2]),
        LazyTensor(bas^2, (1,2), Tuple(op[3:4])))
    function OP1(t,u)
        op1.factors[3] = Complex(sinpi(t)+sinpi(t/10)+sinpi(t/100)+sinpi(t/1000))
        op1
    end
    # LazySum of sparse operators
    op2 = LazySum(sparse.(op1.operators)...)
    function OP2(t,u)
        op2.factors[3] = Complex(sinpi(t)+sinpi(t/10)+sinpi(t/100)+sinpi(t/1000))
        op2
    end
    OP1, OP2
end;

# allocation of solver vs time span of solution
## LazySum of LazyTensor of sparse operators
_, ψT = timeevolution.schroedinger_dynamic((0.0, 1.0), ψ0, Ht_sum_tensor_sparse); # compile
@time al_sts = [@allocated _, ψT = timeevolution.schroedinger_dynamic((0.0, Float64(k)), ψ0, Ht_sum_tensor_sparse) for k=1:5];
## LazySum of sparse operators
_, ψT = timeevolution.schroedinger_dynamic((0.0, 1.0), ψ0, Ht_sum_sparse); # compile
@time al_ss = [@allocated _, ψT = timeevolution.schroedinger_dynamic((0.0, Float64(k)), ψ0, Ht_sum_sparse) for k=1:5];

# plot
plot(al_sts; label="lazy sum of lazy tensor of sparse") # allocations is linear with time span
plot!(al_ss; label="lazy sum of sparse") # allocations is constant

tes1(n) = sum(timeevolution.schroedinger_dynamic((0.0, exp2(k)), randoperator(bas^2, NLevelBasis(2)), Ht_sum_tensor_sparse)[2][end] for k=range(0,10,n))
tes2(n) = sum(timeevolution.schroedinger_dynamic((0.0, exp2(k)), randoperator(bas^2, NLevelBasis(2)), Ht_sum_sparse)[2][end] for k=range(0,10,n))
tes1(2),tes2(2);

@time tes1(10); # about 6 times as long then tes2(10)
@time tes2(10);

@profview tes1(10); # shows some GC at `_strides` when creating a vector of zeros, but doesn't take a lot of time
@profview tes2(10);
@amilsted
Copy link
Collaborator

Seems we allocate shape and strides vectors every time we do a lazytensor sparse gemm. This is a holdover from the old lazytensor code. We can probably replace these vectors with tuples?

@amilsted
Copy link
Collaborator

amilsted commented Aug 3, 2023

Fixed

@amilsted amilsted closed this as completed Aug 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants