LazySum of LazyTensor of sparse result in high alloc in integration #352

AmitRotem · 2022-12-13T08:03:16Z

Short example of two subsystems;
LazySum of LazyTensor of sparse have memory allocations that scale with integration time, whereas LazySum of sparse has a constant number of allocations.
Integration takes about 6 times as long for this 225 sized Hilbert space.
Profiler showed some GC in _strides, but other than that all boxes are bluish.

using SparseArrays, QuantumOptics, Plots, BenchmarkTools
# setup
nn = 15
bas = NLevelBasis(nn)
ψ0 = randoperator(bas^2, NLevelBasis(2));
# Hamiltonians
Ht_sum_tensor_sparse, Ht_sum_sparse = let op, op1, op2
    # list of sparse operators
    op = [Operator(bas, sprandn(nn,nn,0.5)) for k=1:4]
    op.+= op
    # LazySum of LazyTensor of sparse operators
    op1 = LazySum(LazyTensor(bas^2, 1, op[1]),
        LazyTensor(bas^2, 2, op[2]),
        LazyTensor(bas^2, (1,2), Tuple(op[3:4])))
    function OP1(t,u)
        op1.factors[3] = Complex(sinpi(t)+sinpi(t/10)+sinpi(t/100)+sinpi(t/1000))
        op1
    end
    # LazySum of sparse operators
    op2 = LazySum(sparse.(op1.operators)...)
    function OP2(t,u)
        op2.factors[3] = Complex(sinpi(t)+sinpi(t/10)+sinpi(t/100)+sinpi(t/1000))
        op2
    end
    OP1, OP2
end;

# allocation of solver vs time span of solution
## LazySum of LazyTensor of sparse operators
_, ψT = timeevolution.schroedinger_dynamic((0.0, 1.0), ψ0, Ht_sum_tensor_sparse); # compile
@time al_sts = [@allocated _, ψT = timeevolution.schroedinger_dynamic((0.0, Float64(k)), ψ0, Ht_sum_tensor_sparse) for k=1:5];
## LazySum of sparse operators
_, ψT = timeevolution.schroedinger_dynamic((0.0, 1.0), ψ0, Ht_sum_sparse); # compile
@time al_ss = [@allocated _, ψT = timeevolution.schroedinger_dynamic((0.0, Float64(k)), ψ0, Ht_sum_sparse) for k=1:5];

# plot
plot(al_sts; label="lazy sum of lazy tensor of sparse") # allocations is linear with time span
plot!(al_ss; label="lazy sum of sparse") # allocations is constant

tes1(n) = sum(timeevolution.schroedinger_dynamic((0.0, exp2(k)), randoperator(bas^2, NLevelBasis(2)), Ht_sum_tensor_sparse)[2][end] for k=range(0,10,n))
tes2(n) = sum(timeevolution.schroedinger_dynamic((0.0, exp2(k)), randoperator(bas^2, NLevelBasis(2)), Ht_sum_sparse)[2][end] for k=range(0,10,n))
tes1(2),tes2(2);

@time tes1(10); # about 6 times as long then tes2(10)
@time tes2(10);

@profview tes1(10); # shows some GC at `_strides` when creating a vector of zeros, but doesn't take a lot of time
@profview tes2(10);

The text was updated successfully, but these errors were encountered:

amilsted · 2023-02-23T00:28:17Z

Seems we allocate shape and strides vectors every time we do a lazytensor sparse gemm. This is a holdover from the old lazytensor code. We can probably replace these vectors with tuples?

amilsted · 2023-08-03T21:04:24Z

Fixed

AmitRotem mentioned this issue Feb 23, 2023

Reduce allocations for multipling LazyTensor of sparse and dense qojulia/QuantumOpticsBase.jl#80

Merged

amilsted closed this as completed Aug 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LazySum of LazyTensor of sparse result in high alloc in integration #352

LazySum of LazyTensor of sparse result in high alloc in integration #352

AmitRotem commented Dec 13, 2022

amilsted commented Feb 23, 2023

amilsted commented Aug 3, 2023

LazySum of LazyTensor of sparse result in high alloc in integration #352

LazySum of LazyTensor of sparse result in high alloc in integration #352

Comments

AmitRotem commented Dec 13, 2022

amilsted commented Feb 23, 2023

amilsted commented Aug 3, 2023