Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/impurity example #171

Open
wants to merge 15 commits into
base: develop
Choose a base branch
from
197 changes: 197 additions & 0 deletions scripts/Impurity-example.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,197 @@
# # Example 4: Impurity Yrast States

# This is an example calculation of the lowest energy eigenstates at given total momentum (yrast states) of a mobile
# impurity coupled with a one-dimensional Bose gas. We will be using MPI parallelisation
# as such calculations are typically expensive. This script is designed to be run effectively
# on a high performance computing (HPC) unit.
Comment on lines +4 to +6
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused here, and I think it might also be confusing for other future readers. My understanding is that actually this script works on a small enough example that it can be run under CI and without MPI. Telling the reader that it is going to be an expensive calculation and that MPI is necessary is potentially misleading, or at least confusing information. It might be better to say that for larger system sizes (and for data reported in the publication) it makes sense to run this with MPI, but that the current example script can also be run without MPI.

The bigger questions is how many thing you want to teach or show the reader. The simpler and easier to understand a tutorial is, the more helpful it is, usually. A basic example for running with MPI is already available.


# The aim of this example is to showcase how a two-component Hamiltonian in momentum-space can be set up,
# as well as how a multi-stage FCIQMC can be run. Furthermore, this momentum-space setup will incur the
# sign problem, hence the initiator approach for FCIQMC will be used.

# A detailed description of the physical system and the physics behind the model can be found
# at our published paper (open access) "Polaron-Depleton Transition in the Yrast Excitations of a One-Dimensional
# Bose Gas with a Mobile Impurity", M. Yang, M. Čufar, E. Pahl, J. Brand,
# [*Condens. Matter* **7**, 15 (2022)](https://www.mdpi.com/2410-3896/7/1/15).

# A runnable script for this example is located
# [here](https://github.com/joachimbrand/Rimu.jl/blob/develop/scripts/Impurity-example.jl).
# Run it with `mpirun -np [# of CPUs] julia Impurity-example.jl`.
Comment on lines +17 to +19
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above. Maybe mention that it can be run without MPI as well.


# ## Initial setup and parameters

# Firstly, we load all needed modules `Rimu` and `Rimu.RMPI` for parallel FCIQMC calculation.

using Rimu
using Rimu.RMPI

# Let's define a function for constructing the starting vector based on
# the total momentum of the coupled system `P`, the number of modes `m` and the
# number of non-impurity bosons `n`. The maximum allowed total momentum equals to the total
# number of particles including the impurity, hence `n+1`. Apart from the zero and the maximum
# total momentum states, we will have more than one configurations in the starting vector
# to reflect various possible excitation options based on intuitions in physics.
function init_dv(P,m,n)
Comment on lines +28 to +34
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately this function is not very readable (more comment lines might help) and comes right at the start of the example. I don't have a good alternative suggestion, but this might be an early turnoff for potential readers.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this function could be simplified by making it return only the addresses. This would have the additional benefit of being able to reuse one of them as a starting address for the Hamiltonian.

Then I think the text should also explain which configurations you are constructing, or by which principle you are constructing them.

aIni = BoseFS2C(BoseFS([n; zeros(Int, m-1)]), BoseFS([1; zeros(Int, m-1)]))
dv = InitiatorDVec(aIni=>1.0;style=IsDynamicSemistochastic())
empty!(dv)
Comment on lines +35 to +37
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Constructing an InitiatorDVec first with the wrong address in it and then emptying it is quite confusing. Why don't you use the constructor for an empty InitiatorDVec?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to test it out. Previously doing dv[add] += 1.0 with an empty dv did not give what I wanted (sorry can't really remember what the problem was).

c = (m+1)÷2 # finding the zero-momentum mode
if P == 0 # zero total momentum
bfs1 = zeros(Int, m);bfs2 = zeros(Int, m)
bfs1[c] = n; bfs2[c] = 1
dv[BoseFS2C(BoseFS(bfs1),BoseFS(bfs2))]+=1.0
elseif P == (n+1) # maximum total momentum
bfs1 = zeros(Int, m);bfs2 = zeros(Int, m)
bfs1[c+1] = n; bfs2[c+1] = 1
dv[BoseFS2C(BoseFS(bfs1),BoseFS(bfs2))]+=1.0
else
bfs1 = zeros(Int, m);bfs2 = zeros(Int, m)
bfs1[c] = n-(P-1); bfs1[c+1] = P-1; bfs2[c+1] = 1 # move impurity to c+1
dv[BoseFS2C(BoseFS(bfs1),BoseFS(bfs2))]+=1.0

bfs1 = zeros(Int, m);bfs2 = zeros(Int, m)
bfs1[c] = n-P; bfs1[c+1] = P; bfs2[c] = 1 # move bosons to c+1
dv[BoseFS2C(BoseFS(bfs1),BoseFS(bfs2))]+=1.0

bfs1 = zeros(Int, m);bfs2 = zeros(Int, m)
bfs1[c] = n; bfs2[c+P] = 1 # move impurity to c+P
dv[BoseFS2C(BoseFS(bfs1),BoseFS(bfs2))]+=1.0

if (n-1) >= P >(n÷2)
bfs1 = zeros(Int, m);bfs2 = zeros(Int, m)
bfs1[c] = n-(P+1); bfs1[c+1] = P+1; bfs2[c-1] = 1 # move impurity to c-1 and a boson to c+1
dv[BoseFS2C(BoseFS(bfs1),BoseFS(bfs2))]+=1.0
end
end
return dv
end

# Note that the `dv` will be constructed with `InitiatorDVec()`, meaning
# that the initiator-FCIQMC algorithm will be used.

# Now let's first do some MPI sanity checks and print some information:
mpi_barrier() # optional, use for debugging and sanity checks
@info "After barrier 1" mpi_rank() mpi_size() Threads.nthreads()
Comment on lines +72 to +74
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like some debugging code that I had in my scripts when I was learning how MPI works. It does not fulfil any real function and I'm not sure how helpful it will be to the reader. My suggestion is to delete these lines.

Suggested change
# Now let's first do some MPI sanity checks and print some information:
mpi_barrier() # optional, use for debugging and sanity checks
@info "After barrier 1" mpi_rank() mpi_size() Threads.nthreads()


# Now we specify parameters for constructing a two-component Hamiltonian
P = 3 # total momentum
m = 8 # number of modes
na= 4 # number of non-impurity bosons
γ = 0.2 # boson-boson coupling strength, dimensionless quantity
η = 0.5 # impurity-boson coupling strength, dimensionless quantity
T = m^2/2 # normalised hopping strength
U = m*γ*na/(γ*na/(m*π^2) + 1) # converting γ to U
V = m*η*na/(η*na/(m*π^2) + 1) # converting η to V
# Here we use an initial address `aIni` for constructing the Hamiltonian, but
# it will not be used in the starting vector.
aIni = BoseFS2C(BoseFS([na; zeros(Int, m-1)]), BoseFS([1; zeros(Int, m-1)]))
ham = BoseHubbardMom1D2C(aIni;ta=T,tb=T,ua=U,v=V,dispersion=continuum_dispersion)
Comment on lines +85 to +88
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not? I am uneasy with recommending the usage of starting addresses this way. It would be better to construct the Hamiltonian only after you have constructed the starting address that selects the correct momentum subspace. Or is there any good reason for doing it this way?

I'd like to think of the Hamiltonian together with the starting address defining the relevant subspace of Hilbert space. What you are doing works because we are currently not testing which total momentum subspace addresses refer to. If your starting address had a different number of modes or particles, your code might error (or not), or work in unexpected ways.




# Now we can setup the Monte Carlo parameters
steps_warmup = 10_000 # number of QMC steps running with `ham2`
steps_equilibrate = 10_000 # number of QMC steps running with the real `ham`
steps_final = 10_000 # number of QMC steps running with G2 correlators, very slow, be caution!
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How slow is this script? The examples have been moved out of the testing job into the documentation job, but we don't want these examples to take too long...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It takes about 1 min. I can make it quicker by using a smaller system or running fewer steps. What's your opinion for a ideal run time?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole Documenter "build and deploy" step took less than three minutes (after taking about the same time for installing the dependencies. I think that is fine. Still much faster that the automated "testing".

tw = 1_000 # number of walkers, be sure to use a larger enough number to eliminate biases

# Specifying the shift strategy:
s_strat = DoubleLogUpdateAfterTargetWalkers(targetwalkers = tw)
# We use very small time-step size and high starting value of shift
params = RunTillLastStep(step = 0, dτ = 0.00001, laststep = steps_warmup,shift = 200.0)
# As we only use the secondary Hamiltonian `ham2` to generate a staring vector, we don't have to
# save any data in this stage. Progress messages are suppressed with `io=devnull`.
r_strat = ReportDFAndInfo(reporting_interval = 1_000, info_interval = 1_000, writeinfo = is_mpi_root(), io = devnull)

# Wrapping `dv` for MPI:
dv = MPIData(init_dv(P,m,na));
Comment on lines +101 to +102
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add some explanation why we are doing this and what is achieved by this step.


# Let's have a look of the starting vector, in this particular case, all 4 different ways of
# distributing total momenta `P` with `init_dv()` are triggered:
@mpi_root @show dv
Comment on lines +104 to +106
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the @show here has the main purpose of showing how the function init_dv() works, I would suggest to move this demonstration up to where the function is defined. Moreover, if you print out the DVec before wrapping it with MPIData, then it prints out very cutely using the compact notation for Fock strings:

julia> d = init_dv(3,8,4)
InitiatorDVec{BoseFS2C{4, 1, 8, BitString{11, 1, UInt16}, BitString{8, 1, UInt8}, 5},Float64} with 4 entries, style = IsDynamicSemistochastic{Float64,ThresholdCompression,DynamicSemistochastic}(), initiator = Rimu.DictVectors.Initiator{Float64}(1.0)
  fs"|0 0 0 4 0 0 0 0⟩ ⊗ |0 0 0 0 0 0 1 0⟩" => 1.0
  fs"|0 0 0 0 4 0 0 0⟩ ⊗ |0 0 1 0 0 0 0 0⟩" => 1.0
  fs"|0 0 0 2 2 0 0 0⟩ ⊗ |0 0 0 0 1 0 0 0⟩" => 1.0
  fs"|0 0 0 1 3 0 0 0⟩ ⊗ |0 0 0 1 0 0 0 0⟩" => 1.0

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see! Makes sense.


# ## Stage 1: Running with the "dummy" Hamiltonian

# Here we are constructing a secondary Hamiltonian `ham2` with equal boson-boson and impurity coupling
# strength. We use this Hamiltonian to further generate a batter staring vector. From previous experiences
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# strength. We use this Hamiltonian to further generate a batter staring vector. From previous experiences
# strength. We use this Hamiltonian to further generate a better starting vector. From previous experiences

# calculating impurity problems, this setup can significantly speed up the convergence and help FCIQMC to
# sample the important part of the Hilbert space, especially useful when `η` is very small.
η2 = γ
V2 = m*η2*na/(η2*na/(m*π^2) + 1)
ham2 = BoseHubbardMom1D2C(aIni;ta=T,tb=T,ua=U,v=V2,dispersion=continuum_dispersion)

# Now we run FCIQMC with `lomc!()` and track the elapsed time.
# Both `df` and `state` will be overwritten late with the "real" data.
el = @elapsed df, state = lomc!(ham2, dv; params, s_strat, r_strat,)
@mpi_root @info "Initial fciqmc completed in $(el) seconds."

# ## Stage 2: Running with the real Hamiltonian with replica but no observables

# We are ready to run the real Hamiltonian, here we redefine some variables for saving outputs.
# We save the Monte Carlo data every 1000 steps.
# Progress messages are suppressed with `io=devnull`, for a real job one should remove the
# line to invoke the default `io` and reenable the output messages.
r_strat = ReportToFile(
save_if = is_mpi_root(),
filename = "mpi_df_$(η)_$(P).arrow",
chunk_size = 1000,
return_df = true, # change it to `false` when running the real job
io=devnull # remove this line when running the real job
)

# We will turn on the replica, but without operators for a fast equilibration.
el2 = @elapsed df, state = lomc!(ham,dv; params, s_strat, r_strat, replica = AllOverlaps(2, nothing), laststep = (steps_equilibrate+steps_warmup))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deprecated syntax. Also, could this step use NoStats to save some time?

Suggested change
el2 = @elapsed df, state = lomc!(ham,dv; params, s_strat, r_strat, replica = AllOverlaps(2, nothing), laststep = (steps_equilibrate+steps_warmup))
el2 = @elapsed df, state = lomc!(ham,dv; params, s_strat, r_strat, replica = AllOverlaps(2; operator=nothing), laststep = (steps_equilibrate+steps_warmup))

@mpi_root @info "Replica fciqmc completed in $(el2) seconds."

# ## Stage 3: Running with the real Hamiltonian with replica and observables

# We now at the last stage of the calculation, doing replica FCIQMC with a serious of
# G2 correlators with distance `d` from `0` to `m`. See [`G2Correlator`](@ref).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deprecated method name. Check that the references work!

Suggested change
# G2 correlators with distance `d` from `0` to `m`. See [`G2Correlator`](@ref).
# G2 correlators with distance `d` from `0` to `m`. See [`G2MomCorrelator`](@ref).

# Here we save data every 1000 steps, but using a smaller `chunk_size` like 10 or even 1
# is highly recommended as replica FCIQMC with many observables being calculated are very
# expensive and you don't want to loose too much data if the job stops before finishes.
# Progress messages are suppressed with `io=devnull`, for a real job one should remove the
# line to invoke the default `io` and reenable the output messages.
r_strat = ReportToFile(
save_if = is_mpi_root(),
filename = "mpi_df_g2_$(η)_$(P).arrow",
chunk_size = 1000,
return_df = true, # change it to `false` when running the real job
io = devnull # remove this line when running the real job
)

# Setting up a tuple of G2 correlators:
g = Tuple(G2Correlator.(0:m))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
g = Tuple(G2Correlator.(0:m))
g = Tuple(G2MomCorrelator.(0:m))

# By default, for a two-component system the cross-component G2 operators are set up.
# If you want to calculate the correlations within a single component, `G2Correlator(d,:first)`
# or `G2Correlator(d,:second)` can be called based on your choice.
Comment on lines +167 to +168
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# If you want to calculate the correlations within a single component, `G2Correlator(d,:first)`
# or `G2Correlator(d,:second)` can be called based on your choice.
# If you want to calculate the correlations within a single component, `G2MomCorrelator(d,:first)`
# or `G2MomCorrelator(d,:second)` can be called based on your choice.


# Carry over information from the previous stage and set up a new `QMCState`:
new_state = Rimu.QMCState(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this is undocumented usage of an unexported constructor for QMCState. Maybe we can leave it for now (because it may be the only way to achieve what you need), but I don't particularly like it. Instead I think we should make lomc! accept an old QMCState and allow keyword arguments to change any of the fields:

new_df, new_state = lomc!(old_state; replica = AllOverlaps(2; operator=g), laststep = new_laststep)

state.hamiltonian, state.replicas, Ref(Int(state.maxlength)),
state.m_strat, r_strat, state.s_strat, state.τ_strat, state.threading, state.post_step, AllOverlaps(2, g)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
state.m_strat, r_strat, state.s_strat, state.τ_strat, state.threading, state.post_step, AllOverlaps(2, g)
state.m_strat, r_strat, state.s_strat, state.τ_strat, state.threading, state.post_step, AllOverlaps(2; operator=g)

)
# The final stage
el3 = @elapsed df2, state2 = lomc!(new_state; laststep = (steps_equilibrate+steps_warmup+steps_final))
@mpi_root @info "Replica fciqmc with G2 completed in $(el3) seconds."
println("MPI run finished!")

# ## Post-calculation analysis

# Typically, one should not include any analyses when using MPI, as they will be calculated multiple
# time unless you put the `@mpi_root` macro everywhere. Even so, all other MPI ranks apart from the root
# will be idling and wasting CPU hours on a HPC unit.
Comment on lines +182 to +184
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that one should not do this. Maybe then it is better to not show an example where this is done?

# But here, let's have a look of the calculated G2 correlations:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part of the example overlaps with the third example script, i.e. just listing and checking the G2 values. Perhaps this is all that can be done in a small example script, but can something be said about the effect of the impurity?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it looks like the analysis that follows is already shown and explained in another example script. Maybe then it would be better to just describe in words what kind of analysis can be done on the data that has been saved to disk and refer to Example 3.

@mpi_root println("Two-body correlator from 2 replicas:")
@mpi_root for d in 0:m
r = rayleigh_replica_estimator(df2; op_name = "Op$(d+1)", skip = 5_000)
println(" G2($d) = $(r.f) ± $(r.σ_f)")
end
# A symmetry between `G2(d)` and `G2(m-d)` can be observed above, which is the expected outcome due
# the periodic boundary conditions.

# Finished !

using Test #src
r = rayleigh_replica_estimator(df2; op_name = "Op1", skip = 5_000) #src
@test r.f ≈ 0.6294961872457038 rtol = 0.01 #src
Comment on lines +196 to +198
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does #src mean that these lines are still executed by Literate.markdown(..., execute=true)? Because the test needs to happen at this point.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question, I gonna check it.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to use #hide to make sure that the testing happens while Literate.jl 'execute's the file. See here:
https://fredrikekre.github.io/Literate.jl/v2/fileformat/#:~:text=Difference%20between%20%60%23src,out%20after%20execution.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that is my interpretation of the Literate.jl documentation. If that is indeed the case, then the output of tests can be suppressed by ending with the line

nothing #hide