diff --git a/docs/requirements.yaml b/docs/requirements.yaml index 89ee3154..716b1ca9 100644 --- a/docs/requirements.yaml +++ b/docs/requirements.yaml @@ -9,4 +9,4 @@ dependencies: # Pip-only installs - pip: - nbsphinx - - docutils=0.16 + - docutils==0.16 diff --git a/docs/theory_implementation.rst b/docs/theory_implementation.rst index e9817108..d55086c5 100644 --- a/docs/theory_implementation.rst +++ b/docs/theory_implementation.rst @@ -163,8 +163,8 @@ other during the simulation ensemble. Below we decribe the details of these para * :code:`nst_sim`: The number of simulation steps, i.e. exchange frequency. This option assumes replicas with homogeneous simulation lengths. If this option is not specified, the number of steps defined in the template MDP file will be used. * :code:`mc_scheme`: The method for swapping simulations. Choices include :code:`same-state`/:code:`same_state`, :code:`metropolis`, and :code:`metropolis-eq`/:code:`metropolis_eq`. For more details, please refer to :ref:`doc_mc_schemes`. (Default: :code:`metropolis`) - * :code:`w_scheme`: The method for combining weights. Choices include :code:`None` (unspecified), :code:`exp-avg`/:code:`exp_avg`, and :code:`hist-exp-avg`/:code:`hist_exp_avg`. For more details, please refer to :ref:`doc_w_schemes`. (Default: :code:`hist-exp-avg`) - * :code:`N_cutoff`: The histogram cutoff. Only required if :code:`hist-exp-avg` is used. (Default: 0) + * :code:`w_scheme`: The method for combining weights. Choices include :code:`None` (unspecified), :code:`mean`, and :code:`geo-mean`/:code:`geo_mean`. For more details, please refer to :ref:`doc_w_schemes`. (Default: :code:`None`) + * :code:`N_cutoff`: The histogram cutoff. -1 means that no histogram correction will be performed. (Default: 1000) * :code:`n_ex`: The number of swaps to be proposed in one attempt. This works basically the same as :code:`-nex` flag in GROMACS. A recommended value is :math:`N^3`, where :math:`N` is the number of replicas. If `n_ex` is unspecified or specified as 0, neighboring swapping will be carried out. For more details, please refer to :ref:`doc_swap_basics`. (Default: 0) * :code:`outfile`: The output file for logging how replicas interact with each other. * :code:`verbose`: Whether a verbse log is wanted. @@ -519,14 +519,14 @@ for each replica. Note that the sum of the probabilities of each row (replica) s :: - State 0 1 2 3 4 5 - Rep 1 0.858 0.105 0.016 0.021 X X - Rep 2 X 0.642 0.117 0.193 0.048 X - Rep 3 X X 0.328 0.489 0.133 0.049 + State 0 1 2 3 4 5 + Rep 1 0.85800 0.10507 0.01571 0.02121 X X + Rep 2 X 0.64179 0.11724 0.19330 0.04767 X + Rep 3 X X 0.32809 0.48945 0.13339 0.04907 -Step 2: Calculate the probability ratios between each pair of replicas -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Step 2: Calculate the probability ratios +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Ideally (in the limit of inifinite simulation time), for the 3 states overlapped between replicas 1 and 2, we should have @@ -535,15 +535,16 @@ we should have where :math:`r_{2, 1}` is the "probability ratio" between replicas 2 and 1. However, the probability ratios corresopnding to different states will not be the same in practice, but will diverge with statistical noise -for short timescales. For example, in our case we have +for short timescales. For example, in our case we have the following ratios. (Note that here we calculate with +full precision but only present a few significant figures.) .. math:: - \frac{p_{21}}{p_{11}}=6.1143, \; \frac{p_{22}}{p_{12}} = 7.3125, \; \frac{p_{23}}{p_{13}}=9.1905 + \frac{p_{21}}{p_{11}}=6.10828, \; \frac{p_{22}}{p_{12}} = 7.46068, \; \frac{p_{23}}{p_{13}}=9.11249 Similarly, for states 2 to 4, we need to calculate the probability ratios between replicas 2 and 3: .. math:: - \frac{p_{32}}{p_{22}}=2.8034, \; \frac{p_{33}}{p_{23}} = 2.5337, \; \frac{p_{34}}{p_{24}}=2.7708 + \frac{p_{32}}{p_{22}}=2.79834, \; \frac{p_{33}}{p_{23}} = 2.53204, \; \frac{p_{34}}{p_{24}}=2.79834 Notably, in this case, there is no need to calculate :math:`r_{3, 1}` because :math:`r_{3, 1}` is already determined as :math:`r_{3, 1}=r_{3, 2} \times r_{2, 1}`. Also, there are only 2 overlapping states between replicas 1 and 3, @@ -559,14 +560,14 @@ or geometric averages. - Method 1: Simple average .. math:: - r_{2, 1} = \frac{1}{3}\left(\frac{p_{21}}{p_{11}} + \frac{p_{22}}{p_{12}} + \frac{p_{23}}{p_{13}} \right) \approx 7.5391, \; - r_{3, 2} = \frac{1}{3}\left(\frac{p_{32}}{p_{22}} + \frac{p_{33}}{p_{23}} + \frac{p_{34}}{p_{24}} \right) \approx 2.7026 + r_{2, 1} = \frac{1}{3}\left(\frac{p_{21}}{p_{11}} + \frac{p_{22}}{p_{12}} + \frac{p_{23}}{p_{13}} \right) \approx 7.56049, \; + r_{3, 2} = \frac{1}{3}\left(\frac{p_{32}}{p_{22}} + \frac{p_{33}}{p_{23}} + \frac{p_{34}}{p_{24}} \right) \approx 2.70957 - Method 2: Geometric average .. math:: - r_{2, 1}' = \left(\frac{p_{21}}{p_{11}} \times \frac{p_{22}}{p_{12}} \times \frac{p_{23}}{p_{13}} \right)^{\frac{1}{3}} \approx 7.4345, \; - r_{3, 2}' = \left(\frac{p_{32}}{p_{22}} \times \frac{p_{33}}{p_{23}} \times \frac{p_{34}}{p_{24}} \right)^{\frac{1}{3}} \approx 2.6999 + r_{2, 1}' = \left(\frac{p_{21}}{p_{11}} \times \frac{p_{22}}{p_{12}} \times \frac{p_{23}}{p_{13}} \right)^{\frac{1}{3}} \approx 7.46068, \; + r_{3, 2}' = \left(\frac{p_{32}}{p_{22}} \times \frac{p_{33}}{p_{23}} \times \frac{p_{34}}{p_{24}} \right)^{\frac{1}{3}} \approx 2.70660 Notably, if we take the negative natural logarithm of a probability ratio, we will get a free energy difference. For example, :math:`-\ln (p_{21}/p_{11})=f_{21}-f_{11}`, i.e. the free energy difference between state 1 in replica 2 and state 1 in replica 1. @@ -583,10 +584,10 @@ replicas 2 and 3 as follows: :: - State 0 1 2 3 4 5 - Rep 1 0.858 0.105 0.016 0.021 X X - Rep 2’ X 0.08529 0.01552 0.02560 0.00637 X - Rep 3’ X X 0.01610 0.02400 0.00653 0.00240 + State 0 1 2 3 4 5 + Rep 1 0.85800 0.10507 0.01571 0.02121 X X + Rep 2’ X 0.08489 0.01551 0.02557 0.00630 X + Rep 3’ X X 0.01602 0.02389 0.00651 0.00240 As shown above, we keep the probability vector of replica 1 the same but scale that for the other two. Specifically, the probability vector of replica 2' is that of replica 2 divided by :math:`r_21` and the probability vector of replica 3' is that of replica 3 @@ -596,10 +597,10 @@ Similarly, if we used the probability ratios :math:`r_{21}'` and :math:`r_{32}'` :: - State 0 1 2 3 4 5 - Rep 1 0.858 0.105 0.016 0.021 X X - Rep 2’ X 0.08645 0.01573 0.02596 0.00646 X - Rep 3’ X X 0.01634 0.02436 0.00663 0.00244 + State 0 1 2 3 4 5 + Rep 1 0.85800 0.10507 0.01614 0.02121 X X + Rep 2’ X 0.08602 0.01571 0.02591 0.00639 X + Rep 3’ X X 0.01625 0.02424 0.00661 0.00243 Step 5: Average and convert the probabilities @@ -611,25 +612,25 @@ and :math:`r_{32}`), we have the following probability vector of full range: :: - Final p 0.858 0.9515 0.01587 0.02353 0.00645 0.00240 + Final p 0.85800 0.09498 0.01575 0.02356 0.00641 0.00240 which can be converted to the following vector of alchemical weights (denoted as :math:`\vec{g}`) by taking the negative natural logarithm: :: - Final g 0.1532 0.0497 4.1433 3.7495 5.0437 6.0322 + Final g 0.15321 2.35412 4.15117 3.74831 5.05019 6.03420 For the second case (scaled with :math:`r_{21}'` and :math:`r_{32}'`), we have :: - Final p 0.858 0.09527 0.01602 0.02368 0.00654 0.00244 + Final p 0.85800 0.09507 0.01589 0.02371 0.00649 0.00243 -which can be converted to the following vector of alchemical weights (denoted as :math:`\vec{g'}`): +which can be converted to the following vector of alchemical weights (denoted as :math:`\vec{g}'`): :: - Final g 0.1532 2.3510 4.1339 3.7431 5.0437 6.0158 + Final g 0.15315 2.35314 4.14203 3.74204 5.03658 6.01981 Step 6: Determine the vector of alchemical weights for each replica @@ -640,20 +641,28 @@ we have: :: - State 0 1 2 3 4 5 - Rep 1 0.0 2.199 3.990 3.596 X X - Rep 2 X 0.0 1.791 1.397 2.691 X - Rep 3 X X 0.0 -0.393 0.900 1.889 + State 0 1 2 3 4 5 + Rep 1 0.00000 2.20097 3.99802 3.59516 X X + Rep 2 X 0.00000 1.79706 1.39419 2.69607 X + Rep 3 X X 0.00000 -0.40286 0.89901 1.88303 -Similarly, with :math:`\vec{g'}`, we have: +Similarly, with :math:`\vec{g}'`, we have: :: - State 0 1 2 3 4 5 - Rep 1 0.0 2.198 3.981 3.590 X X - Rep 2 X 0.0 1.783 1.392 2.693 X - Rep 3 X X 0.0 -0.391 0.910 1.882 + State 0 1 2 3 4 5 + Rep 1 0.00000 2.20000 3.98889 3.58889 X X + Rep 2 X 0.00000 1.78889 1.38889 2.68333 X + Rep 3 X X 0.00000 -0.40000 0.89444 1.87778 + +As a reference, here are the original weights: + +:: + State 0 1 2 3 4 5 + Rep 1 0.0 2.1 4.0 3.7 X X + Rep 2 X 0.0 1.7 1.2 2.6 X + Rep 3 X X 0.0 -0.4 0.9 1.9 As shown above, the results using Method 1 and Method 2 are pretty close to each other. Notably, regardless of which type of averages we took, in the case here we were assuming that each replica is equally weighted. In the diff --git a/ensemble_md/ensemble_EXE.py b/ensemble_md/ensemble_EXE.py index f4c8d534..97dd5b07 100644 --- a/ensemble_md/ensemble_EXE.py +++ b/ensemble_md/ensemble_EXE.py @@ -34,9 +34,10 @@ class EnsembleEXE: """ This class helps set up input files of an ensemble of expanded ensemble. - Note that for easier parsing of the mdp template file when initializing the class, please make sure that at least the following GROMACS mdp - parameters specified using dashes instead of underscores: :code:`ref-t`, :code:`vdw-lambdas`, - :code:`coul-lambdas`, :code:`restraint-lambdas`, and :code:`init-lambda-weights`. + Note that for easier parsing of the mdp template file when initializing the class, please make + sure that at least the following GROMACS mdp parameters specified using dashes instead of underscores: + :code:`ref-t`, :code:`vdw-lambdas`, :code:`coul-lambdas`, :code:`restraint-lambdas`, and + :code:`init-lambda-weights`. """ def __init__(self, yml_file): @@ -73,7 +74,7 @@ def __init__(self, yml_file): # Key: Optional argument; Value: Default value optional_args = { "mc_scheme": "metropolis", - "w_scheme": "exp-avg", + "w_scheme": None, "N_cutoff": 1000, "n_ex": 0, # neighbor swaps "outfile": "results.txt", @@ -84,6 +85,9 @@ def __init__(self, yml_file): setattr(self, i, optional_args[i]) # Step 3: Check if the parameters in the YAML file is well-defined + if self.w_scheme not in [None, 'mean', 'geo-mean']: + raise ParameterError("The specified weight combining scheme is not available. Options include None, 'mean', and 'geo-mean'/'geo_mean'.") # noqa: E501 + if self.mc_scheme not in ['same-state', 'same_state', 'metropolis', 'metropolis-eq', 'metropolis_eq']: raise ParameterError("The specified MC scheme is not available. Options include 'same-state', 'metropolis', and 'metropolis-eq'.") # noqa: E501 @@ -97,10 +101,11 @@ def __init__(self, yml_file): if getattr(self, i) <= 0: raise ParameterError(f"The parameter '{i}' should be positive.") - params_non_neg = ['N_cutoff', 'n_ex'] # non-negative parameters - for i in params_non_neg: - if getattr(self, i) < 0: - raise ParameterError(f"The parameter '{i}' should be non-negative.") + if self.n_ex < 0: + raise ParameterError("The parameter 'n_ex' should be non-negative.") + + if self.N_cutoff < 0 and self.N_cutoff != -1: + raise ParameterError("The parameter 'N_cutoff' should be non-negative unless no histogram correction is needed, i.e. N_cutoff = -1.") # noqa: E501 params_str = ['mdp', 'outfile'] for i in params_str: @@ -194,8 +199,6 @@ def map_lambda2state(self): ] = i else: self.lambda_dict[(self.template["vdw-lambdas"][i],)] = i - print(self.lambda_dict) - print(self.state_ranges) self.lambda_ranges = [[list(self.lambda_dict.keys())[j] for j in self.state_ranges[i]]for i in range(len(self.state_ranges))] # noqa: E501 def initialize_MDP(self, idx): @@ -292,9 +295,7 @@ def extract_final_dhdl_info(self, dhdl_files): A list of lambda vectors corresponding to the last sampled states of all simulations. """ states, lambda_vecs = [], [] - if self.verbose is True: - print('\n', end='') - print("Below are the final states being visited:") + print("\nBelow are the final states being visited:") for j in range(self.n_sim): dhdl = extract_dHdl(dhdl_files[j], T=self.temp) lambda_vecs.append(dhdl.index[-1][1:]) @@ -599,92 +600,110 @@ def histogram_correction(self, weights, counts): weights : list An updated list of lists of corected weights. """ - print("\nPerforming histogram correction for the lambda weights ...") - for i in range(len(weights)): # loop over the replicas - print(f" Counts of rep {i}:\t\t{counts[i]}") - print( - f' Original weights of rep {i}:\t{[float(f"{k:.3f}") for k in weights[i]]}' - ) - for j in range(1, len(weights[i])): # loop over the alchemical states - if counts[i][j - 1] != 0 and counts[i][j - 1] != 0: - if np.min([counts[i][j - 1], counts[i][j]]) > self.N_cutoff: - weights[i][j] += np.log(counts[i][j - 1] / counts[i][j]) - print( - f' Corrected weights of rep {i}:\t{[float(f"{k:.3f}") for k in weights[i]]}\n' - ) + if self.N_cutoff == -1: + print('\nNote: No histogram correction will be performed.') + else: + if self.verbose is True: + print("\nPerforming histogram correction for the lambda weights ...") + else: + print("\nPerforming histogram correction for the lambda weights ...", end="") + + for i in range(len(weights)): # loop over the replicas + if self.verbose is True: + print(f" Counts of rep {i}:\t\t{counts[i]}") + print(f' Original weights of rep {i}:\t{[float(f"{k:.3f}") for k in weights[i]]}') + + for j in range(1, len(weights[i])): # loop over the alchemical states + if counts[i][j - 1] != 0 and counts[i][j - 1] != 0: + if np.min([counts[i][j - 1], counts[i][j]]) > self.N_cutoff: + weights[i][j] += np.log(counts[i][j - 1] / counts[i][j]) + + if self.verbose is True: + print(f' Corrected weights of rep {i}:\t{[float(f"{k:.3f}") for k in weights[i]]}\n') + + if self.verbose is False: + print(' Done') + return weights - def combine_weights(self, weights, counts, swap): + def combine_weights(self, weights, method): """ - Combines the lambda weights of the exchanging replicas as needed. + Combine alchemical weights across multiple replicas using probability ratios. Parameters ---------- weights : list - A list of Wang Landau weights of all simulations. - counts : list - A list of final counts of all simulations. - swap : tuple - A tuple of a pair of simulation indices to be swapped. + A list of Wang-Landau weights of ALL simulations + method : str + Method for combining probabilities and probability ratios. Choices include "None", "mean" and "geo-mean". Returns ------- weights : list - A list of updated Wang Landau weights of all simulations. Note that the weights - of the simulations not involved in the exchange will remain the same. + A list of original (method==None) or modified Wang-Landau weights of ALL simulations. """ - if self.w_scheme is None: - pass + if method is None: + print('Note: No weight combination will be performed.') + return weights else: - # Step 1: "Correct" the weights as needed (hist-exp-avg or mbar-exp-avg). - # If min(N_{k-1}, N_k) < cutoff, there's no correction and hist-exp-avg/mbar-exp-avg reduces to exp-avg. - if self.w_scheme == "hist-exp-avg": - weights = self.histogram_correction(weights, counts) - elif self.w_scheme == "mbar-exp-avg": - pass - - # Step 2: Combine the weights - print( - f"Performing weight combination between simulation {swap[0]} and simulation {swap[1]} ... " - ) - overlap = self.state_ranges[swap[0]].intersection( - self.state_ranges[swap[1]] - ) - print( - f" Alchemical range of simulation {swap[0]}: {list(self.state_ranges[swap[0]])}" - ) - print( - f" Alchemical range of simulation {swap[1]}: {list(self.state_ranges[swap[1]])}" - ) - print(f" Overlapped alchemical ranges: {list(overlap)}\n") - - # swap[0] is always smaller than swap[1] - g_0 = np.array(weights[swap[0]][-len(overlap):]) # the last N values - g_1 = np.array( - weights[swap[1]][: len(overlap)] - ) # the first N values (g_1[0] must be 0) - shifted_0 = g_0 - g_0[0] # will be combined with g_1 to generate modified_1 - shifted_1 = g_1 + g_0[0] # will be combined with g_0 to generate modified_0 - print(f' Original g^i: {[float(f"{i:.3f}") for i in weights[swap[0]]]}') - print(f' Original g^j: {[float(f"{i:.3f}") for i in weights[swap[1]]]}') - - if self.w_scheme == "avg": - # DEPRECATED! Kept here just for conveneint testing. - modified_0 = (shifted_1 + g_0) / 2 - modified_1 = (shifted_0 + g_1) / 2 - weights[swap[0]][-len(overlap):] = modified_0 - weights[swap[1]][: len(overlap)] = modified_1 + if self.verbose is True: + print(f'Performing weight combination with the {method} method ...') else: - # This includes exp-avg, hist-exp-avg or mbar-exp-avg - modified_0 = -np.log((np.exp(-g_0) + np.exp(-shifted_1)) / 2) - modified_1 = -np.log((np.exp(-g_1) + np.exp(-shifted_0)) / 2) - weights[swap[0]][-len(overlap):] = modified_0 - weights[swap[1]][: len(overlap)] = modified_1 + print(f'Performing weight combination with the {method} method ...', end='') - print(f' Modified g^i: {[float(f"{i:.3f}") for i in weights[swap[0]]]}') - print(f' Modified g^j: {[float(f"{i:.3f}") for i in weights[swap[1]]]}') + if self.verbose is True: + w = np.round(weights, decimals=3).tolist() # just for printing + print(' Original weights:') + for i in range(len(w)): + print(f' Rep {i}: {w[i]}') + + # Step 1: Convert the weights into probabilities + weights = np.array(weights) + prob = np.array([[np.exp(-i)/np.sum(np.exp(-weights[j])) for i in weights[j]] for j in range(len(weights))]) # noqa: E501 + + # Step 2: Caclulate the probability ratios (after figuring out overlapped states between adjacent replicas) + overlapped = [self.state_ranges[i].intersection(self.state_ranges[i + 1]) for i in range(len(self.state_ranges) - 1)] # noqa: E501 + prob_ratio = [prob[i + 1][: len(overlapped[i])] / prob[i][-len(overlapped[i]):] for i in range(len(overlapped))] # noqa: E501 + + # Step 3: Average the probability ratios + avg_ratio = [1] # This allows easier scaling since the first prob vector stays the same. + if method == 'mean': + avg_ratio.extend([np.mean(prob_ratio[i]) for i in range(len(prob_ratio))]) + elif method == 'geo-mean': + avg_ratio.extend([np.prod(prob_ratio[i])**(1/len(prob_ratio[i])) for i in range(len(prob_ratio))]) + + # Step 4: Scale the probabilities for each replica + scaled_prob = np.array([prob[i] / np.prod(avg_ratio[: i + 1]) for i in range(len(prob))]) + + # Step 5: Average and convert the probabilities + final_p = [] + for i in range(self.n_tot): # global state index + p = [] # a list of probabilities to be averaged for each state + for j in range(len(self.state_ranges)): # j can be regared as the replica index + if i in self.state_ranges[j]: + local_idx = i - j * self.s + p.append(scaled_prob[j][local_idx]) + if method == 'mean': + final_p.append(np.mean(p)) + elif method == 'geo-mean' or method == 'geo_mean': + final_p.append(np.prod(p) ** (1 / len(p))) + + final_g = -np.log(final_p) + + # Step 6: Determine the vector of alchemical weights for each replica + weights = [list(final_g[i: i + self.n_sub] - final_g[i: i + self.n_sub][0]) for i in range(self.n_sim)] + weights = np.round(weights, decimals=5).tolist() - return weights + if self.verbose is True: + w = np.round(weights, decimals=3).tolist() # just for printing + print('\n Modified weights:') + for i in range(len(w)): + print(f' Rep {i}: {w[i]}') + + if self.verbose is False: + print(' DONE') + + return weights def run_EEXE(self, n): """ @@ -702,10 +721,8 @@ def run_EEXE(self, n): if rank == 0: dir_before = [ - i for i in os.listdir(".") if os.path.isdir(os.path.join(".", i)) - ] - if self.verbose is True: - print("Preparing the tpr files for the simulation ensemble...") + i for i in os.listdir(".") if os.path.isdir(os.path.join(".", i))] + print("Preparing the tpr files for the simulation ensemble ...", end="") grompp = gmx.commandline_operation( "gmx", @@ -731,8 +748,8 @@ def run_EEXE(self, n): utils.gmx_output(grompp, self.verbose) # Run all the simulations simultaneously using gmxapi - if rank == 0 and self.verbose is True: - print("Running an ensemble of simulations ...") + if rank == 0: + print("Running an ensemble of simulations ...", end="") if self.parallel is True: tpr = [f'{grompp.output.file["-o"].result()[i]}' for i in range(self.n_sim)] diff --git a/ensemble_md/tests/data/other_yamls/0.yaml b/ensemble_md/tests/data/other_yamls/0.yaml new file mode 100644 index 00000000..b36bddbc --- /dev/null +++ b/ensemble_md/tests/data/other_yamls/0.yaml @@ -0,0 +1,9 @@ +# User-defined parameters +parallel: False # Whether the replicas of EXEE should be run in parallel or not +n_sim: 4 # Number of replica simulations +n_iterations: 10 # Number of iterations +nst_sim: 500 # Number of simulation steps for each homogeneous replica +s: 1 # Shift in λ ranges (e.g. s = 2 if λ_2 = (2, 3, 4) & λ_3 = (4, 5, 6)) +w_scheme: "test" +nst_sim: 500 # Number of simulation steps for each homogeneous replica +mdp: 'ensemble_md/tests/data/expanded.mdp' # The MDP template that has the whole range of λ values diff --git a/ensemble_md/tests/data/other_yamls/10.yaml b/ensemble_md/tests/data/other_yamls/10.yaml new file mode 100644 index 00000000..d0901290 --- /dev/null +++ b/ensemble_md/tests/data/other_yamls/10.yaml @@ -0,0 +1,8 @@ +# User-defined parameters +parallel: False # Whether the replicas of EXEE should be run in parallel or not +n_sim: 4 # Number of replica simulations +n_iterations: 10 # Number of iterations +nst_sim: 500 # Number of simulation steps for each homogeneous replica +s: 1 # Shift in λ ranges (e.g. s = 2 if λ_2 = (2, 3, 4) & λ_3 = (4, 5, 6)) +nst_sim: 500 # Number of simulation steps for each homogeneous replica +mdp: 'ensemble_md/tests/data/other_mdps/2.mdp' # The MDP template that has the whole range of λ values diff --git a/ensemble_md/tests/data/other_yamls/6.yaml b/ensemble_md/tests/data/other_yamls/6.yaml index fd3d8b09..75e9d39a 100644 --- a/ensemble_md/tests/data/other_yamls/6.yaml +++ b/ensemble_md/tests/data/other_yamls/6.yaml @@ -5,4 +5,5 @@ n_iterations: 10 # Number of iterations nst_sim: 500 # Number of simulation steps for each homogeneous replica s: 1 # Shift in λ ranges (e.g. s = 2 if λ_2 = (2, 3, 4) & λ_3 = (4, 5, 6)) nst_sim: 500 # Number of simulation steps for each homogeneous replica -mdp: 3 # The MDP template that has the whole range of λ values +N_cutoff: -5 +mdp: 'ensemble_md/tests/data/expanded.mdp' # The MDP template that has the whole range of λ values diff --git a/ensemble_md/tests/data/other_yamls/7.yaml b/ensemble_md/tests/data/other_yamls/7.yaml index 21953dfe..fd3d8b09 100644 --- a/ensemble_md/tests/data/other_yamls/7.yaml +++ b/ensemble_md/tests/data/other_yamls/7.yaml @@ -1,8 +1,8 @@ # User-defined parameters -parallel: 3 # Whether the replicas of EXEE should be run in parallel or not +parallel: False # Whether the replicas of EXEE should be run in parallel or not n_sim: 4 # Number of replica simulations n_iterations: 10 # Number of iterations nst_sim: 500 # Number of simulation steps for each homogeneous replica s: 1 # Shift in λ ranges (e.g. s = 2 if λ_2 = (2, 3, 4) & λ_3 = (4, 5, 6)) nst_sim: 500 # Number of simulation steps for each homogeneous replica -mdp: 'ensemble_md/tests/data/expanded.mdp' # The MDP template that has the whole range of λ values +mdp: 3 # The MDP template that has the whole range of λ values diff --git a/ensemble_md/tests/data/other_yamls/8.yaml b/ensemble_md/tests/data/other_yamls/8.yaml index edfd5c3b..21953dfe 100644 --- a/ensemble_md/tests/data/other_yamls/8.yaml +++ b/ensemble_md/tests/data/other_yamls/8.yaml @@ -1,8 +1,8 @@ # User-defined parameters -parallel: False # Whether the replicas of EXEE should be run in parallel or not +parallel: 3 # Whether the replicas of EXEE should be run in parallel or not n_sim: 4 # Number of replica simulations n_iterations: 10 # Number of iterations nst_sim: 500 # Number of simulation steps for each homogeneous replica s: 1 # Shift in λ ranges (e.g. s = 2 if λ_2 = (2, 3, 4) & λ_3 = (4, 5, 6)) nst_sim: 500 # Number of simulation steps for each homogeneous replica -mdp: 'ensemble_md/tests/data/other_mdps/1.mdp' # The MDP template that has the whole range of λ values +mdp: 'ensemble_md/tests/data/expanded.mdp' # The MDP template that has the whole range of λ values diff --git a/ensemble_md/tests/data/other_yamls/9.yaml b/ensemble_md/tests/data/other_yamls/9.yaml index d0901290..edfd5c3b 100644 --- a/ensemble_md/tests/data/other_yamls/9.yaml +++ b/ensemble_md/tests/data/other_yamls/9.yaml @@ -5,4 +5,4 @@ n_iterations: 10 # Number of iterations nst_sim: 500 # Number of simulation steps for each homogeneous replica s: 1 # Shift in λ ranges (e.g. s = 2 if λ_2 = (2, 3, 4) & λ_3 = (4, 5, 6)) nst_sim: 500 # Number of simulation steps for each homogeneous replica -mdp: 'ensemble_md/tests/data/other_mdps/2.mdp' # The MDP template that has the whole range of λ values +mdp: 'ensemble_md/tests/data/other_mdps/1.mdp' # The MDP template that has the whole range of λ values diff --git a/ensemble_md/tests/test_ensemble_EXE.py b/ensemble_md/tests/test_ensemble_EXE.py index 7dc94da1..f96145f4 100644 --- a/ensemble_md/tests/test_ensemble_EXE.py +++ b/ensemble_md/tests/test_ensemble_EXE.py @@ -29,7 +29,7 @@ def test_init_1(self): k = 1.380649e-23 NA = 6.0221408e23 assert EEXE.mc_scheme == "metropolis" - assert EEXE.w_scheme == "exp-avg" + assert EEXE.w_scheme is None assert EEXE.N_cutoff == 1000 assert EEXE.n_ex == 0 assert EEXE.outfile == "results.txt" @@ -66,6 +66,10 @@ def test_init_1(self): ] def test_init_2(self): + yaml_0 = os.path.join(input_path, "other_yamls/0.yaml") + with pytest.raises(ParameterError, match="The specified weight combining scheme is not available. Options include None, 'mean', and 'geo-mean'/'geo_mean'."): # noqa: E501 + E0 = EnsembleEXE(yaml_0) # noqa: F841 + yaml_1 = os.path.join(input_path, "other_yamls/1.yaml") with pytest.raises(ParameterError, match=f"Required parameter 'parallel' not specified in {yaml_1}"): E1 = EnsembleEXE(yaml_1) # noqa: F841 @@ -87,16 +91,20 @@ def test_init_2(self): E5 = EnsembleEXE(yaml_5) # noqa: F841 yaml_6 = os.path.join(input_path, "other_yamls/6.yaml") - with pytest.raises(ParameterError, match="The parameter 'mdp' should be a string."): + with pytest.raises(ParameterError, match="The parameter 'N_cutoff' should be non-negative unless no histogram correction is needed, i.e. N_cutoff = -1."): # noqa: E501 E6 = EnsembleEXE(yaml_6) # noqa: F841 yaml_7 = os.path.join(input_path, "other_yamls/7.yaml") - with pytest.raises(ParameterError, match="The parameter 'parallel' should be a boolean variable."): + with pytest.raises(ParameterError, match="The parameter 'mdp' should be a string."): E7 = EnsembleEXE(yaml_7) # noqa: F841 yaml_8 = os.path.join(input_path, "other_yamls/8.yaml") - E8 = EnsembleEXE(yaml_8) - assert E8.lambda_dict == { + with pytest.raises(ParameterError, match="The parameter 'parallel' should be a boolean variable."): + E8 = EnsembleEXE(yaml_8) # noqa: F841 + + yaml_9 = os.path.join(input_path, "other_yamls/9.yaml") + E9 = EnsembleEXE(yaml_9) + assert E9.lambda_dict == { (0.1,): 0, (0.2,): 1, (0.3,): 2, @@ -107,16 +115,16 @@ def test_init_2(self): (0.8,): 7, (1,): 8, } # noqa: E501 - assert E8.lambda_ranges == [ + assert E9.lambda_ranges == [ [(0.1,), (0.2,), (0.3,), (0.4,), (0.5,), (0.6,)], [(0.2,), (0.3,), (0.4,), (0.5,), (0.6,), (0.7,)], [(0.3,), (0.4,), (0.5,), (0.6,), (0.7,), (0.8,)], [(0.4,), (0.5,), (0.6,), (0.7,), (0.8,), (1.0,)], ] - yaml_9 = os.path.join(input_path, "other_yamls/9.yaml") - E9 = EnsembleEXE(yaml_9) - assert E9.lambda_dict == { + yaml_10 = os.path.join(input_path, "other_yamls/10.yaml") + E10 = EnsembleEXE(yaml_10) + assert E10.lambda_dict == { (0, 0, 0): 0, (0.25, 0, 0): 1, (0.5, 0, 0): 2, @@ -127,7 +135,7 @@ def test_init_2(self): (1, 0.75, 0.8): 7, (1, 1, 1): 8, } # noqa: E501 - assert E9.lambda_ranges == [ + assert E10.lambda_ranges == [ [(0.0, 0.0, 0.0), (0.25, 0.0, 0.0), (0.5, 0.0, 0.0), (0.75, 0.0, 0.0), (1.0, 0.0, 0.0), (1.0, 0.25, 0.2)], [(0.25, 0.0, 0.0), (0.5, 0.0, 0.0), (0.75, 0.0, 0.0), (1.0, 0.0, 0.0), (1.0, 0.25, 0.2), (1.0, 0.5, 0.6)], [(0.5, 0.0, 0.0), (0.75, 0.0, 0.0), (1.0, 0.0, 0.0), (1.0, 0.25, 0.2), (1.0, 0.5, 0.6), (1.0, 0.75, 0.8)], @@ -142,7 +150,7 @@ def test_print_params(self, capfd): L += "\nImportant parameters of EXEE\n============================\n" L += f"gmxapi version: 0.3.2\nensemble_md version: {ensemble_md.__version__}\n" L += "Output log file: results.txt\nVerbose log file: True\nWhether the replicas run in parallel: False\n" - L += "MC scheme for swapping simulations: metropolis\nScheme for combining weights: exp-avg\n" + L += "MC scheme for swapping simulations: metropolis\nScheme for combining weights: None\n" L += "Histogram cutoff: 1000\nNumber of replicas: 4\nNumber of iterations: 10\n" L += "Number of exchanges in one attempt: 0\n" L += "Length of each replica: 1.0 ps\nTotal number of states: 9\n" @@ -346,6 +354,32 @@ def test_historgam_correction(self): weights_2 = EEXE.histogram_correction(weights_2, counts_2) assert weights_2 == [[0, 10.304, 20.073, 29.364 + np.log(5545 / 5955)]] + def test_combine_weights(self): + EEXE.n_tot = 6 + EEXE.n_sub = 4 + EEXE.s = 1 + EEXE.n_sim = 3 + EEXE.state_ranges = [{0, 1, 2, 3}, {1, 2, 3, 4}, {2, 3, 4, 5}] + weights = [[0, 2.1, 4.0, 3.7], [0, 1.7, 1.2, 2.6], [0, -0.4, 0.9, 1.9]] + + w1 = EEXE.combine_weights(weights, method=None) + w2 = EEXE.combine_weights(weights, method='mean') + w3 = EEXE.combine_weights(weights, method='geo-mean') + + print(w1) + print(w2) + + assert w1 == weights + assert w2 == [ + [0.0, 2.20097, 3.99803, 3.59516], + [0.0, 1.79706, 1.39419, 2.69607], + [0.0, -0.40286, 0.89901, 1.88303]] + assert w3 == [ + [0.0, 2.2, 3.98889, 3.58889], + [0.0, 1.78889, 1.38889, 2.68333], + [0.0, -0.4, 0.89444, 1.87778]] + + """ def combine_w_inputs(self): swap = (0, 1) weights = [[0, 2.1, 4.0, 3.7, 4.8], [0, -0.4, 0.7, 1.5, 2.4]] @@ -414,6 +448,6 @@ def test_combine_weights(self): ] ), ) - + """ def test_run_EEXE(self): pass diff --git a/ensemble_md/utils.py b/ensemble_md/utils.py index 736e6ce3..b386db3e 100644 --- a/ensemble_md/utils.py +++ b/ensemble_md/utils.py @@ -53,21 +53,17 @@ def gmx_output(gmx_obj, verbose=False): rtn_code = gmx_obj.output.returncode.result() if type(rtn_code) == list: # simulation ensemble if sum(rtn_code) == 0: - if verbose is True: - print("The process was executed successfully.") + print(" DONE") else: - print(f"Return codes: {rtn_code}") + print(f" Return codes: {rtn_code}") for i in range(len(rtn_code)): if rtn_code[i] != 0: - print( - f"STDERR of the process:\n\n {gmx_obj.output.stderr.result()[i]}\n" - ) + print(f"\nSTDERR of the process:\n\n {gmx_obj.output.stderr.result()[i]}\n") else: # single simulation if rtn_code == 0: - if verbose is True: - print("The process was executed successfully.") + print(" DONE") else: - print(f"STDERR of the process:\n\n {gmx_obj.output.stderr.result()}\n") + print(f"\nSTDERR of the process:\n\n {gmx_obj.output.stderr.result()}\n") except AttributeError: raise RuntimeError(f"{repr(gmx_obj)} is not a commandline_operation.")