Merge pull request #9 from wehs7661/remove_gmxapi

Remove the use of gmxapi
wehs7661 · May 1, 2023 · 239776a · 239776a
2 parents c71ad05 + 156d70b
commit 239776a
Show file tree

Hide file tree

Showing 12 changed files with 222 additions and 372 deletions.
diff --git a/.circleci/config.yml b/.circleci/config.yml
@@ -48,12 +48,12 @@ jobs:
       - run:
           name: Install the ensemble_md package
           command: |
-            export gmxapi_ROOT=$HOME/pkgs  # set the envrionment variable so gmxapi can be installed successfully
-            python3 -m pip install '.[gmxapi]'
+            python3 -m pip install .
 
       - run:
           name: Run unit tests
           command: |
+            source $HOME/pkgs/bin/GMXRC
             pip3 install pytest 
             pip3 install pytest-cov
             pytest -vv --disable-pytest-warnings --cov=ensemble_md --cov-report=xml --color=yes ensemble_md/tests/

diff --git a/.codecov.yml b/.codecov.yml
diff --git a/.lgtm.yml b/.lgtm.yml
diff --git a/docs/conf.py b/docs/conf.py
@@ -176,4 +176,4 @@
 # autoclass_content = 'both'
 autodoc_member_order = 'bysource'
 napoleon_attr_annotations = True
-autodoc_mock_imports = ["mpi4py", "gmxapi"]
+autodoc_mock_imports = ["mpi4py"]  # we originally included gmxapi in the old versions of ensemble_md
diff --git a/docs/getting_started.rst b/docs/getting_started.rst
@@ -4,21 +4,19 @@
 running, and analyzing GROMACS simulation ensembles. The current implementation is
 mainly for synchronous ensemble of expanded ensemble (EEXE), but we will develop
 methods like asynchronous EEXE, or ensemble of alchemical metadynamics in the future.
-In the current implementation, `gmxapi`_, which is a higher level Python API of GROMACS,
+In the current implementation, the module :code:`subprocess`
 is used to launch GROMACS commands, but we will switch to `SCALE-MS`_ for this purpose
 in the future when possible.
 
 
-.. _`gmxapi`: https://manual.gromacs.org/current/gmxapi/
 .. _`SCALE-MS`: https://scale-ms.readthedocs.io/en/latest/
 
 
 2. Installation
 ===============
 2.1. Requirements
 -----------------
-Before installing :code:`ensemble_md`, one should have working versions of `GROMACS`_
-and `gmxapi`_. Please refer to the linked documentations for full installation instructions.
+Before installing :code:`ensemble_md`, one should have working versions of `GROMACS`_. Please refer to the linked documentations for full installation instructions.
 All the other pip-installable dependencies of :code:`ensemble_md` (specified in :code:`setup.py` of the package)
 will be automatically installed during the installation of the package.
 
@@ -31,14 +29,6 @@ will be automatically installed during the installation of the package.
 
     pip install ensemble-md 
 
-By default, the command above does not install :code:`gmxapi`, so one needs to either
-following the full installation instruction of :code:`gmxapi`, or install
-:code:`gmxapi` along with the package (after sourcing the GROMACS excutable, e.g. 
-:code:`/usr/local/gromacs/bin/GMXRC`) with the following command:
-::
-
-    pip install ensemble-md[gmxapi]
-
 2.3. Installation from source
 -----------------------------
 One can also install :code:`ensemble_md` from the source code, which is available in our
@@ -49,8 +39,7 @@ One can also install :code:`ensemble_md` from the source code, which is availabl
     cd ensemble_md/
     pip install .
 
-To install the pacakg along with :code:`gmxapi`, replace the last command with 
-:code:`pip install '.[gmxapi]'`. If you are interested in contributing to the project, append the 
+If you are interested in contributing to the project, append the 
 last command with the flag :code:`-e` to install the project in the editable mode 
 so that changes you make in the source code will take effects without re-installation of the package. 
 (Pull requests to the project repository are welcome!)

diff --git a/docs/simulations.rst b/docs/simulations.rst
@@ -4,7 +4,11 @@
 ===============================
 :code:`ensemble_md` provides three command-line interfaces (CLI), including :code:`explore_EEXE`, :code:`run_EEXE` and :code:`analyze_EEXE`.
 :code:`explore_EEXE` helps the user to figure out possible combinations of EEXE parameters, while :code:`run_EEXE` and :code:`analyze_EEXE`
-can be used to perform and analyze EEXE simulations, respectively. Here is the help message of :code:`explore_EEXE`:
+can be used to perform and analyze EEXE simulations, respectively. Below we provide more details about each of these CLIs.
+
+1.1. CLI `explore_EEXE`
+-----------------------
+Here is the help message of :code:`explore_EEXE`:
 
 ::
 
@@ -25,7 +29,9 @@ can be used to perform and analyze EEXE simulations, respectively. Here is the h
                 replicas.
 
 
-And here is the help message of :code:`run_EEXE`:
+1.2. CLI `run_EEXE`
+-------------------
+Here is the help message of :code:`run_EEXE`:
 
 ::
 
@@ -52,6 +58,18 @@ And here is the help message of :code:`run_EEXE`:
                             The maximum number of warnings in parameter specification to be
                             ignored.
 
+In our current implementation, it is assumed that all replicas of an EEXE simulations are performed in
+parallel using MPI. Naturally, performing an EEXE simulation using :code:`run_EEXE` requires a command-line interface
+to launch MPI processes, such as :code:`mpirun` or :code:`mpiexec`. For example, on a 128-core node
+in a cluster, one may use :code:`mpirun -np 4 run_EEXE` (or :code:`mpiexec -n 4 run_EEXE`) to run an EEXE simulation composed of 4
+replicas with 4 MPI processes. Note that in this case, it is often recommended to explicitly specify
+more details about resources allocated for each replica. For example, one can specifies :code:`{'-nt': 32}`
+for the EEXE parameter `runtime_args` (specified in the input YAML file, see :ref:`doc_EEXE_parameters`),
+so each of the 4 replicas will use 32 threads (assuming thread-MPI GROMACS), taking the full advantage
+of 128 cores.
+
+1.3. CLI `analyze_EEXE`
+-----------------------
 Finally, here is the help message of :code:`analyze_EEXE`:
 
 ::
@@ -119,11 +137,9 @@ other during the simulation ensemble. Check :ref:`doc_parameters` for more detai
 
 Step 2: Run the 1st iteration
 -----------------------------
-With all the input files/parameters set up in the previous run, one can use :obj:`.run_EEXE` to run the 
-first iteration. Specifically, :obj:`.run_EEXE` uses :code:`gmxapi.commandline_operation` to launch an GROMACS
-:code:`grompp` command to generate the input MDP file. Then, if :code:`parallel` is specified as :code:`True` 
-in the input YAML file, :code:`gmxapi.mdrun` will be used to run GROMACS :code:`mdrun` commands in parallel, 
-otherwise :code:`gmxapi.commandline_operation` will be used to run simulations serially.
+With all the input files/parameters set up in the previous run, one can use run the first iteration,
+using :obj:`.run_EEXE`, which uses :code:`subprocess.run` to launch GROMACS :code:`grompp`
+and :code:`mdrun` commands in parallel.
 
 Step 3: Set up the new iteration
 --------------------------------
@@ -194,7 +210,15 @@ In the current implementation of the algorithm, 22 parameters can be specified i
 Note that the two CLIs :code:`run_EEXE` and :code:`analyze_EEXE` share the same input YAML file, so we also
 include parameters for data analysis here.
 
-3.1. Simulation inputs
+3.1. GROMACS executable
+-----------------------
+
+  - :code:`gmx_executable`: (Required)
+      The GROMACS executable to be used to run the EEXE simulation. The value could be as simple as :code:`gmx`
+      or :code:`gmx_mpi` if the exeutable has be sourced. Otherwise, the full path of the exetuable (e.g.
+      :code:`/usr/local/gromacs/bin/gmx`, the path returned by the command :code:`which gmx`).
+
+3.2. Simulation inputs
 ----------------------
 
   - :code:`gro`: (Required)
@@ -204,11 +228,11 @@ include parameters for data analysis here.
   - :code:`mdp`: (Required)
       The MDP template that has the whole range of :math:`λ` values.
 
-3.2. EEXE parameters
+.. _doc_EEXE_parameters:
+
+3.3. EEXE parameters
 --------------------
 
-  - :code:`parallel`: (Required)
-      Whether the replicas of EEXE should be run in parallel or not.
   - :code:`n_sim`: (Required)
       The number of replica simulations.
   - :code:`n_iter`: (Required)
@@ -241,7 +265,7 @@ include parameters for data analysis here.
       Additional runtime arguments to be appended to the GROMACS :code:`mdrun` command provided in a dictionary. 
       For example, one could have :code:`{'-nt': 16}` to run the simulation using 16 threads.
 
-3.3. Output settings
+3.4. Output settings
 --------------------
   - :code:`verbose`: (Optional, Default: :code:`True`)
       Whether a verbse log is wanted. 
@@ -250,7 +274,7 @@ include parameters for data analysis here.
 
 .. _doc_analysis_params:
 
-3.4. Data analysis
+3.5. Data analysis
 ------------------
   - :code:`msm`: (Optional, Default: :code:`False`)
       Whether to build Markov state models (MSMs) for the EEXE simulation and perform relevant analysis.
@@ -271,20 +295,21 @@ include parameters for data analysis here.
   - :code:`seed`: (Optional, Default: None)
       The random seed to use in bootstrapping.
 
-3.5. A template input YAML file
+3.6. A template input YAML file
 -------------------------------
 For convenience, here is a template of the input YAML file, with each optional parameter specified with the default and required 
 parameters left with a blank. Note that specifying :code:`null` is the same as leaving the parameter unspecified (i.e. :code:`None`).
 
 ::
+    # Section 1: GROMACS executable
+    gmx_executable:
 
-    # Section 1: Simulation inputs
+    # Section 2: Simulation inputs
     gro:
     top:
     mdp:
 
-    # Section 2: EEXE parameters
-    parallel:
+    # Section 3: EEXE parameters
     n_sim:
     n_iter:
     s:
@@ -297,11 +322,11 @@ parameters left with a blank. Note that specifying :code:`null` is the same as l
     grompp_args: null
     runtime_args: null
 
-    # Section 3: Output settings
+    # Section 4: Output settings
     verbose: True
     n_ckpt: 100
 
-    # Section 4: Data analysis
+    # Section 5: Data analysis
     msm: False
     free_energy: False 
     df_spacing: 1

diff --git a/ensemble_md/cli/run_EEXE.py b/ensemble_md/cli/run_EEXE.py
@@ -9,7 +9,6 @@
 ####################################################################
 import os
 import sys
-import glob
 import time
 import copy
 import shutil
@@ -90,6 +89,8 @@ def main():
 
     # Step 2: If there is no checkpoint file found/provided, perform the 1st iteration (index 0)
     if os.path.isfile(args.ckpt) is False:
+        start_idx = 1
+
         # 2-1. Set up input files for all simulations with 1 rank
         if rank == 0:
             for i in range(EEXE.n_sim):
@@ -99,46 +100,29 @@ def main():
                 MDP.write(f"sim_{i}/iteration_0/{EEXE.mdp.split('/')[-1]}", skipempty=True)
 
         # 2-2. Run the first ensemble of simulations
-        md = EEXE.run_EEXE(0)
+        EEXE.run_EEXE(0)
 
-        # 2-3. Restructure the directory (move the files from mdrun_0_i0_* to sim_*/iteration_0)
-        if rank == 0:
-            work_dir = md.output.directory.result()
-            for i in range(EEXE.n_sim):
-                if EEXE.verbose is True:
-                    print(f'  Moving files from {work_dir[i].split("/")[-1]}/ to sim_{i}/iteration_0/ ...')
-                    print(f'  Removing the empty folder {work_dir[i].split("/")[-1]} ...')
-                for f in glob.glob(f'{work_dir[i]}/*'):
-                    shutil.move(f, f'sim_{i}/iteration_0/')
-                os.rmdir(work_dir[i])
-        start_idx = 1
     else:
         if rank == 0:
             # If there is a checkpoint file, we see the execution as an extension of an EEXE simulation
             ckpt_data = np.load(args.ckpt)
-            start_idx = len(ckpt_data[0])
+            start_idx = len(ckpt_data[0])  # The length should be the same for the same axis
             print(f'\nGetting prepared to extend the EEXE simulation from iteration {start_idx} ...')
 
-            print('Deleting corrupted data ...')
-            corrupted = glob.glob('gmxapi.commandline.cli*')  # corrupted iteration
-            corrupted.extend(glob.glob('mdrun*'))
-            for i in corrupted:
-                shutil.rmtree(i)
-            if len(corrupted) == 0:
-                corrupt_bool = False
-
-            for i in range(EEXE.n_sim):
-                n_finished = len(next(os.walk(f'sim_{i}'))[1])  # number of finished iterations (the last might be initialized but corrupted though)  # noqa: E501
-                if n_finished == EEXE.n_iter and corrupt_bool is False:
-                    print('Extension aborted: The expected number of iterations have been completed!')
-                    sys.exit()
-                else:
-                    print('Deleting data generated after the checkpoint ...')
+            if start_idx == EEXE.n_iter:
+                print('Extension aborted: The expected number of iterations have been completed!')
+                sys.exit()
+            else:
+                print('Deleting data generated after the checkpoint ...')
+                for i in range(EEXE.n_sim):
+                    n_finished = len(next(os.walk(f'sim_{i}'))[1])  # number of finished iterations
                     for j in range(start_idx, n_finished):
                         print(f'  Deleting the folder sim_{i}/iteration_{j}')
                         shutil.rmtree(f'sim_{i}/iteration_{j}')
 
             # Read g_vecs.npy and rep_trajs.npy so that new data can be appended, if any.
+            # Note that these two arrays are created in rank 0 and should always be operated in rank 0,
+            # or broadcasting is required.
             EEXE.rep_trajs = [list(i) for i in ckpt_data]
             if os.path.isfile(args.g_vecs) is True:
                 EEXE.g_vecs = [list(i) for i in np.load(args.g_vecs)]
@@ -209,7 +193,9 @@ def main():
                 MDP.write(f"sim_{j}/iteration_{i}/{EEXE.mdp.split('/')[-1]}", skipempty=True)
                 # In run_EEXE(i, swap_pattern), where the tpr files will be generated, we use the top file at the
                 # level of the simulation (the file that will be shared by all simulations). For the gro file, we pass
-                # swap_patter to the function to figure it out internally.
+                # swap_pattern to the function to figure it out internally.
+        else:
+            swap_pattern = None
 
         if -1 not in EEXE.equil and 0 not in EEXE.equil:
             # This is the case where the weights are equilibrated in a weight-updating simulation.
@@ -220,20 +206,11 @@ def main():
 
         # Step 4: Perform another iteration
         # 4-1. Run another ensemble of simulations
-        md = EEXE.run_EEXE(i, swap_pattern)
+        swap_pattern = comm.bcast(swap_pattern, root=0)
+        EEXE.run_EEXE(i, swap_pattern)
 
         if rank == 0:
-            # 4-2. Restructure the directory (move the files from mdrun_{i}_i0_* to sim_*/iteration_{i})
-            work_dir = md.output.directory.result()
-            for j in range(EEXE.n_sim):
-                if EEXE.verbose is True:
-                    print(f'  Moving files from {work_dir[j].split("/")[-1]}/ to sim_{j}/iteration_{i}/ ...')
-                    print(f'  Removing the empty folder {work_dir[j].split("/")[-1]} ...')
-                for f in glob.glob(f'{work_dir[j]}/*'):
-                    shutil.move(f, f'sim_{j}/iteration_{i}/')
-                os.rmdir(work_dir[j])
-
-            # 4-3. Save data
+            # 4-2. Save data
             if (i + 1) % EEXE.n_ckpt == 0:
                 if len(EEXE.g_vecs) != 0:
                     # Save g_vec as a function of time if weight combination was used.