docs: documenting using pymapdl on clusters (#3466)

* feat: adding env vars needed for multinode * feat: adding env vars needed for multinode * feat: renaming hpc detection argument * docs: adding documentation * chore: adding changelog file 3466.documentation.md * feat: adding env vars needed for multinode * feat: renaming hpc detection argument * docs: adding documentation * chore: adding changelog file 3466.documentation.md * fix: vale issues * chore: To fix sphinx build Squashed commit of the following: commit c1d1a3e Author: German <28149841+germa89@users.noreply.github.com> Date: Mon Oct 7 15:33:19 2024 +0200 ci: retrigger CICD commit b7b5c30 Author: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Date: Mon Oct 7 13:31:55 2024 +0000 ci: auto fixes from pre-commit.com hooks. for more information, see https://pre-commit.ci commit 32a1c02 Author: Revathy Venugopal <104772255+Revathyvenugopal162@users.noreply.github.com> Date: Mon Oct 7 15:31:24 2024 +0200 fix: add suggestions Co-authored-by: German <28149841+germa89@users.noreply.github.com> commit 575a219 Merge: f2afe13 be1be2e Author: Revathyvenugopal162 <revathy.venugopal@ansys.com> Date: Mon Oct 7 15:09:01 2024 +0200 Merge branch 'fix/add-build-cheatsheet-as-env-varaible' of https://github.com/ansys/pymapdl into fix/add-build-cheatsheet-as-env-varaible commit f2afe13 Author: Revathyvenugopal162 <revathy.venugopal@ansys.com> Date: Mon Oct 7 15:08:58 2024 +0200 fix: precommit commit be1be2e Author: pyansys-ci-bot <92810346+pyansys-ci-bot@users.noreply.github.com> Date: Mon Oct 7 13:07:35 2024 +0000 chore: adding changelog file 3468.fixed.md commit f052a4d Author: Revathyvenugopal162 <revathy.venugopal@ansys.com> Date: Mon Oct 7 15:05:56 2024 +0200 fix: add build cheatsheet as env variable within doc-build * docs: expanding a bit troubleshooting advices and small format fix * docs: fix vale * fix: nproc tests * feat: adding env vars needed for multinode * feat: renaming hpc detection argument * docs: adding documentation * chore: adding changelog file 3466.documentation.md * fix: vale issues * docs: fix vale * docs: expanding a bit troubleshooting advices and small format fix * fix: nproc tests * revert: "chore: To fix sphinx build" This reverts commit e45d2e5. * docs: clarifying where everything is running. * docs: expanding bash example * tests: fix * docs: adding `PYMAPDL_NPROC` to env var section * docs: fix vale issue * docs: fix vale issue * fix: replacing env var name * fix: unit tests * chore: adding changelog file 3466.documentation.md [dependabot-skip] * Apply suggestions from code review Co-authored-by: Camille <78221213+clatapie@users.noreply.github.com> * docs: apply suggestions from code review made by Kathy Co-authored-by: Kathy Pippert <84872299+PipKat@users.noreply.github.com> * docs: adding Kathy suggestion. --------- Co-authored-by: pyansys-ci-bot <92810346+pyansys-ci-bot@users.noreply.github.com> Co-authored-by: Camille <78221213+clatapie@users.noreply.github.com> Co-authored-by: Kathy Pippert <84872299+PipKat@users.noreply.github.com>
ansys · Oct 23, 2024 · d71c544 · d71c544
1 parent ffdcfa9
commit d71c544
Show file tree

Hide file tree

Showing 9 changed files with 441 additions and 196 deletions.
diff --git a/doc/changelog.d/3466.documentation.md b/doc/changelog.d/3466.documentation.md
@@ -0,0 +1 @@
+docs: documenting using pymapdl on clusters
diff --git a/doc/source/examples/extended_examples/hpc/hpc_ml_ga.rst b/doc/source/examples/extended_examples/hpc/hpc_ml_ga.rst
@@ -251,7 +251,7 @@ this script.
 
    If you have problems when creating the virtual environment
    or accessing it from the compute nodes,
-   see :ref:`ref_hpc_pymapdl_job`.
+   see :ref:`ref_hpc_troubleshooting`.
 
 3. Install the requirements for this example from the
    :download:`requirements.txt <requirements.txt>` file.

diff --git a/doc/source/user_guide/hpc/pymapdl.rst b/doc/source/user_guide/hpc/pymapdl.rst
@@ -1,84 +1,182 @@
-.. _ref_hpc_pymapdl:
 
+.. _ref_hpc_pymapdl_job:
 
-=============================
-PyMAPDL on SLURM HPC clusters
-=============================
+=======================
+PyMAPDL on HPC clusters
+=======================
 
-.. _ref_hpc_pymapdl_job:
 
-Submit a PyMAPDL job
-====================
+Introduction
+============
 
-To submit a PyMAPDL job, you must create two files:
+PyMAPDL communicates with MAPDL using the gRPC protocol.
+This protocol offers the many advantages and features described in
+see :ref:`ref_project_page`.
+One of these features is that it is not required to have both
+PyMAPDL and MAPDL processes running on the same machine.
+This possibility opens the door to many configurations, depending
+on whether or not you run them both on the HPC compute nodes.
+Additionally, you might be able interact with them (``interactive`` mode)
+or not (``batch`` mode).
 
-- Python script with the PyMAPDL code
-- Bash script that activates the virtual environment and calls the Python script
+For information on supported configurations, see :ref:`ref_pymapdl_batch_in_cluster_hpc`.
+
+
+Since v0.68.5, PyMAPDL can take advantage of the tight integration
+between the scheduler and MAPDL to read the job configuration and
+launch an MAPDL instance that can use all the resources allocated
+to that job.
+For instance, if a SLURM job has allocated 8 nodes with 4 cores each,
+then PyMAPDL launches an MAPDL instance which uses 32 cores
+spawning across those 8 nodes.
+This behavior can turn off if passing the :envvar:`PYMAPDL_ON_SLURM`
+environment variable or passing the ``detect_HPC=False`` argument
+to the :func:`launch_mapdl() <ansys.mapdl.core.launcher.launch_mapdl>` function.
+
+
+.. _ref_pymapdl_batch_in_cluster_hpc:
+
+Submit a PyMAPDL batch job to the cluster from the entrypoint node
+==================================================================
+
+Many HPC clusters allow their users to log into a machine using
+``ssh``, ``vnc``, ``rdp``, or similar technologies and then submit a job
+to the cluster from there.
+This entrypoint machine, sometimes known as the *head node* or *entrypoint node*,
+might be a virtual machine (VDI/VM).
+
+In such cases, once the Python virtual environment with PyMAPDL is already
+set and is accessible to all the compute nodes, launching a
+PyMAPDL job from the entrypoint node is very easy to do using the ``sbatch`` command.
+When the ``sbatch`` command is used, PyMAPDL runs and launches an MAPDL instance in
+the compute nodes.
+No changes are needed on a PyMAPDL script to run it on an SLURM cluster.
+
+First the virtual environment must be activated in the current terminal.
+
+.. code-block:: console
+
+    user@entrypoint-machine:~$ export VENV_PATH=/my/path/to/the/venv
+    user@entrypoint-machine:~$ source $VENV_PATH/bin/activate
 
-**Python script:** ``pymapdl_script.py``
+Once the virtual environment is activated, you can launch any Python
+script that has the proper Python shebang (``#!/usr/bin/env python3``).
+
+For instance, assume that you want to launch the following ``main.py`` Python script:
 
 .. code-block:: python
+    :caption: main.py
+
+    #!/usr/bin/env python3
 
     from ansys.mapdl.core import launch_mapdl
 
-    # Number of processors must be lower than the
-    # number of CPUs allocated for the job.
-    mapdl = launch_mapdl(nproc=10)
+    mapdl = launch_mapdl(run_location="/home/ubuntu/tmp/tmp/mapdl", loglevel="debug")
 
-    mapdl.prep7()
-    n_proc = mapdl.get_value("ACTIVE", 0, "NUMCPU")
-    print(f"Number of CPUs: {n_proc}")
+    print(mapdl.prep7())
+    print(f'Number of CPU: {mapdl.get_value("ACTIVE", 0, "NUMCPU")}')
 
     mapdl.exit()
 
+You can run this command in your console:
 
-**Bash script:** ``job.sh``
-
-.. code-block:: bash
+.. code-block:: console
 
-    source /home/user/.venv/bin/activate
-    python pymapdl_script.py
+    (venv) user@entrypoint-machine:~$ sbatch main.py
 
-To start the simulation, you use this code:
+Alternatively, you can remove the shebang from the Python file and use a
+Python executable call:
 
 .. code-block:: console
 
-    user@machine:~$ srun job.sh
+    (venv) user@entrypoint-machine:~$ sbatch python main.py
+
+Additionally, you can change the number of cores used in your
+job by setting the :envvar:`PYMAPDL_NPROC` environment variable to the desired value.
+
+.. code-block:: console
 
+    (venv) user@entrypoint-machine:~$ PYMAPDL_NPROC=4 sbatch main.py
 
-The bash script allows you to customize the environment before running the Python script.
-This bash script performs such tasks as creating environment variables, moving to
-different directories, and printing to ensure your configuration is correct. However,
-this bash script is not mandatory.
-You can avoid having the ``job.sh`` bash script if the virtual environment is activated
-and you pass all the environment variables to the job:
+You can also add ``sbatch`` options to the command:
 
 .. code-block:: console
 
-    user@machine:~$ source /home/user/.venv/bin/activate
-    (.venv) user@machine:~$ srun python pymapdl_script.py --export=ALL
+    (venv) user@entrypoint-machine:~$ PYMAPDL_NPROC=4 sbatch  main.py
 
 
-The ``--export=ALL`` argument might not be needed, depending on the cluster configuration.
-Furthermore, you can omit the Python call in the preceding command if you include the
-Python shebang (``#!/usr/bin/python3``) in the first line of the ``pymapdl_script.py`` script.
+For instance, to launch a PyMAPDL job that starts a four-core MAPDL instance
+on a 10-CPU SLURM job, you can run this command:
 
 .. code-block:: console
 
-    user@machine:~$ source /home/user/.venv/bin/activate
-    (.venv) user@machine:~$ srun pymapdl_script.py --export=ALL
+    (venv) user@entrypoint-machine:~$ PYMAPDL_NPROC=4 sbatch --partition=qsmall --nodes=10 --ntasks-per-node=1 main.py
 
-If you prefer to run the job in the background, you can use the ``sbatch``
-command instead of the ``srun`` command. However, in this case, the Bash file is needed:
+
+Using a submission script
+-------------------------
+
+If you need to customize your PyMAPDL job further, you can create a SLURM
+submission script for submitting it. 
+In this case, you must create two files:
+
+- Python script with the PyMAPDL code
+- Bash script that activates the virtual environment and calls the
+  Python script
+
+.. code-block:: python
+    :caption: main.py
+
+    from ansys.mapdl.core import launch_mapdl
+
+    # Number of processors must be lower than the
+    # number of CPU allocated for the job.
+    mapdl = launch_mapdl(nproc=10)
+
+    mapdl.prep7()
+    n_proc = mapdl.get_value("ACTIVE", 0, "NUMCPU")
+    print(f"Number of CPU: {n_proc}")
+
+    mapdl.exit()
+
+
+.. code-block:: bash
+   :caption: job.sh
+
+   #!/bin/bash
+   # Set SLURM options
+   #SBATCH --job-name=ansys_job            # Job name
+   #SBATCH --partition=qsmall              # Specify the queue/partition name                  
+   #SBATCH --nodes=5                       # Number of nodes
+   #SBATCH --ntasks-per-node=2             # Number of tasks (cores) per node
+   #SBATCH --time=04:00:00                 # Set a time limit for the job (optional but recommended)
+
+   # Set env vars
+   export MY_ENV_VAR=VALUE
+
+   # Activate Python virtual environment
+   source /home/user/.venv/bin/activate
+   # Call Python script
+   python main.py
+
+To start the simulation, you use this code:
 
 .. code-block:: console
 
     user@machine:~$ sbatch job.sh
-    Submitted batch job 1
 
-Here is the expected output of the job:
+In this case, the Python virtual environment does not need to be activated
+before submission since it is activated later in the script.
+
+The expected output of the job follows:
 
 .. code-block:: text
 
-    Number of CPUs: 10.0
+    Number of CPU: 10.0
+
 
+The bash script allows you to customize the environment before running the
+Python script.
+This bash script performs tasks such as creating environment variables,
+moving files to different directories, and printing to ensure your
+configuration is correct.
diff --git a/doc/source/user_guide/hpc/settings.rst b/doc/source/user_guide/hpc/settings.rst
@@ -7,14 +7,16 @@ Setting PyMAPDL
 Requirements
 ============
 
-Using PyMAPDL in an HPC environment managed by SLURM scheduler has certain requirements:
+Using PyMAPDL in an HPC environment managed by SLURM scheduler has certain
+requirements:
 
-* **An Ansys installation must be accessible from all the compute nodes**.
+* **An Ansys installation must be accessible from all the compute nodes.**
   This normally implies that the ``ANSYS`` installation directory is in a
   shared drive or directory. Your HPC cluster administrator
   should provide you with the path to the ``ANSYS`` directory.
 
-* **A compatible Python installation must be accessible from all the compute nodes**.
+* **A compatible Python installation must be accessible from all the compute
+  nodes.**
   For compatible Python versions, see :ref:`ref_pymapdl_installation`.
 
 Additionally, you must perform a few key steps to ensure efficient job
@@ -23,8 +25,8 @@ execution and resource utilization. Subsequent topics describe these steps.
 Check the Python installation
 =============================
 
-The PyMAPDL Python package (``ansys-mapdl-core``) must be installed in a virtual
-environment that is accessible from the compute nodes.
+The PyMAPDL Python package (``ansys-mapdl-core``) must be installed in
+a virtual environment that is accessible from the compute nodes.
 
 To see where your Python distribution is installed, use this code:
 
@@ -40,9 +42,10 @@ To print the version of Python you have available, use this code:
     user@machine:~$ python3 --version
     Python 3.9.16
 
-You should be aware that your machine might have installed other Python versions.
-To find out if those installations are already in the ``PATH`` environment variable,
-you can press the **Tab** key to use autocomplete:
+You should be aware that your machine might have other Python versions
+installed.
+To find out if those installations are already in the ``PATH`` environment
+variable, you can press the **Tab** key to use autocomplete:
 
 .. code-block:: console
 
@@ -55,23 +58,34 @@ you can press the **Tab** key to use autocomplete:
 You should use a Python version that is compatible with PyMAPDL.
 For more information, see :ref:`ref_pymapdl_installation`.
 
-The ``which`` command returns the path where the Python executable is installed.
-You can use that executable to create your own Python virtual environment in a directory
-that is accessible from all the compute nodes.
-For most HPC clusters, the ``/home/$user`` directory is generally available to all nodes.
-You can then create the virtual environment in the ``/home/user/.venv`` directory:
+.. warning::
+
+    Contact your cluster administrator if you cannot find a Python version
+    compatible with PyMAPDL.
+
+
+The ``which`` command returns the path where the Python executable is
+installed.
+You can use that executable to create your own Python virtual environment
+in a directory that is accessible from all the compute nodes.
+For most HPC clusters, the ``/home/$user`` directory is generally available
+to all nodes.
+You can then create the virtual environment in the ``/home/user/.venv``
+directory:
 
 .. code-block:: console
 
     user@machine:~$ python3 -m venv /home/user/.venv
 
 After activating the virtual environment, you can install PyMAPDL.
 
+.. _ref_install_pymapdl_on_hpc:
 
 Install PyMAPDL
 ===============
 
-To install PyMAPDL on the activated virtual environment, run the following commands:
+To install PyMAPDL on the activated virtual environment, run the following
+commands:
 
 .. code-block:: console
 
@@ -107,14 +121,16 @@ then you can run that script using:
 
     user@machine:~$ srun test.sh
 
-This command might take a minute or two to complete, depending on the amount of free
-resources available in the cluster.
+This command might take a minute or two to complete, depending on the amount of
+free resources available in the cluster.
+
 On the console, you should see this output:
 
 .. code-block:: text
 
     Testing Python!
     PyMAPDL version 0.68.1 was successfully imported.
 
-If you see an error in the output, see :ref:`ref_hpc_troubleshooting`, especially
-:ref:`ref_python_venv_not_accesible`.
+If you see an error in the output, see :ref:`ref_hpc_troubleshooting`,
+especially :ref:`ref_python_venv_not_accesible`.
+