deploy: cf1dd00

AMReX-Astro · Sep 13, 2024 · eee7552 · eee7552
1 parent 00bda2d
commit eee7552
Show file tree

Hide file tree

Showing 3 changed files with 15 additions and 1 deletion.
diff --git a/_sources/nersc-workflow.rst.txt b/_sources/nersc-workflow.rst.txt
@@ -22,6 +22,13 @@ includes the restart logic to allow for job chaining.
 .. literalinclude:: ../../job_scripts/perlmutter/perlmutter.submit
    :language: sh
 
+.. note::
+
+   With large reaction networks, you may get GPU out-of-memory errors during
+   the first burner call.  If this happens, you can add
+   ``amrex.the_arena_init_size=0`` after ``${restartString}`` in the srun call
+   so AMReX doesn't reserve 3/4 of the GPU memory for the device arena.
+
 Below is an example that runs on CPU-only nodes. Here ``ntasks-per-node``
 refers to number of MPI processes (used for distributed parallelism) per node,
 and ``cpus-per-task`` refers to number of hyper threads used per task

diff --git a/nersc-workflow.html b/nersc-workflow.html
@@ -202,6 +202,13 @@ <h2>Perlmutter<a class="headerlink" href="#perlmutter" title="Link to this headi
 <span class="nb">exit</span><span class="w"> </span><span class="nv">$ret</span>
 </pre></div>
 </div>
+<div class="admonition note">
+<p class="admonition-title">Note</p>
+<p>With large reaction networks, you may get GPU out-of-memory errors during
+the first burner call.  If this happens, you can add
+<code class="docutils literal notranslate"><span class="pre">amrex.the_arena_init_size=0</span></code> after <code class="docutils literal notranslate"><span class="pre">${restartString}</span></code> in the srun call
+so AMReX doesn’t reserve 3/4 of the GPU memory for the device arena.</p>
+</div>
 <p>Below is an example that runs on CPU-only nodes. Here <code class="docutils literal notranslate"><span class="pre">ntasks-per-node</span></code>
 refers to number of MPI processes (used for distributed parallelism) per node,
 and <code class="docutils literal notranslate"><span class="pre">cpus-per-task</span></code> refers to number of hyper threads used per task