-
Notifications
You must be signed in to change notification settings - Fork 40
Running on NERSC Systems
Currently only CORI system has been tested.
For installation of Matex-Tensorflow, on CORI on NERSC, a branch 'cori_inst' has been created and added into the Matex Github: https://github.com/matex-org/matex.git. Please check out 'cori_inst' branch. For further installation instruction, we can follow the github wiki page :https://github.com/matex-org/matex/wiki/Installing-MaTEx-TensorFlow-CPU
There are however some CORI specific Instructions that need to be followed. By default, the module PrgEnv-intel is loaded on CORI. You can find this with 'module list'. This should be changed to gnu environment with
module switch PrgEnv-intel PrgEnv-gnu/6.0.3
Set the following environment variable:
export MPI_HOME=/opt/cray/pe/mpt/7.4.4/gni/mpich-gnu/5.1
On CORI, there is an installation of Matex-Tensorflow here (#1): /global/cscratch1/sd/vamatya/matex
Have /global/cscratch1/sd/vamatya/anaconda3/bin in your PATH:
export PATH=/global/cscratch1/sd/vamatya/anaconda3/bin:$PATH
While on path #1, do the following:
source run_TFEnv.sh
Now this should activate the python virtual environment, within which we should now be able to use Matex-Tensorflow.
After this, Slurm batch script should be used to run Matex-Tensorflow on the compute nodes. We have not had success in running tensorflow on the host-node on CORI yet (and is probably not the best approach to run it).
If the above doesn't work please follow following instructions.
Set:
export LD_PRELOAD=/opt/cray/pe/mpt/7.4.4/gni/mpich-gnu/5.1/lib/libmpichcxx.so
Please check if PYTHONHOME is set to: /global/cscratch1/sd/vamatya/matex_github/src/deeplearning/tensorflow/cpu/py3.x/py_distro
Next, to activate the python virtual environment within which we should be able to use Matex-Tensorflow, do:
source $PYTHONHOME/bin/activate
Getting Started on MaTEx-TensorFlow
- Required Software
- Installing MaTEx-TensorFlow on CPU Clusters
- Installing MaTEx-TensorFlow on GPU Clusters
- MaTEx-TensorFlow on Older glibc(v<2.19)
- DataSet Reader
- Testing Scripts
- Performance
- Running on PNNL Systems
- Running on NERSC Systems
- Restarting the MaTEx TensorFlow environment