Skip to content

Running on NERSC Systems

Vinay Amatya edited this page Jul 10, 2017 · 7 revisions

Currently only CORI system has been tested.

For installation of Matex-Tensorflow, on CORI on NERSC, a branch 'cori_inst' has been created and added into the Matex Github: https://github.com/matex-org/matex.git. Please check out 'cori_inst' branch. For further installation instruction, we can follow the github wiki page :https://github.com/matex-org/matex/wiki/Installing-MaTEx-TensorFlow-CPU

There are however some CORI specific Instructions that need to be followed. By default, the module PrgEnv-intel is loaded on CORI. You can find this with 'module list'. This should be changed to gnu environment with

module switch PrgEnv-intel PrgEnv-gnu/6.0.3

Set the following environment variable:

export MPI_HOME=/opt/cray/pe/mpt/7.4.4/gni/mpich-gnu/5.1

On CORI, there is an installation of Matex-Tensorflow here (#1): /global/cscratch1/sd/vamatya/matex

Have /global/cscratch1/sd/vamatya/anaconda3/bin in your PATH:

export PATH=/global/cscratch1/sd/vamatya/anaconda3/bin:$PATH

While on path #1, do the following:

source run_TFEnv.sh

Now this should activate the python virtual environment, within which we should now be able to use Matex-Tensorflow.

After this, Slurm batch script should be used to run Matex-Tensorflow on the compute nodes. We have not had success in running tensorflow on the host-node on CORI yet (and is probably not the best approach to run it).

If the above doesn't work please follow following instructions.
Set:

export LD_PRELOAD=/opt/cray/pe/mpt/7.4.4/gni/mpich-gnu/5.1/lib/libmpichcxx.so

Please check if PYTHONHOME is set to: /global/cscratch1/sd/vamatya/matex_github/src/deeplearning/tensorflow/cpu/py3.x/py_distro

Next, to activate the python virtual environment within which we should be able to use Matex-Tensorflow, do:

source $PYTHONHOME/bin/activate
Clone this wiki locally