diff --git a/README.md b/README.md index 5178e36..2beafe4 100644 --- a/README.md +++ b/README.md @@ -17,9 +17,9 @@ - [Check Habana Package Installation for no Docker](#check-habana-package-installation-for-no-docker) - [Install SW Stack](#install-sw-stack) - [Check TF/Horovod Habana packages](#check-tfhorovod-habana-packages) - - [Install TF/Horovod Habana python packages](#install-tfhorovod-habana-python-packages) + - [Install TF/Horovod Habana packages](#install-tfhorovod-habana-packages) - [Check PT Habana packages](#check-pt-habana-packages) - - [Install PT Habana python packages](#install-pt-habana-python-packages) + - [Install PT Habana packages](#install-pt-habana-packages) - Docker - [Do you want to use prebuilt docker or build docker yourself?](#do-you-want-to-use-prebuilt-docker-or-build-docker-yourself) - [How to Build Docker Images from Habana Dockerfiles](#how-to-build-docker-images-from-habana-dockerfiles) @@ -196,12 +196,6 @@ Setup complete, please proceed to [Setup Complete](#Setup-Complete) ## Habana Deep Learning AMI from AWS Marketplace -
- -**--- Coming Soon ---** - -
- When using the Habana Deep Learning AMI from AWS Marketplace, you can either directly use containers or install a framework and proceed from there to run directly on the AMI.
@@ -999,7 +993,7 @@ ${PYTHON} -m pip list | grep habana
### Are the required python packages installed on your system? -[Yes](#Setup-Complete) • [No](#Install-TFHorovod-Habana-python-packages) +[Yes](#Setup-Complete) • [No](#install-tfhorovod-habana-packages)
@@ -1009,7 +1003,7 @@ ${PYTHON} -m pip list | grep habana
-## Install TF/Horovod Habana python packages +## Install TF/Horovod Habana packages This section describes how to obtain and install the TensorFlow software package. The package consists of two main components: Base **habana-tensorflow** Python package - Libraries and modules needed to execute TensorFlow on a **single Gaudi** device. @@ -1021,7 +1015,9 @@ Scale-out **habana-horovod** Python package - Libraries and modules needed to ex
-The following example scripts include instructions from the steps [Base Installation (Single Node)](#Base-Installation-Single-Node) and [Scale-out Installation](#Scale-out-Installation) that can be used for your reference. The scripts install TF 2.5.1. +The following example scripts include instructions from the steps [Base Installation (Single Node)](#Base-Installation-Single-Node) and [Scale-out Installation](#Scale-out-Installation) that can be used for your reference. The scripts install TF 2.5.1. +The scripts are using Python3 from ``/usr/bin/`` with version according to the [Support Matrix](#SynapseAi-Support-Matrix). +Make sure, that Python3 is installed there, and if not, update the bash scripts with appropriate ``PYTHON=``. Ubuntu 18.04 example script [u18_tensorflow_installation.sh](https://github.com/HabanaAI/Setup_and_Install/blob/r1.0.1/installation_scripts/u18_tensorflow_installation.sh). @@ -1149,16 +1145,30 @@ This will search for and list all packages with the word Habana. ### Base Installation (Single Node) The habana-tensorflow package contains all the binaries and scripts to run topologies on a single-node. -1. Before installing habana-tensorflow, install supported TensorFlow version. See [Support Matrix](#SynapseAi-Support-Matrix). If no TensorFlow package is available, PIP will automatically fetch it. +1. All the steps listed below are using to ``${PYTHON}`` environment variable, which must be set to appropriate version of Python, according to [Support Matrix](#SynapseAi-Support-Matrix). ``` -${PYTHON} -m pip install tensorflow-cpu== +export PYTHON=/usr/bin/python # i.e. for U18 it's PYTHON=/usr/bin/python3.7 ``` +2. Before installing habana-tensorflow, install supported TensorFlow version. See [Support Matrix](#SynapseAi-Support-Matrix). If no TensorFlow package is available, PIP will automatically fetch it. +**NOTE:** +After TensorFlow release version 2.7.0, TensorFlow 2.6.0 has a broken dependency to TensorFlow Estimator, Keras and Tensorboard. +To overcome this dependency, user needs to explicitly install the proper version of those packages before installing TensorFlow. -2. habana-tensorflow is available in the Habana Vault. To allow PIP to search for the habana-tensorflow package, –extra-index-url needs to be specified: ``` -${PYTHON} -m pip install habana-tensorflow==1.0.1.81 --extra-index-url https://vault.habana.ai/artifactory/api/pypi/gaudi-python/simple +# Only when installing tensorflow-cpu==2.6.0 +${PYTHON} -m pip install --user tensorflow-estimator==2.6.0 +${PYTHON} -m pip install --user tensorboard==2.6.0 +${PYTHON} -m pip install --user keras==2.6.0 +``` +Then install tensorflow-cpu: ``` -3. Run the below command to make sure the habana-tensorflow package is properly installed: +${PYTHON} -m pip install --user tensorflow-cpu== +``` +3. habana-tensorflow is available in the Habana Vault. To allow PIP to search for the habana-tensorflow package, –extra-index-url needs to be specified: +``` +${PYTHON} -m pip install --user habana-tensorflow==1.0.1.81 --extra-index-url https://vault.habana.ai/artifactory/api/pypi/gaudi-python/simple +``` +4. Run the below command to make sure the habana-tensorflow package is properly installed: ``` ${PYTHON} -c "import habana_frameworks.tensorflow as htf; print(htf.__version__)" ``` @@ -1212,11 +1222,11 @@ export PATH=$MPI_ROOT/bin:$PATH ``` Install mpi4py binding ``` -python3 -m pip install mpi4py==3.0.3 +${PYTHON} -m pip install --user mpi4py==3.0.3 ``` 3. habana-horovod is also stored in the Habana Vault. To allow PIP to search for the habana-horovod package, –extra-index-url needs to be specified: ``` -${PYTHON} -m pip install habana-horovod==1.0.1.81 --extra-index-url https://vault.habana.ai/artifactory/api/pypi/gaudi-python/simple +${PYTHON} -m pip install --user habana-horovod==1.0.1.81 --extra-index-url https://vault.habana.ai/artifactory/api/pypi/gaudi-python/simple ``` #### See also: @@ -1274,7 +1284,7 @@ Python dependencies are gatehered in [model_requirements.txt](https://github.com Download the file and invoke: ``` -python3 -m pip install -r model_requirements.txt +${PYTHON} -m pip install --user -r model_requirements.txt ```
@@ -1371,7 +1381,7 @@ Check for habana-torch and habana-torch-hcl
### Are the required python packages installed on your system? -[Yes](#Setup-Complete) • [No](#Install-PT-Habana-python-packages) +[Yes](#Setup-Complete) • [No](#install-pt-habana-packages)
@@ -1381,7 +1391,7 @@ Check for habana-torch and habana-torch-hcl
-## Install PT Habana python packages +## Install PT Habana packages ### Install Habana Pytorch
Ubuntu distributions @@ -2224,7 +2234,7 @@ It will look similar to this: *
TF 2.6.0 - ### Pull docker + ### Pull docker ``` docker pull vault.habana.ai/gaudi-docker/1.0.1/ubuntu18.04/habanalabs/tensorflow-installer-tf-cpu-2.6.0:1.0.1-81 ``` @@ -2275,8 +2285,8 @@ It will look similar to this: *
TF 2.6.0 - - ### Pull docker + + ### Pull docker ``` docker pull vault.habana.ai/gaudi-docker/1.0.1/amzn2/habanalabs/tensorflow-installer-tf-cpu-2.6.0:1.0.1-81 ``` diff --git a/dockerfiles/Dockerfile_amzn2_tensorflow_installer b/dockerfiles/Dockerfile_amzn2_tensorflow_installer index 0f0ee74..1669a6d 100644 --- a/dockerfiles/Dockerfile_amzn2_tensorflow_installer +++ b/dockerfiles/Dockerfile_amzn2_tensorflow_installer @@ -56,6 +56,9 @@ RUN wget https://bootstrap.pypa.io/get-pip.py && \ python3 get-pip.py pip==21.0.1 && \ rm -rf get-pip.py && \ pip3 install -r requirements-training-release.txt && \ + pip3 install tensorflow-estimator==2.6.0 \ + pip3 install tensorboard==2.6.0 \ + pip3 install keras==2.6.0 \ pip3 install tensorflow-cpu==${TF_VERSION} \ tensorflow-model-optimization==0.5.0 && \ # pycocotools has to be installed in separated process otherwise it fails with 'numpy.ufunc size changed' diff --git a/dockerfiles/Dockerfile_ubuntu_tensorflow_installer b/dockerfiles/Dockerfile_ubuntu_tensorflow_installer index 2546b73..6618959 100644 --- a/dockerfiles/Dockerfile_ubuntu_tensorflow_installer +++ b/dockerfiles/Dockerfile_ubuntu_tensorflow_installer @@ -45,6 +45,9 @@ COPY requirements-training-release.txt requirements-training-release.txt RUN python3 -m pip install pip==21.0.1 && \ pip3 install -r requirements-training-release.txt && \ pip3 uninstall --yes habana tensorflow && \ + pip3 install tensorflow-estimator==2.6.0 \ + pip3 install tensorboard==2.6.0 \ + pip3 install keras==2.6.0 \ # tensorflow-cpu and -model have to be installed in separated processes otherwise old version of tf will be imported pip3 install tensorflow-cpu==${TF_VERSION} && \ pip3 install tensorflow-model-optimization==0.5.0 && \ diff --git a/installation_scripts/al2_tensorflow_installation.sh b/installation_scripts/al2_tensorflow_installation.sh index 9f54258..97c68bf 100755 --- a/installation_scripts/al2_tensorflow_installation.sh +++ b/installation_scripts/al2_tensorflow_installation.sh @@ -29,6 +29,7 @@ export MPI_ROOT=/usr/local/openmpi export LD_LIBRARY_PATH=$MPI_ROOT/lib:$LD_LIBRARY_PATH export OPAL_PREFIX=$MPI_ROOT export PATH=$MPI_ROOT/bin:$PATH +export PYTHON=/usr/bin/python3.7 echo "export MPI_ROOT=${MPI_ROOT}" | sudo tee -a /etc/profile.d/habanalabs.sh echo "export OPAL_PREFIX=${MPI_ROOT}" | sudo tee -a /etc/profile.d/habanalabs.sh echo 'export LD_LIBRARY_PATH=${MPI_ROOT}/lib:${LD_LIBRARY_PATH}' | sudo tee -a /etc/profile.d/habanalabs.sh @@ -46,14 +47,13 @@ wget --no-verbose https://download.open-mpi.org/release/open-mpi/v4.0/openmpi-"$ /sbin/ldconfig export MPICC=${MPI_ROOT}/bin/mpicc -python3 -m pip install mpi4py==3.0.3 +${PYTHON} -m pip install --user mpi4py==3.0.3 #install base tensorflow package -python3 -m pip install tensorflow-cpu==2.5.1 -#instal Habana tensorflow bridge & Horovod -python3 -m pip install habana-tensorflow==1.0.1.81 --extra-index-url https://vault.habana.ai/artifactory/api/pypi/gaudi-python/simple -python3 -m pip install habana-horovod==1.0.1.81 --extra-index-url https://vault.habana.ai/artifactory/api/pypi/gaudi-python/simple +${PYTHON} -m pip install --user tensorflow-cpu==2.5.1 +#install Habana tensorflow bridge & Horovod +${PYTHON} -m pip install --user habana-tensorflow==1.0.1.81 --extra-index-url https://vault.habana.ai/artifactory/api/pypi/gaudi-python/simple +${PYTHON} -m pip install --user habana-horovod==1.0.1.81 --extra-index-url https://vault.habana.ai/artifactory/api/pypi/gaudi-python/simple source /etc/profile.d/habanalabs.sh -python3 -c 'import tensorflow as tf;import habana_frameworks.tensorflow as htf;htf.library_loader.load_habana_module();x = tf.constant(2);y = x + x;assert y.numpy() == 4, "Sanity check failed: Wrong Add output";assert "HPU" in y.device, "Sanity check failed: Operation not executed on Habana";print("Sanity check passed")' - +${PYTHON} -c 'import tensorflow as tf;import habana_frameworks.tensorflow as htf;htf.load_habana_module();x = tf.constant(2);y = x + x;assert y.numpy() == 4, "Sanity check failed: Wrong Add output";assert "HPU" in y.device, "Sanity check failed: Operation not executed on Habana";print("Sanity check passed")' diff --git a/installation_scripts/u18_tensorflow_installation.sh b/installation_scripts/u18_tensorflow_installation.sh index 8b67e06..f666bea 100755 --- a/installation_scripts/u18_tensorflow_installation.sh +++ b/installation_scripts/u18_tensorflow_installation.sh @@ -26,6 +26,7 @@ export MPI_ROOT=/usr/local/openmpi export LD_LIBRARY_PATH=$MPI_ROOT/lib:$LD_LIBRARY_PATH export OPAL_PREFIX=$MPI_ROOT export PATH=$MPI_ROOT/bin:$PATH +export PYTHON=/usr/bin/python3.7 echo "export MPI_ROOT=${MPI_ROOT}" | sudo tee -a /etc/profile.d/habanalabs.sh echo "export OPAL_PREFIX=${MPI_ROOT}" | sudo tee -a /etc/profile.d/habanalabs.sh echo 'export LD_LIBRARY_PATH=${MPI_ROOT}/lib:${LD_LIBRARY_PATH}' | sudo tee -a /etc/profile.d/habanalabs.sh @@ -46,14 +47,13 @@ wget --no-verbose https://download.open-mpi.org/release/open-mpi/v4.0/openmpi-"$ rm -rf openmpi-"${OPENMPI_VER}"* && \ sudo /sbin/ldconfig -python3 -m pip install mpi4py==3.0.3 +${PYTHON} -m pip install --user mpi4py==3.0.3 #install base tensorflow package -python3 -m pip install tensorflow-cpu==2.5.1 -#instal Habana tensorflow bridge & Horovod -python3 -m pip install habana-tensorflow==1.0.1.81 --extra-index-url https://vault.habana.ai/artifactory/api/pypi/gaudi-python/simple -python3 -m pip install habana-horovod==1.0.1.81 --extra-index-url https://vault.habana.ai/artifactory/api/pypi/gaudi-python/simple +${PYTHON} -m pip install --user tensorflow-cpu==2.5.1 +#install Habana tensorflow bridge & Horovod +${PYTHON} -m pip install --user habana-tensorflow==1.0.1.81 --extra-index-url https://vault.habana.ai/artifactory/api/pypi/gaudi-python/simple +${PYTHON} -m pip install --user habana-horovod==1.0.1.81 --extra-index-url https://vault.habana.ai/artifactory/api/pypi/gaudi-python/simple source /etc/profile.d/habanalabs.sh -python3 -c 'import tensorflow as tf;import habana_frameworks.tensorflow as htf;htf.library_loader.load_habana_module();x = tf.constant(2);y = x + x;assert y.numpy() == 4, "Sanity check failed: Wrong Add output";assert "HPU" in y.device, "Sanity check failed: Operation not executed on Habana";print("Sanity check passed")' - +${PYTHON} -c 'import tensorflow as tf;import habana_frameworks.tensorflow as htf;htf.load_habana_module();x = tf.constant(2);y = x + x;assert y.numpy() == 4, "Sanity check failed: Wrong Add output";assert "HPU" in y.device, "Sanity check failed: Operation not executed on Habana";print("Sanity check passed")'