Update for 2.0.0 (#827)

* Add MKL backend related documentation and update ChangeLog
NervanaSystems · Jun 27, 2017 · 41e746a · 41e746a
1 parent 1f3036c
commit 41e746a
Show file tree

Hide file tree

Showing 10 changed files with 91 additions and 33 deletions.
diff --git a/ChangeLog b/ChangeLog
@@ -1,5 +1,14 @@
 # ChangeLog
 
+## v2.0.0 (2017-06-27):
+
+* Added support for MKL backend (-b mkl) on Linux, which boosts neon CPU performance significantly
+* Added WGAN model examples for LSUN and MNIST data
+* Enabled WGAN and DCGAN model examples for Python3
+* Added fix (using file locking) to prevent race conditions running multiple jobs on the same machine with multiple GPUs
+* Added functionality to display some information about hardware, OS and model used
+* Updated appdirs to 1.4.3 to be compatibile on Centos 7.3 for appliance
+
 ## v1.9.0 (2017-05-03):
 
 * Add support for 3D deconvolution

diff --git a/README.md b/README.md
@@ -11,11 +11,12 @@ For fast iteration and model exploration, neon has the fastest performance among
 * 2.5s/macrobatch (3072 images) on AlexNet on Titan X (Full run on 1 GPU ~ 26 hrs)
 * Training VGG with 16-bit floating point on 1 Titan X takes ~10 days (original paper: 4 GPUs for 2-3 weeks)
 
-We use neon internally at Nervana to solve our customers' problems across many
+We use neon internally at Intel Nervana to solve our customers' problems across many
 [domains](http://www.nervanasys.com/solutions/). We are hiring across several
 roles. Apply [here](http://www.nervanasys.com/careers/)!
 
 See the [new features](https://github.com/NervanaSystems/neon/blob/master/ChangeLog) in our latest release.
+We want to highlight that neon v2.0.0+ has been optimized for much better performance on CPUs by enabling Intel Math Kernel Library (MKL). Remember to turn on MKL by adding `-b mkl` when running neon on Intel Xeon and Xeon Phi CPUs! The DNN (Deep Neural Networks) component of MKL that is used by neon is provided free of charge and downloaded automatically as part of the neon installation. 
 
 ## Quick Install
 
@@ -29,12 +30,25 @@ neon (conda users see the [guide](http://neon.nervanasys.com/docs/latest/install
     cd neon
     make
     . .venv/bin/activate
-    # run an example with the mkl backend (defaults to the cpu backend (non-mkl):
-    neon examples/mnist_mlp.yaml -b mkl
-    # alternatively, use a script (defaults to gpu backend if available):
-    python examples/mnist_mlp.py
+    # use a script to  run an example with the **optimized** CPU (mkl) backend (defaults to the non-optimized CPU backend (cpu) if no `-b mkl` is specified):
+    python examples/mnist_mlp.py -b mkl
+    # alternatively, use a yaml file (defaults to gpu backend if available, adding a line that contains``backend: mkl`` to enable MKL backend):
+    neon examples/mnist_mlp.yaml
 ```
 
+## Recommended Settings for neon with MKL on Intel Architectures
+
+The Intel Math Kernel Library takes advantages of the parallelization and vectorization capabilities of Intel Xeon and Xeon Phi systems. When hyperthreading is enabled on the system, we recommend 
+the following KMP_AFFINITY setting to make sure parallel threads are 1:1 mapped to the available physical cores. 
+
+```bash
+    export OMP_NUM_THREADS=<Number of Physical Cores>
+    export KMP_AFFINITY=compact,1,0,granularity=fine
+```
+For more information about KMP_AFFINITY, please check [here](https://software.intel.com/en-us/node/522691).
+We encourage users to set out trying and establishing their own best performance settings. 
+
+
 ## Documentation
 
 The complete documentation for neon is available

diff --git a/doc/source/index.rst b/doc/source/index.rst
@@ -36,21 +36,15 @@ Features include:
 
 New features in this release:
 
-* Add support for 3D deconvolution
-* Generative Adversarial Networks (GAN) implementation, and MNIST DCGAN example, following GoodFellow 2014 (http://arXiv.org/abs/1406.2661)
-* Implement Wasserstein GAN cost function and make associated API changes for GAN models
-* Add a new benchmarking script with per-layer timings
-* Add weight clipping for GDM, RMSProp, Adagrad, Adadelta and Adam optimizers
-* Make multicost an explicit choice in mnist_branch.py example
-* Enable NMS kernels to work with normalized boxes and offset
-* Fix missing links in api.rst [#366]
-* Fix docstring for --datatype option to neon [#367]
-* Fix perl shebang in maxas.py and allow for build with numpy 1.12 [#356]
-* Replace os.path.join for Windows interoperability [#351]
-* Update aeon to 0.2.7 to fix a seg fault on termination
+* Added support for MKL backend (-b mkl) on Linux, which boosts neon CPU performance significantly
+* Added WGAN model examples for LSUN and MNIST data
+* Enabled WGAN and DCGAN model examples for Python3
+* Added fix (using file locking) to prevent race conditions running multiple jobs on the same machine with multiple GPUs
+* Added functionality to display some information about hardware, OS and model used
+* Updated appdirs to 1.4.3 to be compatibile on Centos 7.3 for appliance
 * See more in the `change log`_.
 
-We use neon internally at Nervana to solve our `customers' problems`_
+We use neon internally at Intel Nervana to solve our `customers' problems`_
 in many domains. Consider joining us. We are hiring across several
 roles. Apply here_!
 

diff --git a/doc/source/installation.rst b/doc/source/installation.rst
@@ -16,7 +16,7 @@
 Installation
 ===============
 
-Let's get you started using Neon to build deep learning models!
+Let's get you started using neon to build deep learning models!
 
 Requirements
 ~~~~~~~~~~~~
@@ -40,10 +40,11 @@ packages (different system names shown):
    To enable neon's :py:class:`.DataLoader`, several optional libraries should be installed. For image processing, install `OpenCV <http://opencv.org/>`__. For audio and video data, install `ffmpeg <https://ffmpeg.org/>`__. We recommend installing with a package manager (e.g. apt-get or homebrew). 
 
 
-Additionally, there are several other optional libraries.
+Additionally, there are several other libraries.
 
-* To enable multi-threading operations on a CPU, install `OpenBLAS <http://www.openblas.net/>`__, then recompile numpy with links to openBLAS (see sample instructions `here <https://hunseblog.wordpress.com/2014/09/15/installing-numpy-and-openblas/>`_). While Neon will run on the CPU, you'll get far better performance using GPUs.
-* Enabling Neon to use GPUs requires installation of `CUDA SDK and drivers <https://developer.nvidia.com/cuda-downloads>`__. We support `Pascal <http://developer.nvidia.com/pascal>`__ ,  `Maxwell <http://maxwell.nvidia.com/>`__ and `Kepler <http://www.nvidia.com/object/nvidia-kepler.html>`__ GPU architectures, but our backend is optimized for Maxwell GPUs. Remember to add the CUDA path to your environment variables.
+* Neon v2.0.0+ by default comes with Intel Math Kernel Library (MKL) support, which enables multi-threading operations on Intel CPU. It is the recommended library to use for best performance on CPU. When installing neon, MKL support will be automatically enabled.
+* (optional) If interested to compare multi-threading performance of MKL optimized neon, install `OpenBLAS <http://www.openblas.net/>`__, then recompile numpy with links to openBLAS (see sample instructions `here <https://hunseblog.wordpress.com/2014/09/15/installing-numpy-and-openblas/>`_). While neon will run on the CPU with OpenBLAS, you'll get better performance using MKL on CPUs or CUDA on GPUs.
+* Enabling neon to use GPUs requires installation of `CUDA SDK and drivers <https://developer.nvidia.com/cuda-downloads>`__. We support `Pascal <http://developer.nvidia.com/pascal>`__ ,  `Maxwell <http://maxwell.nvidia.com/>`__ and `Kepler <http://www.nvidia.com/object/nvidia-kepler.html>`__ GPU architectures, but our backend is optimized for Maxwell GPUs. Remember to add the CUDA path to your environment variables.
 
 For GPU users, remember to add the CUDA path. For example, on Ubuntu:
 
@@ -62,7 +63,7 @@ Or on Mac OS X:
 Installation
 ~~~~~~~~~~~~
 
-We recommend installing Neon within a `virtual
+We recommend installing neon within a `virtual
 environment <http://docs.python-guide.org/en/latest/dev/virtualenvs/>`__
 to ensure a self-contained environment. To install neon within an
 already existing virtual environment, see the System-wide Install section.
@@ -76,7 +77,10 @@ setup neon in this manner, run the following commands:
     cd neon; make
 
 This will install the files in the ``neon/.venv/`` directory and will use the python version in the
-default PATH.  To instead force a Python2 or Python3 install, supply this as an optional parameter:
+default PATH. Note that neon would automatically download the released MKLML library that 
+features MKL support.
+
+To instead force a Python2 or Python3 install, supply this as an optional parameter:
 
 .. code-block:: bash
 
@@ -95,13 +99,23 @@ To activate the virtual environment, type
     . .venv/bin/activate
 
 You will see the prompt change to reflect the activated environment. To
-start Neon and run the MNIST multi-layer perceptron example (the "Hello
+start neon and run the MNIST multi-layer perceptron example (the "Hello
 World" of deep learning), enter
 
 .. code-block:: bash
 
     examples/mnist_mlp.py
 
+For better performance on Intel CPUs, start neon and run the MNIST multi-layer
+perceptron example with ``-b mkl`` 
+
+.. code-block:: bash
+
+    examples/mnist_mlp.py -b mkl
+
+.. note::
+   To achieve best performance, we recommend setting KMP_AFFINITY and OMP_NUM_THREADS in this way: ``export KMP_AFFINITY=compact,1,0,granularity=fine`` and ``export OMP_NUM_THREADS=<Number of Physical Cores>``. You can set these environment variables in bash and do ``source ~/.bashrc`` to activate it. You may need to activate the virtual environment again after sourcing bashrc. For detailed information about KMP_AFFINITY, please read here: https://software.intel.com/en-us/node/522691. We encourage users to experiment with this thread affinity configurations to achieve even better performance. 
+
 When you are finished, remember to deactivate the environment
 
 .. code-block:: bash
@@ -125,7 +139,7 @@ the guide at http://docs.python-guide.org/en/latest/dev/virtualenvs/.
 System-wide install
 ~~~~~~~~~~~~~~~~~~~
 
-If you would prefer not to use a new virtual environment, Neon can be
+If you would prefer not to use a new virtual environment, neon can be
 installed system-wide with
 
 .. code-block:: bash
@@ -170,7 +184,7 @@ Docker
 
 If you would prefer having a containerized installation of neon and its
 dependencies, the open source community has contributed the following
-Docker images (note that these are not supported/maintained by Nervana):
+Docker images (note that these are not supported/maintained by Intel Nervana):
 
 -  `neon (CPU only) <https://hub.docker.com/r/kaixhin/neon/>`__
 -  `neon (GPU) <https://hub.docker.com/r/kaixhin/cuda-neon/>`__

diff --git a/doc/source/mnist.rst b/doc/source/mnist.rst
@@ -85,7 +85,7 @@ iterators can be obtained with the following code:
 Model specification
 -------------------
 
-Training a deep learning model in Neon requires specifying the dataset,
+Training a deep learning model in neon requires specifying the dataset,
 a list of layers, a cost function, and the learning rule. Here we guide
 you through each item in turn.
 
@@ -244,7 +244,7 @@ Next steps
 ~~~~~~~~~~
 
 This simple example guides you through the basic operations needed to
-create and fit a neural network. However, Neon contains a rich feature
+create and fit a neural network. However, neon contains a rich feature
 set of customizable layers, metrics, and options. To learn more, we
 recommend reading through the :doc:`CIFAR10 tutorial <cifar10>`,
 which introduces convolutional neural networks.
diff --git a/doc/source/previous_versions.rst b/doc/source/previous_versions.rst
@@ -17,6 +17,26 @@
 Previous Versions
 =================
 
+neon v1.9.0
+-----------
+
+|Docs190|_
+
+neon v1.9.0 released May 3, 2017 supporting:
+
+* Add support for 3D deconvolution
+* Generative Adversarial Networks (GAN) implementation, and MNIST DCGAN example, following GoodFellow 2014 (http://arXiv.org/abs/1406.2661)
+* Implement Wasserstein GAN cost function and make associated API changes for GAN models
+* Add a new benchmarking script with per-layer timings
+* Add weight clipping for GDM, RMSProp, Adagrad, Adadelta and Adam optimizers
+* Make multicost an explicit choice in mnist_branch.py example
+* Enable NMS kernels to work with normalized boxes and offset
+* Fix missing links in api.rst [#366]
+* Fix docstring for --datatype option to neon [#367]
+* Fix perl shebang in maxas.py and allow for build with numpy 1.12 [#356]
+* Replace os.path.join for Windows interoperability [#351]
+* Update aeon to 0.2.7 to fix a seg fault on termination
+
 neon v1.8.2
 -----------
 
@@ -415,6 +435,7 @@ neon v0.8.1
 
 Initial public release of neon.
 
+.. |Docs190| replace:: Docs
 .. |Docs182| replace:: Docs
 .. |Docs181| replace:: Docs
 .. |Docs180| replace:: Docs
@@ -440,6 +461,7 @@ Initial public release of neon.
 .. |Docs9| replace:: Docs
 .. |Docs8| replace:: Docs
 .. _cudanet: https://github.com/NervanaSystems/cuda-convnet2
+.. _Docs190: http://neon.nervanasys.com/docs/1.9.0
 .. _Docs182: http://neon.nervanasys.com/docs/1.8.2
 .. _Docs181: http://neon.nervanasys.com/docs/1.8.1
 .. _Docs180: http://neon.nervanasys.com/docs/1.8.0

diff --git a/doc/source/running_models.rst b/doc/source/running_models.rst
@@ -18,11 +18,11 @@ Running models
 
 With the virtual environment activated, there are two ways to run models
 through neon. The first is to simply execute the python script
-containing the model, as mentioned before:
+containing the model (with ``-b mkl``), as mentioned before:
 
 .. code-block:: bash
 
-    examples/mnist_mlp.py
+    examples/mnist_mlp.py -b mkl
 
 This will run the multilayer perceptron (MLP) model and print the final
 misclassification error after 10 training epochs. On the first run, neon will download the MNIST dataset. It will create a ``~/nervana`` directory where the raw datasets are kept. The data directory can be controlled with the ``-w`` flag.
@@ -36,6 +36,8 @@ file for the MLP example, enter from the neon repository directory:
 
     neon examples/mnist_mlp.yaml
 
+In a YAML file, the mkl backend can be specified by adding ``backend: mkl``.
+
 Arguments
 ---------
 

diff --git a/doc/source/tutorials.rst b/doc/source/tutorials.rst
@@ -15,6 +15,9 @@ visualizing the results. We recommend reading the section on the neon
 * :doc:`Tutorial 4 <creating_new_layers>`: Creating new layers
 * :doc:`Tutorial 5 <tools>`: Visualizing the results
 
+Since neon v2.0.0+ is now released with MKL backend support, we encourage users
+to use ``-b mkl`` on Intel CPUs for all the tutorial examples used. 
+
 .. toctree::
    :hidden:
    :maxdepth: 0

diff --git a/neon/data/datasets.py b/neon/data/datasets.py
@@ -129,7 +129,7 @@ def fetch_dataset(url, sourcefile, destfile, totalsz):
             destfile (str): Path to the destination.
             totalsz (int): Size of the file to be downloaded.
         """
-        req = Request('/'.join([url, sourcefile]), headers={'User-Agent': 'neon'})
+        req = Request(os.path.join(url, sourcefile), headers={'User-Agent': 'neon'})
         # backport https limitation and workaround per http://python-future.org/imports.html
         cloudfile = urlopen(req)
         neon_logger.display("Downloading file: {}".format(destfile))

diff --git a/setup.py b/setup.py
@@ -18,7 +18,7 @@
 import subprocess
 
 # Define version information
-VERSION = '1.9.0'
+VERSION = '2.0.0'
 FULLVERSION = VERSION
 write_version = True