Skip to content

Commit

Permalink
update dependency version (#3895)
Browse files Browse the repository at this point in the history
* add torch-ccl into compile bundle

* fix dead link in doc

* update footer link

* update deepspeed dependency version, remove cpu related md files from build_doc.sh

* add xpu perf

* version to 2.1.20

* fix example import

* update torch ccl version

* add mpi path in the scripts

* update dependency version

* move known issue to tutorial repo

* update known issue link

* add note for not contain cpu features

* update log version

* update feature and example doc

* update model zoo version

* add paper to publications

* remove cheetsheet

---------

Co-authored-by: Zheng, Zhaoqiong <zhaoqiong.zheng@intel.com>
Co-authored-by: Ye Ting <ting.ye@intel.com>
  • Loading branch information
3 people authored Mar 30, 2024
1 parent 716d786 commit cc1a83e
Show file tree
Hide file tree
Showing 24 changed files with 222 additions and 419 deletions.
14 changes: 7 additions & 7 deletions dependency_version.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,21 +4,21 @@ gcc:
llvm:
version: 16.0.6
pytorch:
version: 2.1.0a0
version: 2.1.0.post0+cxx11.abi
commit: v2.1.0
torchaudio:
version: 2.1.0a0
version: 2.1.0.post0+cxx11.abi
commit: v2.1.0
torchvision:
version: 0.16.0a0
version: 0.16.0.post0+cxx11.abi
commit: v0.16.0
torch-ccl:
repo: https://github.com/intel/torch-ccl.git
commit: 5f20135ccf8f828738cb3bc5a5ae7816df8100ae
version: 2.1.100+xpu
commit: 5ee65b42c42a0d91c4cf459d9be40020274003b6
version: 2.1.200+xpu
deepspeed:
repo: https://github.com/microsoft/DeepSpeed.git
version:
version: v0.11.2
commit: 4fc181b01077521ba42379013ce91a1c294e5d8e
intel-extension-for-deepspeed:
repo: https://github.com/intel/intel-extension-for-deepspeed.git
Expand All @@ -28,7 +28,7 @@ transformers:
commit: v4.31.0
protobuf:
version: 3.20.3
llm_eval:
lm_eval:
version: 0.3.0
basekit:
dpcpp-cpp-rt:
Expand Down
3 changes: 3 additions & 0 deletions docs/_static/custom.css
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,9 @@
a#wap_dns {
display: none;
}
a#wap_nac {
display: none;
}

/* replace the copyright to eliminate the copyright symbol enforced by
the ReadTheDocs theme */
Expand Down
2 changes: 1 addition & 1 deletion docs/_templates/footer.html
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
{% extends '!footer.html' %} {% block extrafooter %} {{super}}
<p></p><div><a href='https://www.intel.com/content/www/us/en/privacy/intel-cookie-notice.html' data-cookie-notice='true'>Cookies</a> <a href='https://www.intel.com/content/www/us/en/privacy/intel-privacy-notice.html'>| Privacy</a> <a data-wap_ref='dns' id='wap_dns' href='https://www.intel.com/content/www/us/en/privacy/intel-cookie-notice.html'>| Do Not Share My Personal Information</a> </div> <p></p> <div>&copy; Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others. No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document, with the sole exception that code included in this document is licensed subject to the Zero-Clause BSD open source license (OBSD), <a href='http://opensource.org/licenses/0BSD'>http://opensource.org/licenses/0BSD</a>. </div>
<p></p><div><a href='https://www.intel.com/content/www/us/en/privacy/intel-cookie-notice.html' data-cookie-notice='true'>Cookies</a> <a href='https://www.intel.com/content/www/us/en/privacy/intel-privacy-notice.html'>| Privacy</a> <a href="/#" data-wap_ref="dns" id="wap_dns"><small>Your Privacy Choices</small></a> <a href=https://www.intel.com/content/www/us/en/privacy/privacy-residents-certain-states.html data-wap_ref="nac" id="wap_nac"><small>Notice at Collection</small></a> </div> <p></p> <div>&copy; Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others. No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document, with the sole exception that code included in this document is licensed subject to the Zero-Clause BSD open source license (OBSD), <a href='http://opensource.org/licenses/0BSD'>http://opensource.org/licenses/0BSD</a>. </div>
{% endblock %}
9 changes: 4 additions & 5 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Large Language Models (LLMs) are introduced in the Intel® Extension for PyTorch
The extension can be loaded as a Python module for Python programs or linked as a C++ library for C++ programs. In Python scripts, users can enable it dynamically by importing ``intel_extension_for_pytorch``.

.. note::

- CPU features are not included in GPU-only packages.
- GPU features are not included in CPU-only packages.
- Optimizations for CPU-only may have a newer code base due to different development schedules.

Expand All @@ -26,8 +26,8 @@ Intel® Extension for PyTorch* has been released as an open–source project at

You can find more information about the product at:

- `Features <https://intel.github.io/intel-extension-for-pytorch/gpu/latest/tutorials/features>`_
- `Performance <./tutorials/performance.html>`_
- `Features <https://intel.github.io/intel-extension-for-pytorch/xpu/latest/tutorials/features>`_
- `Performance <https://intel.github.io/intel-extension-for-pytorch/xpu/latest/tutorials/performance>`_

Architecture
------------
Expand Down Expand Up @@ -62,7 +62,7 @@ The team tracks bugs and enhancement requests using `GitHub issues <https://gith
tutorials/performance
tutorials/technical_details
tutorials/releases
tutorials/performance_tuning/known_issues
tutorials/known_issues
tutorials/blogs_publications
tutorials/license

Expand All @@ -74,7 +74,6 @@ The team tracks bugs and enhancement requests using `GitHub issues <https://gith
tutorials/installation
tutorials/getting_started
tutorials/examples
tutorials/cheat_sheet

.. toctree::
:maxdepth: 3
Expand Down
5 changes: 2 additions & 3 deletions docs/tutorials/api_doc.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Device-Agnostic
.. autofunction:: optimize_transformers
.. autofunction:: get_fp32_math_mode
.. autofunction:: set_fp32_math_mode
.. autoclass:: verbose


GPU-Specific
************
Expand Down Expand Up @@ -43,8 +43,7 @@ Miscellaneous

.. currentmodule:: intel_extension_for_pytorch.xpu.fp8.fp8
.. autofunction:: fp8_autocast
.. currentmodule:: intel_extension_for_pytorch.quantization
.. autofunction:: _gptq


Random Number Generator
=======================
Expand Down
1 change: 1 addition & 0 deletions docs/tutorials/blogs_publications.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
Blogs & Publications
====================

* [LLM inference solution on Intel GPU, Dec 2023](https://arxiv.org/abs/2401.05391)
* [Accelerate Llama 2 with Intel AI Hardware and Software Optimizations, Jul 2023](https://www.intel.com/content/www/us/en/developer/articles/news/llama2.html)
* [Accelerate PyTorch\* Training and Inference Performance using Intel® AMX, Jul 2023](https://www.intel.com/content/www/us/en/developer/articles/technical/accelerate-pytorch-training-inference-on-amx.html)
* [Intel® Deep Learning Boost (Intel® DL Boost) - Improve Inference Performance of Hugging Face BERT Base Model in Google Cloud Platform (GCP) Technology Guide, Apr 2023](https://networkbuilders.intel.com/solutionslibrary/intel-deep-learning-boost-intel-dl-boost-improve-inference-performance-of-hugging-face-bert-base-model-in-google-cloud-platform-gcp-technology-guide)
Expand Down
23 changes: 0 additions & 23 deletions docs/tutorials/cheat_sheet.md

This file was deleted.

36 changes: 18 additions & 18 deletions docs/tutorials/examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@ Examples
These examples will help you get started using Intel® Extension for PyTorch\*
with Intel GPUs.

For examples on Intel CPUs, check the [CPU examples](../../../cpu/latest/tutorials/examples.html).

**Prerequisites**:
Before running these examples, install the `torchvision` and `transformers` Python packages.

Expand All @@ -27,7 +25,7 @@ Before running these examples, install the `torchvision` and `transformers` Pyth
To use Intel® Extension for PyTorch\* on training, you need to make the following changes in your code:

1. Import `intel_extension_for_pytorch` as `ipex`.
2. Use the `ipex.optimize` function, which applies optimizations against the model object, as well as an optimizer object.
2. Use the `ipex.optimize` function for additional performance boost, which applies optimizations against the model object, as well as an optimizer object.
3. Use Auto Mixed Precision (AMP) with BFloat16 data type.
4. Convert input tensors, loss criterion and model to XPU, as shown below:

Expand Down Expand Up @@ -219,18 +217,20 @@ The <LIBPYTORCH_PATH> is the absolute path of libtorch we install at the first s

If *Found IPEX* is shown as dynamic library paths, the extension was linked into the binary. This can be verified with the Linux command *ldd*.

The value of x, y, z in the following log will change depending on the version you choose.

```bash
$ CC=icx CXX=icpx cmake -DCMAKE_PREFIX_PATH=/workspace/libtorch ..
-- The C compiler identification is IntelLLVM 2024.0.0
-- The CXX compiler identification is IntelLLVM 2024.0.0
-- The C compiler identification is IntelLLVM 202x.y.z
-- The CXX compiler identification is IntelLLVM 202x.y.z
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /workspace/intel/oneapi/compiler/2024.0.0/linux/bin/icx - skipped
-- Check for working C compiler: /workspace/intel/oneapi/compiler/202x.y.z/linux/bin/icx - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /workspace/intel/oneapi/compiler/2024.0.0/linux/bin/icpx - skipped
-- Check for working CXX compiler: /workspace/intel/oneapi/compiler/202x.y.z/linux/bin/icpx - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for pthread.h
Expand All @@ -252,16 +252,16 @@ $ ldd example-app
libintel-ext-pt-cpu.so => /workspace/libtorch/lib/libintel-ext-pt-cpu.so (0x00007fd5a1a1b000)
libintel-ext-pt-gpu.so => /workspace/libtorch/lib/libintel-ext-pt-gpu.so (0x00007fd5862b0000)
...
libmkl_intel_lp64.so.2 => /workspace/intel/oneapi/mkl/2024.0.0/lib/intel64/libmkl_intel_lp64.so.2 (0x00007fd584ab0000)
libmkl_core.so.2 => /workspace/intel/oneapi/mkl/2024.0.0/lib/intel64/libmkl_core.so.2 (0x00007fd5806cc000)
libmkl_gnu_thread.so.2 => /workspace/intel/oneapi/mkl/2024.0.0/lib/intel64/libmkl_gnu_thread.so.2 (0x00007fd57eb1d000)
libmkl_sycl.so.3 => /workspace/intel/oneapi/mkl/2024.0.0/lib/intel64/libmkl_sycl.so.3 (0x00007fd55512c000)
libOpenCL.so.1 => /workspace/intel/oneapi/compiler/2024.0.0/linux/lib/libOpenCL.so.1 (0x00007fd55511d000)
libsvml.so => /workspace/intel/oneapi/compiler/2024.0.0/linux/compiler/lib/intel64_lin/libsvml.so (0x00007fd553b11000)
libirng.so => /workspace/intel/oneapi/compiler/2024.0.0/linux/compiler/lib/intel64_lin/libirng.so (0x00007fd553600000)
libimf.so => /workspace/intel/oneapi/compiler/2024.0.0/linux/compiler/lib/intel64_lin/libimf.so (0x00007fd55321b000)
libintlc.so.5 => /workspace/intel/oneapi/compiler/2024.0.0/linux/compiler/lib/intel64_lin/libintlc.so.5 (0x00007fd553a9c000)
libsycl.so.6 => /workspace/intel/oneapi/compiler/2024.0.0/linux/lib/libsycl.so.6 (0x00007fd552f36000)
libmkl_intel_lp64.so.2 => /workspace/intel/oneapi/mkl/202x.y.z/lib/intel64/libmkl_intel_lp64.so.2 (0x00007fd584ab0000)
libmkl_core.so.2 => /workspace/intel/oneapi/mkl/202x.y.z/lib/intel64/libmkl_core.so.2 (0x00007fd5806cc000)
libmkl_gnu_thread.so.2 => /workspace/intel/oneapi/mkl/202x.y.z/lib/intel64/libmkl_gnu_thread.so.2 (0x00007fd57eb1d000)
libmkl_sycl.so.3 => /workspace/intel/oneapi/mkl/202x.y.z/lib/intel64/libmkl_sycl.so.3 (0x00007fd55512c000)
libOpenCL.so.1 => /workspace/intel/oneapi/compiler/202x.y.z/linux/lib/libOpenCL.so.1 (0x00007fd55511d000)
libsvml.so => /workspace/intel/oneapi/compiler/202x.y.z/linux/compiler/lib/intel64_lin/libsvml.so (0x00007fd553b11000)
libirng.so => /workspace/intel/oneapi/compiler/202x.y.z/linux/compiler/lib/intel64_lin/libirng.so (0x00007fd553600000)
libimf.so => /workspace/intel/oneapi/compiler/202x.y.z/linux/compiler/lib/intel64_lin/libimf.so (0x00007fd55321b000)
libintlc.so.5 => /workspace/intel/oneapi/compiler/202x.y.z/linux/compiler/lib/intel64_lin/libintlc.so.5 (0x00007fd553a9c000)
libsycl.so.6 => /workspace/intel/oneapi/compiler/202x.y.z/linux/lib/libsycl.so.6 (0x00007fd552f36000)
...
```

Expand All @@ -286,4 +286,4 @@ Intel® Extension for PyTorch\* provides its C++ dynamic library to allow users

## Intel® AI Reference Models

Use cases that have already been optimized by Intel engineers are available at [Intel® AI Reference Models](https://github.com/IntelAI/models/tree/v2.12.0) (former Model Zoo). A number of PyTorch use cases for benchmarking are also available in the [Use Cases](https://github.com/IntelAI/models/tree/v2.12.0#use-cases) section. Models verified on Intel GPUs are marked in the `Model Documentation` column. You can get performance benefits out-of-the-box by simply running scripts in the Intel® AI Reference Models.
Use cases that have already been optimized by Intel engineers are available at [Intel® AI Reference Models](https://github.com/IntelAI/models/tree/v3.1.1) (former Model Zoo). A number of PyTorch use cases for benchmarking are also available in the [Use Cases](https://github.com/IntelAI/models/tree/v3.1.1?tab=readme-ov-file#use-cases) section. Models verified on Intel GPUs are marked in the `Model Documentation` column. You can get performance benefits out-of-the-box by simply running scripts in the Intel® AI Reference Models.
25 changes: 11 additions & 14 deletions docs/tutorials/features.rst
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
Features
========

Device-Agnostic
***************
GPU-Specific
************

Easy-to-use Python API
----------------------
Expand Down Expand Up @@ -46,16 +46,15 @@ Quantization

Intel® Extension for PyTorch* currently supports imperative mode and TorchScript mode for post-training static quantization on GPU. This section illustrates the quantization workflow on Intel GPUs.

Check more detailed information for `INT8 Quantization [XPU] <features/int8_overview_xpu.md>`_.
Check more detailed information for `INT8 Quantization <features/int8_overview_xpu.md>`_.

On Intel® GPUs, Intel® Extension for PyTorch* also provides INT4 and FP8 Quantization. Check more detailed information for `FP8 Quantization <./features/float8.md>`_ and `INT4 Quantization <./features/int4.md>`_
On Intel® GPUs, Intel® Extension for PyTorch* also provides FP8 Quantization. Check more detailed information for `FP8 Quantization <./features/float8.md>`_.

.. toctree::
:hidden:
:maxdepth: 1

features/int8_overview_xpu
features/int4
features/float8


Expand All @@ -74,9 +73,6 @@ For more detailed information, check `DDP <features/DDP.md>`_ and `Horovod (Prot
features/horovod


GPU-Specific
************

DLPack Solution
---------------

Expand Down Expand Up @@ -131,11 +127,12 @@ For more detailed information, check `FSDP <features/FSDP.md>`_.

features/FSDP

Inductor
--------
torch.compile for GPU (Beta)
----------------------------

Intel® Extension for PyTorch\* now empowers users to seamlessly harness graph compilation capabilities for optimal PyTorch model performance on Intel GPU via the flagship `torch.compile <https://pytorch.org/docs/stable/generated/torch.compile.html#torch-compile>`_ API through the default "inductor" backend (`TorchInductor <https://dev-discuss.pytorch.org/t/torchinductor-a-pytorch-native-compiler-with-define-by-run-ir-and-symbolic-shapes/747/1>`_ ).

For more detailed information, check `Inductor <features/torch_compile_gpu.md>`_.
For more detailed information, check `torch.compile for GPU <features/torch_compile_gpu.md>`_.

.. toctree::
:hidden:
Expand All @@ -144,7 +141,7 @@ For more detailed information, check `Inductor <features/torch_compile_gpu.md>`_
features/torch_compile_gpu

Legacy Profiler Tool (Prototype)
-----------------------------------
--------------------------------

The legacy profiler tool is an extension of PyTorch* legacy profiler for profiling operators' overhead on XPU devices. With this tool, you can get the information in many fields of the run models or code scripts. Build Intel® Extension for PyTorch* with profiler support as default and enable this tool by adding a `with` statement before the code segment.

Expand All @@ -157,7 +154,7 @@ For more detailed information, check `Legacy Profiler Tool <features/profiler_le
features/profiler_legacy

Simple Trace Tool (Prototype)
--------------------------------
-----------------------------

Simple Trace is a built-in debugging tool that lets you control printing out the call stack for a piece of code. Once enabled, it can automatically print out verbose messages of called operators in a stack format with indenting to distinguish the context.

Expand All @@ -170,7 +167,7 @@ For more detailed information, check `Simple Trace Tool <features/simple_trace.m
features/simple_trace

Kineto Supported Profiler Tool (Prototype)
---------------------------------------------
------------------------------------------

The Kineto supported profiler tool is an extension of PyTorch\* profiler for profiling operators' executing time cost on GPU devices. With this tool, you can get information in many fields of the run models or code scripts. Build Intel® Extension for PyTorch\* with Kineto support as default and enable this tool using the `with` statement before the code segment.

Expand Down
Loading

0 comments on commit cc1a83e

Please sign in to comment.