Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support loading checkpoints quantized using Autofp8 #286

Merged
merged 28 commits into from
Sep 25, 2024

Commits on Sep 16, 2024

  1. Inc on vLLM - Split qk and v calculations

    nirda7 authored and Yantom1 committed Sep 16, 2024
    Configuration menu
    Copy the full SHA
    a6f8dee View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    23e931b View commit details
    Browse the repository at this point in the history
  3. ruff fixes

    Yantom1 committed Sep 16, 2024
    Configuration menu
    Copy the full SHA
    363de3c View commit details
    Browse the repository at this point in the history
  4. ruff fixes

    Yantom1 committed Sep 16, 2024
    Configuration menu
    Copy the full SHA
    e4fc78b View commit details
    Browse the repository at this point in the history
  5. isort fixes

    Yantom1 committed Sep 16, 2024
    Configuration menu
    Copy the full SHA
    d165c6e View commit details
    Browse the repository at this point in the history
  6. ruff format

    Yantom1 committed Sep 16, 2024
    Configuration menu
    Copy the full SHA
    6f0016b View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    7f587eb View commit details
    Browse the repository at this point in the history
  8. isort fixes

    Yantom1 committed Sep 16, 2024
    Configuration menu
    Copy the full SHA
    c204f3f View commit details
    Browse the repository at this point in the history
  9. yapf fixes

    Yantom1 committed Sep 16, 2024
    Configuration menu
    Copy the full SHA
    2e00486 View commit details
    Browse the repository at this point in the history

Commits on Sep 17, 2024

  1. revert commit

    Yantom1 committed Sep 17, 2024
    Configuration menu
    Copy the full SHA
    0f40204 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    cd24505 View commit details
    Browse the repository at this point in the history

Commits on Sep 18, 2024

  1. Revert "Inc on vLLM - Split qk and v calculations"

    This reverts commit a6f8dee.
    Yantom1 committed Sep 18, 2024
    Configuration menu
    Copy the full SHA
    343b533 View commit details
    Browse the repository at this point in the history
  2. formnat.sh

    Yantom1 committed Sep 18, 2024
    Configuration menu
    Copy the full SHA
    8657c4c View commit details
    Browse the repository at this point in the history
  3. delete ops.py

    Yantom1 committed Sep 18, 2024
    Configuration menu
    Copy the full SHA
    6b485fb View commit details
    Browse the repository at this point in the history
  4. fix imports

    Yantom1 committed Sep 18, 2024
    Configuration menu
    Copy the full SHA
    2e603ea View commit details
    Browse the repository at this point in the history
  5. isort fix

    Yantom1 committed Sep 18, 2024
    Configuration menu
    Copy the full SHA
    a7a036a View commit details
    Browse the repository at this point in the history

Commits on Sep 19, 2024

  1. Configuration menu
    Copy the full SHA
    2b4a196 View commit details
    Browse the repository at this point in the history

Commits on Sep 23, 2024

  1. pr fix

    Yantom1 committed Sep 23, 2024
    Configuration menu
    Copy the full SHA
    454acc9 View commit details
    Browse the repository at this point in the history

Commits on Sep 24, 2024

  1. Configuration menu
    Copy the full SHA
    26d8321 View commit details
    Browse the repository at this point in the history
  2. Update fp8.py

    Yantom1 authored Sep 24, 2024
    Configuration menu
    Copy the full SHA
    e92abd6 View commit details
    Browse the repository at this point in the history
  3. Update fused_moe.py

    Yantom1 authored Sep 24, 2024
    Configuration menu
    Copy the full SHA
    f150851 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    c7dcbbc View commit details
    Browse the repository at this point in the history

Commits on Sep 25, 2024

  1. Update compressed_tensors.py

    Yantom1 authored Sep 25, 2024
    Configuration menu
    Copy the full SHA
    3e8762e View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    5726801 View commit details
    Browse the repository at this point in the history
  3. Update llama.py

    Yantom1 authored Sep 25, 2024
    Configuration menu
    Copy the full SHA
    426e8e1 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    4cf34f4 View commit details
    Browse the repository at this point in the history
  5. Update vllm/model_executor/layers/quantization/fp8.py

    Co-authored-by: Konrad Zawora <kzawora@habana.ai>
    Yantom1 and kzawora-intel authored Sep 25, 2024
    Configuration menu
    Copy the full SHA
    f58d4c1 View commit details
    Browse the repository at this point in the history
  6. Update fp8.py

    Yantom1 authored Sep 25, 2024
    Configuration menu
    Copy the full SHA
    db9affe View commit details
    Browse the repository at this point in the history