You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
C:\Windows\System32>python
Python 3.12.5 (tags/v3.12.5:ff3bc82, Aug 6 2024, 20:45:27) [MSC v.1940 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
from airllm import AutoModel
bitsandbytes installed
cache_utils installed
MAX_LENGTH=150
from huggingface_hub.hf_api import HfFolder
HfFolder.save_token('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')
model = AutoModel.from_pretrained("nvidia/Llama-3.1-Nemotron-70B-Instruct-HF",compression='4bit')
Fetching 8 files: 100%|█████████████████████████████████████████████████████████████████| 8/8 [00:00<00:00, 255.68it/s]
found_layers:{'model.embed_tokens.': True, 'model.layers.0.': True, 'model.layers.1.': True, 'model.layers.2.': True, 'model.layers.3.': True, 'model.layers.4.': True, 'model.layers.5.': True, 'model.layers.6.': True, 'model.layers.7.': True, 'model.layers.8.': True, 'model.layers.9.': True, 'model.layers.10.': True, 'model.layers.11.': True, 'model.layers.12.': True, 'model.layers.13.': True, 'model.layers.14.': True, 'model.layers.15.': True, 'model.layers.16.': True, 'model.layers.17.': True, 'model.layers.18.': True, 'model.layers.19.': True, 'model.layers.20.': True, 'model.layers.21.': True, 'model.layers.22.': True, 'model.layers.23.': True, 'model.layers.24.': True, 'model.layers.25.': True, 'model.layers.26.': True, 'model.layers.27.': True, 'model.layers.28.': True, 'model.layers.29.': True, 'model.layers.30.': True, 'model.layers.31.': True, 'model.layers.32.': True, 'model.layers.33.': True, 'model.layers.34.': True, 'model.layers.35.': True, 'model.layers.36.': True, 'model.layers.37.': True, 'model.layers.38.': True, 'model.layers.39.': True, 'model.layers.40.': True, 'model.layers.41.': True, 'model.layers.42.': True, 'model.layers.43.': True, 'model.layers.44.': True, 'model.layers.45.': True, 'model.layers.46.': True, 'model.layers.47.': True, 'model.layers.48.': True, 'model.layers.49.': True, 'model.layers.50.': True, 'model.layers.51.': True, 'model.layers.52.': True, 'model.layers.53.': True, 'model.layers.54.': True, 'model.layers.55.': True, 'model.layers.56.': True, 'model.layers.57.': True, 'model.layers.58.': True, 'model.layers.59.': True, 'model.layers.60.': True, 'model.layers.61.': True, 'model.layers.62.': True, 'model.layers.63.': True, 'model.layers.64.': True, 'model.layers.65.': True, 'model.layers.66.': True, 'model.layers.67.': True, 'model.layers.68.': True, 'model.layers.69.': True, 'model.layers.70.': True, 'model.layers.71.': True, 'model.layers.72.': True, 'model.layers.73.': True, 'model.layers.74.': True, 'model.layers.75.': True, 'model.layers.76.': True, 'model.layers.77.': True, 'model.layers.78.': True, 'model.layers.79.': True, 'model.norm.': True, 'lm_head.': True}
saved layers already found in G:\nvidiallama-3_1-nemotron-70b-instruct\models--nvidia--Llama-3.1-Nemotron-70B-Instruct-HF\snapshots\fac73d3507320ec1258620423469b4b38f88df6e\splitted_model.4bit
The class optimum.bettertransformers.transformation.BetterTransformer is deprecated and will be removed in a future release.
new version of transfomer, no need to use BetterTransformer, try setting attn impl to sdpa...
attn imp: <class 'transformers.models.llama.modeling_llama.LlamaSdpaAttention'> not support prefetching for compression for now. loading with no prepetching mode.
The text was updated successfully, but these errors were encountered:
C:\Windows\System32>python
Python 3.12.5 (tags/v3.12.5:ff3bc82, Aug 6 2024, 20:45:27) [MSC v.1940 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
The text was updated successfully, but these errors were encountered: