Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to load models with adapter weights in offline mode #31700

Closed
1 of 4 tasks
amyeroberts opened this issue Jun 28, 2024 · 6 comments · Fixed by huggingface/peft#1976
Closed
1 of 4 tasks

Unable to load models with adapter weights in offline mode #31700

amyeroberts opened this issue Jun 28, 2024 · 6 comments · Fixed by huggingface/peft#1976

Comments

@amyeroberts
Copy link
Collaborator

System Info

  • transformers version: 4.42.0.dev0
  • Platform: Linux-5.15.0-1045-aws-x86_64-with-glibc2.31
  • Python version: 3.10.9
  • Huggingface_hub version: 0.23.4
  • Safetensors version: 0.4.2
  • Accelerate version: 0.31.0
  • Accelerate config: not found
  • PyTorch version (GPU?): 2.3.1+cu121 (True)
  • Tensorflow version (GPU?): 2.14.0 (False)
  • Flax version (CPU?/GPU?/TPU?): 0.7.0 (cpu)
  • Jax version: 0.4.13
  • JaxLib version: 0.4.13
  • Using distributed or parallel set-up in script?: No
  • Using GPU in script?: No
  • GPU type: NVIDIA A10G

Who can help?

Probably me @amyeroberts or @ArthurZucker.

PEFT weight loading code was originally added by @younesbelkada

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Unable to load models in offline model, even when the adapter weights are cache locally

import os
import torch

os.environ['HF_HUB_OFFLINE'] = '1'

from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained(
    "haoranxu/ALMA-13B-R",
    torch_dtype=torch.float16,
    device_map="auto",
    local_files_only=True
)

This model uses haoranxu/ALMA-13B-Pretrain as adapter weights.

If you first load the model s.t. the model and adapter weights are available in the cache, and then re-run in offline mode, the following error occurs:

Traceback (most recent call last):
  File "/home/ubuntu/transformers/../scripts/debug_31552_load_without_safetensors.py", line 8, in <module>
    model = AutoModelForCausalLM.from_pretrained(
  File "/home/ubuntu/transformers/src/transformers/models/auto/auto_factory.py", line 564, in from_pretrained
    return model_class.from_pretrained(
  File "/home/ubuntu/transformers/src/transformers/modeling_utils.py", line 3907, in from_pretrained
    model.load_adapter(
  File "/home/ubuntu/transformers/src/transformers/integrations/peft.py", line 201, in load_adapter
    adapter_state_dict = load_peft_weights(peft_model_id, token=token, device=device, **adapter_kwargs)
  File "/data/ml/lib/python3.10/site-packages/peft/utils/save_and_load.py", line 297, in load_peft_weights
    has_remote_safetensors_file = file_exists(
  File "/data/ml/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "/data/ml/lib/python3.10/site-packages/huggingface_hub/hf_api.py", line 2641, in file_exists
    get_hf_file_metadata(url, token=token)
  File "/data/ml/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "/data/ml/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1645, in get_hf_file_metadata
    r = _request_wrapper(
  File "/data/ml/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 372, in _request_wrapper
    response = _request_wrapper(
  File "/data/ml/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 395, in _request_wrapper
    response = get_session().request(method=method, url=url, **params)
  File "/data/ml/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "/data/ml/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "/data/ml/lib/python3.10/site-packages/huggingface_hub/utils/_http.py", line 77, in send
    raise OfflineModeIsEnabled(
huggingface_hub.errors.OfflineModeIsEnabled: Cannot reach https://huggingface.co/haoranxu/ALMA-13B-R/resolve/main/adapter_model.safetensors: offline mode is enabled. To disable it, please unset the `HF_HUB_OFFLINE` environment variable.

Expected behavior

Can load the model in online and offline mode

@ArthurZucker
Copy link
Collaborator

🫠 sounds like kwargs getting lost maybe?

@amyeroberts
Copy link
Collaborator Author

It's being triggered here in the PEFT library cc @BenjaminBossan

Essentially, path built assumes that if the adapters weight path is local, then it's in the form model_id/adapter_model.safetensors. However, if we've already downloaded the model, it'll be under path/to/cache/.cache/huggingface/hub/models--{REPO_ID}-{MODEL_ID}/snapshots/{COMMIT_REF}/{WEIGHT_NAME}.safetensors

@BenjaminBossan
Copy link
Member

Thanks for flagging this, indeed, this breaks with offline mode. @Wauplin do you have a suggestion how we can correctly check if the file has already been locally cached?

@huggingface huggingface deleted a comment from github-actions bot Jul 29, 2024
@LysandreJik
Copy link
Member

cc @Wauplin if you have the bw :)

@Wauplin
Copy link
Contributor

Wauplin commented Jul 29, 2024

@Wauplin do you have a suggestion how we can correctly check if the file has already been locally cached?

Easiest way to do that is to use hf_hub_download(..., local_files_only=True) in a try/except statement. If a huggingface_hub.utils.LocalEntryNotFoundError error is raised, then it means the file has not been cached locally. Would that be good for you?

And sorry I missed this notification 🙈

@BenjaminBossan
Copy link
Member

I worked on a fix: huggingface/peft#1976. It resolves the issue for me but I had trouble unit testing it, as dynamically setting offline mode in the unit test seems to have no effect :( I think it would still be okay to merge the fix without test but if anyone has an idea how to test it correctly, please LMK.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants