LocalEntryNotFoundError when loading downloaded pretrained model using timm.create_model (side load offline weights, e.g. on Kaggle) #1826

eesoymilk · 2023-05-25T03:01:06Z

eesoymilk
May 25, 2023

Describe the bug

I'm trying to load swin transformer in a kaggle's offline environment. While it ran without issue yesterday, it now fails to load with a LocalEntryNotFoundError. I believe this is because the recent update of kaggle's docker image, which uses a newer version of timm and therefore breaks the code.

To Reproduce

Download the pretrain model from SwinTransformer/storage and upload it to kaggle's dataset. In my case, I use swinv2_large_patch4_window12to16_192to256_22kto1k_ft.pth, but I don't think any of them work as well.

Minimal code to reproduce the bug

import os
from timm import create_model

if not os.path.exists('/root/.cache/torch/hub/checkpoints/'):
    os.makedirs('/root/.cache/torch/hub/checkpoints/')
!cp '../input/swin-transformer-classification-models/swinv2_large_patch4_window12to16_192to256_22kto1k_ft.pth' '/root/.cache/torch/hub/checkpoints/swinv2_large_patch4_window12to16_192to256_22kto1k_ft.pth'

model = create_model(
    'swinv2_large_window12to16_192to256_22kft1k', 
    pretrained=True
)

Expected behavior
This should load swinv2_large_patch4_window12to16_192to256_22kto1k_ft.pth from /root/.cache/torch/hub/checkpoints/.

Error message

---------------------------------------------------------------------------
LocalEntryNotFoundError                   Traceback (most recent call last)
Cell In[9], line 1
----> 1 model = create_model(
      2     'swinv2_large_window12to16_192to256_22kft1k', pretrained=True)

File /opt/conda/lib/python3.10/site-packages/timm/models/_factory.py:114, in create_model(model_name, pretrained, pretrained_cfg, pretrained_cfg_overlay, checkpoint_path, scriptable, exportable, no_jit, **kwargs)
    112 create_fn = model_entrypoint(model_name)
    113 with set_layer_config(scriptable=scriptable, exportable=exportable, no_jit=no_jit):
--> 114     model = create_fn(
    115         pretrained=pretrained,
    116         pretrained_cfg=pretrained_cfg,
    117         pretrained_cfg_overlay=pretrained_cfg_overlay,
    118         **kwargs,
    119     )
    121 if checkpoint_path:
    122     load_checkpoint(model, checkpoint_path)

File /opt/conda/lib/python3.10/site-packages/timm/models/_registry.py:139, in _deprecated_model_shim.<locals>._fn(pretrained, **kwargs)
    137 warnings.warn(f'Mapping deprecated model name {deprecated_name} to current {current_name}.', stacklevel=2)
    138 pretrained_cfg = kwargs.pop('pretrained_cfg', None)
--> 139 return current_fn(pretrained=pretrained, pretrained_cfg=pretrained_cfg or current_tag, **kwargs)

File /opt/conda/lib/python3.10/site-packages/timm/models/swin_transformer_v2.py:814, in swinv2_large_window12to16_192to256(pretrained, **kwargs)
    809 """
    810 """
    811 model_args = dict(
    812     window_size=16, embed_dim=192, depths=(2, 2, 18, 2), num_heads=(6, 12, 24, 48),
    813     pretrained_window_sizes=(12, 12, 12, 6))
--> 814 return _create_swin_transformer_v2(
    815     'swinv2_large_window12to16_192to256', pretrained=pretrained, **dict(model_args, **kwargs))

File /opt/conda/lib/python3.10/site-packages/timm/models/swin_transformer_v2.py:636, in _create_swin_transformer_v2(variant, pretrained, **kwargs)
    633 default_out_indices = tuple(i for i, _ in enumerate(kwargs.get('depths', (1, 1, 1, 1))))
    634 out_indices = kwargs.pop('out_indices', default_out_indices)
--> 636 model = build_model_with_cfg(
    637     SwinTransformerV2, variant, pretrained,
    638     pretrained_filter_fn=checkpoint_filter_fn,
    639     feature_cfg=dict(flatten_sequential=True, out_indices=out_indices),
    640     **kwargs)
    641 return model

File /opt/conda/lib/python3.10/site-packages/timm/models/_builder.py:393, in build_model_with_cfg(model_cls, variant, pretrained, pretrained_cfg, pretrained_cfg_overlay, model_cfg, feature_cfg, pretrained_strict, pretrained_filter_fn, kwargs_filter, **kwargs)
    391 num_classes_pretrained = 0 if features else getattr(model, 'num_classes', kwargs.get('num_classes', 1000))
    392 if pretrained:
--> 393     load_pretrained(
    394         model,
    395         pretrained_cfg=pretrained_cfg,
    396         num_classes=num_classes_pretrained,
    397         in_chans=kwargs.get('in_chans', 3),
    398         filter_fn=pretrained_filter_fn,
    399         strict=pretrained_strict,
    400     )
    402 # Wrap the model in a feature extraction module if enabled
    403 if features:

File /opt/conda/lib/python3.10/site-packages/timm/models/_builder.py:186, in load_pretrained(model, pretrained_cfg, num_classes, in_chans, filter_fn, strict)
    184         state_dict = load_state_dict_from_hf(*pretrained_loc)
    185     else:
--> 186         state_dict = load_state_dict_from_hf(pretrained_loc)
    187 else:
    188     model_name = pretrained_cfg.get('architecture', 'this model')

File /opt/conda/lib/python3.10/site-packages/timm/models/_hub.py:188, in load_state_dict_from_hf(model_id, filename)
    185             pass
    187 # Otherwise, load using pytorch.load
--> 188 cached_file = hf_hub_download(hf_model_id, filename=filename, revision=hf_revision)
    189 _logger.debug(f"[{model_id}] Safe alternative not found for '{filename}'. Loading weights using default pytorch.")
    190 return torch.load(cached_file, map_location='cpu')

File /opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py:120, in validate_hf_hub_args.<locals>._inner_fn(*args, **kwargs)
    117 if check_use_auth_token:
    118     kwargs = smoothly_deprecate_use_auth_token(fn_name=fn.__name__, has_token=has_token, kwargs=kwargs)
--> 120 return fn(*args, **kwargs)

File /opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py:1291, in hf_hub_download(repo_id, filename, subfolder, repo_type, revision, library_name, library_version, cache_dir, local_dir, local_dir_use_symlinks, user_agent, force_download, force_filename, proxies, etag_timeout, resume_download, token, local_files_only, legacy_cache_layout)
   1285         raise LocalEntryNotFoundError(
   1286             "Cannot find the requested files in the disk cache and"
   1287             " outgoing traffic has been disabled. To enable hf.co look-ups"
   1288             " and downloads online, set 'local_files_only' to False."
   1289         )
   1290     else:
-> 1291         raise LocalEntryNotFoundError(
   1292             "Connection error, and we cannot find the requested files in"
   1293             " the disk cache. Please try again or make sure your Internet"
   1294             " connection is on."
   1295         )
   1297 # From now on, etag and commit_hash are not None.
   1298 assert etag is not None, "etag must have been retrieved from server"

LocalEntryNotFoundError: Connection error, and we cannot find the requested files in the disk cache. Please try again or make sure your Internet connection is on.

Answered by rwightman

May 25, 2023

@eesoymilk latest timm versions (0.9.x) use the huggingface hub for weights by default now, that takes priority over the torch hub cache, many weights have been remapped for model changes so best download via HF hub. Kaggle really should support passing through HF hub or at least caching it properly but they seem to have no interest in making things simpler for people so the manual caching of weights in datasets madness continues...

To override pretrained location, you download the weights file manually from the hub and try the following.

timm.create_model(
  'swinv2_large_window12to16_192to256',
  pretrained=True,
  pretrained_cfg_overlay=dict(file='path/to/checkpoint'),
)

where checkpoi…

View full answer

rwightman · 2023-05-25T03:52:23Z

rwightman
May 25, 2023
Maintainer

@eesoymilk latest timm versions (0.9.x) use the huggingface hub for weights by default now, that takes priority over the torch hub cache, many weights have been remapped for model changes so best download via HF hub. Kaggle really should support passing through HF hub or at least caching it properly but they seem to have no interest in making things simpler for people so the manual caching of weights in datasets madness continues...

To override pretrained location, you download the weights file manually from the hub and try the following.

timm.create_model(
  'swinv2_large_window12to16_192to256',
  pretrained=True,
  pretrained_cfg_overlay=dict(file='path/to/checkpoint'),
)

where checkpoint is a .safetensors/.bin/.pth/.pt etc file...

5 replies

eesoymilk May 25, 2023
Author

Thanks for the reply. The code is working right now!
One thing to add is that I think this can be added somewhere in the docs of timm since this is really confusing a ton of people.

rwightman May 25, 2023
Maintainer

yes, this is definitely something I should document better, noted

Tianyahechu Nov 27, 2024

KeyError: 'embedding/kernel is not a file in the archive'

I encountered this issue when I was loading the pretrained weights downloaded from huggingface. timm==1.0.11
code like below:
timm.create_model(model_name='vit_large_patch16_384',
pretrained=True,
pretrained_cfg_overlay=dict(file='vit_large_patch16_384.augreg_in21k_ft_in1k/pytorch_model.bin'),
# num_classes=2
)
how to fix it? Thanks

rwightman Nov 27, 2024
Maintainer

@Tianyahechu hmm, not the most obvious, but the overlay needs to also have custom_load=False in the overlay as the original weights for that model were .npz and used the custom load mechanism.

Tianyahechu Nov 28, 2024

@rwightman Thanks for your response. It worked successfully after I added custom_load=False to pretrained_cfg_overlay

rwightman · 2023-05-25T03:53:44Z

rwightman
May 25, 2023
Maintainer

moving to discussions for future reference

0 replies

kostiantynbrandbrigade · 2023-06-09T01:26:23Z

kostiantynbrandbrigade
Jun 9, 2023

@rwightman can you pls point me what is the default path where loaded models are stored in 0.9.x versions?

1 reply

K-D-Gallagher Jun 26, 2023

@kostiantynbrandbrigade I'm using timm 0.9 and found them here:
~/.cache/huggingface/hub

turicas · 2024-12-03T22:19:09Z

turicas
Dec 3, 2024

I had a related problem and would like to share the solution: I couldn't find a way to change the model download path (to speedup application deployments). I tried to change it by passing arguments to timm.create_model, which didn't work. Instead, setting the HF_HUB_CACHE environment variable solved the issue. This variable can be set externally or within the script; if set within the script, it must be done before importing timm.

Here is an example (using timm 1.0.12, torch 2.5.1+cpu, huggingface_hub 0.26.3). First, let's test using standard HF cache dir:

import timm
from huggingface_hub import scan_cache_dir

# Load a model to ensure it's downloaded in the standard cache dir
model = timm.create_model("timm/vit_small_patch14_dinov2.lvd142m", pretrained=True, num_classes=0)

# Now check where the downloaded repositories were stored
cache = scan_cache_dir()
cache.repos
for repo in cache.repos:
    print(f"repo {repo.repo_id} stored in {repo.repo_path}")

The code above will print something like this (/app is the user's home directory):

repo timm/vit_small_patch14_dinov2.lvd142m stored in /app/.cache/huggingface/hub/models--timm--vit_small_patch14_dinov2.lvd142m

Now, let's force the HF_HUB_CACHE environment variable (note that you must run it in another Python shell, since changing the env var must be done before importing timm):

!rm -rf /data/models  # Make sure it does not exist, so force HF to download the model
import os
os.environ["HF_HUB_CACHE"] = "/data/models/"

# The rest of the code is exactly the same:
import timm
from huggingface_hub import scan_cache_dir

# Load a model to ensure it's downloaded in the custom cache dir
model = timm.create_model("timm/vit_small_patch14_dinov2.lvd142m", pretrained=True, num_classes=0)

# Now check where the downloaded repositories were stored
cache = scan_cache_dir()
cache.repos
for repo in cache.repos:
    print(f"repo {repo.repo_id} stored in {repo.repo_path}")

And the output will be:

repo timm/vit_small_patch14_dinov2.lvd142m stored in /data/models/models--timm--vit_small_patch14_dinov2.lvd142m

0 replies

rwightman · 2024-12-03T23:38:39Z

rwightman
Dec 3, 2024
Maintainer

FYI @turicas & others, I have a feature issue up to address #2338 ... I welcome comments about the proposed idea there from a user interface aspect.

The idea is you download model repo ahead of time from the HF hub
huggingface-cli download timm/resnet50.a1_in1k --local-dir ./resnet50_a1

Then create model with an identifier that specifies it's a local folder
timm.create_model('local-dir:resnet50_a1', pretrained=True, num_classes=0)

Even right now, you can avoid rooting through the HF hub cache and download the model using the cli, then using the pretrained_cfg_overlay to point to either the .safetensors or .bin file. You retain more control that way than you do letting the hub cache. Best use case for the cache is just let it be if you're on the same machine, switch between offline/online if you are unplugged sometimes. If you are on an isolated network, either share a cache folder that you can sync from another machien that fills it or download models ahead of time explicitly.

To lookup the hf repoid without poking around timm.models.get_pretrained_cfg('resnet50.a1_in1k').hf_hub_id

1 reply

turicas Dec 4, 2024

Thanks for the reply! I couldn't find any docs on pretrained_cfg_overlay, so I opted for the env var solution.
Regarding the HF CLI and pre-downloading, I prefer to avoid introducing an additional dependency and an extra step for the first use. I'll share a more detailed feedback on the issue you mentioned. Thanks again.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LocalEntryNotFoundError when loading downloaded pretrained model using timm.create_model (side load offline weights, e.g. on Kaggle) #1826

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 5 comments 7 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

LocalEntryNotFoundError when loading downloaded pretrained model using timm.create_model (side load offline weights, e.g. on Kaggle) #1826

eesoymilk May 25, 2023

Replies: 5 comments · 7 replies

rwightman May 25, 2023 Maintainer

eesoymilk May 25, 2023 Author

rwightman May 25, 2023 Maintainer

Tianyahechu Nov 27, 2024

KeyError: 'embedding/kernel is not a file in the archive'

rwightman Nov 27, 2024 Maintainer

Tianyahechu Nov 28, 2024

rwightman May 25, 2023 Maintainer

kostiantynbrandbrigade Jun 9, 2023

K-D-Gallagher Jun 26, 2023

turicas Dec 3, 2024

rwightman Dec 3, 2024 Maintainer

turicas Dec 4, 2024

eesoymilk
May 25, 2023

Replies: 5 comments 7 replies

rwightman
May 25, 2023
Maintainer

eesoymilk May 25, 2023
Author

rwightman May 25, 2023
Maintainer

rwightman Nov 27, 2024
Maintainer

rwightman
May 25, 2023
Maintainer

kostiantynbrandbrigade
Jun 9, 2023

turicas
Dec 3, 2024

rwightman
Dec 3, 2024
Maintainer