Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Beit model and pretrained weights #245

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Add Beit model and pretrained weights #245

wants to merge 5 commits into from

Conversation

triple-Mu
Copy link
Contributor

Add Beit model and pretrained weights

tripleMu and others added 2 commits June 30, 2022 15:32
@wzy9813125
Copy link

1、当oneflow版本为0.8.1.dev20220802+cu112时,会报错:

Traceback (most recent call last):
  File "/home/deeplearning/wangzheny/vision-beit/train.py", line 16, in <module>
    net = ModelCreator.create_model("beit_base_patch16_224", num_classes=10)
  File "/home/deeplearning/wangzheny/vision-beit/flowvision/models/registry.py", line 53, in create_model
    model = create_fn(pretrained=pretrained, **kwargs)
  File "/home/deeplearning/wangzheny/vision-beit/flowvision/models/beit.py", line 411, in beit_base_patch16_224
    model = Beit(**model_kwargs)
  File "/home/deeplearning/wangzheny/vision-beit/flowvision/models/beit.py", line 290, in __init__
    [
  File "/home/deeplearning/wangzheny/vision-beit/flowvision/models/beit.py", line 291, in <listcomp>
    Block(
  File "/home/deeplearning/wangzheny/vision-beit/flowvision/models/beit.py", line 160, in __init__
    self.attn = Attention(
  File "/home/deeplearning/wangzheny/vision-beit/flowvision/models/beit.py", line 89, in __init__
    "relative_position_index", gen_relative_position_index(window_size)
  File "/home/deeplearning/wangzheny/vision-beit/flowvision/models/beit.py", line 38, in gen_relative_position_index
    relative_coords[:, :, 0] += window_size[0] - 1  # shift to start from 0
  File "/home/deeplearning/miniconda3/envs/onetorch/lib/python3.8/site-packages/oneflow/framework/tensor.py", line 132, in _iadd
    return self.add_(other)
  File "/home/deeplearning/miniconda3/envs/onetorch/lib/python3.8/site-packages/oneflow/framework/tensor.py", line 128, in _add_inplace
    return flow._C.add(self, other, alpha=alpha, inplace=True)
RuntimeError: Check failed: (*outputs_)[i] != inputs_[i] || inputs_[i]->is_contiguous() inplace operation is not allowed if input is non-contiguous and non-contiguous is not supported for this operation

修改oneflow版本为0.8.1+cu112.git.506cb3f1即可正常运行。
最小复现代码:

from flowvision.models import ModelCreator
net = ModelCreator.create_model("beit_base_patch16_224", num_classes=10)

2、对beit_base_patch16_224、beit_base_patch16_384、beit_large_patch16_224、beit_large_patch16_384、beit_large_patch16_512进行测试,皆可正常运行。

对beit_base_patch16_224_in22k、beit_large_patch16_224_in22k进行测试,采用预训练时可正常运行。在不采用预训练直接调用模型时,无法直接设置num_classes参数。
最小复现代码:

from flowvision.models import ModelCreator
net = ModelCreator.create_model("beit_base_patch16_224_in22k", num_classes=10)

报错信息:

Traceback (most recent call last):
  File "/home/deeplearning/wangzheny/vision-beit/train.py", line 16, in <module>
    net = ModelCreator.create_model("beit_base_patch16_224_in22k", num_classes=10)
  File "/home/deeplearning/wangzheny/vision-beit/flowvision/models/registry.py", line 53, in create_model
    model = create_fn(pretrained=pretrained, **kwargs)
  File "/home/deeplearning/wangzheny/vision-beit/flowvision/models/beit.py", line 459, in beit_base_patch16_224_in22k
    **kwargs
TypeError: type object got multiple values for keyword argument 'num_classes'

若通过像采用预训练那样修改最后一层head层的输出,可正常运行。

from flowvision.models import ModelCreator
import oneflow as flow
net = ModelCreator.create_model("beit_base_patch16_224_in22k")
num_fc = net.head.in_features
net.head = flow.nn.Linear(in_features=num_fc, out_features=10)

3、还有一点我比较好奇的是,in22k结尾的网络最后一层head的输出为何是21841这么大,即使是预训练之后。(按理说通过imagenet预训练之后最后的输出都是1000类)
(head): Linear(in_features=1024, out_features=21841, bias=True)

@hjchen2
Copy link

hjchen2 commented Aug 5, 2022

修改oneflow版本为0.8.1+cu112.git.506cb3f1即可正常运行。

你是用eager模式跑的吧,因为add op还没支持non-contiguous输入,所以non-contiguous输入的inplace add是不支持的,以前的版本能支持,但计算出来的结果肯定是错的。这个问题正在统一解决中。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants