Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mmcv版本问题 #17

Open
libingDY opened this issue Jul 4, 2023 · 11 comments
Open

mmcv版本问题 #17

libingDY opened this issue Jul 4, 2023 · 11 comments

Comments

@libingDY
Copy link

libingDY commented Jul 4, 2023

您好,mmcv已经升级到最新版本了,您代码中的mmcv_custom中的代码还是基于mmcv低版本写的,您能更新下代码吗?

@DotWang
Copy link
Collaborator

DotWang commented Jul 5, 2023

@libingDY 您好,当前这套代码是完完全全的老版本mmcv,mmdet,mmseg等,老版本上完全可以跑通。由于新老版本改动较大,我没有精力,也没有必要将老版本全部更新为新版本,建议您采用老版本来运行。据我所知,新版本可能已经支持mmcv_custom中的部分功能,您也可以找找看,这样可能就不需要mmcv_custom了。在之后的工作中,我们会采用全套新版本,谢谢关注!

@libingDY
Copy link
Author

libingDY commented Jul 5, 2023

感谢您的恢复,我用低版本的mmcv\mmdet进行了代码运行,但是我又遇到了以下问题:
Traceback (most recent call last):
File "tools/train.py", line 153, in
main()
File "tools/train.py", line 142, in main
train_detector(
File "/data4/libing/bisai/OBBDetection/mmdet/apis/train.py", line 133, in train_detector
runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
File "/data4/libing/anaconda3/lib/python3.8/site-packages/mmcv_full-1.7.1-py3.8-linux-x86_64.egg/mmcv/runner/epoch_based_runner.py", line 136, in run
epoch_runner(data_loaders[i], **kwargs)
File "/data4/libing/anaconda3/lib/python3.8/site-packages/mmcv_full-1.7.1-py3.8-linux-x86_64.egg/mmcv/runner/epoch_based_runner.py", line 53, in train
self.run_iter(data_batch, train_mode=True, **kwargs)
File "/data4/libing/anaconda3/lib/python3.8/site-packages/mmcv_full-1.7.1-py3.8-linux-x86_64.egg/mmcv/runner/epoch_based_runner.py", line 31, in run_iter
outputs = self.model.train_step(data_batch, self.optimizer,
File "/data4/libing/anaconda3/lib/python3.8/site-packages/mmcv_full-1.7.1-py3.8-linux-x86_64.egg/mmcv/parallel/data_parallel.py", line 77, in train_step
return self.module.train_step(*inputs[0], **kwargs[0])
File "/data4/libing/bisai/OBBDetection/mmdet/models/detectors/base.py", line 237, in train_step
losses = self(**data)
File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/data4/libing/bisai/OBBDetection/mmdet/core/fp16/decorators.py", line 51, in new_func
return old_func(*args, **kwargs)
File "/data4/libing/bisai/OBBDetection/mmdet/models/detectors/base.py", line 172, in forward
return self.forward_train(img, img_metas, **kwargs)
File "/data4/libing/bisai/OBBDetection/mmdet/models/detectors/obb/obb_two_stage.py", line 154, in forward_train
x = self.extract_feat(img)
File "/data4/libing/bisai/OBBDetection/mmdet/models/detectors/obb/obb_two_stage.py", line 84, in extract_feat
x = self.backbone(img)
File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/data4/libing/bisai/OBBDetection/mmdet/models/backbones/vitae_nc_win_rvsa_wsz7.py", line 777, in forward
x = self.forward_features(x)
File "/data4/libing/bisai/OBBDetection/mmdet/models/backbones/vitae_nc_win_rvsa_wsz7.py", line 764, in forward_features
x = blk(x, Hp, Wp)
File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/data4/libing/bisai/OBBDetection/mmdet/models/backbones/vitae_nc_win_rvsa_wsz7.py", line 540, in forward
convX = self.drop_path(self.PCM(x_2d).permute(0, 2, 3, 1).contiguous().view(b, n, c))
File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/nn/modules/container.py", line 119, in forward
input = module(input)
File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/nn/modules/batchnorm.py", line 532, in forward
world_size = torch.distributed.get_world_size(process_group)
File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 711, in get_world_size
return _get_group_size(group)
File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 263, in _get_group_size
default_pg = _get_default_group()
File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 347, in _get_default_group
raise RuntimeError("Default process group has not been initialized, "
RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.
您遇到过相似的问题吗?

@DotWang
Copy link
Collaborator

DotWang commented Jul 5, 2023

@libingDY 您好,我没有碰到过,你这是DDP方面的问题吧,我没有改OBBDetection,可能是你命令不对

@libingDY
Copy link
Author

libingDY commented Jul 5, 2023

好的,非常感谢您的回复

@zhongyas
Copy link

zhongyas commented Jul 7, 2023

您好,能提供您老版本的安装包版本吗?感谢

@DotWang
Copy link
Collaborator

DotWang commented Jul 7, 2023

@zhongyas 如果你说mmcv-full的话,安装的时候可以指定版本,我这里现在没有旧的了,如果你说的是obbdetection和mmsegmentation,完整框架在RSP仓库里,RVSA仓库只是提供相关的backbone和config等文件

@zhongyas
Copy link

zhongyas commented Jul 7, 2023

十分感谢您的回复

@hhb442
Copy link

hhb442 commented Oct 23, 2023

你好,我想问一下,这个backbone在初始化时候,使用的norm_cfg是SyncBN吗

@DotWang
Copy link
Collaborator

DotWang commented Oct 24, 2023

@hhb442 如果在多卡上finetune ViTAE-RVSA的话是,readme有写

@vxiaobai
Copy link

请问能否提供一下你的conda环境的压缩包呢,我想进行克隆以避免不用的版本错误

@regainOWO
Copy link

感谢您的恢复,我用低版本的mmcv\mmdet进行了代码运行,但是我又遇到了以下问题: Traceback (most recent call last): File "tools/train.py", line 153, in main() File "tools/train.py", line 142, in main train_detector( File "/data4/libing/bisai/OBBDetection/mmdet/apis/train.py", line 133, in train_detector runner.run(data_loaders, cfg.workflow, cfg.total_epochs) File "/data4/libing/anaconda3/lib/python3.8/site-packages/mmcv_full-1.7.1-py3.8-linux-x86_64.egg/mmcv/runner/epoch_based_runner.py", line 136, in run epoch_runner(data_loaders[i], **kwargs) File "/data4/libing/anaconda3/lib/python3.8/site-packages/mmcv_full-1.7.1-py3.8-linux-x86_64.egg/mmcv/runner/epoch_based_runner.py", line 53, in train self.run_iter(data_batch, train_mode=True, **kwargs) File "/data4/libing/anaconda3/lib/python3.8/site-packages/mmcv_full-1.7.1-py3.8-linux-x86_64.egg/mmcv/runner/epoch_based_runner.py", line 31, in run_iter outputs = self.model.train_step(data_batch, self.optimizer, File "/data4/libing/anaconda3/lib/python3.8/site-packages/mmcv_full-1.7.1-py3.8-linux-x86_64.egg/mmcv/parallel/data_parallel.py", line 77, in train_step return self.module.train_step(*inputs[0], **kwargs[0]) File "/data4/libing/bisai/OBBDetection/mmdet/models/detectors/base.py", line 237, in train_step losses = self(**data) File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/data4/libing/bisai/OBBDetection/mmdet/core/fp16/decorators.py", line 51, in new_func return old_func(*args, **kwargs) File "/data4/libing/bisai/OBBDetection/mmdet/models/detectors/base.py", line 172, in forward return self.forward_train(img, img_metas, **kwargs) File "/data4/libing/bisai/OBBDetection/mmdet/models/detectors/obb/obb_two_stage.py", line 154, in forward_train x = self.extract_feat(img) File "/data4/libing/bisai/OBBDetection/mmdet/models/detectors/obb/obb_two_stage.py", line 84, in extract_feat x = self.backbone(img) File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/data4/libing/bisai/OBBDetection/mmdet/models/backbones/vitae_nc_win_rvsa_wsz7.py", line 777, in forward x = self.forward_features(x) File "/data4/libing/bisai/OBBDetection/mmdet/models/backbones/vitae_nc_win_rvsa_wsz7.py", line 764, in forward_features x = blk(x, Hp, Wp) File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/data4/libing/bisai/OBBDetection/mmdet/models/backbones/vitae_nc_win_rvsa_wsz7.py", line 540, in forward convX = self.drop_path(self.PCM(x_2d).permute(0, 2, 3, 1).contiguous().view(b, n, c)) File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/nn/modules/container.py", line 119, in forward input = module(input) File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/nn/modules/batchnorm.py", line 532, in forward world_size = torch.distributed.get_world_size(process_group) File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 711, in get_world_size return _get_group_size(group) File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 263, in _get_group_size default_pg = _get_default_group() File "/data4/libing/anaconda3/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 347, in _get_default_group raise RuntimeError("Default process group has not been initialized, " RuntimeError: Default process group has not been initialized, please make sure to call init_process_group. 您遇到过相似的问题吗?

如果是单卡训练的话,应该是nn.SyncBatchNorm引起的,换成nn.BatchNorm2d就行了

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants