Because the mmdet repo is already in this project ,did i need to run pip install mmdet==2.25.3 #23

RicoJYang · 2023-07-31T14:46:12Z

if i did run pip install mmdet==2.25.3, this project has error: No module named mmdet. When i run pip install mmdet==2.25.3 , i try to train my dataset,then has error:KeyError: 'CoDETR is not in the models registry'

TempleX98 · 2023-07-31T14:58:46Z

We have included the source code of mmdet in this repo. So you don't need to install it via pip.

RicoJYang · 2023-08-01T02:50:13Z

We have included the source code of mmdet in this repo. So you don't need to install it via pip.

Thank you for your reply. How can i sove the error : KeyError: 'CoDETR is not in the models registry' when i run 'sh tools/dist_train.sh projects/configs/co_deformable_detr/co_deformable_detr_r50_1x_coco.py 2 results/'

TempleX98 · 2023-08-01T04:15:55Z

If you have installed mmdet, you should move the projects folder to your mmdet directory.
Add the code from projects import * to tools/train.py, tools/test.py, mmdet/apis/train.py and mmdet/apis/inference.py.

RicoJYang · 2023-08-01T13:00:27Z

If you have installed mmdet, you should move the projects folder to your mmdet directory.

Add the code from projects import * to tools/train.py, tools/test.py, mmdet/apis/train.py and mmdet/apis/inference.py.

I am greatly disturbed that I did the operation you have provided and the files and folders in mmdet are apis core datasets __init__.py models projects __pycache__ utils version.py But i still didn't solve the problem

Traceback (most recent call last):
  File "tools/train.py", line 245, in <module>
    main()
  File "tools/train.py", line 216, in main
    test_cfg=cfg.get('test_cfg'))
  File "/mnt/lustre/GPU7/home/yangbo/anaconda3/envs/co-dert/lib/python3.7/site-packages/mmdet-2.25.3-py3.7.egg/mmdet/models/builder.py", line 59, in build_detector
    main()
  File "tools/train.py", line 216, in main
    cfg, default_args=dict(train_cfg=train_cfg, test_cfg=test_cfg))
  File "/mnt/lustre/GPU7/home/yangbo/anaconda3/envs/co-dert/lib/python3.7/site-packages/mmcv/utils/registry.py", line 217, in build
    test_cfg=cfg.get('test_cfg'))
  File "/mnt/lustre/GPU7/home/yangbo/anaconda3/envs/co-dert/lib/python3.7/site-packages/mmdet-2.25.3-py3.7.egg/mmdet/models/builder.py", line 59, in build_detector
    return self.build_func(*args, **kwargs, registry=self)
  File "/mnt/lustre/GPU7/home/yangbo/anaconda3/envs/co-dert/lib/python3.7/site-packages/mmcv/cnn/builder.py", line 27, in build_model_from_cfg
    cfg, default_args=dict(train_cfg=train_cfg, test_cfg=test_cfg))
  File "/mnt/lustre/GPU7/home/yangbo/anaconda3/envs/co-dert/lib/python3.7/site-packages/mmcv/utils/registry.py", line 217, in build
    return build_from_cfg(cfg, registry, default_args)
  File "/mnt/lustre/GPU7/home/yangbo/anaconda3/envs/co-dert/lib/python3.7/site-packages/mmcv/utils/registry.py", line 47, in build_from_cfg
    return self.build_func(*args, **kwargs, registry=self)
  File "/mnt/lustre/GPU7/home/yangbo/anaconda3/envs/co-dert/lib/python3.7/site-packages/mmcv/cnn/builder.py", line 27, in build_model_from_cfg
    f'{obj_type} is not in the {registry.name} registry')
KeyError: 'CoDETR is not in the models registry'
    return build_from_cfg(cfg, registry, default_args)
  File "/mnt/lustre/GPU7/home/yangbo/anaconda3/envs/co-dert/lib/python3.7/site-packages/mmcv/utils/registry.py", line 47, in build_from_cfg
    f'{obj_type} is not in the {registry.name} registry')
KeyError: 'CoDETR is not in the models registry'

TempleX98 · 2023-08-01T13:08:38Z

The best solution is to uninstall your mmdet: pip uninstall mmdet and directly use the mmdet in our repo.
Another solution can be viewed here.

RicoJYang · 2023-08-03T08:08:08Z

The best solution is to uninstall your mmdet: pip uninstall mmdet and directly use the mmdet in our repo.

Another solution can be viewed here.

Thank you for your reply . I have solved this problem. But when i try to train my own modedl,there is an another error:

Traceback (most recent call last):
  File "tools/train.py", line 245, in <module>
    main()
  File "tools/train.py", line 241, in main
    meta=meta)
  File "/mnt/lustre/GPU7/home/yangbo/workspace/codes/Co-DETR-main/mmdet/apis/train.py", line 245, in train_detector
    runner.run(data_loaders, cfg.workflow)
  File "/mnt/lustre/GPU7/home/yangbo/anaconda3/envs/co-dert/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/mnt/lustre/GPU7/home/yangbo/anaconda3/envs/co-dert/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train
    self.run_iter(data_batch, train_mode=True, **kwargs)
  File "/mnt/lustre/GPU7/home/yangbo/anaconda3/envs/co-dert/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 30, in run_iter
    **kwargs)
  File "/mnt/lustre/GPU7/home/yangbo/anaconda3/envs/co-dert/lib/python3.7/site-packages/mmcv/parallel/distributed.py", line 59, in train_step
    output = self.module.train_step(*inputs[0], **kwargs[0])
  File "/mnt/lustre/GPU7/home/yangbo/workspace/codes/Co-DETR-main/mmdet/models/detectors/base.py", line 248, in train_step
    losses = self(**data)
  File "/mnt/lustre/GPU7/home/yangbo/anaconda3/envs/co-dert/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/lustre/GPU7/home/yangbo/anaconda3/envs/co-dert/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 110, in new_func
    return old_func(*args, **kwargs)
  File "/mnt/lustre/GPU7/home/yangbo/workspace/codes/Co-DETR-main/mmdet/models/detectors/base.py", line 172, in forward
    return self.forward_train(img, img_metas, **kwargs)
  File "/mnt/lustre/GPU7/home/yangbo/workspace/codes/Co-DETR-main/mmdet/models/detectors/co_detr.py", line 180, in forward_train
    gt_labels, gt_bboxes_ignore)
  File "/mnt/lustre/GPU7/home/yangbo/workspace/codes/Co-DETR-main/mmdet/models/dense_heads/co_deformable_detr_head.py", line 629, in forward_train
    losses = self.loss(*loss_inputs, gt_bboxes_ignore=gt_bboxes_ignore)
  File "/mnt/lustre/GPU7/home/yangbo/anaconda3/envs/co-dert/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 198, in new_func
    return old_func(*args, **kwargs)
  File "/mnt/lustre/GPU7/home/yangbo/workspace/codes/Co-DETR-main/mmdet/models/dense_heads/co_deformable_detr_head.py", line 700, in loss
    img_metas, gt_bboxes_ignore)
ValueError: not enough values to unpack (expected 4, got 3)
Traceback (most recent call last):
  File "tools/train.py", line 245, in <module>
    main()
  File "tools/train.py", line 241, in main
    meta=meta)
  File "/mnt/lustre/GPU7/home/yangbo/workspace/codes/Co-DETR-main/mmdet/apis/train.py", line 245, in train_detector
    runner.run(data_loaders, cfg.workflow)
  File "/mnt/lustre/GPU7/home/yangbo/anaconda3/envs/co-dert/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/mnt/lustre/GPU7/home/yangbo/anaconda3/envs/co-dert/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train
    self.run_iter(data_batch, train_mode=True, **kwargs)
  File "/mnt/lustre/GPU7/home/yangbo/anaconda3/envs/co-dert/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 30, in run_iter
    **kwargs)
  File "/mnt/lustre/GPU7/home/yangbo/anaconda3/envs/co-dert/lib/python3.7/site-packages/mmcv/parallel/distributed.py", line 59, in train_step
    output = self.module.train_step(*inputs[0], **kwargs[0])
  File "/mnt/lustre/GPU7/home/yangbo/workspace/codes/Co-DETR-main/mmdet/models/detectors/base.py", line 248, in train_step
    losses = self(**data)
  File "/mnt/lustre/GPU7/home/yangbo/anaconda3/envs/co-dert/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/lustre/GPU7/home/yangbo/anaconda3/envs/co-dert/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 110, in new_func
    return old_func(*args, **kwargs)
  File "/mnt/lustre/GPU7/home/yangbo/workspace/codes/Co-DETR-main/mmdet/models/detectors/base.py", line 172, in forward
    return self.forward_train(img, img_metas, **kwargs)
  File "/mnt/lustre/GPU7/home/yangbo/workspace/codes/Co-DETR-main/mmdet/models/detectors/co_detr.py", line 180, in forward_train
    gt_labels, gt_bboxes_ignore)
  File "/mnt/lustre/GPU7/home/yangbo/workspace/codes/Co-DETR-main/mmdet/models/dense_heads/co_deformable_detr_head.py", line 629, in forward_train
    losses = self.loss(*loss_inputs, gt_bboxes_ignore=gt_bboxes_ignore)
  File "/mnt/lustre/GPU7/home/yangbo/anaconda3/envs/co-dert/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 198, in new_func
    return old_func(*args, **kwargs)
  File "/mnt/lustre/GPU7/home/yangbo/workspace/codes/Co-DETR-main/mmdet/models/dense_heads/co_deformable_detr_head.py", line 700, in loss
    img_metas, gt_bboxes_ignore)
ValueError: not enough values to unpack (expected 4, got 3)
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 107352) of binary: /mnt/lustre/GPU7/home/yangbo/anaconda3/envs/co-dert/bin/python
Traceback (most recent call last):
  File "/mnt/lustre/GPU7/home/yangbo/anaconda3/envs/co-dert/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/mnt/lustre/GPU7/home/yangbo/anaconda3/envs/co-dert/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/mnt/lustre/GPU7/home/yangbo/anaconda3/envs/co-dert/lib/python3.7/site-packages/torch/distributed/launch.py", line 193, in <module>
    main()
  File "/mnt/lustre/GPU7/home/yangbo/anaconda3/envs/co-dert/lib/python3.7/site-packages/torch/distributed/launch.py", line 189, in main
    launch(args)
  File "/mnt/lustre/GPU7/home/yangbo/anaconda3/envs/co-dert/lib/python3.7/site-packages/torch/distributed/launch.py", line 174, in launch
    run(args)
  File "/mnt/lustre/GPU7/home/yangbo/anaconda3/envs/co-dert/lib/python3.7/site-packages/torch/distributed/run.py", line 718, in run
    )(*cmd_args)
  File "/mnt/lustre/GPU7/home/yangbo/anaconda3/envs/co-dert/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 131, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/mnt/lustre/GPU7/home/yangbo/anaconda3/envs/co-dert/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 247, in launch_agent
    failures=result.failures,
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

I change the code to enc_loss_cls, enc_losses_bbox, enc_losses_iou= self.loss_single(enc_cls_scores, enc_bbox_preds,gt_bboxes_list, binary_labels_list,img_metas, gt_bboxes_ignore) , then the error is solved. Will the changed code will affect my training?

TempleX98 · 2023-08-03T08:15:55Z

This is a bug, I will fix it.

minh132 · 2024-03-18T07:40:49Z

@TempleX98 @RicoJYang . When I train model, I still meet above error

Note that --use_env is set by default in torchrun.
If your script expects `--local_rank` argument to be set, please
change it to read from `os.environ['LOCAL_RANK']` instead. See 
https://pytorch.org/docs/stable/distributed.html#launch-utility for 
further instructions

  warnings.warn(
Traceback (most recent call last):
  File "tools/train.py", line 18, in <module>
    from mmdet.apis import init_random_seed, set_random_seed, train_detector
  File "/root/code/exp/Lab/Co-DETR/mmdet/apis/__init__.py", line 2, in <module>
    from .inference import (async_inference_detector, inference_detector,
  File "/root/code/exp/Lab/Co-DETR/mmdet/apis/inference.py", line 13, in <module>
    from mmdet.datasets import replace_ImageToTensor
  File "/root/code/exp/Lab/Co-DETR/mmdet/datasets/__init__.py", line 2, in <module>
    from .builder import DATASETS, PIPELINES, build_dataloader, build_dataset
  File "/root/code/exp/Lab/Co-DETR/mmdet/datasets/builder.py", line 26, in <module>
    resource.setrlimit(resource.RLIMIT_NOFILE, (soft_limit, hard_limit))
ValueError: not allowed to raise maximum limit
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 752631) of binary: /root/miniconda3/envs/lab/bin/python
Traceback (most recent call last):
  File "/root/miniconda3/envs/lab/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/root/miniconda3/envs/lab/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/root/miniconda3/envs/lab/lib/python3.8/site-packages/torch/distributed/launch.py", line 193, in <module>
    main()
  File "/root/miniconda3/envs/lab/lib/python3.8/site-packages/torch/distributed/launch.py", line 189, in main
    launch(args)
  File "/root/miniconda3/envs/lab/lib/python3.8/site-packages/torch/distributed/launch.py", line 174, in launch
    run(args)
  File "/root/miniconda3/envs/lab/lib/python3.8/site-packages/torch/distributed/run.py", line 715, in run
    elastic_launch(
  File "/root/miniconda3/envs/lab/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/root/miniconda3/envs/lab/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
============================================================
tools/train.py FAILED
------------------------------------------------------------
Failures:
  <NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2024-03-18_07:35:16
  host      : 02d76903f0b2
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 752631)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

TempleX98 added a commit that referenced this issue Aug 3, 2023

fix bug in #23

a6054b9

TempleX98 closed this as completed Aug 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Because the mmdet repo is already in this project ,did i need to run pip install mmdet==2.25.3 #23

Because the mmdet repo is already in this project ,did i need to run pip install mmdet==2.25.3 #23

RicoJYang commented Jul 31, 2023

TempleX98 commented Jul 31, 2023

RicoJYang commented Aug 1, 2023

TempleX98 commented Aug 1, 2023

RicoJYang commented Aug 1, 2023

TempleX98 commented Aug 1, 2023

RicoJYang commented Aug 3, 2023

TempleX98 commented Aug 3, 2023

minh132 commented Mar 18, 2024

Because the mmdet repo is already in this project ,did i need to run pip install mmdet==2.25.3 #23

Because the mmdet repo is already in this project ,did i need to run pip install mmdet==2.25.3 #23

Comments

RicoJYang commented Jul 31, 2023

TempleX98 commented Jul 31, 2023

RicoJYang commented Aug 1, 2023

TempleX98 commented Aug 1, 2023

RicoJYang commented Aug 1, 2023

TempleX98 commented Aug 1, 2023

RicoJYang commented Aug 3, 2023

TempleX98 commented Aug 3, 2023

minh132 commented Mar 18, 2024