Skip to content

"Cannot pickle 'module' object error " when train my own dataset using mmdetection. #2425

Closed
@kdh4672

Description

Describe the bug

I'm now trying to train my own dataset.
when I run
(mmd2) daehyeon@daehyeon-MS-7C13:~/mmdetection$ python tools/train.py configs/mask_rcnn_r50_fpn_1x.py ,

I got error like

yapf:enable
runtime settings
total_epochs = 8
dist_params = dict(backend='nccl')
log_level = 'INFO'
work_dir = './work_dirs/mask_rcnn_r50_fpn_1x_test'
load_from = None
resume_from = None
workflow = [('train', 1)]

loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
2020-04-11 06:20:11,015 - mmdet - INFO - Start running, host: daehyeon@daehyeon-MS-7C13, work_dir: /home/daehyeon/mmdetection/work_dirs/mask_rcnn_r50_fpn_1x_test
2020-04-11 06:20:11,015 - mmdet - INFO - workflow: [('train', 1)], max: 8 epochs
Traceback (most recent call last):
File "tools/train.py", line 144, in
main()
File "tools/train.py", line 132, in main
train_detector(
File "/home/daehyeon/anaconda3/envs/mmd2/lib/python3.8/site-packages/mmdet-1.1.0+51df8a9-py3.8-linux-x86_64.egg/mmdet/apis/train.py", line 104, in train_detector
_non_dist_train(
File "/home/daehyeon/anaconda3/envs/mmd2/lib/python3.8/site-packages/mmdet-1.1.0+51df8a9-py3.8-linux-x86_64.egg/mmdet/apis/train.py", line 225, in _non_dist_train
runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
File "/home/daehyeon/anaconda3/envs/mmd2/lib/python3.8/site-packages/mmcv/runner/runner.py", line 359, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/daehyeon/anaconda3/envs/mmd2/lib/python3.8/site-packages/mmcv/runner/runner.py", line 273, in train
self.call_hook('after_train_epoch')
File "/home/daehyeon/anaconda3/envs/mmd2/lib/python3.8/site-packages/mmcv/runner/runner.py", line 226, in call_hook
getattr(hook, fn_name)(self)
File "/home/daehyeon/anaconda3/envs/mmd2/lib/python3.8/site-packages/mmcv/runner/dist_utils.py", line 74, in wrapper
return func(*args, **kwargs)
File "/home/daehyeon/anaconda3/envs/mmd2/lib/python3.8/site-packages/mmcv/runner/hooks/checkpoint.py", line 26, in after_train_epoch
runner.save_checkpoint(
File "/home/daehyeon/anaconda3/envs/mmd2/lib/python3.8/site-packages/mmcv/runner/runner.py", line 247, in save_checkpoint
save_checkpoint(self.model, filepath, optimizer=optimizer, meta=meta)
File "/home/daehyeon/anaconda3/envs/mmd2/lib/python3.8/site-packages/mmcv/runner/checkpoint.py", line 249, in save_checkpoint
torch.save(checkpoint, filename)
File "/home/daehyeon/anaconda3/envs/mmd2/lib/python3.8/site-packages/torch/serialization.py", line 327, in save
_legacy_save(obj, opened_file, pickle_module, pickle_protocol)
File "/home/daehyeon/anaconda3/envs/mmd2/lib/python3.8/site-packages/torch/serialization.py", line 400, in _legacy_save
pickler.dump(obj)
TypeError: cannot pickle 'module' object

I don't know what ' cannot pickle 'module' object' does mean and what should I do.

+) but when i run same thing
(mmd2) daehyeon@daehyeon-MS-7C13:~/mmdetection$ python tools/train.py configs/mask_rcnn_r50_fpn_1x.py ,

but using COCO dataset (not my own dataset), It works perfectly like

loading annotations into memory...
Done (t=0.39s)
creating index...
index created!
2020-04-11 06:45:36,680 - mmdet - INFO - Start running, host: daehyeon@daehyeon-MS-7C13, work_dir: /home/daehyeon/mmdetection/work_dirs/mask_rcnn_r50_fpn_1x
2020-04-11 06:45:36,680 - mmdet - INFO - workflow: [('train', 1)], max: 12 epochs
2020-04-11 06:46:15,204 - mmdet - INFO - Epoch [1][50/2476] lr: 0.00797, eta: 6:20:47, time: 0.770, data_time: 0.026, memory: 3828, loss_rpn_cls: 0.3816, loss_rpn_bbox: 0.0893, loss_cls: 0.6315, acc: 92.6523, loss_bbox: 0.1025, loss_mask: 0.7153, loss: 1.9201
2020-04-11 06:46:54,318 - mmdet - INFO - Epoch [1][100/2476] lr: 0.00931, eta: 6:23:06, time: 0.782, data_time: 0.014, memory: 3828, loss_rpn_cls: 0.2121, loss_rpn_bbox: 0.0620, loss_cls: 0.4719, acc: 93.7617, loss_bbox: 0.1379, loss_mask: 0.6908, loss: 1.5747

and in this case, data_type = 'MyDataset' , data_type = 'CocoDataset' both does work

Reproduction
I read getstart.md for use myowndataset
and i did

  1. At mmdetection/mmdet/my_dataset.py, I wrote

from .coco import CocoDataset
from .registry import DATASETS

@DATASETS.register_module

class MyDataset(CocoDataset):

CLASSES = ('lactofit', 'vitamin')
  1. At mmdetection/mmdet/init.py I wrote add

from .my_dataset import MyDataset

all = [
'CustomDataset', 'XMLDataset', 'CocoDataset', 'VOCDataset',
'CityscapesDataset', 'GroupSampler', 'DistributedGroupSampler',
'build_dataloader', 'ConcatDataset', 'RepeatDataset', 'WIDERFaceDataset',
'DATASETS', 'build_dataset','MyDataset'
]

  1. At configs/mask_rcnn_r50_fpn_1x.py i edit
    1)) dataset_type = 'MyDataset'

2)) bbox_head=dict(num_classes= 3) ## because my own dataset has 2 classes. and 1 is for
## background
3)) mask_head=dict(num_classes=3)

4))
data_root = '../pycococreator/examples/shapes/train/' ## it is my data root

data = dict( imgs_per_gpu=2,
workers_per_gpu=2,
train=dict(type=dataset_type,
ann_file=data_root + 'instances_nutrients_train.json',
img_prefix=data_root + 'images_test/',
pipeline=train_pipeline),

val=dict(type=dataset_type,
    ann_file=data_root + 'instances_nutrients_train.json',
    img_prefix=data_root + 'images_test/',
    pipeline=test_pipeline),

test=dict(
    type=dataset_type,
    ann_file=data_root + 'instances_nutrients_train.json',
    img_prefix=data_root + 'images_test/',
    pipeline=test_pipeline))
  1. At mmdetection/mmdet/core/evaluation/class_names.py

i edit

def coco_classes():
return [
'lactofit', 'vitamin'
]

Finally) (mmd2) daehyeon@daehyeon-MS-7C13:~/mmdetection$ python setup.py develop

This is my dataset link:
https://github.com/kdh4672/my_own_dataset_nutrients_mmdetection.git

i made coco format json file by using pycococreator from 👍 https://github.com/waspinator/pycococreator

Environment
when i get my environment information i got some trouble, but I solved.
-->
My mmdet's version is '1.1.0+51df8a9' and mmdet.version.py has version = '1.1.0+51df8a9' collect_env.py doesn't work because of mmdet.version.py like
" AttributeError: module 'mmdet' has no attribute 'version' "
so i just edit this line in collect_env.py
env_info['MMDetection'] = mmdet.__version -> env_info['MMDetection'] = '1.1.0+51df8a9'

then collect_env works:
(mmd2) daehyeon@daehyeon-MS-7C13:/mmdetection$ python mmdet/utils/collect_env.py
sys.platform: linux
Python: 3.8.2 (default, Mar 26 2020, 15:53:00) [GCC 7.3.0]
CUDA available: True
CUDA_HOME: /usr/local/cuda-10.1
NVCC: Cuda compilation tools, release 10.1, V10.1.105
GPU 0: GeForce GTX 1660
GCC: gcc (Ubuntu 7.5.0-3ubuntu1
18.04) 7.5.0
PyTorch: 1.4.0
PyTorch compiling details: PyTorch built with:

  • GCC 7.3
  • Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CUDA Runtime 10.1
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
  • CuDNN 7.6.3
  • Magma 2.5.1
  • Build settings: BLAS=MKL, BUILD_NAMEDTENSOR=OFF, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Wno-stringop-overflow, DISABLE_NUMA=1, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,

TorchVision: 0.5.0
OpenCV: 4.2.0
MMCV: 0.4.2
MMDetection: 1.1.0+51df8a9
MMDetection Compiler: GCC 7.4
MMDetection CUDA Compiler: 9.1

Error traceback

Traceback (most recent call last):
File "tools/train.py", line 144, in
main()
File "tools/train.py", line 132, in main
train_detector(
File "/home/daehyeon/anaconda3/envs/mmd2/lib/python3.8/site-packages/mmdet-1.1.0+51df8a9-py3.8-linux-x86_64.egg/mmdet/apis/train.py", line 104, in train_detector
_non_dist_train(
File "/home/daehyeon/anaconda3/envs/mmd2/lib/python3.8/site-packages/mmdet-1.1.0+51df8a9-py3.8-linux-x86_64.egg/mmdet/apis/train.py", line 225, in _non_dist_train
runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
File "/home/daehyeon/anaconda3/envs/mmd2/lib/python3.8/site-packages/mmcv/runner/runner.py", line 359, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/daehyeon/anaconda3/envs/mmd2/lib/python3.8/site-packages/mmcv/runner/runner.py", line 273, in train
self.call_hook('after_train_epoch')
File "/home/daehyeon/anaconda3/envs/mmd2/lib/python3.8/site-packages/mmcv/runner/runner.py", line 226, in call_hook
getattr(hook, fn_name)(self)
File "/home/daehyeon/anaconda3/envs/mmd2/lib/python3.8/site-packages/mmcv/runner/dist_utils.py", line 74, in wrapper
return func(*args, **kwargs)
File "/home/daehyeon/anaconda3/envs/mmd2/lib/python3.8/site-packages/mmcv/runner/hooks/checkpoint.py", line 26, in after_train_epoch
runner.save_checkpoint(
File "/home/daehyeon/anaconda3/envs/mmd2/lib/python3.8/site-packages/mmcv/runner/runner.py", line 247, in save_checkpoint
save_checkpoint(self.model, filepath, optimizer=optimizer, meta=meta)
File "/home/daehyeon/anaconda3/envs/mmd2/lib/python3.8/site-packages/mmcv/runner/checkpoint.py", line 249, in save_checkpoint
torch.save(checkpoint, filename)
File "/home/daehyeon/anaconda3/envs/mmd2/lib/python3.8/site-packages/torch/serialization.py", line 327, in save
_legacy_save(obj, opened_file, pickle_module, pickle_protocol)
File "/home/daehyeon/anaconda3/envs/mmd2/lib/python3.8/site-packages/torch/serialization.py", line 400, in _legacy_save
pickler.dump(obj)
TypeError: cannot pickle 'module' object

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions