Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question about ann_to_snn generating snn_dag #38

Open
shirleyatgithub opened this issue Jan 6, 2022 · 25 comments
Open

question about ann_to_snn generating snn_dag #38

shirleyatgithub opened this issue Jan 6, 2022 · 25 comments

Comments

@shirleyatgithub
Copy link

shirleyatgithub commented Jan 6, 2022

Dear Author,
Thanks for sharing the code. I encountered a problem when executing the ann_to_snn.py and wonder if you have encountered this problem.
The error message is as follows:
"
File "/home/gss/PyTorch-Spiking-YOLOv3-main/ann_parser.py", line 102, in relu_wrapper
in_nodes = [find_node_by_tensor(inp)]
File "/home/gss/PyTorch-Spiking-YOLOv3-main/ann_parser.py", line 37, in find_node_by_tensor
raise ValueError("cannot find tensor Size", tensor.size())
ValueError: ('cannot find tensor Size', torch.Size([1, 16, 416, 416]))
"
In ann_parser.py, the find_node_by_tensor requires "v is tensor", in python this means their memory are the same, but when adding ReLU layer, the input of ReLU cannot meet this condition and the rst is empty.
I print the id of the tensors in this function and got the following messages:
conv1 inp id 140221914107048
find node by tensor dag_input0 torch.Size([1, 3, 416, 416]) torch.Size([1, 3, 416, 416]) 140221914107048 140221914107048
add node conv1: ['dag_input0']->['conv1_out1']
conv1 out id 140221914107336
find node by tensor dag_input0 torch.Size([1, 16, 416, 416]) torch.Size([1, 3, 416, 416]) 140221914107336 140221914107048
find node by tensor conv1_out1 torch.Size([1, 16, 416, 416]) torch.Size([1, 16, 416, 416]) 140221914107336 140221914107336
batch_norm1 inp id 140221914107336
batch_norm1 out id 140221914155120
relu1 inp id 140221914106976
find node by tensor dag_input0 torch.Size([1, 16, 416, 416]) torch.Size([1, 3, 416, 416]) 140221914106976 140221914107048
find node by tensor conv1_out1 torch.Size([1, 16, 416, 416]) torch.Size([1, 16, 416, 416]) 140221914106976 140221914155120

I don't understand why the id will be different in the flow. I only change the classses from 80 to 1 and the filters from 255 to 18 accordingly in the config file "yolov3-tiny-mp2conv-mp1none-lk2relu-up2tconv.cfg". The ANN trained with the config file can be trained and tested successfully.
Looking forward to your reply. @cwq159

@buaa-luzhi
Copy link

@shirleyatgithub
Hello, Which version Pytorch is using.

$ python ann_to_snn.py --cfg cfg/yolov3-tiny.cfg --data data/coco.data --weights weights/best.pt --timesteps 128

I encountered this problem: "ValueError: ('cannot find tensor Size', torch.Size([16, 16, 320, 320])) "
And I can't find the version of pytorch = 1.3.0
Do you have this problem? Looking forward to your reply.

@shirleyatgithub
Copy link
Author

@buaa-luzhi yes, seems the same problem, I use torch 1.7.1. any idea of solving this problem?

@buaa-luzhi
Copy link

@shirleyatgithub
#5
But, I couldn't find a version of Pytorch=1.3.

@shirleyatgithub
Copy link
Author

@buaa-luzhi why using pytorch=1.3, the requirements.txt suggests torch>=1.6.0

@buaa-luzhi
Copy link

@shirleyatgithub
I don't know.
#5
I referred to this link.

@buaa-luzhi
Copy link

@shirleyatgithub
I used Pytorch=1.7.1 and 1.4 and still had this problem.

@shirleyatgithub
Copy link
Author

@buaa-luzhi I didn't find torch 1.3 either so I tried torch 1.4 cpu and python 3.7, this problem is didn't pop out but another problem pops out.
ann_parser.py", line 221, in parse_ann_model
model(*warpped_input)
File "/home/gss/anaconda3/envs/nlp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "ann_to_snn.py", line 65, in forward
x = self.listi
File "/home/gss/anaconda3/envs/nlp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
TypeError: forward() missing 1 required positional argument: 'out'

@buaa-luzhi
Copy link

@shirleyatgithub
pip install torch==1.3.1+cu100 torchvision==0.4.2+cu100 -f https://download.pytorch.org/whl/torch_stable.html

@buaa-luzhi
Copy link

@shirleyatgithub
I'm still testing.

@buaa-luzhi
Copy link

@shirleyatgithub
I still get this error!
I don't know how to modify.

@shirleyatgithub
Copy link
Author

@shirleyatgithub pip install torch==1.3.1+cu100 torchvision==0.4.2+cu100 -f https://download.pytorch.org/whl/torch_stable.html

Thank you, I will try too

@cwq159
Copy link
Owner

cwq159 commented Jan 6, 2022

Please use pytorch1.3 with python 3.7 in this version.
New version with pytorch1.7+ will be released soon.

@buaa-luzhi
Copy link

@cwq159 @shirleyatgithub
I used pytorch1.3 and python 3.7 and still get this error.
I wonder if /cfg/yolov3-tiny-mp2conv-mp1none-lk2relu-up2tconv.cfg should be used during the training phase.
Because I didn't find/CFG /yolov3-tiny-ours.cfg
Thanks so much, and looking forward to your reply!

@buaa-luzhi
Copy link

@cwq159 @shirleyatgithub
(1) The stage of training:
python3 train.py --batch-size 32 --cfg cfg/yolov3-tiny-mp2conv-mp1none-lk2relu-up2tconv.cfg --data data/coco.data --weights ''
(2)Transform
python3 ann_to_snn.py --cfg cfg/yolov3-tiny-mp2conv-mp1none-lk2relu-up2tconv.cfg --data data/coco.data --weights weights/best.pt --timesteps 128

What's wrong with this type of training?
Error reappears....
ValueError: ('cannot find tensor Size', torch.Size([16, 16, 640, 640]))

Thanks so much, and looking forward to your reply!

@buaa-luzhi
Copy link

Now that error doesn't exist.
However, as timeSteps get larger, a memory error occurs.
GPU memory is only 6GB, batch_size is 1, timesteps=32,
Still display GPU memory error.

@buaa-luzhi
Copy link

@cwq159
Hello, when will Python=1.7 be released?
Thanks!

@cwq159
Copy link
Owner

cwq159 commented Jan 7, 2022

If you want to enlarge timesteps, you should use one GPU with enough memory.
Because in this version, input data will be copied for timesteps times. Then snn will calculate the output for every copy. So the GPU memory should be large enough.
New version will try to optimize the IF operation to decrease the memory usage and support for pytorch1.7+. Please look forward to it soon afterwards.

@buaa-luzhi
Copy link

@cwq159
Hello, sorry to trouble you again!
What type of GPU do you use.
My GUP memory is small and I want to replace it with a new card.
Thanks again.

@cwq159
Copy link
Owner

cwq159 commented Jan 10, 2022

RTX8000 with 48G memory

@mengjingyouling
Copy link

thanks! That is great! how long it will take about the new code?

@WuTi0525
Copy link

WuTi0525 commented Nov 28, 2022

Now that error doesn't exist. However, as timeSteps get larger, a memory error occurs. GPU memory is only 6GB, batch_size is 1, timesteps=32, Still display GPU memory error.

@buaa-luzhi Execuse me, how do you solve this error: ValueError: ('cannot find tensor Size', torch.Size([16, 16, 640, 640]))

@jsckdon
Copy link

jsckdon commented Apr 25, 2024

@buaa-luzhi ValueError: ('找不到张量大小', torch.Size([16, 16, 640, 640]))hello,how did you solved this question?can you tell me?

@mengjingyouling
Copy link

mengjingyouling commented Apr 25, 2024 via email

@jsckdon
Copy link

jsckdon commented Apr 25, 2024

Sorry actually i have no idea about it. 发自我的iPhone

------------------ Original ------------------ From: jsckdon @.> Date: Thu,Apr 25,2024 7:12 PM To: cwq159/PyTorch-Spiking-YOLOv3 @.> Cc: mengjingyouling @.>, Comment @.> Subject: Re: [cwq159/PyTorch-Spiking-YOLOv3] question about ann_to_snngenerating snn_dag (Issue #38) @buaa-luzhi ValueError: ('找不到张量大小', torch.Size([16, 16, 640, 640]))hello,how did you solved this question?can you tell me? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

This code ann-snn still have some problems?Today I try to run this code and i meet this problem ,i see someone request to use python3.7 and torch 1.3,i don't konw this wether useful

@mengjingyouling
Copy link

mengjingyouling commented Apr 25, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants