Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with torch load pre-trained model #4

Open
SHENG-KAI-HUANG opened this issue Oct 31, 2018 · 5 comments
Open

Problem with torch load pre-trained model #4

SHENG-KAI-HUANG opened this issue Oct 31, 2018 · 5 comments

Comments

@SHENG-KAI-HUANG
Copy link

SHENG-KAI-HUANG commented Oct 31, 2018

Hi,
I was trying to using the pre-trained model which download from this repository.
but I met the problem as following:

==> loading model from pretained weights from file: ./pre-trained/siam_hybridnet_fullsized.t7
Warning: Failed to load function from bytecode: binary string: not a precompiled chunkWarning: Failed to load function from bytecode: [string "�"]:1: unexpected symbol near char(4)/home/mark/torch/install/bin/lua: /home/mark/torch/install/share/lua/5.2/torch/File.lua:375: unknown object
stack traceback:
[C]: in function 'error'
/home/mark/torch/install/share/lua/5.2/torch/File.lua:375: in function 'readObject'
/home/mark/torch/install/share/lua/5.2/torch/File.lua:307: in function 'readObject'
/home/mark/torch/install/share/lua/5.2/torch/File.lua:369: in function 'readObject'
/home/mark/torch/install/share/lua/5.2/nn/Module.lua:192: in function 'read'
/home/mark/torch/install/share/lua/5.2/torch/File.lua:351: in function 'readObject'
/home/mark/torch/install/share/lua/5.2/torch/File.lua:369: in function 'readObject'
/home/mark/torch/install/share/lua/5.2/torch/File.lua:369: in function 'readObject'
/home/mark/torch/install/share/lua/5.2/nn/Module.lua:192: in function 'read'
/home/mark/torch/install/share/lua/5.2/torch/File.lua:351: in function 'readObject'
...
...k/torch/install/share/lua/5.2/cunn/DataParallelTable.lua:398: in function 'read'
/home/mark/torch/install/share/lua/5.2/torch/File.lua:351: in function 'readObject'
/home/mark/torch/install/share/lua/5.2/torch/File.lua:409: in function 'load'
/usr/relativeCameraPose-master/gpu_util.lua:54: in function 'loadDataParallel'
/usr/relativeCameraPose-master/model.lua:71: in main chunk
[C]: in function 'dofile'
/home/mark/torch/install/share/lua/5.2/paths/init.lua:84: in function 'dofile'
main.lua:29: in main chunk
[C]: in function 'dofile'
...mark/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: in ?

Here is the pre-trained model's MD5 hash code: (created by md5sum command)
bdf13b947817bd7d3244309b2cda811d ./pre-trained/siam_hybridnet_fullsized.t7

Is this file broken? or anything wrong?
Could anyone give me a help?

@SHENG-KAI-HUANG
Copy link
Author

By the way, I had tried load model in 'ascii' mode, but I got the another error message:

/home/mark/torch/install/bin/lua: /home/mark/torch/install/share/lua/5.2/torch/File.lua:259: read error: read 0 blocks instead of 1 at /home/mark/torch/pkg/torch/lib/TH/THDiskFile.c:352
stack traceback:
[C]: in function 'readInt'
/home/mark/torch/install/share/lua/5.2/torch/File.lua:259: in function 'readObject'
/home/mark/torch/install/share/lua/5.2/torch/File.lua:409: in function 'load'
test.lua:4: in main chunk
[C]: in function 'dofile'
...mark/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: in ?

@imelekhov
Copy link
Collaborator

Hi there,
Thank you for your interest in our work.
The MD5 sum is correct. What version of CUDA and cudnn you have? I have installed torch and all the packages (nn, cunn, inn, cudnn) from scrath (with CUDA v9.2 and cudnn 5.1) and I could load the model at least.

@SHENG-KAI-HUANG
Copy link
Author

SHENG-KAI-HUANG commented Nov 2, 2018

@imelekhov thank you for your answer, I am using Cuda 8.0 and CUDNN 6.0.

I have tried to train the model and have created some snapshot, and I can load those .t7 which created by myself.
According to torch7's website , it say the load function in binary format will be platform dependent, and ASCII format is platform-independent.
So, maybe those different setting (or package version) between your environment and my environment cause this error happened.
Therefore I think maybe ASCII format pre-trained model can help me to solve this error.
Would you mind turning the pre-trained model into ASCII format?

@imelekhov
Copy link
Collaborator

I see. Sure, no problem. I have converted original weights to ascii format and put an archive here. MD5sum of the file inside is afcb6f1be9caf4a23d94b399fddfeb3d. Let me know if something goes wrong.

@SHENG-KAI-HUANG
Copy link
Author

SHENG-KAI-HUANG commented Nov 3, 2018

Well, still have some problem here.
the error message to load the ascii model is:

Warning: Failed to load function from bytecode: (binary): cannot load incompatible bytecodeWarning: Failed to load function from bytecode: [string "2..."]:1: unexpected symbol near '2'luajit: /home/mark/torch/install/share/lua/5.1/torch/File.lua:259: read error: read 0 blocks instead of 1 at /home/mark/torch/pkg/torch/lib/TH/THDiskFile.c:352

I am using Ubuntu 16.04 with Lua 5.1 now, I don't sure the version of Lua will impact or not.
but it looks some symbol (or string?) in ascii file can't be recognize by my computer.
I will find some time to install CUDA 9.2 and CUDNN 5.1 then try it again, I will told you the result as soon as possible.

By the way, would you mind sharing the landmarks dataset which you used to training and validation in the paper?
I have looked the original dataset, but I don't know how to use it as you describe in the paper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants