Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Architecture issues with Torch.load #42

Closed
bamos opened this issue Oct 29, 2015 · 40 comments
Closed

Architecture issues with Torch.load #42

bamos opened this issue Oct 29, 2015 · 40 comments
Labels

Comments

@bamos
Copy link
Collaborator

bamos commented Oct 29, 2015

From @ananghudaya in #26:

th> net = torch.load('./models/openface/nn4.v1.t7')
...e/ananghudaya/torch/install/share/lua/5.1/torch/File.lua:289: table index is nil
stack traceback:
    ...e/ananghudaya/torch/install/share/lua/5.1/torch/File.lua:289: in function 'readObject'
    ...e/ananghudaya/torch/install/share/lua/5.1/torch/File.lua:272: in function 'readObject'
    ...e/ananghudaya/torch/install/share/lua/5.1/torch/File.lua:311: in function 'load'
    [string "net = torch.load('./models/openface/nn4.v1.t7')"]:1: in main chunk
    [C]: in function 'xpcall'
    ...e/ananghudaya/torch/install/share/lua/5.1/trepl/init.lua:648: in function 'repl'
    ...daya/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:185: in main chunk
    [C]: at 0x0804d6d0  
@bamos bamos added the bug label Oct 29, 2015
@bamos bamos added this to the v0.2.0 milestone Oct 29, 2015
@bamos bamos mentioned this issue Oct 29, 2015
@bamos
Copy link
Collaborator Author

bamos commented Oct 29, 2015

@ananghudaya - https://github.com/teradeep/demo-apps/issues/4 indicates this is an architecture issue, which is a problem with torch.load I wasn't aware of, but is clearly in the documentation at https://github.com/torch/torch7/blob/master/doc/serialization.md. I saved the binary model in x86_64 and I think it's only compatible with x86_64. Are you using 32-bit x86 or ARM?

I've saved the model in ASCII format. Can you download and unxz it from here.

$ md5sum nn4.v1.ascii.t7
735723e2c9cc4eefc00a7df34c9a4d3b  nn4.v1.ascii.t7

Try loading it with:

$ th
th> require 'nn'
th> require 'dpnn'
th> net = torch.load('nn4.v1.ascii.t7', 'ascii')

If this works, I think you'll just need to replace nn4.v1.t7 with nn4.v1.ascii.t7 in the Python demos and make add ascii to torch.load in https://github.com/cmusatyalab/openface/blob/master/openface/openface_server.lua.

  • Even though the ascii model is larger, I'll use it in place of the binary one everywhere to avoid issues like this. Thanks for the useful info and for helping me improve the project. I'll make the changes over the next few days.

@ananghudaya
Copy link

Thanks @bamos

Still no luck in getting it right. I've downloaded and verified the ASCII model. Here is the output:

th> net = torch.load('nn4.v1.ascii.t7', 'ascii')
cannot open <nn4.v1.ascii.t7> in mode r  at /home/ananghudaya/torch/pkg/torch/lib/TH/THDiskFile.c:484
stack traceback:
    [C]: at 0xb720afc0
    [C]: in function 'DiskFile'
    ...e/ananghudaya/torch/install/share/lua/5.1/torch/File.lua:309: in function 'load'
    [string "net = torch.load('nn4.v1.ascii.t7', 'ascii')"]:1: in main chunk
    [C]: in function 'xpcall'
    ...e/ananghudaya/torch/install/share/lua/5.1/trepl/init.lua:648: in function 'repl'
    ...daya/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:185: in main chunk
    [C]: at 0x0804d6d0  

I'm using a 32-bit machine.

@bamos
Copy link
Collaborator Author

bamos commented Oct 30, 2015

Hi Anang, this error looks like Torch can't find the file.
Did you unxz it and check the md5sum?

-Brandon.

  • Anang Hudaya Muhamad Amin :: 2015-10-30 02:49 Fri:

    Thanks @bamos

    Still no luck in getting it right. Here is the output:

    th> net = torch.load('nn4.v1.ascii.t7', 'ascii') cannot open <nn4.v1.ascii.t7> in mode r at /home/ananghudaya/torch/pkg/torch/lib/TH/THDiskFile.c:484 stack traceback: [C]: at 0xb720afc0 [C]: in function 'DiskFile' ...e/ananghudaya/torch/install/share/lua/5.1/torch/File.lua:309: in function 'load' [string "net = torch.load('nn4.v1.ascii.t7', 'ascii')"]:1: in main chunk [C]: in function 'xpcall' ...e/ananghudaya/torch/install/share/lua/5.1/trepl/init.lua:648: in function 'repl' ...daya/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:185: in main chunk [C]: at 0x0804d6d0
    I'm using a 32-bit machine.


    Reply to this email directly or view it on GitHub:
    Architecture issues with Torch.load #42 (comment)

@ananghudaya
Copy link

Hi @bamos,

Yes I did. the md5 checksum is similar, and I have placed the file in the same folder as the other models.

@bamos
Copy link
Collaborator Author

bamos commented Nov 3, 2015

Please double check the path to the model.
The error message you're getting is the same error message I get for incorrect paths.

th> model = torch.load('/tmp/does-not-exist.t7')
cannot open </tmp/does-not-exist.t7> in mode r  at /home/bamos/torch/pkg/torch/lib/TH/THDiskFile.c:484
stack traceback:
    [C]: at 0x7f4389ef2a90
    [C]: in function 'DiskFile'
    /home/bamos/torch/install/share/lua/5.1/torch/File.lua:292: in function 'load'
    [string "model = torch.load('/tmp/does-not-exist.t7')"]:1: in main chunk
    [C]: in function 'xpcall'
    /home/bamos/torch/install/share/lua/5.1/trepl/init.lua:648: in function 'repl'
    ...amos/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:185: in main chunk
    [C]: at 0x00406670

bamos pushed a commit that referenced this issue Nov 4, 2015
@bamos
Copy link
Collaborator Author

bamos commented Nov 4, 2015

The ascii model loads in about 30-45 seconds for me and the x86 binary model loads in a few seconds. I'll add a fallback mechanism when we transition to a Lua server in #4 instead of a Lua subprocess so only non 64-bit x86 users will have the 30 second penalty, and it will only be for the first time they start the server, not every time they try to run a new Python program using OpenFace.

@snowlord
Copy link

i faced the same problem that torch cant load nn4.v1.ascii.t7. i downloaded nn4.v1.ascii.t7 and checked md5. as @bamos sayed it caused by incorrect path,but i tried absolutely path.it still showed that
cannot open <nn4.v1.ascii.t7> in mode r at /tmp/luarocks_torch-scm-1../torch7/lib/TH/THDiskFile.c :484

@bamos
Copy link
Collaborator Author

bamos commented Nov 17, 2015

Hi @snowlord - strange! Can you (or @ananghudaya) try saving a small file in binary format, then loading it? Then doing the same with an ASCII-formatted file?

/tmp$ th
th> t = torch.Tensor(10)
th> torch.save('test-binary.t7', t)
th> t2 = torch.load('test-binary.t7')
th> torch.save('test-ascii.t7', t, 'ascii')
th> t3 = torch.load('test-ascii.t7', 'ascii')
th> t:eq(t2):all()
true
th> t:eq(t3):all()
true

If this works, can you then try doing it in a different directory that's not your current working directory?

@snowlord
Copy link

hi,@bamos,i changed on the 64-bit x86,i have checked md5 of model file.it showed different problem.

th> torch.load('./models/openface/nn4.v1.t7')
/usr/local/share/lua/5.1/torch/File.lua:294: unknown object
stack traceback:
    [C]: in function 'error'
    /usr/local/share/lua/5.1/torch/File.lua:294: in function 'readObject'
    /usr/local/share/lua/5.1/torch/File.lua:240: in function 'readObject'
    /usr/local/share/lua/5.1/torch/File.lua:288: in function 'readObject'
    /usr/local/share/lua/5.1/torch/File.lua:272: in function 'readObject'
    /usr/local/share/lua/5.1/torch/File.lua:288: in function 'readObject'
    /usr/local/share/lua/5.1/torch/File.lua:288: in function 'readObject'
    /usr/local/share/lua/5.1/torch/File.lua:272: in function 'readObject'
    /usr/local/share/lua/5.1/torch/File.lua:319: in function 'load'
    [string "_RESULT={torch.load('./models/openface/nn4.v1..."]:1: in main chunk
    [C]: in function 'xpcall'
    /usr/local/share/lua/5.1/trepl/init.lua:650: in function 'repl'
    /usr/local/lib/luarocks/rocks/trepl/scm-1/bin/th:199: in main chunk
    [C]: at 0x00406260  

@bamos
Copy link
Collaborator Author

bamos commented Nov 24, 2015

Hi @snowlord - interesting you're seeing that on 64-bit x86. Somebody in this thread on the torch mailing list got a similar unknown object error and said it was an architecture issue: https://groups.google.com/forum/#!msg/torch7/zNNdXATZxlA/z5A2HocVCgAJ

Does the ascii model work on your 64-bit x86 machine?

@shimen
Copy link

shimen commented Dec 6, 2015

Hi @bamos ,
I got the same problem:

celeb-classifier.nn4.v1.pkl  cifar10-test.t7  cifar10torchsmall.zip  cifar10-train.t7  nn2.def.lua  nn4.def.lua  nn4.v1.ascii.t7  nn4.v1.t7
-bash-4.1# th                                                                                                                              
th> require 'nn'
{..........}
                                                                      [0.0143s]
th> require 'dpnn'                                                             
true                                                                           
                                                                      [0.0113s]
th> net = torch.load('nn4.v1.t7')                               
/usr/local/share/lua/5.1/torch/File.lua:241: Failed to load function from bytecode: (binary): cannot load incompatible bytecode
stack traceback:
        [C]: in function 'error'
        /usr/local/share/lua/5.1/torch/File.lua:241: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:294: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:278: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:294: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:294: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:278: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:325: in function 'load'
        [string "net = torch.load('nn4.v1.t7')"]:1: in main chunk
        [C]: in function 'xpcall'
        /usr/local/share/lua/5.1/trepl/init.lua:668: in function 'repl'
        /usr/local/lib/luarocks/rocks/trepl/scm-1/bin/th:199: in main chunk
        [C]: at 0x004051e0
                                                                      [0.0005s]
th> net = torch.load('nn4.v1.ascii.t7', 'ascii')
/usr/local/share/lua/5.1/torch/File.lua:241: Failed to load function from bytecode: (binary): cannot load incompatible bytecode
stack traceback:
        [C]: in function 'error'
        /usr/local/share/lua/5.1/torch/File.lua:241: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:294: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:278: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:294: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:294: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:278: in function 'readObject'
        /usr/local/share/lua/5.1/torch/File.lua:325: in function 'load'
        [string "net = torch.load('nn4.v1.ascii.t7', 'ascii')"]:1: in main chunk
        [C]: in function 'xpcall'
        /usr/local/share/lua/5.1/trepl/init.lua:668: in function 'repl'
        /usr/local/lib/luarocks/rocks/trepl/scm-1/bin/th:199: in main chunk
        [C]: at 0x004051e0
                                                                      [0.0053s]
th> net = torch.load('cifar10-train.t7')
                                                                      [0.0134s]
th>

As you can see I had tried to load the model that you provided at this link:
https://groups.google.com/forum/#!msg/torch7/zNNdXATZxlA/z5A2HocVCgAJ
everything loads just fine.
net = torch.load('cifar10-train.t7')

when I tried to load the nn4.v1.t7 with no susses:
net = torch.load('nn4.v1.t7')
net = torch.load('nn4.v1.ascii.t7', 'ascii')

I had done a md5sum test:

md5sum models/{dlib/*.dat,openface/*.{pkl,t7}}
73fde5e05226548677a050913eed4e04  models/dlib/shape_predictor_68_face_landmarks.dat
c0675d57dc976df601b085f4af67ecb9  models/openface/celeb-classifier.nn4.v1.pkl
735723e2c9cc4eefc00a7df34c9a4d3b  models/openface/nn4.v1.ascii.t7
a59a5ec1938370cd401b257619848960  models/openface/nn4.v1.t7

I'm on x86_64 GNU/Linux.
What seems to be the problem?

Ilya

@shimen
Copy link

shimen commented Dec 6, 2015

It seems to be a problem of the lua and luajit versions:

-bash-4.1# lua -v
Lua 5.1.4 Copyright (C) 1994-2008 Lua.org, PUC-Rio
-bash-4.1# luajit -v
LuaJIT 2.0.4 -- Copyright (C) 2005-2015 Mike Pall. http://luajit.org/

I use these versions. Which version the model was complied with?

Ilya

@bamos
Copy link
Collaborator Author

bamos commented Dec 6, 2015 via email

@shimen
Copy link

shimen commented Dec 7, 2015

I had installed LuaJIT 2.1.0-beta1.
now the command got no errors!!!
net = torch.load('nn4.v1.t7')

download from:
https://github.com/torch/luajit-rocks

make sure to add this option to the cmake "-DWITH_LUAJIT21=ON" !!!!!!!!!!!!!

git clone https://github.com/torch/luajit-rocks.git
cd luajit-rocks
mkdir build
cd build
cmake .. -DWITH_LUAJIT21=ON

@lijian8
Copy link

lijian8 commented Dec 20, 2015

Hi @bamos ,
I try to play with ARM 32 bit platform, and change the torch load model to
net = torch.load('nn4.v1.ascii.t7', 'ascii')
A strange thing is when I run the compare demo script I got following error message:

Error getting result from Torch subprocess.
Line read:

Exception:

could not convert string to float:

stdout:

stderr:

I tried to run the same code in X86_64 platform it's all OK since ascii version should be platform independent. Could you give some hint about this issue I had on ARM 32 bit platform? Thanks.

@bamos
Copy link
Collaborator Author

bamos commented Dec 20, 2015

Hi @lijian8,

stdout:

stderr:

Are these both empty? I would expect more content.

I tried to run the same code in X86_64 platform it's all OK since
ascii version should be platform indepedant. Could you give some
hint about this issue I had on ARM 32 bit platform? Thanks.

I don't have any experience executing on 32-bit ARM.
Maybe the Torch community will be able to help if we can
find a more informative error message.

-Brandon.

@lijian8
Copy link

lijian8 commented Dec 21, 2015

Hi @bamos,
Yes these are empty. I'll try to run sunprocess directly on torch to see if I can catch up something.

@bamos bamos removed this from the v0.2.0 milestone Dec 30, 2015
@SyRenity
Copy link

I had a very similar issue on Jetson TK1 board, here is a solution from another project that might help:

git clone https://github.com/mvitez/torch7.git mvittorch7
cd mvittorch7
luarocks make rocks/torch-scm-1.rockspec
diff --git a/eval.lua b/eval.lua
index 1814180..8cad5ba 100644
--- a/eval.lua
+++ b/eval.lua
@@ -65,8 +65,21 @@ end
 -------------------------------------------------------------------------------
 -- Load the model checkpoint to evaluate
 -------------------------------------------------------------------------------
+local function load(filename)
+   local mode = 'binary'
+   local referenced = true
+   local file = torch.DiskFile(filename, 'r')
+   file[mode](file)
+   file:referenced(referenced)
+   file:longSize(8)
+   file:littleEndianEncoding()
+   local object = file:readObject()
+   file:close()
+   return object
+end
+
 assert(string.len(opt.model) > 0, 'must provide a model')
-local checkpoint = torch.load(opt.model)
+local checkpoint = load(opt.model)
 -- override and collect parameters
 if string.len(opt.input_h5) == 0 then opt.input_h5 = checkpoint.opt.input_h5 end
 if string.len(opt.input_json) == 0 then opt.input_json = checkpoint.opt.input_json end

@jacklanchantin
Copy link

I had the same issue. It was fixed by the comment from SyRenity:

git clone https://github.com/mvitez/torch7.git mvittorch7
cd mvittorch7
luarocks make rocks/torch-scm-1.rockspec

@SyRenity
Copy link

SyRenity commented Feb 2, 2016

@jacklanchantin glad it helped :)

@fmassa
Copy link

fmassa commented Feb 5, 2016

For information, torch/torch7#476 was merged into master some time ago, so all the changes in @mvitez branch were integrated to torch.

@lijian8
Copy link

lijian8 commented Feb 10, 2016

Thanks @bamos @SyRenity @jacklanchantin, this issue should be fixed with instruction from @SyRenity .

@bamos
Copy link
Collaborator Author

bamos commented Feb 17, 2016

Great info, thanks all!

@bamos bamos closed this as completed Feb 17, 2016
@ChrisYang
Copy link

@SyRenity i am also working on TK1 but still get error when I load the binary model for openface. As mentioned before this issue should be fixed. Do you have any clue about my errors. Thanks:

net = torch.load('/home/ubuntu/Downloads/face/openface/models/openface/nn4.small2.v1.t7')
/home/ubuntu/torch/install/share/lua/5.1/torch/File.lua:370: table index is nil
stack traceback:
/home/ubuntu/torch/install/share/lua/5.1/torch/File.lua:370: in function 'readObject'
/home/ubuntu/torch/install/share/lua/5.1/nn/Module.lua:158: in function 'read'
/home/ubuntu/torch/install/share/lua/5.1/torch/File.lua:351: in function 'readObject'
/home/ubuntu/torch/install/share/lua/5.1/torch/File.lua:409: in function 'load'
[string "net = torch.load('/home/ubuntu/Downloads/face..."]:1: in main chunk
[C]: in function 'xpcall'
/home/ubuntu/torch/install/share/lua/5.1/trepl/init.lua:669: in function 'repl'
...untu/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:199: in main chunk
[C]: at 0x0000cff9

@bamos
Copy link
Collaborator Author

bamos commented Mar 18, 2016

Hi @ChrisYang - can you try using our ascii model from http://openface-models.storage.cmusatyalab.org/nn4.small2.v1.ascii.t7.xz? Unxz it and then use ascii mode in torch.load.

@ChrisYang
Copy link

@bamos thanks for your prompt reply.
Though I haven't found your ascii file, I managed to save a ascii version on a x86 machine and now I can load it from TK1.
However I face some new issues. It runs ok using cpu mode but very slowly on TK1. When I tried to call net:forward in cuda mode i got cuda runtime error 'too many resources requested for launch at xxx'. Do you have any clue how to solve this?

@apeterswu
Copy link

apeterswu commented Jun 3, 2016

@shimen Hi, I have the same problem as you, "File.lua failed to load function from bytecode binary string: not a precompiled chunk", and I also updated my luajit version to be 2.1 beta, but it still failed, I don't what to do now? Could anyone help? Thanks.

@shimen
Copy link

shimen commented Jun 5, 2016

@apeterswu Hi, I'm not sure what is the problem. Since openFace version 0.2 I do not have to use this command.

@weiqifa0
Copy link

Subject: openface
root@tegra-ubuntu:~/openface/openface# ./demos/compare.py images/examples/{lennon_,clapton_}
/home/ubuntu/torch/install/bin/luajit: /home/ubuntu/torch/install/share/lua/5.1/torch/File.lua:370: table index is nil
stack traceback:
/home/ubuntu/torch/install/share/lua/5.1/torch/File.lua:370: in function 'readObject'
/home/ubuntu/torch/install/share/lua/5.1/nn/Module.lua:158: in function 'read'
/home/ubuntu/torch/install/share/lua/5.1/torch/File.lua:351: in function 'readObject'
/home/ubuntu/torch/install/share/lua/5.1/torch/File.lua:409: in function 'load'
...lib/python2.7/dist-packages/openface/openface_server.lua:46: in main chunk
[C]: in function 'dofile'
...untu/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x0000cff9
Traceback (most recent call last):
File "./demos/compare.py", line 101, in
d = getRep(img1) - getRep(img2)
File "./demos/compare.py", line 92, in getRep
rep = net.forward(alignedFace)
File "/usr/local/lib/python2.7/dist-packages/openface/torch_neural_net.py", line 156, in forward
rep = self.forwardPath(t)
File "/usr/local/lib/python2.7/dist-packages/openface/torch_neural_net.py", line 113, in forwardPath
""".format(self.cmd, self.p.stdout.read()))
Exception:

OpenFace: openface_server.lua subprocess has died.

  • Is the Torch command th on your PATH? Check with which th.
  • If th is on your PATH, try running ./util/profile-network.lua
    to see if Torch can correctly load and run the network.
    • If this gives illegal instruction errors, see the section on
      this in our FAQ at http://cmusatyalab.github.io/openface/faq/
    • In Docker, use a Bash login shell or source
      /root/torch/install/bin/torch-activate for the Torch environment.
  • See this GitHub issue if you are running on a non-64-bit machine:
    Architecture issues with Torch.load #42
  • Please post further issues to our mailing list at
    https://groups.google.com/forum/#!forum/cmu-openface
    Diagnostic information:
    cmd: ['/usr/bin/env', 'th', '/usr/local/lib/python2.7/dist-packages/openface/openface_server.lua', '-model', '/home/ubuntu/openface/openface/demos/../models/openface/nn4.small2.v1.t7', '-imgDim', '96']

    stdout:

is anyone encountered such a problem?

my email is 329410527@qq.com

Thank you very much.

@bamos
Copy link
Collaborator Author

bamos commented Jul 14, 2016

@bamos
Copy link
Collaborator Author

bamos commented Jul 19, 2016

Some users following this issue may also be interested in helping improve dlib and its face detector's speed on ARM by adding NEON instructions. Contact @davisking if interested. Here is his comment from another thread:

NEON instructions are similar enough in overall structure that you should
be able to implement alternative versions of the simd classes in dlib (e.g.
https://github.com/davisking/dlib/blob/master/dlib/simd/simd8f.h). All the
simd usage is through these classes, so if there were NEON versions of them
then things would be much faster on ARM. I've had this on my todo list for
a long time but haven't gotten around to it yet. You should give it a go :)

@maxisme
Copy link

maxisme commented Aug 2, 2016

@bamos do you know of anyone successfully getting this to work on a raspberry pi? As it is driving me crazy. I have got this working. (it takes 18 seconds?!). I change this line to the ascii file and then try run again but I get a File.lua:375: unknown object error. Any ideas?

@nitish11
Copy link

@maxisme : I got it solved. Refer issue

@BrandonJoffe
Copy link

Hey @bamos,

I've been trying to use the Docker container in Ubuntu 14.04 on 64 bit x86 architecture. I have switched to the ascii model and I'm getting the same error as weiqifa0 above. I'm not quite sure where to go from here other than performing a fresh by hand install of Openface, which I want to avoid. Any suggestions would be great!

Exception in thread frame_process_thread_0:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 763, in run
self.__target(_self.__args, *_self.__kwargs)
File "/host/system/SurveillanceSystem.py", line 534, in process_frame
predictions, alignedFace = self.recogniser.make_prediction(personimg,face_bb)
File "/host/system/FaceRecogniser.py", line 111, in make_prediction
persondict = self.recognize_face(alignedFace)
File "/host/system/FaceRecogniser.py", line 121, in recognize_face
if self.getRep(img) is None:
File "/host/system/FaceRecogniser.py", line 145, in getRep
rep = self.net.forward(alignedFace) # Gets embedding - 128 measurements
File "/usr/local/lib/python2.7/dist-packages/openface/torch_neural_net.py", line 156, in forward
rep = self.forwardPath(t)
File "/usr/local/lib/python2.7/dist-packages/openface/torch_neural_net.py", line 113, in forwardPath
""".format(self.cmd, self.p.stdout.read()))
Exception:

OpenFace: openface_server.lua subprocess has died.

Diagnostic information:

cmd: ['/usr/bin/env', 'th', '/usr/local/lib/python2.7/dist-packages/openface/openface_server.lua', '-model', '/host/system/../models/openface/nn4.small2.v1.ascii.t7', '-imgDim', '96']

@BrandonJoffe
Copy link

Don't worry just tested with Docker in a Ubuntu VM and worked perfectly :) not sure what the issue was.

@KGOURAV
Copy link

KGOURAV commented Apr 2, 2017

hey @bamos,
as you have said to save the model in ascii format i have saved it and i have tried these commands they are perfectly working

$ th
th> require 'nn'
th> require 'dpnn'
th> net = torch.load('nn4.v1.ascii.t7', 'ascii')

but again when i try this `command ./demos/classifier.py infer ./generated-embeddings/classifier.pkl your_test_image.jpg

this is the error i am getting

/home/pi/torch/install/share/lua/5.1/torch/File.lua:375: unknown object
stack traceback:
[C]: in function 'error'
/home/pi/torch/install/share/lua/5.1/torch/File.lua:375: in function 'readObject'
/home/pi/torch/install/share/lua/5.1/torch/File.lua:409: in function 'load'

./batch-represent/main.lua:33: in main chunk
[C]: in function 'dofile'
...e/pi/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00014fa8

@mattanimation
Copy link

mattanimation commented May 2, 2017

Had the same issue on ubuntu 16.04 | torch7. The ascii loading method worked with the provided ascii model download link. Just had to modify the ./batch-represent/opt.lua and main.lua file that the model gets loaded from in the example on the openface website for testing classification. However trying to run the ./demo/compare.py example that uses the openface python api suffers the same error. It seems if the torch_neural_net.py file's cmd could accept an ascii option it might be a way to curtail it?
self.cmd = ['/usr/bin/env', 'th', os.path.join(myDir, 'openface_server.lua'), '-model', model, '-imgDim', str(imgDim)]

-- update
I also modified the torch_neural_net.py and openface_server.lua to include the ascii argument and it indeed works as well.

@eyobbirhanu4
Copy link

is it CNN

@eyobbirhanu4
Copy link

real time and web based...

@01ActeAnnaNager
Copy link

had some issue

./demos/compare.py images/examples/{lennon*,clapton*}
<openface.torch_neural_net.TorchNeuralNet instance at 0x7fbfe5a0c320>
/home/sct/torch/install/bin/lua: .../sct/torch/install/share/lua/5.1/luarocks/loader.lua:117: error loading module 'treplutils' from file '/home/sct/torch/install/lib/lua/5.1/treplutils.so':
dynamic libraries not enabled; check your Lua installation
stack traceback:
[C]: in function 'a_loader'
.../sct/torch/install/share/lua/5.1/luarocks/loader.lua:117: in function <.../sct/torch/install/share/lua/5.1/luarocks/loader.lua:114>
(tail call): ?
[C]: in function 'require'
/home/sct/torch/install/share/lua/5.1/trepl/init.lua:40: in main chunk
[C]: in function 'require'
.../torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:104: in main chunk
[C]: ?
Traceback (most recent call last):
File "./demos/compare.py", line 102, in
d = getRep(img1) - getRep(img2)
File "./demos/compare.py", line 93, in getRep
rep = net.forward(alignedFace)
File "/home/sct/miniconda3/envs/openface01/lib/python2.7/site-packages/openface/torch_neural_net.py", line 204, in forward
rep = self.forwardPath(t)
File "/home/sct/miniconda3/envs/openface01/lib/python2.7/site-packages/openface/torch_neural_net.py", line 161, in forwardPath
""".format(self.cmd, self.p.stdout.read()))
Exception:

OpenFace: openface_server.lua subprocess has died.

Diagnostic information:

cmd: ['/usr/bin/env', 'th', '/home/sct/miniconda3/envs/openface01/lib/python2.7/site-packages/openface/openface_server.lua', '-model', '/home/sct/CV/openface/demos/../models/openface/nn4.small2.v1.t7', '-imgDim', '96']

============

stdout:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests