Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libtorch C++, fasterrcnn_resnet50_fpn module.forward() Assert #3349

Open
dc986 opened this issue Feb 4, 2021 · 16 comments
Open

libtorch C++, fasterrcnn_resnet50_fpn module.forward() Assert #3349

dc986 opened this issue Feb 4, 2021 · 16 comments

Comments

@dc986
Copy link

dc986 commented Feb 4, 2021

🐛 Bug

module.forward() launches Debug assert

File: minkernel\crts\ucrt\src\appcrt\heap\debug_heap.cpp
Line: 966

Expression: __acrt_first_block == header

To Reproduce

Loaded scripted model with

torch::jit::script::Module module;
try {
	module = torch::jit::load(model_path);
}
catch (const c10::Error& e) {
	std::cerr << e.what();
	return -1;
}
module.eval();

Loaded image into tensor with

cv::Mat image; 
cv::Mat3f image_32fc3;

image = cv::imread(image_path, cv::IMREAD_COLOR);
auto h = image.rows;
auto w = image.cols;
auto c = image.channels();

image.convertTo(image_32fc3, CV_32FC3, 1.0f / 255.0f);
at::Tensor inputTensor = torch::from_blob(image_32fc3.data, { 1, h, w, c });
inputTensor = inputTensor.permute({ 0, 3, 1, 2 });
torch::DeviceType device_type = torch::kCPU;
inputTensor = inputTensor.to(device_type);

Both model and tensor seem to be loaded correctly anyway

std::vector<torch::jit::IValue>  input_to_net;
input_to_net.push_back(inputTensor);
at::Tensor output = module.forward(input_to_net).toTensor();

does not work.

call stack is:
image

Environment

OS: Microsoft Windows 7 Professional
Language: C++
CMake version: version 3.17.1
Python version: 3.7 (64-bit runtime)
Is CUDA available: N/A
numpy==1.18.5
torch==1.7.1+cpu
torchaudio==0.7.2
torchvision==nightly
Python version:

Additional context

cc @vfdev-5

@dc986
Copy link
Author

dc986 commented Feb 8, 2021

Same issue with maskrcnn_resnet50_fpn.

Any ideas?

@bmanga
Copy link
Contributor

bmanga commented Feb 12, 2021

Are you using the debug versions of libtorch and torchvision?

@dc986
Copy link
Author

dc986 commented Feb 12, 2021

I've tried with both debug and release. But none of the two works

@bmanga
Copy link
Contributor

bmanga commented Feb 12, 2021

Try passing in a list of tensor images(c x h x w) instead of a single tensor that contains a batch of images:

  auto imageList = c10::List<torch::Tensor>({imageTensors...});
  std::vector<torch::jit::IValue> inputs;
  inputs.emplace_back(imageList);

  torch::jit::IValue output = module.forward(inputs);

@bmanga
Copy link
Contributor

bmanga commented Feb 12, 2021

For reference, this is what I use to convert a cv::Mat to a torch tensor:

torch::Tensor createImageTensor(const cv::Mat &image)
{
  cv::Mat rgbImage;
  cv::cvtColor(image, rgbImage, cv::COLOR_BGR2RGB);

  torch::Tensor tensorImage = torch::from_blob(
      rgbImage.data, {rgbImage.rows, rgbImage.cols, 3},
      torch::TensorOptions().dtype(torch::kByte).requires_grad(false));
  tensorImage = tensorImage.to(torch::kFloat);
  tensorImage /= 255.0;

  tensorImage = tensorImage.transpose(0, 1).transpose(0, 2).contiguous();
  return tensorImage;
}

@dc986
Copy link
Author

dc986 commented Feb 12, 2021

Thanks for you answer, this is now my code:

torch::Tensor t1 = createImageTensor(image);
torch::Tensor t2 = createImageTensor(image);

auto imageList = c10::List<torch::Tensor>({ t1, t2 });
std::vector<torch::jit::IValue> input_to_net;
input_to_net.emplace_back(imageList);

auto output = module.forward(input_to_net);

Unfortunately it gives me an assert again.

Looking at the call stack where the exception is thrown:

torch_cpu.dll!torch::jit::Module::forward(std::vector<c10::IValue,std::allocator<c10::IValue>> inputs)

In file "libtorch-win-shared-with-deps-debug-1.7.1+cpu\libtorch\include\torch\csrc\jit\api\module.h" Line 112

 IValue forward(std::vector<IValue> inputs) {
    return get_method("forward")(std::move(inputs));
  }

input.size() = 0

@bmanga
Copy link
Contributor

bmanga commented Feb 12, 2021

Can you verify that you can correctly run the tracing test ?

@dc986
Copy link
Author

dc986 commented Feb 16, 2021

I've tried to run the tracing test. The output is the same I had with the image.

I've tried to download torchlib-nightly + vision-master and did again the same tracing test.
This time the error is before, when I try to load the model I have the following errors:
image

@bmanga
Copy link
Contributor

bmanga commented Feb 16, 2021

Did you modify the source code of the test? It seems like it's trying to load a file called fasterrcnn_resnet50_fpn_1602_nightly.pth. Files with extension pth are not usually the scripted/traced ones.

@dc986
Copy link
Author

dc986 commented Feb 16, 2021

Yes, sorry, I've tried both.
Extension .pt gives the same output.
image

@bmanga
Copy link
Contributor

bmanga commented Feb 16, 2021

You shouldn't have to modify the source code. the pt file is generated by the python file in the tracing directory, so make sure you run that one first.

@dc986
Copy link
Author

dc986 commented Feb 16, 2021

I've compiled the test using cmake, it runs, the model is correctly loaded and the forward gives no problem.

When I use the model traced with the test in Visual Studio I am back to the original issue, the inference does not work.
I've set Include Directories, Library Directories and Linked Input. In the post build event the torch and torchvision .dll are copied where the executable files is.

@bmanga
Copy link
Contributor

bmanga commented Feb 16, 2021

Can you share the python code you use to generate the torchscript file?

@dc986
Copy link
Author

dc986 commented Feb 18, 2021

This is my code

import cv2
import os, sys, time, datetime, random

from PIL import Image
from matplotlib import pyplot as plt

import torch
import torchvision

model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=False)
model.eval()

traced_model = torch.jit.script(model)
traced_model.save("my_fasterrcnn_resnet50_fpn.pt")

@bmanga
Copy link
Contributor

bmanga commented Feb 19, 2021

That looks fine. If you can't correctly run your my_fasterrcnn_resnet50_fpn.pt in the tracing test, I'm out of ideas :/.

@Aquapisces
Copy link

Aquapisces commented Jan 12, 2022

Maybe I accidently find out a solution. I came across a similar problem like yours.
orginal code:
auto InputTensor = torch::from_blob(mGlobalCam_P.data, {1, mGlobalCam_P.rows, mGlobalCam_P.cols, 3 }, torch::kFloat);
InputTensor = InputTensor.permute({ 0,3,1,2 });

the fine code:
auto InputTensor = torch::from_blob(mGlobalCam_P.data, {1, mGlobalCam_P.rows, mGlobalCam_P.cols, 3 }, torch::kByte);
InputTensor = InputTensor.permute({ 0,3,1,2 }).to(torch::kFloat);
I don't know why. It seems that we'd better use kByte in from_blob.
Tell me if this works for you or not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants