Skip to content

[Build] choose_qparams_tensor_out get wrong return type cause build fail on native Windows #4659

Closed
@python3kgae

Description

@python3kgae

🐛 Describe the bug

When building on native Windows, I encountered an undefined symbol error.

lld-link : error : undefined symbol: __declspec(dllimport) class std::tuple<class at::Tensor &, class at::Tensor &> __cdecl torch::executor::native::choose_qparams_tensor_out(class at::Tensor const &, __int64, __int64, double, enum c10::ScalarType, class at::Tensor &, class at::Tensor &)

This issue can be worked around with the following patch.

diff --git a/kernels/quantized/cpu/op_choose_qparams.cpp b/kernels/quantized/cpu/op_choose_qparams.cpp
index 47f261407..9bda17192 100644
--- a/kernels/quantized/cpu/op_choose_qparams.cpp
+++ b/kernels/quantized/cpu/op_choose_qparams.cpp
@@ -149,7 +149,7 @@ void choose_qparams(
 }
 } // namespace
 
-std::tuple<Tensor, Tensor> choose_qparams_tensor_out(
+std::tuple<Tensor&, Tensor&> choose_qparams_tensor_out(
     const Tensor& input,
     int64_t quant_min,
     int64_t quant_max,
@@ -164,7 +164,7 @@ std::tuple<Tensor, Tensor> choose_qparams_tensor_out(
   return {scale_out, zero_point_out};
 }
 
-::std::tuple<Tensor, Tensor> choose_qparams_tensor_out(
+::std::tuple<Tensor&, Tensor&> choose_qparams_tensor_out(
     RuntimeContext& context,
     const Tensor& input,
     int64_t quant_min,

Versions

Collecting environment information...
PyTorch version: 2.5.0.dev20240716+cpu
Is debug build: False
CUDA used to build PyTorch: Could not collect
ROCM used to build PyTorch: N/A

OS: Microsoft Windows 11 Pro
GCC version: Could not collect
Clang version: 18.1.8
CMake version: version 3.30.2
Libc version: N/A

Python version: 3.10.0 | packaged by conda-forge | (default, Nov 10 2021, 13:20:59) [MSC v.1916 64 bit (AMD64)] (64-bit runtime)
Python platform: Windows-10-10.0.22631-SP0
Is CUDA available: False
CUDA runtime version: 12.2.140
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 3070 Ti
Nvidia driver version: 551.76
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Architecture=9
CurrentClockSpeed=3501
DeviceID=CPU0
Family=107
L2CacheSize=16384
L2CacheSpeed=
Manufacturer=AuthenticAMD
MaxClockSpeed=3501
Name=AMD Ryzen Threadripper PRO 3975WX 32-Cores
ProcessorType=3
Revision=12544

Versions of relevant libraries:
[pip3] executorch==0.4.0a0+a70d070
[pip3] numpy==1.21.3
[pip3] torch==2.5.0.dev20240716+cpu
[pip3] torchaudio==2.4.0.dev20240716+cpu
[pip3] torchsr==1.0.4
[pip3] torchvision==0.20.0.dev20240716+cpu
[conda] executorch 0.4.0a0+a70d070 pypi_0 pypi
[conda] numpy 1.21.3 pypi_0 pypi
[conda] torch 2.5.0.dev20240716+cpu pypi_0 pypi
[conda] torchaudio 2.4.0.dev20240716+cpu pypi_0 pypi
[conda] torchsr 1.0.4 pypi_0 pypi
[conda] torchvision 0.20.0.dev20240716+cpu pypi_0 pypi

Metadata

Metadata

Assignees

Labels

module: kernelsIssues related to kernel libraries and utilities, and code under kernels/triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions