Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows use mkl static lib (take 2) #1798

Merged
merged 9 commits into from
Apr 26, 2024
Merged

Conversation

xuhancn
Copy link
Contributor

@xuhancn xuhancn commented Apr 24, 2024

resubmit #1790 with fix PR #1797.

From pytorch issue: pytorch/pytorch#124009 I found libtorch seems use shared mkl lib and missing some mkl dll files.

  1. Currently pytorch Linux already use static mkl lib.
  2. Windows can also support static mkl lib, I have validated as update build guide to use mkl-static. pytorch#116946

Tested in https://github.com/pytorch/pytorch/actions/runs/8836875904/job/24264643410

Copy link
Contributor

@malfet malfet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please create a test PR against PyTorch/PyTorch

@xuhancn
Copy link
Contributor Author

xuhancn commented Apr 24, 2024

Triggerred test from the dummy PR: pytorch/pytorch#124853
From the result: https://github.com/pytorch/pytorch/actions/runs/8821164940/job/24216482475, I still do some work to enable mkl-static for pytorch Windows: pytorch/pytorch#124869

@xuhancn
Copy link
Contributor Author

xuhancn commented Apr 25, 2024

Triggerred test from the dummy PR: pytorch/pytorch#124853 From the result: https://github.com/pytorch/pytorch/actions/runs/8821164940/job/24216482475, I still do some work to enable mkl-static for pytorch Windows: pytorch/pytorch#124869

After pytorch/pytorch#124925 merged, the static mkl not support issue is gone.
Test job: https://github.com/pytorch/pytorch/actions/runs/8834373375/job/24255862902?pr=124853
image

@xuhancn
Copy link
Contributor Author

xuhancn commented Apr 25, 2024

@malfet malfet changed the title Xu win mkl static Windows use mkl static lib (take 2) Apr 25, 2024
@malfet malfet self-requested a review April 25, 2024 17:46
@xuhancn
Copy link
Contributor Author

xuhancn commented Apr 25, 2024

@malfet malfet merged commit eebc2f0 into pytorch:main Apr 26, 2024
2 checks passed
@xuhancn xuhancn deleted the xu_win_mkl_static branch April 26, 2024 16:30
atalman added a commit that referenced this pull request Jun 11, 2024
@atalman
Copy link
Contributor

atalman commented Jun 11, 2024

I am getting an error with latest Windows AMI:

-- Checking for [blis]
--   Library blis: BLAS_blis_LIBRARY-NOTFOUND
-- Checking for [Accelerate]
--   Library Accelerate: BLAS_Accelerate_LIBRARY-NOTFOUND
-- Checking for [vecLib]
--   Library vecLib: BLAS_vecLib_LIBRARY-NOTFOUND
-- Checking for [flexiblas]
--   Library flexiblas: BLAS_flexiblas_LIBRARY-NOTFOUND
-- Checking for [openblas]
--   Library openblas: BLAS_openblas_LIBRARY-NOTFOUND
-- Checking for [openblas - pthread - m]
--   Library openblas: BLAS_openblas_LIBRARY-NOTFOUND
-- Checking for [openblas - pthread - m - gomp]
--   Library openblas: BLAS_openblas_LIBRARY-NOTFOUND
-- Checking for [libopenblas]
--   Library libopenblas: BLAS_libopenblas_LIBRARY-NOTFOUND
-- Checking for [goto2 - gfortran]
--   Library goto2: BLAS_goto2_LIBRARY-NOTFOUND
-- Checking for [goto2 - gfortran - pthread]
--   Library goto2: BLAS_goto2_LIBRARY-NOTFOUND
-- Checking for [acml - gfortran]
--   Library acml: BLAS_acml_LIBRARY-NOTFOUND
-- Checking for [blis]
--   Library blis: BLAS_blis_LIBRARY-NOTFOUND
-- Could NOT find Atlas (missing: Atlas_CBLAS_INCLUDE_DIR Atlas_CLAPACK_INCLUDE_DIR Atlas_CBLAS_LIBRARY Atlas_BLAS_LIBRARY Atlas_LAPACK_LIBRARY) 
-- Checking for [ptf77blas - atlas - gfortran]
--   Library ptf77blas: BLAS_ptf77blas_LIBRARY-NOTFOUND

and finaly failing:

2024-06-11T16:40:56.0998121Z UfuncCUDA_add.cu
2024-06-11T16:44:04.9996771Z tmpxft_00000284_00000000-7_UfuncCUDA_add.compute_90.cudafe1.cpp
2024-06-11T16:44:04.9997305Z 
2024-06-11T16:44:04.9998101Z [8312/8413] Building CUDA object caffe2\CMakeFiles\torch_cuda.dir\__\aten\src\ATen\native\cuda\Unique.cu.obj
2024-06-11T16:44:04.9999192Z Unique.cu
2024-06-11T16:44:04.9999334Z 
2024-06-11T16:44:04.9999562Z tmpxft_00000ef4_00000000-7_Unique.compute_90.cudafe1.cpp
2024-06-11T16:44:04.9999918Z 
2024-06-11T16:44:05.0000129Z [8313/8413] Linking CXX shared library bin\torch_cuda.dll
2024-06-11T16:44:05.0000668Z FAILED: bin/torch_cuda.dll lib/torch_cuda.lib 
2024-06-11T16:44:05.0005278Z cmd.exe /C "cd . && C:\actions-runner\_work\pytorch-canary\pytorch-canary\builder\windows\conda\envs\py38\Library\bin\cmake.exe -E vs_link_dll --intdir=caffe2\CMakeFiles\torch_cuda.dir --rc=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\rc.exe --mt=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\mt.exe --manifests  -- C:\PROGRA~2\MICROS~1\2019\BUILDT~1\VC\Tools\MSVC\1429~1.301\bin\Hostx64\x64\link.exe /nologo @CMakeFiles\torch_cuda.rsp  /out:bin\torch_cuda.dll /implib:lib\torch_cuda.lib /pdb:bin\torch_cuda.pdb /dll /version:0.0 /machine:x64 /ignore:4049 /ignore:4217 /ignore:4099 /INCREMENTAL:NO  && cd ."
2024-06-11T16:44:05.0059710Z LINK: command "C:\PROGRA~2\MICROS~1\2019\BUILDT~1\VC\Tools\MSVC\1429~1.301\bin\Hostx64\x64\link.exe /nologo @CMakeFiles\torch_cuda.rsp /out:bin\torch_cuda.dll /implib:lib\torch_cuda.lib /pdb:bin\torch_cuda.pdb /dll /version:0.0 /machine:x64 /ignore:4049 /ignore:4217 /ignore:4099 /INCREMENTAL:NO /MANIFEST /MANIFESTFILE:bin\torch_cuda.dll.manifest" failed (exit code 1181) with the following output:
2024-06-11T16:44:05.0062298Z LINK : fatal error LNK1181: cannot open input file 'NOTFOUND.lib'
2024-06-11T16:44:05.0062728Z 
2024-06-11T16:44:05.0062875Z ninja: build stopped: subcommand failed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants