Skip to content

Latest commit

 

History

History
100 lines (79 loc) · 18.8 KB

MKL-AOT.md

File metadata and controls

100 lines (79 loc) · 18.8 KB

Notes for users using MKL release (Intel GPU)

Why the separation process is so slow?

The official package of IPEX (Intel Extension for Pytorch) is not built with AOT (Ahead-Of-Time), with only JIT (Just-In-Time) support (Demucs-GUI release is also packed with this package). This means that the first separation operation each time you start Demucs-GUI will take a long time (normally more than 5 minutes) to compile the model if you use Intel GPU. Please note that if you restart Demucs-GUI, the model will be recompiled again. JIT may also fails sometimes. This is when you need to restart Demucs-GUI.

This is because AOT binaries have to be compiled separately for each GPU architecture, and including all architectures (actually, 16) will make the package too large (20GB+). But I've built the AOT binaries separately for each architecture and uploaded them to FossHUB.

I've built different versions of intel_extension_for_pytorch. The binaries are built for Windows x86_64, Python 3.11. Demucs-GUI 1.1a2 to 1.2a1 are packed with torch 2.1.0a0+git7bcf7da (patched by Intel), intel_extension_for_pytorch 2.1.10+git45400a8, while from 1.2b1 it will be packed with torch 2.1.0 and intel_extension_for_pytorch 2.1.30+xpu, and from 1.3a1 it will be packed with intel_extension_for_pytorch 2.1.40+xpu. You can also install 2.1.30+xpu and 2.1.40+xpu from my own redistribution GitHub repo. Support list of each version is shown below. Please note that the support list is not associated with the version of Demucs-GUI, but the version of intel_extension_for_pytorch. It just means that the version of Demucs-GUI is packed with the corresponding version of intel_extension_for_pytorch. If you are running from source code, you can actually use any version of intel_extension_for_pytorch as long as it is compatible with your GPU.

The table is generated by running ocloc.exe with argument device from 0x0000 to 0xFFFF. Theroetically, all these GPUs should be supported (even if they are not released yet) and even some unlisted GPUs can be supported.

Install AOT enabled IPEX (Windows only, CPython 3.11) version 2.1.40+xpu (Demucs-GUI 1.3a1 and later)

  1. Install Intel graphics driver greater than or equal to 31.0.101.4953 (Windows) from official website.
  2. Run Demucs-GUI greater than or equal to 1.3a1. If any Intel GPU is detected, you will be able to see an option in the menu bar About -> About AOT. Click it and you will see a dialog asking you whether to download an AOT build or open the documentation (this page). If you started Demucs-GUI with IPEX enabled but with JIT only on Windows, the dialog will also appear. If the build doesn't have Intel GPU support, the dialog will show up with warning.
  3. Just click on download button and your browser will download a 7z file from GitHub. Or, use this direct link. The file is about 319MB.
  4. Do not extract the downloaded 7z file immediately because it would require 24.8GB of disk space and you don't need all of the files. Instead, look at the desire version in Demucs-GUI's About AOT dialog or look up the PCI ID of your GPU in the table below. Then extract the corresponding folder in the 7z file to C:\Demucs-GUI\intel_extension_for_pytorch\bin, replacing the existing intel-ext-pt-gpu.dll file.
  5. Try separating again! The first-time separation will no longer take a long time. If it still takes a long time, please try other versions of AOT build.

Following GPUs are supported with 2.1.40+xpu (for details, please see find_device_win.py):

PCI ID (Only the device part) Codename Generation Code Display Name
9A40 9A49 9A59 9A60 9A68 9A70 9A78 FF20 Tiger Lake (tgl tgllp) 12.0.0 Intel® UHD Graphics Intel® Iris® Xe Graphics Intel® UHD G4
4C80 4C8A 4C8B 4C8C 4C90 4C9A Rocket Lake (rkl) 12.1.0 Intel® UHD Graphics Intel® UHD Graphics 750 Intel® UHD Graphics 730 Intel® UHD Graphics P750
4680 4682 4688 468A 468B 4690 4692 4693 A780 A781 A782 A783 A788 A789 A78A A78B Alder Lake-S (adl-s), Raptor Lake-S (rpl-s) 12.2.0 Intel® UHD Graphics Intel® UHD Graphics 770 Intel® UHD Graphics 730 Intel® UHD Graphics 710
4626 4628 462A 46A0 46A1 46A2 46A3 46A6 46A8 46AA 46B0 46B1 46B2 46B3 46C0 46C1 46C2 46C3 A720 A721 A7A0 A7A1 A7A8 A7A9 A7AA A7AB A7AC A7AD Alder Lake-P (adl-p), Raptor Lake-P (rpl-p) 12.3.0 Intel® UHD Graphics Intel® Iris® Xe Graphics Intel® Graphics
46D0 46D1 46D2 46D3 46D4 Alder Lake-N (adl-n) 12.4.0 Intel® UHD Graphics Intel® Graphics
4905 4906 4907 4908 4909 DG1 (dg1) 12.10.0 Intel® Iris® Xe MAX Graphics Iris® Xe® Pod Intel Server GPU SG-18M Intel® Iris® Xe Graphics Intel® Iris® Xe MAX 100 Graphics
4F80 4F81 4F82 4F83 4F84 5690 5691 5692 56A0 56A1 56A2 56BE 56BF 56C0 56C2 Alchemist (acm-g10 ats-m150 dg2-g10 dg2-g10-c0) 12.55.8 Intel® Iris® Xe Graphics Intel® Arc™ A770M Graphics Intel® Arc™ A730M Graphics Intel® Arc™ A550M Graphics Intel® Arc™ A770 Graphics Intel® Arc™ A750 Graphics Intel® Arc ™ A580 Graphics Intel® Arc™ A750E Graphics Intel® Arc™ A580E Graphics Intel® Data Center GPU Flex 170 Intel® Data Center GPU Flex 170V
4F87 4F88 5693 5694 5695 56A5 56A6 56B0 56B1 56BA 56BB 56BC 56BD 56C1 Alchemist (acm-g11 ats-m75 dg2-g11 dg2-g11-b1) 12.56.5 Intel® Iris® Xe Graphics Intel® Arc™ A370M Graphics Intel® Arc™ A350M Graphics Intel® Iris® Xe MAX A200M Graphics Intel® Arc™ A380 Graphics Intel® Arc™ A310 Graphics Intel® Arc™ Pro A30M Graphics Intel® Arc™ Pro A40/A50 Graphics Intel® Arc™ A380E Graphics Intel® Arc™ A310E Graphics Intel® Arc™ A370E Graphics Intel® Arc™ A350E Graphics Intel® Data Center GPU Flex 140
4F85 4F86 5696 5697 56A3 56A4 56B2 56B3 Alchemist (acm-g12 dg2-g12 dg2-g12-a0) 12.57.0 Intel® Iris® Xe Graphics Intel® Arc™ A570M Graphics Intel® Arc™ A530M Graphics Intel® Arc™ Xe Graphics Intel® Arc ™ Pro A60M Graphics Intel® Arc ™ Pro A60 Graphics
0BD0 0B69 0B6E 0BD5 0BD6 0BD7 0BD8 0BD9 0BDA 0BDB Ponte Vecchio (pvc pvc-xt-c0) 12.60.7 Intel® Data Center GPU Max 1450 Intel® Data Center GPU Max 1100C Intel® Data Center GPU Max 1100 Intel® Data Center GPU Max 1550 Intel® Data Center GPU Max 1350 Intel® Data Center GPU Max 1100
0BD4 Ponte Vecchio (pvc-vg pvc-xt-c0-vg) 12.61.7 Intel® Data Center GPU Max 1550VG
7D40 7D41 7D45 7D60 7D67 Meteor Lake-M, Meteor Lake-P, Arrow Lake-U (arl-s arl-u mtl-u mtl-s mtl-u-b0) 12.70.4 Intel® Graphics
7D55 7DD5 Meteor Lake-P (mtl-h mtl-p mtl-h-b0) 12.71.4 Intel® Arc™ Graphics Intel® Graphics
7D51 7DD1 Arrow Lake-P (arl-h arl-h-b0) 12.74.4 Intel® Graphics
E202 E20B E20C E20D E20E E20F E212 G21 (bmg-g21 bmg-g21-b0) 20.1.4 Intel® Graphics
E220 E221 E222 (No acronym) 20.2.0 Intel® Graphics
6420 64A0 64B0 Lunar Lake (lnl-m lnl-b0) 20.4.4 Intel® Graphics Intel® Arc™ Graphics 130V / 140V

Display names above come from Intel official documentation and The PCI ID Repository.


Install AOT enabled IPEX (Windows only, CPython 3.11) version 2.1.30+xpu (Demucs-GUI 1.2b1 to 1.2)

  1. Install Intel graphics driver greater than or equal to 31.0.101.4953 (Windows) from official website.
  2. Run Demucs-GUI greater than or equal to 1.1b1. If any Intel GPU is detected, you will be able to see an option in the menu bar About -> About AOT. Click it and you will see a dialog asking you whether to download an AOT build or open the documentation (this page). If you started Demucs-GUI with IPEX enabled but with JIT only on Windows, the dialog will also appear. If the build doesn't have Intel GPU support, the dialog will show up with warning.
  3. Just click on download button and your browser will open a page on FossHUB. You need't to do anything and download will start automatically.
  4. Extract the downloaded 7z file. Assume that you store Demucs-GUI in C:\Demucs-GUI, then you should extract the 7z file to C:\Demucs-GUI\intel_extension_for_pytorch\bin, replacing the existing file.
  5. Try separating again! The first-time separation will no longer take a long time. If it still takes a long time, please try other versions of AOT build. You can download them by clicking the version number in the table below.

You can also recognize PCI ID yourself. Using tools like GPU-Z, you can see your device ID starts with "8086 XXXX", and the XXXX part is the PCI ID to look for in the table below. Then you can download the corresponding version of AOT build.

Following GPUs are supported with 2.1.30+xpu (for details, please see find_device_win.py):

PCI ID (Only the device part) Architecture Generation Code Display Name
9A40 9A49 9A59 9A60 9A68 9A70 9A78 FF20 Tiger Lake (tgl tgllp) 12.0.0 Intel® UHD Graphics Intel® Iris® Xe Graphics
4C80 4C8A 4C8B 4C8C 4C90 4C9A Rocket Lake (rkl) 12.1.0 Intel® UHD Graphics
4680 4682 4688 468A 4690 4692 4693 A780 A781 A782 A783 A788 A789 A78B Alder Lake-S, Raptor Lake-S (adl-s) 12.2.0 Intel® UHD Graphics
4626 4628 462A 46A0 46A1 46A2 46A3 46A6 46A8 46AA 46B0 46B1 46B2 46B3 46C0 46C1 46C2 46C3 A720 A721 A7A0 A7A1 A7A8 A7A9 Alder Lake, Raptor Lake-P (adl-p) 12.3.0 Intel® UHD Graphics Intel® Iris® Xe Graphics
46D0 46D1 46D2 Alder Lake-N (adl-n) 12.4.0 Intel® UHD Graphics
4905 4906 4907 4908 DG1 (dg1) 12.10.0 Intel® Iris® Xe MAX Graphics Intel® SG-18M (SG1) Intel® Iris® Xe Graphics
4F80 4F81 4F82 4F83 4F84 5690 5691 5692 56A0 56A1 56A2 56C0 Alchemist, Intel® Data Center GPU Flex Series (dg2-g10-a0 dg2-g10-a1 dg2-g10-b0 acm-g10 ats-m150 dg2-g10 dg2-g10-c0) 12.55.0 12.55.1 12.55.4 12.55.8 Intel® Arc™ A770M Graphics Intel® Arc™ A730M Graphics Intel® Arc™ A550M Graphics Intel® Arc™ A770 Graphics Intel® Arc™ A750 Graphics Intel® Data Center GPU Flex 170
4F87 4F88 5693 5694 5695 56A5 56A6 56B0 56B1 56BA 56BB 56BC 56BD 56C1 Alchemist, Intel® Data Center GPU Flex Series (dg2-g11-a0 dg2-g11-b0 acm-g11 ats-m75 dg2-g11 dg2-g11-b1) 12.56.0 12.56.4 12.56.5 Intel® Arc™ A370M Graphics Intel® Arc™ A350M Graphics Intel® Arc™ A380 Graphics Intel® Arc™ A310 Graphics Intel® Data Center GPU Flex 140 Intel® Arc™ A-series Graphics
4F85 4F86 5696 5697 56A3 56A4 56B2 56B3 Alchemist (acm-g12 dg2-g12 dg2-g12-a0) 12.57.0 Intel® Arc™ A-series Graphics
4F8C 5698 5699 569A 56A7 56A8 Alchemist (acm-g20 dg2-g20) 12.58.0 Intel® Arc™ A-series Graphics
4F89 56A9 56AA Alchemist (acm-g21 dg2-g21) 12.59.0 Intel® Arc™ A-series Graphics
7D40 7D45 7D60 7D67 Meteor Lake-M, Meteor Lake-P, Arrow Lake-U (xe-lpg-md-a0 mtl-m mtl-s xe-lpg-md-b0) 12.70.0 12.70.4 Intel® Iris® Xe Graphics Intel® UHD Graphics
7D55 7DD5 Meteor Lake-P (xe-lpg-lg-a0 mtl-p xe-lpg-lg-b0) 12.71.0 12.71.4 Intel® Iris® Xe Graphics

Install AOT enabled IPEX (Windows only, CPython 3.11) version 2.1.10+xpu (Demucs-GUI 1.1a2 to 1.2a1)

  1. Install Intel graphics driver greater than or equal to 31.0.101.4953 (Windows) from official website.
  2. Run Demucs-GUI greater than or equal to 1.1b1. If any Intel GPU is detected, you will be able to see an option in the menu bar About -> About AOT. Click it and you will see a dialog asking you whether to download an AOT build or open the documentation (this page). If you started Demucs-GUI with IPEX enabled but with JIT only on Windows, the dialog will also appear. If the build doesn't have Intel GPU support, the dialog will show up with warning.
  3. Just click on download button and your browser will open a page on FossHUB. You need't to do anything and download will start automatically.
  4. Extract the downloaded 7z file. Assume that you store Demucs-GUI in C:\Demucs-GUI, then you should extract the 7z file to C:\Demucs-GUI\intel_extension_for_pytorch\bin, replacing the existing file.
  5. Try separating again! The first-time separation will no longer take a long time. If it still takes a long time, please try other versions of AOT build. You can download them by clicking the version number in the table below.

You can also recognize PCI ID yourself. Using tools like GPU-Z, you can see your device ID starts with "8086 XXXX", and the XXXX part is the PCI ID to look for in the table below. Then you can download the corresponding version of AOT build.

Following GPUs are supported with 2.1.10+xpu (for details, please see find_device_win.py):

PCI ID (Only the device part) Architecture Generation Code Display Name
9A40 9A49 9A59 9A60 9A68 9A70 9A78 FF20 Tiger Lake (tgl tgllp) 12.0.0 Intel® UHD Graphics Intel® Iris® Xe Graphics
4C80 4C8A 4C8B 4C8C 4C90 4C9A Rocket Lake (rkl) 12.1.0 Intel® UHD Graphics
4680 4682 4688 468A 4690 4692 4693 A780 A781 A782 A783 A788 A789 A78B Alder Lake-S, Raptor Lake-S (adl-s) 12.2.0 Intel® UHD Graphics
4626 4628 462A 46A0 46A1 46A2 46A3 46A6 46A8 46AA 46B0 46B1 46B2 46B3 46C0 46C1 46C2 46C3 A720 A721 A7A0 A7A1 A7A8 A7A9 Alder Lake, Raptor Lake-P (adl-p) 12.3.0 Intel® UHD Graphics Intel® Iris® Xe Graphics
46D0 46D1 46D2 Alder Lake-N (adl-n) 12.4.0 Intel® UHD Graphics
4905 4906 4907 4908 DG1 (dg1) 12.10.0 Intel® Iris® Xe MAX Graphics Intel® SG-18M (SG1) Intel® Iris® Xe Graphics
4F80 4F81 4F82 4F83 4F84 5690 5691 5692 56A0 56A1 56A2 56C0 Alchemist, Intel® Data Center GPU Flex Series (dg2-g10-a0 dg2-g10-a1 dg2-g10-b0 acm-g10 ats-m150 dg2-g10 dg2-g10-c0) 12.55.0 12.55.1 12.55.4 12.55.8 Intel® Arc™ A770M Graphics Intel® Arc™ A730M Graphics Intel® Arc™ A550M Graphics Intel® Arc™ A770 Graphics Intel® Arc™ A750 Graphics Intel® Data Center GPU Flex 170
4F87 4F88 5693 5694 5695 56A5 56A6 56B0 56B1 56C1 Alchemist, Intel® Data Center GPU Flex Series (dg2-g11-a0 dg2-g11-b0 acm-g11 ats-m75 dg2-g11 dg2-g11-b1) 12.56.0 12.56.4 12.56.5 Intel® Arc™ A370M Graphics Intel® Arc™ A350M Graphics Intel® Arc™ A380 Graphics Intel® Arc™ A310 Graphics Intel® Data Center GPU Flex 140 Intel® Arc™ A-series Graphics (future)
4F85 4F86 5696 5697 56A3 56A4 56B2 56B3 Alchemist (acm-g12 dg2-g12 dg2-g12-a0) 12.57.0 Intel® Arc™ A-series Graphics (future)
4F8C 5698 5699 569A 56A7 56A8 Alchemist (acm-g20 dg2-g20) 12.58.0 Intel® Arc™ A-series Graphics (future)
4F89 56A9 56AA Alchemist (acm-g21 dg2-g21) 12.59.0 Intel® Arc™ A-series Graphics (future)