Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[X86][GlobalISel] Added support for SQRT function #132356

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

JaydeepChauhan14
Copy link
Contributor

No description provided.

@llvmbot
Copy link
Member

llvmbot commented Mar 21, 2025

@llvm/pr-subscribers-llvm-globalisel

@llvm/pr-subscribers-backend-x86

Author: None (JaydeepChauhan14)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/132356.diff

4 Files Affected:

  • (modified) llvm/lib/Target/X86/GISel/X86LegalizerInfo.cpp (+5)
  • (modified) llvm/lib/Target/X86/GISel/X86RegisterBankInfo.cpp (+1)
  • (added) llvm/test/CodeGen/X86/GlobalISel/sqrt.mir (+68)
  • (modified) llvm/test/CodeGen/X86/isel-sqrt.ll (+10-9)
diff --git a/llvm/lib/Target/X86/GISel/X86LegalizerInfo.cpp b/llvm/lib/Target/X86/GISel/X86LegalizerInfo.cpp
index 24bf0dd378641..b474d6a3f6356 100644
--- a/llvm/lib/Target/X86/GISel/X86LegalizerInfo.cpp
+++ b/llvm/lib/Target/X86/GISel/X86LegalizerInfo.cpp
@@ -105,6 +105,11 @@ X86LegalizerInfo::X86LegalizerInfo(const X86Subtarget &STI,
                                G_FEXP10, G_FLOG, G_FLOG2, G_FLOG10})
       .libcall();
 
+  getActionDefinitionsBuilder(G_FSQRT)
+      .legalFor(HasSSE1 || UseX87, {s32})
+      .legalFor(HasSSE2 || UseX87, {s64})
+      .legalFor(UseX87, {s80});
+
   // merge/unmerge
   for (unsigned Op : {G_MERGE_VALUES, G_UNMERGE_VALUES}) {
     unsigned BigTyIdx = Op == G_MERGE_VALUES ? 0 : 1;
diff --git a/llvm/lib/Target/X86/GISel/X86RegisterBankInfo.cpp b/llvm/lib/Target/X86/GISel/X86RegisterBankInfo.cpp
index 42faf4299c6d5..0baca81494694 100644
--- a/llvm/lib/Target/X86/GISel/X86RegisterBankInfo.cpp
+++ b/llvm/lib/Target/X86/GISel/X86RegisterBankInfo.cpp
@@ -288,6 +288,7 @@ X86RegisterBankInfo::getInstrMapping(const MachineInstr &MI) const {
   SmallVector<PartialMappingIdx, 4> OpRegBankIdx(NumOperands);
 
   switch (Opc) {
+  case TargetOpcode::G_FSQRT:
   case TargetOpcode::G_FPEXT:
   case TargetOpcode::G_FPTRUNC:
   case TargetOpcode::G_FCONSTANT:
diff --git a/llvm/test/CodeGen/X86/GlobalISel/sqrt.mir b/llvm/test/CodeGen/X86/GlobalISel/sqrt.mir
new file mode 100644
index 0000000000000..b55a9f8dceca0
--- /dev/null
+++ b/llvm/test/CodeGen/X86/GlobalISel/sqrt.mir
@@ -0,0 +1,68 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -mtriple=i686-linux-gnu -run-pass=regbankselect,instruction-select -disable-gisel-legality-check -global-isel -verify-machineinstrs %s -o - | FileCheck %s --check-prefixes GISEL-I686
+
+---
+name:            test_sqrt_f32
+alignment:       16
+legalized:       true
+fixedStack:
+  - { id: 0, type: default, offset: 0, size: 4, alignment: 16, stack-id: default,
+      isImmutable: true, isAliased: false, callee-saved-register: '', callee-saved-restored: true,
+      debug-info-variable: '', debug-info-expression: '', debug-info-location: '' }
+body:             |
+  bb.1:
+    ; GISEL-I686-LABEL: name: test_sqrt_f32
+    ; GISEL-I686: [[LD_Fp32m:%[0-9]+]]:rfp32 = LD_Fp32m %fixed-stack.0, 1, $noreg, 0, $noreg, implicit-def $fpsw, implicit $fpcw :: (invariant load (s32) from %fixed-stack.0, align 16)
+    ; GISEL-I686-NEXT: [[SQRT_Fp32_:%[0-9]+]]:rfp32 = nofpexcept SQRT_Fp32 [[LD_Fp32m]], implicit-def dead $fpsw, implicit $fpcw
+    ; GISEL-I686-NEXT: $fp0 = COPY [[SQRT_Fp32_]]
+    ; GISEL-I686-NEXT: RET 0, implicit $fp0
+    %1:_(p0) = G_FRAME_INDEX %fixed-stack.0
+    %0:_(s32) = G_LOAD %1(p0) :: (invariant load (s32) from %fixed-stack.0, align 16)
+    %2:_(s32) = G_FSQRT %0
+    $fp0 = COPY %2(s32)
+    RET 0, implicit $fp0
+
+...
+---
+name:            test_sqrt_f64
+alignment:       16
+legalized:       true
+fixedStack:
+  - { id: 0, type: default, offset: 0, size: 8, alignment: 16, stack-id: default,
+      isImmutable: true, isAliased: false, callee-saved-register: '', callee-saved-restored: true,
+      debug-info-variable: '', debug-info-expression: '', debug-info-location: '' }
+body:             |
+  bb.1:
+    ; GISEL-I686-LABEL: name: test_sqrt_f64
+    ; GISEL-I686: [[DEF:%[0-9]+]]:rfp64 = IMPLICIT_DEF
+    ; GISEL-I686-NEXT: [[SQRT_Fp64_:%[0-9]+]]:rfp64 = nofpexcept SQRT_Fp64 [[DEF]], implicit-def dead $fpsw, implicit $fpcw
+    ; GISEL-I686-NEXT: $fp0 = COPY [[SQRT_Fp64_]]
+    ; GISEL-I686-NEXT: RET 0, implicit $fp0
+    %0:_(s64) = IMPLICIT_DEF
+    %2:_(s64) = G_FSQRT %0
+    $fp0 = COPY %2(s64)
+    RET 0, implicit $fp0
+
+...
+---
+name:            test_sqrt_f80
+alignment:       16
+legalized:       true
+fixedStack:
+  - { id: 0, type: default, offset: 0, size: 10, alignment: 16, stack-id: default,
+      isImmutable: true, isAliased: false, callee-saved-register: '', callee-saved-restored: true,
+      debug-info-variable: '', debug-info-expression: '', debug-info-location: '' }
+body:             |
+  bb.1:
+    ; GISEL-I686-LABEL: name: test_sqrt_f80
+    ; GISEL-I686: [[LD_Fp80m:%[0-9]+]]:rfp80 = LD_Fp80m %fixed-stack.0, 1, $noreg, 0, $noreg, implicit-def $fpsw, implicit $fpcw :: (invariant load (s80) from %fixed-stack.0, align 16)
+    ; GISEL-I686-NEXT: [[SQRT_Fp80_:%[0-9]+]]:rfp80 = nofpexcept SQRT_Fp80 [[LD_Fp80m]], implicit-def dead $fpsw, implicit $fpcw
+    ; GISEL-I686-NEXT: $fp0 = COPY [[SQRT_Fp80_]]
+    ; GISEL-I686-NEXT: RET 0, implicit $fp0
+    %1:_(p0) = G_FRAME_INDEX %fixed-stack.0
+    %0:_(s80) = G_LOAD %1(p0) :: (invariant load (s80) from %fixed-stack.0, align 16)
+    %2:_(s80) = G_FSQRT %0
+    $fp0 = COPY %2(s80)
+    RET 0, implicit $fp0
+
+...
diff --git a/llvm/test/CodeGen/X86/isel-sqrt.ll b/llvm/test/CodeGen/X86/isel-sqrt.ll
index 9ac68a11d748c..7204e37347618 100644
--- a/llvm/test/CodeGen/X86/isel-sqrt.ll
+++ b/llvm/test/CodeGen/X86/isel-sqrt.ll
@@ -1,13 +1,14 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc < %s -mtriple=x86_64-apple-darwin -mattr=-avx,+sse2                       | FileCheck %s --check-prefixes=X64,SSE2
-; RUN: llc < %s -mtriple=x86_64-apple-darwin -mattr=-avx,+sse2 -fast-isel            | FileCheck %s --check-prefixes=X64,SSE2
-; RUN: llc < %s -mtriple=x86_64-apple-darwin -mattr=-avx,+sse2 -global-isel -global-isel-abort=2 | FileCheck %s --check-prefixes=X64,SSE2
-; RUN: llc < %s -mtriple=x86_64-apple-darwin -mattr=-avx2,+avx                       | FileCheck %s --check-prefixes=X64,AVX
-; RUN: llc < %s -mtriple=x86_64-apple-darwin -mattr=-avx2,+avx -fast-isel            | FileCheck %s --check-prefixes=X64,AVX
-; RUN: llc < %s -mtriple=x86_64-apple-darwin -mattr=-avx2,+avx -global-isel -global-isel-abort=2 | FileCheck %s --check-prefixes=X64,AVX
-; RUN: llc < %s -mtriple=i686-linux-gnu -global-isel=0 -fast-isel=0                  | FileCheck %s --check-prefixes=X86,SDAG-X86
-; RUN: llc < %s -mtriple=i686-linux-gnu -fast-isel                                   | FileCheck %s --check-prefixes=X86,FASTISEL-X86
-; RUN: llc < %s -mtriple=i686-linux-gnu -global-isel -global-isel-abort=2            | FileCheck %s --check-prefixes=X86,GISEL-X86
+; RUN: llc < %s -mtriple=x86_64-apple-darwin -mattr=-avx,+sse2                 | FileCheck %s --check-prefixes=X64,SSE2
+; RUN: llc < %s -mtriple=x86_64-apple-darwin -mattr=-avx,+sse2 -fast-isel      | FileCheck %s --check-prefixes=X64,SSE2
+; RUN: llc < %s -mtriple=x86_64-apple-darwin -mattr=-avx,+sse2 -global-isel    | FileCheck %s --check-prefixes=X64,SSE2
+; RUN: llc < %s -mtriple=x86_64-apple-darwin -mattr=-avx2,+avx                 | FileCheck %s --check-prefixes=X64,AVX
+; RUN: llc < %s -mtriple=x86_64-apple-darwin -mattr=-avx2,+avx -fast-isel      | FileCheck %s --check-prefixes=X64,AVX
+; RUN: llc < %s -mtriple=x86_64-apple-darwin -mattr=-avx2,+avx -global-isel    | FileCheck %s --check-prefixes=X64,AVX
+; RUN: llc < %s -mtriple=i686-linux-gnu -global-isel=0 -fast-isel=0            | FileCheck %s --check-prefixes=X86,SDAG-X86
+; RUN: llc < %s -mtriple=i686-linux-gnu -fast-isel                             | FileCheck %s --check-prefixes=X86,FASTISEL-X86
+; TODO: The last RUN line will fails GISEL selection and will fallback to DAG selection due to lack of support for loads/stores in i686 mode, support is expected soon enough, for this reason the llvm/test/CodeGen/X86/GlobalISel/sqrt.mir test is added for now because of the lack of support for i686 in GlobalISel.
+; RUN: llc < %s -mtriple=i686-linux-gnu -global-isel -global-isel-abort=2      | FileCheck %s --check-prefixes=X86,GISEL-X86
 
 define float @test_sqrt_f32(float %a) {
 ; SSE2-LABEL: test_sqrt_f32:

@JaydeepChauhan14
Copy link
Contributor Author

@arsenm, @RKSimon, @e-kud please review PR.

@RKSimon RKSimon requested review from RKSimon, e-kud and arsenm March 21, 2025 11:21
; RUN: llc < %s -mtriple=x86_64-apple-darwin -mattr=-avx,+sse2 -global-isel | FileCheck %s --check-prefixes=X64,SSE2
; RUN: llc < %s -mtriple=x86_64-apple-darwin -mattr=-avx2,+avx | FileCheck %s --check-prefixes=X64,AVX
; RUN: llc < %s -mtriple=x86_64-apple-darwin -mattr=-avx2,+avx -fast-isel | FileCheck %s --check-prefixes=X64,AVX
; RUN: llc < %s -mtriple=x86_64-apple-darwin -mattr=-avx2,+avx -global-isel | FileCheck %s --check-prefixes=X64,AVX
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-global-isel-abort=1 is required otherwise we don't know whether we test GlobalISel or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

@e-kud e-kud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Addressed the review comments.

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants