Skip to content

[llvm][AArch64] SVE2 is an optional feature in ARMv9.0a #96007

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 20, 2024

Conversation

jroelofs
Copy link
Contributor

@jroelofs jroelofs commented Jun 18, 2024

... so move it out of the implied_features list, and into the DefaultExts list.

@llvmbot
Copy link
Member

llvmbot commented Jun 18, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-backend-aarch64

Author: Jon Roelofs (jroelofs)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/96007.diff

3 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64Features.td (+1-1)
  • (modified) llvm/lib/Target/AArch64/AArch64Processors.td (+24-12)
  • (modified) llvm/unittests/TargetParser/TargetParserTest.cpp (+1-1)
diff --git a/llvm/lib/Target/AArch64/AArch64Features.td b/llvm/lib/Target/AArch64/AArch64Features.td
index ffb899a301459..6e77d71b88829 100644
--- a/llvm/lib/Target/AArch64/AArch64Features.td
+++ b/llvm/lib/Target/AArch64/AArch64Features.td
@@ -847,7 +847,7 @@ def HasV8_9aOps : Architecture64<8, 9, "a", "v8.9a",
   !listconcat(HasV8_8aOps.DefaultExts, [FeatureSPECRES2, FeatureCSSC,
     FeatureRASv2])>;
 def HasV9_0aOps : Architecture64<9, 0, "a", "v9a",
-  [HasV8_5aOps, FeatureMEC, FeatureSVE2],
+  [HasV8_5aOps, FeatureMEC],
   !listconcat(HasV8_5aOps.DefaultExts, [FeatureFullFP16, FeatureSVE,
     FeatureSVE2])>;
 def HasV9_1aOps : Architecture64<9, 1, "a", "v9.1a",
diff --git a/llvm/lib/Target/AArch64/AArch64Processors.td b/llvm/lib/Target/AArch64/AArch64Processors.td
index e32ca629721ff..53b46ff42b72f 100644
--- a/llvm/lib/Target/AArch64/AArch64Processors.td
+++ b/llvm/lib/Target/AArch64/AArch64Processors.td
@@ -690,11 +690,13 @@ def ProcessorFeatures {
   list<SubtargetFeature> A520 = [HasV9_2aOps, FeaturePerfMon, FeatureAM,
                                  FeatureMTE, FeatureETE, FeatureSVE2BitPerm,
                                  FeatureFP16FML,
-                                 FeatureSB, FeatureSSBS, FeaturePAuth, FeatureFlagM, FeaturePredRes];
+                                 FeatureSB, FeatureSSBS, FeaturePAuth, FeatureFlagM, FeaturePredRes,
+                                 FeatureSVE, FeatureSVE2];
   list<SubtargetFeature> A520AE = [HasV9_2aOps, FeaturePerfMon, FeatureAM,
                                  FeatureMTE, FeatureETE, FeatureSVE2BitPerm,
                                  FeatureFP16FML,
-                                 FeatureSB, FeatureSSBS, FeaturePAuth, FeatureFlagM, FeaturePredRes];
+                                 FeatureSB, FeatureSSBS, FeaturePAuth, FeatureFlagM, FeaturePredRes,
+                                 FeatureSVE, FeatureSVE2];
   list<SubtargetFeature> A65  = [HasV8_2aOps, FeatureSHA2, FeatureAES, FeatureFPARMv8,
                                  FeatureNEON, FeatureFullFP16, FeatureDotProd,
                                  FeatureRCPC, FeatureSSBS, FeatureRAS,
@@ -726,19 +728,23 @@ def ProcessorFeatures {
                                  FeatureFP16FML, FeatureSVE, FeatureTRBE,
                                  FeatureSVE2BitPerm, FeatureBF16, FeatureETE,
                                  FeaturePerfMon, FeatureMatMulInt8, FeatureSPE,
-                                 FeatureSB, FeatureSSBS, FeatureFullFP16, FeaturePAuth, FeaturePredRes, FeatureFlagM];
+                                 FeatureSB, FeatureSSBS, FeatureFullFP16, FeaturePAuth, FeaturePredRes, FeatureFlagM,
+                                 FeatureSVE2];
   list<SubtargetFeature> A720 = [HasV9_2aOps, FeatureMTE, FeatureFP16FML,
                                  FeatureTRBE, FeatureSVE2BitPerm, FeatureETE,
                                  FeaturePerfMon, FeatureSPE, FeatureSPE_EEF,
-                                 FeatureSB, FeatureSSBS, FeaturePAuth, FeatureFlagM, FeaturePredRes];
+                                 FeatureSB, FeatureSSBS, FeaturePAuth, FeatureFlagM, FeaturePredRes,
+                                 FeatureSVE, FeatureSVE2];
   list<SubtargetFeature> A720AE = [HasV9_2aOps, FeatureMTE, FeatureFP16FML,
                                  FeatureTRBE, FeatureSVE2BitPerm, FeatureETE,
                                  FeaturePerfMon, FeatureSPE, FeatureSPE_EEF,
-                                 FeatureSB, FeatureSSBS, FeaturePAuth, FeatureFlagM, FeaturePredRes];
+                                 FeatureSB, FeatureSSBS, FeaturePAuth, FeatureFlagM, FeaturePredRes,
+                                 FeatureSVE, FeatureSVE2];
   list<SubtargetFeature> A725 = [HasV9_2aOps, FeatureMTE, FeatureFP16FML,
                                  FeatureETE, FeaturePerfMon, FeatureSPE,
                                  FeatureSVE2BitPerm, FeatureSPE_EEF, FeatureTRBE,
-                                 FeatureFlagM, FeaturePredRes, FeatureSB, FeatureSSBS];
+                                 FeatureFlagM, FeaturePredRes, FeatureSB, FeatureSSBS,
+                                 FeatureSVE, FeatureSVE2];
   list<SubtargetFeature> R82  = [HasV8_0rOps, FeaturePerfMon, FeatureFullFP16,
                                  FeatureFP16FML, FeatureSSBS, FeaturePredRes,
                                  FeatureSB, FeatureRDM, FeatureDotProd,
@@ -771,16 +777,19 @@ def ProcessorFeatures {
                                  FeatureSPE, FeatureBF16, FeatureMatMulInt8,
                                  FeatureMTE, FeatureSVE2BitPerm, FeatureFullFP16,
                                  FeatureFP16FML,
-                                 FeatureSB, FeaturePAuth, FeaturePredRes, FeatureFlagM, FeatureSSBS];
+                                 FeatureSB, FeaturePAuth, FeaturePredRes, FeatureFlagM, FeatureSSBS,
+                                 FeatureSVE2];
   list<SubtargetFeature> X4 =   [HasV9_2aOps,
                                  FeaturePerfMon, FeatureETE, FeatureTRBE,
                                  FeatureSPE, FeatureMTE, FeatureSVE2BitPerm,
                                  FeatureFP16FML, FeatureSPE_EEF,
-                                 FeatureSB, FeatureSSBS, FeaturePAuth, FeatureFlagM, FeaturePredRes];
+                                 FeatureSB, FeatureSSBS, FeaturePAuth, FeatureFlagM, FeaturePredRes,
+                                 FeatureSVE, FeatureSVE2];
   list<SubtargetFeature> X925 = [HasV9_2aOps, FeatureMTE, FeatureFP16FML,
                                  FeatureETE, FeaturePerfMon, FeatureSPE,
                                  FeatureSVE2BitPerm, FeatureSPE_EEF, FeatureTRBE,
-                                 FeatureFlagM, FeaturePredRes, FeatureSB, FeatureSSBS];
+                                 FeatureFlagM, FeaturePredRes, FeatureSB, FeatureSSBS,
+                                 FeatureSVE, FeatureSVE2];
   list<SubtargetFeature> A64FX    = [HasV8_2aOps, FeatureFPARMv8, FeatureNEON,
                                      FeatureSHA2, FeaturePerfMon, FeatureFullFP16,
                                      FeatureSVE, FeatureComplxNum,
@@ -849,7 +858,8 @@ def ProcessorFeatures {
                                       FeatureFullFP16, FeatureMTE, FeaturePerfMon,
                                       FeatureRandGen, FeatureSPE, FeatureSPE_EEF,
                                       FeatureSVE2BitPerm,
-                                      FeatureSSBS, FeatureSB, FeaturePredRes, FeaturePAuth, FeatureFlagM];
+                                      FeatureSSBS, FeatureSB, FeaturePredRes, FeaturePAuth, FeatureFlagM,
+                                      FeatureSVE, FeatureSVE2];
   list<SubtargetFeature> Neoverse512TVB = [HasV8_4aOps, FeatureBF16, FeatureCacheDeepPersist,
                                            FeatureSHA2, FeatureAES, FeatureFPARMv8, FeatureFP16FML,
                                            FeatureFullFP16, FeatureMatMulInt8, FeatureNEON,
@@ -871,12 +881,14 @@ def ProcessorFeatures {
                                       FeatureFullFP16, FeatureLS64, FeatureMTE,
                                       FeaturePerfMon, FeatureRandGen, FeatureSPE,
                                       FeatureSPE_EEF, FeatureSVE2BitPerm, FeatureBRBE,
-                                      FeatureSSBS, FeatureSB, FeaturePredRes, FeaturePAuth, FeatureFlagM];
+                                      FeatureSSBS, FeatureSB, FeaturePredRes, FeaturePAuth, FeatureFlagM,
+                                      FeatureSVE, FeatureSVE2];
   list<SubtargetFeature> NeoverseV3AE = [HasV9_2aOps, FeatureETE, FeatureFP16FML,
                                       FeatureFullFP16, FeatureLS64, FeatureMTE,
                                       FeaturePerfMon, FeatureRandGen, FeatureSPE,
                                       FeatureSPE_EEF, FeatureSVE2BitPerm, FeatureBRBE,
-                                      FeatureSSBS, FeatureSB, FeaturePredRes, FeaturePAuth, FeatureFlagM];
+                                      FeatureSSBS, FeatureSB, FeaturePredRes, FeaturePAuth, FeatureFlagM,
+                                      FeatureSVE, FeatureSVE2];
   list<SubtargetFeature> Saphira    = [HasV8_4aOps, FeatureSHA2, FeatureAES, FeatureFPARMv8,
                                        FeatureNEON, FeatureSPE, FeaturePerfMon];
   list<SubtargetFeature> ThunderX   = [HasV8_0aOps, FeatureCRC, FeatureSHA2, FeatureAES,
diff --git a/llvm/unittests/TargetParser/TargetParserTest.cpp b/llvm/unittests/TargetParser/TargetParserTest.cpp
index a99ef85fbfc81..a04655ca9722a 100644
--- a/llvm/unittests/TargetParser/TargetParserTest.cpp
+++ b/llvm/unittests/TargetParser/TargetParserTest.cpp
@@ -2474,7 +2474,7 @@ AArch64ExtensionDependenciesBaseArchTestParams
          {},
          {"v8.1a", "crc", "fp-armv8", "lse", "rdm", "neon"},
          {}},
-        {AArch64::ARMV9_5A, {}, {"v9.5a", "sve", "sve2", "mops", "cpa"}, {}},
+        {AArch64::ARMV9_5A, {}, {"v9.5a", "mops", "cpa"}, {}},
 
         // Positive modifiers
         {AArch64::ARMV8A, {"fp16"}, {"fullfp16"}, {}},

@jroelofs
Copy link
Contributor Author

This is a baby step toward fixing apple-m4's version: v8.7 => v9.2

@pinskia
Copy link

pinskia commented Jun 19, 2024

Seems like you should treat SIMD/neon as an optional feature of armv8-a too then.

@tmatheson-arm
Copy link
Contributor

So, while "FEAT_SVE2 is OPTIONAL from Armv9.0", the PCS requires that v9.0-a has SVE2. From the Arm ARM:

All Armv9-A systems that support standard operating systems with rich application environments also provide hardware support for SVE2 instructions. It is a requirement of the ARM Procedure Call Standard for AArch64, see Procedure Call Standard for the Arm 64-bit Architecture.

I think this is ok to do, as it keeps SVE2 enabled by default for -march=armv9-a. However for any downstream users of LLVM, "target-features"="+v9a" will not get them SVE2, they will need to explicitly add "+v9a,+sve2". That seems ok to me, but I'm not 100% sure.

Copy link
Contributor

@tmatheson-arm tmatheson-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Might need a release note, to document the backend target-feature dependency change? But I had a quick look and couldn't see any examples of similar changes in the release notes.

@llvmbot llvmbot added the clang Clang issues not falling into any other category label Jun 19, 2024
@@ -955,6 +955,11 @@ Arm and AArch64 Support
* Arm Neoverse-N3 (neoverse-n3).
* Arm Neoverse-V3 (neoverse-v3).
* Arm Neoverse-V3AE (neoverse-v3ae).
- SVE and SVE2 have been moved to the default extensions list for ARMv9.0,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this note to Clang is required, because the definition of v9a is an LLVM-internal concept.
From an end-user point of view, invoking Clang with -march=armv9-a should still enable SVE2.

* SVE and SVE2 have been moved to the default extensions list for ARMv9.0,
making them optional per the Arm ARM. Existing v9.0+ CPUs in the backend that
support these extensions continue to have these features enabled by default
when specified via ``-mcpu=``. The attribute ``"target-features"="+v9a"`` no
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Did you mean -march= instead of -mcpu= ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean both. Adjusted to clarify this.

@jroelofs jroelofs merged commit 037a9a7 into llvm:main Jun 20, 2024
5 of 7 checks passed
@jroelofs jroelofs deleted the jroelofs/sve2-optional branch June 20, 2024 15:31
AlexisPerry pushed a commit to llvm-project-tlp/llvm-project that referenced this pull request Jul 9, 2024
... so move it out of the `implied_features` list, and into the
`DefaultExts` list.
MattPD referenced this pull request in ashvardanian/SimSIMD Aug 16, 2024
This commit add new capability levels for Arm allowing
us to differentiate f16, bf16. and i8-supporting generations
of CPUs, becoming increasingly popular in the datacenter.

This breaks compilation of Rust and Python bindings
due to the "target specific options mismatch".
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AArch64 clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants