-
Notifications
You must be signed in to change notification settings - Fork 12.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FMV][AArch64][clang] Emit fmv-features metadata in LLVM IR. #118544
Conversation
We need to be able to propagate information about FMV attribute strings from C/C++ source to LLVM IR. This is necessary so that we can distinguish which target-features are coming from the cmdline, which are coming from the target attribute, and which are coming from feature dependency expansion. We need this for static resolution of calls in LLVM. Here's a motivating example: Suppose you have target_version("i8mm+dotprod") and target_version("fcma"). The first version clearly has higher priority. Now suppose you specify -march=armv8-a+i8mm on the command line. Then the versions would have target-features "+i8mm,+dotprod" and "+i8mm,+fcma" respectively. If you are using those to deduce version priority, then you would incorrectly deduce that the second version was higher priority than the first.
@llvm/pr-subscribers-clang-codegen @llvm/pr-subscribers-clang Author: Alexandros Lamprineas (labrinea) ChangesWe need to be able to propagate information about FMV attribute strings from C/C++ source to LLVM IR. This is necessary so that we can distinguish which target-features are coming from the cmdline, which are coming from the target attribute, and which are coming from feature dependency expansion. We need this for static resolution of calls in LLVM. Here's a motivating example: Suppose you have target_version("i8mm+dotprod") and target_version("fcma"). The first version clearly has higher priority. Now suppose you specify -march=armv8-a+i8mm on the command line. Then the versions would have target-features "+i8mm,+dotprod" and "+i8mm,+fcma" respectively. If you are using those to deduce version priority, then you would incorrectly deduce that the second version was higher priority than the first. Patch is 45.55 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/118544.diff 7 Files Affected:
diff --git a/clang/lib/AST/ASTContext.cpp b/clang/lib/AST/ASTContext.cpp
index 80e8c5b9df58e7..93d24aabbe04b2 100644
--- a/clang/lib/AST/ASTContext.cpp
+++ b/clang/lib/AST/ASTContext.cpp
@@ -14294,7 +14294,7 @@ QualType ASTContext::getCorrespondingSignedFixedPointType(QualType Ty) const {
// corresponding backend features (which may contain duplicates).
static std::vector<std::string> getFMVBackendFeaturesFor(
const llvm::SmallVectorImpl<StringRef> &FMVFeatStrings) {
- std::vector<std::string> BackendFeats{{"+fmv"}};
+ std::vector<std::string> BackendFeats;
llvm::AArch64::ExtensionSet FeatureBits;
for (StringRef F : FMVFeatStrings)
if (auto FMVExt = llvm::AArch64::parseFMVExtension(F))
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index d3d5c0743a520b..e89fadabc57fee 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -2745,7 +2745,21 @@ bool CodeGenModule::GetCPUAndFeaturesAttributes(GlobalDecl GD,
Attrs.addAttribute("target-features", llvm::join(Features, ","));
AddedAttr = true;
}
-
+ if (getTarget().getTriple().isAArch64()) {
+ llvm::SmallVector<StringRef, 8> Feats;
+ if (TV)
+ TV->getFeatures(Feats);
+ else if (TC)
+ TC->getFeatures(Feats, GD.getMultiVersionIndex());
+ if (!Feats.empty()) {
+ llvm::sort(Feats);
+ std::string FMVFeatures;
+ for (StringRef F : Feats)
+ FMVFeatures.append(",+" + F.str());
+ Attrs.addAttribute("fmv-features", FMVFeatures.substr(1));
+ AddedAttr = true;
+ }
+ }
return AddedAttr;
}
diff --git a/clang/test/CodeGen/AArch64/fmv-dependencies.c b/clang/test/CodeGen/AArch64/fmv-dependencies.c
index f74b7aa32c7dca..6537545135a335 100644
--- a/clang/test/CodeGen/AArch64/fmv-dependencies.c
+++ b/clang/test/CodeGen/AArch64/fmv-dependencies.c
@@ -42,7 +42,7 @@ __attribute__((target_version("flagm"))) int fmv(void) { return 0; }
// CHECK: define dso_local i32 @fmv._Mflagm2() #[[flagm2:[0-9]+]] {
__attribute__((target_version("flagm2"))) int fmv(void) { return 0; }
-// CHECK: define dso_local i32 @fmv._Mfp() #[[default:[0-9]+]] {
+// CHECK: define dso_local i32 @fmv._Mfp() #[[fp:[0-9]+]] {
__attribute__((target_version("fp"))) int fmv(void) { return 0; }
// CHECK: define dso_local i32 @fmv._Mfp16() #[[fp16:[0-9]+]] {
@@ -99,7 +99,7 @@ __attribute__((target_version("sha2"))) int fmv(void) { return 0; }
// CHECK: define dso_local i32 @fmv._Msha3() #[[sha3:[0-9]+]] {
__attribute__((target_version("sha3"))) int fmv(void) { return 0; }
-// CHECK: define dso_local i32 @fmv._Msimd() #[[default]] {
+// CHECK: define dso_local i32 @fmv._Msimd() #[[simd:[0-9]+]] {
__attribute__((target_version("simd"))) int fmv(void) { return 0; }
// CHECK: define dso_local i32 @fmv._Msm4() #[[sm4:[0-9]+]] {
@@ -150,48 +150,49 @@ int caller() {
return fmv();
}
-// CHECK: attributes #[[aes]] = { {{.*}} "target-features"="+aes,+fmv,+fp-armv8,+neon,+outline-atomics,+v8a"
-// CHECK: attributes #[[bf16]] = { {{.*}} "target-features"="+bf16,+fmv,+fp-armv8,+neon,+outline-atomics,+v8a"
-// CHECK: attributes #[[bti]] = { {{.*}} "target-features"="+bti,+fmv,+fp-armv8,+neon,+outline-atomics,+v8a"
-// CHECK: attributes #[[crc]] = { {{.*}} "target-features"="+crc,+fmv,+fp-armv8,+neon,+outline-atomics,+v8a"
-// CHECK: attributes #[[dit]] = { {{.*}} "target-features"="+dit,+fmv,+fp-armv8,+neon,+outline-atomics,+v8a"
-// CHECK: attributes #[[dotprod]] = { {{.*}} "target-features"="+dotprod,+fmv,+fp-armv8,+neon,+outline-atomics,+v8a"
-// CHECK: attributes #[[dpb]] = { {{.*}} "target-features"="+ccpp,+fmv,+fp-armv8,+neon,+outline-atomics,+v8a"
-// CHECK: attributes #[[dpb2]] = { {{.*}} "target-features"="+ccdp,+ccpp,+fmv,+fp-armv8,+neon,+outline-atomics,+v8a"
-// CHECK: attributes #[[f32mm]] = { {{.*}} "target-features"="+f32mm,+fmv,+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+v8a"
-// CHECK: attributes #[[f64mm]] = { {{.*}} "target-features"="+f64mm,+fmv,+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+v8a"
-// CHECK: attributes #[[fcma]] = { {{.*}} "target-features"="+complxnum,+fmv,+fp-armv8,+neon,+outline-atomics,+v8a"
-// CHECK: attributes #[[flagm]] = { {{.*}} "target-features"="+flagm,+fmv,+fp-armv8,+neon,+outline-atomics,+v8a"
-// CHECK: attributes #[[flagm2]] = { {{.*}} "target-features"="+altnzcv,+flagm,+fmv,+fp-armv8,+neon,+outline-atomics,+v8a"
-// CHECK: attributes #[[default]] = { {{.*}} "target-features"="+fmv,+fp-armv8,+neon,+outline-atomics,+v8a"
-// CHECK: attributes #[[fp16]] = { {{.*}} "target-features"="+fmv,+fp-armv8,+fullfp16,+neon,+outline-atomics,+v8a"
-// CHECK: attributes #[[fp16fml]] = { {{.*}} "target-features"="+fmv,+fp-armv8,+fp16fml,+fullfp16,+neon,+outline-atomics,+v8a"
-// CHECK: attributes #[[frintts]] = { {{.*}} "target-features"="+fmv,+fp-armv8,+fptoint,+neon,+outline-atomics,+v8a"
-// CHECK: attributes #[[i8mm]] = { {{.*}} "target-features"="+fmv,+fp-armv8,+i8mm,+neon,+outline-atomics,+v8a"
-// CHECK: attributes #[[jscvt]] = { {{.*}} "target-features"="+fmv,+fp-armv8,+jsconv,+neon,+outline-atomics,+v8a"
-// CHECK: attributes #[[ls64]] = { {{.*}} "target-features"="+fmv,+fp-armv8,+ls64,+neon,+outline-atomics,+v8a"
-// CHECK: attributes #[[lse]] = { {{.*}} "target-features"="+fmv,+fp-armv8,+lse,+neon,+outline-atomics,+v8a"
-// CHECK: attributes #[[memtag]] = { {{.*}} "target-features"="+fmv,+fp-armv8,+mte,+neon,+outline-atomics,+v8a"
-// CHECK: attributes #[[mops]] = { {{.*}} "target-features"="+fmv,+fp-armv8,+mops,+neon,+outline-atomics,+v8a"
-// CHECK: attributes #[[predres]] = { {{.*}} "target-features"="+fmv,+fp-armv8,+neon,+outline-atomics,+predres,+v8a"
-// CHECK: attributes #[[rcpc]] = { {{.*}} "target-features"="+fmv,+fp-armv8,+neon,+outline-atomics,+rcpc,+v8a"
-// CHECK: attributes #[[rcpc2]] = { {{.*}} "target-features"="+fmv,+fp-armv8,+neon,+outline-atomics,+rcpc,+rcpc-immo,+v8a"
-// CHECK: attributes #[[rcpc3]] = { {{.*}} "target-features"="+fmv,+fp-armv8,+neon,+outline-atomics,+rcpc,+rcpc-immo,+rcpc3,+v8a"
-// CHECK: attributes #[[rdm]] = { {{.*}} "target-features"="+fmv,+fp-armv8,+neon,+outline-atomics,+rdm,+v8a"
-// CHECK: attributes #[[rng]] = { {{.*}} "target-features"="+fmv,+fp-armv8,+neon,+outline-atomics,+rand,+v8a"
-// CHECK: attributes #[[sb]] = { {{.*}} "target-features"="+fmv,+fp-armv8,+neon,+outline-atomics,+sb,+v8a"
-// CHECK: attributes #[[sha2]] = { {{.*}} "target-features"="+fmv,+fp-armv8,+neon,+outline-atomics,+sha2,+v8a"
-// CHECK: attributes #[[sha3]] = { {{.*}} "target-features"="+fmv,+fp-armv8,+neon,+outline-atomics,+sha2,+sha3,+v8a"
-// CHECK: attributes #[[sm4]] = { {{.*}} "target-features"="+fmv,+fp-armv8,+neon,+outline-atomics,+sm4,+v8a"
-// CHECK: attributes #[[sme]] = { {{.*}} "target-features"="+bf16,+fmv,+fp-armv8,+neon,+outline-atomics,+sme,+v8a"
-// CHECK: attributes #[[sme_f64f64]] = { {{.*}} "target-features"="+bf16,+fmv,+fp-armv8,+neon,+outline-atomics,+sme,+sme-f64f64,+v8a"
-// CHECK: attributes #[[sme_i16i64]] = { {{.*}} "target-features"="+bf16,+fmv,+fp-armv8,+neon,+outline-atomics,+sme,+sme-i16i64,+v8a"
-// CHECK: attributes #[[sme2]] = { {{.*}} "target-features"="+bf16,+fmv,+fp-armv8,+neon,+outline-atomics,+sme,+sme2,+v8a"
-// CHECK: attributes #[[ssbs]] = { {{.*}} "target-features"="+fmv,+fp-armv8,+neon,+outline-atomics,+ssbs,+v8a"
-// CHECK: attributes #[[sve]] = { {{.*}} "target-features"="+fmv,+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+v8a"
-// CHECK: attributes #[[sve2]] = { {{.*}} "target-features"="+fmv,+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+sve2,+v8a"
-// CHECK: attributes #[[sve2_aes]] = { {{.*}} "target-features"="+aes,+fmv,+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+sve-aes,+sve2,+sve2-aes,+v8a"
-// CHECK: attributes #[[sve2_bitperm]] = { {{.*}} "target-features"="+fmv,+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+sve2,+sve2-bitperm,+v8a"
-// CHECK: attributes #[[sve2_sha3]] = { {{.*}} "target-features"="+fmv,+fp-armv8,+fullfp16,+neon,+outline-atomics,+sha2,+sha3,+sve,+sve2,+sve2-sha3,+v8a"
-// CHECK: attributes #[[sve2_sm4]] = { {{.*}} "target-features"="+fmv,+fp-armv8,+fullfp16,+neon,+outline-atomics,+sm4,+sve,+sve2,+sve2-sm4,+v8a"
-// CHECK: attributes #[[wfxt]] = { {{.*}} "target-features"="+fmv,+fp-armv8,+neon,+outline-atomics,+v8a,+wfxt"
+// CHECK: attributes #[[aes]] = { {{.*}} "target-features"="+aes,+fp-armv8,+neon,+outline-atomics,+v8a"
+// CHECK: attributes #[[bf16]] = { {{.*}} "target-features"="+bf16,+fp-armv8,+neon,+outline-atomics,+v8a"
+// CHECK: attributes #[[bti]] = { {{.*}} "target-features"="+bti,+fp-armv8,+neon,+outline-atomics,+v8a"
+// CHECK: attributes #[[crc]] = { {{.*}} "target-features"="+crc,+fp-armv8,+neon,+outline-atomics,+v8a"
+// CHECK: attributes #[[dit]] = { {{.*}} "target-features"="+dit,+fp-armv8,+neon,+outline-atomics,+v8a"
+// CHECK: attributes #[[dotprod]] = { {{.*}} "target-features"="+dotprod,+fp-armv8,+neon,+outline-atomics,+v8a"
+// CHECK: attributes #[[dpb]] = { {{.*}} "target-features"="+ccpp,+fp-armv8,+neon,+outline-atomics,+v8a"
+// CHECK: attributes #[[dpb2]] = { {{.*}} "target-features"="+ccdp,+ccpp,+fp-armv8,+neon,+outline-atomics,+v8a"
+// CHECK: attributes #[[f32mm]] = { {{.*}} "target-features"="+f32mm,+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+v8a"
+// CHECK: attributes #[[f64mm]] = { {{.*}} "target-features"="+f64mm,+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+v8a"
+// CHECK: attributes #[[fcma]] = { {{.*}} "target-features"="+complxnum,+fp-armv8,+neon,+outline-atomics,+v8a"
+// CHECK: attributes #[[flagm]] = { {{.*}} "target-features"="+flagm,+fp-armv8,+neon,+outline-atomics,+v8a"
+// CHECK: attributes #[[flagm2]] = { {{.*}} "target-features"="+altnzcv,+flagm,+fp-armv8,+neon,+outline-atomics,+v8a"
+// CHECK: attributes #[[fp]] = { {{.*}} "target-features"="+fp-armv8,+neon,+outline-atomics,+v8a"
+// CHECK: attributes #[[fp16]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+v8a"
+// CHECK: attributes #[[fp16fml]] = { {{.*}} "target-features"="+fp-armv8,+fp16fml,+fullfp16,+neon,+outline-atomics,+v8a"
+// CHECK: attributes #[[frintts]] = { {{.*}} "target-features"="+fp-armv8,+fptoint,+neon,+outline-atomics,+v8a"
+// CHECK: attributes #[[i8mm]] = { {{.*}} "target-features"="+fp-armv8,+i8mm,+neon,+outline-atomics,+v8a"
+// CHECK: attributes #[[jscvt]] = { {{.*}} "target-features"="+fp-armv8,+jsconv,+neon,+outline-atomics,+v8a"
+// CHECK: attributes #[[ls64]] = { {{.*}} "target-features"="+fp-armv8,+ls64,+neon,+outline-atomics,+v8a"
+// CHECK: attributes #[[lse]] = { {{.*}} "target-features"="+fp-armv8,+lse,+neon,+outline-atomics,+v8a"
+// CHECK: attributes #[[memtag]] = { {{.*}} "target-features"="+fp-armv8,+mte,+neon,+outline-atomics,+v8a"
+// CHECK: attributes #[[mops]] = { {{.*}} "target-features"="+fp-armv8,+mops,+neon,+outline-atomics,+v8a"
+// CHECK: attributes #[[predres]] = { {{.*}} "target-features"="+fp-armv8,+neon,+outline-atomics,+predres,+v8a"
+// CHECK: attributes #[[rcpc]] = { {{.*}} "target-features"="+fp-armv8,+neon,+outline-atomics,+rcpc,+v8a"
+// CHECK: attributes #[[rcpc2]] = { {{.*}} "target-features"="+fp-armv8,+neon,+outline-atomics,+rcpc,+rcpc-immo,+v8a"
+// CHECK: attributes #[[rcpc3]] = { {{.*}} "target-features"="+fp-armv8,+neon,+outline-atomics,+rcpc,+rcpc-immo,+rcpc3,+v8a"
+// CHECK: attributes #[[rdm]] = { {{.*}} "target-features"="+fp-armv8,+neon,+outline-atomics,+rdm,+v8a"
+// CHECK: attributes #[[rng]] = { {{.*}} "target-features"="+fp-armv8,+neon,+outline-atomics,+rand,+v8a"
+// CHECK: attributes #[[sb]] = { {{.*}} "target-features"="+fp-armv8,+neon,+outline-atomics,+sb,+v8a"
+// CHECK: attributes #[[sha2]] = { {{.*}} "target-features"="+fp-armv8,+neon,+outline-atomics,+sha2,+v8a"
+// CHECK: attributes #[[sha3]] = { {{.*}} "target-features"="+fp-armv8,+neon,+outline-atomics,+sha2,+sha3,+v8a"
+// CHECK: attributes #[[simd]] = { {{.*}} "target-features"="+fp-armv8,+neon,+outline-atomics,+v8a"
+// CHECK: attributes #[[sm4]] = { {{.*}} "target-features"="+fp-armv8,+neon,+outline-atomics,+sm4,+v8a"
+// CHECK: attributes #[[sme]] = { {{.*}} "target-features"="+bf16,+fp-armv8,+neon,+outline-atomics,+sme,+v8a"
+// CHECK: attributes #[[sme_f64f64]] = { {{.*}} "target-features"="+bf16,+fp-armv8,+neon,+outline-atomics,+sme,+sme-f64f64,+v8a"
+// CHECK: attributes #[[sme_i16i64]] = { {{.*}} "target-features"="+bf16,+fp-armv8,+neon,+outline-atomics,+sme,+sme-i16i64,+v8a"
+// CHECK: attributes #[[sme2]] = { {{.*}} "target-features"="+bf16,+fp-armv8,+neon,+outline-atomics,+sme,+sme2,+v8a"
+// CHECK: attributes #[[ssbs]] = { {{.*}} "target-features"="+fp-armv8,+neon,+outline-atomics,+ssbs,+v8a"
+// CHECK: attributes #[[sve]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+v8a"
+// CHECK: attributes #[[sve2]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+sve2,+v8a"
+// CHECK: attributes #[[sve2_aes]] = { {{.*}} "target-features"="+aes,+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+sve-aes,+sve2,+sve2-aes,+v8a"
+// CHECK: attributes #[[sve2_bitperm]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+sve2,+sve2-bitperm,+v8a"
+// CHECK: attributes #[[sve2_sha3]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sha2,+sha3,+sve,+sve2,+sve2-sha3,+v8a"
+// CHECK: attributes #[[sve2_sm4]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sm4,+sve,+sve2,+sve2-sm4,+v8a"
+// CHECK: attributes #[[wfxt]] = { {{.*}} "target-features"="+fp-armv8,+neon,+outline-atomics,+v8a,+wfxt"
diff --git a/clang/test/CodeGen/AArch64/fmv-features.c b/clang/test/CodeGen/AArch64/fmv-features.c
new file mode 100644
index 00000000000000..24082413d9e2c7
--- /dev/null
+++ b/clang/test/CodeGen/AArch64/fmv-features.c
@@ -0,0 +1,202 @@
+// Test all of the AArch64 fmv-features metadata without any dependency expansion.
+// It is used to propagate the attribute string information from C/C++ source to LLVM IR.
+
+// RUN: %clang --target=aarch64-linux-gnu --rtlib=compiler-rt -emit-llvm -S -o - %s | FileCheck %s
+
+// CHECK: define dso_local i32 @fmv._Maes() #[[aes:[0-9]+]] {
+__attribute__((target_version("aes"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Mbf16() #[[bf16:[0-9]+]] {
+__attribute__((target_version("bf16"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Mbti() #[[bti:[0-9]+]] {
+__attribute__((target_version("bti"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Mcrc() #[[crc:[0-9]+]] {
+__attribute__((target_version("crc"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Mdit() #[[dit:[0-9]+]] {
+__attribute__((target_version("dit"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Mdotprod() #[[dotprod:[0-9]+]] {
+__attribute__((target_version("dotprod"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Mdpb() #[[dpb:[0-9]+]] {
+__attribute__((target_version("dpb"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Mdpb2() #[[dpb2:[0-9]+]] {
+__attribute__((target_version("dpb2"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Mf32mm() #[[f32mm:[0-9]+]] {
+__attribute__((target_version("f32mm"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Mf64mm() #[[f64mm:[0-9]+]] {
+__attribute__((target_version("f64mm"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Mfcma() #[[fcma:[0-9]+]] {
+__attribute__((target_version("fcma"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Mflagm() #[[flagm:[0-9]+]] {
+__attribute__((target_version("flagm"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Mflagm2() #[[flagm2:[0-9]+]] {
+__attribute__((target_version("flagm2"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Mfp() #[[fp:[0-9]+]] {
+__attribute__((target_version("fp"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Mfp16() #[[fp16:[0-9]+]] {
+__attribute__((target_version("fp16"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Mfp16fml() #[[fp16fml:[0-9]+]] {
+__attribute__((target_version("fp16fml"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Mfrintts() #[[frintts:[0-9]+]] {
+__attribute__((target_version("frintts"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Mi8mm() #[[i8mm:[0-9]+]] {
+__attribute__((target_version("i8mm"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Mjscvt() #[[jscvt:[0-9]+]] {
+__attribute__((target_version("jscvt"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Mls64() #[[ls64:[0-9]+]] {
+__attribute__((target_version("ls64"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Mlse() #[[lse:[0-9]+]] {
+__attribute__((target_version("lse"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Mmemtag() #[[memtag:[0-9]+]] {
+__attribute__((target_version("memtag"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Mmops() #[[mops:[0-9]+]] {
+__attribute__((target_version("mops"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Mpredres() #[[predres:[0-9]+]] {
+__attribute__((target_version("predres"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Mrcpc() #[[rcpc:[0-9]+]] {
+__attribute__((target_version("rcpc"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Mrcpc2() #[[rcpc2:[0-9]+]] {
+__attribute__((target_version("rcpc2"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Mrcpc3() #[[rcpc3:[0-9]+]] {
+__attribute__((target_version("rcpc3"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Mrdm() #[[rdm:[0-9]+]] {
+__attribute__((target_version("rdm"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Mrng() #[[rng:[0-9]+]] {
+__attribute__((target_version("rng"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Msb() #[[sb:[0-9]+]] {
+__attribute__((target_version("sb"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Msha2() #[[sha2:[0-9]+]] {
+__attribute__((target_version("sha2"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Msha3() #[[sha3:[0-9]+]] {
+__attribute__((target_version("sha3"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Msimd() #[[simd:[0-9]+]] {
+__attribute__((target_version("simd"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Msm4() #[[sm4:[0-9]+]] {
+__attribute__((target_version("sm4"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Msme() #[[sme:[0-9]+]] {
+__attribute__((target_version("sme"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Msme-f64f64() #[[sme_f64f64:[0-9]+]] {
+__attribute__((target_version("sme-f64f64"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Msme-i16i64() #[[sme_i16i64:[0-9]+]] {
+__attribute__((target_version("sme-i16i64"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Msme2() #[[sme2:[0-9]+]] {
+__attribute__((target_version("sme2"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Mssbs() #[[ssbs:[0-9]+]] {
+__attribute__((target_version("ssbs"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Msve() #[[sve:[0-9]+]] {
+__attribute__((target_version("sve"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Msve2() #[[sve2:[0-9]+]] {
+__attribute__((target_version("sve2"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Msve2-aes() #[[sve2_aes:[0-9]+]] {
+__attribute__((target_version("sve2-aes"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Msve2-bitperm() #[[sve2_bitperm:[0-9]+]] {
+__attribute__((target_version("sve2-bitperm"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Msve2-sha3() #[[sve2_sha3:[0-9]+]] {
+__attribute__((target_version("sve2-sha3"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Msve2-sm4() #[[sve2_sm4:[0-9]+]] {
+__attribute__((target_version("sve2-sm4"))) int fmv(void) { return 0; }
+
+// CHECK: define dso_local i32 @fmv._Mwfxt() #[[wfxt:[0-9]+]] {
+__attribute__((target_version("wfxt"))) int fmv(void) { return 0; }
+
+// CHECK: define ds...
[truncated]
|
According to the current rules yes, but IMO that rule is broken and doesn't match user expectation. A motivating example might be I think if we fix that rule, and instead sort priorities lexicographically after adding implicit features implied by command-line options, then this change becomes unnecessary. Mind starting a thread with the gcc folks to sort this out? |
@andrewcarlotti is planning a PR in ACLE to fix the priorities and the selection algorithm. I will tag you there when this happens. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The patch itself LGTM though, despite my gripes on the current prioritization rule.
I agree about this. As I said we will adress the priorities and selection algorithm separately.
The lexicographic ordering discards the ACLE priorities. I don't think it's what we want here. I am not sure how I feel about mixing the features inherited from the cmdline option with the fmv features. If we do that we should be detecting them at runtime if we want the static resolution optimization to behave the same as the resolver. However, making the version selection not only depend on the source code but also the cmdline may not be the right thing to do. |
I should clarify that I mean: take the listed features for a given function, sort them according to their priorities highest to lowest as if each priority were a letter in a Consider a multi-versioned function with
Now with those sorted according to that lexicographic order, we have a total order for the set:
The selection process then filters out elements that cannot be chosen per runtime availability of features, and selects the remaining choice that is "greatest" according to that order.
The way I see it, setting command-line options sets the floor for minimum feature set: one cannot expect to run code on a machine that does not at least meet that minimum. Because of that assumption, every version in the multi-version set inherits those implicit features. Imagine a multi-versioned function with three implementations: Do you have a case in mind where this would behave differently than the runtime resolver? I suppose we should distinguish two cases: both with the current sorting rule, and with that lexicographic-ish rule I'm proposing. |
I chose my example (which @labrinea posted here) specifically so that it would still be valid after changing the priorities to use lexicographic ordering (for which I just created ARM-software/acle#370). Your alternative example doesn't exhibit the issue that this patch is intended to address. |
Ah, I see now. Sounds good to me. |
Thanks for looking into this Andrew. There's one thing I would like to clarify here. Jon expressed the opinion that in your example the command line should determine the highest priority version (i8mm+fcma), which isn't what ARM-software/acle#370 suggests. That said, this PR is still need after the amendment to ACLE. Personally I don't have a strong opinion which way to go. I am open to discuss.
The proposal ARM-software/acle#370 does not take the implicit features of the cmdline into account. |
Just a gentle reminder that I would like us to clarify the impact of the cmdline over the version selection algorithm if any. With the Christmas break approaching I fear #87939 may miss the LLVM20 branch date. |
We need to be able to propagate information about FMV attribute strings from C/C++ source to LLVM IR. This is necessary so that we can distinguish which target-features are coming from the cmdline, which are coming from the target attribute, and which are coming from feature dependency expansion. We need this for static resolution of calls in LLVM. Here's a motivating example:
Suppose you have target_version("i8mm+dotprod") and target_version("fcma"). The first version clearly has higher priority. Now suppose you specify -march=armv8-a+i8mm on the command line. Then the versions would have target-features "+i8mm,+dotprod" and "+i8mm,+fcma" respectively. If you are using those to deduce version priority, then you would incorrectly deduce that the second version was higher priority than the first.