Skip to content

Commit

Permalink
[FMV] Fix incorrect system register dependencies.
Browse files Browse the repository at this point in the history
Some features are later versions of others, like sve2 and sve,
therefore performing equality checks on system register values
would incur incorrect feature detection on later hardware. See
ARM-software#320 for example.
Therefore we should instead do >= comparisons when HWCAP info
is not available.

I am also fixing incorrect detection for LSE and WFxT. Lastly to
detect SVE2 and SME2 I am using the SVEVer and SMEVer bitfields
respectively.
  • Loading branch information
labrinea committed Jun 25, 2024
1 parent 9863b3e commit 1ece376
Showing 1 changed file with 53 additions and 49 deletions.
102 changes: 53 additions & 49 deletions main/acle.md
Original file line number Diff line number Diff line change
Expand Up @@ -394,6 +394,10 @@ Armv8.4-A [[ARMARMv84]](#ARMARMv84). Support is added for the Dot Product intrin
* Added [**Alpha**](#current-status-and-anticipated-changes)
support for SVE2.1 (FEAT_SVE2p1).

#### Changes for next release

* Fixed incorrect system register dependencies in Function Multi Versioning.

### References

This document refers to the following documents.
Expand Down Expand Up @@ -2593,66 +2597,66 @@ The following table lists the architectures feature mapping for AArch64
| **Priority** | **Architecture name** | **Name** | **Dependent feature registers** |
| ------------- | ------------------------ | ------------- | ----------------------------------------- |
| 0 | N/A | default | N/A |
| 10 | `FEAT_RNG` | rng | ```ID_AA64ISAR0_EL1.RNDR == 0b0001``` |
| 20 | `FEAT_FlagM` | flagm | ```ID_AA64ISAR0_EL1.TS == 0b0001 OR ``` <br> ```ID_AA64ISAR0_EL1.TS == 0b0010``` |
| 30 | `FEAT_FlagM2` | flagm2 | ```ID_AA64ISAR0_EL1.TS == 0b0010``` |
| 80 | `FEAT_LSE` | lse | ```ID_AA64ISAR0_EL1.Atomic == 0b0001``` |
| 10 | `FEAT_RNG` | rng | ```ID_AA64ISAR0_EL1.RNDR >= 0b0001``` |
| 20 | `FEAT_FlagM` | flagm | ```ID_AA64ISAR0_EL1.TS >= 0b0001``` |
| 30 | `FEAT_FlagM2` | flagm2 | ```ID_AA64ISAR0_EL1.TS >= 0b0010``` |
| 80 | `FEAT_LSE` | lse | ```ID_AA64ISAR0_EL1.Atomic >= 0b0010``` |
| 90 | Floating-point | fp | ```ID_AA64PFR0_EL1.FP != 0b1111``` |
| 100 | `FEAT_AdvSIMD` | simd | ```ID_AA64PFR0_EL1.AdvSIMD != 0b1111``` |
| 104 | `FEAT_DotProd` | dotprod | ```ID_AA64ISAR0_EL1.DP == 0b0001``` |
| 106 | `FEAT_SM3`, `FEAT_SM4` | sm4 | ```ID_AA64ISAR0_EL1.SM4 == 0b0001 AND ``` <br> ```ID_AA64ISAR0_EL1.SM3 == 0b0001``` |
| 108 | `FEAT_RDM` | rdm, rdma | ```ID_AA64ISAR0_EL1.RDM == 0b0001``` |
| 110 | `FEAT_CRC32` | crc | ```ID_AA64ISAR0_EL1.CRC32 == 0b0001``` |
| 120 | `FEAT_SHA1` | sha1 | ```ID_AA64ISAR0_EL1.SHA1 == 0b0001``` |
| 130 | `FEAT_SHA256` | sha2 | ```ID_AA64ISAR0_EL1.SHA2 == 0b0001``` |
| 140 | `FEAT_SHA512`,`FEAT_SHA3`| sha3 | ```ID_AA64ISAR0_EL1.SHA3 != 0b0000``` |
| 104 | `FEAT_DotProd` | dotprod | ```ID_AA64ISAR0_EL1.DP >= 0b0001``` |
| 106 | `FEAT_SM3`, `FEAT_SM4` | sm4 | ```ID_AA64ISAR0_EL1.SM4 >= 0b0001``` |
| 108 | `FEAT_RDM` | rdm, rdma | ```ID_AA64ISAR0_EL1.RDM >= 0b0001``` |
| 110 | `FEAT_CRC32` | crc | ```ID_AA64ISAR0_EL1.CRC32 >= 0b0001``` |
| 120 | `FEAT_SHA1` | sha1 | ```ID_AA64ISAR0_EL1.SHA1 >= 0b0001``` |
| 130 | `FEAT_SHA256` | sha2 | ```ID_AA64ISAR0_EL1.SHA2 >= 0b0001``` |
| 140 | `FEAT_SHA512`,`FEAT_SHA3`| sha3 | ```ID_AA64ISAR0_EL1.SHA3 >= 0b0001``` |
| 150 | `FEAT_AES` | aes | ```ID_AA64ISAR0_EL1.AES >= 0b0001``` |
| 160 | `FEAT_PMULL` | pmull | ```ID_AA64ISAR0_EL1.AES == 0b0010``` |
| 160 | `FEAT_PMULL` | pmull | ```ID_AA64ISAR0_EL1.AES >= 0b0010``` |
| 170 | `FEAT_FP16` | fp16 | ```ID_AA64PFR0_EL1.FP == 0b0001``` |
| 175 | `FEAT_FHM` | fp16fml | ```ID_AA64ISAR0_EL1.FHM == 0b0001``` |
| 180 | `FEAT_DIT` | dit | ```ID_AA64PFR0_EL1.DIT == 0b0001``` |
| 175 | `FEAT_FHM` | fp16fml | ```ID_AA64ISAR0_EL1.FHM >= 0b0001``` |
| 180 | `FEAT_DIT` | dit | ```ID_AA64PFR0_EL1.DIT >= 0b0001``` |
| 190 | `FEAT_DPB` | dpb | ```ID_AA64ISAR1_EL1.DPB >= 0b0001``` |
| 200 | `FEAT_DPB2` | dpb2 | ```ID_AA64ISAR1_EL1.DPB == 0b0010``` |
| 210 | `FEAT_JSCVT` | jscvt | ```ID_AA64ISAR1_EL1.JSCVT == 0b0001``` |
| 220 | `FEAT_FCMA` | fcma | ```ID_AA64ISAR1_EL1.FCMA == 0b0001``` |
| 230 | `FEAT_LRCPC` | rcpc | ```ID_AA64ISAR1_EL1.LRCPC != 0b0000``` |
| 240 | `FEAT_LRCPC2` | rcpc2 | ```ID_AA64ISAR1_EL1.LRCPC == 0b0010``` |
| 241 | `FEAT_LRCPC3` | rcpc3 | ```ID_AA64ISAR1_EL1.LRCPC == 0b0011``` |
| 250 | `FEAT_FRINTTS` | frintts | ```ID_AA64ISAR1_EL1.FRINTTS == 0b0001``` |
| 260 | `FEAT_DGH` | dgh | ```ID_AA64ISAR1_EL1.DGH == 0b0001``` |
| 270 | `FEAT_I8MM` | i8mm | ```ID_AA64ISAR1_EL1.I8MM == 0b0001``` |
| 280 | `FEAT_BF16` | bf16 | ```ID_AA64ISAR1_EL1.BF16 != 0b0000``` |
| 290 | `FEAT_EBF16` | ebf16 | ```ID_AA64ISAR1_EL1.BF16 == 0b0010``` |
| 300 | `FEAT_RPRES` | rpres | ```ID_AA64ISAR2_EL1.RPRES == 0b0001``` |
| 310 | `FEAT_SVE` | sve | ```ID_AA64PFR0_EL1.SVE != 0b0000 AND ``` <br> ```ID_AA64ZFR0_EL1.SVEver == 0b0000``` |
| 320 | `FEAT_BF16` | sve-bf16 | ```ID_AA64ZFR0_EL1.BF16 != 0b0000``` |
| 330 | `FEAT_EBF16` | sve-ebf16 | ```ID_AA64ZFR0_EL1.BF16 == 0b0010``` |
| 340 | `FEAT_I8MM` | sve-i8mm | ```ID_AA64ZFR0_EL1.I8MM == 0b00001``` |
| 350 | `FEAT_F32MM` | f32mm | ```ID_AA64ZFR0_EL1.F32MM == 0b00001``` |
| 360 | `FEAT_F64MM` | f64mm | ```ID_AA64ZFR0_EL1.F64MM == 0b00001``` |
| 370 | `FEAT_SVE2` | sve2 | ```ID_AA64PFR0_EL1.SVE != 0b0000 AND ``` <br> ```ID_AA64ZFR0_EL1.SVEver == 0b0001``` |
| 380 | `FEAT_SVE_AES` | sve2-aes | ```ID_AA64ZFR0_EL1.AES == 0b0001 OR ``` <br> ```ID_AA64ZFR0_EL1.AES == 0b0010``` |
| 390 | `FEAT_SVE_PMULL128` | sve2-pmull128 | ```ID_AA64ZFR0_EL1.AES == 0b0010``` |
| 400 | `FEAT_SVE_BitPerm` | sve2-bitperm | ```ID_AA64ZFR0_EL1.BitPerm == 0b0001``` |
| 410 | `FEAT_SVE_SHA3` | sve2-sha3 | ```ID_AA64ZFR0_EL1.SHA3 == 0b0001``` |
| 420 | `FEAT_SM3`,`FEAT_SVE_SM4`| sve2-sm4 | ```ID_AA64ZFR0_EL1.SM4 == 0b0001``` |
| 430 | `FEAT_SME` | sme | ```ID_AA64PFR1_EL1.SME == 0b0001``` |
| 200 | `FEAT_DPB2` | dpb2 | ```ID_AA64ISAR1_EL1.DPB >= 0b0010``` |
| 210 | `FEAT_JSCVT` | jscvt | ```ID_AA64ISAR1_EL1.JSCVT >= 0b0001``` |
| 220 | `FEAT_FCMA` | fcma | ```ID_AA64ISAR1_EL1.FCMA >= 0b0001``` |
| 230 | `FEAT_LRCPC` | rcpc | ```ID_AA64ISAR1_EL1.LRCPC >= 0b0001``` |
| 240 | `FEAT_LRCPC2` | rcpc2 | ```ID_AA64ISAR1_EL1.LRCPC >= 0b0010``` |
| 241 | `FEAT_LRCPC3` | rcpc3 | ```ID_AA64ISAR1_EL1.LRCPC >= 0b0011``` |
| 250 | `FEAT_FRINTTS` | frintts | ```ID_AA64ISAR1_EL1.FRINTTS >= 0b0001``` |
| 260 | `FEAT_DGH` | dgh | ```ID_AA64ISAR1_EL1.DGH >= 0b0001``` |
| 270 | `FEAT_I8MM` | i8mm | ```ID_AA64ISAR1_EL1.I8MM >= 0b0001``` |
| 280 | `FEAT_BF16` | bf16 | ```ID_AA64ISAR1_EL1.BF16 >= 0b0001``` |
| 290 | `FEAT_EBF16` | ebf16 | ```ID_AA64ISAR1_EL1.BF16 >= 0b0010``` |
| 300 | `FEAT_RPRES` | rpres | ```ID_AA64ISAR2_EL1.RPRES >= 0b0001``` |
| 310 | `FEAT_SVE` | sve | ```ID_AA64PFR0_EL1.SVE >= 0b0001``` |
| 320 | `FEAT_BF16` | sve-bf16 | ```ID_AA64ZFR0_EL1.BF16 >= 0b0001``` |
| 330 | `FEAT_EBF16` | sve-ebf16 | ```ID_AA64ZFR0_EL1.BF16 >= 0b0010``` |
| 340 | `FEAT_I8MM` | sve-i8mm | ```ID_AA64ZFR0_EL1.I8MM >= 0b00001``` |
| 350 | `FEAT_F32MM` | f32mm | ```ID_AA64ZFR0_EL1.F32MM >= 0b00001``` |
| 360 | `FEAT_F64MM` | f64mm | ```ID_AA64ZFR0_EL1.F64MM >= 0b00001``` |
| 370 | `FEAT_SVE2` | sve2 | ```ID_AA64ZFR0_EL1.SVEver >= 0b0001``` |
| 380 | `FEAT_SVE_AES` | sve2-aes | ```ID_AA64ZFR0_EL1.AES >= 0b0001``` |
| 390 | `FEAT_SVE_PMULL128` | sve2-pmull128 | ```ID_AA64ZFR0_EL1.AES >= 0b0010``` |
| 400 | `FEAT_SVE_BitPerm` | sve2-bitperm | ```ID_AA64ZFR0_EL1.BitPerm >= 0b0001``` |
| 410 | `FEAT_SVE_SHA3` | sve2-sha3 | ```ID_AA64ZFR0_EL1.SHA3 >= 0b0001``` |
| 420 | `FEAT_SM3`,`FEAT_SVE_SM4`| sve2-sm4 | ```ID_AA64ZFR0_EL1.SM4 >= 0b0001``` |
| 430 | `FEAT_SME` | sme | ```ID_AA64PFR1_EL1.SME >= 0b0001``` |
| 440 | `FEAT_MTE` | memtag | ```ID_AA64PFR1_EL1.MTE >= 0b0001``` |
| 450 | `FEAT_MTE2` | memtag2 | ```ID_AA64PFR1_EL1.MTE >= 0b0010``` |
| 460 | `FEAT_MTE3` | memtag3 | ```ID_AA64PFR1_EL1.MTE >= 0b0011``` |
| 470 | `FEAT_SB` | sb | ```ID_AA64ISAR1_EL1.SB == 0b0001``` |
| 480 | `FEAT_SPECRES` | predres | ```ID_AA64ISAR1_EL1.SPECRES == 0b0001``` |
| 490 | `FEAT_SSBS` | ssbs | ```ID_AA64PFR1_EL1.SSBS == 0b0001``` |
| 500 | `FEAT_SSBS2` | ssbs2 | ```ID_AA64PFR1_EL1.SSBS == 0b0010``` |
| 510 | `FEAT_BTI` | bti | ```ID_AA64PFR1_EL1.BT == 0b0001``` |
| 470 | `FEAT_SB` | sb | ```ID_AA64ISAR1_EL1.SB >= 0b0001``` |
| 480 | `FEAT_SPECRES` | predres | ```ID_AA64ISAR1_EL1.SPECRES >= 0b0001``` |
| 490 | `FEAT_SSBS` | ssbs | ```ID_AA64PFR1_EL1.SSBS >= 0b0001``` |
| 500 | `FEAT_SSBS2` | ssbs2 | ```ID_AA64PFR1_EL1.SSBS >= 0b0010``` |
| 510 | `FEAT_BTI` | bti | ```ID_AA64PFR1_EL1.BT >= 0b0001``` |
| 520 | `FEAT_LS64` | ls64 | ```ID_AA64ISAR1_EL1.LS64 >= 0b0001``` |
| 530 | `FEAT_LS64_V` | ls64_v | ```ID_AA64ISAR1_EL1.LS64 >= 0b0010``` |
| 540 | `FEAT_LS64_ACCDATA` | ls64_accdata | ```ID_AA64ISAR1_EL1.LS64 >= 0b0011``` |
| 550 | `FEAT_WFxT` | wfxt | ```ID_AA64ISAR2_EL1.WFxT == 0b0001``` |
| 560 | `FEAT_SME_F64F64` | sme-f64f64 | ```ID_AA64SMFR0_EL1.F64F64 == 0b0001``` |
| 570 | `FEAT_SME_I16I64` | sme-i16i64 | ```ID_AA64SMFR0_EL1.I16I64 == 0b1111``` |
| 580 | `FEAT_SME2` | sme2 | ```ID_AA64PFR1_EL1.SME == 0b0010``` |
| 650 | `FEAT_MOPS` | mops | ```ID_AA64ISAR2_EL1.MOPS == 0b0001``` |
| 550 | `FEAT_WFxT` | wfxt | ```ID_AA64ISAR2_EL1.WFxT >= 0b0010``` |
| 560 | `FEAT_SME_F64F64` | sme-f64f64 | ```ID_AA64SMFR0_EL1.F64F64 == 0b1``` |
| 570 | `FEAT_SME_I16I64` | sme-i16i64 | ```ID_AA64SMFR0_EL1.I16I64 == 0b1111``` |
| 580 | `FEAT_SME2` | sme2 | ```ID_AA64PFR1_EL1.SMEver >= 0b0001``` |
| 650 | `FEAT_MOPS` | mops | ```ID_AA64ISAR2_EL1.MOPS >= 0b0001``` |

### Selection

Expand Down

0 comments on commit 1ece376

Please sign in to comment.