Skip to content

Commit

Permalink
[FMV] Fix incorrect system register dependencies.
Browse files Browse the repository at this point in the history
Some features are later versions of others, like sve2 and sve,
therefore performing equality checks on system register values
would incur incorrect feature detection on later hardware. See
ARM-software#320 for example.
Therefore we should instead do >= comparisons when HWCAP info
is not available.

I am also fixing incorrect detection for LSE and WFxT. Lastly to
detect SVE2 and SME2 I am using the SVEVer and SMEVer bitfields
respectively.
  • Loading branch information
labrinea committed Jun 25, 2024
1 parent 9863b3e commit aa859f9
Showing 1 changed file with 54 additions and 50 deletions.
104 changes: 54 additions & 50 deletions main/acle.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ version: 2024Q2
date-of-issue: 21 June 2024
# LaTeX specific variables
copyright-text: "Copyright: see section \\texorpdfstring{\\nameref{copyright}}{Copyright}."
draftversion: false
draftversion: true
# Jekyll specific variables
header_counter: true
toc: true
Expand Down Expand Up @@ -394,6 +394,10 @@ Armv8.4-A [[ARMARMv84]](#ARMARMv84). Support is added for the Dot Product intrin
* Added [**Alpha**](#current-status-and-anticipated-changes)
support for SVE2.1 (FEAT_SVE2p1).

#### Changes for next release

* Fixed incorrect system register dependencies in Function Multi Versioning.

### References

This document refers to the following documents.
Expand Down Expand Up @@ -2593,66 +2597,66 @@ The following table lists the architectures feature mapping for AArch64
| **Priority** | **Architecture name** | **Name** | **Dependent feature registers** |
| ------------- | ------------------------ | ------------- | ----------------------------------------- |
| 0 | N/A | default | N/A |
| 10 | `FEAT_RNG` | rng | ```ID_AA64ISAR0_EL1.RNDR == 0b0001``` |
| 20 | `FEAT_FlagM` | flagm | ```ID_AA64ISAR0_EL1.TS == 0b0001 OR ``` <br> ```ID_AA64ISAR0_EL1.TS == 0b0010``` |
| 30 | `FEAT_FlagM2` | flagm2 | ```ID_AA64ISAR0_EL1.TS == 0b0010``` |
| 80 | `FEAT_LSE` | lse | ```ID_AA64ISAR0_EL1.Atomic == 0b0001``` |
| 10 | `FEAT_RNG` | rng | ```ID_AA64ISAR0_EL1.RNDR >= 0b0001``` |
| 20 | `FEAT_FlagM` | flagm | ```ID_AA64ISAR0_EL1.TS >= 0b0001``` |
| 30 | `FEAT_FlagM2` | flagm2 | ```ID_AA64ISAR0_EL1.TS >= 0b0010``` |
| 80 | `FEAT_LSE` | lse | ```ID_AA64ISAR0_EL1.Atomic >= 0b0010``` |
| 90 | Floating-point | fp | ```ID_AA64PFR0_EL1.FP != 0b1111``` |
| 100 | `FEAT_AdvSIMD` | simd | ```ID_AA64PFR0_EL1.AdvSIMD != 0b1111``` |
| 104 | `FEAT_DotProd` | dotprod | ```ID_AA64ISAR0_EL1.DP == 0b0001``` |
| 106 | `FEAT_SM3`, `FEAT_SM4` | sm4 | ```ID_AA64ISAR0_EL1.SM4 == 0b0001 AND ``` <br> ```ID_AA64ISAR0_EL1.SM3 == 0b0001``` |
| 108 | `FEAT_RDM` | rdm, rdma | ```ID_AA64ISAR0_EL1.RDM == 0b0001``` |
| 110 | `FEAT_CRC32` | crc | ```ID_AA64ISAR0_EL1.CRC32 == 0b0001``` |
| 120 | `FEAT_SHA1` | sha1 | ```ID_AA64ISAR0_EL1.SHA1 == 0b0001``` |
| 130 | `FEAT_SHA256` | sha2 | ```ID_AA64ISAR0_EL1.SHA2 == 0b0001``` |
| 140 | `FEAT_SHA512`,`FEAT_SHA3`| sha3 | ```ID_AA64ISAR0_EL1.SHA3 != 0b0000``` |
| 104 | `FEAT_DotProd` | dotprod | ```ID_AA64ISAR0_EL1.DP >= 0b0001``` |
| 106 | `FEAT_SM3`, `FEAT_SM4` | sm4 | ```ID_AA64ISAR0_EL1.SM4 >= 0b0001``` |
| 108 | `FEAT_RDM` | rdm, rdma | ```ID_AA64ISAR0_EL1.RDM >= 0b0001``` |
| 110 | `FEAT_CRC32` | crc | ```ID_AA64ISAR0_EL1.CRC32 >= 0b0001``` |
| 120 | `FEAT_SHA1` | sha1 | ```ID_AA64ISAR0_EL1.SHA1 >= 0b0001``` |
| 130 | `FEAT_SHA256` | sha2 | ```ID_AA64ISAR0_EL1.SHA2 >= 0b0001``` |
| 140 | `FEAT_SHA512`,`FEAT_SHA3`| sha3 | ```ID_AA64ISAR0_EL1.SHA3 >= 0b0001``` |
| 150 | `FEAT_AES` | aes | ```ID_AA64ISAR0_EL1.AES >= 0b0001``` |
| 160 | `FEAT_PMULL` | pmull | ```ID_AA64ISAR0_EL1.AES == 0b0010``` |
| 160 | `FEAT_PMULL` | pmull | ```ID_AA64ISAR0_EL1.AES >= 0b0010``` |
| 170 | `FEAT_FP16` | fp16 | ```ID_AA64PFR0_EL1.FP == 0b0001``` |
| 175 | `FEAT_FHM` | fp16fml | ```ID_AA64ISAR0_EL1.FHM == 0b0001``` |
| 180 | `FEAT_DIT` | dit | ```ID_AA64PFR0_EL1.DIT == 0b0001``` |
| 175 | `FEAT_FHM` | fp16fml | ```ID_AA64ISAR0_EL1.FHM >= 0b0001``` |
| 180 | `FEAT_DIT` | dit | ```ID_AA64PFR0_EL1.DIT >= 0b0001``` |
| 190 | `FEAT_DPB` | dpb | ```ID_AA64ISAR1_EL1.DPB >= 0b0001``` |
| 200 | `FEAT_DPB2` | dpb2 | ```ID_AA64ISAR1_EL1.DPB == 0b0010``` |
| 210 | `FEAT_JSCVT` | jscvt | ```ID_AA64ISAR1_EL1.JSCVT == 0b0001``` |
| 220 | `FEAT_FCMA` | fcma | ```ID_AA64ISAR1_EL1.FCMA == 0b0001``` |
| 230 | `FEAT_LRCPC` | rcpc | ```ID_AA64ISAR1_EL1.LRCPC != 0b0000``` |
| 240 | `FEAT_LRCPC2` | rcpc2 | ```ID_AA64ISAR1_EL1.LRCPC == 0b0010``` |
| 241 | `FEAT_LRCPC3` | rcpc3 | ```ID_AA64ISAR1_EL1.LRCPC == 0b0011``` |
| 250 | `FEAT_FRINTTS` | frintts | ```ID_AA64ISAR1_EL1.FRINTTS == 0b0001``` |
| 260 | `FEAT_DGH` | dgh | ```ID_AA64ISAR1_EL1.DGH == 0b0001``` |
| 270 | `FEAT_I8MM` | i8mm | ```ID_AA64ISAR1_EL1.I8MM == 0b0001``` |
| 280 | `FEAT_BF16` | bf16 | ```ID_AA64ISAR1_EL1.BF16 != 0b0000``` |
| 290 | `FEAT_EBF16` | ebf16 | ```ID_AA64ISAR1_EL1.BF16 == 0b0010``` |
| 300 | `FEAT_RPRES` | rpres | ```ID_AA64ISAR2_EL1.RPRES == 0b0001``` |
| 310 | `FEAT_SVE` | sve | ```ID_AA64PFR0_EL1.SVE != 0b0000 AND ``` <br> ```ID_AA64ZFR0_EL1.SVEver == 0b0000``` |
| 320 | `FEAT_BF16` | sve-bf16 | ```ID_AA64ZFR0_EL1.BF16 != 0b0000``` |
| 330 | `FEAT_EBF16` | sve-ebf16 | ```ID_AA64ZFR0_EL1.BF16 == 0b0010``` |
| 340 | `FEAT_I8MM` | sve-i8mm | ```ID_AA64ZFR0_EL1.I8MM == 0b00001``` |
| 350 | `FEAT_F32MM` | f32mm | ```ID_AA64ZFR0_EL1.F32MM == 0b00001``` |
| 360 | `FEAT_F64MM` | f64mm | ```ID_AA64ZFR0_EL1.F64MM == 0b00001``` |
| 370 | `FEAT_SVE2` | sve2 | ```ID_AA64PFR0_EL1.SVE != 0b0000 AND ``` <br> ```ID_AA64ZFR0_EL1.SVEver == 0b0001``` |
| 380 | `FEAT_SVE_AES` | sve2-aes | ```ID_AA64ZFR0_EL1.AES == 0b0001 OR ``` <br> ```ID_AA64ZFR0_EL1.AES == 0b0010``` |
| 390 | `FEAT_SVE_PMULL128` | sve2-pmull128 | ```ID_AA64ZFR0_EL1.AES == 0b0010``` |
| 400 | `FEAT_SVE_BitPerm` | sve2-bitperm | ```ID_AA64ZFR0_EL1.BitPerm == 0b0001``` |
| 410 | `FEAT_SVE_SHA3` | sve2-sha3 | ```ID_AA64ZFR0_EL1.SHA3 == 0b0001``` |
| 420 | `FEAT_SM3`,`FEAT_SVE_SM4`| sve2-sm4 | ```ID_AA64ZFR0_EL1.SM4 == 0b0001``` |
| 430 | `FEAT_SME` | sme | ```ID_AA64PFR1_EL1.SME == 0b0001``` |
| 200 | `FEAT_DPB2` | dpb2 | ```ID_AA64ISAR1_EL1.DPB >= 0b0010``` |
| 210 | `FEAT_JSCVT` | jscvt | ```ID_AA64ISAR1_EL1.JSCVT >= 0b0001``` |
| 220 | `FEAT_FCMA` | fcma | ```ID_AA64ISAR1_EL1.FCMA >= 0b0001``` |
| 230 | `FEAT_LRCPC` | rcpc | ```ID_AA64ISAR1_EL1.LRCPC >= 0b0001``` |
| 240 | `FEAT_LRCPC2` | rcpc2 | ```ID_AA64ISAR1_EL1.LRCPC >= 0b0010``` |
| 241 | `FEAT_LRCPC3` | rcpc3 | ```ID_AA64ISAR1_EL1.LRCPC >= 0b0011``` |
| 250 | `FEAT_FRINTTS` | frintts | ```ID_AA64ISAR1_EL1.FRINTTS >= 0b0001``` |
| 260 | `FEAT_DGH` | dgh | ```ID_AA64ISAR1_EL1.DGH >= 0b0001``` |
| 270 | `FEAT_I8MM` | i8mm | ```ID_AA64ISAR1_EL1.I8MM >= 0b0001``` |
| 280 | `FEAT_BF16` | bf16 | ```ID_AA64ISAR1_EL1.BF16 >= 0b0001``` |
| 290 | `FEAT_EBF16` | ebf16 | ```ID_AA64ISAR1_EL1.BF16 >= 0b0010``` |
| 300 | `FEAT_RPRES` | rpres | ```ID_AA64ISAR2_EL1.RPRES >= 0b0001``` |
| 310 | `FEAT_SVE` | sve | ```ID_AA64PFR0_EL1.SVE >= 0b0001``` |
| 320 | `FEAT_BF16` | sve-bf16 | ```ID_AA64ZFR0_EL1.BF16 >= 0b0001``` |
| 330 | `FEAT_EBF16` | sve-ebf16 | ```ID_AA64ZFR0_EL1.BF16 >= 0b0010``` |
| 340 | `FEAT_I8MM` | sve-i8mm | ```ID_AA64ZFR0_EL1.I8MM >= 0b00001``` |
| 350 | `FEAT_F32MM` | f32mm | ```ID_AA64ZFR0_EL1.F32MM >= 0b00001``` |
| 360 | `FEAT_F64MM` | f64mm | ```ID_AA64ZFR0_EL1.F64MM >= 0b00001``` |
| 370 | `FEAT_SVE2` | sve2 | ```ID_AA64ZFR0_EL1.SVEver >= 0b0001``` |
| 380 | `FEAT_SVE_AES` | sve2-aes | ```ID_AA64ZFR0_EL1.AES >= 0b0001``` |
| 390 | `FEAT_SVE_PMULL128` | sve2-pmull128 | ```ID_AA64ZFR0_EL1.AES >= 0b0010``` |
| 400 | `FEAT_SVE_BitPerm` | sve2-bitperm | ```ID_AA64ZFR0_EL1.BitPerm >= 0b0001``` |
| 410 | `FEAT_SVE_SHA3` | sve2-sha3 | ```ID_AA64ZFR0_EL1.SHA3 >= 0b0001``` |
| 420 | `FEAT_SM3`,`FEAT_SVE_SM4`| sve2-sm4 | ```ID_AA64ZFR0_EL1.SM4 >= 0b0001``` |
| 430 | `FEAT_SME` | sme | ```ID_AA64PFR1_EL1.SME >= 0b0001``` |
| 440 | `FEAT_MTE` | memtag | ```ID_AA64PFR1_EL1.MTE >= 0b0001``` |
| 450 | `FEAT_MTE2` | memtag2 | ```ID_AA64PFR1_EL1.MTE >= 0b0010``` |
| 460 | `FEAT_MTE3` | memtag3 | ```ID_AA64PFR1_EL1.MTE >= 0b0011``` |
| 470 | `FEAT_SB` | sb | ```ID_AA64ISAR1_EL1.SB == 0b0001``` |
| 480 | `FEAT_SPECRES` | predres | ```ID_AA64ISAR1_EL1.SPECRES == 0b0001``` |
| 490 | `FEAT_SSBS` | ssbs | ```ID_AA64PFR1_EL1.SSBS == 0b0001``` |
| 500 | `FEAT_SSBS2` | ssbs2 | ```ID_AA64PFR1_EL1.SSBS == 0b0010``` |
| 510 | `FEAT_BTI` | bti | ```ID_AA64PFR1_EL1.BT == 0b0001``` |
| 470 | `FEAT_SB` | sb | ```ID_AA64ISAR1_EL1.SB >= 0b0001``` |
| 480 | `FEAT_SPECRES` | predres | ```ID_AA64ISAR1_EL1.SPECRES >= 0b0001``` |
| 490 | `FEAT_SSBS` | ssbs | ```ID_AA64PFR1_EL1.SSBS >= 0b0001``` |
| 500 | `FEAT_SSBS2` | ssbs2 | ```ID_AA64PFR1_EL1.SSBS >= 0b0010``` |
| 510 | `FEAT_BTI` | bti | ```ID_AA64PFR1_EL1.BT >= 0b0001``` |
| 520 | `FEAT_LS64` | ls64 | ```ID_AA64ISAR1_EL1.LS64 >= 0b0001``` |
| 530 | `FEAT_LS64_V` | ls64_v | ```ID_AA64ISAR1_EL1.LS64 >= 0b0010``` |
| 540 | `FEAT_LS64_ACCDATA` | ls64_accdata | ```ID_AA64ISAR1_EL1.LS64 >= 0b0011``` |
| 550 | `FEAT_WFxT` | wfxt | ```ID_AA64ISAR2_EL1.WFxT == 0b0001``` |
| 560 | `FEAT_SME_F64F64` | sme-f64f64 | ```ID_AA64SMFR0_EL1.F64F64 == 0b0001``` |
| 570 | `FEAT_SME_I16I64` | sme-i16i64 | ```ID_AA64SMFR0_EL1.I16I64 == 0b1111``` |
| 580 | `FEAT_SME2` | sme2 | ```ID_AA64PFR1_EL1.SME == 0b0010``` |
| 650 | `FEAT_MOPS` | mops | ```ID_AA64ISAR2_EL1.MOPS == 0b0001``` |
| 550 | `FEAT_WFxT` | wfxt | ```ID_AA64ISAR2_EL1.WFxT >= 0b0010``` |
| 560 | `FEAT_SME_F64F64` | sme-f64f64 | ```ID_AA64SMFR0_EL1.F64F64 == 0b1``` |
| 570 | `FEAT_SME_I16I64` | sme-i16i64 | ```ID_AA64SMFR0_EL1.I16I64 == 0b1111``` |
| 580 | `FEAT_SME2` | sme2 | ```ID_AA64PFR1_EL1.SMEver >= 0b0001``` |
| 650 | `FEAT_MOPS` | mops | ```ID_AA64ISAR2_EL1.MOPS >= 0b0001``` |

### Selection

Expand Down

0 comments on commit aa859f9

Please sign in to comment.