-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
aesni: Add target_feature annotations to allow intrinsic inlining #165
aesni: Add target_feature annotations to allow intrinsic inlining #165
Conversation
Otherwise when built with nocheck feature, the AES-NI intrinsics are not inlined causing a severe performance drop.
8b2f0ac
to
a3c8e5b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me
Decryption could be updated similarly. I didn't need it, but could update that as well if desired.
That'd be appreciated!
Will wait to hear from @newpavlov for a bit before merging.
Added decryption in 82970df, improvement looks to be similar. |
@jack-signal thanks! clippy error looks unrelated. I can try to fix that up in a separate PR. |
Strictly speaking doing so is incorrect and even having the Do you plan to use it in practice with a runtime detection on top or do you want to simply omit enabling |
Yes in my code I have an enum
and then
But that's true for |
Yes. Unfortunately it is not possible to mark safe trait methods as I guess we can merge it as a temporary solution until #135 lands. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also add similar attributes for the CTR code?
Information about #165 was accidentally omitted
Information about #165 was accidentally omitted
When built with nocheck feature, the AES-NI intrinsics are not inlined causing a severe performance drop. (rust-lang/rust#53069)
When compiled with
--features=nocheck
, improvement is notable (all numbers MB/s as reported bycargo bench
on an i7-10510U, rustc 1.47)When compiled with
RUSTFLAGS="-C target-feature=+aes,+ssse3"
there was no performance change (based oncargo bench
output)Decryption could be updated similarly. I didn't need it, but could update that as well if desired.