You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We've had a customer report that the bert-base-cased HuggingFace model's pooler module is especially sensitive to float8 quantization during training, and after debugging a bit evidence points to the fact that supporting fused float8 gemm + bias will help in this case.
We've had a customer report that the
bert-base-cased
HuggingFace model'spooler
module is especially sensitive to float8 quantization during training, and after debugging a bit evidence points to the fact that supporting fused float8 gemm + bias will help in this case.Code pointer:
ao/torchao/float8/float8_linear.py
Line 401 in 24a78fe
torch._scaled_mm
supports bias, so we just need to rewire theFloat8Linear
code.The text was updated successfully, but these errors were encountered: