You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The qk_scale was defined by embed_dim ** -0.5 in models/transformer_block.py. But, the attention scale value is (embed_dim // num_heads) ** -0.5 as I know.
Hello
Thank you for providing pretrained weights.
The
qk_scale
was defined byembed_dim ** -0.5
inmodels/transformer_block.py
. But, the attention scale value is(embed_dim // num_heads) ** -0.5
as I know.Please check if I'm right or if you have any other intentions.
The text was updated successfully, but these errors were encountered: