Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non Cosine-Sim Similarity functions for Sentence Transformer models are broken in v2.0.0 #1731

Closed
orionw opened this issue Jan 8, 2025 · 7 comments · Fixed by #1749
Closed

Comments

@orionw
Copy link
Contributor

orionw commented Jan 8, 2025

I tested using model facebook/contriever-msmarco (which should use dot) and dataset InstructIR (but TBH any dataset works).

If you stick a breakpoint in the sim function after this line you can check, it is always cos_sim. This might be because the SentenceTransformer models are only defined via ModelMeta and the default sim function is not stored in ST (AFAIK).

@Samoed
Copy link
Collaborator

Samoed commented Jan 8, 2025

Default similarity function is stored in sentence_transformer_config.json. Example https://huggingface.co/infgrad/jasper_en_vision_language_v1/blob/c6991f5a54805d15ea1af42f9db2eee4cca96edc/config_sentence_transformers.json#L12

@orionw
Copy link
Contributor Author

orionw commented Jan 8, 2025

Ah, awesome @Samoed! But this is probably a newer update for ST then. I don't see that in contriever-msmarco's files. So then I guess this issue is for older models where that isn't defined and is only defined in the model_meta.

@Samoed
Copy link
Collaborator

Samoed commented Jan 8, 2025

We need to add the model meta for this model with a wrapper that uses the dot function. From what I found when adding a similar function, all models used cosine similarity, but I might be wrong.

@orionw
Copy link
Contributor Author

orionw commented Jan 8, 2025

The model meta says dot - I don't think I know the wrapper you are talking about though.

It does seem to be the only dot model in the file though!

@Samoed
Copy link
Collaborator

Samoed commented Jan 8, 2025

I'll add wrapper for this

@sam-hey
Copy link
Contributor

sam-hey commented Jan 12, 2025

@orionw ModelMeta.similarity_fn_name is currently ignored and does not affect which similarity function is used. However, #1749 has temporarily addressed this issue.

It seems that #1759 might provide a more permanent fix for this behavior.

@Samoed
Copy link
Collaborator

Samoed commented Jan 12, 2025

I think this can be closed for now, because issue was fixed for now

@Samoed Samoed closed this as completed Jan 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants