Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Inference] compatibility with third-party Inference providers #1077

Merged
merged 25 commits into from
Jan 9, 2025

Conversation

julien-c
Copy link
Member

@julien-c julien-c commented Dec 16, 2024

TL;DR

Allow users to request 3rd party inference providers (Sambanova, Replicate, Together, Fal) with @huggingface/inference for a curated set of models on the HF Hub

For now, Requesting a 3rd party inference provider requires users to pass an api key from this provider as a parameter to the inference function.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

- [@huggingface/tasks](packages/tasks/README.md): The definition files and source-of-truth for the Hub's main primitives like pipeline tasks, model libraries, etc.
- [@huggingface/jinja](packages/jinja/README.md): A minimalistic JS implementation of the Jinja templating engine, to be used for ML chat templates.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @xenova fyi

@coyotte508
Copy link
Member

LGTM. When I have time I'll look at the tests.

@julien-c
Copy link
Member Author

@coyotte508 i think i can use VCR tapes, looks like a cool feature

@coyotte508
Copy link
Member

cc @Aschen ⬆️ :)

Comment on lines +120 to +121
/// TODO we wil proxy the request server-side (using our own keys) and handle billing for it on the user's HF account.
throw new Error("Inference proxying is not implemented yet");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SBrandeis SBrandeis self-assigned this Jan 6, 2025
@SBrandeis SBrandeis force-pushed the inference-providers branch from bf47b3b to 238567a Compare January 7, 2025 14:25
@SBrandeis SBrandeis force-pushed the inference-providers branch from daec1eb to 3939438 Compare January 8, 2025 15:42
@SBrandeis SBrandeis self-requested a review January 9, 2025 11:33
Copy link
Member

@coyotte508 coyotte508 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Comment on lines +71 to +81
switch (provider) {
case "replicate":
return REPLICATE_MODEL_IDS[model];
case "sambanova":
return SAMBANOVA_MODEL_IDS[model];
case "together":
return TOGETHER_MODEL_IDS[model]?.id;
case "fal-ai":
return FAL_AI_MODEL_IDS[model];
default:
return model;
Copy link
Member

@coyotte508 coyotte508 Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should maybe at least return REPLICATE_MODEL_IDS[model] ?? model; to fallback to the provided id in case the user directly provided a replicate model id and not a HF model ID. Same with the others.

Note that we could maybe maintain a mapping in the backend and in case of errors try to load it -only once (like we do for default models associated to tasks). Just a thought for the future, but it would enable new mappings without updating the lib.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should maybe at least return REPLICATE_MODEL_IDS[model] ?? model; to fallback to the provided id in case the user directly provided a replicate model id and not a HF model ID. Same with the others.

cc @julien-c - we discussed it and decided to stick to HF model IDs for now

Note that we could maybe maintain a mapping in the backend and in case of errors try to load it -only once (like we do for default models associated to tasks). Just a thought for the future, but it would enable new mappings without updating the lib.

Yes, we definitely want a way for 3rd party providers to expose the mapping HF model ID -> Provider ID that does not require hardcoding / updating the huggingface.js lib

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we discussed it and decided to stick to HF model IDs for now

Yes. simpler to always be "Hub-centric"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yes and i now remember, the ?? model in my mind was to work out of the box for models that have the same id on the inference provider as the HF id, NOT to work if you pass the provider's (different) id

@julien-c
Copy link
Member Author

julien-c commented Jan 9, 2025

i think this is ready to merge 🎉 🌮

@SBrandeis SBrandeis merged commit 86b1f2e into main Jan 9, 2025
6 checks passed
@SBrandeis SBrandeis deleted the inference-providers branch January 9, 2025 15:40
@Aschen
Copy link
Contributor

Aschen commented Jan 10, 2025

Hey @julien-c @coyotte508 glad to see that it's still here and it's useful ;-)

Continue your great job here, I like it 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants