♻️ [Tasks] JSON spec: text-generation #468

SBrandeis · 2024-02-06T14:59:47Z

TL;DR

Update text-generation spec to match TGI API
~~Add conversational spec, heavily inspired by TGI messages API (cc @Wauplin @osanseviero @Narsil )~~
- ~~Relevant related work: Update conversational widget to use text-generation (+ remove conversational task) #457 & https://github.com/huggingface/moon-landing/pull/8723~~
regenerate typescript code for those tasks

SBrandeis · 2024-02-06T15:00:20Z

packages/tasks/src/tasks/text-generation/inference.ts

+	 * Best of
 	 */
-	doSample?: boolean;
+	bestOf?: number;


@Narsil @OlivierDehaene I'm not sure what this parameter represents

do_sample: bool -> Should the generation be greedy or not (it cannot be set to True if any other params is set, but sometimes you do not want to specify temp, nor top_p, nor anyhing, therefore setting do_sample=True is for that).

best_of: int -> Run n sampling queries, return the best (in terms of total logprob). Clashes with streaming.

osanseviero · 2024-02-07T11:50:45Z

Given https://github.com/huggingface/moon-landing/pull/8723, do we really need the conversational spec?

SBrandeis · 2024-02-07T12:25:01Z

@osanseviero I think speccing the conversational API can still be useful for the widget and inference clients, wdyt?

Wauplin

Sorry being late to the show here. I'm currently reviewing this PR :) (don't merge it too quickly 🙏)

Wauplin

Tried to list all the difference I've spotted between these specs, the TGI documentation and the Python client implementation. I think we should stick to the existing TGI parameters as they are already in use + "newer" than the transformers-based parameters.

Wauplin · 2024-02-09T14:16:13Z

packages/tasks/src/tasks/text-generation/inference.ts

@@ -3,7 +3,6 @@
 *


(commenting on inference.ts but more related to ./spec)

TGI server has an OpenAPI documentation to document its inputs and outputs (see swagger ui and openapi.json). I've spotted a few differences compared to the specs here. Some of these properties are currently specific to TGI and not available in the "transformers-based inference" endpoints.

In TextGenerationParameters:

max_new_tokens (integer) => Maximum number of generated tokens.

repetition_penalty (number) => The parameter for repetition penalty. A value of 1.0 means no penalty. See this paper for more details.

seed (integer) => Random sampling seed.

stop (array of string) => Stop generating tokens if a member of stop is generated. (currently named stop_sequences in these specs / stop in TGI => same purpose => to harmonize?)

top_n_tokens (integer) => ??? (no idea what this is. We don't have it either in Python client)

In the Python client, along with the inputs (string) + parameters (TextGenerationParameters) values in the request payload we also have a stream (bool) value. If True the output type is different (same as /generate_stream instead of /generate in TGI). I don't see it documented here so I wonder if these docs might be a bit outdated (ping @Narsil you might know more?). In any case, this is an option currently working and used so worth keeping it.

In TextGenerationOutputDetails:

best_of_sequences (array of BestOfSequence) => Additional sequences when using the best_of parameter.

BestOfSequence (currently missing):

generated_text (string) => The generated text.

finish_reason (FinishReason) => The reason for the generation to finish, represented by a FinishReason value.

generated_tokens (integer) => The number of generated tokens in the sequence.

seed (integer) => The sampling seed if sampling was activated.

prefill (array of InputToken) => he decoder input tokens. Empty if decoder_input_details is False.

tokens (array of Token) => The generated tokens.

BestOfSequence object is actually very similar to TextGenerationOutputDetails but 1 level of nesting below.

I deliberately left out the stream parameter because it was API/transport-specific (and not inference-specific)

I'm happy to revisit that if there's a consensus tho

Wauplin

Thanks for making the changes @SBrandeis! Re-reviewed it and looks good to me now to use it in the Python client. Fair-enough about not adding the stream parameters (and let's revisit if we realize we really need it at some point).

SBrandeis requested review from Narsil and julien-c February 6, 2024 14:59

SBrandeis commented Feb 6, 2024

View reviewed changes

SBrandeis changed the title ~~[Tasks] JSON spec: text-generation & conversational~~ [Tasks] JSON spec: text-generation Feb 7, 2024

SBrandeis marked this pull request as ready for review February 7, 2024 12:49

SBrandeis requested review from osanseviero, gary149 and Wauplin as code owners February 7, 2024 12:49

SBrandeis changed the title ~~[Tasks] JSON spec: text-generation~~ ♻️ [Tasks] JSON spec: text-generation Feb 7, 2024

SBrandeis requested a review from coyotte508 February 8, 2024 13:01

coyotte508 approved these changes Feb 8, 2024

View reviewed changes

julien-c approved these changes Feb 8, 2024

View reviewed changes

SBrandeis added 6 commits February 9, 2024 14:07

Match TGI API in text-generation task

ad8a766

Re-generate code

d7f0cfd

wip: spec for conversational

782b407

text-generation: document bestOf

03b8e49

🔥 Remove conversational spec (let's handle it in another PR)

5a5b45e

Regenerate code

3cad908

SBrandeis force-pushed the add-missing-generate-params branch from a8a59e3 to 3cad908 Compare February 9, 2024 13:08

Wauplin reviewed Feb 9, 2024

View reviewed changes

SBrandeis added 2 commits February 9, 2024 16:42

text-generation: Add missing parameters + factorize

c8d0e9a

Field ordering in JSON schemaé

8dda47f

SBrandeis requested a review from Wauplin February 9, 2024 16:34

Wauplin approved these changes Feb 12, 2024

View reviewed changes

SBrandeis merged commit ea2d471 into main Feb 12, 2024
1 of 2 checks passed

SBrandeis deleted the add-missing-generate-params branch February 12, 2024 11:01

Wauplin mentioned this pull request Feb 21, 2024

✨ [Widgets] Enable streaming in the conversational widget #486

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

♻️ [Tasks] JSON spec: text-generation #468

♻️ [Tasks] JSON spec: text-generation #468

SBrandeis commented Feb 6, 2024 •

edited

Loading

SBrandeis Feb 6, 2024

Narsil Feb 6, 2024

osanseviero commented Feb 7, 2024

SBrandeis commented Feb 7, 2024

Wauplin left a comment •

edited

Loading

Wauplin left a comment

Wauplin Feb 9, 2024

SBrandeis Feb 9, 2024

Wauplin left a comment

♻️ [Tasks] JSON spec: text-generation #468

♻️ [Tasks] JSON spec: text-generation #468

Conversation

SBrandeis commented Feb 6, 2024 • edited Loading

TL;DR

SBrandeis Feb 6, 2024

Choose a reason for hiding this comment

Narsil Feb 6, 2024

Choose a reason for hiding this comment

osanseviero commented Feb 7, 2024

SBrandeis commented Feb 7, 2024

Wauplin left a comment • edited Loading

Choose a reason for hiding this comment

Wauplin left a comment

Choose a reason for hiding this comment

Wauplin Feb 9, 2024

Choose a reason for hiding this comment

SBrandeis Feb 9, 2024

Choose a reason for hiding this comment

Wauplin left a comment

Choose a reason for hiding this comment

SBrandeis commented Feb 6, 2024 •

edited

Loading

Wauplin left a comment •

edited

Loading