Tasks: Add image-text-to-text pipeline and inference API to task page #1039

merveenoyan · 2024-11-18T15:23:14Z

..and remove the long inference

pcuenca

Very cool! 🔥

packages/tasks/src/tasks/image-text-to-text/about.md

pcuenca · 2024-11-18T16:12:17Z

packages/tasks/src/tasks/image-text-to-text/about.md

+     {
+         "role": "assistant",
+         "content": [
+             {"type": "text", "text": "There's a pink flower"},
+         ],
+     },


It's a bit strange to me that the input ends with an assistant turn. I see in the example later that the model completes the sentence with more details, but I'm not sure this is compatible with all chat VLMs. Can we maybe skip the assistant role from the input and see if the model provides a good description of the image?

This has not been addressed, I think it's unusual that users supply an assistant turn with the input.

sorry I thought I answered to this. basically it's to give more control to further align the output during inference. I used the same example here where you can see the output https://huggingface.co/docs/transformers/en/tasks/image_text_to_text

But that example ends with an user role, while this one ends with an assistant role. I don't think models are expected to be queried with an assistant role in the last turn: they receive a conversation that always ends with an user role, and then they respond with an assistant message.

sorry I think I should've sent the particular title, here you go https://huggingface.co/docs/transformers/en/tasks/image_text_to_text#pipeline I meant this one

Still looks weird / confusing to me, but ok if you feel strongly about it.

packages/tasks/src/tasks/image-text-to-text/about.md

Co-authored-by: Pedro Cuenca <[email protected]>

packages/tasks/src/tasks/image-text-to-text/about.md

Co-authored-by: vb <[email protected]>

merveenoyan · 2024-12-10T16:31:18Z

ah need to lint

pcuenca

Looking good, let's try to get this merged soon 🔥

packages/tasks/src/tasks/image-text-to-text/about.md

pcuenca · 2024-12-11T16:45:40Z

packages/tasks/src/tasks/image-text-to-text/about.md

+     {
+         "role": "assistant",
+         "content": [
+             {"type": "text", "text": "There's a pink flower"},
+         ],
+     },


This has not been addressed, I think it's unusual that users supply an assistant turn with the input.

merveenoyan · 2024-12-12T16:06:50Z

@pcuenca I changed it since it looked counterintuitive as an example, merging. thanks for the review

Add it2t pipeline to task page

d3e58a1

merveenoyan requested review from SBrandeis, gary149, Wauplin, julien-c, pcuenca and ngxson as code owners November 18, 2024 15:23

Add inference API

403e846

merveenoyan changed the title ~~Tasks: Add image-text-to-text pipeline to task page~~ Tasks: Add image-text-to-text pipeline and inference API to task page Nov 18, 2024

pcuenca reviewed Nov 18, 2024

View reviewed changes

merveenoyan and others added 4 commits November 19, 2024 08:52

Update packages/tasks/src/tasks/image-text-to-text/about.md

eee0f17

Co-authored-by: Pedro Cuenca <[email protected]>

Update packages/tasks/src/tasks/image-text-to-text/about.md

21b3aaa

Co-authored-by: Pedro Cuenca <[email protected]>

Update packages/tasks/src/tasks/image-text-to-text/about.md

c32d9df

Co-authored-by: Pedro Cuenca <[email protected]>

Update packages/tasks/src/tasks/image-text-to-text/about.md

4577d74

Co-authored-by: Pedro Cuenca <[email protected]>

Vaibhavs10 reviewed Nov 25, 2024

View reviewed changes

packages/tasks/src/tasks/image-text-to-text/about.md Outdated Show resolved Hide resolved

merveenoyan and others added 2 commits December 10, 2024 16:35

Update packages/tasks/src/tasks/image-text-to-text/about.md

82d9af6

Co-authored-by: vb <[email protected]>

Add roles to snippet

5e5131f

merveenoyan requested a review from pcuenca December 10, 2024 15:40

Merve Noyan and others added 2 commits December 10, 2024 17:39

lint

620db9c

Merge branch 'main' into add-vlm-pipeline

2450372

pcuenca reviewed Dec 11, 2024

View reviewed changes

Merge branch 'main' into add-vlm-pipeline

1bf74ad

pcuenca approved these changes Dec 12, 2024

View reviewed changes

merveenoyan added 2 commits December 12, 2024 16:53

Update about.md

0314949

Update about.md

6a40460

merveenoyan merged commit 8c62f4a into main Dec 12, 2024
5 checks passed

merveenoyan deleted the add-vlm-pipeline branch December 12, 2024 16:50

huggingface deleted a comment from code30x58 Dec 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tasks: Add image-text-to-text pipeline and inference API to task page #1039

Tasks: Add image-text-to-text pipeline and inference API to task page #1039

merveenoyan commented Nov 18, 2024

pcuenca left a comment

pcuenca Nov 18, 2024

pcuenca Dec 11, 2024

merveenoyan Dec 11, 2024

pcuenca Dec 11, 2024

merveenoyan Dec 12, 2024

pcuenca Dec 12, 2024

merveenoyan commented Dec 10, 2024

pcuenca left a comment

pcuenca Dec 11, 2024

merveenoyan commented Dec 12, 2024

Tasks: Add image-text-to-text pipeline and inference API to task page #1039

Tasks: Add image-text-to-text pipeline and inference API to task page #1039

Conversation

merveenoyan commented Nov 18, 2024

pcuenca left a comment

Choose a reason for hiding this comment

pcuenca Nov 18, 2024

Choose a reason for hiding this comment

pcuenca Dec 11, 2024

Choose a reason for hiding this comment

merveenoyan Dec 11, 2024

Choose a reason for hiding this comment

pcuenca Dec 11, 2024

Choose a reason for hiding this comment

merveenoyan Dec 12, 2024

Choose a reason for hiding this comment

pcuenca Dec 12, 2024

Choose a reason for hiding this comment

merveenoyan commented Dec 10, 2024

pcuenca left a comment

Choose a reason for hiding this comment

pcuenca Dec 11, 2024

Choose a reason for hiding this comment

merveenoyan commented Dec 12, 2024