Generating the caption of a given image #3

claudiogreco · 2022-05-31T17:21:38Z

Hello,

Thank you for having implemented this model. Have you already implemented some code to generate the caption of a given image? If not, do you have an idea about how you would do it in this particular architecture?

Thank you in advance.

mk-runner · 2023-11-06T02:52:23Z

logits = coca(
    text = text,
    images = images
) # (4, 512, 20000)

I also have the same question. Although the caption logits can be obtained using the above code, text_tokens cannot be obtained and only image_tokens can be used in the inference phase.

Thank you in advance.

SeaN0X · 2024-04-18T02:20:52Z

Same problem here, with logits i get a huge tensor, but i didn't figure out how to convert it to text.

elmekkiMalek · 2024-12-17T12:51:14Z

Hello, Have you figured out how to do that ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generating the caption of a given image #3

Generating the caption of a given image #3

claudiogreco commented May 31, 2022 •

edited

Loading

mk-runner commented Nov 6, 2023

SeaN0X commented Apr 18, 2024 •

edited

Loading

elmekkiMalek commented Dec 17, 2024

Generating the caption of a given image #3

Generating the caption of a given image #3

Comments

claudiogreco commented May 31, 2022 • edited Loading

mk-runner commented Nov 6, 2023

SeaN0X commented Apr 18, 2024 • edited Loading

elmekkiMalek commented Dec 17, 2024

claudiogreco commented May 31, 2022 •

edited

Loading

SeaN0X commented Apr 18, 2024 •

edited

Loading