Auto-Generated Captions API #191

yashrajbharti · 2024-12-01T17:08:48Z

Introduction

The challenge we aim to address is the significant accessibility gap in web video content. Research shows that only 0.5% of web videos currently include captions, leaving a large portion of online video inaccessible to individuals who rely on them. To solve this, we propose adding an autogenerate attribute to the <track> element, enabling browsers to automatically generate captions for videos. This will improve accessibility and encourage widespread adoption of captions across the web, particularly for content creators who may lack the resources to manually create captions.

Read the complete Explainer

Feedback (Choose One)

Please provide all feedback below.

I welcome feedback in this thread, but encourage you to file bugs against the Explainer.

The text was updated successfully, but these errors were encountered:

Crissov · 2024-12-02T05:46:42Z

Why should this be an opt-in feature?
Why does the proposed attribute name contain a hyphen, when autocapitalize, autocomplete and autoplay don‘t?
Should this be designed to also apply to still images or to tables (replacing summary)?

yashrajbharti · 2024-12-02T05:56:43Z

Why should this be an opt-in feature?

Requiring an opt-in for auto-generated captions ensures that developers retain control over the user experience and avoid introducing unintended side effects. For example:

Auto-generated captions may not meet quality expectations in all languages or for all types of content.
Certain video creators may prefer manual captions to maintain higher accuracy or context-specific relevance.
An opt-in approach allows websites to assess the impact of auto-generated captions before widespread adoption, addressing accessibility improvements incrementally.
It ensures backward compatibility with existing implementations where developers or platforms rely on manual captioning or have specific requirements for <track> elements without auto-generation.

Why does the proposed attribute name contain a hyphen, when autocapitalize, autocomplete and autoplay don’t?

~~If necessary, alternatives such as autogenerate, autocaption could be considered during discussions.~~ I have changed it to autogenerate.

Should this be designed to also apply to still images or to tables (replacing summary)?

The scope of this proposal is limited to video content and captions as it addresses a significant accessibility gap (only 0.5% of web videos include captions). Expanding the functionality to still images or tables would dilute the focus of this API, but such use cases could be explored in separate proposals tailored to their unique requirements.

yashrajbharti · 2024-12-16T00:07:18Z

To prototype the look and feel of auto-generated captions and support my explainer, I created a vanilla JavaScript solution that demonstrates captions on the fly. Simply try it by uploading a video and playing it. The demo also includes examples with preloaded videos and real-time caption generation. I also made a Chrome extension to explore how it would work in practice, which can be loaded from the repo Captions on the Fly.

The captions have three styles:

Style	Description
Static	Captions replace the previous text and display only the latest transcribed line.
Scroll	Captions scroll as new lines are added, retaining old transcriptions for context.
Append	Captions are appended below the previous ones, keeping transcriptions visible up to two lines and scrollable.

This solution is designed to highlight different UX approaches to handling live captions dynamically. All styles are draggable and can be positioned anywhere on the screen.

I also hope to adapt this concept to the Chrome built-in Gemini Nano API, enabling real-time, text-based answers from video/audio. The processed data could be cached directly in the client's browser for efficiency. I feel this approach is particularly valuable for live streams, where manually adding captions in real-time is impractical.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Auto-Generated Captions API #191

Auto-Generated Captions API #191

yashrajbharti commented Dec 1, 2024 •

edited

Loading

Crissov commented Dec 2, 2024

yashrajbharti commented Dec 2, 2024 •

edited

Loading

yashrajbharti commented Dec 16, 2024

Auto-Generated Captions API #191

Auto-Generated Captions API #191

Comments

yashrajbharti commented Dec 1, 2024 • edited Loading

Introduction

Feedback (Choose One)

Crissov commented Dec 2, 2024

yashrajbharti commented Dec 2, 2024 • edited Loading

yashrajbharti commented Dec 16, 2024

yashrajbharti commented Dec 1, 2024 •

edited

Loading

yashrajbharti commented Dec 2, 2024 •

edited

Loading