Skip to content

Caption output quality: generated SRT files should meet WCAG/FCC accessibility standards #23

@bhuvan-somisetty

Description

@bhuvan-somisetty

The gap no one is talking about

Every pull request in this project focuses on the same question: when should a caption appear? That is half the problem.

The other half is: once we decide to show a caption, is the resulting SRT file actually usable by the people it is meant to serve?

Right now, the pipeline can correctly identify that a gunshot happens at 00:01:14 and emit an [Explosion] entry into an SRT file — and that caption can still be invisible to a deaf viewer if it flashes past in 0.4 seconds, or unreadable if it contains 60 words that scroll at 900 WPM.

Who gets hurt when we skip this

Closed captions exist for deaf and hard-of-hearing viewers. PlanetRead's content reaches students in classrooms across India, many of whom rely on captions as their only access to audio. A caption that is technically present but cognitively inaccessible is not accessibility — it is the appearance of accessibility.

What the standards say

Three established standards define what "readable" means for captions:

Standard Rule
WCAG 2.1 SC 1.2.2 Captions must be synchronised with the media
FCC 47 CFR § 79.1 Maximum reading rate: 220 WPM (adult), 130 WPM (children's content)
BBC Subtitle Guidelines 2024 Minimum on-screen time: 1.5 s; line length: ≤ 42 chars (Latin), ≤ 28 chars (Devanagari); inter-caption gap: ≥ 83 ms

None of the current implementations validate the final SRT/SLS output against any of these.

Proposed solution

A standalone cc_quality post-processing module that:

  1. Validates generated SRT/SLS files against the above standards and produces a quality score (0–100) with per-caption violation details
  2. Auto-fixes purely timing-based violations (too-short captions, overlaps, narrow gaps) without touching caption text
  3. Supports Hindi/Devanagari — detects script automatically and applies the appropriate line-length limit
  4. Works with every existing implementation — it is input-agnostic; it only cares about the SRT format

Rules to enforce

Rule Severity Trigger
MIN_DURATION error Caption on-screen < 1.5 s
READING_SPEED error WPM exceeds FCC limit for content type
LINE_LENGTH warning Line exceeds BBC character limit
OVERLAP error Caption overlaps the next
MIN_GAP warning Gap < 83 ms (~2 frames at 24 fps)

CLI sketch

cc-quality output.srt
cc-quality output.srt --content-type children
cc-quality output.srt --fix --output reviewed.srt
cc-quality output.srt --report json

Why this matters beyond the DMP

Once this project is deployed, editors at PlanetRead will feed its SRT output directly into video players. A validator that catches timing and readability problems before publication protects real users and reduces the manual review load — which is exactly the kind of automation this project is meant to provide.

Happy to implement this as a PR if there is interest.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions