The gap no one is talking about
Every pull request in this project focuses on the same question: when should a caption appear? That is half the problem.
The other half is: once we decide to show a caption, is the resulting SRT file actually usable by the people it is meant to serve?
Right now, the pipeline can correctly identify that a gunshot happens at 00:01:14 and emit an [Explosion] entry into an SRT file — and that caption can still be invisible to a deaf viewer if it flashes past in 0.4 seconds, or unreadable if it contains 60 words that scroll at 900 WPM.
Who gets hurt when we skip this
Closed captions exist for deaf and hard-of-hearing viewers. PlanetRead's content reaches students in classrooms across India, many of whom rely on captions as their only access to audio. A caption that is technically present but cognitively inaccessible is not accessibility — it is the appearance of accessibility.
What the standards say
Three established standards define what "readable" means for captions:
| Standard |
Rule |
| WCAG 2.1 SC 1.2.2 |
Captions must be synchronised with the media |
| FCC 47 CFR § 79.1 |
Maximum reading rate: 220 WPM (adult), 130 WPM (children's content) |
| BBC Subtitle Guidelines 2024 |
Minimum on-screen time: 1.5 s; line length: ≤ 42 chars (Latin), ≤ 28 chars (Devanagari); inter-caption gap: ≥ 83 ms |
None of the current implementations validate the final SRT/SLS output against any of these.
Proposed solution
A standalone cc_quality post-processing module that:
- Validates generated SRT/SLS files against the above standards and produces a quality score (0–100) with per-caption violation details
- Auto-fixes purely timing-based violations (too-short captions, overlaps, narrow gaps) without touching caption text
- Supports Hindi/Devanagari — detects script automatically and applies the appropriate line-length limit
- Works with every existing implementation — it is input-agnostic; it only cares about the SRT format
Rules to enforce
| Rule |
Severity |
Trigger |
MIN_DURATION |
error |
Caption on-screen < 1.5 s |
READING_SPEED |
error |
WPM exceeds FCC limit for content type |
LINE_LENGTH |
warning |
Line exceeds BBC character limit |
OVERLAP |
error |
Caption overlaps the next |
MIN_GAP |
warning |
Gap < 83 ms (~2 frames at 24 fps) |
CLI sketch
cc-quality output.srt
cc-quality output.srt --content-type children
cc-quality output.srt --fix --output reviewed.srt
cc-quality output.srt --report json
Why this matters beyond the DMP
Once this project is deployed, editors at PlanetRead will feed its SRT output directly into video players. A validator that catches timing and readability problems before publication protects real users and reduces the manual review load — which is exactly the kind of automation this project is meant to provide.
Happy to implement this as a PR if there is interest.
The gap no one is talking about
Every pull request in this project focuses on the same question: when should a caption appear? That is half the problem.
The other half is: once we decide to show a caption, is the resulting SRT file actually usable by the people it is meant to serve?
Right now, the pipeline can correctly identify that a gunshot happens at 00:01:14 and emit an
[Explosion]entry into an SRT file — and that caption can still be invisible to a deaf viewer if it flashes past in 0.4 seconds, or unreadable if it contains 60 words that scroll at 900 WPM.Who gets hurt when we skip this
Closed captions exist for deaf and hard-of-hearing viewers. PlanetRead's content reaches students in classrooms across India, many of whom rely on captions as their only access to audio. A caption that is technically present but cognitively inaccessible is not accessibility — it is the appearance of accessibility.
What the standards say
Three established standards define what "readable" means for captions:
None of the current implementations validate the final SRT/SLS output against any of these.
Proposed solution
A standalone
cc_qualitypost-processing module that:Rules to enforce
MIN_DURATIONREADING_SPEEDLINE_LENGTHOVERLAPMIN_GAPCLI sketch
Why this matters beyond the DMP
Once this project is deployed, editors at PlanetRead will feed its SRT output directly into video players. A validator that catches timing and readability problems before publication protects real users and reduces the manual review load — which is exactly the kind of automation this project is meant to provide.
Happy to implement this as a PR if there is interest.