TTS in an accessibility plan: where it helps, where it stops
Adding text-to-speech to a website or app looks like an accessibility win. Sometimes it is. Sometimes it is a checkbox that distracts from the real accessibility work. Here is the honest map of where TTS helps and where it does not.
There is a comfortable assumption in product teams that "we added a play-as-audio button" is a meaningful accessibility improvement. Sometimes it is. Often it is a feature that helps a small number of users while letting the team feel they have addressed accessibility without doing the harder, more durable work the actual standards require.
This is the honest map of what TTS does and does not do for accessibility, and how to use it as part of a real accessibility plan rather than a substitute for one.
What accessibility actually means
The starting frame: WCAG 2.1 AA is the legally enforced minimum in most jurisdictions in 2026. WCAG 3 is in working draft status, but the substantive obligations on the books for the next several years still come from 2.1 AA, with 2.2 adding incremental requirements. Section 508 in the U.S., the European Accessibility Act in the EU, and similar regimes in other markets all reference WCAG-style criteria.
What WCAG actually requires for audio and reading content:
- Text alternatives for non-text content (images, video, audio).
- Audio descriptions or alternatives for video where the visual content carries information.
- Captions for prerecorded audio in video.
- Compatibility with screen readers and other assistive technology.
- Sufficient color contrast, keyboard navigation, focus indicators.
- Content that can be presented in different ways without losing information or structure.
What WCAG does not require:
- A play-as-audio button on every page.
- TTS narration of every article.
- Synthetic voice as a substitute for screen-reader compatibility.
This is the key distinction the field gets wrong. Accessibility is not a single feature; it is a property of the page. TTS narration can be one feature inside a properly accessible page. It does not replace the underlying accessibility work, and a page with a play-as-audio button but missing semantic HTML, ARIA labels, or keyboard navigation is not accessible just because the audio button exists.
Where TTS genuinely helps
TTS is a real accessibility feature for specific user groups. Naming them honestly:
Users with reading difficulties. Dyslexia, ADHD, and certain cognitive disabilities are easier to navigate with audio support. The user can listen to the page while looking at it, which engages multiple processing pathways and reduces the working-memory load of decoding text. For these users, an opt-in audio mode is genuinely valuable.
Low-vision users who do not use a screen reader. Many low-vision users do not run a full screen reader; they use OS-level magnification and prefer audio playback for long-form content. A page-native audio mode fits their workflow without requiring them to copy text into a separate tool.
Users with temporary conditions. Eye strain after a long workday, a vision issue post-surgery, contact-lens problems. These users are not running assistive tech but benefit from audio playback when reading is uncomfortable.
Users with situational barriers. Driving (where the user genuinely should not be reading), exercising, doing manual work that occupies the hands but not the ears. Audio playback turns text content into something the user can consume without dedicated visual attention. This is not a disability accommodation in the strict sense, but it is a genuine usability improvement for a meaningful slice of users.
Multilingual or low-literacy contexts. Users reading in their second language often process audio better than text, especially for long-form content where decoding speed is the bottleneck. Same for users with lower reading proficiency in any language, listening to a page while reading along reinforces understanding.
For all of these groups, a properly built TTS feature on a page is helpful. The feature has to be opt-in (loading audio on every page visit is bandwidth waste for users who do not want it), gracefully styled so it does not distract from the primary content, and well-paired with the underlying text so the listener can switch between modes naturally.
Where TTS does not help
Naming the failure modes is the harder part.
Screen-reader users. Users running JAWS, NVDA, VoiceOver, or TalkBack already have audio output of any text content on the page. The screen reader uses voices the user has selected, navigates the document structure the user is familiar with, and integrates with the user's existing keyboard shortcuts. A page-level "play as audio" button is not useful to these users; they have already heard the page in their preferred voice with their preferred navigation. Worse, a poorly built page-level audio mode can interfere with the screen reader (audio playing over the screen reader's narration, focus stealing, ARIA live regions firing on play) in ways that make the page actively harder to use.
Users with hearing disabilities. Audio playback is not an accessibility feature for users who cannot hear it. The accessibility feature for these users is captions, transcripts, and structured text content. Adding TTS to a page does nothing for them and may distract the team from the captioning work that does help them.
Users navigating with a keyboard or alternative input. A page that plays audio at the press of a button is fine; a page where the audio button is the primary way to consume content is hostile to keyboard users who lose the ability to scrub, scan, and reread. The text remains the accessible artifact; the audio is a complement.
Users on slow connections or limited data. Loading a multi-megabyte audio file on every page visit (or even on user opt-in if the file is large) is a usability cost for users with bandwidth constraints. A play-as-audio feature that downloads aggressively can make the page worse for users in these situations.
Users who want to skim, scan, or jump. Reading is faster than listening for almost everyone who can read. A page that nudges users toward audio playback makes the content slower to consume, not faster. This is fine when the user explicitly chooses audio; it is a regression when it becomes a default mode.
The compliance trap
The trap teams fall into: "we have a play-as-audio button, therefore the page is accessible." This is wrong on two axes.
First, the play-as-audio button does not address any of the WCAG 2.1 AA criteria directly. It is not a substitute for semantic HTML, ARIA labels, keyboard navigation, color contrast, focus indicators, or screen-reader compatibility. A page can have a play-as-audio button and fail every WCAG criterion that actually carries legal weight. The button is invisible to compliance audits unless it is part of a broader plan; even then, it is a small input.
Second, "we built it for accessibility" is a claim regulators can probe. If the page builds an audio-mode feature and markets it as an accessibility improvement while neglecting underlying issues (missing alt text, unlabeled buttons, broken keyboard navigation, low color contrast, no transcripts on video), an accessibility complaint or audit will find the underlying issues regardless of the audio feature. The audio button does not buy goodwill that translates into compliance.
This is the trap because the audio feature is satisfying to build. It is concrete, visible, and feels like accessibility work. The underlying compliance work, auditing existing pages for ARIA correctness, fixing color contrast failures, ensuring keyboard parity, transcripts and captions for video, is unglamorous, hard to demo, and does not produce a feature page can put on a launch announcement. Teams gravitate toward the feature and away from the audit, and the result is a site that has an accessibility-themed feature but is not accessible.
How to use TTS as part of a real accessibility plan
The framing that holds up: TTS is an enhancement layer on a page that is already accessible. The plan looks like this.
- Audit and fix the structural foundation. WCAG 2.1 AA compliance covers semantic HTML, ARIA labels, keyboard navigation, color contrast, focus management, captions for video, alt text for images. This is the load-bearing work. Run the audit; fix what fails. Tools like axe-core, WAVE, or paid auditors can identify the issues. This step is not optional and is not replaceable.
- Ensure screen-reader compatibility specifically. Test with NVDA, JAWS, and VoiceOver on actual content. This is a task; spend the time. Screen-reader users are the largest single group of accessibility-tech users and the group most often poorly served by a "play as audio" button without proper screen-reader work.
- Add captions and transcripts for video. Same load-bearing work. WCAG requires it; users with hearing disabilities depend on it.
- Then, optionally, add a TTS audio mode for users who prefer it. Build it as opt-in, not auto-play. Pair it with visual highlighting (the word-timestamp feature pays its way here for exactly this case). Make it work with keyboard controls. Do not interfere with screen readers.
- Disclose where audio is AI-generated. This is a regulatory concern under the EU AI Act for content reaching EU users; it is also a transparency concern with users generally. A small "voice produced with AI text-to-speech" note adjacent to the audio control covers it.
The end state is a page that is accessible to screen-reader users out of the box (because of the structural work), accessible to keyboard users (same), captions-equipped for hearing-disabled users (same), and that also offers an opt-in audio mode for users who prefer audio playback. The TTS feature is the cherry on top of a foundation that does the actual work.
A practical test
Run this test on any page where TTS is being framed as an accessibility feature.
- Turn off your monitor and try to navigate the page using only a screen reader. Can you find the main content? Can you read the article? Can you submit any form?
- Try the page with keyboard only. Does focus move sensibly? Can you reach every interactive element? Are focus indicators visible?
- View the page in a high-contrast mode or with a contrast-checking tool. Are any colors below the WCAG 2.1 AA contrast threshold?
- If the page has video, are captions present and accurate?
- If the page has images, do they have meaningful alt text?
If any of these tests fail, fix the test before claiming the TTS feature is an accessibility win. The TTS feature is not a substitute for the underlying work; it is an addition on top of it.
If all of these tests pass, then yes, adding a TTS mode to the page is a real, useful accessibility enhancement for the user groups it serves, and it adds capability without distracting from the foundation.
The argument in plain language
I think TTS-on-the-page is a useful enhancement that should not be confused with accessibility itself. The accessibility work is the structural work, semantic markup, screen-reader compatibility, captions, keyboard navigation, color contrast. That work is harder to demo and easier to skip. The TTS feature is easier to demo and easier to ship as a marketing line. Teams that prioritize the demo-able feature over the structural work end up with sites that look accessible and are not, and the cost shows up in compliance audits, in user complaints, and in the legal risk that has been mounting in jurisdictions that take accessibility seriously.
Use TTS on your site if you have done the structural work and want to add an opt-in audio mode for users who prefer it. Do not use TTS as the headline of your accessibility plan. The map between the two is not what it looks like; pretending it is helps no one.