Captions and Transcripts

People with various disabilities such as hard of hearing, deaf or Deafness and levels of blindness depend on media with accurate closed captions and transcripts. Assistive technologies like screen readers and refreshable braille devices ensure access to media content.

Many video platforms in use at Rice, including Zoom, Kaltura, and YouTube, provide automatic speech recognition (ASR) captions for uploaded videos. ASR captions and transcripts are only roughly 80% accurate, which falls below the accessibility threshold. Platforms with ASR captions have integrated editors where content creators can edit spelling, punctuation and other errors to ensure accuracy.

Accurate captions and transcripts have many affordances that benefit everyone, but remember that they are essential for some disabled people. Some of the broader benefits include:

  • Improved note-taking.
  • Improved attention.
  • Improved comprehension for non-native speakers.

Learn more about transcription accuracy.

Captions vs. Subtitles


Captions display a synchronized text version of the speech (and contextual sound descriptions) in a video using the same language.

Closed vs. Open

  • The video watcher can hide closed captions.
  • Open captions are always displayed and cannot be turned off.


Subtitles display a synchronized text version of the speech translated into a different language.


Success Criteria for Transcripts


Basic transcripts are a text version of speech and contextually relevant audio sounds and are used by individuals who are deaf, hard of hearing, or have difficulty processing auditory information. Descriptive transcripts also include visual information for individuals that are blind.


Interactive transcripts can be created by some media players like Zoom and Kaltura using caption files in order to concurrently display both written speech and the video. Phrases are highlighted in the displayed transcript as it is spoken in the video. Interactive transcripts allow individuals to move within the video by selecting text of interest.

Techniques for Transcripts

Recorded Video Captions

Success Criteria for Recorded Video Captions

When preparing a video, accessibility should be considered in the initial planning.

  • Take steps to ensure the audio will be high-quality with few background noises and that the presenter speaks clearly and not too fast. This will help with comprehension as well as captioning later.

  • The speaker’s face should be easily visible since some people watch mouth movement to assist in their comprehension.

  • Adding verbal descriptions of simple visual information can help blind and low-vision audience members, among others. For more complex information, you may need to add descriptions after the recording.

Techniques for Recorded Video Captions

Live Video Captions

Success Criteria for Live Video Captions

CART Captioning

CART stands for Communication Access Realtime Translation.

Whenever hosting an event with a broad audience like a webinar, hiring a professional CART vendor to provide accurate, real-time captions is best.

Likewise, if a student has a document disability, they are entitled to reasonable accommodations, including human captioning. For more information about hiring a captioning vendor, contact the Disability Resource Center.

Live ASR Captions

For some smaller-scale presentations, hiring a human captioner may not always be possible, partly due to budget constraints.

ASR captions alone are not accurate enough to meet accessibility standards in recorded media. However, ASR captions can still be valuable during live presentations.

Zoom currently offers live ASR captions, and participants can turn on this feature in any Zoom meeting by using the Show Captions option in the actions bar at the bottom of the meeting room screen. Encourage colleagues and students to use this feature, as they can follow along with a live transcript or change the language in which captions are displayed.

We recommend using one of two presentation applications available to members of the Rice community. Both applications include live ASR:

Techniques for Live Video Captions