I'm no video expert. Yet I often find myself encoding, editing, and otherwise manipulating video for the web. Recently, I completed a video project that involved converting a DVD of a 40-minute presentation into a movie that could be viewed on a web page, as a whole or in chapters. The final product had to be captioned.

Converting the DVD into video for the web was easy. I used Handbrake to rip the DVD into MP4 format. Editing was equally easy. I used iMovie to add title screens and transitions, and to break the movie up into chapters. Adding the captioning, however, was tricky.

Why bother with captioning? Here are some good reasons: so that those who are deaf or hard of hearing can enjoy the video, so the text is indexed by search engines, and to aid those for whom English is a second language. And here’s another: the Twenty-First Century Communications and Video Accessibility Act of 2010.

If captioning is important, then why isn't it a mainstream practice? I'm not qualified to answer that question, but my guess is that it's in part due to the fact that captioning is time-consuming and difficult. For instance: with external captioning (where captions are contained in an external file and sync with the video), there are multiple formats and a lack of clear standards. And for embedded captioning (where captions are simply typed in an editor and then exported with the movie), it's just plain tedious work.

For my recent video project, I considered three captioning options:

  1. Embed the captions. The first option is to place the captions directly into the movie itself using a tool such as Final Cut Pro,  iMovie, or  Adobe Premiere.  I have Final Cut Pro, but I tend to use iMovie since most of the video work I do is short and simple. It’s the easiest tool for the job and the results look good. Here’s the thing about iMovie: while there are dozens of title/text effect options, none are designed for captioning (which is surprising given Apple's robust accessability options for the OS). Despite this shortcoming, I’ve discovered that I can 'fake' captions by adding lower thirds to each segment of video. Making a default lower third overlay in iMovie into something that resembles a caption is a matter of changing font sizes. You can see an example of this in a recent video podcast I produced. This works, but it isn't a practical solution for a long movie. In truth, it's really not an ideal solution for any length movie because the captions are permanently embedded in the video. Screen readers and search engines can't see this text. People can’t choose to turn the captions on/off. So I didn't choose this option for my project.
  2. Dump the text on the page. A second option is to dump the captioning for a video on a page, underneath the video as HTML text. This may technically meets accessibility requirements, but it’s a lousy solution. The text is unassociated with the video. One can read the text or watch the video. It's not feasible to do both at the same time. Nix.
  3. Create an external caption file. This last choice is the best solution: create an external caption file that will appear in sync with the video. Captioning is then matched up with the video, it's readable by screen readers, and it's good for search engines. It can also be turned on or off at the user's discretion.

So how do you create and deploy and external caption file? If you simply wish to place a video on Youtube, it's easy. Once you upload your video to the free service, Youtube offers free auto-generated machine transcription. While you'll find that video speech-to-text accuracy is hit-and-miss (more miss in my experience), the important part is that Google generates  time codes that precisely match the the audio in the video. So once you download the caption file from Youtube, it's a simply a matter of manually correcting the text so that what appears in the caption will match what is actually being said in the video.

If you don't want to (or can't because of workplace policy) solely use Youtube to present your video, it's still a very useful tool. How? If you are embedding captions in a video using an editor such as iMovie, YouTube will do half of the work for you by delivering a fair approximation of a transcript. If you want to use an external caption file elsewhere with a different video player, you can still use this Google-generated file. You just need to convert it into the right format.

Here’s the process I used to generate a caption file for my video project:

  • I began by uploading the video to my YouTube channel.
  • I then requested that YouTube auto-generate a Subviewer caption file for this movie (Be patient. It may take hours to get this file back from Google because you'll be in a queue with tons of other people).
  • I then downloaded this file and opened it up in text editor. 
  • The next step is tedious, but necessary: cleaning up the machine-generated text. I opened my movie in a QuickTime player window and, as it played, edited my caption text to correct errors and typos. It's not too bad if you toogle between a text editor and QuickTime using Cmd-Tab.
  • Once I had my cleaned-up Subviewer text file, I copied and pasted it it into a free online converter to generate into the appropriate format. In my case, I generated a DFXP file for use with a Flash player. Here are three conversion tool options:
    • 3PlayMedia Caption Format Converter. This converter lets you convert from SRT or from SBV to  DFXP, SMI or SAMI (Windows Media), CPT.XML (Flash Captionate XML), QT (Quicktime), and STL (Spruce Subtitle File).
    • Subtitle Horse. A free online caption editor. Exports DFXP, SRT, and Adobe Encore files.
    • Subviewer to DFXP. This free online tool from Ohio State University converts a YouTube .SBV file into DFXP, Subrip, or QT (QuickTime caption) files. I used this tool for my project.

What’s the appropriate format?

  • YouTube: Subviewer (.SBV) 
  • iTunes, iOS: Scenarist Closed Caption (.SCC) 
  • Flash: DFXP, Timed Text Markup Language, the W3C recommendation. These are plain ol’ XML files.  You could also use the SubRip (.SRT) file format for Flash.
  • HTML5:  See this post.

If you're not using a hosted service like YouTube or Vimeo (which, incidentally, does not support external captions), you'll of course have to decide how to present the video on your site. There are many options. You can roll your own player with external captions using Adobe Flash. You can use off-the-shelf players that support captioning such as Flowplayer and JW Player — these two commercial products offer very easy setup and they offer HTML5 video players with Flash fallback. Another option: you might try HTML5 with experimental captioning support (note that Safari 5 now supports captioning with the HTML5 video tag). As I said, there are options. The video player discussion is beyond the scope of this post (and I don’t want to go down the HTML5 vs. Flash rabbit hole!).

My main goal here is to point out that Google's machine transcription is good for more than just hosting a captioned video on Youtube. It's trivial to convert this caption file into a variety of formats. The key point is that you don't have to manually add time codes for your video. This critical step is done for you.

Yet even with this handy Google tool, generating caption files (and getting them to work with video players) remains an unwieldy task. We clearly need better tools and standards to help bring video captioning into the mainstream.

P.S. While researching this post, I came across two low-cost tools that look like solid options to create iOS and iTunes movies with captions. Both are from a company called bitfield. The first is called Submerge. This tool makes it very easy to embed (hard-code) subtitles in a movie and will import all the popular external captioning formats. The second is called iSubtitle. This tool will ‘soft-code’ subtitle tracks so you can add multiple files (languages) and easily add metadata to your movie.