Speech to Text vs. Human Transcription: What’s the difference and which is right for you?
There’s no shortage of man vs. machine examples present in everyday life. Should you call an expert or ask Siri? Wait for a traditional cashier or hurry through self-checkout? Hire someone to sweep your floors or purchase a Roomba?
The same options exist when choosing a transcription provider: should you use an automated service or human transcriptionist? To this question, there’s not really a right answer. It’s more about which option is right for your workflow.
Choose automated transcription if you:
- Need a transcript immediately
- Are limited on budget
- Have a clear audio with only 1-2 speakers
- Only need a rough draft
- Have time to edit it yourself
- Want to search the audio for specific keywords
- Are looking for a few specific quotes
Who uses automated transcription?
Journalists, graduate students, radio stations, podcasters. (This list is not exhaustive. There are also instances when people in these industries choose human transcription instead.)
Choose human transcription if you:
- Need accurate results
- Have a flexible budget
- Don’t want to spend time on editing
- Have publication as an end goal
- Have an audio file with heavy accents or multiple speakers
Who uses human transcription?
Video production companies, market research firms, lawyers. (Again, this list is not exhaustive. There are also instances when people in these industries choose human transcription instead.)
What’s the difference between automated vs. human transcription?
Human transcription, as it sounds, is done by real people who listen to an audio file and convert it to text. When it comes to transcription, humans tend to produce far more accurate results as they are capable of deciphering heavy accents and industry jargon. People are also more effective with tough audio, including background noise and multiple speakers. So in short, human transcription beats automated transcription in almost every way except cost. For comparison, our automated transcription service is a tenth of the cost of our human transcription service (Temi.com = $0.10/min., Rev.com = $1.00/min.).
Automated transcription uses speech recognition to convert audio to text in minutes. It’s far less accurate but much faster than humans. Meaning you might have to spend some time cleaning up your transcript with an editing tool. High-quality audio files with one or two speakers, no accents or complicated jargon, and little background noise will produce more accurate transcripts. If you’re a grad student with a tight timeline and a limited budget this is probably your choice.
So, there you have it. There are positives to both options, and how you plan to use your transcript determines which makes sense for you. Are transcripts a part of your workflow? If so, have you tried both automated and human transcription services? We’d love to hear how you incorporate transcripts into your day to day and which option you prefer. Please tell us in the comments!
- When are closed captions required by law? Here’s what you need to know
- Speech to Text vs. Human Transcription: What’s the difference and which is right for you?
- How to Use Rev’s Vimeo Integration for Captions
- The Easiest Way to Add Closed Captions to YouTube Videos
- Rev vs 3Play Media Closed Captioning