Google Speech Recognition API vs. Rev AI API
Rev’s network of 50,000 human transcriptionists completes thousands of projects daily. This provided an extensive training set for the Rev AI transcription API.
But is that enough to stack up against automatic speech recognition (ASR) from a tech giant like Google? Let’s compare the accuracy, speed, features, and cost of two popular ASR solutions: Google Speech Recognition API vs. Rev AI API.
Which Is More Accurate: Google Speech Recognition API or Rev AI API?
In our podcast transcription benchmarks, we compared the word error rates (WER) of Rev AI and Google’s video model for 30 podcasts. Rev AI was more accurate than Google for 24 of 30 media files. Rev AI also had a lower average WER at 14.22% compared to 15.82% for Google Speech Recognition.
Feature Comparison: Google Speech Recognition API vs. Rev AI API
Speaker Identification and Diarization
Speaker identification and diarization are crucial for audio files with multiple speakers. These features break the audio file into separate streams for each speaker.
That way, the transcript can indicate who said which words. Rev AI includes full speaker identification and diarization in our service. Google Speech Recognition has these services as well, but they are still in beta.
Google includes support for a total of 125 languages and variants. They currently have beta support for the auto-detection of individual languages. This feature works in situations with up to four pre-specified languages.
Rev AI supports a global English model as well as 30 other world languages. This includes:
The global English model recognizes and transcribes speech from several English variants. It even includes English as spoken by a German or French speaker.
Turnaround Speed of Google Speech Recognition API and Rev AI API
For short files (10s of seconds), Google offers impressively rapid transcription turnaround times. For longer files, Google’s transcription takes about half the runtime of the media file. Rev AI’s transcriptions are somewhat slower for short files. However, Rev AI is able to transcribe long files at a remarkable rate — it even transcribes 1-2 hour media files in 5-10 minutes.
Ease-of-Use: Google Speech Recognition API vs. Rev AI API
Google has extensive integrations with other Google services, and it is great for complex applications. The recognizable user interface is a big advantage of their ASR. If you’ve used Gmail or other Google products, you’ll feel right at home in their speech API. Being in the Google ecosystem is also a decided advantage if you plan to integrate your application with other Google services.
Rev AI is built for ease of use as a standalone service. The simpler your application or the less you rely on Google integrations, the more advantageous Rev AI becomes. Rev AI’s output files are easy to read and use because they are available as either .txt or .json transcripts. The .json files contain the text, speaker IDs, timestamps, and confidence scores for each word.
Price Comparison: Google Speech Recognition API vs. Rev AI API
Rev AI charges $0.035 per minute (rounded up to the nearest 15-second increment) for our ASR service base plan. For high-volume users, Rev AI additionally has an enterprise plan that starts at $1.20 per hour ($0.02 per minute) and goes lower in price as volume goes up.
Google has two tiers of ASR service: a standard model and an enhanced video model that is more accurate. The Google video model, which was used in the above accuracy comparisons, is $0.036 per minute (rounded up to the nearest 15-second increment). The standard model $0.024 per minute, also charged in 15-second increments. They do offer a discount if you choose to opt-in for data logging.
Note: Prices for these services are constantly changing, and these prices were accurate when this article was written. For up to date prices check out the Rev AI pricing page here and the Google Speech Recognition prices here.
How Do You Choose a Transcription API?
Google Speech Recognition API and Rev AI API both offer excellent ASR solutions. Google’s API offers impressively robust language coverage. Additionally, it integrates well with other Google offerings. That’s great for applications that are already immersed in the Google ecosystem.
Rev AI’s solution offers better accuracy.
This is particularly true for media files that need speaker identification and diarization. Rev AI is also easier to set up and use for standalone applications and has faster turnaround speeds for long media files. But don’t take our word for it, try out the Rev AI API for free today and try Rev’s free word error rate calculator & speech recognition benchmarking tools to run these tests yourself