Skip to content

Announcing Real-Time Transcription and Captioning With Our Streaming API

Real-Time Transcription and Captioning with Streaming API

RevBlogSpeech to Text TechnologyAnnouncing Real-Time Transcription and Captioning With Our Streaming API

Today, we’re launching real-time audio transcription with our Streaming API. This new API service, powered by Rev’s proprietary speech engine, allows developers to easily build real-time speech recognition directly into their own applications and services.

This opens up a lot of opportunities for developers and enterprise businesses looking to implement ASR technology. With our Streaming API, you’ll get access to a feature-rich platform that not only provides real-time speech recognition, but automatic punctuation, capitalization, and timestamp generation.

Features of Rev AI’s Streaming API

Our Streaming API makes it easy to connect and send audio to the Rev AI speech engine during a live streaming session in real-time.

Real-time Speech Recognition

Our automatic speech recognition (ASR) converts spoken word into text with best-in-class accuracy, now with the capability to transcribe in real-time for streaming and other live applications.

Punctuation & Capitalization

Even in real-time, you’ll get instant punctuation & capitalization in your transcription. Our ASR automatically punctuates (commas, question marks, periods, etc.) and capitalizes for an easy-to-read transcript.

Timestamp Generation

We can generate timestamps in real-time and you’ll receive timestamps for each word.

“We developed real-time audio transcription to meet market demand beyond the asynchronous market, and provide customers—from podcasters to call centers—with deeper speech-to-text capabilities.”

Rev AI General Manager, Jay Lee

Integrating Rev AI’s Streaming API

Our easy-to-use API is designed by developers for developers. We provide you with SDKs, comprehensive API documentation, example code and expert support so you can get started in minutes. All you need to generate your first transcript is an API token.

All connections to Rev AI’s Streaming API start as a WebSocket handshake HTTP request to On successful authorization, the client can start sending binary WebSocket messages containing audio files in one of our supported formats. As speech is detected, Rev AI returns hypotheses of the recognized speech content.

You can see code examples for our streaming generator, microphone streaming, and streaming local files in our Streaming API docs examples.

Real-Time Streaming Transcription Accuracy

The live captioning and transcription offered through Rev AI’s Streaming API offers an incredible accuracy that outperforms other speech recognition services on the market.

Rev AI is best-in-class for real-time speech-to-text accuracy.

“Having accurate transcripts and captions is a major factor in our business, and Rev AI is by far the most reliable provider in this space.”

streamGo’s Production Director, Richard Lee

The real-time transcription offered by Rev AI is able to understand a more complicated vocabulary, trained by our learning model based on a data set that includes millions of hours of human-transcribed audio content.

In fact, our benchmarking tests show that Rev’s ASR, powered by the same engine as Rev AI, has the lowest Word Error Rate (WER) of the competition.

  • 13.9% WER – Rev
  • 15.1% WER – Google Speech-To-Text
  • 18.0% WER – Amazon Transcribe
  • 18.0% WER – Microsoft Azure Speech-to-Text

Applications of Real-Time Speech Recognition Technology

With Rev’s Streaming API, real-time captioning and transcription can open up new possibilities for your business. Consider using the new API for the following uses:

  • Livestreaming videos, webinars, and podcasts
  • Live events, conferences, keynotes, investor calls, annual meetings, speeches, and announcements
  • Phone calls, meetings, and training sessions

Rev AI is the first and only provider that allows us to meet customer expectations. With automated live captions, we’ll be able to fulfill accessibility requirements for webcasts without the premium cost of live human captioners.”

streamGo’s Production Director, Richard Lee

Beyond converting speech-to-text as captions or in a transcript, you might also consider more advanced applications for your business:

  • Meeting communication access requirements and complying with accessibility laws for all your live content.
  • Analyzing dialogue from customer phone support calls in real-time for quality assurance monitoring.
  • Performing advanced analytics to derive actionable insights of spoken words for live content.
  • Hands-free voice commands or voice typing functions in your own applications, software, or business tools.

Get Real-Time Transcription with Streaming API

With Rev AI, you’ll get accurate speech recognition in real-time starting at $0.035 a minute with no hidden fees and no up-front commitments. There are no usage limits, handles all media types, and your first 5 hours are completely free. Plus, there’s exceptional email, chat, and phone support along with our comprehensive API documentation.

Try Rev AI for free and start real-time transcription with Rev AI’s Streaming API today.

Real-Time Streaming for Enterprise

We welcome partners that integrate our ASR engine, streaming or pre-recorded/batch to drive your business results by making your audio data fully accessible and searchable. We can support any of your voice-to-text needs, from human transcription to ASR in any combination. 

To discuss partnering with us, please email us at

Affordable, fast transcription. 100% Guaranteed.