Add Automatic Language Identification to your ASR Applications

Vikram Vaswani

Apr 5, 2022

revai-global-voice-recongition-31-world-languages

Rev › Blog › Speech to Text Technology › Add Automatic Language Identification to your ASR Applications

Rev AI’s Asynchronous Speech-to-Text API makes it easy to transcribe audio even if it’s not in English – simply specify the language code when requesting transcription. But what if you don’t recognize the language in the first place?

Today, we’re happy to announce Rev AI’s new Language Identification API, which automatically identifies the most probable language used in an audio file. This API offers developers a fast, automated, and accurate solution to the problem of language recognition in spoken audio. It accepts and analyzes an input audio file and returns a list of possible languages, ranked by confidence.

Key Feature: No Language Inputs Required

A unique feature of our Language Identification API is that it performs language identification without requiring a list of possible language codes upfront. This feature eliminates the need to first acquire and validate information on language possibilities, reducing work (and code dependencies) for developers and helping them build out ASR applications faster.

Use Cases

Language identification enables a variety of applications and use cases, such as:

Automated language identification for any spoken audio, including voice notes, speeches, conference discussions and more
Automated classification of digital media libraries by language
Automated labeling or verification of the main spoken language in audio files
Better analysis for enterprise decision-making, such as assigning call center employees based on languages used by customers

Get Started

To try out these new APIs, follow the steps below:

1. Obtain a Rev AI API access token. If you don’t already have one, sign up for a free Rev AI account and generate an access token.

2. Submit an audio file for language identification using the command below.
curl -X POST "https://api.rev.ai/languageid/v1beta/jobs" \ -H "Authorization: Bearer <REVAI_ACCESS_TOKEN>" \ -H "Content-Type: application/json" \ -d '{"media_url": "https://www.rev.ai/FTC_Sample_1.mp3"}'
Your request must contain an Authorization header containing your API access token and a media_url parameter with a link to the audio file. The command above uses an example Rev AI audio file, but you can replace this with your own.

3. The API response will contain a job identifier (id field). Copy this to your clipboard or note it, as it will be needed for the next step.

4. Language identification is normally completed within a minute, although this can vary. Wait 60 seconds and then make a second request to obtain the results, as below. Replace the <ID> field with the id obtained in the previous step.
curl -X GET "https://api.rev.ai/languageid/v1beta/jobs/<ID>/result" \ -H "Authorization: Bearer <REVAI_ACCESS_TOKEN>" \ -H "Accept: application/vnd.rev.languageid.v1.0+json"
A sample response is shown below:

{
 "top_language": "en",
 "language_confidences": [
   {
     "language": "en",
     "confidence": 0.907
   },
   {
     "language": "nl",
     "confidence": 0.023
   },
   {
     "language": "ar",
     "confidence": 0.023
   },
   {
     "language": "de",
     "confidence": 0.023
   },
   {
     "language": "cmn",
     "confidence": 0.023
   }
 ]
}

The confidence scores in a result represent how confident our model is that the given language is the language spoken in the submitted audio. This score is always a numeric value in the range [0, 1]. The sample response above shows that the full audio has a 90.7% likelihood of being in English, and a 2.3% likelihood of being in either Dutch, Arabic, German, or Mandarin.

Additional Notes

For developers, a few other points to note:

The API currently recognizes 22 languages. The language codes generated can be passed to our Asynchronous Speech-to-Text API for transcription.
The API performs best for files with a single language and not heavy accents.
Clients must authenticate by including their Rev AI access token as a query parameter in their requests. If the access token is invalid or the query parameter is not present, a 401 error code will be returned.

The Language Identification API is currently in open beta. API endpoints, request, and response models may change in future. Always refer to the Language Identification API documentation for the most up-to-date information.

The Language Identification API is priced at $0.003 per minute, rounded up to the nearest second (15 seconds minimum) for our pay-as-you-go customers. Enterprise pricing is available for enterprise customers – contact your Rev Account Manager for more information. You can send us your feedback or questions by emailing us at support@rev.ai.

Affordable, fast transcription. 100% Guaranteed.

Get Started

Add Automatic Language Identification to your ASR Applications

Vikram Vaswani

Apr 5, 2022

Key Feature: No Language Inputs Required

Use Cases

Get Started

Additional Notes

Extract Topics from Transcribed Speech with Node.js

What Are the Advantages of Artificial Intelligence?

What is ASR? The Guide to Automatic Speech Recognition Technology

Everybody’s Favorite Speech-to-Text Blog

Add Automatic Language Identification to your ASR Applications

Vikram Vaswani

Apr 5, 2022

Share

Key Feature: No Language Inputs Required

Use Cases

Get Started

Additional Notes

Related Content

Latest Article

Extract Topics from Transcribed Speech with Node.js

Most Popular

What Are the Advantages of Artificial Intelligence?

Featured Article

What is ASR? The Guide to Automatic Speech Recognition Technology

Everybody’s Favorite Speech-to-Text Blog