Home
Speech to Text Technology
Add Automatic Language Identification to your ASR Applications

Add Automatic Language Identification to your ASR Applications

Rev AI's Asynchronous Speech-to-Text API makes it easy to transcribe audio even if it's not in English - simply specify the language code when requesting

Written by:
Vikram Vaswani
April 4, 2022
revai-global-voice-recongition-31-world-languages
table of contents
Hungry For More?

Luckily for you, we deliver. Subscribe to our blog today.

Thank You for Subscribing!

A confirmation email is on it’s way to your inbox.

Share this post

Rev AI’s Asynchronous Speech-to-Text API makes it easy to transcribe audio even if it’s not in English – simply specify the language code when requesting transcription. But what if you don’t recognize the language in the first place?

Today, we’re happy to announce Rev AI’s new Language Identification API, which automatically identifies the most probable language used in an audio file. This API offers developers a fast, automated, and accurate solution to the problem of language recognition in spoken audio. It accepts and analyzes an input audio file and returns a list of possible languages, ranked by confidence.

Key Feature: No Language Inputs Required

A unique feature of our Language Identification API is that it performs language identification without requiring a list of possible language codes upfront. This feature eliminates the need to first acquire and validate information on language possibilities, reducing work (and code dependencies) for developers and helping them build out ASR applications faster.

Use Cases

Language identification enables a variety of applications and use cases, such as:

  • Automated language identification for any spoken audio, including voice notes, speeches, conference discussions and more
  • Automated classification of digital media libraries by language
  • Automated labeling or verification of the main spoken language in audio files
  • Better analysis for enterprise decision-making, such as assigning call center employees based on languages used by customers

Get Started

To try out these new APIs, follow the steps below:

1. Obtain a Rev AI API access token. If you don’t already have one, sign up for a free Rev AI account and generate an access token.

2. Submit an audio file for language identification using the command below.

curl -X POST "https://api.rev.ai/languageid/v1beta/jobs" \
-H "Authorization: Bearer <REVAI_ACCESS_TOKEN>" \
-H "Content-Type: application/json" \
-d '{"media_url": "https://www.rev.ai/FTC_Sample_1.mp3"}'

Your request must contain an Authorization header containing your API access token and a media_url parameter with a link to the audio file. The command above uses an example Rev AI audio file, but you can replace this with your own.

3. The API response will contain a job identifier (id field). Copy this to your clipboard or note it, as it will be needed for the next step.

4. Language identification is normally completed within a minute, although this can vary. Wait 60 seconds and then make a second request to obtain the results, as below. Replace the <ID> field with the id obtained in the previous step.

curl -X GET "https://api.rev.ai/languageid/v1beta/jobs/<ID>/result" \
-H "Authorization: Bearer <REVAI_ACCESS_TOKEN>" \
-H "Accept: application/vnd.rev.languageid.v1.0+json"

A sample response is shown below:

{
 "top_language": "en",
 "language_confidences": [
   {
     "language": "en",
     "confidence": 0.907
   },
   {
     "language": "nl",
     "confidence": 0.023
   },
   {
     "language": "ar",
     "confidence": 0.023
   },
   {
     "language": "de",
     "confidence": 0.023
   },
   {
     "language": "cmn",
     "confidence": 0.023
   }
 ]
} 

The confidence scores in a result represent how confident our model is that the given language is the language spoken in the submitted audio. This score is always a numeric value in the range [0, 1]. The sample response above shows that the full audio has a 90.7% likelihood of being in English, and a 2.3% likelihood of being in either Dutch, Arabic, German, or Mandarin.

Additional Notes

For developers, a few other points to note:

  • The API currently recognizes 22 languages. The language codes generated can be passed to our Asynchronous Speech-to-Text API for transcription.
  • The API performs best for files with a single language and not heavy accents.
  • Clients must authenticate by including their Rev AI access token as a query parameter in their requests. If the access token is invalid or the query parameter is not present, a 401 error code will be returned.
The Language Identification API is currently in open beta. API endpoints, request, and response models may change in future. Always refer to the Language Identification API documentation for the most up-to-date information.

The Language Identification API is priced at $0.003 per minute, rounded up to the nearest second (15 seconds minimum) for our pay-as-you-go customers. Enterprise pricing is available for enterprise customers – contact your Rev Account Manager for more information. You can send us your feedback or questions by emailing us at support@rev.ai.

Heading

Heading 1

Heading 2

Heading 3

Heading 4

Heading 5
Heading 6

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.

Block quote

Ordered list

  1. Item 1
  2. Item 2
  3. Item 3

Unordered list

  • Item A
  • Item B
  • Item C

Text link

Bold text

Emphasis

Superscript

Subscript

Subscribe to the Rev Blog

Lectus donec nisi placerat suscipit tellus pellentesque turpis amet.

Share this post

Subscribe to The Rev Blog

Sign up to get Rev content delivered straight to your inbox.