Language Models

Language models are our core transcription engines, delivering industry-leading accuracy across an unmatched 200+ languages.

We provide three specialized categories of language models to meet diverse transcription needs:

  1. Zero Indic  - Optimised performance for Indic languages
  2. Zero Codeswitch  - Designed for mixed-language speech patterns
  3. Zero Universal  - Support for 200+ languages worldwide

Zero Indic Models 

Specialized models fine-tuned for Indian languages, offering superior accuracy for regional speech patterns and accents.

To use this category of models, pass the model parameter "model":"zero-indic" along with the appropriate language code.

For example, to transcribe audio in Hindi using Zero Indic Hindi:

data = {
    "model": "zero-indic"
    "language_code": "hi"
}

Supported Languages

Languages currently supported by Zero Indic models:

Language"model""language_code"
Hindizero-indichi
Teluguzero-indicte
Kannadazero-indickn
Bengalizero-indicbn

Support for other languages will be coming soon.

Zero Code-Switch Models

Industry-leading code-switch models designed by Shunya Labs to handle multilingual speech, generating accurate output across multiple languages within a single conversation.

To use this category of models, pass the model parameter "model":"zero-indic" along with the appropriate language code.

Currently, the Hinglish model is available. To transcribe audio in Hinglish:

data = {
    "model": "zero-codeswitch"
    "language_code": "hi-en"
}

Zero Universal

A universal speech-to-text model supporting 200+ languages, providing broad multilingual transcription capabilities across diverse linguistic and acoustic environments.

You can auto-detect the language of your audio by setting "language_code": "auto" :

data = {
    "language_code": "auto"
}

For optimal accuracy, specify the language of your audio input from our list of supported languages.

For example, for audio in English:

data = {
    "language_code": "en"
}