Zero STT Med - Batch Transcription Documentation

Medical-grade Speech-to-Text powered by ShunyaLabs

Zero Med is our domain-specific speech recognition model optimized for medical transcription, offering superior accuracy for medical terminology, procedures, and clinical documentation.

Prerequisites

  • Python 3.8 or higher
  • Valid API key (contact [email protected] to get your unique API key)
  • Audio files in supported formats (WAV, MP3, M4A, FLAC, OGG, AAC, WMA, MP4, MKV, MOV, AVI, WebM)

Installation:

pip install requests

Input - REST API

import requests


url = "https://tb.shunyalabs.ai/transcribe"
headers = {"X-API-Key": YOUR_API_KEY}

with open(your_audio_file, "rb") as your_audio_file:
    files = {"file": your_audio_file}
    data = {
        "language_code": "med-en",
        "enable_diarization": "true"
    }

    response = requests.post(url, headers=headers, files=files, data=data)
    result = response.json()

print(result["text"])

Input - cURL

curl -X POST "https://tb.shunyalabs.ai/transcribe" \
  -H "X-API-Key: <YOUR_API_KEY>" \
  -F "file=@your_audio_file.wav" \
  -F "language_code=med-en" \
  -F "enable_diarization=true"

File Size Limits

Maximum file size: 30 MB

For files larger than 30MB: Split audio into smaller segments before processing.

Parameters

ParameterValueDescription
--audio-file<your_audio_file>Path to your audio file
--language-codemed-enUse med-enfor Zero Med model
--api-key<YOUR_API_KEY>Your authentication key
--api-urlhttps://tb.shunyalabs.aiAPI endpoint
--enable-diarizationTrue (optional)Speaker identification

Output

The API returns a JSON response with the transcription:

Note: segments appear only when enable_diarization parameter is set to true

{
  "success": true,
  "text": "Patient presents with acute onset chest pain radiating to left arm. History of hypertension and diabetes mellitus type 2.",
  "segments": [
    {
      "start": 0.0,
      "end": 5.2,
      "text": "Patient presents with acute onset chest pain radiating to left arm.",
      "speaker": "SPEAKER_00"
    },
    {
      "start": 5.5,
      "end": 9.8,
      "text": "History of hypertension and diabetes mellitus type 2.",
      "speaker": "SPEAKER_00"
    }
  ],
  "total_segments": 2,
  "filename": "your_audio_file.wav",
  "unique_speakers": ["SPEAKER_00"]
}

Relevant Response Fields

FieldTypeDescription
successbooleanWhether transcription succeeded
textstringComplete transcription text
segmentsarrayTimestamped segments with speaker labels—these appear only when enable_diarization parameter is set to true
total_segmentsintegerNumber of transcribed segments
filenamestringOriginal filename
unique_speakersarrayList of speaker IDs found—these appear only when enable_diarization parameter is set to true

Segment Object

Each segment in the segments array contains:

Note: segments appear only when enable_diarization parameter is set to true

FieldTypeDescription
startfloatStart time in seconds
endfloatEnd time in seconds
textstringTranscribed text for this segment
speakerstringSpeaker identifier (e.g., SPEAKER_00)

Working with Results

Extract full transcript:

result = transcribe_file("your_audio_file.wav", YOUR_API_KEY)
full_text = result["text"]
print(full_text)

Process segments with timestamps:

for segment in result["segments"]:
    print(f"[{segment['start']:.2f}s - {segment['end']:.2f}s] {segment['speaker']}: {segment['text']}")

Note: segments appear only when enable_diarization parameter is set to true

Identify unique speakers:

speakers = result["unique_speakers"]
print(f"Found {len(speakers)} speakers: {', '.join(speakers)}")

Note: unique_speakers appear only when enable_diarization parameter is set to true

Troubleshooting

IssueCauseSolution
HTTP 400: Bad RequestInvalid parametersVerify language_code=med-en
HTTP 401: UnauthorizedInvalid API keyCheck authentication credentials
HTTP 413: File Too LargeExceeds 30 MB limitSplit file into smaller segments
HTTP 500: Internal ErrorServer-side issueContact support with error details
Connection TimeoutNetwork/server problemVerify connectivity and retry
SSL Certificate ErrorHTTPS certificate issueScript handles automatically

Best Practices

  • Always specify language_code=med-en to use the Zero STT Med model
  • Split files larger than 30MB into smaller segments before processing
  • Enable diarization for multi-speaker clinical discussions

Model: Zero STT Med | Optimized for: Medical terminology, procedures, and clinical documentation