Zero STT Med - Batch Transcription Documentation

Medical-grade Speech-to-Text powered by ShunyaLabs

Zero Med is our domain-specific speech recognition model optimized for medical transcription, offering superior accuracy for medical terminology, procedures, and clinical documentation.

Prerequisites

Python 3.8 or higher
Valid API key (contact [email protected] to get your unique API key)
Audio files in supported formats (WAV, MP3, M4A, FLAC, OGG, AAC, WMA, MP4, MKV, MOV, AVI, WebM)

Installation:

pip install requests

Input - REST API

import requests


url = "https://tb.shunyalabs.ai/transcribe"
headers = {"X-API-Key": YOUR_API_KEY}

with open(your_audio_file, "rb") as your_audio_file:
    files = {"file": your_audio_file}
    data = {
        "language_code": "med-en",
        "enable_diarization": "true"
    }

    response = requests.post(url, headers=headers, files=files, data=data)
    result = response.json()

print(result["text"])

Input - cURL

curl -X POST "https://tb.shunyalabs.ai/transcribe" \
  -H "X-API-Key: <YOUR_API_KEY>" \
  -F "file=@your_audio_file.wav" \
  -F "language_code=med-en" \
  -F "enable_diarization=true"

File Size Limits

Maximum file size: 30 MB

For files larger than 30MB: Split audio into smaller segments before processing.

Parameters

Parameter	Value	Description
`--audio-file`	`<your_audio_file>`	Path to your audio file
`--language-code`	`med-en`	Use `med-en`for Zero Med model
`--api-key`	`<YOUR_API_KEY>`	Your authentication key
`--api-url`	`https://tb.shunyalabs.ai`	API endpoint
`--enable-diarization`	`True (optional)`	Speaker identification

Output

The API returns a JSON response with the transcription:

Note: segments appear only when enable_diarization parameter is set to true

{
  "success": true,
  "text": "Patient presents with acute onset chest pain radiating to left arm. History of hypertension and diabetes mellitus type 2.",
  "segments": [
    {
      "start": 0.0,
      "end": 5.2,
      "text": "Patient presents with acute onset chest pain radiating to left arm.",
      "speaker": "SPEAKER_00"
    },
    {
      "start": 5.5,
      "end": 9.8,
      "text": "History of hypertension and diabetes mellitus type 2.",
      "speaker": "SPEAKER_00"
    }
  ],
  "total_segments": 2,
  "filename": "your_audio_file.wav",
  "unique_speakers": ["SPEAKER_00"]
}

Relevant Response Fields

Field	Type	Description
`success`	boolean	Whether transcription succeeded
`text`	string	Complete transcription text
`segments`	array	Timestamped segments with speaker labels—these appear only when enable_diarization parameter is set to true
`total_segments`	integer	Number of transcribed segments
`filename`	string	Original filename
`unique_speakers`	array	List of speaker IDs found—these appear only when enable_diarization parameter is set to true

Segment Object

Each segment in the segments array contains:

Note: segments appear only when enable_diarization parameter is set to true

Field	Type	Description
`start`	float	Start time in seconds
`end`	float	End time in seconds
`text`	string	Transcribed text for this segment
`speaker`	string	Speaker identifier (e.g., SPEAKER_00)

Working with Results

Extract full transcript:

result = transcribe_file("your_audio_file.wav", YOUR_API_KEY)
full_text = result["text"]
print(full_text)

Process segments with timestamps:

for segment in result["segments"]:
    print(f"[{segment['start']:.2f}s - {segment['end']:.2f}s] {segment['speaker']}: {segment['text']}")

Note: segments appear only when enable_diarization parameter is set to true

Identify unique speakers:

speakers = result["unique_speakers"]
print(f"Found {len(speakers)} speakers: {', '.join(speakers)}")

Note: unique_speakers appear only when enable_diarization parameter is set to true

Troubleshooting

Issue	Cause	Solution
HTTP 400: Bad Request	Invalid parameters	Verify `language_code=med-en`
HTTP 401: Unauthorized	Invalid API key	Check authentication credentials
HTTP 413: File Too Large	Exceeds 30 MB limit	Split file into smaller segments
HTTP 500: Internal Error	Server-side issue	Contact support with error details
Connection Timeout	Network/server problem	Verify connectivity and retry
SSL Certificate Error	HTTPS certificate issue	Script handles automatically

Best Practices

Always specify language_code=med-en to use the Zero STT Med model
Split files larger than 30MB into smaller segments before processing
Enable diarization for multi-speaker clinical discussions

Model: Zero STT Med | Optimized for: Medical terminology, procedures, and clinical documentation