Pre-recorded audio

The pre-recorded transcription endpoint is designed for audio files you already have on hand: voice notes, meeting clips, voicemail recordings, and similar short-form content up to 5 minutes long. You send one file per request and receive the finished transcript in the response body — no polling, no callbacks, just a straightforward synchronous call.

Constraints

Supported formats: wav, ogg, webm
Max file size: 50 MB
Max duration: 300 seconds (default — can be raised on request)

Make a request

Attach your audio file as a multipart audio field and specify which model to use. Make sure your API key is set in the environment first (see Authentication).

curl -X POST https://api.typelessapi.com/v1/transcribe \
  -H "Authorization: Token $TYPELESS_API_KEY" \
  -F "audio=@meeting.wav" \
  -F "model=typeless-1.0-pro"

import os
import requests

with open("meeting.wav", "rb") as f:
    resp = requests.post(
        "https://api.typelessapi.com/v1/transcribe",
        headers={"Authorization": f"Token {os.environ['TYPELESS_API_KEY']}"},
        files={"audio": f},
        data={"model": "typeless-1.0-pro"},
    )
print(resp.json())

import fs from "node:fs";

const form = new FormData();
form.append("audio", new Blob([fs.readFileSync("meeting.wav")]), "meeting.wav");
form.append("model", "typeless-1.0-pro");

const resp = await fetch("https://api.typelessapi.com/v1/transcribe", {
  method: "POST",
  headers: { Authorization: `Token ${process.env.TYPELESS_API_KEY}` },
  body: form,
});
console.log(await resp.json());

Response

A successful request returns a JSON object containing the transcript, detected language, and billing usage:

{
  "status": "success",
  "result": {
    "transcript": "Let's move the launch to next Thursday and loop in the design team early.",
    "detected_language": "en",
    "duration_seconds": 42.5
  },
  "usage": {
    "billed_audio_seconds": 42.5,
    "output_token_count": 128
  },
  "request_id": "01JXXXXXXXXXXXXXXXXXXXXXXX"
}

Choosing a language

The language parameter is optional. Pass an ISO 639-1 code — for example en for English or zh for Mandarin Chinese — to give the model a hint that improves accuracy, especially for shorter clips or accented speech. If you omit it — or pass an unsupported/unrecognized code — the API detects the language automatically and reports the result in detected_language on every response.

Choosing a model

The model field is required. Three tiers are available:

Model	Tier
`typeless-1.0-lite`	Economy — lowest cost; recommended for non-real-time (batch) use cases
`typeless-1.0-pro`	Balanced quality, speed, and cost
`typeless-1.0-max`	Recommended default — highest accuracy and fastest turnaround; ideal for noisy audio, strong accents, or technical vocabulary

See Models & Pricing for a full breakdown of per-second rates.

Timeouts

The request is synchronous: your HTTP connection stays open until transcription is complete. Internally, the ASR stage has a 90-second timeout and the refinement stage has a 60-second timeout. If either limit is exceeded, the API returns SERVICE_UNAVAILABLE (503) and you are not charged for that request.

Billing is per second of audio with a 15-second minimum per request. See Models & Pricing for rates.

Get Started

Transcription

Models & Pricing

SDKs

Make a request

Response

Choosing a language

Choosing a model

Timeouts

​Make a request

​Response

​Choosing a language

​Choosing a model

​Timeouts

Make a request

Response

Choosing a language

Choosing a model

Timeouts