Streaming audio

Streaming transcription is designed for scenarios where audio is captured in real time: live dictation, call transcription, and real-time captioning pipelines. Rather than waiting until a recording is complete, you open a WebSocket connection and push audio frames as they arrive. The session lifecycle is straightforward:

Connect — authenticate via query parameters
Send binary PCM frames — push raw audio as it is captured
Send close_stream — signal that you are done sending audio
Receive result — the server returns the final transcript and closes the connection

Connection

Connect to the following URL, passing authentication and stream parameters as query parameters:

wss://api.typelessapi.com/v1/transcribe/stream

Parameter	Type	Default	Description
`token`	string	required	Your API key
`model`	string	required	Model tier — see Models & Pricing
`sample_rate`	integer	16000	8000–48000 Hz
`channels`	integer	1	1 or 2
`encoding`	string	`pcm16`	Only `pcm16` (16-bit little-endian PCM) is supported
`language`	string	auto-detect	Optional ISO 639-1 hint

Connection lifecycle

You must send the first audio frame within 10 seconds of connecting — keep_alive messages do not extend this deadline.

After that, the connection closes after 60 seconds of silence; any audio frame or a keep_alive message resets the idle timer.

Example

import asyncio
import json
import os

import websockets


async def main():
    url = (
        "wss://api.typelessapi.com/v1/transcribe/stream"
        f"?token={os.environ['TYPELESS_API_KEY']}"
        "&model=typeless-1.0-pro"
        "&sample_rate=16000&channels=1&encoding=pcm16"
    )
    async with websockets.connect(url) as ws:
        # meeting.pcm: raw 16 kHz mono little-endian pcm16
        with open("meeting.pcm", "rb") as f:
            while chunk := f.read(3200):  # 100 ms per frame
                await ws.send(chunk)
                await asyncio.sleep(0.1)
        await ws.send(json.dumps({"type": "close_stream"}))
        async for message in ws:
            event = json.loads(message)
            if event["type"] == "result":
                print(event["result"]["transcript"])
                break


asyncio.run(main())

import fs from "node:fs";
import WebSocket from "ws";

const url =
  "wss://api.typelessapi.com/v1/transcribe/stream" +
  `?token=${process.env.TYPELESS_API_KEY}` +
  "&model=typeless-1.0-pro" +
  "&sample_rate=16000&channels=1&encoding=pcm16";

const ws = new WebSocket(url);

ws.on("open", () => {
  const pcm = fs.readFileSync("meeting.pcm"); // raw 16 kHz mono pcm16
  const frame = 3200; // 100 ms per frame
  let offset = 0;
  const timer = setInterval(() => {
    if (offset >= pcm.length) {
      clearInterval(timer);
      ws.send(JSON.stringify({ type: "close_stream" }));
      return;
    }
    // production code should check ws.bufferedAmount before sending
    ws.send(pcm.subarray(offset, offset + frame));
    offset += frame;
  }, 100);
});

ws.on("message", (data) => {
  const event = JSON.parse(data.toString());
  if (event.type === "result") {
    console.log(event.result.transcript);
    ws.close();
  }
});

Getting the result

Once you send close_stream, the server finishes processing all buffered audio and then delivers a single result message containing the complete transcript. Finalization typically completes within a few seconds, and can take up to about a minute when a large amount of audio is still being processed. The connection is closed by the server immediately after.

If you disconnect or cancel after transcription has started, you are still billed for the audio you sent (15-second minimum applies).

For the complete message protocol, see WS /v1/transcribe/stream reference.

Get Started

Transcription

Models & Pricing

SDKs

Connection

Connection lifecycle

Example

Getting the result

​Connection

​Connection lifecycle

​Example

​Getting the result

Connection

Connection lifecycle

Example

Getting the result